We address the problem of site-wide operational optimization of production plants in the context of Industry 4.0 with an emphasis on sensor data-driven approaches. A multi-plant production site is a complex network of plants and intermediate storage tanks with a continuous flow of materials that get transformed from raw inflows into product outflows. A site-wide production strategy is a time-indexed operational plan for operating the network in real-time and computing various plant flow rates and corresponding tank inventories. It needs to be dynamic to respond to changes like breakdowns or shifting economic objectives, thereby requiring the ability to capture the run-time behavior of each process to alter any controls as needed. We present a novel solution based on the use of machine learning to learn process relationships from sensor data and converting the process network into a surrogate network representation of regression-based transformers that are coupled via inventory balances and physical constraints like capacity limits. We discuss some physical and modeling considerations that need to be handled in practice for realizing such a representation from sensor data. We emphasize the choice of segmented linear models and couple them with integer-linear modeling techniques to devise a prediction-optimization framework for site-wide optimization. We illustrate the application and the effectiveness of the proposed framework with a case study based on the oil sands processing industry.