Using machine learning to correct for nonphotochemical quenching in high-frequency, in vivo fluorometer data
In vivo fluorometers use chlorophyll a fluorescence (Fchl) as a proxy to monitor phytoplankton biomass. However, the fluorescence yield of Fchl is affected by photoprotection processes triggered by increased irradiance (nonphotochemical quenching; NPQ), creating diurnal reductions in Fchl that may be mistaken for phytoplankton biomass reductions. Published correction methods are mostly designed for pelagic oceans and are ill suited for inland waters or for high-frequency data collection. A machine learning-based method was developed to correct vertical profiler data from an oligotrophic lake. NPQ was estimated as a percent reduction in Fchl by comparing daytime values to mean, unquenched values from the previous night. A random forest regression was trained on sensor data collected coincident with Fchl; including solar radiation, water temperature, depth, and dissolved oxygen saturation. The accuracy of the model was assessed using a grouped 10-fold cross validation (mean absolute error [MAE]: 7.6%; root mean square error [RMSE]: 10.2%), which was then used to correct Fchl profiles. The model also predicted NPQ and corrected unseen Fchl profiles from a future period with excellent results (MAE: 9.0%; RMSE: 14.4%). Fchl profiles were then correlated to laboratory results, allowing corrected profiles to be compared directly to collected samples. The correction reduced error (RMSE) due to NPQ from 0.67 μg L−1 to 0.33 μg L−1 when compared to uncorrected Fchl data. These results suggest that the use of machine learning models may be an effective way to correct for NPQ and may have universal applicability.