Water Research X Long-term multivariate water quality forecasting for sustainable aquaculture management
Water Research X Long-term multivariate water quality forecasting for sustainable aquaculture management
ABSTRACT
Accurate water quality prediction is essential for intelligent aquaculture management, enabling timely intervention, risk mitigation, and sustainable resource use. Key parameters such as dissolved oxygen, chlorophyll-a, and pH are influenced by complex spatiotemporal dynamics, making long-term forecasting particularly challenging in high-density aquaculture systems. Traditional methods struggle to balance local details and global trends, while circadian rhythms, feeding cycles, and seasonal shifts cause dynamic dependencies and distribution drift. To address these issues, we propose a novel deep learning framework with three core components: one, a multi-scale decomposition module with time-frequency enhancement, which removes cross-scale redundancy, suppresses noise, and integrates local-global features via hierarchical decomposition and feature reorganization; two, an adaptive sequence perception attention mechanism based on graph learning, which captures dynamic variable dependencies and models spatiotemporal interactions, including environmental coupling and aquaculture disturbances; and three, a GRU-MoE network with a dynamic expert selection strategy that adjusts to data characteristics, mitigating distribution drift caused by human interventions like feeding and oxygenation. Extensive experiments on four real-world water quality datasets show the proposed method outperforms six deep learning baselines, achieving an average MAE reduction of fifty-three point one seven percent, RMSE reduction of fifty-one point six eight percent, R squared improvement of zero point four nine four five, and KGE improvement of zero point one nine seven nine. Furthermore, Kolmogorov-Smirnov test results confirm the model's ability to recover real data distributions and their temporal evolution. This high-precision long-term prediction method enhances aquaculture system resilience, reduces risks from water quality fluctuations, and provides a robust foundation for informed decision-making and sustainable aquaculture management.
One. Introduction
One. Introduction
Accurate prediction of aquaculture water quality parameters is the core foundation for realizing intelligent management of aquaculture. By predicting key water quality parameters such as dissolved oxygen, chlorophyll a, temperature, turbidity, salinity and pH value, the aquaculture environment can be monitored in real time, water quality abnormalities can be detected in time, and aquatic animals can be prevented from growth restriction or disease due to harsh environment, thereby improving aquaculture production and economic benefits; in addition, accurate water quality prediction can also optimize bait delivery and water body regulation strategies, reduce resource waste, and improve aquaculture sustainability. At the same time, it helps to assess the impact of climate change on marine aquaculture ecology and provide a basis for scientific management and policy formulation. Therefore, it is urgent to establish a high-precision long-term prediction model for aquaculture water quality parameters, which can not only provide dynamic early warning support for aquaculture models such as factory-scale recirculating water aquaculture and offshore cages, but also help environmental carrying capacity assessment and the formulation of regional aquaculture capacity standards, and provide solid data support for the transformation of the aquaculture industry to an eco-friendly and sustainable development model.
Water quality prediction models can generally be divided into two categories: physics-based models and data-driven models. Physics-based models simulate specific water chemical processes by constructing equations and parameterization schemes with clear physical meanings, and have been widely used in the field of water quality simulation and prediction.
Such models are often combined with data assimilation technology to enhance their robustness and reliability. However, physics-based models have several inherent limitations: their dependence on idealized condition assumptions limits their predictive capabilities in complex or highly dynamic environments; model construction usually relies on detailed prior knowledge of the physical and chemical properties of water bodies, and obtaining such knowledge often requires a lot of experimental and observational resources, which significantly increases the technical threshold and cost of application. More importantly, such models involve complex numerical calculation processes with high computational overhead, making it difficult to meet the timeliness requirements of real-time or near-real-time water quality prediction, thereby restricting their engineering applications and large-scale deployment in actual scenarios.
With the advancement of data mining technology and the increasing abundance of environmental monitoring data, the research and application of data-driven water quality prediction models are becoming increasingly extensive. The core goal of such models is not to reveal the physical and chemical mechanisms behind water quality changes, but to focus on mining the complex nonlinear mapping relationship between meteorological factors and water quality parameters. According to the differences in modeling methodology, they can be divided into two categories: machine learning-based and deep learning-based models. Among them, traditional machine learning methods such as decision trees, support vector machines, and hidden Markov models have made up for the shortcomings of traditional mechanism models to a certain extent with their powerful nonlinear modeling capabilities. However, these methods still face significant challenges in practical applications: high dependence on feature engineering, significant decrease in computational efficiency with increasing dimensions in high-dimensional and complex data scenarios, and limited model generalization ability are common. In addition, such methods are usually difficult to effectively capture the dynamic correlation between time steps in time series, which restricts their ability to model the time evolution characteristics of water quality. Therefore, although they perform well in short-term prediction tasks, their prediction performance is obviously limited when dealing with water quality data with long time series dependence, multivariate coupling, and significant dynamic changes.
Deep neural networks have been widely used in time series prediction tasks due to their excellent feature learning ability, noise resistance and excellent generalization performance. In the field of water quality prediction, models based on convolutional neural networks have shown good performance due to their powerful local feature extraction capabilities. For example, methods such as MICN and PDF achieve high prediction accuracy by effectively capturing local periodic features and modeling long-term dependencies; while the improved model that introduces temporal convolutional networks and double residual structures further enhances the interpretability of the model. However, such convolution-based methods overly rely on the design of convolution kernels when dealing with long-term dependencies, making it difficult to effectively model global dynamic associations. In contrast, recurrent neural networks can achieve dynamic transmission of time series information through their inherent directed cyclic structure. Its important variants, such as long short-term memory networks and gated recurrent units, are widely used in complex time series prediction tasks due to their stronger nonlinear modeling capabilities and memory mechanisms. In particular, in water quality prediction applications, GRU-based models can effectively capture long-term dependency characteristics in multivariate sequences through their gating mechanisms. Nevertheless, although LSTM and GRU
have advantages over traditional methods in modeling long-term dependencies, their global modeling capabilities are still limited when dealing with sequences with very long time spans.
In recent years, deep learning models based on the Transformer architecture have made significant progress in the field of time series forecasting. With the powerful representation ability of its core self-attention mechanism, such models can effectively capture the complex and dynamic dependencies between time points and show excellent performance when processing time series data with highly nonlinear characteristics. It is particularly worth noting that in long sequence prediction tasks, its inherent global modeling ability effectively breaks through the bottleneck of traditional methods that are limited by local dependencies. However, the computational complexity of the standard self-attention mechanism grows quadratically with the length of the sequence, which constitutes the main constraint on its application in ultra-long sequence scenarios. To this end, researchers have proposed a variety of efficient attention mechanisms, aiming to significantly reduce the computational cost while retaining its powerful sequence modeling advantages as much as possible. Among them, Sparse Attention limits the scope of attention calculation so that it only focuses on key positions, which greatly reduces the computational overhead; Low-Rank Decomposition and Kernel-Based Approaches respectively achieve effective compression of computational complexity by performing low-rank approximation on the attention matrix or mapping it to a high-dimensional feature space; In addition, Segment-Based Attention and Window Mechanism divide long sequences into multiple sub-intervals for separate modeling, which significantly reduces resource consumption while also improving the model's ability to balance local and global feature modeling.
Although Transformer-based optimization methods have made significant progress in long-sequence modeling, they still face many challenges when facing complex application scenarios such as aquaculture water quality parameter prediction, revealing the potential deficiencies of current mainstream methods in multi-scale feature fusion, multivariate coupling relationships, attention distribution identification, and robustness to concept drift. First, water quality parameter time series generally show significant multi-scale characteristics, and there are significant differences in their behavior patterns between short-term fluctuations and long-term trends. However, the existing Transformer architecture mostly relies on single-scale modeling and lacks effective scale decoupling and feature integration mechanisms, which can easily lead to key patterns (such as sudden drops in dissolved oxygen) being masked by redundant information or smooth trends, thereby weakening the stability of the prediction. Secondly, the dot product attention mechanism commonly used by mainstream methods tends to generate smooth and homogeneous attention distribution, making it difficult to focus on key time nodes such as sudden changes (such as sudden changes in turbidity caused by heavy rain events), resulting in insufficient response capabilities of the model to local abnormal events. Although models such as DECSF-Net have introduced cross-source data fusion strategies, their attention allocation mechanisms have not been effectively improved and have limited performance in sudden event prediction. Finally, due to factors such as seasonal changes and extreme weather, water quality data often experience distribution drift. However, current static model structures such as Transformer and LSTM lack adaptive adjustment capabilities and are difficult to cope with dynamic changes in data distribution. Although online learning methods have improved the overall robustness of the model, their structure fails to effectively model the distribution differences of local data fragments and is difficult to solve the inconsistency problem between local features and global patterns. In view of this, this paper proposes a novel deep learning framework with the following main contributions:
One. A time-frequency enhanced multi-scale decomposable fusion strategy is proposed to eliminate redundant information in multi-scale time series data and balance local and global key features. Through time-frequency domain enhancement technology, the global trend and local detail characteristics in the time series are highlighted, and different time patterns are extracted using the improved moving average method. The sequence is decomposed into multiple scales by selecting an appropriate kernel size to ensure the diversity and independence of features at each scale; and redundant information is eliminated through residual connections to aggregate various time patterns.
Two. An adaptive sequence-aware attention mechanism is proposed to solve the problem of failing to capture key time points, local features, and multivariate dependencies due to the row homogeneity phenomenon caused by the traditional attention mechanism. By combining the dynamic changes in the time domain and the periodic characteristics in the frequency domain, the key time points and their characteristics are accurately captured, and the efficiency of attention allocation and feature extraction is optimized. At the same time, a graph structure framework is introduced to model the complex dependencies between multiple variables through graph representation and graph aggregation of time series.
Three. A GRU-MoE model is proposed to solve the problem of inconsistency between local and overall distributions caused by distribution drift. It avoids the model from overfitting local features and ignoring global trends, which in turn affects short-term predictions and long-term trend identification. This is a set of specially designed expert models, each of which is customized and optimized for the specific distribution of each patch in the input time series data, and automatically adjusts the expert's weights and strategies to achieve more accurate and adaptive predictions.
This study not only expands the theoretical framework of multi-scale time series modeling at the methodological level, but also responds to the urgent need for high-precision water quality prediction models in intelligent management of aquaculture at the application level. The proposed multi-scale decomposition strategy, adaptive attention mechanism and hybrid expert structure synergistically improve the model's ability to identify key features and adapt to complex environmental changes, showing good prediction stability and generalization performance. The research results can provide effective technical support for dynamic early warning, water quality regulation and ecological risk prevention and control in aquaculture scenarios, and promote the transformation of water quality management from experience-driven to data-driven. At the same time, it also provides a solid data foundation and theoretical support for ecological carrying capacity assessment, sustainable use of marine resources and related policy formulation in the context of climate change, which has important scientific significance and practical value.
The rest of this paper is organized as follows: Section Two reports the principal experimental results and offers a detailed discussion. Section Three concludes this paper and gives future work. Section Four introduces the framework of our model.