Contextual Reuse of Big Data Systems: A Case Study Assessing Groundwater Recharge Influences

Buccella, Agustina; Cechich, Alejandra; Garrido, Walter; Montenegro, Ayelén

Contextual Reuse of Big Data Systems: A Case Study Assessing Groundwater Recharge Influences

Buccella, Agustina; Cechich, Alejandra; Garrido, Walter; Montenegro, Ayelén

URI: https://rdi.uncoma.edu.ar/handle/uncomaid/19508

Resumen:

The process of building data analytics systems, including big data systems, is currently being investigated from various perspectives that generally focus on specific aspects, such as data security or privacy, to the detriment of an engineering perspective on systems development. To address this limitation, our proposal focuses on developing analytics systems through a reuse-based approach, including stages ranging from problem definition to results analysis by identifying variations and building reusable, context-based assets. This study presents the reuse process by constructing two case studies that address the water table level prediction problem in two different contexts: the irrigated period and the non-irrigated period in the same study area. The objective of this study is to demonstrate the influence of context on the performance of widely used predictive models for this problem, including long short-term memory (LSTM), artificial neural networks (ANNs), and support vector machines (SVMs), as well as the potential for reusing the developed analytics system. Additionally, we applied the permutation feature importance (PFI) to determine the contribution of individual variables to the prediction. The results confirm that the same problem hypotheses yield different performance in each case in terms of coefficient of determination (R2), root mean square error (RMSE), mean absolute error (MAE), and mean square error (MSE). They also show that the best-performing predictive models differ for some of the hypotheses (ANN in one case and LSTM in another), supporting the assumption that context can influence model selection and performance. Reusing assets allows for more efficient evaluation of these alternatives during development time, resulting in analytics systems that are more closely aligned with reality, while also offering the advantages of software system composition.

Mostrar el registro completo del ítem