CEREALIA: A Digital Twin Framework for Weather-Driven Agricultural Decision-Making Under Imperfect Conditions

CEREALIA is presented in SIGSPATIAL '25!

T. Ahmed and M. Hasan, "Weather-driven agricultural decision-making under imperfect conditions," in Proc. of ACM International Conference on Advances in Geographic Information Systems (SIGSPATIAL), pp. 1–4, Nov. 2025. [PDF] [Extended Version]

Background

Modern agriculture relies on precise weather data to optimize irrigation, frost protection, and pest control. However, real-world weather networks often suffer from sensor faults, calibration drift, or communication errors. These inconsistencies can lead to poor predictions and crop losses.

High-level schematic of CEREALIA
High-level schematic of this work. CEREALIA stores historical weather traces and simulated anomalous data that is used to detect runtime inconsistent data generated by remote weather stations deployed in the field. The end users can use CEREALIA to analyze how inconsistent data impacts the target agriculture decision-making process.

Our Framework: CEREALIA

We introduce CEREALIA – a modular digital twin platform that detects, analyzes, and mitigates inconsistencies in agricultural weather data.

CEREALIA mirrors field weather stations in real time, classifies inconsistencies using neural models, and supports resilient decision-making when perfect data is unavailable.

Workflow of CEREALIA
Workflow of CEREALIA. The system integrates noisy data with historical traces to train machine learning model(s). A runtime consistency checker evaluates the impact of imperfect data on agricultural decision processes (e.g., fruit heat or frost prediction).

Key Contributions

Real-world deployment of CEREALIA
Real-world deployment of CEREALIA on an NVIDIA Jetson AGX Orin connected to a live weather station in Quincy, Washington. The setup streams live sensor data to the digital twin for anomaly detection, imputation, and decision-support inference.

Inconsistency Generation in CEREALIA

To realistically study imperfect weather data, CEREALIA includes a noise generator module that emulates sensor faults. This allows us to inject anomalies into otherwise clean data and observe how inconsistency affects decision-making models.

Visualization of noise generators in CEREALIA
Visualization of four types of inconsistencies in air temperature readings: (a) Random, (b) Malfunction, (c) Drift, and (d) Bias.

We consider the following four types of inconsistencies:

These generators mimic common real-world sensor faults such as random spikes, faulty oscillations, long-term drifts, and constant offsets. Incorporating them allows CEREALIA to test robustness of anomaly detection and decision-support models under imperfect weather data.

Machine Learning Models for Inconsistency Detection

To classify inconsistent weather measurements, CEREALIA leverages a diverse set of nine state-of-the-art neural network models. These models capture both short-term patterns and long-term temporal dependencies in weather data.

Together, these models allow CEREALIA to detect and classify anomalies such as random noise, sensor malfunctions, drifts, and biases with high accuracy while operating efficiently on embedded hardware.

Results Summary

We evaluated CEREALIA with nine machine learning models across multiple weather datasets. The key findings are:

These results show that CEREALIA can robustly detect anomalies in weather data and support agricultural decision-making even under imperfect measurement conditions.

Case Studies

1. Weather Data Imputation

Weather stations often produce missing or faulty measurements due to sensor malfunctions or network outages. Such gaps can interrupt decision-making tasks like irrigation scheduling or fruit stress prediction.

CEREALIA addresses this issue using a generative recurrent model (C-RNN-GAN) trained on historical traces and noisy samples. The model can accurately predict missing values across key attributes (temperature, humidity, pressure, wind), ensuring uninterrupted data streams.

Weather data imputation using CEREALIA
Example of CEREALIA imputing air temperature values when real-time sensor feeds are imperfect. Red shaded regions denote inconsistent inputs, replaced by predicted values.

2. Fruit Surface Temperature Prediction

Heat stress is a major concern for fruit growers, as excessive surface temperature can cause sunburn and quality loss. Accurate, consistent weather inputs are required for triggering protective measures such as cooling, fogging, or netting. However, faulty sensor data can reduce prediction reliability.

CEREALIA incorporates nine neural models (CNN: TCN, ResNet  |  RNN: LSTM, Bi-LSTM, GRU  |  Transformer: TST, Informer  |  Hybrid AE: TST-AE, LSTM-AE) to predict fruit surface temperature from weather attributes such as canopy air temperature, wind speed, dew point, and solar radiation. When imperfect measurements were introduced, prediction errors increased significantly. With CEREALIA imputing inconsistencies, performance improved and approached the no-fault baseline.

Models No Imperfection Imperfect Measurements Imputing Inconsistencies
MAERMSE MAERMSE MAERMSE
TCN0.68741.29110.92881.93474.90190.26340.76521.34660.9226
ResNet0.58231.23950.93442.17115.41400.10140.66951.30380.9274
LSTM0.83351.39690.91671.92084.17760.46500.92151.46880.9079
Bi-LSTM1.02831.58900.89221.91393.86890.54111.09591.64670.8842
GRU1.72422.14950.80272.32253.80690.55571.80412.22920.7879
TST0.81281.37290.91952.47705.36000.11930.93481.80150.8615
Informer0.59831.25300.93302.29685.34740.12340.67691.31220.9265
TST-AE1.59491.97840.83292.30723.52050.62011.68342.07210.8167
LSTM-AE0.93931.42390.91341.74153.47670.62951.01461.49490.9046

The results demonstrate that when faulty weather feeds are used directly, surface temperature prediction errors nearly double. With CEREALIA imputing inconsistencies, accuracy improves substantially, approaching the performance of perfect sensor data.

CEREALIA implementation is publicly available: https://github.com/CPS2RL/ag-dt

Impact

CEREALIA bridges computing and agriculture, enabling more resilient, data-driven decision processes. By leveraging digital twins, we show how imperfect measurements can still lead to reliable farm management outcomes.