Global forecasting models for dengue outbreaks in endemic regions: a systematic review

Cover Image


Cite item

Full Text

Abstract

Background. Dengue is a rapidly spreading mosquito-borne disease, posing significant global health challenges, particularly in endemic regions. Recent years have witnessed an increase in the frequency and intensity of dengue outbreaks, necessitating robust forecasting models for early intervention.

This systematic review aims to synthesize recent literature on dengue forecasting models, evaluate their predictive performance, and identify the most effective approaches.

Materials and methods. A comprehensive search in Scopus, PubMed, ScienceDirect, and Springer databases was conducted following PRISMA guidelines. Studies were selected based on strict inclusion and exclusion criteria, and the quality of the research was evaluated using TRIPOD criteria. Out of 1,366 identified studies, 13 met the eligibility criteria. Data were extracted and analyzed to assess the accuracy and validity of the forecasting models employed.

Results. The findings indicate that machine learning-based models, particularly random forest, outperform conventional statistical models such as ARIMA and Poisson regression. Additionally, climate data — especially temperature and rainfall play a critical role in forecasting dengue incidence.

Conclusion. The present study corroborates the superior efficacy of machine learning-based forecasting models, particularly random forest, in forecasting dengue cases compared to conventional statistical methods. This finding provides a foundation for the development of an enhanced early warning system to address future outbreaks of dengue.

Full Text

Introduction

Dengue is one of the fastest spreading mosquito-borne disease, especially in tropical and subtropical regions, caused by various types of dengue viruses [1, 2]. The World Health Organization has reported an 8-fold increase in global dengue incidence between 2000 and 2019. In 2023, over 5 million cases were documented across 80 countries, with at least 23 nations experiencing dengue outbreaks. That number has more than doubled in 2024, with more than 10.6 million cases reported in North and South America alone. However, the actual number of cases is likely significantly higher, emphasizing the urgent need for effective public health interventions to mitigate this escalating crisis [3]. Although most infections are harmless, dengue shock syndrome and dengue are severe forms of infection that can lead to death [4, 5]. In the absence of a specific drug or vaccine for this virus, case fatality rates can reach 20% if diagnosis is not prompt [6], particularly in resource-constrained areas. When outbreaks occur on a large scale, the sheer number of severe dengue cases can overwhelm the health system and impede the delivery of optimal care. Dengue also poses a huge social and economic burden to many tropical countries where the disease is endemic [7]. Precise prediction of outbreak size and trends in disease incidence early can limit further spread [8], and help better plan health resource allocation to meet needs during an outbreak.

The two principal vectors are Aedes aegypti and A. albopictus, which are capable of transmitting dengue. The transmission of dengue is influenced by a number of factors, including environmental and climate change, urbanization, globalization, vector activity, and behavioral change [9]. The interaction between humans, climate, and mosquitoes gives rise to a complex system that exerts a profound influence on dengue transmission patterns, which in turn affects the likelihood of outbreaks [10]. This relationship has been researched for decades through the development of forecasting models in different parts of the world. These models vary widely, both in terms of purpose [11, 12], and setting [13–15]. While many of these models demonstrate excellence in various tasks, to create efficient prediction models, a systematic, adaptive and generalizable framework is needed, capable of identifying weather- and population-related patterns of vulnerability across geographic regions. The scientific community has not yet reached agreement on which models provide the best predictions. There are many research reports on prediction tools for dengue outbreaks [16–19]. However, research that provides a comprehensive summary of the performance and predictive ability of these tools remains limited. Previous studies have underscored the value of integrating diverse epidemiological tools, including mapping and mathematical models, to develop an effective early warning system [20]. However, this study did not prioritize the identification of significant predictors in the development of an early warning system for dengue. Other studies that emphasize early warning systems and incorporate numerous case forecasting models have been conducted, but this study solely examines the case experience of the various models utilized [21].

Various forecasting models have been developed over the years, integrating epidemiological, environmental, and climatic variables. While some models rely on traditional statistical methods such as Autoregressive Integrated Moving Average (ARIMA) and Poisson regression [14, 22–24]. Emerging research highlights the superior accuracy of machine learning models, particularly random forest and Long Short Term Memory (LSTM) [25, 26]. However, there is still no consensus on the most effective forecasting approach. To address this research gap, several recent studies have explored novel methodologies in dengue forecasting. Recent studies indicate that integrating deep learning techniques, such as LSTM and transformer models, significantly improves prediction accuracy compared to conventional statistical models [27]. Furthermore, recent findings suggest that incorporating real-time meteorological and mobility data improves forecasting precision [28]. These updated approaches not only improve prediction accuracy but also enhance model adaptability across different geographical regions. Despite these advancements, inconsistencies in data quality, limited external validation, and computational constraints continue to pose challenges in real-world applications. This review focuses on determining which model exhibits the highest accuracy and examining its internal and external validity. Its objective is to synthesize recent literature on dengue case forecasting, discuss related evidence, and evaluate different models' forecasting performance to identify the most effective one.

Materials and methods

This review used the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) approach, which includes methods for determining resources, eligibility, inclusion and exclusion criteria, and the process of systematic review, extraction, and analysis of data from the available literature [7]. PRISMA 2020 replaces the previous edition published in 2009, introducing new reporting guidelines that include more comprehensive study identification, selection, scoring, and synthesis methods [29]. This guide enables the search for terms relevant to the review and provides advice on aspects that need to be addressed in the review report for publication purposes [21].

Research Question Formulation

Research questions were developed using PICo, a useful tool to help frame relevant research questions for systematic reviews. The PICo concept incorporates three important elements (population or problem, importance, and context).[30] Based on PICo, the three main components in this review are dengue (Problem), case forecast model (Importance), and case prediction (Context). These concepts guided the formulation of the research question: “What is the evidence of the dengue case forecast model and its performance in predicting cases?”

Systematic Searching Strategies

Systematic searching strategies include identification, screening, and eligibility process.

Identification

In the identification stage, synonyms and variations were used to enrich the keywords, then applied in the search process, search strings were created and generated by using Boolean operators and keyword search, as illustrated in Table 1. A systematic literature search was conducted against four major databases: Scopus, PubMed, ScienceDirect, and Springer, and identified a total of 1366 relevant records. 16 duplicate records were found and removed, leaving 1,350 records for title screening. All potential records were then exported from the databases and organized into Excel sheets for title and abstract screening.

 

Table 1. Keywords search used in the screening process

Databases

Keywords used

Pubmed

((((((((((((((((((((((((dengue fever) OR (dengue incidence)) OR (dengue outbreaks)) OR (dengue epidemic)) AND (forecasting models)) OR (predictive models)) OR (prediction models)) OR (epidemic forecasting)) OR (outbreak prediction)) AND (machine learning)) OR (statistical models)) OR (ARIMA)) OR (regression models)) OR (random forest)) OR (neural networks)) OR (support vector machines)) AND (environmental factors)) OR (climate variables)) OR (temperature)) OR (rainfall)) OR (humidity)) OR (climate data)) OR (weather patterns)) AND (endemic regions)) AND (tropical areas)

Scopus

TITLE-ABS-KEY ("dengue fever" OR "dengue incidence" OR "dengue outbreak*" OR "dengue epidemic*") AND ("forecast* model*" OR "predict* model*" OR "prediction model*" OR "epidemic forecast*" OR "outbreak prediction") AND ("machine learning" OR "statistical model*" OR "ARIMA" OR "regression model*" OR "random forest" OR "neural network*" OR "support vector machine*") AND ("environment* factor*" OR "climate variable*" OR "temperature" OR "rainfall" OR "humidity" OR "climate data" OR "weather pattern*") AND ("endemic region*" OR "tropical area*" OR "high-risk area*" OR "disease-endemic region*")

ScienceDirect

Search 1: ("dengue fever" OR "dengue incidence") AND ("forecasting models" OR "prediction models")

Search 2: ("dengue fever" OR "dengue incidence") AND ("prediction models" OR "outbreak prediction") AND ("machine learning" OR "statistical models")

Search 3: ("dengue fever" OR "dengue outbreaks") AND ("predictive models" OR "forecasting models") AND ("environmental factors" OR "temperature" OR "rainfall")

Springer

("dengue fever" OR "dengue incidence" OR "dengue outbreaks") AND ("forecasting models" OR "predictive models") AND ("machine learning" OR "statistical models" OR "ARIMA") AND ("environmental factors" OR "climate" OR "rainfall") AND ("endemic regions" OR "tropical areas")

 

Screening

Two authors were responsible for the screening of titles and abstracts, which was conducted in accordance with the review questions that had been developed and the specific inclusion and exclusion criteria that had been established. Inclusion criteria were primary research in peer-reviewed journals and English-language articles. We excluded systematic review articles, books, conference proceedings, and non-peer-reviewed articles, such as editorials, commentaries, opinion pieces, or short reports. The screening process resulted in the elimination of 1,120 articles that were deemed irrelevant to the review. The remaining 230 articles were then read in full, including the abstracr reading, and assessed for eligibility.

Eligibility

A total of 64 full-text articles were retrieved for eligibility. Two authors independently reviewed all full-text articles for eligibility. All studies found to be unrelated to the interest and outcome of interest were excluded. The reasons for article exclusion were notated. There were 51 articles excluded due to:

  1. studies that did not focus on predicting the number of future cases (n = 14);
  2. studies that used or evaluated prediction or forecasting models, including machine learning methods (random forests, LSTM) or statistical models (such as ARIMA, Seasonal Autoregressive Integrated Moving Average (SARIMA), regression) (n = 19);
  3. articles that did not involve key climate variables in the forecasting (n = 11);
  4. studies conducted in non-endemic or low prevalence dengue areas (n = 7).

The remaining 13 eligible articles were continued for the quality assessment process.

Quality Assessment

The quality of the study was assessed using the quality assessment criteria described in TRIPOD (Transparent Reporting of multivariable prediction models for Individual Prognosis or Diagnosis) [31]. The TRIPOD statement is a checklist of 22 items, which are considered essential for the proper reporting of research that develops or validates multivariable prediction models [32]. The TRIPOD guidelines explicitly cover the development and validation of prediction models for diagnosis and prognosis across all medical domains and predictor types. Two authors conducted the quality assessment independently. Scores for report levels were obtained by awarding one point for each reported item relevant to the study. The total score was converted to a percentage based on the maximum possible score. Ultimately, 17 articles (with a percentage score > 70%) were included in the review [21]. Table 2 presents the scores and percentages of each quality assessment adapted from the TRIPOD checklist.

 

Table 2. Quality appraisal score of eligible articles adapted from TRIPOD checklist [32, 42]

Daftar periksa

Item

Source

[25]

[26]

[27]

[28]

[33]

[34]

[35]

[36]

[37]

[38]

[39]

[40]

[41]

Title and abstract

Title

1

1

1

1

1

1

1

1

1

1

1

1

1

1

Abstract

2

1

1

1

1

1

1

1

1

1

1

1

1

1

Introduction

              

Background and objectives

3a

1

1

1

1

1

1

1

1

1

1

1

1

1

 

3b

1

1

1

1

1

1

1

1

1

1

1

1

1

Methods

Source of data

4a

1

1

1

1

1

1

1

1

1

1

1

1

1

 

4b

1

0

0

0

1

0

0

1

0

0

0

0

0

Participants

5a

1

1

1

1

1

1

1

1

1

1

1

1

1

 

5b

1

1

1

1

1

1

1

1

1

1

1

1

1

Outcome

6a

1

1

1

1

1

1

1

1

1

1

1

1

1

Predictors

7a

1

1

1

1

1

1

1

1

1

1

1

1

1

Sample size

8

1

0

0

0

1

0

0

1

0

0

0

0

0

Missing data

9

1

0

0

0

0

0

0

0

0

0

0

0

0

Statistical analysis methods

10a

1

1

1

1

1

1

1

1

1

1

1

1

1

 

10b

1

1

1

1

1

1

1

1

1

1

1

1

1

 

10d

1

1

1

1

1

1

1

1

1

1

1

1

1

Results

Participants

13a

1

0

0

0

1

0

0

0

0

0

0

0

0

 

13b

1

1

1

1

1

1

1

1

1

1

1

1

1

Model development

14a

1

1

1

1

1

1

1

1

1

1

1

1

1

 

14b

1

0

0

0

0

0

0

1

0

0

0

0

0

Model specification

15a

1

1

0

0

1

1

1

1

1

0

1

1

1

 

15b

1

1

1

1

1

1

1

1

1

1

1

1

1

Model performance

16

1

1

1

1

1

1

1

1

1

1

1

1

1

Discussion

Limitations

18

1

1

1

1

1

1

1

1

1

1

1

1

1

Interpretation

19b

1

1

1

1

1

1

1

1

1

1

1

1

1

Implications

20

1

1

1

1

1

1

1

1

1

1

1

1

1

Other information

Supplementary information

21

1

0

0

0

0

0

0

1

0

0

0

0

0

Funding

22

1

0

1

1

1

0

1

1

0

1

0

0

0

Final score

 

27

20

20

20

24

20

21

25

20

20

20

20

20

Percentage

 

100

74.1

74.1

74.1

88.9

74.1

77.8

92.6

74.1

74.1

74.1

74.1

74.1

 

Data Extraction and Synthesis

The author extracted the data independently using a standardized data extraction form and organized it in a Microsoft Excel worksheet. The information collected included: author (year), country, study design, candidate predictors, research, data frequency, model techniques used, model performance, outcome, model accuracy, evaluation. The PRISMA flowchart is shown in Figure 1.

 

Fig. 1. Systematic review flow.

 

Results

Study characteristics

A total of 13 studies met the eligibility criteria and were included in this systematic review. Of these 13 studies, 4 (31%) were conducted in the Americas, 4 (31%) in East Asia, 4 (31%) in Southeast Asia, and 1 (7%) in South Asia. Brazil was the country with the highest number of eligible studies (n = 4) [25, 26, 33, 34], followed by China (n = 2) [27, 35], Taiwan (n = 2) [36, 37], Vietnam (n = 2) [28, 38]. Other studies were conducted in Malaysia [39], Sri Lanka [40], and the Philippines [41]. Five (42%) studies were published between 2015 and 2020, 9 studies between 2018–2022, and 7 (58%) studies were published between 2021–2024. Most studies (46%) used weekly time units, there were 23% studies using monthly data units, and the rest using annual and yearly. More than half (n = 7; 54%) of the studies used machine learning model techniques [25–28, 33, 36, 39], and the remaining (n = 5; 46%) studies used statistical model techniques [34, 35, 37, 38, 40, 41]. The characteristics of the included studies are summarized in Fig. 2. Details of the characteristics within each study are presented in Table 3.

 

Fig. 2. Study characteristics

 

Table 3. The details for characteristic and main findings of each study

Source

Country

Study Design

Candidate predictors

Data Unit

Model techniques used

Model performance

Outcome

Model Accuracy

Evaluation

[25]

Brazil

Observational Study

Rainfall, maximum temperature, minimum temperature, relative median temperature, insolation, rate of evaporation, median relative humidity, median wind speed

Monthly

Machine Learning (Random Forests, Gradient Boosting, Multilayer Perceptron, Support Vector Regression)

RMSE, MAE (Lowest errors with Random Forests)

Monthly cases of dengue

RMSE: 15.5 = 84.5%

MAE: 11.9 = 88,1%

Internal and External

[26]

Brazil

Comparative Study

Historical dengue cases, climate variables, tweets

Weekly

Machine Learning (LSTM, Random Forest, LASSO)

MSE, MSLE

Forecasting dengue incidence

LSTM = MSE = 0,04 (96%), MSLE = 0,01 (100%)

Random Forest = MSE = 0,17 (83%), MSLE = 0,13 (87%)

LASSO = MSE = 0,4 (60%), MSLE = 0,33 (67%)

Internal and external

[27]

China

Spatiotemporal Analysis

Imported cases, Tmin, Forest, Pop, Prec, Tmean, GDP, RH, Cropland, Tmax, Impervious, Water

Daily

Random Forest, Gradient Boosting Machine, Support Vector Machine

AUC

Dengue incidence

AUC = 0.91 (91%)

Internal and external

[28]

Vietnam

Observational

Climate data (temperature, precipitation, humidity, evaporation, sunshine hours)

Daily

Machine Learning (LSTM, LSTM-ATT, CNN, Transformer)

RMSE and MAE

Forecasting dengue fever incidence

RMSE: 1.60

MAE: 1.95

Accuraty rate 100%

Internal only

[33]

Brazil

Quantitative research design

Epidemiological data, Google search data, Weather

Weekly

Random Forest, LASSO Regression

RMSE, R², Pearson Correlation

Dengue incidence

LASSO = 70%-90%

Up to 90%

Internal only

[34]

Brazil

Ecological Time-Series Study

Climatic, environmental, social factors

Monthly

Statistical models (ARIMA, ETS, TBATS, BATS, STLM, StructTS, NNETAR, ELM, MLP, null model)

MAPE, Relative MAPE, Theil’s U

Dengue cases

ARIMA and TBATS are the best models in various time horizons (12 months, 6 months, dan 3 months)

Model accuracy not mentioned

Internal only

[35]

China

Time series analysis

Imported cases, Minimum temperature, Accumulative precipitation

Monthly

Time series Poisson regression

Dengue outbreaks

R² = 0.98 (98%)

Internal only

[36]

Taiwan

Observational Study

Meteorological variables, AQI, vector data

Daily

Machine Learning (Random Forest, XGBoost, Logistic Regression)

AUC

Dengue fever incidence

Random Forest: AUC = 0.9547, Accuracy = 89.94%

XGBoost: AUC = 0.9329

Logistic Regression: AUC = 0.7905

Internal only

[37]

Taiwan

Observational Study

Minimum temperature, Maximum cumulative rainfall

Yearly

Poisson Regression

MSE

Dengue incidence

MSE for validation set = 2.21

MSE for training set = 2.11

Internal only

[38]

Vietnam

Observational Study

Climate variables (temperature, humidity, precipitation), time-shifted variables

Weekly

SARIMAX

XGBoost

LSTM

Negative Binomial Regression

MAE, RMSE, AIC

Weekly dengue case counts

SARIMAX = 25.678 (83.33%)

XGBoost = 21.409 (100%)

LSTM = 30.456 (70.34%)

Negative Binomial Regression = 22.345 (95.78%)

Internal only

[39]

Malaysia

Time Series Analysis

Epidemiological (notified cases, onset cases, interventions), Environmental (rainfall, temperature, humidity)

Weekly

Random Forest

Support Vector Machine (SVM)

Artificial Neural Network (ANN)

Autoregressive Distributed Lag (ADL)

Hierarchical Forecasting (Optimal Combination)

Hierarchical Forecasting (Bottom Up)

MAPE

Dengue outbreak forecasting

Random Forest = 95% (with all factors)

SVM = 92.47%; ANN = 86.10%

ADL = 85.70%

Hierarchical Forecasting (Optimal Combination) = 85.67%

Hierarchical Forecasting (Bottom Up) = 84.85%

Internal only

[40]

Sri Lanka

Time Series Analysis

Historical dengue incidence data

Weekly

Modified ARIMA (Statistical)

MAPE

Dengue incidence forecast

MAPE: 1.554 (44.6%) (Validation), 0.3184 (Training) (68.16%)

Internal only

[41]

Philippines

Hybrid Model Development

Dengue incidence, climate data, past incidence

Weekly

ARIMA, NNAR, ANN, SVM, LSTM

RMSE, MAE, SMAPE

Dengue outbreaks

Hybrid ARIMA-NNAR: ~85%

Internal only

 

Approach and Accuracy of Forecasting Model for dengue cases

Various modeling approaches, such as machine learning and statistical methods for dengue case experience have been used in all included studies. Out of 13 studies, 6 (26,1%) used random forest approach [25–27, 33, 36, 39], 5 (21,7%) used LSTM approach [26, 28, 34, 38, 41], 3 (13%) used ARIMA [34, 40, 41], 2 others used Least Absolute Shrinkage and Selection Operator (LASSO), Gradient Boosting, XGBoost poisson regression, SARIMA. In terms of perfomance, all studies use different methods, including Root Mean Squared Error (RMSE), R-Squared (R²), Pearson Correlation, Mean Absolute Percentage Error (MAPE), RMSE, Mean Absolute Error (MAE), Area Under the Curve (AUC), Mean Squared Error (MSE), Mean Squared Logarithmic Error (MSLE), Akaike Information Criterion (AIC). The type of model used can be seen in Fig. 3.

 

Fig. 3. Type of model technique used.

 

Of the 13 articles included, there are 3 best forecasting methods with the highest model accuracy, namely random forest, LSTM, and LASSO. 6 articles using the random forest method, showed an average model accuracy of 89% [25–27, 33, 36, 39], from 5 articles using the LSTM method, there are 3 articles that show model accuracy, and the average obtained is 89% [26, 28, 38], while the other 2 articles do not mention the percentage of model accuracy [34, 41]. Of the 2 articles that used the LASSO method, the average model accuracy was 65% [26, 33]. The accuracy of the forecasting models can be seen in Fig. 4. In general, all of the case experience models included in the study showed fairly good forecasting ability. Overall, climate indicators were the most frequently used in showing the best performance. However, there are studies that used a combination of climate and epidemiological indicators, which showed that previous dengue cases significantly influenced current dengue cases [39].

 

Fig. 4. Average model accuracy

 

Random forest model accuracy

The Figure 5 illustrates the accuracy of various random forest models applied in dengue forecasting studies. The dataset includes models developed by six original research, with accuracy values ranging from 83% to 92%. The average model accuracy is recorded at 89%. The results highlight the superior predictive performance of random forest models in dengue incidence forecasting, reinforcing their potential for integration into early warning systems for outbreak management.

 

Fig. 5. Random forest model accuracy.

 

Discussion

This systematic review aims to summarize and discuss the evidence of various dengue case forecasting methods, model performance, and their ability to explain dengue incidence. This review shows that dengue prediction studies have become a topic of research interest, especially in Asia, where 69% of these included studies were conducted in Asia. This trend is due to the fact that the Asian region represents about 70% of the dengue burden globally [43]. Climate data, particularly temperature, rainfall and humidity are important predictors of dengue incidence, but they are often not available in time for health providers working on dengue early warning systems. Several studies have found that countries with better meteorological records provide higher performance metrics [25, 34, 35]. Therefore, integration with local meteorological departments on real-time meteorological data will improve access to meteorological information and benefit end users in early outbreak detection.

In general, climatic variables show an important role in the prediction of dengue cases. Climate variables such as mean temperature [25, 27, 28, 38, 39], minimum temperature [27, 35–37], maximum temperature [27, 37, 38], rainfall [27, 28, 36, 37, 39], humidity [25, 33, 39, 40], relative humidity [25, 28, 33], wind speed [25, 28, 33], evaporation and sunshine [28] are important input paramaters in the development of dengue incidence prediction models. Temperature showed the best predictive capacity of the meteorological variables studied in this review. In Vietnam, temperature was a significant predictor in the best dengue forecasting model, where the AUC and sensitivity were 87.42% and 96.88%, respectively [28]. In Ba Ria Vung Tau Province, Vietnam reported temperature and humidity as reliable variables in predicting dengue cases, where the AUC and sensitivity were 90.00% and 85.00%, respectively [38]. Meanwhile, Taiwan showed that temperature and rainfall are important factors in predicting dengue cases, where the AUC and sensitivity are 88% and 80% respectively [37].

In general, the dengue case prediction models included in the studies demonstrated a relatively high level of predictive ability. However, the predictive accuracy of these models varies considerably depending on the specific model employed and the quality of the data used. The most commonly utilized statistical modeling techniques in dengue research are ARIMA, Generalized Additive Models (GAM), Negative Binomial Regression, and Poisson Regression. ARIMA and GAM are established models for examining the relationship between environmental factors and disease outcomes, as well as for conducting time series prediction analysis [44, 45]. According to recent literature, time series techniques are particularly considered effective in predicting the highly auto-correlated nature of dengue infections [46]. In recent years, data-driven techniques based on machine learning algorithms such as Random Forest, Decision Tree, Support Vector Machine (SVM), and Naïve Bayes have shown promising results in predictive analysis for classification problems [47].

More than half of the included studies rely on machine learning methods, particularly supervised learning models, to assess conventional and novel data streams. Supervised learning models are defined by the use of labeled data sets to train algorithms to accurately classify data or predict outcomes [21]. The advantages of machine learning techniques that demonstrate lower error rates in comparison to conventional statistical-based models in predicting dengue cases are manifold. In the era of big data, this technique can utilize the availability of data and, in addition to being non-parametric, it can also provide leeway in terms of strict assumptions [7]. Random forest, neural network, gradient boosting, and support vector algorithms are part of important machine learning algorithms, which have made significant contributions to several areas of public health, especially in forecasting infectious diseases such as COVID-19 [48], malaria [49], and have similar uses for making dengue outbreak predictions [7].

In some of the studies included in this literature, we assume that the machine learning method using random forest is the best method at the moment. Findings in Brazil state that the accuracy of this model in recognizing dengue cases is more than 90% [33]. Likewise, findings in Malaysia state that the accuracy of this model reaches 95% [39]. Similar findings in another study in Singapore, which stated that the potential of random forest and its strong predictive ability in clustering the spatial risk of dengue transmission in Singapore. The dengue risk map generated using random forest has high accuracy and is a good tool to guide vector control operations, allowing targeted preventive measures before and during dengue outbreaks [50].

All studies employed internal validation to assess the accuracy of their findings. The utility of a forecasting model is contingent upon the certainty of its accuracy, or the extent to which it can predict real-world outcomes [51]. It is notable that the majority of published models have not undergone or been subjected to real-world validation. It is reasonable to conclude that models are unlikely to perform as well in real-world samples as they do in derived samples. This discrepancy, or validity shrinkage, is often significant. Consequently, it would be beneficial for future models to include mechanisms for estimating and reporting potential validity shrinkage, as well as predictive validity, in real-world data [52, 53]. External validation, on the other hand, was only used in a few studies that included [25–27]. This is despite the fact that external validation is considered very important for model development and is a key indicator of model performance by highlighting its applicability to participants, centers, regions or environments [54], It is imperative that external validation be employed during the process of model redevelopment. This entails making adjustments, updates, or recalibrations to the original model based on validation data, with the objective of enhancing its performance [55].

It should be noted that this systematic review is not without limitations. Firstly, the majority of the included studies originate from Asia, which encompasses a multitude of non-English speaking countries. Consequently, this review may have overlooked a substantial corpus of related literature published in other languages. Secondly, the inclusion criteria stipulated the necessity for studies to be derived from primary research in peer-reviewed journals. Consequently, preprints and grey literature, such as conference abstracts, committee and government reports, were excluded. It is therefore possible that some studies may have been omitted from our review.

Conclusion

The forecasting of dengue cases is a valuable resource for policymakers engaged in the formulation of strategies for the prevention of dengue outbreaks, particularly in regions where the disearse is endemic. The results of this systematic review indicate that the machine learning method utilizing the random forest algorithm is more effective than others method, particularly in comparison to statistical methods. Furthermore, this systematic review presents evidence of predictors in dengue case experience that focuses on incorporating climatic factors to create an early warning system, which can be utilized as a reference for preventing dengue transmission. The findings from this review have the potential to form the basis for more effective modelling practices in the future. These findings will contribute to the development of robust modelling across differenctt settings and populations and have significant implications for planning and decision-making processes for early dengue intervention and prevention.

×

About the authors

Agung Sutriyawan

Diponegoro University; Bhakti Kencana University

Author for correspondence.
Email: agung.epid@gmail.com
ORCID iD: 0000-0002-6119-6073

researcher, Diponegoro University; Head, Department of public health, Faculty of health sciences, Bhakti Kencana University

Индонезия, Semarang; Bandung

Mursid Rahardjo

Diponegoro University

Email: mursidraharjo@gmail.com
ORCID iD: 0000-0003-4791-1242

senior researcher, Department of environmental health, Faculty of public health

Индонезия, Semarang

Martini Martini

Diponegoro University

Email: martini@live.undip.ac.id
ORCID iD: 0000-0002-6773-1727

senior researcher, Department of epidemiology, Faculty of public health

Индонезия, Semarang

Dwi Sutiningsih

Diponegoro University

Email: dwi.sutiningsih@live.undip.ac.id
ORCID iD: 0000-0002-4128-6688

senior researcher, Department of epidemiology, Faculty of public health

Индонезия, Semarang

Cheerawit Rattanapan

Mahidol University

Email: cheerawit.rat@mahidol.ac.th
ORCID iD: 0000-0002-1799-422X

senior researcher, ASEAN Institute for Health Development

Таиланд, Bangkok

Nur Faeza Abu Kassim

Universiti Sains Malaysia

Email: nurfaeza@usm.my
ORCID iD: 0000-0001-6620-8603

senior researcher, School of biological sciences

Малайзия, Penang

References

  1. Sarker R., Roknuzzaman A.S.M., Haque M.A., et al. Upsurge of dengue outbreaks in several WHO regions: Public awareness, vector control activities, and international collaborations are key to prevent spread. Health Sci. Rep. 2024;7(4):e2034. DOI: https://doi.org/10.1002/hsr2.2034
  2. Hossain M.S., Noman A.A., Mamun S.M.A.A., Mosabbir A.A. Twenty-two years of dengue outbreaks in Bangladesh: epidemiology, clinical spectrum, serotypes, and future disease risks. Trop. Med. Health. 2023;51(1):37. DOI: https://doi.org/10.1186/s41182-023-00528-6
  3. CDC. Dengue on the Rise: Get the Facts. Available at: https://cdc.gov/dengue/stories/dengue-on-the-rise-get-the-facts.html
  4. Trivedi S., Chakravarty A. Neurological complications of dengue fever. Curr. Neurol. Neurosci. Rep. 2022;22(8):515–29. DOI: https://doi.org/10.1007/s11910-022-01213-7
  5. Umakanth M., Suganthan N. Unusual manifestations of dengue fever: a review on expanded dengue syndrome. Cureus. 2020;12(9):e10678. DOI: https://doi.org/10.7759/cureus.10678
  6. Capeding M.R., Tran N.H., Hadinegoro S.R., et al. Clinical efficacy and safety of a novel tetravalent dengue vaccine in healthy children in Asia: a phase 3, randomised, observer-masked, placebo-controlled trial. Lancet. 2014;384(9951):1358–65. DOI: https://doi.org/10.1016/S0140-6736(14)61060-6
  7. Leung X.Y., Islam R.M., Adhami M., et al. A systematic review of dengue outbreak prediction models: Current scenario and future directions. PLoS Negl. Trop. Dis. 2023;17(2):e0010631. DOI: https://doi.org/10.1371/journal.pntd.0010631
  8. Chen H.L., Hsiao W.H., Lee H.C., et al. Selection and characterization of DNA aptamers targeting all four serotypes of dengue viruses. PLoS One. 2015;10(6):e0131240. DOI: https://doi.org/10.1371/journal.pone.0131240
  9. Zhu G., Liu J., Tan Q., Shi B. Inferring the spatio-temporal patterns of dengue transmission from surveillance data in Guangzhou, China. PLoS Negl. Trop. Dis. 2016;10(4):e0004633. DOI: https://doi.org/10.1371/journal.pntd.0004633
  10. Teurlai M., Menkès C.E., Cavarero V., et al. Socio-economic and climate factors associated with dengue fever spatial heterogeneity: a worked example in New Caledonia. PLoS Negl. Trop. Dis. 2015;9(12):e0004211. DOI: https://doi.org/10.1371/journal.pntd.0004211
  11. Phung D., Talukder M.R., Rutherford S., Chu C. A climate-based prediction model in the high-risk clusters of the Mekong Delta region, Vietnam: towards improving dengue prevention and control. Trop. Med. Int. Health. 2016;21(10):1324–33. DOI: https://doi.org/10.1111/tmi.12754
  12. Medlock J.M., Leach S.A. Effect of climate change on vector-borne disease risk in the UK. Lancet Infect. Dis. 2015;15(6):721–30. DOI: https://doi.org/10.1016/S1473-3099(15)70091-5
  13. Benedum C.M., Seidahmed O.M.E., Eltahir E.A.B., Markuzon N. Statistical modeling of the effect of rainfall flushing on dengue transmission in Singapore. PLoS Negl. Trop. Dis. 2018;12(12):e0006935. DOI: https://doi.org/10.1371/journal.pntd.0006935
  14. Gharbi M., Quenel P., Gustave J., et al. Time series analysis of dengue incidence in Guadeloupe, French West Indies: forecasting models using climate variables as predictors. BMC Infect. Dis. 2011;11:166. DOI: https://doi.org/10.1186/1471-2334-11-166
  15. Betanzos-Reyes Á.F., Rodríguez M.H., Romero-Martínez M., et al. Association of dengue fever with Aedes spp. abundance and climatological effects. Salud Publica Mex. 2018;60(1):12–20. DOI: https://doi.org/10.21149/8141
  16. Gluskin R.T., Johansson M.A., Santillana M., Brownstein J.S. Evaluation of Internet-based dengue query data: Google Dengue Trends. PLoS Negl. Trop. Dis. 2014;8(2):e2713. DOI: https://doi.org/10.1371/journal.pntd.0002713
  17. Ogashawara I., Li L., Moreno-Madriñán M.J. Spatial-temporal assessment of environmental factors related to dengue outbreaks in São Paulo, Brazil. Geohealth. 2019;3(8):202–17. DOI: https://doi.org/10.1029/2019GH000186
  18. Anno S., Hara T., Kai H., et al. Spatiotemporal dengue fever hotspots associated with climatic factors in Taiwan including outbreak predictions based on machine-learning. Geospat. Health. 2019;14(2). DOI: https://doi.org/10.4081/gh.2019.771
  19. Baquero O.S., Santana L.M.R., Chiaravalloti-Neto F. Dengue forecasting in São Paulo city with generalized additive models, artificial neural networks and seasonal autoregressive integrated moving average models. PLoS One. 2018;13(4):e0195065. DOI: https://doi.org/10.1371/journal.pone.0195065
  20. Racloz V., Ramsey R., Tong S., Hu W. Surveillance of dengue fever virus: a review of epidemiological models and early warning systems. PLoS Negl. Trop. Dis. 2012;6(5):e1648. DOI: https://doi.org/10.1371/journal.pntd.0001648
  21. Baharom M., Ahmad N., Hod R., Abdul Manaf M.R. Dengue early warning system as outbreak prediction tool: a systematic review. Risk Manag. Healthc. Policy. 2022;15:871–86. DOI: https://doi.org/10.2147/RMHP.S361106
  22. Aburas H.M., Cetiner B.G., Sari M. Dengue confirmed-cases prediction: A neural network model. Expert Syst. Appl. 2010;37(6):4256–60. DOI: https://doi.org/10.1016/j.eswa.2009.11.077
  23. Chang F.S., Tseng Y.T., Hsu P.S., et al. Re-assess vector indices threshold as an early warning tool for predicting dengue epidemic in a dengue non-endemic country. PLoS Negl. Trop. Dis. 2015;9(9):e0004043. DOI: https://doi.org/10.1371/journal.pntd.0004043
  24. Ahmad Qureshi E.M., Tabinda A.B., Vehra S. Predicting dengue outbreak in the metropolitan city Lahore, Pakistan, using dengue vector indices and selected climatological variables as predictors. J. Pak. Med. Assoc. 2017;67(3):416–21.
  25. Roster K., Connaughton C., Rodrigues F.A. Machine-learning-based forecasting of dengue fever in Brazilian cities using epidemiologic and meteorological variables. Am. J. Epidemiol. 2022;191(10):1803–12. DOI: https://doi.org/10.1093/aje/kwac090
  26. Mussumeci E., Codeço Coelho F. Large-scale multivariate forecasting models for Dengue – LSTM versus random forest regression. Spat. Spatiotemporal. Epidemiol. 2020;35:100372. DOI: https://doi.org/10.1016/j.sste.2020.100372
  27. Ren H., Xu N. Forecasting and mapping dengue fever epidemics in China: a spatiotemporal analysis. Infect. Dis. Poverty. 2024;13(1):50. DOI: https://doi.org/10.1186/s40249-024-01219-y
  28. Nguyen V.H., Tuyet-Hanh T.T., Mulhall J., et al. Deep learning models for forecasting dengue fever based on climate data in Vietnam. PLoS Negl. Trop. Dis. 2022;16(6):e0010509. DOI: https://doi.org/10.1371/journal.pntd.0010509
  29. Page M.J., McKenzie J.E., Bossuyt P.M., et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372:n71. DOI: https://doi.org/10.1136/bmj.n71
  30. Lockwood C., Munn Z., Porritt K. Qualitative research synthesis: methodological guidance for systematic reviewers utilizing meta-aggregation. Int. J. Evid. Based Healthc. 2015;13(3):179–87. DOI: https://doi.org/10.1097/XEB.0000000000000062
  31. Moons K.G., Altman D.G., Reitsma J.B., et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann. Intern. Med. 2015;162(1):W1–73. DOI: https://doi.org/10.7326/M14-0698
  32. Collins G.S., Reitsma J.B., Altman D.G., Moons K.G. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): the TRIPOD statement. Ann. Intern. Med. 2015; 162(1): 55–63. DOI: https://doi.org/10.7326/M14-0697
  33. Koplewitz G., Lu F., Clemente L., et al. Predicting dengue incidence leveraging internet-based data sources. A case study in 20 cities in Brazil. PLoS Negl. Trop. Dis. 2022;16(1):e0010071. DOI: https://doi.org/10.1371/journal.pntd.0010071
  34. Lima M.V.M., Laporta G.Z. Evaluation of the models for forecasting dengue in Brazil from 2000 to 2017: An ecological time-series study. Insects. 2020;11(11):794. DOI: https://doi.org/10.3390/insects11110794
  35. Sang S., Gu S., Bi P., et al. Predicting unprecedented dengue outbreak using imported cases and climatic factors in Guangzhou, 2014. PLoS Negl. Trop. Dis. 2015;9(5):e0003808. DOI: https://doi.org/10.1371/journal.pntd.0003808
  36. Kuo C.Y., Yang W.W., Su E.C. Improving dengue fever predictions in Taiwan based on feature selection and random forests. BMC Infect. Dis. 2024;24(Suppl. 2):334. DOI: https://doi.org/10.1186/s12879-024-09220-4
  37. Yuan H.Y., Wen T.H., Kung Y.H., et al. Prediction of annual dengue incidence by hydro-climatic extremes for southern Taiwan. Int. J. Biometeorol. 2019;63(2):259–68. DOI: https://doi.org/10.1007/s00484-018-01659-w
  38. Tuan D.A., Dang T.N. Leveraging climate data for dengue forecasting in Ba Ria Vung Tau Province, Vietnam: An advanced machine learning approach. Trop. Med. Infect. Dis. 2024;9(10):250. DOI: https://doi.org/10.3390/tropicalmed9100250
  39. Ismail S., Fildes R., Ahmad R., et al. The practicality of Malaysia dengue outbreak forecasting model as an early warning system. Infect. Dis. Model. 2022;7(3):510–25. DOI: https://doi.org/10.1016/j.idm.2022.07.008
  40. Karasinghe N., Peiris S., Jayathilaka R., Dharmasena T. Forecasting weekly dengue incidence in Sri Lanka: Modified Autoregressive Integrated Moving Average modeling approach. PLoS One. 2024;19(3):e0299953. DOI: https://doi.org/10.1371/journal.pone.0299953
  41. Chakraborty T., Chattopadhyay S., Ghosh I. Forecasting dengue epidemics using a hybrid methodology. Phys. A: Stat. Mech. Appl. 2019;527:121266. DOI: https://doi.org/10.1016/j.physa.2019.121266
  42. Baharom M., Ahmad N., Hod R., Abdul Manaf M.R. Dengue early warning system as outbreak prediction tool: a systematic review. Risk Manag. Healthc. Policy. 2022;15:871–86. DOI: https://doi.org/10.2147/RMHP.S361106
  43. Ilic I., Ilic M. Global patterns of trends in incidence and mortality of dengue, 1990-2019: An analysis based on the global burden of disease study. Medicina (Kaunas). 2024;60(3):425. DOI: https://doi.org/10.3390/medicina60030425
  44. Nayak S.D.P., Narayan K.A. Prediction of dengue outbreaks in Kerala state using disease surveillance and meteorological data. Int. J. Community Med. Public Health. 2019;6(10):4392. DOI: https://doi.org/10.18203/2394-6040.ijcmph20194500
  45. Liu D., Guo S., Zou M., et al. A dengue fever predicting model based on Baidu search index data and climate data in South China. PLoS One. 2019;14(12):e0226841. DOI: https://doi.org/10.1371/journal.pone.0226841
  46. Johansson M.A., Reich N.G., Hota A., et al. Evaluating the performance of infectious disease forecasts: A comparison of climate-driven and seasonal dengue forecasts for Mexico. Sci. Rep. 2016;6:33707. DOI: https://doi.org/10.1038/srep33707
  47. Salim N.A.M., Wah Y.B., Reeves C., et al. Prediction of dengue outbreak in Selangor Malaysia using machine learning techniques. Sci. Rep. 2021;11(1):939. DOI: https://doi.org/10.1038/s41598-020-79193-2
  48. Bullock J., Luccioni A., Hoffman Pham K., et al. Mapping the landscape of Artificial Intelligence applications against COVID-19. J. Artif. Intell. Res. 2020;69:807–45. DOI: https://doi.org/10.1613/jair.1.12162
  49. Zinszer K., Verma A.D., Charland K., et al. A scoping review of malaria forecasting: past work and future directions. BMJ Open. 2012;2(6):e001992. DOI: https://doi.org/10.1136/bmjopen-2012-001992
  50. Ong J., Liu X., Rajarethinam J., et al. Mapping dengue risk in Singapore using Random Forest. PLoS Negl. Trop. Dis. 2018;12(6):e0006587. DOI: https://doi.org/10.1371/journal.pntd.0006587
  51. Johansson M.A., Apfeldorf K.M., Dobson S., et al. An open challenge to advance probabilistic forecasting for dengue epidemics. Proc. Natl. Acad. Sci. U.S.A. 2019;116(48):24268–74. DOI: https://doi.org/10.1073/pnas.1909865116
  52. Ivanescu A.E., Li P., George B., et al. The importance of prediction model validation and assessment in obesity and nutrition research. Int. J. Obes. (Lond.). 2016;40(6):887–94. DOI: https://doi.org/10.1038/ijo.2015.214
  53. Steyerberg E.W., Lingsma H.F. Predicting citations: Validating prediction models. BMJ. 2008;336(7648):789. DOI: https://doi.org/10.1136/bmj.39542.610000.3A
  54. Moons K.G., de Groot J.A., Bouwmeester W., et al. Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist. PLoS Med. 2014;11(10):e1001744. DOI: https://doi.org/10.1371/journal.pmed.1001744
  55. Moons K.G., Kengne A.P., Grobbee D.E., et al. Risk prediction models: II. External validation, model updating, and impact assessment. Heart. 2012;98(9):691–8. DOI: https://doi.org/10.1136/heartjnl-2011-301247

Supplementary files

Supplementary Files
Action
1. JATS XML
2. Fig. 1. Systematic review flow.

Download (123KB)
3. Fig. 2. Study characteristics

Download (90KB)
4. Fig. 3. Type of model technique used.

Download (50KB)
5. Fig. 4. Average model accuracy

Download (37KB)
6. Fig. 5. Random forest model accuracy.

Download (35KB)

Copyright (c) 2025 Sutriyawan A., Rahardjo M., Martini M., Sutiningsih D., Rattanapan C., Kassim N.F.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

СМИ зарегистрировано Федеральной службой по надзору в сфере связи, информационных технологий и массовых коммуникаций (Роскомнадзор).
Регистрационный номер и дата принятия решения о регистрации СМИ: ПИ № ФС77-75442 от 01.04.2019 г.


This website uses cookies

You consent to our cookies if you continue to use our website.

About Cookies