The COVID-19 pandemic continues to ravage the world. Even as global infections crossed 2.6 million, Indiaโs number at around 21,370 seems modest, given we are home to one-sixth of the worldโs population. Based on data from Johns Hopkins, in per capita terms, only 16 in a million people in India are infected by COVID-19, vs 338 in a million people globally (as of 22nd April 2020). Things in India are not as bad… but what does the future look like?
Given my interest in numbers and trends, I have been trying to figure out if we could forecast the trends for COVID-19. I requested data from the popular ones Johns Hopkins-CDDEP, BCG and other forecasts but these were allegedly not for public dissemination / disputed and I did not get a response. In general, I noticed that most of the forecast did not provide day-wise numbers. On a log scale without supporting numbers, it was difficult to decipher what the forecasters wanted to say from the presentations and reports. I would not have been able to read even my own forecast chart without the accompanying numbers.
Forecasts from the US by experts compiled by fivethirtyeight showed a huge variation in forecasts. My colleague Gunnvant has created a data scraper and visualization tool for COVID-19. However, I did not find any good forecasts for India.
Some bad ones are out there. โA five-member Central team has projected that the number of COVID-19 cases in Mumbai will touch an estimated 42,604 by April 30 and spiral to 6,56,407 by May 15. Based on mathematical modelling for Mumbai by the Union Ministry of Health on April 16โ
Source: The Hindu
The assumptions are too simplistic. 3.8 doubling maintained throughout the forecast period. Such high numbers are great for scaremongering, grabbing eyeballs and making headlines. The state government is disputing these numbers. They should. Such โmathematical modellingโ have been made by team members who had no understanding of either mathematics nor modelling. These forecasts add negligible value. May I direct these ill-trained forecasters to some courses at Jigsaw Academy …
Given my absolute lack of knowledge on diseases, I was initially hesitant to try to forecast it. I take solace from the words of Mark Weir of Ohio State’s ecology, epidemiology, and population health program:
Source: fivethirtyeight.com
I looked at this as a data forecasting problem and decided to build a simple time series model. Having spent over a decade forecasting revenues, profits and the unknowable stock prices of my coverage universe, I was used to being wrong and forecasting things I had no idea of! Here is the result, the link to my COVID-19 confirmed infections predictions for India: https://docs.google.com/spreadsheets/d/1dc9hwCSz7hoqkgymPghar0AnN80weDgRICQ2qXrmxB0/edit?usp=sharing
When I build the models, these are the things I wanted to have:
You may view the details from in the Google sheet. However, you will not be able to edit or change anything. You may copy it to your own Google drive if you would like to make any changes. All changes in forecast are recorded and ideally these will be updated once a day.
The data is sourced from Johns Hopkins (details in the Google spreadsheet). As some of the data is country-wise and some data is state-wise (for some countries like the US, China and Australia), we use groupby in Python and download as an excel file. We use a simple time series forecasting model to predict the number of confirmed COVID-19 infections in the next seven days. We also highlight the upper bound and lower bound of the estimates. We check the difference of our mean estimate and the actual numbers. The data for my daily forecasts is available from 11th April and since then the actual number has been within 5% of the predicted forecast. Here are my forecasts for the next seven days..
Source: https://docs.google.com/spreadsheets/d/1dc9hwCSz7hoqkgymPghar0AnN80weDgRICQ2qXrmxB0/edit?usp=sharing
The model is work-in-progress and considering some fine tuning. The lower bound is easier to predict as it canโt be less than actuals. The upper bound needs to be tested, especially once we are not in lockdown and may increase the rate of spread. Looking forward to extending the duration of the forecast as well as seeing if we can predict the peak of the infection in India. Hope to share the model soon.
Given these limitations, honestly, I am surprised the simple model has reasonably good predictive power. And I decided to post it on a public forum to (i) make myself update it daily (ii) see if the model continues to be as good in predicting the numbers, especially in public scrutiny!
Note that my predictions keep changing each day as fresh data comes in. My prediction for todayโs (23rd April) confirmed cases have increased by 4% over the last seven days. I am searching for the peak and to see the numbers fall. Hopefully, my numbers will prove excessive and we will see it reduce… Unfortunately, the forecasts seem to be edging up. All models are right until they go wrong! Hopefully, this falters in predicting too much, and the numbers end up being lower than forecastโฆ
Let’s have a more sensible discussion on numbers and expectations. I estimated, India would be around 11,000 confirmed infections on 14th April and there would be a push to keep the lockdown intact. With cases around 20,000 currently, going to around 35,000 by 30th April and expected to cross 40,000 by 3rd May, are we looking for at least a partial lockdown continuing? We will know soon enough…
Ok, we all agree that Mumbai hitting 6.5 lakh cases by 15th May is baloney. However, while the experts in the Union Ministry of Health expect over 42,000 confirmed cases by 30th April, I have the audacity to suggest that the whole of India will have less than 42,000 cases by 30th April?
Yes, I do. Game on! And because I back myself, may the better forecaster win!
Disclaimer:
I offer my views, with the knowledge that diseases, medicine and healthcare are not my area of expertise. This is an attempt in predictive time series analysis. There are a lot of bad models out there, and I am confident this will be better than most.
Also, given that many discussions on the topic have been polarized by political leanings and viewpoints, I would like to stress that these are not to promote any ideology or offer judgment on government policy decisions.
My only wish is that the government both state and central focus on improving healthcare infrastructure and facilities in India, while they leave the forecasting to those who can!
Fill in the details to know more
Metaverse: The Virtual Universe and its impact on the World of Finance
April 13, 2023
The Portal Podcast Transcription – Episode 3 โ Analytics in HR Management With Sayantani Pyne
March 18, 2023
Podcast Transcript Episode 2: Product Thinking For Entrepreneurs With Mr. Praveen Udupa, Co-founder, eedge.ai
March 13, 2023
โThe Power of SQL in Driving Business Success”
March 8, 2023
Exploring the Potential of Artificial Intelligence & Machine Learning for Improving Program Management
February 28, 2023
Cyber Safe Behaviour In Banking Systems
February 17, 2023
Add your details:
By proceeding, you agree to our privacy policy and also agree to receive information from UNext through WhatsApp & other means of communication.
Upgrade your inbox with our curated newletters once every month. We appreciate your support and will make sure to keep your subscription worthwhile