Predicting Covid-19 Infections Using the SIR Model

Predicting the spread of Covid-19 infections in various areas using a basic rendition of the SIR Model


As of April 15th, 2021 at 5:00 pm, John Hopkins has registered 138 million cases as the total number of people infected with Covid-19.

And. that. is. crazy.

When my country, Canada first went into lockdown, I remember not really understanding how dire the situation was. Our schools closed around March Break so we were provided with an extended vacation. “More time to focus on projects”, I thought to myself.

And now, after Ontario has been put into an even stricter lockdown, I can’t help but wonder if the sudden spike in cases could’ve been foreseen. Could our government have done more beforehand in order to avoid this situation? Could Ontario have avoided another big lockdown?

Well, let me introduce you to…

SIR Models.

Okok. I’m sorry! I wasn’t trying to overwhelm you with this picture so let me break it down.

A Guide to SIR Models

A SIR model is a basic set of graphs and formulas that are primarily used to determine the spread of an infectious disease throughout a population. Originating in the 1760s, SIR models have been used to analyze all types of viruses - from the cholera outbreak to the ebola epidemic.

The SIR model assumes that a population is broken down into 3 categories which are each represented by a variable:

  • S — stands for susceptibles and is the number of individuals who could potentially contract the virus.
  • I — stands for infectives and is the number of individuals that have and can transmit the virus.
  • R — stands for removed and is the number of individuals that either weren’t infected by the virus (possible natural immunity), can’t transmit it to others (could be in isolation), or have passed away.

Now, to create a SIR model, you must first pick out the data that you will use. This includes a specific area (ex. Italy), the duration (September - December 2020), and virus cases’ data (ex. Covid-19 cases).

Next, one needs to establish the independent variable, which in this case, is time (usually this is measured in days). The dependent variables are in sets where set 1 is the number of susceptible individuals [S = S(t)], the number of infected individuals [I = I(t0], and the number of recovered individuals [R = R(t)]. Set 2 of the dependent variables include the susceptible fraction of the population [s(t) = S(t)/N], the infected fraction of the population [i(t) = I(t)/N], and the recovered fraction of the population [r(t) = R(t)/N].

From here, it is simply the process of replacing the variables with the data in order to make numerical calculations.

Or… you could use this SIR model generator. But where’s the fun in that?

Predicting Infections in China & South Korea

In wanting to predict the spread of Covid-19, I decided to mimic the results from Ian Cooper, Argha Mondal, and Chris G. Antonopoulos’ paper, A SIR model assumption from the spread of Covid-19 in different communities.

This allowed me to familiarize myself with the SIR model calculations and ensure that I wasn’t inaccurate. Before actually coming up with your own predictions, I would definitely suggest using the data that is found in a research paper and calculating the SIR models. This would allow you to check and see if any part of your graph was incorrectly conceived.

I would also like to make note that I attempted to redraw my graphs online to give it a cleaner look. Anything beats the severely worn-out notebook where all my original calculations and graphs were.

Here were my results:

For China, I decided to focus on the total infections during the period of January to June 2020.
For South Korea, I decided to focus on the active infections at the time from February to June 2020

In my case, the predictions were focused on the months January to August and February to September, respectively.

Final Thoughts

All in all, SIR models are an extremely sick concept. The ability to almost accurately predict the spread of a virus over a certain period of time is valuable!

In completing this on my own, I have a better grasp of the ways in which virus predictions overlap with government and health agencies. In having access to this data, these establishments are able to put in precautionary measures to avoid events such as a recession or country-wide panic…or even a stricter lockdown.

Still, I would also caution that you doubl-NO TRIPLE CHECK your calculations. I may or may not be speaking from experience when I say one mistake can really change the entire trajectory of your graphs.

hihi 🐳

Thank you so much for reading my article! My name is Alysha Selvarajah and I am a 16 y/o with an interest in exploring new fields (the current one being virology).

Please shoot me a message on Twitter, LinkedIn, or Instagram. I would love to connect with you and chat about anything :)

biology & neuroscience