COVID-19

KAJ · Post by **KAJ** » Sat Feb 20, 2021 3:15 pm

shpalman wrote: Sat Feb 20, 2021 2:56 pm
KAJ wrote: Sat Feb 20, 2021 2:32 pm
shpalman wrote: Sat Feb 20, 2021 9:15 am If you look at James Annan's models, his R is slowly varying most of the time and then jumps to a different value when the government changes the rules.

If you're fitting a polynomial, which is at most second order, to the logarithm of the data then you're already assuming that the data is exponential with a slowly varying rate. And it kind of is, or else it just wouldn't fit that well.
I think we'll have to agree to differ on terminology. When I read "exponential with a varying rate" I think "not exponential", for "exponential with slowly varying rate" I think "approximately exponential".
When I write "exponential with slowly varying rate" I intend "can be described as exponential over a limited time span".

"approximately exponential" could also mean "over a long term such that fluctuations away from the ideal behaviour smooth out" but in this case would mean "getting increasingly inaccurate the further away we get" (from the "limited time span" I mentioned). As such it can't really be extrapolated very far.

Yes. Before retirement I spent a lot of time modelling microbial growth, survival, and death and I get a bit nerdy. Don't take me seriously.

KAJ · Post by **KAJ** » Sat Feb 20, 2021 3:46 pm

jimbob wrote: Sat Feb 20, 2021 3:00 pm <snip>
Deaths declined almost perfectly exponentially from the first peak to August

Yes, looking at deaths by date of death from mid-April to mid-July there's some curvature but it's magnitude is small - a straight line is a very good fit (rectilinear Multiple R-squared: 0.9822 compared to quadratic 0.9854).

: DateDeaths.png (16.6 KiB) Viewed 6771 times

Code: Select all

Analysis of Variance Table

Model 1: log(DateDeaths) ~ poly(date, 2)
Model 2: log(DateDeaths) ~ date
  Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
1     87 1.6778                                  
2     88 2.0432 -1  -0.36542 18.948 3.648e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Gfamily · Post by **Gfamily** » Sat Feb 20, 2021 3:48 pm

jimbob wrote: Sat Feb 20, 2021 3:00 pm
Bird on a Fire wrote: Fri Feb 19, 2021 6:55 pm So what would an exponential decrease suggest, mechanistically? That there's some asymptotic limit below which transmission won't fall for other reasons? In which case, getting below that limit would require a further development.
I'd say it suggests that although it's a difficult term to determine, R is a reasonable parameter to explain what's happening.

Deaths declined almost perfectly exponentially from the first peak to August

That'll be the asymptotic cases

Post by **Bird on a Fire** » Sat Feb 20, 2021 4:21 pm

KAJ wrote: Sat Feb 20, 2021 3:15 pm
shpalman wrote: Sat Feb 20, 2021 2:56 pm
KAJ wrote: Sat Feb 20, 2021 2:32 pm
I think we'll have to agree to differ on terminology. When I read "exponential with a varying rate" I think "not exponential", for "exponential with slowly varying rate" I think "approximately exponential".
When I write "exponential with slowly varying rate" I intend "can be described as exponential over a limited time span".

"approximately exponential" could also mean "over a long term such that fluctuations away from the ideal behaviour smooth out" but in this case would mean "getting increasingly inaccurate the further away we get" (from the "limited time span" I mentioned). As such it can't really be extrapolated very far.
Yes. Before retirement I spent a lot of time modelling microbial growth, survival, and death and I get a bit nerdy. Don't take me seriously.

Sorry, I was getting confused. I was thinking of logistic growth curves*, which also use exponentials but can have non-zero asymptotes. Another suggested model structure to compare with the pure exponential

So yes, the relationship with R makes a lot more sense. We can't actively "remove" infections from the population, we can only adjust the rate at which new ones appear.

*I've used them (inter alia) for doing phytoplankton microcosm experiments, and it's super cool when you plot the numbers how tweaking some parameter like a nutrient ratio can cause the population levels to adjust to a new steady stead following a logistic process. (We had to count the planktons by eye with a microscope, though, which was incredibly boring. Flow cytometry ftw.)

KAJ · Post by **KAJ** » Sat Feb 20, 2021 4:43 pm

Bird on a Fire wrote: Sat Feb 20, 2021 4:21 pm
KAJ wrote: Sat Feb 20, 2021 3:15 pm <snip>
Yes. Before retirement I spent a lot of time modelling microbial growth, survival, and death and I get a bit nerdy. Don't take me seriously.
Sorry, I was getting confused. I was thinking of logistic growth curves*, which also use exponentials but can have non-zero asymptotes. Another suggested model structure to compare with the pure exponential

So yes, the relationship with R makes a lot more sense. We can't actively "remove" infections from the population, we can only adjust the rate at which new ones appear.

*I've used them (inter alia) for doing phytoplankton microcosm experiments, and it's super cool when you plot the numbers how tweaking some parameter like a nutrient ratio can cause the population levels to adjust to a new steady stead following a logistic process. (We had to count the planktons by eye with a microscope, though, which was incredibly boring. Flow cytometry ftw.)

Yes. Microbial growth curves under constant conditions are usually modelled by sigmoidal curves with 4 (or more) parameters, related to initial value, lag time, growth rate, final value. There's a number of such "primary" models around including the logistic. I was interested in the effect of changed/changing conditions on the parameters of those primary models, as represented by "secondary" models - again there's a number around. It was generally much more expensive to get an anything-like-adequate data set than to do the statistics. /nerdy derail

Martin Y · Post by **Martin Y** » Sat Feb 20, 2021 4:48 pm

sTeamTraen wrote: Fri Feb 19, 2021 2:24 pm Last Saturday evening, David Davis MP went on Twitter to promote a preprint from a hospital in Barcelona that claimed an 80% reduction in ICU admissions and a 60% reduction in deaths simply by administering Vitamin D.

I'm pleased to report that this preprint has now been taken down, and that I may have been able to contribute to that result.

Just catching up but that's remarkable. Am I completely misunderstanding or did that paper really claim to have categorised its Barcelona hospital patient subjects as "White, Asian, African American or Latino"?

shpalman · Post by **shpalman** » Sat Feb 20, 2021 5:01 pm

Bird on a Fire wrote: Sat Feb 20, 2021 4:21 pm
KAJ wrote: Sat Feb 20, 2021 3:15 pm
shpalman wrote: Sat Feb 20, 2021 2:56 pm
When I write "exponential with slowly varying rate" I intend "can be described as exponential over a limited time span".

"approximately exponential" could also mean "over a long term such that fluctuations away from the ideal behaviour smooth out" but in this case would mean "getting increasingly inaccurate the further away we get" (from the "limited time span" I mentioned). As such it can't really be extrapolated very far.
Yes. Before retirement I spent a lot of time modelling microbial growth, survival, and death and I get a bit nerdy. Don't take me seriously.
Sorry, I was getting confused. I was thinking of logistic growth curves*, which also use exponentials but can have non-zero asymptotes. Another suggested model structure to compare with the pure exponential

So yes, the relationship with R makes a lot more sense. We can't actively "remove" infections from the population, we can only adjust the rate at which new ones appear.

*I've used them (inter alia) for doing phytoplankton microcosm experiments, and it's super cool when you plot the numbers how tweaking some parameter like a nutrient ratio can cause the population levels to adjust to a new steady stead following a logistic process. (We had to count the planktons by eye with a microscope, though, which was incredibly boring. Flow cytometry ftw.)

People suggested logistic curves for modelling the pandemic on the way up, as if the reduction in new infection rate was due to the finite size of the population and not the lockdown measures (or people personally deciding to take precautions).

We do sort-of "remove" infections from the population, in the sense that we quarantine them.

The usual simple model for this sort of thing is the SIR model, in which you have a Susceptible population, an Infected population, and a Recovered population, and new infections rely on Infected people meeting Susceptible people. That will give exponential growth at the beginning until a substantial fraction of the population has been infected. However it doesn't make total sense to try to fit the covid stats with this since new infections are more likely caused by people who we don't (yet) know are themselves infected. You'd need to add the Exposed population, the Quarantined population...

sTeamTraen · Post by **sTeamTraen** » Sat Feb 20, 2021 8:58 pm

Martin Y wrote: Sat Feb 20, 2021 4:48 pm
sTeamTraen wrote: Fri Feb 19, 2021 2:24 pm Last Saturday evening, David Davis MP went on Twitter to promote a preprint from a hospital in Barcelona that claimed an 80% reduction in ICU admissions and a 60% reduction in deaths simply by administering Vitamin D.

I'm pleased to report that this preprint has now been taken down, and that I may have been able to contribute to that result.
Just catching up but that's remarkable. Am I completely misunderstanding or did that paper really claim to have categorised its Barcelona hospital patient subjects as "White, Asian, African American or Latino"?

Yes, that was part of the weirdness. First, Spanish hospitals don't (according to my contacts) ask for ethnicity data. Second, the categories are weird (and again, to make sense of them, we'd need to know what the reference categories are, and there is no official list because Spain doesn't classify people that way). Maybe "African American" is the authors trying to say "Black", and "Latino" means "Latin American", but I would expect far more of the early infected population in Barcelona to be Latin American than Asian, and the numbers here are the other way round.

I also note that one of the giveaways with the obviously fabricated Surgisphere data was that it required hospitals on six continents to be collecting racial classification data using the US model (White, Black, Asian, Native American, Hawaiian, and Pacific Islander, with "Hispanic" as an orthogonal category). Not going to happen in Lagos or Berlin.

JQH · Post by **JQH** » Sun Feb 21, 2021 12:24 am

Gfamily wrote: Sat Feb 20, 2021 3:48 pm
jimbob wrote: Sat Feb 20, 2021 3:00 pm
Bird on a Fire wrote: Fri Feb 19, 2021 6:55 pm So what would an exponential decrease suggest, mechanistically? That there's some asymptotic limit below which transmission won't fall for other reasons? In which case, getting below that limit would require a further development.
I'd say it suggests that although it's a difficult term to determine, R is a reasonable parameter to explain what's happening.

Deaths declined almost perfectly exponentially from the first peak to August
That'll be the asymptotic cases

Very good!

Millennie Al · Post by **Millennie Al** » Sun Feb 21, 2021 1:39 am

Bird on a Fire wrote: Fri Feb 19, 2021 6:55 pm So what would an exponential decrease suggest, mechanistically? That there's some asymptotic limit below which transmission won't fall for other reasons? In which case, getting below that limit would require a further development.

Quite the opposite. There's some limit below which the disease completely dies out. For example, if the number of cases halves every week, then we might have weeks with 16, 8, 4, 2, 1 after which you can't have half a case, so a more appropriate interpretation is that you have a 50% chance of one case and a 50% chance of zero. But zero is an exceptional value. Once it hits zero, there's no coming back.

Millennie Al · Post by **Millennie Al** » Sun Feb 21, 2021 2:01 am

KAJ wrote: Sat Feb 20, 2021 2:32 pm
shpalman wrote: Sat Feb 20, 2021 9:15 am If you look at James Annan's models, his R is slowly varying most of the time and then jumps to a different value when the government changes the rules.

If you're fitting a polynomial, which is at most second order, to the logarithm of the data then you're already assuming that the data is exponential with a slowly varying rate. And it kind of is, or else it just wouldn't fit that well.
I think we'll have to agree to differ on terminology. When I read "exponential with a varying rate" I think "not exponential", for "exponential with slowly varying rate" I think "approximately exponential".

Think of it as being like riding a bike. Each revolution of the pedals corresponds to a fixed movement of the back wheel. Then you change gears and it changes, but only to a different fixed ratio. Regardless of how many gears you have, the underlying mechanism is fundamentally a fixed ratio.

Similarly for infections: the only way to catch it is from someone who has it, so for a given set of rules and attitudes, if twice as many people have it, them twice as many will catch it. The underlying mechnism is exponential (well, actually, it's logistic, but that only matters much later and premature consideration of that leads to the herd immunity delusion).

If the fall is deviating from the previous exponential, I'd guess it's a combination of people feeling a bit safer due to vaccinations (i.e. they perceive that fewer other people will be infectious, so they're at lower risk of exposure), due to the numbers falling (same reasoning) and people starting to do things that they have been putting off but got tired waiting for.

People's instincts with regard to the numbers are completely wrong because exponential growth and decay is totally non-intuitive. There's a common puzzle which says that if there's a pond with a lily that grows so that it covers twice the area in one day, and it covers the pond in 30 days, how long does it take to cover half the pond. It counts as a puzzle because so many people are tempted to say 15 days. There are lots of variants of this puzzle because exponential functions can only be dealt with through understanding the maths. In particular, exponentials beat any polynomial, so even attempting to allow for them intuitively, you underestimate. And it goes the other way too. If a fall from 10000 cases to 1000 cases takes a week, then the next week sees a fall to 100. The forst week saw an improvement of 9000, but this one only improved by 900, which will seem very disappointing. Then the next week only sees a fall to 10, which is even worse. So it becomes very difficult to presuade people in the face of improvements which diminish so quickly, just as it is difficult to motivate them to take the threat seriously when it is growing.

KAJ · Post by **KAJ** » Sun Feb 21, 2021 11:54 am

Millennie Al wrote: Sun Feb 21, 2021 2:01 am
KAJ wrote: Sat Feb 20, 2021 2:32 pm
shpalman wrote: Sat Feb 20, 2021 9:15 am If you look at James Annan's models, his R is slowly varying most of the time and then jumps to a different value when the government changes the rules.

If you're fitting a polynomial, which is at most second order, to the logarithm of the data then you're already assuming that the data is exponential with a slowly varying rate. And it kind of is, or else it just wouldn't fit that well.
I think we'll have to agree to differ on terminology. When I read "exponential with a varying rate" I think "not exponential", for "exponential with slowly varying rate" I think "approximately exponential".
Think of it as being like riding a bike. Each revolution of the pedals corresponds to a fixed movement of the back wheel. Then you change gears and it changes, but only to a different fixed ratio. Regardless of how many gears you have, the underlying mechanism is fundamentally a fixed ratio.
<snip>

Emphasis added. In my terminology (feel free to use your own, but please be clear) exactly exponential growth/decline has a fixed rate constant. If there is a "gear change" the growth/decline is exponential before and after the change, but not overall.

As I've said before, I'm not equipped to speculate on pandemic mechanisms. I restrict my comments to descriptive statistics and fitting models to data, where I have some experience.

KAJ · Post by **KAJ** » Sun Feb 21, 2021 12:02 pm

Millennie Al wrote: Sun Feb 21, 2021 1:39 am
Bird on a Fire wrote: Fri Feb 19, 2021 6:55 pm So what would an exponential decrease suggest, mechanistically? That there's some asymptotic limit below which transmission won't fall for other reasons? In which case, getting below that limit would require a further development.
Quite the opposite. There's some limit below which the disease completely dies out. For example, if the number of cases halves every week, then we might have weeks with 16, 8, 4, 2, 1 after which you can't have half a case, so a more appropriate interpretation is that you have a 50% chance of one case and a 50% chance of zero. But zero is an exceptional value. Once it hits zero, there's no coming back.

It's potentially misleading to apply an exactly exponential decay model to small integer values. Such a model can't handle, for example, odd numbers of cases; 60, 30, 15, ???.

An exponential model is an approximation to reality and at small integer values is a not-very-good approximation. At low numbers you really need a probabilistic model.

lpm · Post by **lpm** » Sun Feb 21, 2021 12:21 pm

The noise increases at low levels. The "single meat packing plant at Doncaster" type effect.

But nobody is going to care about statistics when cases are in the hundreds. What matters is not being misled by noise and detecting underlying deviations from the trendline over the next few months.

shpalman · Post by **shpalman** » Sun Feb 21, 2021 12:21 pm

KAJ wrote: Sun Feb 21, 2021 12:02 pm
Millennie Al wrote: Sun Feb 21, 2021 1:39 am
Bird on a Fire wrote: Fri Feb 19, 2021 6:55 pm So what would an exponential decrease suggest, mechanistically? That there's some asymptotic limit below which transmission won't fall for other reasons? In which case, getting below that limit would require a further development.
Quite the opposite. There's some limit below which the disease completely dies out. For example, if the number of cases halves every week, then we might have weeks with 16, 8, 4, 2, 1 after which you can't have half a case, so a more appropriate interpretation is that you have a 50% chance of one case and a 50% chance of zero. But zero is an exceptional value. Once it hits zero, there's no coming back.
It's potentially misleading to apply an exactly exponential decay model to small integer values. Such a model can't handle, for example, odd numbers of cases; 60, 30, 15, ???.

Only because you're thinking in terms of halving time. In that case you might as well extract the thirding time, which would have exactly the same amount of scientific fundamentality (the inverse of the exponential rate is the time it takes for the number to fall to 1/e ~ 0.36788 of the previous value and it's hard to imagine that this will ever be an integer, but you can convert that to any 1/nth-ing time you like with a factor of log_e(n)).

An exponential model is an approximation to reality and at small integer values is a not-very-good approximation. At low numbers you really need a probabilistic model.

Well, yes, but integer case data should vaguely average out to follow the exponential model.

An integer model at zero cases will never get to 1 case, whereas a model allowed to go to fractions of a case might, if R changes to being >1. What does this mean? Well, case data indicates that just over 0.5% of people in Lombardy currently have the covids, that we know of. So maybe we could assume the true number is 1%. Does that mean there's 0.01 of a covid in my apartment? Of course not, either I have covid or I don't, and if I don't and I avoid all contact with humans then it will stay at zero. But then if everybody managed to do that, there'd eventually be no covids. What it more generally expresses, is the chances that I'll catch covid as I go about my socially-distanced business. Or in other words it makes no sense to apply the model to such a limited part of the population, especially as I'm anyway connected to a much larger population.

KAJ · Post by **KAJ** » Sun Feb 21, 2021 3:38 pm

shpalman wrote: Sun Feb 21, 2021 12:21 pm
KAJ wrote: Sun Feb 21, 2021 12:02 pm
Millennie Al wrote: Sun Feb 21, 2021 1:39 am

Quite the opposite. There's some limit below which the disease completely dies out. For example, if the number of cases halves every week, then we might have weeks with 16, 8, 4, 2, 1 after which you can't have half a case, so a more appropriate interpretation is that you have a 50% chance of one case and a 50% chance of zero. But zero is an exceptional value. Once it hits zero, there's no coming back.
It's potentially misleading to apply an exactly exponential decay model to small integer values. Such a model can't handle, for example, odd numbers of cases; 60, 30, 15, ???.
Only because you're thinking in terms of halving time. In that case you might as well extract the thirding time, which would have exactly the same amount of scientific fundamentality (the inverse of the exponential rate is the time it takes for the number to fall to 1/e ~ 0.36788 of the previous value and it's hard to imagine that this will ever be an integer, but you can convert that to any 1/nth-ing time you like with a factor of log_e(n)).

An exponential model is an approximation to reality and at small integer values is a not-very-good approximation. At low numbers you really need a probabilistic model.
Well, yes, but integer case data should vaguely average out to follow the exponential model.

An integer model at zero cases will never get to 1 case, whereas a model allowed to go to fractions of a case might, if R changes to being >1. What does this mean? Well, case data indicates that just over 0.5% of people in Lombardy currently have the covids, that we know of. So maybe we could assume the true number is 1%. Does that mean there's 0.01 of a covid in my apartment? Of course not, either I have covid or I don't, and if I don't and I avoid all contact with humans then it will stay at zero. But then if everybody managed to do that, there'd eventually be no covids. What it more generally expresses, is the chances that I'll catch covid as I go about my socially-distanced business. Or in other words it makes no sense to apply the model to such a limited part of the population, especially as I'm anyway connected to a much larger population.

I think we're agreeing with each other. Exponential growth/decay (dN/dt = kN) is a mathematical model which is attractively simple and closely fits many real world situations. It follows from some quite simple mechanistic models which are often called "exponential", although they are really distinct. Like all models, it "breaks" outside its domain of validity, which is clearer if any underlying mechanistic model is explicit.

Millenie Al answered a mechanistic/real world question from Bird on a Fire about small numbers, which I think are outside the domain of validity of the exponential model - this triggered my nerd reflex (xkcd). You correctly pointed out that my extension of his example was not a good rebuttal. A more correct phrasing of that argument would have been that the codomain of the exponential model includes non-integers whereas case numbers must be integer - the difference becomes important at low numbers. A better answer would have been that we do not expect the exponential model to [continue to] fit well at low counts.

lpm wrote: Sun Feb 21, 2021 12:21 pm The noise increases at low levels. The "single meat packing plant at Doncaster" type effect.

But nobody is going to should care about statistics when cases are in the hundreds. What matters is not being misled by noise and detecting underlying deviations from the trendline over the next few months.

FIFY Note that Millenie Al answered a question about low numbers with the exponential model.
Continuing in my nerdy vein, I'm a little uneasy regarding the probabilistic/random/unpredictable part of the data as "noise" implying that it is somehow unreal. I know you didn't mean that and are well aware that it can be relatively substantial.

lpm · Post by **lpm** » Sun Feb 21, 2021 4:43 pm

But there is a component of noise that is "unreal" - data errors. Somebody forgeting to submit data, or falling off the bottom excel, or misreading handwriting and entering 11 instead of 71. Much of the time they cancel out, sometimes you get a bunch all falling the same way followed by the opposite way the next day.

Post by **Bird on a Fire** » Sun Feb 21, 2021 5:02 pm

shpalman wrote: Sat Feb 20, 2021 5:01 pm We do sort-of "remove" infections from the population, in the sense that we quarantine them.

Well, ok, but they'd still be counted in these case data, which are numbers of infections detected. It's not like a population census where you're e.g. counting all the badgers in the woods, and the numbers depend on births and deaths. The covid numbers are just "number of new badgers detected per week", with the caveat that by the time we detect a badger it's probably already had the chance to reproduce sneakily, and sort of making the assumption that after a badger has been detected we isolate it so it can't breed any further.

(I actually might be doing some S(E)IR modelling later this year, but of information transmission rather than viruses. Right now I don't have a strong intuitive sense of how the different wiggly bits interact.)

Post by **Bird on a Fire** » Sun Feb 21, 2021 5:10 pm

lpm wrote: Sun Feb 21, 2021 4:43 pm But there is a component of noise that is "unreal" - data errors. Somebody forgeting to submit data, or falling off the bottom excel, or misreading handwriting and entering 11 instead of 71. Much of the time they cancel out, sometimes you get a bunch all falling the same way followed by the opposite way the next day.

I suspect those errors will be relatively small compared to the variation inherent in random processes, though. Even with perfect detection and perfect data collection, a population process occurring at a fixed probability will still wobble about all over the place.

It's relatively straightforward to get the plausible range of variation in silico by running some simulations.

Post by **Bird on a Fire** » Sun Feb 21, 2021 7:03 pm

Thought I'd have a play with some trajectories to look at the size of expected variations.

This is a super-simple individual-based model. Each infected individual infects n others, where n is drawn from a Poisson distribution with lambda (ie, mean and s.d.) equal to some reproduction number (so assuming a constant halving time). So with R<1, most individuals infect nobody, but some infect 1 or even 2 or 3 etc. This gives a mean that follows an exponential curve, but with stochastic variation around it.

Summing those new infections in each time step gives the total infected population, which then feeds forward to the next time step. A pretty bog standard Markov chain population model, in fact. I've done 1000 reps of each scenario just to get a look.

With large numbers around 30,000 like the UK has had recently, the 'noise' from using random numbers is comparatively small, and doesn't obscure the trend:

: Rplot.png (29.59 KiB) Viewed 6550 times

Below 10,000, things are starting to get a bit wobbly:

: Rplot01.png (43.64 KiB) Viewed 6550 times

But once we're into the realm of smaller numbers, quite a lot of plausible scenarios can look to the naked eye like a deviation from the exponential trend, even though it's really just the randoms:

: Rplot03.png (104.21 KiB) Viewed 6550 times

Post by **Bird on a Fire** » Sun Feb 21, 2021 7:07 pm

R code for the above:

Code: Select all

# Function to simulate and plot population trajectories

# Four input parameters:
# t = number of time steps to model;
# N0 = initial population size;
# Rt = population growth rate within a time step;
# Nrep = number of population replicates to simulate

sim_exp <- function(t, N0, Rt, Nrep) {
  res <- matrix(rep(NA, Nrep*t), ncol=t) # empty matrix to store results
  
  res[,1] <- N0
  
  t_i <- 2 # initialise time step index
  N_i <- 1 # initialise population index
  for (t_i in 2:t) {
    for (N_i in 1:Nrep) {
      res[N_i, t_i] <- sum(rpois(n=res[N_i, (t_i-1)], lambda = Rt))
    } 
  }
  
  res_df <- as.data.frame(res)
  res_df$pop <- 1:Nrep
  
  require(reshape2)
  res_long <- melt(res_df, id.vars = 'pop')
  res_long$T <- rep(1:t, each=Nrep)
  # res_long <- data.frame(T = 1:t, value=colSums(res), pop=1)
  
  require(ggplot2)
  p <- ggplot(res_long, aes(x=T, y=value)) + geom_path(alpha=25/Nrep, aes(group=pop)) + scale_y_log10() +
    ggtitle(paste0('N0 = ', N0, '; Rt = ', Rt,
                   ';\nMean pop size at t=', t, ': ', round(mean(res[,t]),1), 
                   ';\n95% CIs:', round(quantile(res[,t], 0.025)), '-' , round(quantile(res[,t], 0.975)), 
                   '; sd = ', round(sd(res[,t]), 1))) +
    theme_classic() + xlab('Time step') + ylab('N. of infections')
  print(p)
}

# Generate plots for post
sim_exp(t=12, N0=30000, Rt=0.9, Nrep=1000) 
sim_exp(t=12, N0=10000, Rt=0.9, Nrep=1000) 

p1 <- sim_exp(t=12, N0=1000, Rt=0.9, Nrep=1000) 
p2 <- sim_exp(t=12, N0=500, Rt=0.9, Nrep=1000) 
p3 <- sim_exp(t=12, N0=100, Rt=0.9, Nrep=1000) 
p4 <- sim_exp(t=12, N0=50, Rt=0.9, Nrep=1000) 

require(cowplot)
plot_grid(p1, p2, p3, p4, nrow=2)

Post by **Bird on a Fire** » Sun Feb 21, 2021 7:18 pm

So to come back to the question of the importance of data errors...

With "large" numbers of cases, the effect of natural variation is relatively small compared with the ongoing trend. Once the numbers are "small", that variation is a considerable proportion of expected change.

I suspect that measurement errors will show the opposite pattern. While there are lots of cases, systems and staff are overwhelmed, there's no time to check anything, and you'll get a comparatively large amount of mess. Once the case numbers are smaller and more manageable, there's time to double-check all the data for every covid patient in the hospital, to run samples twice, possibly even trace contacts and stuff.

I also suspect that as the numbers decline, the proportion of cases that actually get detected and end up in the data will increase for the same sorts of reasons. I have no clue how strong this effect would have to be for the decline to be detectably different from exponential, but it would be a fun thing to add to the model

shpalman · Post by **shpalman** » Sun Feb 21, 2021 7:29 pm

That's interesting but note that it looks worse for small numbers on a log scale because the steps between integer values are more obvious, and zero drops off the bottom.

KAJ · Post by **KAJ** » Sun Feb 21, 2021 7:45 pm

Bird on a Fire wrote: Sun Feb 21, 2021 7:03 pm Thought I'd have a play with some trajectories to look at the size of expected variations.

This is a super-simple individual-based model. Each infected individual infects n others, where n is drawn from a Poisson distribution with lambda (ie, mean and s.d.) equal to some reproduction number (so assuming a constant halving time). So with R<1, most individuals infect nobody, but some infect 1 or even 2 or 3 etc. This gives a mean that follows an exponential curve, but with stochastic variation around it.

Summing those new infections in each time step gives the total infected population, which then feeds forward to the next time step. A pretty bog standard Markov chain population model, in fact. I've done 1000 reps of each scenario just to get a look.

Just a couple of nit-picks. Please don't be offended.

"...Poisson distribution with lambda (ie, mean and s.d.)..." should be "mean and variance".
"...reps of each scenario..." In this context time is on an interval scale (Wiki) so the origin is arbitrary. That means the later scenarios (N0 = 10000, 1000, etc.) are just later evolutions of one outcome (the mean) of the first scenario (N0 = 30000). Effectively you've zoomed in on different regions of the tail.

Post by **Bird on a Fire** » Sun Feb 21, 2021 7:46 pm

shpalman wrote: Sun Feb 21, 2021 7:29 pm That's interesting but note that it looks worse for small numbers on a log scale because the steps between integer values are more obvious, and zero drops off the bottom.

I think the same (relative) fattening of the ribbons is noticeable on a linear scale:

: Rplot05.png (103.48 KiB) Viewed 6567 times

You can also see the trademark right-skewed Poisson distribution effect, with more trajectories above the mean than below it.

But yeah, if we run that last scenario forward for more time steps you can definitely start to see the effect you're talking about (linear on left, log on right):

: Rplot07.png (174.02 KiB) Viewed 6567 times

Scrutable

COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19

Re: COVID-19