Skip to content

What happens with smooth.Rt and gaps in data ? #8

@tobadia

Description

@tobadia

From a recent e-mail :

Greetings!

I am Chris Swenson, a data scientist in the United States, working for a multi-state, non-profit hospital system called SSM Health. We have been using the R0 package in R for the past few months, and I have a question about how the smooth.Rt function works.

We’ve been calculating the R(0) / R(t) for specific regions where we have hospitals for the past few months, to alert the infection control staff of potential surges in COVID-19 cases. I’ve been using the data from Johns Hopkins to supply the county-level regional data as inputs to the estimation. (We’ve had to do some data corrections. For examples, some locations appear to skip weekends and reporting very high numbers on Mondays.)

It appears that if we’re not careful handling the data, some unexpected results may occur. There have been some situations where the data was not entered correctly, and it appears that the smooth.Rt function will exclude any incomplete periods (e.g., 7-day periods, in our case). Below is some code where I ran the estimate and smoothing for the data, removed 1 day and estimated, removed 1 week and estimated, and compared all three estimates. I also included the final table, and since the data is included in R, you may run the code to compare.

My question is: What happens in the smooth.Rt function when a period is incomplete, for example, the input data only has 5 days instead of 7 in a week? It appears the most recent week of data generates a much smaller estimate after smoothing. Should we be cautious when using the most recent data?

Provided replicable example

mGT <- generation.time("gamma", c(3,1.5))
 
TD <- estimate.R(Germany.1918, mGT, begin=as.integer(1), end=as.integer(length(Germany.1918)), methods="TD", nsim=100)
TD.weekly <- smooth.Rt(TD$estimates$TD, 7)
init <- TD.weekly$R
#TD.weekly
print(paste('Original: ', as.character(length(init))))
 
len <- length(Germany.1918)-1
test <- Germany.1918[1:len]
TD <- estimate.R(test, mGT, begin=as.integer(1), end=as.integer(length(test)), methods="TD", nsim=100)
TD.weekly <- smooth.Rt(TD$estimates$TD, 7)
new <- TD.weekly$R
#TD.weekly
print(paste('One Less Row: ', length(new)))
 
len <- length(Germany.1918)-7
test2 <- Germany.1918[1:len]
TD <- estimate.R(test2, mGT, begin=as.integer(1), end=as.integer(length(test2)), methods="TD", nsim=100)
TD.weekly <- smooth.Rt(TD$estimates$TD, 7)
new2 <- TD.weekly$R
#TD.weekly
print(paste('One Less Row: ', length(new2)))
 
df_init_full <- as.data.frame(init)
df_init <- as.data.frame(init)[1:length(init)-1,]
df_new <- as.data.frame(new)
df_new2 <- as.data.frame(new2)
df_compare <- cbind(df_init, df_new, df_new2)

Comparison of outputs

  Full Data Data Missing 1 Day Data Missing 1 Week
1 1.8784 1.8784 1.8784
2 1.5810 1.5810 1.5810
3 1.3569 1.3569 1.3569
4 1.1316 1.1316 1.1316
5 0.9615 0.9615 0.9615
6 0.8119 0.8119 0.8119
7 0.8045 0.8045 0.8045
8 0.8396 0.8396 0.8396
9 0.8543 0.8543 0.8543
10 0.8258 0.8258 0.8258
11 0.8544 0.8544 0.8544
12 0.9776 0.9776 0.9776
13 0.9517 0.9517 0.9517
14 0.9273 0.9273 0.9273
15 0.9635 0.9635 0.9635
16 0.9509 0.9509 0.9481
17 0.9827 0.9843 0.4994
18 0.5844    

Note: I manually added week 18 from the original estimation for comparison.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions