Compute and visualise incidence (reworking of the original incidence package)

Last update: Nov 22, 2022

Related tags

Overview

incidence2

incidence2 is an R package that implements functions and classes to compute, handle and visualise incidence from linelist data. It refocusses the scope of the original incidence package. Unlike the original package, incidence2 concentrates only on the initial calculation, manipulation and plotting of the resultant incidence objects.

Installing the package

The development version, which this documentation refers to, can be installed from GitHub with:

if (!require(remotes)) {
  install.packages("remotes")
}
remotes::install_github("reconverse/incidence2", build_vignettes = TRUE)

You can install the current version of the package from the releases page or directly from CRAN with:

install.packages("incidence2")

Resources

Vignettes

A short overview of incidence2 is provided below in the worked example below. More detailed tutorials are distributed as vignettes with the package:

vignette("Introduction", package = "incidence2")
vignette("handling_incidence_objects", package = "incidence2")
vignette("customizing_incidence_plots", package = "incidence2")
vignette("alternative_date_groupings", package = "incidence2")

Getting help online

Bug reports and feature requests should be posted on github using the issue system: https://github.com/reconverse/incidence2/issues.
Online documentation: https://www.reconverse.org/incidence2.
All other questions should be posted on the RECON slack channel; see https://www.repidemicsconsortium.org/forum/ for details on how to join.

A quick overview

This short example uses the simulated Ebola Virus Disease (EVD) outbreak from the package outbreaks. It shows how to compute incidence for various time steps and plot the resulting output.

First, we load the data:

$ generation : int 0 1 1 2 2 0 3 3 2 3 ... #> $ date_of_infection : Date, format: NA "2014-04-09" ... #> $ date_of_onset : Date, format: "2014-04-07" "2014-04-15" ... #> $ date_of_hospitalisation: Date, format: "2014-04-17" "2014-04-20" ... #> $ date_of_outcome : Date, format: "2014-04-19" NA ... #> $ outcome : Factor w/ 2 levels "Death","Recover": NA NA 2 1 2 NA 2 1 2 1 ... #> $ gender : Factor w/ 2 levels "f","m": 1 2 1 1 1 1 1 1 2 2 ... #> $ hospital : Factor w/ 5 levels "Connaught Hospital",..: 2 1 3 NA 3 NA 1 4 3 5 ... #> $ lon : num -13.2 -13.2 -13.2 -13.2 -13.2 ... #> $ lat : num 8.47 8.46 8.48 8.46 8.45 ... ">

library(outbreaks)
library(incidence2)

dat <- ebola_sim_clean$linelist
str(dat)
#> 'data.frame':    5829 obs. of  11 variables:
#>  $ case_id                : chr  "d1fafd" "53371b" "f5c3d8" "6c286a" ...
#>  $ generation             : int  0 1 1 2 2 0 3 3 2 3 ...
#>  $ date_of_infection      : Date, format: NA "2014-04-09" ...
#>  $ date_of_onset          : Date, format: "2014-04-07" "2014-04-15" ...
#>  $ date_of_hospitalisation: Date, format: "2014-04-17" "2014-04-20" ...
#>  $ date_of_outcome        : Date, format: "2014-04-19" NA ...
#>  $ outcome                : Factor w/ 2 levels "Death","Recover": NA NA 2 1 2 NA 2 1 2 1 ...
#>  $ gender                 : Factor w/ 2 levels "f","m": 1 2 1 1 1 1 1 1 2 2 ...
#>  $ hospital               : Factor w/ 5 levels "Connaught Hospital",..: 2 1 3 NA 3 NA 1 4 3 5 ...
#>  $ lon                    : num  -13.2 -13.2 -13.2 -13.2 -13.2 ...
#>  $ lat                    : num  8.47 8.46 8.48 8.46 8.45 ...

Computing and plotting incidence

We compute the weekly incidence:

i_7 <- incidence(dat, date_index = date_of_onset, interval = 7)
i_7
#> An incidence object: 56 x 2
#> date range: [2014-04-07 to 2014-04-13] to [2015-04-27 to 2015-05-03]
#> cases: 5829
#> interval: 7 days
#> cumulative: FALSE
#> 
#>                  date_index count
#>                    
    
    
   
#>  1 2014-04-07 to 2014-04-13     1
#>  2 2014-04-14 to 2014-04-20     1
#>  3 2014-04-21 to 2014-04-27     5
#>  4 2014-04-28 to 2014-05-04     4
#>  5 2014-05-05 to 2014-05-11    12
#>  6 2014-05-12 to 2014-05-18    17
#>  7 2014-05-19 to 2014-05-25    15
#>  8 2014-05-26 to 2014-06-01    19
#>  9 2014-06-02 to 2014-06-08    23
#> 10 2014-06-09 to 2014-06-15    21
#> # … with 46 more rows
summary(i_7)
#> date range: [2014-04-07 to 2014-04-13] to [2015-04-27 to 2015-05-03]
#> cases: 5829
#> interval: 7 days
#> cumulative: FALSE
#> timespan: 392 days
plot(i_7, color = "white")

Notice how specifying the interval as 7 creates weekly intervals with the coverage displayed by date. incidence() also allows us to create year-weekly groupings with the default being weeks starting on a Monday (following the ISO 8601 date and time standard). incidence() can also compute incidence by specified groups using the groups argument. As an example, below we can compute the weekly incidence by gender and plot in a single, stacked chart:

An incidence object: 109 x 3 #> date range: [2014-W15] to [2015-W18] #> cases: 5829 #> interval: 1 (Monday) week #> cumulative: FALSE #> #> date_index gender count #> 


#> 1 2014-W15 f 1 #> 2 2014-W16 m 1 #> 3 2014-W17 f 4 #> 4 2014-W17 m 1 #> 5 2014-W18 f 4 #> 6 2014-W19 f 9 #> 7 2014-W19 m 3 #> 8 2014-W20 f 7 #> 9 2014-W20 m 10 #> 10 2014-W21 f 8 #> # … with 99 more rows summary(iw) #> date range: [2014-W15] to [2015-W18] #> cases: 5829 #> interval: 1 (Monday) week #> cumulative: FALSE #> timespan: 392 days #> #> 1 grouped variable #> #> gender count #>


#> 1 f 2934 #> 2 m 2895 plot(iw, fill = "gender", color = "white") ">
iw <- incidence(dat, interval = "week", date_index = date_of_onset, groups = gender)
iw
#> An incidence object: 109 x 3
#> date range: [2014-W15] to [2015-W18]
#> cases: 5829
#> interval: 1 (Monday) week 
#> cumulative: FALSE
#> 
#>    date_index gender count
#>        
         
           
          
         
        
#>  1   2014-W15 f          1
#>  2   2014-W16 m          1
#>  3   2014-W17 f          4
#>  4   2014-W17 m          1
#>  5   2014-W18 f          4
#>  6   2014-W19 f          9
#>  7   2014-W19 m          3
#>  8   2014-W20 f          7
#>  9   2014-W20 m         10
#> 10   2014-W21 f          8
#> # … with 99 more rows
summary(iw)
#> date range: [2014-W15] to [2015-W18]
#> cases: 5829
#> interval: 1 (Monday) week 
#> cumulative: FALSE
#> timespan: 392 days
#> 
#> 1 grouped variable
#> 
#>   gender count
#>   
          
         
        
#> 1 f       2934
#> 2 m       2895
plot(iw, fill = "gender", color = "white")

we can also facet our plot (grouping detected automatically):

facet_plot(iw, n_breaks = 3, color = "white")

It is also possible to group by multiple variables specifying different facets and fills:

An incidence object: 601 x 4 #> date range: [2014-W15] to [2015-W18] #> cases: 5829 #> interval: 1 (Monday) week #> cumulative: FALSE #> #> date_index gender hospital count #> 



#> 1 2014-W15 f Military Hospital 1 #> 2 2014-W16 m Connaught Hospital 1 #> 3 2014-W17 f

2 #> 4 2014-W17 f other 2 #> 5 2014-W17 m other 1 #> 6 2014-W18 f

1 #> 7 2014-W18 f Connaught Hospital 1 #> 8 2014-W18 f Princess Christian Maternity Hospital (PCMH) 1 #> 9 2014-W18 f Rokupa Hospital 1 #> 10 2014-W19 f

1 #> # … with 591 more rows summary(iw2) #> date range: [2014-W15] to [2015-W18] #> cases: 5829 #> interval: 1 (Monday) week #> cumulative: FALSE #> timespan: 392 days #> #> 2 grouped variables #> #> gender count #>


#> 1 f 2934 #> 2 m 2895 #> #> #> hospital count #>


#> 1 Military Hospital 889 #> 2 Connaught Hospital 1737 #> 3

1456 #> 4 other 876 #> 5 Princess Christian Maternity Hospital (PCMH) 420 #> 6 Rokupa Hospital 451 facet_plot(iw2, facets = gender, fill = hospital, n_breaks = 3) ">
iw2 <- incidence(dat, date_of_onset, interval = "week",  groups = c(gender, hospital))
iw2
#> An incidence object: 601 x 4
#> date range: [2014-W15] to [2015-W18]
#> cases: 5829
#> interval: 1 (Monday) week 
#> cumulative: FALSE
#> 
#>    date_index gender hospital                                     count
#>        
                
                  
                                                         
                  
                 
                
               
#>  1   2014-W15 f      Military Hospital                                1
#>  2   2014-W16 m      Connaught Hospital                               1
#>  3   2014-W17 f      
               
                                                             2
               
#>  4   2014-W17 f      other                                            2
#>  5   2014-W17 m      other                                            1
#>  6   2014-W18 f      
               
                                                             1
               
#>  7   2014-W18 f      Connaught Hospital                               1
#>  8   2014-W18 f      Princess Christian Maternity Hospital (PCMH)     1
#>  9   2014-W18 f      Rokupa Hospital                                  1
#> 10   2014-W19 f      
               
                                                             1
               
#> # … with 591 more rows
summary(iw2)
#> date range: [2014-W15] to [2015-W18]
#> cases: 5829
#> interval: 1 (Monday) week 
#> cumulative: FALSE
#> timespan: 392 days
#> 
#> 2 grouped variables
#> 
#>   gender count
#>   
                 
                
               
#> 1 f       2934
#> 2 m       2895
#> 
#> 
#>   hospital                                     count
#>   
                                                       
                
               
#> 1 Military Hospital                              889
#> 2 Connaught Hospital                            1737
#> 3 
               
                                                          1456
               
#> 4 other                                          876
#> 5 Princess Christian Maternity Hospital (PCMH)   420
#> 6 Rokupa Hospital                                451
facet_plot(iw2, facets = gender, fill = hospital, n_breaks = 3)

Using an alternative function

The incidence() function wraps the date grouping functionality of the grates package, providing an easy to use interface for constructing incidence objects. Sometimes, however, you may want greater flexibility in choosing how you would like to transform your “date” inputs. Using the function build_incidence(),you can specify the function you wish to apply. We illustrate this below with the excellent clock package:

An incidence object: 601 x 4 #> date range: [2014-W15] to [2015-W18] #> cases: 5829 #> #> date_index gender hospital count #> 

>



#> 1 2014-W15 f Military Hospital 1 #> 2 2014-W16 m Connaught Hospital 1 #> 3 2014-W17 f other 2 #> 4 2014-W17 f

2 #> 5 2014-W17 m other 1 #> 6 2014-W18 f Connaught Hospital 1 #> 7 2014-W18 f Princess Christian Maternity Hospital (PCMH) 1 #> 8 2014-W18 f Rokupa Hospital 1 #> 9 2014-W18 f

1 #> 10 2014-W19 f Connaught Hospital 2 #> # … with 591 more rows ">
library(clock)

# create a week function comparable to above approach
isoweek <- function(x) calendar_narrow(as_iso_year_week_day(x), "week")

clock_week_inci <- 
  build_incidence(
    dat,
    date_index = date_of_onset,
    groups = c(gender, hospital),
    FUN = isoweek
  )

clock_week_inci
#> An incidence object: 601 x 4
#> date range: [2014-W15] to [2015-W18]
#> cases: 5829
#> 
#>    date_index      gender hospital                                     count
#>    
          
           
            > 
              
                                                     
              
             
            
           
          
#>  1 2014-W15        f      Military Hospital                                1
#>  2 2014-W16        m      Connaught Hospital                               1
#>  3 2014-W17        f      other                                            2
#>  4 2014-W17        f      
          
                                                        2
          
#>  5 2014-W17        m      other                                            1
#>  6 2014-W18        f      Connaught Hospital                               1
#>  7 2014-W18        f      Princess Christian Maternity Hospital (PCMH)     1
#>  8 2014-W18        f      Rokupa Hospital                                  1
#>  9 2014-W18        f      
          
                                                        1
          
#> 10 2014-W19        f      Connaught Hospital                               2
#> # … with 591 more rows

Comments

Default color palette
The 3 requirements of the new color palette would be:

[x] look nice to as many humans as possible

[ ] be colorblind friendly

[ ] correspond to categorical variables

Quite a bit of thinking on these has been done by the viridis package: https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html

As well, these very good resources:

https://personal.sron.nl/~pault/#sec:qualitative

https://www.osapublishing.org/oe/abstract.cfm?uri=oe-21-8-9862

I suggest we use this issue to propose palettes. Ideally put them to a vote at some point.
enhancement help wanted discussion
opened by thibautjombart 10
Not importing incidence 1?

at I'm mostly just curious, but what is the reasoning behind the design decision to copy over the code from {incidence} initially instead of importing? From my perspective, if there's a bug in the future, then there are two places where it needs to be fixed.

opened by zkamvar 9
Adding moving average as geom_line

Hey - not a must have but might be nice to have option to add a moving average line. This pretty commonly used on messy epi data. Should be quite easy to implement now that {slider} has been released. See r4epi discussion

Maybe should just be left for users to so separately (i.e. add to plot themselves after)?
enhancement

opened by aspina7 8
Width argument to specify no gap between bars
Hello! Great improvements on the original package - thank you very much! I really like the ability to facet and to use count data.

I would like to ask if the plotting functions can allow a width argument or otherwise an option for the there to be no gap between bars. At the US CDC and it seems in Europe as well (see ref below) there is a traditional guideline that epidemic curves (when large enough that cases are not shown as boxes) should be histograms and not bar charts - or at least that there be no spaces between the bars. If this option can be offered I think it would also offer a solution to the varying width and frequency of "white lines"/gaps between bars, which appear for example in the github readme (below).

From the vignette - "white lines"/bar gaps appearing at different frequencies across the plot

From the vignette - "white lines"/bar gaps of varying thickness across the plot

I tried to include a width argument in plot() but it was not accepted. When I tried to add a geom_col() to plot() and specify width that also did not work. While experimenting, I tried to use geom_col alone directly on a weekly incidence2 object. When I specified width = 7 I was able to achieve non-overlapping bars without any gaps. This makes sense given that it was a weekly incidence object and according to this ggplot2 issue discussion which says that geom_col width is interpreted in absolute units (days in this case).

Here is that example - the outbreaks ebola_sim_clean linelist

pacman::p_load(incidence2, tidyverse, outbreaks) b <- incidence2::incidence(outbreaks::ebola_sim_clean$linelist, date_index = date_of_onset, groups = gender, interval = "week") plot(b, fill = gender) # weird varying white "gaps" between bars ggplot(data = b)+geom_col(aes(x = bin_date, y = count, fill = gender), width = 7) # no gaps

I just wanted to chime in and see if this was something that is possible. Perhaps at the least the width argument could be allowed to pass to the underlying geom_col? Then the user could tinker and find the correct width?

Thanks very much for considering!

ECDC guidelines for presentation of surveillance data
opened by nsbatra 7
Where to put labels on the x-axis?
There has been debates in the past on where dates should appear on the x-axis. I will try to sum up views / things to take into account below, and maybe some will add thoughts to it.

Original incidence package

epicurves were treated as histograms; bars represent case counts between 2 time points, so that e.g. for monthly incidence, a date on the x-axis marks the left hand-side of the bin (label to the left)

for fitting, a single date needs to be associated to a case count; thus we were using the middle of the time interval (label in the middle)

we did not have options for plotting epicurves as points / lines

several users complained that label in the middle was more intuitive

Current considerations

I suspect most epis do not read epicurves as histograms, so label in the middle would make sense

if we add geom_point and geom_line as options for plot and facet_plot, it is preferrable to have a consistent label positioning, which works the same for all geoms; label in the middle seems better for this: it still makes sense with geom_bar

model predictions will probably work better with label in the middle

devel-wise, it is safer to go with the least-amount of fiddling with ggplot2 handing of the x-axis

discussion
opened by thibautjombart 7
consider using {tsibble}

hi - like the idea of moving this to cleaner syntax!

Just thought I would suggest using tsibble to do some of the underlying legwork. There are yearweek and year* functions and all works pretty clean.

The advantage over aweek is that is recognised as a date automatically. The disadvantage over aweek is that cannot (as of yet) set a different start day for a week.

Actually just posted issues this morning on {aweek} and {tsibble} about this.

As a sidenote - while in the process of redoing everything might be worth considering renaming just to make certain epis happy and avoid semantic discussions around incidence vs incidence rate vs prevalence (see)
discussion

opened by aspina7 6
Allowing adding non-integer numbers to grate objects
A maybe not very frequent use case: define limits between two time intervals defined by grate, e.g. to visually delineate epochs in a graph using a vertical line.

Currently the following will error on purpose:

> as_yrwk("2021-W03") + 1 [1] "2021-W04" > as_yrwk("2021-W03") + 1.5 Error: Can only add whole numbers to <yrwk> objects

But unsure if we want to change this or not.
opened by thibautjombart 4

User request: allow date adjustment using % strptime abbreviations syntax

@nsbatra - the following requires the dev (GitHub main/master branches) of incidence2 and grates but hopefully works as you were hoping. Let me know what you think:

library(outbreaks)
library(incidence2)

dat <- ebola_sim_clean$linelist
x <- incidence(dat, date_of_onset, interval = "month")
x
#> An incidence2 object: 13 x 2
#> 5829 cases from 2014-Apr to 2015-Apr
#> interval: 1 month
#> cumulative: FALSE
#> 
#>    date_index count
#>    <month>    <int>
#>  1 2014-Apr       7
#>  2 2014-May      67
#>  3 2014-Jun     102
#>  4 2014-Jul     228
#>  5 2014-Aug     540
#>  6 2014-Sep    1144
#>  7 2014-Oct    1199
#>  8 2014-Nov     779
#>  9 2014-Dec     567
#> 10 2015-Jan     427
#> 11 2015-Feb     307
#> 12 2015-Mar     277
#> 13 2015-Apr     185
 
# centred dates (default for yearweek, single months, quarters and years)
plot(x, color = "white")


# histogram-esque dates on the breaks (defaults to "%Y-%m-%d")
plot(x, color = "white", centre_dates = FALSE)


# can specify a different format
plot(x, color = "white", centre_dates = FALSE, date_format = "%d-%m-%Y")

^{Created on 2021-05-19 by the reprex package (v2.0.0)}

opened by TimTaylor 3

Strategy for renaming functions
For instance, pool may be better named regroup, and I guess there could be more cases like this. Generally speaking, renaming things from the original incidence poses some trade-offs. There are several strategies we may consider:

Stick to the old

We keep old names as much as possible, and only use new names for new features.

Scrap the old

As this is a reboot, we can do away with old names, and rely on documentation for people to find out correspondence. A softer version would be to have incidence2::pool merely return NULL (or an error) and throw a message saying that this feature is now called regroup in incidence2.

Aliases

We could have incidence2::regroup <- incidence2::pool. If so, do we want to:

keep aliases going forward (I think not)

mark old names as deprecated and eventually remove them? It might make sense in terms of transition, but it is weird to develop a new package with already deprecated functions, with a schedule that explicitely plans breaking backward compatibility fruther down the line.

discussion
opened by thibautjombart 3
Possible new features: subsetting time windows
Subsetting objects by given time windows may be one of the only things made slightly easier in the original incidence package. For instance, x[1:5] would get you the first 5 time steps (days / weeks / months) of the object, which is a little trickier to do now. It would be useful to have some functions helping with this - see some proposed example uses below.

Filter first / last days / weeks / months etc.

Filter the data to retain the first or last data points, predicated on a duration. There is a question here, as to how duration can be specified:

simplest form: duration is provided as integer days

other simple form: duration is provided as integer time intervals (as specified by the bins of the object)

interpreted like the 'interval' argument of incidence2::incidence, so we could do things like "3 months" to have the first 3 months of data (possibly months 1 and 3 not being complete)

Examples would be (depending on the option above we retain):

filter_first(x, 30): retain the first 30 days of data, or all of it if there are less than 30 days

filter_first(x, "1 month"): retain the first month of data; may not be a full month, only data from the first reported month

filter_last(x, "4 weeks"): retain the last 4 weeks of data; the last week may not be complete e.g. if the last date is a Thursday, so this may not be 28 days of data

filter_last(x, 28): filters the first 28 days data (irrespective of week definition)

Subset

We could re-implement the features of incidence::subset(), but possibly renaming the function. It would merely be a wrapper for filter on dates.
enhancement
opened by thibautjombart 2
On the behaviour of `facet_plot`
Some thoughts on how facetting may work, esp with regards to using groups for facetting and/or color-filling. Nothing hard-set, more for discussion purpose. It would be useful to have a facet argument handling which grouping variables are used for facetting. Together with fill, this should give more flexibility to the user for designing plots with different grouping variables displayed.

Here are some proposed behaviours:

facet_plot(x): plot the incidence object using all grouping variables for facetting

facet_plot(x, facet = "foo"): same, using only variable foo for facetting

facet_plot(x, facet = c(foo, bar)): same, using variables foo and bar

facet_plot(x, facet = "foo", fill = bar): use foo for facetting and bar for filling

facet_plot(x, facet = c("foo", "bar"), fill = bar): use foo and bar for facetting, and bar for filling; redundant, but that's okay, the user asked for it

What do you think?
opened by thibautjombart 2
version2 - dplyr for vctrs_rcrd and POSIXlt support

Currently using only data.table for aggregation which means we cannot support non-atomic columns (see https://github.com/Rdatatable/data.table/issues?q=is%3Aopen+is%3Aissue+label%3A%22non-atomic+column%22 for further discussion). Previously I switched behaviour based on the input and dispatched to either data.table or dplyr accordingly. Need to think about whether I do the same thing going forward or error on known problematic inputs.

I want to avoid changing user inputs so if I do not go with the dual use I'd like to error on POSIXlt input .
version2 development

opened by TimTaylor 1

incidence() works with POSIXt columns when interval = "week" but interval = "day"

Please place an "x" in all the boxes that apply

[x] I have the most recent version of incidence2 and R
[x] I have found a bug
[x] I have a reproducible example

If a column is stored as POSIXt / datetime rather than a plain Date, incidence() will have an inconsistent behaviour and work when interval = "week" but interval = "day".

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(incidence2)

# Create a dataset with Dates stored as datetimes / POSIXt
data("covid19_england_nhscalls_2020", package = "outbreaks")

d <- covid19_england_nhscalls_2020 %>% 
  mutate(across(where(lubridate::is.Date), as.POSIXct, tz = "UTC"))

# Convert this dataset to incidence2

# WORKS
d %>%
  incidence("date",
            interval = "week",
            counts = "count"
  )
#> An incidence object: 27 x 2
#> date range: [2020-W12] to [2020-W38]
#> cases: 4101446
#> interval: 1 (Monday) week 
#> cumulative: FALSE
#> 
#>    date_index  count
#>        <yrwk>  <int>
#>  1   2020-W12 677547
#>  2   2020-W13 866757
#>  3   2020-W14 522298
#>  4   2020-W15 331617
#>  5   2020-W16 212969
#>  6   2020-W17 156984
#>  7   2020-W18 154765
#>  8   2020-W19 117314
#>  9   2020-W20 107629
#> 10   2020-W21  88949
#> # … with 17 more rows

# DOESN'T WORK
d %>%
  incidence("date",
            interval = "day",
            counts = "count"
  )
#> Error in `create_interval_string()`:
#> ! Not implemented for class POSIXct, POSIXt

#> Backtrace:
#>     ▆
#>  1. ├─d %>% incidence("date", interval = "day", counts = "count")
#>  2. └─incidence2::incidence(., "date", interval = "day", counts = "count")
#>  3.   ├─incidence2:::create_interval_string(dat$date_index)
#>  4.   └─incidence2:::create_interval_string.default(dat$date_index)
#>  5.     └─rlang::abort(...)

^{Created on 2022-11-21 with reprex v2.0.2.9000}

bug released

opened by Bisaloo 1

Warning about change in tidyselect

Please place an "x" in all the boxes that apply

[x] I have the most recent version of incidence2 and R
[x] I have found a bug
[x] I have a reproducible example

Please include a brief description of the problem with a code example:

library(incidence2)

data(ebola_sim_clean, package = "outbreaks")
dat <- ebola_sim_clean$linelist

inci <- incidence(dat,
                  date_index = date_of_onset,
                  interval = 7,
                  groups = hospital)

green_grey <- "#5E7E80"

facet_plot(inci, fill = green_grey)
#> Warning: Using an external vector in selections was deprecated in tidyselect 1.1.0.
#> ℹ Please use `all_of()` or `any_of()` instead.
#>   # Was:
#>   data %>% select(green_grey)
#> 
#>   # Now:
#>   data %>% select(all_of(green_grey))
#> 
#> See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.

^{Created on 2022-10-26 with reprex v2.0.2.9000}

Looking at the full error stack, this is caused by

https://github.com/reconverse/incidence2/blob/6d527d1f2d5e0de682a435643d2f6d193f3f734e/R/plot.R#L212-L218

Related: #78

bug released

opened by Bisaloo 1

Improve build_incidence() documentation

Currently build_incidence() does not mention that it cannot be used with the built in plotting functionality. This should be documented, and potentially we should create a plot methods that warns when called directly.

opened by TimTaylor 0

Releases(v1.2.2)

v1.2.2(Aug 23, 2021)
Bug fixes

Fixes bug when input object to incidence is a data.table.

Source code(tar.gz)
Source code(zip)
v1.2.1(Jul 15, 2021)
Bug fixes

Fixes bug in incidence() when more than one column was given for the date_index.

Fixes incorrect test that did not take in to account changing time zones.

Source code(tar.gz)
Source code(zip)
v1.2.0(Jul 7, 2021)
New functions

new_incidence(): A minimal incidence constructor.

validate_incidence(): Check for internal consistency of incidence-like object.

build_incidence(): Allows you to construct an incidence object whilst specifying your own date grouping function.

format.incidence()

Deprecated functions

cumulate() will now give a deprecation error. We have removed the function to avoid users erroneously regressing against a cumulative count.

Bug fixes

Fixes bug in incidence() when dates were a character vector and the the default, daily, interval was specified.

Other updates

Now uses dplyr to handle list based columns (e.g. record-type objects from vctrs). For data.frames with only atomic columns, data.table is still used.

Printing and summaries of incidence objects have been improved to remove duplication in the overview section.

Source code(tar.gz)
Source code(zip)
v1.1(May 29, 2021)
New function complete_counts().

plot() and facet_plot() now have a centre_dates argument which can be set to FALSE to get histogram-esque date labels for single month, quarter and yearweek groupings.

Internal refactoring due to breakages changes in the upstream grates package.

Source code(tar.gz)
Source code(zip)
v1.0.0(Mar 30, 2021)
Due to multiple changes in the underlying representation of incidence2 objects this release may possibly break old workflows particularly those relying on the old implementations of date grouping:

Now uses the package grates for date grouping. This introduces the s3 classes yrwk, yrmon, yrqtr, yr, period and int_period as well as associated constructors which incidence now builds upon. As a result of this the aweek dependency has been dropped.

Add's keep_first and keep_last functions.

Construction of incidence objects now faster due to underlying use of data.table.

Source code(tar.gz)
Source code(zip)
v0.2.2(Nov 12, 2020)
Fixes bug in get_interval.

Removes message that was displayed when incidence class dropped.

Refactoring of internal code to improve maintainability.

Tests now use the 3rd edition of testthat.

Source code(tar.gz)
Source code(zip)
v0.2.1(Oct 16, 2020)
Fixes bug in as.data.frame.incidence2

Limits internal reliance on dplyr.

Source code(tar.gz)
Source code(zip)
v0.2.0(Sep 22, 2020)
Fixes issue with monthly incidence objects when show_cases = TRUE (see #42).

Additional checks added for assessing whether a manipulated incidence object maintains its class.

Improved implementation speed.

NA's now ignored in the count variable of a pre-aggregated input to incidence function.

Fixes axis labelling and spacing.

Source code(tar.gz)
Source code(zip)
v0.1.0(Sep 10, 2020)
Initial release.

Source code(tar.gz)
Source code(zip)

Owner

GitHub Repository https://www.reconverse.org/incidence2/

Advanced_Data_Visualization_Tools - The present hands-on lab mainly uses Immigration to Canada dataset and employs advanced visualization tools such as word cloud, and waffle plot to display relations between features within the dataset.

Hands-on Practice Learning Lab for Data Science Overview This hands on practice lab is a part of Data Visualization with Python course offered by Cour

1 Jan 05, 2022

Decision Border Visualizer for Classification Algorithms

dbv Decision Border Visualizer for Classification Algorithms Project description A python package for Machine Learning Engineers who want to visualize

1 Nov 01, 2021

Practical-statistics-for-data-scientists - Code repository for O'Reilly book

Code repository Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python by Peter Bruce, Andrew Bruce, and Peter Gedeck Pub

1.7k Jan 04, 2023

A streamlit component for bi-directional communication with bokeh plots.

Streamlit Bokeh Events A streamlit component for bi-directional communication with bokeh plots. Its just a workaround till streamlit team releases sup

123 Dec 25, 2022

Boltzmann visualization - Visualize the Boltzmann distribution for simple quantum models of molecular motion

1 Jan 22, 2022

Python package for the analysis and visualisation of finite-difference fields.

discretisedfield Marijan Beg1,2, Martin Lang2, Samuel Holt3, Ryan A. Pepper4, Hans Fangohr2,5,6 1 Department of Earth Science and Engineering, Imperia

12 Dec 14, 2022

Massively parallel self-organizing maps: accelerate training on multicore CPUs, GPUs, and clusters

Somoclu Somoclu is a massively parallel implementation of self-organizing maps. It exploits multicore CPUs, it is able to rely on MPI for distributing

239 Nov 10, 2022

Flow-based visual scripting for Python

A simple visual node editor for Python Ryven combines flow-based visual scripting with Python. It gives you absolute freedom for your nodes and a simp

3.1k Jan 06, 2023

Comparing USD and GBP Exchange Rates

Currency Data Visualization Comparing USD and GBP Exchange Rates This is a bar graph comparing GBP and USD exchange rates. I chose blue for the UK bec

5 Oct 28, 2021

Matplotlib JOTA style for making figures

Matplotlib JOTA style for making figures This repo has Matplotlib JOTA style to format plots and figures for publications and presentation.

2 May 05, 2022

Geocoding library for Python.

geopy geopy is a Python client for several popular geocoding web services. geopy makes it easy for Python developers to locate the coordinates of addr

3.8k Jan 02, 2023

PolytopeSampler is a Matlab implementation of constrained Riemannian Hamiltonian Monte Carlo for sampling from high dimensional disributions on polytopes

PolytopeSampler PolytopeSampler is a Matlab implementation of constrained Riemannian Hamiltonian Monte Carlo for sampling from high dimensional disrib

9 Sep 26, 2022

This project is an Algorithm Visualizer where a user can visualize algorithms like Bubble Sort, Merge Sort, Quick Sort, Selection Sort, Linear Search and Binary Search.

Algo_Visualizer This project is an Algorithm Visualizer where a user can visualize common algorithms like "Bubble Sort", "Merge Sort", "Quick Sort", "

4 Feb 07, 2022

This is a Boids Simulation, written in Python with Pygame.

PyNBoids A Python Boids Simulation This is a Boids simulation, written in Python3, with Pygame2 and NumPy. To use: Save the pynboids_sp.py file (and n

17 Dec 18, 2022

An(other) implementation of JSON Schema for Python

jsonschema jsonschema is an implementation of JSON Schema for Python. from jsonschema import validate # A sample schema, like what we'd get f

4k Jan 04, 2023

A Python Binder that merge 2 files with any extension by creating a new python file and compiling it to exe which runs both payloads.

Update ! ANONFILE MIGHT NOT WORK ! About A Python Binder that merge 2 files with any extension by creating a new python file and compiling it to exe w

15 Oct 12, 2022

A Jupyter - Leaflet.js bridge

ipyleaflet A Jupyter / Leaflet bridge enabling interactive maps in the Jupyter notebook. Usage Selecting a basemap for a leaflet map: Loading a geojso

1.3k Dec 27, 2022

Matplotlib colormaps from the yt project !

cmyt Matplotlib colormaps from the yt project ! Colormaps overview The following colormaps, as well as their respective reversed (*_r) versions are av

5 Sep 16, 2022

MPL Plotter is a Matplotlib based Python plotting library built with the goal of delivering publication-quality plots concisely.

162 Nov 11, 2022

A Python Library for Self Organizing Map (SOM)

SOMPY A Python Library for Self Organizing Map (SOM) As much as possible, the structure of SOM is similar to somtoolbox in Matlab. It has the followin

497 Dec 29, 2022

Compute and visualise incidence (reworking of the original incidence package)

Related tags

Overview

incidence2

Installing the package

Resources

Vignettes

Getting help online

A quick overview

Computing and plotting incidence

Using an alternative function

Comments

Original incidence package

Current considerations

Stick to the old

Scrap the old

Aliases

Filter first / last days / weeks / months etc.

Subset

Please place an "x" in all the boxes that apply

Please place an "x" in all the boxes that apply

Releases(v1.2.2)

v1.2.2(Aug 23, 2021)

Bug fixes

v1.2.1(Jul 15, 2021)

Bug fixes

v1.2.0(Jul 7, 2021)

New functions

Deprecated functions

Bug fixes

Other updates

v1.1(May 29, 2021)

v1.0.0(Mar 30, 2021)

v0.2.2(Nov 12, 2020)

v0.2.1(Oct 16, 2020)

v0.2.0(Sep 22, 2020)

v0.1.0(Sep 10, 2020)

Owner

Advanced_Data_Visualization_Tools - The present hands-on lab mainly uses Immigration to Canada dataset and employs advanced visualization tools such as word cloud, and waffle plot to display relations between features within the dataset.

Decision Border Visualizer for Classification Algorithms

Practical-statistics-for-data-scientists - Code repository for O'Reilly book

A streamlit component for bi-directional communication with bokeh plots.

Boltzmann visualization - Visualize the Boltzmann distribution for simple quantum models of molecular motion

Python package for the analysis and visualisation of finite-difference fields.

Massively parallel self-organizing maps: accelerate training on multicore CPUs, GPUs, and clusters

Flow-based visual scripting for Python

Comparing USD and GBP Exchange Rates

Matplotlib JOTA style for making figures

Geocoding library for Python.

PolytopeSampler is a Matlab implementation of constrained Riemannian Hamiltonian Monte Carlo for sampling from high dimensional disributions on polytopes

This project is an Algorithm Visualizer where a user can visualize algorithms like Bubble Sort, Merge Sort, Quick Sort, Selection Sort, Linear Search and Binary Search.

This is a Boids Simulation, written in Python with Pygame.

An(other) implementation of JSON Schema for Python

A Python Binder that merge 2 files with any extension by creating a new python file and compiling it to exe which runs both payloads.

A Jupyter - Leaflet.js bridge

Matplotlib colormaps from the yt project !

MPL Plotter is a Matplotlib based Python plotting library built with the goal of delivering publication-quality plots concisely.

A Python Library for Self Organizing Map (SOM)