Epidemiology analysis package

Overview

zepid

zEpid

PyPI version Build Status Documentation Status Join the chat at https://gitter.im/zEpid/community

zEpid is an epidemiology analysis package, providing easy to use tools for epidemiologists coding in Python 3.5+. The purpose of this library is to provide a toolset to make epidemiology e-z. A variety of calculations and plots can be generated through various functions. For a sample walkthrough of what this library is capable of, please look at the tutorials available at https://github.com/pzivich/Python-for-Epidemiologists

A few highlights: basic epidemiology calculations, easily create functional form assessment plots, easily create effect measure plots, and causal inference tools. Implemented estimators include; inverse probability of treatment weights, inverse probability of censoring weights, inverse probabilitiy of missing weights, augmented inverse probability of treatment weights, time-fixed g-formula, Monte Carlo g-formula, Iterative conditional g-formula, and targeted maximum likelihood (TMLE). Additionally, generalizability/transportability tools are available including; inverse probability of sampling weights, g-transport formula, and doubly robust generalizability/transportability formulas.

If you have any requests for items to be included, please contact me and I will work on adding any requested features. You can contact me either through GitHub (https://github.com/pzivich), email (gmail: zepidpy), or twitter (@zepidpy).

Installation

Installing:

You can install zEpid using pip install zepid

Dependencies:

pandas >= 0.18.0, numpy, statsmodels >= 0.7.0, matplotlib >= 2.0, scipy, tabulate

Module Features

Measures

Calculate measures directly from a pandas dataframe object. Implemented measures include; risk ratio, risk difference, odds ratio, incidence rate ratio, incidence rate difference, number needed to treat, sensitivity, specificity, population attributable fraction, attributable community risk

Measures can be directly calculated from a pandas DataFrame object or using summary data.

Other handy features include; splines, Table 1 generator, interaction contrast, interaction contrast ratio, positive predictive value, negative predictive value, screening cost analyzer, counternull p-values, convert odds to proportions, convert proportions to odds

For guided tutorials with Jupyter Notebooks: https://github.com/pzivich/Python-for-Epidemiologists/blob/master/3_Epidemiology_Analysis/a_basics/1_basic_measures.ipynb

Graphics

Uses matplotlib in the background to generate some useful plots. Implemented plots include; functional form assessment (with statsmodels output), p-value function plots, spaghetti plot, effect measure plot (forest plot), receiver-operator curve, dynamic risk plots, and L'Abbe plots

For examples see: http://zepid.readthedocs.io/en/latest/Graphics.html

Causal

The causal branch includes various estimators for causal inference with observational data. Details on currently implemented estimators are below:

G-Computation Algorithm

Current implementation includes; time-fixed exposure g-formula, Monte Carlo g-formula, and iterative conditional g-formula

Inverse Probability Weights

Current implementation includes; IP Treatment W, IP Censoring W, IP Missing W. Diagnostics are also available for IPTW. IPMW supports monotone missing data

Augmented Inverse Probability Weights

Current implementation includes the augmented-IPTW estimator described by Funk et al 2011 AJE

Targeted Maximum Likelihood Estimator

TMLE can be estimated through standard logistic regression model, or through user-input functions. Alternatively, users can input machine learning algorithms to estimate probabilities. Supported machine learning algorithms include sklearn

Generalizability / Transportability

For generalizing results or transporting to a different target population, several estimators are available. These include inverse probability of sampling weights, g-transport formula, and doubly robust formulas

Tutorials for the usage of these estimators are available at: https://github.com/pzivich/Python-for-Epidemiologists/tree/master/3_Epidemiology_Analysis/c_causal_inference

G-estimation of Structural Nested Mean Models

Single time-point g-estimation of structural nested mean models are supported.

Sensitivity Analyses

Includes trapezoidal distribution generator, corrected Risk Ratio

Tutorials are available at: https://github.com/pzivich/Python-for-Epidemiologists/tree/master/3_Epidemiology_Analysis/d_sensitivity_analyses

Comments
  • Confusing about the time in MonteCarloGFormula

    Confusing about the time in MonteCarloGFormula

    Hi,I am confusing about how to use the time and history data. It seems that we should make a model with data of questionnaires k-1 and k-2, but I couldn't find such code implementation. I found you used data of time k only while fitting you model.

    #176 exposure_model #207 outcome_model #274 add_covariate_model

    question 
    opened by Jnnocent 14
  • SingleCrossFit `invalid value encountered in log`

    SingleCrossFit `invalid value encountered in log`

    @pzivich, When using Singlecrossfit TMLE for a continuous outcome with sm.Gaussian GLM class. I have encountered the following error:

    xxx/lib/python3.7/site-packages/zepid/causal/doublyrobust/crossfit.py:1663: RuntimeWarning: invalid value encountered in log
      log = sm.GLM(ys, np.column_stack((h1ws, h0ws)), offset=np.log(probability_to_odds(py_os)),
    xxx/lib/python3.7/site-packages/zepid/causal/doublyrobust/crossfit.py:1669: RuntimeWarning: invalid value encountered in log
      ystar0 = np.append(ystar0, logistic.cdf(np.log(probability_to_odds(py_ns)) - epsilon[1] / pa0s))
    

    Here is how I defined the estimator for superleaner as well as the parameter input.

    link_i = sm.genmod.families.links.identity()
    SL_glm = GLMSL(family = sm.families.family.Gaussian(link=link_i))
    GLMSL(family = sm.families.family.Binomial())
    
    sctmle = SingleCrossfitTMLE(dataset = df, exposure='treatment', outcome='paid_amt', continuous_bound = 0.01)
    sctmle.exposure_model('gender_cd_F + prospective_risk + age_nbr', GLMSL(family = sm.families.family.Binomial()), bound=[0.01, 0.99])
    sctmle.outcome_model('gender_cd_F + prospective_risk + age_nbr', SL_glm)
    sctmle.fit(n_splits = 2, n_partitions=3, random_state=12345, method = 'median')
    sctmle.summary()
    

    If I uses any other ML estimates such as Lasso, GBM, RandomForest from Sklearn for outcome model estimator, it will work fine. The error only related to use of GLMSL family.

    Could you share any idea of the reason of this error and how I can fix this issue? Much appreciated!

    opened by miaow27 8
  • Add G-formula

    Add G-formula

    One lofty goal is to implement the G-formula. Would need to code two versions; time-fixed and time-varying. The Chapter by Robins & Hernan is good reference. I have code that implements the g-formula using pandas. It is reasonably fast.

    TODO: generalize to a class, allow input models then predict, need to determine how to allow users to input custom treatment regimes (all/none/natural course are easy to do), compare results (https://www.ncbi.nlm.nih.gov/pubmed/25140837)

    Time-fixed version will be relatively easy to write up

    Time-varying will need the ability to specify a large amount of models and specify the order in which the models are fit.

    Note; I am also considering reorganizing in v0.2.0 that IPW/g-formula/doubly robust will all be contained within a folder caused causal, rather than adding to the current ipw folder

    enhancement 
    opened by pzivich 8
  • generalize branch

    generalize branch

    In the future, I think a zepid.causal.generalize branch would be a useful addition. This branch would contain some generalizability and transportability tools. Specifically, the g-transport formula, inverse probability of sampling weights, inverse odds of sampling weights, and doubly robust generalizability.

    Generally, I think I can repurpose a fair bit of the existing code. I need to consider how best to handle the distinction between generalizability (sample from target population) and transportability (sample not from target population). I am imagining that the user would pass in two data sets, the data to estimate the model on, and the other data set to generalize to.

    As far as I know, these resources are largely lacking in all other commonly used epi softwares. Plus this is becoming an increasingly hot topic in epi (and I think it will catch on more widely once people recognize you can go from your biased sample to a full population under an additional set of assumptions)

    Resources:

    • g-transport and IPSW estimators: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5466356/

    • inverse odds of sampling: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5860052/

    • Doubly Robust estimator: https://arxiv.org/pdf/1902.06080.pdf

    • General introduction: https://arxiv.org/pdf/1805.00550.pdf

    Notes: Some record of Issa's presentation at CIRG. This is the more difficult doubly robust estimator. It is when only a subset of the cohort has some measure needed for transportability. Rather than throwing out the individual who don't have X2 measures, you can use the process in the arXiv paper. For the nested-nested population, the robust estimator has three parts. b_a(X_1, S) is harder to estimate but you can use the following algorithm

    1. model Y as a function of X={X1, X2} among S=1 and A=a

    2. Predict \hat{Y} among those with D=1

    3. Model \hat{Y} as X1, S in D=1

    4. Predict \hat{Y*} in D={0, 1}

    Also hard to estimate Pr(S=1|X) becuase X2 only observed for subset. Can use m-estimator to obtain. Can do this by a weighted regression with 1 for D=1 & S=1 and 1/C(X1, S) for D=1 & S=0. This is a little less obvious to me but seems doable

    enhancement Short-term Causal inference 
    opened by pzivich 7
  • TMLE & Machine Learning

    TMLE & Machine Learning

    TMLE is not guaranteed to attain nominal coverage when used with machine learning. A simulation paper showing major problems is: https://arxiv.org/abs/1711.07137 As a result, I don't feel like TMLE can continue to be supported with machine learning, especially since it implies the confidence intervals are way too narrow (sometimes resulting in 0% coverage). I know this is a divergence from R's tmleverse, but I would rather enforce the best practice/standards than allow incorrect use of methods

    Due to this issue, I will be dropping support for TMLE with machine learning. In place of this, I plan on adding CrossfitTMLE which will support machine learning approaches. The crossfitting will result in valid confidence intervals / inference.

    Tentative plan:

    • In v0.8.0, TMLE will throw a warning when using the custom_model argument.

    • Once the Crossfit-AIPW and Crossfit-TMLE are available (v0.9.0), TMLE will lose that functionality. If users want to use TMLE with machine learning, they will need to use a prior version

    bug change Short-term Causal inference 
    opened by pzivich 6
  • G-estimation of Structural Nested Models

    G-estimation of Structural Nested Models

    Add SNM to the zepid.causal branch. After this addition, all of Robin's g-methods will be implemented in zEpid.

    SNM are discussed in the Causal Inference book (https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/) and The Chapter. SAS code for search-based and closed-form solvers is available at the site. Ideally will have both implemented. Will start with time-fixed estimator

    enhancement Long-term wishlist Causal inference 
    opened by pzivich 6
  • v0.8.0

    v0.8.0

    Update for version 0.8.0. Below are listed the planned additions

    • [x] ~Update how the weights argument works in applicable causal models (IPTW, AIPW, g-formula)~ #106 No longer using this approach

    Inverse Probability of Treatment Weights

    • [x] Changing entire structure #102

    • [x] Figure out why new structure is giving slightly different values for positivity calculations...

    • [x] Add g-bounds option to truncate weights

    • [x] Update tests for new structure

    • [x] Weight argument behaves slightly different (diagnostics now available for either IPTW alone or with other weights)

    • [x] New summary function for results

    • [x] ~Allowing for continuous treatments in estimation of MSM~ ...for later

    • [x] ~Plots available for binary or continuous treatments~ ...for later

    Inverse Probability of Censoring Weights

    • [x] ~Correction for pooled logistic weight calculation with late-entry occurring~ Raise ValueError if late-entry is detected. The user will need to do some additional work

    • [x] Create better documentation for when late-entry occurs for this model

    G-formula

    • [x] Add diagnostics (prediction accuracy of model)

    • [x] Add run_diagnostics()

    Augmented IPTW

    • [x] Add g-bounds

    • [x] Add diagnostics for weights and outcome model

    • [x] Add run_diagnostics()

    TMLE

    • [x] New warning when using machine learning algorithms to estimate nuisance functions #109

    • [x] Add diagnostics for weights and outcome model

    • [x] Add run_diagnostics()

    S-value

    • [x] Add calculator for S-values, a (potentially) more informative measure than p-values #107

    ReadTheDocs Documentation

    • [x] Add S-value

    • [x] Update IPTW

    • [x] Make sure run_diagnostics() and bound are sufficiently explained

    opened by pzivich 5
  • refactor spline so an anonymous function can be returned for use elsewhere

    refactor spline so an anonymous function can be returned for use elsewhere

    Previously my code might look like:

    rossi_with_splines[['age0', 'age1']] = spline(rossi_with_splines, var='age', term=2, restricted=True)
    cph = CoxPHFitter().fit(rossi_with_splines.drop('age', axis=1), 'week', 'arrest')
    
    # this part is nasty
    df =rossi_with_splines.drop_duplicates('age').sort_values('age')[['age', 'age0', 'age1']].set_index('age')
    (df * cph.hazards_[['age0', 'age1']].values).sum(1).plot()
    

    vs

    spline_transform, _ = create_spline_transform(df['age'], term=2, restricted=True)
    rossi_with_splines[['age0', 'age1']] = spline_transform(rossi_with_splines['age'])
    
    cph = CoxPHFitter().fit(rossi_with_splines.drop('age', axis=1), 'week', 'arrest')
    
    ages_to_consider = np.arange(20, 45))
    y = spline_transform(ages_to_consider).dot(cph.hazards_[['age0', 'age1']].values)
    plot(ages_to_consider, y)
    
    opened by CamDavidsonPilon 5
  • v0.5.0

    v0.5.0

    Version 0.5.0

    Features to be implemented:

    • [x] Replace AIPW with the more specific AIPTW #57

    • [x] Add support for monotone IPMW #55

    • [ ] ~~Add support for nonmonotone IPMW #55~~ As I have read further into this, it gets a little complicated (even for the unconditional scenario). Will save for later implementation

    • [ ] Add support for continuous treatments in TimeFixedGFormula #49

    • [ ] ~~Add stratify option to measures #56~~

    • [x] TMLE with continuous outcomes #39

    • [x] TMLE with missing data #39 (only applies to missing outcome data)

    • [ ] ~~Add support for stochastic interventions into TMLE #52~~ Above two changes to TMLE will take precedence. Stochastic treatments are to be added later

    • [ ] ~~Add support for permutation weighting (TBD depending on complexity)~~ Will open a new branch for this project. No idea on how long implementation may take

    • [x] Incorporate random error in MonteCarloRR

    Maintenance

    • [x] Add Python 3.7 support

    • [x] Check to see if matplotlib 3 breaks anything. LGTM via test_graphics_manual.py

    • [x] Magic-g warning updates for g-formula #63

    opened by pzivich 5
  • Add interference

    Add interference

    Later addition, but since statsmodels 0.9.0 has GLIMMIX, I would like to add something to deal with interference for the causal branch. I don't have any part of this worked out, so I will need to take some time to really learn what is happening in these papers

    References: https://www.ncbi.nlm.nih.gov/pubmed/21068053 https://onlinelibrary.wiley.com/doi/abs/10.1111/biom.12184 https://github.com/bsaul/inferference

    Branch plan:

    ---causal
          |
          -interference
    

    Verification: inferference the R package has some datasets that I can compare results with

    Other: Will need to update requirements to need statsmodels 0.9.0

    enhancement Long-term wishlist Causal inference 
    opened by pzivich 5
  • Enhancements to Monte-Carlo g-formula

    Enhancements to Monte-Carlo g-formula

    As noted in #73 and #77 there are some further optional enhancements I can add to MonteCarloGFormula

    Items to add:

    • [x] Censoring model

    • [ ] Competing risk model

    Testing:

    • [x] Test censoring model works as intended (compare to Keil 2014)

    • [ ] Test competing risks. May be easiest to simulate up a quick data set to compare. Don't have anything on hand

    The updates to Monte-Carlo g-formula will be added to a future update (haven't decided which version they will make it into)

    Optional:

    • [x] Reduce memory burden of unneeded replicants

    I sometimes run into a MemoryError when replicating Keil et al 2014 with many resamples. A potential way out of this is to "throw away" the observations that are not the final observation for that individual. Can add option low_memory=True to throw out those unnecessary observations. User could return the full simulated dataframe with False.

    enhancement 
    opened by pzivich 4
  • Unable to install latest 0.9.0 version through pip

    Unable to install latest 0.9.0 version through pip

    Using the latest version of pip 22.2.2 I am unable to install the most recent zEpid 0.9.0 release on python 3.7.0

    pip install -Iv zepid==0.9.0

    ERROR: Could not find a version that satisfies the requirement zepid==0.9.0 (from versions: 0.1.0, 0.1.1, 0.1.2, 0.1.3, 0.1.5, 0.1.6, 0.2.0, 0.2.1, 0.3.0, 0.3.1, 0.3.2, 0.4.0, 0.4.1, 0.4.2, 0.4.3, 0.5.0, 0.5.1, 0.5.2, 0.6.0, 0.6.1, 0.7.0, 0.7.1, 0.7.2, 0.8.0, 0.8.1, 0.8.2) ERROR: No matching distribution found for zepid==0.9.0

    opened by aidanberry12 7
  • Saving DAGs programatically

    Saving DAGs programatically

    I had corresponded with @pzivich over email and am posting our communication here for the benefit of other users.

    JD.

    Is it possible to program saving figures of directed acyclic graphs (DAGs) using zEpid? E.g. using the M-bias DAG code in the docs, typing plt.savefig('dag.png') only saves a blank PNG. To save it to disk, I'd need to plot the figure then manually click-and-point on the pop-up to save it.

    PZ.

    Unfortunately, saving the DAGs draw isn't too easy. In the background, I use NetworkX to organize and plot the diagram. NetworkX uses matplotlib, but it doesn't return the matplotlib axes object. So while you can tweak parts of the graph in various ways, NetworkX doesn't allow you to directly access the drawn part of the image. Normally, this isn't a problem but when it gets wrapped up in a class object that returns the matplotlib axes (which is what DirectedAcyclicGraph. draw_dag(...) does) it can lead to the issues you noted.

    Currently, the best work-around is to generate the image by hand. Below is some code that should do the trick to match what is output by DirectedAcyclicGraph

    import networkx as nx
    import matplotlib.pyplot as plt
    from zepid.causal.causalgraph import DirectedAcyclicGraph
    
    dag = DirectedAcyclicGraph(exposure='X', outcome="Y")
    dag.add_arrows((('X', 'Y'),
                    ('U1', 'X'), ('U1', 'B'),
                    ('U2', 'B'), ('U2', 'Y')
                   ))
    
    fig = plt.figure(figsize=(6, 5))
    ax = plt.subplot(1, 1, 1)
    positions = nx.spectral_layout(dag.dag)
    nx.draw_networkx(dag.dag, positions, node_color="#d3d3d3", node_size=1000, edge_color='black',
                     linewidths=1.0, width=1.5, arrowsize=15, ax=ax, font_size=12)
    plt.axis('off')
    plt.savefig("filename.png", format='png', dpi=300)
    plt.close()
    
    

    Thanks Paul for the advice!

    For longer term, it seems useful to build this or something similar into zEpid graphics to programatically save (complex) DAGs in Python for publication. Possibly using position values from DAGs generated in dagitty, which is handy to quickly graph and analyse complex DAGs. Just a thought.

    Cheers

    opened by joannadiong 11
  • Add Odds Ratio and other estimands for AIPTW and TMLE

    Add Odds Ratio and other estimands for AIPTW and TMLE

    Currently AIPTW only returns RD and RR. TMLE returns those and OR as well. I should add support for OR with AIPTW (even though I am not a huge fan of OR when we have nicer estimands)

    I should also add support for all / none, and things like ATT and ATU for TMLE and AIPTW both. Basically I need to look up the influence curve info in the TMLE book(s)

    enhancement Causal inference 
    opened by pzivich 0
  • MonteCarloGFormula

    MonteCarloGFormula

    Currently you need to set the np.random.seed outside of the function for reproducibility (which isn't good). I should use a similar RandomState approach that the cross-fit estimators use

    bug Causal inference 
    opened by pzivich 0
  • Update documentation (and possibly re-organize)

    Update documentation (and possibly re-organize)

    I wrote most of the ReadTheDocs documentation 2-3 years ago now. It is dated (and my understanding has expanded), so I should go back and review everything after the v0.9.0 release

    Here are some things to consider

    • Use a different split than time-fixed and time-varying exposures
    • Add a futures section (rather than having embedded in documents)
    • Update the LIPTW / SIPTW info (once done)
    • Replace Chat Gitter button with GitHub Discussions
    • Add SuperLearner page to docs
    enhancement help wanted Website 
    opened by pzivich 2
Releases(latest-version)
  • latest-version(Oct 23, 2022)

  • v0.9.0(Dec 30, 2020)

    v0.9.0

    The 0.9.x series drops support of Python 3.5.x. Only Python 3.6+ are now supported. Support has also been added for Python 3.8

    Cross-fit estimators have been implemented for better causal inference with machine learning. Cross-fit estimators include SingleCrossfitAIPTW, DoubleCrossfitAIPTW, SingleCrossfitTMLE, and DoubleCrossfitTMLE. Currently functionality is limited to treatment and outcome nuisance models only (i.e. no model for missing data). These estimators also do not accept weighted data (since most of sklearn does not support weights)

    Super-learner functionality has been added via SuperLearner. Additions also include emprical mean (EmpiricalMeanSL), generalized linear model (GLMSL), and step-wise backward/forward selection via AIC (StepwiseSL). These new estimators are wrappers that are compatible with SuperLearner and mimic some of the R superlearner functionality.

    Directed Acyclic Graphs have been added via DirectedAcyclicGraph. These analyze the graph for sufficient adjustment sets, and can be used to display the graph. These rely on an optional NetworkX dependency.

    AIPTW now supports the custom_model optional argument for user-input models. This is the same as TMLE now.

    zipper_plot function for creating zipper plots has been added.

    Housekeeping: bound has been updated to new procedure, updated how print_results displays to be uniform, created function to check missingness of input data in causal estimators, added warning regarding ATT and ATU variance for IPTW, and added back observation IDs for MonteCarloGFormula

    Future plans: TimeFixedGFormula will be deprecated in favor of two estimators with different labels. This will more clearly delineate ATE versus stochastic effects. The replacement estimators are to be added

    Source code(tar.gz)
    Source code(zip)
  • v0.8.1(Oct 3, 2019)

    Added support for pygam's LogisticGAM for TMLE with custom models (Thanks darrenreger!)

    Removed warning for TMLE with custom models following updates to Issue #109 I plan on creating a smarter warning system that flags non-Donsker class machine learning algorithms and warns the user. I still need to think through how to do this.

    Source code(tar.gz)
    Source code(zip)
  • v0.8.0(Jul 17, 2019)

    v0.8.0

    Major changes to IPTW. IPTW now supports calculation of a marginal structural model directly.

    Greater support for censored data in IPTW, AIPTW, and GEstimationSNM

    Addition of s-values

    Source code(tar.gz)
    Source code(zip)
  • v0.7.2(May 19, 2019)

  • v0.7.1(May 3, 2019)

  • v0.6.0(Mar 31, 2019)

    MonteCarloGFormula now includes a separate censoring_model() function for informative censoring. Additionally, I added a low memory option to reduce the memory burden during the Monte-Carlo procedure

    IterativeCondGFormula has been refactored to accept only data in a wide format. This allows for me to handle more complex treatment assignments and specify models correctly. Additional tests have been added comparing to R's ltmle

    There is a new branch in zepid.causal. This is the generalize branch. It contains various tools for generalizing or transporting estimates from a biased sample to the target population of interest. Options available are inverse probability of sampling weights for generalizability (IPSW), inverse odds of sampling weights for transportability (IPSW), the g-transport formula (GTransportFormula), and doubly-robust augmented inverse probability of sampling weights (AIPSW)

    RiskDifference now calculates the Frechet probability bounds

    TMLE now allows for specified bounds on the Q-model predictions. Additionally, avoids error when predicted continuous values are outside the bounded values.

    AIPTW now has confidence intervals for the risk difference based on influence curves

    spline now uses numpy.percentile to allow for older versions of NumPy. Additionally, new function create_spline_transform returns a general function for splines, which can be used within other functions

    Lots of documentation updates for all functions. Additionally, summary() functions are starting to be updated. Currently, only stylistic changes

    Source code(tar.gz)
    Source code(zip)
  • v0.4.3(Feb 8, 2019)

  • v0.3.2(Nov 5, 2018)

    MAJOR CHANGES:

    TMLE now allows estimation of risk ratios and odds ratios. Estimation procedure is based on tmle.R

    TMLE variance formula has been modified to match tmle.R rather than other resources. This is beneficial for future implementation of missing data adjustment. Also would allow for mediation analysis with TMLE (not a priority for me at this time).

    TMLE now includes an option to place bounds on predicted probabilities using the bound option. Default is to use all predicted probabilities. Either symmetrical or asymmetrical truncation can be specified.

    TimeFixedGFormula now allows weighted data as an input. For example, IPMW can be integrated into the time-fixed g-formula estimation. Estimation for weighted data uses statsmodels GEE. As a result of the difference between GLM and GEE, the check of the number of dropped data was removed.

    TimeVaryGFormula now allows weighted data as an input. For example, Sampling weights can be integrated into the time-fixed g-formula estimation. Estimation for weighted data uses statsmodels GEE.

    MINOR CHANGES:

    Added Sciatica Trial data set. Mertens, BJA, Jacobs, WCH, Brand, R, and Peul, WC. Assessment of patient-specific surgery effect based on weighted estimation and propensity scoring in the re-analysis of the Sciatica Trial. PLOS One 2014. Future plan is to replicate this analysis if possible.

    Added data from Freireich EJ et al., "The Effect of 6-Mercaptopurine on the Duration of Steriod-induced Remissions in Acute Leukemia: A Model for Evaluation of Other Potentially Useful Therapy" Blood 1963

    TMLE now allows general sklearn algorithms. Fixed issue where predict_proba() is used to generate probabilities within sklearn rather than predict. Looking at this, I am probably going to clean up the logic behind this and the rest of custom_model functionality in the future

    AIPW object now contains risk_difference and risk_ratio to match RiskRatio and RiskDifference classes

    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(Aug 27, 2018)

  • v0.2.1(Aug 13, 2018)

  • v0.2.0(Aug 7, 2018)

    BIG CHANGES:

    IPW all moved to zepid.causal.ipw. zepid.ipw is no longer supported

    IPTW, IPCW, IPMW are now their own classes rather than functions. This was done since diagnostics are easier for IPTW and the user can access items directly from the models this way.

    Addition of TimeVaryGFormula to fit the g-formula for time-varying exposures/confounders

    effect_measure_plot() is now EffectMeasurePlot() to conform to PEP

    ROC_curve() is now roc(). Also 'probability' was changed to 'threshold', since it now allows any continuous variable for threshold determinations

    MINOR CHANGES:

    Added sensitivity analysis as proposed by Fox et al. 2005 (MonteCarloRR)

    Updated Sensitivity and Specificity functionality. Added Diagnostics, which calculates both sensitivity and specificity.

    Updated dynamic risk plots to avoid merging warning. Input timeline is converted to a integer (x100000), merged, then back converted

    Updated spline to use np.where rather than list comprehension

    Summary data calculators are now within zepid.calc.utils

    FUTURE CHANGES:

    All pandas effect/association measure calculations will be migrating from functions to classes in a future version. This will better meet PEP syntax guidelines and allow users to extract elements/print results. Still deciding on the setup for this... No changes are coming to summary measure calculators (aside from possibly name changes). Intended as part of v0.3.0

    Addition of Targeted Maximum Likelihood Estimation (TMLE). No current timeline developed

    Addition of IPW for Interference settings. No current timeline but hopefully before 2018 ends

    Further conforming to PEP guidelines (my bad)

    Source code(tar.gz)
    Source code(zip)
  • v0.1.6(Jul 16, 2018)

    See CHANGELOG for the full list of details

    Briefly,

    Added causal branch Added time-fixed g-formula Added double-robust estimator Updated some fixes to errors

    Source code(tar.gz)
    Source code(zip)
  • v0.1.5(Jul 11, 2018)

  • v0.1.3(Jul 2, 2018)

  • v0.1.2(Jun 25, 2018)

Owner
Paul Zivich
Epidemiology post-doc working in epidemiologic methods and infectious diseases.
Paul Zivich
(Personalized) Page-Rank computation using PyTorch

torch-ppr This package allows calculating page-rank and personalized page-rank via power iteration with PyTorch, which also supports calculation on GP

Max Berrendorf 69 Dec 03, 2022
Code repository for Semantic Terrain Classification for Off-Road Autonomous Driving

BEVNet Datasets Datasets should be put inside data/. For example, data/semantic_kitti_4class_100x100. Training BEVNet-S Example: cd experiments bash t

(Brian) JoonHo Lee 24 Dec 12, 2022
[CVPRW 21] "BNN - BN = ? Training Binary Neural Networks without Batch Normalization", Tianlong Chen, Zhenyu Zhang, Xu Ouyang, Zechun Liu, Zhiqiang Shen, Zhangyang Wang

BNN - BN = ? Training Binary Neural Networks without Batch Normalization Codes for this paper BNN - BN = ? Training Binary Neural Networks without Bat

VITA 40 Dec 30, 2022
Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification

STAM - Pytorch Implementation of STAM (Space Time Attention Model), yet another pure and simple SOTA attention model that bests all previous models in

Phil Wang 109 Dec 28, 2022
High-performance moving least squares material point method (MLS-MPM) solver.

High-Performance MLS-MPM Solver with Cutting and Coupling (CPIC) (MIT License) A Moving Least Squares Material Point Method with Displacement Disconti

Yuanming Hu 2.2k Dec 31, 2022
Code for "Typilus: Neural Type Hints" PLDI 2020

Typilus A deep learning algorithm for predicting types in Python. Please find a preprint here. This repository contains its implementation (src/) and

47 Nov 08, 2022
Source code related to the article submitted to the International Conference on Computational Science ICCS 2022 in London

POTHER: Patch-Voted Deep Learning-based Chest X-ray Bias Analysis for COVID-19 Detection Source code related to the article submitted to the Internati

Tomasz Szczepański 1 Apr 29, 2022
以孤立语假设和宽度优先搜索为基础,构建了一种多通道堆叠注意力Transformer结构的斗地主ai

ddz-ai 介绍 斗地主是一种扑克游戏。游戏最少由3个玩家进行,用一副54张牌(连鬼牌),其中一方为地主,其余两家为另一方,双方对战,先出完牌的一方获胜。 ddz-ai以孤立语假设和宽度优先搜索为基础,构建了一种多通道堆叠注意力Transformer结构的系统,使其经过大量训练后,能在实际游戏中获

freefuiiismyname 88 May 15, 2022
Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning, CVPR 2021

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning By Zhenda Xie*, Yutong Lin*, Zheng Zhang, Yue Ca

Zhenda Xie 293 Dec 20, 2022
Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks. Bayes

Intel Labs 210 Jan 04, 2023
Official PyTorch implementation of the preprint paper "Stylized Neural Painting", accepted to CVPR 2021.

Official PyTorch implementation of the preprint paper "Stylized Neural Painting", accepted to CVPR 2021.

Zhengxia Zou 1.5k Dec 28, 2022
Code for our paper "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021

SimCLS Code for our paper: "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021 1. How to Install Requirements

Yixin Liu 150 Dec 12, 2022
PyTorch implementation of our ICCV2021 paper: StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation

StructDepth PyTorch implementation of our ICCV2021 paper: StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimat

SJTU-ViSYS 112 Nov 28, 2022
Codebase for testing whether hidden states of neural networks encode discrete structures.

structural-probes Codebase for testing whether hidden states of neural networks encode discrete structures. Based on the paper A Structural Probe for

John Hewitt 349 Dec 17, 2022
Cobalt Strike teamserver detection.

Cobalt-Strike-det Cobalt Strike teamserver detection. usage: cobaltstrike_verify.py [-l TARGETS] [-t THREADS] optional arguments: -h, --help show this

TimWhite 17 Sep 27, 2022
The devkit of the nuScenes dataset.

nuScenes devkit Welcome to the devkit of the nuScenes and nuImages datasets. Overview Changelog Devkit setup nuImages nuImages setup Getting started w

Motional 1.6k Jan 05, 2023
Pytorch implementation for the paper: Contrastive Learning for Cold-start Recommendation

Contrastive Learning for Cold-start Recommendation This is our Pytorch implementation for the paper: Yinwei Wei, Xiang Wang, Qi Li, Liqiang Nie, Yan L

45 Dec 13, 2022
For the paper entitled ''A Case Study and Qualitative Analysis of Simple Cross-Lingual Opinion Mining''

Summary This is the source code for the paper "A Case Study and Qualitative Analysis of Simple Cross-Lingual Opinion Mining", which was accepted as fu

1 Nov 10, 2021
Time series annotation library.

CrowdCurio Time Series Annotator Library The CrowdCurio Time Series Annotation Library implements classification tasks for time series. Features Suppo

CrowdCurio 51 Sep 15, 2022
Image Segmentation and Object Detection in Pytorch

Image Segmentation and Object Detection in Pytorch Pytorch-Segmentation-Detection is a library for image segmentation and object detection with report

Daniil Pakhomov 732 Dec 10, 2022