A lightweight interface for reading in output from the Weather Research and Forecasting (WRF) model into xarray Dataset

Related tags

Data Analysisxwrf
Overview

xwrf

A lightweight interface for reading in output from the Weather Research and Forecasting (WRF) model into xarray Dataset. The primary objective of xwrf is to replicate crucial I/O functionality from the wrf-python package in a way that is more convenient for users and provides seamless integration with the rest of the Pangeo software stack.

CI GitHub Workflow Status Code Coverage Status
Docs Documentation Status
License License

This code is highly experimental! Let the buyer beware ⚠️ ;)

Installation

xwrf may be installed with pip:

python -m pip install git+https://github.com/NCAR/xwrf.git

What is it?

The native WRF output files are not CF compliant. This makes these files not the easiest NetCDF files to use with tools like xarray. This package provides a simple interface for reading in the WRF output files into xarray Dataset objects using xarray's flexible and extensible I/O backend API. For example, the following code reads in a WRF output file:

Dimensions: (Time: 1, south_north: 546, west_east: 480) Coordinates: XLONG (south_north, west_east) float32 ... XLAT (south_north, west_east) float32 ... Dimensions without coordinates: Time, south_north, west_east Data variables: Q2 (Time, south_north, west_east) float32 ... PSFC (Time, south_north, west_east) float32 ... Attributes: (12/86) TITLE: OUTPUT FROM WRF V3.3.1 MODEL START_DATE: 2012-04-20_00:00:00 SIMULATION_START_DATE: 2012-04-20_00:00:00 WEST-EAST_GRID_DIMENSION: 481 SOUTH-NORTH_GRID_DIMENSION: 547 BOTTOM-TOP_GRID_DIMENSION: 32 ... ... NUM_LAND_CAT: 24 ISWATER: 16 ISLAKE: -1 ISICE: 24 ISURBAN: 1 ISOILWATER: 14 ">
In [1]: import xarray as xr

In [2]: path = "./tests/sample-data/wrfout_d03_2012-04-22_23_00_00_subset.nc"

In [3]: ds = xr.open_dataset(path, engine="xwrf")

In [4]: # or

In [5]: # ds = xr.open_dataset(path, engine="wrf")

In [6]: ds
Out[6]:
<xarray.Dataset>
Dimensions:  (Time: 1, south_north: 546, west_east: 480)
Coordinates:
    XLONG    (south_north, west_east) float32 ...
    XLAT     (south_north, west_east) float32 ...
Dimensions without coordinates: Time, south_north, west_east
Data variables:
    Q2       (Time, south_north, west_east) float32 ...
    PSFC     (Time, south_north, west_east) float32 ...
Attributes: (12/86)
    TITLE:                            OUTPUT FROM WRF V3.3.1 MODEL
    START_DATE:                      2012-04-20_00:00:00
    SIMULATION_START_DATE:           2012-04-20_00:00:00
    WEST-EAST_GRID_DIMENSION:        481
    SOUTH-NORTH_GRID_DIMENSION:      547
    BOTTOM-TOP_GRID_DIMENSION:       32
    ...                              ...
    NUM_LAND_CAT:                    24
    ISWATER:                         16
    ISLAKE:                          -1
    ISICE:                           24
    ISURBAN:                         1
    ISOILWATER:                      14

In addition to being able to use xr.open_dataset, xwrf also allows reading in multiple WRF output files at once via xr.open_mfdataset function:

ds = xr.open_mfdataset(list_of_files, engine="xwrf", parallel=True,
                       concat_dim="Time", combine="nested")

Why not just a preprocess function?

One can achieve the same functionality with a preprocess function. However, there are some additional I/O features that wrf-python implements under the hood that we think would be worth implementing as part of a backend engine instead of a regular preprocess function.

Comments
  • First Release Blog Post

    First Release Blog Post

    Description

    I think that once we have a first release of xwrf, we should write a blog post demonstrating its use. It would be great if one of our WRF expert collaborators could spearhead this blog. Any volunteers?

    Implementation

    Personally, I think that a Jupyter Notebook is a good medium for a demonstration, and the notebook can be easily converted to a markdown doc for a blog-post.

    Tests

    N/A

    Questions

    Before embarking on this, though, we need to complete the features that we want in the first release. That said, I wouldn't be too overly excited to delay the release. Earlier is better, even if incomplete.

    enhancement 
    opened by kmpaul 32
  • Implementation of salem-style x, y, and z coordinates

    Implementation of salem-style x, y, and z coordinates

    Change Summary

    As alluded to in #2, including dimension coordinates in the grid mapping/projection space is a key feature for integrating with other tools in the ecosystem like metpy and xgcm. In this (draft) PR, I've combined code ported from salem with some of my own one-off scripts and what already exists in xwrf to meet this goal. In particular, this introduces a pyproj dependency (for CRS handling and transforming the domain center point from lon/lat to easting/northing). Matching the assumptions already present in xwrf and salem, this implementation assumes we do not have a moving domain (which simplifies things greatly). Also, this implements the c_grid_axis_shift attr as-needed, so xgcm should be able to interpret our coords automatically, eliminating the need for direct handling (like #5) in xwrf.

    ~~Also, because it existed in salem and my scripts alongside the dimension coordinate handling, I also included my envisioned diagnostic field calculations. These are deliberately limited to only those four fields that require WRF-specific handling:~~

    • ~~ 'T' going to potential temperature has a magic number offset of 300 K~~
    • ~~ 'P' and 'PB' combine to form pressure, and are not otherwise used~~
    • ~~ 'PH' and 'PHB' combine to form geopotential, and are not otherwise used~~
    • ~~ Geopotential to geopotential height conversion depends on a particular value of g (9.81 m s**2) that may not match the value used elsewhere~~

    ~~Unless I'm missing something, any other diagnostics should be derivable using these or other existing fields in a non-WRF-specific way (and so, fit outside of xwrf). If the netcdf4 backend already handles Dask chunks, then this should "just work" as it is currently written. However, I'm not sure how this should behave with respect to lazy-loading when chunks are not specified, so that is definitely a discussion to have in relation to #10.~~

    ~~Right now, no tests are included, as this is just a draft implementation to get the conversation started on how we want to approach these features. So, please do share your thoughts and ask questions!~~

    Related issue number

    • Closes #3
    • Closes #11

    Checklist

    • [x] Unit tests for the changes exist
    • [x] Tests pass on CI
    • [ ] Documentation reflects the changes where applicable
    enhancement 
    opened by jthielen 31
  • First Release?

    First Release?

    Now that we have xwrf in a usable state, should we consider cutting its first release soon (later this week or next week)? We already have the infrastructure in place for automatically publishing the package to PyPI. One missing piece is the documentation. The infrastructure for authoring the docs is already in place (uses markdown via myst + furo theme, and the current template follows this documentation system guide). I am opening this issue to keep track of other outstanding issues that need to be addressed before the first release. Feel free to add to this list (cc @ncar-xdev/xwrf)

    • [x] Update documentation
    • [x] Publish to PyPI
    • [x] Publish to conda-forge
    opened by andersy005 27
  • Tutorial

    Tutorial

    Change Summary

    Tutorial showing xWRF usage.

    Related issue number

    • Towards #69

    Checklist

    • [x] Unit tests for the changes exist
    • [x] Tests pass on CI
    • [x] Documentation reflects the changes where applicable
    opened by lpilz 20
  • Tutorial on xWRF

    Tutorial on xWRF

    What is your issue?

    The aim of this issue is to track the progress in creating a tutorial for xWRF. Here the start of a list of features which are to be presented. Please feel free to add to this list - I'll work on implementing this over coming days.

    • [x] general parsing/coordinate transformation (what does xwrf do?)
    • [x] interface to metpy via unit CF-conventions and pint
    • [x] destaggering data using xgcm
    • [x] vertically interpolating data using xgcm
    • [x] plotting
    opened by lpilz 17
  • Update of tutorials for v0.0.2

    Update of tutorials for v0.0.2

    Change Summary

    Added a tutorial for using xgcm with dask-data.

    Related issue number

    Closes #69

    Checklist

    • [x] Documentation reflects the changes where applicable
    documentation 
    opened by lpilz 13
  • First draft

    First draft "destagger" function

    Change Summary

    Here's an attempt at a "destaggering" function. This is based on the function in WRF-python (https://github.com/NCAR/wrf-python/blob/22fb45c54f5193b849fdff0279445532c1a6c89f/src/wrf/destag.py).

    I've tested in on "east_west_stag" and "north_south_stag" coordinates. The function takes an xarray data-array and guesses the name of the staggered coordinate (it ends in "_stag"). If there is more than one (I don't think there are in WRF?), a NotImplenetedError is raised.

    I'm also not sure if this should ultimately look like this at all, but I wanted to go ahead and throw this code out there.

    Related issue number

    This is related to issue #35

    Checklist

    I don't have any unit tests to check this -- I'm open to ideas on how to make unit tests (do they need to be on "real" data?) Maybe that's a separate issue.

    • [ ] Unit tests for the changes exist
    • [ ] Tests pass on CI
    • [ ] Documentation reflects the changes where applicable

    I'm new to collaborating on open-source projects, and writing code for wide usage, so any feedback is welcome!

    enhancement 
    opened by bsu-wrudisill 13
  • [MISC]: Curate sample datasets

    [MISC]: Curate sample datasets

    What is your issue?

    We currently don't have great sample datasets to use for testing, documentation. It's worth curating exemplar, small data sets. We could emulate the approach used by fatiando/ensaio or xarray tutorial module. These datasets should probably be hosted in a separate GitHub repository.

    • Option 1: A separate data package (xwrf_data)
    import xwrf_data
    import xwrf
    import xarray as xr
    
    fname = xwrf_data.fetch_foo_dataset()
    ds = xr.open_dataset(fname).wrf.diag_and_destagger()
    
    • Option 2: Tutorial module within xwrf
    import xwrf
    import xarray as xr
    
    ds = xwrf.tutorial.open_dataset('foo_dataset').wrf.diag_and_destagger()
    

    Cc @ncar-xdev/xwrf

    enhancement 
    opened by andersy005 12
  • Division of Features in Top-Level API

    Division of Features in Top-Level API

    While detailed API discussions will be ongoing based on https://github.com/NCAR/xwrf/discussions/13 and other issues/discussions that follow from that, https://github.com/NCAR/xwrf/pull/14#issuecomment-977066277 and https://github.com/NCAR/xwrf/pull/14#issuecomment-977157649 raised a more high-level API point that would be good to clear up first: what features go into the xwrf backend, and what goes elsewhere (such as a .wrf accessor)?

    Original comments:


    If so, I think this means we can't have direct Dask operations within the backend, but would rather need to design custom backend arrays that play nicely with the Dask chunking xarray itself does, or re-evaluate the approach for derived quantities so that they are outside the backend. Perhaps the intake-esm approach could help in that regard at least?

    Wouldn't creating custom backend arrays be overkill? Assuming we want to support reading files via the Python-netCDF4 library, we might be able to write a custom data store that borrows from xarray's NetCDF4DataStore: https://github.com/pydata/xarray/blob/5db40465955a30acd601d0c3d7ceaebe34d28d11/xarray/backends/netCDF4_.py#L291. With this custom datastore, we would have more control over what to do with variables, dimensions, attrs before passing them to xarray. Wouldn't this suffice for the data loading (without the derived quantities)?

    I think there's value in keeping the backend plugin simple (e.g. performing simple tasks such as decoding coordinates, fixing attributes/metadata, etc) and everything else outside the backend. Deriving quantities doesn't seem simple enough to warrant having this functionality during the data loading.

    Some of the benefits of deriving quantities outside the backend are that this approach:

    (1) doesn't obfuscate what's going on, (2) gives users the opportunity to fix aspects of the dataset that might be missed by xwrf during data loading before passing this cleaned dataset to the functionality for deriving quantities. (3) removes the requirement for deriving quantities to be a lazy operation i.e. if your dataset is in memory, deriving the quantity is done eagerly...

    Originally posted by @andersy005 in https://github.com/NCAR/xwrf/issues/14#issuecomment-977066277


    Some of the benefits of deriving quantities outside the backend are that this approach:

    Also, Wouldn't it be beneficial for deriving quantities to be backend agnostic? I'm imagining cases in which the data have been post-processed and saved in a different format (e.g. Zarr) and you still want to be able to use the same code for deriving quantities on the fly.

    Originally posted by @andersy005 in https://github.com/NCAR/xwrf/issues/14#issuecomment-977072366


    Deriving quantities doesn't seem simple enough to warrant having this functionality during the data loading.

    This sounds like it factors directly into the "keep the solutions as general as possible (so that maybe also MPAS can profit from it)" discussion. However, I feel that we have to think about the user-perspective too. I don't have any set opinions on this and we should definitely discuss this maybe in a larger group too. Here some thoughts on this so far:

    I think the reason users like wrf-python is because it is an easy one-stop-shop for getting wrf output to work with python - this is especially true because lots of users are scientists and not software engineers or programmers. I personally take from this point that it would be prudent to keep the UX as easy as possible. I think this is what the Backend-approach does really well. Basically users just have to add the engine='xwrf' kwarg and then it just works (TM). Meaning that it provides the users with CF-compliant de-WRFified meteo data. Also, given that the de-WRFification of the variable data is not too difficult (it's basically just adding fields for three variables), I think the overhead in complexity wouldn't be too great. However, while I do see that it breaks the conceptual barrier between data loading (and decoding etc.) and computation, this breakage would be required in order to provide the user with meteo data rather than raw wrf fields.

    @andersy005 do you already have some other ideas on how one could handle this elegantly?

    Also, should we move this discussion to a separate issue maybe?

    Originally posted by @lpilz in https://github.com/NCAR/xwrf/issues/14#issuecomment-977157649

    opened by jthielen 10
  • Coordinate UX

    Coordinate UX

    I think this is pretty straightforward as we just need the lat, lon and time coordinates, all other can be discarded. Unstaggering will be done in the variable initialization. However, we should be aware of moving-nest runs and keep the time-dependence of lat and lon for these occasions.

    enhancement 
    opened by lpilz 9
  • Create xWRF logo

    Create xWRF logo

    What is your issue?

    It would be nice to have a minimalistic logo for the project. Does anyone have or know someone with design skills? :). This would be good for the overall branding of the project once we start advertising the project after the first release

    • https://github.com/ncar-xdev/xwrf/issues/51

    Cc @ncar-xdev/xwrf

    opened by andersy005 8
  • [Bug]: ValueError when using MetPy to calculate geostrophic winds

    [Bug]: ValueError when using MetPy to calculate geostrophic winds

    What happened?

    I'm trying to use the MetPy function mpcalc.geostrophic_wind() to calculate geostrophic winds from a wrfout file.

    I'm getting "ValueError: Must provide dx/dy arguments or input DataArray with latitude/longitude coordinates", along with a warning, "warnings.warn('More than one ' + axis + ' coordinate present for variable'".

    I don't know what's causing the problem.

    Minimal Complete Verifiable Example

    import metpy.calc as mpcalc
    import xarray as xr
    import xwrf
    
    # Open the NetCDF file
    filename = "wrfout_d01_2016-10-04_12:00:00"
    ds = xr.open_dataset(filename).xwrf.postprocess()
    
    # Extract the geopotential height
    z = ds['geopotential_height']
    
    # Compute the geostrophic wind
    geo_wind_u, geo_wind_v = mpcalc.geostrophic_wind(z)
    

    Relevant log output

    /mnt/iusers01/fatpou01/sees01/w34926hb/.conda/envs/metpy_env/lib/python3.9/site-packages/metpy/xarray.py:355: UserWarning: More than one latitude coordinate present for variable "geopotential_height".
      warnings.warn('More than one ' + axis + ' coordinate present for variable'
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/mnt/iusers01/fatpou01/sees01/w34926hb/.conda/envs/metpy_env/lib/python3.9/site-packages/metpy/xarray.py", line 1508, in wrapper
        raise ValueError('Must provide dx/dy arguments or input DataArray with '
    ValueError: Must provide dx/dy arguments or input DataArray with latitude/longitude coordinates.
    

    Environment

    System Information
    ------------------
    xWRF commit : None
    python      : 3.9.13 | packaged by conda-forge | (main, May 27 2022, 16:58:50)
    [GCC 10.3.0]
    python-bits : 64
    OS          : Linux
    OS-release  : 3.10.0-1127.19.1.el7.x86_64
    machine     : x86_64
    processor   : x86_64
    byteorder   : little
    LC_ALL      : None
    LANG        : en_GB.UTF-8
    LOCALE      : ('en_GB', 'UTF-8')
    
    Installed Python Packages
    -------------------------
    cf_xarray   : 0.7.5
    dask        : 2022.11.0
    donfig      : 0.7.0
    matplotlib  : 3.6.2
    metpy       : 1.3.1
    netCDF4     : 1.6.2
    numpy       : 1.23.5
    pandas      : 1.5.1
    pint        : 0.20.1
    pooch       : v1.6.0
    pyproj      : 3.4.0
    xarray      : 2022.11.0
    xgcm        : 0.8.0
    xwrf        : 0.0.2
    

    Anything else we need to know?

    No response

    bug waiting for response 
    opened by starforge 3
  • [MISC]: Plot in metpy tutorial is missing

    [MISC]: Plot in metpy tutorial is missing

    What is your issue?

    On https://xwrf.readthedocs.io/en/latest/tutorials/metpy.html, the Skew-T plot is missing. @andersy005 is this an intermittent sphinx issue or do we have some malconfiguration somewhere?

    opened by lpilz 1
  • More comprehensive unit harmonization

    More comprehensive unit harmonization

    Change Summary

    Unit harmonization is improved by:

    • using a better map parsed from WRF Registries (yes, all of them, but not WPS)
      • translations are generated manually using custom external tool
      • includes all versions from WRFv4.0 onwards
      • makes bracket cleaning superfluous
    • extracting this map from the config yaml to avoid clutter

    Related issue number

    Checklist

    • [x] Unit tests for the changes exist
    • [x] Tests pass on CI
    • [x] Documentation reflects the changes where applicable
    enhancement 
    opened by lpilz 4
  • [FEATURE]: Add functionality to organize WRF data into a DataTree

    [FEATURE]: Add functionality to organize WRF data into a DataTree

    Description

    WRF output can easily have a couple hundred data variables in a dataset, which is not ideal for interactive exploration of a dataset's contents. With DataTree, we would have a tree-like hierarchical data structure for xarray which could be used for this.

    From @lpilz in https://github.com/xarray-contrib/xwrf/issues/10:

    • Which diagnostics do we want to provide and do we want to expose them in a DataTree eventually?

    One suggestion might be:

    DataTree("root")
    |-- DataNode("2d_variables")
    |   |-- DataArrayNode("sea_surface_temperature")
    |   |-- DataArrayNode("surface_temperature")
    |   |-- DataArrayNode("surface_air_pressure")
    |   |-- DataArrayNode("air_pressure_at_sea_level")
    |   |-- DataArrayNode("air_temperature_at_2m") (?)
    |   ....
    |-- DataNode("3d_variables")
        |-- DataArrayNode("air_temperature")
        |-- DataArrayNode("air_pressure")
        |-- DataArrayNode("northward_wind")
        |-- DataArrayNode("eastward_wind")
        ....
    

    Implementation

    This would likely become a new accessor method, such as .xwrf.organize().

    Tests

    After xwrf.postprocess(), we have a post processed dataset (with likely many data variables). Then, after xwrf.organize(), we would have a DataTree with (a yet to be decided) tree-like grouping of data variables. Calling xwrf.organize() without xwrf.postprocess() would fail.

    Questions

    What form of heirarchy would we want to have and how deep?

    • 2d_variables vs. 3d_variables?
    • semantic grouping of variables, such as thermodynamic, grid_metrics, kinematic, accumulated, etc.?
    • Parse the WRF Registry somehow and assign groups based on that?
    • some other strategy?
    enhancement 
    opened by jthielen 0
  • [META]: Support for unexpected/non-pristine wrfout datasets

    [META]: Support for unexpected/non-pristine wrfout datasets

    What is your issue?

    As encountered in #36 and https://github.com/xarray-contrib/xwrf-data/pull/34 (and perhaps elsewhere), there may be several unexpected factors (old versions, tweaked registries, subsetting, etc.) that could result in xWRF's standard functionality being unsupported or failing. While it is definitely something not to prioritize for immediate releases, it would still be nice to make as many subsets of xWRF functionality available to users whose WRF datasets "break" xWRF's norms as possible. So, I propose this to be a meta-issue to

    • track such unexpected/non-pristine examples
    • work towards features to enable extended compatibility and/or custom application of atomized functionality outside of the standard postprocess()
    • discuss any high-level design strategies to improve the experience of xWRF in these situations

    Running list of sub-issues

    (feel free to add/modify)

    • [ ] Missing latitude/longitude coordinates (xref #36)
      • Could be addressed by (one or both of)
        • Convenience methods to merge in coordinates from geo_em files
        • Recompute lat/lon from projection coordinates
    • [ ] Dataset grid definition attributes partially invalid due to spatial subsetting prior to postprocessing (xref https://github.com/xarray-contrib/xwrf-data/pull/34; local issue TBD)
      • Could be addressed by (one or both of)
        • Reference lat/lon being derived from XLAT/XLONG corner(s) rather than CEN_LON/CEN_LAT attrs
        • Require user input of needed info if some sanity check fails (which would also lead to support for completely missing attrs, not just CEN_LON/CEN_LAT being rendered invalid)
    enhancement 
    opened by jthielen 0
  • [MISC]: More careful consideration of different xarray options

    [MISC]: More careful consideration of different xarray options

    What is your issue?

    Test expected results under different xarray options

    In the spirit of improving the quality of our tests (xref #60), it would be nice to implement tests where different relevant xarray options are enabled (using set_options as a context manager). This would likely make it easier to catch issues like #96 .

    Xarray options in issue reports

    Not sure the best way to do this (bundle into xwrf.show_versions()? Add another copy-paste box to the issue template?), but it could help with debugging if we knew the state of xarray.get_options.

    maintenance 
    opened by jthielen 0
Releases(v0.0.2)
  • v0.0.2(Sep 21, 2022)

    What's Changed

    • Add destaggering functionality by @jthielen in https://github.com/xarray-contrib/xwrf/pull/93
    • Fix destagger attrs by @lpilz in https://github.com/xarray-contrib/xwrf/pull/97
    • Fix staggered coordinate destaggering for dataarray destagger method by @jthielen in https://github.com/xarray-contrib/xwrf/pull/101
    • Added earth-relative wind field calculation to base diagnostics by @lpilz in https://github.com/xarray-contrib/xwrf/pull/100
    • Clean up _destag_variable with respect to types and terminology by @jthielen in https://github.com/xarray-contrib/xwrf/pull/103
    • Changed wrfout file (cf. xwrf-data/#34) by @lpilz in https://github.com/xarray-contrib/xwrf/pull/102
    • More unit harmonization by @lpilz in https://github.com/xarray-contrib/xwrf/pull/105
    • Fixing a further coords attrs fail. by @lpilz in https://github.com/xarray-contrib/xwrf/pull/107
    • Clear c_grid_axis_shift from attrs when destaggering by @jthielen in https://github.com/xarray-contrib/xwrf/pull/106
    • Update of tutorials for v0.0.2 by @lpilz in https://github.com/xarray-contrib/xwrf/pull/89

    Full Changelog: https://github.com/xarray-contrib/xwrf/compare/v0.0.1...v0.0.2

    Source code(tar.gz)
    Source code(zip)
  • v0.0.1(Sep 9, 2022)

    This is the first packaged release of xWRF (a lightweight interface for working with the Weather Research and Forecasting (WRF) model output in xarray). Features in this release include:

    • A xwrf Dataset accessor with a postprocess method that can perform the following operations
      • Rename dimensions to match the CF conventions.
      • Rename variables to match the CF conventions.
      • Rename variable attributes to match the CF conventions.
      • Convert units to Pint-friendly units.
      • Decode times.
      • Include projection coordinates.
      • Collapse time dimension.
    • A tutorial module with several sample datasets
    • Documentation with several examples/tutorials

    Thank you to the following contributors for their efforts towards this release!

    • @andersy005
    • @lpilz
    • @jthielen
    • @kmpaul
    • @dcherian
    • @jukent

    Full Changelog: https://github.com/xarray-contrib/xwrf/commits/v0.0.1

    Source code(tar.gz)
    Source code(zip)
Owner
National Center for Atmospheric Research
NCAR is sponsored by the National Science Foundation and managed by the University Corporation for Atmospheric Research.
National Center for Atmospheric Research
Universal data analysis tools for atmospheric sciences

U_analysis Universal data analysis tools for atmospheric sciences Script written in python 3. This file defines multiple functions that can be used fo

Luis Ackermann 1 Oct 10, 2021
Unsub is a collection analysis tool that assists libraries in analyzing their journal subscriptions.

About Unsub is a collection analysis tool that assists libraries in analyzing their journal subscriptions. The tool provides rich data and a summary g

9 Nov 16, 2022
Implementation in Python of the reliability measures such as Omega.

OmegaPy Summary Simple implementation in Python of the reliability measures: Omega Total, Omega Hierarchical and Omega Hierarchical Total. Name Link O

Rafael Valero Fernández 2 Apr 27, 2022
Analysiscsv.py for extracting analysis and exporting as CSV

wcc_analysis Lichess page documentation: https://lichess.org/page/world-championships Each WCC has a study, studies are fetched using: https://lichess

32 Apr 25, 2022
VevestaX is an open source Python package for ML Engineers and Data Scientists.

VevestaX Track failed and successful experiments as well as features. VevestaX is an open source Python package for ML Engineers and Data Scientists.

Vevesta 24 Dec 14, 2022
Single-Cell Analysis in Python. Scales to >1M cells.

Scanpy – Single-Cell Analysis in Python Scanpy is a scalable toolkit for analyzing single-cell gene expression data built jointly with anndata. It inc

Theis Lab 1.4k Jan 05, 2023
Learn machine learning the fun way, with Oracle and RedBull Racing

Red Bull Racing Analytics Hands-On Labs Introduction Are you interested in learning machine learning (ML)? How about doing this in the context of the

Oracle DevRel 55 Oct 24, 2022
CaterApp is a cross platform, remotely data sharing tool created for sharing files in a quick and secured manner.

CaterApp is a cross platform, remotely data sharing tool created for sharing files in a quick and secured manner. It is aimed to integrate this tool with several more features including providing a U

Ravi Prakash 3 Jun 27, 2021
Shot notebooks resuming the main functions of GeoPandas

Shot notebooks resuming the main functions of GeoPandas, 2 notebooks written as Exercises to apply these functions.

1 Jan 12, 2022
Falcon: Interactive Visual Analysis for Big Data

Falcon: Interactive Visual Analysis for Big Data Crossfilter millions of records without latencies. This project is work in progress and not documente

Vega 803 Dec 27, 2022
Evidence enables analysts to deliver a polished business intelligence system using SQL and markdown.

Evidence enables analysts to deliver a polished business intelligence system using SQL and markdown

915 Dec 26, 2022
Open source platform for Data Science Management automation

Hydrosphere examples This repo contains demo scenarios and pre-trained models to show Hydrosphere capabilities. Data and artifacts management Some mod

hydrosphere.io 6 Aug 10, 2021
LynxKite: a complete graph data science platform for very large graphs and other datasets.

LynxKite is a complete graph data science platform for very large graphs and other datasets. It seamlessly combines the benefits of a friendly graphical interface and a powerful Python API.

124 Dec 14, 2022
Pypeln is a simple yet powerful Python library for creating concurrent data pipelines.

Pypeln Pypeln (pronounced as "pypeline") is a simple yet powerful Python library for creating concurrent data pipelines. Main Features Simple: Pypeln

Cristian Garcia 1.4k Dec 31, 2022
Employee Turnover Analysis

Employee Turnover Analysis Submission to the DataCamp competition "Can you help reduce employee turnover?"

Jannik Wiedenhaupt 1 Feb 13, 2022
A Python module for clustering creators of social media content into networks

sm_content_clustering A Python module for clustering creators of social media content into networks. Currently supports identifying potential networks

72 Dec 30, 2022
Probabilistic reasoning and statistical analysis in TensorFlow

TensorFlow Probability TensorFlow Probability is a library for probabilistic reasoning and statistical analysis in TensorFlow. As part of the TensorFl

3.8k Jan 05, 2023
A powerful data analysis package based on mathematical step functions. Strongly aligned with pandas.

The leading use-case for the staircase package is for the creation and analysis of step functions. Pretty exciting huh. But don't hit the close button

48 Dec 21, 2022
Python implementation of Principal Component Analysis

Principal Component Analysis Principal Component Analysis (PCA) is a dimension-reduction algorithm. The idea is to use the singular value decompositio

Ignacio Darago 1 Nov 06, 2021