Suite of tools for retrieving USGS NWIS observations and evaluating National Water Model (NWM) data.

Overview

Unit Testing Status

OWPHydroTools

Documentation

OWPHydroTools GitHub pages documentation

Motivation

We developed OWPHydroTools with data scientists in mind. We attempted to ensure the simplest methods such as get both accepted and returned data structures frequently used by data scientists using scientific Python. Specifically, this means that pandas.DataFrames, geopandas.GeoDataFrames, and numpy.arrays are the most frequently encountered data structures when using OWPHydroTools. The majority of methods include sensible defaults that cover the majority of use-cases, but allow customization if required.

We also attempted to adhere to organizational (NOAA-OWP) data standards where they exist. This means pandas.DataFrames will contain column labels like usgs_site_code, start_date, value_date, and measurement_unit which are consistent with organization wide naming conventions. Our intent is to make retrieving, evaluating, and exporting data as easy and reproducible as possible for scientists, practitioners and other hydrological experts.

What's here?

We've taken a grab-and-go approach to installation and usage of OWPHydroTools. This means, in line with a standard toolbox, you will typically install just the tool or tools that get your job done without having to install all the other tools available. This means a lighter installation load and that tools can be added to the toolbox, without affecting your workflows!

It should be noted, we commonly refer to individual tools in OWPHydroTools as a subpackage or by their name (e.g. nwis_client). You will find this lingo in both issues and documentation.

Currently the repository has the following subpackages:

  • events: Variety of methods used to perform event-based evaluations of hydrometric time series
  • nwm_client: Provides methods for retrieving National Water Model data from various sources including Google Cloud Platform and NOMADS
  • metrics: Variety of methods used to compute common evaluation metrics
  • nwis_client: Provides easy to use methods for retrieving data from the USGS NWIS Instantaneous Values (IV) Web Service
  • _restclient: A generic REST client with built in cache that make the construction and retrieval of GET requests painless
  • caches: Provides a variety of object caching utilities

UTC Time

Note: the canonical pandas.DataFrames used by OWPHydroTools use time-zone naive datetimes that assume UTC time. In general, do not assume methods are compatible with time-zone aware datetimes or timestamps. Expect methods to transform time-zone aware datetimes and timestamps into their timezone naive counterparts at UTC time.

Usage

Refer to each subpackage's README.md or documentation for examples of how to use each tool.

Installation

In accordance with the python community, we support and advise the usage of virtual environments in any workflow using python. In the following installation guide, we use python's built-in venv module to create a virtual environment in which the tools will be installed. Note this is just personal preference, any python virtual environment manager should work just fine (conda, pipenv, etc. ).

# Create and activate python environment, requires python >= 3.8
$ python3 -m venv venv
$ source venv/bin/activate
$ python3 -m pip install --upgrade pip

# Install all tools
$ python3 -m pip install hydrotools

# Alternatively you can install a single tool
#  This installs the NWIS Client tool
$ python3 -m pip install hydrotools.nwis_client

OWPHydroTools Canonical Format

"Canonical" labels are protected and part of a fixed lexicon. Canonical labels are shared among all hydrotools subpackages. Subpackage methods should avoid changing or redefining these columns where they appear to encourage cross-compatibility. Existing canonical labels are listed below:

  • value [float32]: Indicates the real value of an individual measurement or simulated quantity.
  • value_time [datetime64[ns]]: formerly value_date, this indicates the valid time of value.
  • variable_name [category]: string category that indicates the real-world type of value (e.g. streamflow, gage height, temperature).
  • measurement_unit [category]: string category indicating the measurement unit (SI or standard) of value
  • qualifiers [category]: string category that indicates any special qualifying codes or messages that apply to value
  • series [integer32]: Use to disambiguate multiple coincident time series returned by a data source.
  • configuration [category]: string category used as a label for a particular time series, often used to distinguish types of model runs (e.g. short_range, medium_range, assimilation)
  • reference_time [datetime64[ns]]: formerly, start_date, some reference time for a particular model simulation. Could be considered an issue time, start time, end time, or other meaningful reference time. Interpretation is simulation or forecast specific.
  • longitude [category]: float32 category, WGS84 decimal longitude
  • latitude [category]: float32 category, WGS84 decimal latitude
  • crs [category]: string category, Coordinate Reference System, typically "EPSG:4326"
  • geometry [geometry]: GeoPandas compatible GeoSeries used as the default "geometry" column

Non-Canonical Column Labels

"Non-Canonical" labels are subpackage specific extensions to the canonical standard. Packages may share these non-canonical lables, but cross-compatibility is not guaranteed. Examples of non-canonical labels are given below.

  • usgs_site_code [category]: string category indicating the USGS Site Code/gage ID
  • nwm_feature_id [integer32]: indicates the NWM reach feature ID/ComID
  • nws_lid [category]: string category indicating the NWS Location ID/gage ID
  • usace_gage_id [category]: string category indicating the USACE gage ID
  • start [datetime64[ns]]: datetime returned by event_detection that indicates the beginning of an event
  • end [datetime64[ns]]: datetime returned by event_detection that indicates the end of an event

Categorical Data Types

OWPHydroTools uses pandas.Dataframe that contain pandas.Categorical values to increase memory efficiency. Depending upon your use-case, these values may require special consideration. To see if a Dataframe returned by a OWPHydroTools subpackage contains pandas.Categorical you can use pandas.Dataframe.info like so:

print(my_dataframe.info())

   
Int64Index: 5706954 entries, 0 to 5706953
Data columns (total 7 columns):
 #   Column            Dtype         
---  ------            -----         
 0   value_date        datetime64[ns]
 1   variable_name     category      
 2   usgs_site_code    category      
 3   measurement_unit  category      
 4   value             float32       
 5   qualifiers        category      
 6   series            category      
dtypes: category(5), datetime64[ns](1), float32(1)
memory usage: 141.5 MB
None

Columns with Dtype category are pandas.Categorical. In most cases, the behavior of these columns is indistinguishable from their primitive types (in this case str) However, there are times when use of categories can lead to unexpected behavior such as when using pandas.DataFrame.groupby as documented here. pandas.Categorical are also incompatible with fixed format HDF files (must use format="table") and may cause unexpected behavior when attempting to write to GeoSpatial formats using geopandas.

Possible solutions include:

Cast Categorical to str

Casting to str will resolve all of the aformentioned issues including writing to geospatial formats.

my_dataframe['usgs_site_code'] = my_dataframe['usgs_site_code'].apply(str)

Remove unused categories

This will remove categories from the Series for which no values are actually present.

my_dataframe['usgs_site_code'] = my_dataframe['usgs_site_code'].cat.remove_unused_categories()

Use observed option with groupby

This limits groupby operations to category values that actually appear in the Series or DataFrame.

mean_flow = my_dataframe.groupby('usgs_site_code', observed=True).mean()
Comments
  • Add Social Vulnerability Index (SVI)  subpackage

    Add Social Vulnerability Index (SVI) subpackage

    This PR adds a client for programmatically accessing the Center for Disease Control's (CDC) Social Vulnerability Index (SVI).

    "Social vulnerability refers to the potential negative effects on communities caused by external stresses on human health. Such stresses include natural or human-caused disasters, or disease outbreaks. Reducing social vulnerability can decrease both human suffering and economic loss." [source]

    The SVI has been released 5 times (2000, 2010, 2014, 2016, and 2018) and calculates a relative percentile ranking in four themes categories and an overall ranking at a given geographic context and geographic scale. The themes are:

    • Socioeconomic
    • Household Composition & Disability
    • Minority Status & Language
    • Housing Type & Transportation

    Rankings are calculated relative to a geographic context, state or all states (United States) . Meaning, for example, a ranking calculated for some location at the United States geographic context would be relative to all other locations where rankings was calculated in the United States. Similarly, SVI rankings are calculated at two geographic scales, census tract and county scales. Meaning, the rankings correspond to a county for a census tract. For completeness, for example, if you were to retrieve the 2018 SVI at the census tract scale, at the state context for the state of Alabama, you would receive 1180 records (number of census tracts in AL in 2010 census) where each ranked percentile is calculated relative to census tracts in Alabama. The tool released in this PR only supports querying for ranking calculated at the United States geographic context. Future work will add support for retrieving rankings at the state spatial scale.

    Documentation for each year release of the SVI are located below:

    Example

    from hydrotools.svi_client import SVIClient
    
    client = SVIClient()
    df = client.get(
        location="AL", # state / nation name (i.e. "alabama" or "United States") also accepted. case insensitive
        geographic_scale="census_tract", # "census_tract" or "county"
        year="2018", # 2000, 2010, 2014, 2016, or 2018
        geographic_context="national" # only "national" supported. "state" will be supported in the future
        )
    print(df)
                        state_name state_abbreviation  ... svi_edition                                           geometry
            0        alabama                 al  ...        2018  POLYGON ((-87.21230 32.83583, -87.20970 32.835...
            1        alabama                 al  ...        2018  POLYGON ((-86.45640 31.65556, -86.44864 31.655...
            ...          ...                ...  ...         ...                                                ...
            29498    alabama                 al  ...        2018  POLYGON ((-85.99487 31.84424, -85.99381 31.844...
            29499    alabama                 al  ...        2018  POLYGON ((-86.19941 31.80787, -86.19809 31.808...
    

    Additions

    • adds a client for programmatically accessing the Center for Disease Control's (CDC) Social Vulnerability Index (SVI)

    Testing

    1. Integration tests are included that test all valid SVI queries currently supported by the tool.
    2. Some unit tests included for utility functions.

    Todos

    • Future work will add support for retrieving rankings at the state spatial scale.

    Checklist

    • [x] PR has an informative and human-readable title
    • [x] PR is well outlined and documented. See #12 for an example
    • [x] Changes are limited to a single goal (no scope creep)
    • [x] Code can be automatically merged (no conflicts)
    • [x] Code follows project standards (see CONTRIBUTING.md)
    • [x] Passes all existing automated tests
    • [x] Any change in functionality is tested
    • [x] New functions are documented (with a description, list of inputs, and expected output) using numpy docstring formatting
    • [x] Placeholder code is flagged / future todos are captured in comments
    • [x] Reviewers requested with the Reviewers tool :arrow_right:
    opened by aaraney 44
  • Cache GCP client responses to reduce repeated operations and network calls

    Cache GCP client responses to reduce repeated operations and network calls

    Each time the gcp client is used, it hits gcp to get the requested data. Given the size of the data and that repeated process, it only makes sense to implement some kind of cache. I propose that we use a file db (i.e. sqlite) to accomplish this for simplicity and broad support in python.

    High level logic

    1. check if db cache has desired data (stored as df)
      1. yes - return data
      2. no - continue
    2. get data from gcp
    3. create a df from the gcp data
    4. cache this df in an sqlite db using the URL path as the key
    5. return the df

    Requirements

    • sqlite lib must support multiprocessing and batch commits

    The sqlitedict library is mature, maintained, and seems to fit this bill for this feature. The lib lets you create/connect with a db and use it like you would a python dictionary. Most importantly, it supports multiprocessing.

    enhancement 
    opened by aaraney 26
  • As ROE user, I am getting a file that cache nwis data in my working folder that is collapsing the system.

    As ROE user, I am getting a file that cache nwis data in my working folder that is collapsing the system.

    Hello guys! I have a brand new installation of miniconda3 (version 3.8) on a new system. I am calling the IV module to retrieve NWIS data and see a file in my working folder called:

    /[working_folder]/nwisiv_cache.sqlite

    How can I tell the program that I want that file in an specific location (i.e. /home)? I do not want this huge file to affect my system, where is this file installed by default?

    Thanks! Alex

    bug 
    opened by amaes3owp 13
  • `nwis_client`: `SyntaxError`

    `nwis_client`: `SyntaxError`

    Issue

    hydrotools.nwis_client version 3.0.6 raises a SyntaxError on import. @aaraney would you mind looking into this?

    Environment:

    Python 3.8.10 (default, Jun  4 2021, 15:09:15)
    [GCC 7.5.0] :: Anaconda, Inc. on linux
    

    Duplicate

    import hydrotools.nwis_client
    

    Error

    'base_url:typing.Union[str, yarl.URL, NoneType]=None' of kind 'POSITIONAL_OR_KEYWORD' follows 'cache:CacheBackend=None' of kind 'KEYWORD_ONLY'
    Traceback (most recent call last):
      File "/home/jason.regina/Projects/nwis_test/miniconda3/lib/python3.8/site-packages/aiohttp_client_cache/docs/forge_utils.py", line 32, in wrapper
        return revision(target_function)
      File "/home/jason.regina/Projects/nwis_test/miniconda3/lib/python3.8/site-packages/forge/_revision.py", line 330, in __call__
        next_.validate()
      File "/home/jason.regina/Projects/nwis_test/miniconda3/lib/python3.8/site-packages/forge/_signature.py", line 1344, in validate
        raise SyntaxError(
    SyntaxError: 'base_url:typing.Union[str, yarl.URL, NoneType]=None' of kind 'POSITIONAL_OR_KEYWORD' follows 'cache:CacheBackend=None' of kind 'KEYWORD_ONLY'
    
    bug 
    opened by jarq6c 12
  • NWIS Client: startDT and endDT get shifted an hour

    NWIS Client: startDT and endDT get shifted an hour

    When retrieving data using the nwis_client tool, something is happening when specifying the startDT and endDT options where the returned data is shift forward in time by 1 hour. May be able to clean-up the date handling and hand-off a lot to pandas.

    bug 
    opened by jarq6c 11
  • (WIP) Change restclient from monkey patching

    (WIP) Change restclient from monkey patching

    In relation to #14 and #70. _restclient no longer uses requests_cache.install_cache which monkeypatches the requests library. Instead, requests_cache.CachedSession is now used. enable_cache [bool] parameter added to constructor of RestClient

    Changes

    • requests_cache.install_cache is no longer used. Instead requests_cache.CachedSession is now used.
    • enable_cache [bool] parameter added to constructor of RestClient

    Notes

    • Passes all unittests, however several slow tests for nwis_client are now failing. There appears to be something happening with multiprocessing and the requests_cache.CachedSession module.

    Checklist

    • [x] PR has an informative and human-readable title
    • [x] PR is well outlined and documented. See #12 for an example
    • [x] Changes are limited to a single goal (no scope creep)
    • [x] Code can be automatically merged (no conflicts)
    • [x] Code follows project standards (see CONTRIBUTING.md)
    • [ ] Passes all existing automated tests
    • [x] Any change in functionality is tested
    • [x] New functions are documented (with a description, list of inputs, and expected output) using numpy docstring formatting
    • [x] Placeholder code is flagged / future todos are captured in comments
    • [x] Reviewers requested with the Reviewers tool :arrow_right:
    enhancement 
    opened by aaraney 11
  • Add check for complete categories when constructing contingency tables

    Add check for complete categories when constructing contingency tables

    This adds a check and corrects missing boolean categories in Categorical series input to compute_contingency_table.

    Additions

    • Checks for both True and False in observed and simulated input series.

    Removals

    • None

    Changes

    • Update original test to use categorical series instead of raw categories per the original docstring and intended use of compute_contingency_table.

    Testing

    1. Four new test scenarios for compute_contingency_table that check for all True and all False cases.

    Notes

    • Fixes #183

    Todos

    • None

    Checklist

    • [x] PR has an informative and human-readable title
    • [x] PR is well outlined and documented. See #12 for an example
    • [x] Changes are limited to a single goal (no scope creep)
    • [x] Code can be automatically merged (no conflicts)
    • [x] Code follows project standards (see CONTRIBUTING.md)
    • [x] Passes all existing automated tests
    • [x] Any change in functionality is tested
    • [x] New functions are documented (with a description, list of inputs, and expected output) using numpy docstring formatting
    • [ ] Placeholder code is flagged / future todos are captured in comments
    • [x] Reviewers requested with the Reviewers tool :arrow_right:
    bug 
    opened by jarq6c 9
  • As user, I would like event detection to consider when a station is temporarily discontinued

    As user, I would like event detection to consider when a station is temporarily discontinued

    Good morning. I am trying to create a list of events for the past 10 days at station FLRV2

    https://water.weather.gov/ahps2/hydrograph.php?wfo=rnk&gage=flrv2

    https://waterdata.usgs.gov/nwis/uv?cb_00060=on&cb_00065=on&format=gif_default&site_no=02064000&period=10&begin_date=2021-05-03&end_date=2021-05-10

    There has been a gage relocation due to a bridge construction. The function rolling_minimum is failing.

    Thanks, Alex

    bug 
    opened by amaes3owp 9
  • MAJOR CHANGE: Rebranding package namespace from evaluation_tools to HydroTools (hydrotools)

    MAJOR CHANGE: Rebranding package namespace from evaluation_tools to HydroTools (hydrotools)

    The decision has been made to change the projects name from evaluation_tools to HydroTools. We feel this name better represents/describes the goals of the project and as such, the repo remote name has also been changed prior to this PR.

    The bulk of this PR moves both source and test file references and documentation under the new namespace and branding.

    Changes

    • python package namespace from evaluation_tools to hydrotoolsin source and test files
    • branding of package now camel cased as HydroTools
    • developer installation procedure changed from pip install -e . to python setup.py develop. The former will no longer work.
    • all major package versions bumped:
      • hydrotools: 2.0.0-alpha.0
      • hydrotools.metrics: 1.0.0-alpha.0
      • hydrotools.events: 1.0.0-alpha.0
      • hydrotools.gcp_client: 1.0.0-alpha.0
      • hydrotools.nwis_client: 2.0.0-alpha.0
      • hydrotools._restclient: 2.0.0-alpha.0

    Notes

    Todos

    • fix href to hydrotools._restclient made in 9779ab2 prior to deploy

    Checklist

    • [x] PR has an informative and human-readable title
    • [x] PR is well outlined and documented. See #12 for an example
    • [x] Changes are limited to a single goal (no scope creep)
    • [x] Code can be automatically merged (no conflicts)
    • [x] Code follows project standards (see CONTRIBUTING.md)
    • [x] Passes all existing automated tests
    • [x] Any change in functionality is tested
    • [x] New functions are documented (with a description, list of inputs, and expected output) using numpy docstring formatting
    • [x] Placeholder code is flagged / future todos are captured in comments
    • [x] Reviewers requested with the Reviewers tool :arrow_right:
    opened by aaraney 9
  • Resolve Nwm client IndexError: invalid index to scalar variable. (#180)

    Resolve Nwm client IndexError: invalid index to scalar variable. (#180)

    See #180 for bug report. This PR lays the groundwork for two new releases of the hydrotools.nwm_client. The first release, 5.0.1-post1, pins h5netcdf<=0.13.0. This was done to support users who would like to use the software but are dependent on a h5netcdf<=0.13.0. The second release is 5.0.2, this pins h5netcdf>=0.14.0. 5.0.2 also includes a patch that fixes the IndexError propagating from h5netcdf==0.14.0.

    fixes #180.

    Guidance

    • if your application requires h5netcdf<=0.13.0, use hydrotools.nwm_client==5.0.1-post1
    • otherwise, use hydrotools.nwm_client>=5.0.2

    Changes

    • Resolves IndexError: invalid index to scalar variable. present in hydrotools.nwm_client<=5.0.1

    Checklist

    • [x] PR has an informative and human-readable title
    • [x] PR is well outlined and documented. See #12 for an example
    • [x] Changes are limited to a single goal (no scope creep)
    • [x] Code can be automatically merged (no conflicts)
    • [x] Code follows project standards (see CONTRIBUTING.md)
    • [ ] Passes all existing automated tests
    • [x] Any change in functionality is tested
    • [x] New functions are documented (with a description, list of inputs, and expected output) using numpy docstring formatting
    • [x] Placeholder code is flagged / future todos are captured in comments
    • [x] Reviewers requested with the Reviewers tool :arrow_right:
    opened by aaraney 8
  • NWIS Client: Expose cache filename as option

    NWIS Client: Expose cache filename as option

    Allow users to set an alternative path for the cache when instantiating IVDataService.

    Relevant line here: https://github.com/NOAA-OWP/hydrotools/blob/7294aac3cb65001c933b80fcc4a73bd5724f3765/python/nwis_client/src/hydrotools/nwis_client/iv.py#L89

    enhancement 
    opened by jarq6c 8
  • NWM Client New Test Failure: AttributeError: 'EntryPoints' object has no attribute 'get'

    NWM Client New Test Failure: AttributeError: 'EntryPoints' object has no attribute 'get'

    The test runners for #210 found a failing nwm_client_new test. The AttributeError is raised in xarray source when getting a xarray backend. See the below collapsed test failure trace for more detail.

    FAILED python/nwm_client_new/tests/test_NWMFileProcessor.py::test_get_dataset - AttributeError: 'EntryPoints' object has no attribute 'get'
    
    full trace

    source

    =================================== FAILURES ===================================
    _______________________________ test_get_dataset _______________________________
    
    Unclosed client session
    client_session: <hydrotools._restclient.async_client.ClientSession object at 0x7f90b51bf650>
        def test_get_dataset():
    >       ds = NWMFileProcessor.get_dataset(input_directory)
    
    python/nwm_client_new/tests/test_NWMFileProcessor.py:8: 
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    /opt/hostedtoolcache/Python/3.7.15/x64/lib/python3.7/site-packages/hydrotools/nwm_client_new/NWMFileProcessor.py:59: in get_dataset
        ds = xr.open_mfdataset(file_list, engine="netcdf4")
    /opt/hostedtoolcache/Python/3.7.15/x64/lib/python3.7/site-packages/xarray/backends/api.py:908: in open_mfdataset
        datasets = [open_(p, **open_kwargs) for p in paths]
    /opt/hostedtoolcache/Python/3.7.15/x64/lib/python3.7/site-packages/xarray/backends/api.py:908: in <listcomp>
        datasets = [open_(p, **open_kwargs) for p in paths]
    /opt/hostedtoolcache/Python/3.7.15/x64/lib/python3.7/site-packages/xarray/backends/api.py:[48](https://github.com/NOAA-OWP/hydrotools/actions/runs/3500004127/jobs/5862205193#step:5:49)1: in open_dataset
        backend = plugins.get_backend(engine)
    /opt/hostedtoolcache/Python/3.7.15/x64/lib/python3.7/site-packages/xarray/backends/plugins.py:161: in get_backend
        engines = list_engines()
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    
        @functools.lru_cache(maxsize=1)
        def list_engines():
    >       entrypoints = entry_points().get("xarray.backends", ())
    E       AttributeError: 'EntryPoints' object has no attribute 'get'
    
    /opt/hostedtoolcache/Python/3.7.15/x[64](https://github.com/NOAA-OWP/hydrotools/actions/runs/3500004127/jobs/5862205193#step:5:65)/lib/python3.7/site-packages/xarray/backends/plugins.py:105: AttributeError
    =============================== warnings summary ===============================
    python/nwm_client_new/tests/test_NWMClient.py::test_QueryError
      /opt/hostedtoolcache/Python/3.7.15/x64/lib/python3.7/site-packages/hydrotools/nwm_client_new/NWMFileClient.py:197: RuntimeWarning: No data found for configuration 'analysis_assim' and reference time '20T00Z'
        warnings.warn(message, RuntimeWarning)
    
    python/nwm_client_new/tests/test_NWMClient.py::test_QueryError
      /opt/hostedtoolcache/Python/3.7.15/x64/lib/python3.7/site-packages/hydrotools/nwm_client_new/NWMFileClient.py:197: RuntimeWarning: No data found for configuration 'analysis_assim' and reference time '30T01Z'
        warnings.warn(message, RuntimeWarning)
    
    -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
    =========================== short test summary info ============================
    FAILED python/nwm_client_new/tests/test_NWMFileProcessor.py::test_get_dataset - AttributeError: 'EntryPoints' object has no attribute 'get'
    

    A quick look on xarray's issue tracker led me to this -- a verbatim reproduction of this issue. Conversation there pointed to the issue's source, the newly released version (5.0.0) of importlib-metadata and python 3.7 (no longer actively supported by xarray).

    opened by aaraney 3
  • Resolve nwis service raising pandas FutureWarning

    Resolve nwis service raising pandas FutureWarning

    pandas >= 1.5.0 was raising FutureWarning in nwis_client.get because of ambiguity in setting a column's values in-place or replacing the column's underlying array. Specifically, the warning is:

    FutureWarning: In a future version, 
    `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To 
    retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, 
    `df.isetitem(i, newvals)` dfs.loc[:, "value"] = pd.to_numeric(dfs["value"], downcast="float")
    

    This PR fixes that issue.

    fixes #209

    nwis_client

    Changes

    • pandas >= 1.5.0 no longer raised FutureWarning in nwis_client.get. see #209.

    Testing

    1. test added to verify warning is not raised in nwis_client.get.

    Checklist

    • [x] PR has an informative and human-readable title
    • [x] PR is well outlined and documented. See #12 for an example
    • [x] Changes are limited to a single goal (no scope creep)
    • [x] Code can be automatically merged (no conflicts)
    • [x] Code follows project standards (see CONTRIBUTING.md)
    • [ ] Passes all existing automated tests
    • [x] Any change in functionality is tested
    • [x] New functions are documented (with a description, list of inputs, and expected output) using numpy docstring formatting
    • [x] Placeholder code is flagged / future todos are captured in comments
    • [ ] Reviewers requested with the Reviewers tool :arrow_right:
    opened by aaraney 1
  • NWIS IV Client `FutureWarning`

    NWIS IV Client `FutureWarning`

    When using the nwis iv client, I receive this warning:

    ~/env/lib/python3.8/site-packages/hydrotools/nwis_client/iv.py:283: FutureWarning: In a future version, 
    `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To 
    retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, 
    `df.isetitem(i, newvals)` dfs.loc[:, "value"] = pd.to_numeric(dfs["value"], downcast="float")
    
    bug 
    opened by jarq6c 3
  • Modules missing stubs or py.typed markers

    Modules missing stubs or py.typed markers

    I received this error when running mypy on a project that depends on hydrotools.nwis_client.

    error: Skipping analyzing "hydrotools.nwis_client.iv": module is installed, but missing library stubs or py.typed marker
    

    I discovered that none of the hydryotools subpackages are PEP-561 compliant, however they do use type hints. I think we should considering adding py.typed files to help users that utilize type checkers.

    opened by jarq6c 0
  • NOTICE: Incorrect NWM RouteLink Assignment - Channel Feature 3624261

    NOTICE: Incorrect NWM RouteLink Assignment - Channel Feature 3624261

    @kvanwerkhoven discovered that the NWM Routelink contains an erroneous mapping from Reach ID 3624261 to USGS Gage 07010040. The error is present in the RouteLink stored on NOMADS and HydroShare. This error affects data returned from the nwm_client and nwm_client_new subpackages. We will update the RouteLink used by HydroTools once it's fixed.

    The National Water Model developers have been notified.

    Gage: https://waterdata.usgs.gov/mo/nwis/nwismap/?site_no=07010040&agency_cd=USGS

    NWM Reach 3624261 is for Deer Creek (contributing area ~6 sq.mi). USGS Gage 07010040 is for Denny Creek (contributing area ~0.5 sq.mi), a tributary to Deer Creek. See the map below (source: water.noaa.gov/map).

    noaa_water_map

    documentation 
    opened by jarq6c 0
Releases(v2.2.3)
  • v2.2.3(May 24, 2022)

    Content

    hydrotools-2.2.3 _restclient-3.0.5 nwis_client-3.2.1 caches-0.1.3 nwm_client-5.0.3 events-1.1.5 metrics-1.2.3

    What's Changed

    • Add Kling-Gupta Efficiency by @jarq6c in https://github.com/NOAA-OWP/hydrotools/pull/172
    • Fix 160. **params arg in IVDataService.get now raises warning when case insensitively matches other parameters by @aaraney in https://github.com/NOAA-OWP/hydrotools/pull/174
    • Add cache_filename parameter to the NWIS Client constructor by @aaraney in https://github.com/NOAA-OWP/hydrotools/pull/176
    • Add nwm_client documentation and minor subpackage level import changes by @aaraney in https://github.com/NOAA-OWP/hydrotools/pull/179
    • Resolve Nwm client IndexError: invalid index to scalar variable. (#180) by @aaraney in https://github.com/NOAA-OWP/hydrotools/pull/182
    • Add check for complete categories when constructing contingency tables by @jarq6c in https://github.com/NOAA-OWP/hydrotools/pull/184
    • Add nwis_client CLI by @jarq6c in https://github.com/NOAA-OWP/hydrotools/pull/185
    • Patch update to nwm-client HTTP interface by @jarq6c in https://github.com/NOAA-OWP/hydrotools/pull/186
    • Suppress NWIS Client value_time warning by @jarq6c in https://github.com/NOAA-OWP/hydrotools/pull/187
    • Improvements to validation checks by @groutr in https://github.com/NOAA-OWP/hydrotools/pull/189
    • bump metrics version to 1.2.2 by @jarq6c in https://github.com/NOAA-OWP/hydrotools/pull/190
    • Make map conversions more efficient alternative by @jarq6c in https://github.com/NOAA-OWP/hydrotools/pull/191
    • HydroTools Superpackage Release 2.2.3 by @jarq6c in https://github.com/NOAA-OWP/hydrotools/pull/194

    New Contributors

    • @groutr made their first contribution in https://github.com/NOAA-OWP/hydrotools/pull/189

    Full Changelog: https://github.com/NOAA-OWP/hydrotools/compare/v2.2.2...v2.2.3

    Source code(tar.gz)
    Source code(zip)
  • v2.2.2(Dec 15, 2021)

    Content

    hydrotools-2.2.2 _restclient-3.0.5 nwis_client-3.0.6 caches-0.1.3 nwm_client-5.0.1 events-1.1.5 metrics-1.1.3

    Notes

    gcp_client is still available, but deprecated. nwm_client_new is still in pre-release development. The hydrotools superpackage now pulls in google-cloud-storage dependencies by default.

    What's Changed

    • Warn for Null values in series when calling event detection by @jarq6c in https://github.com/NOAA-OWP/hydrotools/pull/124
    • Update documentation by @jarq6c in https://github.com/NOAA-OWP/hydrotools/pull/125
    • update name version and front page by @jarq6c in https://github.com/NOAA-OWP/hydrotools/pull/126
    • Update package level installation makefile by @aaraney in https://github.com/NOAA-OWP/hydrotools/pull/128
    • Fix #99: "RuntimeError: This event loop is already running" in colab and notebook. Re-PR of #100 by @aaraney in https://github.com/NOAA-OWP/hydrotools/pull/130
    • fix nwis_client bug causing KeyError when station does not return data -- Updated by @jarq6c in https://github.com/NOAA-OWP/hydrotools/pull/135
    • fix nwis_client bug causing KeyError when station does not return data by @aaraney in https://github.com/NOAA-OWP/hydrotools/pull/134
    • nwis_client now requires _restclient >= 3.0.4 by @aaraney in https://github.com/NOAA-OWP/hydrotools/pull/140
    • Redesign NWM Client Subpackage by @jarq6c in https://github.com/NOAA-OWP/hydrotools/pull/138
    • Move package/subpackage version inside package source by @aaraney in https://github.com/NOAA-OWP/hydrotools/pull/142
    • Move nwm_client_new version inside package source by @aaraney in https://github.com/NOAA-OWP/hydrotools/pull/149
    • fix 153; _restclient pins aiohttp version to <=3.7.4.post0 by @aaraney in https://github.com/NOAA-OWP/hydrotools/pull/154
    • Switch NWM Crosswalk Source from CSV to URL by @jarq6c in https://github.com/NOAA-OWP/hydrotools/pull/156
    • NWM Client: Limit number of get calls for testing by @jarq6c in https://github.com/NOAA-OWP/hydrotools/pull/158
    • NWM Client New: Separate Large Modules by @jarq6c in https://github.com/NOAA-OWP/hydrotools/pull/159
    • Metrics: Default to numpy scalars for computation by @jarq6c in https://github.com/NOAA-OWP/hydrotools/pull/163
    • Advance Superpackage to 2.2.2 by @jarq6c in https://github.com/NOAA-OWP/hydrotools/pull/165

    Full Changelog: https://github.com/NOAA-OWP/hydrotools/compare/v2.1.2...v2.2.2

    Source code(tar.gz)
    Source code(zip)
  • v2.1.2(Aug 6, 2021)

    Content

    hydrotools-2.1.2 _restclient-3.0.3 nwis_client-3.0.4 caches-0.1.3 nwm_client-5.0.1 events-1.1.4 metrics-1.0.3

    Details

    Mostly minor documentation updates. Switched from gcp_client to nwm_client to move toward providing a unified interface for accessing NWM data. gcp_client is still available, but deprecated. nwm_client includes an http interface for retrieving data from generic web servers.

    Source code(tar.gz)
    Source code(zip)
  • v2.1.1(Jul 30, 2021)

    Content

    hydrotools-2.1.1 _restclient-3.0.2 nwis_client-3.0.3 caches-0.1.2 gcp_client-4.1.1 events-1.1.3 metrics-1.0.2

    Details

    Includes various bug fixes and improves documentation to reflect slightly new functionality from last release. Package structures have been refactored to adopt a "src" folder structure and use setup.cfg files. This makes for a slightly cleaner package base and unifies installation and development under pip.

    This release includes the addition of the hydrotools super package. This allows installation of all packages using:

    $ python3 -m pip install hydrotools
    

    Minimum versions are set where applicable to guarantee cross-package compatibility.

    Source code(tar.gz)
    Source code(zip)
  • v2.1.0(Jun 16, 2021)

    Additions

    • IVDataService parameters: enable_cache and cache_expire_after allowing control over cache usage.
    • Url keyword arguments quote_treatment, safe, and quote_overide_map added to support flexibility in quoted url representation. Urls are still comparable using the == operator however urls can differ in their quote_url representation.
    • RestClient.mget now accepts a collection of query parameters and/or headers. It is not required to provide urls to mget if base url is set. Thus, a collection of query's can be passed in a collection and the will be built into requests as parameters of the base url.
    • Url: Treat urls like to pathlib.Paths. Supports appending paths with / and appending url query parameters with +.
    • ClientSession: Extension of aiohttp_client_cache that adds exponential backoff.
    • Variadic: Join list or tuple of objects on some delimiter. Useful when query parameters utilize a non-key=value&key=value2 pattern.

    Removals

    • IVDataService parameters: process and retry removed.
    • All code using multiprocessing removed
    • IVDataService method following pattern get_raw_ and get_json removed
    • U.S. State constants removed and no longer checked
    • procs property removed

    Changes

    • IVDataService now uses aiohttp via _restclient to do asynchronous retrieval of data.
    • documentation updated with examples
    • IVDataService::get
      • now accepts multiple states, hucs, counties, bounding boxes
      • is now an instance method instead of a classmethod
    • The default splitting factor for sites is now 20 instead of the previous 100.
    • Update Url docstring to reflect new kwarg additions
    • RestClient: a class which offers asynchronous performance via a convenient serial wrapper, sqlite database GET request caching, GET request backoff, batch GET requests, and flexible url argument encoding.
    • RestClient parameters changed (breaking).
    Source code(tar.gz)
    Source code(zip)
  • v2.0.0-alpha.1(Jun 16, 2021)

    Last release before transition to asynchronous _restclient and nwis_client.

    Subpackages

    _restclient v2.1.0-alpha.0 caches v0.1.1 events 1.0.2 gcp_client 3.0.0 metrics 1.0.1 nwis_client 2.0.0-alpha.0

    Source code(tar.gz)
    Source code(zip)
  • 2.0.0-alpha.0(Apr 2, 2021)

  • v1.3.5(Apr 1, 2021)

    • Refinements to event detection noise handling
    • Updates to package handling
    • Incorporates additional metrics (MSE and NSE)
    • Last release before namespace change to hydrotools
    Source code(tar.gz)
    Source code(zip)
  • v1.3.2(Feb 18, 2021)

  • v1.3.0(Feb 12, 2021)

    Generic tools for conducting National Water Model evaluations against USGS Observations.

    Included Tools

    _restlicent v1.0.0 events v0.2.1 gcp_client v0.2.0 metrics v0.1.1 nwis_client v1.1.0

    Source code(tar.gz)
    Source code(zip)
Python meta class and abstract method library with restrictions.

abcmeta Python meta class and abstract method library with restrictions. This library provides a restricted way to validate abstract methods. The Pyth

Morteza NourelahiAlamdari 8 Dec 14, 2022
Import some key/value data to Prometheus custom-built Node Exporter in Python

About the app In one particilar project, i had to import some key/value data to Prometheus. So i have decided to create my custom-built Node Exporter

Hamid Hosseinzadeh 1 May 19, 2022
Aero is an open source airplane intelligence tool. Aero supports more than 13,000 airlines and 250 countries. Any flight worldwide at your fingertips.

Aero Aero supports more than 13,000 airlines and 250 countries. Any flight worldwide at your fingertips. Features Main : Flight lookup Aircraft lookup

Vickey 비키 4 Oct 27, 2021
CalHacks 8 Repo: Megha Jain, Gaurav Bhatnagar, Howard Meng, Vibha Tantry

CalHacks8 CalHacks 8 Repo: Megha Jain, Gaurav Bhatnagar, Howard Meng, Vibha Tantry Setup FE Install React Native via Expo, run App.js. Backend Create

0 Aug 20, 2022
The only purpose of a byte-sized application is to help you create .desktop entry files for downloaded applications.

Turtle 🐢 The only purpose of a byte-sized application is to help you create .desktop entry files for downloaded applications. As of usual with elemen

TenderOwl 14 Dec 29, 2022
a wordle-solver written in python

Wordle Solver Overview This is yet another wordle solver. It is built with the word list of the official wordle website, but it should also work with

Shoubhit Dash 10 Sep 24, 2022
Simple tools for the Horse Reality webgame

Realtools (Web Tools for Horse Reality) These tools were made on request from a close friend of mine who plays this game. A live instance can be found

shay 0 Sep 06, 2022
Simple control of Thorlabs Elliptec devices from Python.

Elliptec Simple control of Thorlabs Elliptec devices. No docs yet » Get started · Report a bug · Request a feature About The Project ThorLabs Elliptec

David Roesel 8 Sep 22, 2022
Find habits that genuinely increase your productivity

BiProductive Description This repository contains the application BiProductive, which analyzes the habits of the person, tests his productivity, and d

Rizvan Iskaliev 43 Jun 11, 2022
This wishes a mentioned users on their birthdays

BirthdayWisher Requirements: "mysqlserver", "email id and password", "Mysqlconnector" In-Built Modules: "smtplib", "datetime","imghdr" In Mysql: A tab

vellalaharshith 1 Sep 13, 2022
All exercises done during the Python 3 course in the Video Course (World 1, 2 and 3)

Python3-cursoemvideo-exercises - All exercises done during the Python 3 course in the Video Course (World 1, 2 and 3)

Renan Barbosa 3 Jan 17, 2022
Get you an ultimate lexer generator using Fable; port OCaml sedlex to FSharp, Python and more!

NOTE: currently we support interpreted mode and Python source code generation. It's EASY to compile compiled_unit into source code for C#, F# and othe

Taine Zhao 15 Aug 06, 2022
A deployer and package manager for OceanBase open-source software.

OceanBase Deploy OceanBase Deploy (简称 OBD)是 OceanBase 开源软件的安装部署工具。OBD 同时也是包管理器,可以用来管理 OceanBase 所有的开源软件。本文介绍如何安装 OBD、使用 OBD 和 OBD 的命令。 安装 OBD 您可以使用以下方

OceanBase 59 Dec 27, 2022
Replay Felica Exchange For Python

FelicaReplay Replay Felica Exchange Description Standalone Replay Module Usage Save FelicaRelay (=2.0) output to file, then python replay.py [FILE].

3 Jul 14, 2022
This is the DBMS Project done in 5th sem of B.E CS.

Student-Result-Management-System This is the DBMS Project done in 5th sem of B.E CS. You need to install SQlite DB Browser in your pc or laptop to ope

Vivek kulkarni 1 Jan 14, 2022
NFT-Image-Generator - Utility to generate a large collection of unique images

NFT-Image-Generator Utility for creating a generative art collection from suppli

Sem Moolenschot 60 Dec 15, 2022
Xoroshiro-cairo - A xoroshiro128** pseudorandom number generator implementation in Cairo

xoroshiro-cairo A xoroshiro128** pseudorandom number generator implementation in

Milan Cermak 26 Oct 05, 2022
Construção de um jogo Dominó na linguagem python com base em algoritmos personalizados.

Domino (projecto-python) Construção de um jogo Dominó na linguaguem python com base em algoritmos personalizados e na: Monografia apresentada ao curso

Nuninha-GC 1 Jan 12, 2022
🐍 A Python lib for (de)serializing Python objects to/from JSON

Turn Python objects into dicts or (json)strings and back No changes required to your objects Easily customizable and extendable Works with dataclasses

Ramon Hagenaars 253 Dec 14, 2022
A simply program to find active jackbox.tv game codes

PeepingJack A simply program to find active jackbox.tv game codes How does this work? It uses a threadpool to loop through all possible codes in a ran

3 Mar 20, 2022