Mypy stubs, i.e., type information, for numpy, pandas and matplotlib

Overview

Mypy type stubs for NumPy, pandas, and Matplotlib

Join the chat at https://gitter.im/data-science-types/community

This is a PEP-561-compliant stub-only package which provides type information for matplotlib, numpy and pandas. The mypy type checker (or pytype or PyCharm) can recognize the types in these packages by installing this package.

NOTE: This is a work in progress

Many functions are already typed, but a lot is still missing (NumPy and pandas are huge libraries). Chances are, you will see a message from Mypy claiming that a function does not exist when it does exist. If you encounter missing functions, we would be delighted for you to send a PR. If you are unsure of how to type a function, we can discuss it.

Installing

You can get this package from PyPI:

pip install data-science-types

To get the most up-to-date version, install it directly from GitHub:

pip install git+https://github.com/predictive-analytics-lab/data-science-types

Or clone the repository somewhere and do pip install -e ..

Examples

These are the kinds of things that can be checked:

Array creation

import numpy as np

arr1: np.ndarray[np.int64] = np.array([3, 7, 39, -3])  # OK
arr2: np.ndarray[np.int32] = np.array([3, 7, 39, -3])  # Type error
arr3: np.ndarray[np.int32] = np.array([3, 7, 39, -3], dtype=np.int32)  # OK
arr4: np.ndarray[float] = np.array([3, 7, 39, -3], dtype=float)  # Type error: the type of ndarray can not be just "float"
arr5: np.ndarray[np.float64] = np.array([3, 7, 39, -3], dtype=float)  # OK

Operations

import numpy as np

arr1: np.ndarray[np.int64] = np.array([3, 7, 39, -3])
arr2: np.ndarray[np.int64] = np.array([4, 12, 9, -1])

result1: np.ndarray[np.int64] = np.divide(arr1, arr2)  # Type error
result2: np.ndarray[np.float64] = np.divide(arr1, arr2)  # OK

compare: np.ndarray[np.bool_] = (arr1 == arr2)

Reductions

import numpy as np

arr: np.ndarray[np.float64] = np.array([[1.3, 0.7], [-43.0, 5.6]])

sum1: int = np.sum(arr)  # Type error
sum2: np.float64 = np.sum(arr)  # OK
sum3: float = np.sum(arr)  # Also OK: np.float64 is a subclass of float
sum4: np.ndarray[np.float64] = np.sum(arr, axis=0)  # OK

# the same works with np.max, np.min and np.prod

Philosophy

The goal is not to recreate the APIs exactly. The main goal is to have useful checks on our code. Often the actual APIs in the libraries is more permissive than the type signatures in our stubs; but this is (usually) a feature and not a bug.

Contributing

We always welcome contributions. All pull requests are subject to CI checks. We check for compliance with Mypy and that the file formatting conforms to our Black specification.

You can install these dev dependencies via

pip install -e '.[dev]'

This will also install NumPy, pandas, and Matplotlib to be able to run the tests.

Running CI locally (recommended)

We include a script for running the CI checks that are triggered when a PR is opened. To test these out locally, you need to install the type stubs in your environment. Typically, you would do this with

pip install -e .

Then use the check_all.sh script to run all tests:

./check_all.sh

Below we describe how to run the various checks individually, but check_all.sh should be easier to use.

Checking compliance with Mypy

The settings for Mypy are specified in the mypy.ini file in the repository. Just running

mypy tests

from the base directory should take these settings into account. We enforce 0 Mypy errors.

Formatting with black

We use Black to format the stub files. First, install black and then run

black .

from the base directory.

Pytest

python -m pytest -vv tests/

Flake8

flake8 *-stubs

License

Apache 2.0

Comments
  • Update pandas read_csv and to_csv

    Update pandas read_csv and to_csv

    Hey! I updated pandas read_csv and to _csv, and also a small fix to pandas.Series (map function)

    There are some small changes made by black formatter, (was it bad formatted before or did I hve something wrong in my settings?)

    I would appreciate it if you could review this.

    opened by hellocoldworld 9
  • Support str and int as dtypes.

    Support str and int as dtypes.

    Extend the set of dtypes with the str and int literals.

    Note -- it would help to add some comments to describe the intended use of the _Dtype types -- it was hard for me to guess if I needed to also extend any of these.

    Fix for #73

    opened by rpgoldman 9
  • Add Series.sort_index() signature

    Add Series.sort_index() signature

    • [x] Adds Series.sort_index based on stable version documentation
    • [x] Fixes wrong order of arguments in Series.sort_values() (ascending should go before inplace)
    • [x] Adds missing arguments to Series.sort_values()
    opened by krassowski 8
  • Small additions to DataFrame and Series

    Small additions to DataFrame and Series

    I've made a few more additions to the stub, fleshing it out as I found I needed more for my work. I've corrected the issue I found - thanks again, thomkeh! - and hope that others can benefit from this work.

    Thank you!

    opened by ZHSimon 8
  • Flesh out pandas and numpy a bit more

    Flesh out pandas and numpy a bit more

    This is the result of testing data-science-types against another project I contribute to: https://github.com/jldbc/pybaseball

    • I added some common .gitignores for venv and vscode
    • I found a few Pandas tweaks to support functions and parameters that we are using.
      • Tweak DataFrame.apply, DataFrame.drop, DataFrame.merge, DataFrame.rank, DataFrame.reindex, DataFrame.replace
      • Add DataFrame.assign, DataFrame.filter
      • Tweak Series.rank
      • Add pandas.isnull
      • Tweak DataFrame.loc
    • A few changes to numpy as well
      • Allow tuples -> numpy.array
      • Tweak numpy.setdiff1d
      • Add numpy.cos, numpy.deg2rad, numpy.sin, numpy.cos

    Everything was done using the latest Pandas docs for reference to data types:

    https://pandas.pydata.org/pandas-docs/stable/reference/

    I also did my best to add tests to support the changes as well

    opened by TheCleric 7
  • Shelelena/pandas improvements

    Shelelena/pandas improvements

    Improvements in pandas DataFrame, DataFrameGroupBy and SeriesGroupBy

    • specify DataFrame.groupby
    • add DataFrameGroupBy.aggregate
    • adjust data type in DataFrame.__init__
    • add __getattr__ to get columns in DataFrame and DataFrameGroupBy
    • correct return type of DataFrameGroupBy.__getitem__
    • add some missing statistical methods to DataFrameGroupBy and SeriesGroupBy
    opened by Shelelena 7
  • Fix numpy arange overload

    Fix numpy arange overload

    Change order of start/stop to comply with numpy documentation (https://numpy.org/doc/stable/reference/generated/numpy.arange.html) and change data types to float.

    This is my first ever PR on github for a public repository, so please be gentle. If I need to clear anything up, please let me know.

    opened by Hvlot 5
  • Missing pandas.to_numeric

    Missing pandas.to_numeric

    The pandas stubs are missing pandas.to_numeric.

    I would like to do a PR but I'm not really sure where to start or how to write proper type hints for this, as I've only just started learning about python typing for the last few days. Any help would be much appreciated.

    opened by wwuck 5
  • Add support for Series and DF at methods

    Add support for Series and DF at methods

    Created _AtIndexer classes for Series and DataFrame and used them to type the corresponding at() methods.

    Partial solution to #74.

    This doesn't fully work, because it doesn't handle the possibility that a data frame will contain categorical (string) or integer data, instead of just float. I don't know how to do this.

    opened by rpgoldman 5
  • Add support for pandas.IntDtypes and pandas.UIntDtypes

    Add support for pandas.IntDtypes and pandas.UIntDtypes

    This adds support for pandas:

    • Int8Dtype
    • Int16Dtype
    • Int32DType
    • Int64Dtype
    • UInt8Dtype
    • UInt16Dtype
    • UInt32DType
    • UInt64Dtype

    As well as a slew of base classes.

    We'll see how the CI likes it, but for some reason on my local machine, mypy is saying it can't find any of the types, when they have been clearly added to the __init__.pyi in the pandas-stubs root.

    If that continues to be a problem, I may need advice on how to fix.

    opened by TheCleric 4
  • Will numpy stubs be removed after next numpy release?

    Will numpy stubs be removed after next numpy release?

    Numpy has finally merged the stubs from numpy-stubs into the main numpy project.

    https://github.com/numpy/numpy-stubs/pull/88 https://github.com/numpy/numpy/pull/16515

    Will the numpy stubs in this project be removed when numpy 1.20.0 is released?

    opened by wwuck 4
  • No overload variant of

    No overload variant of "subplots" matches argument type "bool"

    When I perform the following:

    from matplotlib.pyplot import subplots
    FIG, AXES = subplots(constrained_layout=True)
    

    I get the warning:

    No overload variant of "subplots" matches argument type "bool".

    Does that need to be added?

    opened by uihsnv 0
  • test_frame_iloc fails on Pandas 1.2

    test_frame_iloc fails on Pandas 1.2

    tests/pandas_test.py line 92 fails on Pandas 1.2

    Extracting the relevant code

    import pandas as pd
    df: pd.DataFrame = pd.DataFrame(
        [[1.0, 2.0], [4.0, 5.0], [7.0, 8.0]],
        index=["cobra", "viper", "sidewinder"],
        columns=["max_speed", "shield"],
    )
    s: "pd.Series[float]" = df["shield"].copy()
    df.iloc[0] = s
    

    Results in

    ValueError: could not broadcast input array from shape (3) into shape (2)
    

    This runs fine on Pandas 1.1.5

    opened by EdwardJRoss 0
  • Pandas `DataFrame.concat` missing some arguments

    Pandas `DataFrame.concat` missing some arguments

    The concat method for joining multiple DataFrames appears to be missing several arguments, such as join, keys, levels, and more.

    https://github.com/predictive-analytics-lab/data-science-types/blob/faebf595b16772d3aa70d56ea179a2eaffdbd565/pandas-stubs/init.pyi#L37-L42

    Compare to the Pandas docs: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html

    opened by kevinhu 0
Releases(v0.2.23)
Owner
Predictive Analytics Lab
Predictive Analytics Lab
The strictest and most opinionated python linter ever!

wemake-python-styleguide Welcome to the strictest and most opinionated python linter ever. wemake-python-styleguide is actually a flake8 plugin with s

wemake.services 2.1k Jan 01, 2023
Reference implementation of sentinels for the Python stdlib

Sentinels This is a reference implementation of a utility for the definition of sentinel values in Python. This also includes a draft PEP for the incl

Tal Einat 22 Aug 27, 2022
open source tools to generate mypy stubs from protobufs

mypy-protobuf: Generate mypy stub files from protobuf specs We just released a new major release mypy-protobuf 2. on 02/02/2021! It includes some back

Dropbox 527 Jan 03, 2023
An enhanced version of the Python typing library.

typingplus An enhanced version of the Python typing library that always uses the latest version of typing available, regardless of which version of Py

Contains 6 Mar 26, 2021
Optional static typing for Python 3 and 2 (PEP 484)

Mypy: Optional Static Typing for Python Got a question? Join us on Gitter! We don't have a mailing list; but we are always happy to answer questions o

Python 14.4k Jan 08, 2023
mypy plugin for loguru

loguru-mypy A fancy plugin to boost up your logging with loguru mypy compatibility logoru-mypy should be compatible with mypy=0.770. Currently there

Tomasz Trębski 13 Nov 02, 2022
Flake8 plugin for managing type-checking imports & forward references

flake8-type-checking Lets you know which imports to put in type-checking blocks. For the imports you've already defined inside type-checking blocks, i

snok 67 Dec 16, 2022
Pylint plugin to enforce some secure coding standards for Python.

Pylint Secure Coding Standard Plugin pylint plugin that enforces some secure coding standards. Installation pip install pylint-secure-coding-standard

Nguyen Damien 2 Jan 04, 2022
PEP-484 typing stubs for SQLAlchemy 1.4 and SQLAlchemy 2.0

SQLAlchemy 2 Stubs These are PEP-484 typing stubs for SQLAlchemy 1.4 and 2.0. They are released concurrently along with a Mypy extension which is desi

SQLAlchemy 139 Dec 30, 2022
Utilities for pycharm code formatting (flake8 and black)

Pycharm External Tools Extentions to Pycharm code formatting tools. Currently supported are flake8 and black on a selected code block. Usage Flake8 [P

Haim Daniel 13 Nov 03, 2022
Pymxs, the 3DsMax bindings of Maxscript to Python doesn't come with any stubs

PyMXS Stubs generator What Pymxs, the 3DsMax bindings of Maxscript to Python doe

Frieder Erdmann 19 Dec 27, 2022
flake8 plugin to catch useless `assert` statements

flake8-useless-assert flake8 plugin to catch useless assert statements Download or install on the PyPI page Violations Code Description Example ULA001

1 Feb 12, 2022
A static-analysis bot for Github

Imhotep, the peaceful builder. What is it? Imhotep is a tool which will comment on commits coming into your repository and check for syntactic errors

Justin Abrahms 221 Nov 10, 2022
Code audit tool for python.

Pylama Code audit tool for Python and JavaScript. Pylama wraps these tools: pycodestyle (formerly pep8) © 2012-2013, Florent Xicluna; pydocstyle (form

Kirill Klenov 967 Jan 07, 2023
Plugin for mypy to support zope.interface

Plugin for mypy to support zope.interface The goal is to be able to make zope interfaces to be treated as types in mypy sense. Usage Install both mypy

Shoobx 36 Oct 29, 2022
flake8 plugin that integrates isort

Flake8 meet isort Use isort to check if the imports on your python files are sorted the way you expect. Add an .isort.cfg to define how you want your

Gil Forcada Codinachs 139 Nov 08, 2022
coala provides a unified command-line interface for linting and fixing all your code, regardless of the programming languages you use.

"Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live." ― John F. Woods coala provides a

coala development group 3.4k Dec 29, 2022
flake8 plugin which checks that typing imports are properly guarded

flake8-typing-imports flake8 plugin which checks that typing imports are properly guarded installation pip install flake8-typing-imports flake8 codes

Anthony Sottile 50 Nov 01, 2022
Check for python builtins being used as variables or parameters

Flake8 Builtins plugin Check for python builtins being used as variables or parameters. Imagine some code like this: def max_values(list, list2):

Gil Forcada Codinachs 98 Jan 08, 2023
Flake8 Type Annotation Checking

flake8-annotations flake8-annotations is a plugin for Flake8 that detects the absence of PEP 3107-style function annotations and PEP 484-style type co

S. Co1 118 Jan 05, 2023