A CLI tool to reduce the friction between data scientists by reducing git conflicts removing notebook metadata and gracefully resolving git conflicts.

Last update: Dec 25, 2022

Overview

databooks

databooks is a package for reducing the friction data scientists while using Jupyter notebooks, by reducing the number of git conflicts between different notebooks and assisting in the resolution of the conflicts.

The key features include:

CLI tool
- Clear notebook metadata
- Resolve git conflicts
Simple to use
Simple API for using modelling and comparing notebooks using Pydantic

Requirements

databooks is built on top of:

Python 3.8+
Typer
Rich
Pydantic
GitPython

Installation

pip install --i https://test.pypi.org/simple/ databooks

Usage

Clear metadata

Simply specify the paths for notebook files to remove metadata. By doing so, we can already avoid many of the conflicts.

$ databooks meta [OPTIONS] PATHS...

Fix git conflicts for notebooks

Specify the paths for notebook files with conflicts to be fixed. Then, databooks finds the source notebooks that caused the conflicts and compares them (so no JSON manipulation!)

$ databooks fix [OPTIONS] PATHS...

License

This project is licensed under the terms of the MIT license.

Comments

tests don't work when run without git repository

When packaging software for Fedora, we run tests in an unpacked archive without initialized git repository for the codebase because we are testing the installed version. Would it make sense to make the tests compatible with this way of running them?

Maybe the tests can initialize their own repositories in a tmpdir or something like that.

The current log I have is:

python -m pytest
===================================== test session starts =====================================
platform linux -- Python 3.10.7, pytest-7.1.3, pluggy-1.0.0
rootdir: /home/lbalhar/Software/databooks
collected 118 items                                                                           

tests/test_affirm.py ......................                                             [ 18%]
tests/test_cli.py ..........FF.FFF.F.F.FF                                               [ 38%]
tests/test_common.py ..                                                                 [ 39%]
tests/test_conflicts.py .........                                                       [ 47%]
tests/test_git_utils.py F.                                                              [ 49%]
tests/test_metadata.py ..........                                                       [ 57%]
tests/test_recipes.py ...............                                                   [ 70%]
tests/test_tui.py ..................                                                    [ 85%]
tests/test_data_models/test_base.py .                                                   [ 86%]
tests/test_data_models/test_notebook.py ................                                [100%]

========================================== FAILURES ===========================================
__________________________________________ test_meta __________________________________________

tmpdir = local('/tmp/pytest-of-lbalhar/pytest-1/test_meta0')

    def test_meta(tmpdir: LocalPath) -> None:
        """Remove notebook metadata."""
        read_path = tmpdir.mkdir("notebooks") / "test_meta_nb.ipynb"  # type: ignore
        TestJupyterNotebook().jupyter_notebook.write(read_path)
    
        nb_read = JupyterNotebook.parse_file(path=read_path)
        result = runner.invoke(app, ["meta", str(read_path), "--yes"])
        nb_write = JupyterNotebook.parse_file(path=read_path)
    
>       assert result.exit_code == 0
E       assert 1 == 0
E        +  where 1 = <Result InvalidGitRepositoryError()>.exit_code

tests/test_cli.py:50: AssertionError
______________________________________ test_meta__check _______________________________________

tmpdir = local('/tmp/pytest-of-lbalhar/pytest-1/test_meta__check0')
caplog = <_pytest.logging.LogCaptureFixture object at 0x7fca4a63baf0>

    def test_meta__check(tmpdir: LocalPath, caplog: LogCaptureFixture) -> None:
        """Report on existing notebook metadata (both when it is and isn't present)."""
        caplog.set_level(logging.INFO)
    
        read_path = tmpdir.mkdir("notebooks") / "test_meta_nb.ipynb"  # type: ignore
        TestJupyterNotebook().jupyter_notebook.write(read_path)
    
        nb_read = JupyterNotebook.parse_file(path=read_path)
        result = runner.invoke(app, ["meta", str(read_path), "--check"])
        nb_write = JupyterNotebook.parse_file(path=read_path)
    
        logs = list(caplog.records)
        assert result.exit_code == 1
>       assert len(logs) == 1
E       assert 0 == 1
E        +  where 0 = len([])

tests/test_cli.py:83: AssertionError
______________________________________ test_meta__script ______________________________________

tmpdir = local('/tmp/pytest-of-lbalhar/pytest-1/test_meta__script0')

    def test_meta__script(tmpdir: LocalPath) -> None:
        """Raise `typer.BadParameter` when passing a script instead of a notebook."""
        py_path = tmpdir.mkdir("files") / "a_script.py"  # type: ignore
        py_path.write_text("# some python code", encoding="utf-8")
    
        result = runner.invoke(app, ["meta", str(py_path)])
>       assert result.exit_code == 2
E       assert 1 == 2
E        +  where 1 = <Result InvalidGitRepositoryError()>.exit_code

tests/test_cli.py:139: AssertionError
____________________________________ test_meta__no_confirm ____________________________________

tmpdir = local('/tmp/pytest-of-lbalhar/pytest-1/test_meta__no_confirm0')

    def test_meta__no_confirm(tmpdir: LocalPath) -> None:
        """Don't make any changes without confirmation to overwrite files (prompt)."""
        nb_path = tmpdir.mkdir("notebooks") / "test_meta_nb.ipynb"  # type: ignore
        TestJupyterNotebook().jupyter_notebook.write(nb_path)
    
        result = runner.invoke(app, ["meta", str(nb_path)])
    
        assert result.exit_code == 1
        assert JupyterNotebook.parse_file(nb_path) == TestJupyterNotebook().jupyter_notebook
>       assert result.output == (
            "1 files will be overwritten (no prefix nor suffix was passed)."
            " Continue? [y/n]: \nAborted!\n"
        )
E       AssertionError: assert '' == '1 files will... \nAborted!\n'
E         - 1 files will be overwritten (no prefix nor suffix was passed). Continue? [y/n]: 
E         - Aborted!

tests/test_cli.py:155: AssertionError
_____________________________________ test_meta__confirm ______________________________________

tmpdir = local('/tmp/pytest-of-lbalhar/pytest-1/test_meta__confirm0')

    def test_meta__confirm(tmpdir: LocalPath) -> None:
        """Make changes when confirming overwrite via the prompt."""
        nb_path = tmpdir.mkdir("notebooks") / "test_meta_nb.ipynb"  # type: ignore
        TestJupyterNotebook().jupyter_notebook.write(nb_path)
    
        result = runner.invoke(app, ["meta", str(nb_path)], input="y")
    
>       assert result.exit_code == 0
E       assert 1 == 0
E        +  where 1 = <Result InvalidGitRepositoryError()>.exit_code

tests/test_cli.py:168: AssertionError
_________________________________________ test_assert _________________________________________

caplog = <_pytest.logging.LogCaptureFixture object at 0x7fca4a66dff0>

    def test_assert(caplog: LogCaptureFixture) -> None:
        """Assert that notebook has sequential and increasing cell execution."""
        caplog.set_level(logging.INFO)
    
        exprs = (
            "[c.execution_count for c in exec_cells] == list(range(1, len(exec_cells) + 1))"
        )
        recipe = "seq-increase"
        with resources.path("tests.files", "demo.ipynb") as nb_path:
            result = runner.invoke(
                app, ["assert", str(nb_path), "--expr", exprs, "--recipe", recipe]
            )
    
        logs = list(caplog.records)
>       assert result.exit_code == 0
E       assert 1 == 0
E        +  where 1 = <Result InvalidGitRepositoryError()>.exit_code

tests/test_cli.py:203: AssertionError
__________________________________________ test_fix ___________________________________________

tmpdir = local('/tmp/pytest-of-lbalhar/pytest-1/test_fix0')

    def test_fix(tmpdir: LocalPath) -> None:
        """Fix notebook conflicts."""
        # Setup
        nb_path = Path("test_conflicts_nb.ipynb")
        notebook_1 = TestJupyterNotebook().jupyter_notebook
        notebook_2 = TestJupyterNotebook().jupyter_notebook
    
        notebook_1.metadata = NotebookMetadata(
            kernelspec=dict(
                display_name="different_kernel_display_name", name="kernel_name"
            ),
            field_to_remove=["Field to remove"],
            another_field_to_remove="another field",
        )
    
        extra_cell = BaseCell(
            cell_type="raw",
            metadata=CellMetadata(random_meta=["meta"]),
            source="extra",
        )
        notebook_2.cells = notebook_2.cells + [extra_cell]
        notebook_2.nbformat += 1
        notebook_2.nbformat_minor += 1
    
        git_repo = init_repo_conflicts(
            tmpdir=tmpdir,
            filename=nb_path,
            contents_main=notebook_1.json(),
            contents_other=notebook_2.json(),
            commit_message_main="Notebook from main",
            commit_message_other="Notebook from other",
        )
    
        conflict_files = get_conflict_blobs(repo=git_repo)
        id_main = conflict_files[0].first_log
        id_other = conflict_files[0].last_log
    
        # Run CLI and check conflict resolution
        result = runner.invoke(app, ["fix", str(tmpdir)])
>       fixed_notebook = JupyterNotebook.parse_file(path=tmpdir / nb_path)

tests/test_cli.py:268: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
databooks/data_models/notebook.py:250: in parse_file
    return super(JupyterNotebook, cls).parse_file(
pydantic/main.py:561: in pydantic.main.BaseModel.parse_file
    ???
pydantic/parse.py:64: in pydantic.parse.load_file
    ???
pydantic/parse.py:37: in pydantic.parse.load_str_bytes
    ???
/usr/lib64/python3.10/json/__init__.py:346: in loads
    return _default_decoder.decode(s)
/usr/lib64/python3.10/json/decoder.py:337: in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <json.decoder.JSONDecoder object at 0x7fca4be6b1f0>
s = '<<<<<<< HEAD\n{"nbformat": 4, "nbformat_minor": 4, "metadata": {"field_to_remove": ["Field to remove"], "another_fiel...xecution_count": 1}, {"metadata": {"random_meta": ["meta"]}, "source": "extra", "cell_type": "raw"}]}\n>>>>>>> other\n'
idx = 0

    def raw_decode(self, s, idx=0):
        """Decode a JSON document from ``s`` (a ``str`` beginning with
        a JSON document) and return a 2-tuple of the Python
        representation and the index in ``s`` where the document ended.
    
        This can be used to decode a JSON document from a string that may
        have extraneous data at the end.
    
        """
        try:
            obj, end = self.scan_once(s, idx)
        except StopIteration as err:
>           raise JSONDecodeError("Expecting value", s, err.value) from None
E           json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

/usr/lib64/python3.10/json/decoder.py:355: JSONDecodeError
__________________________________________ test_show __________________________________________

    def test_show() -> None:
        """Show notebook in terminal."""
        with resources.path("tests.files", "tui-demo.ipynb") as nb_path:
            result = runner.invoke(app, ["show", str(nb_path)])
>       assert result.exit_code == 0
E       assert 1 == 0
E        +  where 1 = <Result InvalidGitRepositoryError()>.exit_code

tests/test_cli.py:382: AssertionError
____________________________________ test_show_no_multiple ____________________________________

    def test_show_no_multiple() -> None:
        """Don't show multiple notebooks if not confirmed in prompt."""
        with resources.path("tests.files", "tui-demo.ipynb") as nb:
            dirpath = str(nb.parent)
    
        # Exit code is 0 if user responds to prompt with `n`
        result = runner.invoke(app, ["show", dirpath], input="n")
>       assert result.exit_code == 0
E       assert 1 == 0
E        +  where 1 = <Result InvalidGitRepositoryError()>.exit_code

tests/test_cli.py:457: AssertionError
________________________________________ test_get_repo ________________________________________

    def test_get_repo() -> None:
        """Find git repository."""
        curr_dir = Path(__file__).parent
>       repo = get_repo(curr_dir)

tests/test_git_utils.py:54: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
databooks/git_utils.py:38: in get_repo
    repo = Repo(path=repo_dir)
../../.virtualenvs/databooks/lib/python3.10/site-packages/git/repo/base.py:266: in __init__
    self.working_dir: Optional[PathLike] = self._working_tree_dir or self.common_dir
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <git.repo.base.Repo ''>

    @property
    def common_dir(self) -> PathLike:
        """
        :return: The git dir that holds everything except possibly HEAD,
            FETCH_HEAD, ORIG_HEAD, COMMIT_EDITMSG, index, and logs/."""
        if self._common_dir:
            return self._common_dir
        elif self.git_dir:
            return self.git_dir
        else:
            # or could return ""
>           raise InvalidGitRepositoryError()
E           git.exc.InvalidGitRepositoryError

../../.virtualenvs/databooks/lib/python3.10/site-packages/git/repo/base.py:347: InvalidGitRepositoryError
=================================== short test summary info ===================================
FAILED tests/test_cli.py::test_meta - assert 1 == 0
FAILED tests/test_cli.py::test_meta__check - assert 0 == 1
FAILED tests/test_cli.py::test_meta__script - assert 1 == 2
FAILED tests/test_cli.py::test_meta__no_confirm - AssertionError: assert '' == '1 files will...
FAILED tests/test_cli.py::test_meta__confirm - assert 1 == 0
FAILED tests/test_cli.py::test_assert - assert 1 == 0
FAILED tests/test_cli.py::test_fix - json.decoder.JSONDecodeError: Expecting value: line 1 c...
FAILED tests/test_cli.py::test_show - assert 1 == 0
FAILED tests/test_cli.py::test_show_no_multiple - assert 1 == 0
FAILED tests/test_git_utils.py::test_get_repo - git.exc.InvalidGitRepositoryError
=============================== 10 failed, 108 passed in 0.60s ================================

opened by frenzymadness 5

Drop dependency on `py` and simplify tests

The first commit drops dependency on py module and switch from LocalPath to pathlib.Path which has a different API so it required some more changes.

The second commit actually removes the unnecessary subdirs. If you want to keep them, I can drop the second commit.

opened by frenzymadness 4
RPM package in Fedora - problem with Typer version

Hi!

I really enjoyed your talk on PyCon PL today and I've decided that it'd be great to have databook packaged in Fedora Linux. I've tried to create a package but found a problem - we have too new type in Fedora (version 0.6.1) but databook requires version < 0.5.

https://github.com/datarootsio/databooks/blob/0b1d3e67c5fd00a145179545a456786bab5b7e77/pyproject.toml#L20

Would it make sense to make the requirement less strict? If thy are following semantic versioning, something like <1.0.0 should guarantee enough stability.

opened by frenzymadness 4
Fix the location of LICENSE file

LICENSE is included automatically by poetry and it's installed into .dist-info directory. This manual include caused the file to be installed also into lib/python3.11/site-packages next to the main databooks package.

opened by frenzymadness 3
pytest 7.2.0 no longer depends on py

This will no longer work with pytest 7.2.0 and higher. py lib is deprecated but you can fix the problem by requiring it explicitly if there is no other way now.

https://github.com/datarootsio/databooks/blob/6febdbadde115cdb6fb89eec737d00cfff1d3223/tests/test_common.py#L3

opened by frenzymadness 2

"Dynamic" recipes

Make recipes take variables so that they could be reused more easily.

Example:

The recipe:

max_cells = RecipeInfo(
    src="len(nb.cells) < amount_max",
    description="Assert that there are less than 'amount_max' cells in the notebook.",
)

The command:

databooks assert path/to/nbs --recipe max_cells --amount_max 10

databooks assert path/to/nbs --recipe max_cells --help

Would return the arguments of that recipe

opened by nicogelders 2

Feature:assert command
New CLI command! 🎉

A databooks assert ... command that will take an expression and evaluate against all notebooks in path (internally databooks.affirm.affirm to hopefully remind of python's assert without creating any ambiguity). The command also comes with recipes that are short-hand for some useful expressions. These can also be used as inspiration for new expressions.

Variables in scope for expression:

nb: Jupyter notebook found in path

raw_cells: notebook cells of raw type

md_cells: notebook cells of markdown type

code_cells: notebook cells of code type

exec_cells: notebook cells of executed code type

It uses python's eval. But, since eval is really dangerous we first parse the string and only whitelist some nodes/attributes.

Allowed nodes are in databooks.affirm._ALLOWED_NODES

Allowed functions are in databooks.affirm._ALLOWED_BUILTINS

Only class attributes are allowed (found Pydantic model attributes using dict(model).keys())

Check tests/test_affirm.py for examples of valid/invalid expressions.
opened by murilo-cunha 2
Log-verbosity

Package uses logging functionalty but introduces a "verbose" concept where loglevel INFO suddendly becomes more verbose. This is a small PR that switches log level level to DEBUG when --verbose flag is passed by the user.

opened by Bart6114 2
add exception for html tables and try-except when parsing outputs (fa…
Changes:

Add custom exception when parsing HTML to table fails

Add try-except for that exception when parsing outputs - when there are issues, fallback to text implementation
opened by murilo-cunha 1
Add rich tables
Add rich table (HTML) rendering to notebook.

Other HTML elements are not displayed.

Fall back to displaying text if that's the case

Refactor logic to simplify this

Refactor logic to gracefully show not supported mime types (anything that is not one of image/png, text/html, text/plain
opened by murilo-cunha 1
Repo found from paths
We should not need the git repo for meta and assert commands

Config files/repos should always be located from the filepaths, not current working directory

Only exception is when we run databooks diff with no paths - in that case we use the current dir to find the repo

Not the case for databooks fix requires paths

Closes https://github.com/datarootsio/databooks/issues/48
opened by murilo-cunha 1
Enrich diffs
Enhancements:

If diff is only on code/output, then only show the split for that section

Show when only cell/notebook metadata changes - maybe make cell border yellow, etc...

For cell source diffs, show lines/character changes by changing the background color (red/green)

Add rich rendering for other outputs

HTML (other than tables)

PNG

SVG

...
opened by murilo-cunha 0
[RFE] databooks run

I like the way databooks is able to show the content of the notebook. It might make sense to implement a run command which can run all cells in a notebook and show the progress in the CLI interface cell by cell. We don't need to implement the execution because it can be done by nbconvert or nbclient packages.

What do you think?

opened by frenzymadness 3
Can I have the pre-commit hook recurse subdirectories?

I'd like to use databooks in the pre-commit Git hook, but my notebooks are in a sub-folder called "notebooks". Is there a way to specify that in the pre-commit?

opened by tonyreina 1

Releases(1.3.7)

1.3.7(Nov 27, 2022)
What's Changed

add exception for html tables and try-except when parsing outputs (fa… by @murilo-cunha in https://github.com/datarootsio/databooks/pull/62

Full Changelog: https://github.com/datarootsio/databooks/compare/1.3.6...1.3.7
Source code(tar.gz)
Source code(zip)
1.3.6(Nov 11, 2022)
What's Changed

Add export option by @murilo-cunha in https://github.com/datarootsio/databooks/pull/60

Full Changelog: https://github.com/datarootsio/databooks/compare/1.3.5...1.3.6
Source code(tar.gz)
Source code(zip)
1.3.5(Nov 11, 2022)
What's Changed

Add rich tables by @murilo-cunha in https://github.com/datarootsio/databooks/pull/59

Full Changelog: https://github.com/datarootsio/databooks/compare/1.3.4...1.3.5
Source code(tar.gz)
Source code(zip)
1.3.4(Nov 9, 2022)
What's Changed

Add Python 3.11 to CI by @frenzymadness in https://github.com/datarootsio/databooks/pull/58

Full Changelog: https://github.com/datarootsio/databooks/compare/1.3.3...1.3.4
Source code(tar.gz)
Source code(zip)
1.3.3(Nov 8, 2022)
What's Changed

Fix the location of LICENSE file by @frenzymadness in https://github.com/datarootsio/databooks/pull/57

Full Changelog: https://github.com/datarootsio/databooks/compare/1.3.2...1.3.3
Source code(tar.gz)
Source code(zip)
1.3.2(Nov 8, 2022)
What's Changed

Fix test_get_repo by @frenzymadness in https://github.com/datarootsio/databooks/pull/56

Full Changelog: https://github.com/datarootsio/databooks/compare/1.3.1...1.3.2
Source code(tar.gz)
Source code(zip)
1.3.1(Nov 8, 2022)
What's Changed

Repo found from paths by @murilo-cunha in https://github.com/datarootsio/databooks/pull/54

Full Changelog: https://github.com/datarootsio/databooks/compare/1.3.0...1.3.1
Source code(tar.gz)
Source code(zip)
1.3.0(Nov 7, 2022)
What's Changed

Add diff command by @murilo-cunha in https://github.com/datarootsio/databooks/pull/45

Full Changelog: https://github.com/datarootsio/databooks/compare/1.2.9...1.3.0
Source code(tar.gz)
Source code(zip)
1.2.9(Nov 5, 2022)
What's Changed

Relax typer version by @murilo-cunha in https://github.com/datarootsio/databooks/pull/53

Full Changelog: https://github.com/datarootsio/databooks/compare/1.2.8...1.2.9
Source code(tar.gz)
Source code(zip)
1.2.8(Nov 5, 2022)
What's Changed

Drop dependency on py and simplify tests by @frenzymadness in https://github.com/datarootsio/databooks/pull/51

New Contributors

@frenzymadness made their first contribution in https://github.com/datarootsio/databooks/pull/51

Full Changelog: https://github.com/datarootsio/databooks/compare/1.2.7...1.2.8
Source code(tar.gz)
Source code(zip)
1.2.7(Nov 4, 2022)
What's Changed

include pull_request event to trigger cicd by @murilo-cunha in https://github.com/datarootsio/databooks/pull/52

Full Changelog: https://github.com/datarootsio/databooks/compare/1.2.6...1.2.7
Source code(tar.gz)
Source code(zip)
1.2.6(Nov 3, 2022)
What's Changed

Bugfix: config not found by @murilo-cunha in https://github.com/datarootsio/databooks/pull/44

Full Changelog: https://github.com/datarootsio/databooks/compare/1.2.5...1.2.6
Source code(tar.gz)
Source code(zip)
1.2.5(Oct 26, 2022)
What's Changed

Update docs by @murilo-cunha in https://github.com/datarootsio/databooks/pull/41

Full Changelog: https://github.com/datarootsio/databooks/compare/1.2.4...1.2.5
Source code(tar.gz)
Source code(zip)
1.2.4(Oct 23, 2022)
What's Changed

Update recipe links in documentation and CLI help by @boaarmpit in https://github.com/datarootsio/databooks/pull/42

New Contributors

@boaarmpit made their first contribution in https://github.com/datarootsio/databooks/pull/42

Full Changelog: https://github.com/datarootsio/databooks/compare/1.2.3...1.2.4
Source code(tar.gz)
Source code(zip)
1.2.3(Oct 20, 2022)
What's Changed

Add pager option and display kernel name when showing notebooks by @murilo-cunha in https://github.com/datarootsio/databooks/pull/40

Full Changelog: https://github.com/datarootsio/databooks/compare/1.2.2...1.2.3
Source code(tar.gz)
Source code(zip)
1.2.2(Oct 19, 2022)
What's Changed

Add rich diff representation of notebook by @murilo-cunha in https://github.com/datarootsio/databooks/pull/39

Full Changelog: https://github.com/datarootsio/databooks/compare/1.2.1...1.2.2
Source code(tar.gz)
Source code(zip)
1.2.1(Oct 18, 2022)
What's Changed

Use specific cell types (code, raw, or markdown) by @murilo-cunha in https://github.com/datarootsio/databooks/pull/38

Full Changelog: https://github.com/datarootsio/databooks/compare/1.2.0...1.2.1
Source code(tar.gz)
Source code(zip)
1.2.0(Oct 14, 2022)
What's Changed

feature: add show command by @murilo-cunha in https://github.com/datarootsio/databooks/pull/37

Full Changelog: https://github.com/datarootsio/databooks/compare/1.1.1...1.2.0
Source code(tar.gz)
Source code(zip)
1.1.1(Oct 10, 2022)
What's Changed

Fix recipes for clean notebooks by @murilo-cunha in https://github.com/datarootsio/databooks/pull/36

Full Changelog: https://github.com/datarootsio/databooks/compare/1.1.0...1.1.1
Source code(tar.gz)
Source code(zip)
1.1.0(Oct 6, 2022)
What's Changed

Add confirm prompt by @murilo-cunha in https://github.com/datarootsio/databooks/pull/35

Full Changelog: https://github.com/datarootsio/databooks/compare/1.0.6...1.1.0
Source code(tar.gz)
Source code(zip)
1.0.6(Oct 5, 2022)
What's Changed

Fix recipe tests by @murilo-cunha in https://github.com/datarootsio/databooks/pull/34

Full Changelog: https://github.com/datarootsio/databooks/compare/1.0.5...1.0.6
Source code(tar.gz)
Source code(zip)
1.0.5(Apr 28, 2022)
What's Changed

Add indentation on output notebook JSON by @murilo-cunha in https://github.com/datarootsio/databooks/pull/33

Full Changelog: https://github.com/datarootsio/databooks/compare/1.0.4...1.0.5
Source code(tar.gz)
Source code(zip)
1.0.4(Apr 14, 2022)
What's Changed

Allow type annotations on imports (PEP-561) by @murilo-cunha in https://github.com/datarootsio/databooks/pull/32

Full Changelog: https://github.com/datarootsio/databooks/compare/1.0.3...1.0.4
Source code(tar.gz)
Source code(zip)
1.0.3(Apr 11, 2022)
What's Changed

Bugfix/fix cog docs gen by @murilo-cunha in https://github.com/datarootsio/databooks/pull/31

Full Changelog: https://github.com/datarootsio/databooks/compare/1.0.2...1.0.3
Source code(tar.gz)
Source code(zip)
1.0.2(Mar 15, 2022)
What's Changed

Add write method by @murilo-cunha in https://github.com/datarootsio/databooks/pull/30

Full Changelog: https://github.com/datarootsio/databooks/compare/1.0.1...1.0.2
Source code(tar.gz)
Source code(zip)
1.0.1(Mar 3, 2022)
What's Changed

Assert docs and change deps by @murilo-cunha in https://github.com/datarootsio/databooks/pull/29

Full Changelog: https://github.com/datarootsio/databooks/compare/1.0.0...1.0.1
Source code(tar.gz)
Source code(zip)
1.0.0(Feb 26, 2022)
What's Changed

Feature:assert command by @murilo-cunha in https://github.com/datarootsio/databooks/pull/26

Full Changelog: https://github.com/datarootsio/databooks/compare/0.1.15...1.0.0
Source code(tar.gz)
Source code(zip)
0.1.15(Feb 16, 2022)
What's Changed

Add config tests by @murilo-cunha in https://github.com/datarootsio/databooks/pull/23

Full Changelog: https://github.com/datarootsio/databooks/compare/0.1.14...0.1.15
Source code(tar.gz)
Source code(zip)
0.1.14(Feb 3, 2022)
What's Changed

Feature/add py37 support by @murilo-cunha in https://github.com/datarootsio/databooks/pull/22

Full Changelog: https://github.com/datarootsio/databooks/compare/0.1.13...0.1.14
Source code(tar.gz)
Source code(zip)
0.1.13(Jan 31, 2022)
What's Changed

add API docs for missing files - config and logging by @murilo-cunha in https://github.com/datarootsio/databooks/pull/25

Full Changelog: https://github.com/datarootsio/databooks/compare/0.1.12...0.1.13
Source code(tar.gz)
Source code(zip)

Owner

dataroots

Supporting your data driven strategy.

GitHub Repository https://datarootsio.github.io/databooks/

Fitting thermodynamic models with pycalphad

ESPEI ESPEI, or Extensible Self-optimizing Phase Equilibria Infrastructure, is a tool for thermodynamic database development within the CALPHAD method

42 Sep 12, 2022

.npy, .npz, .mtx converter.

npy-converter Matrix Data Converter. Expand matrix for multi-thread, multi-process Divid matrix for multi-thread, multi-process Support: .mtx, .npy, .

1 Feb 07, 2022

Nobel Data Analysis

Nobel_Data_Analysis This project is for analyzing a set of data about people who have won the Nobel Prize in different fields and different countries

1 Jan 24, 2022

Important dataframe statistics with a single command

quick_eda Receiving dataframe statistics with one command Project description A python package for Data Scientists, Students, ML Engineers and anyone

2 Dec 19, 2021

Sensitivity Analysis Library in Python (Numpy). Contains Sobol, Morris, Fractional Factorial and FAST methods.

Sensitivity Analysis Library (SALib) Python implementations of commonly used sensitivity analysis methods. Useful in systems modeling to calculate the

663 Jan 05, 2023

A script to "SHUA" H1-2 map of Mercenaries mode of Hearthstone

lushi_script Introduction This script is to "SHUA" H1-2 map of Mercenaries mode of Hearthstone Installation Make sure you installed python=3.6. To in

210 Jan 02, 2023

An Integrated Experimental Platform for time series data anomaly detection.

Curve Sorry to tell contributors and users. We decided to archive the project temporarily due to the employee work plan of collaborators. There are no

486 Dec 21, 2022

Stock Analysis dashboard Using Streamlit and Python

StDashApp Stock Analysis Dashboard Using Streamlit and Python If you found the content useful and want to support my work, you can buy me a coffee! Th

27 Dec 09, 2022

Evidence enables analysts to deliver a polished business intelligence system using SQL and markdown.

Evidence enables analysts to deliver a polished business intelligence system using SQL and markdown

915 Dec 26, 2022

PLStream: A Framework for Fast Polarity Labelling of Massive Data Streams

PLStream: A Framework for Fast Polarity Labelling of Massive Data Streams Motivation When dataset freshness is critical, the annotating of high speed

4 Aug 02, 2022

Meltano: ELT for the DataOps era. Meltano is open source, self-hosted, CLI-first, debuggable, and extensible.

Meltano is open source, self-hosted, CLI-first, debuggable, and extensible. Pipelines are code, ready to be version c

625 Jan 02, 2023

ForecastGA is a Python tool to forecast Google Analytics data using several popular time series models.

ForecastGA is a tool that combines a couple of popular libraries, Atspy and googleanalytics, with a few enhancements.

36 Jan 03, 2023

A Python module for clustering creators of social media content into networks

sm_content_clustering A Python module for clustering creators of social media content into networks. Currently supports identifying potential networks

72 Dec 30, 2022

Data analysis and visualisation projects from a range of individual projects and applications

Python-Data-Analysis-and-Visualisation-Projects Data analysis and visualisation projects from a range of individual projects and applications. Python

1 Jan 25, 2022

Statistical Analysis 📈 focused on statistical analysis and exploration used on various data sets for personal and professional projects.

Statistical Analysis 📈 This repository focuses on statistical analysis and the exploration used on various data sets for personal and professional pr

1 Sep 03, 2022

Intercepting proxy + analysis toolkit for Second Life compatible virtual worlds

Hippolyzer Hippolyzer is a revival of Linden Lab's PyOGP library targeting modern Python 3, with a focus on debugging issues in Second Life-compatible

6 Sep 01, 2022

Manage large and heterogeneous data spaces on the file system.

signac - simple data management The signac framework helps users manage and scale file-based workflows, facilitating data reuse, sharing, and reproduc

109 Dec 14, 2022

📊 Python Flask game that consolidates data from Nasdaq, allowing the user to practice buying and selling stocks.

Web Trader Web Trader is a trading website that consolidates data from Nasdaq, allowing the user to search up the ticker symbol and price of any stock

21 Aug 30, 2022

Hangar is version control for tensor data. Commit, branch, merge, revert, and collaborate in the data-defined software era.

Overview docs tests package Hangar is version control for tensor data. Commit, branch, merge, revert, and collaborate in the data-defined software era

193 Nov 29, 2022

t-SNE and hierarchical clustering are popular methods of exploratory data analysis, particularly in biology.

tree-SNE t-SNE and hierarchical clustering are popular methods of exploratory data analysis, particularly in biology. Building on recent advances in s

61 Nov 21, 2022

A CLI tool to reduce the friction between data scientists by reducing git conflicts removing notebook metadata and gracefully resolving git conflicts.

Related tags

Overview

databooks

Requirements

Installation

Usage

Clear metadata

Fix git conflicts for notebooks

License

Comments

Releases(1.3.7)

1.3.7(Nov 27, 2022)

What's Changed

1.3.6(Nov 11, 2022)

What's Changed

1.3.5(Nov 11, 2022)

What's Changed

1.3.4(Nov 9, 2022)

What's Changed

1.3.3(Nov 8, 2022)

What's Changed

1.3.2(Nov 8, 2022)

What's Changed

1.3.1(Nov 8, 2022)

What's Changed

1.3.0(Nov 7, 2022)

What's Changed

1.2.9(Nov 5, 2022)

What's Changed

1.2.8(Nov 5, 2022)

What's Changed

New Contributors

1.2.7(Nov 4, 2022)

What's Changed

1.2.6(Nov 3, 2022)

What's Changed

1.2.5(Oct 26, 2022)

What's Changed

1.2.4(Oct 23, 2022)

What's Changed

New Contributors

1.2.3(Oct 20, 2022)

What's Changed

1.2.2(Oct 19, 2022)

What's Changed

1.2.1(Oct 18, 2022)

What's Changed

1.2.0(Oct 14, 2022)

What's Changed

1.1.1(Oct 10, 2022)

What's Changed

1.1.0(Oct 6, 2022)

What's Changed

1.0.6(Oct 5, 2022)

What's Changed

1.0.5(Apr 28, 2022)

What's Changed

1.0.4(Apr 14, 2022)

What's Changed

1.0.3(Apr 11, 2022)

What's Changed

1.0.2(Mar 15, 2022)

What's Changed

1.0.1(Mar 3, 2022)

What's Changed

1.0.0(Feb 26, 2022)

What's Changed

0.1.15(Feb 16, 2022)

What's Changed

0.1.14(Feb 3, 2022)

What's Changed

0.1.13(Jan 31, 2022)

What's Changed

Owner

dataroots

Fitting thermodynamic models with pycalphad

.npy, .npz, .mtx converter.

Nobel Data Analysis

Important dataframe statistics with a single command