A CLI tool to reduce the friction between data scientists by reducing git conflicts removing notebook metadata and gracefully resolving git conflicts.

Overview

databooks

maintained by dataroots Code style: black Codecov tests

databooks is a package for reducing the friction data scientists while using Jupyter notebooks, by reducing the number of git conflicts between different notebooks and assisting in the resolution of the conflicts.

The key features include:

  • CLI tool
    • Clear notebook metadata
    • Resolve git conflicts
  • Simple to use
  • Simple API for using modelling and comparing notebooks using Pydantic

Requirements

databooks is built on top of:

Installation

pip install --i https://test.pypi.org/simple/ databooks

Usage

Clear metadata

Simply specify the paths for notebook files to remove metadata. By doing so, we can already avoid many of the conflicts.

$ databooks meta [OPTIONS] PATHS...

databooks meta demo

Fix git conflicts for notebooks

Specify the paths for notebook files with conflicts to be fixed. Then, databooks finds the source notebooks that caused the conflicts and compares them (so no JSON manipulation!)

$ databooks fix [OPTIONS] PATHS...

databooks fix demo

License

This project is licensed under the terms of the MIT license.

Comments
  • tests don't work when run without git repository

    tests don't work when run without git repository

    When packaging software for Fedora, we run tests in an unpacked archive without initialized git repository for the codebase because we are testing the installed version. Would it make sense to make the tests compatible with this way of running them?

    Maybe the tests can initialize their own repositories in a tmpdir or something like that.

    The current log I have is:

    python -m pytest
    ===================================== test session starts =====================================
    platform linux -- Python 3.10.7, pytest-7.1.3, pluggy-1.0.0
    rootdir: /home/lbalhar/Software/databooks
    collected 118 items                                                                           
    
    tests/test_affirm.py ......................                                             [ 18%]
    tests/test_cli.py ..........FF.FFF.F.F.FF                                               [ 38%]
    tests/test_common.py ..                                                                 [ 39%]
    tests/test_conflicts.py .........                                                       [ 47%]
    tests/test_git_utils.py F.                                                              [ 49%]
    tests/test_metadata.py ..........                                                       [ 57%]
    tests/test_recipes.py ...............                                                   [ 70%]
    tests/test_tui.py ..................                                                    [ 85%]
    tests/test_data_models/test_base.py .                                                   [ 86%]
    tests/test_data_models/test_notebook.py ................                                [100%]
    
    ========================================== FAILURES ===========================================
    __________________________________________ test_meta __________________________________________
    
    tmpdir = local('/tmp/pytest-of-lbalhar/pytest-1/test_meta0')
    
        def test_meta(tmpdir: LocalPath) -> None:
            """Remove notebook metadata."""
            read_path = tmpdir.mkdir("notebooks") / "test_meta_nb.ipynb"  # type: ignore
            TestJupyterNotebook().jupyter_notebook.write(read_path)
        
            nb_read = JupyterNotebook.parse_file(path=read_path)
            result = runner.invoke(app, ["meta", str(read_path), "--yes"])
            nb_write = JupyterNotebook.parse_file(path=read_path)
        
    >       assert result.exit_code == 0
    E       assert 1 == 0
    E        +  where 1 = <Result InvalidGitRepositoryError()>.exit_code
    
    tests/test_cli.py:50: AssertionError
    ______________________________________ test_meta__check _______________________________________
    
    tmpdir = local('/tmp/pytest-of-lbalhar/pytest-1/test_meta__check0')
    caplog = <_pytest.logging.LogCaptureFixture object at 0x7fca4a63baf0>
    
        def test_meta__check(tmpdir: LocalPath, caplog: LogCaptureFixture) -> None:
            """Report on existing notebook metadata (both when it is and isn't present)."""
            caplog.set_level(logging.INFO)
        
            read_path = tmpdir.mkdir("notebooks") / "test_meta_nb.ipynb"  # type: ignore
            TestJupyterNotebook().jupyter_notebook.write(read_path)
        
            nb_read = JupyterNotebook.parse_file(path=read_path)
            result = runner.invoke(app, ["meta", str(read_path), "--check"])
            nb_write = JupyterNotebook.parse_file(path=read_path)
        
            logs = list(caplog.records)
            assert result.exit_code == 1
    >       assert len(logs) == 1
    E       assert 0 == 1
    E        +  where 0 = len([])
    
    tests/test_cli.py:83: AssertionError
    ______________________________________ test_meta__script ______________________________________
    
    tmpdir = local('/tmp/pytest-of-lbalhar/pytest-1/test_meta__script0')
    
        def test_meta__script(tmpdir: LocalPath) -> None:
            """Raise `typer.BadParameter` when passing a script instead of a notebook."""
            py_path = tmpdir.mkdir("files") / "a_script.py"  # type: ignore
            py_path.write_text("# some python code", encoding="utf-8")
        
            result = runner.invoke(app, ["meta", str(py_path)])
    >       assert result.exit_code == 2
    E       assert 1 == 2
    E        +  where 1 = <Result InvalidGitRepositoryError()>.exit_code
    
    tests/test_cli.py:139: AssertionError
    ____________________________________ test_meta__no_confirm ____________________________________
    
    tmpdir = local('/tmp/pytest-of-lbalhar/pytest-1/test_meta__no_confirm0')
    
        def test_meta__no_confirm(tmpdir: LocalPath) -> None:
            """Don't make any changes without confirmation to overwrite files (prompt)."""
            nb_path = tmpdir.mkdir("notebooks") / "test_meta_nb.ipynb"  # type: ignore
            TestJupyterNotebook().jupyter_notebook.write(nb_path)
        
            result = runner.invoke(app, ["meta", str(nb_path)])
        
            assert result.exit_code == 1
            assert JupyterNotebook.parse_file(nb_path) == TestJupyterNotebook().jupyter_notebook
    >       assert result.output == (
                "1 files will be overwritten (no prefix nor suffix was passed)."
                " Continue? [y/n]: \nAborted!\n"
            )
    E       AssertionError: assert '' == '1 files will... \nAborted!\n'
    E         - 1 files will be overwritten (no prefix nor suffix was passed). Continue? [y/n]: 
    E         - Aborted!
    
    tests/test_cli.py:155: AssertionError
    _____________________________________ test_meta__confirm ______________________________________
    
    tmpdir = local('/tmp/pytest-of-lbalhar/pytest-1/test_meta__confirm0')
    
        def test_meta__confirm(tmpdir: LocalPath) -> None:
            """Make changes when confirming overwrite via the prompt."""
            nb_path = tmpdir.mkdir("notebooks") / "test_meta_nb.ipynb"  # type: ignore
            TestJupyterNotebook().jupyter_notebook.write(nb_path)
        
            result = runner.invoke(app, ["meta", str(nb_path)], input="y")
        
    >       assert result.exit_code == 0
    E       assert 1 == 0
    E        +  where 1 = <Result InvalidGitRepositoryError()>.exit_code
    
    tests/test_cli.py:168: AssertionError
    _________________________________________ test_assert _________________________________________
    
    caplog = <_pytest.logging.LogCaptureFixture object at 0x7fca4a66dff0>
    
        def test_assert(caplog: LogCaptureFixture) -> None:
            """Assert that notebook has sequential and increasing cell execution."""
            caplog.set_level(logging.INFO)
        
            exprs = (
                "[c.execution_count for c in exec_cells] == list(range(1, len(exec_cells) + 1))"
            )
            recipe = "seq-increase"
            with resources.path("tests.files", "demo.ipynb") as nb_path:
                result = runner.invoke(
                    app, ["assert", str(nb_path), "--expr", exprs, "--recipe", recipe]
                )
        
            logs = list(caplog.records)
    >       assert result.exit_code == 0
    E       assert 1 == 0
    E        +  where 1 = <Result InvalidGitRepositoryError()>.exit_code
    
    tests/test_cli.py:203: AssertionError
    __________________________________________ test_fix ___________________________________________
    
    tmpdir = local('/tmp/pytest-of-lbalhar/pytest-1/test_fix0')
    
        def test_fix(tmpdir: LocalPath) -> None:
            """Fix notebook conflicts."""
            # Setup
            nb_path = Path("test_conflicts_nb.ipynb")
            notebook_1 = TestJupyterNotebook().jupyter_notebook
            notebook_2 = TestJupyterNotebook().jupyter_notebook
        
            notebook_1.metadata = NotebookMetadata(
                kernelspec=dict(
                    display_name="different_kernel_display_name", name="kernel_name"
                ),
                field_to_remove=["Field to remove"],
                another_field_to_remove="another field",
            )
        
            extra_cell = BaseCell(
                cell_type="raw",
                metadata=CellMetadata(random_meta=["meta"]),
                source="extra",
            )
            notebook_2.cells = notebook_2.cells + [extra_cell]
            notebook_2.nbformat += 1
            notebook_2.nbformat_minor += 1
        
            git_repo = init_repo_conflicts(
                tmpdir=tmpdir,
                filename=nb_path,
                contents_main=notebook_1.json(),
                contents_other=notebook_2.json(),
                commit_message_main="Notebook from main",
                commit_message_other="Notebook from other",
            )
        
            conflict_files = get_conflict_blobs(repo=git_repo)
            id_main = conflict_files[0].first_log
            id_other = conflict_files[0].last_log
        
            # Run CLI and check conflict resolution
            result = runner.invoke(app, ["fix", str(tmpdir)])
    >       fixed_notebook = JupyterNotebook.parse_file(path=tmpdir / nb_path)
    
    tests/test_cli.py:268: 
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
    databooks/data_models/notebook.py:250: in parse_file
        return super(JupyterNotebook, cls).parse_file(
    pydantic/main.py:561: in pydantic.main.BaseModel.parse_file
        ???
    pydantic/parse.py:64: in pydantic.parse.load_file
        ???
    pydantic/parse.py:37: in pydantic.parse.load_str_bytes
        ???
    /usr/lib64/python3.10/json/__init__.py:346: in loads
        return _default_decoder.decode(s)
    /usr/lib64/python3.10/json/decoder.py:337: in decode
        obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
    
    self = <json.decoder.JSONDecoder object at 0x7fca4be6b1f0>
    s = '<<<<<<< HEAD\n{"nbformat": 4, "nbformat_minor": 4, "metadata": {"field_to_remove": ["Field to remove"], "another_fiel...xecution_count": 1}, {"metadata": {"random_meta": ["meta"]}, "source": "extra", "cell_type": "raw"}]}\n>>>>>>> other\n'
    idx = 0
    
        def raw_decode(self, s, idx=0):
            """Decode a JSON document from ``s`` (a ``str`` beginning with
            a JSON document) and return a 2-tuple of the Python
            representation and the index in ``s`` where the document ended.
        
            This can be used to decode a JSON document from a string that may
            have extraneous data at the end.
        
            """
            try:
                obj, end = self.scan_once(s, idx)
            except StopIteration as err:
    >           raise JSONDecodeError("Expecting value", s, err.value) from None
    E           json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
    
    /usr/lib64/python3.10/json/decoder.py:355: JSONDecodeError
    __________________________________________ test_show __________________________________________
    
        def test_show() -> None:
            """Show notebook in terminal."""
            with resources.path("tests.files", "tui-demo.ipynb") as nb_path:
                result = runner.invoke(app, ["show", str(nb_path)])
    >       assert result.exit_code == 0
    E       assert 1 == 0
    E        +  where 1 = <Result InvalidGitRepositoryError()>.exit_code
    
    tests/test_cli.py:382: AssertionError
    ____________________________________ test_show_no_multiple ____________________________________
    
        def test_show_no_multiple() -> None:
            """Don't show multiple notebooks if not confirmed in prompt."""
            with resources.path("tests.files", "tui-demo.ipynb") as nb:
                dirpath = str(nb.parent)
        
            # Exit code is 0 if user responds to prompt with `n`
            result = runner.invoke(app, ["show", dirpath], input="n")
    >       assert result.exit_code == 0
    E       assert 1 == 0
    E        +  where 1 = <Result InvalidGitRepositoryError()>.exit_code
    
    tests/test_cli.py:457: AssertionError
    ________________________________________ test_get_repo ________________________________________
    
        def test_get_repo() -> None:
            """Find git repository."""
            curr_dir = Path(__file__).parent
    >       repo = get_repo(curr_dir)
    
    tests/test_git_utils.py:54: 
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
    databooks/git_utils.py:38: in get_repo
        repo = Repo(path=repo_dir)
    ../../.virtualenvs/databooks/lib/python3.10/site-packages/git/repo/base.py:266: in __init__
        self.working_dir: Optional[PathLike] = self._working_tree_dir or self.common_dir
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
    
    self = <git.repo.base.Repo ''>
    
        @property
        def common_dir(self) -> PathLike:
            """
            :return: The git dir that holds everything except possibly HEAD,
                FETCH_HEAD, ORIG_HEAD, COMMIT_EDITMSG, index, and logs/."""
            if self._common_dir:
                return self._common_dir
            elif self.git_dir:
                return self.git_dir
            else:
                # or could return ""
    >           raise InvalidGitRepositoryError()
    E           git.exc.InvalidGitRepositoryError
    
    ../../.virtualenvs/databooks/lib/python3.10/site-packages/git/repo/base.py:347: InvalidGitRepositoryError
    =================================== short test summary info ===================================
    FAILED tests/test_cli.py::test_meta - assert 1 == 0
    FAILED tests/test_cli.py::test_meta__check - assert 0 == 1
    FAILED tests/test_cli.py::test_meta__script - assert 1 == 2
    FAILED tests/test_cli.py::test_meta__no_confirm - AssertionError: assert '' == '1 files will...
    FAILED tests/test_cli.py::test_meta__confirm - assert 1 == 0
    FAILED tests/test_cli.py::test_assert - assert 1 == 0
    FAILED tests/test_cli.py::test_fix - json.decoder.JSONDecodeError: Expecting value: line 1 c...
    FAILED tests/test_cli.py::test_show - assert 1 == 0
    FAILED tests/test_cli.py::test_show_no_multiple - assert 1 == 0
    FAILED tests/test_git_utils.py::test_get_repo - git.exc.InvalidGitRepositoryError
    =============================== 10 failed, 108 passed in 0.60s ================================
    
    opened by frenzymadness 5
  • Drop dependency on `py` and simplify tests

    Drop dependency on `py` and simplify tests

    The first commit drops dependency on py module and switch from LocalPath to pathlib.Path which has a different API so it required some more changes.

    The second commit actually removes the unnecessary subdirs. If you want to keep them, I can drop the second commit.

    opened by frenzymadness 4
  • RPM package in Fedora - problem with Typer version

    RPM package in Fedora - problem with Typer version

    Hi!

    I really enjoyed your talk on PyCon PL today and I've decided that it'd be great to have databook packaged in Fedora Linux. I've tried to create a package but found a problem - we have too new type in Fedora (version 0.6.1) but databook requires version < 0.5.

    https://github.com/datarootsio/databooks/blob/0b1d3e67c5fd00a145179545a456786bab5b7e77/pyproject.toml#L20

    Would it make sense to make the requirement less strict? If thy are following semantic versioning, something like <1.0.0 should guarantee enough stability.

    opened by frenzymadness 4
  • Fix the location of LICENSE file

    Fix the location of LICENSE file

    LICENSE is included automatically by poetry and it's installed into .dist-info directory. This manual include caused the file to be installed also into lib/python3.11/site-packages next to the main databooks package.

    opened by frenzymadness 3
  • pytest 7.2.0 no longer depends on py

    pytest 7.2.0 no longer depends on py

    This will no longer work with pytest 7.2.0 and higher. py lib is deprecated but you can fix the problem by requiring it explicitly if there is no other way now.

    https://github.com/datarootsio/databooks/blob/6febdbadde115cdb6fb89eec737d00cfff1d3223/tests/test_common.py#L3

    opened by frenzymadness 2
  • "Dynamic" recipes

    Make recipes take variables so that they could be reused more easily.

    Example:

    The recipe:

    max_cells = RecipeInfo(
        src="len(nb.cells) < amount_max",
        description="Assert that there are less than 'amount_max' cells in the notebook.",
    )
    

    The command:

    databooks assert path/to/nbs --recipe max_cells --amount_max 10
    
    databooks assert path/to/nbs --recipe max_cells --help
    

    Would return the arguments of that recipe

    opened by nicogelders 2
  • Feature:assert command

    Feature:assert command

    New CLI command! 🎉

    A databooks assert ... command that will take an expression and evaluate against all notebooks in path (internally databooks.affirm.affirm to hopefully remind of python's assert without creating any ambiguity). The command also comes with recipes that are short-hand for some useful expressions. These can also be used as inspiration for new expressions.

    Variables in scope for expression:

    • nb: Jupyter notebook found in path
    • raw_cells: notebook cells of raw type
    • md_cells: notebook cells of markdown type
    • code_cells: notebook cells of code type
    • exec_cells: notebook cells of executed code type

    It uses python's eval. But, since eval is really dangerous we first parse the string and only whitelist some nodes/attributes.

    • Allowed nodes are in databooks.affirm._ALLOWED_NODES
    • Allowed functions are in databooks.affirm._ALLOWED_BUILTINS
    • Only class attributes are allowed (found Pydantic model attributes using dict(model).keys())

    Check tests/test_affirm.py for examples of valid/invalid expressions.

    opened by murilo-cunha 2
  • Log-verbosity

    Log-verbosity

    Package uses logging functionalty but introduces a "verbose" concept where loglevel INFO suddendly becomes more verbose. This is a small PR that switches log level level to DEBUG when --verbose flag is passed by the user.

    opened by Bart6114 2
  • add exception for html tables and try-except when parsing outputs (fa…

    add exception for html tables and try-except when parsing outputs (fa…

    Changes:

    • Add custom exception when parsing HTML to table fails
    • Add try-except for that exception when parsing outputs - when there are issues, fallback to text implementation
    opened by murilo-cunha 1
  • Add rich tables

    Add rich tables

    Add rich table (HTML) rendering to notebook.

    • Other HTML elements are not displayed.
    • Fall back to displaying text if that's the case
    • Refactor logic to simplify this
    • Refactor logic to gracefully show not supported mime types (anything that is not one of image/png, text/html, text/plain
    opened by murilo-cunha 1
  • Repo found from paths

    Repo found from paths

    • We should not need the git repo for meta and assert commands
    • Config files/repos should always be located from the filepaths, not current working directory
      • Only exception is when we run databooks diff with no paths - in that case we use the current dir to find the repo
      • Not the case for databooks fix requires paths

    Closes https://github.com/datarootsio/databooks/issues/48

    opened by murilo-cunha 1
  • Enrich diffs

    Enrich diffs

    Enhancements:

    • If diff is only on code/output, then only show the split for that section
    • Show when only cell/notebook metadata changes - maybe make cell border yellow, etc...
    • For cell source diffs, show lines/character changes by changing the background color (red/green)
    • Add rich rendering for other outputs
      • HTML (other than tables)
      • PNG
      • SVG
      • ...
    opened by murilo-cunha 0
  • [RFE] databooks run

    [RFE] databooks run

    I like the way databooks is able to show the content of the notebook. It might make sense to implement a run command which can run all cells in a notebook and show the progress in the CLI interface cell by cell. We don't need to implement the execution because it can be done by nbconvert or nbclient packages.

    What do you think?

    opened by frenzymadness 3
  • Can I have the pre-commit hook recurse subdirectories?

    Can I have the pre-commit hook recurse subdirectories?

    I'd like to use databooks in the pre-commit Git hook, but my notebooks are in a sub-folder called "notebooks". Is there a way to specify that in the pre-commit?

    opened by tonyreina 1
Releases(1.3.7)
  • 1.3.7(Nov 27, 2022)

    What's Changed

    • add exception for html tables and try-except when parsing outputs (fa… by @murilo-cunha in https://github.com/datarootsio/databooks/pull/62

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.3.6...1.3.7

    Source code(tar.gz)
    Source code(zip)
  • 1.3.6(Nov 11, 2022)

    What's Changed

    • Add export option by @murilo-cunha in https://github.com/datarootsio/databooks/pull/60

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.3.5...1.3.6

    Source code(tar.gz)
    Source code(zip)
  • 1.3.5(Nov 11, 2022)

    What's Changed

    • Add rich tables by @murilo-cunha in https://github.com/datarootsio/databooks/pull/59

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.3.4...1.3.5

    Source code(tar.gz)
    Source code(zip)
  • 1.3.4(Nov 9, 2022)

    What's Changed

    • Add Python 3.11 to CI by @frenzymadness in https://github.com/datarootsio/databooks/pull/58

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.3.3...1.3.4

    Source code(tar.gz)
    Source code(zip)
  • 1.3.3(Nov 8, 2022)

    What's Changed

    • Fix the location of LICENSE file by @frenzymadness in https://github.com/datarootsio/databooks/pull/57

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.3.2...1.3.3

    Source code(tar.gz)
    Source code(zip)
  • 1.3.2(Nov 8, 2022)

    What's Changed

    • Fix test_get_repo by @frenzymadness in https://github.com/datarootsio/databooks/pull/56

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.3.1...1.3.2

    Source code(tar.gz)
    Source code(zip)
  • 1.3.1(Nov 8, 2022)

    What's Changed

    • Repo found from paths by @murilo-cunha in https://github.com/datarootsio/databooks/pull/54

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.3.0...1.3.1

    Source code(tar.gz)
    Source code(zip)
  • 1.3.0(Nov 7, 2022)

    What's Changed

    • Add diff command by @murilo-cunha in https://github.com/datarootsio/databooks/pull/45

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.2.9...1.3.0

    Source code(tar.gz)
    Source code(zip)
  • 1.2.9(Nov 5, 2022)

    What's Changed

    • Relax typer version by @murilo-cunha in https://github.com/datarootsio/databooks/pull/53

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.2.8...1.2.9

    Source code(tar.gz)
    Source code(zip)
  • 1.2.8(Nov 5, 2022)

    What's Changed

    • Drop dependency on py and simplify tests by @frenzymadness in https://github.com/datarootsio/databooks/pull/51

    New Contributors

    • @frenzymadness made their first contribution in https://github.com/datarootsio/databooks/pull/51

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.2.7...1.2.8

    Source code(tar.gz)
    Source code(zip)
  • 1.2.7(Nov 4, 2022)

    What's Changed

    • include pull_request event to trigger cicd by @murilo-cunha in https://github.com/datarootsio/databooks/pull/52

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.2.6...1.2.7

    Source code(tar.gz)
    Source code(zip)
  • 1.2.6(Nov 3, 2022)

    What's Changed

    • Bugfix: config not found by @murilo-cunha in https://github.com/datarootsio/databooks/pull/44

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.2.5...1.2.6

    Source code(tar.gz)
    Source code(zip)
  • 1.2.5(Oct 26, 2022)

    What's Changed

    • Update docs by @murilo-cunha in https://github.com/datarootsio/databooks/pull/41

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.2.4...1.2.5

    Source code(tar.gz)
    Source code(zip)
  • 1.2.4(Oct 23, 2022)

    What's Changed

    • Update recipe links in documentation and CLI help by @boaarmpit in https://github.com/datarootsio/databooks/pull/42

    New Contributors

    • @boaarmpit made their first contribution in https://github.com/datarootsio/databooks/pull/42

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.2.3...1.2.4

    Source code(tar.gz)
    Source code(zip)
  • 1.2.3(Oct 20, 2022)

    What's Changed

    • Add pager option and display kernel name when showing notebooks by @murilo-cunha in https://github.com/datarootsio/databooks/pull/40

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.2.2...1.2.3

    Source code(tar.gz)
    Source code(zip)
  • 1.2.2(Oct 19, 2022)

    What's Changed

    • Add rich diff representation of notebook by @murilo-cunha in https://github.com/datarootsio/databooks/pull/39

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.2.1...1.2.2

    Source code(tar.gz)
    Source code(zip)
  • 1.2.1(Oct 18, 2022)

    What's Changed

    • Use specific cell types (code, raw, or markdown) by @murilo-cunha in https://github.com/datarootsio/databooks/pull/38

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.2.0...1.2.1

    Source code(tar.gz)
    Source code(zip)
  • 1.2.0(Oct 14, 2022)

    What's Changed

    • feature: add show command by @murilo-cunha in https://github.com/datarootsio/databooks/pull/37

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.1.1...1.2.0

    Source code(tar.gz)
    Source code(zip)
  • 1.1.1(Oct 10, 2022)

    What's Changed

    • Fix recipes for clean notebooks by @murilo-cunha in https://github.com/datarootsio/databooks/pull/36

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.1.0...1.1.1

    Source code(tar.gz)
    Source code(zip)
  • 1.1.0(Oct 6, 2022)

    What's Changed

    • Add confirm prompt by @murilo-cunha in https://github.com/datarootsio/databooks/pull/35

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.0.6...1.1.0

    Source code(tar.gz)
    Source code(zip)
  • 1.0.6(Oct 5, 2022)

    What's Changed

    • Fix recipe tests by @murilo-cunha in https://github.com/datarootsio/databooks/pull/34

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.0.5...1.0.6

    Source code(tar.gz)
    Source code(zip)
  • 1.0.5(Apr 28, 2022)

    What's Changed

    • Add indentation on output notebook JSON by @murilo-cunha in https://github.com/datarootsio/databooks/pull/33

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.0.4...1.0.5

    Source code(tar.gz)
    Source code(zip)
  • 1.0.4(Apr 14, 2022)

    What's Changed

    • Allow type annotations on imports (PEP-561) by @murilo-cunha in https://github.com/datarootsio/databooks/pull/32

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.0.3...1.0.4

    Source code(tar.gz)
    Source code(zip)
  • 1.0.3(Apr 11, 2022)

    What's Changed

    • Bugfix/fix cog docs gen by @murilo-cunha in https://github.com/datarootsio/databooks/pull/31

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.0.2...1.0.3

    Source code(tar.gz)
    Source code(zip)
  • 1.0.2(Mar 15, 2022)

    What's Changed

    • Add write method by @murilo-cunha in https://github.com/datarootsio/databooks/pull/30

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.0.1...1.0.2

    Source code(tar.gz)
    Source code(zip)
  • 1.0.1(Mar 3, 2022)

    What's Changed

    • Assert docs and change deps by @murilo-cunha in https://github.com/datarootsio/databooks/pull/29

    Full Changelog: https://github.com/datarootsio/databooks/compare/1.0.0...1.0.1

    Source code(tar.gz)
    Source code(zip)
  • 1.0.0(Feb 26, 2022)

    What's Changed

    • Feature:assert command by @murilo-cunha in https://github.com/datarootsio/databooks/pull/26

    Full Changelog: https://github.com/datarootsio/databooks/compare/0.1.15...1.0.0

    Source code(tar.gz)
    Source code(zip)
  • 0.1.15(Feb 16, 2022)

    What's Changed

    • Add config tests by @murilo-cunha in https://github.com/datarootsio/databooks/pull/23

    Full Changelog: https://github.com/datarootsio/databooks/compare/0.1.14...0.1.15

    Source code(tar.gz)
    Source code(zip)
  • 0.1.14(Feb 3, 2022)

    What's Changed

    • Feature/add py37 support by @murilo-cunha in https://github.com/datarootsio/databooks/pull/22

    Full Changelog: https://github.com/datarootsio/databooks/compare/0.1.13...0.1.14

    Source code(tar.gz)
    Source code(zip)
  • 0.1.13(Jan 31, 2022)

    What's Changed

    • add API docs for missing files - config and logging by @murilo-cunha in https://github.com/datarootsio/databooks/pull/25

    Full Changelog: https://github.com/datarootsio/databooks/compare/0.1.12...0.1.13

    Source code(tar.gz)
    Source code(zip)
Owner
dataroots
Supporting your data driven strategy.
dataroots
For making Tagtog annotation into csv dataset

tagtog_relation_extraction for making Tagtog annotation into csv dataset How to Use On Tagtog 1. Go to Project Downloads 2. Download all documents,

hyeong 4 Dec 28, 2021
Recommendations from Cramer: On the show Mad-Money (CNBC) Jim Cramer picks stocks which he recommends to buy. We will use this data to build a portfolio

Backtesting the "Cramer Effect" & Recommendations from Cramer Recommendations from Cramer: On the show Mad-Money (CNBC) Jim Cramer picks stocks which

Gábor Vecsei 12 Aug 30, 2022
Data processing with Pandas.

Processing-data-with-python This is a simple example showing how to use Pandas to create a dataframe and the processing data with python. The jupyter

1 Jan 23, 2022
A highly efficient and modular implementation of Gaussian Processes in PyTorch

GPyTorch GPyTorch is a Gaussian process library implemented using PyTorch. GPyTorch is designed for creating scalable, flexible, and modular Gaussian

3k Jan 02, 2023
MoRecon - A tool for reconstructing missing frames in motion capture data.

MoRecon - A tool for reconstructing missing frames in motion capture data.

Yuki Nishidate 38 Dec 03, 2022
pipeline for migrating lichess data into postgresql

How Long Does It Take Ordinary People To "Get Good" At Chess? TL;DR: According to 5.5 years of data from 2.3 million players and 450 million games, mo

Joseph Wong 182 Nov 11, 2022
A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using the tools and APIs you know and love from the PyData stack (such as numpy, pandas, and scikit-learn).

This tutorial's purpose is to introduce Pythonistas to methods for scaling their data science and machine learning work to larger datasets and larger models, using the tools and APIs they know and lo

Coiled 102 Nov 10, 2022
Elementary is an open-source data reliability framework for modern data teams. The first module of the framework is data lineage.

Data lineage made simple, reliable, and automated. Effortlessly track the flow of data, understand dependencies and analyze impact. Features Visualiza

898 Jan 09, 2023
signac-flow - manage workflows with signac

signac-flow - manage workflows with signac The signac framework helps users manage and scale file-based workflows, facilitating data reuse, sharing, a

Glotzer Group 44 Oct 14, 2022
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Amundsen 3.7k Jan 03, 2023
Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano

PyMC3 is a Python package for Bayesian statistical modeling and Probabilistic Machine Learning focusing on advanced Markov chain Monte Carlo (MCMC) an

PyMC 7.2k Dec 30, 2022
Semi-Automated Data Processing

Perform semi automated exploratory data analysis, feature engineering and feature selection on provided dataset by visualizing every possibilities on each step and assisting the user to make a meanin

Arun Singh Babal 1 Jan 17, 2022
A computer algebra system written in pure Python

SymPy See the AUTHORS file for the list of authors. And many more people helped on the SymPy mailing list, reported bugs, helped organize SymPy's part

SymPy 9.9k Dec 31, 2022
Two phase pipeline + StreamlitTwo phase pipeline + Streamlit

Two phase pipeline + Streamlit This is an example project that demonstrates how to create a pipeline that consists of two phases of execution. In betw

Rick Lamers 1 Nov 17, 2021
Repositori untuk menyimpan material Long Course STMKGxHMGI tentang Geophysical Python for Seismic Data Analysis

Long Course "Geophysical Python for Seismic Data Analysis" Instruktur: Dr.rer.nat. Wiwit Suryanto, M.Si Dipersiapkan oleh: Anang Sahroni Waktu: Sesi 1

Anang Sahroni 0 Dec 04, 2021
Binance Kline Data With Python

Binance Kline Data by seunghan(gingerthorp) reference https://github.com/binance/binance-public-data/ All intervals are supported: 1m, 3m, 5m, 15m, 30

shquant 5 Jul 13, 2022
A data analysis using python and pandas to showcase trends in school performance.

A data analysis using python and pandas to showcase trends in school performance. A data analysis to showcase trends in school performance using Panda

Jimmy Faccioli 0 Sep 07, 2021
The lastest all in one bombing tool coded in python uses tbomb api

BaapG-Attack is a python3 based script which is officially made for linux based distro . It is inbuit mass bomber with sms, mail, calls and many more bombing

59 Dec 25, 2022
A Python package for modular causal inference analysis and model evaluations

Causal Inference 360 A Python package for inferring causal effects from observational data. Description Causal inference analysis enables estimating t

International Business Machines 506 Dec 19, 2022
Data collection, enhancement, and metrics calculation.

l3_data_collection Data collection, enhancement, and metrics calculation. Summary Repository containing code for QuantDAO's JDT data collection task.

Ruiwyn 3 Dec 23, 2022