TileDB-Py is a Python interface to the TileDB Storage Engine.

Overview

TileDB logo

Build Status Anaconda download count badge

TileDB-Py

TileDB-Py is a Python interface to the TileDB Storage Engine.

Quick Links

Quick Installation

TileDB-Py is available from either PyPI with pip:

pip install tiledb

or from conda-forge with conda or mamba:

conda install -c conda-forge tiledb-py

Dataframes functionality (tiledb.from_pandas, Array.df[]) requires Pandas 1.0 or higher, and PyArrow 1.0 or higher.

Contributing

We welcome contributions, please see CONTRIBUTING.md for suggestions and development-build instructions. For larger features, please open an issue to discuss goals and approach in order to ensure a smooth PR integration and review process.

Comments
  • error on call to consolidate

    error on call to consolidate

    With the most recent release, I am getting an error on consolidate. Something odd seems to be happening on the second dimension (based upon the error message).

    Traceback (most recent call last):
      File "./bug.py", line 38, in <module>
        main()
      File "./bug.py", line 34, in main
        tiledb.consolidate(name)
      File "tiledb/libtiledb.pyx", line 4420, in tiledb.libtiledb.consolidate
      File "tiledb/libtiledb.pyx", line 387, in tiledb.libtiledb._raise_ctx_err
      File "tiledb/libtiledb.pyx", line 372, in tiledb.libtiledb._raise_tiledb_error
    tiledb.libtiledb.TileDBError: [TileDB::Query] Error: Subarray out of bounds. subarray: [0, 27999, 0, 20699] domain: [0, 27999, 0, 20645]
    

    To reproduce:

    import tiledb
    import numpy as np
    
    
    def make_array(name, shape):
    
        filters = tiledb.FilterList([
            tiledb.ZstdFilter(),
        ])
        attrs = [
            tiledb.Attr(dtype=np.float32, filters=filters)
        ]
        domain = tiledb.Domain(tiledb.Dim(name="obs", domain=(0, shape[0] - 1), tile=min(shape[0], 200), dtype=np.uint32),
                               tiledb.Dim(name="var", domain=(0, shape[1] - 1), tile=min(shape[1], 100), dtype=np.uint32))
    
        schema = tiledb.ArraySchema(domain=domain, sparse=False, attrs=attrs,
                                    cell_order='row-major', tile_order='row-major')
        tiledb.DenseArray.create(name, schema)
    
    
    def main():
    
        shape = (28000, 20646)
        name = "X"
        make_array(name, shape)
    
        stride = int(np.power(10, np.around(np.log10(1e8 / shape[1]))))
        with tiledb.DenseArray(name, mode='w') as X:
            for row in range(0, shape[0], stride):
                lim = min(row+stride, shape[0])
                print(row, lim)
                X[row:lim, :] = np.random.rand(lim-row, shape[1])
    
            tiledb.consolidate(name)
    
    
    if __name__ == '__main__':
        main()
    
    
    opened by bkmartinjr 17
  • 0.8.1 crashes with “OSError: libtiledb.so: cannot open shared object file: No such file or directory”

    0.8.1 crashes with “OSError: libtiledb.so: cannot open shared object file: No such file or directory”

    This happens on travis:

    import tiledb
    
    python/3.8.6/site-packages/tiledb/__init__.py:38: in <module>
        ctypes.CDLL(lib_name)
    python/3.8.6/lib/python3.8/ctypes/__init__.py:373: in __init__
        self._handle = _dlopen(self._name, mode)
    E   OSError: libtiledb.so: cannot open shared object file: No such file or directory
    

    /edit: yup, 0.8.1 specific. If I specify tiledb!=0.8.1 it installs 0.8 and my tests run

    bug 
    opened by flying-sheep 13
  • Update dataframe_.py

    Update dataframe_.py

    if sparse == False and (not index_dims or "index_col" not in kwargs): NameError: name 'kwargs' is not defined

    kwargs was replaced with tiledb_args

    opened by royassis 11
  • QueryConditions for Dense Arrays

    QueryConditions for Dense Arrays

    Hi.

    I have following scenario, in which I open a tiledb that contains a Dense Array. I apply QueryConditions on the tiledb obj and it complains that QueryConditions may only be applied to sparse arrays.

    import numpy as np
    import pandas as pd
    import tiledb as td
    
    # create the df
    
    columns = ["index", "seqnames", "start", "end"]
    records = np.array(
        [
            [0, "chr1", 0, 5],
            [1, "chr1", 6, 10],
            [2, "chr2", 11, 15],
            [3, "chr2", 16, 20],
            [4, "chr3", 21, 25],
        ]
    )
    
    df = pd.DataFrame(records, columns=columns)
    
    # write to tiledb
    
    td.from_pandas("./test_pandas", df)
    
    
    # open tiledb with query
    
    with td.open("./test_pandas", "r") as a:
        qc = td.QueryCondition("seqnames == 'chr1'")
        q = a.query(attr_cond=qc)
    

    The traceback:

    ---------------------------------------------------------------------------
    TileDBError                               Traceback (most recent call last)
    test.py in <cell line: 28>()
         [168](file:///test.py?line=167) with td.open("./test_pandas", "r") as a:
         [169](file:///test.py?line=168)     qc = td.QueryCondition("seqnames == 'chr1'")
    ---> [170](file:///test.py?line=169)     q = a.query(attr_cond=qc)
    
    File tiledb/libtiledb.pyx:4369, in tiledb.libtiledb.DenseArrayImpl.query()
    
    File tiledb/libtiledb.pyx:4098, in tiledb.libtiledb.Query.__init__()
    
    TileDBError: QueryConditions may only be applied to sparse arrays
    

    The documentation implies that this would work with dense arrays. Is there a work around for this?

    opened by VictorAlex1 11
  • Consolidation causes core dump

    Consolidation causes core dump

    Code can reproduce:

    import numpy as np
    import tiledb
    
    dense_array_name = 'test_1d'
    
    ctx = tiledb.Ctx()
    
    dom = tiledb.Domain(tiledb.Dim(ctx=ctx, domain=(0, 11), tile=12, dtype=np.int64), ctx=ctx)
    schema = tiledb.ArraySchema(ctx=ctx, domain=dom, sparse=False,
                                attrs=[tiledb.Attr(ctx=ctx, dtype=np.int64)])
    
    tiledb.DenseArray.create(dense_array_name, schema)
    
    with tiledb.DenseArray(dense_array_name, ctx=ctx, mode='w') as A:
        A[:2] = np.arange(2)
        A[2:4] = np.arange(2, 4)
        A[4:6] = np.arange(4, 6)
        A[6:8] = np.arange(6, 8)
        A[8:10] = np.arange(8, 10)
        A[10:12] = np.arange(10, 12)
        
    tiledb.consolidate(uri=dense_array_name)
    

    I have just updated the tiledb-py version to 0.4.1 via conda. Problem still. My laptop is Macbook pro with Mac OS High Sierra. Python version is 3.7.3.

    opened by qinxuye 11
  • QueryCondition with leading whitespace throws error

    QueryCondition with leading whitespace throws error

    If a QueryCondition has leading white space, the QueryCondition parser will throw an error. It does not seem sensitive to intermediate or trailing white space.

    Ideally, it would ignore all whitespace including NL, allowing multi-line strings, eg, in Python:

    qc = tiledb.QueryCondition("""
             a == 'a'
          and
            b == 'b'
        """)
    

    Example failure running TileDb-Py 0.17.5:

    In [34]: tiledb.QueryCondition("a == 'a'")
    Out[34]: QueryCondition(expression="a == 'a'")
    
    In [35]: tiledb.QueryCondition(" a == 'a'")
    ---------------------------------------------------------------------------
    IndentationError                          Traceback (most recent call last)
    File ~/projects/soma-scratch/venv/lib/python3.10/site-packages/tiledb/query_condition.py:115, in QueryCondition.__post_init__(self)
        114 try:
    --> 115     self.tree = ast.parse(self.expression, mode="eval")
        116 except:
    
    File /usr/lib/python3.10/ast.py:50, in parse(source, filename, mode, type_comments, feature_version)
         49 # Else it should be an int giving the minor version for 3.x.
    ---> 50 return compile(source, filename, mode, flags,
         51                _feature_version=feature_version)
    
    IndentationError: unexpected indent (<unknown>, line 1)
    
    During handling of the above exception, another exception occurred:
    
    TileDBError                               Traceback (most recent call last)
    Cell In [35], line 1
    ----> 1 tiledb.QueryCondition(" a == 'a'")
    
    File <string>:5, in __init__(self, expression, ctx)
    
    File ~/projects/soma-scratch/venv/lib/python3.10/site-packages/tiledb/query_condition.py:117, in QueryCondition.__post_init__(self)
        115     self.tree = ast.parse(self.expression, mode="eval")
        116 except:
    --> 117     raise tiledb.TileDBError(
        118         "Could not parse the given QueryCondition statement: "
        119         f"{self.expression}"
        120     )
        122 if not self.tree:
        123     raise tiledb.TileDBError(
        124         "The query condition statement could not be parsed properly. "
        125         "(Is this an empty expression?)"
        126     )
    
    TileDBError: Could not parse the given QueryCondition statement:  a == 'a'
    
    opened by bkmartinjr 10
  • Deadlock writing concurrently to S3 array

    Deadlock writing concurrently to S3 array

    It seems that TileDB deadlocks when writing concurrently to an array in S3. It doesn't lock up with ThreadPoolExecutor, or when writing to a file:// array.

    TileDB-Py 0.9.3 TileDB 2.3.2

    from concurrent.futures.process import ProcessPoolExecutor
    
    import numpy as np
    from tiledb import *
    
    array_name = 's3://...'
    ctx = Ctx(...)
    
    s = ArraySchema(domain=Domain(Dim('a', dtype=np.int8, domain=(0, 126))), attrs=[Attr('x', dtype=np.int32)], sparse=True)
    
    SparseArray.create(array_name, schema=s, ctx=ctx)
    
    
    def write(i):
        with SparseArray(array_name, mode='w', ctx=ctx) as A:
            A[i] = i  # Processes hang here
    
    
    with ProcessPoolExecutor() as pool:
        for i in range(50):
            pool.submit(write, i)
    
    
    with SparseArray(array_name, ctx=ctx) as A:
        print(A[:])
    
    opened by gatesn 10
  • uncaught exception when operating on s3 storage

    uncaught exception when operating on s3 storage

    Hello,

    I’m trying to port a Python program from an HDFS back end to S3. Running some simple tests I’m getting an uncaught exception from TileDB. As the s3 back ends, I'm generally using minio for development, through I've reproduced the same issue using a Ceph object store. Here’s the condensed example:

    [email protected]:/tdmq-dist$ python3
    Python 3.6.9 (default, Oct  8 2020, 12:12:24)
    [GCC 8.4.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import tiledb
    >>> tiledb.__version__
    '0.7.0
    >>> service_info = {
    ...         'version' : '0.1',
    ...         'tiledb' : {
    ...             'storage.root' : 's3://firstbucket/',
    ...             'config': {
    ...                 "vfs.s3.aws_access_key_id": "tdm-user",
    ...                 "vfs.s3.aws_secret_access_key": "tdm-user-s3",
    ...                 "vfs.s3.endpoint_override": "minio:9000",
    ...                 "vfs.s3.scheme": "http",
    ...                 "vfs.s3.region": "",
    ...                 "vfs.s3.verify_ssl": "false",
    ...                 "vfs.s3.use_virtual_addressing": "false",
    ...                 "vfs.s3.use_multipart_upload": "false",
    ...                 "vfs.s3.logging_level": 'TRACE'
    ...                 }
    ...             }
    ...         }
    >>> def clean_s3(tdmq_s3_service_info):
    ...     import tiledb
    ...     config = tiledb.Config(params=tdmq_s3_service_info['tiledb']['config'])
    ...     bucket = tdmq_s3_service_info['tiledb']['storage.root']
    ...     assert bucket.startswith('s3://')
    ...     ctx = tiledb.Ctx(config=config)
    ...     vfs = tiledb.VFS(ctx=ctx)
    ...     if vfs.is_bucket(bucket):
    ...         vfs.empty_bucket(bucket)
    ...     else:
    ...         vfs.create_bucket(bucket)
    ...     return tdmq_s3_service_info
    

    The first one or two times I call clean_s3(service_info) it works fine.

    >>> clean_s3(service_info)
    log4j:WARN File option not set for appender [FSLOGGER].
    log4j:WARN Are you using FileAppender instead of ConsoleAppender?
    {'version': '0.1', 'tiledb': {'storage.root': 's3://firstbucket/', 'config': {'vfs.s3.aws_access_key_id': 'tdm-user', 'vfs.s3.aws_secret_access_key': 'tdm-user-s3', 'vfs.s3.endpoint_override': 'minio:9000', 'vfs.s3.scheme': 'http', 'vfs.s3.region': '', 'vfs.s3.verify_ssl': 'false', 'vfs.s3.use_virtual_addressing': 'false', 'vfs.s3.use_multipart_upload': 'false', 'vfs.s3.logging_level': 'TRACE'}}}
    >>> clean_s3(tdmq_s3_service_info)
    {'version': '0.1', 'tiledb': {'storage.root': 's3://firstbucket/', 'config': {'vfs.s3.aws_access_key_id': 'tdm-user', 'vfs.s3.aws_secret_access_key': 'tdm-user-s3', 'vfs.s3.endpoint_override': 'minio:9000', 'vfs.s3.scheme': 'http', 'vfs.s3.region': '', 'vfs.s3.verify_ssl': 'false', 'vfs.s3.use_virtual_addressing': 'false', 'vfs.s3.use_multipart_upload': 'false', 'vfs.s3.logging_level': 'TRACE'}}}
    

    Then something breaks. Further calls to the function result in an uncaught exception in TileDB:

    >>> clean_s3(s3_service_info)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 8, in clean_s3
      File "tiledb/libtiledb.pyx", line 5511, in tiledb.libtiledb.VFS.is_bucket
      File "tiledb/libtiledb.pyx", line 481, in tiledb.libtiledb._raise_ctx_err
      File "tiledb/libtiledb.pyx", line 466, in tiledb.libtiledb._raise_tiledb_error
    tiledb.libtiledb.TileDBError: Error: Internal TileDB uncaught exception; basic_string::compare: __pos (which is 18446744073709551615) > this->size() (which is 4)
    >>> clean_s3(service_info)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 8, in clean_s3
      File "tiledb/libtiledb.pyx", line 5511, in tiledb.libtiledb.VFS.is_bucket
      File "tiledb/libtiledb.pyx", line 481, in tiledb.libtiledb._raise_ctx_err
      File "tiledb/libtiledb.pyx", line 466, in tiledb.libtiledb._raise_tiledb_error
    tiledb.libtiledb.TileDBError: Error: Internal TileDB uncaught exception; basic_string::compare: __pos (which is 18446744073709551615) > this->size() (which is 4)
    

    Once this happens I can't seem to do any s3 operation. Here are some other information I have collected:

    • Reinitializing the tiledb context has no effect.
    • Trying to reload the module has no effect (i.e., importlib.reload(tiledb)).
    • I can reproduce this with both minio and Ceph.
    • Generally, things break after the first couple of times I've called the function, generally in quick succession. Sometimes it happens on the first call.
    • I've tried distancing subsequent calls to clean_s3 by as much as 15 seconds, but the thing still breaks.
    • I'm running in Docker, in a custom ubuntu-based image. TileDB-Py is installed via pip.
    • I have condensed this example from a more complicated scenario and another example I put together (which I posted in the forum. In those cases, the exception was being generated by tiledb.object_type and by tiledb.DenseArray.create.

    Given the value of _pos, I guess it's got something to do with an unsigned type being used where a signed one is expected -- or maybe not. Let me know I can be of help.

    opened by ilveroluca 10
  • zero length unicode string causes assertion failure

    zero length unicode string causes assertion failure

    tiledb version 0.5.6

    attempting to write zero-length unicode array triggers assertion

    test case:

    attrs = [tiledb.Attr(name="foo", dtype=np.unicode)]
    domain = tiledb.Domain(tiledb.Dim(domain=(0, 1000), tile=1000))
    schema = tiledb.ArraySchema(domain=domain, sparse=False, attrs=attrs)
    tiledb.DenseArray.create("foo", schema)
    
    with tiledb.DenseArray("foo", mode="w") as A:
        # change empty string to non-empty and things work
        val = np.array(['' for i in range(0, 1001)], dtype=np.unicode)
        A[:] = val
    

    result:

    # python foo.py
    Traceback (most recent call last):
      File "foo.py", line 61, in <module>
        A[:] = np.zeros((1000,), dtype=np.unicode)
      File "tiledb/libtiledb.pyx", line 4175, in tiledb.libtiledb.DenseArrayImpl.__setitem__
      File "tiledb/libtiledb.pyx", line 274, in tiledb.libtiledb._write_array
      File "tiledb/np2buf.pyx", line 108, in tiledb.libtiledb.array_to_buffer
    AssertionError
    
    opened by bkmartinjr 10
  • Querying data with negative real indecies

    Querying data with negative real indecies

    Hello there,

    I'm working with a TileDB containing to Domains with negative values. When querying values I get:

    tiledb.libtiledb.TileDBError: [TileDB::Query] Error: Subarray lower bound is larger than upper bound. subarray: [86.75, 86.5, 26.75, 26.75, 157498, 157498] domain: [-89.75, 89.75, -179.75, 179.75, 157498, 157863]
    

    My query looks as follows:

    data = self.data.query(attrs=["some_data"], coords=True)[-3.25,26.75,157498.4375]
    

    Also the error message does not display the subarray I define in my query.

    opened by aosterthun 10
  • Support for orthogonal indexing

    Support for orthogonal indexing

    I ran this on TileDB-py 0.3.0 with TileDB 1.4.0.

    It seems like the slicing operation from an SparseArray or a DenseArray would only support queries such as:

    a[1, 30:40]

    However, if I have a list of values to query (say 1, 4, 10), it would fail:

    a[[1, 4, 10], 30:40]

    The only way I can think of doing this is to iterate through the list and do a slice like:

    a[1, 30:40] a[4, 30:40] a[10, 30:40]

    Is there a more efficient to do this? I'd imagine doing this in python would be pretty slow. If I have more dimensions it can get more complicated. For instance, if I know the exact index for the things I want to query say for a 3d array:

    a[[1, 4, 10], [20, 30, 40], 30:40]

    Then I have to get the cartesian production of [1, 4, 10] and [20, 30, 40] and slice through the rest.

    In zarr for instance, there's a special function to do this (https://zarr.readthedocs.io/en/stable/tutorial.html#orthogonal-indexing). Do you see this is a viable option to implement in TileDB?

    opened by will133 10
  • CtxMixin classmethods

    CtxMixin classmethods

    Replace "private" constructor parameters (_capsule, _lt_obv) with CtxMixin classmethods (from_capsule, from_pybind11).Follow-up to https://github.com/TileDB-Inc/TileDB-Py/pull/1548.

    opened by gsakkis 0
Releases(0.19.1)
  • 0.19.1(Jan 4, 2023)

    TileDB Embedded updates:

    Improvements

    • Move Dim and Domain from Cython to pure Python #1327

    Bug Fixes

    • Ensure NumPy array matches array schema dimensions for dense writes #1514

    https://pypi.org/project/tiledb/0.19.1/

    Source code(tar.gz)
    Source code(zip)
  • 0.19.0(Dec 7, 2022)

    Packaging Notes

    • Added support for Python 3.11

    TileDB Embedded updates:

    Deprecations

    • FragmentInfoList.non_empty_domain deprecated for FragmentInfoList.nonempty_domain
    • FragmentInfoList.to_vacuum_num deprecated for len(FragmentInfoList.to_vacuum)
    • FragmentInfoList.to_vacuum_uri deprecated for FragmentInfoList.to_vacuum
    • FragmentInfoList.dense deprecated for not FragmentInfoList.dense
    • FragmentInfo.non_empty_domain deprecated for FragmentInfo.nonempty_domain
    • FragmentInfo.to_vacuum_num deprecated for len(FragmentInfo.to_vacuum)
    • FragmentInfo.to_vacuum_uri deprecated for FragmentInfo.to_vacuum
    • FragmentInfo.dense deprecated for not FragmentInfo.dense
    • FragmentsInfo deprecated for FragmentInfoList
    • tiledb.delete_fragments deprecated for Array.delete_fragments
    • Array.timestamp deprecated for Array.timestamp_range
    • Array.coords_dtype deprecated with no replacement; combined coords have been removed from libtiledb
    • Array.timestamp deprecated for Array.timestamp_range
    • Array.query(attr_cond=...) deprecated for Array.query(cond=...)
    • Array.query(cond=tiledb.QueryCondition('expression')) deprecated for Array.query(cond='expression')

    API Changes

    • Add support for WebpFilter #1395
    • Support Boolean types for query conditions #1432
    • Support for partial consolidation using a list of fragment URIs #1431
    • Addition of ArraySchemaEvolution.timestamp #1480
    • Addition of ArraySchema.has_dim #1430
    • Addition of Array.delete_array #1428

    Bug Fixes

    • Fix issue where queries in delete mode error out on arrays with string dimensions #1473
    • Fix representation of nullable integers in dataframe when using PyArrow path #1439
    • Check for uninitialized query state after submit and error out if uninitialized #1483

    https://pypi.org/project/tiledb/0.19.0/

    Source code(tar.gz)
    Source code(zip)
  • 0.18.3(Nov 28, 2022)

    Packaging Notes

    • Linux wheels now built on manylinux2014; previously built on manylinux2010
    • Windows wheels NOT AVAILABLE for this release

    TileDB Embedded updates:

    Improvements

    • Move from_numpy out of Cython into pure Python #1436
    • Move Attr From Cython to Pure Python #1411

    Bug Fixes

    • Fix .df and .multi_index always returning attributes applied in query conditions #1433

    https://pypi.org/project/tiledb/0.18.3/

    Source code(tar.gz)
    Source code(zip)
  • 0.18.2(Nov 7, 2022)

  • 0.18.1(Nov 3, 2022)

  • 0.18.0(Oct 26, 2022)

    TileDB Embedded updates:

    API Changes

    • Changes to query conditions #1341
      • Support query conditions on sparse dimensions
      • Deprecate attr_cond in favor of cond
      • Deprecate passing tiledb.QueryCondition to cond in favor of passing string directly
    • Add support for XORFilter #1294
    • Addition of Array.delete_fragments; deprecate tiledb.delete_fragments #1329
    • Array and Group metadata now store bytes as TILEDB_BLOB #1384
    • Addition of {Array,Group}.metadata.dump() #1384
    • Addition of Group.is_relative to check if the URI component of a group member is relative #1386
    • Addition of query deletes to delete data that satisifies a given query condition #1309
    • Addition of FileIO.readinto #1389

    Improvements

    • Addition of Utility Function get_last_ctx_err_str() for C API #1351
    • Move Context and Config from Cython to pure Python #1379

    https://pypi.org/project/tiledb/0.18.0/

    Source code(tar.gz)
    Source code(zip)
  • 0.17.6(Oct 25, 2022)

    TileDB-Py 0.17.6 Release Notes

    Bug Fixes

    • Correct writing empty/null strings to array. tiledb.main.array_to_buffer needs to resize data buffer at the end of convert_unicode; otherwise, last cell will be store with trailing nulls chars #1339
    • Revert #1326 due to issues with Context lifetime with in multiprocess settings #1372

    https://pypi.org/project/tiledb/0.17.6/

    Source code(tar.gz)
    Source code(zip)
  • 0.17.5(Oct 11, 2022)

    TileDB-Py 0.17.5 Release Notes

    Improvements

    • Move Attr from Cython to pure Python #1326

    API Changes

    • Permit true-ASCII attributes in non-from-pandas dataframes #1337
    • Addition of Array.upgrade_version to upgrade array to latest version #1334
    • Attributes in query conditions no longer need to be passed to Array.query's attr arg #1333
    • ArraySchemaEvolution checks context's last error for error message #1335

    https://pypi.org/project/tiledb/0.17.5/

    Source code(tar.gz)
    Source code(zip)
  • 0.17.4(Sep 23, 2022)

    TileDB-Py 0.17.4 Release Notes

    TileDB Embedded updates:

    API Changes

    • Addition of FloatScaleFilter #1195

    Misc Updates

    • Wheels are minimally supported for macOS 10.15 Catalina #1275

    https://pypi.org/project/tiledb/0.17.4/

    Source code(tar.gz)
    Source code(zip)
  • 0.17.3(Sep 12, 2022)

    TileDB-Py 0.17.3 Release Notes

    API Changes

    • Add ability to pass shape tuple to empty_like #1316
    • Support retrieving MBRs of var-sized dimensions #1311

    Misc Updates

    • Wheels will no longer be supported for macOS 10.15 Catalina; the minimum supported macOS version is now 11 Big Sur #1300
    • Wheels will no longer supported for Python 3.6 #1300

    https://pypi.org/project/tiledb/0.17.3/

    Source code(tar.gz)
    Source code(zip)
  • 0.17.2(Aug 25, 2022)

    TileDB-Py 0.17.2 Release Notes

    TileDB Embedded updates:

    Bug Fixes

    • Fix issue where querying an array with a Boolean type when arrow=True, but is unselected in .query(attr=...), results in an error pyarrow.lib.ArrowInvalid: Invalid column index to set field. #1291
    • Use Arrow type fixed-width binary ("w:") for non-variable TILEDB_CHAR #1286

    https://pypi.org/project/tiledb/0.17.2/

    Source code(tar.gz)
    Source code(zip)
  • 0.17.1(Aug 16, 2022)

  • 0.17.0(Aug 8, 2022)

  • 0.16.5(Aug 8, 2022)

  • 0.16.4(Aug 4, 2022)

    TileDB-Py 0.16.4 Release Notes

    TileDB Embedded updates:

    Improvements

    • setup.py revert back to retrieving core version by using ctypes by parsing tiledb_version.h; the tiledb shared object lib now returns back a full path #1226
    • Update minimum required cmake version to =>3.23; required for building libtiledb #1260

    API Changes

    • Addition of in operator for QueryCondition #1214
    • Revert the regular indexer [:] to return entire array rather than nonempty domain in order to maintain NumPy semantics #1261

    Bug Fixes

    • Deprecate Filestore.import_uri in lieu of Filestore.copy_from #1226

    https://pypi.org/project/tiledb/0.16.4/

    Source code(tar.gz)
    Source code(zip)
  • 0.16.3(Jul 11, 2022)

    TileDB-Py 0.16.3 Release Notes

    Packaging Notes

    • This removes import tkinter from test_libtiledb.py which was preventing the conda package from building properly

    https://pypi.org/project/tiledb/0.16.3/

    Source code(tar.gz)
    Source code(zip)
  • 0.16.2(Jul 8, 2022)

    TileDB-Py 0.16.2 Release Notes

    TileDB Embedded updates:

    Improvements

    • setup.py retrieves core version by using ctypes to call tiledb_version rather than parsing tiledb_version.h #1191

    Bug Fixes

    • Set nonempty domain of string dimension to (None, None) for empty array #1182

    API Changes

    • Support QueryCondition for dense arrays #1198
    • Querying dense array with [:] returns shape that matches nonempty domain, consistent with .df[:] and .multi_index[:] #1199
    • Addition of from_numpy support for mode={ingest,schema_only,append} #1185

    https://pypi.org/project/tiledb/0.16.2/

    Source code(tar.gz)
    Source code(zip)
  • 0.16.1(Jun 25, 2022)

  • 0.16.0(Jun 23, 2022)

    TileDB-Py 0.16.0 Release Notes

    TileDB Embedded updates:

    API Changes

    • Addition of Filestore API #1070
    • Use bool instead of uint8 for Boolean dtype in dataframe_.py #1154
    • Support QueryCondition OR operator #1146

    https://pypi.org/project/tiledb/0.16.0/

    Source code(tar.gz)
    Source code(zip)
  • 0.15.6(Jun 22, 2022)

  • 0.15.5(Jun 15, 2022)

    TileDB-Py 0.15.5 Release Notes

    TileDB Embedded updates:

    API Changes

    • Support TILEDB_BLOB dtype #1159

    Bug Fixes

    • Fix error where passing a Context to Group would segfault intermittenly #1165
    • Correct Boolean values when use_arrow=True #1167

    https://pypi.org/project/tiledb/0.15.5/

    Source code(tar.gz)
    Source code(zip)
  • 0.15.4(Jun 13, 2022)

  • 0.15.3(Jun 4, 2022)

  • 0.15.2(Jun 1, 2022)

    TileDB-Py 0.15.2 Release Notes

    TileDB Embedded updates:

    Improvements

    • Refactor MultiRangeIndexer & DataFrameIndexer: addition of ABC _BaseIndexer with virtual method _run_query and generator _BaseIndexer.__iter__; remove _iter_state; and fix bugs related to incomplete queries #1134

    Bug Fixes

    • Fix race condition in {Dense,Sparse}Array.__new__ #1096
    • Correcting stats_dump issues: Python stats now also in JSON form if json=True, resolve name mangling of json argument and json module, and pulling "timer" and "counter" stats from stats_json_core for libtiledb>=2.3 #1140

    API Changes

    • Addition of tiledb.DictionaryFilter #1074
    • Add support for Datatype::TILEDB_BOOL #1110
    • Addition of Group.__contains__ to check if member with given name is in Group #1125
    • Support with-statement for Groups #1124
    • Addition of keys, values, and items to Group.meta #1123
    • Group.member also returns name if given #1141

    https://pypi.org/project/tiledb/0.15.2/

    Source code(tar.gz)
    Source code(zip)
  • 0.15.1(May 19, 2022)

  • 0.15.0(May 16, 2022)

    TileDB-Py 0.15.0 Release Notes

    TileDB Embedded updates:

    Misc Updates

    • Wheels will no longer be supported for macOS 10.14 Mojave; the minimum supported macOS version is now 10.15 Catalina #1080

    https://pypi.org/project/tiledb/0.15.0/

    Source code(tar.gz)
    Source code(zip)
  • 0.14.5(May 12, 2022)

  • 0.14.4(May 10, 2022)

    TileDB-Py 0.14.4 Release Notes

    Misc Updates

    • Update MACOSX_DEPLOYMENT_TARGET from 10.14 to 10.15 #1080

    Bug Fixes

    • Correct handling of Arrow cell count with all empty result #1082

    https://pypi.org/project/tiledb/0.14.4/

    Source code(tar.gz)
    Source code(zip)
  • 0.14.3(May 3, 2022)

    TileDB-Py 0.14.3 Release Notes

    Improvements

    • Refactor display of TileDB objects in Jupyter notebooks to be more readable #1049
    • Improve documentation for Filter, FilterList, VFS, FileIO, Group, and QueryCondition #1043, #1058

    Bug Fixes

    • Dim.shape correctly errors out if type is not integer or datetime #1055
    • Correctly check dtypes in from_pandas for supported versions of NumPy <1.20 #1054
    • Fix Arrow Table lifetime issues when using.query(return_arrow=True) #1056

    https://pypi.org/project/tiledb/0.14.3/

    Source code(tar.gz)
    Source code(zip)
  • 0.14.2(Apr 27, 2022)

    TileDB-Py 0.14.2 Release Notes

    TileDB Embedded updates:

    Improvements

    • Add Group and Object to docs #1040

    Bug Fixes

    • Correct Group.__repr__ to call correct _dump function #1040
    • Check type of ctx in from_pandas and from_csv #1042
    • Only allow use of .df indexer for .query(return_arrow=True); error out with meaningful error message otherwise #1045

    https://pypi.org/project/tiledb/0.14.2/

    Source code(tar.gz)
    Source code(zip)
Owner
TileDB, Inc.
TileDB, Inc.
Little wrapper around asyncpg for specific experience.

Little wrapper around asyncpg for specific experience.

Nikita Sivakov 3 Nov 15, 2021
Python PostgreSQL adapter to stream results of multi-statement queries without a server-side cursor

streampq Stream results of multi-statement PostgreSQL queries from Python without server-side cursors. Has benefits over some other Python PostgreSQL

Department for International Trade 6 Oct 31, 2022
The Database Toolkit for Python

SQLAlchemy The Python SQL Toolkit and Object Relational Mapper Introduction SQLAlchemy is the Python SQL toolkit and Object Relational Mapper that giv

SQLAlchemy 6.5k Jan 01, 2023
Pandas Google BigQuery

pandas-gbq pandas-gbq is a package providing an interface to the Google BigQuery API from pandas Installation Install latest release version via conda

Python for Data 345 Dec 28, 2022
MySQL database connector for Python (with Python 3 support)

mysqlclient This project is a fork of MySQLdb1. This project adds Python 3 support and fixed many bugs. PyPI: https://pypi.org/project/mysqlclient/ Gi

PyMySQL 2.2k Dec 25, 2022
CouchDB client built on top of aiohttp (asyncio)

aiocouchdb source: https://github.com/aio-libs/aiocouchdb documentation: http://aiocouchdb.readthedocs.org/en/latest/ license: BSD CouchDB client buil

aio-libs 53 Apr 05, 2022
This repository is for active development of the Azure SDK for Python.

Azure SDK for Python This repository is for active development of the Azure SDK for Python. For consumers of the SDK we recommend visiting our public

Microsoft Azure 3.4k Jan 02, 2023
edaSQL is a library to link SQL to Exploratory Data Analysis and further more in the Data Engineering.

edaSQL is a python library to bridge the SQL with Exploratory Data Analysis where you can connect to the Database and insert the queries. The query results can be passed to the EDA tool which can giv

Tamil Selvan 8 Dec 12, 2022
SAP HANA Connector in pure Python

SAP HANA Database Client for Python Important Notice This public repository is read-only and no longer maintained. The active maintained alternative i

SAP Archive 299 Nov 20, 2022
A Telegram Bot to manage Redis Database.

A Telegram Bot to manage Redis database. Direct deploy on heroku Manual Deployment python3, git is required Clone repo git clone https://github.com/bu

Amit Sharma 4 Oct 21, 2022
asyncio (PEP 3156) Redis support

aioredis asyncio (PEP 3156) Redis client library. Features hiredis parser Yes Pure-python parser Yes Low-level & High-level APIs Yes Connections Pool

aio-libs 2.2k Jan 04, 2023
Pure Python MySQL Client

PyMySQL Table of Contents Requirements Installation Documentation Example Resources License This package contains a pure-Python MySQL client library,

PyMySQL 7.2k Jan 09, 2023
A wrapper for SQLite and MySQL, Most of the queries wrapped into commands for ease.

Before you proceed, make sure you know Some real SQL, before looking at the code, otherwise you probably won't understand anything. Installation pip i

Refined 4 Jul 30, 2022
GINO Is Not ORM - a Python asyncio ORM on SQLAlchemy core.

GINO - GINO Is Not ORM - is a lightweight asynchronous ORM built on top of SQLAlchemy core for Python asyncio. GINO 1.0 supports only PostgreSQL with

GINO Community 2.5k Dec 27, 2022
SQL queries to collections

SQC SQL Queries to Collections Examples from sqc import sqc data = [ {"a": 1, "b": 1}, {"a": 2, "b": 1}, {"a": 3, "b": 2}, ] Simple filte

Alexander Volkovsky 0 Jul 06, 2022
A SQL linter and auto-formatter for Humans

The SQL Linter for Humans SQLFluff is a dialect-flexible and configurable SQL linter. Designed with ELT applications in mind, SQLFluff also works with

SQLFluff 5.5k Jan 08, 2023
Asynchronous, fast, pythonic DynamoDB Client

AsyncIO DynamoDB Asynchronous pythonic DynamoDB client; 2x faster than aiobotocore/boto3/botocore. Quick start With httpx Install this library pip ins

HENNGE 48 Dec 18, 2022
DataStax Python Driver for Apache Cassandra

DataStax Driver for Apache Cassandra A modern, feature-rich and highly-tunable Python client library for Apache Cassandra (2.1+) and DataStax Enterpri

DataStax 1.3k Dec 25, 2022
SQL for Humans™

Records: SQL for Humans™ Records is a very simple, but powerful, library for making raw SQL queries to most relational databases. Just write SQL. No b

Ken Reitz 6.9k Jan 03, 2023
Lazydata: Scalable data dependencies for Python projects

lazydata: scalable data dependencies lazydata is a minimalist library for including data dependencies into Python projects. Problem: Keeping all data

629 Nov 21, 2022