python-bigquery Apache-2python-bigquery (🥈34 · ⭐ 3.5K · 📈) - Google BigQuery API client library. Apache-2

Overview

Python Client for Google BigQuery

GA pypi versions

Querying massive datasets can be time consuming and expensive without the right hardware and infrastructure. Google BigQuery solves this problem by enabling super-fast, SQL queries against append-mostly tables, using the processing power of Google's infrastructure.

Quick Start

In order to use this library, you first need to go through the following steps:

  1. Select or create a Cloud Platform project.
  2. Enable billing for your project.
  3. Enable the Google Cloud BigQuery API.
  4. Setup Authentication.

Installation

Install this library in a virtualenv using pip. virtualenv is a tool to create isolated Python environments. The basic problem it addresses is one of dependencies and versions, and indirectly permissions.

With virtualenv, it's possible to install this library without needing system install permissions, and without clashing with the installed system dependencies.

Supported Python Versions

Python >= 3.6, < 3.10

Unsupported Python Versions

Python == 2.7, Python == 3.5.

The last version of this library compatible with Python 2.7 and 3.5 is google-cloud-bigquery==1.28.0.

Mac/Linux

pip install virtualenv
virtualenv <your-env>
source <your-env>/bin/activate
<your-env>/bin/pip install google-cloud-bigquery

Windows

pip install virtualenv
virtualenv <your-env>
<your-env>\Scripts\activate
<your-env>\Scripts\pip.exe install google-cloud-bigquery

Example Usage

Perform a query

from google.cloud import bigquery

client = bigquery.Client()

# Perform a query.
QUERY = (
    'SELECT name FROM `bigquery-public-data.usa_names.usa_1910_2013` '
    'WHERE state = "TX" '
    'LIMIT 100')
query_job = client.query(QUERY)  # API request
rows = query_job.result()  # Waits for query to finish

for row in rows:
    print(row.name)

Instrumenting With OpenTelemetry

This application uses OpenTelemetry to output tracing data from API calls to BigQuery. To enable OpenTelemetry tracing in the BigQuery client the following PyPI packages need to be installed:

pip install google-cloud-bigquery[opentelemetry] opentelemetry-exporter-google-cloud

After installation, OpenTelemetry can be used in the BigQuery client and in BigQuery jobs. First, however, an exporter must be specified for where the trace data will be outputted to. An example of this can be found here:

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchExportSpanProcessor
from opentelemetry.exporter.cloud_trace import CloudTraceSpanExporter
trace.set_tracer_provider(TracerProvider())
trace.get_tracer_provider().add_span_processor(
    BatchExportSpanProcessor(CloudTraceSpanExporter())
)

In this example all tracing data will be published to the Google Cloud Trace console. For more information on OpenTelemetry, please consult the OpenTelemetry documentation.

Comments
  • Can't upload data with

    Can't upload data with "2019-07-08 08:00:00" datetime format to Google Bigquery with pandas

    Environment details

    I'm using pandas with google-cloud-python

    Steps to reproduce

    1. I have a dataframe has datetime format, ex: "2019-07-08 08:00:00" and my schema has created column with DATETIME type.
    2. I tried covert it to use pq.to_datetime()
    3. Then I used load_table_from_dataframe() to insert data.

    Code example

    my_df = get_sessions()  # this return a dataframe has a column name is created which is datetime[ns] type ex :"2020-01-08 08:00:00"
    my_df['created'] = pd.to_datetime(my_df['created'], format='%Y-%m-%d %H:%M:%S').astype('datetime64[ns]')
    res = bigquery_client.client.load_table_from_dataframe(my_df, table_id)
    res.result()
    
    # exp: my value "2020-01-08 08:00:00" is being changed as INVALID or this value "0013-03-01T03:05:00" or other wrong value @plamut please help
    

    I just updated my problem . Here Thanks!

    api: bigquery type: bug priority: p2 external 
    opened by namnguyenbk 36
  • Bigquery: import error with v1.24.0

    Bigquery: import error with v1.24.0

    bug googleapis/google-cloud-python#9965 is still happening in v1.24.0 and six v1.14.0

    `File "/root/.local/share/virtualenvs/code-788z9T0p/lib/python3.6/site-packages/google/cloud/bigquery/schema.py", line 17, in

    from six.moves import collections_abc ImportError: cannot import name 'collections_abc' `

    why did you close the googleapis/google-cloud-python#9965 issue if it still reproduces for many people?

    image

    api: bigquery needs more info type: question 
    opened by sagydr 31
  • BigQuery: make jobs awaitable

    BigQuery: make jobs awaitable

    I know BigQuery jobs are asynchronous by default. However, I am struggling to make my datapipeline async end-to-end.

    Looking at this JS example, I thought it would be the most Pythonic to make a BigQuery job awaitable. However, I can't get that to work in Python i.e. errors when await client.query(query). Looking at the source code, I don't see which method returns an awaitable object.

    I have little experience in writing async Python code and found this example that wraps jobs in a async def coroutine.

    class BQApi(object):                                                                                                 
        def __init__(self):                                                                                              
            self.api = bigquery.Client.from_service_account_json(BQ_CONFIG["credentials"])                               
    
        async def exec_query(self, query, **kwargs) -> bigquery.table.RowIterator:                                       
            job = self.api.query(query, **kwargs)                                                                        
            task = asyncio.create_task(self.coroutine_job(job))                                                          
            return await task                                                                                            
    
        @staticmethod                                                                                                    
        async def coroutine_job(job):                                                                                    
            return job.result()   
    

    The google.api_core.operation.Operation shows how to use add_done_callback to asynchronously wait for long-running operations. I have tried that, but the following yields AttributeError: 'QueryJob' object has no attribute '_condition' :

    from concurrent.futures import ThreadPoolExecutor, as_completed
    query1 = 'SELECT 1'
    query2 = 'SELECT 2'
    
    def my_callback(future):
        result = future.result()
    
    operations = [bq.query(query1), bq.query(query2)]
    [operation.add_done_callback(my_callback) for operation in operations]
    results2 = []
    for future in as_completed(operations):
      results2.append(list(future.result()))
    

    Given that jobs are already asynchronous, would it make sense to add a method that returns an awaitable?

    Or am I missing something and is there an Pythonic way to use the BigQuery client with the async/await pattern?

    wontfix api: bigquery type: feature request Python 3 Only 
    opened by dkapitan 27
  • BigQuery: Upload pandas DataFrame containing arrays

    BigQuery: Upload pandas DataFrame containing arrays

    The support for python Bigquery API indicates that arrays are possible, however, when passing from a pandas dataframe to bigquery there is a pyarrow struct issue.

    The only way round it seems its to drop columns then use JSON Normalise for a separate table.

    from google.cloud import bigquery
    
    project = 'lake'
    client = bigquery.Client(credentials=credentials, project=project)
    dataset_ref = client.dataset('XXX')
    table_ref = dataset_ref.table('RAW_XXX')
    job_config = bigquery.LoadJobConfig()
    job_config.autodetect = True
    job_config.write_disposition = 'WRITE_TRUNCATE'
    
    client.load_table_from_dataframe(appended_data, table_ref,job_config=job_config).result()
    

    This is the error recieved. NotImplementedError: struct

    The reason I wanted to use this API as it indicates Nested Array support, which is perfect for our data lake in BQ but I assume this doesn't work?

    api: bigquery type: feature request 
    opened by AETDDraper 21
  • 500 server error when creating table using clustering

    500 server error when creating table using clustering

    Environment details

    • OS type and version: Ubuntu20 PopOs
    • Python version: 3.7.8
    • pip version: 20.1.1
    • google-cloud-bigquery version: 1.27.2

    Steps to reproduce

    I'm creating a table with some columns, one of them is of type GEOGRAHPY. When I try to create the table with a sample data, if I choose to use clustering, I got the 500 error. I can create the table only if no clustering is made. Also I can create the table with clustering if I don't include the column of type GEOGRAHPY. Code with a toy example to reproduce it:

    Code example

    import time
    import pandas as pd
    from google.cloud import bigquery
    from shapely.geometry import Point
    
    client = bigquery.Client()
    PROJECT_ID = ""
    table_id = f"{PROJECT_ID}.data_capture.toy"
    
    df = pd.DataFrame(
        dict(
            lat=[6.208969] * 100,
            lon=[-75.571696] * 100,
            logged_at=[int(time.time() * 1000) for _ in range(100)],
        )
    )
    df["point"] = df.apply(lambda row: Point(row["lon"], row["lat"]).wkb_hex, axis=1)
    
    job_config = bigquery.LoadJobConfig(
        schema=[
            bigquery.SchemaField("lon", "FLOAT64", "REQUIRED"),
            bigquery.SchemaField("lat", "FLOAT64", "REQUIRED"),
            bigquery.SchemaField("point", "GEOGRAPHY", "REQUIRED"),
            bigquery.SchemaField("logged_at", "TIMESTAMP", "REQUIRED"),
        ],
        write_disposition="WRITE_TRUNCATE",
        time_partitioning=bigquery.TimePartitioning(
            type_=bigquery.TimePartitioningType.DAY, field="logged_at",
        ),
        clustering_fields=["logged_at"],
    )
    
    job = client.load_table_from_dataframe(
        df, table_id, job_config=job_config
    )  # Make an API request.
    job.result()  # Wait for the job to complete.
    

    Stack trace

    Traceback (most recent call last):
      File "test.py", line 108, in <module>
        job.result()  # Wait for the job to complete.
      File "/home/charlie/data/kiwi/data-upload/.venv/lib/python3.7/site-packages/google/cloud/bigquery/job.py", line 812, in result
        return super(_AsyncJob, self).result(timeout=timeout)
      File "/home/charlie/data/kiwi/data-upload/.venv/lib/python3.7/site-packages/google/api_core/future/polling.py", line 130, in result
        raise self._exception
    google.api_core.exceptions.InternalServerError: 500 An internal error occurred and the request could not be completed. Error: 3144498
    

    Thank you in advance!

    api: bigquery type: bug priority: p2 external 
    opened by charlielito 20
  • ImportError: cannot import name bigquery_storage_v1beta1 from google.cloud

    ImportError: cannot import name bigquery_storage_v1beta1 from google.cloud

    This error occurs when I run a query using %%bigquery magics in GCP-hosted notebook and the query fails.

    "ImportError: cannot import name bigquery_storage_v1beta1 from google.cloud (unknown location)"

    Environment details

    • OS type and version:
    • Python version: 3.7
    • pip version: 20.1.1
    • google-cloud-bigquery version: 1.26.0 and 1.26.1; not an issue with 1.25.0

    Steps to reproduce

    1. Install current version of google-cloud-bigquery
    2. Query

    Code example

    %pip install --upgrade google-cloud-bigquery[bqstorage,pandas]
    %load_ext google.cloud.bigquery
    import google.cloud.bigquery.magics
    
    %%bigquery stackoverflow --use_bqstorage_api
    SELECT
      CONCAT(
        'https://stackoverflow.com/questions/',
        CAST(id as STRING)) as url,
      view_count
    FROM `bigquery-public-data.stackoverflow.posts_questions`
    WHERE tags like '%google-bigquery%'
    ORDER BY view_count DESC
    LIMIT 10
    

    Stack trace

    ---------------------------------------------------------------------------
    ImportError                               Traceback (most recent call last)
    <ipython-input-2-29432a7a9e7c> in <module>
    ----> 1 get_ipython().run_cell_magic('bigquery', 'stackoverflow --use_bqstorage_api', "SELECT\n  CONCAT(\n    'https://stackoverflow.com/questions/',\n    CAST(id as STRING)) as url,\n  view_count\nFROM `bigquery-public-data.stackoverflow.posts_questions`\nWHERE tags like '%google-bigquery%'\nORDER BY view_count DESC\nLIMIT 10\n")
    
    /opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py in run_cell_magic(self, magic_name, line, cell)
       2369             with self.builtin_trap:
       2370                 args = (magic_arg_s, cell)
    -> 2371                 result = fn(*args, **kwargs)
       2372             return result
       2373 
    
    /opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/magics.py in _cell_magic(line, query)
        589             )
        590         else:
    --> 591             result = query_job.to_dataframe(bqstorage_client=bqstorage_client)
        592 
        593         if args.destination_var:
    
    /opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/job.py in to_dataframe(self, bqstorage_client, dtypes, progress_bar_type, create_bqstorage_client, date_as_object)
       3381             progress_bar_type=progress_bar_type,
       3382             create_bqstorage_client=create_bqstorage_client,
    -> 3383             date_as_object=date_as_object,
       3384         )
       3385 
    
    /opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/table.py in to_dataframe(self, bqstorage_client, dtypes, progress_bar_type, create_bqstorage_client, date_as_object)
       1725                 progress_bar_type=progress_bar_type,
       1726                 bqstorage_client=bqstorage_client,
    -> 1727                 create_bqstorage_client=create_bqstorage_client,
       1728             )
       1729             df = record_batch.to_pandas(date_as_object=date_as_object)
    
    /opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/table.py in to_arrow(self, progress_bar_type, bqstorage_client, create_bqstorage_client)
       1543             record_batches = []
       1544             for record_batch in self._to_arrow_iterable(
    -> 1545                 bqstorage_client=bqstorage_client
       1546             ):
       1547                 record_batches.append(record_batch)
    
    /opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/table.py in _to_page_iterable(self, bqstorage_download, tabledata_list_download, bqstorage_client)
       1432     ):
       1433         if bqstorage_client is not None:
    -> 1434             for item in bqstorage_download():
       1435                 yield item
       1436             return
    
    /opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/_pandas_helpers.py in _download_table_bqstorage(project_id, table, bqstorage_client, preserve_order, selected_fields, page_to_item)
        626     # is available and can be imported.
        627     from google.cloud import bigquery_storage_v1
    --> 628     from google.cloud import bigquery_storage_v1beta1
        629 
        630     if "$" in table.table_id:
    
    ImportError: cannot import name 'bigquery_storage_v1beta1' from 'google.cloud' (unknown location)
    
    api: bigquery type: bug priority: p2 external 
    opened by vanessanielsen 20
  • convert time columns to dbtime by default in `to_dataframe`

    convert time columns to dbtime by default in `to_dataframe`

    Currently TIME columns are just exposed as string objects. This would be a better experience and align with better with the expectations for working with timeseries in pandas https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html

    Presumably one could combine a date column with a time column to create a datetime by adding them.

    api: bigquery type: feature request semver: major 
    opened by tswast 19
  • feat: add support for Parquet options

    feat: add support for Parquet options

    Closes #661.

    For load jobs and external tables config.

    PR checklist:

    • [x] Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
    • [x] Ensure the tests and linter pass
    • [x] Code coverage does not decrease (if any source code was changed)
    • [x] Appropriate docs were updated (if necessary)
    api: bigquery cla: yes 
    opened by plamut 19
  • feat: use geopandas for GEOGRAPHY columns if geopandas is installed

    feat: use geopandas for GEOGRAPHY columns if geopandas is installed

    This would technically be a breaking change, but it might make sense to do while we are changing default dtypes in https://github.com/googleapis/python-bigquery/pull/786 for https://issuetracker.google.com/144712110

    If the GeoPandas library is installed (meaning, GeoPandas should be considered an optional "extra"), it may make sense to use the extension dtypes provided by GeoPandas by default on GEOGRAPHY columns.

    api: bigquery type: feature request semver: major 
    opened by tswast 18
  • Purpose of timeout in client.get_job(timeout=5)

    Purpose of timeout in client.get_job(timeout=5)

    Hi,

    What is the purpose of using timeout while fetching job information since the bigquery.Client says time to wait for before retrying. Is retry attempt should be made by the user or the client will handle that

    But I am getting an exception raised

    host='bigquery.googleapis.com', port=443): Read timed out. (read timeout=5)"}

    api: bigquery type: question 
    opened by nitishxp 18
  • fix: support ARRAY data type when loading from DataFrame with Parquet

    fix: support ARRAY data type when loading from DataFrame with Parquet

    Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

    • [x] Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
    • [x] Ensure the tests and linter pass
    • [x] Code coverage does not decrease (if any source code was changed)
    • [x] Appropriate docs were updated (if necessary)

    Fixes #19 🦕

    api: bigquery cla: yes 
    opened by judahrand 17
  • docs: revise create table cmek sample

    docs: revise create table cmek sample

    Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

    • [ ] Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
    • [ ] Ensure the tests and linter pass
    • [ ] Code coverage does not decrease (if any source code was changed)
    • [ ] Appropriate docs were updated (if necessary)

    Towards #790 🦕

    api: bigquery samples size: m 
    opened by Mattix23 1
  • docs: revise label table code samples

    docs: revise label table code samples

    Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

    • [ ] Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
    • [ ] Ensure the tests and linter pass
    • [ ] Code coverage does not decrease (if any source code was changed)
    • [ ] Appropriate docs were updated (if necessary)

    Towards #790 🦕

    api: bigquery samples size: m 
    opened by Mattix23 1
  • add update_table_access.py

    add update_table_access.py

    in public docs we less sample python code for setting iam on bigquery table. https://cloud.google.com/bigquery/docs/control-access-to-resources-iam#grant_access_to_a_table_or_view

    Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

    • [ ] Make sure to open an issue as a bug/issue
    • [ ] before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
    • [ ] Ensure the tests and linter pass
    • [ ] Code coverage does not decrease (if any source code was changed)
    • [ ] Appropriate docs were updated (if necessary)

    Fixes #<1449> 🦕

    api: bigquery size: m 
    opened by nonokangwei 2
  • Grant access to bigquery table sample code

    Grant access to bigquery table sample code

    Thanks for stopping by to let us know something could be better!

    PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.

    Is your feature request related to a problem? Please describe. in https://cloud.google.com/bigquery/docs/control-access-to-resources-iam#grant_access_to_a_table_or_view, there has no sample code for python to grant access on table or view Describe the solution you'd like give the grant access on table or view python simple code Describe alternatives you've considered customer need do coding by complex documentation. Additional context n/a

    api: bigquery samples 
    opened by nonokangwei 0
  • Support for JSON query parameters

    Support for JSON query parameters

    It is currently not possible to escape json parameters in a query like this:

    job_config = bigquery.QueryJobConfig(
        config.query_parameters=[
            bigquery.JsonQueryParameter("data", {"foo": "bar"})
        ]
    )
    stmt = 'UPDATE FROM my_table SET dat[email protected]';
    query_job = client.query(stmt, job_config=job_config)
    

    It seem quite important to avoid SQL injections.

    api: bigquery 
    opened by maingoh 0
  • docs: Revised create_partitioned_table sample

    docs: Revised create_partitioned_table sample

    Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

    • [ ] Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
    • [ ] Ensure the tests and linter pass
    • [ ] Code coverage does not decrease (if any source code was changed)
    • [ ] Appropriate docs were updated (if necessary)

    Towards #790 🦕

    api: bigquery samples size: m 
    opened by thejaredchapman 1
Releases(v3.4.1)
Owner
Google APIs
Clients for Google APIs and tools that help produce them.
Google APIs
Import entity definition document into SQLie3. Manage the entity. Also, create a "Create Table SQL file".

EntityDocumentMaker Version 1.00 After importing the entity definition (Excel file), store the data in sqlite3. エンティティ定義(Excelファイル)をインポートした後、データをsqlit

G-jon FujiYama 1 Jan 09, 2022
Pure-python PostgreSQL driver

pg-purepy pg-purepy is a pure-Python PostgreSQL wrapper based on the anyio library. A lot of this library was inspired by the pg8000 library. Credits

Lura Skye 11 May 23, 2022
Micro ODM for MongoDB

Beanie - is an asynchronous ODM for MongoDB, based on Motor and Pydantic. It uses an abstraction over Pydantic models and Motor collections to work wi

Roman 993 Jan 03, 2023
Pandas Google BigQuery

pandas-gbq pandas-gbq is a package providing an interface to the Google BigQuery API from pandas Installation Install latest release version via conda

Python for Data 345 Dec 28, 2022
Python ODBC bridge

pyodbc pyodbc is an open source Python module that makes accessing ODBC databases simple. It implements the DB API 2.0 specification but is packed wit

Michael Kleehammer 2.6k Dec 27, 2022
GINO Is Not ORM - a Python asyncio ORM on SQLAlchemy core.

GINO - GINO Is Not ORM - is a lightweight asynchronous ORM built on top of SQLAlchemy core for Python asyncio. GINO 1.0 supports only PostgreSQL with

GINO Community 2.5k Dec 27, 2022
The JavaScript Database, for Node.js, nw.js, electron and the browser

The JavaScript Database Embedded persistent or in memory database for Node.js, nw.js, Electron and browsers, 100% JavaScript, no binary dependency. AP

Louis Chatriot 13.2k Jan 02, 2023
High level Python client for Elasticsearch

Elasticsearch DSL Elasticsearch DSL is a high-level library whose aim is to help with writing and running queries against Elasticsearch. It is built o

elastic 3.6k Jan 03, 2023
Pure Python MySQL Client

PyMySQL Table of Contents Requirements Installation Documentation Example Resources License This package contains a pure-Python MySQL client library,

PyMySQL 7.2k Jan 09, 2023
Python interface to Oracle Database conforming to the Python DB API 2.0 specification.

cx_Oracle version 8.2 (Development) cx_Oracle is a Python extension module that enables access to Oracle Database. It conforms to the Python database

Oracle 841 Dec 21, 2022
Anomaly detection on SQL data warehouses and databases

With CueObserve, you can run anomaly detection on data in your SQL data warehouses and databases. Getting Started Install via Docker docker run -p 300

Cuebook 171 Dec 18, 2022
Redis Python Client - The Python interface to the Redis key-value store.

redis-py The Python interface to the Redis key-value store. Installation | Contributing | Getting Started | Connecting To Redis Installation redis-py

Redis 11k Jan 08, 2023
MongoDB data stream pipeline tools by YouGov (adopted from MongoDB)

mongo-connector The mongo-connector project originated as a MongoDB mongo-labs project and is now community-maintained under the custody of YouGov, Pl

YouGov 1.9k Jan 04, 2023
Python Wrapper For sqlite3 and aiosqlite

Python Wrapper For sqlite3 and aiosqlite

6 May 30, 2022
A tool to snapshot sqlite databases you don't own

The core here is my first attempt at a solution of this, combining ideas from browser_history.py and karlicoss/HPI/sqlite.py to create a library/CLI tool to (as safely as possible) copy databases whi

Sean Breckenridge 10 Dec 22, 2022
Making it easy to query APIs via SQL

Shillelagh Shillelagh (ʃɪˈleɪlɪ) is an implementation of the Python DB API 2.0 based on SQLite (using the APSW library): from shillelagh.backends.apsw

Beto Dealmeida 207 Dec 30, 2022
MariaDB connector using python and flask

MariaDB connector using python and flask This should work with flask and to be deployed on docker. Setting up stuff 1. Docker build and run docker bui

Bayangmbe Mounmo 1 Jan 11, 2022
Makes it easier to write raw SQL in Python.

CoolSQL Makes it easier to write raw SQL in Python. Usage Quick Start from coolsql import Field name = Field("name") age = Field("age") condition =

Aber 7 Aug 21, 2022
dask-sql is a distributed SQL query engine in python using Dask

dask-sql is a distributed SQL query engine in Python. It allows you to query and transform your data using a mixture of common SQL operations and Python code and also scale up the calculation easily

Nils Braun 271 Dec 30, 2022
TileDB-Py is a Python interface to the TileDB Storage Engine.

TileDB-Py TileDB-Py is a Python interface to the TileDB Storage Engine. Quick Links Installation Build Instructions TileDB Documentation Python API re

TileDB, Inc. 149 Nov 28, 2022