Python client for the Socrata Open Data API

Overview

PyPI version Build Status Code Coverage

sodapy

sodapy is a python client for the Socrata Open Data API.

Installation

You can install with pip install sodapy.

If you want to install from source, then clone this repository and run python setup.py install from the project root.

Requirements

At its core, this library depends heavily on the Requests package. All other requirements can be found in requirements.txt. sodapy is currently compatible with Python 3.5, 3.6, 3.7 and 3.8.

Documentation

The official Socrata Open Data API docs provide thorough documentation of the available methods, as well as other client libraries. A quick list of eligible domains to use with this API is available via the Socrata Discovery API or Socrata's Open Data Network.

This library supports writing directly to datasets with the Socrata Open Data API. For write operations that use data transformations in the Socrata Data Management Experience (the user interface for creating datasets), use the Socrata Data Management API. For more details on when to use SODA vs the Data Management API, see the Data Management API documentation. A Python SDK for the Socrata Data Management API can be found at socrata-py.

Examples

There are some jupyter notebooks in the examples directory with usage examples of sodapy in action.

Interface

Table of Contents

client

Import the library and set up a connection to get started.

>>> from sodapy import Socrata
>>> client = Socrata(
        "sandbox.demo.socrata.com",
        "FakeAppToken",
        username="[email protected]",
        password="mypassword",
        timeout=10
    )

username and password are only required for creating or modifying data. An application token isn't strictly required (can be None), but queries executed from a client without an application token will be subjected to strict throttling limits. You may want to increase the timeout seconds when making large requests. To create a bare-bones client:

>>> client = Socrata("sandbox.demo.socrata.com", None)

A client can also be created with a context manager to obviate the need for teardown:

>>> with Socrata("sandbox.demo.socrata.com", None) as client:
>>>    # do some stuff

The client, by default, makes requests over HTTPS. To modify this behavior, or to make requests through a proxy, take a look here.

datasets(limit=0, offset=0)

Retrieve datasets associated with a particular domain. The optional limit and offset keyword args can be used to retrieve a subset of the datasets. By default, all datasets are returned.

>>> client.datasets()
[{"resource" : {"name" : "Approved Building Permits", "id" : "msk6-43c6", "parent_fxf" : null, "description" : "Data of approved building/construction permits",...}, {resource : {...}}, ...]

get(dataset_identifier, content_type="json", **kwargs)

Retrieve data from the requested resources. Filter and query data by field name, id, or using SoQL keywords.

>>> client.get("nimj-3ivp", limit=2)
[{u'geolocation': {u'latitude': u'41.1085', u'needs_recoding': False, u'longitude': u'-117.6135'}, u'version': u'9', u'source': u'nn', u'region': u'Nevada', u'occurred_at': u'2012-09-14T22:38:01', u'number_of_stations': u'15', u'depth': u'7.60', u'magnitude': u'2.7', u'earthquake_id': u'00388610'}, {...}]

>>> client.get("nimj-3ivp", where="depth > 300", order="magnitude DESC", exclude_system_fields=False)
[{u'geolocation': {u'latitude': u'-15.563', u'needs_recoding': False, u'longitude': u'-175.6104'}, u'version': u'9', u':updated_at': 1348778988, u'number_of_stations': u'275', u'region': u'Tonga', u':created_meta': u'21484', u'occurred_at': u'2012-09-13T21:16:43', u':id': 132, u'source': u'us', u'depth': u'328.30', u'magnitude': u'4.8', u':meta': u'{\n}', u':updated_meta': u'21484', u'earthquake_id': u'c000cnb5', u':created_at': 1348778988}, {...}]

>>> client.get("nimj-3ivp/193", exclude_system_fields=False)
{u'geolocation': {u'latitude': u'21.6711', u'needs_recoding': False, u'longitude': u'142.9236'}, u'version': u'C', u':updated_at': 1348778988, u'number_of_stations': u'136', u'region': u'Mariana Islands region', u':created_meta': u'21484', u'occurred_at': u'2012-09-13T11:19:07', u':id': 193, u'source': u'us', u'depth': u'300.70', u'magnitude': u'4.4', u':meta': u'{\n}', u':updated_meta': u'21484', u':position': 193, u'earthquake_id': u'c000cmsq', u':created_at': 1348778988}

>>> client.get("nimj-3ivp", region="Kansas")
[{u'geolocation': {u'latitude': u'38.10', u'needs_recoding': False, u'longitude': u'-100.6135'}, u'version': u'9', u'source': u'nn', u'region': u'Kansas', u'occurred_at': u'2010-09-19T20:52:09', u'number_of_stations': u'15', u'depth': u'300.0', u'magnitude': u'1.9', u'earthquake_id': u'00189621'}, {...}]

get_all(dataset_identifier, content_type="json", **kwargs)

Read data from the requested resource, paginating over all results. Accepts the same arguments as get(). Returns a generator.

>>> client.get_all("nimj-3ivp")
<generator object Socrata.get_all at 0x7fa0dc8be7b0>

>>> for item in client.get_all("nimj-3ivp"):
...     print(item)
...
{'geolocation': {'latitude': '-15.563', 'needs_recoding': False, 'longitude': '-175.6104'}, 'version': '9', ':updated_at': 1348778988, 'number_of_stations': '275', 'region': 'Tonga', ':created_meta': '21484', 'occurred_at': '2012-09-13T21:16:43', ':id': 132, 'source': 'us', 'depth': '328.30', 'magnitude': '4.8', ':meta': '{\n}', ':updated_meta': '21484', 'earthquake_id': 'c000cnb5', ':created_at': 1348778988}
...

>>> import itertools
>>> items = client.get_all("nimj-3ivp")
>>> first_five = list(itertools.islice(items, 5))
>>> len(first_five)
5

get_metadata(dataset_identifier, content_type="json")

Retrieve the metadata associated with a particular dataset.

>>> client.get_metadata("nimj-3ivp")
{"newBackend": false, "licenseId": "CC0_10", "publicationDate": 1436655117, "viewLastModified": 1451289003, "owner": {"roleName": "administrator", "rights": [], "displayName": "Brett", "id": "cdqe-xcn5", "screenName": "Brett"}, "query": {}, "id": "songs", "createdAt": 1398014181, "category": "Public Safety", "publicationAppendEnabled": true, "publicationStage": "published", "rowsUpdatedBy": "cdqe-xcn5", "publicationGroup": 1552205, "displayType": "table", "state": "normal", "attributionLink": "http://foo.bar.com", "tableId": 3523378, "columns": [], "metadata": {"rdfSubject": "0", "renderTypeConfig": {"visible": {"table": true}}, "availableDisplayTypes": ["table", "fatrow", "page"], "attachments": ... }}

update_metadata(dataset_identifier, update_fields, content_type="json")

Update the metadata for a particular dataset. update_fields should be a dictionary containing only the metadata keys that you wish to overwrite.

Note: Invalid payloads to this method could corrupt the dataset or visualization. See this comment for more information.

>>> client.update_metadata("nimj-3ivp", {"attributionLink": "https://anothertest.com"})
{"newBackend": false, "licenseId": "CC0_10", "publicationDate": 1436655117, "viewLastModified": 1451289003, "owner": {"roleName": "administrator", "rights": [], "displayName": "Brett", "id": "cdqe-xcn5", "screenName": "Brett"}, "query": {}, "id": "songs", "createdAt": 1398014181, "category": "Public Safety", "publicationAppendEnabled": true, "publicationStage": "published", "rowsUpdatedBy": "cdqe-xcn5", "publicationGroup": 1552205, "displayType": "table", "state": "normal", "attributionLink": "https://anothertest.com", "tableId": 3523378, "columns": [], "metadata": {"rdfSubject": "0", "renderTypeConfig": {"visible": {"table": true}}, "availableDisplayTypes": ["table", "fatrow", "page"], "attachments": ... }}

download_attachments(dataset_identifier, content_type="json", download_dir="~/sodapy_downloads")

Download all attachments associated with a dataset. Return a list of paths to the downloaded files.

>>> client.download_attachments("nimj-3ivp", download_dir="~/Desktop")
    ['/Users/xmunoz/Desktop/nimj-3ivp/FireIncident_Codes.PDF', '/Users/xmunoz/Desktop/nimj-3ivp/AccidentReport.jpg']

create(name, **kwargs)

Create a new dataset. Optionally, specify keyword args such as:

  • description description of the dataset
  • columns list of fields
  • category dataset category (must exist in /admin/metadata)
  • tags list of tag strings
  • row_identifier field name of primary key
  • new_backend whether to create the dataset in the new backend

Example usage:

>>> columns = [{"fieldName": "delegation", "name": "Delegation", "dataTypeName": "text"}, {"fieldName": "members", "name": "Members", "dataTypeName": "number"}]
>>> tags = ["politics", "geography"]
>>> client.create("Delegates", description="List of delegates", columns=columns, row_identifier="delegation", tags=tags, category="Transparency")
{u'id': u'2frc-hyvj', u'name': u'Foo Bar', u'description': u'test dataset', u'publicationStage': u'unpublished', u'columns': [ { u'name': u'Foo', u'dataTypeName': u'text', u'fieldName': u'foo', ... }, { u'name': u'Bar', u'dataTypeName': u'number', u'fieldName': u'bar', ... } ], u'metadata': { u'rowIdentifier': 230641051 }, ... }

publish(dataset_identifier, content_type="json")

Publish a dataset after creating it, i.e. take it out of 'working copy' mode. The dataset id id returned from create will be used to publish.

>>> client.publish("2frc-hyvj")
{u'id': u'2frc-hyvj', u'name': u'Foo Bar', u'description': u'test dataset', u'publicationStage': u'unpublished', u'columns': [ { u'name': u'Foo', u'dataTypeName': u'text', u'fieldName': u'foo', ... }, { u'name': u'Bar', u'dataTypeName': u'number', u'fieldName': u'bar', ... } ], u'metadata': { u'rowIdentifier': 230641051 }, ... }

set_permission(dataset_identifier, permission="private", content_type="json")

Set the permissions of a dataset to public or private.

>>> client.set_permission("2frc-hyvj", "public")
<Response [200]>

upsert(dataset_identifier, payload, content_type="json")

Create a new row in an existing dataset.

>>> data = [{'Delegation': 'AJU', 'Name': 'Alaska', 'Key': 'AL', 'Entity': 'Juneau'}]
>>> client.upsert("eb9n-hr43", data)
{u'Errors': 0, u'Rows Deleted': 0, u'Rows Updated': 0, u'By SID': 0, u'Rows Created': 1, u'By RowIdentifier': 0}

Update/Delete rows in a dataset.

>>> data = [{'Delegation': 'sfa', ':id': 8, 'Name': 'bar', 'Key': 'doo', 'Entity': 'dsfsd'}, {':id': 7, ':deleted': True}]
>>> client.upsert("eb9n-hr43", data)
{u'Errors': 0, u'Rows Deleted': 1, u'Rows Updated': 1, u'By SID': 2, u'Rows Created': 0, u'By RowIdentifier': 0}

upsert's can even be performed with a csv file.

>>> data = open("upsert_test.csv")
>>> client.upsert("eb9n-hr43", data)
{u'Errors': 0, u'Rows Deleted': 0, u'Rows Updated': 1, u'By SID': 1, u'Rows Created': 0, u'By RowIdentifier': 0}

replace(dataset_identifier, payload, content_type="json")

Similar in usage to upsert, but overwrites existing data.

>>> data = open("replace_test.csv")
>>> client.replace("eb9n-hr43", data)
{u'Errors': 0, u'Rows Deleted': 0, u'Rows Updated': 0, u'By SID': 0, u'Rows Created': 12, u'By RowIdentifier': 0}

create_non_data_file(params, file_obj)

Creates a new file-based dataset with the name provided in the files tuple. A valid file input would be:

files = (
    {'file': ("gtfs2", open('myfile.zip', 'rb'))}
)
>>> with open(nondatafile_path, 'rb') as f:
>>>     files = (
>>>         {'file': ("nondatafile.zip", f)}
>>>     )
>>>     response = client.create_non_data_file(params, files)

replace_non_data_file(dataset_identifier, params, file_obj)

Same as create_non_data_file, but replaces a file that already exists in a file-based dataset.

Note: a table-based dataset cannot be replaced by a file-based dataset. Use create_non_data_file in order to replace.

>>>  with open(nondatafile_path, 'rb') as f:
>>>      files = (
>>>          {'file': ("nondatafile.zip", f)}
>>>      )
>>>      response = client.replace_non_data_file(DATASET_IDENTIFIER, {}, files)

delete(dataset_identifier, row_id=None, content_type="json")

Delete an individual row.

>>> client.delete("nimj-3ivp", row_id=2)
<Response [200]>

Delete the entire dataset.

>>> client.delete("nimj-3ivp")
<Response [200]>

close()

Close the session when you're finished.

>>> client.close()

Run tests

$ pytest

Contributing

See CONTRIBUTING.md.

Meta

This package uses semantic versioning.

Source and wheel distributions are available on PyPI. Here is how I create those releases.

python3 setup.py bdist_wheel
python3 setup.py sdist
twine upload dist/*
Owner
Cristina
ACAB
Cristina
Joshua McDonagh 1 Jan 24, 2022
Send embeds using your discord personal account

Welcome to Embed Sender 👋 Send embeds using your discord personal account Install pip install -r requirements.txt Usage Put your discord token in ./

SkydenFly 11 Sep 07, 2022
The Best Telegram UserBot Made With Pyrogram [Python]

Asterix UserBot A Powerful Telegram userbot based on Pyrogram. How To Deploy Asterix Heroku Railway Qovery Termux Tutorial Railway Deploy Comming Soon

TeamAsterix 9 Oct 17, 2022
Salmanul Farisx Bot With Python

Salman_Farisx_Bot How To Deploy Video Subscribe YouTube Channel Added Features Imdb posters for autofilter. Imdb rating for autofilter. Custom caption

1 Dec 23, 2021
☄️ High performance, easy to use and feature-rich Solana SDK for Python.

Solathon is an high performance, easy to use and feature-rich Solana SDK for Python. Easy for beginners, powerful for real world applications.

Bolt 28 Oct 10, 2022
Auto filter bot for python

Media Search bot Index channel or group files for inline search. When you post file on telegram channel or group this bot will save that file in datab

1 Dec 22, 2021
WatonAPI is an API used to connect to spigot servers with the WatonPlugin to communicate.

WatonAPI is an API used to connect to spigot servers with the WatonPlugin to communicate. You can send messages to the server and read messages, making it useful for cross-chat programs.

Waton 1 Nov 22, 2021
Telegram Bot to check covid vaccine slot availability on CoWin site

Cowin Assist Telegram Bot Check the bot here @cowinassistbot. This is a simple Telegram bot to Check slots availability Get an alert when slots become

32 Jun 21, 2022
Automatic login to Microsoft Teams conferences

Automatic login to Microsoft Teams conferences

Xhos 1 Jan 24, 2022
Python Dialogflow CX Scripting API (SCRAPI)

Python Dialogflow CX Scripting API (SCRAPI) A high level scripting API for bot builders, developers, and maintainers. Table of Contents Introduction W

Google Cloud Platform 39 Dec 09, 2022
A bot for Large Fry Larrys

GroupMe Bot Driver This driver is written entirely in Python, and with easy configuration in mind. Using this driver, you'll be able to monitor multip

1 Oct 25, 2021
LyricsGenius: a Python client for the Genius.com API

LyricsGenius: a Python client for the Genius.com API lyricsgenius provides a simple interface to the song, artist, and lyrics data stored on Genius.co

KevinChunye 2 Jun 30, 2022
Balsam Python client API & SDK

balsam No description provided (generated by Openapi Generator https://github.com/openapitools/openapi-generator) This Python package is automatically

Darren Govoni 1 Oct 22, 2021
阿里云盘上传脚本

阿里云盘上传脚本 Author:李小恩 Github:https://github.com/Hidove/aliyundrive-uploader 如有侵权,请联系我删除 禁止用于非法用途,违者后果自负 环境要求 python3 使用方法 安装 git clone https://github.co

Hidove 301 Jan 01, 2023
Métamorphose Renamer v2

Métamorphose 2 Métamorphose is a graphical mass renaming program for files and folders. These are the command line options: -h, --help Show hel

Métamorphose 129 Dec 30, 2022
Cleaning Tiktok Hacks With Python

Cleaning Tiktok Hacks With Python

13 Jan 06, 2023
Automatically mass follows tons of NameMC profiles.

Automatically mass follows tons of NameMC profiles. (Creates REAL traffic to your profile)

Jam 3 Jun 29, 2022
This tool is created by Shahzain and is one of the best self bots out there!

Shahzain SelfBot This tool is created by Shahzain and is one of the best self bots out there! Features Token Destroyer! Server Nuker(50-100 Bans Per S

Shahzain 6 Apr 02, 2022
You have 3 files: create mass groups, add mass members, rename all groups (only educational use!)

EDUCATIONAL ONLY! HOW TO INSTALL Edit config.json with your discord account token and the imagepath (if its in the same location as the all_together.p

46 Dec 27, 2022
Automate HoYoLAB Genshin Daily Check-In Using Github Actions

Genshin Daily Check-In 🤖 Automate HoYoLAB Daily Check-In Using Github Actions KOR, ENG Instructions Fork the repository Go to Settings - Secrets Cli

Leo Kim 41 Jun 24, 2022