Logica is a logic programming language that compiles to StandardSQL and runs on Google BigQuery.

Overview

Logica: language of Big Data

Logica is an open source declarative logic programming language for data manipulation. Logica is a successor to Yedalog, a language created at Google earlier.

Why?

Logica is for engineers, data scientists and other specialists who want to use logic programming syntax when writing queries and pipelines to run on BigQuery.

Logica compiles to StandardSQL and gives you access to the power of BigQuery engine with the convenience of logic programming syntax. This is useful because BigQuery is magnitudes more powerful than state of the art native logic programming engines.

We encourage you to try Logica, especially if

  • you already use logic programming and need more computational power, or
  • you use SQL, but feel unsatisfied about its readability, or
  • you want to learn logic programming and apply it to processing of Big Data.

In the future we plan to support more SQL dialects and engines.

I have not heard of logic programming. What is it?

Logic programming is a declarative programming paradigm where the program is written as a set of logical statements.

Logic programming was developed in academia from the late 60s. Prolog and Datalog are the most prominent examples of logic programming languages. Logica is a language of the Datalog family.

Datalog and relational databases start from the same idea: think of data as relations and think of data manipulation as a sequence of operations over these relations. But Datalog and SQL differ in how these operations are described. Datalog is inspired by the mathematical syntax of the first order propositional logic and SQL follows the syntax of natural language.

SQL was based on the natural language to give access to databases to the people without formal training in computer programming or mathematics. This convenience may become costly when the logic that you want to express is non trivial. There are many examples of hard-to-read SQL queries that correspond to simple logic programs.

How does Logica work?

Logica compiles the logic program into a SQL expression, so it can be executed on BigQuery, the state of the art SQL engine.

Among database theoreticians Datalog and SQL are known to be equivalent. And indeed the conversion from Datalog to SQL and back is often straightforward. However there are a few nuances, for example how to treat disjunction and negation. In Logica we tried to make choices that make understanding of the resulting SQL structure as easy as possible, thus empowering user to write programs that are executed efficiently.

Why is it called Logica?

Logica stands for Logic with aggregation.

How to learn?

Learn basics of Logica with the CoLab tutorial located at tutorial folder. See examples of using Logica in examples folder.

Tutorial and examples show how to access Logica from CoLab. You can also install Logica command line tool.

Prerequisites

To run Logica programs on BigQuery you will need a Google Cloud Project. Once you have a project you can run Logica programs in CoLab providing your project id.

To run Logica locally you need Python3.

To initiate Logica predicates execution from the command line you will need bq, a BigQuery command line tool. For that you need to install Google Cloud SDK.

Installation

Google Cloud Project is the only thing you need to run Logica in Colab, see Hello World example.

You can install Logica command with pip as follows.

# Install.
python3 -m pip install logica
# Run:
# To see usage message.
python3 -m logica
# To print SQL for HelloWorld program.
python3 -m logica - print Greet <<<'Greet(greeting: "Hello world!")'

If your PATH includes Python's bin folder then you will also be able to run it simply as

logica - print Greet <<<'Greet(greeting: "Hello world!")'

Alternatively, you can clone GitHub repository:

git clone https://github.com/evgskv/logica
cd logica
./logica - print Greet <<<'Greet(greeting: "Hello world!")'

Code samples

Here a couple examples of how Logica code looks like.

Prime numbers

Find prime numbers less than 30.

Program primes.l:

# Define natural numbers from 1 to 29.
N(x) :- x in Range(30);
# Define primes.
Prime(prime: x) :-
  N(x),
  x > 1,
  ~(
    N(y),
    y > 1,
    y != x,
    x % y == 0
  );

Running primes.l

$ logica primes.l run Prime
+-------+
| prime |
+-------+
|     2 |
|     3 |
|     5 |
|     7 |
|    11 |
|    13 |
|    17 |
|    19 |
|    23 |
|    29 |
+-------+

News mentions

Who was mentioned in the news in 2020 the most? Let's query GDELT Project dataset.

Program mentions.l

@OrderBy(Mentions, "mentions desc");
@Limit(Mentions, 10);
Mentions(person:, mentions? += 1) distinct :-
  gdelt-bq.gdeltv2.gkg(persons:, date:),
  Substr(ToString(date), 0, 4) == "2020",
  the_persons == Split(persons, ";"),
  person in the_persons;

Running mentions.l

$ logica mentions.l run Mentions
+----------------+----------+
|     person     | mentions |
+----------------+----------+
| donald trump   |  3624228 |
| joe biden      |  1591320 |
| los angeles    |  1221998 |
| george floyd   |   923472 |
| boris johnson  |   845955 |
| barack obama   |   541672 |
| vladimir putin |   486428 |
| bernie sanders |   409224 |
| andrew cuomo   |   375594 |
| nancy pelosi   |   375373 |
+----------------+----------+

Note that cities of Los Angeles and Las Vegas are mentioned in this table due to known missclasification issue in the GDELT data analysis.

Feedback

Feel free to create github issues for bugs and feature requests.

You questions and comments are welcome at our github discussions!


Unless otherwise noted, the Logica source files are distributed under the Apache 2.0 license found in the LICENSE file.

This is not an officially supported Google product.

Owner
Evgeny Skvortsov
Software Engineer
Evgeny Skvortsov
Pysolr — Python Solr client

pysolr pysolr is a lightweight Python client for Apache Solr. It provides an interface that queries the server and returns results based on the query.

Haystack Search 626 Dec 01, 2022
Logica is a logic programming language that compiles to StandardSQL and runs on Google BigQuery.

Logica: language of Big Data Logica is an open source declarative logic programming language for data manipulation. Logica is a successor to Yedalog,

Evgeny Skvortsov 1.5k Dec 30, 2022
Pandas Google BigQuery

pandas-gbq pandas-gbq is a package providing an interface to the Google BigQuery API from pandas Installation Install latest release version via conda

Python for Data 345 Dec 28, 2022
Asynchronous interface for peewee ORM powered by asyncio

peewee-async Asynchronous interface for peewee ORM powered by asyncio. Important notes Since version 0.6.0a only peewee 3.5+ is supported If you still

05Bit 666 Dec 30, 2022
A Python wheel containing PostgreSQL

postgresql-wheel A Python wheel for Linux containing a complete, self-contained, locally installable PostgreSQL database server. All servers run as th

Michel Pelletier 71 Nov 09, 2022
Implementing basic MySQL CRUD (Create, Read, Update, Delete) queries, using Python.

MySQL with Python Implementing basic MySQL CRUD (Create, Read, Update, Delete) queries, using Python. We can connect to a MySQL database hosted locall

MousamSingh 5 Dec 01, 2021
Python client for Apache Kafka

Kafka Python client Python client for the Apache Kafka distributed stream processing system. kafka-python is designed to function much like the offici

Dana Powers 5.1k Jan 08, 2023
PubMed Mapper: A Python library that map PubMed XML to Python object

pubmed-mapper: A Python Library that map PubMed XML to Python object 中文文档 1. Philosophy view UML Programmatically access PubMed article is a common ta

灵魂工具人 33 Dec 08, 2022
Py2neo is a client library and toolkit for working with Neo4j from within Python

Py2neo Py2neo is a client library and toolkit for working with Neo4j from within Python applications. The library supports both Bolt and HTTP and prov

py2neo.org 1.2k Jan 02, 2023
Making it easy to query APIs via SQL

Shillelagh Shillelagh (ʃɪˈleɪlɪ) is an implementation of the Python DB API 2.0 based on SQLite (using the APSW library): from shillelagh.backends.apsw

Beto Dealmeida 207 Dec 30, 2022
Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.

dataset: databases for lazy people In short, dataset makes reading and writing data in databases as simple as reading and writing JSON files. Read the

Friedrich Lindenberg 4.2k Jan 02, 2023
A tool to snapshot sqlite databases you don't own

The core here is my first attempt at a solution of this, combining ideas from browser_history.py and karlicoss/HPI/sqlite.py to create a library/CLI tool to (as safely as possible) copy databases whi

Sean Breckenridge 10 Dec 22, 2022
Class to connect to XAMPP MySQL Database

MySQL-DB-Connection-Class Class to connect to XAMPP MySQL Database Basta fazer o download o mysql_connect.py e modificar os parâmetros que quiser. E d

Alexandre Pimentel 4 Jul 12, 2021
A supercharged SQLite library for Python

SuperSQLite: a supercharged SQLite library for Python A feature-packed Python package and for utilizing SQLite in Python by Plasticity. It is intended

Plasticity 703 Dec 30, 2022
A Python library for Cloudant and CouchDB

Cloudant Python Client This is the official Cloudant library for Python. Installation and Usage Getting Started API Reference Related Documentation De

Cloudant 162 Dec 19, 2022
A CRUD and REST api with mongodb atlas.

Movies_api A CRUD and REST api with mongodb atlas. Setup First import all the python dependencies in your virtual environment or globally by the follo

Pratyush Kongalla 0 Nov 09, 2022
SAP HANA Connector in pure Python

SAP HANA Database Client for Python Important Notice This public repository is read-only and no longer maintained. The active maintained alternative i

SAP Archive 299 Nov 20, 2022
A selection of SQLite3 databases to practice querying from.

Dummy SQL Databases This is a collection of dummy SQLite3 databases, for learning and practicing SQL querying, generated with the VS Code extension Ge

1 Feb 26, 2022
AWS SDK for Python

Boto3 - The AWS SDK for Python Boto3 is the Amazon Web Services (AWS) Software Development Kit (SDK) for Python, which allows Python developers to wri

the boto project 7.8k Jan 04, 2023
Records is a very simple, but powerful, library for making raw SQL queries to most relational databases.

Records: SQL for Humans™ Records is a very simple, but powerful, library for making raw SQL queries to most relational databases. Just write SQL. No b

Kenneth Reitz 6.9k Jan 03, 2023