Release of SPLASH: Dataset for semantic parse correction with natural language feedback in the context of text-to-SQL parsing

Last update: Oct 31, 2022

Overview

SPLASH: Semantic Parsing with Language Assistance from Humans

SPLASH is dataset for the task of semantic parse correction with natural language feedback in the context of text-to-SQL parsing.

The task, dataset along with baseline results are presented in
Speak to your Parser: Interactive Text-to-SQL with Natural Language Feedback.
Ahmed Elgohary, Saghar Hosseini and Ahmed Hassan Awadallah.
ACL 2020.

Release

The train.json, dev.json and test.json contain the training, development and testing examples of SPLASH. In addition to that, we also release the 179 examples that are based on the EditSQL parser (Please, see section 6.3 in the paper for more details). The EditSQL examples are in editsql.json. SPLASH is distributed under the CC BY-SA 4.0 license.

Format

Each example contains the following fields:

db_id: Name of Spider database.

question: Question (Utterance) as provided in Spider.

predicted_parse: The predicted SQL parse by the relevant model.

predicted_parse_with_values: The predicted SQL with the values (annonomized in predicted_parse) inferred by a rule-based post-processor. Note that we still use Spider's evaluation measure which ignores the values, but inferring values for the predicted parse is essential for generating meaningful explanations.

predicted_parse_explanation: The generated natural language explanation of the predicted SQL.

feedback: Collected natural language feedback.

gold_parse: The gold parse of the given question as provided in Spider.

beam: The top 20 predictions with corresponding scores produced by Seq2Struct beam search.

Please, refer to the paper for more details.

Example

    {
        "db_id": "csu_1", 
        "question": "Which university is in Los Angeles county and opened after 1950?", 
        "predicted_parse": "SELECT T1.Campus FROM Campuses AS T1 JOIN faculty AS T2 ON T1.Id = T2.Campus WHERE T1.County = value AND T1.Year > value AND T2.Year > value", 
        "predicted_parse_with_values": "SELECT T1.Campus FROM Campuses AS T1 JOIN faculty AS T2 ON T1.Id = T2.Campus WHERE T1.County = \"Los Angeles\" AND T1.Year > 1950 AND T2.Year > 2002",
        "predicted_parse_explanation": [
            "Step 1: For each row in Campuses table, find the corresponding rows in faculty     
            table", 
            "Step 2: find Campuses's Campus of the results of step 1 whose County equals Los 
             Angeles and Campuses's Year greater than 1950 and faculty's Year greater than 2002"
        ],
        "feedback": "In step 2 Remove faculty 's year greater than 2002\".", 
        "gold_parse": "SELECT campus FROM campuses WHERE county  =  \"Los Angeles\" AND YEAR  >  
        1950", 
        "beam": [
            [
                "SELECT T1.Campus FROM Campuses AS T1 JOIN faculty AS T2 ON T1.Id = T2.Campus WHERE T1.County = value AND T2.Year > value AND T2.Year > value", 
                -1.5820374488830566
            ], 
            [
                "SELECT T1.County FROM Campuses AS T1 JOIN faculty AS T2 ON T1.Id = T2.Campus WHERE T1.Campus = value AND T2.Year > value AND T2.Year > value", 
                -2.0078020095825195
            ], 
            ..
  }

Please, contact Ahmed Elgohary < [email protected] > for any questions/feedback.

Citation

@inproceedings{Elgohary20Speak,
Title = {Speak to your Parser: Interactive Text-to-SQL with Natural Language Feedback},
Author = {Ahmed Elgohary and Saghar Hosseini and Ahmed Hassan Awadallah},
Year = {2020},
Booktitle = {Association for Computational Linguistics},
}

Release of SPLASH: Dataset for semantic parse correction with natural language feedback in the context of text-to-SQL parsing

Related tags

Overview

SPLASH: Semantic Parsing with Language Assistance from Humans

Release

Format

Example

Please, contact Ahmed Elgohary < [email protected] > for any questions/feedback.

Citation

Owner

Microsoft Research - Language and Information Technologies (MSR LIT)

A deep learning tabular classification architecture inspired by TabTransformer with integrated gated multilayer perceptron.

PyTorch implementation of "ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context" (INTERSPEECH 2020)

PyTorch implementation of spectral graph ConvNets, NIPS’16

Code for MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks

NumPy로 구현한 딥러닝 라이브러리입니다. (자동 미분 지원)

HyperPose is a library for building high-performance custom pose estimation applications.

A 35mm camera, based on the Canonet G-III QL17 rangefinder, simulated in Python.

Cross-modal Retrieval using Transformer Encoder Reasoning Networks (TERN). With use of Metric Learning and FAISS for fast similarity search on GPU

Grow Function: Generate 3D Stacked Bifurcating Double Deep Cellular Automata based organisms which differentiate using a Genetic Algorithm...

Code and Resources for the Transformer Encoder Reasoning Network (TERN)

3D Human Pose Machines with Self-supervised Learning

Official repository for the CVPR 2021 paper "Learning Feature Aggregation for Deep 3D Morphable Models"

This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation

Optimal Adaptive Allocation using Deep Reinforcement Learning in a Dose-Response Study

Security evaluation module with onnx, pytorch, and SecML.

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

Dataset and Code for the paper "DepthTrack: Unveiling the Power of RGBD Tracking" (ICCV2021), and "Depth-only Object Tracking" (BMVC2021)

基于pytorch构建cyclegan示例

A High-Level Fusion Scheme for Circular Quantities published at the 20th International Conference on Advanced Robotics

Code for our paper 'Generalized Category Discovery'