Discovering Explanatory Sentences in Legal Case Decisions Using Pre-trained Language Models.

Overview

Statutory Interpretation Data Set

This repository contains the data set created for the following research papers:

Savelka, Jaromir, and Kevin D. Ashley. "Discovering Explanatory Sentences in Legal Case Decisions Using Pre-trained Language Models." Findings of the Association for Computational Linguistics: EMNLP 2021. 2021.

Jaromir Savelka, Huihui Xu, and Kevin D. Ashley. 2019. Improving Sentence Retrieval from Case Law for Statutory Interpretation. In Seventeenth International Conference on Artificial Intelligence and Law (ICAIL ’19), June 17–21, 2019, Montreal, QC, Canada, Floris Bex (Ed.). ACM, New York, NY, USA, 10 pages. https://doi.org/10.1145/3322640.3326736

Task

Given a statutory provision, user's interest in the meaning of a phrase from the provision, and a list of sentences we would like to rank more highly the sentences that elaborate upon the meaning of the statutory phrase of interest, such as:

  • definitional sentences (e.g., a sentence that provides a test for when the phrase applies)
  • sentences that state explicitly in a different way what the statutory phrase means or state what it does not mean
  • sentences that provide an example, instance, or counterexample of the phrase
  • sentences that show how a court determines whether something is such an example, instance, or counterexample.

Corpus Overview

For this corpus we selected fourty two terms from different provisions of the United States Code.

For each term we have collected a set of sentences by extracting all the sentences mentioning the term from the court decisions retrieved from the Caselaw access project data.

In total the corpus consists of 26,959 sentences.

The sentences are classified into four categories according to their usefulness for the interpretation:

  • high value - sentence intended to define or elaborate on the meaning of the term
  • certain value - sentence that provides grounds to elaborate on the term's meaning
  • potential value - sentence that provides additional information beyond what is known from the provision the term comes from
  • no value - no additional information over what is known from the provision

See Annotation guidelines for additional details.

Data Structure

Each zip file contains data related to one of the fourty two queries. There are four files in total containing the texts of different granularity. These allow to replicate experiments reported in the paper cited above.

  • case
    • original_id - case id from Caselaw access project
    • name
    • short_name
    • date
    • official_date
    • official citation
    • alternate_citations
    • court
    • short_court - court abbreviation
    • jurisdiction
    • short_jurisdiction - jurisdiction abbreviation
    • attorneys
    • parties
    • judges
    • text
  • opinion
    • case_id - pointer to the case the opinion belongs to
    • author
    • type - e.g., concurrence, dissent
    • position - position of the opinion within the case
    • text
  • paragraph
    • case_id - pointer to the case the opinion belongs to
    • opinion_id - pointer to the opinion the paragraph belongs to
    • position - position of the paragraph within the opinion
    • text
  • sentence
    • case_id - pointer to the case the sentence belongs to
    • opinion_id - pointer to the opinion the sentence belongs to
    • paragraph_id - pointer to the paragraph the sentence belongs to
    • position - position of the sentence within the paragraph
    • text
    • label - human-created gold label of the sentence value

Terms of Use

For use of the data we kindly ask you to provide the two following attributions:

Savelka, Jaromir, and Kevin D. Ashley. "Discovering Explanatory Sentences in Legal Case Decisions Using Pre-trained Language Models." Findings of the Association for Computational Linguistics: EMNLP 2021. 2021.

The President and Fellows of Harvard University, Caselaw access project, Caselaw access project, 2018.

TraND: Transferable Neighborhood Discovery for Unsupervised Cross-domain Gait Recognition.

TraND This is the code for the paper "Jinkai Zheng, Xinchen Liu, Chenggang Yan, Jiyong Zhang, Wu Liu, Xiaoping Zhang and Tao Mei: TraND: Transferable

Jinkai Zheng 32 Apr 04, 2022
[WWW 2022] Zero-Shot Stance Detection via Contrastive Learning

PT-HCL for Zero-Shot Stance Detection The code of this repository is constantly being updated... Please look forward to it! Introduction This reposito

Akuchi 12 Dec 21, 2022
A small library of 3D related utilities used in my research.

utils3D A small library of 3D related utilities used in my research. Installation Install via GitHub pip install git+https://github.com/Steve-Tod/util

Zhenyu Jiang 8 May 20, 2022
An open source machine learning library for performing regression tasks using RVM technique.

Introduction neonrvm is an open source machine learning library for performing regression tasks using RVM technique. It is written in C programming la

Siavash Eliasi 33 May 31, 2022
BoxInst: High-Performance Instance Segmentation with Box Annotations

Introduction This repository is the code that needs to be submitted for OpenMMLab Algorithm Ecological Challenge, the paper is BoxInst: High-Performan

88 Dec 21, 2022
[ICCV2021] IICNet: A Generic Framework for Reversible Image Conversion

IICNet - Invertible Image Conversion Net Official PyTorch Implementation for IICNet: A Generic Framework for Reversible Image Conversion (ICCV2021). D

felixcheng97 55 Dec 06, 2022
Data manipulation and transformation for audio signal processing, powered by PyTorch

torchaudio: an audio library for PyTorch The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the

1.9k Dec 28, 2022
A facial recognition doorbell system using a Raspberry Pi

Facial Recognition Doorbell This project expands on the person-detecting doorbell system to allow it to identify faces, and announce names accordingly

rydercalmdown 22 Apr 15, 2022
Official implementation for ICDAR 2021 paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer"

Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer Description Convert offline handwritten mathematical expressi

Wenqi Zhao 87 Dec 27, 2022
Differentiable Simulation of Soft Multi-body Systems

Differentiable Simulation of Soft Multi-body Systems Yi-Ling Qiao, Junbang Liang, Vladlen Koltun, Ming C. Lin [Paper] [Code] Updates The C++ backend s

YilingQiao 26 Dec 23, 2022
ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation

ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation This repository contains the source code of our paper, ESPNet (acc

Sachin Mehta 515 Dec 13, 2022
An Open-Source Package for Information Retrieval.

OpenMatch An Open-Source Package for Information Retrieval. 😃 What's New Top Spot on TREC-COVID Challenge (May 2020, Round2) The twin goals of the ch

THUNLP 439 Dec 27, 2022
Reverse engineer your pytorch vision models, in style

🔍 Rover Reverse engineer your CNNs, in style Rover will help you break down your CNN and visualize the features from within the model. No need to wri

Mayukh Deb 32 Sep 24, 2022
Jaxtorch (a jax nn library)

Jaxtorch (a jax nn library) This is my jax based nn library. I created this because I was annoyed by the complexity and 'magic'-ness of the popular ja

nshepperd 17 Dec 08, 2022
Leaderboard, taxonomy, and curated list of few-shot object detection papers.

Leaderboard, taxonomy, and curated list of few-shot object detection papers.

Gabriel Huang 70 Jan 07, 2023
PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

简体中文 | English PaddleRobotics paddleRobotics是基于paddle的机器人开源算法库集,包括人机交互、复杂运动控制、环境感知、slam定位导航等开源算法部分。 人机交互 主动多模交互技术TFVT-HRI 主动多模交互技术是通过视觉、语音、触摸传感器等输入机器人

185 Dec 26, 2022
Learning Off-Policy with Online Planning, CoRL 2021

LOOP: Learning Off-Policy with Online Planning Accepted in Conference of Robot Learning (CoRL) 2021. Harshit Sikchi, Wenxuan Zhou, David Held Paper In

Harshit Sikchi 24 Nov 22, 2022
Generalized Data Weighting via Class-level Gradient Manipulation

Generalized Data Weighting via Class-level Gradient Manipulation This repository is the official implementation of Generalized Data Weighting via Clas

18 Nov 12, 2022
Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers

Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers Results results on COCO val Backbone Method Lr Schd PQ Config Download

155 Dec 20, 2022
CoRe: Contrastive Recurrent State-Space Models

CoRe: Contrastive Recurrent State-Space Models This code implements the CoRe model and reproduces experimental results found in Robust Robotic Control

Apple 21 Aug 11, 2022