The CLRS Algorithmic Reasoning Benchmark

Last update: Jan 05, 2023

Related tags

Overview

The CLRS Algorithmic Reasoning Benchmark

Learning representations of algorithms is an emerging area of machine learning, seeking to bridge concepts from neural networks with classical algorithms. The CLRS Algorithmic Reasoning Benchmark (CLRS) consolidates and extends previous work torward evaluation algorithmic reasoning by providing a suite of implementations of classical algorithms. These algorithms have been selected from the third edition of the standard Introduction to Algorithms by Cormen, Leiserson, Rivest and Stein.

Installation

The CLRS Algorithmic Reasoning Benchmark can be installed with pip directly from GitHub, with the following command:

pip install git+git://github.com/deepmind/clrs.git

or from PyPI:

pip install dm-clrs

Getting started

To set up a Python virtual environment with the required dependencies, run:

python3 -m venv clrs_env
source clrs_env/bin/activate
python setup.py install

and to run our example baseline model:

python -m clrs.examples.run

Algorithms as graphs

CLRS implements the selected algorithms in an idiomatic way, which aligns as closely as possible to the original CLRS 3ed pseudocode. By controlling the input data distribution to conform to the preconditions we are able to automatically generate input/output pairs. We additionally provide trajectories of "hints" that expose the internal state of each algorithm, to both optionally simplify the learning challenge and to distinguish between different algorithms that solve the same overall task (e.g. sorting).

In the most generic sense, algorithms can be seen as manipulating sets of objects, along with any relations between them (which can themselves be decomposed into binary relations). Accordingly, we study all of the algorithms in this benchmark using a graph representation. In the event that objects obey a more strict ordered structure (e.g. arrays or rooted trees), we impose this ordering through inclusion of predecessor links.

How it works

For each algorithm, we provide a canonical set of train, eval and test trajectories for benchmarking out-of-distribution generalization.

	Trajectories	Problem Size
Train	1000	16
Eval	32	16
Test	32	64

where "problem size" refers to e.g. the length of an array or number of nodes in a graph, depending on the algorithm. These trajectories can be used like so:

train_ds, spec = clrs.clrs21_train("bfs")

for step in range(num_train_steps):
  feedback = train_sampler.next(batch_size)
  model.train(feedback.features)

Here, feedback is a namedtuple with the following structure:

Feedback = collections.namedtuple('Feedback', ['features', 'outputs'])
Features = collections.namedtuple('Features', ['inputs', 'hints', 'lengths'])

where the content of Features can be used for training and outputs is reserved for evaluation. Each field of the tuple is an ndarray with a leading batch dimension. Because hints are provided for the full algorithm trajectory, these contain an additional time dimension padded up to the maximum length max(T) of any trajectory within the dataset. The lengths field specifies the true length t <= max(T) for each trajectory, which can be used e.g. for loss masking.

Please see the examples directory for full working Graph Neural Network (GNN) examples using JAX and the DeepMind JAX Ecosystem of libraries.

What we provide

Algorithms

Our initial CLRS-21 benchmark includes the following 21 algorithms. More algorithms will be supported in the near future.

Divide and conquer
- Maximum subarray (Kadane)
Dynamic programming
- Matrix chain order
- Optimal binary search tree
Graphs
- Depth-first search
- Breadth-first search
- Topological sort
- Minimum spanning tree (Prim)
- Single-source shortest-path (Bellman Ford)
- Single-source shortest-path (Dijsktra)
- DAG shortest paths
- All-pairs shortest-path (Floyd Warshall)
Greedy
- Task scheduling
Searching
- Minimum
- Binary search
- Quickselect
Sorting
- Insertion sort
- Bubble sort
- Heapsort
- Quicksort
Strings
- String matcher (naive)
- String matcher (KMP)

Baselines

We additionally provide JAX implementations of the following GNN baselines:

Graph Attention Networks (Velickovic et al., ICLR 2018)
Message-Passing Neural Networks (Gilmer et al., ICML 2017)

Citation

To cite the CLRS Algorithmic Reasoning Benchmark:

@article{deepmind2021clrs,
  author = {Petar Veli\v{c}kovi\'{c} and Adri\`{a} Puigdom\`{e}nech Badia and
    David Budden and Razvan Pascanu and Andrea Banino and Misha Dashevskiy and
    Raia Hadsell and Charles Blundell},
  title = {The CLRS Algorithmic Reasoning Benchmark},
  year = {2021},
}

Comments

More input signals for evaluation

Hi, in some parts of the code you increase input signals for a few algorithms for evaluation. But, the generated dataset on google storage seems to not contain them (i.e., it contains only 32 trajectories of each). Is the change going to be reflected in future versions?

opened by smahdavi4 9
Inability to reproduce paper results

Thanks to the authors for constructing this benchmark.

I'm having trouble reproducing some of the test scores reported in the paper, in Table 2. Comparing my runs against the paper results (averaging across 3 seeds: 42, 43, and 44):

Graham Scan task: MPNN: 0.6355 vs. 0.9104 published PGN: 0.3622 vs. 0.5687 published

Binary Search task: MPNN: 0.2026 vs. 0.3683 published PGN: 0.4390 vs. 0.7695 published

Here are the values I used for my reproduction experiments:

Values for batch size, train items, learning rate, and hint teaching forcing noise were obtained from sections 4.1 and 4.2 of the paper. Values for eval_every, dropout, use_ln, and use_lstm (which were not found in the paper) were default values in the provided run file. Additionally, I used processor type "pgn_mask" for the PGN experiments.

What setting should I use to more accurately reproduce the paper results? Were there hyperparameter settings unspecified in the paper (or specified) that I am getting wrong?

Finally, I noticed the most recent commit, fixing the axis for mean reduction in PGN. Would that cause PGN to perform differently than reported in the paper? And perhaps explain the discrepancy in results I obtained.

opened by CameronDiao 5
Is the paper still available?

Thank you for the benchmark on neural algorithmic reasoning. I love the throwback to the classical CLRS textbook!

I remember bumping into the PDF paper online, but cannot seem to access it anymore. Is the paper associated with the repo available soon?

opened by chaitjo 3
Why no directed graph for FloydWarshall, Dijkstra, BFS and BellmanFord

Hi,

What is the reason that you chose to use undirected graphs for the algorithms mentioned in the title? As far as I see, they should all be able to support directed graphs as well.

Thanks!

opened by sigeisler 2
Hint `A_t` in SCC

Hi,

thank you for the quite extensive work in putting CLRS together.

Do you have a reference or reasoning about what hints you included? For example, can you elaborate on why you included hint A_t in strongly connected components?

Thanks!

opened by sigeisler 2
Sampling bug on undirected weighted graphs

Hi, I think there is an issue in sampling the undirected weighted graphs. The sampled graph is first symmetrized and then weights are sampled, which makes the weights of each direction different. For algorithms that are capable of handling directed graphs, this might not cause any disruption in algorithm behavior. But, for the ones that output undirected edges (e.g., MST Kruskal), this would not be the true behavior of the algorithm.

opened by smahdavi4 2
Update CLRS models with multi-algorithm options:
Update CLRS models with multi-algorithm options:

Example of multi-algorithm training with on-the-fly samples of different lengths and parameters.

Option to randomize positional input.

Option to move "pred_h" constant hints to inputs.

Option to enforce permutation constraints on the outputs of sorting algorithms.

Option to do soft or hard rematerialization of hints during training.

Option for gradient norm clipping.

Option to initialize scalar encoders with Xavier weights.

New processor types: with triplets and with gating.
opened by copybara-service[bot] 1
Faster batching. Previous version resized batch for each sample, resulting in very long sampler creation times for algorithms like quickselect with big test batches.

Faster batching. Previous version resized batch for each sample, resulting in very long sampler creation times for algorithms like quickselect with big test batches.

opened by copybara-service[bot] 1
Pass processor factory instead of processor string when creating model. This makes it easier to add new processors as processor parameters don't need to be passed down to model and net.

Pass processor factory instead of processor string when creating model. This makes it easier to add new processors as processor parameters don't need to be passed down to model and net.

opened by copybara-service[bot] 1
Problems with jax

I installed the required libraries by pip install -r requirement.txt. The CUDA works well and the GPU can be found by tensorflow. However, when I try to run the code, an error occurs.

"Unable to initialize backend 'cuda'": module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig'

I searched on the web and found it might be caused by the version of the libraries.

Could you please share the versions of those packages you used with me? Thank you very much.

opened by WilliamLi0623 0
Added subgraph_mode

Use as adjacency matrix the subgraph around nodes having ground-truth hints that changed from the previous iteration.

Run with the star subgraphs by using python3 -m clrs.examples.run -algorithm dfs --processor_type gatv2 --hint_mode encoded_decoded_nodiff --hint_teacher_forcing_noise 0.5 --subgraph_mode stars

opened by beabevi 1

tensorflow-macos and tensorflow-metal

Any comments on using 'tensorflow-macos' and 'tensorflow-metal' in the 'clrs' ecosystem?

I was able to install tensoflow-macos and tensorflow-metal in the clrs virtual enviornment. My AMD GPU is being recognized ... but 'clrs' is looking for: 'tpu_driver' , 'cuda', 'tpu'. Any ideas?

% python3 -m clrs.examples.run                                  
I0605 17:03:39.042836 4560762368 run.py:196] Using CLRS30 spec: {'train': {'num_samples': 1000, 'length': 16, 'seed': 1}, 'val': {'num_samples': 32, 'length': 16, 'seed': 2}, 'test': {'num_samples': 32, 'length': 64, 'seed': 3}}
I0605 17:03:39.044355 4560762368 run.py:180] Dataset found at /tmp/CLRS30/CLRS30_v1.0.0. Skipping download.
Metal device set to: AMD Radeon Pro 5700 XT

systemMemory: 128.00 GB
maxCacheSize: 7.99 GB
...
  devices = jax.devices()
  logging.info(devices)

I0605 17:03:40.532486 4560762368 xla_bridge.py:330] Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker: 
I0605 17:03:40.534332 4560762368 xla_bridge.py:330] Unable to initialize backend 'gpu': NOT_FOUND: Could not find registered platform with name: "cuda". Available platform names are: Interpreter Host
I0605 17:03:40.536365 4560762368 xla_bridge.py:330] Unable to initialize backend 'tpu': INVALID_ARGUMENT: TpuPlatform is not available.
W0605 17:03:40.536445 4560762368 xla_bridge.py:335] No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)

https://github.com/google/jax/issues/7163

opened by dbl001 0

Releases(v1.0.0)

v1.0.0(Jun 1, 2022)
Main changes

Extended the benchmark from 21 to 30 tasks by adding the following:

Activity selection (Gavril, 1972)

Longest common subsequence

Articulation points

Bridges

Kosaraju's strongly connected components algorithm (Aho et al., 1974)

Kruskal's minimum spanning tree algorithm (Kruskal, 1956)

Segment intersection

Graham scan convex hull algorithm (Graham, 1972)

Jarvis' march convex hull algorithm (Jarvis, 1973)

Added new baseline processors:

Deep Sets (Zaheer et al., NIPS 2017) and Pointer Graph Networks (Veličković et al., NeurIPS 2020) as particularisations of the existing Message-Passing Neural Network processor.

End-to-End Memory Networks (Sukhbaatar et al., NIPS 2015)

Graph Attention Networks v2 (Brody et al., ICLR 2022)

Detailed changes

Add PyPI installation instructions. by @copybara-service in https://github.com/deepmind/clrs/pull/6

Fix README typo. by @copybara-service in https://github.com/deepmind/clrs/pull/7

Expose Sampler base class in public API. by @copybara-service in https://github.com/deepmind/clrs/pull/8

Add dataset reader. by @copybara-service in https://github.com/deepmind/clrs/pull/12

Patch imbalanced samplers for DFS-based algorithms. by @copybara-service in https://github.com/deepmind/clrs/pull/15

Disk-based samplers for convex hull algorithms. by @copybara-service in https://github.com/deepmind/clrs/pull/16

Avoid dividing by zero in F_1 score computaton. by @copybara-service in https://github.com/deepmind/clrs/pull/18

Sparsify the graphs generated for Kruskal. by @copybara-service in https://github.com/deepmind/clrs/pull/20

Option to add an lstm after the processor. by @copybara-service in https://github.com/deepmind/clrs/pull/19

Include dataset class and creation using tensorflow_datasets format. by @copybara-service in https://github.com/deepmind/clrs/pull/23

Change types of DataPoint and DataPoint members. by @copybara-service in https://github.com/deepmind/clrs/pull/22

Remove unnecessary data loading procedures. by @copybara-service in https://github.com/deepmind/clrs/pull/24

Modify example to run with the tf.data.Datasets dataset. by @copybara-service in https://github.com/deepmind/clrs/pull/25

Expose processors in CLRS by @copybara-service in https://github.com/deepmind/clrs/pull/21

Update CLRS-21 to CLRS-30. by @copybara-service in https://github.com/deepmind/clrs/pull/26

Update README with new algorithms. by @copybara-service in https://github.com/deepmind/clrs/pull/27

Add dropout to example. by @copybara-service in https://github.com/deepmind/clrs/pull/28

Make example download dataset. by @copybara-service in https://github.com/deepmind/clrs/pull/30

Force full dataset pipeline to be on the CPU. by @copybara-service in https://github.com/deepmind/clrs/pull/31

Set default dropout to 0.0 for now. by @copybara-service in https://github.com/deepmind/clrs/pull/32

Added support for GATv2 and masked GATs. by @copybara-service in https://github.com/deepmind/clrs/pull/33

Pad memory in MemNets and disable embeddings. by @copybara-service in https://github.com/deepmind/clrs/pull/34

baselines.py refactoring (2/N) by @copybara-service in https://github.com/deepmind/clrs/pull/36

baselines.py refactoring (3/N). by @copybara-service in https://github.com/deepmind/clrs/pull/38

Update readme. by @copybara-service in https://github.com/deepmind/clrs/pull/37

Generate more samples in tasks where the number of signals is small. by @copybara-service in https://github.com/deepmind/clrs/pull/40

Fix MemNet embeddings by @copybara-service in https://github.com/deepmind/clrs/pull/41

Supporting multiple attention heads in GAT and GATv2. by @copybara-service in https://github.com/deepmind/clrs/pull/42

Use GATv2 + add option to use different number of heads. by @copybara-service in https://github.com/deepmind/clrs/pull/43

Fix GAT processors. by @copybara-service in https://github.com/deepmind/clrs/pull/44

Fix samplers_test by @copybara-service in https://github.com/deepmind/clrs/pull/47

Update requirements.txt by @copybara-service in https://github.com/deepmind/clrs/pull/45

Bug in hint loss for CATEGORICAL type. The number of unmasked datapoints (jnp.sum(unmasked_data)) was computed over the whole time sequence instead of the pertinent time slice. by @copybara-service in https://github.com/deepmind/clrs/pull/53

Use internal rng for batch selection. Makes batch sampling deterministic given seed. by @copybara-service in https://github.com/deepmind/clrs/pull/49

baselines.py refactoring (6/N) by @copybara-service in https://github.com/deepmind/clrs/pull/52

Time-chunked datasets. by @copybara-service in https://github.com/deepmind/clrs/pull/48

Potential bug in edge diff decoding. by @copybara-service in https://github.com/deepmind/clrs/pull/54

Losses for chunked data. by @copybara-service in https://github.com/deepmind/clrs/pull/55

Changes to hint losses, mostly for decode_diffs=True. Before, only one of the terms of the MASK type loss was masked by gt_diff. Also, the loss was averaged over all time steps, including steps without diffs and therefore contributing 0 to the loss. Now we average only over the non-zero-diff steps. by @copybara-service in https://github.com/deepmind/clrs/pull/57

Adapt baseline model to process multiple algorithms with a single processor. by @copybara-service in https://github.com/deepmind/clrs/pull/59

Explicitly denote a hint learning mode, to delimit the tasks of interest to CLRS. by @copybara-service in https://github.com/deepmind/clrs/pull/60

Give names to encoder and decoder params. This facilitates analysis, especially in multi-algorithm training. by @copybara-service in https://github.com/deepmind/clrs/pull/63

Symmetrise the weights of sampled weighted undirected Erdos-Renyi graphs. by @copybara-service in https://github.com/deepmind/clrs/pull/62

Fix dataset size for augmented validation + test sets. by @copybara-service in https://github.com/deepmind/clrs/pull/65

Bug when hint mode is 'none': the multi-algorithm version needs something in the list diff decoders. by @copybara-service in https://github.com/deepmind/clrs/pull/66

Change requirements to a fixed tensorflow datasets nightly build. by @copybara-service in https://github.com/deepmind/clrs/pull/68

Patch KMP algorithm to incorporate the "reset" node. by @copybara-service in https://github.com/deepmind/clrs/pull/69

Allow for multiple-batch evaluation in example run script. by @copybara-service in https://github.com/deepmind/clrs/pull/70

Bug in SearchSampler: arrays should be sorted. by @copybara-service in https://github.com/deepmind/clrs/pull/71

Record separate hint eval scores for analysis. by @copybara-service in https://github.com/deepmind/clrs/pull/72

Symmetrised edges for PGN. by @copybara-service in https://github.com/deepmind/clrs/pull/73

Option for noise in teacher forcing by @copybara-service in https://github.com/deepmind/clrs/pull/74

Regularize PGN_MASK losses by predicting min_value-1 at missing edges instead of -10^5 by @copybara-service in https://github.com/deepmind/clrs/pull/75

Make encoded_decoded_nodiff default mode, and add flag to control teacher forcing noise. by @copybara-service in https://github.com/deepmind/clrs/pull/76

Detailed evaluation of hints in verbose mode. by @copybara-service in https://github.com/deepmind/clrs/pull/79

Pass processor factory instead of processor string when creating model. This makes it easier to add new processors as processor parameters don't need to be passed down to model and net. by @copybara-service in https://github.com/deepmind/clrs/pull/81

Update README. by @copybara-service in https://github.com/deepmind/clrs/pull/82

Use large negative number instead of 0 to discard non-connected edges for max aggregation in PGN processor. by @copybara-service in https://github.com/deepmind/clrs/pull/83

Add tensorflow requirement. by @copybara-service in https://github.com/deepmind/clrs/pull/84

Change deprecated tree_multimap to tree_map by @copybara-service in https://github.com/deepmind/clrs/pull/85

Increase version number for PyPI release. by @copybara-service in https://github.com/deepmind/clrs/pull/87

Full Changelog: https://github.com/deepmind/clrs/compare/v0.0.2...v1.0.0
Source code(tar.gz)
Source code(zip)
v0.0.2(Aug 26, 2021)

The CLRS Algorithmic Reasoning Benchmark.
Source code(tar.gz)
Source code(zip)
v0.0.1(Aug 26, 2021)

Initial release of CLRS Algorithmic Reasoning Benchmark.
Source code(tar.gz)
Source code(zip)

Owner

DeepMind

GitHub Repository

一个运行在 𝐞𝐥𝐞𝐜𝐕𝟐𝐏 或 𝐪𝐢𝐧𝐠𝐥𝐨𝐧𝐠 等定时面板的签到项目

定时面板上的签到盒一个运行在 𝐞𝐥𝐞𝐜𝐕𝟐𝐏 或 𝐪𝐢𝐧𝐠𝐥𝐨𝐧𝐠 等定时面板的签到项目 𝐞𝐥𝐞𝐜𝐕𝟐𝐏 𝐪𝐢𝐧𝐠𝐥𝐨𝐧𝐠 特别声明本仓库发布的脚本及其中涉及的任何解锁和解密分析脚本，仅用于测试和学习研究，禁止用于商业用途，不能保证其合

1.1k Dec 30, 2022

In this project, we develop a face recognize platform based on MTCNN object-detection netcwork and FaceNet self-supervised network.

模式识别大作业——人脸检测与识别平台本项目是一个简易的人脸检测识别平台，提供了人脸信息录入和人脸识别的功能。前端采用 html+css+js，后端采用 pytorch，

5 Aug 02, 2022

A list of multi-task learning papers and projects.

This page contains a list of papers on multi-task learning for computer vision. Please create a pull request if you wish to add anything. If you are interested, consider reading our recent survey pap

297 Dec 17, 2022

Contrastive Fact Verification

VitaminC This repository contains the dataset and models for the NAACL 2021 paper: Get Your Vitamin C! Robust Fact Verification with Contrastive Evide

47 Dec 19, 2022

source code for https://arxiv.org/abs/2005.11248 "Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics"

Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics This work will be published in Nature Biomedical

71 Nov 15, 2022

Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel order of RGB and BGR. Simple Channel Converter for ONNX.

scc4onnx Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel

16 Dec 22, 2022

The CLRS Algorithmic Reasoning Benchmark

Related tags

Overview

The CLRS Algorithmic Reasoning Benchmark

Installation

Getting started

Algorithms as graphs

How it works

What we provide

Algorithms

Baselines

Citation

Comments

Releases(v1.0.0)

v1.0.0(Jun 1, 2022)

Main changes

Detailed changes

v0.0.2(Aug 26, 2021)

v0.0.1(Aug 26, 2021)

Owner

DeepMind

一个运行在 𝐞𝐥𝐞𝐜𝐕𝟐𝐏 或 𝐪𝐢𝐧𝐠𝐥𝐨𝐧𝐠 等定时面板的签到项目

In this project, we develop a face recognize platform based on MTCNN object-detection netcwork and FaceNet self-supervised network.

A list of multi-task learning papers and projects.

Contrastive Fact Verification

source code for https://arxiv.org/abs/2005.11248 "Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics"

공공장소에서 눈만 돌리면 CCTV가 보인다는 말이 과언이 아닐 정도로 CCTV가 우리 생활에 깊숙이 자리 잡았습니다.

PyTorch code for the NAACL 2021 paper "Improving Generation and Evaluation of Visual Stories via Semantic Consistency"

Repo 4 basic seminar §How to make human machine readable"

Only works with the dashboard version / branch of jesse

Hepsiburada - Hepsiburada Urun Bilgisi Cekme

A more easy-to-use implementation of KPConv

The Official Repository for "Generalized OOD Detection: A Survey"

Boosted CVaR Classification (NeurIPS 2021)

Codebase of deep learning models for inferring stability of mRNA molecules

Underwater image enhancement

Custom Implementation of Non-Deep Networks

Neighbor2Seq: Deep Learning on Massive Graphs by Transforming Neighbors to Sequences

Resources for our AAAI 2022 paper: "LOREN: Logic-Regularized Reasoning for Interpretable Fact Verification".

Pretty Tensor - Fluent Neural Networks in TensorFlow

Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel order of RGB and BGR. Simple Channel Converter for ONNX.