ICLR 2022 Paper submission trend analysis

Last update: Dec 06, 2022

Related tags

Data Analysis ICLR2022-OpenReviewData

Overview

Visualize ICLR 2022 OpenReview Data

ICLR 2022 Paper submission analysis from https://openreview.net/group?id=ICLR.cc/2022/Conference

Requirements

pip install wordcloud nltk pandas imageio selenium tqdm

download nltk packages

import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('wordnet')
nltk.download('stopwords')

if you got anything wrong when calling webdriver.Edge('msedgedriver.exe'), you can

Delete msedgedriver.exe since it may only work on my computer (Windows)
Install Microsoft Edge (Chromium): Ensure you have installed Microsoft Edge (Chromium). To confirm that you have Microsoft Edge (Chromium) installed, go to edge://settings/help in the browser, and verify the version number is Version 75 or later.
Download Microsoft Edge Driver:
- Go to edge://settings/help to get the version of Edge.
Navigate to the Microsoft Edge Driver downloads page and download the driver that matches the Edge version number.

From https://stackoverflow.com/questions/63529124/how-to-open-up-microsoft-edge-using-selenium-and-python

Crawl Data

Run crawl_paperlist.py to crawl the list of papers (~0.5h).

Paper List (3,407 submission in total

crawl_paperlist.py only crawls 3,000 papers, but it has 3,407 in total. The full paper list are in follows:

Visualization

Keywords Frequency

The top 50 common keywords (uncased) and their frequency:

Keywords Cloud

The word clouds formed by keywords of submissions show the hot topics including deep learning, reinforcement learning, representation learning, graph neural network, etc.

Title Keywords Frequency

The top 50 common title keywords (uncased) and their frequency:

Title Keywords Cloud

The word clouds formed by keywords of submission titles:

Acknowledgment

Inspired by this repo: https://github.com/evanzd/ICLR2021-OpenReviewData

ICLR 2022 Paper submission trend analysis

Related tags

Overview

Visualize ICLR 2022 OpenReview Data

Requirements

Crawl Data

Paper List (3,407 submission in total

Visualization

Acknowledgment

Owner

Jintang Li

Python Implementation of Scalable In-Memory Updatable Bitmap Indexing

A forecasting system dedicated to smart city data

This tool parses log data and allows to define analysis pipelines for anomaly detection.

A powerful data analysis package based on mathematical step functions. Strongly aligned with pandas.

Pip install minimal-pandas-api-for-polars

X-news - Pipeline data use scrapy, kafka, spark streaming, spark ML and elasticsearch, Kibana

Python Package for DataHerb: create, search, and load datasets.

The official pytorch implementation of ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias

PyClustering is a Python, C++ data mining library.

This is a python script to navigate and extract the FSD50K dataset

PySpark Structured Streaming ROS Kafka ApacheSpark Cassandra

InDels analysis of CRISPR lines by NGS amplicon sequencing technology for a multicopy gene family.

PrimaryBid - Transform application Lifecycle Data and Design and ETL pipeline architecture for ingesting data from multiple sources to redshift

CSV database for chihuahua (HUAHUA) blockchain transactions

🧪 Panel-Chemistry - exploratory data analysis and build powerful data and viz tools within the domain of Chemistry using Python and HoloViz Panel.

Toolchest provides APIs for scientific and bioinformatic data analysis.

Supply a wrapper ``StockDataFrame`` based on the ``pandas.DataFrame`` with inline stock statistics/indicators support.

An extension to pandas dataframes describe function.

This repo contains a simple but effective tool made using python which can be used for quality control in statistical approach.

Predictive Modeling & Analytics on Home Equity Line of Credit