Find papers by keywords and venues. Then download it automatically

Overview

paper finder

Find papers by keywords and venues. Then download it automatically.

How to use this?

Search

CLI

python search.py -k "knowledge tracing,knowledge trace" -v "KDD,IJCAI" -o data/kt_result.csv
  • min_year : paper >= min_year
  • max_year : paper<=max_year
  • -k : keywords, different keywords split use ,
  • -v : venue, split using ,. If default, will use the default venues.
  • o : output file path

Python api

from search import search
keyword_list=['knowledge tracing','knowledge trace']
venue_list=['KDD','IJCAI']
search(keyword_list=keyword_list,venue_list=venue_list,min_year=2016,max_year=2021,output='data/kt_result.csv')

Your can find venues' name in there.

Download

CLI

python download.py -i data/kt_result.csv  -o pdfs
  • i : the csv path from search
  • o : the dir to save pdfs, we will create sub folder for each venue. Such as pdfs/AIED

Python api

from utils.download import download_from_df
import pandas as pd

csv_path = "data/kt_result.csv"
df = pd.read_csv(csv_path)
df = download_from_df(df,save_dir='pdfs')
df.to_csv(csv_path.replace('.csv','_download_result.csv'),index=False)

Todo

  • Search papers.
  • Download papers

Author Warning

This code is only used for academic communication. The author has no liability for copyright. DO NOT ENGAGE IN ANY ILLEGAL ACTIVITIES. Please download and read the genuine articles from the publisher.

You might also like...
This tool crawls a list of websites and download all PDF and office documents

This tool crawls a list of websites and download all PDF and office documents. Then it analyses the PDF documents and tries to detect accessibility issues.

Simple tool to scrape and download cross country ski timings and results from live.skidor.com

LiveSkidorDownload Simple tool to scrape and download cross country ski timings and results from live.skidor.com Usage: Put the python file in a dedic

Liveskidordownload - Simple tool to scrape and download cross country ski timings and results from live.skidor.com

LiveSkidorDownload Simple tool to scrape and download cross country ski timings

A Telegram crawler to search groups and channels automatically and collect any type of data from them.
A Telegram crawler to search groups and channels automatically and collect any type of data from them.

Introduction This is a crawler I wrote in Python using the APIs of Telethon months ago. This tool was not intended to be publicly available for a numb

This code will be able to scrape movies from a movie website and also provide download links to newly uploaded movies.

Movies-Scraper You are probably tired of navigating through a movie website to get the right movie you'd want to watch during the weekend. There may e

Script used to download data for stocks.

This script is useful for downloading stock market data for a wide range of companies specified by their respective tickers. The script reads in the d

Download images from forum threads

Forum Image Scraper Downloads images from forum threads Only works with forums which doesn't require a login to view and have an incremental paginatio

Command line program to download documents from web portals.

command line document download made easy Highlights list available documents in json format or download them filter documents using string matching re

download NCERT books using scrapy

download_ncert_books download NCERT books using scrapy Downloading Books: You can either use the spider by cloning this repo and following the instruc

Releases(v0.1)
Owner
Jiahao Chen (TabChen)
Fighting!
Jiahao Chen (TabChen)
TarkovScrappy - A nifty little bot that lets you know if a queried item might be required for a quest at some point in the land of Tarkov!

TarkovScrappy A nifty little bot that lets you know if a queried item might be required for a quest at some point in the land of Tarkov! Hideout items

Joshua Smeda 2 Apr 11, 2022
A Python Covid-19 cases tracker that scrapes data off the web and presents the number of Cases, Recovered Cases, and Deaths that occurred because of the pandemic.

A Python Covid-19 cases tracker that scrapes data off the web and presents the number of Cases, Recovered Cases, and Deaths that occurred because of the pandemic.

Alex Papadopoulos 1 Nov 13, 2021
A simple Discord scraper for discord bots

A simple Discord scraper for discord bots. That includes sending an guild members ids to an file, Mass inviter for joining servers your bot is in and Fetching all the servers of the bot (w/MemberCoun

3zg 1 Jan 06, 2022
NASA APOD Discord Bot - Fetches information from NASA APOD site.

NASA APOD Discord Bot - Fetches information from NASA APOD site.

Astronomy Club IITK 4 Apr 23, 2022
Complete pipeline for crawling online newspaper article.

Complete pipeline for crawling online newspaper article. The articles are stored to MongoDB. The whole pipeline is dockerized, thus the user does not need to worry about dependencies. Additionally, d

newspipe 4 May 27, 2022
Visual scraping for Scrapy

Portia Portia is a tool that allows you to visually scrape websites without any programming knowledge required. With Portia you can annotate a web pag

Scrapinghub 8.7k Jan 05, 2023
This program scrapes information and images for movies and TV shows.

Media-WebScraper This program scrapes information and images for movies and TV shows. Summary For more information on the program, read the WebScrape_

1 Dec 05, 2021
Displays market info for the LUNI token on the Terra Blockchain

LuniBot for Discord Displays market info for the LUNI/LUNA token on the Terra Blockchain (Webscrape method currently scraping CoinMarketCap). Will evo

0 Jan 22, 2022
🕷 Phone Crawler with multi-thread functionality

Phone Crawler: Phone Crawler with multi-thread functionality Disclaimer: I'm not responsible for any illegal/misuse actions, this program was made for

Kmuv1t 3 Feb 10, 2022
A web scraper that exports your entire WhatsApp chat history.

WhatSoup 🍲 A web scraper that exports your entire WhatsApp chat history. Table of Contents Overview Demo Prerequisites Instructions Frequen

Eddy Harrington 87 Jan 06, 2023
哔哩哔哩爬取器:以个人为中心

Open Bilibili Crawer 哔哩哔哩是一个信息非常丰富的社交平台,我们基于此构造社交网络。在该网络中,节点包括用户(up主),以及视频、专栏等创作产物;关系包括:用户之间,包括关注关系(following/follower),回复关系(评论区),转发关系(对视频or动态转发);用户对创

Boshen Shi 3 Oct 21, 2021
🐞 Douban Movie / Douban Book Scarpy

Python3-based Douban Movie/Douban Book Scarpy crawler for cover downloading + data crawling + review entry.

Xingbo Jia 1 Dec 03, 2022
Binance harvester - A Python 3 script to harvest data from the Binance socket stream and calculate popular TA indicators and produce lists of top trending coins

Binance harvester - A Python 3 script to harvest data from the Binance socket stream and calculate popular TA indicators and produce lists of top trending coins

68 Oct 08, 2022
Automated data scraper for Thailand COVID-19 data

The Researcher COVID data Automated data scraper for Thailand COVID-19 data Accessing the Data 1st Dose Provincial Vaccination Data 2nd Dose Provincia

Porames Vatanaprasan 31 Apr 17, 2022
Pseudo API for Google Trends

pytrends Introduction Unofficial API for Google Trends Allows simple interface for automating downloading of reports from Google Trends. Only good unt

General Mills 2.6k Dec 28, 2022
A modern CSS selector implementation for BeautifulSoup

Soup Sieve Overview Soup Sieve is a CSS selector library designed to be used with Beautiful Soup 4. It aims to provide selecting, matching, and filter

Isaac Muse 151 Dec 23, 2022
A simple, configurable and expandable combined shop scraper to minimize the costs of ordering several items

combined-shop-scraper A simple, configurable and expandable combined shop scraper to minimize the costs of ordering several items. Features Define an

2 Dec 13, 2021
爱奇艺会员,腾讯视频,哔哩哔哩,百度,各类签到

My-Actions 个人收集并适配Github Actions的各类签到大杂烩 不要fork了 ⭐️ star就行 使用方式 新建仓库并同步代码 点击Settings - Secrets - 点击绿色按钮 (如无绿色按钮说明已激活。直接到下一步。) 新增 new secret 并设置 Secr

280 Dec 30, 2022
Dex-scrapper - Hobby project for scrapping dex data on VeChain

Folders /zumo_abis # abi extracted from zumo repo /zumo_pools # runtime e

3 Jan 20, 2022
An IpVanish Proxies Scraper

EzProxies Tired of searching for good proxies for hours? Just get an IpVanish account and get thousands of good proxies in few seconds! Showcase Watch

11 Nov 13, 2022