Automatically download and crop key information from the arxiv daily paper. (cpu version)

Related tags

DownloaderFocusAX
Overview

FocusAX

按关键词筛选arxiv每日最新paper或从arxiv搜索。

  • 自动下载、获取摘要、自动截取文中表格和图片。

安装必要的环境

  • 安装 paddle
# GPU安装
python3 -m pip install paddlepaddle-gpu==2.1.1 -i https://mirror.baidu.com/pypi/simple

# CPU安装
 python3 -m pip install paddlepaddle==2.1.1 -i https://mirror.baidu.com/pypi/simple
  • 安装 Layout-Parser
=2.2"">
pip3 install -U https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl
pip install "paddleocr>=2.2"
  • 按照其他必要的包
pip3 install -r requirements.txt
  • 下载模型权重
  • PubLayNet 下载解压后放置在paperparse目录下。目录结构如下
FocusAX
    - paperparse
        - ppyolov2_r50vd_dcn_365e_publaynet
            - inference.pdiparams
            - inference.pdiparams.info
            - inference.pdmodel
        - ...
    - downloader
        - ...
    - utils
        - ...
    - configs.py
    - focus_daily.py
    - focus_search.py
    - README.py
    - ...

使用教程

  • configs.py :程序参数配置文件
# =============== 网络代理 ================
# proxy = None # 不使用代理
proxy = {"http": "socks5://127.0.0.1:8080", "https": "socks5://127.0.0.1:8080"}
# =============== 保存文件根目录 ================
root_path = "./arxiv"
# =============== DNN模型推理配置信息 ================
threshold = 0.5
enable_mkldnn = True
enforce_cpu = True
thread_num = 4
  • focus_daily.py :按关键字过滤arxiv daily上的文章(仅当日)
if __name__ == '__main__':
    key_words = ['GAN'] # 要包含的关键词
    subject_words = ['ML', 'CV', 'AI']  # 要包含的类别
    start_parse(key_words, subject_words, needPDF=True, needZip=False)
  • focus_search.py :按关键字在arxiv检索
start_parse('Keyword')
  • root_path 目录中将创建新的文件夹保存结果

效果图

每个文件夹中的abs.md文件保留的是当前pdf的介绍,使用Typora等markdown编辑器打开。 image

image

ps:论文排版不规范会导致截图混乱。

其他

Owner
HeoLis
Interesting in generate methods.
HeoLis
A tool written in Python to download all Snapmaps content from a specific location.

snapmap-archiver A tool written in Python to download all Snapmaps content from a specific location.

46 Dec 09, 2022
Source code of paper: "HRegNet: A Hierarchical Network for Efficient and Accurate Outdoor LiDAR Point Cloud Registration".

HRegNet: A Hierarchical Network for Efficient and Accurate Outdoor LiDAR Point Cloud Registration Environments The code mainly requires the following

Intelligent Sensing, Perception and Computing Group 3 Oct 06, 2022
Userscript qutebrowser for downloading audio / video from youtube using aria2

Yt-Downloader Userscript qutebrowser for downloading video / audio from youtube using aria2 by hint links. Requirements Rofi youtube-dl aria2 dunst In

Ara 0 Dec 11, 2021
Download Youtube videos in mp4 format in a fast, easy, convenient way made with Python!

yt_downloader Download Youtube videos in mp4 format in a fast, easy, convenient way made with Python! Required Modules pytube os time colorama Errors

3 Jul 02, 2022
𝐴 𝑡𝑒𝑙𝑒𝑔𝑟𝑎𝑚 𝑏𝑜𝑡 𝑡ℎ𝑎𝑡 𝑐𝑎𝑛 𝑑𝑜𝑤𝑛𝑙𝑜𝑎𝑑 𝑣𝑖𝑑𝑒𝑜 𝑎𝑛𝑑 𝑎𝑢𝑑𝑖𝑜 𝑓𝑟𝑜𝑚 𝑦𝑜𝑢𝑡𝑢𝑏𝑒 𝑎𝑛𝑑 𝑣𝑖𝑑𝑒𝑜 𝑤𝑒𝑏𝑠𝑖𝑡𝑒𝑠 𝑞𝑢𝑖𝑐𝑘𝑙𝑦

𝐴 𝑡𝑒𝑙𝑒𝑔𝑟𝑎𝑚 𝑏𝑜𝑡 𝑡ℎ𝑎𝑡 𝑐𝑎𝑛 𝑑𝑜𝑤𝑛𝑙𝑜𝑎𝑑 𝑣𝑖𝑑𝑒𝑜 𝑎𝑛𝑑 𝑎𝑢𝑑𝑖𝑜 𝑓𝑟𝑜𝑚 𝑦𝑜𝑢𝑡𝑢𝑏𝑒 𝑎𝑛𝑑 𝑣𝑖𝑑𝑒𝑜 𝑤𝑒𝑏𝑠𝑖𝑡𝑒𝑠 𝑞𝑢𝑖𝑐𝑘𝑙𝑦

SOCIAL MECHANIC 2 Aug 04, 2022
Python code to crawl computer vision papers from top CV conferences. Currently it supports CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, SIGGRAPH

Python code to crawl computer vision papers from top CV conferences. Currently it supports CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, SIGGRAPH. It leverages selenium, a website testing framework to crawl

Xiaoyang Huang 39 Nov 21, 2022
Downloads .ksy files and their dependencies straight from the official kaitai-struct format gallery.

ksy-dl Downloads .ksy files and their dependencies straight from the official kaitai-struct format gallery. This tool will: Fetch any of the official

3 Jun 20, 2022
Tool To download Amazon 4k SDR HDR 1080, CDM IS Not Included

WV-AMZN-4K-RIPPER Tool To download Amazon 4k SDR HDR 1080, CDM IS Not Included For CDM You can Mail :- Denis Trunov 179 Dec 17, 2022

Downloads photos you saved from a specific profile.

instagram-download-saved A python script that downloads photos you saved from a specific profile. The only dependency is instaloader, an open-source p

Aviv 1 Dec 19, 2021
Terminal based YouTube player and downloader

termitube NOTE: THIS REPOSITORY IS A FORK OF mps-youtube as mps-youtube has been unmaintained for almost a year now. Features Search and play audio/vi

Otis/Jacob Root 27 Dec 23, 2022
The sole purpose of this script is to download any NFT collection from OpenSea

OpenSea NFT Stealer The sole purpose of this script is to download any NFT collection from OpenSea. Setup Prerequisites: Python 3 Python requests libr

Phillip 9 Sep 04, 2022
This is a python based web scraping bot for windows to download all ACCEPTED submissions of any user on Codeforces

CODEFORCES DOWNLOADER This is a python based web scraping bot for windows to download all ACCEPTED submissions of any user on Codeforces Requirements

Mohak 6 Dec 29, 2022
Simple package for Sublime Text 4; download URL's for local viewing and editing

URLDownloader This is a simple example package that allows you to easily download the contents of any web URL to edit locally. Given a URL, the packag

Terence Martin 3 Mar 05, 2022
A Simple YouTube Video Downloader With Python

Simple YouTube Video Downloader Simple YouTube Video Downloader is an open source project with a very simple UI that tries to speed up the process of

Brian Han 2 Jan 03, 2022
This is Yt Downloader. Coded with Python (my first repository)

Get Started Download & install Python first before using this software. Download Python Installing Python and Pytube Library (IMPORTANT) Installing Py

Qi 2 Oct 25, 2021
Spy Ad Network - Spy Ad Network Detection With Python

Spy Ad Network Spy Ad Network Detection Jumps from link to link to access a site

Baris Dincer 2 Jan 13, 2022
Python Program that downloads gaming required packages based on your Linux Distribution.

LibreGaming Python Program that downloads gaming required packages based on your Linux Distribution. Table of contents Distributions Prerequisites Dep

Ahmed Al Balochi 195 Jan 01, 2023
Application Updater using an download link

Application-Updater This tool will update your app using an storage link

ExtremeDev 1 Dec 20, 2021
A web app for downloading Facebook comments as a csv file

Facebook Comment Downloader A small web app for downloading comments from a public facebook page post. Comment downloading from https://github.com/min

WSDOT 23 Jan 04, 2023
Used Insta Loader to download high quality images from instagram account

Insta Dp Downloader Project Description: In this project, I have used "Insta Loader" to download high quality images from instagram account. You only

Hassan Shahzad 3 Oct 31, 2022