Python wrapper for Xeno-canto API 2.0. Enables downloading bird data with one command line

Overview

birdData

BirdData is a python wrapper for Xeno-canto API 2.0. Enables user to download bird data with one command line. BirdData supports multithreading download.

Environment

Download repo to local:

git clone [email protected]:realzza/birdData.git

Set up environment:

pip install -r requirement.txt

Usage

Single-thread

Download audio data for one bird species. Use scientific name starting with lowercase. e.g, cettia cetti.

python download.py --name "cettia cetti"

Download audio data for a file of species names. Format requirement: names divided by "\n"

python download.py --name name_file

General Usage:

usage: download.py [-h] --name NAME

download bird audios

optional arguments:
  -h, --help   show this help message and exit
  --name NAME  [1] name of one bird species; [2] file of bird species spaced
               by '\n'

Multi-thread

Speed up downloading using multiple threads.

python download-mult.py --name "cettia cetti" --process-ratio 0.6

Download multiple birds in a file, format requirement: names divided by "\n"

python download-mult.py --name name_file --process-ratio 0.6

General Usage:

usage: download-mult.py [-h] --name NAME [--process-ratio PROCESS_RATIO]

download bird audios

optional arguments:
  -h, --help            show this help message and exit
  --name NAME           [1] name of one bird species; [2] file of bird species
                        spaced by '\n'
  --process-ratio PROCESS_RATIO
                        float[0~1], define cpu utilities in downloading audios
                        [default: 0.8]

To-do

  • [12.29] multiprocess download
  • define sample rate prior to download

Contact

Feel free to file an issue had you encountered any problems. Have fun!

You might also like...
PRAW, an acronym for "Python Reddit API Wrapper", is a python package that allows for simple access to Reddit's API.

PRAW: The Python Reddit API Wrapper PRAW, an acronym for "Python Reddit API Wrapper", is a Python package that allows for simple access to Reddit's AP

A Python Instagram Scraper for Downloading Profile's Posts, stories, ProfilePic and See the Details of Particular Instagram Profile.
A Python Instagram Scraper for Downloading Profile's Posts, stories, ProfilePic and See the Details of Particular Instagram Profile.

✔ ✔ InstAstra ⚡ ⚡ ⁜ Description ~ A Python Instagram Scraper for Downloading Profile's Posts, stories, ProfilePic and See the Details of Particular In

Python API wrapper around Trello's API

A wrapper around the Trello API written in Python. Each Trello object is represented by a corresponding Python object. The attributes of these objects

Async ready API wrapper for Revolt API written in Python.

Mutiny Async ready API wrapper for Revolt API written in Python. Installation Python 3.9 or higher is required To install the library, you can just ru

A Python API wrapper for the Twitter API!

PyTweet PyTweet is an api wrapper made for twitter using twitter's api version 2! Installation Windows py3 -m pip install PyTweet Linux python -m pip

Python API wrapper library for Convex Value API

convex-value-python Python API wrapper library for Convex Value API. Further Links: Convex Value homepage @ConvexValue on Twitter JB on Twitter Authen

This an API wrapper library for the OpenSea API written in Python 3.

OpenSea NFT API Python 3 wrapper This an API wrapper library for the OpenSea API written in Python 3. The library provides a simplified interface to f

YARSAW is an Async Python API Wrapper for the Random Stuff API.

Yet Another Random Stuff API Wrapper - YARSAW YARSAW is an Async Python API Wrapper for the Random Stuff API. This module makes it simpler for you to

EpikCord.py - This is an API Wrapper for Discord's API for Python

EpikCord.py - This is an API Wrapper for Discord's API for Python! We've decided not to fork discord.py and start completely from scratch for a new, better structuring system!

Comments
  • ZeroDivisionError

    ZeroDivisionError

    您好,我在尝试批量下载的时候,发现当查询结果为0个的时候,会出现 ZeroDivisionError: integer division or modulo by zero 的错误,查询trackback,错误源头在utils.py文件的第4行portion = len(lst) //n的位置。

    除此之外,我还发现当查询以下物种名称的时候也会出现同样的error。我的查询时间范围是从1971-01-01:

    ['Maleo', 'Moluccan Megapode', 'Nicobar Megapode', 'Long-billed Partridge', 'Black Partridge', 'Udzungwa Forest Partridge', 'Rubeho Forest Partridge', "Roll's Partridge", 'Sumatran Partridge', 'Grey-breasted Partridge', 'Red-billed Partridge', 'Chestnut-necklaced Partridge', 'Crestless Fireback', 'Crested Fireback', 'Siamese Fireback', 'Great Argus', 'Madagascan Pochard']
    

    以下这个list是对应以上的common_name的查询结果数量

    [7, 6, 2, 18, 3, 6, 34, 10, 1, 31, 8, 35, 1, 17, 24, 83, 5]
    

    期待回复,感谢!

    small fix 
    opened by ZhongJunhong 1
  • Something about Download Timeout

    Something about Download Timeout

    您好,我在下载的时候发现一些音频文件在某些情况下(可能是网络的问题)下载时间过长,而且进度条一直停滞不前,有时甚至停滞十分钟到二十分钟不等,有时候我只能手动中断使之进入下一个循环。

    后来我在stackoverflow上找到一个非常非常简单的解决方案,就是在执行请求的代码前加入socket.setdefaulttimeout(30),控制socket打开的时间,比如此处我设置为30秒。

    socket.setdefaulttimeout(30) 
    q.retrieve_recordings(multiprocess=True, nproc=10, attempts=10, outdir="/mnt/database/xcdata/")
    

    同时我加入一些其他设置,让代码在触发socket.timeout错误后,可以将对应的请求参数记录下来,并继续执行下一个循环。由此可以避免下载停滞的问题。当我完成所有循环以后,会将socket打开的时间再增加(比如增加到120秒),对先前记录下的请求参数再次执行。由此循环多次以将所有参数请求完毕。

    https://stackoverflow.com/questions/32763720/timeout-a-file-download-with-python-urllib

    希望能帮助到大家。

    Hello, when I was downloading, I found that the download time of some audio files was too long under certain circumstances (may be a problem with the network), and the progress bar has been stagnant, sometimes even for ten to twenty minutes, there are Sometimes I can only manually interrupt it to enter the next cycle.

    Later, I found a very, very simple solution on stackoverflow, which is to add socket.setdefaulttimeout(30) before executing the requested code to control the opening time of the socket. For example, I set it to 30 seconds here.

    socket.setdefaulttimeout(30) 
    q.retrieve_recordings(multiprocess=True, nproc=10, attempts=10, outdir="/mnt/database/xcdata/")
    

    At the same time, I added some other settings so that after the code triggers the socket.timeout error, it can record the corresponding request parameters and continue to execute the next loop. This can avoid the problem of download stagnation. When I finish all the loops, I will increase the socket opening time (for example, to 120 seconds), and execute again on the previously recorded request parameters. Loop for multiple times to complete the request for all parameters.

    https://stackoverflow.com/questions/32763720/timeout-a-file-download-with-python-urllib

    I Hope that can help everyone.

    opened by ZhongJunhong 3
Releases(v0.0.4)
  • v0.0.4(May 13, 2022)

    • Support Query by bird name.
    • Automatically cut inessential processes in query traffic.
    • Optimized query assignment strategy in recording retrieval.
    Source code(tar.gz)
    Source code(zip)
Owner
_zza
Computational (Para)linguistics | Multimodal Analysis | Founder of @Presento | Previously @ByteDance AI-lab Shanghai
_zza
A light weight Python library for the Spotify Web API

Spotipy A light weight Python library for the Spotify Web API Documentation Spotipy's full documentation is online at Spotipy Documentation. Installat

Paul Lamere 4.2k Jan 06, 2023
PepeSniper is an open-source Discord Nitro auto claimer/redeemer made in python.

PepeSniper is an open-source Discord Nitro auto claimer made in python. It sure as hell is not the fastest sniper out there but it gets the job done in a timely and stable manner. It also supports ho

Unknown User 1 Dec 22, 2021
C Y B Ξ R UserBot is a project that simplifies the use of Telegram. All rights reserved.

C Y B Ξ R USΞRBOT 🇦🇿 C Y B Ξ R UserBot is a project that simplifies the use of Telegram. All rights reserved. Automatic Setup Android: open Termux p

C Y B Ξ R 0 Sep 20, 2022
Total time of all YouTube videos in a playlist.

Youtube Playlist Total Times Total time of all YouTube videos in a playlist. How to Use Download chromedriver depending on your os and chrome version

Mohammad Dori 3 Jul 15, 2022
短信发送 Python 程序(包含1000+有效接口)

短信轰炸 Python 程序(包含1000+有效接口) 前言 这是一个爬取网络上在线轰炸的接口,后通过 Python 异步 请求接口以达到 手机短信轰炸 的目的。 此为开源项目,仅供娱乐学习使用,使用者所带来的一切后果与作者无关,使用请遵守相关的法律法规,合理使用,请勿滥用。 食用方法 1. 爬取接

蓝鲸落 10.2k Jan 02, 2023
A python client for the Software-Challenge Germany.

sc-client-python A python client for the Software-Challenge Germany. Creating a new project (Optional) Install virtualenv virtualenv is a tool that cr

rpkak 3 Jan 22, 2022
A Discord bot to play bluffing games like Dobbins or Bobbins

Usage: pip install -r requirements.txt python3 bot.py DISCORD_BOT_TOKEN Gameplay: All commands are case-insensitive, with trailing punctuation and spa

4 May 27, 2022
Easy-apply-bot - A LinkedIn Easy Apply bot to help with my job search.

easy-apply-bot A LinkedIn Easy Apply bot to help with my job search. Getting Started First, clone the repository somewhere onto your computer, or down

Matthew Alunni 5 Dec 09, 2022
WikiChecker - Repositorio oficial del complemento WikiChecker para NVDA.

WikiChecker Buscador rápido de artículos en Wikipedia. Introducción. El complemento WikiChecker para NVDA permite a los usuarios consultar de forma rá

2 Jan 10, 2022
AWS Lambda Fast API starter application

AWS Lambda Fast API Fast API starter application compatible with API Gateway and Lambda Function. How to deploy it? Terraform AWS Lambda API is a reus

OBytes 6 Apr 20, 2022
Pdisk Link Converter Telegram Bot, Convert link in a single click

Pdisk Converter Bot Make short link by using Pdisk API key Installation The Easy Way Required Variables BOT_TOKEN: Create a bot using @BotFather, and

Ayush Kumar Jaiswal 6 Jul 28, 2022
A liblary whre you can find helpful functions for your discord bot

DBotUtils A liblary whre you can find helpful functions for your discord bot Easy setup Setup is easily and flexible. Change anytime. After setup just

Kondek286 1 Nov 02, 2021
Instadev - Crack Instagram IqbalDev

Crack Instagram IqbalDev ⇨ Install Script Di Termux $ pkg update && upgrade $

Dicky Wahyudi 1 Feb 27, 2022
Unofficial calendar integration with Gradescope

Gradescope-Calendar This script scrapes your Gradescope account for courses and assignment details. Assignment details currently can be transferred to

6 May 06, 2022
The Most advanced and User-Friendly Google Collab NoteBook to download Torrent directly to Google Drive with File or Magnet Link support and with added protection of Timeout Preventer.

Torrent To Google Drive (UI Added! 😊 ) A Simple and User-Friendly Google Collab Notebook with UI to download Torrent to Google Drive using (.Torrent)

Dr.Caduceus 33 Aug 16, 2022
Projeto de estudantes do primeiro período do CIn - UFPE voltado para a criação de um sistema interativo no fechamento da disciplina IF669 - Introdução a Programação.

Projeto Game: Dona da Lua Alunos: Beatriz Férre Clara Kenderessy Matheus Silva Rafael Baltar Roseane Oliveira Samuel Marsaro Sinopse O Cebolinha apron

Maria Clara Kenderessy 5 Dec 20, 2021
Pretend to be a discord bot

Pretendabot © Pretend to be a discord bot! About Pretendabot© is an app that lets you become a discord bot!. It uses discord intrigrations(webhooks) a

Advik 3 Apr 24, 2022
Petpy is an easy-to-use and convenient Python wrapper for the Petfinder API.

Petpy is an easy-to-use and convenient Python wrapper for the Petfinder API. Includes methods for parsing output JSON into pandas DataFrames for easier data analysis

Aaron Schlegel 27 Nov 19, 2022
Discord E-Store Bot

A delivery bot for Discord, works like Amazon where real users can pack & deliver orders in different servers!

Amit Pathak 2 Jan 28, 2022
Ethereum Gas Fee for the MacBook Pro touchbar (using BetterTouchTool)

Gasbar Ethereum Gas Fee for the MacBook Pro touchbar (using BetterTouchTool) Worried about Ethereum gas fees? Me too. I'd like to keep an eye on them

TSS 51 Nov 14, 2022