Script used to download data for stocks.

Overview
This script is useful for downloading stock market data for a wide range of companies specified by their respective tickers. The script reads in the desired tickers and interacts with yahoo finance to download and save csv files containing information for: Date, Open, High, Low, Close, Adjusted Close, and Volume. Once data for a ticker is downloaded and stored, further requests for data will simply append the most recent information onto the existing csv file. Additionally, each time a user requests downloads, a list of the successful and failed requests will be generated. 


A few important notes:
-Most importantly, HUGE shoutout to https://github.com/bradlucas/get-yahoo-quotes-python for the repo on downloading historic data from yahoo finance. My code is build on top of the work done there, which was a huge time saver.
-Make sure to set up the directories for your ticker_location and csv_location.
-The default behavior is to download as much data that yahoo finance can provide.
-This data is daily historic data


There are 5 command line arguments which may be helpful to facilitate the data download process, which may either be used directly in the terminal, or have their defaults set by modifying the download_data.py script.

Command Line Arguments:

--ticker_location (path): this specifies the file location containing a list of tickers to download data for. The list should be saved as a text file with each ticker on its own new line.

--csv_location (path): this is the directory where csv files should be saved. If this directory does not already exist, create it manually before running the script.

--add_tickers (string): this gives the user an option to add more tickers to their existing list and database. Pass in a string of tickers separated by commas (no spaces) to add the tickers to the list, and download their csv files. The default list of tickers will be updated to contain these new tickers specified. If there is not already a default list of tickers, create this before running the script.

--remove_tickers (string): this gives the user an option to remove tickers from their list and database. Pass in a string of tickers separated by commas (no spaces) to remove the tickers from the list as well as the database (csv_location). If there is not already a default list of tickers, create this before running the script.

--verbose (bool): this provides extra information while downloading data, useful for debugging. Set to false to only see the progress bar for data being downloaded.



To use the script, follow these simple steps.

0. Install dependencies using pip install -r requirements.txt
1. Set up a default list of tickers. This can be a blank text file, or a list of tickers each on their own new line, saved as a text file.
2. Set up a directory to save csv files to.
3. Optionally, change the default ticker_location and csv_location file paths in the script itself.
4. Run the script download_data.py from the command line, or your favorite IDE.

Examples: 

Download using a pre-saved list of tickers

python download_data.py --ticker_location /home/user/Desktop/tickers.txt --csv_location /home/user/Desktop/CSVFiles/

Download data using a string of tickers without referencing a tickers.txt file

python download_data.py --csv_location /home/user/Desktop/CSVFiles/ --add_tickers "GME,AMC,AAPL,TSLA,SPY"

Download data using a string of tickers with referencing a tickers.txt file

python download_data.py --csv_location /home/user/Desktop/CSVFiles/ --ticker_location /home/user/Desktop/tickers.txt --add_tickers "GME,AMC,AAPL,TSLA,SPY"



From here, the rest is history (pun intended ;)). When downloading from a pre-saved list of tickers, the computer will open as many threads as it can to speed up this highly parallelizable process to get you your data as quick as possible. Once its finished, you'll find all the data in your csv_location folder!

Now that you have data, you can easily update the files with the latest information at the end of each day, week, or whatever time frame you prefer. Simply run the script in the same way as previously described, and the newest data will be appended to the existing files. If there is a new ticker in your list, the full set of data will be downloaded.


Happy downloading!
Owner
Carmelo Gonzales
My main interests include: applied deep learning, edge computing, automation, data acquisition, sensor integration, and metaprogramming.
Carmelo Gonzales
Simple tool to scrape and download cross country ski timings and results from live.skidor.com

LiveSkidorDownload Simple tool to scrape and download cross country ski timings

0 Jan 07, 2022
Scrapes all articles and their headlines from theonion.com

The Onion Article Scraper Scrapes all articles and their headlines from the satirical news website https://www.theonion.com Also see Clickhole Article

0 Nov 17, 2021
Example of scraping a paginated API endpoint and dumping the data into a DB

Provider API Scraper Example Example of scraping a paginated API endpoint and dumping the data into a DB. Pre-requisits Python = 3.9 Pipenv Setup # i

Alex Skobelev 1 Oct 20, 2021
Instagram profile scrapper with python

IG Profile Scrapper Instagram profile Scrapper Just type the username, and boo! :D Instalation clone this repo to your computer git clone https://gith

its Galih 6 Nov 07, 2022
A scrapy pipeline that provides an easy way to store files and images using various folder structures.

scrapy-folder-tree This is a scrapy pipeline that provides an easy way to store files and images using various folder structures. Supported folder str

Panagiotis Simakis 7 Oct 23, 2022
Binance harvester - A Python 3 script to harvest data from the Binance socket stream and calculate popular TA indicators and produce lists of top trending coins

Binance harvester - A Python 3 script to harvest data from the Binance socket stream and calculate popular TA indicators and produce lists of top trending coins

68 Oct 08, 2022
A scalable frontier for web crawlers

Frontera Overview Frontera is a web crawling framework consisting of crawl frontier, and distribution/scaling primitives, allowing to build a large sc

Scrapinghub 1.2k Jan 02, 2023
Scrapes Every Email Address of Every Society in Every University

society-email-scrape Site Live at https://kcsoc.github.io/society-email-scrape/ How to automatically generate new data Go to unis.yml Add your uni Cre

Krishna Consciousness Society 18 Dec 14, 2022
API which uses discord to scrape NameMC searches/droptime/dropping status of minecraft names

NameMC Scrape API This is an api to scrape NameMC using message previews generated by discord. NameMC makes it a pain to scrape their website, but som

Twilak 2 Dec 22, 2021
Instagram_scrapper - This project allow you to scrape the list of followers, following or both from a public Instagram account, and create a csv or excel file easily.

Instagram_scrapper This project allow you to scrape the list of followers, following or both from a public Instagram account, and create a csv or exce

Lakhdar Belkharroubi 5 Oct 17, 2022
An introduction to free, automated web scraping with GitHub’s powerful new Actions framework.

An introduction to free, automated web scraping with GitHub’s powerful new Actions framework Published at palewi.re/docs/first-github-scraper/ Contrib

Ben Welsh 15 Nov 24, 2022
A package that provides you Latest Cyber/Hacker News from website using Web-Scraping.

cybernews A package that provides you Latest Cyber/Hacker News from website using Web-Scraping. Latest Cyber/Hacker News Using Webscraping Developed b

Hitesh Rana 4 Jun 02, 2022
Automatically scrapes all menu items from the Taco Bell website

Automatically scrapes all menu items from the Taco Bell website. Returns as PANDAS dataframe.

Sasha 2 Jan 15, 2022
A spider for Universal Online Judge(UOJ) system, converting problem pages to PDFs.

Universal Online Judge Spider Introduction This is a spider for Universal Online Judge (UOJ) system (https://uoj.ac/). It also works for all other Onl

TriNitroTofu 1 Dec 07, 2021
Scrape all the media from an OnlyFans account - Updated regularly

Scrape all the media from an OnlyFans account - Updated regularly

CRIMINAL 3.2k Dec 29, 2022
Twitter Claimer / Swapper / Turbo - Proxyless - Multithreading

Twitter Turbo / Auto Claimer / Swapper Version: 1.0 Last Update: 01/26/2022 Use this at your own descretion. I've only used this on test accounts and

Underscores 6 May 02, 2022
Luis M. Capdevielle 1 Jan 14, 2022
Web Scraping images using Selenium and Python

Web Scraping images using Selenium and Python A propos de ce document This is a markdown document about Web scraping images and videos using Selenium

Nafaa BOUGRAINE 3 Jul 01, 2022
自动完成每日体温上报(Github Actions)

体温上报助手 简介 每天 10:30 GMT+8 自动完成体温上报,如想修改定时运行的时间,可修改 .github/workflows/SduHealthReport.yml 中 schedule 属性。 如果当日有异常,请手动在小程序端/PC 端填写!

Teng Zhang 23 Sep 15, 2022
Scrape plants scientific name information from Agroforestry Species Switchboard 2.0.

Agroforestry Species Switchboard 2.0 Scraper Scrape plants scientific name information from Species Switchboard 2.0. Requirements python = 3.10 (you

Mgs. M. Rizqi Fadhlurrahman 2 Dec 23, 2021