Amazon Scraper: A command-line tool for scraping Amazon product data

Overview

Amazon Product Scraper: 2021

Description

A command-line tool for scraping Amazon product data to CSV or JSON format(s).

Requirements

  • Python 3
  • pip3

Installation

Using git clone (you'll need git installed for this):

git clone https://github.com/scrapewalrus/amazon-scraper-python-2021

Or download and extract the zip file of the project manually

You'll also need to install requirements for the project to run. Locate amazon-product-scraper folder via terminal and type pip install -r requirements.txt:

Usage

To launch the Amazon scraper locate the amazon-product-scraper folder via terminal and type python amazon_scraper.py -k "your keyword". This will start the program.

NOTE: you must declare either -k or --keyword before entering your keyword. It's a required argument.

Example:

amazon_scraper.py - the name of a scraper file.

-k or --keyword - required argument to pass before entering your keyword.

-p or --proxies - optional argument to enable proxies. To avoid getting blocked I highly recommend using proxies. I'm using Residential Proxies from Oxylabs. For highest success rate, I suggest Residential Proxies over Datacenter as they're almost impossible to detect and have the smallest footprint. If you decide to use different proxy provider services keep in mind that you'll have to make some minor adjustments in get-proxies.py file.

-j or --json - optional argument for storing extracted data in .json format. Default output format is .csv.

Example of product data #1: JSON

[
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/Mostly-T-Shirt-Womens-Letter-Printed/dp/B07QN2NQ59/ref=sr_1_3?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-3",
        "PRODUCT_NAME": "I'm Mostly Peace Love and Light Funny T-Shirt Womens Graphic Printed Short Sleeve Tops Tee",
        "PRICE": "$21.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "1,637"
    },
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/YITAN-Women-Graphic-Funny-X-Large/dp/B074QMG4D7/ref=sr_1_4?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-4",
        "PRODUCT_NAME": "YITAN Women's Cute Juniors Tops Teen Girl Tee Funny T Shirt",
        "PRICE": "$12.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "12,281"
    },
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/DANVOUY-Womens-V-Neck-Doesnt-Definitely/dp/B07V55ZXVS/ref=sr_1_5?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-5",
        "PRODUCT_NAME": "DANVOUY Womens If My Mouth Doesn't Say It My Face Definitely Will T Shirt",
        "PRICE": "$12.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "6,787"
    }]

Example of product data #2: CSV

amazon-csv-product-data-example

A webmining CLI tool & library for python.

minet is a webmining command line tool & library for python (= 3.6) that can be used to collect and extract data from a large variety of web sources

médialab Sciences Po 165 Dec 17, 2022
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.

Command line utilities for tabular data files This is a set of command line utilities for manipulating large tabular data files. Files of numeric and

eBay 1.4k Jan 09, 2023
Command line tool for monitoring changes of File entities scoped in a Synapse File View

Synapse Monitoring Provides tools for monitoring and keeping track of File entity changes in Synapse with the use of File Views. Learn more about File

Sage Bionetworks 3 May 28, 2022
A simple command-line tracert implementation in Python 3 using ICMP packets

Traceroute A simple command-line tracert implementation in Python 3 using ICMP packets Details Traceroute is a networking tool designed for tracing th

James 3 Jul 16, 2022
A simple web-based SSH client.

Kommander A simple web-based SSH client. It supports: entering SSH login details (including private key and custom ports) and connecting user authenti

KingWaffleIII 2 Jan 01, 2022
A python script that enables a raspberry pi sd card through the CLI and automates the process of configuring network details and ssh.

This project is one script (wpa_helper.py) written in python that will allow for the user to automate the proccess of setting up a new boot disk and configuring ssh and network settings for the pi

Theo Kirby 6 Jun 24, 2021
Library and command-line utility for rendering projects templates.

A library for rendering project templates. Works with local paths and git URLs. Your project can include any file and Copier can dynamically replace v

808 Jan 04, 2023
Neovim integration for Google Keep, built using gkeepapi

Gkeep.nvim Neovim integration for Google Keep, built using gkeepapi Requirements Neovim 0.5 Python 3.6+ A patched font (optional. Used for icons) Tabl

Steven Arcangeli 143 Jan 02, 2023
stonky is a simple command line dashboard for monitoring stocks.

stonky is a simple command line dashboard for monitoring stocks.

Jessy Williams 228 Dec 14, 2022
Oil is a new Unix shell. It's our upgrade path from bash to a better language and runtime

Oil is a new Unix shell. It's our upgrade path from bash to a better language and runtime. It's also for Python and JavaScript users who avoid shell!

2.4k Jan 08, 2023
Yet another bash/zsh prompt script

Here we have yet another script for Git-aware customization of the command prompt in Bash and zsh. Unlike all the other scripts, I wrote this one, so

John T. Wodder II 5 Oct 13, 2021
Juniper Command System is a Micro CLI Tool that allows you to manage your files, launch applications, as well as providing extra tools for OS Management.

Juniper Command System is a Micro CLI Tool that allows you to manage your files, launch applications, as well as providing extra tools for OS Management.

Juan Carlos Juárez 1 Feb 02, 2022
🌈 Generate color palettes based on Neovim colorschemes.

Iris Iris is a Neovim plugin that generates a normalized color palette based on your colorscheme. It is named for the goddess Iris of Greek mythology,

N. G. Scheurich 45 Jul 28, 2022
A next-generation CLI and TUI that aims to be your personal assistant for everything competitive programming related. 🚀

Competitive Programming Tool Kit The Competitive Programming Tool Kit (cptk for short), is a command line and terminal user interface (CLI and TUI) th

Alon 4 May 21, 2022
Linux commands Interpreter for Windows and Mac based systems using Python

DBHTermEcIbP Linux commands Interpreter for Windows and Mac based systems using Python Basic Linux commands supported viewing current working director

Vraj Patel 1 Dec 26, 2021
grungegirl is the hacker's drug encyclopedia. programmed in python for maximum modularity and ease of configuration.

grungegirl. cli-based drug search for girls. welcome. grungegirl is aiming to be the premier drug culture application. it is the hacker's encyclopedia

Eristava 10 Oct 02, 2022
texel - Command line interface for reading spreadsheets inside terminal

texel - Command line interface for reading spreadsheets inside terminal. Sometimes, you have to deal with spreadsheets. Those are sad times. Fortunate

128 Dec 19, 2022
A Python module and command line utility for working with web archive data using the WACZ format specification

py-wacz The py-wacz repository contains a Python module and command line utility for working with web archive data using the WACZ format specification

Webrecorder 14 Oct 24, 2022
An awesome Python wrapper for an awesome Docker CLI!

An awesome Python wrapper for an awesome Docker CLI!

Gabriel de Marmiesse 303 Jan 03, 2023
Message commands extension for discord-py-interactions

interactions-message-commands Message commands extension for discord-py-interactions README IS NOT FINISHED YET BUT IT IS A GOOD START Installation pi

2 Aug 04, 2022