Amazon Scraper: A command-line tool for scraping Amazon product data

Overview

Amazon Product Scraper: 2021

Description

A command-line tool for scraping Amazon product data to CSV or JSON format(s).

Requirements

  • Python 3
  • pip3

Installation

Using git clone (you'll need git installed for this):

git clone https://github.com/scrapewalrus/amazon-scraper-python-2021

Or download and extract the zip file of the project manually

You'll also need to install requirements for the project to run. Locate amazon-product-scraper folder via terminal and type pip install -r requirements.txt:

Usage

To launch the Amazon scraper locate the amazon-product-scraper folder via terminal and type python amazon_scraper.py -k "your keyword". This will start the program.

NOTE: you must declare either -k or --keyword before entering your keyword. It's a required argument.

Example:

amazon_scraper.py - the name of a scraper file.

-k or --keyword - required argument to pass before entering your keyword.

-p or --proxies - optional argument to enable proxies. To avoid getting blocked I highly recommend using proxies. I'm using Residential Proxies from Oxylabs. For highest success rate, I suggest Residential Proxies over Datacenter as they're almost impossible to detect and have the smallest footprint. If you decide to use different proxy provider services keep in mind that you'll have to make some minor adjustments in get-proxies.py file.

-j or --json - optional argument for storing extracted data in .json format. Default output format is .csv.

Example of product data #1: JSON

[
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/Mostly-T-Shirt-Womens-Letter-Printed/dp/B07QN2NQ59/ref=sr_1_3?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-3",
        "PRODUCT_NAME": "I'm Mostly Peace Love and Light Funny T-Shirt Womens Graphic Printed Short Sleeve Tops Tee",
        "PRICE": "$21.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "1,637"
    },
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/YITAN-Women-Graphic-Funny-X-Large/dp/B074QMG4D7/ref=sr_1_4?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-4",
        "PRODUCT_NAME": "YITAN Women's Cute Juniors Tops Teen Girl Tee Funny T Shirt",
        "PRICE": "$12.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "12,281"
    },
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/DANVOUY-Womens-V-Neck-Doesnt-Definitely/dp/B07V55ZXVS/ref=sr_1_5?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-5",
        "PRODUCT_NAME": "DANVOUY Womens If My Mouth Doesn't Say It My Face Definitely Will T Shirt",
        "PRICE": "$12.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "6,787"
    }]

Example of product data #2: CSV

amazon-csv-product-data-example

Helping you manage your data science projects sanely.

PyDS CLI Helping you manage your data science projects sanely. Requirements Anaconda/Miniconda/Miniforge/Mambaforge (Mambaforge recommended!) git on y

Eric Ma 16 Apr 25, 2022
Objexplore is an interactive Python object explorer for the terminal.

Objexplore is an interactive Python object explorer for the terminal. Use it while debugging, or exploring a new library, or whatever! 9D1FAC73-B2A5-4

kylepollina 249 Dec 23, 2022
A simple CLI productivity tool to quickly display the syntax of a desired piece of code

Iforgor Iforgor is a customisable and easy to use command line tool to manage code samples. It's a good way to quickly get your hand on syntax you don

Solaris 21 Jan 03, 2023
Fun project to generate The Matrix Code effect on you terminal.

Fun project to generate The Matrix Code effect on you terminal.

Henrique Bastos 11 Jul 13, 2022
StackOverflow in your terminal.

how. How do I ...? This project was started to help developers ask more questions. Table of Contents Installation Usage Foss Community Copyright Insta

Ron Nathaniel 2 Jan 31, 2022
lfb (light file browser) is a terminal file browser

lfb (light file browser) is a terminal file browser. The whole program is a mess as of now. In the feature I will remove the need for external dependencies, tidy up the code, make an actual readme, a

2 Apr 09, 2022
GanTTY - Project planning from the terminal

GanTTY - Project planning from the terminal

Timeo Sam Pochin 161 Dec 26, 2022
frogtrade9000 - a command-line Rich client for the freqtrade REST API

frogtrade9000 - a command-line Rich client for the freqtrade REST API I found FreqUI too cumbersome and slow on my Raspberry Pi 400 when running multi

Robert Davey 79 Dec 02, 2022
A simple yet powerful timer and time tracker from the command line.

Focus Phase Focus Phase (FP) is a simple yet powerful timer and time tracker. It is a command-line application written in Python and can be installed

Ammar Alyousfi 13 Jan 13, 2022
A python command line tool to calculate options max pain for a given company symbol and options expiry date.

Options-Max-Pain-Calculator A python command line tool to calculate options max pain for a given company symbol and options expiry date. Overview - Ma

13 Dec 26, 2022
Fast as FUCK nvim completion. SQLite, concurrent scheduler, hundreds of hours of optimization.

Fast as FUCK nvim completion. SQLite, concurrent scheduler, hundreds of hours of optimization.

i love my dog 2.8k Jan 05, 2023
Because sometimes you need to do it live

doitlive doitlive is a tool for live presentations in the terminal. It reads a file of shell commands and replays the commands in a fake terminal sess

Steven Loria 3.2k Jan 09, 2023
Free and Open-Source Command Line tool for Text Replacement

Sniplet Free and Open Source Text Replacement Tool Description: Sniplet is a work in progress CLI tool which can do text replacement globally in Linux

Veeraraghavan Narasimhan 13 Nov 28, 2022
A collection of command-line interface games written in python

Command Line Interface Python Games Collection of some starter python game projects for beginners How to play these games Clone this repository git cl

Paras Gupta 7 Jun 06, 2022
gget is a free and open-source command-line tool and Python package that enables efficient querying of genomic databases.

gget is a free and open-source command-line tool and Python package that enables efficient querying of genomic databases. gget consists of a collection of separate but interoperable modules, each des

Pachter Lab 570 Dec 29, 2022
Palm CLI - the tool-belt for data teams

Palm CLI: The extensible CLI at your fingertips Palm is a universal CLI developed to improve the life and work of data professionals. Palm CLI documen

Palmetto 41 Dec 12, 2022
A Reverse Shell Python Packages

A Reverse Shell Python Packages

1 Nov 03, 2021
Command-line tool for looking up colors and palettes.

Colorpedia Colorpedia is a command-line tool for looking up colors, shades and palettes. Supported color models: HEX, RGB, HSL, HSV, CMYK. Requirement

Joohwan Oh 282 Dec 27, 2022
Password manager for the CLI simps.

CLI Password Manager Password manager for the CLI simps. Free software: MIT license

1 Dec 30, 2021
Command-line search tool for GitHub

cligh is a command-line search tool for GitHub.

1 Oct 02, 2022