Pseudo API for Google Trends

Related tags

Web Crawlingpytrends
Overview

pytrends

Introduction

Unofficial API for Google Trends

Allows simple interface for automating downloading of reports from Google Trends. Only good until Google changes their backend again :-P. When that happens feel free to contribute!

Looking for maintainers!

Table of contens

Installation

pip install pytrends

Requirements

  • Written for Python 3.3+
  • Requires Requests, lxml, Pandas

back to top

API

Connect to Google

from pytrends.request import TrendReq

pytrends = TrendReq(hl='en-US', tz=360)

or if you want to use proxies as you are blocked due to Google rate limit:

from pytrends.request import TrendReq

pytrends = TrendReq(hl='en-US', tz=360, timeout=(10,25), proxies=['https://34.203.233.13:80',], retries=2, backoff_factor=0.1, requests_args={'verify':False})
  • timeout(connect, read)

  • tz

    • Timezone Offset
    • For example US CST is '360' (note NOT -360, Google uses timezone this way...)
  • proxies

    • https proxies Google passed ONLY
    • list ['https://34.203.233.13:80','https://35.201.123.31:880', ..., ...]
  • retries

    • number of retries total/connect/read all represented by one scalar
  • backoff_factor

    • A backoff factor to apply between attempts after the second try (most errors are resolved immediately by a second try without a delay). urllib3 will sleep for: {backoff factor} * (2 ^ ({number of total retries} - 1)) seconds. If the backoff_factor is 0.1, then sleep() will sleep for [0.0s, 0.2s, 0.4s, …] between retries. It will never be longer than Retry.BACKOFF_MAX. By default, backoff is disabled (set to 0).
  • requests_args

    • A dict with additional parameters to pass along to the underlying requests library, for example verify=False to ignore SSL errors

Note: the parameter hl specifies host language for accessing Google Trends. Note: only https proxies will work, and you need to add the port number after the proxy ip address

Build Payload

kw_list = ["Blockchain"]
pytrends.build_payload(kw_list, cat=0, timeframe='today 5-y', geo='', gprop='')

Parameters

  • kw_list

    • Required
    • Keywords to get data for

back to top

API Methods

The following API methods are available:

  • Interest Over Time: returns historical, indexed data for when the keyword was searched most as shown on Google Trends' Interest Over Time section.

  • Historical Hourly Interest: returns historical, indexed, hourly data for when the keyword was searched most as shown on Google Trends' Interest Over Time section. It sends multiple requests to Google, each retrieving one week of hourly data. It seems like this would be the only way to get historical, hourly data.

  • Interest by Region: returns data for where the keyword is most searched as shown on Google Trends' Interest by Region section.

  • Related Topics: returns data for the related keywords to a provided keyword shown on Google Trends' Related Topics section.

  • Related Queries: returns data for the related keywords to a provided keyword shown on Google Trends' Related Queries section.

  • Trending Searches: returns data for latest trending searches shown on Google Trends' Trending Searches section.

  • Top Charts: returns the data for a given topic shown in Google Trends' Top Charts section.

  • Suggestions: returns a list of additional suggested keywords that can be used to refine a trend search.

back to top

Common API parameters

Many API methods use the following:

  • kw_list

    • keywords to get data for

    • Example ['Pizza']

    • Up to five terms in a list: ['Pizza', 'Italian', 'Spaghetti', 'Breadsticks', 'Sausage']

      • Advanced Keywords

        • When using Google Trends dashboard Google may provide suggested narrowed search terms.
        • For example "iron" will have a drop down of "Iron Chemical Element, Iron Cross, Iron Man, etc".
        • Find the encoded topic by using the get_suggestions() function and choose the most relevant one for you.
        • For example: https://www.google.com/trends/explore#q=%2Fm%2F025rw19&cmpt=q
        • "%2Fm%2F025rw19" is the topic "Iron Chemical Element" to use this with pytrends
        • You can also use pytrends.suggestions() to automate this.
  • cat

    • Category to narrow results
    • Find available cateogies by inspecting the url when manually using Google Trends. The category starts after cat= and ends before the next & or view this wiki page containing all available categories
    • For example: "https://www.google.com/trends/explore#q=pizza&cat=71"
    • '71' is the category
    • Defaults to no category
  • geo

    • Two letter country abbreviation
    • For example United States is 'US'
    • Defaults to World
    • More detail available for States/Provinces by specifying additonal abbreviations
    • For example: Alabama would be 'US-AL'
    • For example: England would be 'GB-ENG'
  • tz

  • timeframe

    • Date to start from

    • Defaults to last 5yrs, 'today 5-y'.

    • Everything 'all'

    • Specific dates, 'YYYY-MM-DD YYYY-MM-DD' example '2016-12-14 2017-01-25'

    • Specific datetimes, 'YYYY-MM-DDTHH YYYY-MM-DDTHH' example '2017-02-06T10 2017-02-12T07'

      • Note Time component is based off UTC
    • Current Time Minus Time Pattern:

      • By Month: 'today #-m' where # is the number of months from that date to pull data for

        • For example: 'today 3-m' would get data from today to 3months ago
        • NOTE Google uses UTC date as 'today'
        • Works for 1, 3, 12 months only!
      • Daily: 'now #-d' where # is the number of days from that date to pull data for

        • For example: 'now 7-d' would get data from the last week
        • Works for 1, 7 days only!
      • Hourly: 'now #-H' where # is the number of hours from that date to pull data for

        • For example: 'now 1-H' would get data from the last hour
        • Works for 1, 4 hours only!
  • gprop

    • What Google property to filter to
    • Example 'images'
    • Defaults to web searches
    • Can be images, news, youtube or froogle (for Google Shopping results)

back to top

Interest Over Time

pytrends.interest_over_time()

Returns pandas.Dataframe

back to top

Historical Hourly Interest

pytrends.get_historical_interest(kw_list, year_start=2018, month_start=1, day_start=1, hour_start=0, year_end=2018, month_end=2, day_end=1, hour_end=0, cat=0, geo='', gprop='', sleep=0)

Parameters

  • kw_list

    • Required
    • list of keywords that you would like the historical data
  • year_start, month_start, day_start, hour_start, year_end, month_end, day_end, hour_end

    • the time period for which you would like the historical data
  • sleep

    • If you are rate-limited by Google, you should set this parameter to something (i.e. 60) to space off each API call.

Returns pandas.Dataframe

back to top

Interest by Region

pytrends.interest_by_region(resolution='COUNTRY', inc_low_vol=True, inc_geo_code=False)

Parameters

  • resolution

    • 'CITY' returns city level data
    • 'COUNTRY' returns country level data
    • 'DMA' returns Metro level data
    • 'REGION' returns Region level data
  • inc_low_vol

    • True/False (includes google trends data for low volume countries/regions as well)
  • inc_geo_code

    • True/False (includes ISO codes of countries along with the names in the data)

Returns pandas.DataFrame

back to top

Related Topics

pytrends.related_topics()

Returns dictionary of pandas.DataFrames

back to top

Related Queries

pytrends.related_queries()

Returns dictionary of pandas.DataFrames

back to top

Trending Searches

pytrends.trending_searches(pn='united_states') # trending searches in real time for United States
pytrends.trending_searches(pn='japan') # Japan

Returns pandas.DataFrame

back to top

Top Charts

pytrends.top_charts(date, hl='en-US', tz=300, geo='GLOBAL')

Parameters

  • date

    • Required
    • YYYY integer
    • Example 2019 for the year 2019 Top Chart data
    • Note Google removed support for monthly queries (e.g. YYYY-MM)
    • Note Google does not return data for the current year

Returns pandas.DataFrame

back to top

Suggestions

pytrends.suggestions(keyword)

Parameters

  • keyword

    • Required
    • keyword to get suggestions for

Returns dictionary

back to top

Categories

pytrends.categories()

Returns dictionary

back to top

Caveats

  • This is not an official or supported API
  • Google may change aggregation level for items with very large or very small search volume
  • Rate Limit is not publicly known, let me know if you have a consistent estimate
    • One user reports that 1,400 sequential requests of a 4 hours timeframe got them to the limit. (Replicated on 2 networks)
    • It has been tested, and 60 seconds of sleep between requests (successful or not) is the correct amount once you reach the limit.
  • For certain configurations the dependency lib certifi requires the environment variable REQUESTS_CA_BUNDLE to be explicitly set and exported. This variable must contain the path where the ca-certificates are saved or a SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] error is given at runtime.

Credits

Comments
  • 429 After first request?

    429 After first request?

    Howdy,

    I'm attempting to use the library and I'm getting hit with a 429 error after copying the example code. Here's my script:

    from pytrends.request import TrendReq
    
    pytrends = TrendReq(hl='en-US', tz=360)
    kw_list = ["Blockchain"]
    pytrends.build_payload(kw_list, cat=0, timeframe='today 5-y', geo='', gprop='')
    pytrends.interest_over_time()
    

    I can visit the trends website fine, and I can copy and paste the URL produced by the API and get a json file just fine. It's hard for me to imagine being rate limited on my first request and still being able to visit the site normally.

    Any ideas?

    opened by alexsullivan114 41
  • RateLimitError

    RateLimitError

    Hi!

    Thank you for the updates of the code. I tried to run the new updated version. After about 10 downloads, I receive the following traceback:

    Traceback (most recent call last): File "C:/Users/Documents/Python Scripts/collect_gtrends.py", line 34, in trend=pytrend.trend(trend_payload, return_type='dataframe') File "C:\Users\AppData\Roaming\Python\Python27\site-packages\pytrends\request.py", line 62, in trend raise RateLimitError pytrends.request.RateLimitError

    I don't think this is the quota limit problem. Maybe I was downloading too frequently? How may seconds do you guys wait in between requests? My current program lets it sleep for 5-10 seconds. Is that not enough? Thank you!

    opened by sarahjohns 29
  • Is it just me or have 429 errors really increase these past few days?

    Is it just me or have 429 errors really increase these past few days?

    I used to get 429 error every time I request more than 6 items per hour or so.

    But recently, especially today I am not able to request more than 1 per hour without getting 429. Is it just my IP acting up?

    opened by igalci 23
  • ModuleNotFoundError: No module named 'pandas.io.json.normalize'

    ModuleNotFoundError: No module named 'pandas.io.json.normalize'

    I have pandas, lxml, numpy, json modules. But i've got this error when i run example codes of pytrends.

    ModuleNotFoundError: No module named 'pandas.io.json.normalize'

    opened by kubilaykilinc 20
  • Python 2 compatibility issues

    Python 2 compatibility issues

    Hi all,

    I'm having troubles using this library with pyhton2 when there is an error in the response. In fact JSONDecodeError that is being caught when parsing the response is not defined in Python 2 (as stated in https://docs.python.org/3/library/json.html#json.JSONDecodeError). It is also stated that JSONDecodeError is a subclass of ValueError, which could be used for the python2 version.

    Thanks, Luca

    bug 
    opened by covix 20
  • Google Quota limit - IP Address changer

    Google Quota limit - IP Address changer

    Hi, As mentionned before in previous issues I face the Google quota limit after barely 10 requests. I tried to change my IP address by routing the requests through Tor. However I have not been able yet to bypass the limitation. I raise the issue in the following page: http://stackoverflow.com/questions/40406458/google-trends-quota-limit-ip-address-changer

    opened by jblemoine 18
  • interest_over_time doesn't work

    interest_over_time doesn't work

    Hi, I have the following issue:

    Using your example I execute the following code: pytrend.build_payload(kw_list=['pizza', 'bagel']) pytrend.interest_over_time()

    After the last one I have an answer "ValueError: year is out of range"

    And the following: pytrend.interest_by_region() gives me : ValueError: No JSON object could be decoded

    At the same time pytrend.related_queries() works well.

    What could be wrong here?

    opened by FourthWiz 15
  • request for tests

    request for tests

    Merging PRs is difficult for the maintainers because of a lack of robust tests. If somebody writes a broad set of tests, it will significantly improve the ability to merge updates with less risk.

    Thank you for your help!

    help wanted good first issue 
    opened by emlazzarin 14
  • Script stopped working.. 400 Bad Request error

    Script stopped working.. 400 Bad Request error

    I have been using this python script for long now... suddenly today this script stopped working. I am getting 400 Bad Request error and now able to download any Google Trend CSV file from the script..

    Getting error for Connector as well. "connector = pyGTrends(google_username, google_password)"

    I think this is the main issue.

    opened by ravimevcha 14
  • Trends with daily granularity

    Trends with daily granularity

    I think that this is not an issue because Google decides the granularity of the results (daily, weekly or monthly) depending on the search time frame. So I decided to implement a method that splits big time frames into smaller ones (90 days i.e.) with a one day overlap to normalize the scale between the data. Do you think this could be a good improvement for pytrends?

    help wanted 
    opened by bigme666 13
  • SSL cert failure on VPN

    SSL cert failure on VPN

    When I try to login using TrendReq(..) I am getting a SSL error. From what I've been able to figure out is that the form from https://accounts.google.com/ServiceLogin doesn't seem to accept the Passwd arguments until you make the request with the email and then you have to make a new request again with the password. Not sure if this is on the right track or I'm just being an idiot.

    opened by ZenW00kie 13
  • Combination of interest_over_time() and interest_by_region()

    Combination of interest_over_time() and interest_by_region()

    Hey there, I was wondering if it is possible to request data over time with a defined geographical resolution. Currently, it is only possible to have either a temporal or a spatial differentiation, but not both at the same time. Since different Google Trends API URLs are used for the two requests, I think Google Trends may restrict this option. Thanks!

    opened by MoritzDPTV 0
  • Getting incomplete data requesting timeframe=all

    Getting incomplete data requesting timeframe=all

    So Im searching for a specific term and my request gets back with a data gap, while checking the data on the google website and donwloading data comes complete. Also when requesting data for the specific gap I get the correct data points.

    ` kw_list = ['Arbeitslosigkeit'] pytrends.build_payload(kw_list, cat=0, timeframe='2010-01-01 2012-12-25', geo='DE', gprop='')

    pytrends.build_payload(kw_list, cat=0, timeframe='all', geo='DE', gprop='')

    `

    pic1 marcc

    opened by RogerRendon 1
  •  Cannot get the same result as the webpage

    Cannot get the same result as the webpage

    I use interest_over_time() can not get the same result as the webpage, I notice the webpage's headers['req'] is diffrent from the requests, i change it as the webpage's, but still cannot get the same result, what should i do?

    the webpage's headers['req'] is in below, Some do not seem to exist before? Is this the reason? req: {"time":"2004-01-01 2022-11-17","resolution":"MONTH","locale":"zh-CN","comparisonItem":[{"geo":{"country":"BR"},"complexKeywordsRestriction":{"keyword":[{"type":"ENTITY","value":"/m/01hpbc"}]}},{"geo":{"country":"BR"},"complexKeywordsRestriction":{"keyword":[{"type":"ENTITY","value":"/g/11dymw9wxl"}]}}],"requestOptions":{"property":"","backend":"IZG","category":0},"userConfig":{"userType":"USER_TYPE_LEGIT_USER"}}

    opened by jmz1996 2
  •  Interest_over_time missing data

    Interest_over_time missing data

    Today I started facing an issue with the Interest_over_time missing data.

    The trend data just drops to 0 for about a year or so then the data picks back up.

    Last night I had no issues then this morning it started. Tested on multiple machines and different networks.

    For example, try running Interest_over_time for the keyword "barefoot shoes" you'll see around 2020 the data goes to 0 and then returns to normal.

    It only happens for some keywords while others are fine.

    Anyone else facing this issue?

    opened by nicktba 3
  • Newbie: specification of years

    Newbie: specification of years "today 5-y" works but not "today 10-y"

    This could be a newbie issue. Are there restrictions on the years valid in the timescale parameter ? I can get the payload to work with "today 5-y" but not today 10-y". The "all" parameter works "all" - I note that in other parts of the api there are specific limits - are years restricted to 5 or all ? thanks

    opened by loquor 0
  • No way to know what changed between versions

    No way to know what changed between versions

    Currently there is no way to know what changed between versions except to download both versions from pypi and check the differences in the source code, this makes very risky to depend on this library for anything non-amateurish.

    Please consider adding one or more of the following:

    • Release in Github with changelog.
    • Annotated tags in the commit where the version is released.
    • Add a CHANGELOG.md file to the repo with a header for every version released; bonus points if you follow the Keep a changelog format.
    • Add a changelog section in the README.md with a header for every version released.
    opened by Terseus 2
Releases(v4.8.0)
Owner
General Mills
General Mills
Raspi-scraper is a configurable python webscraper that checks raspberry pi stocks from verified sellers

Raspi-scraper is a configurable python webscraper that checks raspberry pi stocks from verified sellers.

Louie Cai 13 Oct 15, 2022
Scrapes the Sun Life of Canada Philippines web site for historical prices of their investment funds and then saves them as CSV files.

slocpi-scraper Sun Life of Canada Philippines Inc Investment Funds Scraper Install dependencies pip install -r requirements.txt Usage General format:

Daryl Yu 2 Jan 07, 2022
Audio media crawler for lbry.

Audio media crawler for lbry. Requirements Python 3.8 Poetry 1.1.7 Elasticsearch 7.14.0 Lbry-sdk 0.99.0 Development This project uses poetry as a depe

Hound.fm 4 Dec 03, 2022
✂️🕷️ Spider-Cut is a Network Mapper Framework (NMAP Framework)

Spider-Cut is a Network Mapper Framework (NMAP Framework) Installation | Usage | Creators | Donate Installation # Kali Linux | WSL

XforWorks 3 Mar 07, 2022
A Python web scraper to scrape latest posts from official Coinbase's Blog.

Coinbase Blog Scraper A Python web scraper to scrape latest posts from official Coinbase's Blog. IDEA It scrapes up latest blog posts from https://blo

Lucas Villela 3 Feb 18, 2022
Scraping news from Ucsal portal with Scrapy.

NewsScraping Esse é um projeto de raspagem das últimas noticias, de 2021, do portal da universidade Ucsal http://noosfero.ucsal.br/institucional Tecno

Crissiano Pires 0 Sep 30, 2021
Web Content Retrieval for Humans™

Lassie Lassie is a Python library for retrieving basic content from websites. Usage import lassie lassie.fetch('http://www.youtube.com/watch?v

Mike Helmick 570 Dec 19, 2022
mlscraper: Scrape data from HTML pages automatically with Machine Learning

🤖 Scrape data from HTML websites automatically with Machine Learning

Karl Lorey 798 Dec 29, 2022
Web Scraping OLX with Python and Bsoup.

webScrap WebScraping first step. Authors: Paulo, Claudio M. First steps in Web Scraping. Project carried out for training in Web Scrapping. The export

claudio paulo 5 Sep 25, 2022
Haphazard scripts for scraping bitcoin/bitcoin data from GitHub

This is a quick-and-dirty tool used to scrape bitcoin/bitcoin pull request and commentary data. Each output/pr number folder contains comments.json:

James O'Beirne 8 Oct 12, 2022
京东茅台抢购

截止 2021/2/1 日,该项目已无法使用! 京东:约满即止,仅限京东实名认证用户APP端抢购,2月1日10:00开始预约,2月1日12:00开始抢购(京东APP需升级至8.5.6版本及以上) 写在前面 本项目来自 huanghyw - jd_seckill,作者的项目地址我找不到了,找到了再贴上

abee 73 Dec 03, 2022
Screenhook is a script that captures an image of a web page and send it to a discord webhook.

screenshot from the web for discord webhooks screenhook is a script that captures an image of a web page and send it to a discord webhook.

Toast Energy 3 Jun 04, 2022
Scrapy, a fast high-level web crawling & scraping framework for Python.

Scrapy Overview Scrapy is a fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pag

Scrapy project 45.5k Jan 07, 2023
This scrapper scrapes the mail ids of faculty members from a given linl/page and stores it in a csv file

This scrapper scrapes the mail ids of faculty members from a given linl/page and stores it in a csv file

Devansh Singh 1 Feb 10, 2022
Twitter Claimer / Swapper / Turbo - Proxyless - Multithreading

Twitter Turbo / Auto Claimer / Swapper Version: 1.0 Last Update: 01/26/2022 Use this at your own descretion. I've only used this on test accounts and

Underscores 6 May 02, 2022
jd_maotai rpa 基于selenium驱动的jd抢购rpa机器人

jd_maotai rpa 基于selenium驱动的jd抢购rpa机器人, 照顾我们这样的马大哈, 不会忘记抢购了, 祝大家过年都能喝上茅台. 特别声明: 本仓库发布的jd_maotai_rpa项目定义为自动化rpa项目, 是用于防止忘记参与jd茅台的活动(由于本人时常忘记), 而不是为了秒杀和抢

35 Nov 18, 2022
TarkovScrappy - A nifty little bot that lets you know if a queried item might be required for a quest at some point in the land of Tarkov!

TarkovScrappy A nifty little bot that lets you know if a queried item might be required for a quest at some point in the land of Tarkov! Hideout items

Joshua Smeda 2 Apr 11, 2022
联通手机营业厅自动做任务、签到、领流量、领积分等。

联通手机营业厅自动完成每日任务,领流量、签到获取积分等,月底流量不发愁。 功能 沃之树领流量、浇水(12M日流量) 每日签到(1积分+翻倍4积分+第七天1G流量日包) 天天抽奖,每天三次免费机会(随机奖励) 游戏中心每日打卡(连续打卡,积分递增至最高

2k May 06, 2021
An arxiv spider

An Arxiv Spider 做为一个cser,杰出男孩深知内核对连接到计算机上的硬件设备进行管理的高效方式是中断而不是轮询。每当小伙伴发来一篇刚挂在arxiv上的”热乎“好文章时,杰出男孩都会感叹道:”师兄这是每天都挂在arxiv上呀,跑的好快~“。于是杰出男孩找了找 github,借鉴了一下其

Jie Liu 11 Sep 09, 2022
A simple Discord scraper for discord bots

A simple Discord scraper for discord bots. That includes sending an guild members ids to an file, Mass inviter for joining servers your bot is in and Fetching all the servers of the bot (w/MemberCoun

3zg 1 Jan 06, 2022