Python wrapper for Wikipedia

Overview

Wikipedia API

Wikipedia-API is easy to use Python wrapper for Wikipedias' API. It supports extracting texts, sections, links, categories, translations, etc from Wikipedia. Documentation provides code snippets for the most common use cases.

build status Documentation Status Test Coverage Version Py Versions GitHub stars

Installation

This package requires at least Python 3.4 to install because it's using IntEnum.

pip3 install wikipedia-api

Usage

Goal of Wikipedia-API is to provide simple and easy to use API for retrieving informations from Wikipedia. Bellow are examples of common use cases.

Importing

import wikipediaapi

How To Get Single Page

Getting single page is straightforward. You have to initialize Wikipedia object and ask for page by its name. It's parameter language has be one of supported languages.

import wikipediaapi
    wiki_wiki = wikipediaapi.Wikipedia('en')

    page_py = wiki_wiki.page('Python_(programming_language)')

How To Check If Wiki Page Exists

For checking, whether page exists, you can use function exists.

page_py = wiki_wiki.page('Python_(programming_language)')
print("Page - Exists: %s" % page_py.exists())
# Page - Exists: True

page_missing = wiki_wiki.page('NonExistingPageWithStrangeName')
print("Page - Exists: %s" %     page_missing.exists())
# Page - Exists: False

How To Get Page Summary

Class WikipediaPage has property summary, which returns description of Wiki page.

import wikipediaapi
    wiki_wiki = wikipediaapi.Wikipedia('en')

    print("Page - Title: %s" % page_py.title)
    # Page - Title: Python (programming language)

    print("Page - Summary: %s" % page_py.summary[0:60])
    # Page - Summary: Python is a widely used high-level programming language for

How To Get Page URL

WikipediaPage has two properties with URL of the page. It is fullurl and canonicalurl.

print(page_py.fullurl)
# https://en.wikipedia.org/wiki/Python_(programming_language)

print(page_py.canonicalurl)
# https://en.wikipedia.org/wiki/Python_(programming_language)

How To Get Full Text

To get full text of Wikipedia page you should use property text which constructs text of the page as concatanation of summary and sections with their titles and texts.

wiki_wiki = wikipediaapi.Wikipedia(
        language='en',
        extract_format=wikipediaapi.ExtractFormat.WIKI
)

p_wiki = wiki_wiki.page("Test 1")
print(p_wiki.text)
# Summary
# Section 1
# Text of section 1
# Section 1.1
# Text of section 1.1
# ...


wiki_html = wikipediaapi.Wikipedia(
        language='en',
        extract_format=wikipediaapi.ExtractFormat.HTML
)
p_html = wiki_html.page("Test 1")
print(p_html.text)
# <p>Summary</p>
# <h2>Section 1</h2>
# <p>Text of section 1</p>
# <h3>Section 1.1</h3>
# <p>Text of section 1.1</p>
# ...

How To Get Page Sections

To get all top level sections of page, you have to use property sections. It returns list of WikipediaPageSection, so you have to use recursion to get all subsections.

def print_sections(sections, level=0):
        for s in sections:
                print("%s: %s - %s" % ("*" * (level + 1), s.title, s.text[0:40]))
                print_sections(s.sections, level + 1)


print_sections(page_py.sections)
# *: History - Python was conceived in the late 1980s,
# *: Features and philosophy - Python is a multi-paradigm programming l
# *: Syntax and semantics - Python is meant to be an easily readable
# **: Indentation - Python uses whitespace indentation, rath
# **: Statements and control flow - Python's statements include (among other
# **: Expressions - Some Python expressions are similar to l

How To Get Page In Other Languages

If you want to get other translations of given page, you should use property langlinks. It is map, where key is language code and value is WikipediaPage.

def print_langlinks(page):
        langlinks = page.langlinks
        for k in sorted(langlinks.keys()):
            v = langlinks[k]
            print("%s: %s - %s: %s" % (k, v.language, v.title, v.fullurl))

print_langlinks(page_py)
# af: af - Python (programmeertaal): https://af.wikipedia.org/wiki/Python_(programmeertaal)
# als: als - Python (Programmiersprache): https://als.wikipedia.org/wiki/Python_(Programmiersprache)
# an: an - Python: https://an.wikipedia.org/wiki/Python
# ar: ar - بايثون: https://ar.wikipedia.org/wiki/%D8%A8%D8%A7%D9%8A%D8%AB%D9%88%D9%86
# as: as - পাইথন: https://as.wikipedia.org/wiki/%E0%A6%AA%E0%A6%BE%E0%A6%87%E0%A6%A5%E0%A6%A8

page_py_cs = page_py.langlinks['cs']
print("Page - Summary: %s" % page_py_cs.summary[0:60])
# Page - Summary: Python (anglická výslovnost [ˈpaiθtən]) je vysokoúrovňový sk

How To Get Links To Other Pages

If you want to get all links to other wiki pages from given page, you need to use property links. It's map, where key is page title and value is WikipediaPage.

def print_links(page):
        links = page.links
        for title in sorted(links.keys()):
            print("%s: %s" % (title, links[title]))

print_links(page_py)
# 3ds Max: 3ds Max (id: ??, ns: 0)
# ?:: ?: (id: ??, ns: 0)
# ABC (programming language): ABC (programming language) (id: ??, ns: 0)
# ALGOL 68: ALGOL 68 (id: ??, ns: 0)
# Abaqus: Abaqus (id: ??, ns: 0)
# ...

How To Get Page Categories

If you want to get all categories under which page belongs, you should use property categories. It's map, where key is category title and value is WikipediaPage.

def print_categories(page):
        categories = page.categories
        for title in sorted(categories.keys()):
            print("%s: %s" % (title, categories[title]))


print("Categories")
print_categories(page_py)
# Category:All articles containing potentially dated statements: ...
# Category:All articles with unsourced statements: ...
# Category:Articles containing potentially dated statements from August 2016: ...
# Category:Articles containing potentially dated statements from March 2017: ...
# Category:Articles containing potentially dated statements from September 2017: ...

How To Get All Pages From Category

To get all pages from given category, you should use property categorymembers. It returns all members of given category. You have to implement recursion and deduplication by yourself.

def print_categorymembers(categorymembers, level=0, max_level=1):
        for c in categorymembers.values():
            print("%s: %s (ns: %d)" % ("*" * (level + 1), c.title, c.ns))
            if c.ns == wikipediaapi.Namespace.CATEGORY and level < max_level:
                print_categorymembers(c.categorymembers, level=level + 1, max_level=max_level)


cat = wiki_wiki.page("Category:Physics")
print("Category members: Category:Physics")
print_categorymembers(cat.categorymembers)

# Category members: Category:Physics
# * Statistical mechanics (ns: 0)
# * Category:Physical quantities (ns: 14)
# ** Refractive index (ns: 0)
# ** Vapor quality (ns: 0)
# ** Electric susceptibility (ns: 0)
# ** Specific weight (ns: 0)
# ** Category:Viscosity (ns: 14)
# *** Brookfield Engineering (ns: 0)

How To See Underlying API Call

If you have problems with retrieving data you can get URL of undrerlying API call. This will help you determine if the problem is in the library or somewhere else.

import wikipediaapi
import sys
wikipediaapi.log.setLevel(level=wikipediaapi.logging.DEBUG)

# Set handler if you use Python in interactive mode
out_hdlr = wikipediaapi.logging.StreamHandler(sys.stderr)
out_hdlr.setFormatter(wikipediaapi.logging.Formatter('%(asctime)s %(message)s'))
out_hdlr.setLevel(wikipediaapi.logging.DEBUG)
wikipediaapi.log.addHandler(out_hdlr)

wiki = wikipediaapi.Wikipedia(language='en')

page_ostrava = wiki.page('Ostrava')
print(page_ostrava.summary)
# logger prints out: Request URL: http://en.wikipedia.org/w/api.php?action=query&prop=extracts&titles=Ostrava&explaintext=1&exsectionformat=wiki

External Links

Other Badges

Code Climate Issue Count Coveralls Version Py Versions implementations Downloads Tags github-release Github commits (since latest release) GitHub forks GitHub stars GitHub watchers GitHub commit activity the past week, 4 weeks, year Last commit GitHub code size in bytes GitHub repo size in bytes PyPi License PyPi Wheel PyPi Format PyPi PyVersions PyPi Implementations PyPi Status PyPi Downloads - Day PyPi Downloads - Week PyPi Downloads - Month Libraries.io - SourceRank Libraries.io - Dependent Repos

Other Pages

.. toctree::
        :maxdepth: 2

        API
        CHANGES
        DEVELOPMENT
        wikipediaapi/api

Owner
Martin Majlis
Martin Majlis
Simple Webhook Spammer with Optional Proxy Support

😎 �Simple Webhook Spammer with Optional Proxy Support:- [+] git clone https://g

Terminal1337 12 Sep 29, 2022
Feedback-TelegramBot is a resemblance bot which can be deployed on server

Feedback-TelegramBot Feedback-TelegramBot is a resemblance bot which can be deployed on server This work is based on Telegram library, thanks to their

2 Jan 03, 2022
aws-lambda-scheduler lets you call any existing AWS Lambda Function you have in a future time.

aws-lambda-scheduler aws-lambda-scheduler lets you call any existing AWS Lambda Function you have in the future. This functionality is achieved by dyn

Oğuzhan Yılmaz 57 Dec 17, 2022
“ Hey there 👋 I'm Sophia „ TG Group management bot with Some Extra features..

❤️ Sophia ❤️ Avaiilable on Telegram as SophiaBot 🏃‍♂️ Easy Deploy Mandatory Vars [+] Make Sure You Add All These Mandatory Vars. [-] APP_ID: You ca

THEEKSHANA 5 Dec 09, 2021
A discord bot written in python

arch-bot A discord bot written in python prefix: . help: .help Installation Requirements A discord bot token Your user id Python installed. For window

3 Jan 10, 2022
PyFacebook

== PyFacebook == PyFacebook is a Python client library for the Facebook API. Samuel Cormier-Iijima ( Samuel Cormier-Iijima 573 Dec 20, 2022

This is a story bot, that will scrape stories from r/stories subreddit and convert it into an Audio File.

Introduction This is a story bot, that will scrape stories from r/stories subreddit and convert it into an Audio File. Installation pip install -r req

Yasho 11 Jun 30, 2022
An incomplete add-on extension to Pyrogram, to create telegram bots a bit more easily

PyStark A star ⭐ from you means a lot An incomplete add-on extension to Pyrogram

Stark Bots 36 Dec 23, 2022
A discord token grabber made in Python 3

Discord Token Grabber A Discord token grabber written in Python 3. This version of the grabber only supports Windows. Features Transfers via Discord w

Mega145 4 Aug 04, 2022
Reverse engineered connection to the TradingView ticker in Python

Tradingview-ticker Reverse engineered connection to the TradingView ticker in Python. Makes a websocket connection to the Tradeview website and receiv

Aaron 20 Dec 02, 2022
The public discord bot, created by: primitt, further developed by: duino-coin team.

Duino Stats Mini A public Duino-Stats Discord bot. Click this link to invite the bot to your server. License Duino Stats Mini distributed under the MI

primboi 8 Mar 14, 2022
A basic implementation of the Battlesnake API in Python

Getting started with Battlesnake and Python This is a basic implementation of the Battlesnake API in Python. It's a great starting point for anyone wa

Gaurav Batra 2 Dec 08, 2021
Maintained Fork of Jishaku For nextcord

Onami a debugging and utility extension for nextcord bots Read the documentation online. Fork Onami is a actively maintained fork of Jishaku for nextc

RPS 11 Dec 14, 2022
Official API documentation for Highrise

Highrise API The Highrise API is implemented as vanilla XML over HTTP using all four verbs (GET/POST/PUT/DELETE). Every resource, like Person, Deal, o

Basecamp 128 Dec 06, 2022
Simple Reddit bot that replies to comments containing a certain word.

reddit-replier-bot Small comment reply bot based on PRAW. This script will scan the comments of a subreddit as they come in and look for a trigger wor

Kefendy 0 Jun 04, 2022
Modified Version of mega.py package for Pyrogram Bots

Pyro Mega.py Python library for the Mega.co.nz API, currently supporting: login uploading downloading deleting searching sharing renaming moving files

I'm Not A Bot #Left_TG 10 Aug 03, 2022
A simple use library for bot discord.py developers

Discord Bot Template It's a simple use library for bot discord.py developers. Ob

Tir Omar 0 Oct 16, 2022
Telegram Bot For Screenshot Generation.

Screenshotit_bot Telegram Bot For Screenshot Generation. Description An attempt to implement the screenshot generation of telegram files without downl

1 Nov 06, 2021
聚合空间测绘搜索(Fofa,Zoomeye,Quake,Shodan,Censys,BinaryEdge)

#Search-Tools Search-Tools集合比较常见的网络空间探测引擎 Fofa,Zoomeye,Quake,Shodan,Censys,BinaryEdge 简单说明 ICO搜索目前只有Fofa,Shodan,Quake支持 代理设置是防止在API请求过于频繁,或者在实战中,好多红队打

311 Dec 16, 2022
Telegram to TamTam stickers

Telegram to TamTam stickers @tg_stickers TamTam бот, который конвертирует Telegram стикеры в формат TamTam и помогает загрузить их в TamTam. Все делае

Ivan Buymov 22 Nov 01, 2022