Python wrapper for Wikipedia

Overview

Wikipedia API

Wikipedia-API is easy to use Python wrapper for Wikipedias' API. It supports extracting texts, sections, links, categories, translations, etc from Wikipedia. Documentation provides code snippets for the most common use cases.

build status Documentation Status Test Coverage Version Py Versions GitHub stars

Installation

This package requires at least Python 3.4 to install because it's using IntEnum.

pip3 install wikipedia-api

Usage

Goal of Wikipedia-API is to provide simple and easy to use API for retrieving informations from Wikipedia. Bellow are examples of common use cases.

Importing

import wikipediaapi

How To Get Single Page

Getting single page is straightforward. You have to initialize Wikipedia object and ask for page by its name. It's parameter language has be one of supported languages.

import wikipediaapi
    wiki_wiki = wikipediaapi.Wikipedia('en')

    page_py = wiki_wiki.page('Python_(programming_language)')

How To Check If Wiki Page Exists

For checking, whether page exists, you can use function exists.

page_py = wiki_wiki.page('Python_(programming_language)')
print("Page - Exists: %s" % page_py.exists())
# Page - Exists: True

page_missing = wiki_wiki.page('NonExistingPageWithStrangeName')
print("Page - Exists: %s" %     page_missing.exists())
# Page - Exists: False

How To Get Page Summary

Class WikipediaPage has property summary, which returns description of Wiki page.

import wikipediaapi
    wiki_wiki = wikipediaapi.Wikipedia('en')

    print("Page - Title: %s" % page_py.title)
    # Page - Title: Python (programming language)

    print("Page - Summary: %s" % page_py.summary[0:60])
    # Page - Summary: Python is a widely used high-level programming language for

How To Get Page URL

WikipediaPage has two properties with URL of the page. It is fullurl and canonicalurl.

print(page_py.fullurl)
# https://en.wikipedia.org/wiki/Python_(programming_language)

print(page_py.canonicalurl)
# https://en.wikipedia.org/wiki/Python_(programming_language)

How To Get Full Text

To get full text of Wikipedia page you should use property text which constructs text of the page as concatanation of summary and sections with their titles and texts.

wiki_wiki = wikipediaapi.Wikipedia(
        language='en',
        extract_format=wikipediaapi.ExtractFormat.WIKI
)

p_wiki = wiki_wiki.page("Test 1")
print(p_wiki.text)
# Summary
# Section 1
# Text of section 1
# Section 1.1
# Text of section 1.1
# ...


wiki_html = wikipediaapi.Wikipedia(
        language='en',
        extract_format=wikipediaapi.ExtractFormat.HTML
)
p_html = wiki_html.page("Test 1")
print(p_html.text)
# <p>Summary</p>
# <h2>Section 1</h2>
# <p>Text of section 1</p>
# <h3>Section 1.1</h3>
# <p>Text of section 1.1</p>
# ...

How To Get Page Sections

To get all top level sections of page, you have to use property sections. It returns list of WikipediaPageSection, so you have to use recursion to get all subsections.

def print_sections(sections, level=0):
        for s in sections:
                print("%s: %s - %s" % ("*" * (level + 1), s.title, s.text[0:40]))
                print_sections(s.sections, level + 1)


print_sections(page_py.sections)
# *: History - Python was conceived in the late 1980s,
# *: Features and philosophy - Python is a multi-paradigm programming l
# *: Syntax and semantics - Python is meant to be an easily readable
# **: Indentation - Python uses whitespace indentation, rath
# **: Statements and control flow - Python's statements include (among other
# **: Expressions - Some Python expressions are similar to l

How To Get Page In Other Languages

If you want to get other translations of given page, you should use property langlinks. It is map, where key is language code and value is WikipediaPage.

def print_langlinks(page):
        langlinks = page.langlinks
        for k in sorted(langlinks.keys()):
            v = langlinks[k]
            print("%s: %s - %s: %s" % (k, v.language, v.title, v.fullurl))

print_langlinks(page_py)
# af: af - Python (programmeertaal): https://af.wikipedia.org/wiki/Python_(programmeertaal)
# als: als - Python (Programmiersprache): https://als.wikipedia.org/wiki/Python_(Programmiersprache)
# an: an - Python: https://an.wikipedia.org/wiki/Python
# ar: ar - بايثون: https://ar.wikipedia.org/wiki/%D8%A8%D8%A7%D9%8A%D8%AB%D9%88%D9%86
# as: as - পাইথন: https://as.wikipedia.org/wiki/%E0%A6%AA%E0%A6%BE%E0%A6%87%E0%A6%A5%E0%A6%A8

page_py_cs = page_py.langlinks['cs']
print("Page - Summary: %s" % page_py_cs.summary[0:60])
# Page - Summary: Python (anglická výslovnost [ˈpaiθtən]) je vysokoúrovňový sk

How To Get Links To Other Pages

If you want to get all links to other wiki pages from given page, you need to use property links. It's map, where key is page title and value is WikipediaPage.

def print_links(page):
        links = page.links
        for title in sorted(links.keys()):
            print("%s: %s" % (title, links[title]))

print_links(page_py)
# 3ds Max: 3ds Max (id: ??, ns: 0)
# ?:: ?: (id: ??, ns: 0)
# ABC (programming language): ABC (programming language) (id: ??, ns: 0)
# ALGOL 68: ALGOL 68 (id: ??, ns: 0)
# Abaqus: Abaqus (id: ??, ns: 0)
# ...

How To Get Page Categories

If you want to get all categories under which page belongs, you should use property categories. It's map, where key is category title and value is WikipediaPage.

def print_categories(page):
        categories = page.categories
        for title in sorted(categories.keys()):
            print("%s: %s" % (title, categories[title]))


print("Categories")
print_categories(page_py)
# Category:All articles containing potentially dated statements: ...
# Category:All articles with unsourced statements: ...
# Category:Articles containing potentially dated statements from August 2016: ...
# Category:Articles containing potentially dated statements from March 2017: ...
# Category:Articles containing potentially dated statements from September 2017: ...

How To Get All Pages From Category

To get all pages from given category, you should use property categorymembers. It returns all members of given category. You have to implement recursion and deduplication by yourself.

def print_categorymembers(categorymembers, level=0, max_level=1):
        for c in categorymembers.values():
            print("%s: %s (ns: %d)" % ("*" * (level + 1), c.title, c.ns))
            if c.ns == wikipediaapi.Namespace.CATEGORY and level < max_level:
                print_categorymembers(c.categorymembers, level=level + 1, max_level=max_level)


cat = wiki_wiki.page("Category:Physics")
print("Category members: Category:Physics")
print_categorymembers(cat.categorymembers)

# Category members: Category:Physics
# * Statistical mechanics (ns: 0)
# * Category:Physical quantities (ns: 14)
# ** Refractive index (ns: 0)
# ** Vapor quality (ns: 0)
# ** Electric susceptibility (ns: 0)
# ** Specific weight (ns: 0)
# ** Category:Viscosity (ns: 14)
# *** Brookfield Engineering (ns: 0)

How To See Underlying API Call

If you have problems with retrieving data you can get URL of undrerlying API call. This will help you determine if the problem is in the library or somewhere else.

import wikipediaapi
import sys
wikipediaapi.log.setLevel(level=wikipediaapi.logging.DEBUG)

# Set handler if you use Python in interactive mode
out_hdlr = wikipediaapi.logging.StreamHandler(sys.stderr)
out_hdlr.setFormatter(wikipediaapi.logging.Formatter('%(asctime)s %(message)s'))
out_hdlr.setLevel(wikipediaapi.logging.DEBUG)
wikipediaapi.log.addHandler(out_hdlr)

wiki = wikipediaapi.Wikipedia(language='en')

page_ostrava = wiki.page('Ostrava')
print(page_ostrava.summary)
# logger prints out: Request URL: http://en.wikipedia.org/w/api.php?action=query&prop=extracts&titles=Ostrava&explaintext=1&exsectionformat=wiki

External Links

Other Badges

Code Climate Issue Count Coveralls Version Py Versions implementations Downloads Tags github-release Github commits (since latest release) GitHub forks GitHub stars GitHub watchers GitHub commit activity the past week, 4 weeks, year Last commit GitHub code size in bytes GitHub repo size in bytes PyPi License PyPi Wheel PyPi Format PyPi PyVersions PyPi Implementations PyPi Status PyPi Downloads - Day PyPi Downloads - Week PyPi Downloads - Month Libraries.io - SourceRank Libraries.io - Dependent Repos

Other Pages

.. toctree::
        :maxdepth: 2

        API
        CHANGES
        DEVELOPMENT
        wikipediaapi/api

Owner
Martin Majlis
Martin Majlis
A demo titiler for Sentinel 2 Digital Twin dataset

This is a DEMO custom api built on top of TiTiler to create Web Map Tiles from the Digital Twin Sentinel-2 COG created by Sinergise

Development Seed 26 May 21, 2022
GitPython is a python library used to interact with Git repositories.

Gitoxide: A peek into the future… I started working on GitPython in 2009, back in the days when Python was 'my thing' and I had great plans with it. O

3.8k Jan 03, 2023
Repo-cloner - Script takes user public liked repos and clone it to a local folder

Liked repos cloner Script takes user public liked repos and clone it to a local

Aleksei 2 Jun 18, 2022
Cleiton Leonel 4 Apr 22, 2022
Telegram bot implementing Lex Arcana using python-telegram-bot library.

Lex Arcana Telegram Bot 🤖 Telegram bot implementing Lex Arcana using python-telegram-bot library. This bot was evaluated for the course "Computer Eng

Nicolò Sonnino 6 Jun 22, 2022
A clean, easy to scale discord bot template

A clean, easy to scale discord bot template. Develope using nextcord library and can be use with any other discord.py forked library.

めがねこ 3 Mar 03, 2022
🐍 VerificaC19 SDK implementation for Python

VerificaC19 Python SDK 🐍 VerificaC19 SDK implementation for Python. Requirements Python version = 3.7 Make sure zbar is installed in your system For

Lotrèk 10 Jan 14, 2022
A simple, multipurpose Discord bot.

EpicBot 🏅 A simple, multipurpose Discord bot. • Info EpicBot is a multipurpose Discord bot that was designed to make your Discord life easier and coo

Nirlep_5252_ 130 Dec 29, 2022
Upload on Doodstream by Url, File and also by direct forward post from other channel...

Upload on Doodstream by Url, File and also by direct forward post from other channel...

Pʀᴇᴅᴀᴛᴏʀ 8 Aug 10, 2022
Upvotes and karma for Discord: Heart 💗 or Crush 💔 a comment to give points to an user, or Star ⭐ it to add it to the Best Of!

🤖 Reto Reto is a community-oriented Discord bot, featuring a karma system, a way to reward the best comments, leaderboards, and so much more! React t

Erik Bianco Vera 3 May 07, 2022
Automation that uses Github Actions, Google Drive API, YouTube Data API and youtube-dl together to feed BackJam app with new music

Automation that uses Github Actions, Google Drive API, YouTube Data API and youtube-dl together to feed BackJam app with new music

Antônio Oliveira 1 Nov 21, 2021
Project glow is an open source bot worked on by many people to create a good and safe moderation bot for all

Project Glow Greetings, I see you have stumbled upon project glow. Project glow is an open source bot worked on by many people to create a good and sa

Glowstikk 24 Sep 29, 2022
VALORANT rank yoinker lets you retrieve the ranks and basic informations of everyone in the lobby, regardless of gamemode.

vRY VALORANT rank yoinker Retrieve the rank and basic information of everyone in the lobby, regardless of gamemode. Table of Contents Terms of Use Abo

Isaac Kenyon 270 Dec 30, 2022
Python library for RetroMMO related stuff, including API wrapper

python library for RetroMMO related stuff, including API wrapper.

1 Nov 25, 2021
A Telegram Bot written in Python for mirroring files on the Internet to your Google Drive or Telegram

Original Repo mirror-leech-telegram-bot This is a Telegram Bot written in Python for mirroring files on the Internet to your Google Drive or Telegram.

0 Jan 03, 2022
PyLyrics Is An [Open-Source] Bot That Can Help You Get Song Lyrics

PyLyrics-Bot Telegram Bot To Search Song Lyrics From Genuis. 🤖 Demo: 👨‍💻 Deploy: ❤ Deploy Your Own Bot : Star 🌟 Fork 🍴 & Deploy -Easy Way -Self-h

DAMIEN 12 Nov 12, 2022
Autofilter with imdb bot || broakcast , imdb poster and imdb rating

LuciferMoringstar_Robot How To Deploy Video Subscribe YouTube Channel Added Features Imdb posters for autofilter. Imdb rating for autofilter. Custom c

Muhammed 127 Dec 29, 2022
An EmbedBuilder in Python for discord.py embeds. Pip Module.

Discord.py-MaxEmbeds An EmbedBuilder for Discord bots in Python. You need discord.py to use this module. Installation Step 1 First you have to install

Max Tischberger 6 Jan 13, 2022
SmsSender v3.0.0 - the script is designed to send free SMS to any number and with any text.

SmsSender v3.0.0 - скрипт предназначен для бесплатной отправки SMS на любой номер и с любым текстом. Возможны небольшие баги, в скором времени исправл

Андрей Сергеев 20 Dec 03, 2021
Low-level, feature rich and easy to use discord python wrapper

PWRCord Low-level, feature rich and easy to use discord python wrapper Important Note: At this point, this library API is considered unstable and can

MIguel Lopes 1 Dec 26, 2021