Python wrapper for Wikipedia

Overview

Wikipedia API

Wikipedia-API is easy to use Python wrapper for Wikipedias' API. It supports extracting texts, sections, links, categories, translations, etc from Wikipedia. Documentation provides code snippets for the most common use cases.

build status Documentation Status Test Coverage Version Py Versions GitHub stars

Installation

This package requires at least Python 3.4 to install because it's using IntEnum.

pip3 install wikipedia-api

Usage

Goal of Wikipedia-API is to provide simple and easy to use API for retrieving informations from Wikipedia. Bellow are examples of common use cases.

Importing

import wikipediaapi

How To Get Single Page

Getting single page is straightforward. You have to initialize Wikipedia object and ask for page by its name. It's parameter language has be one of supported languages.

import wikipediaapi
    wiki_wiki = wikipediaapi.Wikipedia('en')

    page_py = wiki_wiki.page('Python_(programming_language)')

How To Check If Wiki Page Exists

For checking, whether page exists, you can use function exists.

page_py = wiki_wiki.page('Python_(programming_language)')
print("Page - Exists: %s" % page_py.exists())
# Page - Exists: True

page_missing = wiki_wiki.page('NonExistingPageWithStrangeName')
print("Page - Exists: %s" %     page_missing.exists())
# Page - Exists: False

How To Get Page Summary

Class WikipediaPage has property summary, which returns description of Wiki page.

import wikipediaapi
    wiki_wiki = wikipediaapi.Wikipedia('en')

    print("Page - Title: %s" % page_py.title)
    # Page - Title: Python (programming language)

    print("Page - Summary: %s" % page_py.summary[0:60])
    # Page - Summary: Python is a widely used high-level programming language for

How To Get Page URL

WikipediaPage has two properties with URL of the page. It is fullurl and canonicalurl.

print(page_py.fullurl)
# https://en.wikipedia.org/wiki/Python_(programming_language)

print(page_py.canonicalurl)
# https://en.wikipedia.org/wiki/Python_(programming_language)

How To Get Full Text

To get full text of Wikipedia page you should use property text which constructs text of the page as concatanation of summary and sections with their titles and texts.

wiki_wiki = wikipediaapi.Wikipedia(
        language='en',
        extract_format=wikipediaapi.ExtractFormat.WIKI
)

p_wiki = wiki_wiki.page("Test 1")
print(p_wiki.text)
# Summary
# Section 1
# Text of section 1
# Section 1.1
# Text of section 1.1
# ...


wiki_html = wikipediaapi.Wikipedia(
        language='en',
        extract_format=wikipediaapi.ExtractFormat.HTML
)
p_html = wiki_html.page("Test 1")
print(p_html.text)
# <p>Summary</p>
# <h2>Section 1</h2>
# <p>Text of section 1</p>
# <h3>Section 1.1</h3>
# <p>Text of section 1.1</p>
# ...

How To Get Page Sections

To get all top level sections of page, you have to use property sections. It returns list of WikipediaPageSection, so you have to use recursion to get all subsections.

def print_sections(sections, level=0):
        for s in sections:
                print("%s: %s - %s" % ("*" * (level + 1), s.title, s.text[0:40]))
                print_sections(s.sections, level + 1)


print_sections(page_py.sections)
# *: History - Python was conceived in the late 1980s,
# *: Features and philosophy - Python is a multi-paradigm programming l
# *: Syntax and semantics - Python is meant to be an easily readable
# **: Indentation - Python uses whitespace indentation, rath
# **: Statements and control flow - Python's statements include (among other
# **: Expressions - Some Python expressions are similar to l

How To Get Page In Other Languages

If you want to get other translations of given page, you should use property langlinks. It is map, where key is language code and value is WikipediaPage.

def print_langlinks(page):
        langlinks = page.langlinks
        for k in sorted(langlinks.keys()):
            v = langlinks[k]
            print("%s: %s - %s: %s" % (k, v.language, v.title, v.fullurl))

print_langlinks(page_py)
# af: af - Python (programmeertaal): https://af.wikipedia.org/wiki/Python_(programmeertaal)
# als: als - Python (Programmiersprache): https://als.wikipedia.org/wiki/Python_(Programmiersprache)
# an: an - Python: https://an.wikipedia.org/wiki/Python
# ar: ar - بايثون: https://ar.wikipedia.org/wiki/%D8%A8%D8%A7%D9%8A%D8%AB%D9%88%D9%86
# as: as - পাইথন: https://as.wikipedia.org/wiki/%E0%A6%AA%E0%A6%BE%E0%A6%87%E0%A6%A5%E0%A6%A8

page_py_cs = page_py.langlinks['cs']
print("Page - Summary: %s" % page_py_cs.summary[0:60])
# Page - Summary: Python (anglická výslovnost [ˈpaiθtən]) je vysokoúrovňový sk

How To Get Links To Other Pages

If you want to get all links to other wiki pages from given page, you need to use property links. It's map, where key is page title and value is WikipediaPage.

def print_links(page):
        links = page.links
        for title in sorted(links.keys()):
            print("%s: %s" % (title, links[title]))

print_links(page_py)
# 3ds Max: 3ds Max (id: ??, ns: 0)
# ?:: ?: (id: ??, ns: 0)
# ABC (programming language): ABC (programming language) (id: ??, ns: 0)
# ALGOL 68: ALGOL 68 (id: ??, ns: 0)
# Abaqus: Abaqus (id: ??, ns: 0)
# ...

How To Get Page Categories

If you want to get all categories under which page belongs, you should use property categories. It's map, where key is category title and value is WikipediaPage.

def print_categories(page):
        categories = page.categories
        for title in sorted(categories.keys()):
            print("%s: %s" % (title, categories[title]))


print("Categories")
print_categories(page_py)
# Category:All articles containing potentially dated statements: ...
# Category:All articles with unsourced statements: ...
# Category:Articles containing potentially dated statements from August 2016: ...
# Category:Articles containing potentially dated statements from March 2017: ...
# Category:Articles containing potentially dated statements from September 2017: ...

How To Get All Pages From Category

To get all pages from given category, you should use property categorymembers. It returns all members of given category. You have to implement recursion and deduplication by yourself.

def print_categorymembers(categorymembers, level=0, max_level=1):
        for c in categorymembers.values():
            print("%s: %s (ns: %d)" % ("*" * (level + 1), c.title, c.ns))
            if c.ns == wikipediaapi.Namespace.CATEGORY and level < max_level:
                print_categorymembers(c.categorymembers, level=level + 1, max_level=max_level)


cat = wiki_wiki.page("Category:Physics")
print("Category members: Category:Physics")
print_categorymembers(cat.categorymembers)

# Category members: Category:Physics
# * Statistical mechanics (ns: 0)
# * Category:Physical quantities (ns: 14)
# ** Refractive index (ns: 0)
# ** Vapor quality (ns: 0)
# ** Electric susceptibility (ns: 0)
# ** Specific weight (ns: 0)
# ** Category:Viscosity (ns: 14)
# *** Brookfield Engineering (ns: 0)

How To See Underlying API Call

If you have problems with retrieving data you can get URL of undrerlying API call. This will help you determine if the problem is in the library or somewhere else.

import wikipediaapi
import sys
wikipediaapi.log.setLevel(level=wikipediaapi.logging.DEBUG)

# Set handler if you use Python in interactive mode
out_hdlr = wikipediaapi.logging.StreamHandler(sys.stderr)
out_hdlr.setFormatter(wikipediaapi.logging.Formatter('%(asctime)s %(message)s'))
out_hdlr.setLevel(wikipediaapi.logging.DEBUG)
wikipediaapi.log.addHandler(out_hdlr)

wiki = wikipediaapi.Wikipedia(language='en')

page_ostrava = wiki.page('Ostrava')
print(page_ostrava.summary)
# logger prints out: Request URL: http://en.wikipedia.org/w/api.php?action=query&prop=extracts&titles=Ostrava&explaintext=1&exsectionformat=wiki

External Links

Other Badges

Code Climate Issue Count Coveralls Version Py Versions implementations Downloads Tags github-release Github commits (since latest release) GitHub forks GitHub stars GitHub watchers GitHub commit activity the past week, 4 weeks, year Last commit GitHub code size in bytes GitHub repo size in bytes PyPi License PyPi Wheel PyPi Format PyPi PyVersions PyPi Implementations PyPi Status PyPi Downloads - Day PyPi Downloads - Week PyPi Downloads - Month Libraries.io - SourceRank Libraries.io - Dependent Repos

Other Pages

.. toctree::
        :maxdepth: 2

        API
        CHANGES
        DEVELOPMENT
        wikipediaapi/api

Owner
Martin Majlis
Martin Majlis
List of twitch bots n bigots

This is a collection of bot account names NamelistMASTER contains all the names we reccomend you ban in your channel Sometimes people get on that list

62 Sep 05, 2021
A plugin for modmail-bot for stealing,making ,etc emojis

EmojiPlugin for the Modmail-bot My first plugin .. its very Basic I will make more and better too Only 3 commands for now emojiadd-supports .jpg, .png

1 Dec 28, 2021
A Recommendation System For Diabetes Detection And Treatment

Diabetes-detection-tg-bot A Recommendation System For Diabetes Detection And Treatment Данная система помогает определить наличие или отсутствие сахар

Alexander Kanonirov 1 Nov 22, 2021
Documentation and Samples for the Official HN API

Hacker News API Overview In partnership with Firebase, we're making the public Hacker News data available in near real time. Firebase enables easy acc

Y Combinator Hacker News 9.6k Jan 03, 2023
Unofficial GoPro API Library for Python - connect to GoPro via WiFi.

GoPro API for Python Unofficial GoPro API Library for Python - connect to GoPro cameras via WiFi. Compatibility: HERO3 HERO3+ HERO4 (including HERO Se

Konrad Iturbe 1.3k Jan 01, 2023
radiant discord anti nuke src leaked lol.

radiant-anti-wizz-leaked radiant discord anti nuke src leaked lol, the whole anti sucks but idc. sucks to suck thats tuff bro LMAOOOOOO join my server

ok 15 Aug 06, 2022
Companion "receiver" to matrix-appservice-webhooks for [matrix].

Matrix Webhook Receiver Companion "receiver" to matrix-appservice-webhooks for [matrix]. The purpose of this app is to listen for generic webhook mess

Kim Brose 13 Sep 29, 2022
Python binding for Microsoft LightGBM

pyLightGBM: python binding for Microsoft LightGBM Features: Regression, Classification (binary, multi class) Feature importance (clf.feature_importanc

Ardalan 330 Nov 18, 2022
Rotates Amazon Personalize filters on a schedule based on dynamic templates

Amazon Personalize Filter Rotation This project contains the source code and supporting files for deploying a serverless application that provides aut

James Jory 2 Nov 12, 2021
Unofficial Coinbase Python Library

Unofficial Coinbase Python Library Python Library for the Coinbase API for use with three legged oAuth2 and classic API key usage Version 0.3.0 Requir

George Sibble 104 Dec 01, 2022
Python library for the DeepL language translation API.

The DeepL API is a language translation API that allows other computer programs to send texts and documents to DeepL's servers and receive high-quality translations. This opens a whole universe of op

DeepL 535 Jan 04, 2023
Aplicação dos metodos de classificação em 3 diferentes banco de dados. Usando...

Machine Learning - Métodos de classificação Base de Dados utilizadas: Dados de crédito Dados do Census Métodos de classificação aplicados: Naive Bayes

1 Jan 18, 2022
DankMemer-Farmer - Autofarm Self-Bot for Discord bot Named Dankmemer.

DankMemer-Farmer Autofarm Self-Bot for Discord bot Named Dankmemer. Warning We are not responsible if you got banned, since "self-bots" outside of the

Mole 16 Dec 14, 2022
A chatbot on Telegram using technologies of cloud computing.

Chatbot This project is about a chatbot on Telegram to study the cloud computing. You can refer to the project of chatbot-deploy which is conveinent f

Jeffery 4 Apr 24, 2022
Discord Bot written in Python that plays music in your voice channel

Discord Bot that plays music! I decided to create a simple Discord bot using Python in order to advance my coding skills. Please don't ask me for help

Eric Yeung 39 Jan 01, 2023
Tools ini hanya bisa digunakan untuk menyerang website atau http/s

☢️ Tawkun DoS ☢️ Tools ini hanya bisa digunakan untuk menyerang website atau http/s FITUR: [ ☯️ ] Proxy Mode [ 🔥 ] SOCKS Mode | Kadang Eror [ ☢️ ] Ht

Bandhitawkunthi 9 Jul 19, 2022
The fastest nuker on discord, Proxy support and more

About okuru nuker is a nuker for discord written in python, It uses methods such as threading and requests to ban faster and perform at higher speeds.

63 Dec 31, 2022
Qbittorrent / Aria2 Mirror & Leech Telegram Bot

This is a Telegram Bot written in Python for mirroring files on the Internet to your Google Drive or Telegram. Based on python-aria-mirror-bot Feature

Hüzünlü Artemis [HuzunluArtemis] 81 Jul 15, 2022
Discord Bot Personnal Server - Ha-Neul

Haneul Bot, it's a discord for help me on my personnal discord, she do a lot of boring and repetitive stain. You can use on your own server if you want, you just need to find a host for the programm

Maxvyr 1 Feb 03, 2022
A surviv.io bot that helps you manage you clan in surviv.io!

Scooter-Surviv.io-Clan-Bot A Surviv.io Discord Bot This is a bot that helps manage your surviv.io clan! Read below for more!!. Features Lets you creat

cosmic|duck 1 Jan 03, 2022