Asynchronous Python HTTP Requests for Humans using Futures

Overview

Asynchronous Python HTTP Requests for Humans

https://travis-ci.org/ross/requests-futures.png?branch=master

Small add-on for the python requests http library. Makes use of python 3.2's concurrent.futures or the backport for prior versions of python.

The additional API and changes are minimal and strives to avoid surprises.

The following synchronous code:

from requests import Session

session = Session()
# first requests starts and blocks until finished
response_one = session.get('http://httpbin.org/get')
# second request starts once first is finished
response_two = session.get('http://httpbin.org/get?foo=bar')
# both requests are complete
print('response one status: {0}'.format(response_one.status_code))
print(response_one.content)
print('response two status: {0}'.format(response_two.status_code))
print(response_two.content)

Can be translated to make use of futures, and thus be asynchronous by creating a FuturesSession and catching the returned Future in place of Response. The Response can be retrieved by calling the result method on the Future:

from requests_futures.sessions import FuturesSession

session = FuturesSession()
# first request is started in background
future_one = session.get('http://httpbin.org/get')
# second requests is started immediately
future_two = session.get('http://httpbin.org/get?foo=bar')
# wait for the first request to complete, if it hasn't already
response_one = future_one.result()
print('response one status: {0}'.format(response_one.status_code))
print(response_one.content)
# wait for the second request to complete, if it hasn't already
response_two = future_two.result()
print('response two status: {0}'.format(response_two.status_code))
print(response_two.content)

By default a ThreadPoolExecutor is created with 8 workers. If you would like to adjust that value or share a executor across multiple sessions you can provide one to the FuturesSession constructor.

from concurrent.futures import ThreadPoolExecutor
from requests_futures.sessions import FuturesSession

session = FuturesSession(executor=ThreadPoolExecutor(max_workers=10))
# ...

As a shortcut in case of just increasing workers number you can pass max_workers straight to the FuturesSession constructor:

from requests_futures.sessions import FuturesSession
session = FuturesSession(max_workers=10)

FutureSession will use an existing session object if supplied:

from requests import session
from requests_futures.sessions import FuturesSession
my_session = session()
future_session = FuturesSession(session=my_session)

That's it. The api of requests.Session is preserved without any modifications beyond returning a Future rather than Response. As with all futures exceptions are shifted (thrown) to the future.result() call so try/except blocks should be moved there.

Tying extra information to the request/response

The most common piece of information needed is the URL of the request. This can be accessed without any extra steps using the request property of the response object.

from concurrent.futures import as_completed
from pprint import pprint
from requests_futures.sessions import FuturesSession

session = FuturesSession()

futures=[session.get(f'http://httpbin.org/get?{i}') for i in range(3)]

for future in as_completed(futures):
    resp = future.result()
    pprint({
        'url': resp.request.url,
        'content': resp.json(),
    })

There are situations in which you may want to tie additional information to a request/response. There are a number of ways to go about this, the simplest is to attach additional information to the future object itself.

from concurrent.futures import as_completed
from pprint import pprint
from requests_futures.sessions import FuturesSession

session = FuturesSession()

futures=[]
for i in range(3):
    future = session.get('http://httpbin.org/get')
    future.i = i
    futures.append(future)

for future in as_completed(futures):
    resp = future.result()
    pprint({
        'i': future.i,
        'content': resp.json(),
    })

Canceling queued requests (a.k.a cleaning up after yourself)

If you know that you won't be needing any additional responses from futures that haven't yet resolved, it's a good idea to cancel those requests. You can do this by using the session as a context manager:

from requests_futures.sessions import FuturesSession
with FuturesSession(max_workers=1) as session:
    future = session.get('https://httpbin.org/get')
    future2 = session.get('https://httpbin.org/delay/10')
    future3 = session.get('https://httpbin.org/delay/10')
    response = future.result()

In this example, the second or third request will be skipped, saving time and resources that would otherwise be wasted.

Iterating over a list of requests responses

Without preserving the requests order:

from concurrent.futures import as_completed
from requests_futures.sessions import FuturesSession
with FuturesSession() as session:
    futures = [session.get('https://httpbin.org/delay/{}'.format(i % 3)) for i in range(10)]
    for future in as_completed(futures):
        resp = future.result()
        print(resp.json()['url'])

Working in the Background

Additional processing can be done in the background using requests's hooks functionality. This can be useful for shifting work out of the foreground, for a simple example take json parsing.

from pprint import pprint
from requests_futures.sessions import FuturesSession

session = FuturesSession()

def response_hook(resp, *args, **kwargs):
    # parse the json storing the result on the response object
    resp.data = resp.json()

future = session.get('http://httpbin.org/get', hooks={
    'response': response_hook,
})
# do some other stuff, send some more requests while this one works
response = future.result()
print('response status {0}'.format(response.status_code))
# data will have been attached to the response object in the background
pprint(response.data)

Hooks can also be applied to the session.

from pprint import pprint
from requests_futures.sessions import FuturesSession

def response_hook(resp, *args, **kwargs):
    # parse the json storing the result on the response object
    resp.data = resp.json()

session = FuturesSession()
session.hooks['response'] = response_hook

future = session.get('http://httpbin.org/get')
# do some other stuff, send some more requests while this one works
response = future.result()
print('response status {0}'.format(response.status_code))
# data will have been attached to the response object in the background
pprint(response.data)   pprint(response.data)

A more advanced example that adds an elapsed property to all requests.

from pprint import pprint
from requests_futures.sessions import FuturesSession
from time import time


class ElapsedFuturesSession(FuturesSession):

    def request(self, method, url, hooks=None, *args, **kwargs):
        start = time()
        if hooks is None:
            hooks = {}

        def timing(r, *args, **kwargs):
            r.elapsed = time() - start

        try:
            if isinstance(hooks['response'], (list, tuple)):
                # needs to be first so we don't time other hooks execution
                hooks['response'].insert(0, timing)
            else:
                hooks['response'] = [timing, hooks['response']]
        except KeyError:
            hooks['response'] = timing

        return super(ElapsedFuturesSession, self) \
            .request(method, url, hooks=hooks, *args, **kwargs)



session = ElapsedFuturesSession()
future = session.get('http://httpbin.org/get')
# do some other stuff, send some more requests while this one works
response = future.result()
print('response status {0}'.format(response.status_code))
print('response elapsed {0}'.format(response.elapsed))

Using ProcessPoolExecutor

Similarly to ThreadPoolExecutor, it is possible to use an instance of ProcessPoolExecutor. As the name suggest, the requests will be executed concurrently in separate processes rather than threads.

from concurrent.futures import ProcessPoolExecutor
from requests_futures.sessions import FuturesSession

session = FuturesSession(executor=ProcessPoolExecutor(max_workers=10))
# ... use as before

Hint

Using the ProcessPoolExecutor is useful, in cases where memory usage per request is very high (large response) and cycling the interpreter is required to release memory back to OS.

A base requirement of using ProcessPoolExecutor is that the Session.request, FutureSession all be pickle-able.

This means that only Python 3.5 is fully supported, while Python versions 3.4 and above REQUIRE an existing requests.Session instance to be passed when initializing FutureSession. Python 2.X and < 3.4 are currently not supported.

# Using python 3.4
from concurrent.futures import ProcessPoolExecutor
from requests import Session
from requests_futures.sessions import FuturesSession

session = FuturesSession(executor=ProcessPoolExecutor(max_workers=10),
                         session=Session())
# ... use as before

In case pickling fails, an exception is raised pointing to this documentation.

# Using python 2.7
from concurrent.futures import ProcessPoolExecutor
from requests import Session
from requests_futures.sessions import FuturesSession

session = FuturesSession(executor=ProcessPoolExecutor(max_workers=10),
                         session=Session())
Traceback (most recent call last):
...
RuntimeError: Cannot pickle function. Refer to documentation: https://github.com/ross/requests-futures/#using-processpoolexecutor

Important

  • Python >= 3.4 required
  • A session instance is required when using Python < 3.5
  • If sub-classing FuturesSession it must be importable (module global)

Installation

pip install requests-futures
Owner
Ross McFarland
Ross McFarland
🔄 🌐 Handle thousands of HTTP requests, disk writes, and other I/O-bound tasks simultaneously with Python's quintessential async libraries.

🔄 🌐 Handle thousands of HTTP requests, disk writes, and other I/O-bound tasks simultaneously with Python's quintessential async libraries.

Hackers and Slackers 15 Dec 12, 2022
Aiosonic - lightweight Python asyncio http client

aiosonic - lightweight Python asyncio http client Very fast, lightweight Python asyncio http client Here is some documentation. There is a performance

Johanderson Mogollon 93 Jan 06, 2023
A modern/fast python SOAP client based on lxml / requests

Zeep: Python SOAP client A fast and modern Python SOAP client Highlights: Compatible with Python 3.6, 3.7, 3.8 and PyPy Build on top of lxml and reque

Michael van Tellingen 1.7k Jan 01, 2023
HTTP request/response parser for python in C

http-parser HTTP request/response parser for Python compatible with Python 2.x (=2.7), Python 3 and Pypy. If possible a C parser based on http-parser

Benoit Chesneau 334 Dec 24, 2022
Python HTTP library with thread-safe connection pooling, file post support, user friendly, and more.

urllib3 is a powerful, user-friendly HTTP client for Python. Much of the Python ecosystem already uses urllib3 and you should too. urllib3 brings many

urllib3 3.2k Jan 02, 2023
Fast HTTP parser

httptools is a Python binding for the nodejs HTTP parser. The package is available on PyPI: pip install httptools. APIs httptools contains two classes

magicstack 1.1k Jan 07, 2023
Single-file replacement for python-requests

mureq mureq is a single-file, zero-dependency replacement for python-requests, intended to be vendored in-tree by Linux systems software and other lig

Shivaram Lingamneni 267 Dec 28, 2022
A simple, yet elegant HTTP library.

Requests Requests is a simple, yet elegant HTTP library. import requests r = requests.get('https://api.github.com/user', auth=('user', 'pass')

Python Software Foundation 48.8k Jan 05, 2023
suite de mocks http em json

Ritchie Formula Repo Documentation Contribute to the Ritchie community This repository contains rit formulas which can be executed by the ritchie-cli.

Kaio Fábio Prates Prudêncio 1 Nov 01, 2021
HTTP/2 for Python.

Hyper: HTTP/2 Client for Python This project is no longer maintained! Please use an alternative, such as HTTPX or others. We will not publish further

Hyper 1k Dec 23, 2022
curl statistics made simple

httpstat httpstat visualizes curl(1) statistics in a way of beauty and clarity. It is a single file 🌟 Python script that has no dependency 👏 and is

Xiao Meng 5.3k Jan 04, 2023
Bot que responde automáticamente as perguntas do giga unitel

Gigabot+ Bot que responde automáticamente as perguntas do giga unitel LINK DOWNLOAD: Gigabot.exe O script pode apresentar alguns erros, pois não tive

Joaquim Roque 20 Jul 16, 2021
Small, fast HTTP client library for Python. Features persistent connections, cache, and Google App Engine support. Originally written by Joe Gregorio, now supported by community.

Introduction httplib2 is a comprehensive HTTP client library, httplib2.py supports many features left out of other HTTP libraries. HTTP and HTTPS HTTP

457 Dec 10, 2022
Aiohttp simple project with Swagger and ccxt integration

crypto_finder What Where Documentation http://localhost:8899/docs Maintainer nordzisko Crypto Finder aiohttp application Application that connects to

Norbert Danisik 5 Feb 27, 2022
Asynchronous HTTP client/server framework for asyncio and Python

Async http client/server framework Key Features Supports both client and server side of HTTP protocol. Supports both client and server Web-Sockets out

aio-libs 13.1k Jan 01, 2023
An interactive command-line HTTP and API testing client built on top of HTTPie featuring autocomplete, syntax highlighting, and more. https://twitter.com/httpie

HTTP Prompt HTTP Prompt is an interactive command-line HTTP client featuring autocomplete and syntax highlighting, built on HTTPie and prompt_toolkit.

HTTPie 8.6k Dec 31, 2022
Python Simple SOAP Library

PySimpleSOAP / soap2py Python simple and lightweight SOAP library for client and server webservices interfaces, aimed to be as small and easy as possi

PySimpleSOAP 369 Jan 02, 2023
Aiohttp-openmetrics - OpenMetrics endpoint provider for aiohttp

aiohttp-openmetrics This project contains a simple middleware and /metrics route

Jelmer Vernooij 1 Dec 15, 2022
A Python obfuscator using HTTP Requests and Hastebin.

🔨 Jawbreaker 🔨 Jawbreaker is a Python obfuscator written in Python3, using double encoding in base16, base32, base64, HTTP requests and a Hastebin-l

Billy 50 Sep 28, 2022
A next generation HTTP client for Python. 🦋

HTTPX - A next-generation HTTP client for Python. HTTPX is a fully featured HTTP client for Python 3, which provides sync and async APIs, and support

Encode 9.8k Jan 05, 2023