Prometheus instrumentation library for Python applications

Last update: Jan 07, 2023

Overview

Prometheus Python Client

The official Python 2 and 3 client for Prometheus.

Three Step Demo

One: Install the client:

pip install prometheus-client

Two: Paste the following into a Python interpreter:

from prometheus_client import start_http_server, Summary
import random
import time

# Create a metric to track time spent and requests made.
REQUEST_TIME = Summary('request_processing_seconds', 'Time spent processing request')

# Decorate function with metric.
@REQUEST_TIME.time()
def process_request(t):
    """A dummy function that takes some time."""
    time.sleep(t)

if __name__ == '__main__':
    # Start up the server to expose the metrics.
    start_http_server(8000)
    # Generate some requests.
    while True:
        process_request(random.random())

Three: Visit http://localhost:8000/ to view the metrics.

From one easy to use decorator you get:

request_processing_seconds_count: Number of times this function was called.
request_processing_seconds_sum: Total amount of time spent in this function.

Prometheus's rate function allows calculation of both requests per second, and latency over time from this data.

In addition if you're on Linux the process metrics expose CPU, memory and other information about the process for free!

Installation

pip install prometheus_client

This package can be found on PyPI.

Instrumenting

Four types of metric are offered: Counter, Gauge, Summary and Histogram. See the documentation on metric types and instrumentation best practices on how to use them.

Counter

Counters go up, and reset when the process restarts.

from prometheus_client import Counter
c = Counter('my_failures', 'Description of counter')
c.inc()     # Increment by 1
c.inc(1.6)  # Increment by given value

If there is a suffix of _total on the metric name, it will be removed. When exposing the time series for counter, a _total suffix will be added. This is for compatibility between OpenMetrics and the Prometheus text format, as OpenMetrics requires the _total suffix.

There are utilities to count exceptions raised:

@c.count_exceptions()
def f():
  pass

with c.count_exceptions():
  pass

# Count only one type of exception
with c.count_exceptions(ValueError):
  pass

Gauge

Gauges can go up and down.

from prometheus_client import Gauge
g = Gauge('my_inprogress_requests', 'Description of gauge')
g.inc()      # Increment by 1
g.dec(10)    # Decrement by given value
g.set(4.2)   # Set to a given value

There are utilities for common use cases:

g.set_to_current_time()   # Set to current unixtime

# Increment when entered, decrement when exited.
@g.track_inprogress()
def f():
  pass

with g.track_inprogress():
  pass

A Gauge can also take its value from a callback:

d = Gauge('data_objects', 'Number of objects')
my_dict = {}
d.set_function(lambda: len(my_dict))

Summary

Summaries track the size and number of events.

from prometheus_client import Summary
s = Summary('request_latency_seconds', 'Description of summary')
s.observe(4.7)    # Observe 4.7 (seconds in this case)

There are utilities for timing code:

@s.time()
def f():
  pass

with s.time():
  pass

The Python client doesn't store or expose quantile information at this time.

Histogram

Histograms track the size and number of events in buckets. This allows for aggregatable calculation of quantiles.

from prometheus_client import Histogram
h = Histogram('request_latency_seconds', 'Description of histogram')
h.observe(4.7)    # Observe 4.7 (seconds in this case)

The default buckets are intended to cover a typical web/rpc request from milliseconds to seconds. They can be overridden by passing buckets keyword argument to Histogram.

There are utilities for timing code:

@h.time()
def f():
  pass

with h.time():
  pass

Info

Info tracks key-value information, usually about a whole target.

from prometheus_client import Info
i = Info('my_build_version', 'Description of info')
i.info({'version': '1.2.3', 'buildhost': '[email protected]'})

Enum

Enum tracks which of a set of states something is currently in.

from prometheus_client import Enum
e = Enum('my_task_state', 'Description of enum',
        states=['starting', 'running', 'stopped'])
e.state('running')

Labels

All metrics can have labels, allowing grouping of related time series.

See the best practices on naming and labels.

Taking a counter as an example:

from prometheus_client import Counter
c = Counter('my_requests_total', 'HTTP Failures', ['method', 'endpoint'])
c.labels('get', '/').inc()
c.labels('post', '/submit').inc()

Labels can also be passed as keyword-arguments:

from prometheus_client import Counter
c = Counter('my_requests_total', 'HTTP Failures', ['method', 'endpoint'])
c.labels(method='get', endpoint='/').inc()
c.labels(method='post', endpoint='/submit').inc()

Metrics with labels are not initialized when declared, because the client can't know what values the label can have. It is recommended to initialize the label values by calling the .labels() method alone:

from prometheus_client import Counter
c = Counter('my_requests_total', 'HTTP Failures', ['method', 'endpoint'])
c.labels('get', '/')
c.labels('post', '/submit')

Process Collector

The Python client automatically exports metrics about process CPU usage, RAM, file descriptors and start time. These all have the prefix process, and are only currently available on Linux.

The namespace and pid constructor arguments allows for exporting metrics about other processes, for example:

ProcessCollector(namespace='mydaemon', pid=lambda: open('/var/run/daemon.pid').read())

Platform Collector

The client also automatically exports some metadata about Python. If using Jython, metadata about the JVM in use is also included. This information is available as labels on the python_info metric. The value of the metric is 1, since it is the labels that carry information.

Exporting

There are several options for exporting metrics.

HTTP

Metrics are usually exposed over HTTP, to be read by the Prometheus server.

The easiest way to do this is via start_http_server, which will start a HTTP server in a daemon thread on the given port:

from prometheus_client import start_http_server

start_http_server(8000)

Visit http://localhost:8000/ to view the metrics.

To add Prometheus exposition to an existing HTTP server, see the MetricsHandler class which provides a BaseHTTPRequestHandler. It also serves as a simple example of how to write a custom endpoint.

Twisted

To use prometheus with twisted, there is MetricsResource which exposes metrics as a twisted resource.

from prometheus_client.twisted import MetricsResource
from twisted.web.server import Site
from twisted.web.resource import Resource
from twisted.internet import reactor

root = Resource()
root.putChild(b'metrics', MetricsResource())

factory = Site(root)
reactor.listenTCP(8000, factory)
reactor.run()

WSGI

To use Prometheus with WSGI, there is make_wsgi_app which creates a WSGI application.

from prometheus_client import make_wsgi_app
from wsgiref.simple_server import make_server

app = make_wsgi_app()
httpd = make_server('', 8000, app)
httpd.serve_forever()

Such an application can be useful when integrating Prometheus metrics with WSGI apps.

The method start_wsgi_server can be used to serve the metrics through the WSGI reference implementation in a new thread.

from prometheus_client import start_wsgi_server

start_wsgi_server(8000)

ASGI

To use Prometheus with ASGI, there is make_asgi_app which creates an ASGI application.

from prometheus_client import make_asgi_app

app = make_asgi_app()

Such an application can be useful when integrating Prometheus metrics with ASGI apps.

Flask

To use Prometheus with Flask we need to serve metrics through a Prometheus WSGI application. This can be achieved using Flask's application dispatching. Below is a working example.

Save the snippet below in a myapp.py file

from flask import Flask
from werkzeug.middleware.dispatcher import DispatcherMiddleware
from prometheus_client import make_wsgi_app

# Create my app
app = Flask(__name__)

# Add prometheus wsgi middleware to route /metrics requests
app.wsgi_app = DispatcherMiddleware(app.wsgi_app, {
    '/metrics': make_wsgi_app()
})

Run the example web application like this

# Install uwsgi if you do not have it
pip install uwsgi
uwsgi --http 127.0.0.1:8000 --wsgi-file myapp.py --callable app

Visit http://localhost:8000/metrics to see the metrics

Node exporter textfile collector

The textfile collector allows machine-level statistics to be exported out via the Node exporter.

This is useful for monitoring cronjobs, or for writing cronjobs to expose metrics about a machine system that the Node exporter does not support or would not make sense to perform at every scrape (for example, anything involving subprocesses).

from prometheus_client import CollectorRegistry, Gauge, write_to_textfile

registry = CollectorRegistry()
g = Gauge('raid_status', '1 if raid array is okay', registry=registry)
g.set(1)
write_to_textfile('/configured/textfile/path/raid.prom', registry)

A separate registry is used, as the default registry may contain other metrics such as those from the Process Collector.

Exporting to a Pushgateway

The Pushgateway allows ephemeral and batch jobs to expose their metrics to Prometheus.

from prometheus_client import CollectorRegistry, Gauge, push_to_gateway

registry = CollectorRegistry()
g = Gauge('job_last_success_unixtime', 'Last time a batch job successfully finished', registry=registry)
g.set_to_current_time()
push_to_gateway('localhost:9091', job='batchA', registry=registry)

A separate registry is used, as the default registry may contain other metrics such as those from the Process Collector.

Pushgateway functions take a grouping key. push_to_gateway replaces metrics with the same grouping key, pushadd_to_gateway only replaces metrics with the same name and grouping key and delete_from_gateway deletes metrics with the given job and grouping key. See the Pushgateway documentation for more information.

instance_ip_grouping_key returns a grouping key with the instance label set to the host's IP address.

Handlers for authentication

If the push gateway you are connecting to is protected with HTTP Basic Auth, you can use a special handler to set the Authorization header.

from prometheus_client import CollectorRegistry, Gauge, push_to_gateway
from prometheus_client.exposition import basic_auth_handler

def my_auth_handler(url, method, timeout, headers, data):
    username = 'foobar'
    password = 'secret123'
    return basic_auth_handler(url, method, timeout, headers, data, username, password)
registry = CollectorRegistry()
g = Gauge('job_last_success_unixtime', 'Last time a batch job successfully finished', registry=registry)
g.set_to_current_time()
push_to_gateway('localhost:9091', job='batchA', registry=registry, handler=my_auth_handler)

Bridges

It is also possible to expose metrics to systems other than Prometheus. This allows you to take advantage of Prometheus instrumentation even if you are not quite ready to fully transition to Prometheus yet.

Graphite

Metrics are pushed over TCP in the Graphite plaintext format.

from prometheus_client.bridge.graphite import GraphiteBridge

gb = GraphiteBridge(('graphite.your.org', 2003))
# Push once.
gb.push()
# Push every 10 seconds in a daemon thread.
gb.start(10.0)

Graphite tags are also supported.

from prometheus_client.bridge.graphite import GraphiteBridge

gb = GraphiteBridge(('graphite.your.org', 2003), tags=True)
c = Counter('my_requests_total', 'HTTP Failures', ['method', 'endpoint'])
c.labels('get', '/').inc()
gb.push()

Custom Collectors

Sometimes it is not possible to directly instrument code, as it is not in your control. This requires you to proxy metrics from other systems.

To do so you need to create a custom collector, for example:

from prometheus_client.core import GaugeMetricFamily, CounterMetricFamily, REGISTRY

class CustomCollector(object):
    def collect(self):
        yield GaugeMetricFamily('my_gauge', 'Help text', value=7)
        c = CounterMetricFamily('my_counter_total', 'Help text', labels=['foo'])
        c.add_metric(['bar'], 1.7)
        c.add_metric(['baz'], 3.8)
        yield c

REGISTRY.register(CustomCollector())

SummaryMetricFamily, HistogramMetricFamily and InfoMetricFamily work similarly.

A collector may implement a describe method which returns metrics in the same format as collect (though you don't have to include the samples). This is used to predetermine the names of time series a CollectorRegistry exposes and thus to detect collisions and duplicate registrations.

Usually custom collectors do not have to implement describe. If describe is not implemented and the CollectorRegistry was created with auto_describe=True (which is the case for the default registry) then collect will be called at registration time instead of describe. If this could cause problems, either implement a proper describe, or if that's not practical have describe return an empty list.

Multiprocess Mode (Gunicorn)

Prometheus client libraries presume a threaded model, where metrics are shared across workers. This doesn't work so well for languages such as Python where it's common to have processes rather than threads to handle large workloads.

To handle this the client library can be put in multiprocess mode. This comes with a number of limitations:

Registries can not be used as normal, all instantiated metrics are exported
Custom collectors do not work (e.g. cpu and memory metrics)
Info and Enum metrics do not work
The pushgateway cannot be used
Gauges cannot use the pid label

There's several steps to getting this working:

1. Gunicorn deployment:

The prometheus_multiproc_dir environment variable must be set to a directory that the client library can use for metrics. This directory must be wiped between Gunicorn runs (before startup is recommended).

This environment variable should be set from a start-up shell script, and not directly from Python (otherwise it may not propagate to child processes).

2. Metrics collector:

The application must initialize a new CollectorRegistry, and store the multi-process collector inside.

from prometheus_client import multiprocess
from prometheus_client import generate_latest, CollectorRegistry, CONTENT_TYPE_LATEST

# Expose metrics.
def app(environ, start_response):
    registry = CollectorRegistry()
    multiprocess.MultiProcessCollector(registry)
    data = generate_latest(registry)
    status = '200 OK'
    response_headers = [
        ('Content-type', CONTENT_TYPE_LATEST),
        ('Content-Length', str(len(data)))
    ]
    start_response(status, response_headers)
    return iter([data])

3. Gunicorn configuration:

The gunicorn configuration file needs to include the following function:

from prometheus_client import multiprocess

def child_exit(server, worker):
    multiprocess.mark_process_dead(worker.pid)

4. Metrics tuning (Gauge):

When Gauge metrics are used, additional tuning needs to be performed. Gauges have several modes they can run in, which can be selected with the multiprocess_mode parameter.

'all': Default. Return a timeseries per process alive or dead.
'liveall': Return a timeseries per process that is still alive.
'livesum': Return a single timeseries that is the sum of the values of alive processes.
'max': Return a single timeseries that is the maximum of the values of all processes, alive or dead.
'min': Return a single timeseries that is the minimum of the values of all processes, alive or dead.

from prometheus_client import Gauge

# Example gauge
IN_PROGRESS = Gauge("inprogress_requests", "help", multiprocess_mode='livesum')

Parser

The Python client supports parsing the Prometheus text format. This is intended for advanced use cases where you have servers exposing Prometheus metrics and need to get them into some other system.

from prometheus_client.parser import text_string_to_metric_families
for family in text_string_to_metric_families(u"my_gauge 1.0\n"):
  for sample in family.samples:
    print("Name: {0} Labels: {1} Value: {2}".format(*sample))

Comments

Gunicorn/multiprocess with restarting workers, .db files are not cleaned up

We are using Gunicorn with multiple workers with the Gunicorn max_requests option. We found that memory usage in our containers increases with every worker restart. By reducing the max_requests value so workers restart very frequently, the problem became apparent.

Using the method described in the readme:

from prometheus_client import multiprocess

def child_exit(server, worker):
    multiprocess.mark_process_dead(worker.pid)

While running a performance test, I see the following in the container:

[[email protected] app]# cd /app/prometheus_tmp
[[email protected] prometheus_tmp]# ls
counter_10.db   counter_214.db  counter_319.db  counter_41.db   counter_91.db     histogram_196.db  histogram_296.db  histogram_401.db  histogram_69.db
counter_111.db  counter_222.db  counter_31.db   counter_425.db  counter_97.db     histogram_1.db    histogram_305.db  histogram_409.db  histogram_77.db
counter_117.db  counter_232.db  counter_328.db  counter_433.db  counter_9.db      histogram_202.db  histogram_311.db  histogram_419.db  histogram_84.db
counter_130.db  counter_23.db   counter_338.db  counter_444.db  histogram_10.db   histogram_214.db  histogram_319.db  histogram_41.db   histogram_91.db
counter_138.db  counter_244.db  counter_345.db  counter_450.db  histogram_111.db  histogram_222.db  histogram_31.db   histogram_425.db  histogram_97.db
counter_148.db  counter_251.db  counter_353.db  counter_458.db  histogram_117.db  histogram_232.db  histogram_328.db  histogram_433.db  histogram_9.db
counter_154.db  counter_258.db  counter_361.db  counter_470.db  histogram_130.db  histogram_23.db   histogram_338.db  histogram_444.db
counter_175.db  counter_269.db  counter_36.db   counter_48.db   histogram_138.db  histogram_244.db  histogram_345.db  histogram_450.db
counter_187.db  counter_277.db  counter_374.db  counter_56.db   histogram_148.db  histogram_251.db  histogram_353.db  histogram_458.db
counter_18.db   counter_286.db  counter_384.db  counter_61.db   histogram_154.db  histogram_258.db  histogram_361.db  histogram_470.db
counter_196.db  counter_296.db  counter_401.db  counter_69.db   histogram_175.db  histogram_269.db  histogram_36.db   histogram_48.db
counter_1.db    counter_305.db  counter_409.db  counter_77.db   histogram_187.db  histogram_277.db  histogram_374.db  histogram_56.db
counter_202.db  counter_311.db  counter_419.db  counter_84.db   histogram_18.db   histogram_286.db  histogram_384.db  histogram_61.db

Watching that directory, new files are created with every new worker and old ones are not being cleaned up. Deleting the files in that directory reduce down the memory usage on that container, which then starts building again.

What is the proper way of dealing with these database files? Is this mark_process_dead's responsibility?

And a general question about Prometheus: do we need to keep metrics around after they are collected? Could we wipe our metrics after the collector hits our metrics endpoint? If so, can this be done via the prometheus_client?

opened by Dominick-Peluso-Bose 36

Support sharing registry across multiple workers (where possible)

Similar to https://github.com/prometheus/client_ruby/issues/9, when using the python client library on workers that get load balanced by something like gunicorn or uwsgi, each scrape it hits only one worker since they can't share state with others.

At least uwsgi supports sharing memory: http://uwsgi-docs.readthedocs.org/en/latest/SharedArea.html This should be used to share the registry across all workers. Maybe gunicorn supports something similar.

opened by discordianfish 33

Handle multiprocess setups using preloading and equivilents

Sometimes application raise error.

At line https://github.com/prometheus/client_python/blob/master/prometheus_client/core.py#L361

Stacktrace (последний вызов снизу):

  File "flask/app.py", line 1817, in wsgi_app
    response = self.full_dispatch_request()
  File "flask/app.py", line 1477, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "flask/app.py", line 1381, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "flask/app.py", line 1473, in full_dispatch_request
    rv = self.preprocess_request()
  File "flask/app.py", line 1666, in preprocess_request
    rv = func()
  File "core/middleware.py", line 43, in before_request_middleware
    metrics.requests_total.labels(env_role=metrics.APP_ENV_ROLE, method=request.method, url_rule=rule).inc()
  File "prometheus_client/core.py", line 498, in labels
    self._metrics[labelvalues] = self._wrappedClass(self._name, self._labelnames, labelvalues, **self._kwargs)
  File "prometheus_client/core.py", line 599, in __init__
    self._value = _ValueClass(self._type, name, name, labelnames, labelvalues)
  File "prometheus_client/core.py", line 414, in __init__
    files[file_prefix] = _MmapedDict(filename)
  File "prometheus_client/core.py", line 335, in __init__
    for key, _, pos in self._read_all_values():
  File "prometheus_client/core.py", line 361, in _read_all_values
    encoded = struct.unpack_from('{0}s'.format(encoded_len).encode(), self._m, pos)[0]

We run application via uwsgi:

[uwsgi]
chdir=/usr/src/app/
env = APP_ROLE=dev_uwsgi
wsgi-file = /usr/src/app/app.wsgi
master=True
vacuum=True
max-requests=5000
harakiri=120
post-buffering=65536
workers=16
listen=4000
# socket=0.0.0.0:8997
stats=/tmp/uwsgi-app.stats
logger=syslog:uwsgi_app_stage,local0
buffer-size=65536
http = 0.0.0.0:8051

opened by Lispython 26

start_http_server incapable to listen on IPv6 sockets

This is a very similar issue to #132 and #69.

#69 was closed as a bug in cPython. That bug was resolved in cPython 3.8, however client_python has since migrated to WSGI.

This is how it all behaves with the latest Python (3.8.5):

# python3.8
Python 3.8.5 (default, Jul 23 2020, 07:58:41)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import prometheus_client
>>> prometheus_client.start_http_server(port=9097, addr='::')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/site-packages/prometheus_client/exposition.py", line 78, in start_wsgi_server
    httpd = make_server(addr, port, app, ThreadingWSGIServer, handler_class=_SilentHandler)
  File "/usr/local/lib/python3.8/wsgiref/simple_server.py", line 154, in make_server
    server = server_class((host, port), handler_class)
  File "/usr/local/lib/python3.8/socketserver.py", line 452, in __init__
    self.server_bind()
  File "/usr/local/lib/python3.8/wsgiref/simple_server.py", line 50, in server_bind
    HTTPServer.server_bind(self)
  File "/usr/local/lib/python3.8/http/server.py", line 138, in server_bind
    socketserver.TCPServer.server_bind(self)
  File "/usr/local/lib/python3.8/socketserver.py", line 466, in server_bind
    self.socket.bind(self.server_address)
socket.gaierror: [Errno -9] Address family for hostname not supported

The issue lies with the ThreadingWSGIServer class, which defaults to AF_INET address family. A workaround similar to https://github.com/bottlepy/bottle/commit/af547f495d673a910b1c70d9ff36c2cddafeb22d can be implemented in exposition.py:

# diff -u /usr/local/lib/python3.8/site-packages/prometheus_client/exposition.py.broken /usr/local/lib/python3.8/site-packages/prometheus_client/exposition.py
--- /usr/local/lib/python3.8/site-packages/prometheus_client/exposition.py.broken	2020-07-23 11:28:12.198472090 +0000
+++ /usr/local/lib/python3.8/site-packages/prometheus_client/exposition.py	2020-07-23 11:29:07.506269752 +0000
@@ -75,7 +75,12 @@
 def start_wsgi_server(port, addr='', registry=REGISTRY):
     """Starts a WSGI server for prometheus metrics as a daemon thread."""
     app = make_wsgi_app(registry)
-    httpd = make_server(addr, port, app, ThreadingWSGIServer, handler_class=_SilentHandler)
+    server_cls = ThreadingWSGIServer
+    if ':' in addr: # Fix wsgiref for IPv6 addresses.
+        if getattr(server_cls, 'address_family') == socket.AF_INET:
+            class server_cls(server_cls):
+                address_family = socket.AF_INET6
+    httpd = make_server(addr, port, app, server_cls, handler_class=_SilentHandler)
     t = threading.Thread(target=httpd.serve_forever)
     t.daemon = True
     t.start()

And voila, now it works as expected:

# python3.8
Python 3.8.5 (default, Jul 23 2020, 07:58:41)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import prometheus_client
>>> prometheus_client.start_http_server(port=9097, addr='::')
>>> prometheus_client.start_http_server(port=9098)
>>>

# netstat -ltnp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
(...)
tcp        0      0 0.0.0.0:9098            0.0.0.0:*               LISTEN      20262/python3.8
(...)
tcp6       0      0 :::9097                 :::*                    LISTEN      20262/python3.8
(...)

This is a non-invasive fix which doesn't break any existing code while at the same time provides those of us who wish our exporters listened on IPv6 sockets.

Is it possible to adjust exposition.py this way, please? Thank you.

opened by zajdee 23

Added ASGI application

This PR introduces an ASGI counterpart to the current WSGI app.

The ASGI code itself is moved into a seperate module asgi.py, which is conditionally included as it is only valid in Python 3 (due to async/await calls).

opened by Skeen 22
Please add support for pushgateways protected with basic authentication

Currently there doesn't seem to be any way to provide basic authentication credentials for talking to pushgateways (urllib2 doesn't allow putting them in the URL). Would be great if this could be added.

opened by kkoppel 18
Optionally disable Process and Platform collectors

Is there any way to optionally disable those collectors during usage? I would like to be able to disable those metrics from being generated except if they will actually be used.

A quick check on the code shows me that they are loaded from prometheus_client/__init__.py but there aren't (that I see) many options on that except by doing a PR with code modified.

I was able to see that trying to import resource on Microsoft Windows used to cause an exception, but the current version only ignores and generate constant values for those metrics:

Thanks!

opened by glasswalk3r 17
Multiprocess exposition speed boost
Hey there!

We at @valohai are still (#367, #368) bumping into situations where a long-lived multiproc (uWSGI) app's metric exposition steadily gets slower and slower, until it starts clogging up all workers and bumping into Prometheus's deadlines, and things break in a decidedly ungood way.

I figured I could take a little look at what exactly is taking time there, and I'm delighted to say I managed to eke out a roughly 5.8-fold speed increase in my test case (2857 files in the multiproc dir, totaling 2.8 GiB).

By far the largest boost here came from actually not using mmap() at all (f0319fa) when we're only reading the file; instead, simply reading the file fully into memory and parsing things from the memory buffer is much, much faster. Given each file (in my case anyway) is about 1 meg a pop, it shouldn't cause too much momentary memory pressure either.

Another decent optimization (0d8b870) came from looking at vmprof's output (and remembering python-babel/babel#571); it said a lot of time was spent in small, numerous memory allocations within Python, and the trail led to json.loads(). Since the JSON blobs in the files are written with sort_keys=True, we can be fairly certain that there's going to be plenty of string duplication. Simply adding an unbounded lru_cache() to where the JSON strings are being parsed into (nicely immutable!) objects gave a nice speed boost and probably also reduced memory churn since the same objects get reused. Calling cache_info() on the lru_cache validated the guess of duplication: CacheInfo(hits=573057, misses=3229, maxsize=None, currsize=3229)

A handful of other micro-optimizations brought the speed up a little more still.

My benchmark code was essentially 5 iterations of

registry = prometheus_client.CollectorRegistry() multiprocess.MultiProcessCollector(registry) metrics_page = prometheus_client.generate_latest(registry) assert metrics_page

and on my machine it took 22.478 seconds on 5132fd2 and 3.885 on 43cc95c91. 🎉
opened by akx 16

Rename labelvalues to _labelvalues to make clear that it's internal

Metrics with labelvalues are not collected. In the example below, I would expect all 3 counters to be collected, but the one with labelvalues is not.

from prometheus_client import REGISTRY, Counter

c1 = Counter("c1", "c1")
c2 = Counter("c2", "c2", labelnames=["l1"])
c3 = Counter("c3", "c3", labelnames=["l1"], labelvalues=["v1"])

c1.inc()
c2.labels("v1").inc()
c3.inc()

for item in REGISTRY.collect():
    print(item)  # prints c1 and c2, but not c3

opened by bm371613 16

HTTP server leaks memory in python3.7.1

I'm using version 0.4.2, and it seems like the default http server, started via start_http_server leaks memory on every request.

Repro script:

#!/usr/bin/env python3
import gc
import time
import tracemalloc
from random import randint

from prometheus_client import Summary, start_http_server

# Create a metric to track time spent and requests made.
REQUEST_TIME = Summary('request_processing_seconds',
                       'Time spent processing request')


# Decorate function with metric.
@REQUEST_TIME.time()
def process_request(t):
    """A dummy function that takes some time."""
    time.sleep(t)


if __name__ == '__main__':
    # Start up the server to expose the metrics.
    start_http_server(8000)
    tracemalloc.start()
    snapshot = tracemalloc.take_snapshot()
    # Generate some requests.
    while True:
        process_request(randint(1, 5))
        gc.collect()
        snapshot2 = tracemalloc.take_snapshot()
        top_stats = snapshot2.compare_to(snapshot, 'lineno')
        print("[ Top 5 differences ]")
        for stat in top_stats[:5]:
            print(stat)

If you kick off the above script and then do:

while true; do curl localhost:8000; done

you can observe its memory footprint growing on every iteration:

[ Top 5 differences ]
/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:799: size=946 KiB (+946 KiB), count=1682 (+1682), average=576 B
/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:365: size=868 KiB (+868 KiB), count=1683 (+1683), average=528 B
/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:500: size=422 KiB (+422 KiB), count=6753 (+6753), average=64 B
/Users/jose/code/prometheus-mem-leak/.venv/lib/python3.7/site-packages/prometheus_client/core.py:802: size=354 KiB (+354 KiB), count=5027 (+5027), average=72 B
/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:899: size=263 KiB (+263 KiB), count=3366 (+3366), average=80 B
[ Top 5 differences ]
/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:799: size=992 KiB (+992 KiB), count=1763 (+1763), average=576 B
/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:365: size=910 KiB (+910 KiB), count=1764 (+1764), average=528 B
/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:500: size=443 KiB (+443 KiB), count=7078 (+7078), average=64 B
/Users/jose/code/prometheus-mem-leak/.venv/lib/python3.7/site-packages/prometheus_client/core.py:802: size=370 KiB (+370 KiB), count=5266 (+5266), average=72 B
/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:899: size=276 KiB (+276 KiB), count=3528 (+3528), average=80 B
[ Top 5 differences ]
/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:799: size=1063 KiB (+1063 KiB), count=1889 (+1889), average=576 B
/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:365: size=974 KiB (+974 KiB), count=1889 (+1889), average=528 B
/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:500: size=474 KiB (+474 KiB), count=7584 (+7584), average=64 B
/Users/jose/code/prometheus-mem-leak/.venv/lib/python3.7/site-packages/prometheus_client/core.py:802: size=397 KiB (+397 KiB), count=5639 (+5639), average=72 B
/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:899: size=295 KiB (+295 KiB), count=3778 (+3778), average=80 B
[ Top 5 differences ]
/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:799: size=1115 KiB (+1115 KiB), count=1982 (+1982), average=576 B
/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:365: size=1022 KiB (+1022 KiB), count=1983 (+1983), average=528 B
/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:500: size=497 KiB (+497 KiB), count=7956 (+7956), average=64 B
/Users/jose/code/prometheus-mem-leak/.venv/lib/python3.7/site-packages/prometheus_client/core.py:802: size=416 KiB (+416 KiB), count=5922 (+5922), average=72 B
/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:899: size=310 KiB (+310 KiB), count=3966 (+3966), average=80 B
[ Top 5 differences ]
/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:799: size=1151 KiB (+1151 KiB), count=2046 (+2046), average=576 B
/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:365: size=1055 KiB (+1055 KiB), count=2047 (+2047), average=528 B
/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:500: size=514 KiB (+514 KiB), count=8213 (+8213), average=64 B
/Users/jose/code/prometheus-mem-leak/.venv/lib/python3.7/site-packages/prometheus_client/core.py:802: size=430 KiB (+430 KiB), count=6108 (+6108), average=72 B
/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:899: size=320 KiB (+320 KiB), count=4094 (+4094), average=80 B

it's clearly coming from the server, because as soon as you stop the curl, memory usage stops growing

opened by josecv 16

A server serving HTTP/1.1 responses with MetricsHandler causes client requests to timeout

In a server running HTTP/1.1 protocol version the Content-Length header is recommended in responses of at least GET requests. Lack of this header causes clients to wait indefinitely ending up in a read timeout There are other recommendations such as the server closing the connection, but that is counteract connection persistence (keepalive).

from http.server import BaseHTTPRequestHandler, HTTPServer
from prometheus_client import MetricsHandler

class HTTPRequestHandler(MetricsHandler):
    
    # Necessary for connection persistence (keepalive)
    protocol_version = 'HTTP/1.1'

    def do_GET(self):
        return super(HTTPRequestHandler, self).do_GET()

if __name__ == "__main__":
    server_address = ('', 8000)
    HTTPServer(server_address, HTTPRequestHandler).serve_forever()

[[email protected]] 16:49 $ curl -v --max-time 10 localhost:8000
* Rebuilt URL to: localhost:8000/
*   Trying ::1...
* TCP_NODELAY set
* connect to ::1 port 8000 failed: Connection refused
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8000 (#0)
> GET / HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.58.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Server: BaseHTTP/0.6 Python/3.6.5
< Date: Thu, 23 Aug 2018 13:49:55 GMT
< Content-Type: text/plain; version=0.0.4; charset=utf-8
* no chunk, no close, no size. Assume close to signal end
< 
# I removed the metrics payload from here ....
python_info{implementation="CPython",major="3",minor="6",patchlevel="5",version="3.6.5"} 1.0
* Operation timed out after 10002 milliseconds with 1013 bytes received
* stopped the pause stream!
* Closing connection 0
curl: (28) Operation timed out after 10002 milliseconds with 1013 bytes received

To fix this add self.send_header("Content-Length", len(output)) after https://github.com/prometheus/client_python/blob/master/prometheus_client/exposition.py#L100

RFC -> https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html

opened by banjoh 16

Multiprocess application sometimes throwing UnicodeDecodeError in prometheus_client/mmap_dict.py

Hi I have multiprocess enabled application with flask exporter, sometimes it is throwing below error File "usr/local/lib/python3.8/site-packages/prometheus_client/mmap_dict.py", line 44, in _read_all_values yield encoded_key.decode('utf-8'), value, pos UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf0 in position 18: invalid continuation byte.

After this error my application is crashed. Can someone please help me on this?

opened by Velan987 3
User start_http_server to expose metrics in /metrics sudfolder

Hi All

Rather than expose my metrics at the top level I would like to expose them in a metrics subfolder similar to what node-exporter does. I don't quite seem able to find the way to do this using this client library. Anyone help me solve this?

Thanks

Stephen

opened by sbates130272 0
MetricWrapperBase labels() method static typing for label names
Hi, recently I was thinking about possible improvement for MetricWrapperBase and friends labels method.

Very common use case is described even in Counter's docstring:

from prometheus_client import Counter c = Counter('my_requests_total', 'HTTP Failures', ['method', 'endpoint']) c.labels('get', '/').inc() c.labels('post', '/submit').inc()

But when having N different counters, especially with different number of label names, and legacy large codebase or just very hard to test edge cases in your code (or the effort to test them all is not acceptable for some reason) where you use metrics, after some time you end up with typo errors when number of arguments do not match those specified, for example with above example counter:

try: do_something() except VeryRareException: if int(time.time()) % 99999 == 0: c.labels('get').inc() # Surprise!!! ValueError

Maybe we can do better somehow? This would be extra useful if we could pass label names like ['method', 'endpoint'] in a way that type checkers could understand and yield errors even before actually running code. Ideally with 100% backward compability with existing implementations (that one will be hard).

To just give some silly ideas, there is for example TypeVarTuple https://docs.python.org/3/library/typing.html#typing.TypeVarTuple that could at least do the job but only with partial backward compability, here PoC for MetricWrapperBase:

Disclaimer both TypeVarTuple and Self are Python 3.11+

from typing import TypeVarTuple, Self ... LabelNames = TypeVarTuple("LabelNames") class MetricWrapperBase(Collector,Generic[*LabelNames]): ... def __init__(self, name: str, documentation: str, labelnames: tuple[*LabelNames] = (), namespace: str = '', subsystem: str = '', unit: str = '', registry: Optional[CollectorRegistry] = REGISTRY, _labelvalues: Optional[Sequence[str]] = None, ) -> None: ... def labels(self: T, *labelvalues: *LabelNames) -> Self: ... # breaking changes there, only args

With that we have desire result

x = MetricWrapperBase("x", "y", ("short name", "data")) x.labels("Ok name", "Ok data") x.labels("Forgot second arg")

Of course this is very far from perfect, note only tuples could be used (no list) and in labels only args not kwargs. Also Python 3.11 is questionable but there is typing_extensions lib plus that could always live as a optional stubs only or some nasty overloads.

I am not by any means python typing ninja, but maybe someone could come up with better ideas! Or have some thoughts on this topic, I am observing new typing features on every python release, there may be now solutions that didn't exist couple of years ago.
opened by rafsaf 0

How to measure time spent by collector.collect?

Having a 3rd-party application which can not expose any metrics by itself I wrote my own exporter which takes data from the app and exposes metrics. The next idea was to measure and expose to prometheus an exact time it takes for all of the collectors to get actual data from the app. But two different approaches - with collector.collect() decorator and measurements taken directly inside of collector.collect() yields different results, and I don't understand what the decorator is actually measures.

Could you please explain this difference or maybe I'm missing something?

import time
from typing import Iterable
from prometheus_client import make_wsgi_app, Gauge
from prometheus_client.core import REGISTRY, GaugeMetricFamily, Metric
from prometheus_client.registry import Collector
from wsgiref.simple_server import make_server


COLLECTOR_TIME = Gauge('collector_spent_secods', 'Time spent by collector to get data', ['collector'])


class MetricsCollector(Collector):
    @COLLECTOR_TIME.labels('decorator').time()
    def collect(self) -> Iterable[Metric]:
        st = time.perf_counter()
        gauge = GaugeMetricFamily(
            name='some_metric_seconds',
            documentation='Some seconds metric',
            labels=['somename']
        )
        time.sleep(1)
        gauge.add_metric(['foo'], 3.0)
        COLLECTOR_TIME.labels('inside').set(time.perf_counter() - st)
        yield gauge


if __name__ == "__main__":
    REGISTRY.register(MetricsCollector())
    app = make_wsgi_app(REGISTRY)
    host = '0.0.0.0'
    port = 6543
    httpd = make_server(host, port, app)
    httpd.serve_forever()

$ curl localhost:6543 2>/dev/null | egrep -v '^#' | grep 'collector_spent_secods'
collector_spent_secods{collector="decorator"} 7.499998901039362e-06
collector_spent_secods{collector="inside"} 1.0012022000155412

opened by atatarn 0

add Timer decorator support partial function

When we use time to decorate a partial function, I get an cannot decorate partial function error. Should we use timer to support partial functions @hynek @csmarchbanks ？ I thought it should be added and tried to add the relevant code.

opened by dafu-wu 3

Add no-op asgi-lifespan handling

Super simple hook that checks for lifespan and immediately responds type.complete, essentially making startup/shutdown lifespan events no-ops for the prometheus ASGI app.

Fixes #855

from hypercorn.middleware import DispatcherMiddleware
from prometheus_client import make_asgi_app
import asyncio

async def main():

    from hypercorn.asyncio import serve
    from hypercorn.config import Config

    app = DispatcherMiddleware({
        "/metrics": make_asgi_app(),
    })


    config = Config()
    config.bind = ["localhost:8080"]

    await serve(app, config)

Before:

    assert scope.get("type") == "http"
AssertionError

After:

[2022-11-15 16:05:20 -0800] [59048] [INFO] Running on http://127.0.0.1:8080 (CTRL + C to quit)
INFO:hypercorn.error:Running on http://127.0.0.1:8080 (CTRL + C to quit)

Signed-off-by: Sheena Artrip [email protected]

opened by sheenobu 1

Releases(v0.15.0)

v0.15.0(Oct 13, 2022)

[CHANGE] Remove choose_formatter. choose_formatter only existed for v0.14.x and was deprecated in v0.14.1. https://github.com/prometheus/client_python/pull/846 [FEATURE] Support TLS auth when using push gateway with tls_auth_handler. https://github.com/prometheus/client_python/pull/841 [ENHANCEMENT] Add sum, livemin, and livemax multiprocess modes for Gauges. https://github.com/prometheus/client_python/pull/794
Source code(tar.gz)
Source code(zip)
v0.14.1(Apr 8, 2022)

[BUGFIX] Revert choose_encoder being renamed to choose_formatter to fix a breaking change. For the 0.14.x release cycle choose_formatter will still exist, but will be removed in 0.15.0. #796
Source code(tar.gz)
Source code(zip)
v0.14.0(Apr 5, 2022)

[ENHANCEMENT] Continued typing improvements and coverage. #759, #771, #781 [ENHANCEMENT] Allow binding to IPv6 addresses. #657 [ENHANCEMENT] Negotiate gzip content-encoding, enabled by default. #776 [ENHANCEMENT] Allow disabling _created metrics via the PROMETHEUS_DISABLE_CREATED_SERIES environment variable. #774 [BUGFIX] Correct minor typo in exception raised when exemplar labels are too long. #773
Source code(tar.gz)
Source code(zip)
v0.13.1(Jan 28, 2022)

[BUGFIX] Relax some type constraints that were too strict. #754, #755, #756, #758 [BUGFIX] Explicitly export functions with __all__. #757
Source code(tar.gz)
Source code(zip)
v0.13.0(Jan 25, 2022)

[CHANGE] Drop support for Python versions 2.7, 3.4, and 3.5. #718 [FEATURE] Support adding labels when using .time() #730 [ENHANCEMENT] Begin to add type hints to functions. #705 [ENHANCEMENT] Improved go-to-declaration behavior for editors. #747 [BUGFIX] Remove trailing slashes from pushgateway URLS. #722 [BUGFIX] Catch non-integer bucket/count values. #726
Source code(tar.gz)
Source code(zip)
v0.12.0(Oct 29, 2021)

[FEATURE] Exemplar support (excludes multiprocess) #669 [ENHANCEMENT] Add support for Python 3.10 #706 [ENHANCEMENT] Restricted Registry will handle metrics added after restricting #675, #680
[ENHANCEMENT] Raise a more helpful error if a metric is not observable #666 [BUGFIX] Fix instance_ip_grouping_key not working on MacOS #687 [BUGFIX] Fix assertion error from favicion.ico with Python 2.7 #715
Source code(tar.gz)
Source code(zip)
v0.11.0(Jun 1, 2021)

[CHANGE] Specify that the labelvalues argument on metric constructors is internal by renaming it to _labelvalues. If you are affected by this change, it is likely that the metric was not being registered. #660 [BUGFIX] write_to_textfile will overwrite files in windows. If using python 3.4 or newer the replace will be atomic. #650
Source code(tar.gz)
Source code(zip)
v0.10.1(Apr 8, 2021)

[BUGFIX] Support lowercase prometheus_multiproc_dir environment variable in mark_process_dead. #644
Source code(tar.gz)
Source code(zip)
v0.10.0(Apr 2, 2021)
[CHANGE] Python 2.6 is no longer supported. #592

[CHANGE] The prometheus_multiproc_dir environment variable is deprecated in favor of PROMETHEUS_MULTIPROC_DIR. #624

[FEATURE] Follow redirects when pushing to Pushgateway using passthrough_redirect_handler. #622

[FEATURE] Metrics support a clear() method to remove all children. #642

[ENHANCEMENT] Tag support in GraphiteBridge. #618

Source code(tar.gz)
Source code(zip)
v0.9.0(Nov 16, 2020)

[ENHANCEMENT] Add support for python3.9 (#600) [ENHANCEMENT] Various updates for latest OpenMetrics draft spec (#576 #577)
Source code(tar.gz)
Source code(zip)
v0.8.0(May 25, 2020)

[FEATURE] Added ASGI application (#512) [FEATURE] Add support for parsing timestamps in Prometheus exposition format. (#483) [FEATURE] Add target_info to registries (#453) [ENHANCEMENT] Handle empty and slashes in label values for pushgateway (#547 #442) [ENHANCEMENT] Various updates for latest OpenMetrics draft spec (#434 #445 #538 #460 #496) [ENHANCEMENT] Add HELP output for auto-created metrics (#471) [ENHANCEMENT] Use mmap.PAGESIZE constant as value for first read. (#505) [ENHANCEMENT] Add __repr__ method to metric objects, make them debug friendly. (#481) [ENHANCEMENT] Add observability check to metrics (#455 #520) [BUGFIX] Fix urlparse in python >= 3.7.6 (#497) [BUGFIX] Cleaning up name before appending unit on name (#543) [BUGFIX] Allow for OSError on Google App Engine (#448)
Source code(tar.gz)
Source code(zip)
v0.7.1(Jun 20, 2019)

[BUGFIX] multiprocess: don't crash on missing gauge_live/sum files (#424) [BUGFIX] correctly bind method on Python 2.x (#403)
Source code(tar.gz)
Source code(zip)
v0.7.0(Jun 7, 2019)

[ENHANCEMENT] Multiprocess exposition speed boost (#421) [ENHANCEMENT] optimize openmetrics text parsing (~4x perf) (#402) [ENHANCEMENT] Add python3.7 support (#418) [ENHANCEMENT] Change exemplar length limit to be only for label names+values (#397) [BUGFIX] Disable gcCollector for pypy (#380)
Source code(tar.gz)
Source code(zip)
v0.6.0(Feb 19, 2019)

[ENHANCEMENT] Better exceptions on exposition failure (#364) [BUGFIX] Fix deadlock in gcCollector, metrics are now different (#371) [BUGFIX] Fix thread leak in Python 3.7 (#356) [BUGFIX] Make the format strings compatible with Python 2.6 (#361) [BUGFIX] parser: ensure samples are of type Sample (#358)
Source code(tar.gz)
Source code(zip)
v0.5.0(Dec 6, 2018)

[ENHANCEMENT] Be more resilient to certain file corruptions (#329) [ENHANCEMENT] Permit subclassing of MetricsHandler (#339) [ENHANCEMENT] Updates based on latest OpenMetrics draft spec discussions (#338 #346) [BUGFIX] In multiprocess mode, ensure that metrics initialise to the correct file (#346) [BUGFIX] Avoid re-entrant calls to GC collector's callback (#343)
Source code(tar.gz)
Source code(zip)
v0.4.2(Oct 15, 2018)

[BUGFIX] Disable GCCollector in multiprocess mode to prevent a deadlock
Source code(tar.gz)
Source code(zip)
v0.4.1(Oct 9, 2018)

[BUGFIX] Fix OpenMetrics http negotiation handling
Source code(tar.gz)
Source code(zip)
v0.4.0(Oct 3, 2018)

[CHANGE] Counter time series will now always be exposed with _total, and counter metrics will have a _total suffix stripped. This is as the internal data model is now OpenMetrics, rather than Prometheus Text Format (#300) [CHANGE] Samples now use a namedtuple (#300) [FEATURE] Add OpenMetrics exposition and parser (#300 #306) [FEATURE] Add Info, Stateset, Enum, GaugeHistogram support for OpenMetrics (#300) [FEATURE] Add timestamp support for Prometheus text format exposition (#300) [FEATURE] Add garbage collection metrics (#301) [ENHANCEMENT] If reading multiprocess file, open it readonly. (#307) [BUGFIX] Fix bug in WSGI app code. (#307) [BUGFIX] Write to multiprocess files directly (#315)
Source code(tar.gz)
Source code(zip)
v0.3.1(Jul 30, 2018)

[BUGFIX] Fix handing of escaping in parser [BUGFIX] Fix concurrency issues with timers
Source code(tar.gz)
Source code(zip)
v0.3.0(Jul 10, 2018)

[ENHANCEMENT] 4.5x speedup in parser #282 [ENHANCEMENT] Performance improvements for multiproc mode #266 [BUGFIX] Fix FD leak in multiproc mode #269
Source code(tar.gz)
Source code(zip)
v0.2.0(Apr 3, 2018)

[CHANGE/ENHANCEMENT] Set default timeout of 30s on pushgateway actions [ENHANCEMENT] Various performance improvements to multi-process mode [BUGFIX] Handle QUERY_STRING not being present for WSGI
Source code(tar.gz)
Source code(zip)
v0.1.1(Jan 15, 2018)

[BUGFIX] Handle non-ASCII characters in /proc/pid/stat [BUGFIX] Make check for Python 2.6 work on development versions of Python
Source code(tar.gz)
Source code(zip)
v0.1.0(Dec 14, 2017)

[FEATURE] Add UntypedMetricFamily [FEATURE] Allow start_http_server to take a registry, for use in multiprocesses setups [ENCHANCEMENT] Don't log requests to WSGI server [ENCHANCEMENT] Improved error handling when prometheus_multiproc_dir isn't set [BUGFIX] Handle /proc/self/fd not being accessible [BUGFIX] Workaround urlparse bug in Python 2.6
Source code(tar.gz)
Source code(zip)
v0.0.21(Sep 14, 2017)

[BUGFIX] In multi-proc mode correctly handle metrics being created in both parent and child processes [BUGFIX] Handle iterators being passed as labelnames to *MetricFamily [ENHANCEMENT] Python 3.6 now officially supported
Source code(tar.gz)
Source code(zip)
v0.0.20(Jul 19, 2017)

[FEATURE] Support all modes of mutli-process operation in mutliproc mode, and it's a little faster too [FEATURE] Add platform collector by default to add information about the Python/JVM runtime [ENHANCEMENT] Httpserver now multi-threaded [BUGFIX] Use namespace/subsystem correctly in multiproc mode [BUGFIX] Support labelnames being an empty list
Source code(tar.gz)
Source code(zip)
0.0.19(Jan 31, 2017)

[FEATURE] Support basic auth and allow for custom handlers for talking to the pushgateway [BUGFIX] Support trailing commas in parser
Source code(tar.gz)
Source code(zip)
0.0.18(Nov 24, 2016)

[FEATURE] Add optional describe() method on collectors, fallback to "collect() if not present and explcitly requested on the registry. This is enabled on the default registry [FEATURE] Use describe() method to raise an exception on duplicate time series names in a registry [FEATURE] Add support for ?name[]=xxx to limit what metrics names are returned over http from a registry [BUGFIX] An exception in a collector now causes a 500 rather than a blank 200 [BUGFIX] Disallow colon in label names [BUGFIX] Correctly parse untyped metrics into one metric, not several
Source code(tar.gz)
Source code(zip)
0.0.17(Oct 19, 2016)

[BUGFIX] Gauge.set_to_current_time to return correct value on Python3
Source code(tar.gz)
Source code(zip)
0.0.16(Oct 10, 2016)

[FEATURE] Experimental multi-process supported added
Source code(tar.gz)
Source code(zip)
0.0.15(Oct 2, 2016)

It's no longer possible to pass in a dict to labels(), instead use labels(**dict).

[FEATURE] labels function supports labels as keyword argurments [CHANGE] labels function no longer supports being passed a dict [FEATURE] Pushgateway can now be specified as a URL prefix, allowing for https [IMPROVEMENT] Cleanup of process collector [FEATURE] Signatures of decorated functions are now preserved
Source code(tar.gz)
Source code(zip)

Prometheus instrumentation library for Python applications

Related tags

Overview

Prometheus Python Client

Three Step Demo

Installation

Instrumenting

Counter

Gauge

Summary

Histogram

Info

Enum

Labels

Process Collector

Platform Collector

Exporting

HTTP

Twisted

WSGI

ASGI

Flask

Node exporter textfile collector

Exporting to a Pushgateway

Handlers for authentication

Bridges

Graphite

Custom Collectors

Multiprocess Mode (Gunicorn)

Parser

Comments

Releases(v0.15.0)

v0.15.0(Oct 13, 2022)

v0.14.1(Apr 8, 2022)

v0.14.0(Apr 5, 2022)

v0.13.1(Jan 28, 2022)

v0.13.0(Jan 25, 2022)

v0.12.0(Oct 29, 2021)

v0.11.0(Jun 1, 2021)

v0.10.1(Apr 8, 2021)

v0.10.0(Apr 2, 2021)

v0.9.0(Nov 16, 2020)

v0.8.0(May 25, 2020)

v0.7.1(Jun 20, 2019)

v0.7.0(Jun 7, 2019)

v0.6.0(Feb 19, 2019)

v0.5.0(Dec 6, 2018)

v0.4.2(Oct 15, 2018)

v0.4.1(Oct 9, 2018)

v0.4.0(Oct 3, 2018)

v0.3.1(Jul 30, 2018)

v0.3.0(Jul 10, 2018)

v0.2.0(Apr 3, 2018)

v0.1.1(Jan 15, 2018)

v0.1.0(Dec 14, 2017)

v0.0.21(Sep 14, 2017)

v0.0.20(Jul 19, 2017)

0.0.19(Jan 31, 2017)

0.0.18(Nov 24, 2016)

0.0.17(Oct 19, 2016)

0.0.16(Oct 10, 2016)

0.0.15(Oct 2, 2016)

Owner

Prometheus

Diamond is a python daemon that collects system metrics and publishes them to Graphite (and others). It is capable of collecting cpu, memory, network, i/o, load and disk metrics. Additionally, it features an API for implementing custom collectors for gathering metrics from almost any source.

GoAccess is a real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.

ScoutAPM Python Agent. Supports Django, Flask, and many other frameworks.

Was an interactive continuous Python profiler.

Development tool to measure, monitor and analyze the memory behavior of Python objects in a running Python application.

Sentry is cross-platform application monitoring, with a focus on error reporting.

Display machine state using Python3 with Flask.

Visual profiler for Python

Automatically monitor the evolving performance of Flask/Python web services.

Glances an Eye on your system. A top/htop alternative for GNU/Linux, BSD, Mac OS and Windows operating systems.

Call-graph profiling for TwinCAT 3

Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automatically use request headers such as x-request-id or x-correlation-id.

Exports osu! user stats to prometheus metrics for a specified set of users

Prometheus instrumentation library for Python applications

Scalene: a high-performance, high-precision CPU and memory profiler for Python

Yet Another Python Profiler, but this time thread&coroutine&greenlet aware.