A python wrapper for libmagic

Overview

python-magic

PyPI version Build Status

python-magic is a Python interface to the libmagic file type identification library. libmagic identifies file types by checking their headers according to a predefined list of file types. This functionality is exposed to the command line by the Unix command file.

Usage

>>> import magic
>>> magic.from_file("testdata/test.pdf")
'PDF document, version 1.2'
# recommend using at least the first 2048 bytes, as less can produce incorrect identification
>>> magic.from_buffer(open("testdata/test.pdf", "rb").read(2048))
'PDF document, version 1.2'
>>> magic.from_file("testdata/test.pdf", mime=True)
'application/pdf'

There is also a Magic class that provides more direct control, including overriding the magic database file and turning on character encoding detection. This is not recommended for general use. In particular, it's not safe for sharing across multiple threads and will fail throw if this is attempted.

>>> f = magic.Magic(uncompress=True)
>>> f.from_file('testdata/test.gz')
'ASCII text (gzip compressed data, was "test", last modified: Sat Jun 28
21:32:52 2008, from Unix)'

You can also combine the flag options:

>>> f = magic.Magic(mime=True, uncompress=True)
>>> f.from_file('testdata/test.gz')
'text/plain'

Installation

The current stable version of python-magic is available on PyPI and can be installed by running pip install python-magic.

Other sources:

This module is a simple wrapper around the libmagic C library, and that must be installed as well:

Debian/Ubuntu

sudo apt-get install libmagic1

Windows

You'll need DLLs for libmagic. @julian-r maintains a pypi package with the DLLs, you can fetch it with:

pip install python-magic-bin

OSX

  • When using Homebrew: brew install libmagic
  • When using macports: port install file

Troubleshooting

  • 'MagicException: could not find any magic files!': some installations of libmagic do not correctly point to their magic database file. Try specifying the path to the file explicitly in the constructor: magic.Magic(magic_file="path_to_magic_file").

  • 'WindowsError: [Error 193] %1 is not a valid Win32 application': Attempting to run the 32-bit libmagic DLL in a 64-bit build of python will fail with this error. Here are 64-bit builds of libmagic for windows: https://github.com/pidydx/libmagicwin64. Newer version can be found here: https://github.com/nscaife/file-windows.

  • 'WindowsError: exception: access violation writing 0x00000000 ' This may indicate you are mixing Windows Python and Cygwin Python. Make sure your libmagic and python builds are consistent.

Bug Reports

python-magic is a thin layer over the libmagic C library. Historically, most bugs that have been reported against python-magic are actually bugs in libmagic; libmagic bugs can be reported on their tracker here: https://bugs.astron.com/my_view_page.php. If you're not sure where the bug lies feel free to file an issue on GitHub and I can triage it.

Running the tests

To run the tests across a variety of linux distributions (depends on Docker):

./test_docker.sh

To run tests locally across all available python versions:

./test/run.py

To run against a specific python version:

LC_ALL=en_US.UTF-8 python3 test/test.py

libmagic and python-magic

See COMPAT.md for a guide to libmagic / python-magic compatability.

Versioning

Minor version bumps should be backwards compatible. Major bumps are not.

Author

Written by Adam Hupp in 2001 for a project that never got off the ground. It originally used SWIG for the C library bindings, but switched to ctypes once that was part of the python standard library.

You can contact me via my website or GitHub.

License

python-magic is distributed under the MIT license. See the included LICENSE file for details.

I am providing code in the repository to you under an open source license. Because this is my personal repository, the license you receive to my code is from me and not my employer (Facebook).

Owner
Adam Hupp
Adam Hupp
An easy-to-use library for emulating code in minidump files.

dumpulator Note: This is a work-in-progress prototype, please treat it as such. An easy-to-use library for emulating code in minidump files. Example T

Duncan Ogilvie 362 Dec 31, 2022
Here is some Python code that allows you to read in SVG files and approximate their paths using a Fourier series.

Here is some Python code that allows you to read in SVG files and approximate their paths using a Fourier series. The Fourier series can be animated and visualized, the function can be output as a tw

Alexander 12 Jan 01, 2023
Simple addon to create folder structures in blender.

BlenderCreateFolderStructure Simple Add-on to create a folder structure in Blender. Installation Download BlenderCreateFolderStructure.py Open Blender

Dominik Strasser 2 Feb 21, 2022
Python function to stream unzip all the files in a ZIP archive: without loading the entire ZIP file or any of its files into memory at once

Python function to stream unzip all the files in a ZIP archive: without loading the entire ZIP file or any of its files into memory at once

Department for International Trade 206 Jan 02, 2023
Pure Python tools for reading and writing all TIFF IFDs, sub-IFDs, and tags.

Tiff Tools Pure Python tools for reading and writing all TIFF IFDs, sub-IFDs, and tags. Developed by Kitware, Inc. with funding from The National Canc

Digital Slide Archive 32 Dec 14, 2022
A small Python module for determining appropriate platform-specific dirs, e.g. a "user data dir".

the problem What directory should your app use for storing user data? If running on macOS, you should use: ~/Library/Application Support/AppName If

ActiveState Software 948 Dec 31, 2022
QSynthesis is a Python3 API to perform I/O based program synthesis of bitvector expressions.

QSynthesis is a Python3 API to perform I/O based program synthesis of bitvector expressions. It aims at facilitating code deobfuscation. The algorithm is greybox approach combining both a blackbox I/

Quarkslab 103 Dec 30, 2022
A tiny Configuration File Parser for Python Projects

A tiny Configuration File Parser for Python Projects. Currently working on JSON Config Files only.

Tanmoy Sen Gupta 1 Feb 12, 2022
A Certificate renaming tool made for IEEE CS SBC, SJCE.

PDF Batch Renamer Made for IEEE CS SBC, SJCE How to use? Before using the python script, ensure that pytesseract, pdf2image, opencv and other supporti

Ashwin Kumar U 2 Nov 14, 2021
This project is a set of programs that I use to create a README.md file.

🤖 codex-readme 📜 codex-readme What is it? This project is a set of programs that I use to create a README.md file. How does it work? It reads progra

Tom Dörr 224 Jan 07, 2023
Find potentially sensitive files

find_files Find potentially sensitive files This script searchs for potentially sensitive files based off of file name or string contained in the file

4 Aug 20, 2022
Kartothek - a Python library to manage large amounts of tabular data in a blob store

Kartothek - a Python library to manage (create, read, update, delete) large amounts of tabular data in a blob store

15 Dec 25, 2022
Utils for streaming large files (S3, HDFS, gzip, bz2...)

smart_open — utils for streaming large files in Python What? smart_open is a Python 3 library for efficient streaming of very large files from/to stor

RARE Technologies 2.7k Jan 06, 2023
Add Ranges and page numbers to IIIF Manifest from a CSV.

Add Ranges and page numbers to IIIF Manifest from CSV specific to a workflow of the Bibliotheca Hertziana.

Raffaele Viglianti 3 Apr 28, 2022
Remove [x]_ from StudIP zip Archives and archive_filelist.csv completely

This tool removes the "[x]_" at the beginning of StudIP zip Archives. It also deletes the "archive_filelist.csv" file

Kelke vl 1 Jan 19, 2022
Two scripts help you to convert csv file to md file by template

Two scripts help you to convert csv file to md file by template. One help you generate multiple md files with different filenames from the first colume of csv file. Another can generate one md file w

2 Oct 15, 2022
A JupyterLab extension that allows opening files and directories with external desktop applications.

A JupyterLab extension that allows opening files and directories with external desktop applications.

martinRenou 0 Oct 14, 2021
Small Python script to generate a calendar (.ics) file from SIMASTER courses schedule.

simaster.ics Small Python script to generate a calendar (.ics) file from SIMASTER courses schedule. Usage Getting the events.json file from SIMASTER O

Faiz Jazadi 8 Nov 02, 2022
A tiny Python library for writing multi-channel TIFF stacks.

xtiff A tiny Python library for writing multi-channel TIFF stacks. The aim of this library is to provide an easy way to write multi-channel image stac

23 Dec 27, 2022
BREP : Binary Search in plaintext and gzip files

BREP : Binary Search in plaintext and gzip files Search large files in O(log n) time using binary search. We support plaintext and Gzipped files. Benc

Arnaud de Saint Meloir 5 Dec 24, 2021