A python wrapper for libmagic

Overview

python-magic

PyPI version Build Status

python-magic is a Python interface to the libmagic file type identification library. libmagic identifies file types by checking their headers according to a predefined list of file types. This functionality is exposed to the command line by the Unix command file.

Usage

>>> import magic
>>> magic.from_file("testdata/test.pdf")
'PDF document, version 1.2'
# recommend using at least the first 2048 bytes, as less can produce incorrect identification
>>> magic.from_buffer(open("testdata/test.pdf", "rb").read(2048))
'PDF document, version 1.2'
>>> magic.from_file("testdata/test.pdf", mime=True)
'application/pdf'

There is also a Magic class that provides more direct control, including overriding the magic database file and turning on character encoding detection. This is not recommended for general use. In particular, it's not safe for sharing across multiple threads and will fail throw if this is attempted.

>>> f = magic.Magic(uncompress=True)
>>> f.from_file('testdata/test.gz')
'ASCII text (gzip compressed data, was "test", last modified: Sat Jun 28
21:32:52 2008, from Unix)'

You can also combine the flag options:

>>> f = magic.Magic(mime=True, uncompress=True)
>>> f.from_file('testdata/test.gz')
'text/plain'

Installation

The current stable version of python-magic is available on PyPI and can be installed by running pip install python-magic.

Other sources:

This module is a simple wrapper around the libmagic C library, and that must be installed as well:

Debian/Ubuntu

sudo apt-get install libmagic1

Windows

You'll need DLLs for libmagic. @julian-r maintains a pypi package with the DLLs, you can fetch it with:

pip install python-magic-bin

OSX

  • When using Homebrew: brew install libmagic
  • When using macports: port install file

Troubleshooting

  • 'MagicException: could not find any magic files!': some installations of libmagic do not correctly point to their magic database file. Try specifying the path to the file explicitly in the constructor: magic.Magic(magic_file="path_to_magic_file").

  • 'WindowsError: [Error 193] %1 is not a valid Win32 application': Attempting to run the 32-bit libmagic DLL in a 64-bit build of python will fail with this error. Here are 64-bit builds of libmagic for windows: https://github.com/pidydx/libmagicwin64. Newer version can be found here: https://github.com/nscaife/file-windows.

  • 'WindowsError: exception: access violation writing 0x00000000 ' This may indicate you are mixing Windows Python and Cygwin Python. Make sure your libmagic and python builds are consistent.

Bug Reports

python-magic is a thin layer over the libmagic C library. Historically, most bugs that have been reported against python-magic are actually bugs in libmagic; libmagic bugs can be reported on their tracker here: https://bugs.astron.com/my_view_page.php. If you're not sure where the bug lies feel free to file an issue on GitHub and I can triage it.

Running the tests

To run the tests across a variety of linux distributions (depends on Docker):

./test_docker.sh

To run tests locally across all available python versions:

./test/run.py

To run against a specific python version:

LC_ALL=en_US.UTF-8 python3 test/test.py

libmagic and python-magic

See COMPAT.md for a guide to libmagic / python-magic compatability.

Versioning

Minor version bumps should be backwards compatible. Major bumps are not.

Author

Written by Adam Hupp in 2001 for a project that never got off the ground. It originally used SWIG for the C library bindings, but switched to ctypes once that was part of the python standard library.

You can contact me via my website or GitHub.

License

python-magic is distributed under the MIT license. See the included LICENSE file for details.

I am providing code in the repository to you under an open source license. Because this is my personal repository, the license you receive to my code is from me and not my employer (Facebook).

Owner
Adam Hupp
Adam Hupp
LightCSV - This CSV reader is implemented in just pure Python.

LightCSV Simple light CSV reader This CSV reader is implemented in just pure Python. It allows to specify a separator, a quote char and column titles

Jose Rodriguez 6 Mar 05, 2022
This is a file deletion program that asks you for an extension of a file (.mp3, .pdf, .docx, etc.) to delete all of the files in a dir that have that extension.

FileBulk This is a file deletion program that asks you for an extension of a file (.mp3, .pdf, .docx, etc.) to delete all of the files in a dir that h

Enoc Mena 1 Jun 26, 2022
A Certificate renaming tool made for IEEE CS SBC, SJCE.

PDF Batch Renamer Made for IEEE CS SBC, SJCE How to use? Before using the python script, ensure that pytesseract, pdf2image, opencv and other supporti

Ashwin Kumar U 2 Nov 14, 2021
Fast Python reader and editor for ASAM MDF / MF4 (Measurement Data Format) files

asammdf is a fast parser and editor for ASAM (Association for Standardization of Automation and Measuring Systems) MDF (Measurement Data Format) files

Daniel Hrisca 440 Dec 31, 2022
Read and write TIFF files

Read and write TIFF files Tifffile is a Python library to store numpy arrays in TIFF (Tagged Image File Format) files, and read image and metadata fro

Christoph Gohlke 346 Dec 18, 2022
Import Python modules from any file system path

pathimp Import Python modules from any file system path. Installation pip3 install pathimp Usage import pathimp

Danijar Hafner 2 Nov 29, 2021
This simple python script pcopy reads a list of file names and copies them to a separate folder

pCopy This simple python script pcopy reads a list of file names and copies them to a separate folder. Pre-requisites Python 3 (ver. 3.6) How to use

Madhuranga Rathnayake 0 Sep 03, 2021
Creates folders into a directory to categorize files in that directory by file extensions and move all things from sub-directories to current directory.

Categorize and Uncategorize Your Folders Table of Content TL;DR just take me to how to install. What are Extension Categorizer and Folder Dumper Insta

Furkan Baytekin 1 Oct 17, 2021
ValveVMF - A python library to parse Valve's VMF files

ValveVMF ValveVMF is a Python library for parsing .vmf files for the Source Engi

pySourceSDK 2 Jan 02, 2022
Lumar - Smart File Creator

Lumar is a free tool for creating and managing files. With Lumar you can quickly create any type of file, add a file content and file size. With Lumar you can also find out if Photoshop or other imag

Paul - FloatDesign 3 Dec 10, 2021
Uproot is a library for reading and writing ROOT files in pure Python and NumPy.

Uproot is a library for reading and writing ROOT files in pure Python and NumPy. Unlike the standard C++ ROOT implementation, Uproot is only an I/O li

Scikit-HEP Project 164 Dec 31, 2022
Extract longest transcript or longest CDS transcript from GTF annotation file or gencode transcripts fasta file.

Extract longest transcript or longest CDS transcript from GTF annotation file or gencode transcripts fasta file.

laojunjun 13 Nov 23, 2022
Listreqs is a simple requirements.txt generator. It's an alternative to pipreqs

⚡ Listreqs Listreqs is a simple requirements.txt generator. It's an alternative to pipreqs. Where in Pipreqs, it helps you to Generate requirements.tx

Soumyadip Sarkar 4 Oct 15, 2021
Vericopy - This Python script provides various usage modes for secure local file copying and hashing.

Vericopy This Python script provides various usage modes for secure local file copying and hashing. Hash data is captured and logged for paths before

15 Nov 05, 2022
Organizer is a python program that organizes your downloads folder

Organizer Organizer is a python program that organizes your downloads folder, it can run as a service and so will start along with the system, and the

Gustavo 2 Oct 18, 2021
This is just a GUI that detects your file's real extension using the filetype module.

Real-file.extnsn This is just a GUI that detects your file's real extension using the filetype module. Requirements Python 3.4 and above filetype modu

1 Aug 08, 2021
Better directory iterator and faster os.walk(), now in the Python 3.5 stdlib

scandir, a better directory iterator and faster os.walk() scandir() is a directory iteration function like os.listdir(), except that instead of return

Ben Hoyt 506 Dec 29, 2022
Simple, convenient and cross-platform file date changing library. 📝📅

Simple, convenient and cross-platform file date changing library.

kubinka0505 15 Dec 18, 2022
Add Ranges and page numbers to IIIF Manifest from a CSV.

Add Ranges and page numbers to IIIF Manifest from CSV specific to a workflow of the Bibliotheca Hertziana.

Raffaele Viglianti 3 Apr 28, 2022
Yadl - it is a simple library for working with both dotenv files and environment variables.

Yadl Yadl - it is a simple library for working with both dotenv files and environment variables. Features Validation of whitespaces. Validation of num

Ivan Kapranov 3 Oct 19, 2021