🍏 Make Thinc faster on macOS by calling into Apple's native Accelerate library

Overview

thinc-apple-ops

Make spaCy and Thinc up to 8 × faster on macOS by calling into Apple's native libraries.

Install

Make sure you have Xcode installed and then install with pip:

pip install thinc-apple-ops

🏫 Motivation

Matrix multiplication is one of the primary operations in machine learning. Since matrix multiplication is computationally expensive, using a fast matrix multiplication implementation can speed up training and prediction significantly.

Most linear algebra libraries provide matrix multiplication in the form of the standardized BLAS gemm functions. The work behind scences is done by a set of matrix multiplication kernels that are meticulously tuned for specific architectures. Matrix multiplication kernels use architecture-specific SIMD instructions for data-level parallism and can take factors such as cache sizes and intstruction latency into account. Thinc uses the BLIS linear algebra library, which provides optimized matrix multiplication kernels for most x86_64 and some ARM CPUs.

Recent Apple Silicon CPUs, such as the M-series used in Macs, differ from traditional x86_64 and ARM CPUs in that they have a separate matrix co-processor(s) called AMX. Since AMX is not well-documented, it is unclear how many AMX units Apple M CPUs have. It is certain that the (single) performance cluster of the M1 has an AMX unit and there is empirical evidence that both performance clusters of the M1 Pro/Max have an AMX unit.

Even though AMX units use a set of undocumented instructions, the units can be used through Apple's Accelerate linear algebra library. Since Accelerate implements the BLAS interface, it can be used as a replacement of the BLIS library that is used by Thinc. This is where the thinc-apple-ops package comes in. thinc-apple-ops extends the default Thinc ops, so that gemm matrix multiplication from Accelerate is used in place of the BLIS implementation of gemm. As a result, matrix multiplication in Thinc is performed on the fast AMX unit(s).

Benchmarks

Using thinc-apple-ops leads to large speedups in prediction and training on Apple Silicon Macs, as shown by the benchmarks below.

Prediction

This first benchark compares prediction speed of the de_core_news_lg spaCy model between the M1 with and without thinc-apple-ops. Results for an Intel Mac Mini and AMD Ryzen 5900X are also provided for comparison. Results are in words per second. In this prediction benchmark, using thinc-apple-ops improves performance by 4.3 times.

CPU BLIS thinc-apple-ops Package power (Watt)
Mac Mini (M1) 6492 27676 5
MacBook Air Core i5 2020 9790 10983 9
AMD Ryzen 5900X 22568 N/A 52

Training

In the second benchmark, we compare the training speed of the de_core_news_lg spaCy model (without NER). The results are in training iterations per second. Using thinc-apple-ops improves training time by 3.0 times.

CPU BLIS thinc-apple-ops Package power (Watt)
Mac Mini M1 2020 3.34 10.07 5
MacBook Air Core i5 2020 3.10 3.27 10
AMD Ryzen 5900X 6.53 N/A 53
Comments
  • Pass through Accelerate sgemm/saxpy in Ops.cblas

    Pass through Accelerate sgemm/saxpy in Ops.cblas

    This can be used by e.g. the parser in spaCy 3.4 to use Accelerate's implementations.

    I am not sure how to handle this dependency-wise, since this requires Thinc 8.1, but we still want to people to be able to use thinc-apple-ops with Thinc 8.0.x and spaCy < 3.4. Do we need another minor release that sets thinc < 8.1.0?

    opened by danieldk 5
  • IndexError: Out of bounds on buffer access (axis 1)

    IndexError: Out of bounds on buffer access (axis 1)

    Hi I tried to use this awesome package and I am getting this error. Not sure what it means, maybe you guys could help me?

    I should mention that my data is quite big and I am also using some SWAP space. Could this be the reason of this error?

    [2021-09-28 21:09:01,238] [INFO] Set up nlp object from config
    [2021-09-28 21:09:01,500] [INFO] Pipeline: ['tok2vec', 'ner', 'sentencizer', 'entity_linker']
    [2021-09-28 21:09:01,505] [INFO] Created vocabulary
    [2021-09-28 21:09:01,505] [INFO] Finished initializing nlp object
    Traceback (most recent call last):
      File "/Users/joozty/Documents/kolurbo/venv/bin/spacy", line 8, in <module>
        sys.exit(setup_cli())
      File "/Users/joozty/Documents/kolurbo/venv/lib/python3.9/site-packages/spacy/cli/_util.py", line 69, in setup_cli
        command(prog_name=COMMAND)
      File "/Users/joozty/Documents/kolurbo/venv/lib/python3.9/site-packages/click/core.py", line 1137, in __call__
        return self.main(*args, **kwargs)
      File "/Users/joozty/Documents/kolurbo/venv/lib/python3.9/site-packages/click/core.py", line 1062, in main
        rv = self.invoke(ctx)
      File "/Users/joozty/Documents/kolurbo/venv/lib/python3.9/site-packages/click/core.py", line 1668, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/Users/joozty/Documents/kolurbo/venv/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/Users/joozty/Documents/kolurbo/venv/lib/python3.9/site-packages/click/core.py", line 763, in invoke
        return __callback(*args, **kwargs)
      File "/Users/joozty/Documents/kolurbo/venv/lib/python3.9/site-packages/typer/main.py", line 500, in wrapper
        return callback(**use_params)  # type: ignore
      File "/Users/joozty/Documents/kolurbo/venv/lib/python3.9/site-packages/spacy/cli/train.py", line 60, in train_cli
        nlp = init_nlp(config, use_gpu=use_gpu)
      File "/Users/joozty/Documents/kolurbo/venv/lib/python3.9/site-packages/spacy/training/initialize.py", line 84, in init_nlp
        nlp.initialize(lambda: train_corpus(nlp), sgd=optimizer)
      File "/Users/joozty/Documents/kolurbo/venv/lib/python3.9/site-packages/spacy/language.py", line 1272, in initialize
        proc.initialize(get_examples, nlp=self, **p_settings)
      File "/Users/joozty/Documents/kolurbo/venv/lib/python3.9/site-packages/spacy/pipeline/tok2vec.py", line 216, in initialize
        self.model.initialize(X=doc_sample)
      File "/Users/joozty/Documents/kolurbo/venv/lib/python3.9/site-packages/thinc/model.py", line 299, in initialize
        self.init(self, X=X, Y=Y)
      File "/Users/joozty/Documents/kolurbo/venv/lib/python3.9/site-packages/thinc/layers/chain.py", line 86, in init
        layer.initialize(X=curr_input, Y=Y)
      File "/Users/joozty/Documents/kolurbo/venv/lib/python3.9/site-packages/thinc/model.py", line 299, in initialize
        self.init(self, X=X, Y=Y)
      File "/Users/joozty/Documents/kolurbo/venv/lib/python3.9/site-packages/thinc/layers/chain.py", line 90, in init
        curr_input = layer.predict(curr_input)
      File "/Users/joozty/Documents/kolurbo/venv/lib/python3.9/site-packages/thinc/model.py", line 315, in predict
        return self._func(self, X, is_train=False)[0]
      File "/Users/joozty/Documents/kolurbo/venv/lib/python3.9/site-packages/thinc/layers/concatenate.py", line 44, in forward
        Ys, callbacks = zip(*[layer(X, is_train=is_train) for layer in model.layers])
      File "/Users/joozty/Documents/kolurbo/venv/lib/python3.9/site-packages/thinc/layers/concatenate.py", line 44, in <listcomp>
        Ys, callbacks = zip(*[layer(X, is_train=is_train) for layer in model.layers])
      File "/Users/joozty/Documents/kolurbo/venv/lib/python3.9/site-packages/thinc/model.py", line 291, in __call__
        return self._func(self, X, is_train=is_train)
      File "/Users/joozty/Documents/kolurbo/venv/lib/python3.9/site-packages/spacy/ml/staticvectors.py", line 46, in forward
        vectors_data = model.ops.gemm(model.ops.as_contig(V[rows]), W, trans2=True)
      File "/Users/joozty/Documents/kolurbo/venv/lib/python3.9/site-packages/thinc_apple_ops/ops.py", line 25, in gemm
        C = blas.gemm(x, y, trans1=trans1, trans2=trans2)
      File "thinc_apple_ops/blas.pyx", line 37, in thinc_apple_ops.blas.gemm
      File "thinc_apple_ops/blas.pyx", line 53, in thinc_apple_ops.blas.gemm
    IndexError: Out of bounds on buffer access (axis 1)
    

    Info about spaCy

    • spaCy version: 3.1.3
    • Platform: macOS-11.6-arm64-arm-64bit
    • Python version: 3.9.7
    • Pipelines: en_core_web_sm (3.1.0), en_core_web_md (3.1.0)
    opened by Joozty 2
  • Can't compile thinc on Macbook Air M1

    Can't compile thinc on Macbook Air M1

    Hello, I find myself unable to compile this otherwise magnificent tool! Please help, if you can!

    I am on MacOS 12.1, Kernel Version 21.2.0, and have installed the latest Python (3.10.2)

    Here is the error message I get after trying to install with pip (apparently it can't find the Accelerate Libraries, especially Accelerate.h Header ...):

    ERROR: Command errored out with exit status 1: command: /Library/Frameworks/Python.framework/Versions/3.10/bin/python3.10 /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py build_wheel /var/folders/n7/t2plqm6n2jq4khmj0bckswg40000gq/T/tmp0bhlw2sh cwd: /private/var/folders/n7/t2plqm6n2jq4khmj0bckswg40000gq/T/pip-install-wgga78t9/thinc-apple-ops_f5b38888c7a149cd9f99fd524c2bd340 Complete output (34 lines): running bdist_wheel running build running build_py creating build creating build/lib.macosx-10.9-universal2-3.10 creating build/lib.macosx-10.9-universal2-3.10/thinc_apple_ops copying thinc_apple_ops/init.py -> build/lib.macosx-10.9-universal2-3.10/thinc_apple_ops copying thinc_apple_ops/ops.py -> build/lib.macosx-10.9-universal2-3.10/thinc_apple_ops creating build/lib.macosx-10.9-universal2-3.10/thinc_apple_ops/tests copying thinc_apple_ops/tests/init.py -> build/lib.macosx-10.9-universal2-3.10/thinc_apple_ops/tests copying thinc_apple_ops/tests/test_gemm.py -> build/lib.macosx-10.9-universal2-3.10/thinc_apple_ops/tests running egg_info warning: no files found matching '.pxd' under directory 'thinc_apple_ops' warning: no files found matching '.txt' under directory 'thinc_apple_ops' writing manifest file 'thinc_apple_ops.egg-info/SOURCES.txt' copying thinc_apple_ops/blas.pyx -> build/lib.macosx-10.9-universal2-3.10/thinc_apple_ops copying thinc_apple_ops/py.typed -> build/lib.macosx-10.9-universal2-3.10/thinc_apple_ops running build_ext creating build/temp.macosx-10.9-universal2-3.10 creating build/temp.macosx-10.9-universal2-3.10/thinc_apple_ops clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -arch arm64 -arch x86_64 -g -I/private/var/folders/n7/t2plqm6n2jq4khmj0bckswg40000gq/T/pip-build-env-b0flamc2/overlay/lib/python3.10/site-packages/numpy/core/include -I/Library/Frameworks/Python.framework/Versions/3.10/include/python3.10 -c thinc_apple_ops/blas.c -o build/temp.macosx-10.9-universal2-3.10/thinc_apple_ops/blas.o In file included from thinc_apple_ops/blas.c:706: In file included from /private/var/folders/n7/t2plqm6n2jq4khmj0bckswg40000gq/T/pip-build-env-b0flamc2/overlay/lib/python3.10/site-packages/numpy/core/include/numpy/arrayobject.h:5: In file included from /private/var/folders/n7/t2plqm6n2jq4khmj0bckswg40000gq/T/pip-build-env-b0flamc2/overlay/lib/python3.10/site-packages/numpy/core/include/numpy/ndarrayobject.h:12: In file included from /private/var/folders/n7/t2plqm6n2jq4khmj0bckswg40000gq/T/pip-build-env-b0flamc2/overlay/lib/python3.10/site-packages/numpy/core/include/numpy/ndarraytypes.h:1960: /private/var/folders/n7/t2plqm6n2jq4khmj0bckswg40000gq/T/pip-build-env-b0flamc2/overlay/lib/python3.10/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-W#warnings] #warning "Using deprecated NumPy API, disable it with "
    ^ thinc_apple_ops/blas.c:714:10: fatal error: 'Accelerate/Accelerate.h' file not found #include "Accelerate/Accelerate.h" ^~~~~~~~~~~~~~~~~~~~~~~~~ thinc_apple_ops/blas.c:714:10: note: did not find header 'Accelerate.h' in framework 'Accelerate' (loaded from '/System/Library/Frameworks') 1 warning and 1 error generated. error: command '/Library/Developer/CommandLineTools/usr/bin/clang' failed with exit code 1

    ERROR: Failed building wheel for thinc-apple-ops Failed to build thinc-apple-ops ERROR: Could not build wheels for thinc-apple-ops, which is required to install pyproject.toml-based projects

    ------------------------------------------ END---------------------------------------------------------------------------

    Any help would be greatly appreciated, thanks!

    duplicate 
    opened by amal1us 1
  • AppleOps.gemm: write in-place when `output` is given

    AppleOps.gemm: write in-place when `output` is given

    NumpyOps.gemm (with BLIS) writes the result of matrix multiplication in-place when the output argument is given. This changes AppleOps.gemm to do the same, avoiding allocation of a temporary.

    enhancement 
    opened by danieldk 0
  • Change thinc upper bound to <8.1.0

    Change thinc upper bound to <8.1.0

    thinc-apple-ops will require thinc >= 8.1.0 in the future for the CBLAS passthrough functionality. As discussed in #15, we should first do another minor thinc-apple-ops release specifically for thinc <8.1.0.

    Also bump the version to v0.0.7 to prepare for the release.

    opened by danieldk 0
  • Fix 0-size arrays

    Fix 0-size arrays

    Our bit of Cython code uses memory buffers, which apparently have a bounds-check when the size is 0 when acquiring the pointer. In contrast, in other bits of code we often acquire the buffer by casting the array.data pointer, which has no such bounds check. This led to IndexError being raised when zero shapes were passed through.

    opened by honnibal 0
  • Require thinc with ops registry

    Require thinc with ops registry

    Technically it doesn't require a currently unreleased version of thinc to run, but if people install it into an existing venv, then it's better to require the version of thinc to upgraded so that it's detected and used.

    opened by adrianeboyd 0
Releases(v0.1.3)
Owner
Explosion
A software company specializing in developer tools for Artificial Intelligence and Natural Language Processing
Explosion
A web-based chat application that enables multiple users to interact with one another

A web-based chat application that enables multiple users to interact with one another, in the same chat room or different ones according to their choosing.

3 Apr 22, 2022
Statically typed BNF with semantic actions; A frontend of frontend frameworks; Use your grammar everywhere.

Statically typed BNF with semantic actions; A frontend of frontend frameworks; Use your grammar everywhere.

Taine Zhao 56 Dec 14, 2022
Banking management project using Tkinter GUI in python.

Bank-Management Banking management project using Tkinter GUI in python. Packages required Tkinter - Tkinter is the standard GUI library for Python. sq

Anjali Kumawat 7 Jul 03, 2022
Streamlit apps done following data professor's course on YouTube

streamlit-twelve-apps Streamlit apps done following data professor's course on YouTube Español Curso de apps de data science hecho por Data Professor

Federico Bravin 1 Jan 10, 2022
CupScript is a simple programing language made with python

CupScript CupScript is a simple programming language made with python It includes some basic functions, variables, loops, and some other built in func

FUSEN 23 Dec 29, 2022
We want to check several batch of web URLs (1~100 K) and find the phishing website/URL among them.

We want to check several batch of web URLs (1~100 K) and find the phishing website/URL among them. This module is designed to do the URL/web attestation by using the API from NUS-Phishperida-Project.

3 Dec 28, 2022
A simple API to upload notes or files to KBFS

This API can be used to upload either secure notes or files to a secure KeybaseFS folder.

Dakota Brown 1 Oct 08, 2021
Python library to decorate and beautify strings

outputformater Python library to decorate and beautify your standard output 💖 I

Felipe Delestro Matos 259 Dec 13, 2022
SpaCy3Urdu: run command to setup assets(dataset from UD)

Project setup run command to setup assets(dataset from UD) spacy project assets It uses project.yml file and download the data from UD GitHub reposito

Muhammad Irfan 1 Dec 14, 2021
Ronin - Create Fud Meterpreter Payload To Hack Windows 11

Ronin - Create Fud Meterpreter Payload To Hack Windows 11

Dj4w3d H4mm4di 6 May 09, 2022
Blender addons - A collection of Blender tools I've written for myself over the years.

gret A collection of Blender tools I've written for myself over the years. I use these daily so they should be bug-free, mostly. Feel free to take and

217 Jan 08, 2023
Old versions of Deadcord that are problematic or used as reference.

⚠️ Unmaintained and broken. We have decided to release the old version of Deadcord before our v1.0 rewrite. (which will be equiped with much more feat

Galaxzy 1 Feb 10, 2022
RELATE is an Environment for Learning And TEaching

RELATE Relate is an Environment for Learning And TEaching RELATE is a web-based courseware package. It is set apart by the following features: Focus o

Andreas Klöckner 311 Dec 25, 2022
This repository contains Python games that I've worked on. You'll learn how to create python games with AI. I try to focus on creating board games without GUI in Jupyter-notebook.

92_Python_Games 🎮 Introduction 👋 This repository contains Python games that I've worked on. You'll learn how to create python games with AI. I try t

Milaan Parmar / Милан пармар / _米兰 帕尔马 166 Jan 01, 2023
One Ansible Module for using LINE notify API to send notification. It can be required in the collection list.

Ansible Collection - hazel_shen.line_notify Documentation for the collection. ansible-galaxy collection install hazel_shen.line_notify --ignore-certs

Hazel Shen 4 Jul 19, 2021
2021华为软件精英挑战赛 程序输出分析器

AutoGrader 0.2.0更新:加入资源分配溢出检测,如果发生资源溢出会输出溢出发生的位置。 如果通过检测,会显示通过符号 如果没有通过检测,会显示警告,并输出溢出发生的位置和操作

54 Aug 14, 2022
A tool to determine optimal projects for Gridcoin crunchers. Maximize your magnitude!

FindTheMag FindTheMag helps optimize your BOINC client for Gridcoin mining. You can group BOINC projects into two groups: "preferred" projects and "mi

7 Oct 04, 2022
Скрипт позволяет заводить задачи в Панель мониторинга YouTrack на основе парсинга сайта safe-surf.ru

Скрипт позволяет заводить задачи в Панель мониторинга YouTrack на основе парсинга сайта safe-surf.ru

Bad_karma 3 Feb 12, 2022
Zeus - Advanced Punishments with Embeds.

Zeus Advanced Punishments with Embeds. Make sure to put the Discord Bot Token in the " TOKEN = '' " Language Python Features Ban Kick Mute Unmute Warn

2 Jan 05, 2022
This repo is related to Google Coding Challenge, given to Bright Network Internship Experience 2021.

BrightNetworkUK-GCC-2021 This repo is related to Google Coding Challenge, given to Bright Network Internship Experience 2021. Language used here is py

Dareer Ahmad Mufti 28 May 23, 2022