User Experience on mobile might not be great yet, but I'm working on it.

Your first time on this page? Allow me to give some explanations.

Awesome Scientific Audio

Curated list of python software and packages related to scientific research in audio

Here you can see meta information about this topic like the time we last updated this page, the original creator of the awesome list and a link to the original GitHub repository.

Last Update: Oct. 27, 2021, 6:10 p.m.

Thank you faroit & contributors
View Topic on GitHub:
faroit/awesome-python-scientific-audio

Search for resources by name or description.
Simply type in what you are looking for and the results will be filtered on the fly.

Further filter the resources on this page by type (repository/other resource), number of stars on GitHub and time of last commit in months.

Read-Write

Expressive Digital Signal Processing (DSP) package for Python

573
68
5y 29d
GPL-3.0

cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python

359
82
1y 7d
MIT

Python wrapper around sox.

343
61
8m
BSD-3-Clause

Python I/O for STEM audio files

48
10
8m
MIT

Read music meta data and length of MP3, OGG, OPUS, MP4, M4A, FLAC, WMA and Wave files with python 2 or 3

435
80
11m
MIT

octocat:](https://github.com/quodlibet/mutagen) :package: - Reads and writes all kind of audio metadata for various formats.

octocat:](https://github.com/mikeboers/PyAV) - PyAV is a Pythonic binding for FFmpeg or Libav.

octocat:](https://github.com/bastibe/PySoundFile) :package: - Library based on libsndfile, CFFI, and NumPy.

Transformations - General DSP

An audio digital processing toolbox based on a workflow/pipeline principle

224
34
1y 100d
BSD-3-Clause

Gammatone-based spectrograms, using gammatone filterbanks or Fourier transform weightings.

144
56
3y 11d
BSD-3-Clause

๐ŸŽš๏ธ Open Source Audio Matching and Mastering

386
50
8m
GPL-3.0

A fast MDCT implementation using SciPy and FFTs

31
7
1y 5m
MIT

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

650
177
8m
MIT

python wrapper for rubberband

61
17
1y 9m
ISC

octocat:](https://github.com/python-acoustics/python-acoustics/) :package: - useful tools for acousticians.

octocat:](https://github.com/Muges/audiotsm) :package: - real-time audio time-scale modification procedures.

octocat:](https://github.com/pyFFTW/pyFFTW) :package: - Wrapper for FFTW(3).

octocat:](https://github.com/grrrr/nsgt) :package: - Non-stationary gabor transform, constant-q.

octocat:](https://github.com/jiaaro/pydub) :package: - Manipulate audio with a simple and easy high level interface.

octocat:](https://github.com/scikit-signal/pytftb) - Implementation of the MATLAB Time-Frequency Toolbox.

octocat:](https://github.com/PyWavelets/pywt) :package: - Discrete Wavelet Transform in Python.

octocat:](https://github.com/bmcfee/resampy) :package: - Sample rate conversion.

octocat:](https://github.com/sfstoolbox/sfs-python) :package: - Sound Field Synthesis Toolbox.

:octocat: :package: - Analyze, visualize and process sound field data recorded by spherical microphone arrays.

octocat:](https://github.com/nils-werner/stft) :package: - Standalone package for Short-Time Fourier Transform.

Feature extraction

Expressive Digital Signal Processing (DSP) package for Python

573
68
5y 29d
GPL-3.0

This library provides common speech features for ASR including MFCCs and filterbank energies.

1.84K
551
10m
MIT

Audio features extraction

186
42
2y 80d
LGPL-3.0

SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/

831
112
11m
Apache-2.0

spafe: Simplified Python Audio-Features Extraction

127
26
4m
BSD-3-Clause

octocat:](https://github.com/aubio/aubio) :package: - Feature extractor, written in C, Python interface.

octocat:](https://github.com/MTG/essentia) - Music related low level and high level feature extractor, C++ based, includes Python bindings.

Data augmentation

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

206
27
1y 42d
MIT

Python library for audio augmentation

49
10
6m
BSD-3-Clause

octocat:](https://github.com/bmcfee/muda) :package: - Musical Data Augmentation.

Speech Processing

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

16.62K
3.08K
8m
MPL-2.0

gentle forced aligner

904
213
1y 5m
MIT

Praat in Python, the Pythonic way

467
60
8m
GPL-3.0

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

951
219
9m
MIT

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

3.77K
969
8m
Apache-2.0

Python interface to the WebRTC Voice Activity Detector

1.02K
275
8m
n/a

A python package for calculating the PESQ.

183
47
1y 6m
MIT

Python implementation of the Short Term Objective Intelligibility measure

156
39
1y 7m
MIT

A Python wrapper for the high-quality vocoder "World"

414
88
1y 8d
MIT

Speech recognition module for Python, supporting several engines and APIs, online and offline.

5.43K
1.88K
2y 118d
n/a

octocat:](https://github.com/readbeyond/aeneas/) :package: - Forced aligner, based on MFCC+DTW, 35+ languages.

octocat:](https://github.com/persephone-tools/persephone) :package: - Automatic phoneme transcription tool.

octocat:](https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner) - Forced aligner, based on Kaldi (HMM), English (others can be trained).

package:](https://pypi.python.org/pypi/SIDEKIT/) - Speaker and Language recognition.

Environmental Sounds

octocat:](https://github.com/TUT-ARG/sed_eval) :package: - Evaluation toolbox for Sound Event Detection

Perceptial Models - Auditory Models

Inner ear models for Python

82
32
2y 49d
GPL-3.0

Audio library for modelling loudness

29
10
2y 81d
GPL-3.0

octocat:](https://github.com/brian-team/brian2) :package: - Spiking neural networks simulator, includes cochlea model.

octocat:](https://github.com/csteinmetz1/pyloudnorm) - Audio loudness meter and normalization, implements ITU-R BS.1770-4.

octocat:](https://github.com/sfstoolbox/sfs-python) :package: - Sound Field Synthesis Toolbox.

Source Separation

17
4
1y 8m
BSD-3-Clause

Sparse Beta-Divergence Tensor Factorization Library

44
9
6y 1d
MIT

:octocat: :package: - Holistic source separation framework including DSP methods and deep learning methods.

octocat:](https://github.com/marinkaz/nimfa) :package: - Several flavors of non-negative-matrix factorization.

Music Information Retrieval

Python tools for the corpus analysis of popular music.

15
2
4y 10m
MIT

octocat:](https://github.com/CPJKU/madmom) :package: - MIR packages with strong focus on beat detection, onset detection and chord recognition.

octocat:](https://github.com/craffel/mir_eval) :package: - Common scores for various MIR tasks. Also includes bss_eval implementation.

octocat:](https://github.com/urinieto/msaf) :package: - Music Structure Analysis Framework.

octocat:](https://github.com/librosa/librosa) :package: - General audio and music analysis.

Deep Learning

kapre: Keras Audio Preprocessors

716
129
11m
MIT

Data manipulation and transformation for audio signal processing, powered by PyTorch

1.23K
280
8m
BSD-2-Clause

Audio processing by using pytorch 1D convolution network

464
46
5m
MIT

Symbolic Music - MIDI - Musicology

Mingus is a music package for Python

556
132
10m
GPL-3.0

octocat:](https://github.com/cuthbertLab/music21) :package: - Toolkit for Computer-Aided Musicology.

octocat:](https://github.com/olemb/mido) :package: - Realtime MIDI wrapper.

octocat:](https://github.com/craffel/pretty-midi) :package: - Utility functions for handling MIDI data in a nice/intuitive way.

Realtime applications

Python game programming in Jupyter notebooks.

42
5
11m
BSD-2-Clause

Play and Record Sound with Python

508
90
10m
MIT

Real-Time Spherical Microphone Renderer for binaural reproduction in Python

33
5
7m
n/a

PYO

octocat:](https://github.com/belangeo/pyo) - Realtime audio dsp engine.

Web Audio

Scalable audio processing framework written in Python with a RESTful API

298
51
9m
AGPL-3.0

Audio Dataset and Dataloaders

A Python wrapper around the Soundcloud API

49
8
6y 6d
BSD-2-Clause

Python library for handling audio datasets.

99
23
1y 83d
MIT

octocat:](https://github.com/beetbox/beets) :package: - Music library manager and MusicBrainz tagger.

octocat:](https://github.com/faroit/dsdtools) :package: - Parse and process the demixing secrets dataset.

octocat:](https://github.com/marl/medleydb) - Parse medleydb audio + annotations.

octocat:](https://github.com/rg3/youtube-dl) :package: - Download youtube videos (and the audio).

:octocat: :package: - Common loaders for Music Information Retrieval (MIR) datasets.

Wrappers for Audio Plugins

package:](https://pypi.python.org/pypi/vamp) - Interface compiled vamp plugins.

Tutorials

octocat:](https://github.com/jakevdp/WhirlwindTourOfPython

octocat:](https://github.com/scipy-lectures/scipy-lecture-notes) - Highly recommended tutorial, covers large parts of the scientific Python ecosystem.

Short overview of equivalent python functions for switchers.

octocat:](https://github.com/stevetjoa/stanford-mir) - collection of instructional iPython Notebooks for music information retrieval (MIR).

Books

Python Data Science Handbook: full text in Jupyter Notebooks

28.15K
12.51K
2y 11m
n/a

Scientific Papers

John C. Glover, Victor Lazzarini and Joseph Timoney, Linux Audio Conference 2011.

Video](https://www.youtube.com/watch?v=MhOdbtPhbLU) - Brian McFee, Colin Raffel, Dawen Liang, Daniel P.W. Ellis, Matt McVicar, Eric Battenberg, Oriol Nieto, Scipy 2015.

Video](https://www.youtube.com/watch?v=37R_R82lfwA) - Hervรฉ Bredin, Ruiqing Yin, Juan Manuel Coria, Gregory Gelly, Pavel Korshunov, Marvin Lavechin, Diego Fustes, Hadrien Titeux, Wassim Bouaziz, Marie-Philippe Gill, ICASSP 2020.

Other Resources

Audio Signal Processing, Python based course from UPF of Barcelona and Stanford University.

Masters Course Material (University of Rostock) with many Python examples.

Music Information Retrieval Community.