User Experience on mobile might not be great yet, but I'm working on it.

Your first time on this page? Allow me to give some explanations.

Awesome Scientific Audio

Curated list of python software and packages related to scientific research in audio

Here you can see meta information about this topic like the time we last updated this page, the original creator of the awesome list and a link to the original GitHub repository.

Last Update: Dec. 3, 2020, 12:02 a.m.

Thank you faroit & contributors
View Topic on GitHub:
faroit/awesome-python-scientific-audio

Search for resources by name or description.
Simply type in what you are looking for and the results will be filtered on the fly.

Further filter the resources on this page by type (repository/other resource), number of stars on GitHub and time of last commit in months.

Read-Write

Expressive Digital Signal Processing (DSP) package for Python

565
67
4y 66d
GPL-3.0

cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python

352
83
44d
MIT

Python wrapper around sox.

312
59
64d
BSD-3-Clause

Python I/O for STEM audio files

38
8
9d
MIT

Read music meta data and length of MP3, OGG, OPUS, MP4, M4A, FLAC, WMA and Wave files with python 2 or 3

426
79
22d
MIT

Python library for handling audio datasets.

89
21
4m
MIT

octocat:](https://github.com/quodlibet/mutagen) :package: - Reads and writes all kind of audio metadata for various formats.

octocat:](https://github.com/mikeboers/PyAV) - PyAV is a Pythonic binding for FFmpeg or Libav.

octocat:](https://github.com/bastibe/PySoundFile) :package: - Library based on libsndfile, CFFI, and NumPy.

Transformations - General DSP

An audio digital processing toolbox based on a workflow/pipeline principle

219
35
4m
BSD-3-Clause

Gammatone-based spectrograms, using gammatone filterbanks or Fourier transform weightings.

133
52
2y 48d
BSD-3-Clause

A fast MDCT implementation using SciPy and FFTs

31
7
6m
MIT

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

610
171
6d
MIT

python wrapper for rubberband

55
14
10m
ISC

octocat:](https://github.com/python-acoustics/python-acoustics/) :package: - useful tools for acousticians.

octocat:](https://github.com/Muges/audiotsm) :package: - real-time audio time-scale modification procedures.

octocat:](https://github.com/pyFFTW/pyFFTW) :package: - Wrapper for FFTW(3).

octocat:](https://github.com/grrrr/nsgt) :package: - Non-stationary gabor transform, constant-q.

octocat:](https://github.com/jiaaro/pydub) :package: - Manipulate audio with a simple and easy high level interface.

octocat:](https://github.com/scikit-signal/pytftb) - Implementation of the MATLAB Time-Frequency Toolbox.

octocat:](https://github.com/PyWavelets/pywt) :package: - Discrete Wavelet Transform in Python.

octocat:](https://github.com/bmcfee/resampy) :package: - Sample rate conversion.

octocat:](https://github.com/sfstoolbox/sfs-python) :package: - Sound Field Synthesis Toolbox.

octocat:](https://github.com/nils-werner/stft) :package: - Standalone package for Short-Time Fourier Transform.

Feature extraction

Expressive Digital Signal Processing (DSP) package for Python

565
67
4y 66d
GPL-3.0

This library provides common speech features for ASR including MFCCs and filterbank energies.

1.8K
539
10m
MIT

Audio features extraction

184
42
1y 117d
LGPL-3.0

SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/

828
110
12d
Apache-2.0

octocat:](https://github.com/aubio/aubio) :package: - Feature extractor, written in C, Python interface.

octocat:](https://github.com/MTG/essentia) - Music related low level and high level feature extractor, C++ based, includes Python bindings.

Data augmentation

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

206
27
79d
MIT

octocat:](https://github.com/bmcfee/muda) :package: - Musical Data Augmentation.

octocat:](https://github.com/SuperKogito/pydiogment) :package: - Audio Data Augmentation.

Speech Processing

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

15.9K
2.94K
6d
MPL-2.0

gentle forced aligner

858
207
6m
MIT

Praat in Python, the Pythonic way

421
55
7d
GPL-3.0

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

852
205
8d
MIT

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

3.56K
933
14d
Apache-2.0

Python interface to the WebRTC Voice Activity Detector

953
259
52d
n/a

A python package for calculating the PESQ.

160
42
7m
MIT

Python implementation of the Short Term Objective Intelligibility measure

146
37
8m
MIT

A Python wrapper for the high-quality vocoder "World"

393
85
45d
MIT

Speech recognition module for Python, supporting several engines and APIs, online and offline.

5.25K
1.82K
1y 5m
n/a

octocat:](https://github.com/readbeyond/aeneas/) :package: - Forced aligner, based on MFCC+DTW, 35+ languages.

octocat:](https://github.com/persephone-tools/persephone) :package: - Automatic phoneme transcription tool.

octocat:](https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner) - Forced aligner, based on Kaldi (HMM), English (others can be trained).

package:](https://pypi.python.org/pypi/SIDEKIT/) - Speaker and Language recognition.

Environmental Sounds

octocat:](https://github.com/TUT-ARG/sed_eval) :package: - Evaluation toolbox for Sound Event Detection

Perceptial Models - Auditory Models

Inner ear models for Python

75
30
1y 86d
GPL-3.0

Audio library for modelling loudness

28
10
1y 118d
GPL-3.0

octocat:](https://github.com/brian-team/brian2) :package: - Spiking neural networks simulator, includes cochlea model.

octocat:](https://github.com/csteinmetz1/pyloudnorm) - Audio loudness meter and normalization, implements ITU-R BS.1770-4.

octocat:](https://github.com/sfstoolbox/sfs-python) :package: - Sound Field Synthesis Toolbox.

Source Separation

17
4
9m
BSD-3-Clause

Sparse Beta-Divergence Tensor Factorization Library

43
9
5y 38d
MIT

octocat:](https://github.com/interactiveaudiolab/nussl) :package: - Holistic source separation framework including DSP methods and deep learning methods.

octocat:](https://github.com/marinkaz/nimfa) :package: - Several flavors of non-negative-matrix factorization.

Music Information Retrieval

Python tools for the corpus analysis of popular music.

14
2
3y 11m
MIT

octocat:](https://github.com/CPJKU/madmom) :package: - MIR packages with strong focus on beat detection, onset detection and chord recognition.

octocat:](https://github.com/craffel/mir_eval) :package: - Common scores for various MIR tasks. Also includes bss_eval implementation.

octocat:](https://github.com/urinieto/msaf) :package: - Music Structure Analysis Framework.

octocat:](https://github.com/librosa/librosa) :package: - General audio and music analysis.

Deep Learning

kapre: Keras Audio Preprocessors

677
125
17d
MIT

Data manipulation and transformation for audio signal processing, powered by PyTorch

1.14K
259
13d
BSD-2-Clause

Symbolic Music - MIDI - Musicology

Mingus is a music package for Python

528
130
6m
GPL-3.0

octocat:](https://github.com/cuthbertLab/music21) :package: - Toolkit for Computer-Aided Musicology.

octocat:](https://github.com/olemb/mido) :package: - Realtime MIDI wrapper.

octocat:](https://github.com/craffel/pretty-midi) :package: - Utility functions for handling MIDI data in a nice/intuitive way.

Realtime applications

Python game programming in Jupyter notebooks.

42
5
14d
BSD-2-Clause

Play and Record Sound with Python

478
84
15d
MIT

PYO

octocat:](https://github.com/belangeo/pyo) - Realtime audio dsp engine.

Web Audio

Scalable audio processing framework written in Python with a RESTful API

295
49
51d
AGPL-3.0

Audio related APIs and Datasets

A Python wrapper around the Soundcloud API

30
4
5y 43d
BSD-2-Clause

octocat:](https://github.com/beetbox/beets) :package: - Music library manager and MusicBrainz tagger.

octocat:](https://github.com/faroit/dsdtools) :package: - Parse and process the demixing secrets dataset.

octocat:](https://github.com/marl/medleydb) - Parse medleydb audio + annotations.

octocat:](https://github.com/rg3/youtube-dl) :package: - Download youtube videos (and the audio).

Wrappers for Audio Plugins

package:](https://pypi.python.org/pypi/vamp) - Interface compiled vamp plugins.

Tutorials

octocat:](https://github.com/jakevdp/WhirlwindTourOfPython

octocat:](https://github.com/scipy-lectures/scipy-lecture-notes) - Highly recommended tutorial, covers large parts of the scientific Python ecosystem.

Short overview of equivalent python functions for switchers.

octocat:](https://github.com/stevetjoa/stanford-mir) - collection of instructional iPython Notebooks for music information retrieval (MIR).

Books

Python Data Science Handbook: full text in Jupyter Notebooks

27.01K
11.88K
2y 4d
n/a

Scientific Papers

John C. Glover, Victor Lazzarini and Joseph Timoney, Linux Audio Conference 2011.

Video](https://www.youtube.com/watch?v=MhOdbtPhbLU) - Brian McFee, Colin Raffel, Dawen Liang, Daniel P.W. Ellis, Matt McVicar, Eric Battenberg, Oriol Nieto, Scipy 2015.

Video](https://www.youtube.com/watch?v=37R_R82lfwA) - Hervé Bredin, Ruiqing Yin, Juan Manuel Coria, Gregory Gelly, Pavel Korshunov, Marvin Lavechin, Diego Fustes, Hadrien Titeux, Wassim Bouaziz, Marie-Philippe Gill, ICASSP 2020.

Other Resources

Audio Signal Processing, Python based course from UPF of Barcelona and Stanford University.

Masters Course Material (University of Rostock) with many Python examples.

Music Information Retrieval Community.