User Experience on mobile might not be great yet, but I'm working on it.

Your first time on this page? Allow me to give some explanations.

Awesome Scientific Audio

Curated list of python software and packages related to scientific research in audio

Here you can see meta information about this topic like the time we last updated this page, the original creator of the awesome list and a link to the original GitHub repository.

Last Update: Jan. 17, 2022, 3:06 p.m.

Thank you faroit & contributors
View Topic on GitHub:
faroit/awesome-python-scientific-audio

Search for resources by name or description.
Simply type in what you are looking for and the results will be filtered on the fly.

Further filter the resources on this page by type (repository/other resource), number of stars on GitHub and time of last commit in months.

Read-Write

Expressive Digital Signal Processing (DSP) package for Python

603
71
5y 111d
GPL-3.0

cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python

387
95
45d
MIT

Python wrapper around sox.

402
74
11m
BSD-3-Clause

Python I/O for STEM audio files

66
10
11m
MIT

Read audio and music meta data and duration of MP3, OGG, OPUS, MP4, M4A, FLAC, WMA, Wave and AIFF files with python 2 or 3

499
87
31d
MIT

:octocat: :package: - Reads and writes all kind of audio metadata for various formats.

:octocat: - PyAV is a Pythonic binding for FFmpeg or Libav.

:octocat: :package: - Library based on libsndfile, CFFI, and NumPy.

Transformations - General DSP

An audio digital processing toolbox based on a workflow/pipeline principle

233
35
1y 6m
BSD-3-Clause

Gammatone-based spectrograms, using gammatone filterbanks or Fourier transform weightings.

169
61
3y 93d
BSD-3-Clause

๐ŸŽš๏ธ Open Source Audio Matching and Mastering

537
66
51d
GPL-3.0

A fast MDCT implementation using SciPy and FFTs

42
7
1y 8m
MIT

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

846
348
48d
MIT

python wrapper for rubberband

86
18
10m
ISC

:octocat: :package: - real-time audio time-scale modification procedures.

:octocat: :package: - Non-stationary gabor transform, constant-q.

:octocat: :package: - Manipulate audio with a simple and easy high level interface.

:octocat: - Implementation of the MATLAB Time-Frequency Toolbox.

:octocat: :package: - Discrete Wavelet Transform in Python.

:octocat: :package: - Sound Field Synthesis Toolbox.

:octocat: :package: - Analyze, visualize and process sound field data recorded by spherical microphone arrays.

:octocat: :package: - Standalone package for Short-Time Fourier Transform.

Feature extraction

Expressive Digital Signal Processing (DSP) package for Python

603
71
5y 111d
GPL-3.0

This library provides common speech features for ASR including MFCCs and filterbank energies.

2.02K
589
1y 17d
MIT

Audio features extraction

222
45
7m
LGPL-3.0

SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/

849
116
1y 57d
Apache-2.0

spafe: Simplified Python Audio-Features Extraction

176
37
7m
BSD-3-Clause

:octocat: :package: - Feature extractor, written in C, Python interface.

:octocat: - Music related low level and high level feature extractor, C++ based, includes Python bindings.

Data augmentation

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

828
106
4d
MIT

Python library for audio augmentation

63
12
1y 6m
BSD-3-Clause

Speech Processing

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

18.82K
3.39K
61d
MPL-2.0

gentle forced aligner

1.06K
243
1y 8m
MIT

Praat in Python, the Pythonic way

605
73
9d
GPL-3.0

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

1.35K
287
4d
MIT

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

4.54K
1.07K
8d
Apache-2.0

Python interface to the WebRTC Voice Activity Detector

1.3K
322
11m
n/a

A python package for calculating the PESQ.

231
54
1y 8m
MIT

Python implementation of the Short Term Objective Intelligibility measure

206
46
3d
MIT

A Python wrapper for the high-quality vocoder "World"

492
101
7m
MIT

Speech recognition module for Python, supporting several engines and APIs, online and offline.

6.02K
2.05K
34d
n/a

:octocat: :package: - Forced aligner, based on MFCC+DTW, 35+ languages.

:octocat: - Forced aligner, based on Kaldi (HMM), English (others can be trained).

:package: - Speaker and Language recognition.

Environmental Sounds

:octocat: :package: - Evaluation toolbox for Sound Event Detection

Perceptial Models - Auditory Models

Inner ear models for Python

89
36
2y 4m
GPL-3.0

Audio library for modelling loudness

31
11
2y 5m
GPL-3.0

:octocat: :package: - Spiking neural networks simulator, includes cochlea model.

:octocat: - Audio loudness meter and normalization, implements ITU-R BS.1770-4.

:octocat: :package: - Sound Field Synthesis Toolbox.

Source Separation

18
4
1y 10m
BSD-3-Clause

Sparse Beta-Divergence Tensor Factorization Library

46
9
6y 83d
MIT

:octocat: :package: - Holistic source separation framework including DSP methods and deep learning methods.

:octocat: :package: - Several flavors of non-negative-matrix factorization.

Music Information Retrieval

Python tools for the corpus analysis of popular music.

18
1
5y 27d
MIT

:octocat: :package: - MIR packages with strong focus on beat detection, onset detection and chord recognition.

:octocat: :package: - Common scores for various MIR tasks. Also includes bss_eval implementation.

:octocat: :package: - Music Structure Analysis Framework.

:octocat: :package: - General audio and music analysis.

Deep Learning

kapre: Keras Audio Preprocessors

799
140
64d
MIT

Data manipulation and transformation for audio signal processing, powered by PyTorch

1.54K
376
9d
BSD-2-Clause

Audio processing by using pytorch 1D convolution network

637
61
24d
MIT

Symbolic Music - MIDI - Musicology

Mingus is a music package for Python

651
147
1y 8m
GPL-3.0

:octocat: :package: - Toolkit for Computer-Aided Musicology.

:octocat: :package: - Utility functions for handling MIDI data in a nice/intuitive way.

Realtime applications

Python game programming in Jupyter notebooks.

150
14
8m
BSD-2-Clause

Play and Record Sound with Python

606
104
17d
MIT

Real-Time Spherical Microphone Renderer for binaural reproduction in Python

42
7
7d
n/a

Web Audio

Scalable audio processing framework written in Python with a RESTful API

313
56
9m
AGPL-3.0

Audio Dataset and Dataloaders

A Python wrapper around the Soundcloud API

76
25
5m
BSD-2-Clause

Python library for handling audio datasets.

121
23
1y 5m
MIT

:octocat: :package: - Music library manager and MusicBrainz tagger.

:octocat: :package: - Parse and process the MUSDB18 dataset.

:octocat: :package: - Download youtube videos (and the audio).

:octocat: :package: - Common loaders for Music Information Retrieval (MIR) datasets.

Wrappers for Audio Plugins

Tutorials

[:octocat:](https://github.com/jakevdp/WhirlwindTourOfPython

:octocat: - Highly recommended tutorial, covers large parts of the scientific Python ecosystem.

:octocat: - collection of instructional iPython Notebooks for music information retrieval (MIR).

Books

Python Data Science Handbook: full text in Jupyter Notebooks

32.22K
14.46K
3y 49d
n/a

Scientific Papers

Video - Brian McFee, Colin Raffel, Dawen Liang, Daniel P.W. Ellis, Matt McVicar, Eric Battenberg, Oriol Nieto, Scipy 2015.

Video - Hervรฉ Bredin, Ruiqing Yin, Juan Manuel Coria, Gregory Gelly, Pavel Korshunov, Marvin Lavechin, Diego Fustes, Hadrien Titeux, Wassim Bouaziz, Marie-Philippe Gill, ICASSP 2020.

Other Resources