User Experience on mobile might not be great yet, but I'm working on it.

Your first time on this page? Allow me to give some explanations.

Awesome Data Science

Probably the best curated list of data science software in Python.

Here you can see meta information about this topic like the time we last updated this page, the original creator of the awesome list and a link to the original GitHub repository.

Last Update: Aug. 17, 2022, 6:04 p.m.

Thank you krzjoa & contributors
View Topic on GitHub:
krzjoa/awesome-python-data-science

Search for resources by name or description.
Simply type in what you are looking for and the results will be filtered on the fly.

Further filter the resources on this page by type (repository/other resource), number of stars on GitHub and time of last commit in months.

General Purpouse Machine Learning

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

2.98K
537
2y 5m
Apache-2.0

cuML - RAPIDS Machine Learning Library

2.57K
379
6m
Apache-2.0

A modular active learning framework for Python

1.61K
237
1y 7m
MIT

PySpark + Scikit-learn = Sparkit-learn

1.12K
248
4y 9m
Apache-2.0

mlpack: a scalable C++ machine learning library --

3.91K
1.41K
6m
n/a

A toolkit for making real world machine learning and data analysis applications in C++

10.91K
3.03K
6m
BSL-1.0

A library of extension and helper modules for Python's data analysis and machine learning libraries.

3.8K
748
7m
n/a

Waiting hours for a future prediction is unacceptable. Hyperlearn makes AI and ML algorithms 50% faster, use 90% less memory and doesn't require you to use new hardware! ML Algorithms like PCA, Linear Regression, NMF are all faster!

1.23K
112
8m
BSD-3-Clause

Machine Learning toolbox for Humans

646
134
5y 9m
n/a

A scikit-learn based module for multi-label et. al. classification

721
142
3y 90d
BSD-2-Clause

Sequence learning toolkit for Python

628
100
6y 6m
MIT

Simple structured learning framework for python

657
173
10m
BSD-2-Clause

Highly interpretable classifiers for scikit learn, producing easily understood decision rules instead of black box models

480
66
5y 8d
n/a

Python implementation of the rulefit algorithm

279
78
1y 52d
MIT

Metric learning algorithms in Python

1.21K
223
9m
MIT

[HELP REQUESTED] Generalized Additive Models in Python

667
120
2y 34d
Apache-2.0

Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

1.51K
184
6m
GPL-3.0

Little Ball of Fur - A graph sampling extension library for NetworKit and NetworkX (CIKM 2020)

581
40
6m
GPL-3.0

Uplift modeling and causal inference with machine learning algorithms

2.78K
428
6m
n/a

Deepchecks is a Python package for comprehensively validating your machine learning models and data with minimal effort. See our docs: https://docs.deepchecks.com

621
39
7m
n/a

Automated Machine Learning

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

8.44K
1.46K
1y 7m
LGPL-3.0

Automated Machine Learning with scikit-learn

6.02K
1.12K
6m
BSD-3-Clause

MLBox is a powerful Automated Machine Learning python library.

1.28K
260
1y 11m
n/a

Ensemble Methods

Stacked Generalization (Ensemble Learning)

179
74
4y 8m
MIT

Library for machine learning stacking generalization.

113
23
3y 109d
Apache-2.0

Python package for stacking (machine learning technique)

648
77
2y 9m
n/a

Imbalanced Datasets

A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning

5.7K
1.17K
6m
MIT

Python-based implementations of algorithms for learning on imbalanced data.

200
64
6m
n/a

Random Forests

It is a forest of random projection trees

210
42
2y 6m
Apache-2.0

Scikit-learn compatible wrapper of the Random Bits Forest program written by (Wang et al., 2016)

8
2
6y 19d
n/a

Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

354
51
7m
n/a

Extreme Learning Machine

Extreme Learning Machine implementation in Python

484
245
1y 5m
n/a

Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

67
53
3y 9m
n/a

High performance implementation of Extreme Learning Machines (fast randomized neural networks).

164
58
11m
n/a

Kernel Methods

Factorization machines in python

876
312
4y 4m
n/a

fastFM: A Library for Factorization Machines

964
199
1y 4m
n/a

TensorFlow implementation of an arbitrary order Factorization Machine

776
187
7m
MIT

Support vector machines (SVMs) and related kernel-based learning algorithms are a well-known class of machine learning algorithms, for non-parametric classification and regression. liquidSVM is an implementation of SVMs whose key features are: fully integrated hyper-parameter selection, extreme speed on both small and large data sets, full flexibility for experts, and inclusion of a variety of different learning scenarios: multi-class classification, ROC, and Neyman-Pearson learning, and least-squares, quantile, and expectile regression.

55
8
2y 11m
AGPL-3.0

Relevance Vector Machine implementation using the scikit-learn API.

193
67
5y 97d
n/a

ThunderSVM: A Fast SVM Library on GPUs and CPUs

1.37K
189
1y 6m
Apache-2.0

Gradient Boosting

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

22.19K
8.31K
6m
Apache-2.0

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

13.46K
3.48K
6m
MIT

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

6.35K
957
6m
Apache-2.0

ThunderGBM: Fast GBDTs and Random Forests on GPUs

619
82
1y 7m
Apache-2.0

PyTorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

53.96K
14.92K
6m
n/a

Datasets, Transforms and Models specific to Computer Vision

10.85K
5.61K
6m
BSD-3-Clause

Data loaders and abstractions for text and NLP

2.95K
691
6m
BSD-3-Clause

Data manipulation and transformation for audio signal processing, powered by PyTorch

1.59K
384
6m
BSD-2-Clause

High-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently.

3.86K
523
6m
BSD-3-Clause

A simplified framework and utilities for PyTorch

509
61
6m
LGPL-3.0

A scikit-learn compatible neural network library that wraps PyTorch

4.36K
305
7m
BSD-3-Clause

Simple tools for logging and visualizing, loading and training

1.36K
195
1y 7m
BSD-3-Clause

Graph Neural Network Library for PyTorch

13.78K
2.42K
6m
MIT

Accelerated deep learning R&D

2.84K
353
6m
Apache-2.0

PyTorch Geometric Temporal: Spatiotemporal Signal Processing with Neural Machine Learning Models (CIKM 2021)

1.33K
194
6m
MIT

A PyTorch based deep learning library for drug pair scoring.

33
2
7m
Apache-2.0

TensorFlow

An Open Source Machine Learning Framework for Everyone

167.1K
87.06K
4d
Apache-2.0

Deep Learning and Reinforcement Learning Library for Scientists and Engineers

6.83K
1.52K
9m
n/a

Deep learning library featuring a higher-level API for TensorFlow.

9.58K
2.43K
1y 8m
n/a

TensorFlow-based neural network library

9.19K
1.31K
6m
Apache-2.0

A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility

6.16K
1.83K
8m
Apache-2.0

Machine Learning Management & Orchestration Platform (Monorepo for Polyaxon's MLOps Tools)

3.01K
301
6m
Apache-2.0

NeuPy is a Tensorflow based python library for prototyping and building neural networks

702
153
2y 11m
MIT

Deploy tensorflow graphs for fast evaluation and export to tensorflow-less environments running numpy.

346
39
1y 7m
BSD-3-Clause

TensorFlow ROCm port

581
67
6m
Apache-2.0

Deep learning with dynamic computation graphs in TensorFlow

1.82K
279
4y 9m
Apache-2.0

๐Ÿ“ Wrapper library for text generation / language models at char and word level with RNN in TensorFlow

63
30
4y 118d
MIT

TensorLight - A high-level framework for TensorFlow

10
4
5y 106d
MIT

Mesh TensorFlow: Model Parallelism Made Easier

1.2K
209
6m
Apache-2.0

Data-centric declarative deep learning framework

8.11K
965
6m
Apache-2.0

Keras community contributions

1.54K
655
2y 8m
MIT

Keras + Hyperopt: A very simple wrapper for convenient hyperparameter optimization

2.11K
307
9m
MIT

Distributed Deep learning with Keras & Spark

1.53K
296
1y 1d
MIT

Train/evaluate a Keras model, get metrics streamed to a dashboard in your browser.

497
49
5y 87d
MIT

Graph Neural Networks with Keras and Tensorflow 2.

1.99K
270
6m
MIT

QKeras: a quantization deep learning library for Tensorflow Keras

361
72
6m
Apache-2.0

MXNet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

19.85K
6.89K
6m
Apache-2.0

A clear, concise, simple yet powerful and efficient API for deep learning.

2.32K
228
4y 5m
Apache-2.0

Simple, efficient and flexible vision toolbox for mxnet framework.

31
9
4y 8m
BSD-3-Clause

Gluon CV Toolkit

5.06K
1.14K
6m
Apache-2.0

NLP made easy

2.37K
524
11m
Apache-2.0

Transfer Learning library for Deep Neural Networks.

240
50
1y 4m
Apache-2.0

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

30
9
2y 7m
Apache-2.0

Others

Source-to-Source Debuggable Derivatives in Pure Python

2.22K
388
4y 11d
Apache-2.0

Efficiently computes derivatives of numpy code.

5.65K
807
1y 5m
MIT

Myia prototyping

420
40
1y 5m
MIT

Neural Network Libraries

2.52K
314
6m
Apache-2.0

Caffe: a fast open framework for deep learning.

32.25K
18.94K
2y 6m
n/a

hipCaffe: the HIP port of Caffe

124
27
3y 10m
n/a

Probably the best curated list of data science software in Python.

1.22K
219
7m
CC-BY-4.0

Web Scraping

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

8.15K
1.59K
2y 115d
BSD-3-Clause

Scrape Twitter for Tweets

1.98K
538
2y 21d
MIT

Data Containers

Create HTML profiling reports from pandas DataFrame objects

8.5K
1.23K
7m
MIT

cuDF - GPU DataFrame Library

4.5K
588
6m
Apache-2.0

NumPy and Pandas interface to Big Data

3.02K
380
3y 4d
n/a

sqldf for pandas

1.08K
157
5y 6m
MIT

Google BigQuery connector for pandas

288
99
6m
BSD-3-Clause

Universal 1d/2d data containers with Transformers functionality for data analysis.

26
5
3y 10m
n/a

A pure Python implementation of Apache Spark's RDD and DStream interfaces.

244
48
1y 5m
n/a

High performance datastore for time series and tick data

2.6K
529
6m
LGPL-2.1

A Python package for manipulating 2-dimensional tabular data structures

1.45K
130
6m
MPL-2.0

Koalas: pandas API on Apache Spark

3.08K
330
10m
Apache-2.0

Modin: Speed up your Pandas workflows by changing a single line of code

6.78K
481
6m
n/a

A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner

1.9K
88
6m
MIT

The easy way to write your own flavor of Pandas

224
15
2y 8m
MIT

The goal of pandas-log is to provide feedback about basic pandas operations. It provides simple wrapper functions for the most common functions that add additional logs

183
12
1y 8m
MIT

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second ๐Ÿš€

6.91K
535
6m
MIT

Pipelines

Easy pipelines for pandas DataFrames.

651
33
6m
n/a

functional data manipulation for pandas

180
25
6y 11m
n/a

dplyr for python

745
58
5y 7m
MIT

Pandas integration with sklearn

2.55K
407
1y 102d
n/a

BatchFlow helps you conveniently work with random or sequential batches of your data and define data processing and machine learning workflows even for datasets that do not fit into memory.

170
39
6m
Apache-2.0

Clean APIs for data cleaning. Python implementation of R package Janitor

823
138
6m
MIT

A Python toolkit for processing tabular data

386
27
7m
MIT

Build, test, deploy, iterate - Dev and prod tool for data science pipelines

49
3
3y 35d
n/a

Directions overlay for working with pandas in an analysis environment

429
22
1y 11m
BSD-3-Clause

General

An open source python library for automated feature engineering

5.98K
786
6m
BSD-3-Clause

scikit-learn addon to operate on set/"group"-based features

40
8
6y 11d
BSD-3-Clause

A set of tools for creating and testing machine learning features, with a scikit-learn compatible API

380
82
4y 7m
n/a

a feature engineering wrapper for sklearn

45
20
3y 9m
GPL-3.0

A sklearn-compatible Python implementation of Multifactor Dimensionality Reduction (MDR) for feature construction.

118
22
4y 8m
MIT

Automatic extraction of relevant features from time series:

6.17K
961
8m
MIT

Feature Selection

open-source feature selection repository in python

1.17K
404
2y 9m
GPL-2.0

Python implementations of the Boruta all-relevant feature selection method.

1.09K
211
10m
BSD-3-Clause

A fast xgboost feature selection algorithm

175
31
4y 6m
MIT

A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.

343
65
1y 6m
MIT

General Purposes

matplotlib: plotting with Python

15.02K
6.25K
6m
n/a

Statistical data visualization in Python

9.15K
1.56K
7m
BSD-3-Clause

Painlessly create beautiful matplotlib plots.

1.6K
140
7y 10m
MIT

Ternary plotting library for python with matplotlib

514
131
10m
MIT

Missing data visualization module for Python.

3.07K
393
1y 45d
MIT

Python library that makes it easy for data scientists to create charts.

3.09K
282
1y 6m
Apache-2.0

Python histogram library - histograms as updateable, fully semantic objects with visualization tools. [P]ython [HYST]ograms.

115
18
6m
MIT

Interactive plots

A python package for animating plots build on matplotlib.

386
35
1y 10m
MIT

Interactive Data Visualization in the browser, from Python

15.96K
3.88K
6m
BSD-3-Clause

Plotting library for IPython/Jupyter notebooks

3.23K
459
6m
Apache-2.0

๐ŸŽจ Python Echarts Plotting Library

11.99K
2.62K
9m
MIT

Map

A Python package for interactive mapping with Google Earth Engine, ipyleaflet, and ipywidgets.

1.67K
680
6m
MIT

Automatic Plotting

With Holoviews, your data visualizes itself.

2.11K
349
6m
BSD-3-Clause

Automatically Visualize any dataset, any size with a single line of code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.

636
98
6m
Apache-2.0

Visualize and compare datasets, target values and associations, with one line of code.

1.91K
195
1y 41d
MIT

NLP

Python library for interactive topic model visualization. Port of the R LDAvis package.

1.57K
328
1y 4m
BSD-3-Clause

Deployment

Model Explanation

The official implementation of "The Shapley Value of Classifiers in Ensemble Games" (CIKM 2021).

136
17
6m
MIT

Algorithms for explaining machine learning models

1.52K
180
6m
Apache-2.0

Code for "High-Precision Model-Agnostic Explanations" paper

682
99
9m
BSD-2-Clause

Bias and Fairness Audit Toolkit

458
87
1y 83d
MIT

Contrastive Explanation (Foil Trees), developed at TNO/Utrecht University

38
5
1y 98d
BSD-3-Clause

Visual analysis and diagnostic tools to facilitate machine learning model selection.

3.5K
514
7m
Apache-2.0

An intuitive library to add plotting functionality to scikit-learn objects.

2.17K
263
4y 0d
MIT

A game theoretic approach to explain the output of any machine learning model.

15.46K
2.33K
8m
MIT

A library for debugging/inspecting machine learning classifiers and explaining their predictions

2.5K
319
2y 6m
MIT

Lime: Explaining the predictions of any machine learning classifier

9.57K
1.57K
1y 20d
BSD-2-Clause
321
67
5y 4m
n/a
98
30
4y 28d
n/a

python partial dependence plot toolbox

652
108
1y 5m
MIT

Python implementation of R package breakDown

40
4
1y 117d
n/a

โฌ› Python Individual Conditional Expectation Plot Toolbox

120
29
4y 6m
MIT

Python Library for Model Interpretation/Explanations

1.01K
169
6m
UPL-1.0

Model analysis tools for TensorFlow

1.14K
243
6m
Apache-2.0

A library that implements fairness-aware machine learning algorithms

88
19
4y 19d
MIT
696
138
1y 5m
BSD-3-Clause

Interpretability and explainability of data and machine learning models

1.05K
224
6m
Apache-2.0

Auralisation of learned features in CNN (for audio)

39
10
5y 5m
n/a

๐ŸŽ† A visualization of the CapsNet layers to better understand how it works

383
92
2y 105d
MIT

A collection of infrastructure and tools for research in neural network interpretability.

4.36K
621
1y 5m
Apache-2.0

Visualizer for neural network, deep learning, and machine learning models

17.68K
2.03K
6m
MIT

tensorboard for pytorch (and chainer, mxnet, numpy, ...)

7.23K
848
6m
MIT

Logging MXNet data for visualization in TensorBoard.

332
51
2y 6m
Apache-2.0

Reinforcement Learning

A toolkit for developing and comparing reinforcement learning algorithms.

26.4K
7.58K
6m
n/a

Reinforcement Learning Coach by Intel AI Lab enables easy experimentation with state of the art Reinforcement Learning algorithms

2.1K
424
1y 51d
Apache-2.0

A toolkit for reproducible reinforcement learning research.

1.4K
247
10m
MIT

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

12.33K
4.24K
2y 6m
MIT

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

3.44K
667
11m
MIT

A platform for Reasoning systems (Reinforcement Learning, Contextual Bandits, etc.)

3.11K
451
6m
BSD-3-Clause

TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.

2.18K
598
6m
Apache-2.0

Tensorforce: a TensorFlow library for applied reinforcement learning

3.09K
517
6m
Apache-2.0

TensorFlow Reinforcement Learning

3.11K
379
1y 6d
Apache-2.0

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.

9.72K
1.31K
8m
Apache-2.0

Deep Reinforcement Learning for Keras.

5.2K
1.33K
2y 9m
MIT

ChainerRL is a deep reinforcement learning library built on top of Chainer.

1.03K
220
1y 4m
MIT

Probabilistic Methods

Deep universal probabilistic programming with Python and PyTorch

7.3K
893
6m
Apache-2.0

Fast, flexible and easy to use probabilistic modelling in Python.

2.83K
515
6m
MIT

Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Aesara

6.35K
1.52K
6m
n/a

InferPy: Deep Probabilistic Modeling with Tensorflow Made Easy

135
15
2y 43d
Apache-2.0

PyStan, a Python interface to Stan, a platform for statistical modeling. Documentation: https://pystan.readthedocs.io

156
36
10m
ISC

Python package for Bayesian Machine Learning with scikit-learn API

444
108
11m
MIT

Python Library for learning (Structure and Parameter), inference (Probabilistic and Causal), and simulations in Bayesian Networks.

1.99K
627
6m
MIT

Supervised domain-agnostic prediction framework for probabilistic modelling

110
16
3y 6m
n/a

Probabilistic Programming and Statistical Inference in PyTorch

108
14
5y 46d
MIT

Python package facilitating the use of Bayesian Deep Learning methods with Variational Inference for PyTorch

332
46
3y 8m
MIT

The Python ensemble sampling toolkit for affine-invariant MCMC

1.2K
413
7m
MIT

A library for hidden semi-Markov models with explicit durations

58
16
12m
GPL-3.0
506
164
1y 11m
MIT

A highly efficient and modular implementation of Gaussian Processes in PyTorch

2.67K
409
6m
MIT

Modular Probabilistic Programming on MXNet

99
30
3y 81d
Apache-2.0

scikit-learn inspired API for CRFsuite

390
178
2y 8m
n/a

Genetic Programming

Genetic Programming in Python, with a scikit-learn inspired API

1.08K
192
10m
BSD-3-Clause

Distributed Evolutionary Algorithms in Python

4.58K
970
6m
LGPL-3.0

A Genetic Programming platform for Python with TensorFlow for wicked-fast CPU and GPU support.

140
62
1y 4m
n/a

A strongly-typed genetic programming framework for Python

105
12
4y 67d
n/a

Genetic feature selection module for scikit-learn

212
54
7m
n/a

Optimization

Spearmint Bayesian optimization codebase

1.46K
311
3y 4m
n/a

Bayesian optimization in PyTorch

2.19K
244
6m
MIT

Genetic Algorithm, Particle Swarm Optimization, Simulated Annealing, Ant Colony Optimization Algorithm,Immune Algorithm, Artificial Fish Swarm Algorithm, Differential Evolution and TSP(Traveling salesman)

2.91K
700
7m
MIT

Sequential Model-based Algorithm Configuration

660
168
9m
n/a

optimization routines for hyperparameter tuning

380
73
2y 99d
n/a

Distributed Asynchronous Hyperparameter Optimization in Python

6.08K
943
8m
n/a

Hyper-parameter optimization for sklearn

1.3K
246
1y 66d
n/a

Use evolutionary algorithms instead of gridsearch in scikit-learn

676
114
1y 19d
MIT

SigOpt wrappers for scikit-learn methods

70
14
2y 4m
MIT

A Python implementation of global optimization with gaussian processes.

5.74K
1.26K
1y 11m
MIT

Safe Bayesian Optimization

102
42
2y 4m
MIT

Sequential model-based optimization with a scipy.optimize interface

2.29K
438
10m
BSD-3-Clause

๐ŸŽฏ A comprehensive gradient-free optimization framework written in Python

550
59
4y 11m
MIT

A research toolkit for particle swarm optimization in Python

878
289
1y 56d
MIT

A Free and Open Source Python Library for Multiobjective Optimization

370
129
1y 119d
GPL-3.0

Bayesian Optimization using GPflow

246
58
1y 8m
Apache-2.0

POT : Python Optimal Transport

1.37K
392
6m
MIT

Hyperparameter Optimization for TensorFlow, Keras and PyTorch

1.49K
251
6m
MIT

library for nonlinear optimization, wrapping many algorithms for global and local, constrained or unconstrained, optimization

1.19K
363
7m
n/a

Time Series

A unified framework for machine learning with time series

4.63K
674
9m
BSD-3-Clause

A python library for easy manipulation and forecasting of time series.

4.44K
466
25d
Apache-2.0

Lightning โšก๏ธ fast forecasting with statistical and econometric models.

838
55
22d
MIT

Scalable machine learning based time series forecasting.

64
11
21d
Apache-2.0

Scalable and user friendly neural forecasting algorithms for time series data .

823
78
56d
MIT

A machine learning toolkit dedicated to time-series data

2.01K
262
8m
BSD-2-Clause

Module for statistical learning, with a particular emphasis on time-dependent modelling

370
84
2y 64d
BSD-3-Clause

A flexible, intuitive and fast forecasting library

1.55K
76
21d
BSD-2-Clause

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

14.03K
4.04K
6m
MIT

Open source time series library for Python

1.94K
225
3y 8m
BSD-3-Clause

Probabilistic programming framework that facilitates objective model selection for time-varying parameter models.

96
22
8m
MIT

Anomaly Detection and Correlation library

996
197
4y 7m
Apache-2.0

Datetimes for Humansโ„ข

3.3K
212
9m
MIT

ML powered analytics engine for outlier detection and root cause analysis.

229
20
5m
MIT

Natural Language Processing

NLTK Source

10.44K
2.55K
6m
Apache-2.0

The Classical Language Toolkit

706
305
6m
MIT

scikit-learn wrappers for Python fastText.

226
24
7m
n/a

Simple text to phones converter for multiple languages

588
107
8m
GPL-3.0

A very simple framework for state-of-the-art Natural Language Processing (NLP)

11.24K
1.81K
6m
n/a

Computer Audition

Python library for audio and music analysis

5K
787
6m
ISC

Audio features extraction

224
45
1y 58d
LGPL-3.0

a library for audio and music analysis

2.65K
338
6m
GPL-3.0

C++ library for audio and music analysis, description and synthesis, including Python bindings

2.04K
443
6m
AGPL-3.0

LibXtract is a simple, portable, lightweight library of audio feature extraction functions.

207
48
3y 34d
MIT

Marsyas - Music Analysis, Retrieval and Synthesis for Audio Signals

349
95
1y 5m
GPL-2.0

A library for augmenting annotated audio data

198
32
1y 107d
ISC

Python audio and music signal processing library

869
151
7m
n/a

Computer Vision

Open Source Computer Vision Library

59.69K
50.18K
6m
n/a

Image processing in Python

4.77K
1.96K
6m
n/a

Image augmentation for machine learning experiments.

12.26K
2.23K
2y 78d
MIT

Image augmentation library in Python for machine learning.

4.63K
841
10m
MIT

Fast image augmentation library and an easy-to-use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about the library: https://www.mdpi.com/2078-2489/11/2/125

9.66K
1.24K
6m
MIT

Statistics

A library for managing, validating, summarizing, and visualizing data.

385
38
6m
Apache-2.0

Create HTML profiling reports from pandas DataFrame objects

8.5K
1.23K
7m
MIT

Statsmodels: statistical modeling and econometrics in Python

7.1K
2.43K
6m
BSD-3-Clause

Supply a wrapper StockDataFrame based on the pandas.DataFrame with inline stock statistics/indicators support.

955
251
7m
BSD-3-Clause

Pandas-based utility to calculate weighted means, medians, distributions, standard deviations, and more.

90
9
2y 8m
MIT

Multiple Pairwise Comparisons (Post Hoc) Tests in Python

232
24
8m
MIT

Performance analysis of predictive (alpha) stock factors

2.19K
845
2y 113d
Apache-2.0

Distributed Computing

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

12.12K
2.01K
6m
n/a

Distributed machine learning platform

897
185
5y 6m
n/a

Framework and Library for Distributed Online Machine Learning

703
149
3y 95d
LGPL-2.1

Microsoft Distributed Machine Learning Toolkit

2.76K
592
5y 26d
MIT

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice ๏ผˆใ€Ž้ฃžๆกจใ€ๆ ธๅฟƒๆก†ๆžถ๏ผŒๆทฑๅบฆๅญฆไน &ๆœบๅ™จๅญฆไน ้ซ˜ๆ€ง่ƒฝๅ•ๆœบใ€ๅˆ†ๅธƒๅผ่ฎญ็ปƒๅ’Œ่ทจๅนณๅฐ้ƒจ็ฝฒ๏ผ‰

17.58K
4.27K
6m
Apache-2.0

Scalable Machine Learning with Dask

787
220
7m
BSD-3-Clause

A distributed task scheduler for Dask

1.3K
591
6m
BSD-3-Clause

Experimentation

๐Ÿ•๏ธ Development environment for machine learning

496
28
57d
Apache-2.0

Sacred is a tool to help you configure, organize, log and reproduce experiments developed at IDSIA.

3.73K
347
6m
MIT

A web-based application for quick, scalable, and automated hyperparameter tuning and stacked ensembling in Python.

1.26K
111
4y 12m
Apache-2.0

A visual dataflow programming language for sklearn

187
35
5y 50d
MIT

Adaptive Experimentation Platform

1.7K
185
6m
MIT

Evaluation

A library of metrics for evaluating recommender systems

340
81
9m
MIT

Machine learning evaluation metrics, implemented in Python, R, Haskell, and MATLAB / Octave

1.49K
438
6y 11m
n/a

Machine learning model evaluation made easy: plots, tables, HTML reports, experiment tracking and Jupyter notebook analysis.

320
29
6m
MIT

A comprehensive set of fairness metrics for datasets and machine learning models, explanations for these metrics, and algorithms to mitigate bias in datasets and models.

1.64K
527
6m
Apache-2.0

Computations

Parallel computing with task scheduling

9.53K
1.44K
6m
BSD-3-Clause

Fast NumPy array functions written in C

703
83
1y 4m
BSD-2-Clause

NumPy & SciPy for GPU

5.75K
569
6m
MIT

Python library for multilinear algebra and tensor factorizations

389
112
5y 10m
GPL-3.0

Solve automatic numerical differentiation problems in one or more variables.

166
32
11m
BSD-3-Clause

Add built-in support for quaternions to numpy

471
76
6m
MIT

Adaptive: parallel active learning of mathematical functions

687
42
8m
BSD-3-Clause

Spatial Analysis

Python tools for geographic data

3K
681
6m
BSD-3-Clause

PySAL: Python Spatial Analysis Library Meta-Package

971
279
6m
BSD-3-Clause

Quantum Computing

PennyLane is a cross-platform Python library for differentiable programming of quantum computers. Train a quantum computer the same way as a neural network.

1.16K
346
6m
Apache-2.0

QML: Quantum Machine Learning

161
65
3y 11m
MIT

Conversion

Transpile trained scikit-learn estimators to C, Java, JavaScript and others.

1.12K
154
2y 8m
MIT

Open standard for machine learning interoperability

12.04K
2.39K
6m
Apache-2.0

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.

5.51K
973
2y 4d
MIT