User Experience on mobile might not be great yet, but I'm working on it.

Your first time on this page? Allow me to give some explanations.

Awesome Data Science

Probably the best curated list of data science software in Python.

Here you can see meta information about this topic like the time we last updated this page, the original creator of the awesome list and a link to the original GitHub repository.

Last Update: None

Thank you krzjoa & contributors
View Topic on GitHub:
krzjoa/awesome-python-data-science

Search for resources by name or description.
Simply type in what you are looking for and the results will be filtered on the fly.

Further filter the resources on this page by type (repository/other resource), number of stars on GitHub and time of last commit in months.

General Purpouse Machine Learning

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

2.84K
510
1y 73d
Apache-2.0

cuML - RAPIDS Machine Learning Library

1.95K
313
82d
Apache-2.0

A modular active learning framework for Python

1.1K
183
4m
MIT

PySpark + Scikit-learn = Sparkit-learn

1.06K
239
3y 6m
Apache-2.0

mlpack: a scalable C++ machine learning library --

3.56K
1.32K
83d
n/a

A toolkit for making real world machine learning and data analysis applications in C++

9.91K
2.88K
84d
BSL-1.0

A library of extension and helper modules for Python's data analysis and machine learning libraries.

3.35K
695
82d
n/a

50% faster, 50% less RAM Machine Learning. Numba rewritten Sklearn. SVD, NNMF, PCA, LinearReg, RidgeReg, Randomized, Truncated SVD/PCA, CSR Matrices all 50+% faster

1.2K
108
7m
BSD-3-Clause

Machine Learning toolbox for Humans

633
134
4y 6m
n/a

A scikit-learn based module for multi-label et. al. classification

632
122
1y 12m
BSD-2-Clause

Sequence learning toolkit for Python

586
98
5y 90d
MIT

Simple structured learning framework for python

630
169
2y 7m
BSD-2-Clause

Highly interpretable classifiers for scikit learn, producing easily understood decision rules instead of black box models

452
67
3y 9m
n/a

Python implementation of the rulefit algorithm

219
67
6m
MIT

Metric learning algorithms in Python

1.11K
214
9m
MIT

[HELP REQUESTED] Generalized Additive Models in Python

561
105
10m
Apache-2.0

Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

1.16K
140
110d
GPL-3.0

Little Ball of Fur - A graph sampling extension library for NetworKit and NetworkX (CIKM 2020)

495
32
93d
GPL-3.0

Uplift modeling and causal inference with machine learning algorithms

1.68K
263
85d
n/a

Machine learning in Python. sklearn

Machine learning toolbox.

Time Series

A machine learning toolkit dedicated to time-series data

1.48K
223
110d
BSD-2-Clause

Module for statistical learning, with a particular emphasis on time-dependent modelling

323
74
11m
BSD-3-Clause

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

12.28K
3.51K
108d
MIT

Open source time series library for Python

1.85K
214
2y 5m
BSD-3-Clause

Probabilistic programming framework that facilitates objective model selection for time-varying parameter models.

86
19
1y 53d
MIT

Anomaly Detection and Correlation library

880
182
3y 4m
Apache-2.0

Datetimes for Humansโ„ข

3.25K
208
12m
MIT

Powerful extensions to the standard datetime module

Automated Machine Learning

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

7.83K
1.38K
4m
LGPL-3.0

Automated Machine Learning with scikit-learn

5.22K
982
87d
n/a

MLBox is a powerful Automated Machine Learning python library.

1.19K
247
8m
n/a

Ensemble Methods

Stacked Generalization (Ensemble Learning)

172
67
3y 4m
MIT

Library for machine learning stacking generalization.

106
22
2y 14d
Apache-2.0

Python package for stacking (machine learning technique)

582
68
1y 6m
n/a

High performance ensemble learning. sklearn

Imbalanced Datasets

A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning

5.05K
1.07K
86d
MIT

Python-based implementations of algorithms for learning on imbalanced data.

178
57
1y 5m
n/a

Random Forests

It is a forest of random projection trees

195
39
1y 98d
Apache-2.0

Scikit-learn compatible wrapper of the Random Bits Forest program written by (Wang et al., 2016)

7
1
4y 9m
n/a

Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

338
49
83d
n/a

Extreme Learning Machine

Extreme Learning Machine implementation in Python

448
233
3y 9m
n/a

Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

61
53
2y 6m
n/a

High performance implementation of Extreme Learning Machines (fast randomized neural networks).

159
55
2y 11m
n/a

Kernel Methods

Factorization machines in python

822
299
3y 29d
n/a

fastFM: A Library for Factorization Machines

900
194
1y 71d
n/a

TensorFlow implementation of an arbitrary order Factorization Machine

757
183
11m
MIT

Support vector machines (SVMs) and related kernel-based learning algorithms are a well-known class of machine learning algorithms, for non-parametric classification and regression. liquidSVM is an implementation of SVMs whose key features are: fully integrated hyper-parameter selection, extreme speed on both small and large data sets, full flexibility for experts, and inclusion of a variety of different learning scenarios: multi-class classification, ROC, and Neyman-Pearson learning, and least-squares, quantile, and expectile regression.

48
5
1y 8m
AGPL-3.0

Relevance Vector Machine implementation using the scikit-learn API.

177
62
4y 2d
n/a

ThunderSVM: A Fast SVM Library on GPUs and CPUs

1.27K
169
94d
Apache-2.0

Gradient Boosting

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

20.61K
7.91K
82d
Apache-2.0

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

12.21K
3.21K
82d
MIT

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

5.71K
867
81d
Apache-2.0

ThunderGBM: Fast GBDTs and Random Forests on GPUs

580
71
4m
Apache-2.0

PyTorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

46.37K
12.33K
81d
n/a

Datasets, Transforms and Models specific to Computer Vision

8.42K
4.35K
81d
BSD-3-Clause

Data loaders and abstractions for text and NLP

2.65K
614
81d
BSD-3-Clause

Data manipulation and transformation for audio signal processing, powered by PyTorch

1.23K
280
82d
BSD-2-Clause

High-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently.

3.27K
438
81d
BSD-3-Clause

A simplified framework and utilities for PyTorch

452
49
93d
LGPL-3.0

A scikit-learn compatible neural network library that wraps PyTorch

3.79K
281
97d
BSD-3-Clause

Simple tools for logging and visualizing, loading and training

1.28K
190
4m
BSD-3-Clause

Geometric Deep Learning Extension Library for PyTorch

10.27K
1.76K
81d
MIT

Accelerated deep learning R&D

2.44K
280
114d
Apache-2.0

A Temporal Extension Library for PyTorch Geometric

349
40
87d
MIT

TensorFlow

An Open Source Machine Learning Framework for Everyone

153.46K
84.06K
81d
Apache-2.0

Deep Learning and Reinforcement Learning Library for Scientists and Engineers ๐Ÿ”ฅ

6.5K
1.46K
82d
n/a

Deep learning library featuring a higher-level API for TensorFlow.

9.52K
2.43K
5m
n/a

TensorFlow-based neural network library

8.77K
1.26K
94d
Apache-2.0

A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility

5.93K
1.77K
87d
Apache-2.0

Machine Learning Platform for Kubernetes

2.73K
265
82d
Apache-2.0

NeuPy is a Tensorflow based python library for prototyping and building neural networks

665
148
1y 8m
MIT

Deploy tensorflow graphs for fast evaluation and export to tensorflow-less environments running numpy.

346
39
4m
BSD-3-Clause

TensorFlow ROCm port

543
65
82d
Apache-2.0

Deep learning with dynamic computation graphs in TensorFlow

1.8K
279
3y 6m
Apache-2.0

๐Ÿ“ Wrapper library for text generation / language models at char and word level with RNN in TensorFlow

62
30
3y 23d
MIT

TensorLight - A high-level framework for TensorFlow

9
2
4y 11d
MIT

Mesh TensorFlow: Model Parallelism Made Easier

873
156
99d
Apache-2.0

Ludwig is a toolbox that allows to train and evaluate deep learning models without the need to write code.

7.57K
889
83d
Apache-2.0

Keras community contributions

1.49K
615
1y 4m
MIT

Keras + Hyperopt: A very simple wrapper for convenient hyperparameter optimization

2.08K
301
4m
MIT

Distributed Deep learning with Keras & Spark

1.45K
288
91d
MIT

Train/evaluate a Keras model, get metrics streamed to a dashboard in your browser.

496
51
3y 11m
MIT

Graph Neural Networks with Keras and Tensorflow 2.

1.63K
198
88d
MIT

QKeras: a quantization deep learning library for Tensorflow Keras

246
50
84d
Apache-2.0

A high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. Keras compatible

MXNet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

19.28K
6.8K
82d
Apache-2.0

A clear, concise, simple yet powerful and efficient API for deep learning.

2.32K
226
3y 78d
Apache-2.0

Simple, efficient and flexible vision toolbox for mxnet framework.

31
9
3y 5m
BSD-3-Clause

Gluon CV Toolkit

4.56K
1.05K
82d
Apache-2.0

NLP made easy

2.23K
510
82d
Apache-2.0

Transfer Learning library for Deep Neural Networks.

174
42
10m
Apache-2.0

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

30
9
1y 4m
Apache-2.0

Others

Source-to-Source Debuggable Derivatives in Pure Python

2.15K
352
2y 9m
Apache-2.0

Efficiently computes derivatives of numpy code.

5.14K
733
1y 5m
MIT

Myia prototyping

367
39
88d
MIT

Neural Network Libraries

2.42K
299
87d
Apache-2.0

Caffe: a fast open framework for deep learning.

31.4K
18.77K
1y 92d
n/a

hipCaffe: the HIP port of Caffe

123
26
2y 7m
n/a

Probably the best curated list of data science software in Python.

786
142
4m
CC-BY-4.0

Web Scraping

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

7.8K
1.54K
1y 20d
BSD-3-Clause

Scrape Twitter for Tweets

1.63K
497
9m
MIT

Providing Pythonic idioms for iterating, searching, and modifying HTML or XML.

A fast high-level screen scraping and web crawling framework.

Use Selenium Python API to access all functionalities of Selenium WebDriver in an intuitive way like a real user.

Data Containers

Create HTML profiling reports from pandas DataFrame objects

6.82K
1.04K
83d
MIT

cuDF - GPU DataFrame Library

3.68K
490
81d
Apache-2.0

NumPy and Pandas interface to Big Data

2.93K
378
1y 9m
n/a

sqldf for pandas

942
148
4y 104d
MIT

Pandas Google BigQuery

242
85
116d
BSD-3-Clause

Universal 1d/2d data containers with Transformers functionality for data analysis.

24
4
2y 7m
n/a

A pure Python implementation of Apache Spark's RDD and DStream interfaces.

231
44
82d
n/a

High performance datastore for time series and tick data

2.18K
447
109d
LGPL-2.1

A Python package for manipulating 2-dimensional tabular data structures

1.14K
99
81d
MPL-2.0

Koalas: pandas API on Apache Spark

2.66K
298
81d
Apache-2.0

Modin: Speed up your Pandas workflows by changing a single line of code

5.73K
392
82d
n/a

A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner

1.55K
72
4m
MIT

The easy way to write your own flavor of Pandas

194
14
1y 5m
MIT

The goal of pandas-log is to provide feedback about basic pandas operations. It provides simple wrapper functions for the most common functions that add additional logs

169
11
5m
MIT

Out-of-Core DataFrames for Python, ML, visualize and explore big tabular data at a billion rows per second ๐Ÿš€

5.82K
448
99d
MIT

Powerful Python data analysis toolkit.

Pipelines

Easy pipelines for pandas DataFrames.

580
29
116d
n/a

functional data manipulation for pandas

175
24
5y 8m
n/a

dplyr for python

732
55
4y 4m
MIT

Pandas integration with sklearn

2.34K
389
84d
n/a

BatchFlow helps you conveniently work with random or sequential batches of your data and define data processing and machine learning workflows even for datasets that do not fit into memory.

154
36
113d
Apache-2.0

Clean APIs for data cleaning. Python implementation of R package Janitor

628
121
83d
MIT

A Python toolkit for processing tabular data

374
24
9m
MIT

Build, test, deploy, iterate - Dev and prod tool for data science pipelines

46
2
1y 10m
n/a

Directions overlay for working with pandas in an analysis environment

417
21
8m
BSD-3-Clause

Python pipe (|) operator with support for DataFrames and Numpy and Pytorch.

Automates your software builds, tests, and deployments.

General

An open source python library for automated feature engineering

5.41K
709
85d
BSD-3-Clause

scikit-learn addon to operate on set/"group"-based features

40
8
4y 9m
BSD-3-Clause

A set of tools for creating and testing machine learning features, with a scikit-learn compatible API

377
83
3y 4m
n/a

a feature engineering wrapper for sklearn

43
19
2y 5m
GPL-3.0

A sklearn-compatible Python implementation of Multifactor Dimensionality Reduction (MDR) for feature construction.

113
19
3y 5m
MIT

Automatic extraction of relevant features from time series:

5.42K
852
107d
MIT

Feature Selection

open-source feature selection repository in python

1.05K
350
1y 6m
GPL-2.0

Python implementations of the Boruta all-relevant feature selection method.

905
183
90d
BSD-3-Clause

A fast xgboost feature selection algorithm

163
28
3y 114d
MIT

A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.

309
58
89d
MIT

General Purposes

matplotlib: plotting with Python

13.17K
5.67K
84d
n/a

Statistical data visualization using matplotlib

8.14K
1.38K
87d
BSD-3-Clause

Painlessly create beautiful matplotlib plots.

1.56K
144
6y 7m
MIT

Ternary plotting library for python with matplotlib

392
111
87d
MIT

Missing data visualization module for Python.

2.65K
348
4m
MIT

Python library that makes it easy for data scientists to create charts.

2.82K
254
6m
Apache-2.0

Python histogram library - histograms as updateable, fully semantic objects with visualization tools. [P]ython [HYST]ograms.

107
17
113d
MIT

Interactive plots

A python package for animating plots build on matplotlib.

357
32
7m
MIT

Interactive Data Visualization in the browser, from Python

14.7K
3.64K
81d
BSD-3-Clause

Plotting library for IPython/Jupyter notebooks

3.01K
434
81d
Apache-2.0

๐ŸŽจ Python Echarts Plotting Library

9.65K
2.19K
8m
MIT

A Python library that makes interactive and publication-quality graphs.

Declarative statistical visualization library for Python. Can easily do many data transformation within the code to create graph

Map

A Python package for interactive mapping with Google Earth Engine, ipyleaflet, and folium

924
394
83d
MIT

Makes it easy to visualize data on an interactive open street map

Automatic Plotting

With Holoviews, your data visualizes itself.

1.81K
305
82d
BSD-3-Clause

Automatically Visualize any dataset, any size with a single line of code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.

305
60
4m
Apache-2.0

Visualize and compare datasets, target values and associations, with one line of code.

1.27K
145
85d
MIT

NLP

Python library for interactive topic model visualization. Port of the R LDAvis package.

1.41K
302
83d
BSD-3-Clause

Deployment

A collection of APIs to turn scripts and notebooks into interactive reports.

Enable sharing and execute Jupyter Notebooks

Modern, fast (high-performance), web framework for building APIs with Python

Make it easy to deploy machine learning model

Model Explanation

A data-driven approach to quantify the value of classifiers in a machine learning ensemble.

15
2
4m
MIT

Algorithms for monitoring and explaining machine learning models

897
110
86d
Apache-2.0

Code for "High-Precision Model-Agnostic Explanations" paper

622
91
8m
BSD-2-Clause

Bias and Fairness Audit Toolkit

359
67
101d
MIT

Contrastive Explanation (Foil Trees), developed at TNO/Utrecht University

33
4
88d
BSD-3-Clause

Visual analysis and diagnostic tools to facilitate machine learning model selection.

3.11K
477
91d
Apache-2.0

An intuitive library to add plotting functionality to scikit-learn objects.

2.03K
249
2y 9m
MIT

A game theoretic approach to explain the output of any machine learning model.

11.77K
1.73K
94d
MIT

A library for debugging/inspecting machine learning classifiers and explaining their predictions

2.3K
298
1y 114d
MIT

Lime: Explaining the predictions of any machine learning classifier

8.46K
1.37K
4m
BSD-2-Clause
278
51
2y 7m
n/a
78
26
2y 24d
n/a

python partial dependence plot toolbox

533
85
2y 5m
MIT

Python implementation of R package breakDown

38
4
2y 6m
n/a

โฌ› Python Individual Conditional Expectation Plot Toolbox

100
25
3y 111d
MIT

Python Library for Model Interpretation/Explanations

969
162
10m
UPL-1.0

Model analysis tools for TensorFlow

1.04K
212
85d
Apache-2.0

A library that implements fairness-aware machine learning algorithms

69
17
2y 9m
MIT
601
123
10m
BSD-3-Clause

Interpretability and explainability of data and machine learning models

770
174
5m
Apache-2.0

Auralisation of learned features in CNN (for audio)

37
10
4y 66d
n/a

๐ŸŽ† A visualization of the CapsNet layers to better understand how it works

369
91
1y 10d
MIT

A collection of infrastructure and tools for research in neural network interpretability.

4.07K
584
109d
Apache-2.0

Visualizer for neural network, deep learning, and machine learning models

13.37K
1.62K
81d
MIT

Exploration tool for your NeuralNetwork

31
2
2y 6m
MIT

tensorboard for pytorch (and chainer, mxnet, numpy, ...)

6.82K
788
10m
MIT

Logging MXNet data for visualization in TensorBoard.

330
50
1y 112d
Apache-2.0

Reinforcement Learning

A toolkit for developing and comparing reinforcement learning algorithms.

23.5K
6.69K
88d
n/a

Reinforcement Learning Coach by Intel AI Lab enables easy experimentation with state of the art Reinforcement Learning algorithms

1.92K
379
95d
Apache-2.0

A toolkit for reproducible reinforcement learning research.

1.08K
203
82d
MIT

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

11.18K
3.89K
1y 105d
MIT

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

2.92K
583
4m
MIT

A platform for Reasoning systems (Reinforcement Learning, Contextual Bandits, etc.)

2.81K
398
81d
BSD-3-Clause

TF-Agents is a library for Reinforcement Learning in TensorFlow

1.8K
471
81d
Apache-2.0

Tensorforce: a TensorFlow library for applied reinforcement learning

2.88K
489
96d
Apache-2.0

TensorFlow Reinforcement Learning

3.06K
371
1y 25d
Apache-2.0

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.

9.32K
1.24K
92d
Apache-2.0

Deep Reinforcement Learning for Keras.

4.96K
1.3K
1y 6m
MIT

ChainerRL is a deep reinforcement learning library built on top of Chainer.

928
212
5m
MIT

Probabilistic Methods

Fast, flexible and easy to use probabilistic modelling in Python.

2.6K
473
92d
MIT

Deep universal probabilistic programming with Python and PyTorch

6.74K
826
82d
Apache-2.0

THIS IS THE OLD PYMC PROJECT. PLEASE USE PYMC3 INSTEAD:

888
232
9m
AFL-3.0

Decorator for PyMC3

45
5
3y 4m
n/a

InferPy: Deep Probabilistic Modeling with Tensorflow Made Easy

130
15
10m
Apache-2.0

PyStan, a Python interface to Stan, a platform for statistical modeling. Documentation: https://pystan.readthedocs.io

38
11
85d
ISC

Bayesian dessert for Lasagne

84
6
3y 10m
MIT

Python package for Bayesian Machine Learning with scikit-learn API

420
108
1y 4m
MIT

Scikit-learn compatible estimation of general graphical models

176
31
4m
MIT

Python Library for learning (Structure and Parameter) and inference (Statistical and Causal) in Bayesian Networks.

1.71K
563
82d
MIT

Supervised domain-agnostic prediction framework for probabilistic modelling

107
15
2y 87d
n/a

A bare-bones TensorFlow framework for Bayesian deep learning and Gaussian process approximation

126
9
2y 7m
Apache-2.0

Probabilistic Programming and Statistical Inference in PyTorch

108
14
3y 10m
MIT

Python package facilitating the use of Bayesian Deep Learning methods with Variational Inference for PyTorch

315
46
2y 5m
MIT

The Python ensemble sampling toolkit for affine-invariant MCMC

1.11K
389
86d
MIT

A library for hidden semi-Markov models with explicit durations

46
11
3y 4m
GPL-3.0
465
146
8m
MIT

A highly efficient and modular implementation of Gaussian Processes in PyTorch

2.3K
328
82d
MIT

Modular Probabilistic Programming on MXNet

95
26
1y 11m
Apache-2.0

scikit-learn inspired API for CRFsuite

360
154
1y 5m
n/a

Python package for Bayesian statistical modeling and Probabilistic Machine Learning. Theano compatible

A library for probabilistic modeling, inference, and criticism. sklearn

Genetic Programming

Genetic Programming in Python, with a scikit-learn inspired API

907
156
1y 90d
BSD-3-Clause

Distributed Evolutionary Algorithms in Python

4.09K
868
4m
LGPL-3.0

A Genetic Programming platform for Python with TensorFlow for wicked-fast CPU and GPU support.

128
60
109d
n/a

A strongly-typed genetic programming framework for Python

98
11
2y 11m
n/a

Genetic feature selection module for scikit-learn

141
42
5m
n/a

Optimization

Spearmint Bayesian optimization codebase

1.41K
302
2y 44d
n/a

Bayesian optimization in PyTorch

1.84K
185
82d
MIT

Genetic Algorithm, Particle Swarm Optimization, Simulated Annealing, Ant Colony Optimization Algorithm,Immune Algorithm, Artificial Fish Swarm Algorithm, Differential Evolution and TSP(Traveling salesman)

1.98K
463
102d
MIT

Sequential Model-based Algorithm Configuration

561
156
6m
n/a

optimization routines for hyperparameter tuning

363
74
1y 4d
n/a

Distributed Asynchronous Hyperparameter Optimization in Python

5.45K
868
95d
n/a

Hyper-parameter optimization for sklearn

1.18K
224
6m
n/a

Use evolutionary algorithms instead of gridsearch in scikit-learn

628
112
1y 5m
MIT

SigOpt wrappers for scikit-learn methods

69
12
1y 44d
MIT

A Python implementation of global optimization with gaussian processes.

4.89K
1.1K
8m
MIT

Safe Bayesian Optimization

90
35
1y 29d
MIT

Sequential model-based optimization with a scipy.optimize interface

2.04K
390
4m
BSD-3-Clause

๐ŸŽฏ A comprehensive gradient-free optimization framework written in Python

545
55
3y 8m
MIT

A research toolkit for particle swarm optimization in Python

730
234
4m
MIT

A Free and Open Source Python Library for Multiobjective Optimization

309
116
11m
GPL-3.0

Bayesian Optimization using GPflow

227
55
5m
Apache-2.0

POT : Python Optimal Transport

912
199
4m
MIT

Hyperparameter Optimization for TensorFlow, Keras and PyTorch

1.38K
229
5m
MIT

library for nonlinear optimization, wrapping many algorithms for global and local, constrained or unconstrained, optimization

938
295
5m
n/a

Natural Language Processing

NLTK Source

9.65K
2.44K
94d
Apache-2.0

The Classical Language Toolkit

630
300
82d
MIT

scikit-learn wrappers for Python fastText.

210
21
8m
n/a

Simple text to phones converter for multiple languages

410
84
4m
GPL-3.0

A very simple framework for state-of-the-art Natural Language Processing (NLP)

9.97K
1.51K
83d
n/a

Topic Modelling for Humans.

A natural language processing toolkit.

A library for industrial-strength natural language processing in Python and Cython.

Computer Audition

Python library for audio and music analysis

4.31K
701
88d
ISC

Audio features extraction

186
42
1y 9m
LGPL-3.0

a library for audio and music analysis

2.05K
301
116d
GPL-3.0

C++ library for audio and music analysis, description and synthesis, including Python bindings

1.74K
403
81d
AGPL-3.0

LibXtract is a simple, portable, lightweight library of audio feature extraction functions.

201
46
1y 10m
MIT

Marsyas - Music Analysis, Retrieval and Synthesis for Audio Signals

330
92
7m
GPL-2.0

A library for augmenting annotated audio data

174
31
9m
ISC

Python audio and music signal processing library

718
122
1y 4m
n/a

Computer Vision

Open Source Computer Vision Library

52.52K
43.35K
82d
n/a

Image processing in Python

4.2K
1.76K
82d
n/a

Image augmentation for machine learning experiments.

10.72K
2K
11m
MIT

Image augmentation library in Python for machine learning.

4.32K
806
1y 67d
MIT

Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125

7.35K
960
88d
MIT

Statistics

An extension to pandas dataframes describe function.

358
36
1y 8m
MIT

Create HTML profiling reports from pandas DataFrame objects

6.82K
1.04K
83d
MIT

Statsmodels: statistical modeling and econometrics in Python

6.04K
2.17K
85d
n/a

Supply a wrapper StockDataFrame based on the pandas.DataFrame with inline stock statistics/indicators support.

724
201
7m
BSD-3-Clause

Pandas-based utility to calculate weighted means, medians, distributions, standard deviations, and more.

83
9
1y 5m
MIT

Multiple Pairwise Comparisons (Post Hoc) Tests in Python

178
19
86d
MIT

Performance analysis of predictive (alpha) stock factors

1.78K
671
1y 18d
Apache-2.0

Distributed Computing

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

10.85K
1.75K
81d
n/a

Distributed machine learning platform

890
185
4y 112d
n/a

Framework and Library for Distributed Online Machine Learning

701
149
2y 0d
LGPL-2.1

Microsoft Distributed Machine Learning Toolkit

2.77K
593
3y 9m
MIT

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice ๏ผˆใ€Ž้ฃžๆกจใ€ๆ ธๅฟƒๆก†ๆžถ๏ผŒๆทฑๅบฆๅญฆไน &ๆœบๅ™จๅญฆไน ้ซ˜ๆ€ง่ƒฝๅ•ๆœบใ€ๅˆ†ๅธƒๅผ่ฎญ็ปƒๅ’Œ่ทจๅนณๅฐ้ƒจ็ฝฒ๏ผ‰

14.34K
3.57K
81d
Apache-2.0

Scalable Machine Learning with Dask

685
189
89d
BSD-3-Clause

A distributed task scheduler for Dask

1.15K
512
81d
n/a

Exposes the Spark programming model to Python. Apache Spark based

Experimentation

Sacred is a tool to help you configure, organize, log and reproduce experiments developed at IDSIA.

3.32K
318
86d
MIT

A web-based application for quick, scalable, and automated hyperparameter tuning and stacked ensembling in Python.

1.25K
109
3y 8m
Apache-2.0

A visual dataflow programming language for sklearn

175
36
3y 10m
MIT

Adaptive Experimentation Platform

1.41K
143
84d
MIT

A lightweight ML experiment tracking, results visualization and management tool.

Evaluation

A library of metrics for evaluating recommender systems

236
56
5m
MIT

Machine learning evaluation metrics, implemented in Python, R, Haskell, and MATLAB / Octave

1.39K
426
5y 8m
n/a

Machine learning model evaluation made easy: plots, tables, HTML reports, experiment tracking and Jupyter notebook analysis.

286
26
4m
MIT

A comprehensive set of fairness metrics for datasets and machine learning models, explanations for these metrics, and algorithms to mitigate bias in datasets and models.

1.23K
409
82d
Apache-2.0

Computations

Parallel computing with task scheduling

7.95K
1.23K
81d
BSD-3-Clause

Fast NumPy array functions written in C

577
71
111d
BSD-2-Clause

A NumPy-compatible array library accelerated by CUDA

4.84K
435
81d
MIT

Python library for multilinear algebra and tensor factorizations

379
109
4y 7m
GPL-3.0

Solve automatic numerical differentiation problems in one or more variables.

136
31
5m
BSD-3-Clause

Add built-in support for quaternions to numpy

378
59
6m
MIT

Adaptive: parallel active learning of mathematical functions

642
36
6m
BSD-3-Clause

A fundamental package for scientific computing with Python.

Spatial Analysis

Python tools for geographic data

2.48K
556
81d
BSD-3-Clause

PySAL: Python Spatial Analysis Library Meta-Package

822
258
103d
BSD-3-Clause

Quantum Computing

PennyLane is a cross-platform Python library for differentiable programming of quantum computers. Train a quantum computer the same way as a neural network.

773
222
84d
Apache-2.0

QML: Quantum Machine Learning

139
54
2y 8m
MIT

Conversion

Transpile trained scikit-learn estimators to C, Java, JavaScript and others.

996
129
1y 4m
MIT

Open standard for machine learning interoperability

9.8K
1.82K
82d
Apache-2.0

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.

5.22K
949
9m
MIT