User Experience on mobile might not be great yet, but I'm working on it.

Your first time on this page? Allow me to give some explanations.

Awesome Streaming

a curated list of awesome streaming frameworks, applications, etc

Here you can see meta information about this topic like the time we last updated this page, the original creator of the awesome list and a link to the original GitHub repository.

Last Update: Dec. 1, 2020, 9:16 p.m.

Thank you manuzhang & contributors
View Topic on GitHub:
manuzhang/awesome-streaming

Search for resources by name or description.
Simply type in what you are looking for and the results will be filtered on the fly.

Further filter the resources on this page by type (repository/other resource), number of stars on GitHub and time of last commit in months.

Streaming Engine

Mirror of Apache Apex core

348
178
3y 79d
Apache-2.0

Apache Flink

14.76K
8.1K
2d
Apache-2.0

Apache Heron (Incubating) is a realtime, distributed, fault-tolerant stream processing engine from Twitter

3.49K
616
5d
Apache-2.0

Mirror of Apache Samza

660
272
7d
Apache-2.0

Apache Spark - A unified analytics engine for large-scale data processing

28.14K
22.94K
3d
Apache-2.0

Mirror of Apache Storm

6.17K
4.06K
16d
n/a

SQL-based streaming analytics platform at scale

1.16K
277
2y 4m
Apache-2.0

Python Stream Processing

5.07K
422
54d
n/a

Lightweight real-time big data streaming engine over Akka

742
161
1y 2d
Apache-2.0

Distributed Stream and Batch Processing

815
161
2d
n/a

Haskell distributed stream processing with exactly-once semantics

86
7
6y 5m
Apache-2.0

A platform that makes it easy for developers to build realtime, cost-effective, operations-focused applications

1.07K
90
14d
Apache-2.0

Muppet

126
32
5y 5m
Apache-2.0

Distributed, masterless, high performance, fault tolerant data processing

1.98K
212
1y 94d
EPL-1.0

Mirror of Apache S4

38
17
8y 4m
Apache-2.0

Window-Based Hybrid CPU/GPU Stream Processing Engine

34
10
89d
n/a

Spooker is a dynamic framework for processing high volume data streams via processing pipelines

27
5
4y 10m
Apache-2.0

High Throughput Real-time Stream Processing Framework

275
34
3y 8m
Apache-2.0

The core libraries of the teknek stream processing platform

7
1
4y 11m
Apache-2.0

Trill is a single-node query processor for temporal or streaming data.

1.09K
106
33d
MIT

Distributed Stream Processing

1.43K
66
15d
Apache-2.0

Multi-core Window-Based Stream Processing Engine

26
5
34d
Apache-2.0

Streaming Library

Mirror of Apache Kafka

17.42K
9.31K
2d
Apache-2.0

Build highly concurrent, distributed, and resilient message-driven applications on the JVM

11.23K
3.33K
2d
n/a

A stream processor for mundane tasks written in Go

2.69K
232
2d
MIT

Compositional, streaming I/O library for Scala

1.77K
466
4d
n/a

Asynchronous, Reactive Programming for Scala and Scala.js.

1.73K
230
3d
Apache-2.0

StreamLine - Streaming Analytics

150
93
1y 4m
Apache-2.0

StreamAlert is a serverless, realtime data analysis framework which empowers you to ingest, analyze, and alert on data from any environment, using datasources and alerting logic you define.

2.41K
301
58d
Apache-2.0

A lightweight Reactive Streams Infrastructure Toolkit for Scala.

175
23
2y 5m
MPL-2.0

Real-time stream processing for python

881
111
8d
n/a

Stream Ops is a fully embeddable data streaming engine and stream processing API for Java.

35
8
1y 44d
n/a

Python Data Streams

187
17
13d
Apache-2.0

Streaming Application

A platform for real-time streaming search

97
21
4y 9m
MIT

A scalable, mature and versatile web crawler based on Apache Storm

681
216
6d
Apache-2.0

IoT

Lightweight stream processing engine for IoT

198
34
1y 29d
MIT

Mirror of Apache Edgent (Incubating)

199
134
1y 8m
Apache-2.0

Apache StreamPipes - A self-service (Industrial) IoT toolbox to enable non-technical users to connect, analyze and explore IoT data streams.

184
30
4d
Apache-2.0

DSL

Apache Beam is a unified programming model for Batch and Streaming

4.4K
2.79K
5d
Apache-2.0

Experiments in Streaming

58
3
4y 98d
Apache-2.0

Esper Complex Event Processing, Streaming SQL and Event Series Analysis

667
210
22d
n/a

Run Python in Apache Storm topologies. Pythonic API, CLI tooling, and a topology DSL.

1.42K
217
56d
n/a

Streaming MapReduce with Scalding and Storm

2.09K
261
1y 9m
Apache-2.0

Data Pipeline

Mirror of Apache Kafka

17.42K
9.31K
2d
Apache-2.0

Apache Pulsar - distributed pub-sub messaging system

6.87K
1.69K
2d
Apache-2.0

An extensible distributed system for reliable nearline data streaming at scale

616
81
92d
BSD-2-Clause

LinkedIn's previous generation Kafka to HDFS pipeline.

874
475
5y 43d
n/a

Source-agnostic distributed change data capture system

3.02K
676
6m
Apache-2.0

Mirror of Apache Flume

2.04K
1.4K
6m
Apache-2.0

Build platforms that flexibly mix SQL, batch, and stream processing paradigms

202
30
76d
MIT

A high available,high performance distributed messaging system.

1.29K
699
3y 10m
Apache-2.0

NATS Streaming System Server

2.11K
259
13d
Apache-2.0

A realtime distributed messaging platform

18.83K
2.46K
4d
MIT

Privacy and Security focused Segment-alternative, in Golang and React

2.15K
101
2d
AGPL-3.0

Netflix's distributed Data Pipeline

747
173
4y 11m
Apache-2.0

StreamSets Data Collector - Continuous big data and cloud platform ingest infrastructure

1.07K
577
7d
Apache-2.0

C++] - a high-performant distributed system by Facebook for streaming and storing sequential data, using a log structure.

Online Machine Learning

Mirror of Apache Samoa (Incubating)

234
105
9m
Apache-2.0

Core Java Sketch Library.

668
183
40d
Apache-2.0

Stream Data Mining Library for Spark Streaming

447
142
42d
Apache-2.0

Python application to setup and run streaming (contextual) bandit experiments.

69
16
5m
MIT

Apache Storm + OpenCV = large scale distributed image and video analysis.

161
60
4y 57d
Apache-2.0

Trident-ML : A realtime online machine learning library

382
90
5y 63d
n/a

Anomaly detection framework @ PayPal

100
30
1y 92d
Apache-2.0

Streaming SQL

High-performance time-series aggregation for PostgreSQL

2.35K
212
1y 7m
Apache-2.0

A streaming / online query processing / analytics engine based on Apache Storm

267
98
3y 6m
Apache-2.0
0
0
3y 4m
Apache-2.0

The event streaming database purpose-built for stream processing applications

3.88K
715
6d
n/a

Benchmark

42
40
4y 32d
Apache-2.0

A simple storm performance/stress test

77
39
50d
n/a

Benchmarks for Low Latency (Streaming) solutions including Apache Storm, Apache Spark, Apache Flink, ...

520
280
50d
Apache-2.0

Automated message queue orchestration for scaled-up benchmarking.

230
36
4y 10m
Apache-2.0

Toolkit

Build highly concurrent, distributed, and resilient message-driven applications on the JVM

11.23K
3.33K
2d
n/a

Event driven concurrent framework for Python

1.86K
161
11m
BSD-3-Clause

Efficient reliable UDP unicast, UDP multicast, and IPC message transport

5.27K
622
4d
Apache-2.0

StreamFlowโ„ข is a stream processing tool designed to help build and monitor processing workflows.

236
64
5y 96d
Apache-2.0

Integration of Samza and Luwak

100
15
6y 24d
n/a

SSE Stream Aggregator

798
249
5y 4m
Apache-2.0

Closed Source

Provides real-time data processing over large, distributed data streams.

NET] a massively scalable, fully managed, real-time, data stream engine provided by Microsoft Azure.

Serverless stream and batch data processing service.

C++] - a distributed stream processing framework built in C++ on top of Apache.

platform for distributed processing and real-time analytics. Integrates with many of the popular technologies in the Big Data ecosystem (Kafka, HDFS, Spark, etc.)

C++] - distributed processing framework and streaming machine learning library.

framework for building low-latency data-processing applications that is widely used at Google.

Readings

Grokking Streaming Systems helps you unravel what streaming systems are, how they work, and whether theyโ€™re right for your business. Written to be tool-agnostic, youโ€™ll be able to apply what you learn no matter which framework you choose.