User Experience on mobile might not be great yet, but I'm working on it.

Your first time on this page? Allow me to give some explanations.

Awesome Empirical Software Engineering

A curated repository of software engineering repository mining data sets

Here you can see meta information about this topic like the time we last updated this page, the original creator of the awesome list and a link to the original GitHub repository.

Last Update: Aug. 7, 2022, 6:15 p.m.

Thank you dspinellis & contributors
View Topic on GitHub:
dspinellis/awesome-msr

Search for resources by name or description.
Simply type in what you are looking for and the results will be filtered on the fly.

Further filter the resources on this page by type (repository/other resource), number of stars on GitHub and time of last commit in months.

Awesome Empirical Software Engineering

๐Ÿ˜Ž Awesome lists about all kinds of interesting topics

189.2K
22.82K
6m
CC0-1.0

Repositories

Data Sets

A Database of Real Faults and an Experimental Infrastructure to Enable Controlled Experiments in Software Engineering Research

397
194
9m
MIT

The Bug Catalog of the Maven Ecosystem

1
3
7y 34d
n/a

Generating the Blueprints of the Java Ecosystem (MSR Data Paper 2015)

0
0
7y 86d
n/a

Multi-extract and Multi-level Dataset of Mozilla Issue Tracking History

7
1
6y 87d
Apache-2.0

A Data Set of OCL Expressions on GitHub

4
2
4y 50d
n/a

Continuous Unix commit history from 1970 until today

4.83K
390
1y 7m
n/a

Tools

A library for mining of path-based representations of code (and more)

189
65
9m
MIT

A multi-language tokenizer for extracting identifiers from source code.

18
5
1y 4m
Apache-2.0

A tool for mining commits from Git repositories and diffs to automatically extract code change pattern instances and features with ast analysis

69
28
1y 4m
MIT

Collect and view OSS cryptocurrency development.

8
0
3y 36d
MIT

Database smell detector

13
1
4y 6m
MIT

Detects smells and computes metrics of Java code

109
36
1y 104d
Apache-2.0

An agile tool to analyze Git repositories

17
5
11m
LGPL-3.0

This projects mines maven central and creates a global dependency graph

23
7
11m
n/a

Send Sir Perceval on a quest to retrieve and gather data from software repositories.

236
132
9m
GPL-3.0

Smell detection tool for Puppet code

39
14
1y 10m
Apache-2.0

Python Framework to analyse Git repositories

517
106
9m
Apache-2.0

C Quality Metrics

45
9
2y 17d
n/a

Calculate the score of a repository based on best engineering practices.

86
21
1y 10m
Apache-2.0

A vulnerability patch gathering tool

27
16
3y 7m
Apache-2.0

Research Outlets