User Experience on mobile might not be great yet, but I'm working on it.

Your first time on this page? Allow me to give some explanations.

Awesome Bioinformatics

A curated list of awesome Bioinformatics libraries and software.

Here you can see meta information about this topic like the time we last updated this page, the original creator of the awesome list and a link to the original GitHub repository.

Last Update: None

Thank you danielecook & contributors
View Topic on GitHub:
danielecook/Awesome-Bioinformatics

Search for resources by name or description.
Simply type in what you are looking for and the results will be filtered on the fly.

Further filter the resources on this page by type (repository/other resource), number of stars on GitHub and time of last commit in months.

Package suites

Core BioPerl 1.x code

267
177
7m
n/a

Official git repository for Biopython (originally converted from CVS)

2.89K
1.39K
1y 26d
n/a

This library provides implementations of many algorithms and data structures that are useful for bioinformatics. All provided implementations are rigorously tested via continuous integration.

991
141
1y 41d
MIT

The modern C++ library for sequence analysis. Contains version 3 of the library and API docs.

226
58
1y 26d
n/a

A Go package for engineering organisms.

160
23
1y 29d
MIT

OCaml Bioinformatics Library

107
18
1y 4m
n/a

Downloading

The command-line interface to GGD

27
2
1y 4m
MIT

Web application to explore the Sequence Read Archive.

119
24
1y 5m
GPL-2.0

Compressing

Compressor for genomic files (FASTQ, SAM/BAM, VCF, FASTA, GVF, 23andMe...), up to 5x better than gzip and faster too

68
4
10m
n/a

Command Line Utilities

Useful bash one-liners for bioinformatics.

1.31K
411
3y 5m
n/a

Modular and universal bioinformatics

294
43
3y 12d
MIT

Syntax highlighting for computational biology

190
28
1y 4m
GPL-3.0

A suite of utilities for converting to and working with CSV, the king of tabular file formats.

4.87K
565
1y 53d
MIT

A cross-platform, efficient and practical CSV/TSV toolkit in Golang

649
65
1y 45d
MIT

Easily submitting multiple PBS jobs or running local jobs in parallel. Multiple input files supported.

21
6
3y 5m
MIT

a wee tool for random access into BGZF files.

78
12
4y 6m
MIT

sort genomic data

30
1
2y 6m
MIT

Note: tabix and bgzip binaries are now part of the HTSlib project.

82
39
1y 119d
n/a

Write-once-read-many table for large datasets.

25
4
6y 26d
LGPL-3.0

Create an index on a compressed text file

553
36
4y 5d
BSD-2-Clause

Workflow Managers

BigDataScript: Scirpting language for big data

90
22
1y 8m
n/a

Bpipe - a tool for running and managing bioinformatics pipelines

195
53
1y 53d
n/a

Repository for the CWL standards. Use https://cwl.discourse.group/ for support ๐Ÿ˜Š

1.29K
187
1y 72d
Apache-2.0

Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments

742
259
1y 26d
n/a

A DSL for data-driven computational pipelines

1.53K
373
1y 26d
Apache-2.0

Yet another redundant workflow engine

357
22
42d
Apache-2.0

CGAT-ruffus is a lightweight python module for running computational pipelines

163
34
1y 4m
MIT

Robust, flexible and resource-efficient pipelines using Go and the commandline

925
67
6m
MIT

This is the SeqWare Project's main repo.

26
18
4y 4m
GPL-3.0

Workflow Description Language - Specification and Implementations

24
6
3y 8m
BSD-3-Clause

Pipelines

A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin

4.38K
482
1y 57d
n/a

A flexible pipeline for complete analysis of bacterial genomes

105
23
1y 26d
MIT

Generic but comprehensive pipeline for prokaryotic genome annotation and interrogation with interactive reports and shiny app.

39
3
1y 34d
GPL-3.0

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis

860
347
9m
MIT

Software for intuitively doing Differential Gene Expression (DGE) analysis on Windows and GNU\Linux, based on R packages.

3
0
3y 47d
n/a

A pipeline for preprocessing NGS data from Illumina, Nanopore and PacBio technologies

8
3
1y 44d
GPL-3.0

Sequence Processing

Automatic Filtering, Trimming, Error Removing and Quality Control for fastq data

179
51
2y 6m
MIT

A quality control analysis tool for high throughput sequencing data

197
51
1y 106d
n/a

Simple FASTQ quality assessment using Python

94
15
1y 6m
MIT

FASTA/FASTQ pre-processing programs

130
56
3y 11m
n/a

Aggregate results from bioinformatics analyses across many samples into a single report.

772
426
1y 28d
GPL-3.0

seqfu - Sequece Fastx Utilities

23
0
1y 27d
GPL-3.0

A cross-platform and ultrafast toolkit for FASTA/Q file manipulation in Golang

750
112
1y 29d
MIT

An imagemagick-like frontend to Biopython SeqIO

93
21
1y 11m
GPL-3.0

Toolkit for processing sequences in FASTA/Q formats

881
255
1y 55d
MIT

Explore and analyze biological sequence data

11
1
1y 4m
MIT

Data Analysis

Scalable genomic data analysis.

756
205
1y 26d
MIT

Scalable gVCF merging and joint variant calling for population sequencing projects

76
23
1y 71d
Apache-2.0

Pairwise

A fast and sensitive gapped read aligner

402
135
1y 36d
GPL-3.0

Burrow-Wheeler Aligner for short-read alignment (see minimap2 for long-read alignment)

1.04K
495
1y 46d
GPL-3.0

Wavefront alignment algorithm (WFA): Fast and exact gap-affine pairwise alignment

167
14
1y 4m
n/a

Pairwise Sequence Alignment Library

160
22
1y 7m
n/a

Mummer alignment tool

274
87
1y 87d
n/a

Accelerated BLAST compatible local sequence aligner.

701
148
6m
GPL-3.0

Multiple Sequence Alignment

A simple Partial Order Aligner based on Lee, Grasso and Sharlow (2002), for education/demonstration purposes

52
11
1y 73d
GPL-2.0

Clustering

MMseqs2: ultra fast and sensitive search and clustering suite

577
87
1y 27d
GPL-3.0

Quantification

RSEM: accurate quantification of gene and isoform expression from RNA-Seq data

290
99
1y 4m
GPL-3.0

Variant Calling

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.

2.37K
587
1y 34d
BSD-3-Clause

Bayesian haplotype-based genetic polymorphism discovery and genotyping.

560
232
1y 8m
MIT

Official code repository for GATK versions 1.0 through 3.7 (core engine). For GATK 4 code, see the https://github.com/broadinstitute/gatk repository

277
221
4y 101d
n/a

Bayesian haplotype-based mutation calling

238
31
1y 28d
MIT

This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html

411
186
1y 28d
n/a

Structural variant callers

DELLY2: Structural variant discovery by integrated paired-end and split-read analysis

277
112
1y 33d
BSD-3-Clause

lumpy: a general probabilistic framework for structural variant discovery

234
109
2y 77d
MIT

Structural variant and indel caller for mapped sequencing data

293
106
3y 4m
n/a

GRIDSS: the Genomic Rearrangement IDentification Software Suite

165
42
1y 27d
n/a

structural variant calling and genotyping with existing tools, but, smoothly.

156
18
1y 46d
Apache-2.0

BAM File Utilities

C++ API & command-line toolkit for working with BAM data

346
142
1y 4m
MIT

A bam toolbox

1
1
5y 45d
MIT

Automate common sam & bam conversions

6
1
9y 7m
n/a

fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing

444
76
1y 4m
MIT

SAMStat displays various properties of next-generation sequencing reads stored in SAM/BAM format.

11
1
4y 11m
GPL-3.0

fast sample-swap and relatedness checks on BAMs/CRAMs/VCFs/GVCFs... "like damn that is one smart wine guy"

156
22
10m
MIT

A software for calculating telomere length

41
24
4y 38d
GPL-3.0

VCF File Utilities

This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html

411
186
1y 28d
n/a

annotate a VCF with other VCFs/BEDs/tabixed files

274
52
1y 35d
MIT

C++ library and cmdline tools for parsing and manipulating VCF files

448
202
1y 6m
MIT

A set of tools written in Perl and C++ for working with VCF files, such as those generated by the 1000 Genomes Project.

354
133
2y 34d
LGPL-3.0

GFF BED File Utilities

Another Gtf/Gff Analysis Toolkit

174
25
1y 26d
GPL-3.0

GFF and GTF file manipulation and interconversion

173
63
1y 60d
MIT

bedtools - the swiss army knife for genome arithmetic

725
265
1y 31d
MIT

Variant Simulation

tools for adding mutations to existing .bam files, used for testing mutation callers

178
69
1y 4m
MIT

Reads simulator

194
87
1y 88d
n/a

Variant Prediction/Annotation

Data

python access to UCSC genomes database

126
39
2y 95d
MIT

Python interface to access reference genome features (such as genes, transcripts, and exons) from Ensembl

266
48
1y 52d
Apache-2.0

Access to Biological Web Services from Python.

189
58
1y 47d
n/a

Tools

A fast Python library for VCF files leveraging Cython for speed.

51
13
4y 8m
MIT

cython + htslib == fast VCF and BCF processing

275
49
1y 35d
MIT

Python wrapper -- and more -- for Aaron Quinlan's BEDTools (bioinformatics tools)

235
85
1y 47d
n/a

Efficient pythonic random access to fasta subsequences

324
55
1y 33d
n/a

Pysam is a Python module for reading and manipulating SAM/BAM/VCF/BCF files. It's a lightweight wrapper of the htslib C-API, the same one that powers samtools, bcftools, and tabix.

539
227
1y 28d
MIT

A Variant Call Format reader for Python.

353
186
1y 70d
n/a

Assembly

SPAdes Genome Assembler

380
94
1y 32d
n/a

SKESA assembler

69
12
1y 43d
n/a

Annotation

Rapid prokaryotic genome annotation

511
176
1y 6m
n/a

Rapid & standardized annotation of bacterial genomes & plasmids

106
12
1y 26d
GPL-3.0

Long-read Assembly

A single molecule sequence assembler for genomes large and small.

494
161
1y 36d
n/a

De novo assembler for single molecule sequencing reads using repeat graphs

447
97
1y 46d
n/a

Hifiasm: a haplotype-resolved assembler for accurate Hifi reads

202
33
1y 79d
MIT

Redbean: A fuzzy Bruijn graph approach to long noisy reads assembly

432
83
1y 9m
GPL-3.0

Genome Browsers / Gene Diagrams

๐Ÿ“ˆ DNA Sequence Visualization for Humans

31
8
1y 4m
MIT

Interactive web-based genome browser.

210
68
3y 95d
BSD-2-Clause

๐Ÿ”ฌA library of JavaScript components to represent biological data

443
122
1y 50d
Apache-2.0

Flexible circular visualization of genome-associated data with BioPerl and SVG.

38
6
3y 5m
Artistic-2.0

Horizon chart js library for DNA data.

58
6
6y 7m
n/a

Integrative Genomics Viewer. Fast, efficient, scalable visualization tool for genomics data and annotations

452
214
1y 27d
MIT

SVG based genome viewer written in javascript using D3

32
25
7y 4m
GPL-2.0

A modern genome browser built with JavaScript and HTML5.

409
197
1y 42d
LGPL-2.1

Pathogen-Host Analysis Tool - A modern Next-Generation Sequencing (NGS) analysis platform

16
2
1y 49d
GPL-3.0

Interactive in-browser track viewer

257
63
1y 35d
Apache-2.0

HTML5 canvas genomic graphics library

74
17
3y 6m
n/a

Circos Related

Circos is a software package for visualizing data and information. It visualizes data in a circular layout โ€” this makes Circos ideal for exploring relationships between objects or positions.

54
30
4y 33d
n/a

Fuji plotโ€”a circos representation of multiple GWAS resultsโ€”

47
15
1y 10m
GPL-3.0

Database Access

Becoming a Bioinformatician

Bioinformatics on GitHub

Alternative splicing resource

25
6
4y 8m
n/a

Sequencing

[1:34:35] - Excellent (technical) overview of next-generation and third-generation sequencing technologies, along with some applications in cancer research.

RNA-Seq

Informatics for RNA-seq: A web resource for analysis on the cloud. Educational tutorials and working pipelines for RNA-seq analysis including an introduction to: cloud computing, critical file formats, reference genomes, gene annotation, expression, differential expression, alternative splicing, data visualization, and interpretation.

1.11K
569
1y 7m
CC-BY-SA-4.0

RNAseq analysis notes from Ming Tang

614
238
1y 5m
MIT

[46:39] - Dr. Lior Pachter shares his stories from the supplement for well-known RNA-seq analysis software CuffDiff and Cufflinks and explains some of their methodologies.

ChIP-Seq

ChIP-seq analysis notes from Ming Tang

526
257
1y 7m
MIT

YouTube Channels and Playlists

Blogs

Miscellaneous

Online networking groups