User Experience on mobile might not be great yet, but I'm working on it.

Your first time on this page? Allow me to give some explanations.

Awesome NLP with Ruby

Curated List: Practical Natural Language Processing done in Ruby

Here you can see meta information about this topic like the time we last updated this page, the original creator of the awesome list and a link to the original GitHub repository.

Last Update: Sept. 25, 2022, 6:06 p.m.

Thank you arbox & contributors
View Topic on GitHub:
arbox/nlp-with-ruby

Search for resources by name or description.
Simply type in what you are looking for and the results will be filtered on the fly.

Further filter the resources on this page by type (repository/other resource), number of stars on GitHub and time of last commit in months.

Pipeline Generation

Composable Operations is a tool set for creating operations and assembling multiple of these operations in operation pipelines.

46
6
6y 4d
MIT

Ruby wrapper for Apache Spark

222
26
5y 27d
MIT

Simplifying Kafka for ruby apps

190
28
1y 24d
Apache-2.0

Ruby: parallel processing made simple and fast

3.76K
246
10m
MIT

Parallel Workflow extension for Rake, runs on multicores, clusters, clouds.

57
3
2y 8m
MIT

Multipurpose Engines

Ruby bindings to the OpenNLP Java toolkit.

89
11
7y 12m
n/a

Ruby bindings to the Stanford Core NLP tools (English, French, German).

428
68
2y 6m
n/a

Natural language processing framework for Ruby.

1.35K
128
5y 4m
n/a

wrapper for basic nlp tools

2
1
4y 5m
MIT

JRuby tools wrapper for Apache OpenNLP

11
2
3y 10m
MIT

A wrapper module for using spaCy natural language processing library from the Ruby programming language via PyCall

32
3
41d
MIT

On-line APIs

A sdk for AlchemyAPI using Ruby - Please note that this legacy AlchemyAPI SDK is no longer supported by IBM. Please use the Watson SDKs https://github.com/watson-developer-cloud?utf8=โœ“&query=sdk

36
28
5y 12m
Apache-2.0

Ruby library for Wit.ai

276
71
1y 25d
n/a

Ruby based API for the project Wortschatz Leipzig.

19
8
4y 10m
MIT

Official Ruby client for the MonkeyLearn API. Build and consume machine learning models for language processing from your Ruby apps.

79
13
1y 62d
MIT

Google Cloud Client Library for Ruby

1.13K
474
10m
Apache-2.0

Language Identification

28
7
11m
MIT

Segmentation

A simple tokenizer in Ruby for NLP tasks.

43
11
5y 5m
n/a

A multilingual tokenizer to split a string into tokens

87
9
1y 8m
MIT

Natural language processing algorithms implemented in pure Ruby with minimal dependencies

19
0
5y 5m
MIT

Simple and customizable text tokenization gem.

30
4
12m
MIT

Pragmatic Segmenter is a rule-based sentence boundary detection gem that works out-of-the-box across many languages.

484
37
1y 4m
MIT

Ruby port of the NLTK Punkt sentence segmentation algorithm

90
9
4y 109d
n/a

Accurate Bayesian sentence tokenizer in Ruby.

79
12
8y 5m
n/a

A fast and accurate rule-based sentence segmentation tool for Ruby.

53
5
6y 9m
n/a

Stemming

Expose libstemmer_c to Ruby

255
20
1y 9m
MIT

Ruby port of UEALite Stemmer - a conservative stemmer for search and indexing

50
6
12m
Apache-2.0

Lemmatization

Lemmatizer for text in English. Inspired by Python's nltk.corpus.reader.wordnet.morphy

100
15
11m
MIT

Lexical Statistics: Counting Types and Tokens

Your Word Counter Gem

6
0
10y 10m
MIT

A word counter for String and Hash in Ruby

4
1
3y 6m
MIT

A Ruby natural language processor.

154
30
11m
MIT

Filtering Stop Words

Project for filtering stopwords

64
31
11m
MIT

Phrasal Level Processing

N-Gram generator in Ruby - http://en.wikipedia.org/wiki/N-gram

36
7
11m
MIT

Break words and phrases into ngrams.

11
1
8y 9m
MIT

A flexible and general-purpose ngrams library written in Ruby. Raingrams supports ngram sizes greater than 1, text/non-text grams, multiple parsing styles and open/closed vocabulary models.

69
7
1y 7m
MIT

Constituency Parsing

An Earley parser written in Ruby

31
3
11m
MIT

Syntax tree generator made with Ruby and RMagic

60
10
1y 7m
n/a

Semantic Analysis

Approximate String Matching library

344
32
3y 71d
Apache-2.0

Calculates edit distance using Damerau-Levenshtein algorithm

122
14
1y 71d
MIT

Fast Ruby FFI string edit distance algorithms

79
1
9y 7m
n/a

Fast string edit distance computation, using the Damerau-Levenshtein algorithm.

147
17
7y 27d
BSD-2-Clause

Term Frequency - Inverse Document Frequency in Ruby

35
7
10y 7m
MIT

Ruby gem to calculate the similarity between texts using tf*idf

623
58
1y 11m
MIT

Pragmatical Analysis

A simple and extensible sentiment analysis gem

13
2
9y 8m
MIT

Spelling and Error Correction

Ruby wrapper for correcting spelling and grammar mistakes based on the context of complete sentences.

481
20
3y 29d
n/a

Ruby bindings to Hunspell 1.2.x with iconv support

4
2
10y 44d
n/a

Ruby FFI bindings for Hunspell.

46
24
1y 7m
MIT

Ruby wrapper for the famous spell checker library hunspell.

34
12
3y 10m
LGPL-3.0

Text Alignment

Alignment functions for corpus linguistics

1
1
8y 113d
n/a

Machine Translation

REST client for Google APIs

2.46K
794
10m
Apache-2.0

Ruby client for the microsoft translator API

22
11
5y 4m
MIT

Translations with speech synthesis in your terminal as a ruby gem

507
21
5y 4m
MIT

Ruby NLP library

2
1
6y 6m
MIT

Sentiment Analysis

Sentiment analysis for the German language

18
5
6y 5m
n/a

Numbers, Dates, and Time Parsing

Chronic is a pure Ruby natural language date parser.

3.12K
464
4y 8m
MIT

A natural language parser for validating complex date ranges

27
3
1y 98d
MIT

A simple Ruby natural language parser for elapsed time

344
67
1y 20d
MIT

A dirt simple library for parsing and formatting human readable dates

152
11
8y 49d
MIT

Nickel extracts date, time, and message information from naturally worded text.

105
16
4y 11m
MIT

Natural language parser for recurring events

75
10
2y 8d
MIT

Parse numbers in natural language from strings (ex forty two).

33
13
1y 7m
MIT

Named Entity Recognition

Named entity recognition with Stanford NER and Ruby

17
3
10y 4m
n/a

Ruby Binding for Stanford Pos-Tagger and Name Entity Recognizer

87
12
8y 54d
n/a

Text-to-Speech-to-Text

Ruby wrapper for โ€˜espeakโ€™ and โ€˜lameโ€™ with sugar on top to create Text-To-Speech mp3 files.

147
19
1y 14d
MIT

A ruby gem for Text-To-Speech by using google translate service.

87
26
12m
MIT

A Ruby library for consuming the AT&T Speech API for speech to text.

20
6
8y 6m
MIT

Ruby speech recognition with Pocketsphinx

252
37
5y 64d
MIT

Dialog Agents, Assistants, and Chatbots

A straightforward ruby-based Twitter Bot Framework, using OAuth to authenticate.

489
110
1y 4m
MIT

ChatOps for Ruby.

1.66K
165
1y 55d
MIT

Linguistic Resources

A pure Ruby interface to the WordNet database

85
22
3y 36d
n/a

A Ruby interface to the WordNetยฎ Lexical Database.

133
28
4y 11m
n/a

Machine Learning Libraries

Ruby language bindings for LIBSVM

277
32
2y 22d
n/a

Machine Learning & Data Mining with JRuby

65
8
1y 8m
MIT

ID3-based implementation of the ML Decision Tree algorithm

1.32K
132
3y 11m
n/a

A Ruby interface to the Timbl machine-learning library

5
2
12y 11m
MIT

A general classifier module to allow Bayesian and other types of classifications. A fork of cardmagic/classifier.

513
107
2y 10d
LGPL-2.1

A Ruby wrapper for Latent Dirichlet Allocation (LDA).

132
34
1y 12m
LGPL-2.1

This is the Ruby interface to LIBLINEAR (much more efficient than LIBSVM for text classification and other large linear classifications)

82
11
10y 48d
n/a

A redis-backed Bayesian classifier

37
12
6y 9m
MIT

a JRuby maximum entropy classifier for string data, based on the OpenNLP Maxent framework

9
5
13y 85d
n/a

Simple Naive Bayes classifier

46
7
10y 8m
MIT

A robust, full-featured Ruby implementation of Naive Bayes

147
30
1y 48d
MIT

A generalized rack framework for text classifications.

10
2
1y 8m
MIT

Naive Bayes text classification implementation as an OmniCat classifier strategy. (#ruby #naivebayes)

31
3
1y 8m
MIT

Ruby library for interfacing with FANN (Fast Artificial Neural Network)

437
37
9m
MIT

library for nlp with ruby

1
1
6y 55d
MIT

Optical Character Recognition

A Ruby wrapper library to the tesseract-ocr API.

601
76
5y 87d
n/a

Text Extraction

Read text and metadata from files and documents (.doc, .docx, .pages, .odt, .rtf, .pdf)

466
105
1y 4m
MIT

Full Text Search, Information Retrieval, Indexing

A Ruby client for Apache Solr

417
147
11m
n/a

Solr-powered search for Ruby objects

2.94K
926
1y 23d
MIT

Sphinx plugin for ActiveRecord/Rails

1.6K
475
7m
MIT

Ruby integrations for Elasticsearch

1.85K
568
7m
Apache-2.0

Elasticsearch integrations for ActiveModel/Record and Ruby on Rails

2.88K
738
1y 17d
Apache-2.0

REST client for Google APIs

2.46K
794
10m
Apache-2.0

Language Aware String Manipulation

Find a needle (a document or record) in a haystack using string similarity and (optionally) regular expression rules. Uses Dice's Coefficient (aka Pair Similiarity) and Levenshtein Distance internally.

637
49
1y 4m
MIT

fuzzy string matching library for ruby

265
40
2y 7m
Apache-2.0

Ruby on Rails

50.14K
20.12K
7m
MIT

Fuzzy document finding in Ruby

20
6
4y 11m
n/a

Unicode normalization library. (Mirror of Yoshida-san's code base to maintain the RubyGem.)

81
7
1y 8m
n/a

Find a lot of kinds of common information in a string. CommonRegex port for Ruby

74
4
6y 6m
n/a

Generate strings that match a given regular expression

501
27
1y 6m
MIT

Make difficult regular expressions easy! Ruby port of the awesome VerbalExpressions repo - https://github.com/jehna/VerbalExpressions

574
23
1y 25d
MIT

Hebrew - English Transliteration Engine

5
2
10m
MIT

Ruby bindings to re2, an "efficient, principled regular expression library".

73
9
1y 6m
BSD-3-Clause

Regex to sample value. Ruby gem. RegexSample.generate(/(a|b)/) #=> a or b

1
1
5y 58d
MIT

Russian transliteration using nalgeon/iuliia schemas

8
2
1y 6m
MIT

Articles, Posts, Talks, and Presentations

Projects and Code Examples

Distance Measurements are Awesome!

60
6
6y 20d
n/a

Named entity recognition with Stanford NER and Ruby

17
3
10y 4m
n/a

Books

Community

Needs your Help!

Ferret: the extensible information retrieval library for ruby.

275
49
4y 9m
MIT

A Ruby C wrapper for Open Text Summarizer

205
17
10y 5m
n/a

Related Resources

A list of Neural MT implementations

366
70
2y 11m
Apache-2.0

A collection of awesome Ruby libraries, tools, frameworks and software

11.79K
1.65K
11m
n/a

A collection of links to Ruby Natural Language Processing (NLP) libraries, tools and software

1.22K
100
1y 4m
n/a

A curated list of speech and natural language processing resources

2.04K
284
3y 5m
n/a

Official gem repository: Ruby kernel for Jupyter/IPython Notebook

678
11
10m
MIT

Links to awesome OCR projects

1.93K
292
1y 5m
n/a

TensorFlow - A curated list of dedicated resources http://tensorflow.org

16.48K
3.02K
1y 17d
CC0-1.0