The following tables give an overview on the different tutorials with the 1-liners.
The tables are grouped
by category.
Embeddings Tutorials Overview
Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
---|---|---|---|
Albert Word Embeddings | albert , sentiment pos albert emotion |
Albert-Paper, Albert on Github, Albert on TensorFlow, T-SNE, T-SNE-Albert, Albert_Embedding | |
Bert Word Embeddings | bert , pos sentiment emotion bert |
Bert-Paper, Bert Github, T-SNE, T-SNE-Bert, Bert_Embedding | |
BIOBERT Word Embeddings | biobert , sentiment pos biobert emotion |
BioBert-Paper, Bert Github , BERT: Deep Bidirectional Transformers, Bert Github, T-SNE, T-SNE-Biobert, Biobert_Embedding | |
COVIDBERT Word Embeddings | covidbert , sentiment covidbert pos |
CovidBert-Paper, Bert Github, T-SNE, T-SNE-CovidBert, Covidbert_Embedding | |
ELECTRA Word Embeddings | electra , sentiment pos en.embed.electra emotion |
Electra-Paper, T-SNE, T-SNE-Electra, Electra_Embedding | |
ELMO Word Embeddings | elmo , sentiment pos elmo emotion |
ELMO-Paper, Elmo-TensorFlow, T-SNE, T-SNE-Elmo, Elmo-Embedding | |
GLOVE Word Embeddings | glove , sentiment pos glove emotion |
Glove-Paper, T-SNE, T-SNE-Glove , Glove_Embedding | |
XLNET Word Embeddings | xlnet , sentiment pos xlnet emotion |
XLNet-Paper, Bert Github, T-SNE, T-SNE-XLNet, Xlnet_Embedding | |
Multiple Word-Embeddings and Part of Speech in 1 Line of code | bert electra elmo glove xlnet albert pos |
Bert-Paper, Albert-Paper, ELMO-Paper, Electra-Paper, XLNet-Paper, Glove-Paper |
Text Preprocessing and Cleaning
Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
---|---|---|---|
Normalzing | norm |
- | |
Detect sentences | sentence_detector.deep , sentence_detector.pragmatic , xx.sentence_detector |
Sentence Detector | |
Spellchecking | n.a. | n.a. | - |
Stemming | en.stem , de.stem |
- | |
Stopwords removal | stopwords |
Stopwords | |
Tokenization | tokenize |
- | |
Normalization of Documents | norm_document |
- |
Sequence to Sequence
Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
---|---|---|---|
Open and Closed book question answering with Google’s T5 | en.t5 , answer_question |
T5-Paper, T5-Model | |
Overview of every task available with T5 | en.t5.base |
T5-Paper, T5-Model | |
Translate between more than 200 Languages in 1 line of code with Marian Models | tr.translate_to.fr , en.translate_to.fr ,fr.translate_to.he , en.translate_to.de |
Marian-Papers, Translation-Pipeline (En to Fr), Translation-Pipeline (En to Ger) |
Sentence Embeddings
Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
---|---|---|---|
BERT Sentence Embeddings | embed_sentence.bert , pos sentiment embed_sentence.bert |
Bert-Paper, Bert Github, Bert-Sentence_Embedding | |
ELECTRA Sentence Embeddings | embed_sentence.electra , pos sentiment embed_sentence.electra |
Electra Paper, Sentence-Electra-Embedding | |
USE Sentence Embeddings | use , pos sentiment use emotion |
Universal Sentence Encoder, USE-TensorFlow, Sentence-USE-Embedding | |
Sentence similarity using BERT embeddings | embed_sentence.bert , use en.embed_sentence.electra embed_sentence.bert |
Bert-Paper, Bert Github, Bert-Sentence_Embedding |
Part of Speech
Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
---|---|---|---|
Part of Speech tagging | pos |
Part of Speech |
Named Entity Recognition (NER)
Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
---|---|---|---|
NER Aspect Airline ATIS | en.ner.aspect.airline |
NER Airline Model, Atis intent Dataset | |
NLU-NER_CONLL_2003_5class_example | ner |
NER-Piple | |
Named-entity recognition with Deep Learning ONTO NOTES | ner.onto |
NER_Onto | |
Aspect based NER-Sentiment-Restaurants | en.ner.aspect_sentiment |
- |
Multilingual Tasks
Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
---|---|---|---|
Detect Named Entities (NER), Part of Speech Tags (POS) and Tokenize in Chinese | zh.segment_words , zh.pos , zh.ner , zh.translate_to.en |
Translation-Pipeline (Zh to En) | |
Detect Named Entities (NER), Part of Speech Tags (POS) and Tokenize in Japanese | ja.segment_words , ja.pos , ja.ner , ja.translate_to.en |
Translation-Pipeline (Ja to En) | |
Detect Named Entities (NER), Part of Speech Tags (POS) and Tokenize in Korean | ko.segment_words , ko.pos , ko.ner.kmou.glove_840B_300d , ko.translate_to.en |
- |
Matchers
Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
---|---|---|---|
Date Matching | match.datetime |
- |
Dependency Parsing
Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
---|---|---|---|
Typed Dependency Parsing | dep |
Dependency Parsing | |
Untyped Dependency Parsing | dep.untyped |
- |
Classifiers
Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
---|---|---|---|
E2E Classification | e2e |
e2e-Model | |
Language Classification | lang |
- | |
Cyberbullying Classification | classify.cyberbullying |
Cyberbullying-Classifier | |
Sentiment Classification for Twitter | emotion |
Emotion detection | |
Fake News Classification | en.classify.fakenews |
Fakenews-Classifier | |
Intent Classification | en.classify.intent.airline |
Airline-Intention classifier, Atis-Dataset | |
Question classification based on the TREC dataset | en.classify.questions |
Question-Classifier | |
Sarcasm Classification | en.classify.sarcasm |
Sarcasm-Classifier | |
Sentiment Classification for Twitter | en.sentiment.twitter |
Sentiment_Twitter-Classifier | |
Sentiment Classification for Movies | en.sentiment.imdb |
Sentiment_imdb-Classifier | |
Spam Classification | en.classify.spam |
Spam-Classifier | |
Toxic text classification | en.classify.toxic |
Toxic-Classifier | |
Unsupervised keyword extraction using the YAKE algorithm | yake |
- | |
Notebook for Classification of Banking Queries | en.classify.distilbert_sequence.banking77 |
DistilBERT Sequence Classification - Banking77 | |
Notebook for Classification of Intent in Texts | en.ner.snips |
Identify intent in general text - SNIPS dataset | |
Notebook for classification of Similar Questions | en.classify.questionpair |
Question Pair Classifier | |
Notebook for Classification of Questions vs Statements | en.classify.question_vs_statement |
Bert for Sequence Classification (Question vs Statement) | |
Notebook for Classification of News into 4 classes | en.classify.distilbert_sequence.ag_news |
DistilBERT Sequence Classification Base - AG News (distilbert_base_sequence_classifier_ag_news) |
Chunkers
Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
---|---|---|---|
Grammatical Chunk Matching | match.chunks |
- | |
Getting n-Grams | ngram |
- |
Healthcare
Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
---|---|---|---|
Assertion | en.med_ner.clinical en.assert , en.med_ner.clinical.biobert en.assert.biobert , … |
Healthcare-NER, NER_Clinical-Classifier, Toxic-Classifier | |
De-Identification Model overview | med_ner.jsl.wip.clinical en.de_identify , med_ner.jsl.wip.clinical en.de_identify.clinical , … |
NER-Clinical | |
Drug Normalization | norm_drugs |
- | |
Entity Resolution | med_ner.jsl.wip.clinical en.resolve_chunk.cpt_clinical , med_ner.jsl.wip.clinical en.resolve.icd10cm , … |
NER-Clinical, Entity-Resolver clinical | |
Medical Named Entity Recognition | en.med_ner.ade.clinical , en.med_ner.ade.clinical_bert , en.med_ner.anatomy ,en.med_ner.anatomy.biobert , … |
- | |
Relation Extraction | en.med_ner.jsl.wip.clinical.greedy en.relation , en.med_ner.jsl.wip.clinical.greedy en.relation.bodypart.problem , … |
- |
Visualization
Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
---|---|---|---|
Visualization of NLP-Models with Spark-NLP and NLU | ner , dep.typed , med_ner.jsl.wip.clinical resolve_chunk.rxnorm.in , med_ner.jsl.wip.clinical resolve.icd10cm |
NER-Piple, Dependency Parsing, NER-Clinical, Entity-Resolver (Chunks) clinical |
Example Notebooks on Kaggle, Examination on real life Problems.
Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
---|---|---|---|
NLU Covid-19 Emotion Showcase | emotion |
Emotion detection | |
NLU Covid-19 Sentiment Showcase | sentiment |
Sentiment classification | |
NLU Airline Emotion Demo | emotion |
Emotion detection | |
NLU Airline Sentiment Demo | sentiment |
Sentiment classification |
Release Notebooks
Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
---|---|---|---|
Bengali NER Hindi Embeddings for 30 Models | bn.ner , bn.lemma , ja.lemma , am.lemma , bh.lemma , en.ner.onto.bert.small_l2_128 ,.. |
Bengali-NER, Bengali-Lemmatizer, Japanese-Lemmatizer, Amharic-Lemmatizer | |
Entity Resolution | med_ner.jsl.wip.clinical en.resolve.umls , med_ner.jsl.wip.clinical en.resolve.loinc , med_ner.jsl.wip.clinical en.resolve.loinc.biobert |
- |
Crash-Course
Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
---|---|---|---|
NLU 20 Minutes Crashcourse - the fast Data Science route | spell , sentiment , pos , ner , yake , en.t5 , emotion , answer_question , en.t5.base … |
T5-Model, Part of Speech, NER-Piple, Emotion detection , Spellchecker, Sentiment classification |
Natural Language Processing (NLP)
Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
---|---|---|---|
Chapter 0: Intro: 1-liners | sentiment , pos , ner , bert , elmo , embed_sentence.bert |
Part of Speech, NER-Piple, Sentiment classification, Elmo-Embedding, Bert-Sentence_Embedding | |
Chapter 1: NLU base-features with some classifiers on testdata | emotion , yake , stem |
Emotion detection | |
Chapter 2: Translation between 300+ langauges with Marian | tr.translate_to.en , en.translate_to.fr , en.translate_to.he |
Translation-Pipeline (En to Fr), Translation (En to He) | |
Chapter 3: Answer questions and summarize Texts with T5 | answer_question , en.t5 , en.t5.base |
T5-Model | |
Chapter 4: Overview of T5-Tasks | en.t5.base |
T5-Model |
NLU-Crashcourse Graph AI
Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
---|---|---|---|
Graph NLU 20 Minutes Crashcourse - State of the Art Text Mining for Graphs | spell , sentiment , pos , ner , yake , emotion , med_ner.jsl.wip.clinical , … |
Part of Speech, NER-Piple, Emotion detection, Spellchecker, Sentiment classification |
Healthcare-Training
Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
---|---|---|---|
Healthcare | med_ner.human_phenotype.gene_biobert , med_ner.ade_biobert , med_ner.anatomy , med_ner.bacterial_species ,… |
- |
Multilingual-Training
Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
---|---|---|---|
Part 0: Intro: 1-liners | spell , sentiment , pos , ner , bert , elmo , embed_sentence.bert |
Bert-Paper, Bert Github, T-SNE, T-SNE-Bert , Part of Speech, NER-Piple, Spellchecker, Sentiment classification, Elmo-Embedding , Bert-Sentence_Embedding | |
Part 1: Quick Start, base-features with some classifiers on Testdata | yake , stem , ner , emotion |
NER-Piple, Emotion detection | |
Part 2: Translate between 200+ Languages in 1 line of code with Marian-Models | en.translate_to.de , en.translate_to.fr , en.translate_to.he |
Translation-Pipeline (En to Fr), Translation-Pipeline (En to Ger), Translation (En to He) | |
Part 3: More Multilingual NLP-translations for Asian Languages with Marian | en.translate_to.hi , en.translate_to.ru , en.translate_to.zh |
Translation (En to Hi), Translation (En to Ru), Translation (En to Zh) | |
Part 4: Unsupervised Chinese Keyword Extraction, NER and Translation from chinese news | zh.translate_to.en , zh.segment_words , yake , zh.lemma , zh.ner |
Translation-Pipeline (Zh to En), Zh-Lemmatizer | |
Part 5: Multilingual sentiment classifier training for 100+ languages | train.sentiment , xx.embed_sentence.labse train.sentiment |
n.a. | Sentence_Embedding.Labse |
Part 6: Question-answering and Text-summarization with T5-Modell | answer_question , en.t5 , en.t5.base |
T5-Paper | |
Part 7: Overview of all tasks available with T5 | en.t5.base |
T5-Paper | |
Part 8: Overview of some of the Multilingual modes with State Of the Art accuracy (1-liner) | bn.lemma , ja.lemma , am.lemma , bh.lemma , zh.segment_words , … |
Bengali-Lemmatizer, Japanese-Lemmatizer , Amharic-Lemmatizer |
Multilinigual-Examples
Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
---|---|---|---|
Overview of some Multilingual modes avaiable with State Of the Art accuracy (1-liner) | bn.ner.cc_300d , ja.ner , zh.ner , th.ner.lst20.glove_840B_300D , ar.ner |
Bengali-NER | |
NLU 20 Minutes Crashcourse - the fast Data Science route |
PREVIOUSdashboard()