1-liner Tutorial Notebooks

The following tables give an overview on the different tutorials with the 1-liners.
The tables are grouped by category.

Embeddings Tutorials Overview

Tutorial Description	1-liners used	Dataset and Paper References
Albert Word Embeddings	`albert`, `sentiment pos albert emotion`	Albert-Paper, Albert on Github, Albert on TensorFlow, T-SNE, T-SNE-Albert, Albert_Embedding
Bert Word Embeddings	`bert`, `pos sentiment emotion bert`	Bert-Paper, Bert Github, T-SNE, T-SNE-Bert, Bert_Embedding
BIOBERT Word Embeddings	`biobert` , `sentiment pos biobert emotion`	BioBert-Paper, Bert Github , BERT: Deep Bidirectional Transformers, Bert Github, T-SNE, T-SNE-Biobert, Biobert_Embedding
COVIDBERT Word Embeddings	`covidbert`, `sentiment covidbert pos`	CovidBert-Paper, Bert Github, T-SNE, T-SNE-CovidBert, Covidbert_Embedding
ELECTRA Word Embeddings	`electra`, `sentiment pos en.embed.electra emotion`	Electra-Paper, T-SNE, T-SNE-Electra, Electra_Embedding
ELMO Word Embeddings	`elmo`, `sentiment pos elmo emotion`	ELMO-Paper, Elmo-TensorFlow, T-SNE, T-SNE-Elmo, Elmo-Embedding
GLOVE Word Embeddings	`glove`, `sentiment pos glove emotion`	Glove-Paper, T-SNE, T-SNE-Glove , Glove_Embedding
XLNET Word Embeddings	`xlnet`, `sentiment pos xlnet emotion`	XLNet-Paper, Bert Github, T-SNE, T-SNE-XLNet, Xlnet_Embedding
Multiple Word-Embeddings and Part of Speech in 1 Line of code	`bert electra elmo glove xlnet albert pos`	Bert-Paper, Albert-Paper, ELMO-Paper, Electra-Paper, XLNet-Paper, Glove-Paper

Text Preprocessing and Cleaning

Tutorial Description	1-liners used	Open In Colab	Dataset and Paper References
Normalzing	`norm`		-
Detect sentences	`sentence_detector.deep`, `sentence_detector.pragmatic`, `xx.sentence_detector`		Sentence Detector
Spellchecking	n.a.	n.a.	-
Stemming	`en.stem`, `de.stem`		-
Stopwords removal	`stopwords`		Stopwords
Tokenization	`tokenize`		-
Normalization of Documents	`norm_document`		-

Sequence to Sequence

Tutorial Description	1-liners used	Dataset and Paper References
Open and Closed book question answering with Google’s T5	`en.t5` , `answer_question`	T5-Paper, T5-Model
Overview of every task available with T5	`en.t5.base`	T5-Paper, T5-Model
Translate between more than 200 Languages in 1 line of code with Marian Models	`tr.translate_to.fr`, `en.translate_to.fr` ,`fr.translate_to.he` , `en.translate_to.de`	Marian-Papers, Translation-Pipeline (En to Fr), Translation-Pipeline (En to Ger)

Sentence Embeddings

Tutorial Description	1-liners used	Dataset and Paper References
BERT Sentence Embeddings	`embed_sentence.bert`, `pos sentiment embed_sentence.bert`	Bert-Paper, Bert Github, Bert-Sentence_Embedding
ELECTRA Sentence Embeddings	`embed_sentence.electra`, `pos sentiment embed_sentence.electra`	Electra Paper, Sentence-Electra-Embedding
USE Sentence Embeddings	`use`, `pos sentiment use emotion`	Universal Sentence Encoder, USE-TensorFlow, Sentence-USE-Embedding
Sentence similarity using BERT embeddings	`embed_sentence.bert`, `use en.embed_sentence.electra embed_sentence.bert`	Bert-Paper, Bert Github, Bert-Sentence_Embedding

Part of Speech

Tutorial Description	1-liners used	Open In Colab	Dataset and Paper References
Part of Speech tagging	`pos`		Part of Speech

Named Entity Recognition (NER)

Tutorial Description	1-liners used	Dataset and Paper References
NER Aspect Airline ATIS	`en.ner.aspect.airline`	NER Airline Model, Atis intent Dataset
NLU-NER_CONLL_2003_5class_example	`ner`	NER-Piple
Named-entity recognition with Deep Learning ONTO NOTES	`ner.onto`	NER_Onto
Aspect based NER-Sentiment-Restaurants	`en.ner.aspect_sentiment`	-

Multilingual Tasks

Tutorial Description	1-liners used	Dataset and Paper References
Detect Named Entities (NER), Part of Speech Tags (POS) and Tokenize in Chinese	`zh.segment_words`, `zh.pos`, `zh.ner`, `zh.translate_to.en`	Translation-Pipeline (Zh to En)
Detect Named Entities (NER), Part of Speech Tags (POS) and Tokenize in Japanese	`ja.segment_words`, `ja.pos`, `ja.ner`, `ja.translate_to.en`	Translation-Pipeline (Ja to En)
Detect Named Entities (NER), Part of Speech Tags (POS) and Tokenize in Korean	`ko.segment_words`, `ko.pos`, `ko.ner.kmou.glove_840B_300d`, `ko.translate_to.en`	-

Matchers

Tutorial Description	1-liners used	Open In Colab	Dataset and Paper References
Date Matching	`match.datetime`		-

Dependency Parsing

Tutorial Description	1-liners used	Open In Colab	Dataset and Paper References
Typed Dependency Parsing	`dep`		Dependency Parsing
Untyped Dependency Parsing	`dep.untyped`		-

Classifiers

Tutorial Description	1-liners used	Dataset and Paper References
E2E Classification	`e2e`	e2e-Model
Language Classification	`lang`	-
Cyberbullying Classification	`classify.cyberbullying`	Cyberbullying-Classifier
Sentiment Classification for Twitter	`emotion`	Emotion detection
Fake News Classification	`en.classify.fakenews`	Fakenews-Classifier
Intent Classification	`en.classify.intent.airline`	Airline-Intention classifier, Atis-Dataset
Question classification based on the TREC dataset	`en.classify.questions`	Question-Classifier
Sarcasm Classification	`en.classify.sarcasm`	Sarcasm-Classifier
Sentiment Classification for Twitter	`en.sentiment.twitter`	Sentiment_Twitter-Classifier
Sentiment Classification for Movies	`en.sentiment.imdb`	Sentiment_imdb-Classifier
Spam Classification	`en.classify.spam`	Spam-Classifier
Toxic text classification	`en.classify.toxic`	Toxic-Classifier
Unsupervised keyword extraction using the YAKE algorithm	`yake`	-
Notebook for Classification of Banking Queries	`en.classify.distilbert_sequence.banking77`	DistilBERT Sequence Classification - Banking77
Notebook for Classification of Intent in Texts	`en.ner.snips`	Identify intent in general text - SNIPS dataset
Notebook for classification of Similar Questions	`en.classify.questionpair`	Question Pair Classifier
Notebook for Classification of Questions vs Statements	`en.classify.question_vs_statement`	Bert for Sequence Classification (Question vs Statement)
Notebook for Classification of News into 4 classes	`en.classify.distilbert_sequence.ag_news`	DistilBERT Sequence Classification Base - AG News (distilbert_base_sequence_classifier_ag_news)

Chunkers

Tutorial Description	1-liners used	Open In Colab	Dataset and Paper References
Grammatical Chunk Matching	`match.chunks`		-
Getting n-Grams	`ngram`		-

Healthcare

Tutorial Description	1-liners used	Dataset and Paper References
Assertion	`en.med_ner.clinical en.assert`, `en.med_ner.clinical.biobert en.assert.biobert`, …	Healthcare-NER, NER_Clinical-Classifier, Toxic-Classifier
De-Identification Model overview	`med_ner.jsl.wip.clinical en.de_identify`, `med_ner.jsl.wip.clinical en.de_identify.clinical`, …	NER-Clinical
Drug Normalization	`norm_drugs`	-
Entity Resolution	`med_ner.jsl.wip.clinical en.resolve_chunk.cpt_clinical`, `med_ner.jsl.wip.clinical en.resolve.icd10cm`, …	NER-Clinical, Entity-Resolver clinical
Medical Named Entity Recognition	`en.med_ner.ade.clinical`, `en.med_ner.ade.clinical_bert`, `en.med_ner.anatomy`,`en.med_ner.anatomy.biobert`, …	-
Relation Extraction	`en.med_ner.jsl.wip.clinical.greedy en.relation`, `en.med_ner.jsl.wip.clinical.greedy en.relation.bodypart.problem`, …	-

Visualization

Tutorial Description	1-liners used	Open In Colab	Dataset and Paper References
Visualization of NLP-Models with Spark-NLP and NLU	`ner`, `dep.typed`, `med_ner.jsl.wip.clinical resolve_chunk.rxnorm.in`, `med_ner.jsl.wip.clinical resolve.icd10cm`		NER-Piple, Dependency Parsing, NER-Clinical, Entity-Resolver (Chunks) clinical

Example Notebooks on Kaggle, Examination on real life Problems.

Tutorial Description	1-liners used	Dataset and Paper References
NLU Covid-19 Emotion Showcase	`emotion`	Emotion detection
NLU Covid-19 Sentiment Showcase	`sentiment`	Sentiment classification
NLU Airline Emotion Demo	`emotion`	Emotion detection
NLU Airline Sentiment Demo	`sentiment`	Sentiment classification

Release Notebooks

Tutorial Description	1-liners used	Open In Colab	Dataset and Paper References
Bengali NER Hindi Embeddings for 30 Models	`bn.ner`, `bn.lemma`, `ja.lemma`, `am.lemma`, `bh.lemma`, `en.ner.onto.bert.small_l2_128`,..		Bengali-NER, Bengali-Lemmatizer, Japanese-Lemmatizer, Amharic-Lemmatizer
Entity Resolution	`med_ner.jsl.wip.clinical en.resolve.umls`, `med_ner.jsl.wip.clinical en.resolve.loinc`, `med_ner.jsl.wip.clinical en.resolve.loinc.biobert`		-

Crash-Course

Tutorial Description	1-liners used	Open In Colab	Dataset and Paper References
NLU 20 Minutes Crashcourse - the fast Data Science route	`spell`, `sentiment`, `pos`, `ner`, `yake`, `en.t5`, `emotion`, `answer_question`, `en.t5.base` …		T5-Model, Part of Speech, NER-Piple, Emotion detection , Spellchecker, Sentiment classification

Natural Language Processing (NLP)

Tutorial Description	1-liners used	Dataset and Paper References
Chapter 0: Intro: 1-liners	`sentiment`, `pos`, `ner`, `bert`, `elmo`, `embed_sentence.bert`	Part of Speech, NER-Piple, Sentiment classification, Elmo-Embedding, Bert-Sentence_Embedding
Chapter 1: NLU base-features with some classifiers on testdata	`emotion`, `yake`, `stem`	Emotion detection
Chapter 2: Translation between 300+ langauges with Marian	`tr.translate_to.en`, `en.translate_to.fr`, `en.translate_to.he`	Translation-Pipeline (En to Fr), Translation (En to He)
Chapter 3: Answer questions and summarize Texts with T5	`answer_question`, `en.t5`, `en.t5.base`	T5-Model
Chapter 4: Overview of T5-Tasks	`en.t5.base`	T5-Model

NLU-Crashcourse Graph AI

Tutorial Description	1-liners used	Open In Colab	Dataset and Paper References
Graph NLU 20 Minutes Crashcourse - State of the Art Text Mining for Graphs	`spell`, `sentiment`, `pos`, `ner`, `yake`, `emotion`, `med_ner.jsl.wip.clinical`, …		Part of Speech, NER-Piple, Emotion detection, Spellchecker, Sentiment classification

Healthcare-Training

Tutorial Description	1-liners used	Open In Colab	Dataset and Paper References
Healthcare	`med_ner.human_phenotype.gene_biobert`, `med_ner.ade_biobert`, `med_ner.anatomy`, `med_ner.bacterial_species`,…		-

Multilingual-Training

Tutorial Description	1-liners used	Open In Colab	Dataset and Paper References
Part 0: Intro: 1-liners	`spell`, `sentiment`, `pos`, `ner`, `bert`, `elmo`, `embed_sentence.bert`		Bert-Paper, Bert Github, T-SNE, T-SNE-Bert , Part of Speech, NER-Piple, Spellchecker, Sentiment classification, Elmo-Embedding , Bert-Sentence_Embedding
Part 1: Quick Start, base-features with some classifiers on Testdata	`yake`, `stem`, `ner`, `emotion`		NER-Piple, Emotion detection
Part 2: Translate between 200+ Languages in 1 line of code with Marian-Models	`en.translate_to.de`, `en.translate_to.fr`, `en.translate_to.he`		Translation-Pipeline (En to Fr), Translation-Pipeline (En to Ger), Translation (En to He)
Part 3: More Multilingual NLP-translations for Asian Languages with Marian	`en.translate_to.hi`, `en.translate_to.ru`, `en.translate_to.zh`		Translation (En to Hi), Translation (En to Ru), Translation (En to Zh)
Part 4: Unsupervised Chinese Keyword Extraction, NER and Translation from chinese news	`zh.translate_to.en`, `zh.segment_words`, `yake`, `zh.lemma`, `zh.ner`		Translation-Pipeline (Zh to En), Zh-Lemmatizer
Part 5: Multilingual sentiment classifier training for 100+ languages	`train.sentiment`, `xx.embed_sentence.labse train.sentiment`	n.a.	Sentence_Embedding.Labse
Part 6: Question-answering and Text-summarization with T5-Modell	`answer_question`, `en.t5`, `en.t5.base`		T5-Paper
Part 7: Overview of all tasks available with T5	`en.t5.base`		T5-Paper
Part 8: Overview of some of the Multilingual modes with State Of the Art accuracy (1-liner)	`bn.lemma`, `ja.lemma`, `am.lemma`, `bh.lemma`, `zh.segment_words`, …		Bengali-Lemmatizer, Japanese-Lemmatizer , Amharic-Lemmatizer

Multilinigual-Examples

Tutorial Description	1-liners used	Open In Colab	Dataset and Paper References
Overview of some Multilingual modes avaiable with State Of the Art accuracy (1-liner)	`bn.ner.cc_300d`, `ja.ner`, `zh.ner`, `th.ner.lst20.glove_840B_300D`, `ar.ner`		Bengali-NER
NLU 20 Minutes Crashcourse - the fast Data Science route

PREVIOUSdashboard()

NEXT1-liners Reference