John Snow Labs

Powerful One-Liners

Over a thousand NLP models in hundreds of languages are at your fingertips with just one line of code

Elegant Python

Directly read and write pandas dataframes for frictionless integration with other libraries and existing ML pipelines

100% Open Source

Including pre-trained models & pipelines

Quick and Easy

NLU is available on PyPI, Conda

  
# Install NLU from PyPI
pip install nlu

# Install NLU from Anaconda/Conda
conda install -c johnsnowlabs nlu

Benchmark

NLU is based on the award winning Spark NLP which best performing in peer-reviewed results

Training NER

State-of-the-art Deep Learning algorithms
Achieve high accuracy with one line of code
350 + NLP Models
176 + unique NLP models and algorithms
68 + unique NLP pipelines consisting of different NLU components
50 + languages supported
14 + embeddings BERT, ELMO, ALBERT, XLNET, GLOVE, USE, ELECTRA
50 + Pre-trained Classifiers : Emotion, Sarcasm, Language, Question, E2E, Toxic
36 + Pre-Trained NER (Named Entity Recognition) models
34 + Pre-Trained POS (Part of Speech) models
3 + Pre-Trained Lemmatizer models
Dependency parsing untyped and typed
Spell Checking
Multi-lingual NER models in Dutch, English, French, German, Italian, Norwegian, Polish, Portuguese, Russian, Spanish

System	Year	Language	Accuracy
Spark NLP v2.4	2020	Python/Scala/Java/R	93.3 (test F1) - 95.9 (dev F1)
Spark NLP v2.x	2019	Python/Scala/Java/R	93
Spark NLP v1.x	2018	Python/Scala/Java/R	92
spaCy v2.x	2017	Python/Cython	92.6
spaCy v1.x	2015	Python/Cython	91.8
ClearNLP	2015	Java	91.7
CoreNLP	2015	Java	89.6
MATE	2015	Java	92.5
Turbo	2015	C++	92.4

Right Out of The Box

NLU ships with many NLP features, pre-trained models and pipelines

It takes in Pandas and outputs Pandas Dataframes

All in one line

Named Entity Recognition (NER) 18 class
Named Entity Recognition (NER) 5 Class
Part of speech (POS)
Emotion Classifier
Sentiment Classifier
Question Classifier 50 class
Fake News Classifier
Cyberbullying Classifier
Spam Classifier
Sarcasm Classifier
IMDB Movie Sentiment Classifier
Twitter Sentiment Classifier
Language Classifier
E2E Classifier
Toxic Classifier
Word Embeddings Bert
Word Embeddings Biobert
Word Embeddings Covidbert
Word Embeddings Albert
Electra Embeddings
Word Embeddings Elmo
Word Embeddings Xlnet
Word Embeddings Glove
Multiple Token Embeddings at once
Bert Sentence Embeddings
Electra Sentence Embeddings
Sentence Embeddings Use
Spell Checking
Dependency Parsing Unlabeled
Dependency Parsing Labeled
Tokenization
Stemmer
Stopwords Removal
Lemmatization
Normalizers
NGrams
Date Matching
Entity Chunking
Sentence Detection

Named Entity Recognition (NER) 18 class

NER ONTO example

nlu.load('ner').predict('Angela Merkel from Germany and the American Donald Trump dont share many opinions')

embeddings	ner_tag	entities
[[-0.563759982585907, 0.26958999037742615, 0.3…	PER	Angela Merkel
[[-0.563759982585907, 0.26958999037742615, 0.3…	GPE	Germany
[[-0.563759982585907, 0.26958999037742615, 0.3…	NORP	American
[[-0.563759982585907, 0.26958999037742615, 0.3…	PER	Donald Trump

Named Entity Recognition (NER) 5 Class

NER CONLL example

nlu.load('ner.conll').predict('Angela Merkel from Germany and the American Donald Trump dont share many opinions')

embeddings	ner_tag	entities
[[-0.563759982585907, 0.26958999037742615, 0.3…	PER	Angela Merkel
[[-0.563759982585907, 0.26958999037742615, 0.3…	LOC	Germany
[[-0.563759982585907, 0.26958999037742615, 0.3…	MISC	American
[[-0.563759982585907, 0.26958999037742615, 0.3…	PER	Donald Trump

Part of speech (POS)

POS Classifies each token with one of the following tags
Part of Speech example

nlu.load('pos').predict('Part of speech assigns each token in a sentence a grammatical label')

token	pos
Part	NN
of	IN
speech	NN
assigns	NNS
each	DT
token	NN
in	IN
a	DT
sentence	NN
a	DT
grammatical	JJ
label	NN

Emotion Classifier

Emotion Classifier example
Classifies text as one of 4 categories (joy, fear, surprise, sadness)

nlu.load('emotion').predict('I love NLU!')

sentence_embeddings	emotion_confidence	sentence	emotion
[0.027570432052016258, -0.052647676318883896, …]	0.976017	I love NLU!	joy

Sentiment Classifier

Sentiment Classifier Example
Classifies binary sentiment for every sentence, either positive or negative.

nlu.load('sentiment').predict("I hate this guy Sami")

sentiment_confidence	sentence	sentiment	checked
0.5778	I hate this guy Sami	negative	[I, hate, this, guy, Sami]

Question Classifier 50 class

50 Class Questions Classifier example
Classify between 50 different types of questions trained on Trec50
When setting predict(meta=True) nlu will output the probabilities for all other 49 question classes.

nlu.load('en.classify.trec50').predict('How expensive is the Watch?')

sentence_embeddings	question_confidence	sentence	question
[0.051809534430503845, 0.03128402680158615, -0…]	0.919436	How expensive is the watch?	NUM_count

Fake News Classifier

Fake News Classifier example

nlu.load('en.classify.fakenews').predict('Unicorns have been sighted on Mars!')

sentence_embeddings	fake_confidence	sentence	fake
[-0.01756167598068714, 0.015006818808615208, -…]	1.000000	Unicorns have been sighted on Mars!	FAKE

Cyberbullying Classifier

Cyberbullying Classifier example
Classifies sexism and racism

nlu.load('en.classify.cyberbullying').predict('Women belong in the kitchen.') # sorry we really don't mean it

sentence_embeddings	cyberbullying_confidence	sentence	cyberbullying
[-0.054944973438978195, -0.022223370149731636,…]	0.999998	Women belong in the kitchen.	sexism

Spam Classifier

Spam Classifier example

nlu.load('en.classify.spam').predict('Please sign up for this FREE membership it costs $$NO MONEY$$ just your mobile number!')

sentence_embeddings	spam_confidence	sentence	spam
[0.008322705514729023, 0.009957313537597656, 0…]	1.000000	Please sign up for this FREE membership it cos…	spam

Sarcasm Classifier

Sarcasm Classifier example

nlu.load('en.classify.sarcasm').predict('gotta love the teachers who give exams on the day after halloween')

sentence_embeddings	sarcasm_confidence	sentence	sarcasm
[-0.03146284446120262, 0.04071342945098877, 0….]	0.999985	gotta love the teachers who give exams on the…	sarcasm

IMDB Movie Sentiment Classifier

Movie Review Sentiment Classifier example

nlu.load('en.sentiment.imdb').predict('The Matrix was a pretty good movie')

document	sentence_embeddings	sentiment_negative	sentiment_negative	sentiment_positive	sentiment
The Matrix was a pretty good movie	[[0.04629608988761902, -0.020867452025413513, … ]	[2.7235753918830596e-07]	[2.7235753918830596e-07]	[0.9999997615814209]	[positive]

Twitter Sentiment Classifier

Twitter Sentiment Classifier Example

nlu.load('en.sentiment.twitter').predict('@elonmusk Tesla stock price is too high imo')

document	sentence_embeddings	sentiment_negative	sentiment_negative	sentiment_positive	sentiment
@elonmusk Tesla stock price is too high imo	[[0.08604438602924347, 0.04703635722398758, -0…]	[1.0]	[1.0]	[1.692714735043349e-36]	[negative]

Language Classifier

Languages Classifier example
Classifies the following 20 languages:
Bulgarian, Czech, German, Greek, English, Spanish, Finnish, French, Croatian, Hungarian, Italy, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Swedish, Turkish, and Ukrainian

nlu.load('lang').predict(['NLU is an open-source text processing library for advanced natural language processing for the Python.','NLU est une bibliothèque de traitement de texte open source pour le traitement avancé du langage naturel pour les langages de programmation Python.'])

language_confidence	document	language
0.985407	NLU is an open-source text processing library …]	en
0.999822	NLU est une bibliothèque de traitement de text…]	fr

E2E Classifier

E2E Classifier example
This is a multi class classifier trained on the E2E dataset for Natural language generation

nlu.load('e2e').predict('E2E is a dataset for training generative models')

sentence_embeddings	e2e	e2e_confidence	sentence
[0.021445205435156822, -0.039284929633140564, …,]	customer rating[high]	0.703248	E2E is a dataset for training generative models
None	name[The Waterman]	0.703248	None
None	eatType[restaurant]	0.703248	None
None	priceRange[£20-25]	0.703248	None
None	familyFriendly[no]	0.703248	None
None	familyFriendly[yes]	0.703248	None

Toxic Classifier

Toxic Text Classifier example

nlu.load('en.classify.toxic').predict('You are to stupid')

toxic_confidence	toxic	sentence_embeddings	document
0.978273	[toxic,insult]	[[-0.03398505970835686, 0.0007853527786210179,…,] You are to stupid

Word Embeddings Bert

BERT Word Embeddings example

nlu.load('bert').predict('NLU offers the latest embeddings in one line ')

token	bert_embeddings
NLU	[0.3253086805343628, -0.574441134929657, -0.08…]
offers	[-0.6660361886024475, -0.1494743824005127, -0…]
the	[-0.6587662696838379, 0.3323703110218048, 0.16…]
latest	[0.7552685737609863, 0.17207926511764526, 1.35…]
embeddings	[-0.09838500618934631, -1.1448147296905518, -1…]
in	[-0.4635896384716034, 0.38369956612586975, 0.0…]
one	[0.26821616291999817, 0.7025910019874573, 0.15…]
line	[-0.31930840015411377, -0.48271292448043823, 0…]

Word Embeddings Biobert

BIOBERT Word Embeddings example
Bert model pretrained on Bio dataset

nlu.load('biobert').predict('Biobert was pretrained on a medical dataset')

token	biobert_embeddings
NLU	[0.3253086805343628, -0.574441134929657, -0.08…]
offers	[-0.6660361886024475, -0.1494743824005127, -0…]
the	[-0.6587662696838379, 0.3323703110218048, 0.16…]
latest	[0.7552685737609863, 0.17207926511764526, 1.35…]
embeddings	[-0.09838500618934631, -1.1448147296905518, -1…]
in	[-0.4635896384716034, 0.38369956612586975, 0.0…]
one	[0.26821616291999817, 0.7025910019874573, 0.15…]
line	[-0.31930840015411377, -0.48271292448043823, 0…]

Word Embeddings Covidbert

COVIDBERT Word Embeddings
Bert model pretrained on COVID dataset

nlu.load('covidbert').predict('Albert uses a collection of many berts to generate embeddings')

token	covid_embeddings
He	[-1.0551927089691162, -1.534174919128418, 1.29…,]
was	[-0.14796507358551025, -1.3928604125976562, 0….,]
suprised	[1.0647121667861938, -0.3664901852607727, 0.54…,]
by	[-0.15271103382110596, -0.6812090277671814, -0…,]
the	[-0.45744237303733826, -1.4266574382781982, -0…,]
diversity	[-0.05339818447828293, -0.5118572115898132, 0….,]
of	[-0.2971905767917633, -1.0936176776885986, -0….,]
NLU	[-0.9573594331741333, -0.18001675605773926, -1…,]

Word Embeddings Albert

ALBERT Word Embeddings examle

nlu.load('albert').predict('Albert uses a collection of many berts to generate embeddings')

token	albert_embeddings
Albert	[-0.08257609605789185, -0.8017427325248718, 1…]
uses	[0.8256351947784424, -1.5144840478897095, 0.90…]
a	[-0.22089454531669617, -0.24295514822006226, 3…]
collection	[-0.2136894017457962, -0.8225528597831726, -0…]
of	[1.7623294591903687, -1.113651156425476, 0.800…]
many	[0.6415284872055054, -0.04533941298723221, 1.9…]
berts	[-0.5591965317726135, -1.1773797273635864, -0…]
to	[1.0956681966781616, -1.4180747270584106, -0.2…]
generate	[-0.6759272813796997, -1.3546931743621826, 1.6…]
embeddings	[-0.0035803020000457764, -0.35928264260292053,…]

Electra Embeddings

ELECTRA Word Embeddings example

nlu.load('electra').predict('He was suprised by the diversity of NLU')

token	electra_embeddings
He	[0.29674115777015686, -0.21371933817863464, -0…,]
was	[-0.4278327524662018, -0.5352768898010254, -0….,]
suprised	[-0.3090559244155884, 0.8737565279006958, -1.0…,]
by	[-0.07821277529001236, 0.13081523776054382, 0….,]
the	[0.5462881922721863, 0.0683358758687973, -0.41…,]
diversity	[0.1381239891052246, 0.2956242859363556, 0.250…,]
of	[-0.5667567253112793, -0.3955455720424652, -0….,]
NLU	[0.5597224831581116, -0.703249454498291, -1.08…,]

Word Embeddings Elmo

ELMO Word Embeddings example

nlu.load('elmo').predict('Elmo was trained on Left to right masked to learn its embeddings')

token	elmo_embeddings
Elmo	[0.6083735227584839, 0.20089012384414673, 0.42…]
was	[0.2980785369873047, -0.07382500916719437, -0…]
trained	[-0.39923471212387085, 0.17155063152313232, 0…]
on	[0.04337821900844574, 0.1392083466053009, -0.4…]
Left	[0.4468783736228943, -0.623046875, 0.771505534…]
to	[-0.18209676444530487, 0.03812692314386368, 0…]
right	[0.23305709660053253, -0.6459438800811768, 0.5…]
masked	[-0.7243442535400391, 0.10247116535902023, 0.1…]
to	[-0.18209676444530487, 0.03812692314386368, 0…]
learn	[1.2942464351654053, 0.7376189231872559, -0.58…]
its	[0.055951207876205444, 0.19218483567237854, -0…]
embeddings	[-1.31377112865448, 0.7727609872817993, 0.6748…]

Word Embeddings Xlnet

XLNET Word Embeddings example

nlu.load('xlnet').predict('XLNET computes contextualized word representations using combination of Autoregressive Language Model and Permutation Language Model')

token	xlnet_embeddings
XLNET	[-0.02719488926231861, -1.7693557739257812, -0…]
computes	[-1.8262947797775269, 0.8455266356468201, 0.57…]
contextualized	[2.8446314334869385, -0.3564329445362091, -2.1…]
word	[-0.6143839359283447, -1.7368144989013672, -0…]
representations	[-0.30445945262908936, -1.2129613161087036, 0…]
using	[0.07423821836709976, -0.02561005763709545, -0…]
combination	[-0.5387097597122192, -1.1827564239501953, 0.5…]
of	[-1.403516411781311, 0.3108177185058594, -0.32…]
Autoregressive	[-1.0869172811508179, 0.7135171890258789, -0.2…]
Language	[-0.33215752243995667, -1.4108021259307861, -0…]
Model	[-1.6097160577774048, -0.2548254430294037, 0.0…]
and	[0.7884324789047241, -1.507911205291748, 0.677…]
Permutation	[0.6049966812133789, -0.157279372215271, -0.06…]
Language	[-0.33215752243995667, -1.4108021259307861, -0…]
Model	[-1.6097160577774048, -0.2548254430294037, 0.0…]

Word Embeddings Glove

GLOVE Word Embeddings example

nlu.load('glove').predict('Glove embeddings are generated by aggregating global word-word co-occurrence matrix from a corpus')

token	glove_embeddings
Glove	[0.3677999973297119, 0.37073999643325806, 0.32…]
embeddings	[0.732479989528656, 0.3734700083732605, 0.0188…]
are	[-0.5153300166130066, 0.8318600058555603, 0.22…]
generated	[-0.35510000586509705, 0.6115900278091431, 0.4…]
by	[-0.20874999463558197, -0.11739999800920486, 0…]
aggregating	[-0.5133699774742126, 0.04489300027489662, 0.1…]
global	[0.24281999468803406, 0.6170300245285034, 0.66…]
word-word	[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, …]
co-occurrence	[0.16384999454021454, -0.3178800046443939, 0.1…]
matrix	[-0.2663800120353699, 0.4449099898338318, 0.32…]
from	[0.30730998516082764, 0.24737000465393066, 0.6…]
a	[-0.2708599865436554, 0.04400600120425224, -0…]
corpus	[0.39937999844551086, 0.15894000232219696, -0…]

Multiple Token Embeddings at once

Compare 6 Embeddings at once with NLU and T-SNE example

#This takes around 10GB RAM, watch out!
nlu.load('bert albert electra elmo xlnet use glove').predict('Get all of them at once! Watch your RAM tough!')

xlnet_embeddings	use_embeddings	elmo_embeddings	electra_embeddings	glove_embeddings	sentence	albert_embeddings	biobert_embeddings	bert_embeddings
[[-0.003953204490244389, -1.5821468830108643, …,]	[-0.019299551844596863, -0.04762779921293259, …,]	[[0.04002974182367325, -0.43536433577537537, -…,]	[[0.19559216499328613, -0.46693214774131775, -…,]	[[0.1443299949169159, 0.4395099878311157, 0.58…,]	Get all of them at once, watch your RAM tough!	[[-0.4743960201740265, -0.581386387348175, 0.7…,]	[[-0.00012563914060592651, -1.372296929359436,…,]	[[-0.7687976360321045, 0.8489367961883545, -0….,]

Bert Sentence Embeddings

BERT Sentence Embeddings example

sentence	bert_sentence_embeddings
He was suprised by the diversity of NLU	[-1.0726687908172607, 0.4481312036514282, -0.0…,]

Electra Sentence Embeddings

ELECTRA Sentence Embeddings example

nlu.load('embed_sentence.electra').predict('He was suprised by the diversity of NLU')

sentence	electra_sentence_embeddings
He was suprised by the diversity of NLU	[0.005376118700951338, 0.18036000430583954, -0…,]

Sentence Embeddings Use

USE Sentence Embeddings example

nlu.load('use').predict('USE is designed to encode whole sentences and documents into vectors that can be used for text classification, semantic similarity, clustering or oder NLP tasks')

sentence	use_embeddings
USE is designed to encode whole sentences and …]	[0.03302069380879402, -0.004255455918610096, -…]

Spell Checking

Spell checking example

nlu.load('spell').predict('I liek pentut buttr ant jely')

token	checked
I	I
liek	like
peantut	pentut
buttr	buttr
and	and
jelli	jely

Dependency Parsing Unlabeled

Untyped Dependency Parsing example

nlu.load('dep.untyped').predict('Untyped Dependencies represent a grammatical tree structure.md')

token	pos	dependency
Untyped	NNP	ROOT
Dependencies	NNP	represent
represent	VBD	Untyped
a	DT	structure
grammatical	JJ	structure
tree	NN	structure
structure	NN	represent

Dependency Parsing Labeled

Typed Dependency Parsing example

nlu.load('dep').predict('Typed Dependencies represent a grammatical tree structure.md where every edge has a label')

token	pos	dependency	labled_dependency
Typed	NNP	ROOT	root
Dependencies	NNP	represent	nsubj
represent	VBD	Typed	parataxis
a	DT	structure	nsubj
grammatical	JJ	structure	amod
tree	NN	structure	flat
structure	NN	represent	nsubj
where	WRB	structure	mark
every	DT	edge	nsubj
edge	NN	where	nsubj
has	VBZ	ROOT	root
a	DT	label	nsubj
label	NN	has	nsubj

Tokenization

Tokenization example

nlu.load('tokenize').predict('Each word and symbol in a sentence will generate token.')

token
Each
word
and
symbol
will
generate
a
token
.

Stemmer

Stemmer example

nlu.load('stemm').predict('NLU can get you the stem of a word')

token	stem
NLU	nlu
can	can
get	get
you	you
the	the
stem	stem
of	of
a	a
word	word

Stopwords Removal

Stopwords Removal example

nlu.load('stopwords').predict('I want you to remove stopwords from this sentence please')

token	cleanTokens
I	remove
want	stopewords
you	sentence
to	None
remove	None
stopwords	None
from	None
this	None
sentence	None
please	None

Lemmatization

Lemmatization example

nlu.load('lemma').predict('Lemmatizing generates a less noisy version of the inputted tokens')

token	lemma
Lemmatizing	Lemmatizing
generates	generate
a	a
less	less
noisy	noisy
version	version
of	of
the	the
inputted	input
tokens	token

Normalizers

Normalizing example

nlu.load('norm').predict('@CKL_IT says that #normalizers are pretty useful to clean #structured_strings in #NLU like tweets')

normalized	token
CKLIT	@CKL_IT
says	says
that	that
normalizers	#normalizers
are	are
pretty	pretty
useful	useful
to	to
clean	clean
structuredstrings	#structured_strings
in	in
NLU	#NLU
like	like
tweets	tweets

NGrams

NGrams example

nlu.load('ngram').predict('Wht a wondful day!')

document	ngrams	pos
To be or not to be	[To, be, or, not, to, be, To be, be or, or not…]	[TO, VB, CC, RB, TO, VB]

Date Matching

Date Matching example

nlu.load('match.datetime').predict('In the years 2000/01/01 to 2010/01/01 a lot of things happened')

document	date
In the years 2000/01/01 to 2010/01/01 a lot of things happened	[2000/01/01, 2010/01/01]

Entity Chunking

Checkout see here for all possible POS labels or
Splits text into rows based on matched grammatical entities.
Entity Chunking Example

# First we load the pipeline
pipe = nlu.load('match.chunks')
# Now we print the info to see at which index which com,ponent is and what parameters we can configure on them 
pipe.generate_class_metadata_table()
# Lets set our Chunker to only match NN
pipe['default_chunker'].setRegexParsers(['<NN>+', '<JJ>+'])
# Now we can predict with the configured pipeline
pipe.predict("Jim and Joe went to the big blue market next to the town hall")

# the outputs of component_list.print_info()
The following parameters are configurable for this NLU pipeline (You can copy paste the examples) :
>>> component_list['document_assembler'] has settable params:
component_list['document_assembler'].setCleanupMode('disabled')         | Info: possible values: disabled, inplace, inplace_full, shrink, shrink_full, each, each_full, delete_full | Currently set to : disabled
>>> component_list['sentence_detector'] has settable params:
component_list['sentence_detector'].setCustomBounds([])                 | Info: characters used to explicitly mark sentence bounds | Currently set to : []
component_list['sentence_detector'].setDetectLists(True)                | Info: whether detect lists during sentence detection | Currently set to : True
component_list['sentence_detector'].setExplodeSentences(False)          | Info: whether to explode each sentence into a different row, for better parallelization. Defaults to false. | Currently set to : False
component_list['sentence_detector'].setMaxLength(99999)                 | Info: Set the maximum allowed length for each sentence | Currently set to : 99999
component_list['sentence_detector'].setMinLength(0)                     | Info: Set the minimum allowed length for each sentence. | Currently set to : 0
component_list['sentence_detector'].setUseAbbreviations(True)           | Info: whether to apply abbreviations at sentence detection | Currently set to : True
component_list['sentence_detector'].setUseCustomBoundsOnly(False)       | Info: Only utilize custom bounds in sentence detection | Currently set to : False
>>> component_list['regex_matcher'] has settable params:
component_list['regex_matcher'].setCaseSensitiveExceptions(True)        | Info: Whether to care for case sensitiveness in exceptions | Currently set to : True
component_list['regex_matcher'].setTargetPattern('\S+')                 | Info: pattern to grab from text as token candidates. Defaults \S+ | Currently set to : \S+
component_list['regex_matcher'].setMaxLength(99999)                     | Info: Set the maximum allowed length for each token | Currently set to : 99999
component_list['regex_matcher'].setMinLength(0)                         | Info: Set the minimum allowed length for each token | Currently set to : 0
>>> component_list['sentiment_dl'] has settable params:
>>> component_list['default_chunker'] has settable params:
component_list['default_chunker'].setRegexParsers(['<DT>?<JJ>*<NN>+'])  | Info: an array of grammar based chunk parsers | Currently set to : ['<DT>?<JJ>*<NN>+']```

chunk	pos
market	[NNP, CC, NNP, VBD, TO, DT, JJ, JJ, NN, JJ, TO…
town hall	[NNP, CC, NNP, VBD, TO, DT, JJ, JJ, NN, JJ, TO…
big blue	[NNP, CC, NNP, VBD, TO, DT, JJ, JJ, NN, JJ, TO…
next	[NNP, CC, NNP, VBD, TO, DT, JJ, JJ, NN, JJ, TO…

Sentence Detection

Sentence Detection example

nlu.load('sentence_detector').predict('NLU can detect things. Like beginning and endings of sentences. It can also do much more!', output_level ='sentence')  

sentence	word_embeddings	pos	ner
NLU can detect things.	[[0.4970400035381317, -0.013454999774694443, 0…]	[NNP, MD, VB, NNS, ., IN, VBG, CC, NNS, IN, NN… ]	[O, O, O, O, O, B-sent, O, O, O, O, O, O, B-se…]
Like beginning and endings of sentences.	[[0.4970400035381317, -0.013454999774694443, 0…]	[NNP, MD, VB, NNS, ., IN, VBG, CC, NNS, IN, NN…]	[O, O, O, O, O, B-sent, O, O, O, O, O, O, B-se…]
It can also do much more!	[[0.4970400035381317, -0.013454999774694443, 0…]	[NNP, MD, VB, NNS, ., IN, VBG, CC, NNS, IN, NN…]	[O, O, O, O, O, B-sent, O, O, O, O, O, O, B-se…]