Examples

 

Usage examples of NLU.load()

The following examples demonstrate how to use nlu’s load api accompanied by the outputs generated by it. It enables loading any model or pipeline in one line
You need to pass one NLU reference to the load method.
You can also pass multiple whitespace separated references.
You can find all NLU references here

Named Entity Recognition (NER) 18 class

NER ONTO example

Predicts the following 18 NER classes from the ONTO dataset :

Type Description
PERSON People, including fictional like Harry Potter
NORP Nationalities or religious or political groups like the Germans
FAC Buildings, airports, highways, bridges, etc. like New York Airport
ORG Companies, agencies, institutions, etc. like Microsoft
GPE Countries, cities, states. like Germany
LOC Non-GPE locations, mountain ranges, bodies of water. Like the Sahara desert
PRODUCT Objects, vehicles, foods, etc. (Not services.) like playstation
EVENT Named hurricanes, battles, wars, sports events, etc. like hurricane Katrina
WORK_OF_ART Titles of books, songs, etc. Like Mona Lisa
LAW Named documents made into laws. Like : Declaration of Independence
LANGUAGE Any named language. Like Turkish
DATE Absolute or relative dates or periods. Like every second friday
TIME Times smaller than a day. Like every minute
PERCENT Percentage, including ”%“. Like 55% of workers enjoy their work
MONEY Monetary values, including unit. Like 50$ for those pants
QUANTITY Measurements, as of weight or distance. Like this person weights 50kg
ORDINAL “first”, “second”, etc. Like David placed first in the tournament
CARDINAL Numerals that do not fall under another type. Like hundreds of models are avaiable in NLU
nlu.load('ner').predict('Angela Merkel from Germany and the American Donald Trump dont share many opinions')
embeddings ner_tag entities
[[-0.563759982585907, 0.26958999037742615, 0.3… PER Angela Merkel
[[-0.563759982585907, 0.26958999037742615, 0.3… LOC Germany
[[-0.563759982585907, 0.26958999037742615, 0.3… MISC American
[[-0.563759982585907, 0.26958999037742615, 0.3… PER Donald Trump

Named Entity Recognition (NER) 5 Class

NER CONLL example

Predicts the following NER classes from the CONLL dataset :

Tag Description
B-PER A person like Jim or Joe
B-ORG An organisation like Microsoft or PETA
B-LOC A location like Germany
B-MISC Anything else like Playstation
O Everything that is not an entity.
nlu.load('ner.conll').predict('Angela Merkel from Germany and the American Donald Trump dont share many opinions')
embeddings ner_tag entities
[[-0.563759982585907, 0.26958999037742615, 0.3… PER Angela Merkel
[[-0.563759982585907, 0.26958999037742615, 0.3… LOC Germany
[[-0.563759982585907, 0.26958999037742615, 0.3… MISC American
[[-0.563759982585907, 0.26958999037742615, 0.3… PER Donald Trump

Part of speech (POS)

POS Classifies each token with one of the following tags

Part of Speech example

Tag Description Example
CC Coordinating conjunction This batch of mushroom stew is savory and delicious
CD Cardinal number Here are five coins
DT Determiner The bunny went home
EX Existential there There is a storm coming
FW Foreign word I’m having a déjà vu
IN Preposition or subordinating conjunction He is cleverer than I am
JJ Adjective She wore a beautiful dress
JJR Adjective, comparative My house is bigger than yours
JJS Adjective, superlative I am the shortest person in my family
LS List item marker A number of things need to be considered before starting a business , such as premises , finance , product demand , staffing and access to customers
MD Modal You must stop when the traffic lights turn red
NN Noun, singular or mass The dog likes to run
NNS Noun, plural The cars are fast
NNP Proper noun, singular I ordered the chair from Amazon
NNPS Proper noun, plural We visted the Kennedys
PDT Predeterminer Both the children had a toy
POS Possessive ending I built the dog’s house
PRP Personal pronoun You need to stop
PRP$ Possessive pronoun Remember not to judge a book by its cover
RB Adverb The dog barks loudly
RBR Adverb, comparative Could you sing more quietly please?
RBS Adverb, superlative Everyone in the race ran fast, but John ran the fastest of all
RP Particle He ate up all his dinner
SYM Symbol What are you doing ?
TO to Please send it back to me
UH Interjection Wow! You look gorgeous
VB Verb, base form We play soccer
VBD Verb, past tense I worked at a restaurant
VBG Verb, gerund or present participle Smoking kills people
VBN Verb, past participle She has done her homework
VBP Verb, non-3rd person singular present You flit from place to place
VBZ Verb, 3rd person singular present He never calls me
WDT Wh-determiner The store honored the complaints, which were less than 25 days old
WP Wh-pronoun Who can help me?
WP$ Possessive wh-pronoun Whose fault is it?
WRB Wh-adverb Where are you going?
nlu.load('pos').predict('Part of speech assigns each token in a sentence a grammatical label')
token pos
Part NN
of IN
speech NN
assigns NNS
each DT
token NN
in IN
a DT
sentence NN
a DT
grammatical JJ
label NN

Emotion Classifier

Emotion Classifier example
Classifies text as one of 4 categories (joy, fear, surprise, sadness)

nlu.load('emotion').predict('I love NLU!')
sentence_embeddings emotion_confidence sentence emotion
[0.027570432052016258, -0.052647676318883896, …] 0.976017 I love NLU! joy

Sentiment Classifier

Sentiment Classifier Example

Classifies binary sentiment for every sentence, either positive or negative.

nlu.load('sentiment').predict("I hate this guy Sami")
sentiment_confidence sentence sentiment checked
0.5778 I hate this guy Sami negative [I, hate, this, guy, Sami]

Question Classifier 50 class

50 Class Questions Classifier example

Classifies between 50 different types of questions trained on the Trec50 dataset When setting predict(meta=True) nlu will output the probabilities for all other 49 question classes. The classes are the following :

Abbreviation question classes:

Class Definition
abb abbreviation
exp expression abbreviated

Entities question classes:

Class Definition
animal animals
body organs of body
color colors
creative inventions, books and other creative pieces
currency currency names
dis .med. diseases and medicine
event events
food food
instrument musical instrument
lang languages
letter letters like a-z
other other entities
plant plants
product products
religion religions
sport sports
substance elements and substances
symbol symbols and signs
technique techniques and methods
term equivalent terms
vehicle vehicles
word words with a special property

Description and abstract concepts question classes:

Class Definition
definition definition of sth.
description description of sth.
manner manner of an action
reason reasons

Human being question classes:

Class Definition
group a group or organization of persons
ind an individual
title title of a person
description description of a person

Location question classes:

Class Definition
city cities
country countries
mountain mountains
other other locations
state states

Numeric question classes:

Class Definition
code postcodes or other codes
count number of sth.
date dates
distance linear measures
money prices
order ranks
other other numbers
period the lasting time of sth.
percent fractions
speed speed
temp temperature
size size, area and volume
weight weight
nlu.load('en.classify.trec50').predict('How expensive is the Watch?')
sentence_embeddings question_confidence sentence question
[0.051809534430503845, 0.03128402680158615, -0…] 0.919436 How expensive is the watch? NUM_count

Fake News Classifier

Fake News Classifier example

nlu.load('en.classify.fakenews').predict('Unicorns have been sighted on Mars!')
sentence_embeddings fake_confidence sentence fake
[-0.01756167598068714, 0.015006818808615208, -…] 1.000000 Unicorns have been sighted on Mars! FAKE

Cyberbullying Classifier

Cyberbullying Classifier example

Classifies sexism and racism

nlu.load('en.classify.cyberbullying').predict('Women belong in the kitchen.') # sorry we really don't mean it
sentence_embeddings cyberbullying_confidence sentence cyberbullying
[-0.054944973438978195, -0.022223370149731636,…] 0.999998 Women belong in the kitchen. sexism

Spam Classifier

Spam Classifier example

nlu.load('en.classify.spam').predict('Please sign up for this FREE membership it costs $$NO MONEY$$ just your mobile number!')
sentence_embeddings spam_confidence sentence spam
[0.008322705514729023, 0.009957313537597656, 0…] 1.000000 Please sign up for this FREE membership it cos… spam

Sarcasm Classifier

Sarcasm Classifier example

nlu.load('en.classify.sarcasm').predict('gotta love the teachers who give exams on the day after halloween')
sentence_embeddings sarcasm_confidence sentence sarcasm
[-0.03146284446120262, 0.04071342945098877, 0….] 0.999985 gotta love the teachers who give exams on the… sarcasm

IMDB Movie Sentiment Classifier

Movie Review Sentiment Classifier example

nlu.load('en.sentiment.imdb').predict('The Matrix was a pretty good movie')
document sentence_embeddings sentiment_negative sentiment_negative sentiment_positive sentiment
The Matrix was a pretty good movie [[0.04629608988761902, -0.020867452025413513, … ] [2.7235753918830596e-07] [2.7235753918830596e-07] [0.9999997615814209] [positive]

Twitter Sentiment Classifier

Twitter Sentiment Classifier Example

nlu.load('en.sentiment.twitter').predict('@elonmusk Tesla stock price is too high imo')
document sentence_embeddings sentiment_negative sentiment_negative sentiment_positive sentiment
@elonmusk Tesla stock price is too high imo [[0.08604438602924347, 0.04703635722398758, -0…] [1.0] [1.0] [1.692714735043349e-36] [negative]

Language Classifier

Languages Classifier example
Classifies the following 20 languages :
Bulgarian, Czech, German, Greek, English, Spanish, Finnish, French, Croatian, Hungarian, Italy, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Swedish, Turkish, and Ukrainian

nlu.load('lang').predict(['NLU is an open-source text processing library for advanced natural language processing for the Python.','NLU est une bibliothèque de traitement de texte open source pour le traitement avancé du langage naturel pour les langages de programmation Python.'])
language_confidence document language
0.985407 NLU is an open-source text processing library …] en
0.999822 NLU est une bibliothèque de traitement de text…] fr

E2E Classifier

E2E Classifier example

This is a multi class classifier trained on the E2E dataset for Natural language generation

nlu.load('e2e').predict('E2E is a dataset for training generative models')
sentence_embeddings e2e e2e_confidence sentence
[0.021445205435156822, -0.039284929633140564, …,] customer rating[high] 0.703248 E2E is a dataset for training generative models
None name[The Waterman] 0.703248 None
None eatType[restaurant] 0.703248 None
None priceRange[£20-25] 0.703248 None
None familyFriendly[no] 0.703248 None
None familyFriendly[yes] 0.703248 None

Toxic Classifier

Toxic Text Classifier example

nlu.load('en.classify.toxic').predict('You are to stupid')
toxic_confidence toxic sentence_embeddings document
0.978273 [toxic,insult] [[-0.03398505970835686, 0.0007853527786210179,…,] You are to stupid  

YAKE Unsupervised Keyword Extractor

YAKE Keyword Extraction Example

nlu.load('yake').predict("NLU is a Python Library for beginners and experts in NLP")
keywords_score_confidence keywords sentence
0.454232 [nlu, nlp, python library] NLU is a Python Library for beginners and expe…

Word Embeddings Bert

BERT Word Embeddings example

nlu.load('bert').predict('NLU offers the latest embeddings in one line ')
token bert_embeddings
NLU [0.3253086805343628, -0.574441134929657, -0.08…]
offers [-0.6660361886024475, -0.1494743824005127, -0…]
the [-0.6587662696838379, 0.3323703110218048, 0.16…]
latest [0.7552685737609863, 0.17207926511764526, 1.35…]
embeddings [-0.09838500618934631, -1.1448147296905518, -1…]
in [-0.4635896384716034, 0.38369956612586975, 0.0…]
one [0.26821616291999817, 0.7025910019874573, 0.15…]
line [-0.31930840015411377, -0.48271292448043823, 0…]

Word Embeddings Biobert

BIOBERT Word Embeddings example
Bert model pretrained on Bio dataset

nlu.load('biobert').predict('Biobert was pretrained on a medical dataset')
token biobert_embeddings
NLU [0.3253086805343628, -0.574441134929657, -0.08…]
offers [-0.6660361886024475, -0.1494743824005127, -0…]
the [-0.6587662696838379, 0.3323703110218048, 0.16…]
latest [0.7552685737609863, 0.17207926511764526, 1.35…]
embeddings [-0.09838500618934631, -1.1448147296905518, -1…]
in [-0.4635896384716034, 0.38369956612586975, 0.0…]
one [0.26821616291999817, 0.7025910019874573, 0.15…]
line [-0.31930840015411377, -0.48271292448043823, 0…]

Word Embeddings Covidbert

COVIDBERT Word Embeddings
Bert model pretrained on COVID dataset

nlu.load('covidbert').predict('Albert uses a collection of many berts to generate embeddings')
token covid_embeddings
He [-1.0551927089691162, -1.534174919128418, 1.29…,]
was [-0.14796507358551025, -1.3928604125976562, 0….,]
suprised [1.0647121667861938, -0.3664901852607727, 0.54…,]
by [-0.15271103382110596, -0.6812090277671814, -0…,]
the [-0.45744237303733826, -1.4266574382781982, -0…,]
diversity [-0.05339818447828293, -0.5118572115898132, 0….,]
of [-0.2971905767917633, -1.0936176776885986, -0….,]
NLU [-0.9573594331741333, -0.18001675605773926, -1…,]

Word Embeddings Albert

ALBERT Word Embeddings examle

nlu.load('albert').predict('Albert uses a collection of many berts to generate embeddings')
token albert_embeddings
Albert [-0.08257609605789185, -0.8017427325248718, 1…]
uses [0.8256351947784424, -1.5144840478897095, 0.90…]
a [-0.22089454531669617, -0.24295514822006226, 3…]
collection [-0.2136894017457962, -0.8225528597831726, -0…]
of [1.7623294591903687, -1.113651156425476, 0.800…]
many [0.6415284872055054, -0.04533941298723221, 1.9…]
berts [-0.5591965317726135, -1.1773797273635864, -0…]
to [1.0956681966781616, -1.4180747270584106, -0.2…]
generate [-0.6759272813796997, -1.3546931743621826, 1.6…]
embeddings [-0.0035803020000457764, -0.35928264260292053,…]

Electra Embeddings

ELECTRA Word Embeddings example

nlu.load('electra').predict('He was suprised by the diversity of NLU')
token electra_embeddings
He [0.29674115777015686, -0.21371933817863464, -0…,]
was [-0.4278327524662018, -0.5352768898010254, -0….,]
suprised [-0.3090559244155884, 0.8737565279006958, -1.0…,]
by [-0.07821277529001236, 0.13081523776054382, 0….,]
the [0.5462881922721863, 0.0683358758687973, -0.41…,]
diversity [0.1381239891052246, 0.2956242859363556, 0.250…,]
of [-0.5667567253112793, -0.3955455720424652, -0….,]
NLU [0.5597224831581116, -0.703249454498291, -1.08…,]

Word Embeddings Elmo

ELMO Word Embeddings example

nlu.load('elmo').predict('Elmo was trained on Left to right masked to learn its embeddings')
token elmo_embeddings
Elmo [0.6083735227584839, 0.20089012384414673, 0.42…]
was [0.2980785369873047, -0.07382500916719437, -0…]
trained [-0.39923471212387085, 0.17155063152313232, 0…]
on [0.04337821900844574, 0.1392083466053009, -0.4…]
Left [0.4468783736228943, -0.623046875, 0.771505534…]
to [-0.18209676444530487, 0.03812692314386368, 0…]
right [0.23305709660053253, -0.6459438800811768, 0.5…]
masked [-0.7243442535400391, 0.10247116535902023, 0.1…]
to [-0.18209676444530487, 0.03812692314386368, 0…]
learn [1.2942464351654053, 0.7376189231872559, -0.58…]
its [0.055951207876205444, 0.19218483567237854, -0…]
embeddings [-1.31377112865448, 0.7727609872817993, 0.6748…]

Word Embeddings Xlnet

XLNET Word Embeddings example

nlu.load('xlnet').predict('XLNET computes contextualized word representations using combination of Autoregressive Language Model and Permutation Language Model')
token xlnet_embeddings
XLNET [-0.02719488926231861, -1.7693557739257812, -0…]
computes [-1.8262947797775269, 0.8455266356468201, 0.57…]
contextualized [2.8446314334869385, -0.3564329445362091, -2.1…]
word [-0.6143839359283447, -1.7368144989013672, -0…]
representations [-0.30445945262908936, -1.2129613161087036, 0…]
using [0.07423821836709976, -0.02561005763709545, -0…]
combination [-0.5387097597122192, -1.1827564239501953, 0.5…]
of [-1.403516411781311, 0.3108177185058594, -0.32…]
Autoregressive [-1.0869172811508179, 0.7135171890258789, -0.2…]
Language [-0.33215752243995667, -1.4108021259307861, -0…]
Model [-1.6097160577774048, -0.2548254430294037, 0.0…]
and [0.7884324789047241, -1.507911205291748, 0.677…]
Permutation [0.6049966812133789, -0.157279372215271, -0.06…]
Language [-0.33215752243995667, -1.4108021259307861, -0…]
Model [-1.6097160577774048, -0.2548254430294037, 0.0…]

Word Embeddings Glove

GLOVE Word Embeddings example

nlu.load('glove').predict('Glove embeddings are generated by aggregating global word-word co-occurrence matrix from a corpus')
token glove_embeddings
Glove [0.3677999973297119, 0.37073999643325806, 0.32…]
embeddings [0.732479989528656, 0.3734700083732605, 0.0188…]
are [-0.5153300166130066, 0.8318600058555603, 0.22…]
generated [-0.35510000586509705, 0.6115900278091431, 0.4…]
by [-0.20874999463558197, -0.11739999800920486, 0…]
aggregating [-0.5133699774742126, 0.04489300027489662, 0.1…]
global [0.24281999468803406, 0.6170300245285034, 0.66…]
word-word [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, …]
co-occurrence [0.16384999454021454, -0.3178800046443939, 0.1…]
matrix [-0.2663800120353699, 0.4449099898338318, 0.32…]
from [0.30730998516082764, 0.24737000465393066, 0.6…]
a [-0.2708599865436554, 0.04400600120425224, -0…]
corpus [0.39937999844551086, 0.15894000232219696, -0…]

Multiple Token Embeddings at once

Compare 6 Embeddings at once with NLU and T-SNE example

#This takes around 10GB RAM, watch out!
nlu.load('bert albert electra elmo xlnet use glove').predict('Get all of them at once! Watch your RAM tough!')
xlnet_embeddings use_embeddings elmo_embeddings electra_embeddings glove_embeddings sentence albert_embeddings biobert_embeddings bert_embeddings
[[-0.003953204490244389, -1.5821468830108643, …,] [-0.019299551844596863, -0.04762779921293259, …,] [[0.04002974182367325, -0.43536433577537537, -…,] [[0.19559216499328613, -0.46693214774131775, -…,] [[0.1443299949169159, 0.4395099878311157, 0.58…,] Get all of them at once, watch your RAM tough! [[-0.4743960201740265, -0.581386387348175, 0.7…,] [[-0.00012563914060592651, -1.372296929359436,…,] [[-0.7687976360321045, 0.8489367961883545, -0….,]

Bert Sentence Embeddings

BERT Sentence Embeddings example

sentence bert_sentence_embeddings
He was suprised by the diversity of NLU [-1.0726687908172607, 0.4481312036514282, -0.0…,]

Electra Sentence Embeddings

ELECTRA Sentence Embeddings example

nlu.load('embed_sentence.electra').predict('He was suprised by the diversity of NLU')
sentence electra_sentence_embeddings
He was suprised by the diversity of NLU [0.005376118700951338, 0.18036000430583954, -0…,]

Sentence Embeddings Use

USE Sentence Embeddings example

nlu.load('use').predict('USE is designed to encode whole sentences and documents into vectors that can be used for text classification, semantic similarity, clustering or oder NLP tasks')
sentence use_embeddings
USE is designed to encode whole sentences and …] [0.03302069380879402, -0.004255455918610096, -…]

Spell Checking

Spell checking example

nlu.load('spell').predict('I liek pentut buttr ant jely')
token checked
I I
liek like
peantut pentut
buttr buttr
and and
jelli jely

Dependency Parsing Unlabeled

Untyped Dependency Parsing example

nlu.load('dep.untyped').predict('Untyped Dependencies represent a grammatical tree structure')
token pos dependency
Untyped NNP ROOT
Dependencies NNP represent
represent VBD Untyped
a DT structure
grammatical JJ structure
tree NN structure
structure NN represent

Dependency Parsing Labeled

Typed Dependency Parsing example

nlu.load('dep').predict('Typed Dependencies represent a grammatical tree structure where every edge has a label')
token pos dependency labled_dependency
Typed NNP ROOT root
Dependencies NNP represent nsubj
represent VBD Typed parataxis
a DT structure nsubj
grammatical JJ structure amod
tree NN structure flat
structure NN represent nsubj
where WRB structure mark
every DT edge nsubj
edge NN where nsubj
has VBZ ROOT root
a DT label nsubj
label NN has nsubj

Tokenization

Tokenization example

nlu.load('tokenize').predict('Each word and symbol in a sentence will generate token.')
token
Each
word
and
symbol
will
generate
a
token
.

Stemmer

Stemmer example

nlu.load('stemm').predict('NLU can get you the stem of a word')
token stem
NLU nlu
can can
get get
you you
the the
stem stem
of of
a a
word word

Stopwords Removal

Stopwords Removal example

nlu.load('stopwords').predict('I want you to remove stopwords from this sentence please')
token cleanTokens
I remove
want stopewords
you sentence
to None
remove None
stopwords None
from None
this None
sentence None
please None

Lemmatization

Lemmatization example

nlu.load('lemma').predict('Lemmatizing generates a less noisy version of the inputted tokens')
token lemma
Lemmatizing Lemmatizing
generates generate
a a
less less
noisy noisy
version version
of of
the the
inputted input
tokens token

Normalizers

Normalizing example

nlu.load('norm').predict('@CKL_IT says that #normalizers are pretty useful to clean #structured_strings in #NLU like tweets')
normalized token
CKLIT @CKL_IT
says says
that that
normalizers #normalizers
are are
pretty pretty
useful useful
to to
clean clean
structuredstrings #structured_strings
in in
NLU #NLU
like like
tweets tweets

NGrams

NGrams example

nlu.load('ngram').predict('Wht a wondful day!')
document ngrams pos
To be or not to be [To, be, or, not, to, be, To be, be or, or not…] [TO, VB, CC, RB, TO, VB]

Date Matching

Date Matching example

nlu.load('match.datetime').predict('In the years 2000/01/01 to 2010/01/01 a lot of things happened')
document date
In the years 2000/01/01 to 2010/01/01 a lot of things happened [2000/01/01, 2001/01/01]

Entity Chunking

Checkout see here for all possible POS labels or
Splits text into rows based on matched grammatical entities.

Entity Chunking Example

# First we load the pipeline
pipe = nlu.load('match.chunks')
# Now we print the info to see at which index which com,ponent is and what parameters we can configure on them 
pipe.generate_class_metadata_table()
# Lets set our Chunker to only match NN
pipe['default_chunker'].setRegexParsers(['<NN>+', '<JJ>+'])
# Now we can predict with the configured pipeline
pipe.predict("Jim and Joe went to the big blue market next to the town hall")
# the outputs of pipe.print_info()
The following parameters are configurable for this NLU pipeline (You can copy paste the examples) :
>>> pipe['document_assembler'] has settable params:
pipe['document_assembler'].setCleanupMode('disabled')         | Info: possible values: disabled, inplace, inplace_full, shrink, shrink_full, each, each_full, delete_full | Currently set to : disabled
>>> pipe['sentence_detector'] has settable params:
pipe['sentence_detector'].setCustomBounds([])                 | Info: characters used to explicitly mark sentence bounds | Currently set to : []
pipe['sentence_detector'].setDetectLists(True)                | Info: whether detect lists during sentence detection | Currently set to : True
pipe['sentence_detector'].setExplodeSentences(False)          | Info: whether to explode each sentence into a different row, for better parallelization. Defaults to false. | Currently set to : False
pipe['sentence_detector'].setMaxLength(99999)                 | Info: Set the maximum allowed length for each sentence | Currently set to : 99999
pipe['sentence_detector'].setMinLength(0)                     | Info: Set the minimum allowed length for each sentence. | Currently set to : 0
pipe['sentence_detector'].setUseAbbreviations(True)           | Info: whether to apply abbreviations at sentence detection | Currently set to : True
pipe['sentence_detector'].setUseCustomBoundsOnly(False)       | Info: Only utilize custom bounds in sentence detection | Currently set to : False
>>> pipe['regex_matcher'] has settable params:
pipe['regex_matcher'].setCaseSensitiveExceptions(True)        | Info: Whether to care for case sensitiveness in exceptions | Currently set to : True
pipe['regex_matcher'].setTargetPattern('\S+')                 | Info: pattern to grab from text as token candidates. Defaults \S+ | Currently set to : \S+
pipe['regex_matcher'].setMaxLength(99999)                     | Info: Set the maximum allowed length for each token | Currently set to : 99999
pipe['regex_matcher'].setMinLength(0)                         | Info: Set the minimum allowed length for each token | Currently set to : 0
>>> pipe['sentiment_dl'] has settable params:
>>> pipe['default_chunker'] has settable params:
pipe['default_chunker'].setRegexParsers(['<DT>?<JJ>*<NN>+'])  | Info: an array of grammar based chunk parsers | Currently set to : ['<DT>?<JJ>*<NN>+']```
chunk pos
market [NNP, CC, NNP, VBD, TO, DT, JJ, JJ, NN, JJ, TO…
town hall [NNP, CC, NNP, VBD, TO, DT, JJ, JJ, NN, JJ, TO…
big blue [NNP, CC, NNP, VBD, TO, DT, JJ, JJ, NN, JJ, TO…
next [NNP, CC, NNP, VBD, TO, DT, JJ, JJ, NN, JJ, TO…

Sentence Detection

Sentence Detection example

nlu.load('sentence_detector').predict('NLU can detect things. Like beginning and endings of sentences. It can also do much more!', output_level ='sentence')  
sentence word_embeddings pos ner
NLU can detect things. [[0.4970400035381317, -0.013454999774694443, 0…] [NNP, MD, VB, NNS, ., IN, VBG, CC, NNS, IN, NN… ] [O, O, O, O, O, B-sent, O, O, O, O, O, O, B-se…]
Like beginning and endings of sentences. [[0.4970400035381317, -0.013454999774694443, 0…] [NNP, MD, VB, NNS, ., IN, VBG, CC, NNS, IN, NN…] [O, O, O, O, O, B-sent, O, O, O, O, O, O, B-se…]
It can also do much more! [[0.4970400035381317, -0.013454999774694443, 0…] [NNP, MD, VB, NNS, ., IN, VBG, CC, NNS, IN, NN…] [O, O, O, O, O, B-sent, O, O, O, O, O, O, B-se…]
Last updated