The nlp.load() function

 

The nlp.load() method takes in one or multiple nlp pipeline, model or component references separated by whitespaces.
See the Model Namespace for an overview of all possible nlp references.

NLP will induce the following reference format for any query to the load method:
language.component_type.dataset.embeddings i.e.: en.sentiment.twitter.use

It is possible to omit many parts of the query and the nlp module will provide the best possible defaults, like embeddings for choosing a dataset.

The NLP Namespace also provides a few aliases which make referencing a model even easier!
This makes it possible to get predictions by only referencing the component name
Examples for aliases are nlp.load(‘bert’) or nlp.load(‘sentiment’)

It is possible to omit the language prefix and start the query with : component_type.dataset.embeddings the nlp module will automatically set the language to english in this case.

The nlp.load() method returns a NLU pipeline object which provides predictions :

from johnsnowlabs import nlp 

pipeline = nlp.load('sentiment')
pipeline.predict("I love this Documentation! It's so good!")

This is equal to:

from johnsnowlabs import nlp
nlp.load('sentiment').predict("I love this Documentation! It's so good!")

Load Parameters

The load method provides for now just one parameter verbose. Setting nlp.load(nlp_reference, verbose=True) will generate log outputs that can be helpful for troubleshooting.
If you encounter any errors, please run Verbose mode and post your output on our Github Issues page.

Description Parameter name
NLP reference of the model request
Path to a locally stored Spark NLP Model or Pipeline path
Whether to load GPU jars or not. Set to True to enable. gpu
Whether to load M1 jars or not. Set to True to enable. m1_chip
Whether to use caching for the nlp.display() functions or not. Set to True to enable streamlit_caching

Configuring loaded models

To configure your model or pipeline, first load a NLP component and use the print_components() function.
The print outputs tell you at which index of the pipe_components attribute which NLP component is located.
Via setters which are named according to the parameter values a model can be configured

# example for configuring the first element in the component_list
pipe = nlp.load('en.sentiment.twitter')
pipe.generate_class_metadata_table()
document_assembler_model = pipe.components[0].model
document_assembler_model.setCleanupMode('inplace')

This will print

-------------------------------------At pipe.pipe_components[0].model  : document_assembler with configurable parameters: --------------------------------------
Param Name [ cleanupMode ] :  Param Info : possible values: disabled, inplace, inplace_full, shrink, shrink_full, each, each_full, delete_full  currently Configured as :  disabled
--------------------------------------------At pipe.pipe_components[1].model  : glove with configurable parameters: --------------------------------------------
Param Name [ dimension ] :  Param Info : Number of embedding dimensions  currently Configured as :  512
----------------------------------------At pipe.pipe_components[2].model  : sentiment_dl  with configurable parameters: ----------------------------------------
Param Name [ threshold ] :  Param Info : The minimum threshold for the final result otherwise it will be neutral  currently Configured as :  0.6
Param Name [ thresholdLabel ] :  Param Info : In case the score is less than threshold, what should be the label. Default is neutral.  currently Configured as :  neutral
Param Name [ classes ] :  Param Info : get the tags used to trained this NerDLModel  currently Configured as :  ['positive', 'negative']

Namespace

The NLP name space describes the collection of all models, pipelines and components available in NLP and supported by the nlp.load() method.
You can view it on the Name Space page

Last updated