Powerful One-Liners
Over a thousand NLP models in hundreds of languages are at your fingertips with just one line of code
Elegant Python
Directly read and write pandas dataframes for frictionless integration with other libraries and existing ML pipelines
100% Open Source
Including pre-trained models & pipelines
Benchmark
Spark NLP 2.5.x obtained the best performing academic peer-reviewed results
Training NER
- State-of-the-art Deep Learning algorithms
- Achieve high accuracy with one line of code
- 350 + NLP Models
- 176 + unique NLP models and algorithms
- 68 + unique NLP pipelines consisting of different NLU components
- 50 + languages supported
- 14 + embeddings BERT, ELMO, ALBERT, XLNET, GLOVE, USE, ELECTRA
- 50 + Pre-trained Classifiers : Emotion, Sarcasm, Language, Question, E2E, Toxic
- 36 + Pre-Trained NER (Named Entity Recognition) models
- 34 + Pre-Trained POS (Part of Speech) models
- 3 + Pre-Trained Lemmatizer models
- Dependency parsing untyped and typed
- Spell Checking
- Multi-lingual NER models in Dutch, English, French, German, Italian, Norwegian, Polish, Portuguese, Russian, Spanish
System | Year | Language | Accuracy |
---|---|---|---|
Spark NLP v2.4 | 2020 | Python/Scala/Java/R | 93.3 (test F1) - 95.9 (dev F1) |
Spark NLP v2.x | 2019 | Python/Scala/Java/R | 93 |
Spark NLP v1.x | 2018 | Python/Scala/Java/R | 92 |
spaCy v2.x | 2017 |
Python/Cython | 92.6 |
spaCy v1.x | 2015 | Python/Cython | 91.8 |
ClearNLP | 2015 | Java | 91.7 |
CoreNLP | 2015 | Java | 89.6 |
MATE | 2015 | Java | 92.5 |
Turbo | 2015 | C++ | 92.4 |