The library is under Apache 2.0 license, written in Scala with no dependencies on other NLP or ML libraries, and designed to natively extend the Spark ML Pipeline API.
John Snow Labs is the company leading and sponsoring the development of the Spark NLP library. The company provides commercial support, indemnification and consulting for it. This provides the library with long-term financial backing, a funded active development team, and a growing stream of real-world projects that drives robustness and roadmap prioritization.
Alex Thomas from Indeed and David Talby from Pacific AI, for the initial design and code of this library.
Saif Addin Ellafi from John Snow Labs for building the first client-ready version of the library.
Eduardo Munoz, Navneet Behl, Danilo Burbano and Anju Aggarwal from John Snow Labs for expanding the production grade codebase and functionality, and for managing the project.
Aleksei Alekseev and Alberto Andreotti from Pacific AI for contributing machine learning annotators to the library
Joseph Bradley and Xiangrui Meng from Databricks, for guidance on the Spark ML API extension guidelines.
Claudiu Branzan from G2 Web Services, for design contributions and review.
Ben Lorica from O’Reilly, for driving us to move this from idea to reality.
Emmanuel Asimadi for testing, reporting issues and building an in progress documentation
Maziyar Panahi for reporting valuable performance feedback and reporting issues
Vincenzo Gaudenzi from DXC.technology for contributing Italian datasets and models