Spark NLP Integrations

21 Integrations with Spark NLP

View a list of Spark NLP integrations and software that integrates with Spark NLP below. Compare the best Spark NLP integrations as well as features, ratings, user reviews, and pricing of software that integrates with Spark NLP. Here are the current Spark NLP integrations in 2024:

1

TensorFlow

TensorFlow

An end-to-end open source machine learning platform. TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications. Build and train ML models easily using intuitive high-level APIs like Keras with eager execution, which makes for immediate model iteration and easy debugging. Easily train and deploy models in the cloud, on-prem, in the browser, or on-device no matter what language you use. A simple and flexible architecture to take new ideas from concept to code, to state-of-the-art models, and to publication faster. Build, deploy, and experiment easily with TensorFlow.

2 Ratings

Starting Price: Free

View Software
2

Facebook

Meta

Facebook is the world's largest social network. We build technologies that help people connect with friends and family, find communities, and grow businesses. From fundraising to offering life-saving help in a Facebook post or signing up to donate blood, we’re inspired by the ways people show up for each other in times of need. The Facebook app helps you connect with friends, family and communities of people who share your interests. Connecting with your friends and family as well as discovering new ones is easy with features like Groups, Watch and Marketplace.

22 Ratings

Starting Price: Free

View Software
3

OpenAI

OpenAI

OpenAI’s mission is to ensure that artificial general intelligence (AGI)—by which we mean highly autonomous systems that outperform humans at most economically valuable work—benefits all of humanity. We will attempt to directly build safe and beneficial AGI, but will also consider our mission fulfilled if our work aids others to achieve this outcome. Apply our API to any language task — semantic search, summarization, sentiment analysis, content generation, translation, and more — with only a few examples or by specifying your task in English. One simple integration gives you access to our constantly-improving AI technology. Explore how you integrate with the API with these sample completions.

3 Ratings

View Software
4

Python

Python

The core of extensible programming is defining functions. Python allows mandatory and optional arguments, keyword arguments, and even arbitrary argument lists. Whether you're new to programming or an experienced developer, it's easy to learn and use Python. Python can be easy to pick up whether you're a first-time programmer or you're experienced with other languages. The following pages are a useful first step to get on your way to writing programs with Python! The community hosts conferences and meetups to collaborate on code, and much more. Python's documentation will help you along the way, and the mailing lists will keep you in touch. The Python Package Index (PyPI) hosts thousands of third-party modules for Python. Both Python's standard library and the community-contributed modules allow for endless possibilities.

2 Ratings

Starting Price: Free

View Software
5

BERT

Google

BERT is a large language model and a method of pre-training language representations. Pre-training refers to how BERT is first trained on a large source of text, such as Wikipedia. You can then apply the training results to other Natural Language Processing (NLP) tasks, such as question answering and sentiment analysis. With BERT and AI Platform Training, you can train a variety of NLP models in about 30 minutes.

1 Rating

Starting Price: Free

View Software
6

Conda

Conda

Package, dependency, and environment management for any language, Python, R, Ruby, Lua, Scala, Java, JavaScript, C/ C++, Fortran, and more. Conda is an open-source package management system and environment management system that runs on Windows, macOS, Linux, and z/OS. Conda quickly installs, runs, and updates packages and their dependencies. Conda easily creates, saves, loads, and switches between environments on your local computer. It was created for Python programs, but it can package and distribute software for any language. Conda as a package manager helps you find and install packages. If you need a package that requires a different version of Python, you do not need to switch to a different environment manager, because conda is also an environment manager. With just a few commands, you can set up a totally separate environment to run that different version of Python, while continuing to run your usual version of Python in your normal environment.

Starting Price: Free

View Software
7

Java

Oracle

The Java™ Programming Language is a general-purpose, concurrent, strongly typed, class-based object-oriented language. It is normally compiled to the bytecode instruction set and binary format defined in the Java Virtual Machine Specification. In the Java programming language, all source code is first written in plain text files ending with the .java extension. Those source files are then compiled into .class files by the javac compiler. A .class file does not contain code that is native to your processor; it instead contains bytecodes — the machine language of the Java Virtual Machine1 (Java VM). The java launcher tool then runs your application with an instance of the Java Virtual Machine.

Starting Price: Free

View Software
8

Scala

Scala

Scala combines object-oriented and functional programming in one concise, high-level language. Scala's static types help avoid bugs in complex applications, and its JVM and JavaScript runtimes let you build high-performance systems with easy access to huge ecosystems of libraries. The Scala compiler is smart about static types. Most of the time, you need not tell it the types of your variables. Instead, its powerful type inference will figure them out for you. In Scala, case classes are used to represent structural data types. They implicitly equip the class with meaningful toString, equals and hashCode methods, as well as the ability to be deconstructed with pattern matching. In Scala, functions are values, and can be defined as anonymous functions with a concise syntax.

Starting Price: Free

View Software
9

R

The R Foundation

R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R. R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity. One of R’s strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed.

Starting Price: Free

View Software
10

Flair

Flair

Create with Flair, the AI design tool for branded content and product photoshoots. Generate high-quality marketing assets in seconds. Generate entire photoshoots in less than a minute. Generate content in your brand's signature style. Users can choose from our library of high-end styles, or create their own custom moodboard to generate images in their brand's signature aesthetic. Shoot your product anywhere. We preserve your brand's details.

Starting Price: $18 per month

View Software
11

APIFuzzer

PyPI

APIFuzzer reads your API description and step-by-step fuzzes the fields to validate if your application can cope with the fuzzed parameters, and it does not require coding. Parse API definition from a local file or remote URL. JSON and YAML file format support. All HTTP methods are supported. Fuzzing of the request body, query string, path parameter, and request header is supported. Relies on random mutations and supports CI integration. Generate JUnit XML test report format. Send a request to an alternative URL. Support HTTP basic auth from the configuration. Save the report of the failed test in JSON format into the pre-configured folder.

Starting Price: Free

View Software
12

ELMO

ELMO

Looking for an integrated HR information system (HRIS) to help your organization manage its people, process and pay? From improving employee engagement to increasing efficiencies and reducing costs, our integrated cloud-based platform has you covered. ELMO offers a comprehensive suite of cloud HR, payroll and rostering / time & attendance software solutions that can be configured however your organization requires, and these are available within a single dashboard and single user interface. We can help your organization streamline your HR & payroll processes to increase productivity, and efficiency and reduce costs. ISO certification is an important way to demonstrate our commitment to security at all levels of the business and that security is a core ongoing and evolving aspect of our business operations & services. At ELMO we understand that our cloud HR & payroll solutions play a key role in helping our customers manage their most important resources.

View Software
13

Databricks Data Intelligence Platform

Databricks

The Databricks Data Intelligence Platform allows your entire organization to use data and AI. It’s built on a lakehouse to provide an open, unified foundation for all data and governance, and is powered by a Data Intelligence Engine that understands the uniqueness of your data. The winners in every industry will be data and AI companies. From ETL to data warehousing to generative AI, Databricks helps you simplify and accelerate your data and AI goals. Databricks combines generative AI with the unification benefits of a lakehouse to power a Data Intelligence Engine that understands the unique semantics of your data. This allows the Databricks Platform to automatically optimize performance and manage infrastructure in ways unique to your business. The Data Intelligence Engine understands your organization’s language, so search and discovery of new data is as easy as asking a question like you would to a coworker.

View Software
14

Whisper

OpenAI

We’ve trained and are open-sourcing a neural net called Whisper that approaches human-level robustness and accuracy in English speech recognition. Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. We show that the use of such a large and diverse dataset leads to improved robustness to accents, background noise, and technical language. Moreover, it enables transcription in multiple languages, as well as translation from those languages into English. We are open-sourcing models and inference code to serve as a foundation for building useful applications and for further research on robust speech processing. The Whisper architecture is a simple end-to-end approach, implemented as an encoder-decoder Transformer. Input audio is split into 30-second chunks, converted into a log-Mel spectrogram, and then passed into an encoder.

View Software
15

RoBERTa

Meta

RoBERTa builds on BERT’s language masking strategy, wherein the system learns to predict intentionally hidden sections of text within otherwise unannotated language examples. RoBERTa, which was implemented in PyTorch, modifies key hyperparameters in BERT, including removing BERT’s next-sentence pretraining objective, and training with much larger mini-batches and learning rates. This allows RoBERTa to improve on the masked language modeling objective compared with BERT and leads to better downstream task performance. We also explore training RoBERTa on an order of magnitude more data than BERT, for a longer amount of time. We used existing unannotated NLP datasets as well as CC-News, a novel set drawn from public news articles.

Starting Price: Free

View Software
16

XLNet

XLNet

XLNet is a new unsupervised language representation learning method based on a novel generalized permutation language modeling objective. Additionally, XLNet employs Transformer-XL as the backbone model, exhibiting excellent performance for language tasks involving long context. Overall, XLNet achieves state-of-the-art (SOTA) results on various downstream language tasks including question answering, natural language inference, sentiment analysis, and document ranking.

Starting Price: Free

View Software
17

spaCy

spaCy

spaCy is designed to help you do real work, build real products, or gather real insights. The library respects your time and tries to avoid wasting it. It's easy to install, and its API is simple and productive. spaCy excels at large-scale information extraction tasks. It's written from the ground up in carefully memory-managed Cython. If your application needs to process entire web dumps, spaCy is the library you want to be using. Since its release in 2015, spaCy has become an industry standard with a huge ecosystem. Choose from a variety of plugins, integrate with your machine learning stack, and build custom components and workflows. Components for named entity recognition, part-of-speech tagging, dependency parsing, sentence segmentation, text classification, lemmatization, morphological analysis, entity linking, and more. Easily extensible with custom components and attributes. Easy model packaging, deployment, and workflow management.

View Software
18

Apache Spark

Apache Software Foundation

Apache Spark™ is a unified analytics engine for large-scale data processing. Apache Spark achieves high performance for both batch and streaming data, using a state-of-the-art DAG scheduler, a query optimizer, and a physical execution engine. Spark offers over 80 high-level operators that make it easy to build parallel apps. And you can use it interactively from the Scala, Python, R, and SQL shells. Spark powers a stack of libraries including SQL and DataFrames, MLlib for machine learning, GraphX, and Spark Streaming. You can combine these libraries seamlessly in the same application. Spark runs on Hadoop, Apache Mesos, Kubernetes, standalone, or in the cloud. It can access diverse data sources. You can run Spark using its standalone cluster mode, on EC2, on Hadoop YARN, on Mesos, or on Kubernetes. Access data in HDFS, Alluxio, Apache Cassandra, Apache HBase, Apache Hive, and hundreds of other data sources.

View Software
19

ALBERT

Google

ALBERT is a self-supervised Transformer model that was pretrained on a large corpus of English data. This means it does not require manual labelling, and instead uses an automated process to generate inputs and labels from raw texts. It is trained with two distinct objectives in mind. The first is Masked Language Modeling (MLM), which randomly masks 15% of words in the input sentence and requires the model to predict them. This technique differs from RNNs and autoregressive models like GPT as it allows the model to learn bidirectional sentence representations. The second objective is Sentence Ordering Prediction (SOP), which entails predicting the ordering of two consecutive segments of text during pretraining.

View Software
20

Maven

Maven

Our first cohorts sold out within a few hours. Apply Now to join the waitlist and secure your spot in the next cohort. Do you have knowledge to share with the world but don’t know where to start? Many creators are overwhelmed by the number of variables, unknown unknowns, and sheer amount of work involved in creating a complex digital product like a cohort-based course. That’s why Maven is accepting applications to our new cohort-based course on How to Build a Cohort-Based Course (so meta). Our course will make it easy for anyone to join us without a course--and have a fully complete course you’re proud to launch by the end of six weeks. Our company is fully remote, and we're assembling a team of extraordinary talent to revolutionize education on the Internet. We're looking for our first engineers as we gear up to launch courses with an incredible set of early instructors. Check out our open positions.

View Software
21

T5

Google

With T5, we propose reframing all NLP tasks into a unified text-to-text-format where the input and output are always text strings, in contrast to BERT-style models that can only output either a class label or a span of the input. Our text-to-text framework allows us to use the same model, loss function, and hyperparameters on any NLP task, including machine translation, document summarization, question answering, and classification tasks (e.g., sentiment analysis). We can even apply T5 to regression tasks by training it to predict the string representation of a number instead of the number itself.

View Software