Showing 46 open source projects for "natural language processing"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Awesome Fraud Detection Research Papers

    Awesome Fraud Detection Research Papers

    A curated list of data mining papers about fraud detection

    A curated list of data mining papers about fraud detection from several conferences.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    Diffgram

    Diffgram

    Training data (data labeling, annotation, workflow) for all data types

    ...Annotation is required because raw media is considered to be unstructured and not usable without it. That’s why training data is required for many modern machine learning use cases including computer vision, natural language processing and speech recognition.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    Riemann

    Riemann

    A network event stream processing system, in Clojure

    Riemann aggregates events from your servers and applications with a powerful stream processing language. Send an email for every exception in your app. Track the latency distribution of your web app. See the top processes on any host, by memory and CPU. Combine statistics from every Riak node in your cluster and forward to Graphite. Track user activity from second to second. Riemann streams are just functions which accept an event.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 4
    Data Formulator

    Data Formulator

    Create rich visualizations with AI

    To create rich visualizations, data analysts often need to iterate back and forth among data processing and chart specification to achieve their goals. To achieve this, analysts need not only proficiency in data transformation and visualization tools but also efforts to manage the branching history consisting of many different versions of data and charts. Recent LLM-powered AI systems have greatly improved visualization authoring experiences, for example by mitigating manual data...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • 5
    Numaflow

    Numaflow

    Kubernetes-native platform to run massively parallel data/streaming

    Numaflow is a Kubernetes-native tool for running massively parallel stream processing. A Numaflow Pipeline is implemented as a Kubernetes custom resource and consists of one or more source, data processing, and sink vertices. Numaflow installs in a few minutes and is easier and cheaper to use for simple data processing applications than a full-featured stream processing platform.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    Siddhi Core Libraries

    Siddhi Core Libraries

    Stream Processing and Complex Event Processing Engine

    ...Agile development experience with SQL-like query language and graphical drag-and-drop editor supporting event simulation. Lightweight runtime that can natively run on Kubernetes, Docker, VM, or bare metal, and embedded in any Java or Python application. Scalable, and highly available distributed event processing on Kubernetes, with NATS Streaming and Siddhi Kubernetes Operator.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    101-0250-00

    101-0250-00

    ETH course - Solving PDEs in parallel on GPUs

    This course aims to cover state-of-the-art methods in modern parallel Graphical Processing Unit (GPU) computing, supercomputing and code development with applications to natural sciences and engineering.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Benthos

    Benthos

    Fancy stream processing made operationally mundane

    Benthos is a high performance and resilient stream processor, able to connect various sources and sinks in a range of brokering patterns and perform hydration, enrichments, transformations and filters on payloads. It comes with a powerful mapping language, is easy to deploy and monitor, and ready to drop into your pipeline either as a static binary, docker image, or serverless function, making it cloud native as heck. Delivery guarantees can be a dodgy subject. Benthos processes and...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 9
    AI Data Science Team

    AI Data Science Team

    An AI-powered data science team of agents

    ...It provides a modular agent framework where each agent focuses on a step in the typical data science pipeline — for example, loading data from CSV/Excel files, cleaning and wrangling messy datasets, engineering predictive features, building models with AutoML, connecting to SQL databases, and producing visual outputs — all driven by natural language or programmatic instructions. The project includes ready-to-use applications that showcase these agents in action, such as an exploratory data analysis copilot that generates reports, a pandas data analyst that combines wrangling and plotting, and SQL database agents that can query business databases and output results directly.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 10
    Gridap.jl

    Gridap.jl

    Grid-based approximation of partial differential equations in Julia

    ...One can implement new FE spaces, new reference elements, use external mesh generators, linear solvers, post-processing tools, etc. See, e.g., the list of available Gridap plugins.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    Searchkick

    Searchkick

    Intelligent search made easy

    Searchkick brings powerful, production-ready search to Rails by mapping Active Record models into Elasticsearch with sensible defaults and easy customization. It supports language analyzers, stemming, synonyms, misspelling tolerance, and highlighting so search results feel natural to end users. Indexing is model-centric: you declare what fields to index, add computed fields, and trigger reindexing via callbacks or background jobs, with options for zero-downtime rolling reindexes. On the query side, a simple API covers relevance tuning, boosting, filtering, faceting/aggregations, and pagination, while still allowing direct access to advanced Elasticsearch features when needed. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    Pytente

    Uma Ferramenta Computacional para Análise e Recuperação de Patentes

    O Pytente é uma solução avançada para automatizar o processo de coleta, armazenamento e tratamento de dados bibliográficos de patentes. A ferramenta foi projetada para simplificar a coleta de grandes volumes de dados em repositórios de acesso aberto. O Pytente garante o armazenamento estruturado das informações, além da validação e eliminação de registros duplicados. Dentre as diversas funcionalidades disponibilizadas pela ferramenta, destacam-se a extração personalizada de subconjuntos de...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Altaxo Data Processing/Plotting Program

    Altaxo Data Processing/Plotting Program

    Data manipulation and plotting program with scripting

    Altaxo is a data manipulation and plotting program written in C# for MS.NET. It is featuring worksheet views and plot views, a scripting language (currently C#) for data processing and automation, import of data from ASCII files or from images, export of graphs and embedding of graphs in other documents.
    Leader badge
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    HPCC Systems

    HPCC Systems

    End-to-end big data in a massively scalable supercomputing platform.

    ...With HPCC Systems, developers can design applications with Big Data at their core, enabling businesses to better analyze and understand data at scale, improving business time to results and decisions. HPCC Systems offers a consistent data-centric programming language, two processing platforms and a single, complete end-to-end architecture for efficient processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Catbird Linux

    Catbird Linux

    Linux for content creation, web scraping, coding, and data analysis.

    Catbird Linux is a USB pluggable Live Linux operating system built for media creation, web scraping, and software coding. It is the daily driver you want for retrieving data, making videos or podcasts, and making software tools to automate the repetitive tasks. It is ready for work in Python, Lua, and Go languages, with numerous packages for web scraping or downloading data via API calls. Using Catbird Linux, it is possible to accomplish in depth stock market analysis, track weather...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 16
    Quick 2d Plot

    Quick 2d Plot

    Program for live 2d graphical representation of data streams

    Quick2dPlot, or q2d for short, is an open source minimalistic plotting program designed for live 2d graphical representation of data streams. The program may be useful for plotting output of different user's application programs, especially in case when the user wants to see a plot or a number of plots during calculations or a data acquisition process. The program is command-driven and uses no widgets. Q2d is written in C, it takes advantage of SDL2 library for plotting. Currently...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    text-dedup

    text-dedup

    All-in-one text de-duplication

    text-dedup is a Python library that enables efficient deduplication of large text corpora by using MinHash and other probabilistic techniques to detect near-duplicate content. This is especially useful for NLP tasks where duplicated training data can skew model performance. text-dedup scales to billions of documents and offers tools for chunking, hashing, and comparing text efficiently with low memory usage. It supports Jaccard similarity thresholding, parallel execution, and flexible...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    SentimentAnalysis-Rick&Morty

    SentimentAnalysis-Rick&Morty

    Rick & Morty Sentiment Analysis - End-of-Degree Project - UNIR

    The remarkable progress in the field of Big Data has driven the development of new technologies in natural language processing and data analysis. Text mining is a fascinating application of data analysis that extracts relevant information from related writings in different linguistic contexts. And therefore, in natural language processing, sentiment analysis and classification stands out as a key application supported by text mining. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Transducers.jl

    Transducers.jl

    Efficient transducers for Julia

    Transducers are transformations of "sequence" of input that can be composed very efficiently. The interface used by transducers naturally describes a wide range of processes that is expressible as a succession of steps. Furthermore, transducers can be defined without specifying the details of the input and output (collections, streams, channels, etc.) and therefore achieves a full reusability. Transducers are introduced by Rich Hickey, the creator of the Clojure language. His Strange Loop...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Wooey

    Wooey

    A Django app that creates automatic web UIs for Python scripts

    ...Enable the easy wrapping of any program in simple python instead of having to use language specific to existing tools such as Galaxy. Enable fellow lab members with no command line experience to utilize python scripts. Autodocument workflows for data analysis (simple model saving).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    SZT-bigdata

    SZT-bigdata

    SZT‑bigdata is an open source project

    SZT‑bigdata is an open-source project analyzing real Shenzhen metro (subway) card usage data using big‑data frameworks like Spark, Hadoop, Hive, Kafka, Flink, ClickHouse, HBase, and Elasticsearch. Aimed at exploring transit passenger flow patterns and system optimization using a variety of Scala-based technologies.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Strategems

    Strategems

    Quantitative systematic trading strategy development and backtesting

    ...Given the highly iterative nature of event-driven trading strategy development, Julia's high-performance design (particularly in the context of loops) and straightforward syntax would seem to make it a natural fit as a language for systematic strategy research and development. While this package remains early in development, with time the hope is to be able to rapidly implement a trading idea, construct a historical backtest, analyze its results, optimize over a given parameter set, and visualize all of this with great detail.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    http://lc.kubagro.ru/ http://lc.kubagro.ru/aidos/index.htm http://lc.kubagro.ru/aidos/_Aidos-X.htm On the IBM PC, the Eidos system started working in 1992. MS Windows has been running since 2012. Implemented in Alaska+Express. I want to try to translate some modes, and maybe all of them, to the Harbor. The full source text in a single file is here: http://lc.kubagro.ru/__AIDOS-X.txt Responsible Secretary Kubgau scientific journal, Professor of computer science Department Kubgau...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Deep Learning with PyTorch

    Deep Learning with PyTorch

    Latest techniques in deep learning and representation learning

    This course concerns the latest techniques in deep learning and representation learning, focusing on supervised and unsupervised deep learning, embedding methods, metric learning, convolutional and recurrent nets, with applications to computer vision, natural language understanding, and speech recognition. The prerequisites include DS-GA 1001 Intro to Data Science or a graduate-level machine learning course. To be able to follow the exercises, you are going to need a laptop with Miniconda (a minimal version of Anaconda) and several Python packages installed. The following instruction would work as is for Mac or Ubuntu Linux users, Windows users would need to install and work in the Git BASH terminal. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    nonechucks

    nonechucks

    Deal with bad samples in your dataset dynamically

    ...What if you have a dataset of 1000s of images, out of which a few dozen images are unreadable because the image files are corrupted? Or what if your dataset is a folder full of scanned PDFs that you have to OCRize, and then run a language detector on the resulting text, because you want only the ones that are in English? Or maybe you have an AlternateIndexSampler, and you want to be able to move to dataset[6] after dataset[4] fails while attempting to load! PyTorch's data processing module expects you to rid your dataset of any unwanted or invalid samples before you feed them into its pipeline, and provides no easy way to define a "fallback policy" in case such samples are encountered during dataset iteration.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Auth0 Logo