Showing 25 open source projects for "language processing"

View related business solutions
  • Atera - an All-in-one platform for IT management Icon
    Atera - an All-in-one platform for IT management

    Ideal for IT departments and MSPs (managed service providers)

    Your IT essentials, integrated & elevated. Take your IT management from automated to autonomous, download Atera's agent to start your free trial!
    Try Atera now
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • 1
    Awesome Fraud Detection Research Papers

    Awesome Fraud Detection Research Papers

    A curated list of data mining papers about fraud detection

    A curated list of data mining papers about fraud detection from several conferences.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Numaflow

    Numaflow

    Kubernetes-native platform to run massively parallel data/streaming

    Numaflow is a Kubernetes-native tool for running massively parallel stream processing. A Numaflow Pipeline is implemented as a Kubernetes custom resource and consists of one or more source, data processing, and sink vertices. Numaflow installs in a few minutes and is easier and cheaper to use for simple data processing applications than a full-featured stream processing platform.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Diffgram

    Diffgram

    Training data (data labeling, annotation, workflow) for all data types

    ...Annotation is required because raw media is considered to be unstructured and not usable without it. That’s why training data is required for many modern machine learning use cases including computer vision, natural language processing and speech recognition.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Siddhi Core Libraries

    Siddhi Core Libraries

    Stream Processing and Complex Event Processing Engine

    ...Agile development experience with SQL-like query language and graphical drag-and-drop editor supporting event simulation. Lightweight runtime that can natively run on Kubernetes, Docker, VM, or bare metal, and embedded in any Java or Python application. Scalable, and highly available distributed event processing on Kubernetes, with NATS Streaming and Siddhi Kubernetes Operator.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 5
    Riemann

    Riemann

    A network event stream processing system, in Clojure

    Riemann aggregates events from your servers and applications with a powerful stream processing language. Send an email for every exception in your app. Track the latency distribution of your web app. See the top processes on any host, by memory and CPU. Combine statistics from every Riak node in your cluster and forward to Graphite. Track user activity from second to second. Riemann streams are just functions which accept an event.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Benthos

    Benthos

    Fancy stream processing made operationally mundane

    Benthos is a high performance and resilient stream processor, able to connect various sources and sinks in a range of brokering patterns and perform hydration, enrichments, transformations and filters on payloads. It comes with a powerful mapping language, is easy to deploy and monitor, and ready to drop into your pipeline either as a static binary, docker image, or serverless function, making it cloud native as heck. Delivery guarantees can be a dodgy subject. Benthos processes and...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 7
    Gridap.jl

    Gridap.jl

    Grid-based approximation of partial differential equations in Julia

    ...One can implement new FE spaces, new reference elements, use external mesh generators, linear solvers, post-processing tools, etc. See, e.g., the list of available Gridap plugins.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 8
    Data Formulator

    Data Formulator

    Create rich visualizations with AI

    To create rich visualizations, data analysts often need to iterate back and forth among data processing and chart specification to achieve their goals. To achieve this, analysts need not only proficiency in data transformation and visualization tools but also efforts to manage the branching history consisting of many different versions of data and charts. Recent LLM-powered AI systems have greatly improved visualization authoring experiences, for example by mitigating manual data...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    Pytente

    Uma Ferramenta Computacional para Análise e Recuperação de Patentes

    O Pytente é uma solução avançada para automatizar o processo de coleta, armazenamento e tratamento de dados bibliográficos de patentes. A ferramenta foi projetada para simplificar a coleta de grandes volumes de dados em repositórios de acesso aberto. O Pytente garante o armazenamento estruturado das informações, além da validação e eliminação de registros duplicados. Dentre as diversas funcionalidades disponibilizadas pela ferramenta, destacam-se a extração personalizada de subconjuntos de...
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    text-dedup

    text-dedup

    All-in-one text de-duplication

    text-dedup is a Python library that enables efficient deduplication of large text corpora by using MinHash and other probabilistic techniques to detect near-duplicate content. This is especially useful for NLP tasks where duplicated training data can skew model performance. text-dedup scales to billions of documents and offers tools for chunking, hashing, and comparing text efficiently with low memory usage. It supports Jaccard similarity thresholding, parallel execution, and flexible...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    SentimentAnalysis-Rick&Morty

    SentimentAnalysis-Rick&Morty

    Rick & Morty Sentiment Analysis - End-of-Degree Project - UNIR

    The remarkable progress in the field of Big Data has driven the development of new technologies in natural language processing and data analysis. Text mining is a fascinating application of data analysis that extracts relevant information from related writings in different linguistic contexts. And therefore, in natural language processing, sentiment analysis and classification stands out as a key application supported by text mining. Through the extraction of information from textual data, it becomes possible to identify and comprehend the sentiments and emotions conveyed. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Transducers.jl

    Transducers.jl

    Efficient transducers for Julia

    Transducers are transformations of "sequence" of input that can be composed very efficiently. The interface used by transducers naturally describes a wide range of processes that is expressible as a succession of steps. Furthermore, transducers can be defined without specifying the details of the input and output (collections, streams, channels, etc.) and therefore achieves a full reusability. Transducers are introduced by Rich Hickey, the creator of the Clojure language. His Strange Loop...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 13
    Wooey

    Wooey

    A Django app that creates automatic web UIs for Python scripts

    ...Enable the easy wrapping of any program in simple python instead of having to use language specific to existing tools such as Galaxy. Enable fellow lab members with no command line experience to utilize python scripts. Autodocument workflows for data analysis (simple model saving).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    SZT-bigdata

    SZT-bigdata

    SZT‑bigdata is an open source project

    SZT‑bigdata is an open-source project analyzing real Shenzhen metro (subway) card usage data using big‑data frameworks like Spark, Hadoop, Hive, Kafka, Flink, ClickHouse, HBase, and Elasticsearch. Aimed at exploring transit passenger flow patterns and system optimization using a variety of Scala-based technologies.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    nonechucks

    nonechucks

    Deal with bad samples in your dataset dynamically

    ...What if you have a dataset of 1000s of images, out of which a few dozen images are unreadable because the image files are corrupted? Or what if your dataset is a folder full of scanned PDFs that you have to OCRize, and then run a language detector on the resulting text, because you want only the ones that are in English? Or maybe you have an AlternateIndexSampler, and you want to be able to move to dataset[6] after dataset[4] fails while attempting to load! PyTorch's data processing module expects you to rid your dataset of any unwanted or invalid samples before you feed them into its pipeline, and provides no easy way to define a "fallback policy" in case such samples are encountered during dataset iteration.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    AI learning

    AI learning

    AiLearning, data analysis plus machine learning practice

    We actively respond to the Research Open Source Initiative (DOCX) . Open source today is not just open source, but datasets, models, tutorials, and experimental records. We are also exploring other categories of open source solutions and protocols. I hope you will understand this initiative, combine this initiative with your own interests, and do what you can. Everyone's tiny contributions, together, are the entire open source ecosystem. We are iBooker, a large open-source community,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    PRADA

    PRADA : Pipeline for RNA-Sequencing Data Analysis

    Massively parallel sequencing of cDNA reverse transcribed from RNA (RNASeq) provides an accurate estimate of the quantity and composition of mRNAs. To characterize the transcriptome through the analysis of RNA-seq data, we developed PRADA. PRADA focuses on the processing and analysis of gene expression estimates, supervised and unsupervised gene fusion identification, and supervised intragenic deletion identification. PRADA currently supports 7 modules to process and identify...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    Larch: Data Analysis for X-ray Spectra

    Data Processing and Analysis for X-ray Spectroscopy and More

    Larch is a scientific data processing language that is designed to be easy to use for novices and complete enough for advanced data processing and analysis. Larch provides a wide range of functionality for dealing with arrays of scientific data, and basic tools to make it easy to use and organize complex data. Larch has been primarily developed for dealing with x-ray spectroscopic and scattering data, especially the kind of data collected at modern synchrotrons and x-ray sources. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Cascalog

    Cascalog

    Data processing on Hadoop without the hassle

    Cascalog is a powerful Clojure (and Java) data processing and querying library built atop Hadoop (via Cascading), providing a high-level, Datalog-inspired abstraction for both big data processing and local computation. Cascalog is hosted at Clojars, and some of its dependencies are hosted at Conjars. Both Clo/Con-jars are maven repos that's easy to use with maven or leiningen. The Cascalog website contains more information and links to Various articles and tutorials. The best way to get...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Graphical Grammar Studio

    Graphical Grammar Studio

    An user friendly grammar tool for natural language processing tasks

    Full documentation with tutorials is included in the download package. Graphical Grammar Studio is a tool for applying grammars which behave as words acceptors/consumers and annotators. GGS grammars can be used to find and annotate sequences of words which respect certain conditions, in a given input. Its purpose is for creating NLP tools like phrase chunkers, named entity finders, pronoun co-reference solvers etc. A grammar is represented by a state machine which can be visualized, edited...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    q pipeline manager

    q pipeline manager

    q: integrated platform for pipeline configuration and management

    The q utility is a platform for creating and managing data analysis pipelines. It expands the value of your existing job scheduler - either Grid Engine or TORQUE PBS - through numerous functions that help you organize, submit, monitor, manage and share your informatics work. Data processing pipelines require high-level organization and parallelization of work to optimize resource utilization and decrease the time to results. q (from queue) allows complex job sequences to be efficiently...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Pear3DEngine

    Pear3DEngine

    Pear3DEngine is a modern and modular 3D development framework

    Pear3DEngine is a modern and modular 3D development framework that lets you create professional games, simulations and more. You are free to develop your program in C + +, XML or LUA and publish it as open source software or selling it as a commercial program. The rendering engine uses internally OpenGL or DirectX optionally. The planned editor supports software development on Linux, Windows and maybe MacOS X. DirectX 9 and therefore Windows XP are currently not supported and support is not...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Sanchay
    Sanchay is a collection of tools and APIs for language researchers. It has some implementations of NLP algorithms, some flexible APIs, several user friendly annotation interfaces and Sanchay Query Language for language resources.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    SYRAH si propone di far emergere e rappresentare i concetti espressi per mezzo di un linguaggio naturale. SYRAH aims to discover and represent concepts expressed in natural languages. NLP, lemma, lemmario, italiano, rete, semantica, clustering, semantic
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    A Java API for using suffix trees with natural language and an Eclipse/SWT-based GUI for suffix tree visualization using Graphviz.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next