Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "language processing" - Page 22

x

Sort By:

Relevance

Clear All Filters

OS

Mac 846
Linux 836
Windows 833
More...
BSD 379
ChromeOS 365
Desktop Operating Systems 17
Mobile Operating Systems 16
Server Operating Systems 5
Embedded Operating Systems 1
Game Consoles 1

Category

Artificial Intelligence 519
Software Development 178
Scientific/Engineering 106
Text Editors 84
Multimedia 51
Business 42
Formats and Protocols 34
Internet 32
Education 28
Database 19
Games 10
System 9
Communications 8
Security 6
Printing 5
Desktop Environment 4
Productivity 3
Social sciences 2
Religion and Philosophy 1

License

OSI-Approved Open Source 690
Creative Commons Attribution License 13
Public Domain 8
Other License 6
More...
GNU Free Documentation License 5

Translations

Programming Language

Status

Production/Stable 99
Beta 79
Alpha 41
Pre-Alpha 26
More...
Planning 21
Mature 8
Inactive 8

Showing 846 open source projects for "language processing"

View related business solutions

Mac Clear Filters & Widen Search

Stop vibe-debugging.
Plug Claude into your app's actual errors.

AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.

Free 30 days.
$300 Free Credits for Your Google Cloud Projects
Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial
1

GLMixer

Graphic Live Mixer

GLMixer performs real time graphical blending of several movie clips and of computer generated graphics. Drop video files in the mixing workspace and place them in a circular area to change their opacity ; if you selects two videos, moving them together performs a fading transition. This principle generalizes to a large number of videos. Direct interaction with the video allows to be fast and reactive, and to move and deform them on screen. The output of your operations is shown in the...

11 Reviews

Downloads: 111 This Week

Last Update: 2024-01-22
See Project
2

OLiA

OWL/DL ontologies for linguistic annotations

MOVED TO https://github.com/acoli-repo/olia. The Ontologies of Linguistic Annotations (OLiA) provide an OWL/DL taxonomy of data categories as a reference for linguistic annotation (OLiA Reference Model), plus OWL/DL models for a large number of annotation schemes (OLiA Annotation Models) and their relationship to reference data categories (OLiA Linking Models). The OLiA Reference Model itself is linked to community-maintained repositories such as GOLD (http://linguistics-ontology.org/)...

Downloads: 0 This Week

Last Update: 2019-11-11
See Project
3

nonechucks

Deal with bad samples in your dataset dynamically

...What if you have a dataset of 1000s of images, out of which a few dozen images are unreadable because the image files are corrupted? Or what if your dataset is a folder full of scanned PDFs that you have to OCRize, and then run a language detector on the resulting text, because you want only the ones that are in English? Or maybe you have an AlternateIndexSampler, and you want to be able to move to dataset[6] after dataset[4] fails while attempting to load! PyTorch's data processing module expects you to rid your dataset of any unwanted or invalid samples before you feed them into its pipeline, and provides no easy way to define a "fallback policy" in case such samples are encountered during dataset iteration.

Downloads: 0 This Week

Last Update: 2023-06-12
See Project
4

Django Celery

Old Celery integration project for Django

Celery is a simple, flexible, and reliable distributed system to process vast amounts of messages, while providing operations with the tools required to maintain such a system. It’s a task queue with focus on real-time processing, while also supporting task scheduling. Celery has a large and diverse community of users and contributors, you should come join us on IRC or our mailing-list. Celery is Open Source and licensed under the BSD License. A task queue’s input is a unit of work called a...

Downloads: 5 This Week

Last Update: 2022-09-01
See Project
Atera - an All-in-one platform for IT management
Ideal for IT departments and MSPs (managed service providers)

Your IT essentials, integrated & elevated. Take your IT management from automated to autonomous, download Atera's agent to start your free trial!

Try Atera now
5

YouTubeCrawler

Go-based automation utility that downloads YouTube videos

This tool is a Go-based automation utility that downloads YouTube videos and permanently embeds or “hard-codes” their subtitles (typically English) into MP4 output files. The workflow involves specifying one or more URLs (via a simple “url” text file in each folder) and the program uses youtube-dl to fetch video and subtitle, then ffmpeg to overlay the subtitles onto the video track. The architecture follows a command-pattern setup: tasks implement a common interface and are scheduled and...

Downloads: 0 This Week

Last Update: 2025-10-24
See Project
6

ModularAdmin

Free Dashboard Theme Built On Bootstrap 4 | HTML Version

ModularAdmin is an open source dashboard theme built in a modular way. That makes it easy to scale, modify and maintain. We use SASS as CSS preprocessor language. Main variables are defined in the src/_variables.scss folder. For making life easier we broke down styles into components, and on build we're just merging all .scss files together and processing it to the dist/css/app.css file. There are also different theme variations located in src/_themes/ folder, where you can change the main variables to get different themes. ...

Downloads: 0 This Week

Last Update: 2022-04-20
See Project
7

TEXT2DATA

Text Analytics Platform

Bring Text Analytics Platform that uses NLP (Natural Language Processing) and Machine Learning to your work environment. Extract essential information from your text documents and let Artificial Intelligence save your time. Get detailed and agile reports on your unstructured data.

Downloads: 0 This Week

Last Update: 2019-07-17
See Project
8

Duckling (Old)

Clojure library that parses text into structured data

Duckling (the “old” archived version) is a natural language processing library (in Clojure) for parsing text to structured data — specifically, recognizing quantities such as dates, times, durations, measurements, currencies, etc., from free-form text. To use Duckling in your project, you just need two functions: load! to load the default configuration, and parse to parse a string. Duckling is a Clojure library that parses text into structured data.

Downloads: 0 This Week

Last Update: 2025-09-24
See Project
9

codelyzer

Static analysis for Angular projects

...You can run the static code analyzer over web apps, NativeScript, Ionic, etc. Note that by default all components are aligned with the style guide so you won't see any errors in the console. Codelyzer supports any template and style language by custom hooks. If you're using Sass for instance, you can allow codelyzer to analyze your styles by creating a file .codelyzer.js in the root of your project (where the node_modules directory is). In the configuration file can implement custom pre-processing and template resolution logic. Lint rules encode logic for syntactic & semantic checks of TypeScript, HTML, CSS and Angular expressions source code.

Downloads: 1 This Week

Last Update: 2022-03-07
See Project
Enterprise-grade ITSM, for every business
Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.

Try it Free
10

Web Book Downloader

Download websites as e-book: pdf, txt, epub.

This application allows user to download chapters from website in 3 ways: - from table of contents; - from range: first chapter address, last chapter address; - by crawling from first chapter to n; In settings you can customize language, input(website encoding) for simplicity output is in the same encoding. If you want your language add new class into strings package, and new fields into Settings class and GUI menu(initialize method).

1 Review

Downloads: 3 This Week

Last Update: 2019-06-15
See Project
11

NeuroNER

Named-entity recognition using neural networks

...Identified entities can be used in various downstream applications such as patient note de-identification and information extraction systems. They can also be used as features for machine learning systems for other natural language processing tasks. Leverages the state-of-the-art prediction capabilities of neural networks (a.k.a. "deep learning") Is cross-platform, open source, freely available, and straightforward to use. Enables the users to create or modify annotations for a new or existing corpus. Train the neural network that performs the NER. During the training, NeuroNER allows monitoring of the network. ...

Downloads: 0 This Week

Last Update: 2022-08-12
See Project
12

lazynlp

Library to scrape and clean web pages to create massive datasets

LazyNLP is a lightweight tool for collecting and curating large-scale text datasets for machine learning and NLP applications with minimal manual effort.

Downloads: 0 This Week

Last Update: 2025-01-22
See Project
13

Pipelines

An experimental programming language for data flow

Pipelines is a language and runtime for crafting massively parallel pipelines. Unlike other languages for defining data flow, the Pipeline language requires the implementation of components to be defined separately in the Python scripting language. This allows the details of implementations to be separated from the structure of the pipeline while providing access to thousands of active libraries for machine learning, data analysis, and processing.

Downloads: 2 This Week

Last Update: 2025-01-21
See Project
14

Common Litt

Simple java script library for auto literation, input tool.

This project focuses of auto conversion in between language alphabets. Using this 'lit.js' library currently you can done conversion in between English - Tamil - Sinhala scrpits vise-versa. This is use full when you need to know how write something in an other given language. This is still at development stage but works purfectly and easy to customize as well. Live demo available at: http://commonlitt.42web.io/ For UI creations I had used Bootstrap and Jquery. For easy array...

Downloads: 0 This Week

Last Update: 2019-06-01
See Project
15

Faust : signal processing language

Faust is a programming language for realtime audio signal processing

[UPDATE] The project has been moved to GitHub (https://github.com/grame-cncm/faust). Do not use this repository anymore ! FAUST (Functional Audio Stream) is a functional programming language specifically designed for real-time signal processing and synthesis. FAUST targets high-performance signal processing applications and audio plug-ins for a variety of platforms and standards. The Faust compiler translates DSP specifications into very efficient C++ code. Thanks to the notion of architecture, FAUST programs can be easily deployed on a large variety of audio platforms and plugin formats (jack, alsa, ladspa, maxmsp, puredata, csound, supercollider, pure, vst, coreaudio) without any change to the FAUST code.

11 Reviews

Downloads: 0 This Week

Last Update: 2019-01-31
See Project
16

TextRank

TextRank implementation for Python 3

TextRank is an implementation of the TextRank algorithm for extractive text summarization and keyword extraction, inspired by Google’s PageRank.

Downloads: 0 This Week

Last Update: 2025-01-24
See Project
17

go-libav

Go language bindings for ffmpeg libraries

go-libav is a Go language binding for the FFmpeg libav libraries, enabling developers to perform advanced multimedia processing directly in Go applications. It exposes low-level functionality such as encoding, decoding, muxing, and demuxing through Go-friendly abstractions. The project is designed for performance-critical systems where direct control over media pipelines is required.

Downloads: 0 This Week

Last Update: 2026-04-27
See Project
18

unfluff

Automatically extract body content (and other cool stuff) from HTML

unfluff is a Node.js library designed to automatically extract the main content from an HTML document — stripping away navigation bars, ads, footers and other boilerplate to leave you with the “body content”, metadata (title, author, date) and other useful fields. It’s a tool very much aimed at content-analysis, web scraping, building datasets, or repurposing article text for downstream processing (like machine-learning or summarization). The API is simple: you feed in raw HTML and it returns a structured object with the extracted text and other fields. It supports caching internal representations to speed up repeated extractions. While its language support is best for English, it is still widely used in web-content-processing pipelines. ...

Downloads: 0 This Week

Last Update: 2025-11-14
See Project
19

iText®, a JAVA PDF library

PDF Library for Developers

iText is an open-source PDF library available for Java and .NET (C#). iText allows you to effortlessly generate and manipulate standards-compliant PDF documents with a powerful and feature-rich SDK. With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. The latest...

Downloads: 123 This Week

Last Update: 2024-06-01
See Project
20

cnn-text-classification-tf

Convolutional Neural Network for Text Classification in Tensorflow

The cnn-text-classification-tf repository by Denny Britz is a well-known educational implementation of convolutional neural networks for text classification using TensorFlow, aimed at helping developers and researchers understand how CNNs can be applied to natural language processing tasks. Based loosely on Kim’s influential paper on CNNs for sentence classification, this codebase demonstrates how to preprocess text data, convert words into learned embeddings, and apply multiple convolution filters to extract n-gram features that are then pooled and fed into a classifier. The project includes scripts for training, evaluation, and data handling, making it easy to run experiments on datasets such as movie reviews or other labeled text collections. ...

Downloads: 0 This Week

Last Update: 2026-02-13
See Project
21

DeepLearn

Implementation of research papers on Deep Learning+ NLP+ CV in Python

Welcome to DeepLearn. This repository contains an implementation of the following research papers on NLP, CV, ML, and deep learning. The required dependencies are mentioned in requirement.txt. I will also use dl-text modules for preparing the datasets. If you haven't use it, please do have a quick look at it. CV, transfer learning, representation learning.

Downloads: 0 This Week

Last Update: 2022-08-11
See Project
22

OpenPR

OpenPR stands for Open Pattern Recognition project and is intended to be an open source library for algorithms of image processing, computer vision, natural language processing, pattern recognition, machine learning and the related fields.

Downloads: 4 This Week

Last Update: 2018-05-15
See Project
23

cebe/markdown

A super fast, highly extensible markdown parser for PHP

...It is a set of PHP classes, each representing a Markdown flavor and a command line tool for converting Markdown files to HTML files. The implementation focus is to be fast (see benchmark) and extensible. You are able to add additional language elements by directly hooking into the parser, no (possibly error-prone) post- or pre-processing is needed to extend the language. It is also well-tested to provide the best rendering results also in edge cases where other parsers fail.

Downloads: 0 This Week

Last Update: 2024-09-24
See Project
24

pyhanlp

Chinese participle

pyhanlp is a Python interface for HanLP (Han Language Processing) that lets you use a mature Java-based NLP toolkit from Python workflows without rebuilding the underlying algorithms. It is commonly used for Chinese-language NLP tasks where you want production-grade tokenization and linguistic analysis, but still want the convenience of Python scripting. The project focuses on making HanLP’s capabilities accessible through a Python-friendly API surface, so you can integrate NLP steps into data pipelines, notebooks, and downstream ML or information-extraction code. ...

Downloads: 0 This Week

Last Update: 2026-01-22
See Project
25

IceNLP

IceNLP is an open source Natural Language Processing (NLP) toolkit for analyzing and processing Icelandic text. The toolkit is implemented in Java.

1 Review

Downloads: 1 This Week

Last Update: 2018-04-13
See Project

Previous
18
19
20
21
You're on page 22
23
24
25
26
Next

Related Searches

¡text.jar

itextpdf-5.5.13.3.jar

fortify static code analyzer

itextsharp-dll-5.5.13.3.zip

glmixer

olia

data mining

website downloader

sinhala tamil ime

faust

Related Categories

Artificial Intelligence

Software Development

Scientific/Engineering

Text Editors

Multimedia

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise