Page 12 | language processing free download

gensim

Topic Modelling for Humans

Gensim is a Python library for topic modeling, document indexing, and similarity retrieval with large corpora. The target audience is the natural language processing (NLP) and information retrieval (IR) community.

Downloads: 0 This Week

Last Update: 2025-10-16

See Project

Raycast Ollama

Raycast extention for Ollama

Raycast Ollama is an extension for Raycast that integrates Ollama-based large language models directly into the macOS productivity launcher environment. It allows users to interact with local AI models through Raycast commands, enabling quick access to chat, text generation, and other AI-powered tasks without leaving their workflow. The extension is designed to be lightweight and fast, aligning with Raycast’s philosophy of keyboard-driven productivity. It provides a seamless interface for...

Downloads: 2 This Week

Last Update: 2026-05-26

See Project

LLPhant

A comprehensive PHP Generative AI Framework

LLPhant is a PHP-based generative AI framework designed to bring large language model capabilities into modern PHP applications with a structure inspired by frameworks like LangChain. It provides developers with a set of abstractions for building AI-powered features such as chat systems, retrieval pipelines, and automated workflows while remaining simple enough to integrate into existing Symfony or Laravel projects. The framework focuses on usability by offering straightforward APIs for...

Downloads: 2 This Week

Last Update: 2026-05-16

See Project

course.fast.ai

The fast.ai course notebooks

...The repository includes lesson notebooks, slide presentations, spreadsheets, and supplementary materials that help students understand neural networks, computer vision, and natural language processing tasks. The materials are designed to work alongside the fast.ai book and video lectures so learners can follow a structured learning pathway through modern deep learning techniques.

Downloads: 2 This Week

Last Update: 2026-03-11

See Project

Bacalhau

Community-driven, simple, yet powerful framework

Bacalhau is a decentralized compute platform for running jobs on data stored across distributed networks, like IPFS or Filecoin, without moving the data to centralized cloud environments. It allows developers to run containerized workloads close to where the data lives, reducing latency, cost, and privacy risks. Bacalhau supports various runtime environments and is designed to make decentralized data processing as accessible as traditional cloud computing. It’s especially useful for...

Downloads: 2 This Week

Last Update: 2026-04-18

See Project

BrowserAI

Run local LLMs like llama, deepseek, kokoro etc. inside your browser

BrowserAI is a cutting-edge platform that allows users to run large language models (LLMs) directly in their web browser without the need for a server. It leverages WebGPU for accelerated performance and supports offline functionality, making it a highly efficient and privacy-conscious solution. The platform provides a developer-friendly SDK with pre-configured popular models, and it allows for seamless switching between MLC and Transformer engines. Additionally, it supports features such as...

Downloads: 2 This Week

Last Update: 2026-04-18

See Project

AlaSQL

JavaScript SQL database for browser and Node.js for relational tables

AlaSQL.js - JavaScript SQL database for browser and Node.js. Handles both traditional relational tables and nested JSON data (NoSQL). Export, store, and import data from localStorage, IndexedDB, or Excel. We focus on speed by taking advantage of the dynamic nature of JavaScript when building up queries. Real-world solutions demand flexibility regarding where data comes from and where it is to be stored. We focus on flexibility by making sure you can import/export and query directly on data...

Downloads: 4 This Week

Last Update: 2026-01-08

See Project

Conversational Health Agents (CHA)

A Personalized LLM-powered Agent Frameworks

CHA, or Conversational Health Agents, is an open-source framework designed to build intelligent healthcare assistants powered by large language models and external data sources. The system enables developers to create personalized AI agents that can interact with users through natural language while performing multi-step reasoning and task execution. It integrates orchestration capabilities that allow the agent to gather information from APIs, knowledge bases, and external services in order to generate more accurate and context-aware responses. ...

Downloads: 0 This Week

Last Update: 2026-03-17

See Project

rust-bert

Rust native ready-to-use NLP pipelines and transformer-based models

rust-bert is a Rust-based implementation of transformer-based natural language processing models that provides ready-to-use pipelines for tasks such as text classification, summarization, and question answering. The project ports many capabilities of the Hugging Face Transformers ecosystem into the Rust programming language. It allows developers to run state-of-the-art NLP models like BERT, GPT-2, and DistilBERT directly within Rust applications while maintaining high performance and memory efficiency. ...

Downloads: 0 This Week

Last Update: 2026-03-11

See Project

AiLearning-Theory-Applying

Quickly get started with AI theory and practical applications

...Advanced sections explore modern AI topics including transformers, BERT-based natural language processing systems, and practical competition-style machine learning workflows.

Downloads: 0 This Week

Last Update: 2026-03-11

See Project

Mem0

The Memory layer for AI Agents

Mem0 is a self-improving memory layer designed for Large Language Model (LLM) applications, enabling personalized AI experiences that save costs and delight users. It remembers user preferences, adapts to individual needs, and continuously improves over time. Key features include enhancing future conversations by building smarter AI that learns from every interaction, reducing LLM costs by up to 80% through intelligent data filtering, delivering more accurate and personalized AI outputs by...

Downloads: 0 This Week

Last Update: 7 days ago

See Project

Data Formulator

Create rich visualizations with AI

To create rich visualizations, data analysts often need to iterate back and forth among data processing and chart specification to achieve their goals. To achieve this, analysts need not only proficiency in data transformation and visualization tools but also efforts to manage the branching history consisting of many different versions of data and charts. Recent LLM-powered AI systems have greatly improved visualization authoring experiences, for example by mitigating manual data...

Downloads: 0 This Week

Last Update: 2026-05-28

See Project

GLM-4.5V

GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

GLM-4.5V is the preceding iteration in the GLM-V series that laid much of the groundwork for general multimodal reasoning and vision-language understanding. It embodies the design philosophy of mixing visual and textual modalities into a unified model capable of general-purpose reasoning, content understanding, and generation, while already supporting a wide variety of tasks: from image captioning and visual question answering to content recognition, GUI-based agents, video understanding,...

Downloads: 2 This Week

Last Update: 2026-05-16

See Project

Ollama-rs

A simple and easy-to-use library for interacting with the Ollama API

Ollama-rs is a Rust library designed to provide a simple and efficient interface for interacting with the Ollama API, enabling developers to integrate local large language models into Rust applications. It follows the official Ollama API closely, ensuring compatibility while offering an idiomatic Rust experience with strong typing and asynchronous execution. The library supports a wide range of operations, including text generation, chat interactions, embeddings, and model management, making...

Downloads: 3 This Week

Last Update: 2026-04-20

See Project

Agentic Data Scientist

An end-to-end Data Scientist

Agentic Data Scientist is an experimental AI-driven research framework that orchestrates data science workflows through autonomous agents that can reason, plan, and execute complex analytics tasks. Unlike traditional scripted pipelines, this project lets AI agents break down high-level research goals into sub-tasks such as data acquisition, cleaning, modeling, evaluation, and reporting, with minimal human direction. Each agent is designed to independently call functions, interact with data...

Downloads: 3 This Week

Last Update: 2026-05-29

See Project

Label Sleuth

Open source no-code system for text annotation and building of text

...Domain experts can quickly start labeling their data through an intuitive user interface. Developed by researchers across industry and academia, Label Sleuth incorporates the latest research from human-computer interaction, natural language processing, and artificial intelligence. Label Sleuth has been designed with an extensible architecture allowing the easy integration of new components, such as additional model architectures or active learning techniques.

Downloads: 3 This Week

Last Update: 2024-06-17

See Project

VibeVoice

Open-source multi-speaker long-form text-to-speech model

...A key innovation is its use of continuous acoustic and semantic speech tokenizers operating at an ultra-low frame rate of 7.5 Hz, enabling high audio fidelity with efficient processing of long sequences. The model integrates a Qwen2.5-based large language model with a diffusion head to produce realistic acoustic details and capture conversational context. Training involved curriculum learning with increasing sequence lengths up to 65K tokens, allowing VibeVoice to handle very long dialogues effectively. Safety mechanisms include an audible disclaimer and imperceptible watermarking in all generated audio to mitigate misuse risks.

Downloads: 11 This Week

Last Update: 2026-05-06

See Project

pytudes

Python programs, usually short, of considerable difficulty

...The name is inspired by musical études, meaning compact exercises designed to build mastery through focused practice. The repository includes readable solutions, experiments, notebooks, and scripts that cover algorithms, puzzles, probability, search, language processing, simulation, and mathematical reasoning. It is useful for programmers who want to study elegant Python code while learning how experienced developers approach problem solving. Many examples emphasize clarity and compactness rather than framework-heavy engineering. pytudes is best understood as a learning library, a coding style reference, and a set of practical programming studies.

Downloads: 1 This Week

Last Update: 2026-05-29

See Project

StoryGen Atelier

AI-assisted storyboard and video generation tool

StoryGen Atelier is an advanced creative tool that blends AI with visual storytelling, making it possible to generate fully structured storyboards and stitched videos from text prompts without requiring manual art or animation skills. Users begin with natural language descriptions of their story or scene, and the system uses state-of-the-art large models to generate both the script and corresponding frames. Once individual frames are created, a second AI model generates transition clips that smoothly link the frames into a coherent short video sequence, and the tool then assembles everything into a finished video using standard video processing tools. ...

Downloads: 1 This Week

Last Update: 2026-02-04

See Project

Swirl

Swirl queries any number of data sources with APIs

Swirl queries any number of data sources with APIs and uses spaCy and NLTK to re-rank the unified results without extracting and indexing anything! Includes zero-code configs for Apache Solr, ChatGPT, Elastic Search, OpenSearch, PostgreSQL, Google BigQuery, RequestsGet, Google PSE, NLResearch.com, Miro & more! SWIRL adapts and distributes queries to anything with a search API - search engines, databases, noSQL engines, cloud/SaaS services etc - and uses AI (Large Language Models) to re-rank...

Downloads: 0 This Week

Last Update: 2026-05-22

See Project

LangExtract

A Python library for extracting structured information

...LangExtract supports a wide range of models, including Google Gemini, OpenAI GPT, and local LLMs via Ollama, making it adaptable to different deployment environments and compliance needs. The system excels at handling long documents using optimized chunking, multi-pass extraction, and parallel processing to ensure both high recall and structured consistency.

Downloads: 4 This Week

Last Update: 2026-05-20

See Project

SALMONN family

A suite of advanced multi-modal LLMs

SALMONN is a family of advanced multi-modal large language models (LLMs) developed by ByteDance — designed to handle and integrate multiple data modalities (e.g. text, audio, video) rather than just plain text. The repository bundles different branches targeting specialized tasks (e.g. video-SALMONN, speech-quality assessment, general multimodal tasks), suggesting that the project is modular and extensible across domains. SALMONN aims to push the frontier of multi-modal AI by allowing models...

Downloads: 0 This Week

Last Update: 2026-05-14

See Project

Matrix

Multi-Agent daTa geneRation Infra and eXperimentation framework

Matrix is a distributed, large-scale engine for multi-agent synthetic data generation and experiments: it provides the infrastructure to run thousands of “agentic” workflows concurrently (e.g. multiple LLMs interacting, reasoning, generating content, data-processing pipelines) by leveraging distributed computing (like Ray + cluster management). The idea is to treat data generation as a “data-to-data” transformation: each input item defines a task, and the runtime orchestrates asynchronous, peer-to-peer agent workflows, avoiding global synchronization bottlenecks. That design makes Matrix particularly well-suited for large-batch inference, model benchmarking, data curation, augmentation, or generation — whether for language, code, dialogue, or multimodal tasks. ...

Downloads: 0 This Week

Last Update: 2026-03-05

See Project

IMS Toucan

Controllable and fast Text-to-Speech for over 7000 languages

IMS-Toucan is a toolkit for training, using, and teaching state-of-the-art text-to-speech systems, built at the Institute for Natural Language Processing (IMS), University of Stuttgart. It is the official home of ToucanTTS, a massively multilingual TTS system designed to support over 7,000 languages with a single unified framework. The toolkit focuses on being fast and controllable while not requiring huge amounts of compute, making it practical for research labs and smaller teams. ...

Downloads: 0 This Week

Last Update: 2025-11-28

See Project

Tokio

A runtime for writing reliable asynchronous applications with Rust

...Tokio is reliable in that its APIs are memory-safe, thread-safe, and misuse-resistant. Thanks to its task scheduler, it is also incredibly fast. It is capable of processing hundreds of thousands of requests per second with little to no overhead.

Downloads: 0 This Week

Last Update: 2026-05-08

See Project

Search Results for "language processing" - Page 12

Showing 842 open source projects for "language processing"

gensim

Raycast Ollama

LLPhant

course.fast.ai

Bacalhau

BrowserAI

AlaSQL

Conversational Health Agents (CHA)

rust-bert

AiLearning-Theory-Applying

Mem0

Data Formulator

GLM-4.5V

Ollama-rs

Agentic Data Scientist

Label Sleuth

VibeVoice

pytudes

StoryGen Atelier

Swirl

LangExtract

SALMONN family

Matrix

IMS Toucan

Tokio

Search Results for "language processing" - Page 12

Showing 842 open source projects for "language processing"

gensim

Raycast Ollama

LLPhant

course.fast.ai

Bacalhau

BrowserAI

AlaSQL

Conversational Health Agents (CHA)

rust-bert

AiLearning-Theory-Applying

Mem0

Data Formulator

GLM-4.5V

Ollama-rs

Agentic Data Scientist

Label Sleuth

VibeVoice

pytudes

StoryGen Atelier

Swirl

LangExtract

SALMONN family

Matrix

IMS Toucan

Tokio

Related Searches

Related Categories