high free download - SourceForge

Showing 17 open source projects for "high"

View related business solutions

Semantic Search Linux Clear Filters & Widen Search

Ship Agents Faster
Transform your applications and workflows into powerful agentic systems at global scale.

Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.

Get Started Free
$300 Free Credits to Build on Google Cloud
New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.

Claim $300 Free
1

MyScaleDB

A @ClickHouse fork that supports high-performance vector search

...The system is built on top of the ClickHouse database engine and extends it with specialized indexing and search capabilities optimized for vector embeddings. This design allows developers to store structured data, unstructured text, and high-dimensional vector embeddings within a single database platform. MyScaleDB enables developers to perform vector similarity searches using standard SQL syntax, eliminating the need to learn specialized vector database query languages. The database is optimized for high performance and scalability, allowing it to handle extremely large datasets and high query loads typical of production AI applications.

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
2

Zvec

A lightweight, lightning-fast, in-process vector database

...Developed by Alibaba’s Tongyi Lab, it positions itself as the “SQLite of vector databases” by being easy to integrate, minimal in dependencies, and capable of handling high throughput with low latency on edge devices or small systems. Zvec excels at approximate nearest neighbor search and retrieval tasks that power features like semantic search, recommendation systems, and retrieval-augmented generation (RAG) setups. Its performance benchmarks show it achieving high queries-per-second and fast index build times compared to similar tools. ...

Downloads: 0 This Week

Last Update: 3 days ago
See Project
3

pgvector

Open-source vector similarity search for Postgres

pgvector is an open-source PostgreSQL extension that equips PostgreSQL databases with vector data storage, indexing, and similarity search capabilities—ideal for embeddings-based applications like semantic search and recommendations. You can add an index to use approximate nearest neighbor search, which trades some recall for speed. Unlike typical indexes, you will see different results for queries after adding an approximate index. An HNSW index creates a multilayer graph. It has better...

Downloads: 49 This Week

Last Update: 2026-06-18
See Project
4

PHP Client For NLP Cloud

NLP Cloud serves high performance pre-trained or custom models for NER

NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, dialogue summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases extraction, text generation, image generation, blog post generation, code generation, question answering, automatic speech recognition, machine translation, language detection, semantic search, semantic similarity, tokenization, POS tagging, embeddings, and dependency parsing. ...

Downloads: 2 This Week

Last Update: 2024-11-27
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
5

FlagEmbedding

Retrieval and Retrieval-augmented LLMs

FlagEmbedding is an open-source toolkit for building and deploying high-performance text embedding models used in information retrieval and retrieval-augmented generation systems. The project is part of the BAAI FlagOpen ecosystem and focuses on creating embedding models that transform text into dense vector representations suitable for semantic search and large language model pipelines. FlagEmbedding includes a family of models known as BGE (BAAI General Embedding), which are designed to achieve strong performance across multilingual and cross-lingual retrieval benchmarks. ...

Downloads: 1 This Week

Last Update: 2026-04-22
See Project
6

Python Client For NLP Cloud

NLP Cloud serves high performance pre-trained or custom models for NER

NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, dialogue summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases extraction, text generation, image generation, blog post generation, source code generation, question answering, automatic speech recognition, machine translation, language detection, semantic search, semantic similarity, tokenization, POS tagging, embeddings, and dependency parsing. ...

Downloads: 0 This Week

Last Update: 2024-11-27
See Project
7

Node.js Client For NLP Cloud

NLP Cloud serves high performance pre-trained or custom models

This is the Node.js client (with Typescript types) for the NLP Cloud API. NLP Cloud serves high-performance pre-trained or custom models for NER, sentiment analysis, classification, summarization, dialogue summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases extraction, text generation, image generation, blog post generation, text generation, question answering, automatic speech recognition, machine translation, language detection, semantic search, semantic similarity, tokenization, POS tagging, embeddings, and dependency parsing. ...

Downloads: 0 This Week

Last Update: 2024-11-27
See Project
8

SemTools

Semantic search and document parsing tools for the command line

SemTools is an open-source command-line toolkit designed for document parsing, semantic indexing, and semantic search workflows. The project focuses on enabling developers and AI agents to process large document collections and extract meaningful semantic representations that can be searched efficiently. Built with Rust for performance and reliability, the toolchain provides fast processing of text and structured documents while maintaining low system overhead. SemTools can parse documents,...

Downloads: 3 This Week

Last Update: 2026-03-13
See Project
9

OpenViking

Context database designed specifically for AI Agents

OpenViking is an open-source context database engineered for efficient indexing and retrieval of large amounts of unstructured or semi-structured context data used by AI applications. It’s primarily designed to serve as a high-performance, scalable backend for storing app context, embeddings, conversational histories, and other textual artifacts that need rapid lookup and semantic search, which makes it especially useful for systems like chatbots or memory-augmented agents. The project is implemented with performance in mind, often leveraging optimized data structures that balance fast reads and writes with minimal resource consumption. ...

Downloads: 2 This Week

Last Update: 3 days ago
See Project
Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure
Native application identity and user-based security for your Azure cloud

Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.

Get a free trial
10

CocoIndex

ETL framework to index data for AI, such as RAG

...It lets users index and retrieve content based on meaning rather than keywords, making it ideal for modern AI-based search applications. CocoIndex leverages vector embeddings and integrates with various models and frameworks, including OpenAI and Hugging Face, to provide high-quality semantic understanding. It’s built for transparency, ease of use, and local control over your search data, distinguishing itself from closed, black-box systems. The tool is suitable for developers working on personal knowledge bases, AI search interfaces, or private LLM applications.

Downloads: 2 This Week

Last Update: 2 days ago
See Project
11

LEANN

Local RAG engine for private multimodal knowledge search on devices

...LEANN introduces a storage-efficient approximate nearest neighbor index combined with on-the-fly embedding recomputation to avoid storing large embedding vectors. By recomputing embeddings during queries and using compact graph-based indexing structures, LEANN can maintain high search accuracy while minimizing disk usage. It aims to act as a unified personal knowledge layer that connects different types of data such as documents, code, images, and other local files into a searchable context for language models.

Downloads: 0 This Week

Last Update: 2026-03-13
See Project
12

Controllable-RAG-Agent

This repository provides an advanced RAG

...The pipeline ingests PDFs, splits them into chapters, cleans and preprocesses text, then constructs vector stores for fine-grained chunks, chapter summaries, and book quotes to support nuanced queries. At query time, it anonymizes entities, creates a high-level plan, de-anonymizes and expands that plan into concrete retrieval or reasoning tasks, and executes them in sequence while continuously revising the plan. A key focus is hallucination control: each answer is verified against retrieved context, and responses are reworked when they are not sufficiently grounded in the source documents.

Downloads: 0 This Week

Last Update: 2026-06-04
See Project
13

Burn To The Brim

Utility for efficiently grouping files and folders together

**Burn To The Brim** is a highly efficient archiving utility designed to solve the classic subset-sum (bin packing) optimization challenge. It intelligently selects and groups files and directories (documents, high-fidelity media, or raw back-ups) to optimally fill recordable Blu-Rays, USB drives or custom-capacity storage drives. By recursively scanning your designated folders, BTTB matches item sizes to your media capacity, finding a near-perfect selection in milliseconds and an absolute perfect packing configuration in just a few seconds. ...

Downloads: 1 This Week

Last Update: 2026-06-12
See Project
14

finetuner

Task-oriented finetuning for better embeddings on neural search

...With Finetuner, you can easily enhance the performance of pre-trained models, making them production-ready without extensive labeling or expensive hardware. Create high-quality embeddings for semantic search, visual similarity search, cross-modal text image search, recommendation systems, clustering, duplication detection, anomaly detection, or other uses. Bring considerable improvements to model performance, making the most out of as little as a few hundred training samples, and finish fine-tuning in as little as an hour.

Downloads: 0 This Week

Last Update: 2023-08-21
See Project
15

hora

Efficient approximate nearest neighbor search algorithm collections

...The library is written in Rust and emphasizes performance, safety, and efficient memory management, making it suitable for production-grade applications requiring low latency and high throughput.

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
16

bge-large-en-v1.5

BGE-Large v1.5: High-accuracy English embedding model for retrieval

BAAI/bge-large-en-v1.5 is a powerful English sentence embedding model designed by the Beijing Academy of Artificial Intelligence to enhance retrieval-augmented language model systems. It uses a BERT-based architecture fine-tuned to produce high-quality dense vector representations optimized for sentence similarity, search, and retrieval. This model is part of the BGE (BAAI General Embedding) family and delivers improved similarity distribution and state-of-the-art results on the MTEB benchmark. It is recommended for use in document retrieval tasks, semantic search, and passage reranking, particularly when paired with a reranker like BGE-Reranker. ...

Downloads: 0 This Week

Last Update: 2025-07-02
See Project
17

bge-base-en-v1.5

Efficient English embedding model for semantic search and retrieval

bge-base-en-v1.5 is an English sentence embedding model from BAAI optimized for dense retrieval tasks, part of the BGE (BAAI General Embedding) family. It is a fine-tuned BERT-based model designed to produce high-quality, semantically meaningful embeddings for tasks like semantic similarity, information retrieval, classification, and clustering. This version (v1.5) improves retrieval performance and stabilizes similarity score distribution without requiring instruction-based prompts. With 768 embedding dimensions and a maximum sequence length of 512 tokens, it achieves strong performance across multiple MTEB benchmarks, nearly matching larger models while maintaining efficiency. ...

Downloads: 0 This Week

Last Update: 2025-07-01
See Project