Open source libraries and APIs to build custom preprocessing pipelines
Instill Core is a full-stack AI infrastructure tool for data
Superlinked is a Python framework for AI Engineers
Parse files for optimal RAG
AI-Powered Data Processing: Use LOTUS to process all of your datasets
Extract schema, statistics and entities from datasets
Claude Code skill for generating production-quality SVG+PNG technical
Autonomous LLM agent for end-to-end data science workflows
Context database designed specifically for AI Agents
Central interface to connect your LLM's with external data
A modular graph-based Retrieval-Augmented Generation (RAG) system
Python module for parsing semi-structured text into python tables
A system for agentic LLM-powered data processing and ETL
AI-data warehouse to enrich, transform and analyze unstructured data
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine
No-code LLM Platform to launch APIs and ETL Pipelines
Lightweight library for scraping web-sites with LLMs
Training data (data labeling, annotation, workflow) for all data types
Airweave lets agents search any app
Synthetic data generators for structured and unstructured text
Deterministic LLMs Outputs for AI Applications and AI Agents
Open-Source Financial Large Language Models
Open-source choice to scale, assess and maintain natural language data
The data structure for multimodal data
Dealing with all unstructured data, such as reverse image search