Showing 396 open source projects for "data integration"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 1
    Universal Sentence Encoder

    Universal Sentence Encoder

    Encoder of greater-than-word length text trained on a variety of data

    The Universal Sentence Encoder (USE) is a pre-trained deep learning model designed to encode sentences into fixed-length embeddings for use in various natural language processing (NLP) tasks. It leverages Transformer and Deep Averaging Network (DAN) architectures to generate embeddings that capture the semantic meaning of sentences. The model is designed for tasks like sentiment analysis, semantic textual similarity, and clustering, and provides high-quality sentence representations in a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    CodinIT.dev

    CodinIT.dev

    Free, local, open-source AI app builder

    ...A natural-language API enables powerful data queries and updates, automating tasks without leaving the chat interface. By running entirely locally, CodinIT.dev delivers maximum privacy, minimal latency, and smooth developer experiences free from cloud-based inconsistencies.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    PyDenseCRF

    PyDenseCRF

    Python wrapper to Philipp Krähenbühl's dense (fully connected) CRFs

    PyDenseCRF is a Python library that provides a wrapper around the implementation of fully connected Conditional Random Fields (CRFs) developed by Philipp Krähenbühl and Vladlen Koltun. The project allows developers and researchers to integrate Dense CRF inference into Python-based machine learning pipelines, particularly for computer vision tasks such as image segmentation and labeling. Conditional Random Fields are probabilistic graphical models used to model contextual relationships...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    sketch

    sketch

    AI code-writing assistant that understands data content

    Sketch is an open-source AI-powered data analysis assistant designed specifically for pandas users, enabling natural language interaction with tabular datasets to generate code, insights, and transformations. It works by summarizing the structure and statistical properties of a dataset and providing that context to a language model, allowing it to generate highly relevant and accurate responses tailored to the data. The tool integrates directly into pandas dataframes through an extension,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    LLaMA-MoE

    LLaMA-MoE

    Building Mixture-of-Experts from LLaMA with Continual Pre-training

    LLaMA-MoE is an open-source project that builds mixture-of-experts language models from LLaMA through expert partitioning and continual pre-training. The repository is centered on making MoE research more accessible by offering smaller and more affordable models with only about 3.0 to 3.5 billion activated parameters, which helps reduce deployment and experimentation costs. Its architecture works by splitting LLaMA feed-forward networks into sparse experts and adding gating mechanisms so...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    CLIP-as-service

    CLIP-as-service

    Embed images and sentences into fixed-length vectors

    ...Smooth integration with neural search ecosystem including Jina and DocArray. Build cross-modal and multi-modal solutions in no time.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    towhee

    towhee

    Framework that is dedicated to making neural data processing

    ...Towhee provides out-of-the-box integration with your favorite libraries, tools, and frameworks, making development quick and easy. Towhee includes a pythonic method-chaining API for describing custom data processing pipelines. We also support schemas, making processing unstructured data as easy as handling tabular data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Adala

    Adala

    Adala: Autonomous DAta (Labeling) Agent framework

    Adala is a data-centric AI framework focused on dataset curation, annotation, and validation. It helps AI teams manage high-quality training datasets by providing tools for data auditing, error detection, and quality assessment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    DPM-Solver

    DPM-Solver

    Fast ODE Solver for Diffusion Probabilistic Model Sampling

    DPM-Solver is a machine learning research implementation focused on accelerating the sampling process in diffusion probabilistic models used for generative AI tasks. Diffusion models are powerful generative systems capable of producing high-quality images and other data, but traditional sampling methods often require hundreds or thousands of computational steps. The project introduces a specialized numerical solver designed to approximate the diffusion process using a small number of high-order integration steps. By reformulating the sampling problem as the solution of a diffusion-related ordinary differential equation, the solver can produce high-quality samples much more efficiently. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 10
    RAGs

    RAGs

    Build ChatGPT over your data, all with natural language

    RAGs is an open-source application designed to simplify the creation of retrieval-augmented generation pipelines through an interactive interface. Built with Streamlit and powered by the LlamaIndex ecosystem, the tool allows users to construct AI assistants that answer questions using their own data sources. Instead of requiring extensive programming knowledge, the application allows users to configure and build a RAG system using natural language instructions. The system automatically...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    DB-GPT-Hub

    DB-GPT-Hub

    A repository that contains models, datasets, and fine-tuning

    DB-GPT-Hub is an open-source repository that provides datasets, models, and training tools designed to improve large language models for database interaction tasks, particularly Text-to-SQL. The project serves as a specialized extension of the broader DB-GPT ecosystem, focusing on the preparation and evaluation of models capable of translating natural language questions into structured database queries. It offers a modular framework that supports data preparation, model fine-tuning,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    YiVal

    YiVal

    Your Automatic Prompt Engineering Assistant for GenAI Applications

    ...It focuses on experimentation and optimization by allowing users to test multiple prompt variations, configurations, and model parameters in parallel, then evaluate their outputs using structured metrics and scoring systems. The platform is particularly useful in production environments where prompt quality directly impacts user experience, as it provides a repeatable and data-driven approach to refining prompts rather than relying on manual trial and error. YiVal supports integration with various LLM providers and can orchestrate experiments across different models, making it adaptable to evolving AI ecosystems. It also includes evaluation pipelines that help quantify output quality based on criteria such as accuracy, coherence, or task-specific benchmarks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Prem AI

    Prem AI

    Prem provides a unified environment to develop AI applications

    ...For example, all models of type Chat expose the OpenAI API for easy of integration of existing tools and AI app ecosystem. Each service we support it's published on the Prem Registry.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Alink

    Alink

    Alink is the Machine Learning algorithm platform based on Flink

    Alink is Alibaba’s scalable machine learning algorithm platform built on Apache Flink, designed for batch and stream data processing. It provides a wide variety of ready-to-use ML algorithms for tasks like classification, regression, clustering, recommendation, and more. Written in Java and Scala, Alink is suitable for enterprise-grade big data applications where performance and scalability are crucial. It supports model training, evaluation, and deployment in real-time environments and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Chinese Llama 2 7B

    Chinese Llama 2 7B

    The first Chinese LLaMA2 model in the open source community

    Chinese Llama 2 7B is an open-source large language model adapted from the LLaMA-2 architecture and optimized for Chinese and bilingual Chinese-English applications. The project provides a version of LLaMA-2 that has been further trained on Chinese data so it can better understand and generate text in Chinese while maintaining compatibility with the original model ecosystem. In addition to the model weights, the repository also includes supervised fine-tuning datasets and training resources...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    fastquant

    fastquant

    Backtest and optimize your ML trading strategies with only 3 lines

    fastquant is a Python library designed to simplify quantitative financial analysis and algorithmic trading strategy development. The project focuses on making backtesting accessible by providing a high-level interface that allows users to test investment strategies with only a few lines of code. It integrates historical market data sources and trading frameworks so that users can quickly build experiments without constructing complex data pipelines. The framework enables users to test common...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Lingua-Go

    Lingua-Go

    The most accurate natural language detection library for Go

    Lingua-Go is a Golang implementation of the Lingua language detection library, providing efficient and accurate language identification for Go-based applications. Its task is simple: It tells you which language some text is written in. This is very useful as a preprocessing step for linguistic data in natural language processing applications such as text classification and spell checking. Other use cases, for instance, might include routing e-mails to the right geographically located...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    AnyTrading

    AnyTrading

    The most simple, flexible, and comprehensive OpenAI Gym trading

    gym-anytrading is an OpenAI Gym-compatible environment designed for developing and testing reinforcement learning algorithms on trading strategies. It simulates trading environments for financial markets, including stocks and forex.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    LlamaGPT

    LlamaGPT

    Self-hosted ChatGPT-like chatbot powered by Llama models locally

    LlamaGPT is a self-hosted chatbot application designed to provide a conversational AI experience similar to ChatGPT while running entirely on local hardware. It uses Llama-based large language models to generate responses and operate without requiring external AI services. Because the system runs locally, it keeps all interactions and data on the user's device, enabling a fully private environment for experimentation with AI chat interfaces. LlamaGPT includes both a user interface and an API...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    AI-powered enterprise search engine

    AI-powered enterprise search engine

    AI-powered enterprise search engine

    AI-powered enterprise search engine is an open-source, AI-powered enterprise search engine designed to help organizations quickly locate and retrieve information scattered across multiple internal tools, documents, and communication platforms. It enables users to search across sources such as Slack, Confluence, Jira, Google Drive, and other enterprise systems, consolidating fragmented knowledge into a single, unified search experience. By leveraging natural language processing, Gerev allows...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    PromethAI

    PromethAI

    Open-source framework that gives you AI Agents

    PromethAI-Backend is a backend framework for AI-driven automation and knowledge extraction. It is designed to integrate with large language models (LLMs) to provide AI-enhanced workflows, including content generation, summarization, and data analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    learn2learn

    learn2learn

    A PyTorch Library for Meta-learning Research

    Learn2Learn is a PyTorch-based library focused on meta-learning and few-shot learning research. It provides reusable components and meta-learning algorithms, making it easier to build, train, and evaluate models that can quickly adapt to new tasks with minimal data. Learn2Learn is widely used in research for tasks such as few-shot classification, reinforcement learning, and optimization.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    TensorFlow Ranking

    TensorFlow Ranking

    Learning to rank in TensorFlow

    TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform. Commonly used loss functions including pointwise, pairwise, and listwise losses. Commonly used ranking metrics like Mean Reciprocal Rank (MRR) and Normalized Discounted Cumulative Gain (NDCG). Multi-item (also known as groupwise) scoring functions. LambdaLoss implementation for direct ranking metric optimization. Unbiased Learning-to-Rank from biased feedback data. We envision that this library...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    PIFuHD

    PIFuHD

    High-Resolution 3D Human Digitization from A Single Image

    PIFuHD (Pixel-Aligned Implicit Function for 3D human reconstruction at high resolution) is a method and codebase to reconstruct high-fidelity 3D human meshes from a single image. It extends prior PIFu work by increasing resolution and detail, enabling fine geometry in cloth folds, hair, and subtle surface features. The method operates by learning an implicit occupancy / surface function conditioned on the image and camera projection; at inference time it queries dense points to reconstruct a...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 25
    Alpa

    Alpa

    Training and serving large-scale neural networks

    Alpa is a system for training and serving large-scale neural networks. Scaling neural networks to hundreds of billions of parameters has enabled dramatic breakthroughs such as GPT-3, but training and serving these large-scale neural networks require complicated distributed system techniques. Alpa aims to automate large-scale distributed training and serving with just a few lines of code.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB