Showing 19 open source projects for "unstructured data"

View related business solutions
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • 1
    Milvus

    Milvus

    Vector database for scalable similarity search and AI applications

    Milvus is an open-source vector database built to power embedding similarity search and AI applications. Milvus makes unstructured data search more accessible, and provides a consistent user experience regardless of the deployment environment. Milvus 2.0 is a cloud-native vector database with storage and computation separated by design. All components in this refactored version of Milvus are stateless to enhance elasticity and flexibility. Average latency measured in milliseconds on trillion vector datasets. ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 2
    Scanopy

    Scanopy

    Clean network diagrams, One-time setup, zero upkeep

    Scanopy is a powerful multi-modal data capture and analysis toolkit that enables users to collect, process, and visualize structured and unstructured information from a variety of sources in a flexible pipeline. It is built to handle complex scanning tasks — such as OCR, document analysis, audio transcription, network data capture, and image extraction — while providing unified APIs and workflows that make managing heterogeneous data sources seamless. ...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 3
    Gretel Synthetics

    Gretel Synthetics

    Synthetic data generators for structured and unstructured text

    Unlock unlimited possibilities with synthetic data. Share, create, and augment data with cutting-edge generative AI. Generate unlimited data in minutes with synthetic data delivered as-a-service. Synthesize data that are as good or better than your original dataset, and maintain relationships and statistical insights. Customize privacy settings so that data is always safe while remaining useful for downstream workflows. Ensure data accuracy and privacy confidently with expert-grade reports....
    Downloads: 9 This Week
    Last Update:
    See Project
  • 4
    DataChain

    DataChain

    AI-data warehouse to enrich, transform and analyze unstructured data

    Datachain enables multimodal API calls and local AI inferences to run in parallel over many samples as chained operations. The resulting datasets can be saved, versioned, and sent directly to PyTorch and TensorFlow for training. Datachain can persist features of Python objects returned by AI models, and enables vectorized analytical operations over them. The typical use cases are data curation, LLM analytics and validation, image segmentation, pose detection, and GenAI alignment. Datachain...
    Downloads: 5 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    Diffgram

    Diffgram

    Training data (data labeling, annotation, workflow) for all data types

    ...Training Data is the art of supervising machines through data. This includes the activities of annotation, which produces structured data; ready to be consumed by a machine learning model. Annotation is required because raw media is considered to be unstructured and not usable without it. That’s why training data is required for many modern machine learning use cases including computer vision, natural language processing and speech recognition.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 6
    Pimcore

    Pimcore

    Open Source Data & Experience Management Platform

    No matter if you're dealing with unstructured web documents or structured data for MDM/PIM, you define the UI design (web documents by a template and structured data with an intuitive graphical editor), Pimcore knows how to persist the data efficiently and optimized for fast access. Due to the framework approach, Pimcore is very flexible and adapts perfectly to your needs.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Gridap.jl

    Gridap.jl

    Grid-based approximation of partial differential equations in Julia

    Gridap provides a set of tools for the grid-based approximation of partial differential equations (PDEs) written in the Julia programming language. The library currently supports linear and nonlinear PDE systems for scalar and vector fields, single and multi-field problems, conforming and nonconforming finite element (FE) discretizations, on structured and unstructured meshes of simplices and n-cubes. It also provides methods for time integration. Gridap is extensible and modular. One can...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 8
    LinearSolve.jl

    LinearSolve.jl

    High-Performance Unified Interface for Linear Solvers in Julia

    LinearSolve.jl is a unified interface for the linear solving packages of Julia. It interfaces with other packages of the Julia ecosystem to make it easy to test alternative solver packages and pass small types to control algorithm swapping. It also interfaces with the ModelingToolkit.jl world of symbolic modeling to allow for automatically generating high-performance code. Performance is key: the current methods are made to be highly performant on scalar and statically sized small problems,...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    FinGPT

    FinGPT

    Open-Source Financial Large Language Models

    FinGPT is an open-source, finance-specialized large language model framework that blends the capabilities of general LLMs with real-time financial data feeds, domain-specific knowledge bases, and task-oriented agents to support market analysis, research automation, and decision support. It extends traditional GPT-style models by connecting them to live or historical financial datasets, news APIs, and economic indicators so that outputs are grounded in relevant and recent market conditions...
    Downloads: 10 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10
    Relaticle

    Relaticle

    The Next-Generation Open-Source CRM Platform written with Laravel

    ...The interface lets you write plain text notes and tag or connect them dynamically, making it easier to uncover patterns and connections over time instead of losing insights in a long, unstructured list. Because it’s built with productivity and exploration in mind, Relaticle offers fast search, semantic context awareness, and the ability to zoom from high-level overviews down to specific node details. It also supports self-hosting so users retain full control over their data without relying on third-party servers or cloud subscriptions.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 12
    BDS

    BDS

    Blockchain data parsing and persisting results

    JD Cloud Blockchain Data Service (BDS) is a real-time data aggregating, analyzing, and visualization service for chain-like unstructured data from all kinds of 3rd party Blockchains. Splitter is the key module of Blockchain Data Service (BDS) and provides data analysis capability. Splitter is responsible for consuming blockchain data from message queue (kafka) and inserting data into persistent data storage services (relational database, data warehouse, etc.) for further processing. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    TEXT2DATA

    TEXT2DATA

    Text Analytics Platform

    Bring Text Analytics Platform that uses NLP (Natural Language Processing) and Machine Learning to your work environment. Extract essential information from your text documents and let Artificial Intelligence save your time. Get detailed and agile reports on your unstructured data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    iCubing

    Several OLAP algorithms, data structures and HPC OLAP versions

    OLAP technology is very useful for decision makers and data mining tools with BIG data. In this direction, we implement iCubing project with several multidimensional data cube approaches for cube indexing, querying, updating and mining. There are also several cube types, i.e. alphanumeric cubes, text cubes with unstructured data and geo cube with geo types, dimensions, measures and hierarchies, so the OLAP area continues a hard challenge after more than 20 years of the seminal paper of Jim Gray et al. in 1997. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Moved to sf.net/projects/cloveretl/ !!! CloverETL is a Java ETL framework which transforms structured or unstructured data. Works as a standalone application or embedded in other applications as a data transformation library of functions.
    Downloads: 24 This Week
    Last Update:
    See Project
  • 16
    Single Click Real Time searching of both structured and unstructured data and information. Simultaneous searching of Structured: databases and unstructured: documents from within a web browser, desktop application and application plugins
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    The Hardware Assisted Visibility Sorting (HAVS) algorithm is a GPU-based, direct volume renderer for unstructured grids. The algorithm operates in both object- and images-space and includes a sample-based, dynamic level-of-detail algorithm.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    bitHull is a Simple unstructured data store-and-share mechanism. It is part experimental graph-based task/note/idea management system and part data aggregator.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    The Enterprise Knowledge Base (EKB) from ModelDriven.org is a repository for enterprise knowledge: metadata, planning and governance. The EKB can manage and transform both structured and unstructured data as files, Eclipse-EMF or RDF ontologies.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB