Showing 19 open source projects for "extract"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    DocTR

    DocTR

    Library for OCR-related tasks powered by Deep Learning

    DocTR provides an easy and powerful way to extract valuable information from your documents. Seemlessly process documents for Natural Language Understanding tasks: we provide OCR predictors to parse textual information (localize and identify each word) from your documents. Robust 2-stage (detection + recognition) OCR predictors with pretrained parameters. User-friendly, 3 lines of code to load a document and extract text with a predictor.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    docext

    docext

    An on-premises, OCR-free unstructured data extraction

    ...This allows the system to detect and extract structured elements such as tables, signatures, key fields, and layout information while maintaining semantic understanding of the document content. The toolkit can also convert complex documents into structured markdown representations that preserve formatting and contextual relationships.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    deepfakes_faceswap

    deepfakes_faceswap

    Deepfakes Software For All

    Faceswap is the leading free and open source multi-platform deepfakes software. When faceswapping was first developed and published, the technology was groundbreaking, it was a huge step in AI development. It was also completely ignored outside of academia because the code was confusing and fragmentary. It required a thorough understanding of complicated AI techniques and took a lot of effort to figure it out. Until one individual brought it together into a single, cohesive collection.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 4
    Video-subtitle-extractor

    Video-subtitle-extractor

    A GUI tool for extracting hard-coded subtitle (hardsub) from videos

    Video hard subtitle extraction, generate srt file. There is no need to apply for a third-party API, and text recognition can be implemented locally. A deep learning-based video subtitle extraction framework, including subtitle region detection and subtitle content extraction. A GUI tool for extracting hard-coded subtitles (hardsub) from videos and generating srt files. Use local OCR recognition, no need to set up and call any API, and do not need to access online OCR services such as Baidu...
    Downloads: 61 This Week
    Last Update:
    See Project
  • Application Monitoring That Won't Slow Your App Down Icon
    Application Monitoring That Won't Slow Your App Down

    AppSignal's Rust-based agent is lightweight and stable. Already running in thousands of production apps.

    Full APM with errors, performance, logs, and uptime monitoring. 99.999% uptime SLA on the platform itself.
    Start Free
  • 5
    SimpleHTR

    SimpleHTR

    Handwritten Text Recognition (HTR) system implemented with TensorFlow

    ...The project focuses on converting images of handwritten text into machine-readable digital text using neural networks. The system uses a combination of convolutional neural networks and recurrent neural networks to extract visual features and model sequential character patterns in handwriting. It also employs connectionist temporal classification (CTC) to align predicted character sequences with input images without requiring character-level segmentation. The repository provides code for training models, performing inference on handwritten text images, and evaluating recognition accuracy. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    fastdup

    fastdup

    An unsupervised and free tool for image and video dataset analysis

    fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    pyAudioAnalysis

    pyAudioAnalysis

    Python Audio Analysis Library: Feature Extraction, Classification

    pyAudioAnalysis is an open-source Python library designed for audio signal analysis, machine learning, and music information retrieval tasks. The project provides a collection of tools that allow developers to extract meaningful features from audio files and use those features for classification, segmentation, and analysis. The library supports multiple audio processing workflows, including feature extraction from raw audio signals, training of machine learning models, and automatic audio segmentation. It also includes utilities for visualizing audio features and analyzing patterns within sound recordings, which can be useful in applications such as speech recognition, music classification, and acoustic event detection. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    TimeMixer

    TimeMixer

    Decomposable Multiscale Mixing for Time Series Forecasting

    ...Instead of relying on traditional recurrent or transformer-based architectures, TimeMixer is implemented as a fully multilayer perceptron–based model that performs temporal mixing across different resolutions of the data. The architecture introduces specialized components such as Past-Decomposable-Mixing blocks, which extract information from historical sequences at different scales, and Future-Multipredictor-Mixing modules that combine predictions from multiple forecasting paths. This design allows the model to integrate complementary information across scales and produce more accurate predictions for complex temporal patterns.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Semantra

    Semantra

    Multi-tool for semantic search

    ...The software analyzes text and PDF documents stored locally and creates embeddings that allow queries to retrieve results based on conceptual similarity. It is primarily intended for individuals who need to extract insights from large document collections, including researchers, journalists, students, and historians. The system runs from the command line and automatically launches a local web interface where users can perform interactive searches and examine document passages related to a query. By relying on semantic embeddings and contextual analysis, the tool can identify passages that are relevant even when the query uses different wording than the source documents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 10
    ktrain

    ktrain

    ktrain is a Python library that makes deep learning AI more accessible

    ktrain is a Python library that makes deep learning and AI more accessible and easier to apply. ktrain is a lightweight wrapper for the deep learning library TensorFlow Keras (and other libraries) to help build, train, and deploy neural networks and other machine learning models. Inspired by ML framework extensions like fastai and ludwig, ktrain is designed to make deep learning and AI more accessible and easier to apply for both newcomers and experienced practitioners. With only a few lines...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Ubix Linux

    Ubix Linux

    The Pocket Datalab

    Ubix stands for Universal Business Intelligence Computing System. Ubix Linux is an open-source, Debian-based Linux distribution geared towards data acquisition, transformation, analysis and presentation. Ubix Linux purpose is to offer a tiny but versatile datalab. Ubix Linux is easily accessible, resource-efficient and completely portable on a simple USB key. Ubix Linux is a perfect toolset for learning data analysis and artificial intelligence basics on small to medium...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    DeepKE

    DeepKE

    An Open Toolkit for Knowledge Graph Extraction and Construction

    ...DeepKE is a knowledge extraction toolkit supporting cnSchema, standard supervised, low-resource, and document-level scenarios for entity, relation, and attribution extraction. It allows developers and researchers to customize datasets and models to extract information from unstructured texts. DeepKE supports low-resource settings with only a few labeled (e.g., 16/32 shot) instances for widespread information extraction tasks. Since relations are expressed over multiple sentences in real-world applications, DeepKE supports document-level relation extraction. We present a new open-source and extensible knowledge extraction toolkit, called DeepKE, supporting standard fully supervised, low-resource few-shot and document-level scenarios.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    imgaug

    imgaug

    Image augmentation for machine learning experiments

    ...Affine transformations, perspective transformations, contrast changes, gaussian noise, dropout of regions, hue/saturation changes, cropping/padding, blurring, etc. Rotate image and segmentation map on it by the same value sampled. Convert keypoints to distance maps, extract pixels within bounding boxes from images, clip polygon to the image plane, etc. Scale segmentation maps, average/max pool of images/maps, pad images to aspect ratios (e.g. to square them). Draw heatmaps, segmentation maps, keypoints, bounding boxes, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Image classification models for Keras

    Image classification models for Keras

    Keras code and weights files for popular deep learning models

    All architectures are compatible with both TensorFlow and Theano, and upon instantiation the models will be built according to the image dimension ordering set in your Keras configuration file at ~/.keras/keras.json. For instance, if you have set image_dim_ordering=tf, then any model loaded from this repository will get built according to the TensorFlow dimension ordering convention, "Width-Height-Depth". Pre-trained weights can be automatically loaded upon instantiation (weights='imagenet'...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Five video classification methods

    Five video classification methods

    Code that accompanies my blog post outlining five video classification

    Classifying video presents unique challenges for machine learning models. As I’ve covered in my previous posts, video has the added (and interesting) property of temporal features in addition to the spatial features present in 2D images. While this additional information provides us more to work with, it also requires different network architectures and, often, adds larger memory and computational demands.We won’t use any optical flow images. This reduces model complexity, training time, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    TexLexAn is an open source text analyser for Linux, able to estimate the readability and reading time, to classify and summarize texts. It has some learning abilities and accepts html, doc, pdf, ppt, odt and txt documents. Written in C and Python.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Pronac MediaMonkey Extension

    Pronac MediaMonkey Extension

    Recommends music based upon your current taste.

    A music recommendation engine. It is meant to be an add-on for popular media players like Winamp, Amarok, Rhythmbox or Banshee. Currently supports only MediaMonkey Player. Downlaod, extract and run "pronac.exe". Play the first song from the Now Playing list, it'll recommend you next songs from the same list. NOTE: MAKE SURE THAT SONG SHUFFLE IS TURNED OFF WHILE USING PRONAC. Based upon K-Nearest Neighbor Machine Learning Algorithm, K-Fold Cross Validation and EchoNest for audio features.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Spider that recollects data from MySpace Social Network. At now, it is only designed to extract information from native american people because it is used for a social science study in the UNAM (Universidad Nacional Autónoma de México).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    A Python function library to extract EEG feature from EEG time series in standard Python and numpy data structure. Features include classical spectral analysis, entropies, fractal dimensions, DFA, inter-channel synchrony and order, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB