Showing 81 open source projects for "extraction"

View related business solutions
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 1
    OCR Web based

    OCR Web based

    OCR web based for Browser Firefox & PC

    ...id=com.ulm.ocr ========= Add-on for Opera: http://bit.ly/1F0E0wP ========= Release 1.0.1 For safety reasons, I disabled the possibility to import an image from url. Finally, I wish to inform you that you can write or draw directly on the canvas to get the subsequent character recognition and text extraction
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Parallec

    Parallec

    Fast Parallel Async HTTP/SSH/TCP/UDP/Ping Client Java Library

    Fast parallel async HTTP/SSH/TCP/UDP/Ping client java library on Akka. Aggregate 100,000 APIs & send results anywhere in 20 lines of code. View production use cases. Ping or HTTP calls 8000 servers with responses aggregated in 12 seconds. Parallec means Parallel Client (pronounced as "para-like"). Open Source from eBay Cloud. A convenient response context passes any object you need when handling a response. Process data anyway and send it anywhere. Intuitive builder pattern APIs make...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Turbo Download Manager

    Turbo Download Manager

    A portable modern multi-threading download manager for all platforms

    A modern multi thread download manager for Windows, Linux, Mac OS, Firefox, Chrome, Opera and Android devices. For bug reports visit: https://github.com/inbasic/turbo-download-manager/issues For FAQs visit: http://add0n.com/turbo-download-manager.html Turbo Download Manager is an stand-alone application without any dependencies. It should run out of the box. Just set the download location while adding the first job request. If you have a browser and would like to integrate this...
    Leader badge
    Downloads: 79 This Week
    Last Update:
    See Project
  • 4
    Bifrozt

    Bifrozt

    High interaction honeypot solution for Linux based systems

    NOTICE: The format of this project has been changed from ISO to using ansible and has been moved to GitHub. Github link: https://github.com/Bifrozt/bifrozt-ansible
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 5
    PDF Clown

    PDF Clown

    General-Purpose PDF Library for Java and .NET

    PDF Clown is a general-purpose Java and .NET library for manipulating PDF files through multiple abstraction layers, rigorously adhering to PDF 1.7 specification (ISO 32000-1). This project aims to provide a universal access to PDF files (creation, reading, editing, rendering...) through an accurate and elegant object-oriented API. * Features: http://pdfclown.org/overview/features/ * Overview: http://pdfclown.org/overview/architecture/ * Website: http://pdfclown.org/ * Blog:...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    webStraktor is a programmable World Wide Web data extraction client. Its purpose is to scrape HTML based content via the HTTP protocol and extract relevant information. webStraktor features a scripting language to facilitate the collection, the extraction and the storage of information available on the web, including images. The scripting language uses elements of the Regular Expression and xPath syntax.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    DownloadDaemon
    DownloadDaemon is a comfortable download-manager with many features like one-click-hoster support, etc. It can be remote-controled in several ways (web/gui/console clients), which makes it perfect for file- and root-servers, as well as for local use.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    Flexible Survey - online survey system

    Flexible Survey - online survey system

    Simple and extensible survey system

    Flexible Survey is open source, simple and extensible software to create online surveys. Features: - Surveys defined in XML - Wide and extensible set of available questions - Branching - AND/OR/NOT operators - Answers filtering - Answers randomization - Results stored in PostgreSQL or CSV file Good solution for users demanding high flexibility of the software.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Constellio Enterprise Search engine

    Constellio Enterprise Search engine

    Open source Search Engine and Enterprise Search

    Constellio is an enterprise search engine that allows companies to search all their organization's information through a single interface (Web, CRM, ERP, ECM, Mail etc.). Constellio is Based on Apache Solr and Google Search Appliance's connector. Constellio has a powerful web crawler.
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    eTube Downloader

    eTube Downloader

    Download YouTube videos and songs fast and easy

    eTube Downloader is a simple 2-step program to download and optionally extract the audio of YouTube videos. It works by providing a straightforward 2-step wizard where you select if you what to download songs (extract audio from videos) or videos, enter the desired download quality and the YouTube urls to download and the downloading starts!
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    SeerSuite
    SeerSuite is an application toolkit for digital libraries and search engines; i.e., CiteSeerX. CiteSeerX has moved to GitHub, please get the latest code from: https://github.com/SeerLabs/CiteSeerX
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Customizable browser based (text/web(WYSIWYG)) file editors environment in PHP (GPL Licensed) with loads of features. (tested only in firefox)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Graph-based Extraction and Summarization - a generic graph-based summarization framework. Basic functionality is provided - third-party modules can be plugged in.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Please visit http://imgv.sf.net/ - IMGV is a cross-platform Image Viewer. Features include slideshows, exif viewing, histograms, gamma correction, adjustable thumbnails, playlists, website image extraction, multi-dir loading, movies, and much more.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    The purpose of the Metabrain library is to give developers a way to extract this information from the Internet without resorting to natural language parsing or other complex techniques, using instead statistical methods and patterns/trends analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    FX Player is a Web-based streaming server with a Flash iTunes-like interface. It shares your MP3 library and allow access to your tracks through the Internet. Coded in Java, FX Player run on most platforms, including Mac OS X, Windows, Linux and Unix.
    Leader badge
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    A PHP class to make life easier for developers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Marquee Plus
    This is a add-on plugin/module for wordpress and joomla, which helps placing marquees in your wordpress blog or joomla site. Visit http://www.etdsonline.com/technical/plugins-and-modules for more information on these plugins and modules.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Folksonomy Web Crawler
    A Web crawler prototype designed to index pages of certain resource sharing platforms based on folksonomy tags. The results are displayed in an Excel spreadsheet.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    A simple to set up web scraper written in Java. It uses modified regEx to quickly write complex patterns to parse data out of a website. It contains a GUI tool for testing your configuration scripts and is fully automated through the command line
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    The Cornell Web Lab Collaboration Server is a suite of tools and services for GUI-based extraction, analysis and sharing of archived web data. See http://weblab.infosci.cornell.edu/ and http://www.cs.cornell.edu/~weigel for details about the project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    The Metadata Express is a web application for maintaining and browsing documentation of databases, a.k.a. metadata. It has automatic extraction, a method of creating description links between objects, an issue tracking system and is very easy to use.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    WebNews Crawler is a specific web crawler (spider, fetcher) designed to acquire and clean news articles from RSS and HTML pages. It can do a site specific extraction to extract the actual news content only, filtering out the advertising and other cruft.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    FOXY is a filtering web proxy. Originally designed to provide device-independent access to the World Wide Web, it may also be used for HTTP-filtering, extraction and reauthoring of existing web content or as security device against web based attacks.
    Leader badge
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    GoldSeeker is a small formatted data extraction application. It can parse informations from a text, html or other file, and export it in a database.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB