Showing 37 open source projects for "duplicate files windows"

View related business solutions
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    Video-subtitle-extractor

    Video-subtitle-extractor

    A GUI tool for extracting hard-coded subtitle (hardsub) from videos

    Video hard subtitle extraction, generate srt file. There is no need to apply for a third-party API, and text recognition can be implemented locally. A deep learning-based video subtitle extraction framework, including subtitle region detection and subtitle content extraction. A GUI tool for extracting hard-coded subtitles (hardsub) from videos and generating srt files. Use local OCR recognition, no need to set up and call any API, and do not need to access online OCR services such as Baidu...
    Downloads: 66 This Week
    Last Update:
    See Project
  • 2
    OCRmyPDF

    OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files

    OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.
    Downloads: 106 This Week
    Last Update:
    See Project
  • 3
    Scribe.js

    Scribe.js

    JavaScript OCR and text extraction for images and PDFs

    Scribe.js is a JavaScript library that provides Optical Character Recognition (OCR) and text extraction capabilities for both images and PDF documents, aimed at developers who want to build OCR features directly into their applications. The library can take image files (such as PNG or JPEG) and recognize the text they contain, and it can also extract text from PDF files that either already contain text or are image-based scans, using modern web standards and WebAssembly under the hood. In...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Papermerge

    Papermerge

    Open Source Document Management System for Digital Archives

    Papermerge is an open source document management system (DMS) primarily designed for archiving and retrieving your digital documents. Instead of having piles of paper documents all over your desk, office or drawers - you can quickly scan them and configure your scanner to directly upload to Papermerge DMS. Store, organize and index scanned documents in PDF, JPEG and TIFF formats. Instantly find relevant information using full text, tags and metadata-based search. Papermerge is free and...
    Downloads: 18 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Comandi Vocali Offline per Windows

    Comandi Vocali Offline per Windows

    Sistema comandi vocali offline per Windows, veloce e privato .Offline

    ... 👉 Nuova versione funzionante: https://voicecommander2multilingual.sourceforge.io/ o scaricala direttamente - direct download : https://sourceforge.net/projects/voicecommander2multilingual/files/VoiceCommander2.zip/download VoiceCommander 2.0 è stabile, migliorato e completamente operativo. Comandi Vocali Offline per Windows è un sistema di controllo vocale che funziona interamente in locale sul tuo PC. Permette di controllare il computer con la voce senza connessione internet, senza cloud e senza inviare dati all’esterno. ...
    Downloads: 85 This Week
    Last Update:
    See Project
  • 6
    Hathi Download Helper

    Hathi Download Helper

    Download books from the hathitrust website in a fast and easy manner

    2025-05-08 ====================== PLEASE NOTE ======================= Due to changes to the API of the hathirtust homepage, the HDH is no longer functional!! Please check the project Wiki for alternative methods. https://sourceforge.net/p/hathidownloadhelper/alternative/ ---------------------------------------------------------------------------------------------- Hathi Download Helper was a tool for downloading public domain books from hathitrust.org. E-Mail contact:...
    Leader badge
    Downloads: 38 This Week
    Last Update:
    See Project
  • 7
    Super PDF Editor (a Batch PDF Processor)

    Super PDF Editor (a Batch PDF Processor)

    Create, Edit, Delete, Organize , Convert, Export, Secure & Sign PDF.

    Super PDF Editor - Powerful, superfast, lightweight PDF processor. All-in-one PDF solution, PDF editing with 80+ tools and functions. The easy-to-use software is complete with editing tools for modifying PDF files your way. Most comprehensive, powerful, process-based and lightning-fast batch processor software. OCR PDF. PDF Imposition, Reverse Pages, Resize Page, Scale Page, Booklet, N-up Pages, Merge, Split by page, Extract Page, Rotate Page. Replace Page, Insert Page, Delete Page....
    Leader badge
    Downloads: 20 This Week
    Last Update:
    See Project
  • 8
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    MyBox

    MyBox

    Easy Tools of PDF, Image, File, Network, Data, and Medias

    javafx-desktop-apps pdf image ocr icc barcode color-palette text bytes markdown html archive compress digest video audio editor converter media https://github.com/Mararsh/MyBox Self-contain packages need not java env nor installation. Jar packages need Java 16 or higher.
    Downloads: 4 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 10

    Image To Text tools

    ITTT is a Free tool designed to Scan and extract Text from Images.

    Image To Text Tools is a 100% Free user-friendly tool designed to Scan and extract containing text in images into editable text formats. Whether you need to extract text from scanned documents, photographs, or other image files, Image To Text Tools provides accurate and reliable Optical Character Recognition (OCR) capabilities to meet your needs.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 11
    Common Resource Grep - crgrep

    Common Resource Grep - crgrep

    Common Resource Grep

    CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Super-PDF-Editor-Lite

    Super-PDF-Editor-Lite

    World's most comprehensive, powerful, process-based PDF editor

    ...OCR performs in pdf files, scanned pdf files and any pdf files. OCR performs in image files, and supports multiple image formats. Auto and manual image enhancement for better OCR accuracy and quality. Supports 165+ languages with three languages data set. Use Multiple Languages at once. International Languages: 127 Languages, High, Medium, and Fast Quality.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 13
    Super-PDF-Editor

    Super-PDF-Editor

    World's most comprehensive, powerful, process-based PDF editor

    World's most comprehensive, powerful, process-based and lighting fast PDF reader, editor and batch processor. PDF editing with 60+ features rich tools and function like OCR pdf and images and produce output like searchable PDF, Text, Hocr, Box, Unlv. Also, improve image enhancement before OCR operation for better OCR performance. pdf Imposition, etc. Super PDF Editor is best for bulk pdf processing, especially for the printing industry. Easy pdf imposition, booklet, n ups pages, and more....
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    e-Dokyumento

    e-Dokyumento

    e-Dokyumento is web-based Document Management System (DMS)

    e-Dokyumento is opensource web-based Document Management System (DMS) A Document Management which automates the basic office document workflow such as receiving, filing, routing, and approving through capturing (scanning), digitizing (OCR Reading), storing, tagging, and electronically routing and approving (e-signature) of electronic documents. # Demo : https://e-dokyumento.herokuapp.com/ https://edokyu.seillig.com/ (refer to Readme.md for the...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    Paperless-ng

    Paperless-ng

    A supercharged version of paperless, scan, index and archive docs

    Paperless is a simple Django application running in two parts, a Consumer (the thing that does the indexing) and a Web server (the part that lets you search & download already-indexed documents). Paper is a nightmare. Environmental issues aside, there’s no excuse for it in the 21st century. It takes up space, collects dust, doesn’t support any form of a search feature, indexing is tedious, it’s heavy and prone to damage & loss. I wrote this to make “going paperless” easier. I do not have to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    Merge PDF Files

    It is a Windows library that merges standard PDFs into a final PDF

    The library is intended for developers, for inclusion in desktop applications or server services. There are lots of SDKs on the market creating (merging) PDFs (almost all of them have limitations). Our Windows library (MergePDFByNMI.dll) only merges standard PDF files (there are several PDF formats). You can send the input PDFs (by file name or by byte array) and you can have the final PDF (saved on a file or get back on a byte array). The library calls can be synchronous or asynchronous. We want to give you a benchmark, the library was used to create a PDF from single page(scanned) image by an OCR SDK (it is not included in our library, you can use any on the market): 20,000 Images (the OCR SDK creates single page PDF text searchable, running 50 threads) in 80 minutes. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Subtitle Workshop

    Subtitle Workshop

    Free subtitle editor

    Subtitle Workshop is a free application for creating, editing, and converting text-based subtitle files. It supports all the subtitle formats you need and has all the features you would want.
    Leader badge
    Downloads: 963 This Week
    Last Update:
    See Project
  • 18
    OCR Web based

    OCR Web based

    OCR web based for Browser Firefox & PC

    Optical Character Recognition in JS for Browser is based on ocrad.js. OCR for Browser is a free extension and You can use this application to extract text from any image you supply. Just upload your image files. OCR for Browser takes either a JPG, GIF, TIFF, BMP, PNG. ========= Get OCR for Android (Beta release) - https://play.google.com/store/apps/details?id=com.ulm.ocr ========= Add-on for Opera: http://bit.ly/1F0E0wP ========= Release 1.0.1 For safety reasons, I disabled...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    cbrTekStraktor

    an application to automatically extract text from comic books.

    cbrTekStraktor is an application to automatically extract text from the text bubbles or speech balloons present in comic book reader files (CBR). Its prime goal is to perform analysis on the texts of comic books. cbrTekStraktor can however also be used for scanlation or similar purposes. The application also enables to manually define text areas in CBR files. The application comprises a simple graphical editor for further processing the extracted text. The text extraction is...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    DoAllWithPDF_servicemenu

    DoAllWithPDF_servicemenu

    KDE servicemenu for pdf

    allows kde user to make a lot of things whit right click on a pdf file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    WebDjVuTextEd

    Edit the OCR text layer of DjVu documents in a web browser

    WebDjVuTextEd allows to edit the text layer of OCR'ed DjVu documents in a web browser. You can modify the structure (paragraphs, lines, words...) create, delete, edit text nodes, modify their container box by mouse, and run a spellchecker. The program does not directly read the DjVu files, it requires exported XML text data and images. When using without a webserver, you can open and save local files, but cannot take advantages of auto-save and spell checking. Note that current SVN...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    DJVU++

    DJVU++

    The DjVu complete solution,with OCR Technology(Arabic ,English).

    DjVu++ is a user-friendly program that used to manipulate DjVu file formats such as eBooks with a penalty of editing features. The program introduce a free replacement for the property PDF format with similar resolution and smaller file size DjVu++ also support OCR to handle text in scanned books and images. The program shows good performance for English. In addition to the Arabic language to lead free and commercial software in this area. The main features of DjVu++ program are: o...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    DjVuPlus

    DjVuPlus

    DjVu Read Documents,With OCR Technology(Arabic ,English ),Small Size

    The DjVu Reference Library 3.5 was released by Lizardtech under the GNU General Public License version 2. DjVuLibre-3.5 was developed by Leon Bottou and others as a "Derived Work" of the DjVu Reference Library 3.5. As such, it is also subject to the GNU General Public License version 2. Several patents apply to two very specific aspects of DjVu and DjVuLibre. The patents cover a particular aspect of the ZP-coder (the arithmetic coder used in DjVu and implemented in libdjvu/ZPCodec.cpp)...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    bnlviewer

    METS / ALTO viewer written in Java and Javascript

    The National library of Luxembourg's viewer for METS (http://www.loc.gov/standards/mets/) files with OCR files in the ALTO format. The viewer needs a tomcat application server to run in. It can be deployed so that it reads the METS files from a local folder. Its main use is for digitized newspapers and postcards but can be adapted to other METS profiles as well. The viewer can be seen in action at: http://www.eluxemburgensia.lu Other known users include: National library of Latvia...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Sanskrit / Hindi - Tesseract OCR

    Sanskrit / Hindi - Tesseract OCR

    Devanagari fonts traineddata for Tesseract OCR

    Read https://sourceforge.net/projects/tesseracthindi/files/OCRHindi_using_VietOCR_and_Tesseract.pdf/download for how to use vietocr gui for OCR of Hindi and Sanskrit texts using tesseract-ocr ***** Please see https://github.com/Shreeshrii/ imagessan and imageshin for newer box/tiff pairs, traineddata files, ocr evaluation statistics and ground truth files with images for Sanskrit and Hindi. ***** Following is OLD information - saved only for archival purposes. Tesseract OCR...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB