Showing 22 open source projects for "text processing"

View related business solutions
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • Streamline Azure Security with Palo Alto Networks VM-Series Icon
    Streamline Azure Security with Palo Alto Networks VM-Series

    Centrally manage physical and virtualized firewalls with Panorama

    Improve your security posture and reduce incident response time. Use the VM-Series to natively analyze Azure traffic and dynamically drive policy updates based on workload changes.
    Learn more
  • 1
    Swiss File Knife

    Swiss File Knife

    One hundred command line tools in a small and portable binary.

    Create zip files, extract zip files, replace text in files, search in files using expressions, stream text editor, instant command line ftp and http server, send folder via network, copy folder excluding sub folders and files, find duplicate files, run a command on all files of a folder, split and join large files, make md5 checksum lists of files, remove tab characters, convert CR/LF, list newest or biggest files of a folder, compare folders, treesize, show first or last lines of a file,...
    Leader badge
    Downloads: 400 This Week
    Last Update:
    See Project
  • 2
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3

    UniversalTextExtractor

    Command-line toolset for extracting text from files

    Command-line toolset for extracting text from files (documents, images, archives) into SQLite with OCR support. Simple, expandable, one shell script only.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    AI File Sorter

    AI File Sorter

    Local AI file organization with categorization and rename suggestions

    AI File Sorter is a cross-platform desktop application that uses AI (local LLMs run on your computer) to organize files and suggest meaningful file names based on real content, not just filenames or extensions. The app can analyze images locally and propose descriptive rename suggestions (for example, IMG_2048.jpg → clouds_over_lake.jpg). It can also analyze document text to improve categorization and renaming. Supported formats include PDF, DOCX, XLSX, PPTX, ODT, ODS, ODP, and common...
    Leader badge
    Downloads: 516 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5

    Safe Harbor Deidentification

    Safe Harbor Deidentification for medical documents

    Phalanx - Deidentify Safe Harbor Deidentification Mode of Phalanx is an abridged pipeline of NLP annotators culminating in NER annotators which write output of text offsets. It uses the Safe Harbor deidentification method.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    PanDocElectron

    Graphical User Interface for PanDoc for Linux, Mac & Windows

    PanDoc Graphical User Interface implemented with Electron for Linux, Mac and Windows. It support users in converting source documents into various other formats like docx, odt, html and reveal documentation. The zip files contain the full source code because PanDocElectron is written in HTML/Javascript. Electron is used more or less as browser that runs the HTML/Javascript application. [Download PanDocElectron](https://sourceforge.net/p/pandocelectron/wiki/Home/) Extract the zip-file...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    GitSync is a shell script designed to simplify the usage of the version control system GIT (see www.git-scm.com for more information) by providing a "do everything to sync my repository" command.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    Drag-and-drop files/directories/HTML-URLs into a Java GUI. Perform text operations on the files into output files. Operations include concatention, text and regex editing, and other file/string/row/column/script operations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    PTools is a set of useful tools written in Pascal. It includes: scientific calculator, archiver, text editor, remote adminitration and more. It is designed to be portable across operating systems, specially Java-based mobiles, Windows and Unixes.
    Downloads: 6 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10
    Filecmp is a command-line application that gets two filenames as argument and outputs the comparison between them - e.g. if they are the same or not... it may look irrelevant but sometime it's very useful, specially inside scripts.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    This application reads the output of Web forms posted on your website (usually via email) and converts them to csv files for importing into a database, or managing in Excel
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Smart Id3 Tag from filename. Rename file from Tag. Full Regex Search & Replace. Intuitive parsing. Smart file numbering. Artist Album report. Split compilation album. Configurable text processing. mp3, flac, ogg, mp4, m4a, mp4p + more. Java.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    csvtoxml will convert parse csv comma separated value data into xml. a command line console utility that uses stdin and stdout pipe with more cat, pr, wget, zip, find -exec for added functionality. file stream term c c++ small fast parser unix win osx
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Visual xsltproc is a tool which help to write xslt file, and debug it to find errors. It writes xml, and generates xml (Syntax highlighting of XML & line Nr.). Finally if the result is XSL-FO it generates the pdf on Apache FOP java. Build on QT4.2.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    This project utilizes the iPod's ability to store and display short text files to allow you to view RSS Feeds, Weather Forecasts, Movie Showtimes, and other text documents on your iPod when you are away from your computer
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    A backup/transfer files program to move and process files between computers in a production (industrial) process. The processing of file begins when it are created or copied to directory and they can be redirected to other machine to be used.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Full implementation of ISO 2022 files (ECMA-35) as a library.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    The FileNamer for EML allows you to rename a lot of .eml files like 001.eml, 002.eml, 003.eml to something more descriptive.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    A Simple Tool Kit for Programmers and general users. The purpose of this project is to benefit both the programmers and the end users.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Latexss is a program that can transform a text file with a very minimal style syntax into a self contained LaTeX spreadsheet powered by the fp package. A graphical front-end (klatexss) is provided to generate both the text and the LaTeX file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    CTA (conversor de ficheros de texto) is a program to change the format of one or multiple text files between unix text file format and dos/windows text file format. With this easy program you can see text files correctly, whitout annoying symbols or comp
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Guber, short for Gutenberg renamer, renames text files provided by the Gutenberg project into the format "Author, Title" by automatically extracting the relavent information from the text file. Guber can do single files or Batch processing of a directo
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB