Showing 14 open source projects for "pdf java linearize"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Context for your AI agents Icon
    Context for your AI agents

    Crawl websites, sync to vector databases, and power RAG applications. Pre-built integrations for LLM pipelines and AI assistants.

    Build data pipelines that feed your AI models and agents without managing infrastructure. Crawl any website, transform content, and push directly to your preferred vector store. Use 10,000+ tools for RAG applications, AI assistants, and real-time knowledge bases. Monitor site changes, trigger workflows on new data, and keep your AIs fed with fresh, structured information. Cloud-native, API-first, and free to start until you need to scale.
    Try for free
  • 1
    Free Manga Downloader

    Free Manga Downloader

    Forked from https://sf.net/p/fmd/

    The Free Manga Downloader (FMD) is an open source application written in Object-Pascal for managing and downloading manga from various websites. This is a mirror of main repository on GitHub. For feedback/bug report visit https://github.com/riderkick/FMD
    Leader badge
    Downloads: 275 This Week
    Last Update:
    See Project
  • 2
    OpenSearchServer Search Engine

    OpenSearchServer Search Engine

    An open source search engine with RESTFul API and crawlers

    OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3

    eXtensible Text Framework (XTF)

    Framework for search and display of heterogenous document collections.

    NOTICE: This code repository is deprecated. Please visit https://github.com/cdlib/xtf for the latest updates. Obsolete Description: The eXtensible Text Framework (XTF) is an architecture that supports searching across collections of heterogeneous textual data (XML, PDF, HTML, text, and more), and the presentation of results and documents in a highly configurable manner. Includes highly customized versions of the proven open-source components Lucene and Saxon.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Regain is a Java search engine based on Jakarta Lucene. It provides indexing and searching files for plenty of formats (HTML,XML,doc(x),xls(x),ppt(x),oo,PDF,RTF,mp3,mp4,Java). A TagLibrary eases integrating search results in your JSP based web page.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Zendesk: The Complete Customer Service Solution Icon
    Zendesk: The Complete Customer Service Solution

    Discover AI-powered, award-winning customer service software trusted by 200k customers

    Equip your agents with powerful AI tools and workflows that boost efficiency and elevate customer experiences across every channel.
    Learn More
  • 5
    IDRA (InDexing and Retrieving Automatically) is a tool which allows indexing a wide range of text (TXT, DOC, PDF) and image annotations files (XML), query-based searching, visualizing an index, saving it for re-usability, evaluation, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    Free Manga Downloader

    Free Manga Downloader

    The Free Manga Downloader (FMD) is an open source application written in Object-Pascal for managing and downloading manga from various websites such as AnimeA, Batoto, MangaFox, MangaStream, ...
    Leader badge
    Downloads: 132 This Week
    Last Update:
    See Project
  • 7
    DocInfoRetriever is a Web_based document full-text search engine based on lucene. It allows you to search the contents and metadata of documents . Supported document formats, likes doc, xls, pdf, odt, jpg...etc.,and torrent files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Booletin es un buscador de Boletines oficiales (BOE, BOCM, etc.), que incluye un sistema de alertas por correo electrónico. Utiliza Apache Lucene para indexar el contenido en pdf de los boletines oficiales de España.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    SearchSaver is a web search tool, that enables you to search multiple search engines simultaneously and export selected results to XML (RSS, Atom) or PDF files. It presents the search results in a tabbed interface, as well as tree-style explorer view.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Comet Backup - Fast, Secure Backup Software for MSPs Icon
    Comet Backup - Fast, Secure Backup Software for MSPs

    Fast, Secure Backup Software for Businesses and IT Providers

    Comet is a flexible backup platform, giving you total control over your backup environment and storage destinations.
    Learn More
  • 10
    PDFBox is a Java PDF Library. This project will allow access to all of the components in a PDF document. More PDF manipulation features will be added as the project matures. This ships with a utility to take a PDF document and output a text file.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    Spencer is a Java-based, web-hosted filesystem indexing application. It indexes files on network shares, reads inside MSOffice, Open/StarOffice, PDF and zip files and provides a web interface to the index with search functions to find the file you want.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    This projects implements a complete entreprise solution based on lucene. It's a smart engine implemented to index numerous files formats (pdf, ps, xls, doc, ppt, ). The engine can index file systems (filtering), databases, mailing folders, web sites and
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    100% Java multithread search engine. Communication between the client and server is transferred through TCP-IP. To index objects, it obtains the documents through HTTP protocol and parses HTML files, PDF files, XML files and Text Plain files. Artlight use
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    A SOAP-based Document/File-Sharing solution written in Java. It includes a basic web-interface but other clients are possible. You can share and download all common office document formats like MS Word, Excel, OpenOffice and PDF.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next