Showing 32 open source projects for "office open xml"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Open Notebook

    Open Notebook

    An Open Source implementation of Notebook LM with more flexibility

    Open Notebook enables users to organize and analyze multi-modal content such as PDFs, videos, audio files, web pages, and Office documents. It combines full-text and vector search with context-aware AI chat to deliver insights grounded in your own research materials. With advanced features like multi-speaker podcast generation, customizable content transformations, and a comprehensive REST API, Open Notebook provides a powerful and extensible research environment.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 2
    MarkItDown

    MarkItDown

    Python tool for converting files and office documents to Markdown

    MarkItDown is a lightweight Python utility developed by Microsoft for converting various files and office documents to Markdown format. It is particularly useful for preparing documents for use with large language models and related text analysis pipelines. ​
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    Paperless-ngx

    Paperless-ngx

    A community-supported supercharged version of paperless

    Paperless-ngx is a community-supported open-source document management system that transforms your physical documents into a searchable online archive so you can keep, well, less paper.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 4
    text-extract-api

    text-extract-api

    Document (PDF, Word, PPTX ...) extraction and parse API

    text-extract-api is an open-source service designed to extract readable text from a wide variety of document formats through a simple API interface. The project focuses on converting complex files such as PDFs, images, scanned documents, and office files into structured plain text that can be processed by downstream applications or language models. Instead of requiring developers to integrate multiple document parsing libraries individually, the system centralizes text extraction capabilities into a unified API that standardizes the output. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    Papermerge

    Papermerge

    Open Source Document Management System for Digital Archives

    Papermerge is an open source document management system (DMS) primarily designed for archiving and retrieving your digital documents. Instead of having piles of paper documents all over your desk, office or drawers - you can quickly scan them and configure your scanner to directly upload to Papermerge DMS. Store, organize and index scanned documents in PDF, JPEG and TIFF formats.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 6
    ScrapeGraphAI

    ScrapeGraphAI

    Python scraper based on AI

    Extracting content from websites and local documents using LLM. ScrapeGraphAI is a web scraping python library that uses LLM and direct graph logic to create scraping pipelines for websites and local documents (XML, HTML, JSON, Markdown, etc.). Just say which information you want to extract and the library will do it for you.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    PasteMD

    PasteMD

    Paste Markdown and AI responses into Word Excel instantly fast

    PasteMD is a lightweight desktop utility designed to streamline the process of transferring formatted content from the clipboard into office applications such as Word, WPS, and Excel. It primarily targets users who frequently copy content from AI chat tools or web pages and encounter formatting issues, especially with Markdown, tables, and LaTeX formulas. PasteMD operates from the system tray and monitors clipboard content, automatically converting Markdown or HTML into properly formatted...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    Android Use

    Android Use

    Automate native Android apps with AI using accessibility APIs

    android-action-kernel is an open source Python library designed to let AI agents control and automate native Android applications running on real devices or emulators. It fills a gap in automation tooling by focusing on mobile-first workflows where traditional browser or desktop-based automation doesn’t work; such as logistics, gig work, field operations, and other industries reliant on phones or tablets.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    Paper2Slides

    Paper2Slides

    From Paper to Presentation in One Click

    Paper2Slides is an automation tool that converts research papers, reports, and other documents into polished slide decks and posters with minimal manual effort. It is designed to replace the repetitive work of turning dense technical documents into presentation-friendly structure by extracting key points, figures, and data into a coherent visual narrative. The system supports multiple input formats, so you can process PDFs and common office documents rather than being locked to a single file...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • 10
    files-to-prompt

    files-to-prompt

    Concatenate a directory full of files into a single prompt

    files-to-prompt is a Python command-line tool that takes one or more files or entire directories and concatenates their contents into a single, LLM-friendly prompt. It walks the directory tree, outputting each file preceded by its relative path and a separator, so a model can understand which content came from where. The tool is aimed at workflows where you want to ask an LLM questions about a whole codebase, documentation set, or notes folder without manually copying files together. It...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    AiHound

    AiHound

    AI powered image classification for nudity and documents / id-cards

    AI Hound is designed to run from an USB pendrive or any other kind of removeable and writeable media. The programm checks all Office-documents, Images and videos for various categories for images. Actually It can recognice nudity/porn and scanned or photographed documents / ID- and credit-cards. I am working on a model that also recognice various types of drugs in images.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Menagerie

    Menagerie

    A collection of high-quality models for the MuJoCo physics engine

    MuJoCo Menagerie, developed by Google DeepMind, is a curated collection of high-quality simulation models designed for use with the MuJoCo physics engine. It serves as a comprehensive library of accurate and ready-to-use robotic, biomechanical, and mechanical models, ensuring users can perform reliable simulations without having to build or tune models from scratch. The repository aims to improve reproducibility and quality across robotics research by providing verified models that adhere to...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    e-Dokyumento

    e-Dokyumento

    e-Dokyumento is web-based Document Management System (DMS)

    e-Dokyumento is opensource web-based Document Management System (DMS) A Document Management which automates the basic office document workflow such as receiving, filing, routing, and approving through capturing (scanning), digitizing (OCR Reading), storing, tagging, and electronically routing and approving (e-signature) of electronic documents. # Demo : https://e-dokyumento.herokuapp.com/ https://edokyu.seillig.com/ (refer to Readme.md for the...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    Paperless-ng

    Paperless-ng

    A supercharged version of paperless, scan, index and archive docs

    Paperless is a simple Django application running in two parts, a Consumer (the thing that does the indexing) and a Web server (the part that lets you search & download already-indexed documents). Paper is a nightmare. Environmental issues aside, there’s no excuse for it in the 21st century. It takes up space, collects dust, doesn’t support any form of a search feature, indexing is tedious, it’s heavy and prone to damage & loss. I wrote this to make “going paperless” easier. I do not have to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    BerryNet

    BerryNet

    Deep learning gateway on Raspberry Pi and other edge devices

    This project turns edge devices such as Raspberry Pi into an intelligent gateway with deep learning running on it. No internet connection is required, everything is done locally on the edge device itself. Further, multiple edge devices can create a distributed AIoT network. At DT42, we believe that bringing deep learning to edge devices is the trend towards the future. It not only saves costs of data transmission and storage but also makes devices able to respond according to the events...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    EverydayWechat

    EverydayWechat

    Python tool that automates WeChat messages, replies, & group utilities

    EverydayWechat is a Python-based automation tool designed to enhance and automate interactions on the WeChat messaging platform. Built using Python 3 and the Itchat library, it connects to the web version of WeChat to perform various automated messaging tasks. It allows users to send scheduled messages to friends or group chats, including daily weather updates, reminders, inspirational quotes, and other personalized content. It also supports intelligent automatic replies to incoming messages...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    LabelImg

    LabelImg

    Graphical image annotation tool and label object bounding boxes

    ...Click 'Change default saved annotation folder' in Menu/File. Click 'Open Dir'. Click 'Create RectBox'. Click and release left mouse to select a region to annotate the rect box. You can use right mouse to drag the rect box to copy or move it. The annotation will be saved to the folder you specify. You can refer to the hotkeys to speed up your workflow.
    Downloads: 100 This Week
    Last Update:
    See Project
  • 18
    PyTom

    PyTom

    http://www.sciencedirect.com/science/article/pii/S1047847711003492

    PyTom is a toolbox developed for interpreting cryo electron tomography data. All steps from reconstruction, localization, alignment and classification are covered with standard and improved methods. Please sign up to our mailing list to keep up with the most recent updates and versions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    PyAIMLng

    The Next Generation of Python AIML Interpreter

    A Python AIML interpreter with non-compliant extensions. PyAIMLng is an interpreter for AIML (the Artificial Intelligence Markup Language), forked from Cort Stratton's PyAIML. PyAIMLng adds additional features which are not part of the AIML 1.0.1 specification in order to provide the bot master with a rich set of tools from which to build a more believable AIML bot.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    C++ library for working with OWL ontologies
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    mwetoolkit

    THIS PROJECT MIGRATED TO https://gitlab.com/mwetoolkit/mwetoolkit3/

    THIS PROJECT MIGRATED TO https://gitlab.com/mwetoolkit/mwetoolkit3/ The Multiword Expressions toolkit aids in the automatic identification and extraction of multiword units in running text. These include idioms (kick the bucket), noun compounds (cable car), phrasal verbs (take off, give up), etc. Even though it focuses on multiword expresisons, the framework is quite complete and can also be useful in any corpus-based study in computational linguistics. The mwetoolkit can be...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    Domotic Speech-recognition interface

    Speech-recognition interface for a domotic system.

    This product recognizes oral commands and translates them to domotic orders for a domotic system. This product does not implement a domotic system. This product is an interface to be plugged to a domotic system. The speech recognition is done by an arduino UNO board and an EasyVR shield. Available oral commands are generated from a house description file in XML format. The oral commands have to be trained for a specific users. For this purpose 2 interfaces are provided: a command line...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Soar
    Soar is a general cognitive architecture for developing systems that exhibit intelligent behavior. Researchers all over the world, both from the fields of artificial intelligence and cognitive science, are using Soar for a variety of tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Spyse is a software framework for building multi-agent systems. It allows Python developers to build distributed intelligent systems of multiple cooperative agents based on FIPA, OWL, SOA and many others. Spyse is designed for ease-of-use and fun.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    iDocs is a intellectual document work flow with text mining options project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB