Showing 35 open source projects for "parse files"

View related business solutions
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    pre-commit-hooks

    pre-commit-hooks

    Some out-of-the-box hooks for pre-commit

    Some out-of-the-box hooks for pre-commit. Using pre-commit-hooks with pre-commit. Instead of loading the files, simply parse them for syntax. A syntax-only check enables extensions and unsafe constructs which would otherwise be forbidden. Using this option removes all guarantees of portability to other yaml implementations. Detect symlinks which are changed to regular files with a content of a path that that symlink was pointing to. This usually happens on Windows when a user clones a repository that has symlinks but they do not have permission to create symlinks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    LlamaParse

    LlamaParse

    Parse files for optimal RAG

    LlamaParse is a GenAI-native document parser that can parse complex document data for any downstream LLM use case (RAG, agents). Load in 160+ data sources and data formats, from unstructured, and semi-structured, to structured data (API's, PDFs, documents, SQL, etc.) Store and index your data for different use cases. Integrate with 40+ vector stores, document stores, graph stores, and SQL db providers.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Notoma

    Notoma

    Use Notion as your blogging editor, with any static gen blog engine

    Use Notion as your blogging editor, with any static gen blog engine. Notoma converts Notion pages to Markdown files. Convert contents of your Notion Blog database to a bunch of Markdown files. Watch Notion Blog database for updates and regenerate Markdown files on any updates. Create a new Notion database for your Blog with all required fields.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    yt-dlp

    yt-dlp

    A youtube-dl fork with additional features and fixes

    yt-dlp is a youtube-dl fork based on the now inactive youtube-dlc. The main focus of this project is adding new features and patches while also keeping up to date with the original project
    Downloads: 592 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    python-bibtexparser v2

    python-bibtexparser v2

    Bibtex parser for Python 3

    Welcome to python-bibtexparser, a parser for .bib files with a long history and wide adaption. Bibtexparser is available in two versions: V1 and V2. For new projects, we recommend using v2 which, in the long run, will provide an overall more robust and faster experience. For now, however, note that v2 is an early beta, and does not contain all features of v1.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    text-extract-api

    text-extract-api

    Document (PDF, Word, PPTX ...) extraction and parse API

    text-extract-api is an open-source service designed to extract readable text from a wide variety of document formats through a simple API interface. The project focuses on converting complex files such as PDFs, images, scanned documents, and office files into structured plain text that can be processed by downstream applications or language models. Instead of requiring developers to integrate multiple document parsing libraries individually, the system centralizes text extraction...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    PaperQA2

    PaperQA2

    High accuracy RAG for answering questions from scientific documents

    PaperQA2 is a package for doing high-accuracy retrieval augmented generation (RAG) on PDFs or text files, with a focus on the scientific literature. See our recent 2024 paper to see examples of PaperQA2's superhuman performance in scientific tasks like question answering, summarization, and contradiction detection. In this example we take a folder of research paper PDFs, magically get their metadata - including citation counts and a retraction check, then parse and cache PDFs into a full-text search index, and finally answer the user question with an LLM agent.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    videodl

    videodl

    Lightweight Python tool for downloading videos from many platforms

    Videodl is a lightweight video downloader implemented entirely in Python that allows users to retrieve videos from a wide range of online media platforms. It focuses on providing a fast and simple way to parse video pages and download media files, often prioritizing high-definition versions without watermarks when available. It supports numerous video platforms across both Chinese and international streaming ecosystems, enabling users to fetch content from many popular services through a unified interface. Videodl works by implementing platform-specific client modules that extract video information and download links from supported services. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    Nexa is an advanced tool that's used to parse and compare Ericsson XML files (mainly the 3GPP version). This tool can plot RF data as well as sites and cells, making it very handy for day-to-day operations. Current functions: 1. Pare BSC/RNC/4G/5G/IoT XML files from Ericsson. 2. Compare the configuration of the selected XML files. 3. Generate the Bulk-Configuration files. 4.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 10
    PhiPsi

    PhiPsi

    An eXtended Finite Element Method (XFEM) Software.

    PhiPsi is a 2D and 3D computational solid mechanics program, which involves the extended finite element method (XFEM), as well as the finite element method (FEM). PhiPsi is written in Fortran and compiled using the GNU Fortran compiler (gfortran). PPView is a visualization tool for PhiPsi. PPView can be used to import Abaqus inp file, view the model defined in the PhiPsi keywords file (*.kpp), edit PhiPsi keywords file, perform a PhiPsi simulation, and view the simulation result files...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    PathPicker

    PathPicker

    Accepts a wide range of input, output from git commands & grep results

    PathPicker accepts a wide range of input, output from git commands, grep results, searches, pretty much anything. After parsing the input, PathPicker presents you with a nice UI to select which files you're interested in. After that you can open them in your favorite editor or execute arbitrary commands. Facebook PathPicker is a simple command line tool that solves the perpetual problem of selecting files out of bash output. Bash is fully supported and works the best. ZSH is supported as...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    pyment

    pyment

    Format and convert Python docstrings and generates patches

    ...It will parse one or several python scripts and retrieve existing docstrings. Then, for all found functions/methods/classes, it will generate formatted docstrings with parameters, default values. At the end, patches can be generated for each file. Then, man can apply the patches to the initial scripts. It is also possible to update the files directly without generating patches, or to output on stdout.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Rdbtools

    Rdbtools

    Parse Redis dump.rdb files, Analyze Memory, and Export Data to JSON

    Rdbtools is a parser for Redis' dump.rdb files. The parser generates events similar to an XML sax parser and is very efficient memory-wise. Rdbtools is written in Python, though there are similar projects in other languages. Every run of RDB Tool requires to specify a command to indicate what should be done with the parsed RDB data. Valid commands are JSON, diff, justkeys, justkeyvals and protocol.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Zeus Scanner

    Zeus Scanner

    Advanced reconnaissance utility

    ...It combines URL parsing, search engine querying, crawling, proxy support, and vulnerability assessment workflows in one tool. The scanner can work with multiple search engines, extract URLs from Google ban and webcache URLs, and parse robots.txt or sitemap.xml files. It also supports proxy configurations, Tor proxy compatibility, and Tor browser emulation for flexible routing during authorized assessments. Zeus-Scanner includes checks for issues such as XSS, SQL injection, clickjacking, exposed admin panels, port scanning, whois lookup, and header protection. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Rekall

    Rekall

    Rekall Memory Forensic Framework

    Rekall is a powerful memory forensics framework that turns raw RAM captures—or live system state—into structured artifacts investigators can query and script. It ships with a large collection of plugins that parse OS internals to recover processes, modules, sockets, registry hives, and file objects, even when rootkits try to hide them. The design emphasizes repeatability: investigators run well-defined analyses that produce timelines, indicators, and reports suitable for case work or...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 16
    Command Line Parser GetPot

    Command Line Parser GetPot

    Tool to parse the command line and configuration files.

    Powerful command line and configuration file parsing for C++, Python, Ruby and Java (others to come). This tool provides many features, such as separate treatment for options, variables, and flags, unrecognized object detection, prefixes and much more.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    LogicalLogViewer

    LogicalLogViewer

    Parses a log file, shows the relevant information in a table

    Parses a log (file or http), shows the relevant information in the form of a table. Allows filtering, searching, limited keyword highlighting. Is fully customizable: - which information to show - adding alternative parsers - reads from a file or a http server Current alpha version supports multiple parsers, for which the specification is found in an XML file (which should have the extension '.lpc'. The interface is not customizable yet. If you're interested in how it is done, check out the code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Parse C++ header files using ply.lex to generate navigable class tree representing the class structure. CppHeaderParser.py has the advantage of being a pure python C++ header parser. Grap a copy of ply at http://www.dabeaz.com/ply/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    HyperSQL is like a doxygen plus javadoc for SQL, hypermapping SQL views, packages, procedures, and functions to HTML source code listings and showing all code locations where these are used.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    Analyzer for Files

    A tool to look into file contents

    Analyzer for Files (AoF) is a tool to look into file contents, analyze the structure with installed plug-ins, and show the results with several split windows including converted data and a tree if successful. It was designed as a workbench with a core and plug-in extensions. It can handle the normal plain-text file and data, complex binaries supported with the corresponding plug-ins. What's more, the developers can deploy and release their own plug-ins according to the plug-in developing...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    ant_farm

    Python-based reverse-engineering tool

    ant_farm provides a GUI framework for integrating all of those python tools you have written over the years to parse files, execute algorithms, display data etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    Latest on Arxiv

    Parse the latest Arxiv RSS stream and get your institute preprints

    Latest on Arxiv is a program/script with a simple job to do. Every day, it will download all pdf files from your favorite Arxiv RSS, and will then scan it to see if any authors from your favorite institute(s) are on there. If so, it will save the resulting index. The matched files are then parsed into a short list, featuring only the latest 4 preprints (which is ideal for a single TV screen in the coffee corner), and a long list which contains every paper in a clickable way. I provide...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    EnDiskEx

    Bulk extractor for Ensoniq-formatted disk images

    EnDiskEx is a command-line tool that bulk extracts instruments, sequences, songs, and banks from Ensoniq-formatted disk images (RAW, GKH, EDE, and EDA) for the EPS/ASR family of samplers. The extracted files are saved as EFE / SMF / TXT files. EnDiskEx is designed to extract Ensoniq banks for re-creation within a different DAW. It will track down the instrument and song files from bank references even if they were saved on another disk. There also exists a disk mapping feature to...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 24

    Pyradox2

    Various tools for Paradox games, an extension of Pyradox

    Presently, the scripts here can be used to parse the game data files for Victoria 2. These values are put into wiki format for the purpose of adding to the Paradoxian wiki (http://www.paradoxian.org/vicky2wiki/Reference_Guides). These scripts are particularly useful to have since anytime Paradox makes updates to the core game, values often change. These scripts help ensure that the wiki can be easily kept up to date to help players better understand how the game works and to help modders understand core game mechanics.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    PyProperties

    Provides support for properties files in Python 3.x

    pyproperties provides support for properties files in Python. Being written entirely from scratch it is not in any way derived from java.util.Properties. There are projects which try to mimic j.u.P. This is not one of them. It can read, parse and store properties files but also provides some more advanced functionality like manipulating comments and type-guessing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Auth0 Logo