Instagram OSINT tool for gathering profile data and public posts
A high-quality tool for convert PDF to Markdown and JSON
PDF Parser for AI-ready data. Automate PDF accessibility
Fast and efficient unstructured data extraction
ContextGem: Effortless LLM extraction from documents
A machine learning software for extracting information
Download pictures (or videos) along with their captions
Python & command-line tool to gather text on the Web
Tool to help you collect, organize, annotate, cite, and share research
A versatile toolkit for PDF manipulation
A tool to simulate Amazon EC2 instance metadata
A distributed job server
A GUI tool for extracting hard-coded subtitle (hardsub) from videos
lightweight Go package to parse, analyze and extract metadata
A library for interacting with the nhentai API
A self-hostable bookmark-everything app
Coomer downloader
Open source OSINT tool for gathering data on emails, phones, and IPs
Assist in organizing your piles of documents
Copybara: A tool for transforming and moving code between repositories
CLI tool to extract (meta)data from PDF and manipulate PDF files
Cross platform GUI tool for downloading videos from Bilibili sites
ExtractThinker is a Document Intelligence library for LLMs
Document content and metadata extraction microservice
This is a public repository containing scrapers