Search Results for "metadata extraction tool"

Showing 387 open source projects for "metadata extraction tool"

View related business solutions
  • Our Free Plans just got better! | Auth0 by Okta Icon
    Our Free Plans just got better! | Auth0 by Okta

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do bestβ€”building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your secuirty. Auth0 now, thank yourself later.
    Try free now
  • Free CRM Software With Something for Everyone Icon
    Free CRM Software With Something for Everyone

    216,000+ customers in over 135 countries grow their businesses with HubSpot

    Think CRM software is just about contact management? Think again. HubSpot CRM has free tools for everyone on your team, and it’s 100% free. Here’s how our free CRM solution makes your job easier.
    Get free CRM
  • 1
    dude uncomplicated data extraction

    dude uncomplicated data extraction

    dude uncomplicated data extraction: A simple framework

    Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax. Dude is currently in Pre-Alpha. Please expect breaking changes. You can run your scraper from terminal/shell/command-line by supplying URLs, the output filename of your choice and the paths to your python scripts to dude scrape command.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Amazon EC2 Metadata Mock

    Amazon EC2 Metadata Mock

    A tool to simulate Amazon EC2 instance metadata

    Instance metadata is data about your instance that you can use to configure or manage the running instance. Instance metadata is divided into categories, for example, hostname, events, and security groups. You can also use instance metadata to access user data that you specified when launching your instance. For example, you can specify parameters for configuring your instance, or include a simple script. You can build generic AMIs and use user data to modify the configuration files supplied...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    DBeaver

    DBeaver

    Free universal database tool

    DBeaver is a free, multi-platform database tool that supports any database having a JDBC driver. It is useful for developers, SQL programmers, database administrators and analysts. DBeaver comes with plenty of great features such as metadata and SQL editors, ERD, data export/import/migration and more. Plugins are available for certain databases, and there are also several database management utilities. DBeaver’s Enterprise Edition provides even more features and supports non-JDBC...
    Downloads: 367 This Week
    Last Update:
    See Project
  • 4
    yt-dlp

    yt-dlp

    A youtube-dl fork with additional features and fixes

    yt-dlp is a youtube-dl fork based on the now inactive youtube-dlc. The main focus of this project is adding new features and patches while also keeping up to date with the original project
    Downloads: 220 This Week
    Last Update:
    See Project
  • Bright Data - All in One Platform for Proxies and Web Scraping Icon
    Bright Data - All in One Platform for Proxies and Web Scraping

    Say goodbye to blocks, restrictions, and CAPTCHAs

    Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.
    Get Started
  • 5

    Pandoc

    The universal markup converter

    Pandoc is a universal document converter able to convert files from a multitude of markup formats into another. With Pandoc, you have a swiss-army knife of a converter, able to convert practically any markup format into any other. Pandoc contains a Haskell library for conversions as well as a command-line tool that uses this library. It can convert to and from just about anything-- lightweight markup formats, HTML formats, documentation formats, ebooks, TeX formats, word processor formats...
    Downloads: 113 This Week
    Last Update:
    See Project
  • 6
    Video-subtitle-extractor

    Video-subtitle-extractor

    A GUI tool for extracting hard-coded subtitle (hardsub) from videos

    Video hard subtitle extraction, generate srt file. There is no need to apply for a third-party API, and text recognition can be implemented locally. A deep learning-based video subtitle extraction framework, including subtitle region detection and subtitle content extraction. A GUI tool for extracting hard-coded subtitles (hardsub) from videos and generating srt files. Use local OCR recognition, no need to set up and call any API, and do not need to access online OCR services such as Baidu...
    Downloads: 53 This Week
    Last Update:
    See Project
  • 7
    Lantern

    Lantern

    Tool to access videos, messaging, and other popular apps

    Can't access your favorite apps? Download Lantern to easily access videos, messaging, and other popular apps while at school or work. Lantern is an application that allows you to bypass firewalls to use your favorite applications and access your favorite websites. Lantern does not cooperate with any law enforcement in any country. Lantern encrypts all of your traffic to blocked sites and services to protect your data and privacy. Lantern passed multiple third party white box security audits...
    Downloads: 62 This Week
    Last Update:
    See Project
  • 8
    CSV Lint

    CSV Lint

    CSV Lint plug-in for Notepad++ for syntax highlighting

    CSV Lint plug-in for Notepad++ for syntax highlighting, csv validation, automatic column and datatype detecting fixed width datasets, change datetime format, decimal separator, sort data, count unique values, convert to xml, json, sql etc. A plugin for data cleaning and working with messy data files. Use CSV Lint for metadata discovery, technical data validation, and reformatting on tabular data files. It is not meant to be a replacement for spreadsheet programs like Excel or SPSS, but rather...
    Downloads: 29 This Week
    Last Update:
    See Project
  • 9
    NetBox

    NetBox

    The premiere source of truth powering network automation

    .... It is a web-based application that can be used to manage IP addresses and the devices and cables connected to them, as well as providing a data center infrastructure management (DCIM) tool. It supports virtualization, inventory management, and cable management. It has a web-based user interface and RESTful API, to easily integrate with other tools and automate tasks.
    Downloads: 25 This Week
    Last Update:
    See Project
  • Start building the next generation of GenAI apps today Icon
    Start building the next generation of GenAI apps today

    MongoDB and Google Cloud bring together powerful technologies that enable you to confidently build GenAI experiences.

    MongoDB Atlas is a fully-managed developer data platform built by developers, for developers. With tight integration to Google Cloud services such as Vertex AI and BigQuery, you can accelerate application deployment to stay at the forefront of AI innovation.
    Learn More
  • 10
    ripgrep

    ripgrep

    Regex pattern directory search tool that respects your .gitignore

    ripgrep is a line-oriented search tool that actively searches the directory you're currently in for a regex pattern. By default, ripgrep will ignore your .gitignore and skip hidden files or directories and binary files automatically. ripgrep has first class support on Windows, macOS and Linux, with binary downloads available for every release. ripgrep is similar to other popular search tools like The Silver Searcher, ack and grep. ripgrep supports arbitrary input preprocessing filters which...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 11
    mtail

    mtail

    Extract internal monitoring data from application logs

    Extract internal monitoring data from application logs for collection in a time-series database. mtail is a tool for extracting metrics from application logs to be exported into a timeseries database or timeseries calculator for alerting and dashboarding. It fills a monitoring niche by being the glue between applications that do not export their own internal state (other than via logs) and existing monitoring systems, such that system operators do not need to patch those applications...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 12
    GROBID

    GROBID

    A machine learning software for extracting information

    GROBID is a machine learning library for extracting, parsing, and re-structuring raw documents such as PDF into structured XML/TEI encoded documents with a particular focus on technical and scientific publications. First developments started in 2008 as a hobby. In 2011 the tool has been made available in open source. Work on GROBID has been steady as a side project since the beginning and is expected to continue as such. Header extraction and parsing from article in PDF format. The extraction...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    ANTLR

    ANTLR

    Parser generator to read, process, or translate structured text

    ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It's widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build and walk parse trees. It’s widely used in academia and industry to build all sorts of languages, tools, and frameworks. Twitter search uses ANTLR for query parsing, with over 2 billion queries a day. The languages for Hive...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 14
    audioFlux

    audioFlux

    A library for audio and music analysis, feature extraction

    A library for audio and music analysis, and feature extraction. Can be used for deep learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc. audioflux is a deep learning tool library for audio and music analysis, feature extraction. It supports dozens of time-frequency analysis transformation methods and hundreds of corresponding time-domain and frequency-domain feature combinations. It can be provided to deep learning networks for training and is used...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    Trafilatura

    Trafilatura

    Python & command-line tool to gather text on the Web

    Trafilatura is a Python package and command-line tool designed to gather text on the Web. It includes discovery, extraction and text-processing components. Its main applications are web crawling, downloads, scraping, and extraction of main texts, metadata and comments. It aims at staying handy and modular: no database is required, the output can be converted to various commonly used formats. Going from raw HTML to essential parts can alleviate many problems related to text quality, first...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Kaniko

    Kaniko

    Build Container Images In Kubernetes

    kaniko is a tool to build container images from a Dockerfile, inside a container or Kubernetes cluster. kaniko doesn't depend on a Docker daemon and executes each command within a Dockerfile completely in userspace. This enables building container images in environments that can't easily or securely run a Docker daemon, such as a standard Kubernetes cluster.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    MyDumper

    MyDumper

    MyDumper project

    MyDumper is a MySQL Logical Backup Tool. It has 2 tools. mydumper which is responsible to export a consistent backup of MySQL databases. myloader reads the backup from mydumper, connects the to destination database and imports the backup. Both tools use multithreading capabilities. MyDumper is Open Source and maintained by the community, it is not a Percona, MariaDB or MySQL product. Parallelism (hence, speed) and performance (avoids expensive character set conversion routines, efficient code...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    GitVersion

    GitVersion

    From git log to SemVer in no time

    ... pipeline with TeamCity, AppVeyor, Jenkins or any of the other supported build servers. GitVersion is a tool that generates a Semantic Version number based on your Git history. The version number generated from GitVersion can then be used for various different purposes. GitVersion can be used in a Continuous Server pipeline to generate a version number that both labels the build itself and makes the different version variables available to the rest of the build pipeline.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    ffsend

    ffsend

    Easily and securely share files from the command line

    Easily and securely share files and directories from the command line through a safe, private and encrypted link using a single simple command. Files are shared using the Send service and may be up to 1GB. Others are able to download these files with this tool, or through their web browser. All files are always encrypted on the client, and secrets are never shared with the remote host. An optional password may be specified, and a default file lifetime of 1 (up to 20) download or 24 hours...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    Hyperion Android

    Hyperion Android

    App Debugging & Inspection Tool for Android

    ... the Plugin interface and expose the implementation as a service. The plugins made available in this repository leverage Google's AutoService annotation processor to generate the service metadata and simplify the process.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    Amplify

    Amplify

    Automatic enrichment, enhancement, and explanation of your data

    Amplify attaches afterburners to your data. Amplify explains metadata extraction, classification, tagging, and reporting. Eriches derivative data generation like thumbnails, previews, conversions, etc. Enhances batteries-included value-adds like data quality reports, image augmentation, OCR, translations, etc. Amplify leverages the decentralized compute provided by Bacalhau to magically enrich your data. A built-in suite of pipelines decides what your data is and how to best improve upon...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    SVGO

    SVGO

    Node.js tool for optimizing SVG files

    SVG Optimizer is a Node.js-based tool for optimizing SVG vector graphics files. SVG files, in particular those exported from multiple editors, normally contain tons of redundant and useless information. This can include editor metadata, comments, hidden elements, default or non-optimal values and other stuff that can be safely removed or converted without affecting the SVG rendering result. Some options can be configured with CLI though it may be easier to have the configuration in a separate...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    CyberScraper 2077

    CyberScraper 2077

    A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama

    CyberScraper 2077 is not just another web scraping tool – it's a glimpse into the future of data extraction. Born from the neon-lit streets of a cyberpunk world, this AI-powered scraper uses OpenAI, Gemini and LocalLLM Models to slice through the web's defenses, extracting the data you need with unparalleled precision and style.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    EMBA

    EMBA

    The firmware security analyzer

    EMBA is designed as the central firmware analysis tool for penetration testers and product security teams. It supports the complete security analysis process starting with firmware extraction, doing static analysis and dynamic analysis via emulation and finally generating a web report. EMBA automatically discovers possible weak spots and vulnerabilities in firmware. Examples are insecure binaries, old and outdated software components, potentially vulnerable scripts, or hard-coded passwords...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    fq

    fq

    jq for binary formats

    Tool, language, and decoders for working with binary data. fq is inspired by the well-known jq tool and language and allows you to work with binary formats the same way you would using jq. In addition, it can also present data similar to a hex viewer, transform, slice, and concatenate binary data, supports nested formats, and has an interactive REPL with auto-completion. It was originally designed to query, inspect and debug codecs and metadata in media files and containers like mp4, FLAC, mp3...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next