Download Latest Version media-tagging-1.0.tar.gz (20.5 kB)
Email in envelope

Get an email when there's a new version of mediaTagging

Home
Name Modified Size InfoDownloads / Week
README.md 2026-06-10 6.2 kB
media-tagging-1.0.checksums 2026-06-10 217 Bytes
media-tagging-1.0.tar.gz 2026-06-10 20.5 kB
LICENSE 2026-06-10 1.5 kB
ChangeLog.txt 2026-06-10 44 Bytes
Totals: 5 Items   28.5 kB 0

mediaTagging

mediaTagging is a collection of command-line tools for reading, validating, organizing and writing metadata for audio and video files.

The tools are designed to be deterministic, script-friendly and transparent in their behavior. They avoid heuristics, hidden logic and side effects. Each tool has a clearly defined purpose and can be used independently or as part of a processing pipeline. All tools provide their own -h help output and do not require external documentation.

The project focuses on MP3 and MP4/M4A metadata handling, but several tools also work with other audio and video formats through ffprobe/ffmpeg.


Tool Overview

Tools are listed in alphabetical order.

Tool Description
gettag Reads metadata from audio/video files an output the result as JSON array
id3purify Validates ID3 Tags of mp3 files against whitelist of allowed frames
id3safe Moves MP3 files into a deterministic directory structure and generates a
defined filename based on the file’s metadata.
media2xml Creates an XML file containing the metadata of exactly one media file.
setMp4Tag Writes metadata to MP4/M4A files in a deterministic and controlled way.
xml2json Converts one or more XML files produced by media2xml into a JSON array.

gettag

gettag reads metadata from audio and video files and outputs the result as a JSON array.

The command-line interface follows the style of mid3v2, using options such as -l, -A, -a and others to select specific fields.

Field names are normalized to predictable identifiers, for example album-sort becomes album_sort and the ID3 frame TSST becomes set_subtitle.

Although optimized for MP3, gettag works with many other formats because it uses ffprobe as backend.


id3purify

id3purify validates the ID3 tags of an MP3 file against a whitelist of allowed frames.

In validation mode, it reports all frames that are not permitted and does not modify the file.

In clean mode, it removes every frame that is not part of the whitelist while preserving all permitted frames unchanged.

id3purify never creates new metadata and never modifies the contents of permitted frames. It only reports or removes frames that are outside the configured whitelist.


id3safe

id3safe moves MP3 files into a deterministic directory structure and generates destination paths and filenames from the file's metadata.

It does not write or modify tags, except in enumeration mode.

The destination path is generated from a user-configurable format string. Format specifiers are expanded from metadata fields such as artist, album, title, track number, release year and file extension.

The default format is:

%R/%S/%Y - %A/%0XT. %a - %t.%e

where:

  • %R is the media root directory
  • %S is the sort criterion
  • %A is the album name
  • %T is the track number
  • %a is the artist
  • %t is the title
  • %e is the file extension

Additional format specifiers allow fixed-width and dynamic-width track numbers, year formatting, and alternative filename layouts.

The value of %S is determined by the first available value in the following sequence:

  • artist_sort (TSOP)
  • album_sort (TSOA)
  • album_artist (TPE2)
  • artist (TPE1)
  • "Unknown Artist"

The value of %A is determined by the first available value in the following sequence:

  • album (TALB)
  • "No Album"

The value of %a is determined by the first available value in the following sequence:

  • artist (TPE1)
  • "Unknown Artist"

The value of %t is determined by the first available value in the following sequence:

  • title (TIT2)
  • "Unknown Title"

If artist_sort (TSOP) starts with '@', the remaining part is interpreted as a reference to another existing ID3 attribute.

The referenced attribute name is capitalized when constructing the path fragment.

Example:

{
  "artist_sort": "@genre",
  "genre": "Soundtrack"
}

Results in:

%S := "Genre/Soundtrack"

media2xml

media2xml creates an XML file containing the metadata of exactly one media file.

It was originally designed as a metadata backup mechanism for MP4 files at a time when no reliable MP4 tag writer was available.

The resulting XML files contain all relevant information and can later be used to reconstruct metadata with modern tools.

media2xml does not modify the media file.


setMp4Tag

setMp4Tag writes metadata to MP4/M4A files in a deterministic and controlled way.

It accepts either a single XML file (from media2xml) or a JSON array (from xml2json).

For each referenced media file, setMp4Tag removes all existing metadata and QuickTime atoms and writes exactly the fields defined in the XML or JSON structure.

It does not infer or generate additional fields and does not preserve existing metadata.


xml2json

xml2json converts one or more XML files produced by media2xml into a JSON array.

The resulting JSON structure contains all information extracted from the media file:

  • technical metadata under "media"
  • tag metadata under "tags"
  • optional track lists under "tracks" (e.g. for chapter markers)

This structure is not identical to the output of gettag. However, the values inside "tags" are compatible with the tag fields returned by gettag.

This allows older XML-based metadata archives to be integrated into the current JSON-based workflow while preserving all additional metadata.


Requirements

gettag

  • ffprobe/avprobe
  • md5neutral
  • jq

id3purify

  • gettag
  • mid3v2

id3safe

  • gettag
  • id3purify
  • mid3v2
  • jq
  • perl

media2xml

  • ffprobe/avprobe
  • xmlstarlet

setMp4Tag

  • ffmpeg/avmpeg
  • jq
  • xml2json

xml2json

  • xmlstarlet
  • jq

Notes

  • All tools are deterministic and designed for scripting environments
  • No tool performs hidden inference or heuristic tagging
  • All metadata transformations are explicit and reproducible

(c) 2011–2026 Hans-Hermann Froehlich

All rights reserved.

Source: README.md, updated 2026-06-10