Menu

Tree [974d11] default tip /
 History

Read Only access


File Date Author Commit
 .hgtags 2013-10-31 Kurt Garloff Kurt Garloff [ab100f] Added tag TS_EXTRACT_0_9 for changeset 11c0780c...
 Makefile 2021-08-15 Kurt Garloff Kurt Garloff [974d11] EIT file parser added.
 README.md 2013-10-31 Kurt Garloff Kurt Garloff [3f4658] Add license note.
 eit_parse.cc 2021-08-15 Kurt Garloff Kurt Garloff [974d11] EIT file parser added.
 reader.cc 2013-10-28 Kurt Garloff Kurt Garloff [699c18] Refactoring next step:
 reader.h 2013-10-28 Kurt Garloff Kurt Garloff [699c18] Refactoring next step:
 restore_movies.py 2019-07-23 Kurt Garloff Kurt Garloff [40a79d] Fix error handling
 ts_extract.cc 2018-09-15 Kurt Garloff Kurt Garloff [1a1b56] Use snprintf for limited buffer. Fix make clean.
 ts_fmt.c 2013-10-28 Kurt Garloff Kurt Garloff [bb8911] Renamed ts_fmt.cc in ts_fmt.c and make ts_fmt.h...
 ts_fmt.h 2021-08-15 Kurt Garloff Kurt Garloff [974d11] EIT file parser added.
 ts_merge.cc 2018-09-15 Kurt Garloff Kurt Garloff [fb9d7b] More verbose ...
 ts_parse.cc 2020-11-24 Kurt Garloff Kurt Garloff [7f8e7a] Output STARTPCR. (3 digits fixed format).
 ts_sblock.cc 2019-07-09 Kurt Garloff Kurt Garloff [34f685] Better debug output.
 ts_sblock.h 2018-09-15 Kurt Garloff Kurt Garloff [23d6e4] Implement find_pcr() member in stream_block class.
 ts_stream.cc 2018-09-22 Kurt Garloff Kurt Garloff [968176] remove unneeded var
 ts_stream.h 2018-09-22 Kurt Garloff Kurt Garloff [58fc60] Better PCRDiscont exception info.

Read Me

Documentation for ts_extract

What is this?

ts_extract is a tool that helps you to recover lost .ts (MPEG2 transport stream)
files. It does so by scanning the files/partitions/devices you specify at the
command line and detecting valid .ts file fragments.

It does then analyze these and see whether they can be recombined into complete
files by looking at stream PIDs (and the PAT/PMT directories) and the continuity
fields in them as well as the PCR.

This can be useful to recover lost files from a hard disk after a disk crash or
after deleting files by mistake. (For a crashed disk, first create an image
using a tool like e.g. dd_rescue.)

Approach

The program does work in three steps

  1. It reads all the input files/partitions/disks/images/... and creates a list
    of extents that represent TS file fragments. (This step typically takes the
    longest; the results of it can be stored in a log file that can be reread
    later, skipping over this step.)

  2. It then uses the analyzed metadata to reconnect the fragments into (hopefully)
    complete TS streams. (This does take some computation time, tens of minutes
    for tens of thousands of fragments.)

  3. The resulting streams can then optionally be copied into files.

TS analysis

The tool looks for a number of features in TS files to detect whether they
belong together:

  • The TS packet of 188 does NOT nicely fit into a 4k filesystem block; this
    actually is an advantage for us, as it gives us a test for consecutive
    blocks with only 1/(188/4) error probability.

  • The TS packets have a per PID cont counter (4bits) that's increased per
    segment; this gives us a test with a 1/16 error rate; given the fact that
    you find several PIDs in a block typically, this becomes even more useful.

  • The TS packets belong to a program that's described by a PAT and PMT;
    we assume that only ONE program is recorded in a .ts file (which is the
    case for e.g. E2 based VDRs); a change in the PID set or the PAT or
    PMT indicates that we have a new stream. (This also applies to the the
    TSID.)

  • One of the PIDs of a program typically carries a timestamp (PCR). This
    needs to be continuous and allows us to determine how to reconnect the
    fragments.

Features

  • You can filter for transport stream IDs (TSID) and program IDs. If you look
    for a few files only or for a series of files, this is an effective way
    to limit the amount of (CPU) time needed to reconnect the fragments and
    to limit the amount of files and storage required to output the results.

  • The state can be saved and reloaded again; this is a good time saver;
    the recommended approach to rescue files is to NOT filter in the first
    step, but save the state (option -l). You can then use that as input
    (option -L) and skip over the first step.

  • Options -v (verbose) and -d allow you to get more information and
    observe the inner workings of the program. You can sepcify these options
    multiple time to increase the level of verbosity.

  • There's no man page (yet), only the -h (help) summary and this little
    document.

Hints

  • If you have deleted files by mistake, make sure you cut off write operations
    to that filesystem as soon (or as much) as possible.

  • If you have a crashed disk, do the analysis/extraction on an image (created
    by e.g. dd_rescue.)

  • Use options -l and -L to save time. Use -l without filtering (options
    -t and -g) initially and then use -L together with filtering.

Limitations

  • The program does not parse tables (PAT, PMT) larger than one packet currently.

  • The reconnection routine does only work reliably up to files with up to
    13hrs of stream data (this is due to an optimization).

  • The program only deals with stream that contain ONE program. (The PAT
    may reference multiple programs, that's fine.)

License

I release the under the terms of the GNU General Public License (GPL) version 2
or version 3 (at your option).

More information

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.