ts_extract Code

Analyze and reassemble TS (mpeg transport stream) fragments

Status: Beta

Brought to you by: garloff

Tree [974d11] default tip /

History

Read Only access

File	Date	Author	Commit
.hgtags	2013-10-31	Kurt Garloff	[ab100f] Added tag TS_EXTRACT_0_9 for changeset 11c0780c...
Makefile	2021-08-15	Kurt Garloff	[974d11] EIT file parser added.
README.md	2013-10-31	Kurt Garloff	[3f4658] Add license note.
eit_parse.cc	2021-08-15	Kurt Garloff	[974d11] EIT file parser added.
reader.cc	2013-10-28	Kurt Garloff	[699c18] Refactoring next step:
reader.h	2013-10-28	Kurt Garloff	[699c18] Refactoring next step:
restore_movies.py	2019-07-23	Kurt Garloff	[40a79d] Fix error handling
ts_extract.cc	2018-09-15	Kurt Garloff	[1a1b56] Use snprintf for limited buffer. Fix make clean.
ts_fmt.c	2013-10-28	Kurt Garloff	[bb8911] Renamed ts_fmt.cc in ts_fmt.c and make ts_fmt.h...
ts_fmt.h	2021-08-15	Kurt Garloff	[974d11] EIT file parser added.
ts_merge.cc	2018-09-15	Kurt Garloff	[fb9d7b] More verbose ...
ts_parse.cc	2020-11-24	Kurt Garloff	[7f8e7a] Output STARTPCR. (3 digits fixed format).
ts_sblock.cc	2019-07-09	Kurt Garloff	[34f685] Better debug output.
ts_sblock.h	2018-09-15	Kurt Garloff	[23d6e4] Implement find_pcr() member in stream_block class.
ts_stream.cc	2018-09-22	Kurt Garloff	[968176] remove unneeded var
ts_stream.h	2018-09-22	Kurt Garloff	[58fc60] Better PCRDiscont exception info.

Read Me

Documentation for ts_extract

What is this?

ts_extract is a tool that helps you to recover lost .ts (MPEG2 transport stream)
files. It does so by scanning the files/partitions/devices you specify at the
command line and detecting valid .ts file fragments.

It does then analyze these and see whether they can be recombined into complete
files by looking at stream PIDs (and the PAT/PMT directories) and the continuity
fields in them as well as the PCR.

This can be useful to recover lost files from a hard disk after a disk crash or
after deleting files by mistake. (For a crashed disk, first create an image
using a tool like e.g. dd_rescue.)

Approach

The program does work in three steps

It reads all the input files/partitions/disks/images/... and creates a list
of extents that represent TS file fragments. (This step typically takes the
longest; the results of it can be stored in a log file that can be reread
later, skipping over this step.)
It then uses the analyzed metadata to reconnect the fragments into (hopefully)
complete TS streams. (This does take some computation time, tens of minutes
for tens of thousands of fragments.)
The resulting streams can then optionally be copied into files.

TS analysis

The tool looks for a number of features in TS files to detect whether they
belong together:

The TS packet of 188 does NOT nicely fit into a 4k filesystem block; this
actually is an advantage for us, as it gives us a test for consecutive
blocks with only 1/(188/4) error probability.
The TS packets have a per PID cont counter (4bits) that's increased per
segment; this gives us a test with a 1/16 error rate; given the fact that
you find several PIDs in a block typically, this becomes even more useful.
The TS packets belong to a program that's described by a PAT and PMT;
we assume that only ONE program is recorded in a .ts file (which is the
case for e.g. E2 based VDRs); a change in the PID set or the PAT or
PMT indicates that we have a new stream. (This also applies to the the
TSID.)
One of the PIDs of a program typically carries a timestamp (PCR). This
needs to be continuous and allows us to determine how to reconnect the
fragments.

Features

You can filter for transport stream IDs (TSID) and program IDs. If you look
for a few files only or for a series of files, this is an effective way
to limit the amount of (CPU) time needed to reconnect the fragments and
to limit the amount of files and storage required to output the results.
The state can be saved and reloaded again; this is a good time saver;
the recommended approach to rescue files is to NOT filter in the first
step, but save the state (option -l). You can then use that as input
(option -L) and skip over the first step.
Options -v (verbose) and -d allow you to get more information and
observe the inner workings of the program. You can sepcify these options
multiple time to increase the level of verbosity.
There's no man page (yet), only the -h (help) summary and this little
document.

Hints

If you have deleted files by mistake, make sure you cut off write operations
to that filesystem as soon (or as much) as possible.
If you have a crashed disk, do the analysis/extraction on an image (created
by e.g. dd_rescue.)
Use options -l and -L to save time. Use -l without filtering (options
-t and -g) initially and then use -L together with filtering.

Limitations

The program does not parse tables (PAT, PMT) larger than one packet currently.
The reconnection routine does only work reliably up to files with up to
13hrs of stream data (this is due to an optimization).
The program only deals with stream that contain ONE program. (The PAT
may reference multiple programs, that's fine.)

License

I release the under the terms of the GNU General Public License (GPL) version 2
or version 3 (at your option).

More information

See wikipedia page
for MPEG2 TS description.

ts_extract Code

Analyze and reassemble TS (mpeg transport stream) fragments

Branches

Tags

Tree [974d11] default tip /

History

Read Me

Documentation for ts_extract

What is this?

Approach

TS analysis

Features

Hints

Limitations

License

More information

ts_extract Code

Analyze and reassemble TS (mpeg transport stream) fragments

Branches

Tags

Tree [974d11] default tip / Download Snapshot History

Read Me

Documentation for ts_extract

What is this?

Approach

TS analysis

Features

Hints

Limitations

License

More information

Tree [974d11] default tip /

History