MediaInfo and binary file representation

DV
2014-08-18
2014-08-29
  • DV
    DV
    2014-08-18

    As MediaInfo supports quite a few file formats, I have a question that relates to a different way of using the information about binary file structures that MediaInfo parses and outputs.
    Let me give an example first to have better understanding what I mean:
    Let's assume I have an .mpg or .avi file with incorrect aspect ratio. This aspect ratio is reported by MediaInfo. But what I actually need is to understand which exact bytes inside the file correspond to this aspect ratio value to be able to set the correct aspect ratio by modifying the binary contents of the file.
    So, in general, the question is: is it possible to see/generate the relation between the textual information reported by MediaInfo and the original specific bytes of the input binary file? In the terms of MediaInfo's sources my question seems to relate to functions such as Get_L2, Get_X4 and so on which already have a text comment. Such as this one: Get_S1 ( 4, aspect_ratio_information, "aspect_ratio_information"). Probably my question can be rephrased to this one: is it possible to update all these Get_... functions in such way that they will create some kind of "dump" (or we can name it "log") that will say something like:

    File offset 0x..., ... bytes - aspect_ratio_information
    File offset 0x..., ... bytes - frame_rate_code

    and so on?
    This would help to understand the relation between bytes at specific offests and their actual meaning in terms of the structures of the binary file format.

    Another question regarding the same subject, but from a different angle, would be: do you have some documentation that describes different binary file formats (supported by MediaInfo) in some standardized form of tables or something similar?

     
  • But what I actually need is to understand which exact bytes inside the file correspond to this aspect ratio value to be able to set the correct aspect ratio by modifying the binary contents of the file.

    Good luck, it is not so easy to modify such bits everywhere in a file. There are a lot of traps.

    So, in general, the question is: is it possible to see/generate the relation between the textual information reported by MediaInfo and the original specific bytes of the input binary file?

    With e.g. Linux (Windows version is built with this feature disabled for binary size optimization) CLI binary, you can do "mediainfo --trace=1 YourFile" and you'll get a trace (example attached)
    Still alpha status.

    Another question regarding the same subject, but from a different angle, would be: do you have some documentation that describes different binary file formats (supported by MediaInfo) in some standardized form of tables or something similar?

    Format specifications (ITU, ISO, SMPTE...), you need to get them.


    If you have specific needs (correction of some files), customization is possible, but it not for free

     
  • DV
    DV
    2014-08-19

    Hi Jerome,
    Thank you, the attached trace file looks very promising and seems to reflect my question regarding reporting the offsets and sizes of binary records (or structures).
    Could you point me which function(s) in which source file(s) does this tracing functionality relate? Probably I'll be able to make or suggest some improvements in the code (I'll be using the Windows version).

     
  • Could you point me which function(s) in which source file(s) does this tracing functionality relate?

    They are the ones you pointed out (Get_L2()...)
    The common part is in File_ _Analyze.cpp, void File__Analyze::Param()
    Any improvement would be appreciated.

    For compiling with trace support, change #define MEDIAINFO_TRACE_NO to MEDIAINFO_TRACE_YES.

     
  • DV
    DV
    2014-08-21

    Thank you, I was able to build both Release and Debug version under VS2008. Though, I did not find the "--trace" command line option you mentioned (I was using mediainfo-code-6391-MediaInfo-trunk). Where can I find this option - or what should I change or add to make it work?
    Also the following method seems to be missed in "MediaInfo\ZenLib\Source\ZenLib\Ztring.h" :

    Ztring& From_Unicode (const wchar_t ch) {return From_Unicode(&ch, 0, 1);};

    Probably it's related to the definition of WSTRING_MISSING, because other methods such as From_UTF8, From_UTF16 and so on did not require such an overloaded version. Though, the overloaded version for (wchar_t ch) can always be added just for complete certainty :)

     
  • Where can I find this option - or what should I change or add to make it work?

    Arghh.... Sorry, I screw up, I was wrong.
    command is " --Inform=Details;1"
    in the source code, you can add MediaInfo::Option("Inform", "Details;1");

    This is a cryptic command I plan to simplify, but not yet done (there is so much to do, this is a bit a mess :( ).

    Ztring& From_Unicode (const wchar_t ch) {return From_Unicode(&ch, 0, 1);};

    ZenLib is not up to date. Anyway, yesterday I patched MediaInfo SVN for not relying on hte latest version of ZenLib SVN. This issue should be gone now.

     
  • DV
    DV
    2014-08-21

    Yes, I was using ZenLib from mediainfo_0.7.69_AllInclusive.7z and it was outdated.
    Now, with latest ZenLib from https://sourceforge.net/projects/zenlib/ it's OK.
    By the way, zlib included in mediainfo_0.7.69_AllInclusive.7z is very important to remain there because 1) MediaInfoLib depends on it and 2) \zlib\projects contains ready-to-use project files which can't be found otherwise.
    As for the project files, the latest version of MediaInfoLib seems to include outdated project files for VS2005 and VS2008: the VS2005 version lacks the Debug configuration at all, the VS2008 version does not include the dependency on aes-gladman - and even with this dependency added can't be compiled.
    I've used VS2012 and got a succesfull Debug build. However the Release build gave me "fatal error C1001: An internal error has occurred in the compiler" while linking MediaInfoDLL. I've been able to fix it by setting the Optimization to Disabled (/Od) as mentioned here: http://stackoverflow.com/questions/7076494/fatal-error-c1001-an-internal-error-has-occurred-in-the-compiler
    So now I'm ready to play with the trace/Inform feature :) Will post here once there is anything interesting to write.

    P.S.
    Returning to Ztring.h in ZenLib, I can see that the implementation of Ztring::From_Unicode (const wchar_t S) is almost a duplicate of Ztring& Ztring::From_Unicode (const wchar_t* S). I'd propose to rewrite that method in such a way:

    Ztring& Ztring::From_Unicode (const wchar_t ch)  
    {  
      wchar_t S[2] = { ch, 0 };  
      return From_Unicode(S);  
    }
    

    What do you think?

     
  • Wow, you tested a lot, impressive! :)

    Yes, I was using ZenLib from mediainfo_0.7.69_AllInclusive.7z and it was outdated.

    But it is expected to work with SVN trunk starting yesterday, I removed the use of the new method.

    the VS2005 version lacks the Debug configuration at all

    Wanted, I am not motivated to maintain debug build for it. Is there someone still using it? ;-)
    (actually, I am more willing to remove completely support of such version of Visual Studio)

    VS2008 version does not include the dependency on aes-gladman - and even with this dependency added can't be compiled.

    Wanted: aes-gladman is not from me and depends of an include not in Visual Studio 2008 (not compliant with C99...), so I was not interested in adding support of AES with this old compiler. Don't care of old compilers (or do you need of them?)

    I've used VS2012 and got a succesfull Debug build. However the Release build gave me "fatal error C1001: An internal error has occurred in the compiler" while linking MediaInfoDLL.

    Tested on my side, works well. I'll check more a bit later (removing optimizations is something I don't like)

    So now I'm ready to play with the trace/Inform feature :)

    Go! :)

    Will post here once there is anything interesting to write.

    Do :).
    And if you are interested in improving it, you can do some proposals (I don't promise I'll accept them, but I'll review them)

    Returning to Ztring.h in ZenLib, I can see that the implementation of Ztring::From_Unicode (const wchar_t S)

    I changed it a bit differently.
    I am aware the code is not beautiful, but not my priority for the moment.

     
  • DV
    DV
    2014-08-22

    Well, if VS2005 and VS2008 are no longer supported for MediaInfoLib, let's either mark the corresponding project folders as outdated (e.g. "Project\MSVC2005_outdated") or just exclude them from the sources.
    As for the trace feature, I did not really find that its inclusion increases the file size noticably. It's just about 15% of the original file size whereas the feature is definitely useful - personally I certainly want it!
    I'd like to propose to report a size of each parameter in addition to its offset, also mentioning whether it's a signed value or not. The corresponding change could be:

        inline void Param      (const char*   Parameter, int8u  Value) {Param(Parameter, Ztring::ToZtring(Value)+__T(" (0x")+Ztring().From_CC1(Value)+__T(") {1 byte unsigned}"));}
        inline void Param      (const char*   Parameter, int8s  Value) {Param(Parameter, Ztring::ToZtring(Value)+__T(" (0x")+Ztring().From_CC1(Value)+__T(") {1 byte signed}"));}
        inline void Param      (const char*   Parameter, int16u Value) {Param(Parameter, Ztring::ToZtring(Value)+__T(" (0x")+Ztring().From_CC2(Value)+__T(") {2 bytes unsigned}"));}
        inline void Param      (const char*   Parameter, int16s Value) {Param(Parameter, Ztring::ToZtring(Value)+__T(" (0x")+Ztring().From_CC2(Value)+__T(") {2 bytes signed}"));}
    ...
    

    However the output looks really ugly in this case :(
    Probably something can be done inside of the function
    void File__Analyze::Param(const Ztring& Parameter, const Ztring& Value).
    For example, it could use something like
    BS->SizeOfLastCall_Get(); (similar to 'BS->OffsetBeforeLastCall_Get();')
    to report the size of the parameter. Not sure how should it deal with signed/unsigned though - probably return a string, not an integer?
    Also I'd like to propose to add some kind of a "header" to the information reported by the trace feature. Something similar to the following:

    File: "MediaInfo.exe"
    Size: 1,256,448 bytes
    Date: 21/08/2014
    Detected file type: Win32 Executable
    Trace dump:
    00000000 MZ (248 bytes)
    00000000 magic: MZ
    00000002 cblp: 144 (0x0090) {2 bytes unsigned}
    00000004 cp: 3 (0x0003) {2 bytes unsigned}
    ...

     
  • Well, if VS2005 and VS2008 are no longer supported for MediaInfoLib, let's either mark the corresponding project folders as outdated (e.g. "Project\MSVC2005_outdated") or just exclude them from the sources.

    For the moment, only one feature does not work (AES), rarely used, not enough for removing it... they are partially supported and will be fully supported if someone is enough interested enough in it.

    As for the trace feature, I did not really find that its inclusion increases the file size noticably. It's just about 15% of the original file size whereas the feature is definitely useful - personally I certainly want it!

    Lot of people already complain about the "huge" size of MediaInfo. do forget that MediaInfo is used by hundreds of thousands users and they don't have the same interested than you (lot of people really don't care of trace, and they also use "MediaInfo lite" with has a more basic GUI and is smaller in size...)

    The plan is to release a specific version of MediaInfo with trace feature enabled, when I have time to handle it.
    Also: for the moment, the feature is not stable enough (it may freeze due to the huge amount of traces and so on...), I still need to work on it before something more public. And it is not the priority of my current sponsors + I found no sponsor for this feature, so there is no short term plan for a public release.
    You can manage your own build, no problem, but a public release on MediaInfo website is another problem (if you are interested in being a maintainer of the feature and handle feedback from users and do lot of development, I am interested ;-) ).

    I'd like to propose to report a size of each parameter in addition to its offset,

    I already do it for bit streams because I thought it is useful. Is it useful for bytes? We already have offset and see the offset different. but it is possible if it is useful.

    signed value or not.

    lot of text, for low value. Is it really interesting?

    Also I'd like to propose to add some kind of a "header" to the information reported by the trace feature.

    The problem is when the output is used in a tree view, it breaks it. see screenshoot.
    It is a good idea, but maybe it will be something optional. To be discussed.

     
    Attachments
    • DV
      DV
      2014-08-26

      Lot of people already complain about the "huge" size of MediaInfo

      I was using UPX (http://upx.sourceforge.net/) with several versions of MediaInfo.dll and with other software as well and did not encounter any side effect or whatever. Could be considered as an option.

      Also: for the moment, the feature is not stable enough

      I think, this is definitely what I could help with. Are there some concrete examples to investigate?

      I'd like to propose to report a size of each parameter in addition to its offset

      I already do it for bit streams because I thought it is useful. Is it useful for bytes?

      Yes, it's useful for bytes because the end of one parameter is not always the beginning of the next one. I mean, there are "gaps" between some parameters of some structures, so it's worth to know the exact size of each parameter to avoid be mistaken by a "gap".
      And you are right, signed/unsigned is not so important. (Though, adding 's' or 'u' to the size might be useful - though would need the explanation that 's' stands for "signed" and 'u' - for "unsigned".)

      As for the output "header" and the tree view - yes, I was thinking more about the command-line version. Probably the header could have a specific "beginning" and "ending" that will be recognized and ignored by the tree view.

       
      • I was using UPX (http://upx.sourceforge.net/) with several versions of MediaInfo.dll and with other software as well and did not encounter any side effect or whatever. Could be considered as an option.

        UPX is flagged as risky for several anti-virus and the problem is the compressed (LZMA) installer, and UPX is not better than LZMA. The size on disk is not a big issue.

        Also: for the moment, the feature is not stable enough

        I think, this is definitely what I could help with. Are there some concrete examples to investigate?

        Offsets are wrong with bitstreams
        With AVC and HEVC, offsets are the ones with the 0x000003 special bytes removed, not the real ones.
        If a frame is accross 2 packets, the offset is wrong
        sometimes (I remember I saw it with MPEG-PS), data is weirdly displayed due to unfinished parsing before load of the next part of the file
        Speed issue with MPEG-TS. Too much data actually with TS files.
        I have no specific example in memory, but I already saw some case with missing data for the stream with special cases (container...) --> My conclusion was that I need to review all files I have.
        No good GUI for it (I have the WxWidget version but it is an awful hack, the Qt version has also this thing but not tested enough and not made by me. the tree view as in the screenshot is important for me. Issue: for the moment, the Windows GUI is done with the crappy old Borland VCL, and I am not a lot motivated to update this code, I want more to build a new GUI but this is not the priority)

        I am interested in having contributions, but this one is big if you want a public release, are you sure you'll be interested in working on issues you don't have on your side (e.g. formats you don't care about)?

        Thinking about that, if it is so interesting, I could try to add the trace feature in the Windows GUI (text view only) with a big warning "not stable" and see if I still have complain about size...

        Yes, it's useful for bytes because the end of one parameter is not always the beginning of the next one.

        OK.

        As for the output "header" and the tree view - yes, I was thinking more about the command-line version. Probably the header could have a specific "beginning" and "ending" that will be recognized and ignored by the tree view.

        Acceptable. my unstable tools use the DLL directly, so I let you propose a patch in the CLI code with an header and footer you'd like ;-).

         
  • DV
    DV
    2014-08-29

    OK, I'll play with the command-line version with regards to adding some "header" of the tracing feature. (Could not do it before because I was a little bit overloaded this week.)
    With regards to contributions... it's again the case when I was thinking about the library itself, about command-line - and not about the GUI :) I'll be glad to help with debugging of the tracing feature when it will be needed (in case of some issues and so on). And for sure I'll test it with my local files.
    As for the GUI, I used "Win32++" some time ago and found it very good. Not sure how useful this information is, though, because it's not cross-platform and personally I am some kind of tired of all these endless handles, controls, window procedures, message loops and so on.