transcoding suggestions

  • charles shick

    charles shick - 2007-06-15

    hi jin,

    back in a couple of threads, you had mentioned that mediatomb might be doing some transcoding in future versions ...

    another related feature that would be very cool: during the scan process of videos, use `tcprobe` or another tool to detect what the elementary streams (presentation units) are contained in a vob file. one detected, the administrator or even the user could select from the menu, which presentation units should be sent over the wire.

    allows users to select audio format, language, and subtitles from the server rather than from the client :-)

    thanks again


    • Jin

      Jin - 2007-06-15

      thats indeed an interesting feature! I'll take a look at this tcprobe tool, let's see how this could be done.

    • Torsten Schlabach

      Is there any transcoding available yet at all?

      Are there any hooks in the code right now?

    • Jin

      Jin - 2007-07-06

      Currently it is not available; the only "hook" or workaround:
      add an external URL item to MediaTomb that will point to a VLC server on your local network. Then make
      sure that VLC is setup to play the file of your choice, I think someone tried it and it should work.

      We plan to bring out a new release with playlist and inotify scan support this weekend, let's see if we can
      make it. After that we will start implementing the transcoding feature.

    • YuChung Wang

      YuChung Wang - 2007-07-26

      However, how do we "select it from the server"? It's not a good idea to select them one by one from the server. Instead, we may export these information by using a virtual folder.

      For example, if a aaa.vob contains one video stream and three audio stream. We may export it as

          - aaa(Chinese)
          - aaa(English)
          - aaa(Spanish)

      We can write a simple parser to extract teh audio and video information from the VOB file and then implement a transcoder in the HTTP server to extract only selected streams from the VOB file itself. In this way, any UPnP client device can play the VOB in near DVD experience.

      In adidtion, we can follow the DLNA way to generate a virtual IFO file from the VOB and export it as an attribute of the item. However, I doubt if there is any DMA which has support this function yet.

    • Jin

      Jin - 2007-07-26

      That's actually a very good idea!

      As I mentioned in some of my other posts on the forum, we plan to introduce transcoding in two steps (i.e. there will be two releases, first "basic" or "generic" version of transcoding, and then an improved version will follow).

      The basic version allows to hook in any external program to act as a transcoder, it allows passing parameters to that program, depending on the transcoding profile which can be defined in config.xml
      That means that you could use anything (for example mencoder or just simple mpg321 for mp3->pcm conversion) to transcode your data, the only important thing is that it is able to write to a fifo (that's where we will be reading the transcoded stream from). Right now I am not sure if we can offer the language selection feature in the basic version, I will have to think this through.

      The improved transcoding support will use libraries like ffmpeg to do transcoding natively, this will give us maximum control over the data - so adding the vob language selection there should be fairly easy.

      Currently I am working on the basic version, and it looks like it will be done very soon.

      So the plan is to have two releases:
      1. basic transcoding using external transcoders
      2. additionally native transcoding with advanced features

      Another interesting thing would be to use an external transcoding server, let's see if we can offer that as well.

      • charles shick

        charles shick - 2007-07-26

        hi jin, hi all,

        i'm not a programmer ... but, if mediatomb could read from a fifo socket, that would be really nifty:

        1) ivtv writes its PS to a socket, so mediatomb could now deliver TV from Hauppauge and others :-)
        2) if mediatomb could just read from a socket and stream the contents via http to a rendering client, than we're golden.

        i can already imagine feeding a vob to tcdemux to parse up a PS -- just feed the whole vob to tcdemux, and spits out what you've told it to the socket -- one video track one audio track for example ...

        super fantastic :-) :-) :-)



    • Jin

      Jin - 2007-07-26

      well, actually reading from a socket would already be possible (I think I'd have to change one line or so), so I could enable it for items that are created manually from the UI, then this ivtv thing would probably work already.

      let me know if you want to play around with dev-code :>

    • Jin

      Jin - 2007-07-26

      OK, I made sure that a FIFO can now be added via the UI (not via FS view, but in DB view, click the "[+]" icon, select "item" from the dropdown list and simply enter the location of your fifo in the appropriate field.

      the most important thing is to get the mimetype right - you will be only able to use one particular fifo for one particular type of content (i.e. only video or audio, etc.) - the renderer needs to know what it will be playing, thus you have to set the mimetype correctly.

      when requested, the stream from the fifo will be served to the renderer in a usual way (just like with any other item)

      let me know how it works out :>

    • YuChung Wang

      YuChung Wang - 2007-07-27

      There are two features here. One is transcoding and the other is virtual folder support. The virtual folder is a generic playlist, which produce folder from a single file, such as M3U/PLS. It can be used to support structured files, such as transport stream, VOD, DVD filesyste, VCD filesystem as well.

      For the transcoding, we had better to export it by using DLNA-like tags DLNA.ORG_CI to indicate that it is a transcoded resource or not. It means that we will have two resources for single item.

      For playlist, it is different. We will export it as a container object and multiple child item object under it. We have export it as a mult-level folder tree as well if it is necessary. I treat this kind of information as a EPG. For example, if we find a XMLTV file, we may export it as a lot of virtual fodlers which show the information by time, by date or by others. I don't know if we can do it by using the javascript support or not. If yes, it's fantastic. For example, we may use it to implement youtube support as well.

    • Jin

      Jin - 2007-07-27

      Well yes, that is how I am doing it now with transcoding: I add resources to the same item. I only have a problem with DLNA - the spec is not freely available; by now we have quite a good understanding on what the various tags mean, however I'd still prefer to read the spec and that will unfortunately not happen. So far we had to add a few tags for the Playstation 3 - that's triggered by the <protocolInfo extend="yes/> parameter in config.xml

      Now, a word on playlists:
      they can indeed be used to point to or to import online content, also, since playlist parsing is implemented via js you could write a parser for any kind of ASCII-based playlist file, we do not have xml support there so you would probably have to stick to regular expressions and string comparisons; that's why I do not see the playlist feature as an online content feature.

      Also, online content is different in the way that it can often change and our database needs to be synced/updated with a particular service. So my idea is to provide some sort of a generic API that would allow to easily fetch and keep in sync content from services on the network. It should be made in a way that MediaTomb knows how to deal with a particular service and can automatically refresh the content information/content layout when needed.

      But well.. we will finish transcoding first :)

    • YuChung Wang

      YuChung Wang - 2007-07-28

      I am more insteresting in adding DVD library support into the mediatomb so that we can export the network DVD library to the devices in the UPnP network. In order to do this, we need to add a DVD parser to extract the audiotrack/subtitle information from DVD filesystem and present them as virtual folder and items.

      I am thinking to use libdvdread to extract inforamtion from the DVD filesystem and create the virtual fodlers. In addition, we need to enhance the HTTP server to extract correct audio/video and subtitle streams from the DVD filesystem and convert it as MPEG2 PS stream.

      However, there is a big issue for the transcoding here. How do we want to implement the trick mode? Most media players implement their client side trick mode. Unfortunely, all of these implementations will not be compatible with the transcoding since all of them requires to read the index table before start trick mode. The only solution for thsi is the server side trick mode. It means that the client will send a special HTTP requests to the media server and expect the media server to produce the trick mode stream for them.

    • Jin

      Jin - 2007-07-28

      To be honest - I am not very familiar with the DVD filesystem, I'll have to do some reading on this. Do we really need to apply transcoding here, or is it just an issue of picking the right streams? Well, I have to read up on the topic before I make any further comments.

      Regarding trickmodes and transcoding... yes, that is indeed a problem; I have some ideas on how this could be solved (I have a big hack in mind but it needs to be tested), another option is time based seeking instead of the normal byte range requests, so far I have not seen a renderer that supports that, allthough someone told me that the Telegent TG100 may be able to do that. I'd have to read up on this topic too.

      One more word on the index table... I think that depends on the format? Do you really need an index table on a continious MPEG1/MPEG2 videostream? Assuming that we know the total length, wouldn't the usual range requests do the job? I think the main problem with transcoding and seeking is, that we do not know the size of the resulting transcoded video...

      As for special server side trickmodes: I think the server should try and support the usual range request scheme that most clients use for seeking; my idea is to try and fake the range thing on content where the actual size of the data is not known, this would make sure that all renderers are supported. I would not like to resort to "special" HTTP requests since it would introduce incompatibility to various devices... well.. I'm thinking of the redsonic headers that are used on the DSM - why exactly was that done? I do not really see a reason for this, apart from the attempt to make the renderer incompatible with other servers.

    • YuChung Wang

      YuChung Wang - 2007-07-30

      We don't need real transcoding for DVD. Actually, the blocks in the DVD might not be even sequential. The libdvdread has the capability to extract correct blocks from the DVD filesystem and generate a MPEG2 stream from it.

      For the transcoding, time-based seek is different from the real trick modes. The FF/FB requires the server or client to get I frame from the whole stream and play it at faster speed. The real FF/FB can not be implemented in time-based seek since it can not tell us the location of the I frame.

      In order to do this, we need a real server side trick mode. Currently, all viiv compatible renderer has such capability. Actually, DLNA has defined the server side trick mode. Viiv define its own extention as well. The time-based seek is not defined in UPnP as well. It only be defined in DLNA, the same as server side trick mode.

      In my experience, the work for server side trick mode is huge. There is a lot of corner cases. The time-based seek might be easier. If we want to implement it, we must follow the DLNA specification so that we can find renderer which support this function.

      Currently, there is only few viiv devices still on the market now. In US, DSM510 is available from D-link. In Europe, the AMP150 is available from FSC. The DSM520 can use the firmware for AMP150 if you know how to upgrade it.

    • Jin

      Jin - 2007-07-30

      Well, again the problem that we have - the DLNA spec is not freely available, so I have no idea what exactly it is defining.

      Thus my thought, to hide the whole seeking behind the usual range requests mechanism which is supported by all renderers. I do not know if this will be possible, we would have to investigate this.

    • YuChung Wang

      YuChung Wang - 2007-07-30

      Since MPEG has no index table, most renderers will implement some kind of trick to extract I frames from the MPEG file. The most common trick is to read a big chunk of block which is big enough to contain at least one I frame and then do a big jump to skip some I frames. For example, it may read 0-100K to get at least one I frame and then read 400K-500K to get another one.

      However, the jump distance is totally adaptive. It means that it will use the bitrate and the speed to guess how long it should jump. Different renderer may have different strategy.

      According to this behaviour, we can implement the range seek in the following way.

      When we receive a range seek, we use the start offset to emulate a seek in the original file. For example, if we receive 1024000-1025000 range seek, we  use 1024000 to determine a time. If the transcoding is 1Mbps CBR, we can easily get the time as 1024000/(1024*1024/8) = 7.81 second. Assume that the GOP of the original file is 0.5 second. We seek to the 8 second in the original file and generate transcoded stream from there.

      The only issue is the transcoding speed. The other issue is that "adaptive" behaviour. We need to fake it as well. I don;t know if we can implement a strategy which fits all renderers.

    • StephanV

      StephanV - 2007-09-24

      Are there any plans to include a type of transcoding that would help with bitrate reduction?  I find that in my environment, for instance, anything above 700-800kbps or so will cause some video pauses.  I've noticed that MPEG-4 source encoded at about 300kbps gives me no issues.  Maybe my choppy video is another underlying issue, but I've tried with two different Ubuntu boxes (a 1GHz NetVista, and a VMware guest running on a 2.8GHz hyperthreaded host).

      I was trying to rule out CPU utilization issues on the 1GHz machine (it was constantly in the 60-80% while video was playing, but that was under Twonky) by also trying on a VMware guest, but I was getting choppy video there too.

      • Jin

        Jin - 2007-09-24

        Yes, that is possible using the current SVN, just create an appropriate transcoding profile; in your case the target mimetype will simply equal the source mimetype which is not a problem at all.


Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks