Menu

Decoding MPEG4 and h264 Frames from RTP bytes [C#]

2012-11-02
2014-11-07
  • juliusfriedman

    juliusfriedman - 2012-11-02

    Hello All,
    I have created some code in pure C# which allows for RTSP / RTP Communication. I have successfully created a Server and Client and I have verified players such as VLC can display streams from my server. Performance is great and I am happy overall so far with the transport implementation.
    I now have the task of displaying individual frames from the streams when requested for a Web Server in the form of JPEG or BMP.

    I have accomplished this for JPEG streams quite easily and now I have the task of MPEG4 and h264. (I have attached the source code below to show the encapsulation I am trying to achieve).
    I was hoping that Media Foundation had facilities I could utilize to decode directly from bytes when requested.

    E.g. I have a RTPFrame from a MPEG4 source and I want to take this RTPFrame which consists of MPEG4-ES Data which is in the form of a Byte array and then pass it to Media Foundation and receive a Bitmap or RGB values which I could write to a Bitmap.

    I was hoping there was an example available which showed a developer how to accomplish something like this as most of the resources available read from files or servers and not RTP sources or even bytes which is more important in my case.

    Ideally I would be able to setup a single IMFSourceReader (or whatever the appropriate interface instance is) per video type and feed frames from the given camera on demand when they arrive and then call some method to retrieve the decoded data.

    My implementation allows me to keep or remove the RTP Headers from each packet and join then with a simple ToBytes() (from the RTPFrame or derived). I can pre-pend this with the SDP or any other information which would be required.

    So for 10 streams of the same type I would only need a single decoder and I would pass the complete frames (with SDP config preceding each) when demanded and I would then save the resulting data in the output stream for display using GDI as I have with JPEG RTP data.
    If someone could show me how to get this working or at least give me the links to the correct interfaces to utilize with a high level description on how to achieve what I am trying to do I would appreciate it greatly, my only other options would be to utilize VLC or some external libraries or port a decoder.

    I imagine I should be able to end up with a ProcessPackets which just creates a MemoryStream, adds the SDP Config bytes, add the joined frame bytes via ToBytes().
    Then in ToImage() I should be able to then pass the created Stream to a Media Foundation interface instance which I then will be able be able to call a method I suppose GetSample() and then create a Bitmap from the returned value in some fashion....

    internal void ProcessPackets(){

    Buffer.Write(SDPConfig, 0, SDPConfig.Length);
    var allPackets = ToBytes();//Rtp headers removed
    Buffer.Write(allPackets, 0, allPackets.Length);

    }

    internal System.Drawing.Image ToImage(){

    /Need help with this part.../

    }

    I just can't seem to find examples or definitions which allow this type of interaction...

    I hope this is possible using Media Foundation otherwise I have a lot of porting or interop to do.

    Attached is how I am handling JPEG Frames so you have an idea of the encapsulation I am trying to achieve.

    I can't just use the examples with the 'rtsp' url to my media on my server as I get a 'Not support streaming media server' error from this library so now I am trying to find methods of just decoding frame by frame as explained above.

    If this is possible I would even be willing to donate my Rtsp / Rtp classes to this framework to allow for this type of task to be performed easier. I also imagine archiving and trascoding would be much easier as well and this framework combined with my transport code would allow for a media server to be created fairly easily.

    Thanks,
    Jay

     

    Last edit: juliusfriedman 2012-11-02
  • snarfle

    snarfle - 2012-11-03

    Hmm. I'm not sure whether you can use MF for this or not.

    I know you can create source filters for data that MF consumes. In fact, the WavSource sample shows how to do this for wav files.

    In theory you could then use other MF methods (SampleGrabber or Source Reader) to retrieve samples in a specified output format. But I don't know that MF is going to be comfortable acting as a simple "sample converter."

    Still can't hurt to try.

     
  • juliusfriedman

    juliusfriedman - 2012-11-03

    Thanks snarfle!

    I think I see a round about way to achieve it..

    I would utilize the IMFByteStream interface and then call Read for every byte[] I want to pass it.

    I would then MFCreateSourceReaderFromByteStream and use the GetSampleMethod.

    Immediate questions I have are:

    What type of data is the IMFByteStream expecting? e.g. How should I pass the bytes... packet by packet or frame by frame... and where would I specify the decoder config from SDP?

    Since it seems the above requires a lot of code just to get to a MediaSample can't a MediaSample just be created manually from my bytes and then paint the sample on a bitmap?(this still doesn't tell me what format it expects and how to pass the SP config).

    E.g.

    MFCreateMemoryBuffer(source.Legnth, buffer);
    buffer.Lock();
    CopyMemory(source, buffer);
    buffer.Unlock();
    buffer.SetLength(source.Legnth);
    MFCreateSample(sample);
    sample.AddBuffer(buffer);

    Then I just need to figure out how to get the bitmaps from the Sample.

    Is there any way to determine what data the sample expects or how to pass the config data?

    Thanks!

     

    Last edit: juliusfriedman 2012-11-03
  • snarfle

    snarfle - 2012-11-04

    Since I haven't tried what you aim to accomplish, I think I'm going to defer this to the MS folks. I'd suggest asking this in the MSDN forum.

    You might want to be a little vague about what programming language you are using, as some people will refuse to answer even basic questions if they realize you aren't using c++. Don't ask me why.

     
  • juliusfriedman

    juliusfriedman - 2012-11-05

    Thanks Snarfle! I already had a post there before coming here... I will try to bump it and see what happens.. Thanks again!

     
  • juliusfriedman

    juliusfriedman - 2014-11-07

    Just a fyi this is not possible out of the box, middle ware is required to demux rtp and rtsp.

    My library (http://net7mma.codeplex.com) can mux and demux rtp and then you can use mf to render.

     

    Last edit: juliusfriedman 2014-11-07

Log in to post a comment.