From: Mike M. <mel...@pc...> - 2003-01-16 20:47:08
|
On Thu, 16 Jan 2003, John McCutchan wrote: > Yes I think that is correct, I don't really know anything about xine > so I am not sure how to calculate the pts. This same code appeared in > many demuxers so I copied it. What exactly does a pts represent? pts stands for presentation timestamp. In MPEG jargon, the pts is the time that a piece of decoded data (either a video or audio buffer) is supposed to be presented to the output unit (monitor or speakers). xine handles pts in the MPEG tradition- In reference to a 90 kHz clock. For example: pts 0 is 0 sec pts 90000 is 1 sec pts 135000 is 1.5 sec pts 180000 is 2 sec Let's take some video examples. Suppose that the video in a file is supposed to be played at 15 frames/second. What pts should frame 0 be displayed? 0, that's the easy one. How about frame 1? 1 pts --- = ----- => pts = 1 * 90000 / 15 = 6000 15 90000 Frame 2 is presented at pts 12000, 3 @ 18000 and so on. Some files have variable framerates. Frame n will be presented at (n * 90000 / 15) for this 15 fps file. For audio you will be sending a buffer of PCM samples. Suppose the audio is mono, 8 bits, 44100 Hz. 1 second of audio will be 44100 bytes long. When dispatched in reference to the 90 kHz clock, these 44100 bytes correspond to 90000 pts units. So if you have a stream of PCM data with the properties stated above: # of bytes pts ---------- = ----- 44100 90000 But the formula does not always hold. If the same data is stereo, there will be twice as much data (1 8-bit sample/channel) in a 1 second: # of bytes ---------- 2 pts ------------ = ----- 44100 90000 Further, if the audio was 16 bits instead of 8 bits, an individual audio frame with be 4 bytes instead of 2: # of bytes ---------- 4 pts ------------ = ----- 44100 90000 Wait, what's an audio frame? Well, this audio has a sample rate of 44100 samples/second. An audio frame is the amount of data that is being sent out at each of those 44100 sample points: properties: frame size: 8 bits, mono = 1 byte 8 bits, stereo = 2 bytes 16 bits, mono = 2 bytes 16 bits, stereo = 4 bytes BTW, you may have spotted that in the audio decoder you have to specify the number of frames in the audio buffer where you are putting the decoded PCM for dispatch. The number of frames will be the total number of bytes divided by the frame size. So now are general audio pts calculation becomes: # of bytes ---------- frame size pts 90000 * # of bytes ------------ = ----- => pts = ------------------------ sample rate 90000 frame size * sample rate All of that covers decoded PCM audio. With VBR audio, either all variable-length chunks are going to decode to the same number of PCM samples or there is going to be some way to determine the variable size of the decoded audio from an individual block. Either way, you will likely keep a running tally of total decoded PCM bytes in either the demuxer or the decoder and run this calculation before dispatching the final audio buffer. That was a long explanation. But now that I have typed it out it should probably go in the Hacker's Guide. Go on, ask another question...:) Hope this helps... -- -Mike Melanson |