From: David I. L. <dl...@vt...> - 2003-01-30 22:59:12
|
walters is having a slight problem with mp3 type finding. the code in gstmp3types.c does a simple check for ID3 tags and a start code at the beginning of the buffer. However, this doesn't work for some iradio streams which do not start the stream at the beginning. This causes further type find functions to be called and it turns out festival plugin decides the mp3 data is actually "text/plain" and it can handle it. Oops. The simple fix for this right now is to have the mp3types.c code scan the entire buffer it gets for a startcode. This has a performance penalty if the buffer is something else. And it also will fail if not enough data is available. [As a side note, this mp3 type code has buffer range errors. It could, in theory, segfault if the input buffer is < 4 bytes. Or more likely, do type detecting on invalue buffer data! Probably need to do more auditing for such errors elsewhere as well.] Possible longer term solutions might be to let the type find function return some more info on it's state and have optional effort flag. An "effort" flag would tell the type find function if it should just check the front of the buffer or do a more extensive scan through the whole buffer. Returned values would be a "confidence" value, a guint64 specifying if more data is needed to do a higher confidence check, and caps of the potentially found type. An algorithm using this would be to scan all type find functions with a buffer at "low effort". If "high confidence" caps are found, stop. Else continue checking. If nothing is found do a "high effort" check. This would tell the type find functions to do more extensive check through the whole stream (or whatever they determind appropriate) in case the stream is out of sync. (as is the case with walters iradio stream). I'm not sure what the failure modes are if this fails too. Either pick the highest confidence type so far or bail out with an error. This sort of thing involves a bit of API change. Adding effort to the type find signature: typedef GstCaps* (*GstTypeFindFunc) (GstBuffer *buf, gpointer priv); to typedef GstCaps* (*GstTypeFindFunc) (GstBuffer *buf, GstTypeFindEffort effort, gpointer priv); And some flags/values for the effort type. The return values like confidence and more-data-needed could be added to caps or maybe that's overloading the caps idea too far: typedef GstCaps* (*GstTypeFindFunc) ( GstBuffer *buf, /* buffer */ GstTypeFindEffort effort, /* effort to use */ GstTypeFindCapsConfidence *confidence, /* out: result caps confidence */ guint64 *data_request_size, /* out: need more data */ gpointer priv); I think for the immediate fix the mp3 code should scan the entire buffer it gets. Which is slower and could fail if buffer is short and has no start code. But it's better than nothing. Thoughts? -dave |
From: David I. L. <dl...@vt...> - 2003-01-30 23:18:10
|
* David I. Lehn <dl...@vt...> [20030130 18:05]: > walters is having a slight problem with mp3 type finding. the code in > gstmp3types.c does a simple check for ID3 tags and a start code at the > beginning of the buffer. However, this doesn't work for some iradio > streams which do not start the stream at the beginning. This causes > further type find functions to be called and it turns out festival > plugin decides the mp3 data is actually "text/plain" and it can handle > it. Oops. ... > I think for the immediate fix the mp3 code should scan the entire buffer > it gets. Which is slower and could fail if buffer is short and has no > start code. But it's better than nothing. > > Thoughts? > I forgot to add, in this case it really seems like gnomevfssrc should see that this is an icy mp3 stream and set it's output caps appropriately and that spider should use this info. -dave |
From: Erik W. <om...@te...> - 2003-01-30 23:21:58
|
On Thu, 30 Jan 2003, David I. Lehn wrote: > I forgot to add, in this case it really seems like gnomevfssrc should > see that this is an icy mp3 stream and set it's output caps > appropriately and that spider should use this info. Yeah, that's true. Is that a fix we want to consider for 0.6.0? Erik Walthinsen <om...@te...> - System Administrator __ / \ GStreamer - The only way to stream! | | M E G A ***** http://gstreamer.net/ ***** _\ /_ |
From: Colin W. <wa...@de...> - 2003-01-31 18:24:11
|
On Thu, 2003-01-30 at 18:20, Erik Walthinsen wrote: > On Thu, 30 Jan 2003, David I. Lehn wrote: > > > I forgot to add, in this case it really seems like gnomevfssrc should > > see that this is an icy mp3 stream and set it's output caps > > appropriately and that spider should use this info. > > Yeah, that's true. Is that a fix we want to consider for 0.6.0? Ok, I worked around this problem for now with the patch in #104918. It, in combination with your mp3 typefinding fix, should suffice for me until we redo the typefinding and autoplugging stuff for 0.7.x... |
From: Colin W. <wa...@de...> - 2003-01-31 00:44:09
|
On Thu, 2003-01-30 at 18:17, David I. Lehn wrote: > I forgot to add, in this case it really seems like gnomevfssrc should > see that this is an icy mp3 stream and set it's output caps > appropriately and that spider should use this info. I believe that the icy metadata stuff can be used for both vorbis and mp3, so I don't think that's possible. At least I don't see anything in the common icy-* headers which say whether the stream is mp3 or vorbis. Now one thing it *definitely* should be doing is grabbing the Content-Type header if available. I will work on that. |
From: Colin W. <wa...@de...> - 2003-01-31 00:44:09
|
On Thu, 2003-01-30 at 17:58, David I. Lehn wrote: > [As a side note, this mp3 type code has buffer range errors. It could, > in theory, segfault if the input buffer is < 4 bytes. Or more likely, > do type detecting on invalue buffer data! This is fixed now. > I think for the immediate fix the mp3 code should scan the entire buffer > it gets. Which is slower and could fail if buffer is short and has no > start code. But it's better than nothing. Your API changes look good to me. I would say though that we shouldn't worry a lot about the performance penalty; after all, in the common case GStreamer will have filename extensions to work with. Also, it might be good for apps to have a way to give hints to the autoplugger about the expected media type (I didn't see a way to do this currently). So that way the autoplugger could try those types first, with both low and high effort, and only later fall back to other types. |
From: Erik W. <om...@te...> - 2003-01-31 00:51:52
|
On 30 Jan 2003, Colin Walters wrote: > Your API changes look good to me. I would say though that we shouldn't > worry a lot about the performance penalty; after all, in the common case > GStreamer will have filename extensions to work with. I'm working on reimplementing the header checks now to search the entire buffer. It won't be pretty, and as I mention in the comments it's not quite correct either. But without an autoplugcache in place and a 'certainty' system in place, it'll have to be enough for now. > Also, it might be good for apps to have a way to give hints to the > autoplugger about the expected media type (I didn't see a way to do this > currently). So that way the autoplugger could try those types first, > with both low and high effort, and only later fall back to other types. Well, the filesrc should really be doing this. gnomevfssrc should be able to set caps based on several of its properties, such as iradio. The longer-term typefind fixes depend on general use of the autoplugcache. The correct way to detect an mp3 that's mid-stream is to search for a frame header, then calculate the size of that frame, skip those bytes, and confirm that there's another frame header at the expected location. In order to do that, we very well may require two consecutive buffers. I'm not sure how spider manages sending the buffer to typefind, then 'recalling' it and sending it into the newly selected element, but the autoplugcache is designed to do this, up to an arbitrary depth. When I rework the original dynamic autoplugger, that's what'll be usd. Erik Walthinsen <om...@te...> - System Administrator __ / \ GStreamer - The only way to stream! | | M E G A ***** http://gstreamer.net/ ***** _\ /_ |
From: Benjamin O. <in...@pu...> - 2003-02-03 11:49:40
|
On Thu, 30 Jan 2003, Erik Walthinsen wrote: > Well, the filesrc should really be doing this. gnomevfssrc should be able > to set caps based on several of its properties, such as iradio. > I don't think it's a good idea to detect filetypes based on extensions, though it might be a nice idea to use it as a hint on which typefind to try first. Even Nautilus uses sniffing exclusively I believe. > I'm not sure how spider manages sending the buffer to typefind, then > 'recalling' it and sending it into the newly selected element, but the > autoplugcache is designed to do this, up to an arbitrary depth. When I > rework the original dynamic autoplugger, that's what'll be usd. > If no type is found, spider fetches a new buffer, appends it to the old one and restarts typefinding with that bigger buffer. Benjamin |
From: Colin W. <wa...@de...> - 2003-02-03 15:09:15
|
[ I am on the list; please don't CC me ] On Mon, 2003-02-03 at 06:49, Benjamin Otte wrote: > I don't think it's a good idea to detect filetypes based on extensions, > though it might be a nice idea to use it as a hint on which typefind to > try first. It is a quite efficient method. Also note that typefinding by content isn't perfect either; you could just happen to have a say flac file where a section looks like an mp3 frame. > Even Nautilus uses sniffing exclusively I believe. Exclusively? No way. Think about for remote filesystems. Opening up every single file and looking at its content would be prohibitively expensive. Just imagine looking at a web directory with a few hundred files; then you'd need to make a few hundred HTTP requests. > If no type is found, spider fetches a new buffer, appends it to the old > one and restarts typefinding with that bigger buffer. Makes sense. |
From: Steve B. <st...@st...> - 2003-02-03 18:46:13
|
On Mon, 2003-02-03 at 12:49, Benjamin Otte wrote: > On Thu, 30 Jan 2003, Erik Walthinsen wrote: > > I'm not sure how spider manages sending the buffer to typefind, then > > 'recalling' it and sending it into the newly selected element, but the > > autoplugcache is designed to do this, up to an arbitrary depth. When I > > rework the original dynamic autoplugger, that's what'll be usd. > > > If no type is found, spider fetches a new buffer, appends it to the old > one and restarts typefinding with that bigger buffer. Not anymore it doesn't. This approach was causing spider to consume the whole file if the type couldn't be found. Currently it just uses the first buffer for typefinding. In the common case the buffer will be big enough to work so since the change the player has been behaving better. The code is there to build a buffer of a minimum arbitrary size (say 4096 bytes) but I was having some kind of problem merging the buffers which caused typefinding to fail. This code is commented out right now. cheers -- Steve Baker <st...@st...> |
From: Erik W. <om...@te...> - 2003-02-03 20:25:31
|
On Mon, 3 Feb 2003, Benjamin Otte wrote: > I don't think it's a good idea to detect filetypes based on extensions, > though it might be a nice idea to use it as a hint on which typefind to > try first. The src element would only set the mime/type, which isn't sufficient by itself generally. It does give a hint though. There should be a provision for mime/type to be a strong hint to typefind... > I think that needed effort should be a parameter when registering the > typefind func, not when calling it. Mp3 would then register 2 functions, > one with less effort and one with more effort. While I think an effort/likeliness/certainty metric would be worth something, we definitely do *not* want to force plugins to have multiple typefinds. If they *want* to do that, we might want to support it. But the key thing to keep in mind is that GStreamer's main purpose in life is to deal with all the hard stuff. Wherever possible, applications and elements should not have to code anything that GStreamer can't deal with in some efficient, reasonable way. Forcing two typefind functions would defeat that. THe primary goal in this case should be to maximize certainty, reduce effort, and sort be likelihood. The first two are a function of the coding of the plugin, the last is a system-wide metric we have to assign to each plugin, with mp3 being rather high on the list <g> Erik Walthinsen <om...@te...> - System Administrator __ / \ GStreamer - The only way to stream! | | M E G A ***** http://gstreamer.net/ ***** _\ /_ |
From: Benjamin O. <in...@pu...> - 2003-02-03 11:58:08
|
On 30 Jan 2003, Colin Walters wrote: > Your API changes look good to me. I would say though that we shouldn't > worry a lot about the performance penalty; after all, in the common case > GStreamer will have filename extensions to work with. > > Also, it might be good for apps to have a way to give hints to the > autoplugger about the expected media type (I didn't see a way to do this > currently). So that way the autoplugger could try those types first, > with both low and high effort, and only later fall back to other types. > Some thoughts about this: I think that needed effort should be a parameter when registering the typefind func, not when calling it. Mp3 would then register 2 functions, one with less effort and one with more effort. Typefinding would order all its functions by effort and go top down. Second advantage: More common types could get a higher effort value and be called earlier. Mp3 is more common than Mod these days... The second thing we should implement when we change this stuff is to include a value on how big a buffer this typefinding function needs in the worst case. This would allow spider to stop typefinding once it has collected such a buffer without missing any possible types. I'd like to implement this btw (/me hears thomasvs shout 'fix spider first' from behind) Benjamin |