From: Scott W. <bau...@co...> - 2005-02-15 07:35:40
|
I'm very excited about finding Libvisual! I wanted to take a few days to grok how it all works and meshes together before I posted my comments, and I _think_ I'm getting a handle on this. If I'm completely off-base, feel free to bash me upside the head. Before my comments start, let me point out that in case it hasn't been reported yet, the frame-limiter for 0.2.0 is broken, at least for the XMMS plugin. It's chewing up all available CPU time. I profiled one of the simpler plugins (the scope) to check this out and when sized very small (roughly 100 X 50 by my eye) the "render" callback was still getting called more than 600 times per second even though frames were (theoretically) being limited to 30/second. So here're my initial thoughts and suggestions, for what they're worth: It would appear to me that instead of pcm data being "pushed" to the visualizer engine as is done in the visualizer plug-in models put forth by XMMS and WinAmp, the pcm data in Libvisual is being "pulled" via an Input plugin's VisPluginInputUploadFunc or by implementing an upload callback. The "pull" model works just fine, but Libvisual needs to add some means for synchronization and non-audio data or else you'll cut out an entire class of visualizations. That's probably unclear, so let me give an example: Let's say I'm decoding an MPEG or AVI file using FFMPEG (ffmpeg.sourceforge.net) - as I decode, I'm going to get interleaved "packets" of audio and video data. Depending on the codec, the display times between individual video frames can vary wildly so a simple latency calculation won't synch the video to the audio. Instead, the codec (or ffmpeg or the application itself) calculates and provides a "presentation time stamp" in stream relative time of when to display each frame of video. The video visualization plugin's job is to buffer frames, and display them when the proper time comes. So, if you stick with the "pull" data model, you need to have the ability for the application to expose a method for the visualization plugin to get the current playing stream time. Likewise, there needs to be a method to query visualization plugins to see if they can accept and handle certain special data types (so I don't exacerbate entropy sending video packets to Goom, for instance) and an API for getting that special data to those that do (as simple as a "userdata" callback in addition to the "render" callback.) In the same light, consider a Karaoke (mp3+cdg or ogg+cdg) plugin. There are actually two data sources: a standard .mp3 or .ogg file, and a separate .cdg file that contains the karaoke lyrics and graphics that were ripped out of the subchannel data of an audio CD+G disc. A karaoke visualizer would get the .cdg data sent to it as a one shot package at the start of the stream, and before returning from a "song_start" callback (and there should be one of these, as well as callbacks for "song_pause", "song_resume", and "song end" so the thing ain't chewing up CPU cycles if Joe User needs to pause audio to do something CPU-intensive for a bit) it decodes the CDG data into frames, and generates presentation time stamps for each frame. Again, during song playback, it doesn't care a whiff about pcm data, it just wants to monitor stream playback time and display each frame synched to the audio output. In fact, there's another standard using MIDI with karaoke lyrics that may not generate any pcm data at all (and while I'm at it, I might want my non-karaoke MIDI file player to generate data for a graphic piano keyboard visualizer in pass-through or hardware-synth mode, or use "regular" audio visualizers when using a software-synth that creates regular pcm data.) In both of the above cases, the "render" callback may or may not actually draw a frame if the call happens between two presentation time stamps. For that matter, Libvisual should not assume that any Actor actually draws a frame during any particular "render" call unless the Actor tells Libvisual that it _did_ draw a frame, and likewise there needs to be a method or callback to the application to let it know that a new frame is available for drawing so that an unchanged video buffer isn't getting re-blitted without cause) As an application writer, I obviously have a vested interest in any CPU-saving tweak possible. As far as the interface for querying which special data types a visualization plugin can accept, I think something similar to the way a WinAmp input plug-in exposes which filetypes it can handle would work great. Off the top of my head - here are a few special data types of interest: Streaming Video Karoke CDG Stream Tags (i.e. ID3V2 or APE) Either in addition to, or in place of the VisSongInfo stuff, you could also abstract special data types for: Artist/Track Title/Album/Year Still Images Associated with an artist/track (album cover art, artist photos, etc. in JPEG, BMP, GIF, PNG, etc.) I also want to point out an obvious omission here (understandable, since you probably weren't considering a video class of visualizers.) In a addition to RGB and OpenGL display types, there should also be a YUV display type. The same rules of no-blit/no-morph between disimilar display types should apply. YUV420P (SDL type: SDL_YV12_OVERLAY) format should be sufficient out of the starting gate. I also think there should be a non-gui method for getting/setting individual visualizer configuration settings by a serialized string. The application may or may not be able to decipher the contents of the command strings for a particular visualizer plug-in, but they can still be thought of as a "bookmark" of what the user likes above and beyond the last-used configuration settings. Wouldn't it also be nice if the plug-ins exposed to the application language neutral author/copyright/credits/plug-in name & version info. A method of getting a list of presets (for those visualizers that support them) and selecting one by non-gui means would also be cool. Finally, on my wish list (but I realize the added YUV mode adds another layer of complexity to this and alpha-channelled shaped text is probably out of the question for that) I'd love to see some sort of overlay engine: Perhaps I, as an application author, want a scrolling ticker at the bottom of the screen announcing sports scores or happy hour specials (!) or some On Screen Display info for a TV or radio tuner card when I'm running in full-screen mode (or otherwise.) Not that big of a deal for me to add at application level, but a guy can dream, right? In VisUI, a useful addition would be a tab or page widget for implementing multiple dialog box pages. I think wxWidgets has about the snazziest way of specifying a platform-neutral dialog box that I've seen. One last question, is frequency spectrum analysis being done even for those visualizers that don't need it? If there isn't, there should be a way to turn off those expensive FFT's if they don't help a given visualizer (and/or morph) if there isn't. Keep up the great work! S.W. |