From: mput <ro...@mp...> - 2003-09-17 01:20:21
|
# CC to [timidity-talk] (again). Hi. I've not tested with SoundFonts as I don't have any stereo drumset (I'm using Macintosh and there's no .sfark or .sfpack expander for it). So I forwarded this to Japanese Mailing list, and some developers confirmed your situation. This may be a bug on 2.11.3, seems to be fixed since Oct. 19 last year, so (as you reported) 2.12.0 has no problems now. > I do not compile the timidity program, because i don't know how to do > it. Maybe you > could explain it to me? Well, I'm not familiar with this... especially for Windows compilers. See "configs\msc-project.zip" inside the source distribution. There may be the project files. > if it would be possible to include a parameter FOR > PLAYBACK for the Windows console and/or GUI version that would switch > between > Gauss and cspline interpolation??? No. That option is compiled in for now. Is it desirable to make this a switch? Thanks, -- mput <ro...@mp...> On 2003.09.16, at 15:41, Friend wrote: > Hello! > > Thank you for your answer. > > I do not compile the timidity program, because i don't know how to do > it. Maybe you > could explain it to me? > > What I really wanted to know is, if it would be possible to include a > parameter FOR > PLAYBACK for the Windows console and/or GUI version that would switch > between > Gauss and cspline interpolation??? > > Regarding your comment: if you try cspline version 2.11.3 with GUS > patches, are the > drums patches you use STEREO? The problem occurs with my STEREO SF2 drum > patches. Please see my original message for the description of the > problem. > > Thank you very much for your time and dedication!!! > > Peace > Friend > > > On 12 Sep 2003 at 9:41, mput wrote: > >> Hi. >> >> I've tried cspline version of 2.11.3 with GUS/patch drums, but I could >> not find any problems. I think cspline interpolation works fine. >> >> If you are able to compile, what happens when you choose cspline in >> 2.12.0-pre1b? >> >> -- >> mput <ro...@mp...> >> >> On 2003.09.10, at 06:24, Friend wrote: >> >>> Hello! >>> >>> I'm normally using Timidity++ Experimental for Windows 2.12.0-pre1b. >>> And >>> the same console version. >>> >>> I have encountered the following situation: >>> >>> When using the console version from >>> http://www.stardate.bc.ca/eawpatches/source/timpp2113q.exe(cspline >>> algorithm), then the stereo drumset i use (soundfont) will sound >>> mostly >>> only on the left channel. The right channel sounds too, but very low. >>> >>> When i use the 2.12.0-pre1b version instead, the sound is totally >>> different (I guess it is because of the Gauss algorithm?) but the >>> stereo >>> drums soundfont works perfectly!!! >>> >>> >>> >>> --- >>> So my question is the following: >>> >>> Is there any possibility to adjust the cspline algorithm using >>> parameters >>> for the Timidity++ 2.12.0-pre1b version (GUI or console)? >>> >>> If not, could it be possible to integrate a parameter for it? >>> --- >>> >>> >>> >>> Thank you very much for all the fine work and the superb Timidity!!! >>> >>> Peace >>> R. Sastre |
From: mput <ro...@mp...> - 2003-09-17 10:07:19
|
Hi. In what situation does cubic spline sounds better than -N 3 Gauss interpolation (a particular MIDI file, or a particular SF2, or something other)? That can be a potential problem. On 2003.09.17, at 15:09, Friend wrote: > Hello! > > Thank you again, for your fast response. > >> I've not tested with SoundFonts as I don't have any stereo drumset (I'm >> using Macintosh and there's no .sfark or .sfpack expander for it). >> So I >> forwarded this to Japanese Mailing list, and some developers confirmed >> your situation. > > If you need some, we could arrange something (FTP, or so). Thanks, but I have no problems now, except that I cannot confirm a bug report :) > >> This may be a bug on 2.11.3, seems to be fixed since Oct. 19 last year, >> so (as you reported) 2.12.0 has no problems now. > > Thank you for this information. I spent hours trying to find the bug (I > thought it was a > wrong parameter, or something ;-) ) > >> Well, I'm not familiar with this... especially for Windows compilers. >> See "configs\msc-project.zip" inside the source distribution. There >> may >> be the project files. > > Thanx. > >>> if it would be possible to include a parameter FOR >>> PLAYBACK for the Windows console and/or GUI version that would switch >>> between >>> Gauss and cspline interpolation??? >> >> No. That option is compiled in for now. Is it desirable to make >> this a >> switch? > > Yes, very much so. I find the cspline sound much better in some > situations. Also, > apparently it uses less CPU time than Gauss. So, please, please, make > it a switch in > both versions (console+GUI) > > Thanx again! Timidity is soooo great! > > Peace > R. Sastre |
From: Eric A. W. <ew...@cc...> - 2003-09-17 22:46:09
|
> I have an idea about the cspline runtime parameter: If it is a problem to make a > switch, why not make a separate compilation for it? I mean a timidity-cspline.exe. I think that is a good idea. Maybe have separate console binaries for linear, cubic spline, lagrange, and gauss. This may not be practical for experimental builds, but it would be a good idea once the 2.12.0 final is released. > Could anyone do this for me? I'm not familiar at all with the compilation process, and > it sounds veeery complicated. It's a bit complicated to get things set up to begin with, especially if you aren't familiar with unix compilier environments, but once you get it working it is very easy to make new builds. Making separate builds for the different interpolation methods would only take a few minutes of effort. I could do this and send you binaries for each method. > I' sure it's my lack of knowledge, but what i cannot understand in your explanation is > why the cspline switch would be a runtime problem. I think of it like this: BEFORE > playing the midi, i adjust cspline, so it has to check it only initially, then it runs in > cspline all the time, without checking anything about it in any loops. Unfortunately, the resampling code in timidity is not set up the way you describe :( I will try to give a brief explanation that isn't too hard to understand. Please forgive me if I get too technical. Resampling is done within many loops in timidity that look basicly like: for (j = 0; j < i; j++) { RESAMPLATION; ofs += incr; } RESAMPLATION is a macro that contains the resampling routine. A long time ago, in a version far far away, timidity only had linear interpolation or no interpolation at all. It made sense to use macros to switch between the two, since that would result in the best speed, plus it was very clean and easy to read. Between the RESAMPLATION and RESAMPLATION_CACHE macros, there are 17 different loops like this throughout the timidity code. If you wanted to efficiently choose between each one of them at runtime, you would have to rewrite each one of these 17 loops to look like: if (do_linear_interpolation) { for (j = 0; j < i; j++) { LINEAR_RESAMPLATION ofs += incr; } } else if (do_cubic_spline_interpolation) { <insert appropriate loop for cubic spline> } else if (do_lagrange_interpolation) { etc. etc. } <continue adding more else if blocks for the remaining methods> This would be a royal pain in the butt, would result in a large amount of ugly looking code bloat, and changing 17 different loops is 17 different places where the programmer could make a stupid typo and cause it not to work quite right. It is much much easier, cleaner, and safer, to simply change the RESAMPLATION macro when adding a new interpolation method. Unfortunately, this results in some of the checks, like whether or not to use gauss or linear interpolation, being done on every single sample. But this isn't what hurts performance the most, what probably hurts performance the most is how the compiler allocates registers when it optimizes the loops. A few years ago, I did all my developement on my Pentium 133. This encouraged me to optimize many things :) A P133 is too slow to use cubic spline interpolation all of the time when both reverb and chorus are in use (the old chorus, the new -EFchorus=1 should be much faster than the older chorus methods, since it does not double the number of voices). When the number of voices got too high, cubic spline would take too much CPU, many voices would be killed, and it would sound bad. So when I rewrote the voice reduction routines, I set the reduce_quality_flag whenever the CPU load got too high. This let timidity fall back to using linear interpolation whenever it got into trouble using cubic spline, so you could use the higher quality cubic spline during most of the midi, and only fall back to linear interpolation in the really CPU intensive portions. But when I benchmarked LINEAR_INTERPOLATION versus CSPLINE_INTERPOLATION with the reduce_quality_flag permanantly set (-4 command line option), the linear fallback in the cubic spline interpolation was significantly slower than if I had defined LINEAR_INTERPOLATION. From simple cycle counts, one extra comparison and a jump should not be slowing things down by 1/3rd to 1/2. What I think made the big difference was how the compiler allocated registers within the loop (although I did not look at the actual assembly code to be sure of what was going on). By adding two separate cases, instead of just one, the compiler could no longer assign all the variables to registers, and that slowed things down a lot. I'm not absolutely certain this was the cause, but I think it is very likely. Loop unrolling optimizations probably become much more difficult as well? So, generally speaking, the smaller/simpler the loop, the better the compiler can optimize it and the faster the code will run. Adding in a new case within the gauss interpolation to let it peform cubic spline instead of gauss will further complicate the loop, and will likely slow down the gauss interpolation even more. It the cubic spline mode within gauss will almost certainly be slower than if CSPLINE_INTERPOLATION was defined, due to the register usage and loop unrolling issues I discussed above. The only way to know for sure, however, is to just make the change and see how much slower it really is. I could just be paranoid. These speed problems could be avoided if all 17 loops were rewritten into separate cases like what I discussed at the first of this email, so that there would be separate loops for each different interpolation method. But, as I said earlier, it would be very easy to make a typo or other silly mistake when changing one of those 17 loops and generate slightly wrong output accidently, not to mention that it would take a good amount of time and effort. It would also make the code much more difficult to read than the simple layout that it has now. Changing the current macro setup into separate loops would not be a simple undertaking.... If someone wants to change the gauss macro to implement a switch for cubic spline (or lagrange), that's fine. Just be sure to compare the speed of the new macro to the old ones before you decide to keep the change. If it has a significant negative impact on the speed, then it should not be used. I am a firm believer in the expression "If it ain't broke, don't fix it." :) -Eric |
From: Takashi I. <ti...@su...> - 2003-09-18 10:16:42
|
At Wed, 17 Sep 2003 17:46:04 -0500 (CDT), Eric A. Welsh wrote: > > > I have an idea about the cspline runtime parameter: If it is a problem to make a > > switch, why not make a separate compilation for it? I mean a timidity-cspline.exe. > > I think that is a good idea. Maybe have separate console binaries for > linear, cubic spline, lagrange, and gauss. This may not be practical for > experimental builds, but it would be a good idea once the 2.12.0 final is > released. > > > Could anyone do this for me? I'm not familiar at all with the compilation process, and > > it sounds veeery complicated. > > It's a bit complicated to get things set up to begin with, especially > if you aren't familiar with unix compilier environments, but once you get > it working it is very easy to make new builds. Making separate builds for > the different interpolation methods would only take a few minutes of > effort. I could do this and send you binaries for each method. > > > I' sure it's my lack of knowledge, but what i cannot understand in your explanation is > > why the cspline switch would be a runtime problem. I think of it like this: BEFORE > > playing the midi, i adjust cspline, so it has to check it only initially, then it runs in > > cspline all the time, without checking anything about it in any loops. > > Unfortunately, the resampling code in timidity is not set up the way you > describe :( I will try to give a brief explanation that isn't too hard to > understand. Please forgive me if I get too technical. Resampling is done > within many loops in timidity that look basicly like: > > for (j = 0; j < i; j++) > { > RESAMPLATION; > ofs += incr; > } > > RESAMPLATION is a macro that contains the resampling routine. well, i suspect whether using a macro here brings us so much benifit nowadays. of course, there is an overhead if we replace it with a function (or function pointer reference), but the code would be much simpler, and the compiler can be enough clever to minimize this cost. another coding hack with macros would be: - define the resampling functions in another file (e.g. resample_code.c), which contains like: static void RS_PLAIN_C(int v, int32 *cptr) { ... } static sample_t *RS_PLAIN(int v, int32 *countptr) { ... } - in resample.c, include it with several times with different resampling macros: #define RESAMPLATION ... /* gauss */ #define RS_PLAIN_C gauss_rs_plain_c #define RS_PLAIN gauss_rs_plain ... #include "resample_code.c" #undef RESAMPLATION #undef RS_PLAIN_C #undef RS_PLAIN ... #define RESAMPLATION ... /* cubic spline */ #define RS_PLAIN_C cubic_rs_plain_c #define RS_PLAIN cubic_rs_plain ... #include "resample_code.c" #undef RESAMPLATION #undef RS_PLAIN_C #undef RS_PLAIN ... - define the tables for each resampling algorighm. choose the table entry from the command line option. it would look like: struct resample_table { void (*plain_c)(...); void (*plain)(...); ... }; static struct resample_table[] = { { gauss_rs_plain_c, gauss_rs_plain_c, ... }, { cubic_rs_plain_c, cubic_rs_plain_c, ... }, ... }; by this way, the ugliness can be reduced, at least. but i still prefer the functions instead of these macros. ciao, -- Takashi Iwai <ti...@su...> SuSE Linux AG - www.suse.de ALSA Developer ALSA Project - www.alsa-project.org |
From: Eric A. W. <ew...@cc...> - 2003-09-19 19:58:47
|
> > RESAMPLATION is a macro that contains the resampling routine. > > well, i suspect whether using a macro here brings us so much benifit > nowadays. > of course, there is an overhead if we replace it with a function (or > function pointer reference), but the code would be much simpler, and > the compiler can be enough clever to minimize this cost. I agree that an array of function pointers would be the cleanest (but probably not the fastest) way to allow the user to choose the interpolation method with a command line switch. I don't know how much replacing the macros with function pointers would slow things down. Compiler optimizations can do a lot, but I'm still worried about the function overhead. If the interpolation loops are done with function pointers, would that mean that the compiler can not inline the functions? And without inlining, there would be no loop unrolling optimizations on them either? It might not make much difference for slow routines like -N 34 gauss, but I'm guessing that there would be a noticable slowdown for the simpler routines like linear interpolation. I agree that the current macros are ugly. The use of the macros is fairly clean, but the macros themselves are ugly. If function pointers do not slow things down much, then it would be good to replace the macros with something cleaner. But if the macros still result in noticably faster code, then I don't mind them being ugly. -Eric |
From: Takashi I. <ti...@su...> - 2003-10-02 15:42:28
|
At Fri, 19 Sep 2003 02:21:23 -0500 (CDT), Eric A. Welsh wrote: > > > > RESAMPLATION is a macro that contains the resampling routine. > > > > well, i suspect whether using a macro here brings us so much benifit > > nowadays. > > of course, there is an overhead if we replace it with a function (or > > function pointer reference), but the code would be much simpler, and > > the compiler can be enough clever to minimize this cost. > > I agree that an array of function pointers would be the cleanest (but > probably not the fastest) way to allow the user to choose the > interpolation method with a command line switch. I don't know how much > replacing the macros with function pointers would slow things down. > Compiler optimizations can do a lot, but I'm still worried about the > function overhead. If the interpolation loops are done with function > pointers, would that mean that the compiler can not inline the functions? > And without inlining, there would be no loop unrolling optimizations on > them either? It might not make much difference for slow routines like > -N 34 gauss, but I'm guessing that there would be a noticable slowdown for > the simpler routines like linear interpolation. surprisingly, while i've investigated this, it turned out that - using functions is not slower but even sometimes faster (up to 10%) than macro expansions! - a lot of codes can be removed from resample.c and recache.c. hence i commited my changes to cvs now. if you find significant slowdown, please let me know. i'll take them back again (after checking why it happens). now the algorithm can be switched by --resample=xxx (or -EFresample=xxx) option, where xxx is the algorithm name. also, if you run ncurses interface, it can be switched dynamically by pressing 'E' following 'Fresample=xxx' sequence. it's interesting to compare the difference among algorithms in real time :) -- Takashi Iwai <tiwai dot suse.de> ALSA Developer - www.alsa-project.org |
From: Eric A. W. <ew...@cc...> - 2003-10-04 05:43:06
|
> surprisingly, while i've investigated this, it turned out that > > - using functions is not slower but even sometimes faster (up to 10%) > than macro expansions! The new code is very nice. Great work! However, my test results (on only a single file so far) are a little bit different. Gauss -N 34 is 2.7% faster, Lagrange is about the same speed, Linear and Cubic-spline are 6-7% slower. I compiled it with GCC 3.3.1 under cygwin (with mingw libraries) using the following CFLAGS: -mno-cygwin -O3 -fomit-frame-pointer -funroll-all-loops -ffast-math -foptimize-sibling-calls -fforce-mem -mcpu=athlon-xp -momit-leaf-frame-pointer -malign-double -maccumulate-outgoing-args -mno-align-stringops -minline-all-stringops I tested the speed using: time ./timidity -Ow1S -idtv -EFchorus=2 -EFns=0 -s 44100 -S 0 -EFchorus=2 doubles the number of voices so that there are more notes to be resampled. -S 0 disables pre-resampling, so that the resampling functions are used much more than normal. I should have disabled reverb, so that resampling would take up an even larger percentage of the CPU time, but I did not think to do so until after I started to write this email. Here are the times, in seconds: Old New % change ------- ------- -------- Linear 20.265 21.606 +6.6 Lagrange 26.871 26.969 +0.4 Cspline 29.543 31.883 +7.9 Gauss (-N 34) 148.821 144.754 -2.7 The times are accurate to about +/- 0.01, so the difference in time for Lagrange is larger than the experimental error. I find it strange that linear and cspline are slower, while Lagrange remains unchanged. Perhaps Gauss is faster because the macro was so large before and there is now much less code within the resamplation loops now that they call a function, so perhaps the optimizer was able to optimize the surrounding code better? I can not explain the speed increase/decrease differences between the different methods. The slower times for cubic-spline and linear look like the functions add more overhead than the macros, like I had worried before. But the speed of Lagrange is unchanged, and that is even simpler code than cubic-spline! And Gauss is even a little bit faster! So for linear and Gauss, it looks like my worries were incorrect, and that the compiler did a good job of optimizing the functions. I will experiment with different compiler optimization flags (like disabling loop unrolling, -funroll-loops vs. -funroll-all-loops, setting the function inline limit higher, etc.) and see if the linear and cspline speeds can be improved. It may just be that GCC isn't optimizing those as well as it should be. Most people will probably use Gauss. If their computers are not fast enough, then they would probably use Lagrange instead. Since these are as fast or faster than before, I see no reason not to keep the new resamplation functions. The new code is MUCH cleaner than the old macros, and it is nice to be able to switch interpolation methods without recompiling. I am a little bothered by the slightly slower linear interpolation results, but hopefully this can be fixed with different compiler options (or a different compiler entirely, like the Intel compiler under Linux). Overall, I think this is a good change. -Eric |
From: Matthew W. M. <mwm...@co...> - 2003-10-05 20:34:33
|
On Sat, Oct 04, 2003 at 12:42:57AM -0500, Eric A. Welsh wrote: >However, my test results (on only a single file so far) are a little >bit different. Gauss -N 34 is 2.7% faster, Lagrange is about the same >speed, Linear and Cubic-spline are 6-7% slower. >I compiled it with GCC 3.3.1 under cygwin (with mingw libraries) using >the following CFLAGS: > -mno-cygwin -O3 -fomit-frame-pointer -funroll-all-loops -ffast-math > -foptimize-sibling-calls -fforce-mem -mcpu=athlon-xp > -momit-leaf-frame-pointer -malign-double -maccumulate-outgoing-args > -mno-align-stringops -minline-all-stringops In my experience all this kitting out with fancy optimisations is only marginally helpful at best, and may even be harmful, to performance. (You can tell I'm not a hot-rod fanatic, can't you?) ;) It's probably better to use a simpler flagset, such as: -march=pentium-mmx -O3 This has served me pretty well. Of course in your case you'll want to replace it with '-march=athlon-xp', and if your CPU has an SSE processor, you may wish to add '-mfpmath=sse,387'. Since you're not specifying otherwise, I'm assuming you're compiling for your own use, of course. If you're compiling for the general public's use, you may wish to specify '-march=i386 -mcpu=athlon-xp' instead. -- Matthew W. Miller <mwm...@co...> |
From: Takashi I. <ti...@su...> - 2003-10-06 10:24:54
|
At Sat, 4 Oct 2003 00:42:57 -0500 (CDT), Eric A. Welsh wrote: > > I find it strange that linear and cspline are slower, while Lagrange > remains unchanged. > > Perhaps Gauss is faster because the macro was so large before and there is > now much less code within the resamplation loops now that they call a > function, so perhaps the optimizer was able to optimize the surrounding > code better? I can not explain the speed increase/decrease differences > between the different methods. i guess so, too. or, could be related with the cache. > The slower times for cubic-spline and linear look like the functions add > more overhead than the macros, like I had worried before. But the speed > of Lagrange is unchanged, and that is even simpler code than cubic-spline! > And Gauss is even a little bit faster! So for linear and Gauss, it looks > like my worries were incorrect, and that the compiler did a good job of > optimizing the functions. yes. in fact, when i tried to optimize the code using 3D now, it resulted in even the slower code than the gcc gives :) anyway, the overhead could be reduced if you set FIXED_RESAMPLATION in timidity.h. then the resamplation function is called statically, and it give the compiler to process the calls inline. however, in this case, you cannot switch the resamplation method, so there is no big merit by functionization except for the code clean-up. -- Takashi Iwai <tiwai dot suse.de> ALSA Developer - www.alsa-project.org |
From: Eric A. W. <ew...@cc...> - 2003-09-17 09:09:16
|
> > I do not compile the timidity program, because i don't know how to do > > it. Maybe you > > could explain it to me? > > Well, I'm not familiar with this... especially for Windows compilers. > See "configs\msc-project.zip" inside the source distribution. There may > be the project files. I use cygwin with mingw32 to build the console version. To configure it, use: ./configure CC='gcc -mno-cygwin' LD='gcc -mno-cygwin' --enable-vt100 --enable-ncurses --without-x --enable-spline=gauss And to build it: make CFLAGS='-mno-cygwin -O3 -fomit-frame-pointer -funroll-all-loops -ffast-math -foptimize-sibling-calls -fforce-mem -mcpu=athlon-xp -momit-leaf-frame-pointer -malign-double -maccumulate-outgoing-args -mno-align-stringops -minline-all-stringops' You don't really need to use all of the optimization flags after -O3, but -fomit-frame-pointer and -funroll-all-loops are a good idea. If you do not already have ncurses installed (or pdcurses, I use 2.4beta, since 2.4 final had some issues) and compiled with -mno-cygwin, then you won't be able to use the ncurses interface when compiled with -mno-cygwin. -mno-cygwin uses MinGW32 instead of the cygwin1.dll unix translation layer. I use -mno-curses so that I can build binaries that can run anywhere, without requiring cygwin1.dll to run. If you want to use regular cygwin instead (cygwin has an option to auto-download and install ncurses for you), just remove the -mno-cygwin flag from the lines I gave above, at which point the CC and LD settings will no longer be needed for the configure script. I use pdcurses because it does not require an ANSI text display driver (such as ansi.sys) or an /etc/termcap file, both of which are required for ncurses to work in a windows DOS box (at least they were the last time I tried), and are not things that the average user would have installed on their machines. > No. That option is compiled in for now. Is it desirable to make this a > switch? I think that the "Lagrange" interpolation (it's actually the Newton form of the polynomial, but I don't think anyone reading this is going to be that picky about it ^_^) is faster than -N 3 third order Gauss (but I'm not sure, I don't remember the results of those benchmarks very well), since it uses all int32 math, whereas Gauss uses doubles. Cubic spline is a good bit slower than Lagrange, and they have roughly the same accuracy, so Lagrange is probably preferable to cubic spline. I don't know if cubic spline is any faster than a 3rd order Gauss polynomial or not. I do know that 3rd order Gauss is more accurate than either cubic spline or 3rd order Lagrange/Newton. Since 3rd order Gauss is still pretty fast, I would prefer the Gauss interpolation, since it is more accurate, and it allows you to specify even higher orders of interpolation (up to -N 34 if your CPU can handle it). -N 0 will use linear interpolation, but will still be a little slower than if LINEAR_INTERPOLATION was defined, due to the check for reduce_quality_flag in the interpolation loop. Gauss interpolation should be the best choice for most users, since it gives a large amount of control over the quality/speed of the program. I do not think it would be a good idea to make a runtime option that lets you switch between Gauss and one of the 3rd order interpolation modes. Checking for this option inside the interpolation loop would add more overhead, which would lessen any speed gain that Lagrange has over 3rd order Gauss. The main reason to define LAGRANGE_INTERPOLATION would be to use higher quality interpolation (better than linear) on old, slow CPUs that may not be quite fast enough to use 3rd order Gauss, but are fast enough to use something higher than linear or -N 0. Keeping the interpolation modes as #defines allows for the highest speed, which is a must for the old, slow CPUs that would require Lagrange instead of Gauss. Making it a runtime option would slow it down, which would defeat the purpose of using Lagrange in the first place. I think it is best to leave the interpolation modes defined at compile time for best speed. -Eric |