Fuse - the Free Unix Spectrum Emulator / Bugs / #33 Sound output delayed with libao

Fredrick Meunier - 2005-01-19

Logged In: YES
user_id=11017

Hm, the new timing code has two sound enabled modes implemented:
1) We are using a sound driver that provides a blocking
socket with, say, a couple of frags worth of latency (e.g.
OSS). We just use writing to the socket to manage our speed.
2) We are using a sound driver that consumes data from a
sound_fifo in a seperate thread, the sound consumption rate
implicitly sets the speed.

The aosound code was using the ALWAYS_USE_TIMER mode before,
this suffered from poor synchronisation and would
periodically underrun.

It appears that in your case the underlying API (probably
ALSA with your 2.6 kernel) is providing about a seconds
worth of hardware buffers, which is like our case 1) above
but with much too much latency. libao offers no way to
control the amount of buffers outstanding or their size.

I'd suggest that the way forward for the libao code would be
to have a seperate thread for audio and a sound_fifo similar
to the SDL audio code.

I expect that barring the DirectSound driver all the others
work as blocking sockets and should work as before. We will
need something similar to the libao change for the
DirectSound driver, though I believe that the SDL code will
now work unchanged on Windows.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Gergely Szasz - 2006-02-06

Logged In: YES
user_id=57243

Hmm..

The two mentioned mode not differs so much...

First... The host machine generates the sound samples. If we
directly use the sound card, and we can directly set the
level of output, than OK, we can produce sound without delay
or slip...
Second... All the cards on the present market works a little
different manner... (Yeahhh, if you have an old SB and you
reach the DSP directly.. yes.. probably... but where theese
old cards??? who use them???) Generally there is some memory
(on the host machine, or on the card - low end / high end),
and the sound processor gets values from this memory (DMA)
and feeds the DA converter with it.
Ok. to feed the DA all card uses some kind of ring buffer...
The buffer size and the sampling frequency determines the
slip of the sound... The sound subsystem driver may use
directly the card ring buffer, or offer a secondary
(ring)buffer to the application, but the base "problem" is
the same: BUFFER...
If we made a second or third buffer.... we cannot change the
main problem :-( sorry...

There is no problem, if we could make "pre"sound (like video
players...) but if we could not, we have to reduce the
buffer!!! But if we reduce the buffer we have to feed more
precisely the sound card... If we run a lot of application
paralell, use low res system timer, and so on, probably more
often hear a click or a pop (buffer underrun)...

Ok, if we start a separate thread, new sound fifo, or other
magical things, but not precalculate the sound samples, we
always slip!!!

A simple graph:

>---z80 emulation-----$$$$$$$$$$$$$------------->
^| ^
keypress-+| |
+-sound|
start|
+-samples reach the DA
<-----> minimal delay

The question: what is between the "sound start" and "samples
reach" the DA?
1., If we could use direct DSP acces (directly reach the DA)
technically nothing, so no delay (DOS strike back!)
2., If we use direct card access, the "hardware buffer"...
3., If we use sound subsystem, the subsystem (complex)
buffer...
4., If we use sound subsystem + fifo: the fifo size +
subsystem buffer...

If the host machine processor shared between many
application (DOS strike back again) then we need buffer.
Bigger processor power, fewer application, higher freq.
system timer, smaller load => need lower buffer...
Even if you feed the sound subsytem frame by frame, the
driver USE buffer. Push some samples to the memory and start
the play, and leave the card alone.. get an interrupt, push
samples and leave the card... and so on.

Ok. Philip: the i810/845 audio subsytem is basically an
AC'97 card (as you say) and it has a lot of drawback. e.g.:
only some sampling freq (8kHz, 16, 22, 44, 48 I think), can
play only 16 bit stereo sound, use system memory with DMA -
some byte depth incard fifo (2 frame depth), so if fuse
produce e.g. 32kHz mono 8 bit sound, the driver has to
convert it to 44/or 48kHz 16bit stereo in the background,
and if you use dmix "software" plugin, your sound subsystem
allocate a quite big sound buffer to make there job
smootly... you should try the buf_size ao parameter as Fred
mentioned before...

Fred,

I try the SDL (magical newthread-sfifo combined) sound with
my 500Mhz K2, and I hear about 5 click during spect48 run
this program : PAUSE 50*30:BORDER 6, and in the background
only run a firefox a gtk-gnutella, 3 xterm ssh and fvwm, and
other base daemons. Linux-2.6.15.1 built in ALSA, a GRAVIS
IW and a SB Live.. (Fuse use GRAVIS)
Next I try the native ALSA driver, with buffer=2000 frame. I
hear only 2 clicks and the sound delay is good...I do not
feel more delay than SDL (I do not know SDL how much buffer
use)!!!

Yes!!! You should see: native ALSA is superior :-) OK! it is
just a joke... But we can see: basically no difference! If I
could change the SDL driver "buffer size" I can reach the
same click number...

Ok, I think the mid solution: Fuse should provide a sound
delay, or latency, or something similar parameter, and the
user can set the balance between the sound delay and the
number of buffer underruns, depending on the actual
situation and need...

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Fredrick Meunier - 2007-02-10

Logged In: YES
user_id=11017
Originator: NO

I forgot to follow up on this: I am aware of what you are saying, but in older versions of Fuse sound was not closely enough connected to the machine being emulated to make the appropriate adjustments. Now that Fuse starts sound after a machine is selected, we can use the machine information to adjust the size of the sfifo in line with the amount of sound produced each frame. I have checked in such a change.

The sound fifo *is* a source of the latency you are asking for, the problem has been getting an appropriate size to date.

Could you retry the SDL version on your machine? I expect that you will get better results, and that converting the libao sound code to use the fifo (to limit the latency to the required amount), will resolve this bug.

We still have a remaining issue for SDL that SDL cannot convert sound frequencies in non-power of two ratios, this means that if your hardware does not support the Fuse default frequency of 32khz, that the sound may not sound very good. This should be solveable by using the --sound-freq argument to select a speed natively supported by your sound hardware - better suggestions are welcome.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Gergely Szasz - 2007-02-14

Logged In: YES
user_id=57243
Originator: NO

Hi Fred!

You right, the sound fifo(s) are the source of the latency. But, if we add a new fifo the latency may increasing... not decreasing :-)

What happend with SDL sound output:

1. fuse generate the sound samples (638 frame at once at 32000Hz with UK Spectrum 48k)
2. pass the data to the 'sound_lowlevel_frame'
3. sound_lowlevel_frame pass it immediately to sfifo and return

4. the sdl sound system use a callback to read sound samples 'sdlwrite' from sfifo

and the timing: fuse poll sfifo status, and if it is full, sleep Nx10ms

What happend with blocking sound io:

1. fuse generate the sound samples (638 frame at once at 32000Hz with UK Spectrum 48k)
2. pass the data to the 'sound_lowlevel_frame'
3. sound_lowlevel_frame pass it to sound subsystem and wait for the write

and the timing: fuse not sleep but blocked bay sound io

The difference essentially:
-with sfifo fuse wait for the sound subsytem in a _(u)sleep_
-with blocking io fuse wait for the sound subsytem in a _wait for io_

Not so big difference...

If we 'convert' libao to uses sfifo (very difficult, because libao not include callback functionality), the latency not decreased.

By the way: Philip not wrote: what backend used with libao? (alsa, oss, arts, nas, ...) If alsa (alsa09), the default sound buffer size (this is the alsa 'hardware' ringbuffer) is 500000us (0.5s). But, some ALSA config may set up a bigger one, and libao may cannot override them. If Philip try with '-d alsa09:buffer_time=100000' libao try to set up a 0.1s buffer with alsa backend.
I try with the 0.1s buffer with an i810 subsytem and no hearable sound latency (for my ears). If Philip use some high latency sound backend with libao (arts - KDE, esd - GNOME) please do not wait for a miracle...

Back to latency.
1. fuse create 638 frame at 32000Hz so we have an initial 1/50s lag
2. fuse sdl output put this packet into sfifo and start to play
3. sdl if need, read samples to there own ringbuffer
The second lag is here, because sdl (backend) first read some data (may fill up the whole buffer), and than start to play. And if ringbuffer lowered to a given level, try to read another data.
4. if sdl not consumes samples as fast, as fuse can generate, 'sfifo' get full and fuse go to _sleep_
5. if sdl eat more samples, than fuse can generate (buffer underrun), sound_lowlevel_frame send _0_

with blocking IO:
1. fuse create 638 frame at 32000Hz so we have an initial 1/50s lag
2. fuse sound output 'driver' put this packet into sound subsytem and start to play
The second lag is here, because sound subsytem first read some data (may fill up they whole ringbuffer), and than start to play.
4. if sound subsytem not consumes samples as fast, as fuse can generate, _block_ the 'sound_lowlevel_frame', and fuse waiting for io
5. if sound subsytem eat more samples, than fuse can generate (buffer underrun), may happen an interesting thing (sound output suspended, or get into 'xrun' status, or something else ...)

What the different? As I say before: in the first case, fuse waiting in _sleep_ and the second case fuse _waiting for io_...
The lag not depend on the type of sound output, just the initial lag (~1/50s) and the sound subsystem lag (cause by the 'start level' of subsytem ringbuffer) and the card lag (oncard fifo, and other sound processing lags).
Because fuse cannot work for the 'future' -- cannot generate sound samples played in the future -- we always have a definitive latency. If we fill up 'sfifo', the sound samples in sfifo are all _late_, because in ideal case, the real timestamp of the last sample in sfifo is 1/50s _before_! If fuse advances more before sound subsytem, then sound has more lag...

If we use little buffer, we get little latency, but we have bigger chance to buffer underrun at any case. But there is a hearable different with SDL contra other sound output, because sdlwrite fill buffer with zeros if no sample in 'sfifo'. So, if we not make sound that moment we cannot hear a click (we not catch the buffer underrun). With libao (ALSA backend) we hear a click, because ALSA stop the sound, and restart it when got samples again.

So with sfifo or without sfifo I think, no much difference between latency and underruns if we use the same amount of buffer. The possibility of changing the buffer size is an another question.

So, after all, I think, 'sfifo' cause only that: fuse wait in '(u)sleep' not in 'wait for io'. :-)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Fredrick Meunier - 2007-02-14

Logged In: YES
user_id=11017
Originator: NO

>The difference essentially:
> -with sfifo fuse wait for the sound subsytem in a _(u)sleep_
> -with blocking io fuse wait for the sound subsytem in a _wait for io_
>
> Not so big difference...

Big difference - with blocking io we are at the mercy of the sound API and the unknown amount of buffering it adds (or none at all with a callback API), with sfifo we know exactly how much latency we are adding.

> If we 'convert' libao to uses sfifo (very difficult, because libao not
> include callback functionality), the latency not decreased.

I don't think the conversion would be difficult, we just need a separate thread reading the sound fifo and writing libao (effectively what is done in the SDL and CoreAudio drivers).

We know the frequency, so we know the rate of sound consumption, we can simply put in a known amount of sound samples per unit time (e.g. wait for two frames sound to build up, then feed in those two frames worth, followed by a frames worth every 1/50th or 1/60th per second). Then even though libao has a one second buffer before blocking, we can continue processing with our desired amount of sound latency only.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Gergely Szasz - 2007-02-15

Logged In: YES
user_id=57243
Originator: NO

>Big difference - with blocking io we are at the mercy of the sound API and
>the unknown amount of buffering it adds (or none at all with a callback
>API), with sfifo we know exactly how much latency we are adding.

Why do you think, that the blocking sound API use additionally buffering?
Why do you think, with SDL callback, you directly set and write the sound card ring buffer? SDL uses some backend to do the dirty job. On MAC may uses the CoreAudio system. On Linux SDL can use a lot of backend (as libao) e.g. OSS, ALSA, aRTs, NAS, esd etc... If the backend allows to set the ringbuffer size, you can controll it, if bacend not allow you cannot.

By the way, if libao use ALSA you can _totally_ controll the _ring buffer_ size and period (oss call it fragment) sizes, and _no_ additionally buffer or something else...

>We know the frequency, so we know the rate of sound consumption, we can
>simply put in a known amount of sound samples per unit time (e.g. wait for
>two frames sound to build up, then feed in those two frames worth, followed
>by a frames worth every 1/50th or 1/60th per second). Then even though
>libao has a one second buffer before blocking, we can continue processing
>with our desired amount of sound latency only.
Not exactly.
- First: libao may or may not have 1 second buffer 'before blocking'. And we may controll or may not the size of this buffer. With ALSA backend I lower the ringbuffer to 1 frame! (yes, a lot of underrun :-)
- Second the working of sound is a little bit different... ehhh..
Generally all sound system use a ring buffer. The ring buffer has a size a start 'level' and fragment/period time. The start level is that when the sound card start to play (usually with DMA) the samples. The fragment (OSS)/period(ALSA) size controll the interrupt 'rate' of the sound card. Theese interrupts sended to PC in order to inform the application now time to fill up the sound buffer.
|-----------------------+----|
0 80 100
For example the start level at 80%, the period size is 50% of the buffer. It means that, if we fill up the buffer up to 80% the 'driver' start the playing of samples. This moment if we add very fresh samples to the buffer, our lag is (buffer size * 0.8). Now, the program can make something else, because sound card play the samples. When the card played as many samples as 50% of the whole buffer, send an interrupt and now the continuation of story depend that, callback used or not? If used the sound subsystem call the user function (callback), and the callback has to fill the buffer. If we not use callback, the sound subsystem just note that the sound card consumes 50% of samples from the buffer and the next interrupt - if the application not fill the buffer -- made something, e.g. stop the play, or fill the buffer with zeros... If the buffer full, and we want to write, sound subsystem block the operation as long as there is a period size empty space in the buffer. So, no additional buffer...

If we generate always a given amount of samples (e.g. 1/50s) and we write it to the buffer always at the right time, then our lag is 1/50s + the 80% of the ringbuffer (above example). It not depend of the style of sound io (callback or not).
Generally the sound start to play when we fill up the ringbuffer (100%) -- but some sound subsystem allow to change this (e.g. ALSA), but SDL I think: not. So fuse generate sound at every frame (1/50s) and SDL use 2x1/50s buffer and fuse block if the sfifo full, max lag theoretically is 5x1/50s = 1/10s, but fuse not go forward compared to real time, so it just 4x1/50s. The minimum lag is 3x1/50s, the SDL buffer + the generation lag. With libao, if we use 2x1/50s buffer, the situation is the same :-). The maximum lag is 4x1/50s, (2x1/50s + 1x1/50s, and max 1/50s because the blocking of io). Because the sound card make there job at real time, and fuse make there job in 'real time' too, there is no possibility to a big slip between each other (fuse cannot work for the 'future').

If fuse can do something interesting instead of waiting for IO, then will be a difference, but fuse only sleep.
In blocking IO, if fuse spend more time to render the screen or something else, then wait fewer ticks for sound IO... When fuse uses callback for sound, then fuse takes fewer 10ms sleep at timer_frame... Thats it (IMHO).

>We know the frequency, so we know the rate of sound consumption, we can
>simply put in a known amount of sound samples per unit time (e.g. wait for
>two frames sound to build up, then feed in those two frames worth, followed
>by a frames worth every 1/50th or 1/60th per second). Then even though
>libao has a one second buffer before blocking, we can continue processing
>with our desired amount of sound latency only.
So.. libao not have a 'one second buffer before blocking'. The backend of libao have a buffer, and this backend block the IO if there buffer full. The latency exactly not depend hardly on the buffer size, it depend on _when_ the sound subsystem _start_ the sound card to play the samples. Generally (I said it before) start playing, when the buffer is _full_. So we may have a separated thread or something other magic, but the sound card start playing just that moment when the ringbuffer level hit the 100%, and we have as lag as the size of the buffer.
We have two choice:
1. lower the buffer,
2. or, order the subsystem to start the sound before buffer level hit 100% e.g. at 5%... But, because fuse generate 1/50s amount of sound every 1/50s, and sound card play 1/50s amount of sound exactly every 1/50s :-), this is the same situation than 1., if we lower the sound buffer to 5% of there original size. :-)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Fredrick Meunier - 2007-02-15

Logged In: YES
user_id=11017
Originator: NO

> Why do you think, that the blocking sound API use additionally buffering?

I don't mean every blocking API (OSS has no such problem with most drivers), but that we have two modes for the timer - one which leaves the buffering to the blocking sound API, and one in which we specifically provide our own buffer and sleep when that buffer is full (the sfifo based model).

The mode used for libao is the blocking API, this bug states that the sound is delayed by 1 second. In the blocking API version of the timer we write to the sound API interface until it blocks - thus I believe libao is putting in a seconds worth of buffering.

I suppose it is possible that the driver being used by libao (e.g.OSS, ASLA etc.) is causing the buffering rather than libao itself, but I don't think that this is very interesting unless the libao code is not setting this explicitly when it could.

> Why do you think, with SDL callback, you directly set and write the sound card ring buffer?

I don't think that, with SDL I directly set the sfifo size, and I ask SDL to set the sound callback size (which may correspond to the fragment size on the sound card or not depending), to the power of two size smaller than a frames worth of sound in order to minimise the impedance between the two.

> By the way, if libao use ALSA you can _totally_ controll the _ring buffer_
> size and period (oss call it fragment) sizes, and _no_ additionally buffer
> or something else...

If libao can control the sound buffer sizes it should be doing so - it is a requirement of the blocking sound API mode timer that it does.

> So.. libao not have a 'one second buffer before blocking'. The backend of
> libao have a buffer, and this backend block the IO if there buffer full.

From the perspective of the timer system the difference is not significant. The timer system requires that sound drivers that are driven by Fuse in "blocking mode" block when a reasonable amount of data is written to the sound system. This is expected to be in the region of the amount of data in a single frame. In this bug libao is buffering for a second before blocking.

> The latency exactly not depend hardly on the buffer size, it depend on
> _when_ the sound subsystem _start_ the sound card to play the samples.

This is correct if you are talking about the precise latency that occurs between the visuals and the sound - I am not, when I am talking about sound latency, I am referring to the need to introduce latency to the system when generating real time sound in order to minimise buffer underruns in the face of unpredictable scheduling by the OS etc.

> Generally (I said it before) start playing, when the buffer is _full_. So
> we may have a separated thread or something other magic, but the sound card
> start playing just that moment when the ringbuffer level hit the 100%, and
> we have as lag as the size of the buffer.
> We have two choice:
> 1. lower the buffer,

I am presuming that as the libao driver doesn't already do this it is not possible. If this can be done with libao, please provide a patch to reduce the buffer size to approximately a frame as this is a hard requirement of the Fuse blocking sound API mode timer.

> 2. or, order the subsystem to start the sound before buffer level hit
> 100% e.g. at 5%... But, because fuse generate 1/50s amount of sound every
> 1/50s, and sound card play 1/50s amount of sound exactly every 1/50s :-),

If libao is not blocking when Fuse is writing to it, you will not reduce the 1s delay experienced by the user in this bug, as Fuse will generate sound until the driver blocks which it will only do when it's buffer is full. Fuse will build up a full 1s of latency before blocking.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Gergely Szasz - 2007-02-16

Logged In: YES
user_id=57243
Originator: NO

First of all, I cannot reproduce this 'bug' on a i810 based (intel motherboard) P4 desktop machine. Or more exactly i found that, the latency -- if libao use ALSA backend -- is about 1/2s (coincidently with the 0.5s default buffer used libao with ALSA).
On the other hand Philip never say further information about his system:
- I not know he used OSS or ALSA with OSS emulation?
- libao used OSS, ALSA, or aRTs, esd, or nas?
- And the one second is exactly one second, or (i think so :-) just a guessed one second...

You wrote:
>possible. If this can be done with libao, please provide a patch to reduce
>the buffer size to approximately a frame as this is a hard requirement of
>the Fuse blocking sound API mode timer.
The situation is not so easy:
- first, libao provide a very simple interface, designed for the ogg123 sound file player
- there are a lot of underlying sound driver (sun, oss, mac, irix, ...) and depending on backend we have or have _not_ controll over the buffer size. With 'sun', 'oss', 'mac', 'irix', 'esd', and 'arts' output we _not_, with 'nas', 'alsa-0.5' and 'alsa-0.9+' _yes_
- if we can controll the buffer size we not need a patch, just add the appropriate option to the --sound-driver command line argument (e.g. buffer_time=100000 (2/50s buffer) for alsa-0.9+)
- otherwise we have _no any way_ to controll the amount of sound buffer used by sound subsystem

Additionally with ALSA we may have other problems when we want to lower the buffer, because its complexity and its 'intelligence' (e.g. dmix).

And back to latency, and difference of 'sfifo', 'callback' and 'blocking' style sound:-)
First of all, there are 3 type of sound output you mentioned:
1. 'blocking' IO with direct write from 'sound_lowlevel_frame'
2. 'callback' IO with write from 'sfifo'
3. 'blocking' IO with a separate thread write from 'sfifo'
You say that, if we change libao from 1 to 3, then we gain more controll over latency.
You wrote that before:
>I don't think the conversion would be difficult, we just need a separate
>thread reading the sound fifo and writing libao (effectively what is done
>in the SDL and CoreAudio drivers).
>
>... Then even though
>libao has a one second buffer before blocking, we can continue processing
>with our desired amount of sound latency only.

Sorry Fred, but this is may a false assumption. :-(
First: (sorry, but this is my fond belief) libao has _not_ got a buffer. Really. libao uses buffer only if it has to swap the byte order (big/little endian), but this is an another type of buffer and no effect on latency.
Second: this one second buffer not before libao blocking, but before sound start playing
Third: if we (or as you say: libao) have a one second buffer, hardly irrelevant how we fill it:
- direct from 'sound_lowlevel_frame'
- or a 'separated thread'
we allways have a one second lag, because we have to generate one second sound before start the play, and the sound card play one second sound exactly one second span...
Ok, if we dive into it deeper, we can make some trick, e.g. first fill up the buffer in order to start the sound and than wait for a specific time, to empty enough the buffer (decrease the sound lag), and than start to generate sound frames as we need (syncronising with the timer not with the sound subsystem). But we not need 'sfifo' to do that. And may we have other restrictions, e.g. with ALSA and OSS we can only transfer samples as packages (fragment/period), and the subsystem block the IO until in the buffer we have at least one period empty space...

To sum up:

1. The 'one second latency' is not a general 'libao' driver bug. It depend a lot of thing (underlaying sound subsystem, sound card).

2. We cannot eliminate this lag easily by to change the 'libao' blocking style output into a 'sfifo' used one.

3. In some case we allowed to controll this lag by to change the buffer size used by libao (sound subsystem)

4. I make a patch, so libao try to reduce the sound buffer if the alsa-0.9+ driver used (except if we give the buffer size explicitly) patch #1661522 http://sourceforge.net/tracker/index.php?func=detail&aid=1661522&group_id=91293&atid=596650

:-)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Fredrick Meunier - 2007-02-16

Logged In: YES
user_id=11017
Originator: NO

Thanks for the patch - Phil: does patch #1661522 help the problem?

> And back to latency, and difference of 'sfifo', 'callback' and 'blocking'
> style sound:-)
> First of all, there are 3 type of sound output you mentioned:
> 1. 'blocking' IO with direct write from 'sound_lowlevel_frame'
> 2. 'callback' IO with write from 'sfifo'
> 3. 'blocking' IO with a separate thread write from 'sfifo'

Correct (of which only 1 and 2 are currently implemented in Fuse). Of course if sound is disabled, we don't do any of these, we just run as much emulation as is required by the time that has passed since our last 10ms sleep. We don't call sleep in 1, we do in 2 after filling our sound buffer (which is where I refer to when I say we are adding latency).

> You say that, if we change libao from 1 to 3, then we gain more controll
> over latency.

I believe so, though I don't claim to be an authority on libao, and I understand that is has many backends that may behave differently than the others (which may make it a poor API for games and emulators).

>>... Then even though libao has a one second buffer before blocking, we can continue
>> processing with our desired amount of sound latency only.
>
> Sorry Fred, but this is may a false assumption. :-(

It may be, though I think we have seen several sound bugs over the years that suggest this can be the case even if it is not in this specific case.

> First: (sorry, but this is my fond belief) libao has _not_ got a buffer.
> Really. libao uses buffer only if it has to swap the byte order (big/little
> endian), but this is an another type of buffer and no effect on latency.

My assumption is that if we have a 1s latency, this means that there is a 1s buffer somewhere below aosound.c:sound_lowlevel_frame. I don't distinguish between aosound.c, libao, and the underlying driver in use as I am speaking from the perspective of the timer system which is the first subject of this bug.

> Second: this one second buffer not before libao blocking, but before sound
> start playing

Again, I am speaking from the perspective of the timer system in Fuse. The blocking mode sound timer in Fuse writes to the sound_lowlevel_frame in a tight loop, it is this routine blocking or not that controls the latency between emulation time and sound time and the emulation speed of Fuse.

If sound_lowlevel_frame in aosound.c does not block at all, Fuse will consume 100% CPU and run faster than 100% speed with sound enabled on a reasonably fast CPU. This is not what Phil observed, so I believe that ao_play in aosound.c:sound_lowlevel_frame blocks after 1s for Phil at least.

> Third: if we (or as you say: libao) have a one second buffer, hardly
> irrelevant how we fill it:
> - direct from 'sound_lowlevel_frame'
> - or a 'separated thread'

True - I am not saying there is only one way to do it, I am more concerned that we keep getting bugs from time to time where particular sound libraries don't work the same on all machines (due to driver bugs I guess), and we rely on them working a particular way in Fuse. See bug 915289 for an example of an OSS driver that does not work correctly for sound/osssound.c.

I think that if we didn't rely on the varying kernel driver implementations working to spec so much on Linux that some of this would be reduced/eliminated. I believe that the sfifo-based approach is sound and can be adapted to work with any reasonable API. I am not writing code to change all the drivers to use that approach, and I am not making anyone else do so.

> 1. The 'one second latency' is not a general 'libao' driver bug. It depend
> a lot of thing (underlaying sound subsystem, sound card).

I am not claiming that aosound.c always produces the same 1s latency result - that it does so at all is a bug with aosound.c from my perspective, as all the sound drivers are required to add a reasonable amount of latency only (in the order of one frames sound samples like osssound.c). I don't distinguish between aosound.c, libao, the underlying sound system or the sound card - all of this should be transparent to the Fuse timer system.

> 2. We cannot eliminate this lag easily by to change the 'libao' blocking
> style output into a 'sfifo' used one.

If the problem is as you describe, that the sound does not start playing until it has a full 1s buffer for Phil, then I think you are correct. OTOH we have seen problems with OSS drivers that don't block correctly, and making Fuse sleep and not block on the sound API write still allowed the sound to play in those cases.

From your description, if libao was writing to such an OSS driver, I am not sure that the fixes you propose would be enough, whereas having Fuse sleep, with some buffered sound in an sfifo to cover underruns would.

I agree this is not what Phil describes, but I am interested in having fewer bugs and support effort into the various odd Linux etc. sound drivers, not more.

Maybe we should just delete libao and add the ALSA driver - are there any extra systems supported by libao that would not be supported by the other sound drivers already in Fuse? From a quick look at the libao page it seems to be just AIX and IRIX, and the SDL version already covers IRIX at least, I am not sure how many AIX users we have, but they've never complained about sound on the mailing list or this project page AFAIK.

> 3. In some case we allowed to controll this lag by to change the buffer size used by libao (sound subsystem)

I think that if we can't be reasonably sure we will get the correct result with libao, it may not be a suitable API to use for Fuse.

> 4. I make a patch, so libao try to reduce the sound buffer if the
> alsa-0.9+ driver used (except if we give the buffer size explicitly) patch
> #1661522

Thanks very much for this, hopefully it will resolve this bug.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Gergely Szasz - 2007-02-22

Logged In: YES
user_id=57243
Originator: NO

First of all, I cannot reproduce this 'bug' on a i810 based (intel motherboard) P4 desktop machine. Or more exactly i found that, the latency -- if libao use ALSA backend -- is about 1/2s (coincidently with the 0.5s default buffer used libao with ALSA).
On the other hand Philip never say further information about his system:
- I not know he used OSS or ALSA with OSS emulation?
- libao used OSS, ALSA, or aRTs, esd, or nas?
- And the one second is exactly one second, or (i think so :-) just a guessed one second...

You wrote:
>possible. If this can be done with libao, please provide a patch to reduce
>the buffer size to approximately a frame as this is a hard requirement of
>the Fuse blocking sound API mode timer.
The situation is not so easy:
- first, libao provide a very simple interface, designed for the ogg123 sound file player
- there are a lot of underlying sound driver (sun, oss, mac, irix, ...) and depending on backend we have or have _not_ controll over the buffer size. With 'sun', 'oss', 'mac', 'irix', 'esd', and 'arts' output we _not_, with 'nas', 'alsa-0.5' and 'alsa-0.9+' _yes_
- if we can controll the buffer size we not need a patch, just add the appropriate option to the --sound-driver command line argument (e.g. buffer_time=100000 (2/50s buffer) for alsa-0.9+)
- otherwise we have _no any way_ to controll the amount of sound buffer used by sound subsystem

Additionally with ALSA we may have other problems when we want to lower the buffer, because its complexity and its 'intelligence' (e.g. dmix).

And back to latency, and difference of 'sfifo', 'callback' and 'blocking' style sound:-)
First of all, there are 3 type of sound output you mentioned:
1. 'blocking' IO with direct write from 'sound_lowlevel_frame'
2. 'callback' IO with write from 'sfifo'
3. 'blocking' IO with a separate thread write from 'sfifo'
You say that, if we change libao from 1 to 3, then we gain more controll over latency.
You wrote that before:
>I don't think the conversion would be difficult, we just need a separate
>thread reading the sound fifo and writing libao (effectively what is done
>in the SDL and CoreAudio drivers).
>
>... Then even though
>libao has a one second buffer before blocking, we can continue processing
>with our desired amount of sound latency only.

Sorry Fred, but this is may a false assumption. :-(
First: (sorry, but this is my fond belief) libao has _not_ got a buffer. Really. libao uses buffer only if it has to swap the byte order (big/little endian), but this is an another type of buffer and no effect on latency.
Second: this one second buffer not before libao blocking, but before sound start playing
Third: if we (or as you say: libao) have a one second buffer, hardly irrelevant how we fill it:
- direct from 'sound_lowlevel_frame'
- or a 'separated thread'
we allways have a one second lag, because we have to generate one second sound before start the play, and the sound card play one second sound exactly one second span...
Ok, if we dive into it deeper, we can make some trick, e.g. first fill up the buffer in order to start the sound and than wait for a specific time, to empty enough the buffer (decrease the sound lag), and than start to generate sound frames as we need (syncronising with the timer not with the sound subsystem). But we not need 'sfifo' to do that. And may we have other restrictions, e.g. with ALSA and OSS we can only transfer samples as packages (fragment/period), and the subsystem block the IO until in the buffer we have at least one period empty space...

To sum up:

1. The 'one second latency' is not a general 'libao' driver bug. It depend a lot of thing (underlaying sound subsystem, sound card).

2. We cannot eliminate this lag easily by to change the 'libao' blocking style output into a 'sfifo' used one.

3. In some case we allowed to controll this lag by to change the buffer size used by libao (sound subsystem)

4. I make a patch, so libao try to reduce the sound buffer if the alsa-0.9+ driver used (except if we give the buffer size explicitly) patch #1661522 http://sourceforge.net/tracker/index.php?func=detail&aid=1661522&group_id=91293&atid=596650

:-)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Gergely Szasz - 2007-02-23

Logged In: YES
user_id=57243
Originator: NO

Sorry, for previouse duplicated submit.

>Maybe we should just delete libao and add the ALSA driver - are there any
>extra systems supported by libao that would not be supported by the other
>sound drivers already in Fuse? From a quick look at the libao page it seems
>to be just AIX and IRIX, and the SDL version already covers IRIX at least,
>I am not sure how many AIX users we have, but they've never complained
>about sound on the mailing list or this project page AFAIK.
OK! Btw, I _just_ made the libao sound output, because: it has very simple API, it can use ALSA, and can generate 'file' output. But now, ALSA knows all of this thing...

>and we rely on them working a particular way in Fuse. See bug 915289 for an
>example of an OSS driver that does not work correctly for
>sound/osssound.c.

>I think that if we didn't rely on the varying kernel driver
>implementations working to spec so much on Linux that some of this would be
>reduced/eliminated. I believe that the sfifo-based approach is sound and
Btw, the OSS obsolate from... long time (may about 2004?)...
>can be adapted to work with any reasonable API. I am not writing code to
>change all the drivers to use that approach, and I am not making anyone
>else do so.

I try to convert ALSA driver to use _sfifo_, but now I have some very mysterious problem (distorted sound with echo and pops and clicks)...

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Fredrick Meunier - 2007-02-24

Logged In: YES
user_id=11017
Originator: NO

As we are no longer talking about libao, I'll respond on the mailing list.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jim Banks - 2008-07-12

Logged In: YES
user_id=1117002
Originator: NO

Hello,

Apologies for dragging this bug up again but I think I've run into it while trying to get Fuse to run on an SGI O2 running Irix.

I've managed to get the SDL UI and GTK+ UI to work but the libao sound lags when using the GTK+ UI. I'm a bit limited with respect to the audio options available to me under Irix and I'm not too sure how to workaround the problem. I have libao and esd installed from Nekochan.net (they also have aRTS and OpenAL available, i think that's it for sound libraries). I've tried using the -d esd flag but I think it's still going through libao and it's still being delayed.

I was wondering if somebody would be able to give me any ideas around this problem ..... AFAIK I don't have access to ALSA or OSS at all on Irix. I would like to get the GTK UI working mainly because it's easier and quicker to use, and I'm hoping to submit a package to the Nekochan project for Fuse.

Thanks for your help,

Jim

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Gergely Szasz - 2008-07-13

Logged In: YES
user_id=57243
Originator: NO

As i wrote before: "- there are a lot of underlying sound driver (sun, oss, mac, irix, ...)
and depending on backend we have or have _not_ controll over the buffer
size. With 'sun', 'oss', 'mac', 'irix', 'esd', and 'arts' output we _not_,
with 'nas', 'alsa-0.5' and 'alsa-0.9+' _yes_" so there is no so much hope to "fix" the libao sound lag on Irix. But someone could write a native Irix sound driver, or make sdlsound works with GTK+ UI...

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Sound output delayed with libao

Group

Searches

Help

#33 Sound output delayed with libao

Discussion