Menu

Audio

Peter Maersk-Moller

Snowmix Audio Guide.

Updated for Snowmix-0.4.4 and GStreamer-1.x

Snowmix supports audio mixing from version 0.4 and onwards. This is done using 3 key components called audio feed, audio mixer and audio sink. Each component is intended to have at least one source and are intended to source at at least one other component or destination.

For users familiar with setting up audio configuration and wanting to understand AV sync in Snowmix, please see the page Understanding Audio and Video Sync for Snowmix.

Audio Feed

Audio feeds are created using the reserved command audio feed add. The syntax of the command is:

    audio feed add [<feed id> [<feed name>]]

Feed id is a positive integer greater than 0. Audio feed 0 is reserved for later for a special purpose. Each audio feed is sourced via a control connection that can be established as a tcp connection to the host running Snowmix. The port used is the port specified in the ini file using the command

    system port <port number>

Audio feeds will read and queue data sent to it as fast as possible. Audio data read will be queued, if there is at least one active mixer or at least one active sink sourced by the audio feed. Otherwise data will be dropped. Audio feeds can source an unlimited number of mixers and sinks. Audio feeds can change volume of an audio stream and possibly mute it. Changing the volume of an audio feed will affect all mixers and possibly sinks using the feed as a source.

Audio Mixer

Audio mixers are created using the reserved command audio mixer add. The syntax of the command is:

    audio mixer add [<mixer id> [<mixer name>]]

Mixer id is a positive integer greater than 0. Audio mixer 0 is reserved for later for a special purpose. Audio mixers are sourced from audio feeds, audio mixers and in later versions possibly audio sink. Audio mixers can be sourced by an unlimited numbers of feeds and mixers (and later sinks). Audio mixers will by default consume data from its sources at a the given rate and as such rate shaping its output. Audio mixers can source an unlimited number of audio mixers and audio sinks. Audio mixers can change volume of each of its sources as well as mute them. Likewise it can change volume of its output as well as mute it. Changing the volume of a source of a mixer using the command audio mixer source volume ... will only affect the source volume in the specified mixer and not for other mixers and sinks possibly using the same source.

Audio Sink

Audio sinks are created using the reserved command audio sink add. The syntax of the command is:

    audio sink add [<sink id> [<sink name>]]

Sink id is a positive integer greater than 0. Audio sink 0 is a special purpose sink needing no file or control connection to consume data as fast as it receives it. Currently audio sink 0 only supports a single source. This may change in later versions. Audio sinks are each sourced from one audio mixer or audio feed. Later versions may support sinks to be sourced from other sinks. An individual Audio sink can either write to a file or write out through a control connection. It cannot do both at the same time. For that you will need two sinks sourced from the same source and one writeting to the control connection and the other writing to a file

WARNING: Because audio sinks can write to a file, they can also write to a fifo/pipe file. If Snowmix tries to open a fifo (named pipe) for writing and that fifo does not have a reader, Snowmix will hang until a process opens the fifo for reading. This is not a bug, but a feature. Now you are warned should you choose to use fifos.

Formats

Audio feeds and sinks need to have the format of the audio input (feeds) and output (sinks) specified. The commands for setting the format are audio feed format and audio sink format. The syntax of these commands are:

    audio feed format [<feed id> (8 | 16 | 24 | 32 | 64) (signed | unsigned | float)]
    audio sink format [<sink id> (8 | 16 | 24 | 32 | 64) (signed | unsigned | float)]

Currently only 16 bit signed and unsigned formats are supported. Audio mixers use internally 32 bit signed integer truncated to 16 bit when passed between audio components.

Sample Rate

Audio feeds, mixers and sinks need to have the sample rate specified before they can work properly. This is done by the commands audio feed rate, audio mixer rate and audio sink rate. The syntax of these commands are:

    audio feed rate [<feed id> <rate>]
    audio mixer rate [<mixer id> <rate>]
    audio sink rate [<sink id> <rate>]

The parameter rate is a positive integer indicating the sample rate in Hz. A rate could be 48000 indicating audio sampled at 48.000 Hz.

Channels

Audio feeds, mixers and sinks need to have the number of channels specified before they can work properly. This is done by the commands audio feed channels, audio mixer channels and audio sink channels. The syntax of these commands are:

    audio feed channels [<feed id> <channels>]
    audio mixer channels [<mixer id> <channels>]
    audio sink channels [<sink id> <channels>]

The parameter channels is a positive integer greater than 0 indicating the number of channels in each stream. Channels are expected to be interleaved in the audio stream. It is possible to mix audio streams even though they do not have an identical number of channels. A single channel audio feed stream will in a two or more channel audio mixer by default be mixed into both or more of the mixers channels. A two or more channel audio feed will in a single channel mixer by default be mixed together into a single channel. This doubles the volume and lowering the volume for the source in the mixer might be needed.

Channel mapping and mixing.

Default mapping/mixing between channels in a mixer may be changed by the *audio mixer source map command. While Snowmix may support an unlimited number of channels in audio stream, audio is currently primarily tested with 1 and 2 channel audio streams. The syntax of channel mapping is

    audio mixer source [map <mixer id> <source id> <map 1> <map 2> ... <map n>

At least one map must be provided. A map is two positive integers one for the source channel and another for the mixer channel separated by a comma. The following shows a couple of examples of source mapping.

    audio mixer source map 1 2 0,0 1,1
    audio mixer source map 1 2 0,1 0,1
    audio mixer source map 1 2 0,0 1,0
    audio mixer source map 1 2 0,0 0,0 0,0 0,0

In the examples the mixer id 1 and the source id 2 is used. The first example is the default mapping of a 2 channel source and a 2 channel mixer output where channel 0 of the source is mixed to channel 0 of the mixer's output and channel 1 of the source is mixed to the channel 1 of the mixer's output.
In the second example, the channels are swopped and mixed (0 to 1 and 1 to 0). In the third example channel 0 and 1 of the source is mixed into channel 0 of the output of the mixer. In the fourth example, channel 0 of the source is mixed 4 times. This is an inefficient way to increase the volume by a factor 4.

Clipping

When 2 or more channels and/or sources are mixed into one, clipping or truncation of the resulted sample may occur if two or more samples added results in a value larger than +/- 15 bit for signed and 16 bit for unsigned. If a sample is clipped, the buffer holding the sample will be marked as clipped. Buffers marked as clipped used by other audio components such as audio mixers will result in their output buffers also will be marked as clipped. To avoid clipping, input volume for sources to mixers must be reduced. Adjusting output of a mixer clipping due to its input will not prevent the mixer from clipping and as such mark its output as clipped.

States.

Audio feeds, mixers and sinks have internally the following states : SETUP, READY, PENDING, RUNNING and STALLED. While feeds, mixers and sinks are being configured, they will be in the state SETUP. When sufficient configuration has been applied for the components individually, they will individually change state to READY. When individually started, they will change state to PENDING. When data flows through a compnent, state will change to RUNNING. If the connection to a feed or sink is closed, the feed or sink will be stopped and change state to READY.

Starting audio feeds.

In the following, audio queues are means of connections between sinks, mixers and feeds.

Audio feeds are started by an external process establishing a control connection to Snowmix and sending the follwing command sequence followed by a newline:

    audio feed ctr isaudio <feed id>

The feed id is the id of the feed created typical in the ini file. What follow the newline must be an audio data stream in the format specified for the feed. If the feed specified was ready, the state of the feed will change to PENDING. In this state, and if one or more audio queues has been applied to the audio feed, the feed will change state to RUNNING, when data arrives for the feed.

Starting audio sinks.

Audio sinks can be started either by the command audio sink start or audio sink ctr isaudio. When this happens, the sink state (if ready) will change to PENDING. The sink will also apply an audio queue to its source. When at least a single sample is received on the source queue to the sink, the sink will change state to RUNNING. The syntax of the start commands are:

    audio sink start [<sink id>]
    audio sink ctr isaudio <sink id>

If the command audio sink ctr isaudio <sink id=""></sink> followed by a newline is given on a control connection and given that the sink has been setup correctly, that control connection will be used to send an audio data stream out until the sink is either stopped, which will make Snowmix disconnect the control connection, or the external process that had established the connection disconnects the connection. The latter will make the sink stop and state change to READY.

Do not start and audio sink with the command audio sink start <sink id=""></sink> in your ini file unless the audio sink has a file specified and you want Snowmix to start writing audio samples to it when Snowmix starts. One note though, a pipeline with an audio feed, mixer and a sink, where the mixer and the sink is started, does not produce audio samples unless either the audio feeds delivers samples or the mixer is started in a special way discussed further down.

Starting audio mixers.

Audio mixers can be started by the command audio mixer start. The syntax of the command is:

    audio mixer start [[soft ]<mixer id>]

If an audio mixer was ready when started, it will change state to PENDING. When a pending mixer is applied an audio queue either by another mixer or a sink, it itself will apply audio queues to all its sources. If the mixer had at least one audio queue applied to it and then started without the keyword soft, the mixer will change state to RUNNING without waiting for data from its sources. If the mixer was started with the keyword soft, the mixer will enter the state PENDING and wait for audio data from at least one of its sources before changing state to RUNNING and thus start mixing data. If the mixer was started without the keyword soft and the mixer had no audio queue applied to it, the mixer will wait in the state PENDING. It will stay in that state until at least one audio queue has been applied to it. When that happens, it will apply audio queues to its sources and stay in the state PENDING until at least one of its sources sends samples to it.

In the table below, the mixer is assumed to be in the state READY. The table list what will happen when the mixer is started with and without the keyword soft

State Soft Queue Applied New State Further changes
READY No No PENDING Will enter state RUNNING when queue has been applied and samples are received from at least one source independently of how many sources it has
READY No Yes RUNNING Will start sending samples to its audio queue or queues without awaiting samples to arrive from its source or sources
READY Yes No PENDING Will enter state RUNNING when at least one queue has been applied and samples are received from at least one source independently of how many sources it has
READY Yes Yes PENDING Will enter state RUNNING when samples are received from at least one source independently of how many sources it has

In the table above the Queue Applied means that another started mixer or started sink is using the mixer as the source subsequently adding an audio queue to it.

If a setup requires the *all the following 3 conditions to apply, then a special setup is needed. The 3 conditions are:

  • a sink must output samples as soon as it is started
  • the sink is sourced by a mixer that was started before the sink was started
  • the mixer has yet to receive samples from its sources when the sink is started.

The solution is to create the special audio sink 0. The audio sink 0 must be sourced by the mixer and the audio sink 0 must be started before the mixer is started and the mixer must be started without the soft keyword. This way the mixer will start mixing samples as soon as it is started and any other sink sourced by the mixer will receive samples for output as soon as these are started.

If multiple mixers needs this setup described above, the audio sink 0 can be sourced by a mixer sourcing the mixers that needs to be started. Audio sink 0 is special in the sense that it applies an audio queue to its source when started and silently drop any samples it receives.

Example

After this introduction, it is time to show an example of a 2 input 1 output mixer. The Mixer is sourced from 2 audio feeds and providing output to a single sink.

    audio feed add 1 Speaker
    audio feed channels 1 1
    audio feed rate 1 48000
    audio feed format 1 16 signed

    audio feed add 2 Line-In
    audio feed channels 2 2
    audio feed rate 2 48000
    audio feed format 2 16 signed

    audio mixer add 1 Main Mixer
    audio mixer channels 1 2
    audio mixer rate 1 48000
    audio mixer source feed 1 1
    audio mixer source feed 1 2

    audio sink add 1 Line-Out
    audio sink channels 1 2
    audio sink rate 1 48000
    audio sink format 1 16 signed
    audio sink source mixer 1 1
    audio sink file 1 /tmp/audio

    audio mixer start 1
    audio sink start 1

In the above example, we define two audio feeds, each with a sample rate of 48kHz, and 16 bit signed integer audio. The first feed has a single channel and the second feed has 2 channels. In addition to the feeds, a mixer with two inputs and a sink sourced from the mixer and writing to a file is defined. Even though the mixer is started without the keyword soft the mixer will not start immediately mixing thus producing samples for the sink to write. The mixer will awaits first samples to arrive from its source before starting mixing samples and forwarding them to the sink. The reason for this is that the mixer is started before the sink and subsequently doesn't have an audio queue applied for it to fill when started. If however the audio sink is started before the mixer, the mixer will start mixing and forwarding samples to the sink without awaiting the first samples arriving from its source. Such samples until audio from the audio feed arrives at the mixer, the samples mixed are silent samples.

Muting.

Feeds, mixers and sinks can mute their output. In addition to this, mixers can mute the input of a source to the mixer without affecting other components that may or may not use the same source. The syntax for muting and unmuting is shown below:

    audio feed mute [(on | off) <feed id>]
    audio mixer mute [(on | off) <feed id>]
    audio sink mute [(on | off) <feed id>]
    audio mixer source [mute (on|off) <mixer id> <source no>]

Using the the mute command without parameters will list the volume and mute state for the audio components group.

Volume.

Feeds, mixers and sinks can set the volume of their output. In addition to this, mixers can set the volume of the input of a source to the mixer without affecting other components that may or may not use the same source. The volume is set on a per channel base. The syntax for setting volume is shown below:

    audio feed volume [<feed id> <volume 0> ... <volume n>]
    audio mixer volume [<mixer id> <volume 0> ... <volume n>]
    audio sink volume [<sink id> <volume 0> ... <volume n>]
    audio mixer source [volume <mixer id> <source no> <volume 0>... <volume n>]

The value for volume N is a positive real number less than or equal to 4.0. N is the channel number in the audio stream counting from 0. A volume of 1.0 means unchanged volume. It is possbile to use the character '-' as a substitute for a unchanged volume value. To set the volume to 0.5 for the second channel for the third source in the first mixer, the command would be:

    audio mixer source volume 1 3 - 0.5

The following command sets the volume to 0.5 for the first channel of the same mixer and source:

    audio mixer source volume 1 3 0.5

Setting the first channel volume to 0.7 and the second channel volume to 0.6 for the same source and same mixer can be done using the following command:

    audio mixer source volume 1 3 0.7 0.6

Using the the volume command without parameters (for feeds, mixers and sinks - not for mixer sources) will list the volume and mute state for the group of all the audio components. A group is eithers feeds, mixers or sinks.

Delay for feeds.

It is possible to set a fixed delay of silence samples that will be added to an audio feed every time it is started. The syntax for adding a delay in milliseconds is shown below:

    audio feed delay [<delay in ms>]

The command audio feed delay will list the delay configured for all feeds.

Adding silence and dropping samples.

It is possible for a feed, a mixer and a sink to add silence or drop samples. The commands for adding silence and dropping samples are shown below:

    audio feed add silence <feed id> <ms>
    audio mixer add silence <mixer id> <ms>
    audio sink add silence <sink id> <ms>
    audio mixer source add silence <mixer id> <source no> <ms>
    audio feed drop <feed id> <ms>
    audio mixer drop <mixer id> <ms>
    audio sink drop <sink id> <ms>
    audio mixer drop <mixer id> <source no> <ms>

The value of ms is the number of samples expressed in milliseconds

Audio info and status.

To get the information and status of feeds, mixers and sinks, the following commands are available:

    audio feed info
    audio mixer info
    audio sink info
    audio feed status
    audio mixer status
    audio sink status

The info command is intended to list the more permanent settings. In the current version the following information is listed : state, rate, channels, bytespersample, signess, volume, mute, buffersize, delay, queues.

The status command is intended to list information changing more rapidly. The status information listed in the current version is: state samples samplespersecond avg_samplespersecond silence dropped clipped delay rms. A value larger than 0 for clipped indicates that clipping has occured. Delay is number of samples in queue expressed in milliseconds. The value of rms is a value in percent from 0 to 100 per channel indicating the power of the audio signal. The rms is calculated as the sum of squared samples in a buffer. The sum is then divided by number of samples in a buffer. If the calculated rms is less than 95% of the previous calculated rms value, rms is set to 95% of the previous rms value securing a slightly slowed fade back effect. There will be an rms value for each channel separated by a comma.

Verbosity.

The following commands will either toggle or set the verbosity level for audio feeds, mixers and sinks:

    audio feed verbose [<level>]
    audio mixer verbose [<level>]
    audio sink verbose [<level>]

The default level is 0 meaning no verbosity. Toggling verbosity sets the level to 1, assuming the level was 0. Otherwise toggling verbosity set the level to 0. Increased level of verbosity will result in an increased amount of debugging information to be written. All additional information written at a verbosity level of 2 and above will be written to stderr rather than to a possible control connection.

Audio help

To list the syntax for audio commands, the help command can be used. The syntax for the help commands are:

    audio feed help
    audio mixer help
    audio sink help

Feeding audio to Snowmix.

In the following example an external process is feeding audio data to Snowmix. It is assumed Snowmix through its ini files have feeds configured for the input of data:

    # Works for both GStreamer-0.10 and GStreamer-1.x
    which gst-launch-1.0 2>/dev/null 1>&2
    if [ $? -eq 0 ] ; then
      AUDIOFORMAT='audio/x-raw, format=S16LE, layout=interleaved'
      gstlaunch=gst-launch-1.0
    else
      AUDIOFORMAT='audio/x-raw-int, endianness=1234, signed=true, width=16, depth=16'
      gstlaunch=gst-launch-0.10
    fi
    (
      echo 'audio feed ctr isaudio 1\n'
      $gstlaunch -v audiotestsrc is-live=true  !\
        $AUDIOFORMAT', rate=48000, channels=1' !\
        fdsink fd=3 sync=true 3>&1 1>&2
    ) | nc 127.0.0.1 9999

It is worth noting that we are using 16bit signed integer audio sampled at 48.000 Hz and a single channel. It is also worth noting that gst-launch is limiting its pipeline to only write samples in real-time at the right speed using the 'sync=true' for the fdsink module. Without it, in this specific case, GStreamer would produce samples as fast as the CPU would allow it.

The script directory holds additional script examples where both video and audio from the same file is feed into the mixer.

Getting audio from Snowmix.

In the following example an external and possibly remotely process is reading audio from snowmix and playing it on the remote computer. The audio sink needed to be able to read audio from Snowmix is assumed configured for 2 channels interleaved 48000Hz sample rate and 16bit signed integer samples. Snowmix is assume to run on a host with the IP address 192.168.1.2

    # Works for both GStreamer-0.10 and GStreamer-1.x
    which gst-launch-1.0 2>/dev/null 1>&2
    if [ $? -eq 0 ] ; then
      AUDIOFORMAT='audio/x-raw, format=S16LE, layout=interleaved'
      gstlaunch=gst-launch-1.0
    else
      AUDIOFORMAT='audio/x-raw-int, endianness=1234, signed=true, width=16, depth=16'
      gstlaunch=gst-launch-0.10
    fi
    (echo 'audio sink ctr isaudio 1' ; cat >/dev/null) | nc 192.168.1.2 9999 \
      (head -1
       $gstlaunch -v fdsrc fd=0                !\
        $AUDIOFORMAT', rate=48000, channels=2' !\
        audioconvert                           !\
        autoaudiosink
       )

The above example works fine though it has one minor drawback. After nc on stdin has received a string to send to Snowmix, stdin must be kept open to avoid nc to terminate. This is done with the command cat >/dev/null. Many other options exists. However the effect is that the overall script does not terminate when Snowmix terminates the connection between nc and Snowmix. Fot that you need to send the process a kill signal perhaps through a CTRL-C. What is needed is a way to tell nc or a similar program to stay open also after stdin is closed or has send an EOS. Feel free to suggest solutions here or in the forum for Snowmix.

In the following example an external process is reading audio from Snowmix, encoding it to mp3, encapsulating it in a MPEG Transport Stream and making it available on tcp port 5000 on the server running Snowmix. Snowmix is assume to run on a host with the IP address 192.168.1.2

    # Works for both GStreamer-0.10 and GStreamer-1.x
    which gst-launch-1.0 2>/dev/null 1>&2
    if [ $? -eq 0 ] ; then
      AUDIOFORMAT='audio/x-raw, format=S16LE, layout=interleaved'
      gstlaunch=gst-launch-1.0
    else
      AUDIOFORMAT='audio/x-raw-int, endianness=1234, signed=true, width=16, depth=16'
      gstlaunch=gst-launch-0.10
    fi
    (echo 'audio sink ctr isaudio 1' ; cat >/dev/null) | nc 127.0.0.1 9999 \
      (head -1
       $gstlaunch -v fdsrc fd=0                !\
        $AUDIOFORMAT', rate=48000, channels=2' !\
        lamemp3enc                             !\
        mpegtsmux                              !\
        mpegtsparse                            !\
        queue leaky=2                          !\
        tcpserversink port=5000 host=0.0.0.0

The mp3 encoded audio can be played by VLC when opening a network stream and specifying tcp@://192.168.2.1:5000 as the source URL assuming that is the IP address of the host.

Another way to play it would be to use use the script shown below:

    # Works for both GStreamer-0.10 and GStreamer-1.x
    which gst-launch-1.0 2>/dev/null 1>&2
    if [ $? -eq 0 ] ; then gstlaunch=gst-launch-1.0 ; else gstlaunch=gst-launch-0.10 ; fi
    $gstlaunch -v tcpclientsrc host=192.168.2.1 port=5000 ! \
        decodebin2                                        ! \
        autoaudiosink

Pausing audio mixing and dropping source samples

Sometimes it can be required to hold/delay mixing a source in an audio mixer. The command audio mixer pause is intended for this. The command has the following syntax:

    audio mixer source [pause <mixer id> <source no> <frames>]

The argument frames specifies for how many frame periods the inclusion of the source will be suspended. While a source for a mixer is paused, arriving samples will build up. When a source is no longer being paused, either because the number of frame periods specified has passed or because the pausing has been disabled, the mixer will start mix the queued samples at normal speed. If the delay introduced is to be cancelled, the command audio mixer source drop can drop samples. Another method could be to drop samples below a certain threshold using the command audio mixer source rmsthreshold. This command will drop audio frames that has a RMS value below the threshold specified for the command. The command has the following syntax:

    audio mixer source [rmsthreshold <mixer id> <source no> <level>]

Assuming the mixer is mixing two sources, one is a commentator and the other is audio from VHF radio with mostly silence and occasional messages. Then it could be useful if the VHF radio source samples are dropped if below a threshold representing silence and otherwise delayed when the speaker is talking. Then when the speaker is talking, a script detecting the RMS level of the speaker can ask the VHF source to both pause and drop silence. If there are any incoming messages on the radio, these will be queued. The script should notify that speaker about incomming VHF messages. The speaker can then round up and stop talking. Then the script can detect that the speaker is silent, and the script can suspend the pausing of the VHF radio source and start paying queued messages. When played, the script can notify the speaker about this and the speaker can start talking again and the script can pause for incoming VHF radio messages again.

Snowmix version 0.4.4 has all the necessary hooks and commands to implement the scenario decribed here using the embedded Tcl Interpreter.

Questions.

Feel free to comment here, but questions are best asked and answered in the Snowmix forum / discussion.


Related

Discussion: 2fd595ca
Discussion: Reference manual for general commands
Discussion: Reference mnual for feeds available.
Wiki: AV Sync
Wiki: Development Plans
Wiki: Home
Wiki: Reference Audio Feeds
Wiki: Reference Audio Mixers
Wiki: Reference Audio Sinks
Wiki: Reference Cairo Graphics
Wiki: Reference Command
Wiki: Reference Feeds
Wiki: Reference GL Shapes
Wiki: Reference General
Wiki: Reference Images
Wiki: Reference Placed GL Shapes
Wiki: Reference Placed Shapes
Wiki: Reference Shapes
Wiki: Reference Texts
Wiki: Reference Virtual Feeds
Wiki: Reserved Commands
Wiki: Snowmix Guide

MongoDB Logo MongoDB