- Make sure you have installed the FlowDesigner and ManyEars packages before starting. Binaries for OSX, Windows & Linux can be found on the SourceForge download section L
If binaries are not available for your installation, the following section contains information on how to build from source.
Building From Source
Please consult the [Downloads] section to get the code from SVN or the latest tarball. Instructions for supported operating systems are next.
OSX
Windows
**Notes:**
* We are assuming here that you are using Dev-Cpp to get pre-compiled (binary) packages from community devPaks.
* We are also assuming that FlowDesigner is installed in C:\Program Files\FlowDesigner
- 1) Follow the instructions to compile FlowDesigner first.
- 2) Install the FFTW3 package from Dev-Cpp
-
3) Start a terminal, you are now ready to generate the make files and type in the ManyEars root directory :
cmake -G "MinGW Makefiles"
-
4) Compile & Install ManyEars:
make install
(The default directory is C:\Program Files\FlowDesigner\lib\flowdesigner\toolbox\MANYEARS. This can be changed in the root CMakeList.txt)
-
5) Run flowdesigner.exe from C:\Proram Files\FlowDesigner\bin. The ManyEars toolbox should be available in the TreeView.
Linux
The main processing loop contains the following elements :
[[img src=ProcessingLoop.png width=640]]
- Using RTAudio to get audio stream.
- We are sampling at 48kHz.
- Frames are 1024 samples wide, with 50% overlap.
- Each input is interlaced (8) and sampled with 16bits little endian.
- 8 gains are defined for each microphone input. This is useful for fine tuning gains for slightly different microphones or hardware sampling.
- Gains are applied for each sample.
- Localization algorithm with the steered beam former. Please refer to the publications for more information.
- Geometric sound source separation. Please refer to the publications for more information.
SoureTrack will output a Vector of SourceInfo, which is the data structure that contains active sound sources in FlowDesigner. Most useful informations are :
- float x[3] Contains x,y,z coordinates of the detected source. This is related to the origin of the microphone array positions.
- float strength Strength of the source.
- int source_id Id of the source (unique).
ToString
- Transform the Vector<ObjectRef> output of SourceTrack containing source informations into a FlowDesigner serialized string.
QtSendString
- Will send the string to a TCP socket on port 30000. This is the port of the audioviewer application for displaying live sources.
Constant (CONDITION)
- This is the loop condition. The constant is set to "TRUE" for infinite looping.
- Separated streams will be saved in WAV format.
Open Sound Control output using UDP sockets.
-
OSC format for ManyEars is the following :
"/manyears\0\0\0,iffffff\0\0\0\0"
+ source_id (4 bytes)
+ x (4 bytes)
+ y (4 bytes)
+ z (4 bytes)
+ strength (4 bytes)
+ theta (4 bytes)
+ phi (4 bytes)
The resulting code is :
int id = info->source_id;
float x = info->x[0];
float y = info->x[1];
float z = info->x[2];
float strength = info->strength;
float phi = -180.0 * atan2(info->x[2], info->x[1]) / M_PI;;
float theta = 180.0 * atan2(info->x[1],info->x[0]) / M_PI;
stringstream outputString;
char header[12] = {'/','m','a','n','y','e','a','r','s', '\0', '\0','\0'};
char tags[12] = {',','i','f','f','f','f','f','f','\0','\0','\0','\0'};
outputString.write(header,12);
outputString.write(tags,12);
BinIO::write<int>(outputString,&id,1);
BinIO::write<float>(outputString,&x,1);
BinIO::write<float>(outputString,&y,1);
BinIO::write<float>(outputString,&z,1);
BinIO::write<float>(outputString,&strength,1);
BinIO::write<float>(outputString,&theta,1);
BinIO::write<float>(outputString,&phi,1);
//write to the socket
socket.writeDatagram(outputString.str().c_str(),outputString.str().size(),
QHostAddress(hostname.c_str()),portnumber);
Important Notes
- Phi and Theta can be calculated from x,y,z. We have decided to transmit them anyway for ease of use.
- float phi = -180.0 * atan2(info->x[2], info->x[1]) / M_PI;;
- float theta = 180.0 * atan2(info->x[1],info->x[0]) / M_PI;
- Data must be aligned on 32 bits with OSC.
- tags[12] = {',','i','f','f','f','f','f','f','\0','\0','\0','\0'}; describes the format of the OSC packet. Binary data is written using big endian (network order).
- We send N separated packets, one for each of the active source.
- We should consider writing a generic OSC FlowDesigner Node.
Microphones Configuration
Default microphone configuration is defined in file cube_mic_pos.mat :
[[img src=Cube.jpg]]
- Microphone #1 x=-18cm y=16cm z=-15.5cm
- Microphone #2 x=-18cm y=16cm z=15.5cm
- Microphone #3 x=18cm y=16cm z=-15.5cm
- Microphone #4 x=18cm y=16cm z=15.5cm
- Microphone #5 x=-18cm y=-16cm z=-15.5cm
- Microphone #6 x=-18cm y=-16cm z=15.5cm
- Microphone #7 x=18cm y=-16cm z=-15.5cm
- Microphone #8 x=18cm y=-16cm z=15.5cm
Important Notes
- Microphones must be ** connected in the correct order ** in the sound card inputs. Microphone #1 goes into input #1, Microphone #2 goes into input #2, etc.
- Microphones positions can be changed to fit your setup. It is mandatory that you configure the microphone positions otherwise the algorithm won't work.
- Sampling time for each audio input must be synchronized by hardware.
Audioviewer