SLAM++ / Wiki / Running on embedded platforms

Running on embedded platforms

This page is dedicated to compiling and running SLAM++ on embedded platforms (not the usual x86 / x64, but for example ARM).

Raspberry Pi

We tested SLAM++ on Raspberry Pi, Model B (the one with 512 MB of RAM). To get started, you will need a fast SD card, which is not a part of the package. We tested with Kingston 8 GB class 10 SDHC (SD10V/8GB). To get started, you will have to follow instructions at Raspberry Pi downloads page and put Raspbian "wheezy" (or any other with hard-float) on your SD card, using either dd in Linux or Win32DiskImager in Windows.

Then you just put the SD in your Raspberry, hook up a micro-USB cable for power, connect keyboard and mouse using USB, connect LAN using the provided RJ45 conector, and connect monitor using HDMI (possibly using an HDMI - DVI reduction).

The first time, Raspberry boots to a setup menu. You will probably want to use all the available space on the SD, and to disable the setup menu on the next boot. You may as well want to enable SSH access. We used the x-server GUI option, in order to be able to see the result images. Otherwise SLAM++ will work just fine from console. You can always come back to the setup menu by typing raspi-config in console.

First, you need to connect network. If you bought a WiFi dongle or if you are connecting to a DHCP-enabled network, it should work right away. On the other hand, if you need to use a static IP address (like when using ICS with your desktop), you will need to edit /etc/network/interfaces, like this:

sudo leafpad /etc/network/interfaces

And modify the file to this (while keeping in mind to modify the IP addresses accordingly to your network settings):

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
address 192.168.1.2
netmask 255.255.255.0
network 192.168.1.0
broadcast 192.168.1.255
gateway 192.168.1.1
dns-nameservers 8.8.8.8 8.8.4.4

Note that the DNS servers are Google's servers. You might want to use address that your other computers use.

Next, unless you are fine with the Great Britain keyboard layout that the Raspbian seems to come with, you will need to also edit /etc/default/keyboard, like this:

sudo leafpad /etc/default/keyboard

And add one or more layouts, for example like this:

XKBMODEL="pc105"
XKBLAYOUT="cz,us"
XKBVARIANT=""
XKBOPTIONS="grp:ctrl_shift_toggle,lv3:ralt_switch"

BACKSPACE="guess"

If in doubt, refer to:

man keyboard 5

Also, unfortunately the Numlock is off after Raspberry boots. To fix that, one can get a small utility, called numlockx:

sudo apt-get install numlockx
cd ~/.config
echo '#!/bin/sh' > numlock.sh
echo numlockx on >> numlock.sh
chmod +x numlock.sh
sudo crontab -e

Append numlock.sh at the end of the Cron table, save (F3) and quit (Alt+X). Now your Raspberry should boot with numlock on.

In order to see result images, you need to install a small image viewer (the bundled one doesn't support compressed .tga images that SLAM++ produces). FEH seems to be a nice choice:

sudo apt-get install feh

In order to build, one also needs CMake and SVN:

sudo apt-get install cmake
sudo apt-get install subversion

One last step is increasing space in swap, as 512 MB of RAM won't cut it for compilation of all those tricky templates:

sudo leafpad /etc/dphys-swapfile

And increase the size to 1 GB:

CONF_SWAPSIZE=1024

Save and restart the swapfile, like this:

sudo /etc/init.d/dphys-swapfile stop
sudo /etc/init.d/dphys-swapfile start

Note that if there are too many applications running, the swapfile will fail to stop, saying that there is not enough free memory. In that case, just restart the whole system. Also note, that having a swap file on the SD card shortens its lifetime (as these types of memories only have a limited amount of write cycles). When finished, it is better to reduce swap size back to the default 100 MB.

Now we are set to go:

cd ~/Desktop
mkdir slam
cd slam
svn checkout svn://svn.code.sf.net/p/slam-plus-plus/code/trunk .

There is a CMakeLists.txt modified to work with Raspberry by default. You can download it and replace the CMakeLists.txt that downloaded from the SVN. Alternately, you need to disable -fprefetch-loop-arrays (as ARM doesn't seem to have prefetch functionality), -march=native and also -fopenmp / -lgomp (as Raspberry is single-core):

cd build
cmake -i ..
<configure the build, or just press Enter to skip>
make

Building takes a long time (over 8 hours) on Raspberry. Partly, it is caused by swapping memory to the slow SD card, partly by the plain truth that the onboard single-core ARM11 (ARMv6) with no instruction set extensions is by no means fast.

If the compiler gives errors, such as "Virtual memory exhausted" or sometimes "Internal compiller error", you probably need to close applications / increase size of swapfile, and try again. The code builds without any modifications (at least on the 512 MB model).

After that, you need to download data and run as usual:

../bin/slam-plus-plus -i ../data/manhattanOlson3500.txt --lambda -po

And that should produce the result in about 2.5 seconds (it takes less than 0.05 seconds on a moderately powerful PC). To see the result, either type feh solution.tga in the console, or right-click solution.tga, select "Open with ..." and type feh %f in the box. That should do the trick.

To decrease build times, one can modify include/slam/ConfigSolvers.h to only compile solvers that are really needed (typically Lambda for pose / landmark problems, Lambda_LM for bundle adjustment or fastL for incremental solving). Also, one can go to src/slam/Solve??Impl.cpp and disable types of problems that are not required. Cross-compilation might also be a good idea, if you have another computer with g++.

Please note that the commands were not copy-pasted from Raspberry, and may therefore contain typos. If you have any problems, don't hesitate to contact us. If you have tried running on your Raspberry or another embedded platform, we would love to hear about it.

To get some idea about running times, there are some benchmarks (not overclocked):

Dataset	Mode	Time native (sec)	Time CHOLMOD (sec)	Time CSparse (sec)
Manhattan-Olson	batch	2.269394	3.67878	2.686455
10K	batch	18.615160	16.469152	26.732065
100K	batch	392.682670	330.219778	735.995354
Intel	batch	0.533424	0.649770	0.573284
Killian court	batch	0.524023	0.733992	0.554213
City 10k	batch	17.253814	14.28070	26.266288
City Trees 10k	batch	4.865824	7.226268	4.756619
Victoria Park	batch every 100 vertices	33.778355	52.497989	33.900039
Parking Garage	batch	8.369959	10.555369	11.680632
Sphere	batch	36.463761	33.839142	106.024728

Note that Raspberry has no NEON instructions, so our solver is at disadvantage (since it was designed to make use of multimedia instructions), compared to CHOLMOD or CSparse, but still, it gains very good results, especially on real datasets. This is mostly because the artificial datasets are designed to contain large supernodes, and our solver performs simplical factorization at the moment, similar to CSparse. CHOLMOD performs supernodal factorization, which gains some speedup.

Too slow? Unfortunately, that's all Raspberry's got. Can't we use it's GPU? No. It only supports OpenGL (no CUDA, no OpenCL), restricting computation to single-precision floating-point (in better case), which will just not cut it - the code would most likely become numerically unstable.

SLAM++ Wiki

high-performance nonlinear least squares solver for graph problems

Running on embedded platforms

Running on embedded platforms

Raspberry Pi