Re: [simspark-devel] SimSpark boosting project

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hey,

On Sat, Feb 19, 2011 at 7:34 PM, Hedayat Vatankhah <hed...@gm...>wrote:

>  Hi Sander!
> (my reply with more time to write! :P)
>

Great, thanks :)

>
>
> On ۱۱/۰۲/۱۸  11:53, Sander van Dijk wrote:
>
> Hello MC,
>
> As you may know, the RC federation put out a call for proposals for
> projects, among others to work on the competition infrastructure. I sent in
> a proposal together with Ubbo Visser, which got accepted. So now I am with
> Ubbo to work on simspark for 2 months, and I would like to keep you up to
> date with what we're doing.
>
> Thanks for letting us informed, and very happy to hear that it is going to
> happen in these 2 months. :)
>
>
>
> The main aim of our project is to make the simulator usable for bulk
> training. One part of that means debugging the simulator and make it faster
> and more stable, the other part is to make some external tools. On the
> second point we are still working out the details, but on the first part I
> did some work:
>
> * Did a lot of profiling, which a.o. showed that the server spent more than
> 10% of the time on dynamic casting alone. This was mostly because of
> continuous searches for nodes in the scene tree. I have put in some caching
> here to alleviate it, reducing the time spent casting to 1%. This extra 10%
> has now gone to ODE. I still have to create some performance tests to see if
> this made stuff faster.
>
> That's great. It'd be also nice if you specify which tools do you use for
> profiling for the record. IIRC, previously we had some inconsistent
> profiling results based on the tools we used; so it might be helpful to know
> the tools beside the results. Also, it might be helpful if you provide the
> complete results about the most time consuming parts with more details.
> Maybe there are people who'll work on some other time consuming parts. :)
>

Yes, good point. I will make sure to record all test details. For now: I am
mostly using valgrind (with the callgrind and helgrind tools in specific). I
first tried gprof, but then everything was very unstable, but maybe that's
helped with the current fixes.

> * Multi threading mode is fixed (but see below). Although at first I was
> doubtful of whether the current way it is done would help, it should,
> because now the second and third most costly things, gathering perception
> data (20%) and gathering monitor data (8%) can now be done in parallel.
> However, while running a 6vs6 game there is not a real noticeable speed-up.
> But again, I still have to do proper performance tests to see what it does.
>
> Thanks for the fix.
> We've not conducted any benchmarks to find the difference when
> multi-threading is enabled and when it is disabled; but our experience at
> IranOpen 2010 and apparently GermanOpen 2010 experience have shown that
> multi-threaded mode is practically faster.
> Also, Andreas Seekircher has reported his experience with multi-threaded
> which also confirms that multi-threaded mode is faster, but also he has
> faced a problem which deserves some attention:
>
> However there was a strange behavior, that the simulation was running quite
> fast on my laptop with up to 9 agents and the simulator was using more than
> one core. When I started the 10th agent it was getting much slower and it
> seemed that the simulator was then using only one core (it was then again
> the same speed like without multi-threading). This happened with 4 cores. On
> a dual core system already the 5th agent slowed down the simulation... Is
> this a known issue?
>
> I guess that in this situation, ODE is the main bottleneck. But that's just
> a guess.
>

Yes, Andreas notified me of that, too. This happens when agents and server
are run on the same machine with multiple cores.I have found that at some
point, the system's scheduler assigns an agent to the same core as the
server (even though in practice there is still room on another core), so the
server can't run at full speed. With taskset(1) it is possible to explicitly
set a process' CPU affinity, and by starting the agents with e.g.  'taskset
2 ./start.sh localhost' the server is able to take up 100% again.

However, the biggest opportunity to optimise is ODE, which now eats up
> 67-70% of all computation time. There was a project at CMU in 2007 to
> parallelize ODE [1], where they made the collision detection parallel.
> Profiling shows however that this will not help: collision detection takes
> up 0.45% in rcssserver3d. What is expensive for us, is stepping the physics.
> Luckily, ODE already splits this work into different parts, updating
> 'islands' seperately, where in our case each island is one agent. I am now
> working on parallelising this, and if that works we can in theory cut up the
> 67% CPU time into 12/18 parts (4vs6/9vs9) that can be run in parallel,
> hopefully making having 8 cores actually useful.
>
> Great. But it'd be probably nice if we parallelize the collision detection
> too. Specially, it's computation time will increase considerably when two or
> more robots collide (fortunately the new referee doesn't allow many robots
> to collide at the same place, but with more players it is more likely that
> we'll have collisions in different part of the field).
>

>From what I can tell so far it seems that the main part of the speed
reduction due to collisions is not caused by the collision detection, but by
the fact that it adds many new constraints to the LCP problem that ODE
solves to step the physics. However, I still have to make a team of robots
that just run into each other to be able to say that with certainty. And you
are right, it would be nice in any case ;)

So far about what I am doing. Now, I would also like something from you guys
> ;-) First of all, give the new stuff I committed a good test. Behaviour of
> the simulator should still be the same, but it could be that I missed
> something and that timing of messages is slightly different, breaking
> agents. Also, give the multi threaded mode a good test, see if you can make
> it crash. And, finally, I will be working full time on the simulator for 1
> 1/2 months more, if you think there is anything that I may be able to
> squeeze in there, do let me know!
>
> Certainly! :)
> I noticed something in your recent commit: you've removed the ugly
> busy-waiting loop in SimControlThread, but wouldn't it result in a faster
> simulation when ODE has not much work to do? The loop was there to make sure
> that a cycle will last no less than 0.02 (mSimStep). If I'm not mistaken, it
> is now possible for a cycle to finish too soon.
>

I think you refer to:

            if (isInputControl)
                {
                    while (int(mSumDeltaTime*100) < int(mSimStep*100))
                        controlNode->StartCycle(); // advance the time
                }

? The only use I saw of that was to keep updating the InputControl to get
messages from the monitor while the physics are updated. The time check is
there to stop doing this when the physics are done. Without this check, the
InputControl (and the other controls, AgentControl and MonitorControl) wait
for the physics to be done anyway at the next barrier, so the cycle can't be
finished too soon. And this loop caused the most problems with multi
threading, because it allowed the scene graph to be changed while the
physics were still running on it.

> There are some collaboration opportunities with your work and what I'm
> planning to do, but I'll talk about them in a separate email soon.
>

Looking forward to it :)

>
> Thanks,
> Hedayat
>
>
> Cheers,
> Sander
>
> [1] http://www.cs.cmu.edu/~mpa/ode/
>
> --
> Adaptive Systems Research Group
> Department of Computer Science
> University of Hertfordshire
> United Kingdom
>
>

-- 
Adaptive Systems Research Group
Department of Computer Science
University of Hertfordshire
United Kingdom