|
From: James T. <zak...@ma...> - 2016-01-28 19:14:57
|
> On 26 Jan 2016, at 16:39, Stuart Buchanan <stu...@gm...> wrote: > > Yes. HLA isn't really a silver bullet, though I expect that as we > become more familiar with it, new use-cases will become apparent. > >> Would there be any interest in optional threaded subsystems? If HLA >> is not viable for this, then is the threading option on the table? >> Can HLA eliminate all the CPU bottlenecks, or should there be a target >> of using threads to shift most things off the main thread, combined >> with HLA to do the computational heavy lifting? > > Taking a step back here, I think the main issue that we're trying to address by > looking at multi-threading and HLA is to get nice, consistent, and ideally > faster frame-rates. > > HLA directly addresses part of this if we split off the rendering from > everything > else. So the viewer will run at (say) 60fps irrespective of what the FDM etc. > is running at. > > What is unclear to me right now is how this would impact the rest of > the simulator. > > For example, it may be the case that we are sufficiently rendering bound (GPU, > CPU, bus, memory) at present that removing all the rendering from the main > core makes it simply not worth trying to do anything else - issues such as Nasal > GC become unnoticeable and the core simulator is running at > 60fps > all the time. > > I would be tempted to wait until we get a bit further along with HLA > development before investing a lot of energy elsewhere with multi-threading. > Of course, that has a dependency on me. Stuart, thanks for posting this, it helps a lot. I’d like to explain how my own line of thought fits into this picture. Again emphasising it’s /my/ thoughts, not official: 1 - Before HLA was ever considered, I thought a long time about some kind of locking / TLS-based solution to have a property-tree (or shadow) per thread, with the assumption that very fine-grained threading like Edward is proposing would be a good idea 2 - As I’ve grown more familiar with the code performance, I’ve come to the conclusion that rewriting to be aggressively threaded is not worth the effort for the current codebase. As Stuart said, the only thing that blocks CPU right now is graphics; if we could be sure that rendering (OSG + our pieces of drawing) were in a separate thread from ‘everything else’, we’d use two CPU cores at 100% (ish) and get better framerates. Right now we have no subsystem, which utilises enough CPU to be worth threading more intensively than that, with the possible exception of OpenAL and the sound manager. [1] [2] 3 - OSG threading in theory does the above *already*, but the way we integrate that with the property system causes race conditions and crashes already, and dynamic scene elements reduce the amount of parallelisation significantly. 4 - It would be very nice for lots of reasons to be able to separate the simulator and rendering into discrete processes (optionally!). My personal aim there is to be able to hack on an alternative renderer using more modern graphics techniques, but without jeopardising, or indeed touching the existing rendering code. Of course being able to run multiple viewers is also very desirable. 5 - As Edward and others have said, the /possibility/ to run a subsystem in its own thread/process does open up research and add-on areas, such as using an entire CPU core for a weather, traffic or some other intensive simulation, whether in C++ or Python, or any other language 6 - HLA can cover some of these needs, but we have to define the API via the FOM [3]. Especially for separating simulation and rendering in the current codebase, we have the problem that the viewers do need a shadow copy of the property tree to make animations, shader effects and the like function, and most people seem to agree synchronising a property tree over HLA is a bad idea. Give all the above, my preference is to investigate some of the things I already proposed, because doing so will definitely address points 2, 3 & 4 in a way that doesn’t break compatibility. We get point 5 as a bonus, and we don’t do any harm to 6 (using HLA). Personally I would prefer point 5 was /also/ handled by HLA, but it would be silly to make a solution to the first points and not architecture it to allow future subsystems to run on threads. And of course we could move Nasal to its own thread to prove the concept. Another way of looking at this is, the critical thing is actually point 3 - fixing existing crashes/races when we use OSG threading aggressively - and I think the easiest solution to do that is to decouple the OSG side from the simulation using a shadow property tree. The fact it would give us points 2 &4 at the same time is of course very desirable. Hope that all makes sense. Kind regards, James [1] - we do have subsystems which block for unacceptable amounts of time, eg Nasal GC, but that isn’t a problem that should be solved by aggressive use of threading - threading would just mask the problem somewhat. [2] - osg Pager thread makes this something of a lie of course, but the pager doesn’t usually saturate a CPU for long periods of time [3] - the HLA document for a federation which says what data is passed around by what entities in the simulation |