|
From: Edward W. <ew...@ta...> - 2006-10-07 02:28:56
|
Hi Nick, >> So you're trying to make Valgrind go faster, or dynamic binary translation=20 >> systems in general? > Dynamic binary systems in general. I work on the NSF TeraGrid, a distributed high performance computing infrastructure in the US, and one of our biggest challenges is in supporting the plethora of ISAs that we host in our distributed infrastructure. So a research challenge is to enable _some_ HPC applications to be able to migrate between systems connected across our 30 gbps network backbone across the country. There are lots of people working on using VMMs like Xen to enable this, but I think for many situations, user-space dynamic binary translators like valgrind are sufficient, without the overhead of setting up entire OS VMs as containers for a running HPC job across sites.=20 And since I am in the area of HPC, performance (plus reliability) is a very important consideration. Fortunately, many of our systems are now multi-core, e.g. here in Texas we support a system with 1300 woodcrest compute nodes, each of which provides to the application a dual-CPU, dual-core processor. Furthermore the prospect of up to 80 core CPUs have already been announced by Intel, so things will be getting better (or worse depending on how you look at it). So if we can leverage this trend in ever increasing processor cores, provide a certain level of location transparency through processor emulation, this will be a big win for computational science in general. =20 I know valgrind is more then a processor emulator, unlike QEMU and QuickTransit. But unlike QuickTransit, Valgrind is open-source, and unlike QEMU, Valgrind is more efficient. =20 It's not absolutely critical that I pick the fastest platform for my research, but being in HPC, it's a lot easier to convince my colleagues if real performance gains can be seen fairly quickly. =20 Sorry if I did not convey all this information in my previous email. Email is sometimes not the best vehicle to convey complex motivation. =20 >> Sure. I'm in an area where I hear that comment quite often: "this >> application can't be parallelized". >I didn't say that. Sorry. I didn't mean to misinterpret your statement.=20 - Ed |