|
From: Julian S. <js...@ac...> - 2003-11-20 23:18:57
|
> Ideally, we want a cheap way of detecting s-m-c that can be on all the > time (then we could get rid of the INVALIDATE_TRANSLATIONS macro). I have > an extremely vague idea about tracking things at the page level, and > possibly throwing out all translations that come from code within a page > in certain circumstances. Or something. Hmm. Well, there are some options, but none are good. One is (or might be, depending on Jeremy's views) to mess with page level permissions, so as to remove write permission for any page from which we've taken a translation. Then V takes a page fault whenever writes to that page happen; we catch the fault, note the page as dirty and throw away translations from it as soon as possible. So we freeload on the host's memory protection hardware and have zero run-time overhead. Another is the pure-software approach in very old valgrinds. I think what I had was an array of char indexed by addr>>12 or some such (that would be 2^20 bytes on a 32-bit machine); and there was some kind of check or something at each write. Not cheap, but you could drastically reduce the cost by only testing some writes -- for example, writes from the FPU are most unlikely to generate code, and writes happening as part of a read-op-write operation x86 insn are also unlikely to. Also I think I excluded writes of sizes > 1 since it doesn't make much sense to write x86 code on a word-by-word basis, since it's really a byte stream. Personally I prefer the portability/system independence of the pure-sw approach. IIRC the optimised version didn't give much overhead, but I can't really remember any more, and I don't think I have a copy of the code base that still has that stuff in it. I wonder if the frequent %esp changes will give a performance problem for stack writes. Let's see: if code is written into the stack and then executed, the first time, we make a correct translation, and we mark that stack page as dirty. The next write to that page gets the overhead of discarding translations from that page. So it looks like, apart from the cost of checking every write (subject to above filtering criteria), the cost of supporting s-m-c is proportional to the number of translation discards to be done, and the stack doesn't cause special problems. I think in the long term we'd need to be able to support s-m-c, preferably in a self-contained way. On uncooperative platforms like x86 we could use the pure-software approach. Many risc machines have a flush insn or some such which you have to do before running code you just made; that makes our life really easy, since we don't then have to check any writes. If it should ever come to pass that the vcpu stuff gets completely redesigned, we could translate multiple bbs at once and find where the loops are. It may then be possible to identify writes to memory locations in which we can see that a subsequent write happens to the same location, before we lose track of the control flow. In that case we know that the first write cannot be generating code (since it's overwritten later, and we know all the places where the program will execute in between the two writes) and so the s-m-c check for the first write is redundant. So, the use of a cleverer translation/analysis engine, originally intended to do better liveness analysis, may also help here. J |