|
From: Nicholas N. <nj...@cs...> - 2005-05-19 13:51:28
|
On Thu, 19 May 2005, Julian Seward wrote: >> The good thing is that the checksum doesn't need to be very good, since >> false collisions will only cause extra translations to be executed. >> Something incredibly simple like adding up each byte of code (modulo 256) >> might be good enough. > > Are you sure? If the checksum concludes incorrectly that the new code > is the same as the old code, then we are hosed :-) Hmm, yes, ignore me. N |
|
From: Thomas S. <ste...@gm...> - 2005-05-19 15:21:51
|
On 5/19/05, Julian Seward <js...@ac...> wrote: > Are you sure? If the checksum concludes incorrectly that the new code > is the same as the old code, then we are hosed :-) [i think] Yes, that seems to be the case.=20 > So in fact I'd really prefer a 64-bit checksum if possible. Actually, I think using a checksum at all is a bad idea. In the best case you leave a small chance of "bad things happening", with 64bit the chance is just smaller than with a 32bit checksum. But I think that valgrind should be always correct, not just most of the time :-) And there is the question of exploits. If there is a deterministic case where valgrind makes a mistake, someone might decide to write a webpage that uses the exploit to take over your computer. Ok, that is highly unlikely, but not impossible. Since the trampolins are usually small, what about just doing a memcmp() with the original code? Thomas |
|
From: Julian S. <js...@ac...> - 2005-05-19 16:21:25
|
> Actually, I think using a checksum at all is a bad idea. In the best > case you leave a small chance of "bad things happening", with 64bit > the chance is just smaller than with a 32bit checksum. But I think > that valgrind should be always correct, not just most of the time :-) With a 32-bit checksum your screwup probability is 2.33e-10. With a 64-bit checksum the probability is 5.42e-20. The probability of a meteorite striking my house each day is almost certainly larger than 5.42e-20. I'm not going to worry about a 64-bit checksum being wrong. > And there is the question of exploits. If there is a deterministic > case where valgrind makes a mistake, someone might decide to write a > webpage that uses the exploit to take over your computer. Ok, that is > highly unlikely, but not impossible. Huh? Valgrind is "just another" userspace process; you don't get elevated privileges from running your program on it. And so there is no extra risk. > Since the trampolins are usually small, what about just doing a > memcmp() with the original code? It's possible, but I would prefer a more general mechanism that did not assume the code fragments are small. J |
|
From: Nicholas N. <nj...@cs...> - 2005-05-19 13:52:38
|
On Thu, 19 May 2005, Duncan Sands wrote: >>> How do you know how big the code is? >> >> what code? [unclear what you're referring to] > > The code you're taking the CRC of. For example, in the > case of a trampoline, you have a pointer to a 10 byte long > instruction sequence on the stack. You want to calculate > a CRC of these 10 bytes and store them somewhere. How do > you know it is 10 bytes long? Or were you planning to do > a CRC of some fixed length that seems big enough to cover > most cases? The first time Valgrind translates the sequence, it discovers how long it is. Then, each time that sequence is due to run, Valgrind can check that many bytes. Does that make sense? N |
|
From: Duncan S. <bal...@fr...> - 2005-05-19 14:05:03
|
> > The code you're taking the CRC of. For example, in the > > case of a trampoline, you have a pointer to a 10 byte long > > instruction sequence on the stack. You want to calculate > > a CRC of these 10 bytes and store them somewhere. How do > > you know it is 10 bytes long? Or were you planning to do > > a CRC of some fixed length that seems big enough to cover > > most cases? > > The first time Valgrind translates the sequence, it discovers how long it > is. Then, each time that sequence is due to run, Valgrind can check that > many bytes. Does that make sense? Not really :) If the code contains conditional instructions then you may not execute all of it, for example there may be a bunch of stuff at the end that wasn't executed the first time. Then you will think the code is shorter than it is. If some instructions in the block at the end are modified then the CRC won't notice, so things won't be refreshed. Due to different variable/register values the code may branch differently when things are re-run, perhaps causing you execute the unrefreshed code at the end. I know nothing about how valgrind works, so maybe I need to be beaten with a clue stick, but... isn't this a problem? D. |
|
From: Julian S. <js...@ac...> - 2005-05-19 14:21:07
|
> isn't this a problem? No. It's simple, at least in principle: V makes a translation of a piece of code -- and it knows exactly which code addresses form part of the translation's original. When we come to run the translation we merely have to ensure that all the insn bytes from which the translation was derived are the same as they were to start with. There's no ambiguity in the question "is the translation still valid or not" -- it's easy to answer. J |
|
From: Duncan S. <bal...@fr...> - 2005-05-19 14:13:51
|
> The first time Valgrind translates the sequence, it discovers how long it > is. Then, each time that sequence is due to run, Valgrind can check that > many bytes. Does that make sense? A big stick shaped sonic clue wave just blasted my brain. Let me guess: a "translation" consists of cached data about a bunch of instructions that have been executed, or considered for execution. The problem is that when instructions are modified, the translation gets out of date. However this is not a problem if instructions were never "translated". Thus there is no problem. Hang on, didn't I already know all this? Seems I need more coffee... Sorry for the noise, Duncan. |
>>>> How do you know how big the code is? > The first time Valgrind translates the sequence, it discovers how long > it is. Then, each time that sequence is due to run, Valgrind can check > that many bytes. Does that make sense? Does that assume that the sequence is exactly one basic block? (True for the current usual gcc 2-instruction x86 case, but ...) -- |
|
From: Nicholas N. <nj...@cs...> - 2005-05-19 14:31:47
|
On Thu, 19 May 2005, John Reiser wrote: >> The first time Valgrind translates the sequence, it discovers how long it >> is. Then, each time that sequence is due to run, Valgrind can check that >> many bytes. Does that make sense? > > Does that assume that the sequence is exactly one basic block? > (True for the current usual gcc 2-instruction x86 case, but ...) I don't think it matters. Each translation of stack code will be considered and checked separately. N |
|
From: Nicholas N. <nj...@cs...> - 2005-05-19 15:37:57
|
On Thu, 19 May 2005, Thomas Steffen wrote: > And there is the question of exploits. If there is a deterministic > case where valgrind makes a mistake, someone might decide to write a > webpage that uses the exploit to take over your computer. Ok, that is > highly unlikely, but not impossible. This has me scratching my head. I'm sure there are lots of ways in which Valgrind is "insecure", but I don't see it as a security-sensitive application in any way. Unless you're, say, running your webserver under Valgrind, in which case you're crazy. Can you give any more detail where the risk lies here? N |
|
From: Thomas S. <ste...@gm...> - 2005-05-19 18:28:29
|
On 5/19/05, Nicholas Nethercote <nj...@cs...> wrote: > This has me scratching my head. I'm sure there are lots of ways in which > Valgrind is "insecure", but I don't see it as a security-sensitive > application in any way. Well, probably a browser is more likely. Imagine that a trampolin is used to check the security context. If you start the browser, it points to the local homepage, so everything is allowed. Now you go to a malicious website. Somehow it manages to pass the necessary parameters to generate a trampolin with the same checksum. It will use the previous translation, which may make the browser think the page is in a local context. And then the website can take over your browser. It is contrived, but I think it is theoretically possible. I am not sure whether we have to worry about security problems, or weather the user should stick to "trusted content". > Unless you're, say, running your webserver under > Valgrind, in which case you're crazy. Is it? Probably. But maybe you just want to find memory leaks in your webse= rver? Thomas |
|
From: Duncan S. <dun...@ma...> - 2005-07-06 07:28:21
|
Hi Julian, > So, what I'm thinking is to calculate a 32-bit CRC of the code > and store it in the translation; then rerun the crc for the > self-check. Except a CRC is expensive in terms of insns and > cache misses (it requires a table). Mark Adler (co-author > of gzip) had some other magic checksum scheme that gzip uses, which > doesn't require a table and is fast. Maybe use that instead. did you think any more about automatic invalidation of translations? As a short term hack, just checking for the signature of a trampoline would be good enough for me, so if you can give me some pointers to which bit of code I should be looking at, I will try to hack it up myself. All the best, Duncan. |
|
From: Julian S. <js...@ac...> - 2005-07-06 08:18:25
|
> did you think any more about automatic invalidation of translations? > As a short term hack, just checking for the signature of a trampoline > would be good enough for me, so if you can give me some pointers to which > bit of code I should be looking at, I will try to hack it up myself. I haven't yet solved the problem, but I did do an important piece of background infrastructure rejiggery (vex r1247, if you receive commit messages) which means that the rest of the fix only needs to be implemented once rather than differently for each supported architecture. Anyway, as a result of your mail I wrote this down on my short term ToDo list and hopefully it won't be too difficult now. I'll try and hack up something by the weekend. J |
|
From: Duncan S. <dun...@ma...> - 2005-07-06 09:32:35
|
Hi Julian, > I haven't yet solved the problem, but I did do an important piece > of background infrastructure rejiggery (vex r1247, if you receive > commit messages) which means that the rest of the fix only needs to > be implemented once rather than differently for each supported > architecture. Anyway, as a result of your mail I wrote this down on > my short term ToDo list and hopefully it won't be too difficult now. > I'll try and hack up something by the weekend. thanks a lot for working on this! I see that you invented a great new verb "commoning" for your commit message :) How can I get receive commit messages by the way? Thanks again, Duncan. |
|
From: Julian S. <js...@ac...> - 2005-07-06 09:44:43
|
> thanks a lot for working on this! I see that you invented a great new > verb "commoning" for your commit message :) How can I get receive commit > messages by the way? They are all sent to the valgrind-developers list. "commoning up", what's wrong with that? J |