From: <al...@cr...> - 2008-02-13 17:30:31
|
Quoting Jesse Becker <ha...@gm...>: > Attached is a bunch of variable sanitation checking. It isn't done, > but I wanted to throw it out for comments before I go too far down > some hole and can't dig myself out. > > This is not a patch, since are only two "include_once" lines (in > index.php and graph.php) for existing files. The rest of the patch > would be to add this file. Just imagine that this file gets run > after conf.php is sourced, be before get_context.php is read. > > The idea is to take $_GET (and later $_COOKIE), and check to make sure > that their contents have valid information. Invalid information is > *discarded*, and cannot be used by the rest of the code. Thus, if > $_GET['st'] has bogus data (non-integral data, to be precise), then it > is deleted from the array. This is pretty harsh, but should make > problems obvious very quickly. > > There are two main sections to the code. The first is the large array > near the top of the file. This defines what parameters we care about, > and how they should be used. Anything not in this array, or doens't > match the datatype requested, will be discarded. Any new parameters > that are added (for example, to indicated which metric groups should > be "collapsed") should be added to this array. > > The second part is the foreach{} loop at the end. It runs through all > variables in $_GET (and later, $_COOKIE, etc), and checks if we care > about it at all (e.g. it is in the large array I just mentioned). > Keys that we want, and are valid will be kept, but everything else > will be pitched. > > As I said, it isn't done yet, although it is basically functional. > I'm looking for general comments, not specific bug reports yet. > > So, comments and suggestion welcome. I think this looks like a good approach. Centralizing all the input validation is a great idea. You said you hadn't finished with it yet, so I may be commenting on stuff that you're still working on. Please disregard if so. :) You said the only patching would be to include this script in get_context.php and graph.php. So, I take that to mean that get_context.php and graph.php would still access values in $_GET like they currently do, but these would be modified by santize.php. I think it'd be clearer to do everything in one place. Then, the only script that touches $_GET/$_COOKIE is sanitize.php, and all other scripts use variables that it makes available. That way you have only 1 place where user input is touched, and $_GET continues to be what you'd expect: raw user input. In the foreach() loop, if a $_GET value is found to be valid (the if() condition is true), put the value in a $clean array, using the same key as $_GET/$_COOKIE. All other code should use these values from the $clean array rather than the current local variables. When reading other scripts, this convention makes it clear that a variable is user input, and that it has been put through appropriate checks. If a value is invalid, maybe create the key in $clean with a sane default value, or a null if no sensible default is possible. Code that wanted to use something in $clean then wouldn't need to do lots of isset() checking. This is pretty much what get_context.php and graph.php already do. Won't the INT, FLOAT, and NUMBER checks in valid_parameter() always be true? (float)$value would always be a float. The BOOL check is allowing a 1 or 0, not an actual boolean. If I saw a value being validated as BOOL, I'd expect it to be a boolean, not a 1 or a 0. Mostly it doesn't matter since PHP is so loose with types, but the broken 'Show Hosts' bug in 3.0.6 shows that the difference does matter in some cases. error_log calls should identify where they are coming from. Alternately, you could trigger_error( "message", E_USER_WARNING ). That automatically adds information about file and line number where the error occurred. Hope that's helpful, alex |