From: Steven J. S. <sj...@Ju...> - 2002-08-07 15:04:57
|
On Tue, 6 Aug 2002, Alex Russell wrote: > On Tue, 6 Aug 2002, Steven J. Sobol wrote: > > > This is a little function I wrote to escape any character not contained > > in the variable ok_chars. Characters not represented in that variable are > > escaped to the corresponding html entities. For example, a space character > > (ASCII 32) is converted to   > > There's no canonicalization. Use of the asc() function assumes that the > char can be represented in 1 byte. I acknowledge that that is an issue. > Is that a valid assumption at all > times? Will you ever have input from non-ascii char sets? What does the > asc() function return in those cases? Is that output safe? Not sure yet. I have to do some more research on the subject. > > It's VBscript. This is an app running on IIS, and the function is used > > to cleanse data going into a SQL Server 2000 database. > > I think you should just drop any chars that aren't allowed. In practice, this code is going to be used on a script that allows the site owner to post news items. What happens if they post something like "It's time" ("it is time", the apostrophe is appropriate) and it comes out as "Its time" (grammatically incorrect?) Not good... I'd rather keep the dangerous characters there and escape them. > > First, do the & # and semicolon have any special meaning to SQL2K or > > any of the other popular database engines? I'm thinking not - but I'm not > > the expert here. > > > > Second, do you think it's ok, given the purpose of the function, to > > include the % as a valid character that will not be escaped? > > I would be warry of "%", "!", "/", and "=" > the "/" char is used as a division operator in some SQL dialects, while > "!" is used as negation. "=" is obviously part of the SQL BNF, and > should be dissalowed. As for "%", I dunno, but it just kinda strikes me > as dangerous somehow. Perhaps the fact that % is the wildcard character in ANSI SQL and most SQL dialects is what is concerning you. And I probably should take all four of those characters out. Question is how to allow the client to make posts using those characters without opening up holes. HTML entities can handle multi-byte characters, can't they? That's not the problem here. -- Steve Sobol, CTO JustThe.net LLC, Mentor On The Lake, OH |