From: Gabriele B. <g.b...@co...> - 2004-08-26 05:59:36
|
Il lun, 2004-08-23 alle 21:02, Neal Richter ha scritto: > I'm finding that several things are wierd in the htdig/HTML.cc class > > 1) If a page has an ill-formed comment tag like this: > <!-- hennerik CVSweb $Revision: 1.64 0-> We could indeed think of a more flexible way to handle it ... > 2) In the HTML::parse function > > 223 unsigned char *text = (unsigned char *)new char[contents->length()+1]; > > > This variable seems intended to store the document contents. However both > times it's used as a RHS of an anssignment statment: > > 224 unsigned char *ptext = text; > [snip] > 380 position = text; > 381 start = position; > 382 > 383 while (*position) > 384 { > > Note that the while statement (lines 384 to 545) is likely never entered > since gcc seems to initialize text to zeros on Linux. The behavior could > be platform dependent since who knows what's in that memory. Well ... isn't the 'text' string filled through the 'ptext' pointer? Cheers, -Gabriele |