#14 JavaScript support

Rogan Dawes

It would be great to implement Javascript support in
WebScarab (mainly the spider, I guess).

We could combine:

TagSoup (or maybe HTMLParser) to generate SAX events
DOMHandler to build a DOM from the SAX events
Rhino to execute any script elements we encounter

with some kind of pushback InputStream to allow script
elements to write into the document so that any
additions would also be parsed by the SAX handler.

It seems like the best way to do this is to process
pages asynchronously, and keep a "waiting list" if the
page has a dependency on some other URL, e.g. a child
FRAME, or an included javascript resource.

So processing of the page would be suspended until
those pending URL's had been retrieved. This means that
we may have multiple pages being processed at one time.

Finally, once there are no more "pending resources",
fire any "on*" events and monitor for things that
provide an URL.

e.g. document.location.href, window.open(), etc

We would probably have to provide a wrapper for
"document" that implements the non-standard things like
document.write(), etc

It's a big project, but could be fun!


Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks