Menu

remove tags

Help
2006-06-25
2013-04-27
  • Francesco "Darkside"

    Hi,
    I have to realize a system of management and
    extraction script from the pages web.

    The problem is the implementation of a function
    of "cleaning": extracts the scripts and putting them
    in one or more external file and “cleans up” the pages
    from such script, and connects the html page (without
    script) to script (in external file). How am I able
    with the html parser, to remove the scripts from the
    pages and get the html back (without script)?

    Example:

    Page.html (With javascript)-----> Page. html
    (without javascript) + file.js

    thanks,
    Francesco

     
    • Derrick Oswald

      Derrick Oswald - 2006-06-26

      You should be able to subclass ScriptTag and both file the script to disk when doSemanticAction() is called and return nothing when the page is converted back to HTML with toHtml().

      See the documentation for PrototypicalNodeFactory.

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.