Re: [Htmlparser-user] Change Attributes of TDs and TRs
Brought to you by:
derrickoswald
From: Derrick O. <Der...@Ro...> - 2006-01-18 23:27:02
|
Sorry, I wasn't thinking. You need to use the recursive flag (second parameter) to dig down into the list: NodeList tables = all_nodes.extractAllNodesThatMatch (all_tables, true); Fuhrmann, Michael wrote: >Hi, > >When I use the method you suggested me the tables nodelist contains nothing. >Do you have an idea why? > >NodeList all_nodes = parser.parse(null); >NodeFilter all_tables = new NodeClassFilter(TableTag.class); >NodeList tables = all_nodes.extractAllNodesThatMatch(all_tables); > >The list all_nodes contains the whole site but when I use the nodefilter nothing stays in it...... > >Thanks and best regards >Michael > >-----Original Message----- >From: htm...@li... [mailto:htm...@li...] On Behalf Of Derrick Oswald >Sent: Donnerstag, 12. Januar 2006 15:46 >To: htm...@li... >Subject: Re: [Htmlparser-user] Change Attributes of TDs and TRs > >Gather all the nodes into a list using no filter: > NodeList all_nodes = parser.parse (null); > >Then use the table filter on the whole list, process the nodes, and then >turn it back into a string: > NodeList tables = all_nodes.extractAllNodesThatMatch (all_tables); > ... process the tables list... > System.out.println (all_nodes.toHtml ()); > > > >Fuhrmann, Michael wrote: > > > >>Thanx for you support! >>But actually I don't want to parse the whole thing twice. >>My problem is that the page I want to parse contains many tables. >>Unfortunately these tables contain other tables and so on....... >>Now what I want to do is to change several attributes of the tds and trs for all tables. >>The aim is to cleanup the "dirty" html code in order to generate a pdf finally. >>My thought was to make a for loop which goes through all table tags. >>Or do you know a better solution? >> >>-----Original Message----- >>From: htm...@li... [mailto:htm...@li...] On Behalf Of Derrick Oswald >>Sent: Donnerstag, 12. Januar 2006 01:24 >>To: htm...@li... >>Subject: Re: [Htmlparser-user] Change Attributes of TDs and TRs >> >>By the way, after this call: >> NodeList list = parser.parse (all_tables); >>the parser will be at the end of the page and return no more nodes. >>So, this: >> // Seperate all table tags >> * for* (NodeIterator e = parser.elements (); >>e.hasMoreNodes ();) >> e.nextNode ().collectInto (list,all_tables); >>doesn't do anything. >> >>You can use: >> parser.reset (); >>to start again, if that is what you really want to do, but in your case >>you would get duplicates of everything. >> >> >>Third Eye wrote: >> >> >> >> >> >>>Table tag object already has a fucntion to get the rows and TableRow >>>has function to get columns. You don't need to iterate yourself. >>> >>>On 1/11/06, Fuhrmann, Michael <mic...@sa...> wrote: >>> >>> >>> >>> >>> >>> >>>>Hi All! >>>> >>>>I want to change several attributes of the td and tr tags of certain tables >>>>but I don't know if do it the right way. >>>>The problem is that I find the right table (only tables with ids) but I >>>>don't reach the td or tr tags…. >>>>My code looks like that: >>>> >>>>public void cleanDokument(HttpServletRequest >>>>request,HttpServletResponse response) throws IOException >>>> { >>>> // Get the calling HTML Document define the Writer and open >>>>the connection >>>> URLConnection connection; >>>> URL request_url = new >>>>URL(request.getHeader("referer").toString()); >>>> >>>> PrintWriter out = response.getWriter(); >>>> connection = >>>>(HttpURLConnection)request_url.openConnection (); >>>> >>>> try >>>> { >>>> Parser parser = new Parser (); >>>> parser.setConnection(connection); >>>> >>>> NodeFilter all_tables = new TagNameFilter("table"); >>>> NodeList list = parser.parse (all_tables); >>>> Node[] nodelist; >>>> >>>> // Seperate all table tags >>>> for (NodeIterator e = parser.elements (); e.hasMoreNodes >>>>();) >>>> e.nextNode ().collectInto (list,all_tables); >>>> >>>> nodelist=list.toNodeArray(); >>>> >>>> for (int h=0; h<nodelist.length;h++) >>>> { >>>> if (nodelist[h] instanceof TableTag) >>>> { >>>> //for schleife f r die td's und tr's >>>> >>>>if(((TableTag)nodelist[h]).getAttribute("id")!= null) >>>> { >>>> for (int i=0; i<nodelist.length; >>>>i++) >>>> { >>>> >>>>out.println(nodelist.toString()); >>>> if(nodelist[i] instanceof >>>>TableRow) >>>> { >>>> out.println("Row >>>>found!"); >>>> >>>>((TableRow)nodelist[i]).removeAttribute ("nowrap"); >>>> } >>>> else if (nodelist[i] >>>>instanceof TableColumn) >>>> { >>>> out.println("Column >>>>found!"); >>>> >>>>((TableColumn)nodelist[i]).removeAttribute ("nowrap"); >>>> } >>>> } >>>> out.println(nodelist[h].toHtml()); >>>> } >>>> } >>>> else if(nodelist[h] instanceof TableRow || >>>>nodelist[h] instanceof TableColumn) >>>> { >>>> out.println("Else erreicht!"); >>>> >>>>out.println(((TableRow)nodelist[h]).getText()); >>>> } >>>> } >>>> //makePdf(out,response); >>>> } >>>> catch(Exception e) >>>> { >>>> out.println("Fehler beim Parsen!"); >>>> e.printStackTrace(out); >>>> } >>>> } >>>> >>>>Does my nodelist contain the tr and td tags? Is it right to say instanceof >>>>TableRow???? >>>> >>>>Many thanks and best regards >>>>Michael >>>> >>>> >>>> >>>> >>>> >>>> >>>-- >>>Naveen K Kohli >>>http://www.netomatix.com >>>N?HY隊X???'???u???[??????? >>>ަ?k??!???W?~?鮆?zk??C? 塧m????@^ǚ??^??z?Z?f?z?j?!?x2??? ????ɫ,??? >>> >>> >>> >>> >>a{ ??,?H??4?m???i?(??ܢo?v'??jYhr'ׯ:?rX??{f????????j)b? b???ZZ?ǫ?ǫ?+-??.?ǟ????a??l??b??,???y?+???b????+-?w??f??????ser= >> >> >> >> >> >>------------------------------------------------------- >>This SF.net email is sponsored by: Splunk Inc. Do you grep through log files >>for problems? Stop! Download the new AJAX search engine that makes >>searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >>http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click >>_______________________________________________ >>Htmlparser-user mailing list >>Htm...@li... >>https://lists.sourceforge.net/lists/listinfo/htmlparser-user >>N?HY隊X???'???u???[??????? >>ަ?k??!???W?~?鮆?zk??C? 塧m????@^ǚ??^??z?Z?f?z?j?!?x2??? ????ɫ,??? >> >> >a{ ??,?H??4?m???i?(??ܢo?v'??jYhr'ׯ:?rX??{f????????j)b? b???ZZ?ǫ?ǫ?+-??.?ǟ????a??l??b??,???y?+???b????+-?w??f??????ser= > > > > > >------------------------------------------------------- >This SF.net email is sponsored by: Splunk Inc. Do you grep through log files >for problems? Stop! Download the new AJAX search engine that makes >searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click >_______________________________________________ >Htmlparser-user mailing list >Htm...@li... >https://lists.sourceforge.net/lists/listinfo/htmlparser-user >N?HY隊X???'???u???[??????? >ަ?k??!???W?~?鮆?zk??C? 塧m????@^ǚ??^??z?Z?f?z?j?!?x2???????ɫ,???a{??,?H??4?m?????Z??jY?w??ǥrg?y$???~7ٸ?m?Νj??^??{f????????j)b? b???ZZ?ǫ?ǫ?+-??.?ǟ????a??l??b??,???y?+???b????+-?w??f??????ser= > |