You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(6) |
Jul
(17) |
Aug
(18) |
Sep
(22) |
Oct
(16) |
Nov
(6) |
Dec
(11) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(11) |
Feb
(10) |
Mar
(34) |
Apr
(26) |
May
(6) |
Jun
(22) |
Jul
(14) |
Aug
(4) |
Sep
(47) |
Oct
(69) |
Nov
(23) |
Dec
(21) |
2005 |
Jan
(53) |
Feb
(33) |
Mar
(92) |
Apr
(65) |
May
(63) |
Jun
(57) |
Jul
(43) |
Aug
(132) |
Sep
(61) |
Oct
(75) |
Nov
(60) |
Dec
(130) |
2006 |
Jan
(74) |
Feb
(87) |
Mar
(101) |
Apr
(58) |
May
(54) |
Jun
(42) |
Jul
(31) |
Aug
(67) |
Sep
(61) |
Oct
(71) |
Nov
(28) |
Dec
(58) |
2007 |
Jan
(53) |
Feb
(50) |
Mar
(96) |
Apr
(66) |
May
(55) |
Jun
(130) |
Jul
(99) |
Aug
(115) |
Sep
(37) |
Oct
(78) |
Nov
(24) |
Dec
(70) |
2008 |
Jan
(94) |
Feb
(85) |
Mar
(197) |
Apr
(274) |
May
(119) |
Jun
(143) |
Jul
(193) |
Aug
(99) |
Sep
(160) |
Oct
(120) |
Nov
(178) |
Dec
(109) |
2009 |
Jan
(238) |
Feb
(169) |
Mar
(115) |
Apr
(109) |
May
(131) |
Jun
(167) |
Jul
(144) |
Aug
(193) |
Sep
(155) |
Oct
(154) |
Nov
(97) |
Dec
(127) |
2010 |
Jan
(108) |
Feb
(127) |
Mar
(176) |
Apr
(113) |
May
(130) |
Jun
(200) |
Jul
(115) |
Aug
(80) |
Sep
(92) |
Oct
(101) |
Nov
(124) |
Dec
(53) |
2011 |
Jan
(67) |
Feb
(144) |
Mar
(88) |
Apr
(60) |
May
(89) |
Jun
(54) |
Jul
(68) |
Aug
(81) |
Sep
(48) |
Oct
(40) |
Nov
(10) |
Dec
(20) |
2012 |
Jan
(21) |
Feb
(28) |
Mar
(17) |
Apr
(35) |
May
(41) |
Jun
(44) |
Jul
(68) |
Aug
(67) |
Sep
(89) |
Oct
(58) |
Nov
(47) |
Dec
(56) |
2013 |
Jan
(49) |
Feb
(28) |
Mar
(46) |
Apr
(31) |
May
(28) |
Jun
(37) |
Jul
(34) |
Aug
(52) |
Sep
(42) |
Oct
(108) |
Nov
(59) |
Dec
(56) |
2014 |
Jan
(41) |
Feb
(72) |
Mar
(46) |
Apr
(21) |
May
(19) |
Jun
(17) |
Jul
(15) |
Aug
(40) |
Sep
(11) |
Oct
(3) |
Nov
(5) |
Dec
(31) |
2015 |
Jan
(11) |
Feb
(12) |
Mar
(19) |
Apr
(19) |
May
(38) |
Jun
(54) |
Jul
(14) |
Aug
(42) |
Sep
(14) |
Oct
(16) |
Nov
(26) |
Dec
(14) |
2016 |
Jan
(3) |
Feb
(1) |
Mar
(24) |
Apr
(5) |
May
(15) |
Jun
(14) |
Jul
(33) |
Aug
(19) |
Sep
(8) |
Oct
(10) |
Nov
|
Dec
(2) |
2017 |
Jan
(16) |
Feb
(12) |
Mar
(23) |
Apr
(8) |
May
(11) |
Jun
(20) |
Jul
(21) |
Aug
(20) |
Sep
|
Oct
(6) |
Nov
(9) |
Dec
(2) |
2018 |
Jan
(7) |
Feb
(5) |
Mar
(6) |
Apr
(5) |
May
(1) |
Jun
(2) |
Jul
(2) |
Aug
|
Sep
(4) |
Oct
(3) |
Nov
|
Dec
(4) |
2019 |
Jan
(2) |
Feb
(2) |
Mar
(3) |
Apr
(4) |
May
|
Jun
(4) |
Jul
(9) |
Aug
(2) |
Sep
|
Oct
(4) |
Nov
(1) |
Dec
(7) |
2020 |
Jan
(2) |
Feb
(6) |
Mar
(9) |
Apr
(1) |
May
(1) |
Jun
(15) |
Jul
(1) |
Aug
(1) |
Sep
(2) |
Oct
(6) |
Nov
(3) |
Dec
(5) |
2021 |
Jan
(3) |
Feb
(1) |
Mar
(2) |
Apr
(1) |
May
|
Jun
(1) |
Jul
(1) |
Aug
(3) |
Sep
(1) |
Oct
|
Nov
(1) |
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
(1) |
Jul
(4) |
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
(6) |
2025 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Sreenath <sre...@gm...> - 2014-03-28 05:01:32
|
Hello, I have done some navigation with both htmlunit and real browser, compared request invocations in both cases, found few request invocations are missing htmlunit navigation. Please advice on this. Thanks Sreenadh -- If you judge people, you have no time to love them. |
From: Ronald B. <rb...@rb...> - 2014-03-26 19:23:49
|
Hi David, please check the lastest snapshot, hopefully the problem is solved. RBRi On Sun, 23 Mar 2014 18:10:14 +0100 (CET) Ronald Brill wrote: > >Hi David, > >looks like the usual problem with not synchronzed hashMaps. >see here as a starting poing > http://stackoverflow.com/questions/17070184/hashmap-stuck-on-get >Please open an issue, will try to fix it > > RBRi > > >On Sun, 23 Mar 2014 17:32:49 +0200 David Michael Gang wrote: >> >>Hi all, >> >> >>I please need some pointers how to continue this research >> >>I am using htmlunit 2.15 snapshot. >> >>My java is hanging with 100 % cpu. >> >>The jstack gives the following >>Full thread dump Java HotSpot(TM) 64-Bit Server VM (14.0-b16 mixed mode): >> >>"Attach Listener" daemon prio=10 tid=0x00000000471f8800 nid=0x7756 waiting >>on condition [0x0000000000000000] >> java.lang.Thread.State: RUNNABLE >> >>"JS executor for com.gargoylesoftware.htmlunit.WebClient@39e32cd6" daemon >>prio=10 tid=0x00000000471fc800 nid=0x5e61 waiting on condition >>[0x00002ad519ddc000] >> java.lang.Thread.State: TIMED_WAITING (sleeping) >> at java.lang.Thread.sleep(Native Method) >> at >>com.gargoylesoftware.htmlunit.javascript.background.DefaultJavaScriptExecutor.run(DefaultJavaScriptExecutor.java:180) >> at java.lang.Thread.run(Thread.java:619) >> >>"JS executor for com.gargoylesoftware.htmlunit.WebClient@62946d22" daemon >>prio=10 tid=0x00002ad51c442000 nid=0x5813 waiting on condition >>[0x00002ad519ab6000] >> java.lang.Thread.State: TIMED_WAITING (sleeping) >> at java.lang.Thread.sleep(Native Method) >> at >>com.gargoylesoftware.htmlunit.javascript.background.DefaultJavaScriptExecutor.run(DefaultJavaScriptExecutor.java:180) >> at java.lang.Thread.run(Thread.java:619) >> >>"Low Memory Detector" daemon prio=10 tid=0x0000000046a47800 nid=0x578a >>runnable [0x0000000000000000] >> java.lang.Thread.State: RUNNABLE >> >>"CompilerThread1" daemon prio=10 tid=0x0000000046a44800 nid=0x5789 waiting >>on condition [0x0000000000000000] >> java.lang.Thread.State: RUNNABLE >> >>"CompilerThread0" daemon prio=10 tid=0x0000000046a40800 nid=0x5788 waiting >>on condition [0x0000000000000000] >> java.lang.Thread.State: RUNNABLE >> >>"Signal Dispatcher" daemon prio=10 tid=0x0000000046a3e800 nid=0x5787 >>runnable [0x0000000000000000] >> java.lang.Thread.State: RUNNABLE >> >>"Finalizer" daemon prio=10 tid=0x0000000046a1a000 nid=0x5785 in >>Object.wait() [0x00002ad515a1d000] >> java.lang.Thread.State: WAITING (on object monitor) >> at java.lang.Object.wait(Native Method) >> - waiting on <0x00002ad504c5aa28> (a >>java.lang.ref.ReferenceQueue$Lock) >> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118) >> - locked <0x00002ad504c5aa28> (a java.lang.ref.ReferenceQueue$Lock) >> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134) >> at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159) >> >>"Reference Handler" daemon prio=10 tid=0x0000000046a18000 nid=0x5784 in >>Object.wait() [0x00002ad51591c000] >> java.lang.Thread.State: WAITING (on object monitor) >> at java.lang.Object.wait(Native Method) >> - waiting on <0x00002ad504c5baf8> (a java.lang.ref.Reference$Lock) >> at java.lang.Object.wait(Object.java:485) >> at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116) >> - locked <0x00002ad504c5baf8> (a java.lang.ref.Reference$Lock) >> >>"main" prio=10 tid=0x0000000046867000 nid=0x577a runnable >>[0x00002ad4fbfb4000] >> java.lang.Thread.State: RUNNABLE >> at java.util.HashMap.get(HashMap.java:303) >> at >>com.gargoylesoftware.htmlunit.html.HtmlPage.addElement(HtmlPage.java:1875) >> at >>com.gargoylesoftware.htmlunit.html.HtmlPage.addMappedElement(HtmlPage.java:1850) >> at >>com.gargoylesoftware.htmlunit.html.HtmlPage.notifyNodeAdded(HtmlPage.java:1793) >> at >>com.gargoylesoftware.htmlunit.html.DomNode.fireAddition(DomNode.java:1043) >> at >>com.gargoylesoftware.htmlunit.html.DomNode.appendChild(DomNode.java:937) >> at >>com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.addNodeToRightParent(HTMLParser.java:652) >> at >>com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.startElement(HTMLParser.java:565) >> at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown >>Source) >> at >>com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.startElement(HTMLParser.java:510) >> at >>org.cyberneko.html.HTMLTagBalancer.callStartElement(HTMLTagBalancer.java:1164) >> at >>org.cyberneko.html.HTMLTagBalancer.startElement(HTMLTagBalancer.java:754) >> at >>org.cyberneko.html.filters.DefaultFilter.startElement(DefaultFilter.java:136) >> at >>org.cyberneko.html.filters.NamespaceBinder.startElement(NamespaceBinder.java:279) >> at >>org.cyberneko.html.HTMLScanner$ContentScanner.scanStartElement(HTMLScanner.java:2760) >> at >>org.cyberneko.html.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2110) >> at org.cyberneko.html.HTMLScanner.scanDocument(HTMLScanner.java:920) >> at >>org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:499) >> at >>org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:452) >> at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) >> at >>com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.parse(HTMLParser.java:926) >> at >>com.gargoylesoftware.htmlunit.html.HTMLParser.parse(HTMLParser.java:245) >> at >>com.gargoylesoftware.htmlunit.html.HTMLParser.parseHtml(HTMLParser.java:191) >> at >>com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:268) >> at >>com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:156) >> at >>com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:455) >> at >>com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:329) >> at >>com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:394) >> ... >> >> >>"VM Thread" prio=10 tid=0x0000000046a11800 nid=0x5783 runnable >> >>"GC task thread#0 (ParallelGC)" prio=10 tid=0x0000000046871000 nid=0x577b >>runnable >> >>"GC task thread#1 (ParallelGC)" prio=10 tid=0x0000000046872800 nid=0x577c >>runnable >> >>"GC task thread#2 (ParallelGC)" prio=10 tid=0x0000000046874800 nid=0x577d >>runnable >> >>"GC task thread#3 (ParallelGC)" prio=10 tid=0x0000000046876000 nid=0x577e >>runnable >> >>"GC task thread#4 (ParallelGC)" prio=10 tid=0x0000000046878000 nid=0x577f >>runnable >> >>"GC task thread#5 (ParallelGC)" prio=10 tid=0x000000004687a000 nid=0x5780 >>runnable >> >>"GC task thread#6 (ParallelGC)" prio=10 tid=0x000000004687c000 nid=0x5781 >>runnable >> >>"GC task thread#7 (ParallelGC)" prio=10 tid=0x000000004687d800 nid=0x5782 >>runnable >> >>"VM Periodic Task Thread" prio=10 tid=0x0000000046a4a000 nid=0x578b waiting >>on condition >> >>JNI global references: 1142 >> >>When waiting a couple of seconds and taking the thread dump again the trace >>does not change. >> >>The memory looks fine >> >> S0 S1 E O P YGC YGCT FGC FGCT GCT >> 0.00 75.00 88.28 34.37 73.05 1025 1.798 8 0.842 2.640 >> >> >>I am running on a file of urls, so i cannot pinpoint the line where it gets >>stuck. >> >>What else can i change? >> >>Thanks, >>David >> > > >------------------------------------------------------------------------------ >Learn Graph Databases - Download FREE O'Reilly Book >"Graph Databases" is the definitive new guide to graph databases and their >applications. Written by three acclaimed leaders in the field, >this first edition is now available. Download your free book today! >http://p.sf.net/sfu/13534_NeoTech >_______________________________________________ >Htmlunit-user mailing list >Htm...@li... >https://lists.sourceforge.net/lists/listinfo/htmlunit-user > |
From: Amer Al-A. <ame...@ya...> - 2014-03-26 03:49:57
|
Hello I have found that RowContainer.insertRow(int) doesn't allow insert row before the last row The problem is in this line: if(index >= rowCount - 1) { I think the -1 should be removed. http://sourceforge.net/p/htmlunit/code/HEAD/tree/tags/HtmlUnit-2.14/src/main/java/com/gargoylesoftware/htmlunit/javascript/host/RowContainer.java#l150 Thanks |
From: Amir N. <ami...@gm...> - 2014-03-23 19:30:56
|
Hello; I was wondering if there is a way to retrieve all css attributes from an element all at the same time without calling all the getters of ComputedCSSStyleDeclaration class. Thanks a lot. On Sun, Mar 23, 2014 at 7:10 PM, Ronald Brill <rb...@rb...> wrote: > Hi David, > > looks like the usual problem with not synchronzed hashMaps. > see here as a starting poing > http://stackoverflow.com/questions/17070184/hashmap-stuck-on-get > Please open an issue, will try to fix it > > RBRi > > > On Sun, 23 Mar 2014 17:32:49 +0200 David Michael Gang wrote: >> >>Hi all, >> >> >>I please need some pointers how to continue this research >> >>I am using htmlunit 2.15 snapshot. >> >>My java is hanging with 100 % cpu. >> >>The jstack gives the following >>Full thread dump Java HotSpot(TM) 64-Bit Server VM (14.0-b16 mixed mode): >> >>"Attach Listener" daemon prio=10 tid=0x00000000471f8800 nid=0x7756 waiting >>on condition [0x0000000000000000] >> java.lang.Thread.State: RUNNABLE >> >>"JS executor for com.gargoylesoftware.htmlunit.WebClient@39e32cd6" daemon >>prio=10 tid=0x00000000471fc800 nid=0x5e61 waiting on condition >>[0x00002ad519ddc000] >> java.lang.Thread.State: TIMED_WAITING (sleeping) >> at java.lang.Thread.sleep(Native Method) >> at >>com.gargoylesoftware.htmlunit.javascript.background.DefaultJavaScriptExecutor.run(DefaultJavaScriptExecutor.java:180) >> at java.lang.Thread.run(Thread.java:619) >> >>"JS executor for com.gargoylesoftware.htmlunit.WebClient@62946d22" daemon >>prio=10 tid=0x00002ad51c442000 nid=0x5813 waiting on condition >>[0x00002ad519ab6000] >> java.lang.Thread.State: TIMED_WAITING (sleeping) >> at java.lang.Thread.sleep(Native Method) >> at >>com.gargoylesoftware.htmlunit.javascript.background.DefaultJavaScriptExecutor.run(DefaultJavaScriptExecutor.java:180) >> at java.lang.Thread.run(Thread.java:619) >> >>"Low Memory Detector" daemon prio=10 tid=0x0000000046a47800 nid=0x578a >>runnable [0x0000000000000000] >> java.lang.Thread.State: RUNNABLE >> >>"CompilerThread1" daemon prio=10 tid=0x0000000046a44800 nid=0x5789 waiting >>on condition [0x0000000000000000] >> java.lang.Thread.State: RUNNABLE >> >>"CompilerThread0" daemon prio=10 tid=0x0000000046a40800 nid=0x5788 waiting >>on condition [0x0000000000000000] >> java.lang.Thread.State: RUNNABLE >> >>"Signal Dispatcher" daemon prio=10 tid=0x0000000046a3e800 nid=0x5787 >>runnable [0x0000000000000000] >> java.lang.Thread.State: RUNNABLE >> >>"Finalizer" daemon prio=10 tid=0x0000000046a1a000 nid=0x5785 in >>Object.wait() [0x00002ad515a1d000] >> java.lang.Thread.State: WAITING (on object monitor) >> at java.lang.Object.wait(Native Method) >> - waiting on <0x00002ad504c5aa28> (a >>java.lang.ref.ReferenceQueue$Lock) >> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118) >> - locked <0x00002ad504c5aa28> (a java.lang.ref.ReferenceQueue$Lock) >> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134) >> at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159) >> >>"Reference Handler" daemon prio=10 tid=0x0000000046a18000 nid=0x5784 in >>Object.wait() [0x00002ad51591c000] >> java.lang.Thread.State: WAITING (on object monitor) >> at java.lang.Object.wait(Native Method) >> - waiting on <0x00002ad504c5baf8> (a java.lang.ref.Reference$Lock) >> at java.lang.Object.wait(Object.java:485) >> at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116) >> - locked <0x00002ad504c5baf8> (a java.lang.ref.Reference$Lock) >> >>"main" prio=10 tid=0x0000000046867000 nid=0x577a runnable >>[0x00002ad4fbfb4000] >> java.lang.Thread.State: RUNNABLE >> at java.util.HashMap.get(HashMap.java:303) >> at >>com.gargoylesoftware.htmlunit.html.HtmlPage.addElement(HtmlPage.java:1875) >> at >>com.gargoylesoftware.htmlunit.html.HtmlPage.addMappedElement(HtmlPage.java:1850) >> at >>com.gargoylesoftware.htmlunit.html.HtmlPage.notifyNodeAdded(HtmlPage.java:1793) >> at >>com.gargoylesoftware.htmlunit.html.DomNode.fireAddition(DomNode.java:1043) >> at >>com.gargoylesoftware.htmlunit.html.DomNode.appendChild(DomNode.java:937) >> at >>com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.addNodeToRightParent(HTMLParser.java:652) >> at >>com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.startElement(HTMLParser.java:565) >> at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown >>Source) >> at >>com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.startElement(HTMLParser.java:510) >> at >>org.cyberneko.html.HTMLTagBalancer.callStartElement(HTMLTagBalancer.java:1164) >> at >>org.cyberneko.html.HTMLTagBalancer.startElement(HTMLTagBalancer.java:754) >> at >>org.cyberneko.html.filters.DefaultFilter.startElement(DefaultFilter.java:136) >> at >>org.cyberneko.html.filters.NamespaceBinder.startElement(NamespaceBinder.java:279) >> at >>org.cyberneko.html.HTMLScanner$ContentScanner.scanStartElement(HTMLScanner.java:2760) >> at >>org.cyberneko.html.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2110) >> at org.cyberneko.html.HTMLScanner.scanDocument(HTMLScanner.java:920) >> at >>org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:499) >> at >>org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:452) >> at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) >> at >>com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.parse(HTMLParser.java:926) >> at >>com.gargoylesoftware.htmlunit.html.HTMLParser.parse(HTMLParser.java:245) >> at >>com.gargoylesoftware.htmlunit.html.HTMLParser.parseHtml(HTMLParser.java:191) >> at >>com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:268) >> at >>com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:156) >> at >>com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:455) >> at >>com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:329) >> at >>com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:394) >> ... >> >> >>"VM Thread" prio=10 tid=0x0000000046a11800 nid=0x5783 runnable >> >>"GC task thread#0 (ParallelGC)" prio=10 tid=0x0000000046871000 nid=0x577b >>runnable >> >>"GC task thread#1 (ParallelGC)" prio=10 tid=0x0000000046872800 nid=0x577c >>runnable >> >>"GC task thread#2 (ParallelGC)" prio=10 tid=0x0000000046874800 nid=0x577d >>runnable >> >>"GC task thread#3 (ParallelGC)" prio=10 tid=0x0000000046876000 nid=0x577e >>runnable >> >>"GC task thread#4 (ParallelGC)" prio=10 tid=0x0000000046878000 nid=0x577f >>runnable >> >>"GC task thread#5 (ParallelGC)" prio=10 tid=0x000000004687a000 nid=0x5780 >>runnable >> >>"GC task thread#6 (ParallelGC)" prio=10 tid=0x000000004687c000 nid=0x5781 >>runnable >> >>"GC task thread#7 (ParallelGC)" prio=10 tid=0x000000004687d800 nid=0x5782 >>runnable >> >>"VM Periodic Task Thread" prio=10 tid=0x0000000046a4a000 nid=0x578b waiting >>on condition >> >>JNI global references: 1142 >> >>When waiting a couple of seconds and taking the thread dump again the trace >>does not change. >> >>The memory looks fine >> >> S0 S1 E O P YGC YGCT FGC FGCT GCT >> 0.00 75.00 88.28 34.37 73.05 1025 1.798 8 0.842 2.640 >> >> >>I am running on a file of urls, so i cannot pinpoint the line where it gets >>stuck. >> >>What else can i change? >> >>Thanks, >>David >> > > > ------------------------------------------------------------------------------ > Learn Graph Databases - Download FREE O'Reilly Book > "Graph Databases" is the definitive new guide to graph databases and their > applications. Written by three acclaimed leaders in the field, > this first edition is now available. Download your free book today! > http://p.sf.net/sfu/13534_NeoTech > _______________________________________________ > Htmlunit-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlunit-user -- Amir Najjar, Automation Developer Galil Software || DudaMobile Office: +972 4 9118464 | Mobile: +972 52 4899511 ami...@gm... | www.galilsoftware.com | www.dudamobile.com |
From: Ronald B. <rb...@rb...> - 2014-03-23 17:10:33
|
Hi David, looks like the usual problem with not synchronzed hashMaps. see here as a starting poing http://stackoverflow.com/questions/17070184/hashmap-stuck-on-get Please open an issue, will try to fix it RBRi On Sun, 23 Mar 2014 17:32:49 +0200 David Michael Gang wrote: > >Hi all, > > >I please need some pointers how to continue this research > >I am using htmlunit 2.15 snapshot. > >My java is hanging with 100 % cpu. > >The jstack gives the following >Full thread dump Java HotSpot(TM) 64-Bit Server VM (14.0-b16 mixed mode): > >"Attach Listener" daemon prio=10 tid=0x00000000471f8800 nid=0x7756 waiting >on condition [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > >"JS executor for com.gargoylesoftware.htmlunit.WebClient@39e32cd6" daemon >prio=10 tid=0x00000000471fc800 nid=0x5e61 waiting on condition >[0x00002ad519ddc000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at >com.gargoylesoftware.htmlunit.javascript.background.DefaultJavaScriptExecutor.run(DefaultJavaScriptExecutor.java:180) > at java.lang.Thread.run(Thread.java:619) > >"JS executor for com.gargoylesoftware.htmlunit.WebClient@62946d22" daemon >prio=10 tid=0x00002ad51c442000 nid=0x5813 waiting on condition >[0x00002ad519ab6000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at >com.gargoylesoftware.htmlunit.javascript.background.DefaultJavaScriptExecutor.run(DefaultJavaScriptExecutor.java:180) > at java.lang.Thread.run(Thread.java:619) > >"Low Memory Detector" daemon prio=10 tid=0x0000000046a47800 nid=0x578a >runnable [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > >"CompilerThread1" daemon prio=10 tid=0x0000000046a44800 nid=0x5789 waiting >on condition [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > >"CompilerThread0" daemon prio=10 tid=0x0000000046a40800 nid=0x5788 waiting >on condition [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > >"Signal Dispatcher" daemon prio=10 tid=0x0000000046a3e800 nid=0x5787 >runnable [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > >"Finalizer" daemon prio=10 tid=0x0000000046a1a000 nid=0x5785 in >Object.wait() [0x00002ad515a1d000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0x00002ad504c5aa28> (a >java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118) > - locked <0x00002ad504c5aa28> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134) > at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159) > >"Reference Handler" daemon prio=10 tid=0x0000000046a18000 nid=0x5784 in >Object.wait() [0x00002ad51591c000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0x00002ad504c5baf8> (a java.lang.ref.Reference$Lock) > at java.lang.Object.wait(Object.java:485) > at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116) > - locked <0x00002ad504c5baf8> (a java.lang.ref.Reference$Lock) > >"main" prio=10 tid=0x0000000046867000 nid=0x577a runnable >[0x00002ad4fbfb4000] > java.lang.Thread.State: RUNNABLE > at java.util.HashMap.get(HashMap.java:303) > at >com.gargoylesoftware.htmlunit.html.HtmlPage.addElement(HtmlPage.java:1875) > at >com.gargoylesoftware.htmlunit.html.HtmlPage.addMappedElement(HtmlPage.java:1850) > at >com.gargoylesoftware.htmlunit.html.HtmlPage.notifyNodeAdded(HtmlPage.java:1793) > at >com.gargoylesoftware.htmlunit.html.DomNode.fireAddition(DomNode.java:1043) > at >com.gargoylesoftware.htmlunit.html.DomNode.appendChild(DomNode.java:937) > at >com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.addNodeToRightParent(HTMLParser.java:652) > at >com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.startElement(HTMLParser.java:565) > at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown >Source) > at >com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.startElement(HTMLParser.java:510) > at >org.cyberneko.html.HTMLTagBalancer.callStartElement(HTMLTagBalancer.java:1164) > at >org.cyberneko.html.HTMLTagBalancer.startElement(HTMLTagBalancer.java:754) > at >org.cyberneko.html.filters.DefaultFilter.startElement(DefaultFilter.java:136) > at >org.cyberneko.html.filters.NamespaceBinder.startElement(NamespaceBinder.java:279) > at >org.cyberneko.html.HTMLScanner$ContentScanner.scanStartElement(HTMLScanner.java:2760) > at >org.cyberneko.html.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2110) > at org.cyberneko.html.HTMLScanner.scanDocument(HTMLScanner.java:920) > at >org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:499) > at >org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:452) > at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) > at >com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.parse(HTMLParser.java:926) > at >com.gargoylesoftware.htmlunit.html.HTMLParser.parse(HTMLParser.java:245) > at >com.gargoylesoftware.htmlunit.html.HTMLParser.parseHtml(HTMLParser.java:191) > at >com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:268) > at >com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:156) > at >com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:455) > at >com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:329) > at >com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:394) > ... > > >"VM Thread" prio=10 tid=0x0000000046a11800 nid=0x5783 runnable > >"GC task thread#0 (ParallelGC)" prio=10 tid=0x0000000046871000 nid=0x577b >runnable > >"GC task thread#1 (ParallelGC)" prio=10 tid=0x0000000046872800 nid=0x577c >runnable > >"GC task thread#2 (ParallelGC)" prio=10 tid=0x0000000046874800 nid=0x577d >runnable > >"GC task thread#3 (ParallelGC)" prio=10 tid=0x0000000046876000 nid=0x577e >runnable > >"GC task thread#4 (ParallelGC)" prio=10 tid=0x0000000046878000 nid=0x577f >runnable > >"GC task thread#5 (ParallelGC)" prio=10 tid=0x000000004687a000 nid=0x5780 >runnable > >"GC task thread#6 (ParallelGC)" prio=10 tid=0x000000004687c000 nid=0x5781 >runnable > >"GC task thread#7 (ParallelGC)" prio=10 tid=0x000000004687d800 nid=0x5782 >runnable > >"VM Periodic Task Thread" prio=10 tid=0x0000000046a4a000 nid=0x578b waiting >on condition > >JNI global references: 1142 > >When waiting a couple of seconds and taking the thread dump again the trace >does not change. > >The memory looks fine > > S0 S1 E O P YGC YGCT FGC FGCT GCT > 0.00 75.00 88.28 34.37 73.05 1025 1.798 8 0.842 2.640 > > >I am running on a file of urls, so i cannot pinpoint the line where it gets >stuck. > >What else can i change? > >Thanks, >David > |
From: David M. G. <mic...@gm...> - 2014-03-23 15:32:57
|
Hi all, I please need some pointers how to continue this research I am using htmlunit 2.15 snapshot. My java is hanging with 100 % cpu. The jstack gives the following Full thread dump Java HotSpot(TM) 64-Bit Server VM (14.0-b16 mixed mode): "Attach Listener" daemon prio=10 tid=0x00000000471f8800 nid=0x7756 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "JS executor for com.gargoylesoftware.htmlunit.WebClient@39e32cd6" daemon prio=10 tid=0x00000000471fc800 nid=0x5e61 waiting on condition [0x00002ad519ddc000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at com.gargoylesoftware.htmlunit.javascript.background.DefaultJavaScriptExecutor.run(DefaultJavaScriptExecutor.java:180) at java.lang.Thread.run(Thread.java:619) "JS executor for com.gargoylesoftware.htmlunit.WebClient@62946d22" daemon prio=10 tid=0x00002ad51c442000 nid=0x5813 waiting on condition [0x00002ad519ab6000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at com.gargoylesoftware.htmlunit.javascript.background.DefaultJavaScriptExecutor.run(DefaultJavaScriptExecutor.java:180) at java.lang.Thread.run(Thread.java:619) "Low Memory Detector" daemon prio=10 tid=0x0000000046a47800 nid=0x578a runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "CompilerThread1" daemon prio=10 tid=0x0000000046a44800 nid=0x5789 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "CompilerThread0" daemon prio=10 tid=0x0000000046a40800 nid=0x5788 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Signal Dispatcher" daemon prio=10 tid=0x0000000046a3e800 nid=0x5787 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Finalizer" daemon prio=10 tid=0x0000000046a1a000 nid=0x5785 in Object.wait() [0x00002ad515a1d000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00002ad504c5aa28> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118) - locked <0x00002ad504c5aa28> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159) "Reference Handler" daemon prio=10 tid=0x0000000046a18000 nid=0x5784 in Object.wait() [0x00002ad51591c000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00002ad504c5baf8> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:485) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116) - locked <0x00002ad504c5baf8> (a java.lang.ref.Reference$Lock) "main" prio=10 tid=0x0000000046867000 nid=0x577a runnable [0x00002ad4fbfb4000] java.lang.Thread.State: RUNNABLE at java.util.HashMap.get(HashMap.java:303) at com.gargoylesoftware.htmlunit.html.HtmlPage.addElement(HtmlPage.java:1875) at com.gargoylesoftware.htmlunit.html.HtmlPage.addMappedElement(HtmlPage.java:1850) at com.gargoylesoftware.htmlunit.html.HtmlPage.notifyNodeAdded(HtmlPage.java:1793) at com.gargoylesoftware.htmlunit.html.DomNode.fireAddition(DomNode.java:1043) at com.gargoylesoftware.htmlunit.html.DomNode.appendChild(DomNode.java:937) at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.addNodeToRightParent(HTMLParser.java:652) at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.startElement(HTMLParser.java:565) at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown Source) at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.startElement(HTMLParser.java:510) at org.cyberneko.html.HTMLTagBalancer.callStartElement(HTMLTagBalancer.java:1164) at org.cyberneko.html.HTMLTagBalancer.startElement(HTMLTagBalancer.java:754) at org.cyberneko.html.filters.DefaultFilter.startElement(DefaultFilter.java:136) at org.cyberneko.html.filters.NamespaceBinder.startElement(NamespaceBinder.java:279) at org.cyberneko.html.HTMLScanner$ContentScanner.scanStartElement(HTMLScanner.java:2760) at org.cyberneko.html.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2110) at org.cyberneko.html.HTMLScanner.scanDocument(HTMLScanner.java:920) at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:499) at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:452) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.parse(HTMLParser.java:926) at com.gargoylesoftware.htmlunit.html.HTMLParser.parse(HTMLParser.java:245) at com.gargoylesoftware.htmlunit.html.HTMLParser.parseHtml(HTMLParser.java:191) at com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:268) at com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:156) at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:455) at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:329) at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:394) ... "VM Thread" prio=10 tid=0x0000000046a11800 nid=0x5783 runnable "GC task thread#0 (ParallelGC)" prio=10 tid=0x0000000046871000 nid=0x577b runnable "GC task thread#1 (ParallelGC)" prio=10 tid=0x0000000046872800 nid=0x577c runnable "GC task thread#2 (ParallelGC)" prio=10 tid=0x0000000046874800 nid=0x577d runnable "GC task thread#3 (ParallelGC)" prio=10 tid=0x0000000046876000 nid=0x577e runnable "GC task thread#4 (ParallelGC)" prio=10 tid=0x0000000046878000 nid=0x577f runnable "GC task thread#5 (ParallelGC)" prio=10 tid=0x000000004687a000 nid=0x5780 runnable "GC task thread#6 (ParallelGC)" prio=10 tid=0x000000004687c000 nid=0x5781 runnable "GC task thread#7 (ParallelGC)" prio=10 tid=0x000000004687d800 nid=0x5782 runnable "VM Periodic Task Thread" prio=10 tid=0x0000000046a4a000 nid=0x578b waiting on condition JNI global references: 1142 When waiting a couple of seconds and taking the thread dump again the trace does not change. The memory looks fine S0 S1 E O P YGC YGCT FGC FGCT GCT 0.00 75.00 88.28 34.37 73.05 1025 1.798 8 0.842 2.640 I am running on a file of urls, so i cannot pinpoint the line where it gets stuck. What else can i change? Thanks, David |
From: David N. <dav...@gm...> - 2014-03-19 14:48:10
|
Well, it may not be pfSense after all. I connected the servers directly to the modem, ran the crawlers, and I'm still getting UnknownHostException's. It must be either my modem, rate limiting on my providers DNS servers, or the line I'm on itself. On 3/18/14, David Noel <dav...@gm...> wrote: > It seems that pfSense is the culprit. I loaded the crawlers on a few > servers with 100 threads per instance, waited for > UnknownHostException's to be thrown, then plugged a laptop directly in > to my modem, bypassing my 2 pfSense routers. All DNS queries have gone > through, no problem. I'm contacting the pfSense lists to see if anyone > there knows what might be going on. It's probably a misconfiguration > on my part... or maybe it just needs to be tuned differently. I'd be > surprised if something were wrong with pfSense itself. > > Thanks for your input, Ahmed. > > -David > > On 3/14/14, Ahmed Ashour <asa...@ya...> wrote: >> Hi David, >> >> The only known issue is 1577 where HtmlUnit makes new socket connection >> per >> call to getPage(), which is not happening in 2.13 >> >> I suggest two things: revert to 2.13, and directly use HttpClient or Java >> net with very minimal logic, so that we know if it's HtmlUnit or Java. >> >> Ahmed >> >>> On Mar 14, 2014, at 11:05 PM, David Noel <dav...@gm...> wrote: >>> >>> I've encountered an issue while scaling a Java project that I'm not >>> sure how to resolve. Any thoughts would be appreciated. >>> >>> The code is a crawler that uses HTMLUnit's getPage method. I'm running >>> 100 threads per instance. When I have 1 instance up and running >>> everything is fine. When I scale it to a second machine though I start >>> having trouble. Calls to getPage keep throwing UnknownHostException's. >>> Roughly 1 out of every 20 calls throw this exception. For some reason >>> it's unable to resolve domain names.. and it's not just the crawlers, >>> my entire network starts to bug on DNS queries. On different systems >>> on the same network I get 'unable to resolve host' errors in my web >>> browser periodically when loading URL's. Usually when I retry it goes >>> through, but it keeps happening sporadically as long as the crawlers >>> are running. >>> >>> So many things could be going wrong here. Thinking maybe it was my >>> provider throttling DNS queries I've tried changing DNS servers, but >>> that's done nothing. Thinking it might be a bandwidth issue I checked >>> systat, but the cumulative load is well under what my line can handle. >>> What else could be causing this? My network is pretty simple: Provider >>> <--> modem <--> 2 routers running pfSense <--> Servers and >>> workstations. The servers are running FreeBSD, and the workstations >>> run FreeBSD, Windows, and OSX. >>> >>> Has anyone encountered this before? Does anyone have any thoughts on >>> what might be causing it? >>> >>> My only other thought is that maybe pfSense is doing something >>> strange, so if I can't come up with any better ideas I'll try plugging >>> the servers directly into the modem. I'd rather have them behind the >>> routers though, so this would be a less-than-ideal solution. >>> >>> -David > |
From: Alain B. <alb...@gm...> - 2014-03-19 12:33:05
|
I would like to use htmlunit to do something I can't do with browser technics due to cross domain security. To make things easier for some users of a site I don't have any control on it, where they have a form to fill and have all the data already on a form of my site I would like to do the transfer server side. My question is could I have a problem with multiple instance of htmlunit . I mean several users can connect in the same time. Perhaps my question is silly but I said I'm a newbie with these problems. Regards |
From: Ahmed A. <asa...@ya...> - 2014-03-19 06:48:17
|
Hi David, Please provide complete (hopefully minimal) case. Ahmed ________________________________ From: David Hill <DH...@St...> To: "htm...@li..." <htm...@li...> Sent: Tuesday, March 18, 2014 11:35 PM Subject: [Htmlunit-user] FW: problem with cached event handlers I should have posted this to the users list, not the dev list: After upgrading from 2.9 to 2.14 we have a number of failing tests that involve making sure end users can't perform obvious hacks. These tests involve modifying an HTML page before submitting it back to the server. We are seeing the HTML has been correctly updated, but when we change properties in the HTML through HTMLUnit, those changes are no longer executed properly since upgrading. the element looks like this HtmlButtonInput[<input type="button" id="resend1" onclick="resend(19, 'jdoe');" value="Resend">] but the elements scriptObject_.eventListenersContainer_.eventHandlers_[0].value.handler_.jsSnippet = function onclick() {resend(19, 'smith');} note the element was hacked to change the user from smith to jdoe, but HTMLUnit still submits smith. Dave This e-mail and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error please notify the originator of the message. This footer also confirms that this e-mail message has been scanned for the presence of computer viruses. Any views expressed in this message are those of the individual sender, except where the sender specifies and with authority, states them to be the views of Iowa Student Loan. ------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech _______________________________________________ Htmlunit-user mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlunit-user |
From: David H. <DH...@St...> - 2014-03-18 20:52:06
|
I should have posted this to the users list, not the dev list: After upgrading from 2.9 to 2.14 we have a number of failing tests that involve making sure end users can't perform obvious hacks. These tests involve modifying an HTML page before submitting it back to the server. We are seeing the HTML has been correctly updated, but when we change properties in the HTML through HTMLUnit, those changes are no longer executed properly since upgrading. the element looks like this HtmlButtonInput[<input type="button" id="resend1" onclick="resend(19, 'jdoe');" value="Resend">] but the elements scriptObject_.eventListenersContainer_.eventHandlers_[0].value.handler_.jsSnippet = function onclick() {resend(19, 'smith');} note the element was hacked to change the user from smith to jdoe, but HTMLUnit still submits smith. Dave This e-mail and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error please notify the originator of the message. This footer also confirms that this e-mail message has been scanned for the presence of computer viruses. Any views expressed in this message are those of the individual sender, except where the sender specifies and with authority, states them to be the views of Iowa Student Loan. |
From: David N. <dav...@gm...> - 2014-03-18 13:32:01
|
It seems that pfSense is the culprit. I loaded the crawlers on a few servers with 100 threads per instance, waited for UnknownHostException's to be thrown, then plugged a laptop directly in to my modem, bypassing my 2 pfSense routers. All DNS queries have gone through, no problem. I'm contacting the pfSense lists to see if anyone there knows what might be going on. It's probably a misconfiguration on my part... or maybe it just needs to be tuned differently. I'd be surprised if something were wrong with pfSense itself. Thanks for your input, Ahmed. -David On 3/14/14, Ahmed Ashour <asa...@ya...> wrote: > Hi David, > > The only known issue is 1577 where HtmlUnit makes new socket connection per > call to getPage(), which is not happening in 2.13 > > I suggest two things: revert to 2.13, and directly use HttpClient or Java > net with very minimal logic, so that we know if it's HtmlUnit or Java. > > Ahmed > >> On Mar 14, 2014, at 11:05 PM, David Noel <dav...@gm...> wrote: >> >> I've encountered an issue while scaling a Java project that I'm not >> sure how to resolve. Any thoughts would be appreciated. >> >> The code is a crawler that uses HTMLUnit's getPage method. I'm running >> 100 threads per instance. When I have 1 instance up and running >> everything is fine. When I scale it to a second machine though I start >> having trouble. Calls to getPage keep throwing UnknownHostException's. >> Roughly 1 out of every 20 calls throw this exception. For some reason >> it's unable to resolve domain names.. and it's not just the crawlers, >> my entire network starts to bug on DNS queries. On different systems >> on the same network I get 'unable to resolve host' errors in my web >> browser periodically when loading URL's. Usually when I retry it goes >> through, but it keeps happening sporadically as long as the crawlers >> are running. >> >> So many things could be going wrong here. Thinking maybe it was my >> provider throttling DNS queries I've tried changing DNS servers, but >> that's done nothing. Thinking it might be a bandwidth issue I checked >> systat, but the cumulative load is well under what my line can handle. >> What else could be causing this? My network is pretty simple: Provider >> <--> modem <--> 2 routers running pfSense <--> Servers and >> workstations. The servers are running FreeBSD, and the workstations >> run FreeBSD, Windows, and OSX. >> >> Has anyone encountered this before? Does anyone have any thoughts on >> what might be causing it? >> >> My only other thought is that maybe pfSense is doing something >> strange, so if I can't come up with any better ideas I'll try plugging >> the servers directly into the modem. I'd rather have them behind the >> routers though, so this would be a less-than-ideal solution. >> >> -David |
From: Torak t. <sni...@gm...> - 2014-03-17 17:52:58
|
ok new script. all files imported the right way and error fixed: import java.io.IOException; import java.net.MalformedURLException; import org.eclipse.swt.widgets.Shell; import org.eclipse.swt.widgets.Display; import com.gargoylesoftware.htmlunit.*; import com.gargoylesoftware.htmlunit.html.*; public class BotStart { public static void main(String[] args) throws Exception, FailingHttpStatusCodeException, MalformedURLException, IOException { Display display = new Display(); Shell myshell = new Shell(display); myshell.setText("adf.ly BOT 0.1"); myshell.open(); while (!myshell.isDisposed()) { final WebClient webClient = new WebClient(BrowserVersion.CHROME); final HtmlPage page1 = webClient.getPage("http://adf.ly/ftDpz"); final HtmlInput button = page1.getElementById("skip_ad_button") final HtmlPage page2 = button.click(); //some more stuff in here :) webClient.closeAllWindows(); if (!display.readAndDispatch()) display.sleep(); } display.dispose(); } } but as you can probably see there is another problem. i dont know if your meant to use HtmlUnit for Botting purposes but i thought id test it out. my bot needs to : get to my adfly link ( shown above) get the skip ad button, wait 5 seconds, then click it. but in order to do the button.click() you need to have a htmlInput, which does not work directly from a html page ie: final HtmlInput button = page1.getElementById("skip_ad_button") idk what htmldivision does but if anyone can tell me how to make it work so that button is the obtained element id. It also keeps saying "obsolete data type encountered: text/javascript" if anyone knows how to fix that. sorry for the bother. ik ur all sick of me now :( |
From: Raj R. <raj...@gm...> - 2014-03-15 23:09:46
|
Thank you, appreciate the help. On Friday, March 14, 2014, Ronald Brill wrote: > This is fixed now > > RBRi > > On Fri, 14 Mar 2014 07:34:38 +0100 (CET) Ronald Brill wrote: > > > >Hi, > > > >this seems to be a bug. Please open an issue for the missing support of > window.navigate when simulating IE8. > >Fix should be easy. > > > > RBRi > > > >On Thu, 13 Mar 2014 10:03:09 -0500 Raj Rajeswaran wrote: > >> > >>Morning .... > >> > >>I have a button with onClick defined as such: > >> > >>onclick="window.navigate(returnPage)" > >> > >>When the button is clicked, I am getting a ScriptException as follows: > >> > >>*com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot find > >>function navigate in object [object].* > >> > >>Running in INTERNET_EXPLORER_8 mode. > >> > >> > >>Any clues how to fix this? > >> > >>Thanks > >>Ra > >> > > > > > > >------------------------------------------------------------------------------ > >Learn Graph Databases - Download FREE O'Reilly Book > >"Graph Databases" is the definitive new guide to graph databases and their > >applications. Written by three acclaimed leaders in the field, > >this first edition is now available. Download your free book today! > >http://p.sf.net/sfu/13534_NeoTech > >_______________________________________________ > >Htmlunit-user mailing list > >Htm...@li... <javascript:;> > >https://lists.sourceforge.net/lists/listinfo/htmlunit-user > > > > > > ------------------------------------------------------------------------------ > Learn Graph Databases - Download FREE O'Reilly Book > "Graph Databases" is the definitive new guide to graph databases and their > applications. Written by three acclaimed leaders in the field, > this first edition is now available. Download your free book today! > http://p.sf.net/sfu/13534_NeoTech > _______________________________________________ > Htmlunit-user mailing list > Htm...@li... <javascript:;> > https://lists.sourceforge.net/lists/listinfo/htmlunit-user > |
From: Torak t. <sni...@gm...> - 2014-03-15 10:10:11
|
ok so i made a script in eclipse: import org.eclipse.swt.widgets.Display; import org.eclipse.swt.widgets.Shell; import org.junit.*; import org.apache.*; public class HelloWorldSWT { public static void main(String[] args) { Display display = new Display(); Shell myshell = new Shell(display); myshell.setText("adf.ly BOT 0.1"); myshell.open(); while (!myshell.isDisposed()) { //these 3 could not be resolved to a type final WebClient webClient = new WebClient(); final HtmlPage page = webClient.getPage("http://adf.ly/ftDpz"); final HtmlDivision div = page.getHtmlElementById("Skip_Ad_Button"); // to be developed webClient.closeAllWindows(); if (!display.readAndDispatch()) display.sleep(); } display.dispose(); } everything else is fine, it doesnt work yet of course cuz i have to add the 5 second wait time and the button.click. but i think its a problem with the org.apache.* node. does it matter that i couldnt find and import some of the files like nekohtml and xalan? can someone tell me whats wrong? |
From: Jack Bi <814...@qq...> - 2014-03-15 09:42:58
|
Hi Ronald: Thank you for your reply. If you are interested in my code. /******************************************************/ import java.io.IOException; import java.net.MalformedURLException; import java.util.ArrayList; import java.util.List; import java.util.logging.Level; import org.apache.log4j.Logger; import org.apache.log4j.chainsaw.Main; import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; import com.gargoylesoftware.htmlunit.BrowserVersion; import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException; import com.gargoylesoftware.htmlunit.WebClient; import com.gargoylesoftware.htmlunit.WebWindowEvent; import com.gargoylesoftware.htmlunit.WebWindowListener; import com.gargoylesoftware.htmlunit.html.HtmlElement; import com.gargoylesoftware.htmlunit.html.HtmlForm; import com.gargoylesoftware.htmlunit.html.HtmlInput; import com.gargoylesoftware.htmlunit.html.HtmlOption; import com.gargoylesoftware.htmlunit.html.HtmlPage; import com.gargoylesoftware.htmlunit.html.HtmlSelect; public class A { private static Logger logger = Logger.getLogger(A.class); static boolean warningTag = false; static boolean finalTag = false; static boolean htmlTag = false; static final String descUrl = "https://twitter.com/"; static String descTag = "welcome twitter"; static HtmlPage page = null; static HtmlPage page2 = null; static WebClient webClient = null; static String htmlString = null; static String urlString = ""; public static void initWebClient() { webClient = new WebClient(BrowserVersion.FIREFOX_24); webClient.getOptions().setThrowExceptionOnFailingStatusCode(false); webClient.getOptions().setThrowExceptionOnScriptError(false); webClient.getOptions().setCssEnabled(false); webClient.getOptions().setRedirectEnabled(true); webClient.getOptions().setJavaScriptEnabled(true); // webClient.getOptions().setTimeout(20*1000); webClient.getOptions().setUseInsecureSSL(true); webClient.addWebWindowListener(new WebWindowListener() { @Override public void webWindowOpened(WebWindowEvent event) { } @Override public void webWindowContentChanged(WebWindowEvent event) { page = (HtmlPage) webClient.getCurrentWindow().getEnclosedPage(); if (page.getTitleText().equals("Security Warning")) { warningTag = true; } if (page.asText().contains(descTag)) { finalTag = true; } } @Override public void webWindowClosed(WebWindowEvent event) { } }); } public static void webClientGetPage(String url) throws IOException, InterruptedException { try { page = webClient.getPage(url); } catch (FailingHttpStatusCodeException e) { logger.info(url + ":Error:Http Status"); return; } catch (MalformedURLException e) { logger.info(url + ":Error:URL Error"); return; } catch (Exception e) { logger.info(url + ":Error:Connect Error"); return; } try { dealRootPage(page); } catch (Exception e) { logger.info(urlString + ":dealRootPage Failed"); return; } } public static void dealRootPage(HtmlPage htmlPage) throws IOException, InterruptedException { Document document = Jsoup.parse(htmlPage.asXml()); Elements forms = document.getElementsByTag("form"); List<HtmlForm> forms2 = htmlPage.getForms(); if (forms.size() == 0) { logger.info(urlString + ":Error:Cannot find Form"); return; } else { for (int i = 0; i < forms.size(); i++) { Element form = forms.get(i); HtmlForm form2 = forms2.get(i); System.out.println("Execute:form" + (i+1)); try { parseForm(form,form2); } catch (Exception e) { logger.info(urlString + ":parForm:form" + i + "Failed"); continue; } } } } public static int parseForm(Element formElement, HtmlForm form2) throws IOException, InterruptedException { String inputName = formElement.select("input[type=text]").attr("name"); String buttonValue = formElement.select("input[type=submit]").attr("value"); Elements selects = formElement.getElementsByTag("select"); HtmlSelect select = null; if (inputName.length() < 1) { logger.info(urlString + ":Error:Cannot find Input"); return -2; } HtmlInput inputText = form2.getInputByName(inputName); inputText.setAttribute("value", descUrl); HtmlInput buttonInput = form2.getInputByValue(buttonValue); if (selects.size() == 0) { try { doRequest(buttonInput); } catch (Exception e) { logger.info(urlString + ":doRequest(button)" + "Failed"); return -7; } } else { Elements options = formElement.getElementsByTag("select").get(0).getElementsByTag("option"); String selectName = selects.get(0).attr("name"); select = form2.getSelectByName(selectName); for (int i = 0; i < options.size(); i++) { String optionValue = options.get(i).attr("value"); HtmlOption option = select.getOptionByValue(optionValue); select.setSelectedAttribute(option, true); System.out.println("Select Server:" + (i+1)); try { doRequest(buttonInput); } catch (Exception e) { logger.info(urlString + ":doRequest " + "Server:" + (i+1) + "Failed"); continue; } } } return 0; } public static int doRequest(HtmlInput buttonInput) throws InterruptedException, IOException { page = (HtmlPage)(buttonInput.click()); for(int i = 0; i < 1000; i++){ if (warningTag || finalTag) { break; } if (i == 100) { logger.info(urlString + "Error:JS Timeout"); System.out.println(page.asXml()); return -4; } Thread.sleep(1000); } int result = dealPage(page); warningTag = false; finalTag = false; Thread.sleep(5000); return result; } public static int dealPage(HtmlPage page) throws IOException, InterruptedException { if (page.asText().contains("Twitter")) { logger.info(urlString + ":" + "Succ"); System.out.println(page.asXml()); return 1; } else if (page.asText().contains("Warning")) { return dealWarning(page); } return 0; } public static int dealWarning(HtmlPage page) throws IOException, InterruptedException { HtmlForm form2 = page.getForms().get(0); HtmlElement button2 = form2.getInputByValue("Continue anyway..."); @SuppressWarnings("unused") HtmlPage page3 = button2.click(); for(int i = 0; i < 1000; i++){ if (finalTag) { return 2;//Succ } if (i == 50) { logger.info(urlString + ":" + "Error:Warning timeout"); return -5; } Thread.sleep(1000); } return 0; } public static void main(String[] args) throws IOException, InterruptedException { java.util.logging.Logger.getLogger("com.gargoylesoftware").setLevel(Level.OFF); System.setProperty("org.apache.commons.logging.Log","org.apache.commons.logging.impl.NoOpLog"); urlString = "http://www.q8daili.com/"; initWebClient(); Log.loadLogProperties(); logger.info("**************************************************"); logger.info("begin"); try { webClientGetPage(urlString); } catch (Exception e) { logger.info(urlString + ":Get root HTMLPage failed"); } } } Thank you! yours Jack! -- View this message in context: http://htmlunit.10904.n7.nabble.com/htmlUnit-always-failed-when-execute-JS-in-html-tp33378p33383.html Sent from the HtmlUnit - General mailing list archive at Nabble.com. |
From: Jack Bi <814...@qq...> - 2014-03-15 09:37:32
|
Hi David: I will try PhantomJs and CasperJs that you said. thank you very much! yours Jack -- View this message in context: http://htmlunit.10904.n7.nabble.com/htmlUnit-always-failed-when-execute-JS-in-html-tp33378p33382.html Sent from the HtmlUnit - General mailing list archive at Nabble.com. |
From: Ronald B. <rb...@rb...> - 2014-03-15 09:18:42
|
Hi Jack, i personally think that the JS support of HtmlUnit is not that bad. >From you info i'm not able to see your problem. Do you get any error or just not the response you expect?. In general HtmlUnit tries to mimic the browser as close as possible. So if you find a situation where the behaviour is different you have to isolate the situation (see http://htmlunit.sourceforge.net/submittingBugs.html and http://htmlunit.sourceforge.net/submittingJSBugs.html) and open an issue. Depending on the level of details you provide, we usually fix this as fast as possible. RBRi On Fri, 14 Mar 2014 21:53:59 -0700 (PDT) Jack Bi wrote: > >Hi all: > I found htmlUnit always failed when execute JS in html, and if I try the >same code, sometimes it's success, I was puzzled for a long time. > My Java code like this, hope someone can help me solve this >problem,thank you very much. > > WebClient webClient = new >WebClient(BrowserVersion.FIREFOX_24); > webClient.addWebWindowListener(new WebWindowListener() { > > public void webWindowOpened(WebWindowEvent event) { > > } > public void webWindowContentChanged(WebWindowEvent event) { > page = (HtmlPage) webClient.getCurrentWindow().getEnclosedPage(); > if (page.getTitleText().equals("Security Warning")) { > warningTag = true; > } > if (page.asText().contains(descTag)) { > finalTag = true; > } > } > public void webWindowClosed(WebWindowEvent event) { > > } > }); > > page = webClient.getPage(urlString); > for(int i = 0; i < 200; i++){ > if (warningTag || finalTag) { > break; > } > Thread.sleep(1000); > } > >the HTML code is like this: ><html> > <head> > <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> > <title> > title > </title> > <script type="text/javascript"> > window.setTimeout('document.getElementById("formxh4t").submit();',1000); > <form name="formxh4t" id="formxh4t" >action="http://k2.ddtuangou.com/daili/process.php?action=update" >method="post" onsubmit="return updateLocation(this);"> > <input name="u" type="hidden" class="textbox" id="input" >value="http://dongtaiwang.com" size="60"/> > <input type="hidden" name="type" value="0"/> > </form> > </body> ></html> > >thank you! > >yours jack > > > >-- >View this message in context: http://htmlunit.10904.n7.nabble.com/htmlUnit-always-failed-when-execute-JS-in-html-tp33378.html >Sent from the HtmlUnit - General mailing list archive at Nabble.com. > >------------------------------------------------------------------------------ >Learn Graph Databases - Download FREE O'Reilly Book >"Graph Databases" is the definitive new guide to graph databases and their >applications. Written by three acclaimed leaders in the field, >this first edition is now available. Download your free book today! >http://p.sf.net/sfu/13534_NeoTech >_______________________________________________ >Htmlunit-user mailing list >Htm...@li... >https://lists.sourceforge.net/lists/listinfo/htmlunit-user > |
From: David N. <dav...@gm...> - 2014-03-15 06:51:50
|
On 3/14/14, Ahmed Ashour <asa...@ya...> wrote: > The only known issue is 1577 where HtmlUnit makes new socket connection per > call to getPage(), which is not happening in 2.13 > > I suggest two things: revert to 2.13, and directly use HttpClient or Java > net with very minimal logic, so that we know if it's HtmlUnit or Java. Hi, thanks for the reply. I'm currently using HTMLUnit 2.11. The relevant code from inside the threads is as follows: WebClient client = new WebClient(BrowserVersion.FIREFOX_10); client.setRefreshHandler(new ThreadedRefreshHandler()); client.getCookieManager().setCookiesEnabled(true); client.getOptions().setThrowExceptionOnFailingStatusCode(false); client.getOptions().setThrowExceptionOnScriptError(false); client.getOptions().setJavaScriptEnabled(false); client.getOptions().setCssEnabled(false); client.getOptions().setPopupBlockerEnabled(true); client.getOptions().setPrintContentOnFailingStatusCode(false); client.getOptions().setUseInsecureSSL(true); client.setRefreshHandler(new ThreadedRefreshHandler()); client.getOptions().setTimeout(120000); client.setJavaScriptTimeout(20000); client.getOptions().setPrintContentOnFailingStatusCode(false); HtmlPage p = client.getPage(url); client.closeAllWindows(); ---------- WebClient client = new WebClient(BrowserVersion.FIREFOX_10); client.setRefreshHandler(new ThreadedRefreshHandler()); client.getCookieManager().setCookiesEnabled(true); client.getOptions().setThrowExceptionOnFailingStatusCode(false); client.getOptions().setThrowExceptionOnScriptError(false); client.getOptions().setJavaScriptEnabled(false); client.getOptions().setCssEnabled(false); client.getOptions().setPopupBlockerEnabled(true); client.getOptions().setPrintContentOnFailingStatusCode(false); client.getOptions().setTimeout(20000); client.getOptions().setPrintContentOnFailingStatusCode(false); client.getOptions().setUseInsecureSSL(true); WebRequest r = new WebRequest(new URL(iframeURL)); r.setAdditionalHeader("Referer", referUrl); Page p = client.getPage(r); String readFileToString = p.getWebResponse().getContentAsString(); client.closeAllWindows(); ---------- WebClient client = new WebClient(BrowserVersion.FIREFOX_10); client.setRefreshHandler(new ThreadedRefreshHandler()); client.getCookieManager().setCookiesEnabled(true); client.getOptions().setThrowExceptionOnFailingStatusCode(false); client.getOptions().setThrowExceptionOnScriptError(false); client.getOptions().setJavaScriptEnabled(false); client.getOptions().setCssEnabled(false); client.getOptions().setPopupBlockerEnabled(true); client.getOptions().setPrintContentOnFailingStatusCode(false); client.getOptions().setTimeout(20000); client.getOptions().setPrintContentOnFailingStatusCode(false); client.getOptions().setUseInsecureSSL(true); Page p = client.getPage(url); Pattern patt = Pattern.compile("[0-9]+"); Matcher match = patt.matcher(p.getUrl().toString()); if (match.find()) { String tempNumber = match.group(); p = client.getPage(url2); String temp = p.getWebResponse().getContentAsString(); patt = Pattern.compile("apiKey:\"(.*?)\","); match = patt.matcher(temp); if (match.find() && match.group().split(":").length > 0) { ret = match.group().split(":")[1].replace("\"", "").replace(",", ""); } } client.closeAllWindows(); ---------- -David PS: What's the etiquette here? Top posting or bottom posting? Gmail makes it to much easier to top post and that's what I prefer, but some lists insist on bottom posting. |
From: David N. <dav...@gm...> - 2014-03-15 06:26:38
|
On 3/14/14, Jack Bi <814...@qq...> wrote: > Hi all: > I found htmlUnit always failed when execute JS in html, and if I try > the > same code, sometimes it's success, I was puzzled for a long time. > My Java code like this, hope someone can help me solve this > problem,thank you very much. > > WebClient webClient = new > WebClient(BrowserVersion.FIREFOX_24); > webClient.addWebWindowListener(new WebWindowListener() > { > > public void webWindowOpened(WebWindowEvent event) { > > } > public void webWindowContentChanged(WebWindowEvent event) { > page = (HtmlPage) webClient.getCurrentWindow().getEnclosedPage(); > if (page.getTitleText().equals("Security Warning")) { > warningTag = true; > } > if (page.asText().contains(descTag)) { > finalTag = true; > } > } > public void webWindowClosed(WebWindowEvent event) { > > } > }); > > page = webClient.getPage(urlString); > for(int i = 0; i < 200; i++){ > if (warningTag || finalTag) { > break; > } > Thread.sleep(1000); > } > > the HTML code is like this: > <html> > <head> > <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> > <title> > title > </title> > <script type="text/javascript"> > window.setTimeout('document.getElementById("formxh4t").submit();',1000); > <form name="formxh4t" id="formxh4t" > action="http://k2.ddtuangou.com/daili/process.php?action=update" > method="post" onsubmit="return updateLocation(this);"> > <input name="u" type="hidden" class="textbox" id="input" > value="http://dongtaiwang.com" size="60"/> > <input type="hidden" name="type" value="0"/> > </form> > </body> > </html> I have found the Javascript engines used by HTMLUnit unable to properly execute JS as well. For my project I wound up having to switch to PhantomJS/CasperJS. I'd recommend giving it a try. I still use HTMLUnit for some things, and for those things it does a really nice job, but when it comes to executing Javascript I have to use something else. |
From: Jack Bi <814...@qq...> - 2014-03-15 04:54:10
|
Hi all: I found htmlUnit always failed when execute JS in html, and if I try the same code, sometimes it's success, I was puzzled for a long time. My Java code like this, hope someone can help me solve this problem,thank you very much. WebClient webClient = new WebClient(BrowserVersion.FIREFOX_24); webClient.addWebWindowListener(new WebWindowListener() { public void webWindowOpened(WebWindowEvent event) { } public void webWindowContentChanged(WebWindowEvent event) { page = (HtmlPage) webClient.getCurrentWindow().getEnclosedPage(); if (page.getTitleText().equals("Security Warning")) { warningTag = true; } if (page.asText().contains(descTag)) { finalTag = true; } } public void webWindowClosed(WebWindowEvent event) { } }); page = webClient.getPage(urlString); for(int i = 0; i < 200; i++){ if (warningTag || finalTag) { break; } Thread.sleep(1000); } the HTML code is like this: <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> <title> title </title> <script type="text/javascript"> window.setTimeout('document.getElementById("formxh4t").submit();',1000); <form name="formxh4t" id="formxh4t" action="http://k2.ddtuangou.com/daili/process.php?action=update" method="post" onsubmit="return updateLocation(this);"> <input name="u" type="hidden" class="textbox" id="input" value="http://dongtaiwang.com" size="60"/> <input type="hidden" name="type" value="0"/> </form> </body> </html> thank you! yours jack -- View this message in context: http://htmlunit.10904.n7.nabble.com/htmlUnit-always-failed-when-execute-JS-in-html-tp33378.html Sent from the HtmlUnit - General mailing list archive at Nabble.com. |
From: Ahmed A. <asa...@ya...> - 2014-03-14 22:01:15
|
Hi David, The only known issue is 1577 where HtmlUnit makes new socket connection per call to getPage(), which is not happening in 2.13 I suggest two things: revert to 2.13, and directly use HttpClient or Java net with very minimal logic, so that we know if it's HtmlUnit or Java. Ahmed > On Mar 14, 2014, at 11:05 PM, David Noel <dav...@gm...> wrote: > > I've encountered an issue while scaling a Java project that I'm not > sure how to resolve. Any thoughts would be appreciated. > > The code is a crawler that uses HTMLUnit's getPage method. I'm running > 100 threads per instance. When I have 1 instance up and running > everything is fine. When I scale it to a second machine though I start > having trouble. Calls to getPage keep throwing UnknownHostException's. > Roughly 1 out of every 20 calls throw this exception. For some reason > it's unable to resolve domain names.. and it's not just the crawlers, > my entire network starts to bug on DNS queries. On different systems > on the same network I get 'unable to resolve host' errors in my web > browser periodically when loading URL's. Usually when I retry it goes > through, but it keeps happening sporadically as long as the crawlers > are running. > > So many things could be going wrong here. Thinking maybe it was my > provider throttling DNS queries I've tried changing DNS servers, but > that's done nothing. Thinking it might be a bandwidth issue I checked > systat, but the cumulative load is well under what my line can handle. > What else could be causing this? My network is pretty simple: Provider > <--> modem <--> 2 routers running pfSense <--> Servers and > workstations. The servers are running FreeBSD, and the workstations > run FreeBSD, Windows, and OSX. > > Has anyone encountered this before? Does anyone have any thoughts on > what might be causing it? > > My only other thought is that maybe pfSense is doing something > strange, so if I can't come up with any better ideas I'll try plugging > the servers directly into the modem. I'd rather have them behind the > routers though, so this would be a less-than-ideal solution. > > -David > > ------------------------------------------------------------------------------ > Learn Graph Databases - Download FREE O'Reilly Book > "Graph Databases" is the definitive new guide to graph databases and their > applications. Written by three acclaimed leaders in the field, > this first edition is now available. Download your free book today! > http://p.sf.net/sfu/13534_NeoTech > _______________________________________________ > Htmlunit-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlunit-user |
From: David N. <dav...@gm...> - 2014-03-14 20:05:07
|
I've encountered an issue while scaling a Java project that I'm not sure how to resolve. Any thoughts would be appreciated. The code is a crawler that uses HTMLUnit's getPage method. I'm running 100 threads per instance. When I have 1 instance up and running everything is fine. When I scale it to a second machine though I start having trouble. Calls to getPage keep throwing UnknownHostException's. Roughly 1 out of every 20 calls throw this exception. For some reason it's unable to resolve domain names.. and it's not just the crawlers, my entire network starts to bug on DNS queries. On different systems on the same network I get 'unable to resolve host' errors in my web browser periodically when loading URL's. Usually when I retry it goes through, but it keeps happening sporadically as long as the crawlers are running. So many things could be going wrong here. Thinking maybe it was my provider throttling DNS queries I've tried changing DNS servers, but that's done nothing. Thinking it might be a bandwidth issue I checked systat, but the cumulative load is well under what my line can handle. What else could be causing this? My network is pretty simple: Provider <--> modem <--> 2 routers running pfSense <--> Servers and workstations. The servers are running FreeBSD, and the workstations run FreeBSD, Windows, and OSX. Has anyone encountered this before? Does anyone have any thoughts on what might be causing it? My only other thought is that maybe pfSense is doing something strange, so if I can't come up with any better ideas I'll try plugging the servers directly into the modem. I'd rather have them behind the routers though, so this would be a less-than-ideal solution. -David |
From: Ronald B. <rb...@rb...> - 2014-03-14 19:31:09
|
This is fixed now RBRi On Fri, 14 Mar 2014 07:34:38 +0100 (CET) Ronald Brill wrote: > >Hi, > >this seems to be a bug. Please open an issue for the missing support of window.navigate when simulating IE8. >Fix should be easy. > > RBRi > >On Thu, 13 Mar 2014 10:03:09 -0500 Raj Rajeswaran wrote: >> >>Morning .... >> >>I have a button with onClick defined as such: >> >>onclick="window.navigate(returnPage)" >> >>When the button is clicked, I am getting a ScriptException as follows: >> >>*com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot find >>function navigate in object [object].* >> >>Running in INTERNET_EXPLORER_8 mode. >> >> >>Any clues how to fix this? >> >>Thanks >>Ra >> > > >------------------------------------------------------------------------------ >Learn Graph Databases - Download FREE O'Reilly Book >"Graph Databases" is the definitive new guide to graph databases and their >applications. Written by three acclaimed leaders in the field, >this first edition is now available. Download your free book today! >http://p.sf.net/sfu/13534_NeoTech >_______________________________________________ >Htmlunit-user mailing list >Htm...@li... >https://lists.sourceforge.net/lists/listinfo/htmlunit-user > |
From: Ronald B. <rb...@rb...> - 2014-03-14 06:34:55
|
Hi, this seems to be a bug. Please open an issue for the missing support of window.navigate when simulating IE8. Fix should be easy. RBRi On Thu, 13 Mar 2014 10:03:09 -0500 Raj Rajeswaran wrote: > >Morning .... > >I have a button with onClick defined as such: > >onclick="window.navigate(returnPage)" > >When the button is clicked, I am getting a ScriptException as follows: > >*com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot find >function navigate in object [object].* > >Running in INTERNET_EXPLORER_8 mode. > > >Any clues how to fix this? > >Thanks >Ra > |
From: Ahmed A. <asa...@ya...> - 2014-03-13 20:06:42
|
Hi Torak, HtmlUnit doesn't calculate the positions. There are other ways to get the element, see http://htmlunit.sourceforge.net/gettingStarted.html Ahmed ________________________________ From: Torak twelve <sni...@gm...> To: htm...@li... Sent: Thursday, March 13, 2014 8:36 PM Subject: [Htmlunit-user] Im New At this and need some help OK, right. im new at everything here. computer programming,java,htmlunit. so i have only a rough idea of how to to everything and i am not sure what to do. first, how do you perform the click() command at a certain position on the webpage? like htmlpage mypage = GetPage() mypage.setCursorPos(500,30) button Mybutton = GetPos() button.click() idk if thats right but if you can understand its the thing i want to do. i want to move the mouse to a certain position and then click it. i know hardly anythng about this so please correct me if im wrong. thanks. PS not sure if my msg got through last time. ------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech _______________________________________________ Htmlunit-user mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlunit-user |