jcrawler-main Mailing List for Crawler/Load Tester in Java
Status: Beta
Brought to you by:
idumali
You can subscribe to this list here.
2004 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2005 |
Jan
|
Feb
|
Mar
|
Apr
(5) |
May
|
Jun
|
Jul
|
Aug
|
Sep
(4) |
Oct
|
Nov
(1) |
Dec
|
2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2008 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
(2) |
Jul
(3) |
Aug
(1) |
Sep
|
Oct
|
Nov
(5) |
Dec
|
2009 |
Jan
|
Feb
|
Mar
(6) |
Apr
|
May
|
Jun
|
Jul
(5) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
(1) |
Dec
|
2011 |
Jan
(6) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(3) |
Nov
|
Dec
|
2016 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Jovana M. <jo...@we...> - 2012-10-16 14:10:28
|
Dear Sir, I was wondering if your organization got my last mails explaining the nature of my proposal. Once again, I would really appreciate if you can give me the chance to translate your articles and therefore make it more available to the population of Ex Yugoslavia. Please do not hesitate to ask me any question which can explain you more adequately my purpose. As you can imagine, this step would be very bright and I would really appreciate if you can take this into consideration. I am looking forward to hearing from you. Yours sincerely, Jovana Milutinovich On Wed, Oct 10, 2012 at 3:40 PM, Jovana Milutinovich < jo...@we...> wrote: > Dear Sir, > > I have sent you my request for permission to translate your article a week > ago and it was unclear whether your organization had time to take a look at > it. It would be highly appreciative for my objects towards connecting ex > Yugoslavian people with your article and information in it, if your website > can take into consideration my proposal. I hope that you will find the time > to review it. > > Yours sincerely, > > Jovana Milutinovich > > On Wed, Oct 3, 2012 at 12:21 PM, Jovana Milutinovich < > jo...@we...> wrote: > >> >> Dear Sir, >> >> My name is Jovana. I found your article extremely interesting and would >> like to spread the word for people from Ex Yugoslavia. >> >> Here's the URL of your article: http://jcrawler.sourceforge.** >> net/index.php <http://jcrawler.sourceforge.net/index.php> >> >> Would you mind if I translate your article to Serbo-Croatian language and >> post it on our site? >> My purpose is to help people from Ex Yugoslavia better understand some >> very useful information about computer science. >> >> Some quick info about myself: >> I was born in Yugoslavia, Europe. Former Yugoslavia consisted of now >> totally independent states like Serbia, Montenegro, Croatia, Bosnia & >> Hercezovina, Slovenia and Macedonia, which are all united by Serbo-Croatian >> language. >> I'm currently studying Computer Science at the University of >> Belgrade,Serbia. >> >> >> With Kind Regards, >> >> Jovana Milutinovich >> http://science.**webhostinggeeks.com/<http://science.webhostinggeeks.com/> >> jo...@we... >> Tel: +381 63 8049100 >> >> > > > -- > > Jovana Milutinovich > jo...@we... > http://science.webhostinggeeks.com/ > Google+ <https://plus.google.com/u/0/115686188945991205104/posts> > Tel: +381638049100 > > > > -- Jovana Milutinovich jo...@we... http://science.webhostinggeeks.com/ Google+ <https://plus.google.com/u/0/115686188945991205104/posts> Tel: +381638049100 |
From: Jovana M. <jo...@we...> - 2012-10-10 13:40:57
|
Dear Sir, I have sent you my request for permission to translate your article a week ago and it was unclear whether your organization had time to take a look at it. It would be highly appreciative for my objects towards connecting ex Yugoslavian people with your article and information in it, if your website can take into consideration my proposal. I hope that you will find the time to review it. Yours sincerely, Jovana Milutinovich On Wed, Oct 3, 2012 at 12:21 PM, Jovana Milutinovich < jo...@we...> wrote: > > Dear Sir, > > My name is Jovana. I found your article extremely interesting and would > like to spread the word for people from Ex Yugoslavia. > > Here's the URL of your article: http://jcrawler.sourceforge.** > net/index.php <http://jcrawler.sourceforge.net/index.php> > > Would you mind if I translate your article to Serbo-Croatian language and > post it on our site? > My purpose is to help people from Ex Yugoslavia better understand some > very useful information about computer science. > > Some quick info about myself: > I was born in Yugoslavia, Europe. Former Yugoslavia consisted of now > totally independent states like Serbia, Montenegro, Croatia, Bosnia & > Hercezovina, Slovenia and Macedonia, which are all united by Serbo-Croatian > language. > I'm currently studying Computer Science at the University of > Belgrade,Serbia. > > > With Kind Regards, > > Jovana Milutinovich > http://science.**webhostinggeeks.com/<http://science.webhostinggeeks.com/> > jo...@we... > Tel: +381 63 8049100 > > -- Jovana Milutinovich jo...@we... http://science.webhostinggeeks.com/ Google+ <https://plus.google.com/u/0/115686188945991205104/posts> Tel: +381638049100 |
From: Jovana M. <jo...@we...> - 2012-10-03 10:21:49
|
Dear Sir, My name is Jovana. I found your article extremely interesting and would like to spread the word for people from Ex Yugoslavia. Here's the URL of your article: http://jcrawler.sourceforge.net/index.php Would you mind if I translate your article to Serbo-Croatian language and post it on our site? My purpose is to help people from Ex Yugoslavia better understand some very useful information about computer science. Some quick info about myself: I was born in Yugoslavia, Europe. Former Yugoslavia consisted of now totally independent states like Serbia, Montenegro, Croatia, Bosnia & Hercezovina, Slovenia and Macedonia, which are all united by Serbo-Croatian language. I'm currently studying Computer Science at the University of Belgrade,Serbia. With Kind Regards, Jovana Milutinovich http://science.webhostinggeeks.com/ jo...@we... Tel: +381 63 8049100 |
From: irakli <ir...@gm...> - 2011-01-05 01:15:36
|
It does not count load in terms of "simultaneous users" because such thing does not exist in reality. It's a metrics fabricated by simplistic load-testing tools that has very little meaning. Loadtest.io (and JCrawler) measure load in terms of page-requests per second. The amount of load you can generate depends on the computer you are using and the network connection you have. "Javascript embedded web application" and "Ajax" or "web 2.0" technologies web application can mean anything. If your web-system exposes URL links as proper HTML anchor tags - they will be crawled, if not - they won't. On Tue, Jan 4, 2011 at 7:34 PM, rajani pagadala <raj...@gm...>wrote: > Hi, > > 1.Can I run more than 1000 users simultaneously on one computer? > 2.Does Jcrawler supports JavaScript embedded web applications? > 3.Does Jcrawler supports AJAX, web2.0 technologies web based applications? > Thanks, > Raj > > ------------------------------------------------------------------------------ > Learn how Oracle Real Application Clusters (RAC) One Node allows customers > to consolidate database storage, standardize their database environment, > and, > should the need arise, upgrade to a full multi-node Oracle RAC database > without downtime or disruption > http://p.sf.net/sfu/oracle-sfdevnl > _______________________________________________ > Jcrawler-main mailing list > Jcr...@li... > https://lists.sourceforge.net/lists/listinfo/jcrawler-main > > |
From: rajani p. <raj...@gm...> - 2011-01-05 00:34:54
|
Hi, 1.Can I run more than 1000 users simultaneously on one computer? 2.Does Jcrawler supports JavaScript embedded web applications? 3.Does Jcrawler supports AJAX, web2.0 technologies web based applications? Thanks, Raj |
From: rajani p. <raj...@gm...> - 2011-01-05 00:05:26
|
Hi, 1. Can I run more than 1000 users simultaneously on one computer? 2.Does Jcrawler supports JavaScript embedded web applications? 3. Does Jcrawler supports AJAX, web2.0 technologies web based applications? Thanks, Raj On Tue, Jan 4, 2011 at 3:11 PM, irakli <ir...@gm...> wrote: > Raj, > > JCrawler has moved to a new project called Loadtest.io and you can find > usage documentation at: > https://github.com/inadarei/loadtestio/blob/master/README.markdown > > > > On Tue, Jan 4, 2011 at 5:53 PM, rajani pagadala <raj...@gm...>wrote: > >> Hi, >> >> Does anybody has Jcrawler user guide or Tutorial? >> Not installation guide. >> >> If any body has please send it to me.I would like to familiar with it. >> >> Thanks, >> Raj >> >> >> ------------------------------------------------------------------------------ >> Learn how Oracle Real Application Clusters (RAC) One Node allows customers >> to consolidate database storage, standardize their database environment, >> and, >> should the need arise, upgrade to a full multi-node Oracle RAC database >> without downtime or disruption >> http://p.sf.net/sfu/oracle-sfdevnl >> _______________________________________________ >> Jcrawler-main mailing list >> Jcr...@li... >> https://lists.sourceforge.net/lists/listinfo/jcrawler-main >> >> > |
From: irakli <ir...@gm...> - 2011-01-04 23:11:49
|
Raj, JCrawler has moved to a new project called Loadtest.io and you can find usage documentation at: https://github.com/inadarei/loadtestio/blob/master/README.markdown On Tue, Jan 4, 2011 at 5:53 PM, rajani pagadala <raj...@gm...>wrote: > Hi, > > Does anybody has Jcrawler user guide or Tutorial? > Not installation guide. > > If any body has please send it to me.I would like to familiar with it. > > Thanks, > Raj > > > ------------------------------------------------------------------------------ > Learn how Oracle Real Application Clusters (RAC) One Node allows customers > to consolidate database storage, standardize their database environment, > and, > should the need arise, upgrade to a full multi-node Oracle RAC database > without downtime or disruption > http://p.sf.net/sfu/oracle-sfdevnl > _______________________________________________ > Jcrawler-main mailing list > Jcr...@li... > https://lists.sourceforge.net/lists/listinfo/jcrawler-main > > |
From: rajani p. <raj...@gm...> - 2011-01-04 23:03:07
|
Hi, Does anybody has Jcrawler user guide or Tutorial? Not installation guide. If any body has please send it to me.I would like to familiar with it. Thanks, Raj |
From: rajani p. <raj...@gm...> - 2011-01-04 22:54:00
|
Hi, Does anybody has Jcrawler user guide or Tutorial? Not installation guide. If any body has please send it to me.I would like to familiar with it. Thanks, Raj |
From: irakli <ir...@gm...> - 2010-11-01 06:59:41
|
Dear JCrawler users, I wanted to thank you for being loyal JCrawler users over the years and wanted to let you know that I am ceasing development of JCrawler effective immediately. It has been a tool that I have used a lot, personally and the one that, hopefully, others found useful, as well. That said, most of it was written 6 years ago and its architecture is just as old. I have no desire to keep contributing to the outdated architecture. I still believe the main idea behind JCrawler is a valid one. That is why I have been spending a lot of time lately to write a re-architected tool with a similar intention. I have released first version of the code on GitHub: http://loadtest.io and that is where I intend to continue the development. Feel free to download, try and contribute. Thank you, Irakli |
From: irakli <ir...@gm...> - 2010-10-05 16:25:35
|
Not really. It is designed to parse HTML and crawl URLs. That said if your Silverlight application is a public site and SEO-optimized, you are probably exposing every endpoint that you have in Silverlight in HTML as URLs, as well, in which case it would work. On Tue, Oct 5, 2010 at 10:22 AM, John Harper <jo...@of...>wrote: > Hello everyone, > > My name is John, I'm working as a Test Engineer and I've come across > JCrawler while searching tools for stress and load tests. > > As my current project is one developed in Silverlight, I have a quick > question: > > Does JCrawler support this Microsoft technology - Silverlight? > > Thanks in advance, > > Best regards, > > John Harper > > > ------------------------------------------------------------------------------ > Beautiful is writing same markup. Internet Explorer 9 supports > standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. > Spend less time writing and rewriting code and more time creating great > experiences on the web. Be a part of the beta today. > http://p.sf.net/sfu/beautyoftheweb > _______________________________________________ > Jcrawler-main mailing list > Jcr...@li... > https://lists.sourceforge.net/lists/listinfo/jcrawler-main > > |
From: John H. <jo...@of...> - 2010-10-05 14:22:09
|
Hello everyone, My name is John, I'm working as a Test Engineer and I've come across JCrawler while searching tools for stress and load tests. As my current project is one developed in Silverlight, I have a quick question: Does JCrawler support this Microsoft technology - Silverlight? Thanks in advance, Best regards, John Harper |
From: Krishna C. R. B. <cha...@gm...> - 2009-07-30 17:56:21
|
Can we stop the Jcrawler after it is done with one pass. In the log we see many repeated URLs that is the reason I had this doubt. Please let me know if we can make it stop after one pass either by code change or config change. Thank you very much for the time and patience -KC On Wed, Jul 29, 2009 at 1:45 PM, Krishna Chaitanya Reddy Balam < cha...@gm...> wrote: > Can we stop the Jcrawler after it is done with one pass. In the log we see > many repeated URLs that is the reason I had this doubt. > Please let me know if we can make it stop after one pass either by code > change or config change. > > Thank you very much for the time and patience > > -KC > > On Wed, Jul 29, 2009 at 1:22 PM, irakli <ir...@gm...> wrote: > >> JCrawler only crawls unique URLs. It will not hit the sameURL two times >> in one pass. However, to ensure load during >> reasonable time, once it is done with one pass and has exhausted >> all unique URLs, it will restart. >> >> That's something that could be configurable, just has not come up >> as much of a need, so far. >> >> >> >> On Wed, Jul 29, 2009 at 4:17 PM, Krishna Chaitanya Reddy Balam < >> cha...@gm...> wrote: >> >>> Hello All:We are trying to use Jcrawler to crawl through our site to >>> build up the cache at regular intervals and we do not want it to run and >>> generate load. >>> Is there a way to stop the crawl after it hits all the unique URLs. >>> Also more importantly can we make it crawl only unique URLs. >>> >>> Thank you, >>> KC >>> >>> >>> ------------------------------------------------------------------------------ >>> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 >>> 30-Day >>> trial. Simplify your report design, integration and deployment - and >>> focus on >>> what you do best, core application coding. Discover what's new with >>> Crystal Reports now. http://p.sf.net/sfu/bobj-july >>> _______________________________________________ >>> Jcrawler-main mailing list >>> Jcr...@li... >>> https://lists.sourceforge.net/lists/listinfo/jcrawler-main >>> >>> >> > |
From: Eddie B. <Ed...@ma...> - 2009-07-29 21:09:26
|
Remove unsubscribe |
From: Krishna C. R. B. <cha...@gm...> - 2009-07-29 20:45:48
|
Can we stop the Jcrawler after it is done with one pass. In the log we see many repeated URLs that is the reason I had this doubt. Please let me know if we can make it stop after one pass either by code change or config change. Thank you very much for the time and patience -KC On Wed, Jul 29, 2009 at 1:22 PM, irakli <ir...@gm...> wrote: > JCrawler only crawls unique URLs. It will not hit the sameURL two times in > one pass. However, to ensure load during > reasonable time, once it is done with one pass and has exhausted > all unique URLs, it will restart. > > That's something that could be configurable, just has not come up > as much of a need, so far. > > > > On Wed, Jul 29, 2009 at 4:17 PM, Krishna Chaitanya Reddy Balam < > cha...@gm...> wrote: > >> Hello All:We are trying to use Jcrawler to crawl through our site to >> build up the cache at regular intervals and we do not want it to run and >> generate load. >> Is there a way to stop the crawl after it hits all the unique URLs. >> Also more importantly can we make it crawl only unique URLs. >> >> Thank you, >> KC >> >> >> ------------------------------------------------------------------------------ >> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 >> 30-Day >> trial. Simplify your report design, integration and deployment - and focus >> on >> what you do best, core application coding. Discover what's new with >> Crystal Reports now. http://p.sf.net/sfu/bobj-july >> _______________________________________________ >> Jcrawler-main mailing list >> Jcr...@li... >> https://lists.sourceforge.net/lists/listinfo/jcrawler-main >> >> > |
From: irakli <ir...@gm...> - 2009-07-29 20:22:25
|
JCrawler only crawls unique URLs. It will not hit the sameURL two times in one pass. However, to ensure load during reasonable time, once it is done with one pass and has exhausted all unique URLs, it will restart. That's something that could be configurable, just has not come up as much of a need, so far. On Wed, Jul 29, 2009 at 4:17 PM, Krishna Chaitanya Reddy Balam < cha...@gm...> wrote: > Hello All:We are trying to use Jcrawler to crawl through our site to build > up the cache at regular intervals and we do not want it to run and generate > load. > Is there a way to stop the crawl after it hits all the unique URLs. > Also more importantly can we make it crawl only unique URLs. > > Thank you, > KC > > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus > on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Jcrawler-main mailing list > Jcr...@li... > https://lists.sourceforge.net/lists/listinfo/jcrawler-main > > |
From: Krishna C. R. B. <cha...@gm...> - 2009-07-29 20:17:31
|
Hello All:We are trying to use Jcrawler to crawl through our site to build up the cache at regular intervals and we do not want it to run and generate load. Is there a way to stop the crawl after it hits all the unique URLs. Also more importantly can we make it crawl only unique URLs. Thank you, KC |
From: Aditya LV <lva...@gm...> - 2009-03-14 22:19:11
|
Hi everyone I have to Load test for 300 Vusers Java application developed using Applets. The protocol is TCP/IP . Is there any pen source tool that meets my requirement. Please let me know ASAP if there is any. thanks in advance for anyone who can give me this information. Thanks Aditya |
From: Eddie B. <Ed...@ma...> - 2009-03-13 20:03:56
|
Thanks No. Hopefully this problem will eventually be solved Eddie From: Neo Wang [mailto:wan...@gm...] Sent: Friday, March 13, 2009 4:56 PM To: Eddie Barcellos Subject: Re: [Jcrawler-main] Stops working after a while... Hi Eddie, it probably is a bug. As I understand now, this is how jcrawler works (referring to http://sourceforge.net/mailarchive/forum.php?thread_name=36EC972D677CE544B54 41B43BC7A0C2112E069%40dgexchange.DGF.local <http://sourceforge.net/mailarchive/forum.php?thread_name=36EC972D677CE544B5 441B43BC7A0C2112E069%40dgexchange.DGF.local&forum_name=jcrawler-main> &forum_name=jcrawler-main): 1. put all URL it can find to a FIFO queue 2. keep fetching FIFO 20 URLs every 250 milliseconds (won't wait each page fully fetched), which means URL won't repeat. Each page being fetched is one active threads, so active threads account keep adding before that page is fully fetched. If a page is fully fetched, active threads number will minus one. 3. restart the FIFO queue if all URL has been used. I guess FIFO failed to restart at certain point, them active threads go all the way down to zero. that's what we see from your log and mine. Thanks, Neo On Thu, Mar 12, 2009 at 3:57 PM, Eddie Barcellos <Ed...@ma...> wrote: Thanks Neo. I apologize but I don´t have the time to debug this tool... I guess I will have to learn to use Pylot. Best, Eddie From: Neo Wang [mailto:wan...@gm...] Sent: Thursday, March 12, 2009 7:43 PM To: Eddie Barcellos Cc: jcr...@li... Subject: Re: [Jcrawler-main] Stops working after a while... Hi Eddie, It is same for me. As I can tell, active threads goes up to a peak, then down to 0, then it looks like dead. So, we probably need to figure out what active threads really means...I have to be reading the code now because of my job, probably I can find out something and let you know. PS, I write a java program to analyze monitor.log to get max active threads vs current speed, and max speed vs current active threads; it also convert monitor.log into a CSV file to generate a more readable chart report (you have to do that manually in Excel). Here is the code for your reference. package foo; import java.io.BufferedReader; import java.io.BufferedWriter; import java.io.FileReader; import java.io.FileWriter; import java.util.ArrayList; public class C { /** * @param args */ public static void main(String[] args) throws Exception{ BufferedReader file = new BufferedReader(new FileReader("C:/Documents and Settings/nwang/My Documents/Downloads/jcrawler/dist/monitor.log")); String s = file.readLine(); ArrayList times = new ArrayList(); ArrayList averageSpeeds = new ArrayList(); ArrayList currentSpeeds = new ArrayList(); ArrayList activeThreads = new ArrayList(); int i = 0; while (s != null) { if (s.indexOf("Elapsed") >= 0){ String time = s.substring(0, 8); if (time.substring(0, 1).equals("0")){ times.add("2009-02-19 " + time); } else { times.add("2009-02-18 " + time); } } else if (s.indexOf("Average Speed:") >= 0){ averageSpeeds.add(s.substring("Average Speed:".length(), s.indexOf("pages/second fetched")).trim()); } else if (s.indexOf("Current Speed:") >= 0){ currentSpeeds.add(s.substring("Current Speed:".length(), s.indexOf("pages/second fetched")).trim()); } else if (s.indexOf("Active Threads:") >= 0){ activeThreads.add(s.substring("Active Threads:".length()).trim()); } i++; s = file.readLine(); } file.close(); BufferedWriter writer = new BufferedWriter(new FileWriter("C:/Documents and Settings/nwang/My Documents/Downloads/jcrawler/dist/monitor1.csv")); writer.append("Timeline, Average Speed (pages/second), Current Speed(pages/second), Active Threads\n"); i = 0; int maxActiveThreads = 0; String maxActiveThreadsTime = null; String maxActiveThreadsSpeed = null; double maxSpeed = 0.00; String maxSpeedThreads = null; String maxSpeedTime = null; for (int j=0; j<times.size(); j++){ i++; int x = Integer.parseInt((String)activeThreads.get(j)); if (x > maxActiveThreads) { maxActiveThreads = x; maxActiveThreadsTime = (String)times.get(j); maxActiveThreadsSpeed = (String) currentSpeeds.get(j); } double d = Double.parseDouble((String)currentSpeeds.get(j)); if (d > maxSpeed){ maxSpeed = d; maxSpeedTime = (String)times.get(j); maxSpeedThreads = (String)activeThreads.get(j); } if (i == 10000){ writer.close(); writer = new BufferedWriter(new FileWriter("C:/Documents and Settings/nwang/My Documents/Downloads/jcrawler/dist/monitor" + j + ".csv")); writer.append("Timeline, Average Speed (pages/second), Current Speed(pages/second), Active Threads\n"); i = 0; } writer.append(times.get(j) + "," + averageSpeeds.get(j) + "," + currentSpeeds.get(j) + "," + activeThreads.get(j) + "\n"); } System.out.println("max threads:" + maxActiveThreads + "\t time: " + maxActiveThreadsTime + "\t speed: " + maxActiveThreadsSpeed); System.out.println("max speed:" + maxSpeed+ "\t time: " + maxSpeedTime + "\t threads: " + maxSpeedThreads); writer.close(); } } On Thu, Mar 12, 2009 at 8:10 AM, Eddie Barcellos <Ed...@ma...> wrote: First I wanted to thank the developer(s) for making this product available. I tried a LOT of different open source solutions but this one seemed to be the best! I just started messing around with jcrawler, and first I had problems with the heap size. Then I set xms64m, xmx512m and the heap size problem went away. I have it set to fetch every 1s and log every 3s. Now it just stops fetching after a couple hours and it doenst show an error msg. the log is at http://boxstr.com/files/4993604_jug46/monitor.zip Thanks in advance, Eddie ---------------------------------------------------------------------------- -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com _______________________________________________ Jcrawler-main mailing list Jcr...@li... https://lists.sourceforge.net/lists/listinfo/jcrawler-main ---------------------------------------------------------------------------- -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com _______________________________________________ Jcrawler-main mailing list Jcr...@li... https://lists.sourceforge.net/lists/listinfo/jcrawler-main |
From: Eddie B. <Ed...@ma...> - 2009-03-12 22:58:46
|
Thanks Neo. I apologize but I don´t have the time to debug this tool... I guess I will have to learn to use Pylot. Best, Eddie From: Neo Wang [mailto:wan...@gm...] Sent: Thursday, March 12, 2009 7:43 PM To: Eddie Barcellos Cc: jcr...@li... Subject: Re: [Jcrawler-main] Stops working after a while... Hi Eddie, It is same for me. As I can tell, active threads goes up to a peak, then down to 0, then it looks like dead. So, we probably need to figure out what active threads really means...I have to be reading the code now because of my job, probably I can find out something and let you know. PS, I write a java program to analyze monitor.log to get max active threads vs current speed, and max speed vs current active threads; it also convert monitor.log into a CSV file to generate a more readable chart report (you have to do that manually in Excel). Here is the code for your reference. package foo; import java.io.BufferedReader; import java.io.BufferedWriter; import java.io.FileReader; import java.io.FileWriter; import java.util.ArrayList; public class C { /** * @param args */ public static void main(String[] args) throws Exception{ BufferedReader file = new BufferedReader(new FileReader("C:/Documents and Settings/nwang/My Documents/Downloads/jcrawler/dist/monitor.log")); String s = file.readLine(); ArrayList times = new ArrayList(); ArrayList averageSpeeds = new ArrayList(); ArrayList currentSpeeds = new ArrayList(); ArrayList activeThreads = new ArrayList(); int i = 0; while (s != null) { if (s.indexOf("Elapsed") >= 0){ String time = s.substring(0, 8); if (time.substring(0, 1).equals("0")){ times.add("2009-02-19 " + time); } else { times.add("2009-02-18 " + time); } } else if (s.indexOf("Average Speed:") >= 0){ averageSpeeds.add(s.substring("Average Speed:".length(), s.indexOf("pages/second fetched")).trim()); } else if (s.indexOf("Current Speed:") >= 0){ currentSpeeds.add(s.substring("Current Speed:".length(), s.indexOf("pages/second fetched")).trim()); } else if (s.indexOf("Active Threads:") >= 0){ activeThreads.add(s.substring("Active Threads:".length()).trim()); } i++; s = file.readLine(); } file.close(); BufferedWriter writer = new BufferedWriter(new FileWriter("C:/Documents and Settings/nwang/My Documents/Downloads/jcrawler/dist/monitor1.csv")); writer.append("Timeline, Average Speed (pages/second), Current Speed(pages/second), Active Threads\n"); i = 0; int maxActiveThreads = 0; String maxActiveThreadsTime = null; String maxActiveThreadsSpeed = null; double maxSpeed = 0.00; String maxSpeedThreads = null; String maxSpeedTime = null; for (int j=0; j<times.size(); j++){ i++; int x = Integer.parseInt((String)activeThreads.get(j)); if (x > maxActiveThreads) { maxActiveThreads = x; maxActiveThreadsTime = (String)times.get(j); maxActiveThreadsSpeed = (String) currentSpeeds.get(j); } double d = Double.parseDouble((String)currentSpeeds.get(j)); if (d > maxSpeed){ maxSpeed = d; maxSpeedTime = (String)times.get(j); maxSpeedThreads = (String)activeThreads.get(j); } if (i == 10000){ writer.close(); writer = new BufferedWriter(new FileWriter("C:/Documents and Settings/nwang/My Documents/Downloads/jcrawler/dist/monitor" + j + ".csv")); writer.append("Timeline, Average Speed (pages/second), Current Speed(pages/second), Active Threads\n"); i = 0; } writer.append(times.get(j) + "," + averageSpeeds.get(j) + "," + currentSpeeds.get(j) + "," + activeThreads.get(j) + "\n"); } System.out.println("max threads:" + maxActiveThreads + "\t time: " + maxActiveThreadsTime + "\t speed: " + maxActiveThreadsSpeed); System.out.println("max speed:" + maxSpeed+ "\t time: " + maxSpeedTime + "\t threads: " + maxSpeedThreads); writer.close(); } } On Thu, Mar 12, 2009 at 8:10 AM, Eddie Barcellos <Ed...@ma...> wrote: First I wanted to thank the developer(s) for making this product available. I tried a LOT of different open source solutions but this one seemed to be the best! I just started messing around with jcrawler, and first I had problems with the heap size. Then I set xms64m, xmx512m and the heap size problem went away. I have it set to fetch every 1s and log every 3s. Now it just stops fetching after a couple hours and it doenst show an error msg. the log is at http://boxstr.com/files/4993604_jug46/monitor.zip Thanks in advance, Eddie ---------------------------------------------------------------------------- -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com _______________________________________________ Jcrawler-main mailing list Jcr...@li... https://lists.sourceforge.net/lists/listinfo/jcrawler-main |
From: Neo W. <wan...@gm...> - 2009-03-12 22:43:31
|
Hi Eddie, It is same for me. As I can tell, active threads goes up to a peak, then down to 0, then it looks like dead. So, we probably need to figure out what active threads really means...I have to be reading the code now because of my job, probably I can find out something and let you know. PS, I write a java program to analyze monitor.log to get max active threads vs current speed, and max speed vs current active threads; it also convert monitor.log into a CSV file to generate a more readable chart report (you have to do that manually in Excel). Here is the code for your reference. package foo; import java.io.BufferedReader; import java.io.BufferedWriter; import java.io.FileReader; import java.io.FileWriter; import java.util.ArrayList; public class C { /** * @param args */ public static void main(String[] args) throws Exception{ BufferedReader file = new BufferedReader(new FileReader("C:/Documents and Settings/nwang/My Documents/Downloads/jcrawler/dist/monitor.log")); String s = file.readLine(); ArrayList times = new ArrayList(); ArrayList averageSpeeds = new ArrayList(); ArrayList currentSpeeds = new ArrayList(); ArrayList activeThreads = new ArrayList(); int i = 0; while (s != null) { if (s.indexOf("Elapsed") >= 0){ String time = s.substring(0, 8); if (time.substring(0, 1).equals("0")){ times.add("2009-02-19 " + time); } else { times.add("2009-02-18 " + time); } } else if (s.indexOf("Average Speed:") >= 0){ averageSpeeds.add(s.substring("Average Speed:".length(), s.indexOf("pages/second fetched")).trim()); } else if (s.indexOf("Current Speed:") >= 0){ currentSpeeds.add(s.substring("Current Speed:".length(), s.indexOf("pages/second fetched")).trim()); } else if (s.indexOf("Active Threads:") >= 0){ activeThreads.add(s.substring("Active Threads:".length()).trim()); } i++; s = file.readLine(); } file.close(); BufferedWriter writer = new BufferedWriter(new FileWriter("C:/Documents and Settings/nwang/My Documents/Downloads/jcrawler/dist/monitor1.csv")); writer.append("Timeline, Average Speed (pages/second), Current Speed(pages/second), Active Threads\n"); i = 0; int maxActiveThreads = 0; String maxActiveThreadsTime = null; String maxActiveThreadsSpeed = null; double maxSpeed = 0.00; String maxSpeedThreads = null; String maxSpeedTime = null; for (int j=0; j<times.size(); j++){ i++; int x = Integer.parseInt((String)activeThreads.get(j)); if (x > maxActiveThreads) { maxActiveThreads = x; maxActiveThreadsTime = (String)times.get(j); maxActiveThreadsSpeed = (String) currentSpeeds.get(j); } double d = Double.parseDouble((String)currentSpeeds.get(j)); if (d > maxSpeed){ maxSpeed = d; maxSpeedTime = (String)times.get(j); maxSpeedThreads = (String)activeThreads.get(j); } if (i == 10000){ writer.close(); writer = new BufferedWriter(new FileWriter("C:/Documents and Settings/nwang/My Documents/Downloads/jcrawler/dist/monitor" + j + ".csv")); writer.append("Timeline, Average Speed (pages/second), Current Speed(pages/second), Active Threads\n"); i = 0; } writer.append(times.get(j) + "," + averageSpeeds.get(j) + "," + currentSpeeds.get(j) + "," + activeThreads.get(j) + "\n"); } System.out.println("max threads:" + maxActiveThreads + "\t time: " + maxActiveThreadsTime + "\t speed: " + maxActiveThreadsSpeed); System.out.println("max speed:" + maxSpeed+ "\t time: " + maxSpeedTime + "\t threads: " + maxSpeedThreads); writer.close(); } } On Thu, Mar 12, 2009 at 8:10 AM, Eddie Barcellos <Ed...@ma...>wrote: > First I wanted to thank the developer(s) for making this product available. > I tried a LOT of different open source solutions but this one seemed to be > the best! > > I just started messing around with jcrawler, and first I had problems with > the heap size. Then I set xms64m, xmx512m and the heap size problem went > away. I have it set to fetch every 1s and log every 3s. > > Now it just stops fetching after a couple hours and it doenst show an error > msg. the log is at > http://boxstr.com/files/4993604_jug46/monitor.zip > > Thanks in advance, > > Eddie > > > > > ------------------------------------------------------------------------------ > Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are > powering Web 2.0 with engaging, cross-platform capabilities. Quickly and > easily build your RIAs with Flex Builder, the Eclipse(TM)based development > software that enables intelligent coding and step-through debugging. > Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com > _______________________________________________ > Jcrawler-main mailing list > Jcr...@li... > https://lists.sourceforge.net/lists/listinfo/jcrawler-main > |
From: Eddie B. <Ed...@ma...> - 2009-03-12 15:32:57
|
First I wanted to thank the developer(s) for making this product available. I tried a LOT of different open source solutions but this one seemed to be the best! I just started messing around with jcrawler, and first I had problems with the heap size. Then I set xms64m, xmx512m and the heap size problem went away. I have it set to fetch every 1s and log every 3s. Now it just stops fetching after a couple hours and it doenst show an error msg. the log is at http://boxstr.com/files/4993604_jug46/monitor.zip Thanks in advance, Eddie |
From: Neo W. <wan...@gm...> - 2009-03-04 19:01:43
|
Here is my log: 08:23:49::382 ====== Elapsed: 973 mins ===== Average Speed: 1.16 pages/second fetched Current Speed: 0.72 pages/second fetched Active Threads: 0 Doesn't it mean JCrawler is not fetching pages when 'active threads' is zero? Then how come 'current speed' is not zero? |
From: Loritsch, B. <BLo...@ci...> - 2008-11-12 13:44:22
|
You just have to add your certificate to the trust store first. Use keytool to do that: keytool -import -keystore "/path/to/keystore" -trustcacerts -file "/my/self/signed/cert" Be sure to answer "yes" when keytool asks if you want to trust it. From: dur...@as... [mailto:dur...@as...] Sent: Wednesday, November 12, 2008 5:55 AM To: ir...@gm... Cc: jcr...@li... Subject: Re: [Jcrawler-main] https -- in jcrawler Hi irakli, If my certificate is self signed, is it ok for jcrawler. Regards, Durgaababu. Internet ir...@gm... 12-11-08 03:19 PM To jcr...@li... cc Subject Re: [Jcrawler-main] https -- in jcrawler Looks like you need to add SSL certificate of your website to your JVM keystore. On Wed, Nov 12, 2008 at 1:41 AM, <dur...@as...> wrote: > > Hi Jcrawler-main, > > When iam crawling the https url. iam getting the below error. > Could you please let me know how to crawl https urls. > > Regards, > Durgababu. > > > [THREAD#1 CREATED 18:38:43::089] WARN com.jcrawler.scheduler.FetcherTask > - Could not fetch URL: > https://ivision-sa.staging.echonet:30443/jsp/Authentifica > tion.jsp/ > javax.net.ssl.SSLHandshakeException: > sun.security.validator.ValidatorException: > PKIX path building failed: > sun.security.provider.certpath.SunCertPathBuilderExce > ption: unable to find valid certification path to requested target > 985 [THREAD#1 CREATED 18:38:43::089] WARN > com.jcrawler.scheduler.FetcherTask > - resultedHTML is null for url > https://ivision-sa.staging.echonet:30443/jsp/Auth > entification.jsp/ > > > This message and any attachments (the "message") is > intended solely for the addressees and is confidential. > If you receive this message in error, please delete it and > immediately notify the sender. Any use not in accord with > its purpose, any dissemination or disclosure, either whole > or partial, is prohibited except formal approval. The internet > can not guarantee the integrity of this message. > BNP PARIBAS (and its subsidiaries) shall (will) not > therefore be liable for the message if modified. > Do not print this message unless it is necessary, > consider the environment. > > --------------------------------------------- > > Ce message et toutes les pieces jointes (ci-apres le > "message") sont etablis a l'intention exclusive de ses > destinataires et sont confidentiels. Si vous recevez ce > message par erreur, merci de le detruire et d'en avertir > immediatement l'expediteur. Toute utilisation de ce > message non conforme a sa destination, toute diffusion > ou toute publication, totale ou partielle, est interdite, sauf > autorisation expresse. L'internet ne permettant pas > d'assurer l'integrite de ce message, BNP PARIBAS (et ses > filiales) decline(nt) toute responsabilite au titre de ce > message, dans l'hypothese ou il aurait ete modifie. > N'imprimez ce message que si necessaire, > pensez a l'environnement. > > > ------------------------------------------------------------------------ - > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win great > prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Jcrawler-main mailing list > Jcr...@li... > https://lists.sourceforge.net/lists/listinfo/jcrawler-main > > Looks like you need to add SSL certificate of your website to your JVM keystore. On Wed, Nov 12, 2008 at 1:41 AM, <dur...@as...>wrote: Hi Jcrawler-main, When iam crawling the https url. iam getting the below error. Could you please let me know how to crawl https urls. Regards, Durgababu. [THREAD#1 CREATED 18:38:43::089] WARN com.jcrawler.scheduler.FetcherTask - Could not fetch URL: https://ivision-sa.staging.echonet:30443/jsp/Authentifica <https://ivision-sa.staging.echonet:30443/jsp/Authentifica> tion.jsp/ javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderExce ption: unable to find valid certification path to requested target 985 [THREAD#1 CREATED 18:38:43::089] WARN com.jcrawler.scheduler.FetcherTask - resultedHTML is null for url https://ivision-sa.staging.echonet:30443/jsp/Auth <https://ivision-sa.staging.echonet:30443/jsp/Auth> entification.jsp/ This message and any attachments (the "message") is intended solely for the addressees and is confidential. If you receive this message in error, please delete it and immediately notify the sender. Any use not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. The internet can not guarantee the integrity of this message. BNP PARIBAS (and its subsidiaries) shall (will) not therefore be liable for the message if modified. Do not print this message unless it is necessary, consider the environment. --------------------------------------------- Ce message et toutes les pieces jointes (ci-apres le "message") sont etablis a l'intention exclusive de ses destinataires et sont confidentiels. Si vous recevez ce message par erreur, merci de le detruire et d'en avertir immediatement l'expediteur. Toute utilisation de ce message non conforme a sa destination, toute diffusion ou toute publication, totale ou partielle, est interdite, sauf autorisation expresse. L'internet ne permettant pas d'assurer l'integrite de ce message, BNP PARIBAS (et ses filiales) decline(nt) toute responsabilite au titre de ce message, dans l'hypothese ou il aurait ete modifie. N'imprimez ce message que si necessaire, pensez a l'environnement. ------------------------------------------------------------------------ - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Jcrawler-main mailing list Jcr...@li... https://lists.sourceforge.net/lists/listinfo/jcrawler-main ------------------------------------------------------------------------ - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Jcrawler-main mailing list Jcr...@li... https://lists.sourceforge.net/lists/listinfo/jcrawler-main ********************************************************************** This e-mail, including attachments, may include privileged, confidential and/or proprietary information protected by state and/or federal law, and may be used only by the person or entity to which it is addressed. If the reader of this e-mail is not the intended recipient or his or her authorized agent, the reader is hereby notified that any use, disclosure, distribution or copying of this e-mail (or attachments) is prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and delete this e-mail, and any attachments, immediately. ********************************************************************** |
From: <dur...@as...> - 2008-11-12 10:55:37
|
Hi irakli, If my certificate is self signed, is it ok for jcrawler. Regards, Durgaababu. Internet ir...@gm... 12-11-08 03:19 PM To jcr...@li... cc Subject Re: [Jcrawler-main] https -- in jcrawler Looks like you need to add SSL certificate of your website to your JVM keystore. On Wed, Nov 12, 2008 at 1:41 AM, <dur...@as...> wrote: > > Hi Jcrawler-main, > > When iam crawling the https url. iam getting the below error. > Could you please let me know how to crawl https urls. > > Regards, > Durgababu. > > > [THREAD#1 CREATED 18:38:43::089] WARN com.jcrawler.scheduler.FetcherTask > - Could not fetch URL: > https://ivision-sa.staging.echonet:30443/jsp/Authentifica > tion.jsp/ > javax.net.ssl.SSLHandshakeException: > sun.security.validator.ValidatorException: > PKIX path building failed: > sun.security.provider.certpath.SunCertPathBuilderExce > ption: unable to find valid certification path to requested target > 985 [THREAD#1 CREATED 18:38:43::089] WARN > com.jcrawler.scheduler.FetcherTask > - resultedHTML is null for url > https://ivision-sa.staging.echonet:30443/jsp/Auth > entification.jsp/ > > > This message and any attachments (the "message") is > intended solely for the addressees and is confidential. > If you receive this message in error, please delete it and > immediately notify the sender. Any use not in accord with > its purpose, any dissemination or disclosure, either whole > or partial, is prohibited except formal approval. The internet > can not guarantee the integrity of this message. > BNP PARIBAS (and its subsidiaries) shall (will) not > therefore be liable for the message if modified. > Do not print this message unless it is necessary, > consider the environment. > > --------------------------------------------- > > Ce message et toutes les pieces jointes (ci-apres le > "message") sont etablis a l'intention exclusive de ses > destinataires et sont confidentiels. Si vous recevez ce > message par erreur, merci de le detruire et d'en avertir > immediatement l'expediteur. Toute utilisation de ce > message non conforme a sa destination, toute diffusion > ou toute publication, totale ou partielle, est interdite, sauf > autorisation expresse. L'internet ne permettant pas > d'assurer l'integrite de ce message, BNP PARIBAS (et ses > filiales) decline(nt) toute responsabilite au titre de ce > message, dans l'hypothese ou il aurait ete modifie. > N'imprimez ce message que si necessaire, > pensez a l'environnement. > > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win great > prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Jcrawler-main mailing list > Jcr...@li... > https://lists.sourceforge.net/lists/listinfo/jcrawler-main > > Looks like you need to add SSL certificate of your website to your JVM keystore. On Wed, Nov 12, 2008 at 1:41 AM, <dur...@as... >wrote: Hi Jcrawler-main, When iam crawling the https url. iam getting the below error. Could you please let me know how to crawl https urls. Regards, Durgababu. [THREAD#1 CREATED 18:38:43::089] WARN com.jcrawler.scheduler.FetcherTask - Could not fetch URL: https://ivision-sa.staging.echonet:30443/jsp/Authentifica tion.jsp/ javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderExce ption: unable to find valid certification path to requested target 985 [THREAD#1 CREATED 18:38:43::089] WARN com.jcrawler.scheduler.FetcherTask - resultedHTML is null for url https://ivision-sa.staging.echonet:30443/jsp/Auth entification.jsp/ This message and any attachments (the "message") is intended solely for the addressees and is confidential. If you receive this message in error, please delete it and immediately notify the sender. Any use not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. The internet can not guarantee the integrity of this message. BNP PARIBAS (and its subsidiaries) shall (will) not therefore be liable for the message if modified. Do not print this message unless it is necessary, consider the environment. --------------------------------------------- Ce message et toutes les pieces jointes (ci-apres le "message") sont etablis a l'intention exclusive de ses destinataires et sont confidentiels. Si vous recevez ce message par erreur, merci de le detruire et d'en avertir immediatement l'expediteur. Toute utilisation de ce message non conforme a sa destination, toute diffusion ou toute publication, totale ou partielle, est interdite, sauf autorisation expresse. L'internet ne permettant pas d'assurer l'integrite de ce message, BNP PARIBAS (et ses filiales) decline(nt) toute responsabilite au titre de ce message, dans l'hypothese ou il aurait ete modifie. N'imprimez ce message que si necessaire, pensez a l'environnement. ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Jcrawler-main mailing list Jcr...@li... https://lists.sourceforge.net/lists/listinfo/jcrawler-main ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Jcrawler-main mailing list Jcr...@li... https://lists.sourceforge.net/lists/listinfo/jcrawler-main |