Menu

#41 Port Web Harvest to HttpComponents 4.x

2.1.0rc2-RELEASE
open
nobody
None
9
2013-03-26
2013-03-25
No

Please port the Web Harvest core to Apache HttpComponents 4.x (currently 4.2.3).
This is a hard work but it can lead you to a more updated core library, and less bugs given by commons-httpclient 3.1. Common-httpclient-3.1 bugs are proven by its team, and HttpComponents 4.x is nearly a full redesign of the entire library, so you can't simply replace the library and leave the code as is.
Another important reason is that commons-httpclient-3.1 doesn't work on cloud hosting systems like JElastic and OpenShift, then in order to let web harvest be compatible with those environments it needs HttpComponents 4.x.
For this reason I think that it should be a great enhancement to fully port web harvest to httpclient 4, and your hard work will be repayed with a more stable and well designed library, with a lot of bugs correct.

Discussion

  • Robert Bala

    Robert Bala - 2013-03-25
    • milestone: Backlog --> 2.1.0rc2-RELEASE
     
  • Alessandro Accardo

    hey folks, I've downloaded the complete source of web harvest from SVN, revision 644.
    I'm trying to port web-harvest to HttpComponents 4.x, as I requested here above.
    Since I plan to complete the port at the end of this week, if I'm lucky, I'm asking myself (and you) if there are some test cases or specific unit tests that I can run in order to achieve the completion of the port without falling in unwanted regressions or new bugs related to the new library. Can you adivse me something? And can you tell me if I'm doing right, porting revision 644 and in which case where I can put the updated code? I wish to help your community following your rules and your planning in order to let you keep my code and take care of it and in order to not to hinder your work with this big change.

     
  • Alessandro Accardo

    Hey all, I think I've done something, here is a patch file, I don't know if you have a way to handle it but... well I have the complete source too. These are the minimal changes required to get WH work with client 4.x and seems to work. I ran the unit tests and they fail less than the mainstream version, I don't know why they fail even in the trunk. Anyway the tests that are related to http connections seem to go well, so it might work.
    If you have an environment with the trunk source that is fully working and can run the patched classes please check it and let all know. Bye!

     

Log in to post a comment.