From: xratemx <xr...@go...> - 2014-08-20 14:08:44
|
hey archiving-community, I am trying to find a way to record/crawl HTTPS-websites using proxy. I have already tested a few warc-proxys and they do a great job for HTTP! But im totaly lost when recording HTTPS-content.. Since all of them support HTTPS I am kinda confused and I am curious what you guys use to archive HTTPS content? I tested these tools: LiveArchivingProxy+warcwriter ( https://github.com/INA-DLWeb/LiveArchivingProxy) WarcMITMProxy (https://github.com/odie5533/WarcMITMProxy) warcproxy (https://github.com/internetarchive/warcprox) note: I dont want to harvest whole websites. Only a few pages.. so heritrix is probably not an option?! thank you |