WebHarvest - web data extraction tool / Discussion / Open Discussion: extract http body

extract http body

Forum: Open Discussion

Creator: Duro

Created: 2010-05-28

Updated: 2012-09-04

Duro - 2010-05-28

hi, i'd like to extract http message body, which is a pdf file. the file is
served from a script, which send response with content-type "application
/force-download" . in the body, there is the pdf file. Do u have to use
http.client for this, or is there some other variable, which gives me the
message body?

thanks, Juraj

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anonymous - 2010-05-28

i believe you'll need to do the processing yourself. webharvest retrieves the
response and it's up to you to splice it to extract the pdf. i suggest looking
into xpath string functions, specifically substring-after.

good luck!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.