From: SourceForge.net <no...@so...> - 2006-05-31 09:34:13
|
Bugs item #1484404, was opened at 2006-05-08 23:07 Message generated for change (Comment added) made by nobody You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=424135&aid=1484404&group_id=39046 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: tv_grab_ch Group: None Status: Open Resolution: Wont Fix Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Stefan Siegl (stesie) Summary: tv_grab_ch fails Initial Comment: I noticed that (at least) for the last 3 days tv_grab_ch fails to get any data from www.fernsehen.ch: http://www.fernsehen.ch/sender/: cannot grab webpage http://www.fernsehen.ch/sender/ (tried 2 times). giving up. sorry at /usr/bin/tv_grab_ch line 730. see also http://www2.holmlund.se/xmltv-nightly/t_ch_1.log thanks for your support -olla ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2006-05-31 02:34 Message: Logged In: NO i wrote a new tv_grab_ch_bluewin grabber (using fernsehen. bluewin.ch). but its highly beta atm... ---------------------------------------------------------------------- Comment By: olla (olla) Date: 2006-05-27 02:40 Message: Logged In: YES user_id=1519979 Don't panic! A friend and me are currently developing a new grabber_ch (not for fernsehen.ch). I'll keep you informed. (e.g. for beta testing) ;-) Regards, olla ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2006-05-27 02:06 Message: Logged In: NO Change the line in Get_nice.pm from $ua->agent("xmltv/$XMLTV::VERSION"); to $ua->agent(""); This fixes the 403 error message. fersehen.ch provides better data for swiss television as tv_today does. So keep it in the Makefile. When tv_today disallows grabbers, what will you do. Stop grabbing ? ---------------------------------------------------------------------- Comment By: Robert Eden (rmeden) Date: 2006-05-26 08:56 Message: Logged In: YES user_id=270469 I'll chime in and agree with stesie XMLTV has always had a polciy to avoid "wars" with data providers. If they ask us to not use their site, or take action directed specifically at us, we abide by their wishes. Sometimes when contacted the data source is willing to work out a solution to provide data... maybe authorization to redistribute and a mirror server, maybe a raw XML feed, who knows. If someone (usually the maintainer) is interested in the feed, try and work something out with the site. While it may be *legal* for us to use a public site, we don't go where we're not wanted. The data provider could easily make our life difficult and we're not interested in a arms race. Robert ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2006-05-25 12:28 Message: Logged In: NO Change the line in Get_nice.pm from $ua->agent("xmltv/$XMLTV::VERSION"); to $ua->agent(""); This fixes the 403 error message. fersehen.ch provides better data for swiss television as tv_today does. So keep it in the Makefile. When tv_today disallows grabbers, what will you do. Stop grabbing ? ---------------------------------------------------------------------- Comment By: Stefan Siegl (stesie) Date: 2006-05-25 10:52 Message: Logged In: YES user_id=313365 Well, you can forge the useragent-line of course. But do you think that it will help? If fernsehen.ch guys decided that they don't want us to grab their site, than they can and probably will keep us (you) from doing that. If you change the user agent to `empty' they'll just block that as well (sooner or later), and so on. If this isn't possible they can change their PHP-pages slightly, etc. To make it short, I think stealing their data (i.e. against their will) is the wrong approach and not fair. Therefore I'm definitely not going to reenable the grabber unless they allow us (i.e. with xmltv agent string) to get the data from them (and I'm quite convinced that the other XMLTV developers agree with me here) In case tvtoday.de folks would block the XMLTV user agent as well, I'd just disable that grabber as well (and maybe consider writing another grabber). What else do you expect? cheers, stesie PS: folks, please don't do what $anonymous has suggested. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2006-05-25 09:44 Message: Logged In: NO Change the line in Get_nice.pm from $ua->agent("xmltv/$XMLTV::VERSION"); to $ua->agent(""); This fixes the 403 error message. fersehen.ch provides better data for swiss television as tv_today does. So keep it in the Makefile. When tv_today disallows grabbers, what will you do. Stop grabbing ? ---------------------------------------------------------------------- Comment By: Stefan Siegl (stesie) Date: 2006-05-11 09:12 Message: Logged In: YES user_id=313365 well, if you choose another data source, you need to mostly rewrite the grabber, since these webpages are designed in a different way. That in turn takes time, time I don't have. However you or somebody else of course can do this. if you decide to stick to fernsehen.ch you might not need to even adjust the grabber (except for some different uri for example) cheers, stesie ---------------------------------------------------------------------- Comment By: olla (olla) Date: 2006-05-10 01:44 Message: Logged In: YES user_id=1519979 If (as you say) the data quality is poor anyhow, why not choose another webpage? I don't know the prerequisits for a grabber to work, but here are some pages featureing swiss tv program: www.tvstar.ch www.tvtv.ch http://fernsehen.bluewin.ch/programm/ regards, olla from switzerland ---------------------------------------------------------------------- Comment By: Stefan Siegl (stesie) Date: 2006-05-09 12:49 Message: Logged In: YES user_id=313365 hi, seems like tv_grab_ch's life is over - fernsehen.ch's http server returns a 403 for every request with "xmltv" at the beginning of the user agent. Therefore I must conclude that they don't want us to scrape their site any more. A reason for this might be, that they offer a member ship account (which costs 24 EUR a year) since recently. In case any Swiss feels like asking them what could be done, like providing the data for money (well should be better data in case, the one we had before was far from perfect), etc. -- please feel free to do so. I'm neither Swiss - and thus not willing to pay for the data - nor got the time to do that job. In case of any open questions please feel free to ask me however. For the moment I'm going to disable the grabber in the Makefile, etc. cheers, stesie ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=424135&aid=1484404&group_id=39046 |