From: Adam P. <ap...@th...> - 2002-10-24 19:53:30
|
Hi everyone, First off, this is a great product. *All* of my troubles have turned out to be related to our new load-balancing architecture, not htdig, and when I've worked out all the kinks I'll post a workaround summary for others. A couple questions I couldn't find the answers to in the archives: 1. Is there a way to strip out elements from the search returns? In our case, each <title> tag in the site includes the site name. So headers from search returns kook like this: The Onion | Damn You, Hearst! The Onion | I Miss My Old Sled The Onion | Drop Dead, Every Last One of You! Pretty silly, right? I'd like to parse that repeating element out, preferable without employing an auxiliary script. 2. I see that there are configuration attributes for translate_latin1, translate_amp, and translate_lt_gt: false I thought translate_latin1: false might work, but I'm still getting the entity — printing out on the page instead of the em-dash in search results. Is there a config attribute I'm missing? Thanks! -- Adam Powell Web Programmer, The Onion America's Finest News Source ap...@th... | voice: 608.256.1372 | fax: 608.256.2535 www.theonion.com |