#61 Language sensitive searches

open
nobody
htdig (31)
5
2003-03-21
2003-03-21
No

Hi,

I am setting up a bilingual website with content
negotiation. ht://Dig does not work well with that.
It does not state to the server in what language it
would like to be answered (no Accept-Language: header).

Currently, the default language for a webpage is all
that gets indexed. And the language that the searching
browser has set is not taken into account at all. What
a pitty.

Ideally, one htdig run would be able to query for >1
languages by requesting each of them separately. More
modestly, it would be useful to be able to run ht://Dig
twice, with different language settings, and perhaps
even with different databases. Hacking htsearch to
pick the preferred language is easy (see the attached
script).

I made a quick stab at it but failed, on account of
misunderstanding how the configuration file data
(specifically, the "locale" setting) is read into
htdig. Perhaps I'll give it another shot later.

Thanks for ht://Dig!

Rick van Rein,
OpenFortress.

Discussion

  • Rick van Rein

    Rick van Rein - 2003-03-21

    Search wrapper that interprets language preference

     
  • Rick van Rein

    Rick van Rein - 2003-03-21

    Logged In: YES
    user_id=23236

    It seems the attachment got lost in the mail. Here it is.

     
  • Nobody/Anonymous

    Logged In: NO

    Problem solved. A bit kludgy, but it works.

    The script I provided did not work properly, due to a safety
    setting in htsearch, so I compiled it twice for two languages, with
    different --with-default-config-file and --with-database-dir. The
    switch between the two versions of htsearch is made with a
    simple cgi-bin script, much like the one I uploaded before.

    I patched ht://Dig to provide the Accept-Language HTTP header
    if the variable accept_language is set in the configuration file.
    This means that two different configuration files are needed to
    support two languages, but luckily they can share a lot using
    include. The only different settings are database_dir and the
    new accept_language setting.

    Not sure if splitting a DB like this is ideal, but at least it's the least
    intrusive on ht://Dig, and that is what I wanted first and foremost.

    If you want to see it, use the two languages "nl" and "en" on the
    search box of http://openfortress.nl and see what happens if you
    type "notaris" or "notary" or "lawyer". Usually your browser will
    renogiate the page with the server if you reload the page after a
    change to your language settings. Such renegotiation leads to
    the new search output.

    Hope this is helpful to others as well. You are cordially invited to
    mention OpenFortress as the source of all this.

    Enjoy,
    Rick van Rein.

     
  • Rick van Rein

    Rick van Rein - 2003-03-24

    Working search wrapper that interprets language preferences

     
  • Rick van Rein

    Rick van Rein - 2003-03-24

    Logged In: YES
    user_id=23236

    The previously submitted script will not perform properly; the use
    of -c is prohibited in certain situations. Attached is a new script.

    It assumes that htsearch is built twice, and stored once with .nl
    and once with .en extension (vary as you need). The script then
    invokes the proper one.

    Building htsearch twice is done with different ./configure options
    for the database dir and configuration file. The config fles can
    share most options, thanks to the "include" directive.

    -Rick

     
  • Rick van Rein

    Rick van Rein - 2003-03-24

    Contextual diff to patch htdig with Accept-Language: header

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks