Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

Search Engine Indexing of Reference Database

adrianbj
2005-05-31
2013-05-28
  • adrianbj
    adrianbj
    2005-05-31

    Just wondering if there is a way to have search engine robots index all the references in the database?

    Seems to me that if they found the results of a search with no terms (which results in Refbase displaying all references in alphabetical order) that would work.

    Any thoughts please.

    Thanks,

    Adrian

     
    • Hi Adrian,

      I was wondering about the same, just recently. Google says they are able to index dynamically generated pages and that they index them (at least to some degree). I remember that a while back when I checked Google's site there was a note that only simple URLs (containing only one parameter or such) are indexed. I don't think this is still true.

      Anyhow, it might be worth including a 'show.php?record=1234' link on the details page for a given record (with '1234' being its record serial number). We had plans to do this anyhow, so that it's easier for a user to copy or bookmark a permanent link to the currently displayed record. In refbase-0.8.0, there's already a code snippet (lines 1204 to 1207) in 'search.php' which is currently commented out and which you could use as a basis for playing around.

      Since google doesn't crawl deep into a web site's hierarchy, it might be a good idea to setup a top-level page that always presents 'show.php' links to the most recent records.

      Just some thoughts...

      Regards, Matthias

       
      • I think that, eventually, MOST of the database will be indexed.  Some spiders do follow links quite deeply & once one spiders your site, other spiders eventually get updated from the indexes generated by the other spiders.

        However, I always have a show.php link which shows the records I'd most like to be indexed.  This is useful to humans to, as it shows just the good stuff.

         
        • adrianbj
          adrianbj
          2005-05-31

          Thank you both for your suggestions. I enabled the 'Link to this record' option that Matthias suggested. I am also going to modify to create an 'Email this link' option, which I think users will find useful.

          I was actually thinking of adding a 'Show All Records' option on the main page - I was thinking about also having it display all records at once - figured this is the best way to ensure search engine indexing, but concerned that maybe it will result in high server load if it started getting lots of direct links from Google.

          What do you guys think?

          Adrian

           
          • > I was actually thinking of adding a 'Show All Records' option
            > on the main page

            Yes, kind of a 'Browse records' link. It seems useful especially since a user isn't likely to discover the fact that ommitting the search term will show all records.

            > I was thinking about also having it display all records at
            > once - figured this is the best way to ensure search engine
            > indexing, but concerned that maybe it will result in high
            > server load if it started getting lots of direct links from
            > Google.

            Hmm, I'm not sure whether this would pose severe server strains.

            A static page (linked from the main page) that lists all the new and/or important records (as Rick suggests) and that gets generated/updated automatically seems like a good alternative.

            Regards, Matthias

             
            • adrianbj
              adrianbj
              2005-05-31

              Thanks Matthias,

              I decided to add a Browse All Records Link in the hope that Google will index this and then all the pages within it.

              This is the code I added to the header.inc.php file. There might be a simpler way to do it (if so, please let me know). Otherwise, this might be useful for others who are interested.

              <a href="search.php?sqlQuery=SELECT%20author%2C%20title%2C%20year%2C%20publication%2C%20volume%2C%20pages%20FROM%20refs%20WHERE%20author%20LIKE%20%22%25%25%25%22%20ORDER%20BY%20author%2C%20year%20ASC%2C%20author&formType=sqlSearch&submit=&showLinks=1&headerMsg=" title="browse all publications in the database">Browse All Publications </a>

               
              • Hi Adrian,

                your query is fine. However, it won't inlcude records where the author field is NULL ('LIKE "%"' will find empty strings but not NULL values). The following is a modification of your query which does also include records where author is NULL:

                search.php?sqlQuery=SELECT%20author%2C%20title%2C%20year%2C%20publicati
                on%2C%20volume%2C%20pages%20FROM%20refs%20WHERE%20author%20LIKE%20%22%25%25%25%2
                2%20OR%20author%20IS%20NULL%20ORDER%20BY%20author%2C%20year%20ASC%2C%20author&formType=sqlSearch&s
                howLinks=1

                The refbase CVS repository contains a new version of 'show.php' which would allow you to write:

                show.php?serial=%2E%2B&recordConditionalSelector=contains

                to achieve the same result. The sort order is different from yours, though: 'ORDER BY author, year DESC, publication'.

                Problems are that I'm not sure what other recently modified files from the CVS are required to have the new 'show.php' work correctly. At least the new 'common.inc' files in the 'locales' directory are required (IIRC).

                Regards, Matthias