Re: [htmltmpl] RFC: Template Tag Attributes
Brought to you by:
samtregar
From: Mathew R. <mat...@re...> - 2004-06-03 23:03:17
|
> >This is a common mistake that information creators think 'is a good=20 > >thing'... The web got popular for a number of reasons - one of them = being=20 > >"full text indexing of all content" (including headers/footers/etc). >=20 > Why? There is no useful information in headers/footers. By nature of = > using a templating system, they are the same on every page in a given=20 > section. Including them in search results only increases the noise = and the=20 > amount of information that needs to be indexed. Only says you - a user of your site may find the headers and footers to = be very useful. > >The point is that, it is the user of the system that wants to find = the=20 > >information - not the author telling you what you can and cant search = > >for. Classic example -> books used to have (and still do) an index = in=20 > >the last couple of pages of the book, yet the user could never find = what=20 > >they were looking for; until the book made it onto CDROM at which = point=20 > >full-text-searching was possible. >=20 > Almost every piece of a book is useful to search. But what good would = it=20 > be to search for a chapter heading? That information is already given = to=20 > you in the table of contents. An index is a whole different beast to a table of contents. And as you = say, every peice of a book is useful to search -> thus why bother only = indexing content that the author considers valid? Why not just index = _everything_, then let the user decide... > >-> Full text searching is a _much better_ solution to search problems = than=20 > >indexing on what YOU think is the information they want. > > > >Mathew > > > >PS. This means, use a spider.. or even better use google via a = "site:..."=20 > >search. >=20 > Google PageRank is very good at searching a broad sample of sites. = It's=20 > not so good for individual sites. You are kidding right? The algorithms the Google use, already take into = account common content on multiple pages. OpenOffice.org use Google as = their own site specific search engine. As do a number of sites. The = only real problem with using Google is that they only spider the web = every few weeks, thus if you update more frequently than that, you may = have a problem. Mathew |