Re: [htmltmpl] RFC: Template Tag Attributes

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

> >This is a common mistake that information creators think 'is a good=20
> >thing'...  The web got popular for a number of reasons - one of them =
being=20
> >"full text indexing of all content" (including headers/footers/etc).
>=20
> Why?  There is no useful information in headers/footers.  By nature of =

> using a templating system, they are the same on every page in a given=20
> section.  Including them in search results only increases the noise =
and the=20
> amount of information that needs to be indexed.

Only says you - a user of your site may find the headers and footers to =
be very useful.

> >The point is that, it is the user of the system that wants to find =
the=20
> >information - not the author telling you what you can and cant search =

> >for.   Classic example -> books used to have (and still do) an index =
in=20
> >the last couple of pages of the book, yet the user could never find =
what=20
> >they were looking for; until the book made it onto CDROM at which =
point=20
> >full-text-searching was possible.
>=20
> Almost every piece of a book is useful to search.  But what good would =
it=20
> be to search for a chapter heading?  That information is already given =
to=20
> you in the table of contents.

An index is a whole different beast to a table of contents.  And as you =
say, every peice of a book is useful to search -> thus why bother only =
indexing content that the author considers valid?  Why not just index =
_everything_, then let the user decide...

> >-> Full text searching is a _much better_ solution to search problems =
than=20
> >indexing on what YOU think is the information they want.
> >
> >Mathew
> >
> >PS.  This means, use a spider.. or even better use google via a =
"site:..."=20
> >search.
>=20
> Google PageRank is very good at searching a broad sample of sites.  =
It's=20
> not so good for individual sites.

You are kidding right?  The algorithms the Google use, already take into =
account common content on multiple pages.  OpenOffice.org use Google as =
their own site specific search engine.  As do a number of sites.  The =
only real problem with using Google is that they only spider the web =
every few weeks, thus if you update more frequently than that, you may =
have a problem.

Mathew