From: Don A. <don...@co...> - 2006-01-15 23:41:17
|
On Sun, 2006-01-15 at 23:44 +0100, Richard Bos wrote: > how are the file names determined of the html export filter, the one that= =20 > results in "My Family Tree" (Narrative website). I get e.g a page name o= f=20 > the form: gramps/ppl/1/A/1AJ4S766A35YICG7Z1.html > Would it be possible for a 3rd party program perl script e.g. to determin= e=20 > this name from the gedcom ID I0019....? Why not give the person page n= ame,=20 > the gedcom id tag? Believe it or not, there is a very good rationale for this. The name of the file, in this case 1AJ4S766A35YICG7Z1.html corresponds to the internal database key. This key is unique, and does not change. No matter what you change on a person, including the ID value, the same person will *always* be generated with the same file name. So, in a way, this serves as a permanent link. An equally, if not more important reason, has to do with maintaining your server's performance. Those of you who have worked with servers, know that file system performance can significantly degrade with the number of files in a directory. If you want to test this, find a directory with 10K files in it, and see how long it takes to get a directory listing. The standard Linux file system, ext3, seems to handle up to 256 files in a directory before it starts to degrade. Going with an evenly distributed naming scheme allows us to equally distribute the files among subdirectories, thereby decreasing access times to the files. The database key that we use is nicely distributed across the name space. This is also why the first two letters are used for subdirectories (in this case ppp/1/A). You can have tens of thousands of people generated in this way without affecting your web server performance, since the equally distributed names prevent too many files from occurring in a single directory. Alex and I actually did a significant amount of analysis on this particular issue.=20 Using the ID values would lose these benefits. With naming structures like I0001, I0002, etc, you get a very poor distribution of names, and you have no protection in case someone changes the ID value on you. However, it would be a simple plugin to map ID values to handles if you really needed such a map. A user could probably generate this in a matter of a few minutes. Don PS. - I know that someone is going to say, "But you can just limit the number of people per directory, and add directories as needed, just keep track of which files are in which directory." This is a partially true statement, but depending on the selected ID values and selected people, a generated page can end up in different directories from run to run. --=20 Don Allingham <don...@co...> |