libdb-develop Mailing List for LibDB (Page 7)

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Feb 6, 2004, at 7:06 PM, Morbus Iff wrote:

> How do people feel about /installer.cgi compared to 
> installer/index.cgi?

Sounds good.

Bruce

Download: http://osdn.dl.sf.net/sourceforge/libdb/libdb-0.0.2.tar.gz
Blog Entry: http://www.disobey.com/dnn/2004/02/index.shtml#001576
Release Notes: http://www.disobey.com/noos/LibDB/?ReleaseNotes002
LibDB Homepage: http://www.disobey.com/noos/LibDB/

A day later, and [a new release of the installer code. I've fixed four
bugs, as well as pondered ways of improving *before the install*. First,
the settings file ".htlibdb", which stores the database username and
password. Obviously, it's important to make this file well-protected; we
certainly don't want give free access to the user's database.

The ".ht" part gives the file special meaning to Apache, the most popular
web server. When a file starts with ".ht", it is never served to the
general public, protected by a filter that stops outside access. Two
problems. First, I'm assuming the user is running Apache. Other webservers
might not treat ".ht" in the same way, canceling all implied security.

The other downside of ".htlibdb" affects end-usability. A file that starts
with a dot is "hidden" from normal viewing. Similarly, most FTP clients
hide them in file listings. If the LibDB installer can't change that file's
permissions, it instructs the user to. If the user *can't see* the
invisible file, more support emails for me, sure, but more people *who just
give up*, never asking for help. Too much effort, and you know what? It is.

I'm a few inches away from renaming ".htlibdb" to "settings.cgi", which
follows the same behavior as Movable Type. In most cases, webservers are
configured to treat .cgi files as executables, running them as opposed to
simply showing their contents. Since the file is not code, it'd "short
circuit" and display an "Internal Server Error". For configurations where
.cgi is not executable, our installer can attempt to fix the permissions,
prompting the user otherwise. There are more chances for .cgi scripts to
break and leak data, but there are also more chances for a user to actually
finish the install instead of throwing their hands-up in "where is it?"
frustation.

The second big problem with the installer is location: installer/index.cgi.
There's also a file called /index.cgi and, in early tests, two people
mistook one file for the other, either in permission changing or in
accessing through the browser. I'm thinking a better solution would be
/installer.cgi: no more accidental duplicity, a single location for
executable files, and a more understandable URL.

Download, test, comment, improve. Thank you!

-- 
Morbus Iff ( don't heckle the super-villian )
Technical: http://www.oreillynet.com/pub/au/779
Culture: http://www.disobey.com/ and http://www.gamegrene.com/
icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus

I wrote:

> BTW, and only slightly off-topic, one of the useful things I am 
> thinking about MODS' handling of this is that one can have more than 
> one roleTerm element associated with a name.  E.g.:
>
> <role>
> 	<roleTerm type="text" authority="dc">creator</roleTerm>
> 	<roleTerm type="text" authority="marc">writer</roleTerm>
> </role>

Correction: both contributor and creator and MARC role terms, so would 
have that authority attribute.

Bruce

How do people feel about /installer.cgi compared to installer/index.cgi?
A quick reading by one tester had him loading /index.cgi, due to the
similarities of the filename.

The prime reason for using installer/index.cgi was the ability to say
"now, delete the entire installer/ directory to prevent malicious users
from overwriting your database". But, it'd be just as easy for me to
say "delete the installer/ directory", and have the installer.cgi
fail if that directory doesn't exist.

Then, they'll be less confusion with /installer.cgi compared to
installer/index.cgi, both in the "what file do I set permissions on",
and "what file do I load in my browser".

Thoughts?

-- 
Morbus Iff ( shower your women, i'm coming )
Technical: http://www.oreillynet.com/pub/au/779
Culture: http://www.disobey.com/ and http://www.gamegrene.com/
icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus

Another thing: titles.

In my sort of realm, parsing titles and subtitles is important, as are 
abbreviated titles in many cases.  One interesting recent proposal on 
the mods list was something like:

<title>
    <titleMain>A Title</titleMain>
    <titleSub>Subtitle</titleSub>
    <titleSort>Title</titleSort>
</title>

...or:

<title type="abbreviated">
    <titleMain>J. of AAG</titleMain>
</title>

One thing that MODS doesn't have is a way to abbreviate person or 
corporate names though.

Bruce

>Based on what I've been learning from the MODS community, it seems to
>me you ought to have some way to track with which community a given
>role is associated.  For example, clearly I would want you to include
>the MARC relator role terms, because it allows me to track authors,
>translators, recipients, etc... even directors (which I do keep track
>of).  But it would probably be important to know that role term X is
>associated with that controlled list.  It allows you to then map to
>MODS too, where you have:

Fair enough. The big challenge is showing the proper
list, at the proper time. Say we have books and movies.
It's quite easy to say "show me only roles related to books".
But, now, add two different authorities. How would we display
them? Both in the same list ("author [marc]", "author [bob]").
What about equivalence? Which is the more proper authority to use?

>BTW, and only slightly off-topic, one of the useful things I am
>thinking about MODS' handling of this is that one can have more than
>one roleTerm element associated with a name.  E.g.:

LibDB can have more than one. In fact, most movable parts
of the LibDB database can have "more than one" of something.
There are certainly exceptions ("provenance", for example.
Granted, there's a huge history, both alleged and factual, to
the owners of The Spear Of Destiny, but I'm not sure how having
individual fields for each owner would be useful (well, ok,
saying it like that, now I can, but "provenance" is used so
rarely that I'm gonna let it fester until someone *else*
requests multiple provenanii for each item).

><role>
>	<roleTerm type="text" authority="dc">creator</roleTerm>
>	<roleTerm type="text" authority="marc">writer</roleTerm>
></role>
>
>My thinking here -- and someone confirm that I am right (or not) please
>-- is by having the more generic (but still useful) creator role, it
>can be easily mapped to DC and to any citation formatting output.

It certainly seems useful to me. Who the hell is the "creator" of a
movie? It's not solely the director, nor actors, or lighting guy.

-- 
Morbus Iff ( be realistic. demand the impossible. )
Technical: http://www.oreillynet.com/pub/au/779
Culture: http://www.disobey.com/ and http://www.gamegrene.com/
icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus

How do people feel about /installer.cgi compared to installer/index.cgi?
A quick reading by one tester had him loading /index.cgi, due to the
similarities of the filename.

The prime reason for using installer/index.cgi was the ability to say
"now, delete the entire installer/ directory to prevent malicious users
from overwriting your database". But, it'd be just as easy for me to
say "delete the installer/ directory", and have the installer.cgi
fail if that directory doesn't exist.

Then, they'll be less confusion with /installer.cgi compared to
installer/index.cgi, both in the "what file do I set permissions on",
and "what file do I load in my browser".

Thoughts?

-- 
Morbus Iff ( shower your women, i'm coming )
Technical: http://www.oreillynet.com/pub/au/779
Culture: http://www.disobey.com/ and http://www.gamegrene.com/
icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus

On Feb 6, 2004, at 5:36 PM, Morbus Iff wrote:

> Did you happen to use WinZip to extract the tar.gz file?

Nope; StuffIt.

Bruce

 >>  2) Upload the newly extracted files and directories to your web site.
 >>     They should be placed into a directory where CGI scripts may
 >>     be executed (these directories typically have "cgi" in their
 >>     name somewhere). If you don't know, or can't find, this
 >>     directory, contact your web host for further information.
 >
 >Cannot you have the archive extracted into a single folder that gets
 >placed in the CGI directory?  I find this a little confusing myself.

So, when you extracted the file, you received a bunch of folders and files 
that you had to manually select each one to upload? And you'd prefer to 
have the archive extract into a directory named libdb/ (itself containing 
the folders and files), and just upload that one folder?

 >>  4) Open the URL to installer/index.cgi in your browser and follow
 >>     the remaining instructions.
 >
 >"installer/index.cgi" isn't a valid url.
 >If I try this, though, it doesn't work:
 >http://localhost/installer/index.cgi

Define "it doesn't work". Did you upload all those files into
a libdb/ folder on the server? Or would http://localhost/README.txt
show you the file you quoted from above?

-- 
Morbus Iff ( i put the demon back in codemonkey )
Culture: http://www.disobey.com/ and http://www.gamegrene.com/
Spidering Hacks: http://amazon.com/exec/obidos/ASIN/0596005776/disobeycom
icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus

On Feb 6, 2004, at 5:09 PM, Morbus Iff wrote:

> This, I think, is very important for data sharing. Since every
> single thing in LibDB comes with a unique ID, it is imperative that
> the data sharing understands that UniqueIDA on this LibDB installation
> is the same as UniqueIDA on another LibDB installation.

Based on what I've been learning from the MODS community, it seems to 
me you ought to have some way to track with which community a given 
role is associated.  For example, clearly I would want you to include 
the MARC relator role terms, because it allows me to track authors, 
translators, recipients, etc... even directors (which I do keep track 
of).  But it would probably be important to know that role term X is 
associated with that controlled list.  It allows you to then map to 
MODS too, where you have:

<role>
	<roleTerm type="text" authority="marc">writer</roleTerm>
</role>

...or:

<role>
	<roleTerm type="text" authority="libdb-movie">whatever</roleTerm>
</role>

..or some sort of RDF-based uri.

BTW, and only slightly off-topic, one of the useful things I am 
thinking about MODS' handling of this is that one can have more than 
one roleTerm element associated with a name.  E.g.:

<role>
	<roleTerm type="text" authority="dc">creator</roleTerm>
	<roleTerm type="text" authority="marc">writer</roleTerm>
</role>

My thinking here -- and someone confirm that I am right (or not) please 
-- is by having the more generic (but still useful) creator role, it 
can be easily mapped to DC and to any citation formatting output.

Bruce

On Feb 6, 2004, at 5:23 PM, Morbus Iff wrote:

> >Cannot you have the archive extracted into a single folder that gets
> >placed in the CGI directory?  I find this a little confusing myself.
>
> So, when you extracted the file, you received a bunch of folders and 
> files that you had to manually select each one to upload? And you'd 
> prefer to have the archive extract into a directory named libdb/ 
> (itself containing the folders and files), and just upload that one 
> folder?

Right.  It's not clear (to me) which files must be uploaded, and where 
they can and should go.  Sticking them in a single folder and saying 
"move the folder" is more clear for me.

> >>  4) Open the URL to installer/index.cgi in your browser and follow
> >>     the remaining instructions.
> >
> >"installer/index.cgi" isn't a valid url.
> >If I try this, though, it doesn't work:
> >http://localhost/installer/index.cgi
>
> Define "it doesn't work".

"File not found"

> Did you upload all those files into
> a libdb/ folder on the server?

On OS X, I put them all in /Library/WebServer/CGI-Executables; no 
subdirectory.

Bruce

 >Also Morbus, I looked the sql definitions, and note that somewhere
 >around half of the code is movie-specific role information.  I know at
 >some point you're planning to open that up to generalization; how are
 >you going to do this, given where you're at now?

Roles will be customizable once there's an interface
to them <g>. There are two things to consider:

  * roles should follow the MARC/LC roles/relator, which
    I have not yet had a chance to properly address.

  * roles should be shown based on their media. so a book
    wouldn't show role for "underwater camera operator".

Regardless of what *I* choose for roles in the default database,
roles will be customizable in the not-yet-existing interface.
You'll be able to modify, add, and delete them as you see fit.

 From the standpoint of the "official" SQL code
including in the tarball, my take on it is this:

  * if you send it, and it makes sense, I'll include it.

This, I think, is very important for data sharing. Since every
single thing in LibDB comes with a unique ID, it is imperative that
the data sharing understands that UniqueIDA on this LibDB installation
is the same as UniqueIDA on another LibDB installation. This is
facilitated through the use of UniqueIDA being a default statement
in the SQL databases.

 From the standpoint of the future:

  * 0.0.2 is coming out tonight. it fixes some installer bugs.

  * 0.1.0, being worked on now, will contain some attempt to
    include an editing/browsing interface for the individual
    parts of LibDB: the individual roles, the individual person,
    the individual concept, etc. Aggregate parts, ie. "show me
    this movie", aren't planned until after 0.1.0.

-- 
Morbus Iff ( i put the demon back in codemonkey )
Culture: http://www.disobey.com/ and http://www.gamegrene.com/
Spidering Hacks: http://amazon.com/exec/obidos/ASIN/0596005776/disobeycom
icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus

 >>>  2) Upload the newly extracted files and directories to your web site.
 >>>     They should be placed into a directory where CGI scripts may
 >>>     be executed (these directories typically have "cgi" in their
 >>>     name somewhere). If you don't know, or can't find, this
 >>>     directory, contact your web host for further information.
 >>
 >>Cannot you have the archive extracted into a single folder that gets
 >>placed in the CGI directory?  I find this a little confusing myself.

Did you happen to use WinZip to extract the tar.gz file?

-- 
Morbus Iff ( i put the demon back in codemonkey )
Culture: http://www.disobey.com/ and http://www.gamegrene.com/
Spidering Hacks: http://amazon.com/exec/obidos/ASIN/0596005776/disobeycom
icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus

Hey all.

LibDB needs to save your database password, in plain text, to a file on 
your server. Currently, it saves this into a file called ".htlibdb", which 
is in the root of your libdb/ directory.

That name was chosen for one reason: under the Apache webserver, any file 
that starts with ".ht" is never served via the web. In other words, no one 
would ever be able to retrieve your database password, unless Apache had 
been grossly misconfigured.

A major problem with this have been brought to my attention:

  * the starting "." causes the file to be hidden in most file
    listings (OS X hides them in the Finder, some FTP clients
    do not show dot files by default, etc.). The problem: if the
    installer can't set the permissions of the file itself, the
    user has to. If the user can't *find* the file because it's
    being hidden by their software, neither will they.

In Movable Type, the solution is to use a publicly accessible
file that is shortcircuited to break. The file, "mt-db-pass.cgi",
merely contains one word: the password. However, because it's named
.cgi, the assumption is that:

  * Apache will try to execute the file if you're in a /cgi-bin/.
  * The file is set executable anyways, and fails with a parse error.

These both "shortcircuit" the file to cause it to "break", never
revealing the password. The downside is that a broken CGI conf
*could* cause the file to be readable. Breaking the CGI conf
is a lot easier than breaking the .ht configuration.

The other downside of .htlibdb is that, as far as I know, other
webservers MAY NOT follow the same procedure: they may have no
problem with serving .ht files to the web, at large.

So. ".htlibdb" or "settings.cgi": which do you prefer, which 
is the least 
of two evils? On one hand (.htlibdb), we've got
more security, but less user-friendliness. On the other
(settings.cfg), the file will be easier to find, but
potentially easier to view.

Thoughts?

-- 
Morbus Iff ( i put the demon back in codemonkey )
Culture: http://www.disobey.com/ and http://www.gamegrene.com/
Spidering Hacks: http://amazon.com/exec/obidos/ASIN/0596005776/disobeycom
icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus

On Feb 5, 2004, at 10:09 PM, Morbus Iff wrote:

> Grab the download, check
> out the README.txt, and let me know what I'm missing, what can be 
> improved,
> and what doesn't work. I'm specifically looking for suggestions on the
> verbiage used. Is it understandable? Is there something that can be 
> stated
> more clearly? Where did I lose you?

I'm not sure, but I can't get it to work!

Here's the text:

> 1) Unarchive the compressed file into a directory of your choice
>     (if you're reading this, you've probably already done that).

OK.

>  2) Upload the newly extracted files and directories to your web site.
>     They should be placed into a directory where CGI scripts may
>     be executed (these directories typically have "cgi" in their
>     name somewhere). If you don't know, or can't find, this
>     directory, contact your web host for further information.

Cannot you have the archive extracted into a single folder that gets 
placed in the CGI directory?  I find this a little confusing myself.

>  3) Give the installer/index.cgi file execute permissions (*how* to do
>     that will depend on your software: some programs allow you to give
>     "execute" permissions to "owner", "group" and "everybody", others
>     will tell you to set "755" permissions on the file, etc.)

OK.

>  4) Open the URL to installer/index.cgi in your browser and follow
>     the remaining instructions.

"installer/index.cgi" isn't a valid url. If I try this, though, it 
doesn't work:

http://localhost/installer/index.cgi

Bruce

Also Morbus, I looked the sql definitions, and note that somewhere 
around half of the code is movie-specific role information.  I know at 
some point you're planning to open that up to generalization; how are 
you going to do this, given where you're at now?

Bruce

Download: http://osdn.dl.sf.net/sourceforge/libdb/libdb-0.0.1.tar.gz
Blog Entry: http://www.disobey.com/dnn/2004/02/index.shtml#001575
Release Notes: http://www.disobey.com/noos/LibDB/?ReleaseNotes001
LibDB Homepage: http://www.disobey.com/noos/LibDB/

LibDB 0.0.1 has been released. It consists of a web-based installation
program that checks file permissions, verifies the required modules are
installed, and connects to the user's database to import the default schema
and sample data. That's all it does. You'll be installing a program that
doesn't yet exist.

Why bother creating an installer for code that hasn't been written, as
backwards as putting jelly on the outside of your sandwich? Two reasons.
First, I knew the installer code could be used for the rest of LibDB: the
same frameworks are used for template parsing, database access, settings
retrieval, language determination, etc.. The installer helped stress-test
and debug before the "proper" and "important" work.

The second reason concerns end-usability. Too many times, in too many
projects, the documentation and packaging is always written after the code
has been debugged and tested. But the biggest lie any developer can tell
themselves is that they'll *ever* be done debugging and testing code.
There's always going to be one more thing to fix, one more feature to add,
one more piece of goldplating to layer the ox with. The documentation and
packaging, continually being put off, rarely get finished. All this great
code, and no clue how to get it working.

For people to use LibDB, they need to get it installed. The "casual" user
profile (as described in the ProjectGoals), when faced with installing a
web-based application, is assumed to have only an FTP account and "generic"
web host, where tech support is a three hour (or three day) test of
patience. The installer had to be as walk-through-ish as possible, and it
had to be developed first.

Documentation-wise, there are 100+ database fields. They're described in
the DatabaseSchema, but no one is ever going to check the "manual" for what
"provenance" means. That's why every field is described in the DB itself:
the help can be retrieved right along with the matching data. Contextual
help, built into the database, viewable in the interface, without extra
effort. The same attention will be applied to tasks: when the user wants to
*do something* compared to *define something*.

So, LibDB 0.0.1 is merely the installer, and I need your help to refine
this very important step of the end-user process. Grab the download, check
out the README.txt, and let me know what I'm missing, what can be improved,
and what doesn't work. I'm specifically looking for suggestions on the
verbiage used. Is it understandable? Is there something that can be stated
more clearly? Where did I lose you? Email comments to mo...@di... or
use the InstallationNotes:

  http://www.disobey.com/noos/LibDB/?InstallationNotes

-- 
Morbus Iff ( i know a little of everything, a lot of nothing. )
Technical: http://www.oreillynet.com/pub/au/779
Culture: http://www.disobey.com/ and http://www.gamegrene.com/
icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus

I'll be CCing this to the libdb-develop list at Sourceforge
sans any identification. If you want jump on board and
introduce yourself, responding to my comments, have a blast.

 >My understanding of FRBR, as a spec and meme, is that they are just what
 >they claim -- functional requirements, but for users, not systems.

Exactly. It's a concept model, not a data model.

 >Seeing that you've modelled its core concepts as literal tables, I'm
 >wondering what your processing model will be, and how you're planning to
 >do basic object storage.

Heh, heh. Why don't you talk to uh, my um, marketing department <g>.

 >Hmm, let me try that again:  How do you plan to populate the FRBR
 >primitives tables (work, expression, etc.)?  Are you sucking in
 >MARC/MODS/etc records and running a local implementation of OCLC's FRBR
 >algorithm on them?  If so, are you discarding the original source
 >metadata?  Will you count on users to identify the relationships, or will
 >it be semi-/fully-automated?

Aaaahh. Much better! <g>

Yes, there will be aggregated data. This shouldn't be too surprising,
since my latest book has been O'Reilly's SPIDERING HACKS. The first
type of data I'm *specifically* attacking for LibDB is movies, but
if you looked at the database tables without that knowledge, it
may not be immediately obvious (ie. the database tables are not
dependent on an implied media). As such, data would be sucked
down from IMDb, but for books and other standard librarian
stuff, it'd be sucked in through (whatever formats LibDB
supports, which could be MARC, MODS, etc.).

The original source metadata would be discarded. LibDB will
support export formats (in a RESTian URL structure), such that
you'd be able to get data as RDF, MARC, FOAF, etc., etc.

With that in mind, the planned workflow for movies:

  1) User types in movie name and year.
  2) User gets back either:

     a) the matching movie from IMDb, split up in a
        giant form that doesn't mention any FRBR terms.

     b) a list of matching movies, to which they'd choose
        the right one, and be faced with a), above.

  3) user verifies all information.

There's a heckuva lot missing between 2a and 3, and that's mainly all 
interface/forms. I don't have any plans to mention the term "relationships",
whatsoever. LibDB will handle all the core relationships implicitly:
it will create the work/expression relationship based on the data
sucked down, and the expression/manifestation/item relationships
based on user data ("i own the dvd, it's in the third box, and
I thought the movie sucked").

Relationships with Group 1 and Group 2 entities (for movies, cast,
crew, and companies) is handled automatically within the code.
The user will merely see a list of all the people who starred in
the movie, all the people/companies who worked on the movie,
and they'll have the option of choosing which info they want
to save into the database (though, I'm up in the air on that
one), as well as the ability to override any of the "roles"
relationships.

Now again, I won't mention "roles" at all.
The interface would look something like:

  "Julia Roberts"         Cast Member  "CharacterName"
  "Something Someone"     Crew Member  [ "2nd Post Production Assistant" ]
  "Artisan Entertainment"              [ "Distributor" ]

In this example, the [] indicates a select/popup box, and "2nd Post 
Production Assistant" is the data received from IMDb. The user would
be able to (as I would) pick the more generic "Post Production
Assistant" from that select box. The select box is populated
with all the roles the database knows of (in a future version of
the database, roles will be associated with an authority/form,
so that if you were adding a "book", you wouldn't see "Post
Production Assistant", and if you were adding a "film", you'd
see "Titles" instead of "Typesetter").

Likewise, Group 3 entities would be defined as relationships, but
to the end user, they'd just see a big text box that says "Enter
Concepts, one per line", "Enter Events, one per line". I'm still
debating on having a popup of known Concepts, Events, and just
having the user pick from a dozen possible popups (along with
write-ins).

Once the user has gone through all the data, making changes where
they'd see fit, data verification would occur. This is largely
grey area at the moment, but stuff like this would happen:

   "The concept 'Murder' exists, and has been assigned."
   "The event 'Sherwood Forest' did not exist, and has been assigned."

   "You already have a person in the database named 'Julia Roberts'.
    Is this the same 'Julia Roberts' that was involved with:

         * Work 1 (ie. movie 'Erin Brockavich')
         * Work 2 (ie. movie 'Runaway Bride')"

and so on.

Of course, at some point, users will want to more granularly define
relationships. They may want to "create a relationship type" called
"Sister", and then "make a relationship" between "person Mary Kate" and
"person Ashley". Those sorts of relationships can't be implied easily
from any data that currently exists. However, once they're created, the
relationship becomes usable to other application (ie. when a user exports
either of those sisters as RDF or FOAF data).

Does this answer your questions?

-- 
Morbus Iff ( i put the demon back in codemonkey )
Culture: http://www.disobey.com/ and http://www.gamegrene.com/
Spidering Hacks: http://amazon.com/exec/obidos/ASIN/0596005776/disobeycom
icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus

On Jan 31, 2004, at 7:57 PM, Morbus Iff wrote:

> Here's the question: "shouldn't the installer
> move the media_files/ directory for the user"?

I can only tell you from my perspective that so long as the 
documentation is clear, I can likely figure that out.  I get much more 
confused and frustrated when I have to deal with multiple dependencies, 
each with different config options (ports, passwords, etc., etc.).  So 
if all I have to do is stick the files in the right directory, I'm 
happy.

Also, Morbus, I realized the other day you're a Mac user too.  It would 
be nice if you had an installer for Mac OS X that just used the default 
OS Apache stuff and directories.

Bruce

Sounds all good (particularly the post-movie-only phase).  Dan, 
incidentally, has been working on some interesting stuff that I 
recently blogged about:

	http://curtis.med.yale.edu/dchud/writings/blm.html

Another sorta-related development of late is that two developers of 
Z39.50 and SRW technologies -- the stuff you'd use to suck in 
library-oriented data -- have joined the OpenOffice bibliographic 
module project.  Some of this stuff now does automatic conversion of 
USMARC --> MODS v3.  I think there's a lot of promise is this sort of 
online data extraction.

Bruce

> >I'll be CCing this to the libdb-develop list at Sourceforge
> >sans any identification. If you want jump on board and
> >introduce yourself, responding to my comments, have a blast.

Just for clarification, he gave me the go-ahead to identify him.
I actually sent this message (along with the Installation: one,
coming soon) on Friday, but due to hiccups at Sourceforge, I've
had to resend 'em.

-- 
Morbus Iff ( is this a cut out bath-poster Morbus, or what? )
Technical: http://www.oreillynet.com/pub/au/779
Culture: http://www.disobey.com/ and http://www.gamegrene.com/
icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus

Hey all. I wanted to chat with you regarding the installer and
a conversation I got into with an early tester of said code. In
a nutshell, it comes down to "how casual is casual?".

The conversation occurred over what we can expect the user to
handle, and what we can expect the installer to handle for the
user. Before I get into the technical side of things, brush
up on the User Profiles shown here:

  http://disobey.com/noos/LibDB/?ProjectGoals

We'll be talking about the "casual user". I'd like to state
that my definition of the "casual user" contains these maxims:

 * the user does not have shell access to their web host.
 * the user is not the root user at their web host.
 * the user has FTP'd the LibDB files to the web host.

With that in mind, the technical aspects. There are two different
ways (actually, three, but mod_perl is not important to this
discussion) a user can run LibDB:

 * in a /cgi-bin/ directory provided by their web host. all
   files in a cgi-bin are treated as executable. this means
   that media files, like GIFs, JPEGs, CSS, etc. CAN NOT be
   viewed from under this directory. /cgi-bin/style.css would
   cause an Internal Server Error as Apache would try to
   execute it.

 * in an ExecCGI'd directory. ExecCGI is a way to say "ok,
   anything in this directory that has a .cgi extension
   should be considered an executable script; everything
   else should be considered a normal file to be handled
   as you would any other day of the week".

LibDB, and the installer, require media files for proper visual
operation. These will include javascript files, CSS files, and
image files. These files are stored in a directory called
media_files/. Nothing too exciting here.

One of the very first steps the installer takes is to make an
attempt to access these media files. This will ensure that the
user can properly see images and styles that are part of the
installer program (as well as the rest of LibDB).

Currently, the installer loads an external CSS file (located
in media_files/). If a certain bit of text in the installer is
red, then the CSS has been loaded successfully. This typically
means that the user is under an ExecCGI directory: media_files/
style.css does not end in .cgi, thus Apache serves it as normal.

If the text is NOT RED, however, it means that Apache was not
able to serve the style.css. Currently, this means "the user
has placed the LibDB files into a /cgi-bin/ directory". The
installer instructs the user to moves the media_files/ directory
to a web accessible location, and enter the URL to the newly
moved directory. The user enters the URL, the script reloads,
and if the text is red, everything is fine, and the installer
continues.

Here's the question: "shouldn't the installer
move the media_files/ directory for the user"?

I certainly like the idea. Currently, if
the text is NOT RED, the assumption is that:

 a) the user will open up their FTP program.

 b) the user will navigate to their LibDB directory.

 c) the user will move the media_files directory to
    a web accessible location, without fail.

 d) the user will enter the URL to the location
    they moved the media_files to, without fail.

Assuming no failure, that's another four steps the user must go
through before they can actually install LibDB. Is this a problem?

The conversation I had basically went (and to the person
in question, I'm paraphrasing rather badly to make this
email more concise <g>).

 THEM: Well, the installer should detect that the loading
       of the style.css failed (with a check of HTTP
       code 500) and move the media_files for the user.

   ME: Yeah, but I'd still have to ask the user for a directory
       to move them into, which could potentially be dangerous,
       because it assumes the user knows the full path of
       their web directory. That's not something normal
       people know - at the very least, it'd be a call to
       their webhost, or a rummage through paperwork.

 THEM: The installer shouldn't ask them for the directory,
       it should just move them automatically. You can
       program heuristics for this.

   ME: I certainly agree - some /cgi-bin/'s are subdirectories
       of the user's root (so I could move "up", check for the
       existence of .html files, and copy) or are in the same
       directory as an "htdocs", "htweb", etc. (so, move "up",
       if failure, look for "htdocs", copy), but I still worry
       about this. By not prompting the user, we're assuming
       that are heuristics will NOT fail. And what happens if
       the user already has a directory called "media_files"?
       What happens if the user, a month from now, updates their
       site, sees "media_files", has no idea what it is ("I didn't
       place that there!") and deletes it?

 THEM: Yeah, but the user could just as easily say
       "pfff, this is hard", and never use LibDB again.

So. I want people's thoughts. The core question is: should
the installer move/copy media_files for the user, or should
we assume that the user will be willing to do it?

At this point, my feelings are:

 * the user has already uploaded the LibDB directory via FTP.

 * the user probably already has a website (I don't know of
   many non-website-users that would explore web-based
   software as opposed to a native implementation).

 * the user knows where .html files go, and thus, moving
   the media_files directory to the same place won't be
   an extra step that causes much duress.

 * copying automatically is awesome, but has too many
   things to worry about (heuristics, future deletion, etc.).

-- 
Morbus Iff ( cheese and rice saves )
Technical: http://www.oreillynet.com/pub/au/779
Culture: http://www.disobey.com/ and http://www.gamegrene.com/
icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus
-- 
Morbus Iff ( is this a cut out bath-poster Morbus, or what? )
Technical: http://www.oreillynet.com/pub/au/779
Culture: http://www.disobey.com/ and http://www.gamegrene.com/
icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus

 >Date: Fri, 30 Jan 2004 17:31:05 -0500
 >To: Daniel Chudnov <dc...@um...>
 >From: Morbus Iff <mo...@di...>
 >Subject: Re: libdb and FRBR
 >
 >
 >I'll be CCing this to the libdb-develop list at Sourceforge
 >sans any identification. If you want jump on board and
 >introduce yourself, responding to my comments, have a blast.
 >
 >>My understanding of FRBR, as a spec and meme, is that they are just what
 >>they claim -- functional requirements, but for users, not systems.
 >
 >Exactly. It's a concept model, not a data model.
 >
 >>Seeing that you've modelled its core concepts as literal tables, I'm
 >>wondering what your processing model will be, and how you're planning to
 >>do basic object storage.
 >
 >Heh, heh. Why don't you talk to uh, my um, marketing department <g>.
 >
 >>Hmm, let me try that again:  How do you plan to populate the FRBR
 >>primitives tables (work, expression, etc.)?  Are you sucking in
 >>MARC/MODS/etc records and running a local implementation of OCLC's FRBR
 >>algorithm on them?  If so, are you discarding the original source
 >>metadata?  Will you count on users to identify the relationships, or will
 >>it be semi-/fully-automated?
 >
 >Aaaahh. Much better! <g>
 >
 >Yes, there will be aggregated data. This shouldn't be too surprising,
 >since my latest book has been O'Reilly's SPIDERING HACKS. The first
 >type of data I'm *specifically* attacking for LibDB is movies, but
 >if you looked at the database tables without that knowledge, it
 >may not be immediately obvious (ie. the database tables are not
 >dependent on an implied media). As such, data would be sucked
 >down from IMDb, but for books and other standard librarian
 >stuff, it'd be sucked in through (whatever formats LibDB
 >supports, which could be MARC, MODS, etc.).
 >
 >The original source metadata would be discarded. LibDB will
 >support export formats (in a RESTian URL structure), such that
 >you'd be able to get data as RDF, MARC, FOAF, etc., etc.
 >
 >With that in mind, the planned workflow for movies:
 >
 > 1) User types in movie name and year.
 > 2) User gets back either:
 >
 >    a) the matching movie from IMDb, split up in a
 >       giant form that doesn't mention any FRBR terms.
 >
 >    b) a list of matching movies, to which they'd choose
 >       the right one, and be faced with a), above.
 >
 > 3) user verifies all information.
 >
 >There's a heckuva lot missing between 2a and 3, and that's mainly all
 >interface/forms. I don't have any plans to mention the term "relationships",
 >whatsoever. LibDB will handle all the core relationships implicitly:
 >it will create the work/expression relationship based on the data
 >sucked down, and the expression/manifestation/item relationships
 >based on user data ("i own the dvd, it's in the third box, and
 >I thought the movie sucked").
 >
 >Relationships with Group 1 and Group 2 entities (for movies, cast,
 >crew, and companies) is handled automatically within the code.
 >The user will merely see a list of all the people who starred in
 >the movie, all the people/companies who worked on the movie,
 >and they'll have the option of choosing which info they want
 >to save into the database (though, I'm up in the air on that
 >one), as well as the ability to override any of the "roles"
 >relationships.
 >
 >Now again, I won't mention "roles" at all.
 >The interface would look something like:
 >
 > "Julia Roberts"         Cast Member  "CharacterName"
 > "Something Someone"     Crew Member  [ "2nd Post Production Assistant" ]
 > "Artisan Entertainment"              [ "Distributor" ]
 >
 >In this example, the [] indicates a select/popup box, and "2nd Post
 >Production Assistant" is the data received from IMDb. The user would
 >be able to (as I would) pick the more generic "Post Production
 >Assistant" from that select box. The select box is populated
 >with all the roles the database knows of (in a future version of
 >the database, roles will be associated with an authority/form,
 >so that if you were adding a "book", you wouldn't see "Post
 >Production Assistant", and if you were adding a "film", you'd
 >see "Titles" instead of "Typesetter").
 >
 >Likewise, Group 3 entities would be defined as relationships, but
 >to the end user, they'd just see a big text box that says "Enter
 >Concepts, one per line", "Enter Events, one per line". I'm still
 >debating on having a popup of known Concepts, Events, and just
 >having the user pick from a dozen possible popups (along with
 >write-ins).
 >
 >Once the user has gone through all the data, making changes where
 >they'd see fit, data verification would occur. This is largely
 >grey area at the moment, but stuff like this would happen:
 >
 >  "The concept 'Murder' exists, and has been assigned."
 >  "The event 'Sherwood Forest' did not exist, and has been assigned."
 >
 >  "You already have a person in the database named 'Julia Roberts'.
 >   Is this the same 'Julia Roberts' that was involved with:
 >
 >        * Work 1 (ie. movie 'Erin Brockavich')
 >        * Work 2 (ie. movie 'Runaway Bride')"
 >
 >and so on.
 >
 >Of course, at some point, users will want to more granularly define
 >relationships. They may want to "create a relationship type" called
 >"Sister", and then "make a relationship" between "person Mary Kate" and
 >"person Ashley". Those sorts of relationships can't be implied easily
 >from any data that currently exists. However, once they're created, the
 >relationship becomes usable to other application (ie. when a user exports
 >either of those sisters as RDF or FOAF data).
 >
 >Does this answer your questions?

-- 
Morbus Iff ( cheese and rice saves )
Technical: http://www.oreillynet.com/pub/au/779
Culture: http://www.disobey.com/ and http://www.gamegrene.com/
icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus

>http://journal.dajobe.org/journal/archives/2004_01.html#001665
>"Lots of new stuff that's too difficult to summarise,
>the main things are a new MySQL store..."

That *is* good news, actually, though I've actually fallen
out of favor on using RDF triples for data storage (instead
using them only as a means of output).

I'll investigate more though.

-- 
Morbus Iff ( i'm the droid you're looking for )
Technical: http://www.oreillynet.com/pub/au/779
Culture: http://www.disobey.com/ and http://www.gamegrene.com/
icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus

FYI...

http://journal.dajobe.org/journal/archives/2004_01.html#001665

"Lots of new stuff that's too difficult to summarise, the main things 
are a new MySQL store..."

Bruce

2004	Jan (48)	Feb (58)	Mar	Apr (1)	May	Jun	Jul (29)	Aug (36)	Sep (5)	Oct (1)	Nov (32)	Dec (1)
2005	Jan	Feb (4)	Mar	Apr (2)	May (2)	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2006	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug (3)	Sep	Oct	Nov	Dec

libdb-develop Mailing List for LibDB (Page 7)

libdb-develop — Developer discussions and bickerings.