[libdb-develop] Re: libdb and FRBR
Status: Inactive
Brought to you by:
morbus
From: Morbus I. <mo...@di...> - 2004-02-01 02:15:28
|
I'll be CCing this to the libdb-develop list at Sourceforge sans any identification. If you want jump on board and introduce yourself, responding to my comments, have a blast. >My understanding of FRBR, as a spec and meme, is that they are just what >they claim -- functional requirements, but for users, not systems. Exactly. It's a concept model, not a data model. >Seeing that you've modelled its core concepts as literal tables, I'm >wondering what your processing model will be, and how you're planning to >do basic object storage. Heh, heh. Why don't you talk to uh, my um, marketing department <g>. >Hmm, let me try that again: How do you plan to populate the FRBR >primitives tables (work, expression, etc.)? Are you sucking in >MARC/MODS/etc records and running a local implementation of OCLC's FRBR >algorithm on them? If so, are you discarding the original source >metadata? Will you count on users to identify the relationships, or will >it be semi-/fully-automated? Aaaahh. Much better! <g> Yes, there will be aggregated data. This shouldn't be too surprising, since my latest book has been O'Reilly's SPIDERING HACKS. The first type of data I'm *specifically* attacking for LibDB is movies, but if you looked at the database tables without that knowledge, it may not be immediately obvious (ie. the database tables are not dependent on an implied media). As such, data would be sucked down from IMDb, but for books and other standard librarian stuff, it'd be sucked in through (whatever formats LibDB supports, which could be MARC, MODS, etc.). The original source metadata would be discarded. LibDB will support export formats (in a RESTian URL structure), such that you'd be able to get data as RDF, MARC, FOAF, etc., etc. With that in mind, the planned workflow for movies: 1) User types in movie name and year. 2) User gets back either: a) the matching movie from IMDb, split up in a giant form that doesn't mention any FRBR terms. b) a list of matching movies, to which they'd choose the right one, and be faced with a), above. 3) user verifies all information. There's a heckuva lot missing between 2a and 3, and that's mainly all interface/forms. I don't have any plans to mention the term "relationships", whatsoever. LibDB will handle all the core relationships implicitly: it will create the work/expression relationship based on the data sucked down, and the expression/manifestation/item relationships based on user data ("i own the dvd, it's in the third box, and I thought the movie sucked"). Relationships with Group 1 and Group 2 entities (for movies, cast, crew, and companies) is handled automatically within the code. The user will merely see a list of all the people who starred in the movie, all the people/companies who worked on the movie, and they'll have the option of choosing which info they want to save into the database (though, I'm up in the air on that one), as well as the ability to override any of the "roles" relationships. Now again, I won't mention "roles" at all. The interface would look something like: "Julia Roberts" Cast Member "CharacterName" "Something Someone" Crew Member [ "2nd Post Production Assistant" ] "Artisan Entertainment" [ "Distributor" ] In this example, the [] indicates a select/popup box, and "2nd Post Production Assistant" is the data received from IMDb. The user would be able to (as I would) pick the more generic "Post Production Assistant" from that select box. The select box is populated with all the roles the database knows of (in a future version of the database, roles will be associated with an authority/form, so that if you were adding a "book", you wouldn't see "Post Production Assistant", and if you were adding a "film", you'd see "Titles" instead of "Typesetter"). Likewise, Group 3 entities would be defined as relationships, but to the end user, they'd just see a big text box that says "Enter Concepts, one per line", "Enter Events, one per line". I'm still debating on having a popup of known Concepts, Events, and just having the user pick from a dozen possible popups (along with write-ins). Once the user has gone through all the data, making changes where they'd see fit, data verification would occur. This is largely grey area at the moment, but stuff like this would happen: "The concept 'Murder' exists, and has been assigned." "The event 'Sherwood Forest' did not exist, and has been assigned." "You already have a person in the database named 'Julia Roberts'. Is this the same 'Julia Roberts' that was involved with: * Work 1 (ie. movie 'Erin Brockavich') * Work 2 (ie. movie 'Runaway Bride')" and so on. Of course, at some point, users will want to more granularly define relationships. They may want to "create a relationship type" called "Sister", and then "make a relationship" between "person Mary Kate" and "person Ashley". Those sorts of relationships can't be implied easily from any data that currently exists. However, once they're created, the relationship becomes usable to other application (ie. when a user exports either of those sisters as RDF or FOAF data). Does this answer your questions? -- Morbus Iff ( i put the demon back in codemonkey ) Culture: http://www.disobey.com/ and http://www.gamegrene.com/ Spidering Hacks: http://amazon.com/exec/obidos/ASIN/0596005776/disobeycom icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus |