|
From: Jeff S. <jsq...@us...> - 2002-10-30 12:30:41
|
jsquyres 02/10/30 04:30:40 Added: libmaildb/db/mysql/doc design.html Log: Forgot to add this file Revision Changes Path 1.1 maildb/libmaildb/db/mysql/doc/design.html Index: design.html =================================================================== <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <html> <head> <!-- saved from url=(0189)http://aves.org:2095/liza+aves.org/neomail.pl?action=viewattachment&sessionid=avesorg-session-0.0982186974297328&message_id=fd02e2f221c24eb3770452d3143b0894&folder=INBOX&attachment_number=1 --> <meta http-equiv="Content-Type" content="text/html; charset=windows-1252"> <meta content="MSHTML 5.50.4611.1300" name="GENERATOR"> <title>MailDB Design</title> </head> <body> <h1>MailDB Database Design</h1> <p><a href="#intro">Introduction</a><br> <a href="#tables">Table Descriptions</a><br> <a href="#issues">Open Questions/Issues</a><br> <a href="#details">Table Details</a><br> <a href="#storage">MySQL Storage Notes</a><br> </p> <p></p> <hr><a name="intro"></a> <h2>Introduction</h2> <p>This document describes the proposed design for the MailDB database. This design is a work-in-progress and is subject to change; outstanding questions/issues are listed below in <a href="http://aves.org:2095/liza+aves.org/neomail.pl?action=viewattachment&sessionid=avesorg-session-0.0982186974297328&message_id=fd02e2f221c24eb3770452d3143b0894&folder=INBOX&attachment_number=1#issues">Open Questions/Issues</a>. </p> <p>This initial design assumes use of MySQL as the backend RDBMS. The design is generally normalized, the notable exception being message header tags (left of the colon) in the individual header records (which are stored in the msg_hdrs table). In this case the benefits of normalizing the data seemed to be outweighed by the slight increase in complexity of queries. [Note: I'm a bit on the fence on this one...decision not final. :-) -- lyw] </p> <p>Table summary descriptions are provided in the following section. A detailed description of table columns, datatypes, and a brief description/use of columns is in <a href="#details">Table Details</a>, below. The full SQL for table creation may be found in accompanying file <a href="cr_maildb.sql">cr_maildb.sql</a>, and sample queries are provided in accompanying file <a href="queries.sql">queries.sql</a>. </p> <hr><a name="tables"></a> <h2>Table Descriptions</h2> <p>There are four basic categories of tables: data common to all messages, user meta data, message data, and logs. </p> <p> <table border="1"> <tbody> <tr> <th colspan="6">Common message data</th> </tr> <tr> <th>Table</th> <th>Description</th> <th>Primary key</th> <th>Add'l Indexes</th> <th>References</th> <th>Referenced by</th> </tr> <tr> <td valign="top">mime_types</td> <td valign="top">Normalized table of mime_type definitions.</td> <td valign="top">Mime type numeric identifier (mt_id)</td> <td valign="top">Mime type (mt_desc)</td> <td valign="top">n/a</td> <td valign="top">msg_attach</td> </tr> <tr> <th colspan="6">User meta data</th> </tr> <tr> <th>Table</th> <th>Description</th> <th>Primary key</th> <th>Add'l Indexes</th> <th>References</th> <th>Referenced by</th> </tr> <tr> <td valign="top">users</td> <td valign="top">Normalized table of MailDB users.</td> <td valign="top">User numeric identifier (u_id)</td> <td valign="top">User name (u_desc)</td> <td valign="top">n/a</td> <td valign="top">cats<br> views<br> msg_users</td> </tr> <tr> <td valign="top">cats</td> <td valign="top">Message categories (i.e. "folders") created by users, to which messages are assigned. Categories may be [multiply] nested, and messages may assigned to multiple categories.</td> <td valign="top">Category numeric identifier (ca_id)</td> <td valign="top">Category owner (ca_u_id)</td> <td valign="top">n/a</td> <td valign="top">msg_users</td> </tr> <tr> </tr> <tr> <td valign="top">views</td> <td valign="top">User views of messages...specifically, defined search criteria and the SQL required to execute the searches.</td> <td valign="top">View numeric identifier (vw_id)</td> <td valign="top">View owner (vw_u_id)</td> <td valign="top">users</td> <td valign="top">n/a</td> </tr> <tr> <th colspan="6">Message data</th> </tr> <tr> <th>Table</th> <th>Description</th> <th>Primary key</th> <th>Add'l Indexes</th> <th>References</th> <th>Referenced by</th> </tr> <tr> <td valign="top">msg_ids<br> </td> <td valign="top">Normalized table of message ids.<br> </td> <td valign="top">Numeric message record identifier (m_id)<br> </td> <td valign="top">Message id (m_msg_id)<br> </td> <td valign="top">n/a<br> </td> <td valign="top">msg_hdrs, msg_users, msg_attach, msg_log<br> </td> </tr> <tr> <td valign="top">msg_hdrs</td> <td valign="top">Message header information for a given message. Message header sort order is maintained. Multiple records will exist for each message, one per unique header line. </td> <td valign="top">Numeric message header record identifier (mh_id)</td> <td valign="top">Normalized message id (mh_m_id), header key (mh_key), header value (mh_value) [fulltext index]</td> <td valign="top">n/a</td> <td valign="top">msg_users</td> </tr> <tr> <td valign="top">msg_users</td> <td valign="top">Records describing user instances (category filing) of specific messages. Users may assign messages to one or more categories, thus multiple records may exist in the table for a given message & user. Includes references into msg_hdrs table for faster retrieval of message date, sender, subject for category "scans".</td> <td valign="top">Message instance numeric identifier (mu_id).</td> <td valign="top">Normalized message id (mu_m_id), user (mu_u_id), category (mu_ca_id)</td> <td valign="top">users, cats, msg_hdrs</td> <td valign="top">n/a</td> </tr> <tr> <td valign="top">msg_attach</td> <td valign="top">Attachments to a given message. Message "body" not in an attachment is treated within the database as attachment 0. Attachment sort order is maintained. Records include mime type, file encoding, multi-part separator data, flag for attachment storage (0 = internal, in the database; 1 = external, in the filesystem).</td> <td valign="top">Message attachment numeric identifier (ma_id).</td> <td valign="top">Normalized message id (ma_m_id), body (ma_body) [fulltext index]</td> <td valign="top">mime_types</td> <td valign="top">n/a</td> </tr> <tr> <th colspan="6">Logging</th> </tr> <tr> <th>Table</th> <th>Description</th> <th>Primary key</th> <th>Add'l Indexes</th> <th>References</th> <th>Referenced by</th> </tr> <tr> <td valign="top">obj_log</td> <td valign="top">History of actions on meta objects - namely, create, update, delete of users, categories, views, mime_types. To be maintained via trigger (to be defined).</td> <td valign="top">Numeric log record identifier (obj_id)</td> <td valign="top">Object type (obj_type), object identifier (obj_type_id), object description (obj_desc)</td> <td valign="top">n/a</td> <td valign="top">n/a</td> </tr> <tr> <td valign="top">msg_log</td> <td valign="top">History of message activity - user adds, deletes from categories. To be maintained via trigger (to be defined).</td> <td valign="top">Numeric log record identifier (msl_id).</td> <td valign="top">Normalized message id (msl_m_id), user (msl_u_id), category (msl_ca_id)</td> <td valign="top">n/a</td> <td valign="top">n/a</td> </tr> </tbody> </table> </p> <hr><a name="issues"></a> <h2>Open Questions/Issues</h2> <ol> <li>Design issues <ol type="a"> <li>Review all field sizing, especially msg_id and record identifiers. </li> <li>Should header info be normalized? Not done so above. </li> <li>Should users or mime_types be de-normalized? </li> <li>Size threshold for determining whether attachment is stored in the database or as external file? (Need to know here only for reasonable sizing of the ma_body field in msg_attach.) </li> </ol> </li> <li>MySQL specifics <ol type="a"> <li>Triggers for maintaining log (history) tables. <b>Update: No trigger or stored procedure support in MySQL. Estimate is that these items will be included in version 5.0. Current stable MySQL is 3.23.x and 4.x is now available.</b> </li> <li>Check MySQL behavior when auto_incremented column "rolls over". </li> <li>Review MySQL regex pattern matching. </li> <li>Note vanilla MySQL has no true foreign key constraints. You can define the foreign keys (foo integer references bar(foo)) but they aren't enforced. </li> </ol> </li> <li>Message questions <ol type="a"> <li>Multi-part separator data format? </li> <li>Which header line is used for message threading? Should it be included in msg_users table for quick category scanning? </li> </ol> </li> <li>Miscellaneous <ol type="a"> <li>Construct and test sample queries... </li> </ol> </li> </ol> <hr><a name="details"></a> <h2>Table Details</h2> <table border="1"> <tbody> <tr> <th>Table</th> <th>Column</th> <th>Datatype</th> <th>Primary Key</th> <th>Indexed</th> <th>Description</th> </tr> <tr> <td valign="top" rowspan="2">mime_types</td> <td>mt_id</td> <td>smallint unsigned not null auto_increment</td> <td>X</td> <td>X</td> <td>Unique identifier for mime_type record</td> </tr> <tr> <td>mt_desc</td> <td>varchar(127)</td> <td> </td> <td>X</td> <td>Message body/attachment MIME type</td> </tr> <tr> <td valign="top" rowspan="2">users</td> <td>u_id</td> <td>smallint unsigned not null auto_increment</td> <td>X</td> <td>X</td> <td>Unique identifier for user record</td> </tr> <tr> <td>u_desc</td> <td>varchar(127)</td> <td> </td> <td>X</td> <td>User description (login/reference name, e.g. 'weissler' or 'li...@av...'</td> </tr> <tr> <td valign="top" rowspan="4">cats</td> <td>ca_id</td> <td>mediumint unsigned not null auto_increment</td> <td>X</td> <td>X</td> <td>Unique identifier for category record</td> </tr> <tr> <td>ca_u_id</td> <td>smallint unsigned not null</td> <td> </td> <td>X</td> <td>User owning this category. References users(u_id).</td> </tr> <tr> <td>ca_desc</td> <td>varchar(127)</td> <td> </td> <td> </td> <td>Category name, e.g. "Inbox", "Deleted", "Cycling"...</td> </tr> <tr> <td>ca_parent</td> <td>mediumint unsigned</td> <td> </td> <td> </td> <td>"Parent" category, allowing nesting of categories. Top-level categories have null parents. References cats(ca_id).</td> </tr> <tr> <td valign="top" rowspan="4">views</td> <td>vw_id</td> <td>mediumint unsigned not null auto_increment</td> <td>X</td> <td>X</td> <td>Unique identifier for view record</td> </tr> <tr> <td>vw_desc</td> <td>varchar(127)</td> <td> </td> <td> </td> <td>View name, e.g. "Messages from Robert"</td> </tr> <tr> <td>vw_u_id</td> <td>smallint unsigned not null</td> <td> </td> <td>X</td> <td>User owning view. References users(u_id).</td> </tr> <tr> <td>vw_query</td> <td>text</td> <td> </td> <td> </td> <td>SQL required to retrieve messages meeting user's search criteria</td> </tr> <tr> <td valign="top" rowspan="2">msg_ids<br> </td> <td valign="top">m_id<br> </td> <td valign="top">mediumint unsigned not null auto_increment<br> </td> <td valign="top">X<br> </td> <td valign="top">X<br> </td> <td valign="top">Unique identifier for message id (using less storage than the string message_id's in messages)<br> </td> </tr> <tr> <td valign="top">m_msg_id<br> </td> <td valign="top">varchar(255)<br> </td> <td valign="top"><br> </td> <td valign="top">X<br> </td> <td valign="top">Message identifier (assigned by MTA)<br> </td> </tr> <tr> <td valign="top" rowspan="5">msg_hdrs</td> <td>mh_id</td> <td>mediumint unsigned not null auto_increment</td> <td>X</td> <td>X</td> <td>Unique identifier for message header record</td> </tr> <tr> <td>mh_m_id</td> <td>mediumint unsigned not null</td> <td> </td> <td>X</td> <td>Message identifier</td> </tr> <tr> <td>mh_key</td> <td>varchar(63)</td> <td> </td> <td>X</td> <td>Header tag - e.g. "To", "From", "Subject"...anything left of colon (:) on header line. </td> </tr> <tr> <td>mh_value</td> <td>varchar(255)</td> <td> </td> <td>X</td> <td>Header value - anything right of colon (:) on header line. Fulltext index.</td> </tr> <tr> <td>mh_sort_order</td> <td>smallint unsigned not null</td> <td> </td> <td> </td> <td>Header sequence number.</td> </tr> <tr> <td valign="top" rowspan="8">msg_users</td> <td>mu_id</td> <td>mediumint unsigned not null auto_increment</td> <td>X</td> <td>X</td> <td>Unique identifier for message instance record.</td> </tr> <tr> <td>mu_m_id</td> <td>mediumint unsigned not null</td> <td> </td> <td>X</td> <td>Message identifier.</td> </tr> <tr> <td>mu_u_id</td> <td>smallint unsigned not null</td> <td> </td> <td>X</td> <td>User owning a copy of the message.</td> </tr> <tr> <td>mu_ca_id</td> <td>mediumint unsigned not null</td> <td> </td> <td>X</td> <td>Category to which the user assigned the message.</td> </tr> <tr> <td>mu_date</td> <td>mediumint unsigned not null</td> <td> </td> <td> </td> <td>Foreign key ref to corresponding "Date" header in msg_hdrs.</td> </tr> <tr> <td>mu_from</td> <td>mediumint unsigned not null</td> <td> </td> <td> </td> <td>Foreign key ref to corresponding "From" header in msg_hdrs.</td> </tr> <tr> <td>mu_subject</td> <td>mediumint unsigned not null</td> <td> </td> <td> </td> <td>Foreign key ref to corresponding "Subject" header in msg_hdrs.</td> </tr> <tr> <td>mu_flags</td> <td>tinyint unsigned</td> <td> </td> <td> </td> <td>Bit flags for the message.</td> </tr> <tr> <td valign="top" rowspan="9">msg_attach</td> <td>ma_id</td> <td>mediumint unsigned not null auto_increment</td> <td>X</td> <td>X</td> <td>Unique identifier for message attachment record.</td> </tr> <tr> <td>ma_m_id</td> <td>mediumint unsigned not null</td> <td> </td> <td>X</td> <td>Message identifier.</td> </tr> <tr> <td>ma_mt_id</td> <td>smallint unsigned not null</td> <td> </td> <td> </td> <td>Foreign key ref to mime_types(mt_id)</td> </tr> <tr> <td>ma_encode</td> <td>varchar(127)</td> <td> </td> <td> </td> <td>Attachment encoding</td> </tr> <tr> <td>ma_storage</td> <td>boolean</td> <td> </td> <td> </td> <td>Denotes whether attachment is stored within the database (0) or externally in the filesystem (1).</td> </tr> <tr> <td>ma_body</td> <td>mediumtext</td> <td> </td> <td>X</td> <td>Attachment body (for attachments stored within the database). Fulltext index.</td> </tr> <tr> <td>ma_path</td> <td>varchar(255)</td> <td> </td> <td> </td> <td>Full pathname to attachment (for attachments stored in the filesystem).</td> </tr> <tr> <td>ma_sep_data</td> <td>varchar(255)</td> <td> </td> <td> </td> <td>Multi-part separator data.</td> </tr> <tr> <td>ma_sort_order</td> <td>tinyint unsigned not null</td> <td> </td> <td> </td> <td>Attachment sequence number.</td> </tr> <tr> <td valign="top" rowspan="6">obj_log</td> <td>obj_id</td> <td>int unsigned not null auto_increment</td> <td>X</td> <td>X</td> <td>Unique identifier for object log record.</td> </tr> <tr> <td>obj_type</td> <td>varchar(15)</td> <td> </td> <td>X</td> <td>Type of object - e.g. user, view, cat, mime_type</td> </tr> <tr> <td>obj_type_id</td> <td>mediumint unsigned not null</td> <td> </td> <td>X</td> <td>Object identifier; if obj_type = user, obj_type_id corresponds to users(u_id), obj_type = cat, obj_type_id corresponds to cats(ca_id), etc.</td> </tr> <tr> <td>obj_desc</td> <td>varchar(127)</td> <td> </td> <td>X</td> <td>Object description (username, viewname, category name...)</td> </tr> <tr> <td>obj_action</td> <td>char(1)</td> <td> </td> <td> </td> <td>Action taken on this object. Values include C (create), U (update), D (delete).</td> </tr> <tr> <td>obj_datetime</td> <td>timestamp(14)</td> <td> </td> <td> </td> <td>Timestamp of object action.</td> </tr> <tr> <td valign="top" rowspan="6">msg_log</td> <td>msl_id</td> <td>int unsigned not null auto_increment</td> <td>X</td> <td>X</td> <td>Unique identifier for message log record.</td> </tr> <tr> <td>msl_m_id</td> <td>mediumint unsigned not null</td> <td> </td> <td>X</td> <td>Message identifier.</td> </tr> <tr> <td>msl_u_id</td> <td>smallint unsigned not null</td> <td> </td> <td>X</td> <td>User owning instance of the specified message.</td> </tr> <tr> <td>msl_ca_id</td> <td>mediumint unsigned not null</td> <td> </td> <td>X</td> <td>Category to which the user assigned the message.</td> </tr> <tr> <td>msl_action</td> <td>char(1)</td> <td> </td> <td>X</td> <td>Action taken on this message. Values include C (create), U (update), D (delete).</td> </tr> <tr> <td>msl_datetime</td> <td>timestamp(14)</td> <td> </td> <td> </td> <td>Timestamp of message action.</td> </tr> </tbody> </table> <p></p> <hr><a name="storage"></a> <h2>MySQL Storage Notes</h2> <p> <table border="1"> <tbody> <tr> <th>Datatype</th> <th>Max</th> <th>Storage</th> </tr> <tr> <td>varchar</td> <td>255</td> <td>L+1</td> </tr> <tr> <td>tinytext</td> <td>255</td> <td>L+1</td> </tr> <tr> <td>text</td> <td>64k</td> <td>L+2</td> </tr> <tr> <td>mediumtext</td> <td>16m</td> <td>L+3</td> </tr> <tr> <td>longtext</td> <td>4g</td> <td>L+4</td> </tr> <tr> <td>tinyint (U)</td> <td>255</td> <td>1</td> </tr> <tr> <td>smallint (U)</td> <td>65535</td> <td>2</td> </tr> <tr> <td>mediumint (U)</td> <td>16777215</td> <td>3</td> </tr> <tr> <td>int (U)</td> <td>4294967295</td> <td>4</td> </tr> <tr> <td>timestamp</td> <td>YYYYMMDDHHMISS</td> <td>4</td> </tr> </tbody> </table> </p> <p>...where (U) = unsigned, and L = length of text string. </p> <p></p> <hr><i>Last update: 17 October 2002 by <a href="mailto:li...@av...">Liza Weissler</a></i><a href="mailto:li...@av..."> </a><br> </body> </html> |