Re: [mydms-devs] Proposed Improvements to MyDMS
Open Source Document Management System
Brought to you by:
trilexcom
From: Peter G. <pet...@ya...> - 2006-05-05 05:39:37
|
Malcolm Apologies for the delay in response but email disappeared in pile of of junk email. Its good to see some development happening with myDMS. I was very interested in this product and a need has arisen where this would provide a benefit. A few queries Does your version implement some sort of document numbering system. I.e. More of a Document Register rather than a true Document Management System. The reason for this is I would prefer to have the ability for a user to request a document number by filling in a form on the screen and someone else to approve the number. The only time I would require the DMS side would be to archive the completed PDF file. That was another limitation for myDMS was searching within PDF files for cross referenced information. If you have a more improved version without the folder structure but rather the keyword format? I was considering using a similar format to the www.mininova.org website which has tables broken up based on catergories/keywords etc. This would generate a webpage alot quicker. Anyway good to see something happening with this project. Catch Ya --- Malcolm Cowe <Malcolm.Cowe@Sun.COM> wrote: > Hi, > > I have joined this mailing list as it is likely that > I will be working > on improving MyDMS for the next 2-3 months. An > experimental internal > deployment has experienced some teething problems as > the size of the > repository grows, so I have been assigned to look > into it. I am using > MyDMS 1.4.4 as a basis for further development. > > I have attached a document outlining improvements to > MyDMS that will > deliver the benefits that our internal users require > in order to be able > to continue to use MyDMS. > > The only option that I am unclear on how to develop > and contribute back > to the project is the document workflow (item 5). > This is because we > have an internal process that may not be general > purpose enough to > incorporate into the main source. There's nothing > proprietary about what > we do, but it relies on our corporate authentication > mechanisms rather > than on the user/groups stored in the MyDMS table. > Now that I think > about it, I might be able to provide a method for > hooking the > authentication into external authentication systems. > > Regards, > > Malcolm. > > MyDMS Improvements > ================== > > 1. Replace short open tags (<?) with <?php tags. > - Difficulty (time): Easy (10 minutes). > - Status: Done. > > > 2. Remove dependency upon register_globals. > - Difficulty (time): Easy (3 days). But boring. > - Status: Not started. > > > 3. Database: Remove the id column from > tblDocumentContent. Change the primary > key to be (document, version) with the > auto_increment on the version field. > This enables the database to automatically assign > the next appropriate > version number to any new row inserted into the > table without having to > explicitly lookup the existing version value, > increment the result and > storing that as part of the insert statement. It > guarantees that there will > be no conflicts or duplication of version numbers > by simultaneous inserts. > The id field is not required as it is not used > anywhere except as a > convenient identifier. It is easily replaced by > (document id, version). > > - Difficulty (time): Easy (1 day). > - Status: Not started. > > > 4. Database: Move the lock field into a separate > table for managing locks. > With the current implementation it is possible, > although unlikely, > for two people to simultaneously request a lock on > a document. I propose > that a new table be created for storing locks in a > list where the primary > key is the document ID (and must therefore be > unique), and a second field > stored the user ID of the person locking the > document. In this way, if > there are two [near-] simultaneous requests to lock > a document, the first > request to insert a row into the lock table gets > the lock on the document. > Any subsequent requests to insert a row will result > in an error (duplicate > key). > > - Difficulty (time): Easy (2/3 days). > - Status: Not started. > > > 5. Document approval workflow. A basic, static, > workflow for managing the > life-cycle of a document from initial draft to > publication. Allows user to > determine the status of a document. Allows an > author to submit a document > for peer review before publishing it in its final > format. Better > traceability and control over the version > management and change history. > > - Difficulty (time): Moderate (3 weeks). > - Status: Not started. > > > 6. Optimise search. Because of the recursive nature > of the filesystem > developed for MyDMS, searching is very, very slow. > What works for > traditional file systems does not necessarily > translate into a good system > for a database to follow. Even for moderate MyDMS > deployments, the number > of queries required and the amount of processing > overhead generated is far > too high. > > I have experimented with alternative mechanisms on > a local deployment where > searching often (usually) times out. A standard > search of the database > using the existing code takes over 4 minutes, if it > completes at all. I > have managed to reduce this time to 0.12 seconds > (yes, just over one tenth > of one second) using the same database, just by > making a few simple > assumptions about how to search. Specifically, I do > not use the folder > structure _at all_. This may seem like a cheat, but > it has resulted in > greatly improved usability in our tests thus far. > System load on the > database server is also reduced from 80-90% > utilisation to what amounts to > background noise. > > I do not necessarily propose that the folder > structure be removed from > MyDMS, although that is an option (if managed > carefully). There are two > pre-existing methods available for organising > documents in MyDMS: folders > and keywords. > > One option is to develop a system that relies > entirely upon keywords for > its structure and relationships -- a sort of > associative filing system. To > refine the display of documents, use an > intersection of keywords (e.g. > "list all documents with keywords key1 AND key2 AND > key3"). This also > easily allows documents to be associated with more > than one group or > collection. This is sort of how GMail works -- it > relies on the application > of one or more labels to a message in order to > organise a collection of > email. > > It is also possible to improve the performance of > the existing folder > mechanism by incorporating an index or folderList > field in the tblDocuments > table. For each document, record the folder > hierarchy to which it belongs. > One can then quickly search tblDocuments, pulling > out those records where > the specified folder ID is listed, instead of > having to recursively search > the folder tables. It does introduce some overhead > when moving files > around, but it's not much. It also looks a little > hacky, but I think it > stands up as a reasonable compromise. > > Of course, both implementations can happily > co-exist, so it is also > feasible to maintain both mechanisms. > > - Difficulty (time): Moderate (4 weeks). > - Status: Completed discovery (feasibility). > Development not started. > > > 7. Optimise display. The explorer pane on the page > display takes far too long > to display, and it duplicates much of the > information presented on the rest > of the page. The overhead of rendering the > folder-tree greatly out-weighs > any potential benefit and I suggest that it be > removed entirely. I have > created a compact theme that significantly reduces > the page load times, but > there are still some areas that can be improved. > > - Difficulty (time): Easy (1 week). > - Status: Partially implemented. > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com |