From: John M. <jo...@eb...> - 2014-01-19 15:08:35
|
Hi all, We are tantalisingly close to primarily using maven to build the CDK. Using a standardised way of building will not only make the it easier to developer but the separate source trees will also give a different perspective on how the project is organised. Egon recently showed an overview of the currently module dependencies: http://egonw.tumblr.com/post/70669027045/current-overview-of-cdk-module-dependencies-in. There’s a bit of a hairball in the middle but it generally looks okay. The true picture is a little more complicated as inherited decencies are not shown there. One thing I’ve been thinking about recently is to group the modules together into ‘super' modules. This coarser partition of functionality breaks down the code into a much clearer view of what is available and where. In the scheme below I have organised the current modules in to 6 (7) top level groups. Unfortunately some of best names that would be obvious to use (i.e. ‘io’, ‘qsar’) are already taken and so it’s a bit difficult to think of concise descriptive for these. I’ve tried to stick to noun’s as the verb’s are bit yuk. We could also go plural, ‘ios', ‘qsars’, ‘descriptors’ etc... If you have suggestions (or objections) to the group names please indicate. Also, there are still many corners of library that I’m not 100% sure of their function / utility. Let me know if there are any parts out of place. Thanks, John Scheme: base - contains the object-model and ‘core’ parts that everything else builds upon, alternative name: ‘domain’? annotation atomtype core data silent datadebug dict interfaces reaction standard valencycheck tool - contains general utilities and tools that are utilised together or separately in other other modules fingerprint builder3d builder3dtools forcefield charges cip formula fragment group hash isomorphism - could go in base pcore sdg signature smarts structgen tautomer smsd - this could also be a top level module as it sits atop everything else and was originally a separate project storage - contains functionality for storing data to different formats, ideally we would use ‘io’ but that module is quite essential (contains Molfile readers) and it would not be feasible to refactor the naming. alternative names: ios, store (verb), persist (verb) - e.k. very corporate inchi io ioformats ionpot iordf libiocml libiomd pdb pdbcml smiles prediction - contains quantitative modelling descriptors we could perhaps use ‘qsar' here and change that modules name to ‘qsar-core’ as it only contains base classes and no descriptors. alternative name: describe, descriptors qsar qsaratomic qsarbond qsarcml qsarionpot qsarmolecular qsarprotein display - contains functionality for drawing depictions (not laying out atoms) this would also include ‘render-svg’, ‘render-eps' etc. alternative names: depict (verb) render renderawt renderbasic renderextra deprecated - contains old code that is not used in the library but is required for backwards compatibility, only included in downstream projects when needed miscellaneous - bits I can’t place anywhere else, alternative name: other control - undo/redo framework and modifications to structures extra - dumping ground for classes that need a home, includes IO, matrix utilities and everything but the kitchen sink :) diff - primarily used in tests to display and describe difference in attributes, it could go in base but I see this as auxiliary functionality log4j - implementation of CDK's own logging framework, could go in base but it would be good to switch to SLF4J at some point instead of maintaining a custom logging facade. Placing it here highlight that. qm - quantum mechanics? not in tools as only seems to be a few data structures and no real functionality? also contains a renderer eh? |