dim8 wants to merge 16 commits from /u/dim8/scaffoldhunter/ to master, 2018-07-26
Within the branch feature-107
we have written a data calculation plug-in that processes a set of molecular descriptors from the CDK Library. The list of descriptors includes:
and all in all adds 47 new separate table values. The full source code is to be found in the
edu.udo.scaffoldhunter.plugins.datacalculation.impl.konplugin
and its respective sub-package fingerprints
.
We have tested it on some samples and to the best of our knowledge there aren't any apparent issues. Please, review and give us your feedback.
Commit | Date | |
---|---|---|
2018-02-02 00:36:58 | Tree | |
[0350f9]
by
![]() fixed description overflow, matched indices to description, added chi |
2018-01-25 20:22:53 | Tree |
2018-01-25 15:40:21 | Tree | |
2018-01-24 15:06:34 | Tree | |
2018-01-24 11:53:50 | Tree | |
2018-01-22 10:31:20 | Tree | |
2018-01-19 02:57:54 | Tree | |
2018-01-18 23:12:20 | Tree | |
2018-01-18 20:58:13 | Tree | |
2018-01-18 17:57:57 | Tree | |
2018-01-18 16:11:04 | Tree | |
2018-01-16 15:48:19 | Tree | |
2018-01-16 15:23:39 | Tree | |
2018-01-16 15:20:15 | Tree | |
[682e09]
by
![]() New Plugin Descriptor Calculations: molecular weight, chi descriptors, kappa shape indices |
2017-12-14 19:20:39 | Tree |
2017-11-30 23:18:58 | Tree |
First of all: Thank you for the contribution.
Now: Let's get to the technical bits :-)
I have some genral Questions before we go into source code details:
Code-Style:
Code:
For the feature:
Any further questions? Feel free to ask and diskuss them. (for general question not related to this pull request, please use our developer mailing list)
Last edit: Till Schäfer 2018-02-02
Thanks for the quick feedback!
The name is a portmanteau derived from Konstanz plus Plugin. It was thought as a small gesture to the university of the same name, where the plugin has been developed. Frankly, we are open to suggestions. A content-based derived modification from the title (Molecular Descriptors Calculator) of the plug-in, e.g. CDKMolPlugin might, however, be more suitable.
Because, the group of chemists, who sought to use Scaffold Hunter in their research had requested many values and separating them in that many plug-ins wasn't feasible (considering each value - 47). So, instead of implementing 9 separate ones, we went for the more straightforward solution. If a user needs only a subset of them, s/he could still take only those for the analysis part.
Not really. There are parameters that could be set only for the H-Bond Acceptor/Donor counts and the TPSA descriptor, all in terms of aromaticity. There we opted out for the default option, which is false. About the Weight descriptor, there the molecular weight is calculated by design, and thus the parameter is set internally. One hindrance in general towards any exposure of parameters as settings turned out to be the limited place for plugin details. Because the description is on all accounts pretty lengthy and the panel is in fact not scrollable, an addition of settings panel would have gone hidden. As we weren't sure whether this was by design, this wasn't put in any ticket. Is it in fact so?
We have, do you have anything specific in mind?
I'm not sure that I understand you correctly. Could you give me an example? Do you mean e.g.
The structure itself shouldn't be modified. The reason was that we considered the AtomCountPlugin as an example at the start. This will be fixed in the next commit.
Actually, any possible exceptions are caught. See the
getDescriptorValue()
method, which is called on each descriptor. That way, if one descriptor calculation fails. the following could proceed.I understand. Will see, whether that is possible under the newest JDK and, if possible would update it. Otherwise, I would restrict the class and its methods visibility to package.
Thanks, will fix it.
This goes back to your very first question and will be updated synchronously. Because this is a direct description of the plugin, <name>.html is the most suited and minimal enough for this purpose.
We are aware of SH's code hygiene, and it is admirable. Exactly in the name of keeping it clean, we decided to do those changes, but keep them minimal. Because of their inhomogeneity and their lack of great importance, it would have been too much unnecessary effort for both sides, if we had acted on each one on its own.
About the mailing list, I looked for sort of "Submit" option in it, when I wanted to report the two issues (295 and 296), but could find any. To add, I could find the respective address, too.
This is a combined thought on both aspects above: If something is bundled, it should be clear to the user and developer, where to search for some functionality. The displayed plugin name already summarizes the topic. We might just describe it a little bit more specific and call it "QSAR Molecular Descriptors Calculator". Then, the use is aware, what he can expect by the plugin.
The package name and the classes also should be named accordingly to avoid confusion.
Of course, you are free to honor the origin of delopment. Thus, feel free to add a notice in the JavaDoc of the plugin class.
Regarding the Descriptors it would be very nice if they can be selected/toggled in the plugin, e.g. by a list with checkboxes. We already think about extending the functionality of this plugin to resepect all parameter-free IMolecularDescriptors from cdk. If you are interested, you can do this extension to all parameter-free IMolecularDescriptors by yourself, but only the ability to select the current descriptors is required to merge this request.
Have you added a ScrollPane?
hmm, maybe you just used another version of the formatter, but here the javadocs are formatted different (e.g. @param is allwas followed by a newline) and the code is formatted differently in some places. Have you applied the formmatter on the whole class? If so, just keep it the current way. Otherwise, please do so.
Yes, that are the comments I am talking about. We also have them in some places (its a legacy), but they are completely useless for modern IDEs, which can give you much bettern descriptions about the superclass methods. These comments also tend to get outdated if some super-class gets renamed.
Oh, I have not seen them. However, the current implementation will still fail, since #getDescriptorValue() will return null in the case of an Exception. Afterwards you pass the null value to #addFingerprintValues(), which calls toString on the null-result. This will raise another uncatched exception in #apply. Long story should, please make sure to not insert null values (just do not add the property in this case).
Our code should be compatible to Java 7 at this point. The ArrayList#toArray(T[] a) method is available for this version.
I was referring to the "scaffold_hunter" part here. Its clear that is has something to do with the scaffold hunter, thus thats just noise. On the other hand the file name does not give any hint about the actual content. Thus, I could think about something like descriptor_help.html, descriptor_description.html ;-), or something like that.
You can subscribe to the mailinglist here:
https://sourceforge.net/p/scaffoldhunter/mailman/?source=navbar
Afterwards, you can just send regular emails to that list.
Last edit: Till Schäfer 2018-02-12
We have updated the state of the plug-in with the following:
About the ability to select single descriptors for calculation, we haven't added it, because there isn't enough space on the side pane to show it. The settings panel itself could have scrollable pane, but that won't fix the issue. The description panel is the one that has to be scrollable or alternatively the whole side details pane. But changing this is out of scope for the feature, and we can't do it only for this plug-in, since the description has to be given as a bare String.
Sorry for the late feedback. The current verson looks fine, but one small thing is missing:
The current properties are named in a confusing way:
All descriptors form the same type are named the same, but have a number atached. One needs to read the (very good) description of the property to understand the meaning. That description is not always available to the user (table view shows it by hovering the proprty, other views (especially drop down lists) do not do so). Thus, it would be much better to just name the proerty accordingly (please keep the description anyway).
Example: "Chi Chain Descriptor 1" -> "Chi Chain Descriptor SCH-3"