From: Wout B. <wbi...@he...> - 2020-07-15 02:27:02
|
Unfortunately I won't be able to join the call because I'm co-chairing the CompMS session at ISMB the whole day. (*Maybe* I'll be able to join the first 10 minutes.) Final preparations for this have taken up quite some time in the past two weeks as well, so unfortunately I haven't yet had the time to contribute further to the specification document. I've been thinking about using PURLs or CV references to link quality metrics to CVs, and I'm starting to favor something that was suggested two weeks ago: to stick to the old format, but reserve CV namespaces for the most commonly used CVs. So that would be "QC" for our CV, "MS" for the MS CV, and "UO" for the unit ontology. These cover the majority of use cases I think, and if we identify other commonly used CVs we can still add those as well. The advantage of this is that it's possible to efficiently query terms in these CVs, because the CV keys are fixed. So they essentially function the same way as the PURLS, without the need of actually having to use PURLs. For other CVs that aren't reserved a two-query solution will still be needed, but considering that such queries should be rare I don't consider that too much of a problem. A further advantage is that we can avoid the somewhat clunky PURL specification, don't depend on an external service, and (very importantly imo!) don't require web lookups to the PURL service (i.e. essential for doing stuff in firewalled compute environments). A small disadvantage would be that to validate these reserved CV keys we'd need to add explicit functionality in the mzQC Python library to do so. But this is hardly a showstopper. This solution seems to somewhat give us the best of both worlds I think. It's also nice that we don't need to change the JSON schema. (Although we should probably still change the CV references in the JSON schema from a dictionary to a list.) The only thing would be to clearly document this behavior and the reserved CV keys in the specification document. I think we should also adapt and explicitly document some best practices that were discussed in function of adapting PURLs: that CV terms are final and can only ever be deprecated (i.e. an accession will always point to the same CV term) and that we should document an official CV versioning scheme. Let me know if I've overlooked something here. Best, Wout On 14/07/2020 19:05, Wout Bittremieux wrote: > Dear colleagues, > > This is a reminder that our next teleconference is scheduled for > Wednesday, July 15, at 14h00 GMT (15h00 London, 16h00 Western > Europe, 16h00 Cape Town, 17h00 Turkey, 7h00 San Diego). > > You can connect to our teleconference on Zoom through the > following link: > https://uchealth.zoom.us/j/92419363577?pwd=WVp5Q3FXNU9vaVdJT0ZNRllXWlN3Zz09 > (Password: 012575) > > I'd like propose the following agenda items: > > - Update on finalization of specification document > (https://docs.google.com/document/d/132F3MBgDJgtFlXxDZhpJ1oHGbKL8pT6dk9fvL55L5_M/edit). > > - New CV requests for PTXQC via mailing list > (https://sourceforge.net/p/psidev/mailman/message/37059772/). @Chris: Is > this not covered yet? > - Continue discussion on CV references / PURLs in mzQC schema > (https://github.com/HUPO-PSI/mzQC/pull/103). > > Thanks, > Wout |