From: Jonathan C. <jon...@cs...> - 2012-07-18 10:32:03
|
Hi Frank et al., I've been thinking about this and playing around with ideas for the last few weeks; trying to see how it fits with our protocol language ideas too. (Incidentally, our functional curation tool now supports (most of) SED-ML L1V1, and I should have the suggestions below implemented too in time for COMBINE.) Regarding especially the two points for discussion, I think the best solution is actually to have 3 concrete subclasses of an AbstractTask: ModelTask, NestedTask (or RepeatedTask if that name is preferred), and CombinedTask. Frank's current RepeatedTask proposal combines aspects of all three of these concepts into one class, and I think things are clearer and easier to implement if they are separated. * ModelTask provides the current L1V1 Task - it associates a model with a simulation algorithm. This would be the only task with a simulationReference and modelReference. * NestedTask provides the ranges, changes, and a single subtask that gets run repeatedly. The option to iterate directly over a simulation by providing a simulationReference and modelReference can be expressed just as well by providing a ModelTask as subtask, and avoids conflating different behaviours in one class. * CombinedTask allows tasks to be grouped: it has a listOfSubtasks, and a scheduling attribute that specifies whether they must be run in the order given (sequential) or whether order is immaterial and hence they could be parallelised (parallel). CombinedTask allows you to structure dependencies between tasks that are as complex as you like, by having CombinedTasks as subtasks of another CombinedTask. Personally I think it's also clearer and easier to implement than having either an order attribute or listing task dependencies. Of course, this does change the Task class hierarchy. But since the L1V1 Task class still exists unchanged in my proposed hierarchy (it's just not at the top) I don't think that's too much of an issue even if we wanted this to go in L1Vx; for L2 I think achieving a clean (and hence more easily comprehensible) design should be the most important concern. Now to more low-level details. * The 'range' attribute on repeatedTask seems to be redundant and potentially confusing, especially as there can be multiple ranges. What's the behaviour if there are multiple unrelated ranges with different numbers of values? Presumably we should enforce that all ranges have the same number of values, and are all iterated over simultaneously, hence just providing different values for the benefit of changes. Since individual changes can specify range(s) of interest, "blessing" one range doesn't seem necessary. * There are several methods presented for referring to a range value from a setValue change. I prefer the way shown in the examples on page 5 of Frank's proposal, where a variable element is used to link the range id to a variable id for use in the math block - this seems most in line with existing SED-ML behaviour, and also allows a change to reference multiple ranges cleanly, which the approach of giving a range attribute on the setValue element doesn't. * The same comment applies to the index attribute on a functionalRange. Again you could be explicit about it by defining a variable. I can see the benefit of keeping the short-hand method for common cases, but suggest that it is explicitly defined as being a short-hand. * With resetModel, is the intention that it resets to the initial conditions of the model, or the model's state as it existed at the start of the task? I think from page 4 you envisage the former, but the latter seems more flexible to me (and not much harder to implement). It would allow you, for instance, to have a sequence of tasks in which the first gets the model to steady state, and the second does a repeated simulation from that point. * I don't particularly like having a step attribute on oneStep - I'd much rather have a simulation class on which I can specify the desired end point to reach. o The main reason for this is that in moderately long-running simulations, the way in which you calculate time points can noticeably affect the results you get: if you use repeated addition of a time-step, accumulated floating point error can even mean you don't end up at the expected overall end time exactly, and have to take a very small or slightly larger final step. If instead you compute output time points as num_steps*dt, they're computed more accurately and you don't get any issues for the final step (provided of course that dt divides the overall interval). o Another advantage to specifying end point rather than step is that you don't have the "step" information duplicated in both the range of the task and the step of the simulation. The task's range naturally provides the value up to which you want to solve in all the cases I can think of. o Specifying the end point based on the range value does require, however, that you can parameterise your simulation and/or task classes, and that setValue in a NestedTask can address these parameters as well as those in the model. For the latter point, we could extend the target attribute to allow values such as '#id' as well as XPath expressions; '#id' is not valid XPath so there's no potential for confusion. Whereas an XPath expression would address a model variable/parameter, the use of #id would select the protocol parameter with matching id. Specifying what parameters are available would require some more work, possibly building on Richard's proposal <http://sourceforge.net/tracker/?func=detail&aid=3391892&group_id=293618&atid=2532228>. * I also have a question on which I'd like input from others with more experience: the oneStep proposal specifies that it "defines a simulation step of a deterministic simulation." Is there any reason why you can't run a stochastic simulation up to a desired end point? There is also a related use case that might be worth considering here: what if you're simulating a cell-cycle model, and wish to run up until the model indicates cell division should occur? In this case you don't know what the ultimate end point will be, and you'll almost certainly want to resolve it to greater precision than a repeatedTask range or timecourseSimulation output granularity. Best wishes, Jonathan On 10/06/2012 11:43, Frank T. Bergmann wrote: > Hello together, > > I have modified my original proposal and added the suggestions made by > Nicolas and the discussions at HARMONY 2012. The key changes: > > - instead of a NestedSimultion class a RepeatedTask is used > - the SteadyState simulation class is back > > The proposal can be found here: > > http://identifiers.org/combine.specifications/sed-ml.proposal.nested-simulat > ions.FB.version-3 > > And an implementation is available with the SED-ML Web Tools: > > http://sysbioapps.dyndns.org/SED-ML_Web_Tools/ > > where all examples are available to be run. > > > Points for discussion: > -------------------------- > - Stuart brought up the idea of having subTasks not specifying an order > attribute > but instead they would describe the tasks they depend on. The idea being > that > this would allow for a potential parallel execution of tasks. > > - Should we have further task subclasses or not? > > I look forward to your comments. > > Cheers > Frank > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > SED-ML-discuss mailing list > SED...@li... > https://lists.sourceforge.net/lists/listinfo/sed-ml-discuss |