Thread: [limesurvey-developers] need opinion on cascading relevance logic
The leading Open Source survey tool
Brought to you by:
c_schmitz
From: Thomas W. M. M. M. <tw...@co...> - 2011-08-04 14:32:20
|
All (but mostly Thibault and Carsten)- I'd like input on the desired way to support cascading relevance when at least one of the variables is irrelevant. This wiki pages shows a few dozen examples and some proposed results - http://docs.limesurvey.org/Expression+Manager+Roadmap , with room for your comments for each example. They underlying questions are: (1) Which operators should always return false if any of their arguments are irrelevant? (a) Comparators? (==, !=, >, >=, <, <=) (b) Math? (+, -, *, /, +=, -=, *=, /=) (c) Logic (&&, ||, !) (2) Can we safely apply the above to just plain variables (e.g. a == b), or do people truly want this to apply to sub-equations (e.g. (a+b)==c). Cascading irrelevant (N/A) status to sub-equations is harder. (3) Is it always safe to NOT return false for functions that take multiple arguments - like sum(a,b,c), if(a,b,c), list(a,b,c), max(a,b,c) - e.g. each sub-equation (the parts between commas in the function call) will be false, but an argument will be passed to the function. /Tom |
From: Carsten S. <car...@li...> - 2011-08-04 14:46:43
|
Am 04.08.2011 16:32, schrieb Thomas White, MD, MS, MA: > All (but mostly Thibault and Carsten)- > > I'd like input on the desired way to support cascading relevance when > at least one of the variables is irrelevant. This wiki pages shows a > few dozen examples and some proposed results - > http://docs.limesurvey.org/Expression+Manager+Roadmap , with room for > your comments for each example. > I think if a questions condition relies on a question that is not in the current logical branch (=would not be displayed ever if the user walked down this conditions path straight away), then it must be a setup glitch (could also happen if a question was deleted). This should never happen anyway, shouldn't it? I would expect that if the survey is not active yet, and the admin tests it, that some error message is shown while testing the survey. Such an 'errorneous condition' should never make it into a live survey - I am not sure if there is a way to check for these invalid conditions on activation? -Carsten |
From: Thomas W. M. M. M. <tw...@co...> - 2011-08-04 14:57:24
|
Carsten- Say you have questions Q1, Q2, Q3, Q4, Q5, and the relevance for Q5 is sum(Q1,Q2,Q3,Q4) > 5. If you delete question Q2, you will get a syntax error since Q2 does not exist. Similarly, if you move Q2 after Q5, you will also get a syntax error since Q2 cannot be set prior to calling the relevance for Q5. Having such syntax errors would prevent the survey from being activated. However, you might have complex relevance statements for each of Q1-Q4 such that there are a dozen different paths one could take to get to Q5. That would not cause any syntax errors (even though the user would only see a subset of questions Q1-Q4 depending upon how they responded), so the survey could be activated. /Tom On Thu, Aug 4, 2011 at 10:46 AM, Carsten Schmitz < car...@li...> wrote: > Am 04.08.2011 16:32, schrieb Thomas White, MD, MS, MA: > > All (but mostly Thibault and Carsten)- > > > > I'd like input on the desired way to support cascading relevance when > > at least one of the variables is irrelevant. This wiki pages shows a > > few dozen examples and some proposed results - > > http://docs.limesurvey.org/Expression+Manager+Roadmap , with room for > > your comments for each example. > > > I think if a questions condition relies on a question that is not in the > current logical branch (=would not be displayed ever if the user walked > down this conditions path straight away), then it must be a setup > glitch (could also happen if a question was deleted). This should never > happen anyway, shouldn't it? > I would expect that if the survey is not active yet, and the admin tests > it, that some error message is shown while testing the survey. > Such an 'errorneous condition' should never make it into a live survey > - I am not sure if there is a way to check for these invalid conditions > on activation? > > -Carsten > > > ------------------------------------------------------------------------------ > BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA > The must-attend event for mobile developers. Connect with experts. > Get tools for creating Super Apps. See the latest technologies. > Sessions, hands-on labs, demos & much more. Register early & save! > http://p.sf.net/sfu/rim-blackberry-1 > _______________________________________________ > limesurvey-developers mailing list > lim...@li... > https://lists.sourceforge.net/lists/listinfo/limesurvey-developers > |
From: Carsten S. <car...@li...> - 2011-08-04 15:08:40
|
Hi Thomas > Say you have questions Q1, Q2, Q3, Q4, Q5, and the relevance for Q5 is > sum(Q1,Q2,Q3,Q4) > 5. If you delete question Q2, you will get a > syntax error since Q2 does not exist. Similarly, if you move Q2 after > Q5, you will also get a syntax error since Q2 cannot be set prior to > calling the relevance for Q5. Having such syntax errors would prevent > the survey from being activated. Agreed. > > However, you might have complex relevance statements for each of Q1-Q4 > such that there are a dozen different paths one could take to get to > Q5. That would not cause any syntax errors (even though the user > would only see a subset of questions Q1-Q4 depending upon how they > responded), so the survey could be activated. > I guess this scenario never showed up in the old conditions egine because this type of conditions just wasn't possible - so please be lenient with me freeing my thoughts of the old barriers. In the case of a SUM I would certainly see that a non-shown Q2 would evaluate to 0. -Carsten |
From: Thibault Le M. <Thi...@su...> - 2011-08-04 15:34:52
|
Le 04/08/2011 17:08, Carsten Schmitz a écrit : > > I guess this scenario never showed up in the old conditions egine > because this type of conditions just wasn't possible - so please be > lenient with me freeing my thoughts of the old barriers. Same for me ;-) > In the case of a SUM I would certainly see that a non-shown Q2 would > evaluate to 0. Yes, but this should be decided on a per-function basis, and not be a general principle. Working on conditions I found out that it is very difficult to imagine all possible scenarios and if we make such assumption, it is possible that sometimes in the future we'll find a case where this assumption is wrong and prevent us from "going on". Thibault |
From: Thibault Le M. <Thi...@su...> - 2011-08-04 16:57:25
|
Le 04/08/2011 16:32, Thomas White, MD, MS, MA a écrit : > All (but mostly Thibault and Carsten)- > > I'd like input on the desired way to support cascading relevance when > at least one of the variables is irrelevant. This wiki pages shows a > few dozen examples and some proposed results - > http://docs.limesurvey.org/Expression+Manager+Roadmap , with room for > your comments for each example. > > They underlying questions are: > (1) Which operators should always return false if any of their > arguments are irrelevant? > (a) Comparators? (==, !=, >, >=, <, <=) > (b) Math? (+, -, *, /, +=, -=, *=, /=) > (c) Logic (&&, ||, !) > (2) Can we safely apply the above to just plain variables (e.g. a == > b), or do people truly want this to apply to sub-equations (e.g. > (a+b)==c). Cascading irrelevant (N/A) status to sub-equations is harder. > (3) Is it always safe to NOT return false for functions that take > multiple arguments - like sum(a,b,c), if(a,b,c), list(a,b,c), > max(a,b,c) - e.g. each sub-equation (the parts between commas in the > function call) will be false, but an argument will be passed to the > function. Hum, I've seen the table on the wiki page... but wouldn't it be confusing for our users if operators behave differently ? In fact I think would return false for any Expression using a non-available (N/A) value, this includes 1a, 1b,1c and I'm afraid 2 as well. => Is (2) difficult if we rewrite array_filter (the only sub-question filtering feature) using EM? Of course this requires to add the relevanceXXX you described in your last commit for sub-question as well. I wouldn't assume that (3) is safe in all cases. It is really up to the function to define if it can process non-available arguments and still return a meaningful result. sum() seems, indeed, a good candidate if non-available arguments are set to "0". But we can't make this a general rule. IMHO: * ExpressionManager should "throw an exception" if a reference to a non-available value is found whatever the expression is and in this case return FALSE. This would avoid having to imagine all combinations of operators and will remain consistant (since simple) * the only function that would "mask" this exception would be the is_relevant() function. * we could imagine a new notation {Q3:0} which could refer to the value of Q3 if available or '0' if not available (in this case, this notation should mask the Exception as well). This is only my first impression, though I have not stepped into the code. Thibault |
From: Julian G. <ju...@d-...> - 2011-08-04 19:17:00
|
On Thu, Aug 04, 2011 at 06:57:04PM +0200, Thibault Le Meur wrote: > * we could imagine a new notation {Q3:0} which could refer to the value > of Q3 if available or '0' if not available (in this case, this notation > should mask the Exception as well). This sounds good; there is similar notation in bash: When not performing substring expansion, using the forms documented below, bash tests for a parameter that is unset or null. Omitting the colon results in a test only for a parameter that is unset. ${parameter:-word} Use Default Values. If parameter is unset or null, the expan‐ sion of word is substituted. Otherwise, the value of parameter is substituted. ${parameter:=word} Assign Default Values. If parameter is unset or null, the expansion of word is assigned to parameter. The value of param‐ eter is then substituted. Positional parameters and special parameters may not be assigned to in this way. ${parameter:?word} Display Error if Null or Unset. If parameter is null or unset, the expansion of word (or a message to that effect if word is not present) is written to the standard error and the shell, if it is not interactive, exits. Otherwise, the value of parameter is substituted. ${parameter:+word} Use Alternate Value. If parameter is null or unset, nothing is substituted, otherwise the expansion of word is substituted. Julian |
From: Thomas W. M. M. M. <tw...@co...> - 2011-08-05 14:03:27
|
Thibault- Making these changes may be unavoidable, but I need to think this through some more. If you look at the updated wiki page ( http://docs.limesurvey.org/Expression+Manager#Functions_that_are_Planned_or_Being_Considered), you will see that many of the planned functions require that we know the data type of the variable. Valid data types include NUMBER, STRING, DATE, TIME and several flavors of NULL (e.g. NA, REFUSED, UNKNOWN). The way I did this in the past was to create a low level Datum object that had getter methods to return Number, String, Date, Time, etc. versions of the Datum. For LimeSurvey, it may make sense to use a tuple for all Data values (within equations) of (value,type,relevant). That way our math operators and functions can conditionally adjust their behavior based upon the data type, plus do better run-time validation and/or exception handling (like date_diff(A,B) which could return the number of days between two dates if they are both actually dates)).. To make this truly dynamic, it may also be necessary to create an array mapping table in JavaScript rather than many more hidden input elements, so that I can access needed metadata, such as: (1) Question - sequence number, code, type, text, number of enumerated answer choices (2) Answer - code, text, sequence number (so know it is the 4th option in the enumerated list) >From that, there could be an arrays mapping: (1) Variable aliases to canonical name (e.g. question code and INSERTANS:SGQA both map to the javaSGQA or answerSGQA JavaScript variable name) (2) canonical name maps to an array of question attributes (questionNum (from SGQA), sequenceNum, code, type, text, number of enumerated answer choices, canonical AnswerId, canonical SubQuestionId) (3) questionNum to array of (relevanceStatus, displayStatus) (4) answerId to ordered list of of array of (code, text, sequencenceNum) If so, then functions like ExprMgr_get_value(name) would use those mapping arrays to determine whether the question is relevant, its current value, and any other metadata that might be needed. But before I consider diving into this (since it is exclusively needed for on-page dynamic features), I need to think about this further. /Tom On Fri, Aug 5, 2011 at 3:02 AM, Thibault Le Meur <Thi...@su... > wrote: > Hi Thomas, > > Le 04/08/2011 21:23, Thomas White, MD, MS, MA a écrit : > > Carsten- > > Do you agree with Thibault? > > > Yes Carsten, I'd like your opinion on this. > > > > The dilemma is how to implement it. Using exceptions would not work > since I need to collect metadata like the list of variables used and > potential syntax errors. EM's recursive descent parser calls > EvaluateExpressions() which calls EvaluateExpression(), where Expressions is > a set of one or more Expression separated by commas. > > Ok, I see. > Just for my information, how do you currently handle a divide by zero error > within one of the Expression ? > > > One option is to have EvaluateExpression() return a tuple of > (value,relevant), where value is the result of evaluating the Expression, > and relevant is false if any of the variables used within the Expression are > irrelevant. Functions could be converted to process a list of tuples, so > they could decide how to handle irrelevant arguments on a case-by-case > basis. This would mean having to write custom PHP and JavaScript functions > for all of the EM functions, but most should be short and easy. However, > regular operators won't handle tuples properly. An option there is to > create functions for each operator, so that sum(a-b,c*d,e>f) would become > sum(op('-',a,b),op('*',c,d),op('>',e,f)). This is do-able, but more work > that I was expecting. > > > If this is too much work, then maybe an alternative solution is to be > found. > I guess that your proposal was to implement the "non-available argument" > behaviour inside each function/operator (hence your table in the wiki page). > If this is easier to implement than not-relevant argument event "bubbling > up", then it is up to the brave developper to decide: I don't want to to > make you break the great work you've done so far. > > > > Then, rather than a new notation like Q3:0 to return the value of Q3 if > available or 0 if not, we could have a function like zifnr(Q3) (for zero if > not relevant) > > Sure, but maybe we would need an ifnr(Q3,'') as well or imagine a slider > with default value 5, we may want something like ifnr(Q3,5) > > I'm very sorry I pointed out this tricky issue, but It was one of the > drawback of our last implementation so I'd rather have EM be generic enough > to handle "all" possible future user cases. > > Regards, > Thibault > > > > /Tom > > > > On Thu, Aug 4, 2011 at 12:57 PM, Thibault Le Meur < > Thi...@su...> wrote: > >> Le 04/08/2011 16:32, Thomas White, MD, MS, MA a écrit : >> >> All (but mostly Thibault and Carsten)- >>> >>> I'd like input on the desired way to support cascading relevance when at >>> least one of the variables is irrelevant. This wiki pages shows a few dozen >>> examples and some proposed results - >>> http://docs.limesurvey.org/Expression+Manager+Roadmap , with room for >>> your comments for each example. >>> >>> They underlying questions are: >>> (1) Which operators should always return false if any of their arguments >>> are irrelevant? >>> (a) Comparators? (==, !=, >, >=, <, <=) >>> (b) Math? (+, -, *, /, +=, -=, *=, /=) >>> (c) Logic (&&, ||, !) >>> (2) Can we safely apply the above to just plain variables (e.g. a == b), >>> or do people truly want this to apply to sub-equations (e.g. (a+b)==c). >>> Cascading irrelevant (N/A) status to sub-equations is harder. >>> (3) Is it always safe to NOT return false for functions that take >>> multiple arguments - like sum(a,b,c), if(a,b,c), list(a,b,c), max(a,b,c) - >>> e.g. each sub-equation (the parts between commas in the function call) will >>> be false, but an argument will be passed to the function. >>> >> >> Hum, >> >> I've seen the table on the wiki page... but wouldn't it be confusing for >> our users if operators behave differently ? >> >> In fact I think would return false for any Expression using a >> non-available (N/A) value, this includes 1a, 1b,1c and I'm afraid 2 as well. >> => Is (2) difficult if we rewrite array_filter (the only sub-question >> filtering feature) using EM? Of course this requires to add the relevanceXXX >> you described in your last commit for sub-question as well. >> >> I wouldn't assume that (3) is safe in all cases. It is really up to the >> function to define if it can process non-available arguments and still >> return a meaningful result. >> sum() seems, indeed, a good candidate if non-available arguments are set >> to "0". But we can't make this a general rule. >> >> IMHO: >> * ExpressionManager should "throw an exception" if a reference to a >> non-available value is found whatever the expression is and in this case >> return FALSE. This would avoid having to imagine all combinations of >> operators and will remain consistant (since simple) >> * the only function that would "mask" this exception would be the >> is_relevant() function. >> * we could imagine a new notation {Q3:0} which could refer to the value of >> Q3 if available or '0' if not available (in this case, this notation should >> mask the Exception as well). >> >> This is only my first impression, though I have not stepped into the code. >> >> Thibault >> >> >> >> >> > > |
From: Thibault Le M. <Thi...@su...> - 2011-08-05 16:47:55
|
Hi Thomas, Le 05/08/2011 16:03, Thomas White, MD, MS, MA a écrit : > Thibault- > > Making these changes may be unavoidable, but I need to think this > through some more. Ok, > > If you look at the updated wiki page > (http://docs.limesurvey.org/Expression+Manager#Functions_that_are_Planned_or_Being_Considered), > you will see that many of the planned functions require that we know > the data type of the variable. Valid data types include NUMBER, > STRING, DATE, TIME and several flavors of NULL (e.g. NA, REFUSED, > UNKNOWN). Yes I've seen this. > > The way I did this in the past was to create a low level Datum object > that had getter methods to return Number, String, Date, Time, etc. > versions of the Datum. > For LimeSurvey, it may make sense to use a tuple for all Data values > (within equations) of (value,type,relevant). That way our math > operators and functions can conditionally adjust their behavior based > upon the data type, plus do better run-time validation and/or > exception handling (like date_diff(A,B) which could return the number > of days between two dates if they are both actually dates)).. That sounds great. > > To make this truly dynamic, it may also be necessary to create an > array mapping table in JavaScript rather than many more hidden input > elements, so that I can access needed metadata, such as: > (1) Question - sequence number, code, type, text, number of enumerated > answer choices > (2) Answer - code, text, sequence number (so know it is the 4th option > in the enumerated list) I Agree > > From that, there could be an arrays mapping: > (1) Variable aliases to canonical name (e.g. question code and > INSERTANS:SGQA both map to the javaSGQA or answerSGQA JavaScript > variable name) > (2) canonical name maps to an array of question attributes > (questionNum (from SGQA), sequenceNum, code, type, text, number of > enumerated answer choices, canonical AnswerId, canonical SubQuestionId) > (3) questionNum to array of (relevanceStatus, displayStatus) > (4) answerId to ordered list of of array of (code, text, sequencenceNum) > > If so, then functions like ExprMgr_get_value(name) would use those > mapping arrays to determine whether the question is relevant, its > current value, and any other metadata that might be needed. K, > > But before I consider diving into this (since it is exclusively needed > for on-page dynamic features), I need to think about this further. Sure, let the force flow through you ;-) Thx, Thibault |
From: Thomas W. M. M. M. <tw...@co...> - 2011-08-14 09:45:17
|
Carsten and Thibault- Cascading relevance is now working in the 10722 release. However, I had to use a different strategy than initially planned. Details can be found here: http://docs.limesurvey.org/Expression+Manager#Cascading_Conditions It will still be possible to do the Q1:0 syntax (somewhat like bash) if needed - this would internally be treated like Q1.NAOK, but with a different default value (the current default for NA is 0) /Tom On Fri, Aug 5, 2011 at 12:47 PM, Thibault Le Meur < Thi...@su...> wrote: > Hi Thomas, > > Le 05/08/2011 16:03, Thomas White, MD, MS, MA a écrit : > > Thibault- >> >> Making these changes may be unavoidable, but I need to think this through >> some more. >> > > Ok, > > > >> If you look at the updated wiki page (http://docs.limesurvey.org/** >> Expression+Manager#Functions_**that_are_Planned_or_Being_**Considered<http://docs.limesurvey.org/Expression+Manager#Functions_that_are_Planned_or_Being_Considered>), >> you will see that many of the planned functions require that we know the >> data type of the variable. Valid data types include NUMBER, STRING, DATE, >> TIME and several flavors of NULL (e.g. NA, REFUSED, UNKNOWN). >> > > Yes I've seen this. > > >> The way I did this in the past was to create a low level Datum object that >> had getter methods to return Number, String, Date, Time, etc. versions of >> the Datum. >> For LimeSurvey, it may make sense to use a tuple for all Data values >> (within equations) of (value,type,relevant). That way our math operators >> and functions can conditionally adjust their behavior based upon the data >> type, plus do better run-time validation and/or exception handling (like >> date_diff(A,B) which could return the number of days between two dates if >> they are both actually dates)).. >> > > That sounds great. > > > >> To make this truly dynamic, it may also be necessary to create an array >> mapping table in JavaScript rather than many more hidden input elements, so >> that I can access needed metadata, such as: >> (1) Question - sequence number, code, type, text, number of enumerated >> answer choices >> (2) Answer - code, text, sequence number (so know it is the 4th option in >> the enumerated list) >> > > I Agree > > >> From that, there could be an arrays mapping: >> (1) Variable aliases to canonical name (e.g. question code and >> INSERTANS:SGQA both map to the javaSGQA or answerSGQA JavaScript variable >> name) >> (2) canonical name maps to an array of question attributes (questionNum >> (from SGQA), sequenceNum, code, type, text, number of enumerated answer >> choices, canonical AnswerId, canonical SubQuestionId) >> (3) questionNum to array of (relevanceStatus, displayStatus) >> (4) answerId to ordered list of of array of (code, text, sequencenceNum) >> >> If so, then functions like ExprMgr_get_value(name) would use those >> mapping arrays to determine whether the question is relevant, its current >> value, and any other metadata that might be needed. >> > K, > > > >> But before I consider diving into this (since it is exclusively needed for >> on-page dynamic features), I need to think about this further. >> > > Sure, let the force flow through you ;-) > > Thx, > Thibault > > |