Making these changes may be unavoidable, but I need to think this through some more. 

If you look at the updated wiki page (, you will see that many of the planned functions require that we know the data type of the variable.  Valid data types include NUMBER, STRING, DATE,  TIME and several flavors of NULL (e.g. NA,  REFUSED, UNKNOWN).

The way I did this in the past was to create a low level Datum object that had getter methods to return Number, String, Date, Time, etc. versions of the Datum.
For LimeSurvey, it may make sense to use a tuple for all Data values (within equations) of (value,type,relevant).  That way our math operators and functions can conditionally adjust their behavior based upon the data type, plus do better run-time validation and/or exception handling (like date_diff(A,B) which could return the number of days between two dates if they are both actually dates))..

To make this truly dynamic, it may also be necessary to create an array mapping table in JavaScript rather than many more hidden input elements, so that I can access needed metadata, such as:
(1) Question - sequence number, code, type, text, number of enumerated answer choices
(2) Answer - code, text, sequence number (so know it is the 4th option in the enumerated list)

From that, there could be an arrays mapping:
(1) Variable aliases to canonical  name (e.g. question code and INSERTANS:SGQA both map to the javaSGQA or answerSGQA JavaScript variable name)
(2) canonical name maps to an array of question attributes (questionNum (from SGQA), sequenceNum, code, type, text, number of enumerated answer choices, canonical AnswerId, canonical SubQuestionId)
(3) questionNum to array of (relevanceStatus, displayStatus)
(4) answerId to ordered list of of array of (code, text, sequencenceNum)

If  so, then functions like ExprMgr_get_value(name) would use those mapping arrays to determine whether the question is relevant, its current value, and any other metadata that  might be needed.

But before I consider diving into this (since it is exclusively needed for on-page dynamic features), I need to think about this further.


On Fri, Aug 5, 2011 at 3:02 AM, Thibault Le Meur <> wrote:
Hi Thomas,

Le 04/08/2011 21:23, Thomas White, MD, MS, MA a écrit :

Do you agree with Thibault?

Yes Carsten, I'd like your opinion on this.

The dilemma is how to implement it.  Using exceptions would not work since I need to collect metadata like the list of variables used and potential syntax errors. EM's  recursive descent parser calls EvaluateExpressions() which calls EvaluateExpression(), where Expressions is a set of one or more Expression separated by commas.
Ok, I see.
Just for my information, how do you currently handle a divide by zero error within one of the Expression ?

 One option is to have EvaluateExpression() return a tuple of (value,relevant), where value is the result of evaluating the Expression, and relevant is false if any of the variables used within the Expression are irrelevant.  Functions could be converted to process a list of tuples, so they could decide how to handle irrelevant arguments on a case-by-case basis.  This would mean having to write custom PHP and JavaScript functions for all of the EM functions, but  most should be short and easy.  However, regular operators won't handle tuples properly.  An option there is to create functions for each operator, so that sum(a-b,c*d,e>f) would become sum(op('-',a,b),op('*',c,d),op('>',e,f)).  This is do-able, but more work that I was expecting. 

If this is too much work, then maybe an alternative solution is to be found.
I guess that your proposal was to implement the "non-available argument" behaviour inside each function/operator (hence your table in the wiki page).
If this is easier to implement than not-relevant argument event "bubbling up", then it is up to the brave developper to decide: I don't want to to make you break the great work you've done so far.

Then, rather than a new notation like Q3:0 to return the value of Q3 if available or 0 if not, we could have a function like zifnr(Q3) (for zero if not relevant)
Sure, but maybe we would need an ifnr(Q3,'') as well or imagine a slider with default value 5, we may want something like ifnr(Q3,5)

I'm very sorry I pointed out this tricky issue, but It was one of the drawback of our last implementation so I'd rather have EM be generic enough to handle "all" possible future user cases.



On Thu, Aug 4, 2011 at 12:57 PM, Thibault Le Meur <> wrote:
Le 04/08/2011 16:32, Thomas White, MD, MS, MA a écrit :

All (but mostly Thibault and Carsten)-

I'd like input on the desired way to support cascading relevance when at least one of the variables is irrelevant.  This wiki pages shows a few dozen examples and  some proposed results - , with room for your comments for each example.

They underlying questions are:
(1) Which operators should always return false if any of their arguments are irrelevant?
(a) Comparators? (==, !=, >, >=, <, <=)
(b) Math? (+, -, *, /, +=, -=, *=, /=)
(c) Logic (&&, ||, !)
(2) Can we safely apply the above to just plain variables (e.g. a == b), or do people truly want this to apply to sub-equations (e.g. (a+b)==c). Cascading irrelevant (N/A) status to sub-equations is harder.
(3) Is it always safe to NOT return false for functions that take multiple arguments - like sum(a,b,c), if(a,b,c), list(a,b,c), max(a,b,c) - e.g. each sub-equation (the parts between commas in the function call) will be false, but an argument will be passed to the function.


I've seen the table on the wiki page... but wouldn't it be confusing for our users if operators behave differently ?

In fact I think would return false for any Expression using a non-available (N/A) value, this includes 1a, 1b,1c and I'm afraid 2 as well.
   => Is (2) difficult if we rewrite array_filter (the only sub-question filtering feature) using EM? Of course this requires to add the relevanceXXX you described in your last commit for sub-question as well.

I wouldn't assume that (3) is safe in all cases. It is really up to the function to define if it can process non-available arguments and still return a meaningful result.
sum() seems, indeed, a good candidate if non-available arguments are set to "0". But we can't make this a general rule.

* ExpressionManager should "throw an exception" if a reference to a non-available value is found whatever the expression is and in this case return FALSE. This would avoid having to imagine all combinations of operators and will remain consistant (since simple)
* the only function that would "mask" this exception would be the is_relevant() function.
* we could imagine a new notation {Q3:0} which could refer to the value of Q3 if available or '0' if not available (in this case, this notation should mask the Exception as well).

This is only my first impression, though I have not stepped into the code.