Re: [pure-lang-users] case

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Eddie Rucker wrote:
> On Thu, 2008-08-28 at 02:46 +0200, Albert Graef wrote:
>> Albert Graef wrote:
>>>> case type of
>>>>   int = gsl_matrix_int_alloc r c, gsl_matrix_int_set;
>>>>   double = gsl_matrix_alloc r c, gsl_matrix_set;
>>>> end;
> 
>> BTW, the above won't work in Haskell or ML either... That poor little 
>> int function doesn't like it if you toss it around as if it was some 
>> piece of silly data! ;-)
> 
> I thought that was one of the points of using first class functions
> (ducking tomatoes) ;-) 

I knew that this was coming. ;-)

But of course you're right, functions are as good data as any other and 
in fact you can compare them with (===). The reason why you can't match 
against an ordinary (non-nullary) function symbol lies in the "head = 
function" rule.

The "head = function" rule allowed me to get rid of the lexical 
distinction between function and variable symbols (which is one of Q's 
worst parts). This applies to all kinds of rewriting rules, not only 
'case'. So, consider a rule like:

foo bar (baz x) = x+1;

The problem is how to decide which symbols on the lhs of that equation 
are literal (function or constant) symbols, and which are the variables 
which are to be bound by the equation.

Note that Haskell employs the constructor discipline, and even lexically 
distinguishes constructor symbols, to figure out what the variables are. 
But for Pure there's no lexical distinction and we don't have (nor want) 
the constructor discipline either.

But if you take another look at this equation,

foo bar (baz x) = x+1;

it's plain to see for a human reader that foo and baz are function 
symbols and that bar and x must either be variables or constant symbols 
since they're leaves of the lhs expression tree. There's no way to 
decide the latter question, however, without someone telling you, say, 
that bar is a constant, and x isn't, and that is what Pure's 'nullary' 
declarations are for.

The "head = function" rule just makes precise that intuitive 
interpretation, so that the compiler can use it to parse the lhs of 
equations in an unabiguous way. It says that:

- The head symbol of a function application, like foo in 'foo bar (baz 
x)' or baz in 'baz x', is always a literal (function) symbol.

- All other symbols (bar and x in this case) are variables, *unless* 
they're explicitly declared as 'nullary' in which case they're literal 
(constant) symbols.

That rule is just a convention, of course, but it seems perfectly 
reasonable and in fact it lines up nicely with term rewriting theory. 
Indeed, all those fixity and nullary declarations in the prelude and in 
your own programs are just what term rewriting theorists call the 
"signature" of an algebra, which is simply an "alphabet" of function 
symbols with their "arities" (number of arguments they expect), with 
function symbols of arity zero denoting the "constants" of that algebra.

The only differences to standard TRS theory are that

(1) in Pure you don't have to declare the arities of non-operator and 
non-nullary symbols, they're inferred automatically;

(2) a symbol occuring as a leave on the lhs can denote a variable even 
if it's also used as a (non-nullary) function symbol elsewhere.

Feature #2 is what actually bit you in your 'case' example, so you might 
argue that this is *not* a good idea. But if you insist that function 
symbols must never be used as variable names, then a rule like 'foo bla 
= bla+1' would suddenly change meaning if you happen to add a definition 
like 'bla x = x-1' somewhere else in your program! That would be very 
bad indeed and would give rise to horrible bugs almost impossible to 
avoid in bigger programs. Therefore, (2) is much preferable. In fact, 
Haskell treats this in exactly the same way. You can even write stuff 
like 'suck suck = suck+1' which will give you 'suck 99 => 100', in both 
Haskell and Pure. Try it!

(NB: It might be tempting to just declare all your function or at least 
the constructor symbols 'nullary' to circumvent (2) above. Don't do 
that, it's a very bad idea for the reasons I just discussed. Nullary 
symbols should really be reserved for special kinds of atomic values, 
like [] and () in the prelude. If you really need to introduce nullary 
symbols, make them stand out or declare them as private symbols, at 
least if your code is supposed to be used as a library by others.)

Back to your example:

case type of
   int = gsl_matrix_int_alloc r c, gsl_matrix_int_set;
   double = gsl_matrix_alloc r c, gsl_matrix_set;
end;

It should be clear by now that the rule 'int = gsl_matrix_int_alloc ...' 
doesn't match against 'int' as a literal symbol, because 'int' is not 
nullary, so, by what I said above, 'int' is just a variable there!

Ok, I hope that this finally clears up the mysteries surrounding Pure's 
symbol declarations and the "head = function" rule. :) Maybe someone 
could put this on the wiki, so that we finally have some content for the 
FAQ section there?

HTH,
Albert

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag