On 2/3/12 3:33 AM, Juan Jose Garcia-Ripoll wrote:
On Fri, Feb 3, 2012 at 1:40 AM, Raymond Toy <toy.raymond@gmail.com> wrote:
1. You have to check after reading the string to see what it contains.  (I guess a very small compile-time cost.)

Indeed the cost is very small and can be included in the routine that reads the string into the buffer.
2. Because I didn't think any lisp did that, but it's not illegal to do so.
3. It's a burden on the user if the type of a constant string depends on what's in it.  Being illiterate, I only know ASCII, so, perhaps this isn't a problem in practice.

I implemented this because after I introduced Unicode all programs began using 4 times more memory than non-unicode versions of it. It is natural: symbols, strings, code, all data can be either base-string or extended-strings and if the core does not try to save space, everything defaults to the most expensive version.

Indeed.  This is one reason why cmucl uses utf-16 strings.   For most users this is not a problem because most characters are in the BMP.

I had a look at f2cl's code and the following code would more or less fix it. There might be simpler ways, such as looking only at PARAMETER statements, but my fortran is a bit rusty and I do not know f2cl so well. Note also that one possible optimization could be to use LOAD-TIME-VALUE around COERCE, for those lisps that would not precompute the COERCE statement.

Thanks for this fix.  I was looking for something a bit deeper in the guts of f2cl, but this will probably be fine for now.