My take on quantifying language expressivity ("Ending the Language Wars")
is derived from work by Andre Kolmogorov who invented the field of Algorithmic
Information Theory. The idea is that you can quantify the complexity of
something by measuring how BIG a computer program you need to reproduce the
complexity of that thing. In this case, we'll be doing the inverse: measuring
the simplicity. How small can your language make your code?
Proper expressivity means more than a host of sophisticated constructs for
common operations, but also a reduction of ambiguity in what is happening, so
that a glance can lead the reader to understand what the programmer was
saying.
XXXyet to be integrated: We'll define this idea of "simplification" as a
factor of text-wise reduction (fewer characters needed to express complex
concepts (a sort of reversal of Algorithmic Information Theory) or in
literature, the equivalent is the use of sophisticated words that say a lot in
a few characters without increasing ambiguity of meaning) and another, less
easy-to-quantify concept of maintainability. Text that doesn't get turned into
code (comments, docstrings) doesn't count. Fleshing out this latter concept,
it is clear it has to do with how easily one can establish programmer
consensus for the given task; i.e. how many programmers of the language would
put it back the same way you've expressed it or otherwise agree on the best
implementation of a given problem.
I will define the Kolmogorov Quotient so that higher numbers for a given
language denote a reduction in the complexity (i.e. greater amount of
succinctness) of solving the problem in the given language.
Once the basic premise and a methodology above is agreed to, it is only a
matter of a rough constant of difference for any specific
implementation/architecture. That is, as long as the architecture is the same
across all measurements, the number should be valid and comparable between
languages.
But it could be implemented something like so: Implement a language that maps
machine op-codes to mneumonics (like Assembly) and measure the amount of bytes
of machine code to perform a standard suite(*) of common, non-threaded
programming tasks (call it: machine_language_count). Then code that exact
functionality in the language you are wanting to measure (without using
external libraries) and count the number of bytes of source code you used
(call it: test_language_count).
The expressivity of your language, then is machine_language_count /
test_language_count. The elegance of your program is
2/#_of_equivalent_programs * this number. By "equal programs", it is meant
the number of programs fluent programmers of the language agree that are the
good and equivalent solutions to the problem. Whether the output should be the
same for every input is untested (there may be good solutions that don't
include every extreme case). Languages which have a large number of equal
programs say something interesting about either the programming language or
the types of programmers the language attracts.
Expressivity should always be greater than 1.0, otherwise there was no purpose
to your language: machine language implemented it more succinctly. As
elegance gets closer to 1.0, the closer code is to its maximum elegance.
One might consider replacing keywords and function names with a minimal
symbol/identifier as substitute for the purpose of measuring and not getting
caught up in particular naming choices. However, one eventually sees that
naming and keywords are a significant source of expression , and therefore
expressivity , and that the only other item really being measured is a
language's architectural tools , provided over assembly, like parametrized
functions, objects, containers, type safety vs. runtime choices. The answer to
the former is to fine-tune your spoken language into shorter forms, not
sacrifice readability.
Out of date: (*) "standard suite of common programming tasks...": I see two
main categories:
The purpose of this idea is to end the tiring language wars about whose
language is the best. By giving a quantitative metric, people can at least
argue better.
Once you've made an expressive or higher- level language, you can move onto
[elegance].
Wiki: Algorithmic Information Theory
Wiki: Home
Wiki: Prime
Wiki: elegance
Wiki: engineering
Wiki: kolmogorov quotient