Monday, 26 September 2016

What else are we getting wrong?

I like reading history of science and, as a working academic, I am amused by theories that used to be widely accepted yet false, such as the miasmatic theory of disease or the phlogiston theory of combustion. The examples are numerous enough to have their own Wikipedia Category. It makes me wonder which of the current orthodoxies are as spectacularly misguided (this is my personal bet). But the most challenging exercise is to think about my own discipline, the theory of programming languages, and think if any of the (sometimes unstated) dogmas could be utterly wrong.

I think asking ourselves this question is a useful exercise in critical thinking even though it sounds like trolling. First of all, it is justified, if we consider the historical trajectory of mainstream programming languages in terms of improving programmer productivity. I don't have a precise definition of what programmer productivity means, but I certainly don't mean KLOC/week or something silly like that. Lets just define it informally as the ability to complete a task up to a reasonable level of correctness, where the precise interpretation of "task" and "reasonable" and "correctness" is left to the reader. 

Programming started with raw codes introduced directly into the working memory of a computer via switches, relays and buttons. Moving on to reading the code off of cards must have brought about a tremendous leap forward in productivity. Further progress from codes to assembly languages with symbolic labels must have been at least as productivity-enhancing. The next step, to FORTRAN, also must have boosted programmer productivity tremendously. The syntax was nicer -- especially the ability to write expressions and functions -- but the killer feature must have been code portability, given by machine independence. This covers progress in the 1940s and 1950s, the early days of programming. Programming languages obviously evolved greatly since. But did any new programming language ideas lead to any improvements in programmer productivity comparable to assembly or FORTRAN? Is any current language 100 times better than FORTRAN for programmer productivity, just as FORTRAN was compared to machine-code programming?  

Probably not. If a civil engineer wants to build a bridge there will be some choices. They can choose from a simple beam bridge to an extremely complex suspension bridge. They will roughly know what the choice involves in terms of cost, strength, maximum span, construction time and so on. Similarly, or even more so, in programming languages the engineer will face a broad array of languages and frameworks for building any project. Choices will need to be made. Unlike the case of bridges though, the choice will be informed by hardly any quantitative factors, but mostly by taste, ideology and other social or psychological contingencies. When should one choose, for example, functional over object-oriented programming? The question sounds like flame bait but it is a very reasonable one. First year students will ask it all the time until we educate them that such questions are impolite. It is like asking what football team, rock group or religion are better. It is dogma, and it is widely known and accepted to be dogma.
Some aspects of programming languages are scientifically solid. Compilers do get measurably better, for a given language, and the measures of improvement are quantitative: faster code, less memory, faster compile times. Software verification and testing can also get measurably better: more kinds of programs can be automatically verified, faster verification, fewer false positives and negatives. But programming language design is an issue that is rarely dealt with in a similarly scientific and quantitative fashion.

There are scientific questions we can ask about a programming language design or feature. The first one, and this is a question usually asked reasonably well, is whether it can be compiled efficiently. This question is answered constructively by building a compiler and comparing benchmarks with other languages and compilers against various measures as I mention above. The second one, and this is a question rarely asked well, is whether it improves some measure of programmer productivity. This question is of course asked a lot, but not scientifically. Because this important question is not asked scientifically, researchers do what all people do -- use whatever available heuristics are at hand. They use a community-driven sense of "what is right", of "what makes sense", of "what is elegant" to determine what a "good" versus "bad" language is. But to anyone familiar with the history of science, this is a treacherous path. This is the way false dogmas are generated: ideas that can be beautiful yet wrong. They are not necessarily wrong, but they could be. We just don't know. We do no science on them, just speculation and debate.

Is there any such prevalent orthodoxy in (academic) programing language design, seductive but not scientifically validated? The drive towards ever more sophisticated type systems seems to me potentially so. The Curry-Howard-Lambek correspondence is a beautiful paradigm which brings together aspects of language, logic, mathematics, and even physics and systems theory. Phil Wadler has a beautiful talk on the topic [video] and Baez and Stoy's Rosetta Stone paper is equally fascinating [pdf]. My own research has been heavily influenced by these ideas, which I embrace. They are persuasive. I am a fan. But I know of no scientific studies actually backing up the (sometimes implicit) claim that this types-first methodology leads to better languages, from the point of view programmer productivity. It might -- we don't know though. 

The ideology of sophisticated type systems is not merely academic, it does have a growing impact on language design. There are language such as Agda and Idris, which are embodiments of type theory at its most sophisticated. The more popular Haskell also has a type system which may surprise you with its amazing powers. This wonderful talk by Stephanie Weirich is a great introduction [video]. Finally, Rust is another new language, this time lets say "production-oriented", with a tremendously sophisticated type system. I instinctively agree with the ideology of types, and I instinctively like Agda, Idris, Haskell, Rust, and all their cousins and friends. But we should worry a little bit about our relying on aesthetics with no evidence in support.

We should worry, for instance, when Python, a language that breaks all academic language design tenets, is one of the most popular languages in the world. We should worry and we should try to understand what is going on with programming languages not just mathematically but also from a human point of view, psychologically. A couple of years ago I had breakfast with Guido van Rossum -- he almost left the table when I told him I was an academic PL researcher -- and I was quite impressed with the way he dismissed all mathematical aspects of the study of programming languages but was almost exclusively focussed on "soft" aspects of usability such as concrete syntax. I was not convinced by his arguments, but I found them intriguing.
Evidence-based usability research in programming languages is, sadly, a fringe activity. I could find no papers in our main conferences (POPL, PLDI, ICFP, etc) on this topic. There is a small community of usability researchers in programming languages (Psychology of Programming Interest Group) and some of their work is promising. But I think much more evidence-based research is needed in PL design, and this work should be much more encouraged and promoted by our community. Because programming languages are not the abstract Platonic objects of mathematics, are not even edifices bringing together form and function in a way that needs to be aesthetically pleasing. They are tools which we use to accomplish tasks. The programming language is the hammer and the plow of today's economy. They need not be beautiful, unless beauty itself is part of their function. They just need to get the job done.

Yaron Minsky, a man of tremendous accomplishments whom I hugely respect seems to think that such empirical validation is impossible [blog]. He says "there is no pile of sophomores high enough to prove anything", in reference to (the very few) academic studies of programming language usability, which tend to focus on students as subjects. I think this opinion is broadly shared in our community, and this makes me quite unhappy. I don't think we should dismiss science, facts, evidence, experiments and rely solely on aesthetics instead. Science worked out pretty well for humanity for the last thousand years or so, whereas aesthetics-driven progress has been often questionable. Giving up on science is simply giving up. 

Haskell: If monads are the solution, what is the problem?

Monads have exerted a curious fascination on programmers who turn to Haskell. The incomprehension of the newbies is matched in intensity onl...