My Blog

The curious case of case sensitive identifiers

11 July 2010 | Technology

Frustrated person in front of computer screen

Endless religious debates have been had about case sensitivity in programming languages. As with all such debates, fundamentalism and stubbornness quickly take over from common sense and rationality, with incessant flame wars as the result. But the basic fact is very simple: for humans, case sensitivity in identifiers (I'm deliberately limiting scope of this article to identifiers, ignoring for now strings and other code or data) impacts negatively on readability of code, on productivity of writing and debugging and, therefore, on maintainability, stability and reliability of applications.

It's unfortunate that the original C language had case sensitivity for identifiers, as its popularity has contributed to the adoption of that principle in many other languages and environments. Not because case sensitivity is such a great idea, but simply because once you have established a principle like this it is hard to get rid of - because of the ongoing need for backward compatibility, familiarity and habit and plain, simple inertia.

While my main interest and career has been in technology and engineering, I happen to have been educated also as a linguist. As such, I have always looked at matters such as case sensitivity (and other aspects of legibility, for instance operator overloading) with a perhaps different perspective. Where humans need to be able to read code, case sensitivity is definitely a bad idea - the way we read natural language makes us ill suited to deal with case sensitivity in identifiers. Fundamentally, reading is a process of pattern recognition - it's a common misconception that we read by analysing individual letters.

Indo-European languages (including English, which is the inspiration for almost all programming languages) do use capitalisation, but mostly for structural and functional purposes, not for semantic reasons. For example, nouns in German are capitalised and in most Indo-European languages abbreviations are often written in all capitals. But if someone accidentally writes "Jack sat at the tAble" instead of "table", we have no trouble in understanding the meaning, because the capitalisation of the pattern "table" is not linked to its associated semantic content.

The role of identifiers in computer languages (as elsewhere) is almost entirely semantic, not functional. The identifier carries meaning about the data or object it contains or refers to. For such semantic purpose, case sensitivity is quite counterproductive. When reading code, the semantics are most important. TotalPrice = UnitPrice * Quantity means the same to us as totalprice = unitprice * quantity or totalPrice = unitPrice * quantity, etc. For a case sensitive compiler or interpreter, these will all mean different things - thus creating a conflict between how our human brain works and what the compiler is doing.

Consequently, programmers end up wasting time finding and debugging issues related to case sensitivity, whereas they should be spending time on algorithm design, requirement analysis, code logic and so on.

Clearly, it is simpler to write compilers to be case sensitive, in that extra computing effort is needed to implement case insensitivity. The exercise is trivial in case of ASCII, but Unicode requires more effort. However, computer languages and compilers only need to deal with a fairly small character domain and the amount of processing power now available to us means insensitivity in computer languages is fairly easily implemented.

Readability of code, semantically, is important not just when applications are first created, but for long term maintenance, support and further development. Identifier case sensitivity interferes demonstrably and drastically with this readability and, as such, should be avoided wherever possible. Not for reasons of beauty or belief, but simply because it makes writing code more difficult and less productive than it ought to be.

The curious case of case sensitive identifiers

Blog by Category

Science and Technology
News

The curious case of case sensitive identifiers

Blog by Category

Science and TechnologyNews

Science and Technology
News