This blog has been moved to http://info.timkellogg.me/blog/

Sunday, January 9, 2011

Declaring the Future of Programming

Programming languages have developed significantly over the past several decades. I hypothesize that this development has tended more towards declarative syntax than imperative. The future of programming languages will only become more declarative in the years to come.

In the beginning was machine code. Programmers wrote programs by stringing together arcane byte codes of instructions and parameters. Programs were getting pretty hard to read so they made assemblers so you could write instructions in plain text, complete with comments. An assembler program would process the source code and turn each instruction into it's equivalent machine code. This is imperative programming at its most pure state.

When the first C compiler was written it immediately became popular because the programmer only had to declare what should happen in the program and the compiler would generate the necessary machine code to make that happen. Hence why you can write a C program that can be compiled for Linux, Windows and Mac with zero changes to the source code. However, C and C++ are still imperative languages in most other aspects because the thought process is still very much a "do this, now do this, now do this" algorithmic sequence of instructions.

Query Languages

The hallmark of declarative languages thoughout history is probably SQL (referring strictly to set operations here). In SQL you describe the result set and let the DBMS decide the best way to produce that result set. For instance, consider this query:

select p.FirstName, p.LastName, a.AccountName
from Person p
inner join Account a
on p.PersonId = a.ResponsiblePerson
where a.IsActive = 1
order by p.FirstName, p.LastName

First we describe the columns that we want (this actually happens last, if you want to be technical). In the from clause we say what tables we want information from and specify how we want them matched up using the on clause of the join. In the where clause we specify what criteria for the rows that we want to show and in the order by we describe the sort order.

All this was done strictly declaratively. If you have the opportunity to look at the execution plan, it all ends up being quite elaborate. It might consult two or three indexes before actually joining rows, selecting columns and ordering the result set not to mention all the asynchronous locking that took place so as not to run into race conditions. If we had to write this in C# or Java code it would be an extremely gnarly component and would probably be buggy and slow.

Expression Trees in C#

Interestingly, .NET land is also developing into a declarative playground. The biggest step in this direction happened with Linq and it's expression trees. Now, the Linq query syntax is declarative, but I'm referring to something more basic. Expression trees can be broken down at run time by a processor that can analyze the contents of a lambda that it was passed. For instance, NHibernate can receive a method call like:

var timsAccounts = accounts.Where(x => x.ResponsiblePerson == "Tim");

and pull out the meaning (ResponsiblePerson = Tim) and convert it into a SQL "where" clause at run time (sql = "where a.ResponsiblePerson = 'Tim'). The implications of this are wild, and in recent months and years have become very powerful. Examples include Fluent NHibernate, Moq, and Castle Windsor's fluent registration API. Both castle windsor and NHibernate both used to use XML configuration files but have since moved towards using expression trees in combination with dynamic proxies and interceptors to configure via code. This declarative approach is leading towards less code that has potential to be more efficient.

Treatise on Domain Specific Languages

The topic of domain specific languages deserves an entire blog post. SQL and CSS are the obvious examples, but there are hundreds more. In one of my internships a coworker wrote a DSL to specify sort order for dictionaries for arcane natural languages and scripts. A simple DSL is much easier to develop than a GUI for the same purpose and can many times be easier for a non-techy user to learn and become productive in.

The sad news is that colleges and universities are putting less focus on compiler & parser classes. The assumption being that we have all the languages we need, why would we need more? The answer is simple: by providing a simple syntax to describe problems or solutions we can simplify the entire process of arriving to that solution. If the problem is abstracted away from the solution we can easily leverage constructs like multi-threading and highly optimized solutions. Sometime you should take a look at the byte codes that your compiler produces - ask yourself if you could have even thought of those sorts of mind bending tricks.

We need domain specific languages because they simplify problems. They create more effective abstraction than even inversion of control frameworks. Unfortunately, less people are learning about string processing these days. How many people have you worked with actually consider themselves proficient in regular expressions or compiler generators? (yet two more declarative DSLs that simplify solutions)

Conclusion

Anytime you write code that is less imperative, it allows the layer underneath more room to innovate efficient algorithms. Surely this isn't surprising since any good programmer would feel exactly the same way towards a micro-managing supervisor. So after saying all this, it should be clear why I believe that the future of programming is declarative. Declarative syntaxes allow us to simplify the problem by simply stating what the problem is (or describing what the solution looks like) and allowing the underlying engine to determine the solution. As such, I believe we will be seeing the number of domain specific languages multiply in the years to come.

No comments:

Post a Comment