I love runtimes. Leveraging increasing processor speed to create useful abstractions for the programmer is an idea which (I think) is just in its infancy, but which will ultimately become the basic model of software construction.
The runtimes for Java and C# have at their core a single feature: a memory manager. That alone has proven so worthwhile that it has created a tectonic shift in modern software development. It's interesting (and a little sad) to me that this is really the only abstraction built into these runtimes, as I think that loose (or at least non-overt) typing, combined with more flexible data types (I want to be able to add numbers to strings, for example) make for shorter, easier to read, and less buggy programs.
This is my bias, and I don't want to argue it right now. The thing that really interests me about runtime technologies is that if the runtime designer can identify a few core, powerful constructs and then build those constructs into the runtime in native code, then she can create a runtime which will have decent performance characteristics even though it offers very high level constructs.
Something I've been thinking a lot about lately is the way in which a program which is run by a runtime is actually just data. This is somewhat obvious, but it is rarely formalized. In fact, runtime languages usually do a lot to look like the compiled languages that preceeded them. The fact that I have to think about whether I want an int or a float in Java strikes me as ridiculous. If I don't care, I shouldn't have to think about it.
This is a side note, but I also think it's interesting that performance issues often keep real programs from taking full advantage of Java's built-in garbage collector. In theory, it's great that everything in Java is and should be an object; in practice, programmers often have to write their own object pools to workaround the slowness of allocating memory. This is another way in which the full power of the Java runtime cannot be exploited.
Anyway, back to the main point, which is that to date, the virtual machine abstraction has been relatively weak: the Java and MS.NET runtimes are essentially idealized versions of modern computer architecture that don't leak memory or crash (at least, they're not supposed to leak memory or crash.) This reminds of me of something I heard about the early days of cinema: there was a while, before Eisenstein and others invented a cinematic vernacular, where people made movies by just locking down a camera and filming a play.
Functional programming is the closest match to the kind of abstractions that I'm talking about, but I find that purely functional languages are cumbersome for all but the most trivial applications. XSLT is the functional language that's probably most widely used today, and I think that anyone who has used it will testify to its deficiencies. It's good at simple things, but the exceptional cases, the idiosyncracies in the data model, and the inevitable mid-project changes, tend to be very difficult to deal with using XSLT. More importantly, the missing if...then and for...in type of logic that is honestly the bread and butter of most production code is hard to recreate in functional languages, and the implementation tends to be fragile, so that making a change to logic forces large rewrites to the program.
Laszlo currently has an interesting mix of functional and imperative programming. I've written about Laszlo's use of XPath before, but here's how it relates to program transformation.
(EDIT PROGRAM)
<canvas layout="spacing: 2">
<dataset name="myData">
<people>
<Tim> Nice guy. </Tim>
<Ron> Not so bright. </Ron>
<Kelly> Tall. </Kelly>
<Adam> Bombastic. </Adam>
</people>
</dataset>
<view datapath="myData:/people/*" layout="axis:'x'">
<text datapath="name()"/>
<text datapath="text()"/>
</view>
</canvas>
Here at Laszlo, we call this effect "data replication." This program essentially says: for each of the nodes in myData:/people, make a view which has two children, which are text's, the first of which has the text of the node's name(), etc.
The nice thing about Laszlo is that I can do this:
(EDIT PROGRAM)
<canvas layout="spacing: 2">
<dataset name="myData" src="mydata.xml"/>
<view fontstyle="bold" layout="axis:'x'" name="title">
<text>Name</text>
<text>Comment</text>
</view>
<view datapath="myData:/people/*" layout="axis:'x'" name="people">
<text datapath="name()"/>
<text datapath="text()"/>
</view>
</canvas>
This is a simple exception to the main program logic -- I want column headers in a slightly different presentation style -- but I can add them to my program without having to add them to my data model and then try to identify that data in my for..each transformation. Again, if you've used XSLT, you know how hard it can be to add these types of things at just right spot in the middle of a complex transformation.
I'm pretty happy with the way that data replication works with Laszlo, but it's a little bit ad-hoc. The transformation of the people view happens at runtime, meaning that in order to know whether or not there are one or many people view's, the first people view has to be made. In practice, this makes for some messy programming, because the first people view is subtly different from the others. This can be especially frustrating, because often the programmer knows at design time whether or not she intends a given data-mapped view to replicate.
What we've been talking about lately is making it so that the runtime is essentially a big data replication engine. In Laszlo as it is implemented now, there isn't really an abstraction layer between the part of the above program that specifies the title view, and the runtime objects which are created to represent it. The transformation from the declaration of that view to a constructor call happens in the compiler. What I've been thinking about is the idea of passing a Laszlo program to the runtime as data. The Laszlo language, then, is a default mapping that tells the runtime how to map data to constructor calls. The default mapping is:
| XML node | Laszlo instance |
| name | class |
| attributes | attributes |
| children | children (recursive) |
A datapath declaration is way of using other data as program data, and changing that mapping. In the above program, the replicated row does this:
| XML node (of myData:/people) | Laszlo instance |
| "view" | class |
| "people" | name attribute |
| "axis:'x'" | layout attribute |
| text (with separate mapping) | first child |
| text (with separate mapping) | second child |
| name | ignore |
| attributes | ignore |
| children | ignore |
I think it would be so nice if this was built into the core of the Laszlo runtime. (Instead, right now, the people view is responsible for figuring out whether it needs to replicate. The program really is a program, and the data is fed into it.)
When we've talked about this at Laszlo, people's reaction has been either "that's too abstract for me to understand how it's useful" or "it sounds right, although I don't know specifically what it would buy us."
I think that the main advantage is that this would remove the distinction bewteen data-backed parts of program and non-data backed ones. Any part of a Laszlo program hierarchy could be read back out as data, and more importantly, component writers would not have to write special code for the data-backed case.I think it would eliminate a whole class of bugs, and it would open the door to a much more formal description of Laszlo programs, which might lead to compile-time optimizations. Besides, it's just really cool.
Comments
I totally agree with this. I think there are other benefits as well - for instance creating a laszlo code editor which could present multiple views of the current code, view the code as data to test and verify, etc. I also have wanted to create an application which has a need for this data/program fusion...
I seem to recall that it was not Eisenstein that invented the pan (although he used it to great effect), it was Porter, one of Edison's cameramen, who noticed that when a camera fell over while still running, the resulting film was still understandable and had an interesting effect. Prior to that, people had just _assumed_ that moving the camera would ruin the film (as it did for still pictures). Porter's "Great Train Robbery" is cited as the earliest example of a film that includes pan shots. Porter's invention happened in the early days of cinematography, but was not widely adopted for some time.
In a similar way, McCarthy's *Lisp* broke out of the straightjacket of numerical programming by realizing that programs were data. Lisp is as old as Fortran, but many of its concepts are still not widely adopted. Paul Graham has an interesting take on this in his "Revenge of the Nerds" [http://www.paulgraham.com/icad.html], where he shows that one-by-one, the features of Lisp are slowly being grafted onto Fortran's numerical descendants.