martin's blog

Tuesday, March 3, 2009

For what it's worth: Java & expressiveness

Since Ola Bini started blogging about his very own language-project, Ioke, I've been pretty intrigued by the whole package. As soon as the first version had been released I started playing around with it, digging deeper into its concepts and contributing to the Programming Guide and Ioke's core libraries. One of the language's highest goals is expressiveness. It's been designed from the ground up to push the limits on what you can do with internal DSLs. Some of the ingredients that make that happen are Ioke's incredibly flexible treatment of operators as well as the inclusion of a macro system, which, coming from a non-Lisp background, is the most outstanding one for me. For that reason Ola uses the term folding language nowadays to describe Ioke. Well, but this is not a blog post about Ioke. It just sort of reminded me once again of Java's ineptitude when it comes to creating concise and expressive APIs/DSLs.

Java su... ah, excuse me, ... lacks!

Regarding expressiveness Java is like an iron maiden that constrains you and pierces you with a lot of sharp objects (no pun intended). You usually have to resort to all kinds of compromises and even trickery in a few cases (1). Sure, you can implement fluent APIs for providing more easily readable natural language constructs (e.g. JSR-310, the Date and Time API, goes to great lengths to make the API easy to read and natural to use). It makes sense and I think it's a valuable addition to good API design. But it still lacks in my opinion for numerous reasons.

An important one is that Java's restrictions in identifier naming make it impossible to use more appropriate symbols for certain operations. For example, it prevents you from using operator characters in method names or even just appending a '?' to function names, like in Ruby, which makes boolean functions very easy to read. This dilemma is probably due to the fact that operators, which often play an essential part in DSL design, are more or less considered primitives in the Java language rather than functions as in many (or should I say most?) other, particularly functional, programming languages. Whereas Java's closest cousin C++ at least allows operators to be defined as functions for user-defined classes, Java does not ... you're stuck with the built-in ones. I cringe everytime I see code using Java's BigInteger API. Truely horrible from an aesthetic point of view. Other languages offer considerably more freedom in that they achieve operator overloading (2) by treating operators as functions while additionally providing the syntactic sugar of "hiding" the actual method call. This paves the way for highly specialized and readable DSLs. For example, see Scala's parser combinator library that allows writing BNF-like syntax directly in the source code. Note that existing languages differ from each other in that some (e.g Ioke, Haskell or Smalltalk) allow the definition of entirely new operators, possibly incl. associativity and precedence, whereas others provide just a limited set (e.g. Groovy or Ruby). If Stuart Halloway had his way, operator overloading would be one of the distinguishing features of Java.next.

Alternatively, Java also lacks the syntactic sugar of languages like Groovy, Ruby or Scala to omit the parenthesis on n-ary methods (in Scala's case: n = 1) and use them in infix position such that you can write x or y instead of x.or(y). Such tiny syntactic gimmicks alone greatly improve the readability of code. Inspired by Io, Self and Smalltalk, Ioke generally uses whitespace to separate messages from each other, i.e. your reading of a line of code is not "interrupted" by punctuation marks.

Even if ...

... Java had all of the above, it would still be vastly inferior to other languages which are based on different concepts and paradigms and incorporate appropriate feature sets that make them particularly suited for implementing internal DSLs. Among these are:

The dynamic and reflective nature of some languages allows you to observe and modify a program's structure at runtime to shape core libraries to the needs of your DSL, e.g. by adding new methods or by redefining existing ones. Groovy's Categories and ExpandoMetaClass (even on a per-instance basis since version 1.6), Ruby's extensive metaprogramming facilities (see Rails) or Smalltalk's meta object protocol come to mind. Furthermore, concepts like Groovy's methodMissing / propertyMissing and invokeMethod, Ruby's method_missing, Ioke's pass etc. let you create effective DSLs with little effort. Ever tried to create an XML builder in Java? In Groovy it's just a few lines of code.
The dynamic-language argument is a bit weak though: Scala's implicit conversions, for example, are accountable to a very sophisticated static type system rather than a dynamic one. Moreover, InfoQ hosts an interesting interview with Lennart Augustsson on DSLs written in Haskell where he emphasizes amongst other things the language's ability to easily define new control constructs, operators and syntax without the need for dynamic or meta programming.
At the risk of making some fellow Java developers go mental: Closures facilitate the creation of custom control abstractions that (almost) feel like real language constructs.
Finally, Lisps (and now Ioke) raise the bar even higher with the inclusion of macro systems which provide powerful ways to hide "low level" details behind custom layers of abstraction and enable you to easily extend a language's syntax (3).

Conclusion

Personally, I thing that an expressive language should allow you to convey the things that you want to say without requiring a significant translation step by yourself and the users of the API. Once again, for example, Java's BigInteger API leaves a lot to be desired in this respect. I do realize that talking about what constructs and features contribute to the expressiveness of a programming language is highly subjective and often lies in the eye of the beholder (4). The power and structure of DSLs obviously vary from situation to situation and the amount of abstraction you need depends on specific requirements. Sometimes all you want is an API that more or less reads natural to your stakeholders. Yet at other times you also need custom general-purpose constructs like control structures, operators, etc. For those cases Java is not a good match. Even though there are numerous features that make other languages vastly superior to Java, things like operator overloading, extended character sets for identifiers or even just some syntactic sugar here or there are fundamental in my opinion and would kick it up a notch for internal DSLs written in Java. And there might be light at the end of the tunnel: Because of the increasing importance of the JVM as a language host, the recently established Project Coin could include a proposal that would allow Java to call into languages with different naming restrictions. How that would be achieved I don't know, but I'm certainly following the mailing lists should such a proposal show up ...

----------

(1) Adrian Kuhn shows how to use Roman numerals in your Java code by bending JSR-269, the Pluggable Annotation Processing API, to rewrite Java's AST. However, there're a few caveats: It only works with Java 6 and only with Sun's Java compiler because he uses internal APIs. Interesting, but shows the complexity that you're about to deal with should you choose that path. If you're interested you might also take a look at the The Hacker's Guide to Javac.

(2) Some have argued that the term operator overloading does not apply to languages that treat operators as functions. Rather, operators are syntactic sugar because they translate to an equivalent function call form.

(3) Paul Graham once said in one of his Lisp essays that "In Lisp, you don't just write your program down toward the language, you also build the language up toward your program."

(4) For those interested: Matthias Felleisen presents a formal notion of expressiveness in his paper called On the Expressive Power of Programming Languages (admittedly I've not read it yet in its entirety).

Sunday, March 1, 2009

The first days in the life of Project Coin

On February 27th the Compiler Group-sponsored OpenJDK project Project Coin has finally gone live. Proposals for small language changes may be submitted to the dev mailing list until March 30th 2009 after which a subset will be included in a JSR draft. In the two days since inception of the project there have been posted quite a few proposals to the list.

Neal Gafter has been very active so far and proposed no less than three language changes:

Block expressions allow a series of statements to be written inside a parenthesized expression. This avoids the need for helper functions that are used only once or the introduction of temporary variables.
Improved exception handling includes catching multiple exception types as well as improved checking for rethrown exceptions.
Improved wildcard syntax introduces syntactic sugar for writing ? extends T as out T (covariant) and ? super T as in T (contravariant). Both in and out would be considered as context-sensitive keywords, i.e. they still can be used as identifiers elsewhere. The syntax looks an awful lot like the upcoming variance feature of C# 4.0. However, C#'s designers apparently decided on declaration-site variance similar to Scala as opposed to Java's use-site variance.

Josh Bloch submitted a proposal for Automatic Resource Management. People who closely followed the discussion around closures in Java and the various proposals that were brought forward should recognize this one (however, I don't know if there are significant differences to Josh's older proposals). The posting of the proposal prompted an interesting discussion with Neal Gafter who (obviously?) seems not very much in favour of this feature or at least does not agree with Josh on its specification.

Jeremy Manson of Google proposed Improved Type Inference for Generic Instance Creation that would allow you to avoid the explicit declaration of parameterized types in class instance creation expressions even though the parameterized type of the constructor is obvious from the context. Neal Gafter also chimed in on this one and pointed out some of what he believes to be omissions or shortcomings of the specification. Apparently he and Joe Darcy, the initiator of Project Coin, are working on a similar proposal that is based on a different implementation strategy.

Additionally, Joe Darcy proposed the use of Strings in switch-statements, Ruslan Shevchenko submitted a proposal for multi-line strings and Adrian Kuhn suggested the use of the default keyword for default (i.e. package-private) visibility.

Altogether a very promising start in my opinion, though I don't necessarily want to see each and every one of those proposals in the language. For example, I've not yet made up my mind about Neal's improved wildcard syntax. On a first scan of the proposal I was not sure what to make of the in and out keywords, i.e. what they should convey to the developer. After refreshing my mind a bit by reading up on the theory behind variance as well as Scala's variance annotations and C#'s upcoming support I felt a bit more like I'd groked it. Well, that's maybe worth a blog post on its own.

Anyway, the mailing list is open to anyone for submitting a proposal and/or for chiming in on any of the previously submitted proposals. I'm not sure I will however because seeing the concentrated brain power on the list makes me afraid of looking like an idiot ;) Nevertheless, discussions so far have been very interesting and I'm confident that we'll see a lot more of that in the days and weeks to come. It's good to finally see things in motion ...

Saturday, January 31, 2009

Nomisma jacta est

The brand new OpenJDK project that is hosting small language changes for JDK 7 is called Project Coin (an interesting play-of-words). There's no website and/or mailing list yet, but apparently the plan is to get them up and running sometime in February. People are encouraged to propose language changes by formally describing them in a detailed propsal form that is to be sent to the mailing list once it has been established. Interestingly, proposals that wish to remove existing features from Java will not be considered.

As part of the above announcement Joe Darcy lists a few language changes that he thinks might be worthwhile to include in Java 7. These are not exactly brand new and have been discussed on and off over the past view years. Forks of javac implementing some of those features are hosted on Kijaro. However, they received more attention again recently as it became clear that "language-shattering" proposals like closures, first-class properties, reified generics, operator overloading, etc. won't be considered. Thanks to Stephen Colebourne's (and others of course) tireless endeavour the most recent survey results regarding the popularity of those changes can be viewed on his blog. (However, as I've said before, I think that those polls are probably a little too biased because conference attendees are hardly a representative fraction of the entire community of Java developers). Well, Joe probably doesn't speak for Sun in general, but I think chances are that the majority of those changes he mentioned are going to be part of Java 7. I would certainly welcome most of them.

We'll see how it all turns out. I'm pretty excited because it sure looks like that the initial proposal phase is going to be carried out in the public and I'm looking forward to the discussions of the various proposals. Because with the creation of Project Coint we are finally looking at a formal process backed by Sun regarding the evolution of the Java language, these could be very intersting times ...

Sunday, January 25, 2009

Debugging the Ioke sources

As far as I'm concerned one of the best ways to learn what a software is doing internally and how it does it, is stepping through the code with a debugger. That allows you to get a feel for the call chain and to inspect the state of objects and variables at runtime. But how're you supposed to debug an external application? With Java applications it becomes quite simple.

To that end the Java platform provides the Java Platform Debugger Architecture. It consists of two interfaces, the native JVM Tools Interface (JVM TI) for the backend and the Java-based Java Debug Interface (JDI) for the front end, and a protocol, the Java Debug Wire Protocol (JDWP) that ties both interfaces together and provides the format of debugging information and requests between both. By means of this architecture, a debugger can now connect to remote Java processes.

On to a concrete example: Recently I've taken quite a liking to Ioke. I always try to understand the internals of the libraries, frameworks etc. that I'm using. In Ioke's case even more so because it's a programming language and I love programming languages, man! ;) Ioke's written in Java. That mean's we can debug it remotely using Java's Debugger Architecture. Even better, you can find many of Ioke's core concepts directly represented in the Java source. Thus, the cognitive transfer from some program written in Ioke to that program's representation in Java becomes a bit easier.

To debug an external Java process that process has to be running and must have been configured for external debugging. To debug Ioke start the IIk with the option -J-agentlib:jdwp= and use the following sub-options transport=dt_socket,address=8000,server=y,suspend=y.

The agentlib:jdwp option is the preferred way on Java 5 VMs and above (it uses the new interfaces introduced in Java 5). For older VMs you have to use the options -Xdebug and -Xrunjdwp. The above means the following: Listen for a socket connection on port 8000 (rather than explicitely connecting to a debugger) and suspend the VM before the main class loads. Once the debugger application connects, it can send a JDWP command to resume the VM.

I'll use Eclipse and its debugger as an example, but I guess NetBeans or IntelliJ are fine as well (though I've not yet tested it). In Eclipse open the Debug Configurations dialog and create a new configuration for a Remote Java Application. Specify the project, the host and the port (must be the same as the one you gave to the VM). Click on Debug and the VM Ioke's running in will resume normally. Type something in the interactive promt and send it to the interpreter. If you've set any breakpoints in the source code, the debugger will halt and you're good to go doing whatever you want to do.

Pretty neat, ey? Maybe you've already known this stuff, but if not ...

(If anybody knows a better and/or simpler way, I'd really like to know about it.)

Friday, January 23, 2009

Ioke

Ioke is a relatively new addition to the current crop of JVM languages and has been released as version Ioke O a month ago with the next release Ioke S scheduled for today. (Update: seems I'm a bit behind: Ioke S has just been released!). Ioke's designed by Ola Bini, a JRuby core developer, and features strong, dynamic typing, prototype-based object orientation and is a homoiconic general purpose language inspired by Io, Smalltalk, Self, Lisp and Ruby. It's mainly written in Java.

I've been following the evolution of Ioke since the beginning and became quite intrigued by it in the process. The language's main focus is expressiveness rather than performance. It's worth to point out that performance considerations have been completely neglected during the design process in favour of enhancing the expressive power of Ioke. According to Ola, "expressiveness is the ultimate goal of the language". I assume that doesn't mean that performance won't ever be a concern. It just hasn't played a decisive role in the design so far. Actually, Ola's made the point quite early on that Ioke's not your usual language:

"Ioke is a language I designed for myself. That means that some design choices might not match what you would expect. Ioke is a language that makes sense to me. If it makes sense to you too, great! But be warned that I have made several decisions that doesn't necessarily match what most general purpose languages choose to do."

If you want to know more about Ioke's features, I recommend taking a look at the Programming Guide as well as Ola's aforementioned blog posts. Therefore I don't intend to go into too much detail, but I think there are some noteworthy things to mention. Note, however, that I'm by far no expert, so please correct me if I'm talking nonsense.

Ioke has a powerful macro system similar to languages of the Lisp family. Macros are a kind of code generation facility that allow you e.g. to extend Ioke's syntax (comprehensions are implemented as macros for example). Mind you, we're not talking about mere string substitution like #define in C.

During the transition from Ioke O to Ioke S a couple of new macro types were introduced that allow you to write even more powerful abstractions. That recently led Ola to borrow a term coined by Steve Yegge to describe Ioke as a folding language, that is, a language in which you can write code that writes code that writes code etc. If that sounds as confusing to you as it did for me when I first heard that term, just just click on above link for further clarification.
Ioke features a condition system influenced by Common Lisp that allows fine-grained control over any conditions (not necessarily constrained to error conditions) that might arise during execution of a program. It serves a similar purpose as Java's exception handling mechanism, but is more flexible in that it allows e.g. invoking restarts in the dynamic context of the place the condition happened to provide a way to get back to valid state. The debugger in Ioke's REPL, called IIk (interactive Ioke), is implemented using the condition system. It interactively provides various options to handle certain conditions. For example, a method in Ioke signals Condition Error Invocation TooFewArguments if it's not been given enough arguments. On that condition the debugger allows for giving additional arguments to the method and subsequently restarts the block of code that caused the condition. Nice ;)
As of the latest version, Ioke supports a bit of aspect-oriented programming in that it's possible to add before, around and after advice to cells (see below for an explanation of this term) of an object.
Just like Smalltalk, Ioke is all about sending messages to objects. There are no keywords or statements. Everything is an expression that is made up of a chain of messages that return something which in turn is the receiver of the next message. Inspired by Smalltalk and Io, messages are separated by whitespace rather than a period as for example in Java. If you want to know more, Ola's written a detailed blog post about his early decisions and reasoning regarding the syntax of Ioke.
Accompanying the distribution are a couple of tools specifically written for Ioke. First, there's a minimal port of Ruby's BDD framework RSpec called ISpec, which is used exclusively throughout Ioke's test suite. Additionally, a tool called DokGen is used to generate the reference documentation. You can view the current API here.

Well, 'nuff said. To learn more about a language and how it feels you've got to write some code. So I decided to port a Java implementation of an alogrithm to find all two-word anagrams of a given input word from a given dictionary. It turned out to be quite straightforward even though Ola had to help me out on a few occasions.

The algorithm works like this:

Compute a code for each word in the given input dictionary. The code is the product of prime numbers which have been assigned to each letter of the word.
Store the code and the word such that the code maps to a list of words that have the same code, that is, they consist of the same letters. I've used a Multimap for that.
Finally, the Multimap is searched for two codes that multiply to the code that corresponds to the input word. If such two codes can be found, you have a two-word anagram of the input word.

Here's the code (hope it's readable enough; still on the look out for a decent syntax highlighter):


Multimap = Origin mimic
Multimap initialize = method(self content = {})
Multimap cell("[]=") = method(key, value,
 if(content key?(key),
   content[key] << keys =" method(" words =" method(text," index =" method(words," encode =" method(word," asprime =" method(char," primes =" {}("> 2,  "b" => 3,  "c" => 5,  "d" => 7,  "e" => 11,
 "f" => 13, "g" => 17, "h" => 19, "i" => 23, "j" => 29,
 "k" => 31, "l" => 37, "m" => 41, "n" => 43, "o" => 47,
 "p" => 53, "q" => 59, "r" => 61, "s" => 67, "t" => 71,
 "u" => 73, "v" => 79, "w" => 83, "x" => 89, "y" => 97,
 "z" => 101) withDefault(1)

anagram = method(word,
 code = encode(word)
 Multimap keys each(x,
   if(code % x == 0,
     y = code / x
     if(Multimap key?(y),
      "#{Multimap[x]} and #{Multimap[y]}" println))))

index(words(FileSystem readFully("wordlist.txt")))
anagram("documenting")

Let's step through some parts of the code. First of all, there're quite a few terms appearing in the following paragraph(s) that denote core concepts of Ioke. It might not be obvious what they mean at first glance. I'll try to explain them very briefly, but for further clarification please refer to the Programming Guide.

The first part of the program defines a new kind of object, a Multimap, that mimics Origin. Origin is the kind (the Ioke term for type) that most objects in Ioke should start from. The term mimic captures the prototype-based part of Ioke. In prototype-based programming there's no distinction between classes and objects themselves. New objects are created by cloning them from existing objects, that is, existing objects sort of serve as prototypes for new objects. Ola's decided to call that mimicking an object. A mimic is basically the parent of an object. Multimap mimics Origin and therefore inherits (like in object-oriented programming) all its cells. Cells represent all data in Ioke (think slots in Self for example) and contain variables, methods, etc.

The Multimap's initialize method creates a cell on the receiving object (that's what the predefined variable self refers to) that contains an empty Dict, which is basically a HashMap. The remaining cells are redefined methods of Dict that just delegate to the cell called content ... composition over inheritance ;).

The remaining methods implement the algorithm described above. Methods in Ioke can be defined in several ways. Let's take the index method as an example for one of those ways:


index = method(words,
 words each(x, Multimap[encode(x)] = x))

This method takes one positional argument and the actual code to execute. In this case the method is given a list of words on which the method each (from the kind Mixins Enumerable) is called. If you're familiar with Ruby, Groovy, etc., you should know that method. For those who don't: each executes the code in the second argument for each element x of the collection. Here the code for each word is computed and placed in the Multimap together with the actual word. As you can see, messages are separated by whitespace, which makes it very natural to read in my opinion.

The cell primes is an example of the definition of a Dict. It uses the method {} that takes a variables number of arguments, in this case Pairs that are defined by the literal Pair syntax =>. primes maps individual characters to prime numbers and is also given a default value of 1 that is returned if a queried key is not contained in the Dict.

The complete program works like this: The text file wordlist.txt (a huge collection of words separated by newline characters) is read in and passed as a Text object to the method words which converts it to lower case and tries to match it against the given pattern. The return value is a list of Text objects that were matched (here: all of them). This list is now indexed as described above: Each word in the list is encoded by means of the message chain word chars reduce(1, code, x, code * asPrime(x))) in the method encode. What it does is basically reducing the collection of characters of the input word to one value: the product of prime numbers that correspond to the particular characters according to the Dict primes. Again, if you're familiar with Ruby etc. that method should not come as something new. reduce is also known as fold or inject, both of which would have been available as well. Now the method anagram tries to find all two-word anagrams of the word documenting by first encoding it and subsequently by trying to find two keys in the Multimap that multiply to exactly that code. Pretty simple and straighforwad, ey? ;)

The program finishes successfully in about a minute (give or take) on my machine. The original Java implementation is a lot faster (it runs in about 0.3 seconds), but is also a lot less concise and more verbose in general. As I've already mentioned, performance has not been a key factor in the design of Ioke so far.

Note that there are a few subtleties in the code above, e.g. the seemingly strange cell("[]=") method definition in Multimap. These are described in a lot more detail in the Programming Guide.

To summarize, I had a lot of fun daring my first steps with Ioke. There are definitely a few things you need to wrap your head around, but I really like the way the code feels and reads. Naturally there's a lot more to it than the above code snippet shows. For instance, I've not used aspects, blocks or any of the more sophisticated features like conditions and macros. The entire package seems pretty well thought out and looks like it has received a lot of brain power and hard work, beginning with the language and the Programming Guide itself, the REPL, the debugger, the ISpec framework and the DokGen tool. Altogether there's something to Ioke that may well be made for something bigger in the months and years to come. Finally, for anyone who's thinking about occupying himself/herself more seriously with Ioke: The small community that's been forming recently is a very friendly and helpful crowd. Ola's also very open to outside contributions and actually asks people to help out on the project. So if you want to play a part in the evolution of a cool new language on the JVM ... there's nothing to stop you ;)

Some final bits of information ... really:

The Ioke source code is hosted at GitHub. Besides the Java source it contains variuos examples like a parser, a spelling corrector, the game of life, etc. written in Ioke. Also check out the implementation of the language's built-in functionality which is full of all the sophisticated stuff.
Mailing lists and a bug tracker are available at Project Kenai.
There's a Google Group called ioke-language.
An IRC channel called #ioke was set up on irc.freenode.net.
Sam Aaron, one of the contributors, set up an Ioke reddit.
Martin Elwin, also one of the contributors, has a couple of nice blog posts about setting up a development environment for Ioke under Linux.
The Ioke distribution contains an Emacs mode and a TextMate bundle for syntax highlighting. Those who're primarily working on Windows may take a look at the commercial e Text Editor, which supports TextMate bundles or, as Felipe Rodrigues suggested, the SciTE editor. The latter comes with syntax highlighting for Lisp, which might be good enough. It's also possible to define your own settings if you know what you're doing.

Well ... go try it out!

Update: Sam Aaron had a bit of a play with my implementation of a two-word anagram finder and even wrote some accompanying specs. Whereas my solution is mainly procedural in style, his is entirely object-oriented and encapsulates the algorithm. Plus, you get to see ISpec in action. Very cool!

Tuesday, December 30, 2008

Does size matter?

After years of discussion, countless debates, musings, polls and seemingly gazillion proposals (see Alex Miller's awesome Java 7 feed / page), it seems there's hope after all. Recent events indicate that concrete plans for Java 7 are finally under way:

Joe Darcy recently "proposed an OpenJDK project to host the discussion of the proposals and potentially some prototype implementations" of a set of small language changes in Java 7.
Only one day later, the majority of members of the OpenJDK Compiler Group "ratified" the proposal.
That same week Mark Reinhold keynoted about Java 7 at the Devoxx conference, Antwerp, and even announced a preliminary release date: early 2010

And it's about time. It's been over 2 years since Java 6 came out and it will be just over 3 years if the release date proves to be accurate.

However, there's that word ... small. Sun is dedicated to keep any changes to the language small, at least for the coming release. As Joe Darcy put it:

"Given the rough timeline for JDK 7 and other on-going efforts to change the language, such as modules and annotations on types, only a limited number of small changes can be considered... ."

But, what is small, anyway? Joe Darcy offers his guidance on measuring the size of a language change, which is definitely an interesting read. Here, a language change is defined as small if it is about equally small under all three mandatory deliverables of a JSR, namely Specification, Reference Implementation and Compatibility Tests. (Additionally, a small language change may be distinguished by likely requiring little effort to learn the involved concepts). Joe also gives a few examples: in order to measure the size of a specification change, one has to assess several factors, e.g. the number of new concepts, their complexity as well as their interaction with existing features and elements of the language, which, in turn, may result in more changes to the specification. For example, autoboxing / unboxing is an example where the concept itself seems quite simple, but the actual complexity stems from interaction with generics and method resolution. Thus, autoboxing / unboxing has to be considered a medium-sized change. As Joe wrote before, changing the Java language is not such a simple deed after all. Interestingly, Alex Buckley tackled this issue once by arguing that the complexity of a language feature could be measured by exploiting the structure of the JLS in that one analyzes which parts would be affected by a given change. Subsequently, the complexity is calculated according to a formula that involves, among other things, cross references between chapters that need to be changed to fully describe the feature. Intersting, indeed.

A consequence of considering only small changes to the language is, of course, that some of the most hotly debated topics in the community, e.g. closures, first-class properties and reified generics, won't make it into Java 7. This was more or less confirmed in Mark Reinhold's keynote at Devoxx. In general, Joe Darcy suggests that changes to the type system are probably out of scope for Java 7 as they entail at least a medium-sized change. The community didn't take long to react ... mostly with disappointment as it seems.

Well, now we've got this out of the way, the next obviuos question is: which language changes are going to be in Java 7, that is, which ones are deemed small enough to be considered for addition? As we all know there's a plethora of things that people like to see added to the language and I don't dare try to list each and every one of them (Alex Miller's aforementioned page is a good start, though). Apparently one of Sun's primary goals for the coming release is to identify and adress the average Java programmer's day-to-day pain points, that is, common coding patterns that have been recognized as needlessly verbose. Some of the most recent attempts to measure the pulse of the community in this regard were undertaken by Stephen Colebourne (this year's results are available here and here) as well as by a friend of mine, David Linsin (here). Stephen's presentation outlines some of the proposed changes. Note, however, that most of those and other proposals (if not all of them) have been around for years in Sun's bug databases and have surely been discussed equally long. Some have also been prototyped and provided on kijaro, if you're interested.

Ok, so what do I think? As a matter of fact, I think, yes, size does matter and I'm more or less happy with the current course of events. I agree with Alex Buckley when he said that:

"The tough part isn't designing any single one of [those hundreds and hundreds of possible features in the database] ... but choosing which ones to design. We have to Do The Right Thing as well as Do The Thing Right."

Judging from all those polls the majority of developers desparately craved for having closures in the language (or reified generics, first-class properties, etc. for that matter, depending on who you talk to) even though we've not been able to reach a consensus on which form they should have taken eventually (or even at all). Some even predict Java's death now that the decision's been made. They fear, and mostly seem quite certain about it, that Java's demise is just a matter of time in the face of all the other languages that offer all those shiny constructs. While I don't agree with Ricky on that particular matter, I personally would have loved to have closures in the language for the benefits they bring: more concise and expressive code (compared to anonymous inner classes), the ability to create control abstractions for recurring patterns, etc. Not to mention that full closures would render the discussion about automatic resource management (see also in the presentation above) and probably a couple of other proposed language changes obsolete. Furthermore, they could simplify existing or proposed Java APIs like Doug Lea's Fork/Join framework, which is part of JSR 166y (the API/source code of a port to the BGGA prototype is available here).

However, inspite of the (overwhelming?) advantages of supporting full closures in Java, I'm primarily concerned that above polls are not a representative cross section of the world-wide Java "community". Mind you, I'm basing this solely on my personal experience so far. Most of the developers I know for example don't have the slightest idea what a closure is and what is is good for in the practical sense let alone the technical sense (I'm talking Java developers here). Most of them don't even know that there've been attempts to add closures to Java for years now. Mentioning it gives me mostly blank stares or irritating "why-don't-you-occupy-yourself-with-something-reasonable" laughs even from seasoned developers. Let's face it, a lot of developers concern themselves with the stuff they are paid for, but don't care about anything beyond. Taking into consideration that generics have been around for close to 5 years now and that most people (again, based on my experience) don't know how to use them correctly (*), I think closures would add another level of complexity to the language that most developers wouldn't know how to deal with correctly, especially when combined with generics etc. I imagine that such code would be hard to read, hard to write and very confusing to most of them. Even though I fooled around with the BGGA prototype quite a bit and know what closures are about in general, I'm not so presumptuous to exclude myself from that group completely. I really liked Josh Bloch's presentation at JavaPolis 2007 on that subject. As Brian Goetz said:

"The issue being debated is not whether closures are a good idea -- because they clearly are -- but whether the benefits of retrofitting the Java language with closures are worth the costs."

Personally, I think that it's not worth it for the reasons above. At least now. As I've already said, I'd be quite happy with some of the current proposals for small language changes because they'd indeed make my daily life a bit easier, e.g. improved type inference on the declaration of parameterized types, catching multiple exception types and List/Map literals. Some of them (e.g. null de-referencing), I hope, we won't ever see added to Java, but that's left for another day. At Devoxx I listened to Thomas Enebo's and Charles Nutter's talk about JRuby and I was sometimes simply amazed at Ruby's conciseness and expressiveness compared to Java and its capabilities in general (Scala's another candidate here). Since then I wished Java could do the same on more than a couple of occasions. Perhaps "features" like List/Map literals can bring a tiny bit of that feeling to Java.

To conclude, I'd find it interesting to apply Joe Darcy's aforementioned sizing criteria to the currently proposed changes. I don't feel qualified enough to do it myself, but I hope somebody will sometimes.

* That's a bit unfair though, I guess, because I doubt that there are a lot of people in the world that may claim to know everything about generics and how to use them without growing a foot-long nose.

Update: Just stumbled across Alex Miller's article on JavaWorld where he muses about expected additions and changes for Java 7.

Friday, December 26, 2008

The beginning ... with Devoxx 2008

So this is finally going to be my first post on this blog. I've really put that off far too long and I'm hoping to turn this into a regular habit. Although it's been a few weeks since then, this year's Devoxx (fomerly known as JavaPolis, and Javoxx for a short time) gave me the final push over into the land of bloggers. I noticed quite a few people that were busy writing blogs at one time or another during the conference and met and recognized some because I follow their blog on a more or less regular basis. So the decision was made. I'm of course not the only one nor the first one doing this. If anybody's interested, the Devoxx web page contains links to dozens of blog posts about the conference. An especially detailed account of the first four days can be found here.

To begin with, the Devoxx conference is definitely the most exciting conference I've yet been to. 3200 attendees coming from 35 different countries. You get lots of interesting talks about lots of interesting topics as well as dedicated and knowledgable people all around. I find it nothing short of amazing that Stephan Janssen, the founder of Devoxx, manages to organize an event that costs about nothing compared to other conferences and yet has such an elusive list of speakers year after year. You could definitely call it Europe's JavaOne.

As I'm very interested in all kinds of programming languages and everything having to do with programming languages, I was glad to see that there were quite a lot of talks on that matter starting with a couple of 3-hour introductory university talks (all of them on the first day!) on Scala, JRuby, Groovy and Jython. During the main conference there were also a couple of more advanced talks on these languages. I especially liked Bill Venner's talk on The Feel of Scala where he dove a bit deeper into Scala's feature set and complexity. He mostly used his open source project ScalaTest as a running example to explain and show some of the functional benefits of Scala like currying and how he used them to "emulate" Ruby's RSpec in certain aspects. Other parts of his talk were dedicated to aspects of Scala's sophisticated type system, e.g. structural types, views, etc. Quick side note: As you may probably know, the first book about Scala, Programming in Scala (by Martin Odersky, Lex Spoon and Bill Venners), was released just a few weeks ago and apparently was the best-selling book at the Devoxx book store.

Well, Devoxx is still mainly a Java conference and thus one of my personal highlights was that we were finally provided with some first-hand news about Java 7 and, hold your breath, a release date, early 2010, in Mark Reinhold's keynote (note, however, that he described his presentation as being not binding). After Joe Darcy already blogged about a JSR for small language changes the day Devoxx started, Mark also talked about Sun's plans for Java 7, including some more or less likely features and language changes in Java 7 as well as the controversial topic of Java modularity in the form of Project Jigsaw. He even suggested that Jigsaw could provide the basis for breaking backwards compatibility with older JDKs and could help to get rid of the classpath (hell) as we know it. I'm not exactly sure how introducing modularization into the JDK/JVM would help to achieve those ends, so I'm gonna have do some research on that. Another interesting take-away from the keynote was the clear statement about which language features won't make it into Java 7: Closures, first-class properties, reified generics, etc. to name just a few. I won't go into any detail here because other people have already written about that at great length, e.g. Hamlet D'Arcy compiled a complete list of the stuff Mark talked about.

Even though I don't feel qualified enough to really comment about Project Jigsaw, I was left with the impression during a BOF session with Mark and Alex Buckley that Sun itself is not yet very clear about Jigsaw and modularizing the JDK/JVM. There were more than a few questions that were answered with a neutral "Yeah, we still have to figure that one out", e.g. when it comes to the interaction of modules and the protected keyword. Apart from that, the BOF turned out to be quite interesting (even though 90% of the time was spent talking about Jigsaw). For example, as you would expect, the question about reified generics was also raised and Alex argued unsurprisingly against introducing them in the JVM. Basically, the question he asked was: What do they really bring you apart from being able to reflect over the type at runtime? You'd still be not able to call new T() in Java as that's inherently not possible without introducing structural types. Furthermore, you would considerably complicate the use of libraries because you'd have old code without generics, new code with erased generics and on top of that new code with reified generics. Sounds like an awful mess! According to Alex, this whole issue of reified generics is apperently just to complex to shove into the JVM.

For me, the most intersting talk of the conference was Towards a Dynamic VM by Alex Buckley and Brian Goetz. They spoke about JSR 292 (Support of Dynamically Typed Languages on the JVM) at great length and did a terrific job at presenting a reasonably complicated and deeply technical topic in a quite accessible way. It included facts about method handles, interface injection and, of course, the invokedynamic bytecode. For anyone who's interested, the early draft of the spec goes down pretty well and is very informative. Charles Nutter also has a great article about JSR 292. Funny thing: Mark Reinhold mentioned during his keynote that there won't be an invokedynamic instruction, but Alex and Brian said that there would only an hour later. It turned out that Mark based this section of his talk on slightly outdated information. He said that John Rose, the spec lead of JSR 292, "reintroduced" the bytecode instruction only a couple of weeks ago.

I'll spare you the probably boring recollections of each and every talk I've attended, so here's a list of just a couple of more highlights:

Josh Bloch keynoted about the second edition of Effective Java. Unfortunately, it was almost exactly the same talk he gave at JavaOne earlier this year. I've not been there, but I've watched his presentation on the Sun Developer Network. Although I've not heard anything new, I find Josh's style of presentation highly entertaining.
Brian Goetz talked about library-based parallelism in Java 7, e.g. the fork/join framework. As I've read his book on Java concurrency from front to back (highly recommended!), it was cool to see him talk in person. He really knows his stuff and provided an interesting overview of possible additions to the java.util.concurrent library for Java 7.
Mahmood Ali gave a talk about pluggable type-checking with JSR 308 (Annotations on Java Types). Mahmood, member of the expert group, did a very good job in showing the use of so-called type checkers. By allowing Java annotations to appear in more places in the source code, you're able to write more informative types (i.e. "extending" the type system) that can then be detected by type checkers. I'd have wished for a bit more in-depth presentation and Mahmood also finished about 30 minutes short, but it was a good talk nevertheless.
As always, the live recording of the JavaPosse was just pure fun; not just because Atlassian gave out free beer again (cheers!).

Again, the general discussions that were held on the obligatory whiteboards on the first floor attracted a lot of attention this year. Photographs of all of them can be viewed here; compiled results are available on Stephen Colebourne's blog. Two of the whiteboards invited all attendees to vote for or against particular language features for Java 7. Stephen Colebourne meticulously compiled these results as well (great work, Stephen!). Josh Bloch, Alex Buckley and Brian Goetz were often seen hanging around there having discussions with people about one thing or another and/or answering questions.

Well, nothing ain't perfect. One of Devoxx's biggest downsides was that internet access was very slow and unreliable. Apparently the conference's provider made some mistakes and they ended up with only a couple of 1 MBit lines. Obviously not good enough for 3200 people almost constantly browsing with their notebooks and iPhones and downloading stuff and tools they've seen during the presentations. Well, it got a bit better towards the end of the week, but it was nevertheless still far from being usable. A co-worker of mine tried to download the latest version of Scala on Monday, but failed until Thursday night. Well, it'd be cool to check out some of the stuff you've seen right away, don't you think?

As a minor point I would have wished for a bit more variety in meals. They gave out sandwiches on Monday and Wednesday and a bowl of salad on Tuesday and Thursday. No vegetarian food. But lucky enough, the Metropolis complex in Antwerp houses a couple of fast food restaurants.

Above downsides notwithstanding, my overall experience at the Devoxx conference was very satisfying. I was able to learn a few new things and was given the chance to catch up on some stuff that I promised myself to study for some time now. There was a lot more to Devoxx than just what I've written about, but I've restricted myself to a few personal highlights. As you probably may have noticed, I didn't mention JavaFX at all, which was released just a couple of days before Devoxx started and thus had its launch party at Devoxx. Well, I've just not yet caught up on the "hype"...