alt
November 19th, 2011

By Stoyan Rachev

After the Devoxx introductory Scala session, about which I wrote in a previous post, I went to Martin Odersky’s talk What’s in Store for Scala. “New developments for Scala: Better tools, full reflection, runtime compilation, and dynamic types.” Based on the abstract and the fact that the speaker is the language creator himself, I was prepared for this session to go into deeper details and provide stronger insights.

Here is a link to the slides from this session.

Overview and Features

Scala is used today in a large number of enterprises, and on a large number of cloud platforms. The main adoption vectors are Web, Trading, and Financial sectors. The main value proposition of Scala is being very fast to first products and scalability afterwards. The language is getting increasingly popular – Martin mentioned that last time he did a talk at Devoxx his talk was the only Scala one, but this year there are so many Scala talks that there is a conflict.

The latest release Scala 2.9 introduced many important new features, such as parallel and concurrent computing facilities yet to appear in Java, faster REPL, progress on IDEs, and better docs. The Play Framework 2.0 is an open web application framework inspired by Ruby on Rails. The Scala Eclipse IDE 2.0 is being completely reworked with the major goals being to make it reliable and responsive. The ambition of Martin is to eventually make it better than Java based on Scala’s excellent type system.

The new release Scala 2.10, which is scheduled for release early 2012 is going to introduce a few major new features:

  • New reflection framework, much more powerful than native Java reflection
  • Reification – types are persistent at runtime, but not as radical as in .NET, since that’s very expensive and complex, on programmer demand with “manifests”
  • Type Dynamic – will help in the interfacing with dynamic languages, lets the programmer selectively suppress errors, similar to “dynamic” in .NET, but simpler
  • More IDE improvements, among them a debugger
  • Faster builds
  • SIPs – string interpolation, simpler implicits

Scalability and Parallelism

Scala comes from “scalable”, and its name brings the promise of scalability. This is why the in the remainder of the talk Martin focused on two important aspects of “Scala in the Large”, large in terms of both many cores and large systems.

Regarding scaling to many cores, Martin repeated again what other speakers at this conference said: Moore’s Law is today only achieved by increasing the number of cores. It takes 24 000 running threads to keep a modern GPU fully loaded. The need for easier and safer parallel computing is sometimes called “PPP (popular parallel programming) grand challenge”.

Here, Martin made a differentiation between “Concurrent Programming” and “Parallel Programming”. The first is about executing programs faster on parallel hardware, while the second is about managing concurrent execution threads explicitly. But both are too hard to do using the familiar Java model with threads and locks.

The root of the problem is in the non-determinism caused by concurrent threads accessing shared mutable state, as demonstrated by the following code snippet:

var x = 0
async { x = x + 1 }
async { x = x * 2 }
// can give 1 or 2

According to Martin, the only way to really solve the problem is to get rid of the mutable state. But this means programming functionally, and this is why Functional Programming (FP) has become mainstream and popular only very recently. FP here has the advantage of offering a completely different mode of thinking: if you think imperatively, you think in terms of time, while if you program functionally, you think in terms of space. If you look at space, solving a problem in a parallel way is easy, it’s like building a cathedral with 1000 workers each of them working on its own piece. But if you look at time, it’s much more complicated, so you need to think about protecting your resources from concurrent access. However, we as humans are much better at thinking optimistically rather than pessimistically, that’s why from a thought perspective the functional way is much simpler than sticking to an imperative mindset.

In this space, Scala has many appealing features – it is not only agile, both object-oriented and functional, and both safe and performant, but also offers excellent parallel processing facilities. In Scala’s toolbox for parallelism there are parallel collections and parallel DSLs as core language features, and for concurrency there are actors, STM, and futures, all available in the Akka framework.

Since Akka was already presented in a previous talk, Martin focused on the parallelism features that are provided by the core language. First, he outlined parallel collections. He showed a simple Scala class Person with a name and an age, and then used it to partition an array of people to minors and adults sequentially. He compared the Java and the Scala code, outlining again the much more concise syntax of Scala, closer to the problem domain – a simpler pattern match, an infix method call, and a function value.

val people: Array[Person]
val (minors, adults) = people partition (_.age < 18)

He then came to the parallel case, which in Java would be rather hard to implement, while in Scala you can just add .par on people and the job is done.

val (minors, adults) = people.par partition (_.age < 18)

The above code converts the array into a “parallel array”, and then all operations that make sense to be executed in parallel suddenly are executed in parallel. There is a hidden contract, however – you give the system permission to execute operations in any order it likes. This means there should not be any side effects. But fully avoiding side effects means that you need to be functional, otherwise it would just not work. Martin said that he is very curious what will happen in Java after the announced introduction of similar features in Java 8 – according to the speaker, if Java doesn’t morph into a functional language, it’s way too dangerous to use that.

The parallel collections in Scala use Java 7 Fork/Join framework to split the work by the number of processors and introduces an adaptive mechanism in which the number of slices is changed dynamically depending on the workload, to get an optimal performance out of the box. The bottom line: they are easy to use, concise, safe, fast, and scalable.

But still, how can we reach the mentioned tens and thousands of threads in an application to fully load a modern multi-core processor? Parallel collections and actors are not sufficient. Here Martin makes a “bet for the future” that one possible solution could be parallel embedded DSLs. The idea is to capture the parallelism from the problem domain, which the compiler is not able to extract automatically, by using a Scala-based DSL.

Here Martin showed an example from EPFL / Stanford research. Liszt is a DSL for physics simulation, which is used for example to simulate airflow in hypersonic jets. It is mesh-based, with very irregular meshes. Liszt is implemented by a functional program that constructs a representation of itself with AST (abstract syntax trees) which is then converted to code which is optimized by the size of the problem and the hardware. According to the speaker, the performance of this implementation beats that of hand-written C++ code; problems tend to be so complex that humans can’t do an optimal job anymore.

Reflection

In the remainder of the talk, Martin focused on another new feature in Scala 2.10 – its reflection facilities. The goal here is to achieve the analogue of Java’s reflection, but with Scala’s full types. Programmers should be able to ask an instance for its Scala runtime class, then ask the class for its supertype, obtain its members, etc.

Reflection in Java is limited – for example it doesn’t tell you if type A conforms to type B. Why wasn’t this done in Java? It turns out this would only be possible if the reflection implementation would duplicate essential parts of the compiler, which is too hard and may lead to serious consistency issues.

How to do better? Martin looked at dependency injection and introduced the The Cake Pattern for solving cyclic dependencies in dependency injection. It turns out this approach can be used to achieve better reflection, as Martin put it to “bake a cake that’s both compiler and reflection”. There are three cakes: reflect.internal.Universe, extended by nsc.Global (scalac) and reflect.runtime.Mirror. To avoid exposing too much detail, the API facade scala.reflect.api is introduced on top of reflect.internal.Universe.

In conclusion, Scala is a very regular language when it comes to composition – everything can be nested, everything can be abstract – methods, values, types, and last but not least the type of this can be declared freely. All this allows us to solve previously unsolvable problems, such as achieve better reflection.

Conclusion

Although I was not able to fully grasp some of the more intricate technical details of this talk due to my limited knowledge of Scala, after it I felt even more motivated to further explore this language. Yes, it is more complex than other JVM-based languages (not to mention Java itself), but I think it’s quite possible that this complexity reflects the essential complexity of the problem domain, rather than being accidental. Based also on other Devoxx talks, I believe that the much praised Java simplicity is indeed a myth, since it offers programmers seemingly simple, but often inadequate language facilities in terms of conciseness and expressive power. In this respect, Scala has taken the opposite direction, and has so far positioned itself as the most promising “potential Java killer” (as other speakers called it) in the new JVM languages space.

One Response to “ Devoxx 2011: What’s in Store for Scala ”

  1. Alain Van Daele says:

    Good summary of Martin’s talk.

Leave a Reply

CAPTCHA Image
*


5 − = three