alt
February 20th, 2015

by Ivan St. Ivanov

 

In the Bulgarian JUG we had an event dedicated to trying out the OpenJDK Valhalla project’s achievements in the area of using primitive parameters of generics. Our colleague and blogger Mihail Stoynov already wrote about our workshop.

In the first part of this three-part series of posts you could read about the reasoning behind not supporting generic classes with primitive parameters. Before I continue with the current proposal for the implementation, I would like to again make a very important disclaimer. I am not an expert in this matter. I just try to follow the project Valhalla mailing list as well as I read Brian Goetz’s State of Specialization document. So look at this series more as explaining the generics proposals in layman’s terms.

Project Valhalla

Whenever the OpenJDK developers want to experiment with a concept they first create a dedicated OpenJDK project for that. This project usually has its own source repository, which is a fork of the OpenJDK sources. It has its page and mailing list and its main purpose is to experiment with ideas for implementing the new concept before creating the Java Enhancement Proposals (JEPs), the Java Specification Requests (JSRs) and committing source code in the real repositories. Features like lambdas, script engine support, method handles and invokedynamic walked this way before entering the official Java release.

One such project is Project Valhalla. Its goal is to research things like value types, enhanced volatiles and primitives (and value types) as parameters of generic types. The last feature of this impressive list is the topic of this blog series. Being part of a research project means that it may or may not exist as such in one of the future releases of Java. As Java 9 will be out quite soon (hopefully in a year or so), it is almost sure that we will not have the opportunity to generify over primitives any time soon. Anyway, it is a good idea to closely follow the development of this and other features that is why I wrote this blog post.

Coming back to the primitives in generics topic. After considering many arguments, the project developers, led by the Java language architect Brian Goetz, decided to make three substantial compromises and came up with a proposal.

Compromise#1: the language syntax

The first syntactic construct that comes into mind when talking about a, let’s say, list of primitive integers is List<int>. However, in the previous installment we saw why this is not possible. Or better said, it wouldn’t be possible without making big changes in the platform and possibly breaking backward compatibility. That is, the existing rules are rigorous about the fact that the generic parameter of a type can always be converted to Object, it should be assignable to null, etc. As big incompatible changes are not permitted, we come to compromise number one: if a class wants to allow enhanced generic support, i.e. support of primitives as generic parameter types, it has to explicitly state it in its definition, rather than relying that the rules for generics will be changed. This means that a new special syntax will be introduced on language level that will distinguish these enhanced generics from the existing ones. Here is the current proposal:

public class Box<any T> {

    private T value;

    public Box(T value) {
        this.value = value;
    }

    T value() {
        return value;
    }
}

You can probably notice the any modifier on the type variable. As per the current proposal it will be used to denote that the Box class can be parameterized with both reference as well as primitive types. Now, you can do things like this with the Box:

Box<int> intBox = new Box<>(42);
System.out.println(intBox.value());

 

Compromise#2: the runtime representation

Another concern that has to be taken into account is about the runtime representation of an enhanced generic type. Before we go into further details here, let me explain the term specialization.

The process of creating different implementation of a certain type based on its generic characteristics is called specialization. Let’s take C++. There you have templates. C++ will generate a different class for each different template type. This is called heterogeneous translation. In Java and C# the situation is the opposite. These languages create one and the same runtime class for any type parameter. This is called homogeneous translation. With heterogeneous translation you are flexible in terms of combining parameter types: you can do things like <String+Integer> for example. But you are not allowed to do things like <? extends Number>, which as we know is perfectly fine in Java.

So, coming back to the proposed implementation topic. The homogeneous translation of generic types was possible in Java because of the erasure. However, primitive types cannot be erased. And this brings compromise number two: there will be a hybrid homogeneous-heterogeneous translation. This means that the reference types will continue to be erased and they will be translated as they used to be. While the primitive type parameters will be specialized: there will be a separate runtime class for every primitive generic type.

To illustrate this, let’s go back to the code from above:

Box<int> intBox = new Box<>(42);
System.out.println(intBox.value());

 

Let’s take a look at the byte code that is emitted (java -p), or at least at the relevant parts of it:

0: new           #3                  // class "Box${0=I}"
......
6: invokespecial #4                  // Method "Box${0=I}"."<init>":(I)V
....
14: invokevirtual #6                  // Method "Box${0=I}".value:()I

 

You can easily notice that whenever the Box is parameterized with a primitive int, at runtime the class that is generated is not called just Box (as it would be called in case of erased Box), but Box${0=I}, where I stands for Integer. So the class name is augmented with specialization info to help the virtual machine generate the right runtime class.

Compromise#3: subtyping

In the generics as they exist today the following subtyping rules are valid: Box<Integer> extends Box<?>, which in turns extends the raw type Box. This does not apply for Box<int>, though. The reason is hiding again in the fact that there is no common type of reference and primitive types. So if we were to allow Box<int> to extend raw Box, then the former should have its value field of type Object. At the same time this field should be of type int, because that is how it was declared. As int and object don’t share common super type, this is not possible.

So the only subtyping relationship with primitive generics would be of the kind ArrayList<int> extends List<int>. And most unfortunate: List<int> cannot extend List<Integer> because of the transitive inheritance leading to the raw type.

Restrictions and special features

Let’s take again our Box<any T> class. Because the T type can be both primitive as well as reference, there are some restrictions for the things that you can do with it:

  • You cannot assign or even compare the value field of the Box class with null
  • It cannot be converted to Object or Object[]
  • You cannot synchronize or lock a block of code with it
  • It is not possible to convert Box<any T> to Box<?> or Box

At the same time, there are some features that are only available to the enhanced generics:

  • You can do things like new T[<size>] (it is not possible to do that with erased T). This will instantiate Object[] when T is reference type and the correct array in case of primitive type.
  • You can do comparisons with the instanceof operator
  • You can call Box<any T>.class

Generic methods

So far we’ve only discussed the implementation proposal for enhanced generic types. But what about enhanced generic methods? They are supported, so you can do things like:

<any T> void printValue(Box<T> box)

While it preserves the language syntax of the generic types, the internal representation is different. With enhance generic types it is possible to have specialization – different runtime type for the different generic parameter types. But this is not possible with methods (i.e. separate runtime method for the different method calls). This is because in that way the interface of the class will change – it would gain some more methods than declared. This is not so easy to achieve as most VM implementations are organized on the assumption that the number of methods for a given class is fixed.

That is why the enhanced generic methods take the same approach as the lambda expressions: invokedynamic. There will be a special bootstrap class (GenericMethodSpecializer), which will receive as arguments all the needed information in order to make the proper decision which special method to call.

Conclusion

In this second installment of the Primitives in Generics series we went quickly through the proposal coming from project Valhalla on how this feature will be implemented in Java. We saw what the proposed syntax will be and how will it be represented in the virtual machine. Then we discussed some of the restrictions introduced in subtyping and in the operations allowed with generic parameters. We also touched the topic of generic methods and how they differ in terms of internal representation.

In the final part of the series we’ll walk the migration path of existing JDK APIs and namely the most important of them all: the collections library.

February 5th, 2015

by Ivan St. Ivanov

 

Last week in the Bulgarian JUG we had an event dedicated on trying out the OpenJDK Valhalla project’s achievements in the area of using primitives parameters of generics. Our colleague and dedicated blogger Mihail Stoynov already wrote about our workshop. You can find very useful links in his post about the slides that I showed, the VM that he prepared (which you can use to try Valhalla yourself) and even the recording of the meeting (which unfortunately for some of you is in Bulgarian).

In this three-part series of blogs I would like to go in some more details about the current implementation proposal and the reasoning behind the decisions. No, I am not an expert in this matter. I just try to follow the project Valhalla mailing list as well as I read Brian Goetz’s State of Specialization document. So look at this series more as explaining the generics proposals in layman’s terms.

Introduction

Java generics is one of its most widely commented topics. While the discussion whether they should be reified, i.e. the generic parameter information is not erased by javac, is arguably the hottest topic for years now, the lack of support for primitives as parameter types is something that at least causes some confusion. It leads to applying unnecessary boxing when for example you want to put an int into a List (read on to find out about the performance penalty). It also leads to adding “companion” classes in most of the generic APIs, like IntStream and LongStream for example.

One of the goals of OpenJDK’s project Valhalla is among others to research the possibility to generify over primitives in the language, the Virtual Machine and the standard libraries. Yes, it’s just research. Which means that the current proposal may or may not appear in the future versions of Java. What I am pretty sure is that it won’t make it to Java 9, which is about to be shipped next year.

The State of Generics

As already mentioned, it is not possible at the moment to define generic type or method parameterized with a primitive type. So if you want to create, let’s say,  a List of integers, you will have to consider using the wrapper class:

List<Integer> intList = new ArrayList<>();

Besides looking kind of artificial, this brings also huge performance penalty. It comes from the way reference types are layed out in memory. If we take the array list above, internally it is represented as an array (of Integer’s). Which means that we will get an array with references to Integer objects scattered in the heap. Looking at a single Integer object we need memory not only for the int itself but for all other things needed by the virtual machine: pointer to the class object, some space for the garbage collector flags, others for the object monitor used by the Java synchronization infrastructure, etc. So instead of beautifully ordered ints, we potentially get something like this:

 

array.layout

The memory overhead is not the only issue here. The modern processor architectures rely on several layers of cache. This makes going to the main memory for fetching the value of a certain variable a very expensive operation in terms of CPU cycles. That is why most of the optimizations done by the JVM tend to put as much data in the registers as possible. But the problem is that when talking about arrays (remember, ArrayList is represented as an array), the CPU instructions can only cache contiguous memory addresses. Thus if our Integers are scattered around the heap, most probably our VM will not be able to put them in the registers and we’ll have to pay the performance penalty of the cache misses.

Why not generics over primitives?

Normal question here would be why doesn’t Java support generifying over primitive types. The short answer is: because of generic type erasure by javac. Slightly longer answer follows in the next few paragraphs.

Let’s suppose that we have the following class definition:

public class Box<T> {
    private T value;
  
    public Box(T value) {
        this.value = value;
    }

    public T get() {
        return value;
    }
}

 

The generic type T is only used by javac to ensure that correct types are boxed and then retrieved. If you decompile the product of the above class’s compilation, you will notice that the type of the value variable, the constructor parameter and the return type of the get() method are all java.lang.Object. Simply the compiler “erases” the information that you coded above and replaces it with the type that is the parent of all reference types. In Java there is no such thing as a common type of all types (both reference and primitive). Something like Any in Scala for example. That is why you cannot apply erasure to all types: with the current implementation javac doesn’t know to what they should be converted.

Why erasure at all?

Astute readers will ask the question: “But why we need this erasure anyway?” Before trying to answer it, let me first elaborate a bit on the compatibility topic.

Suppose that we have type A. And then we have a class C that uses or extends A. And class A is changed in some manner. We say that this change is source incompatible if the class C does not compile any more after this change (this is rather simplistic explanation, there are also a couple of other subtle causes of source incompatibilities, but let’s keep it simple). Some of you might remember when the enum keyword was added to the language in Java 5 – all the code that used that as identifier did not compile anymore. The same will be the fortune of all of you that use sun.misc.Unsafe BTW ;).

Next, suppose that we change somehow A and let’s say that both classes live in different jars. Class C may still compile after A’s change, however if you drop the hypothetical A.jar in the class path of our program, class C may refuse to link. This is considered as binary incompatibility. You may refer to the Java Language Specification for more information on that matter.

Going back to the generics story. If it was decided upon their introduction in Java 5 that the generic type is not going to be erased, then most likely it would break at least the binary compatibility of your classes. As generics were applied to the most widely used part of the API: the collection library, it would mean that virtually any meaningful Java program in this world would have to be at least recompiled on the day its users decided to upgrade to Java 5. The situation becomes even more complicated, because most of the libraries that we use in our program are developed and maintained by someone else. Which means that if one wanted to upgrade to Java 5, they would need to wait for all the external libraries to be recompiled with Java 5.

The bottom line is that non-erased generic type parameters would have brought a “flag day” when everybody should have recompiled and delivered a new version of their libraries. Which might be fine for smaller or more obedient language communities, but is not the case for Java.

Conclusion

So there are really compelling reasons why we are not able to generify our types and methods over primitive types. In the next installment of this series we’ll look at the current proposal in OpenJDK’s project Valhalla on how it can be implemented without breaking compatibility with older releases of Java.

October 6th, 2014

by Ivan St. Ivanov

 

The last day at JavaOne started as usual with the community keynote. I didn’t go to it, because I wanted to have a rest after the Aerosmith and Macklemore & Ryan Lewis concert last night and also wanted to catch up with my blogs. However, the people that I follow on twitter were kind enough to update me with the most interesting bits of the session. Additionally, there’s already a blog from Ben Evans about it.

Basically Oracle understood their mistake from the opening keynote on Sunday and let Mark Reinhold and Brian Goetz on the stage to do their technical talk that was cut so abruptly a few days ago. Apparently Java 9 was the central topic. Modularity, value types, primitive types in generics, native data and code access, array rework, etc. are the most prominent things that will hopefully come with the next Java release.

The keynote continued with a panel moderated by Mark Reinhold, having Brian Goetz as well as James Gosling (the father of Java), Charles Nutter (the JRuby guy, from RedHat), John Rose (JVM Architect, known for things like invokedynamic and the value types in Java) and Brian Oliver (Oracle’s in memory data grid architect). They answered various questions marked hashed with #j1qa on Twitter (BTW, it was trending in top 10 on the Thursday morning). The most tweeted statement from the conversion was that James Gosling was not sorry for introducing null in the language.

The final part of the keynote was dedicated to the community (at the end, this was the community keynote). The most important news was that like last year all the JavaOne 2014 sessions will be on Parleys.

JBoss Forge

I don’t know whether there is a reader of this blog that does not know what JBoss Forge is (check out its new cool website). But still: it is a rapid application development enabler tool, built on extensible platform and as of its second version with abstraction over its user input. Yes, as of Forge 2 you can run the tool both on the command line as well as with wizards in your favorite IDE. Actually a Forge generated or managed project doesn’t have any dependency to the tool. It builds all the knowledge from the pom.xml for Maven projects and build.gradle for Gradle projects. The extensibility of Forge is because of its modularity. No, it’s not OSGi, Forge is based on JBoss modules and has quite some machinery based on Maven on top.

Where does Forge help, you would ask? In any areas, I would answer:

  • Figure out for you the right dependency that you have to add to a Java (EE) project
  • Put the right entry for your JPA provider in the persistence.xml
  • Help you declare a bi-directional one-to-many dependency
  • Plumb for you the boring CRUD classes
  • Prepare a mock UI (a.k.a. a scaffold) out of your domain model

This is done by already existing Forge addons, but you can very easily create one of your own to help you in your daily job activities. If you have a tedious task, where you constantly copy and paste artifacts from one project to another, then you can think of implementing your own addon. It’s built with Maven and you can use dependency injection for the dependencies from the Forge core that your addon might require (CDI or the service loader). Once you have developed your addon Forge gives you a way to test it in the Arquillian way, i.e. in-container without any need to mock anything, and then deploy it on the Forge runtime. You can also develop listeners to observe events before and after Forge command executions.

Go and try Forge and then join us and contribute! And you might get one of these.

Be reactive with Java 8

The next logical step after applying functional programming practices to your APIs is to make your modules containing those APIs reactive. Well, this might sound as the next buzzword, but in fact it is not. The reactive programming style was something that was invented long ago, we forgot it and now we found how cool it is.

How does your program become reactive? Well, if it serves requests from external clients and if these requests take longer than expected, the call on the client side should not block on one side and on the other side the server should respond in timely manner that it is processing the request. It is very important to not spawn a separate thread for every request – if the client retries the request again and again, the threads may quickly expire and the memory could be filled up. Using futures is not a solution to that: yes, they seem asynchronous, but at the end you have to wait on their get() method. Callbacks seem a neat solution, but the problem with them is that they cannot be nested.

So, how do you achieve the efficiency of the reactive programming style in Java 8? There are quite a few libraries, but the last session that I happened to visit at JavaOne (by Venkat Subramaniam) focused on one of them: RxJava library from Netflix.

Its API is built around the Observable class. You create an instance of that class on the server side. It receives kind of recipe as a parameter, telling the RxJava library what to do upon an event happening on the server side. This recipe is an implementation of the Observer interface and you can call any of its methods inside the create() method body. Then the client can call one of the subscribe() methods of the received Observable to pass lambdas that will react on a given event happening on the server side. The types of events that the client can react upon are success (the onNext event), error (onError) and completion (onComplete). For each one of them there is a special subscribe overloading method, as well as a method receiving a full blown extension of the Subscribed abstract class.

You can call the unsubscribe method of the Observable on the client side to stop “receiving” updates from the server part. Besides those, Observable class has lots of other methods that make it look like a real [functional] lazy abstraction: skip() and skipWhile() for skipping the first events happening on the server, take() to take some of the events and then stop observing, combine() to combine this with other Observlables, etc.

The RxJava library provides APIs for other languages on the JVM besides Java: Groovy, Scala, Clojure to name a few. I guess that this is a really great new topic as of Java 8, so it deserves much more exploration than these few paragraphs.

Java2Days preparation instead of map, flatmap and reduce

I was really interested to visit another functional talk as well. It would go at another level comparing others that I’ve seen in the recent couple of years. It promised to not only go through the usual stream collectors and reduce methods, not stop at forEach and filter or at the simplest map methods. It would go into other functional terms like flatmap or more complex reduction (foldleft probably?). All in the Java world.

However, this was the only slot when I could meet Reza Rahman from Oracle to discuss what we are going to prepare for the Java2Days conference that we are running in Sofia the week after Devoxx. So I decided to watch that talk on Parleys later and had a beer with Reza during the closing ours at the Duke’s caf?. I can promise you that if everything goes fine, we’ll have a really cool agenda for our Bulgarian conference: hands on labs covering the latest Java SE 8 and Java EE 7 technologies plus tools like Forge and Arquillian, Java SE 9 hacking, talks on Java EE 8, etc. Can’t wait it to happen!