alt
March 6th, 2015

by Ivan St. Ivanov

In the Bulgarian JUG we had an event dedicated to trying out the OpenJDK Valhalla project’s achievements in the area of using primitive parameters of generics. Our colleague and blogger Mihail Stoynov already wrote about our workshop. I decided, though, to go in a little bit more details and explain the various aspects of the feature.

In the first part of this three-part series of posts you could read about the reasoning behind not supporting generic classes with primitive parameters. In the second part we went through the proposed syntax, the implementation approaches and the compromises that led to them. In this last installment I will turn your attention to something that is very important when such big language and platform changes happen: the possible migration of the existing APIs and namely – the Collections library.

Again, before I start, I would like to make the same disclaimer again. I am not an expert in this matter. I just try to follow the project Valhalla mailing list as well as I read Brian Goetz’s State of Specialization document. So look at this series more as explaining the generics proposals in layman’s terms.

The usual suspect

There is one common thing about Java 5 and Java 8. They were both revolutionary releases with respect to the language, the platform and the APIs. And in both cases most of the discussions were how the big features coming with these new versions will affect and will be affected by the java.util libraries.

For example, let’s take the introduction of generics. The container nature of the collection API made it one of the most obvious candidates for generifying. That is why most of the compromises that were made (type erasure for example) were because of it and its ubiquitous usage.

Same in Java 8. The collections were again the usual suspect for benefiting from the newly introduced functional concepts made possible by the lambda expressions. Thus the Stream API was born, which allowed for more functional approach in working with containers, rather than following the imperative style of programming with iterators and loops. But in order to retrofit that new API into the existing java.util interfaces, they needed to be evolved in a backward compatible manner. Thus the needs of the collections (for the most part) brought us the concept of default methods.

So based on the history that we have, there’s no doubt that the introduction of the primitives in generics feature researched in project Valhalla will lead to some more migration challenges, mainly caused by the praised collection API.

Migration challenges

Let’s go through some of these challenges.

The java.util.ArrayList class is backed by Object array assuming that the type parameter extends Object. This doesn’t hold any more if you want to make the ArrayList parameterized with any T. And this and all other locations, where we have the following code:

T[] array = (T[]) new Object[n];

have to be adapted to support the specialization. Like this:

T[] array = new T[];

It is internal representation, you say, and its change is slightly easier, because of not breaking compatibility. And you will be right. But let’s check the other challenge. Let’s look at the List interface and its most commonly used implementation: ArrayList. What do you think about this method in the interface:

boolean remove(Object o);

It works fine with erasure, but will not work in the specialization case as primitive types cannot be cast to Object.

Similar is the situation with another, this time generified method in the List interface:

boolean removeAll(Collection<?> c);

As we saw in the previous installment of this series, Collection<?> cannot be cast to Collection<any T>. So this will not work with specialized List either.

Going even further, let’s consider the hypothetic situation, where the remove method is generified and takes T instead of Object. Then we will get the following overloaded methods in the List interface with rather different meaning:

remove(T element);
remove(int position);

This is not a simple overload: the above methods have completely different semantics. The first one is used to remove a concrete element, while the second one is used to remove the element with the specified index (presumably in a random access collection). If the class that defines these methods is made generic over any T and if T is int, then the virtual machine will have hard time to pick which of the two remove methods to call.

The final challenge that we’ll look here comes with the so called sentinel value of the get(key) method of java.util.Map. At the moment, if there is no value for the specified key in the map, this method will return null. However, this cannot be the case any more if the map can take non-reference types as they are not assignable to null. Finding proper sentinel value for primitive types is not an easy job (think about boolean :)).

The peeling technique

After going through the challenges, let’s take a look at how they can be resolved.

One of the solutions that was proposed in Brian Goetz’s paper is the so called peeling technique. According to it, an interface is broken down into layers. One generic layer that is common to all types of parameters and then optionally separate layers for the different kinds of type parameters: one for reference types, another one for primitive types, etc. In order to illustrate this, let’s take our hypothetical fully generified List with overloaded remove method:

public class List<E> {
    public boolean remove(E element);
    public boolean remove(int index);
}

It is already clear that its any-fying is not straightforward because of the option to specialize E to int. So, if we want to have a list over any E, we’ll have to somehow avoid method overloading. This can be done with the following possible steps:

  1. Define methods removeByValue(E) and removeByIndex(int) that are available to all the possible types E (primitive and reference). These methods will belong to the generic layer.
  2. To keep the backward compatibility, keep the overloaded remove methods, but define them only in the reference layer.
  3. In the same reference layer provide default implementations of the newly added generic methods that simply delegate to the respective remove method.

Here is one possible syntax for those three steps:

interface List<any E> {
    // 1) New methods added to the generic layer
    void removeByValue(E element);
    void removeByIndex(int pos);

    layer<ref T> {
        // 2) Abstract methods that exist only in the ref layer
        void remove(int pos);
        void remove(E element);

        // 3) Default implementations of the new generic methods
        default void removeByIndex(int pos) { remove(pos); }
        default void removeByValue(E e) { remove(e); }
    }
}

I will leave to your imagination or curiosity to find out how could the Map.get method be implemented to support primitive return values.

First experiments

This proposal sounds a bit theoretical and at the same time bold. I still remember the contradiction that one such thing as default methods in interfaces brought in Java 8. We are most probably going to have private methods in interfaces in Java 9. And nobody paid real attention to static methods in interfaces (again Java 8). Comparing to those changes, the layer stuff looks like a revolution of its own. But let’s leave the theoretical discussions to the theoreticians, the philosophical disputes to the philosophers and let’s take a look at something tangible.

A few weeks ago Peter Levart came up with a first experiment for anyfying part of the collection API. He took the following paths in this first attempt:

  • Methods that were not fully generic (like Collection.remove(Object)) were complemented with an additional default method (Collection.removeElement(E))
  • Code with assumption that the internal representation is an object was changed in a way that it is E (or T). Check for example the sort method here
  • The construct new T[length] was used in the TimSort constructor instead of (T[]) Object[]
  • An interesting idiom (don’t know how to call it otherwise) was used for differentiator between pieces of code specific for reference and primitive (or actually value) types: __WhereVal(E) and __WhereRef(E)

Conclusion

In this third and final part of the blog series about primitive generic parameters we talked about what challenges will be there for the existing APIs (mostly the collections) when this feature is introduced. I briefly showed you the (so far) ultimate proposal for coping with those challenges as well as the initial experiments done in project Valhalla source repositories.

And that was it! In this three part series I tried to share with you in plain English the things that I shared with our Java user group in plain Bulgarian a month ago. I will be extremely happy if you enjoyed it and if you learned something new. Stay tuned for more great content from me and especially from our JUG!

2 Responses to “ Primitives in Generics, part 3 ”

  1. […] Ivan St. Ivanovs completed his series about the support of primitives in generics as part of project Valhalla. In this 3rd part of the series, he had a look at the challenges that this change will bring to the existing APIs and how to cope with them: Primitives in Generics, part 3. […]

  2. […] Ivan St. Ivanovs completed his series about the support of primitives in generics as part of project Valhalla. In this 3rd part of the series, he had a look at the challenges that this change will bring to the existing APIs and how to cope with them: Primitives in Generics, part 3. […]

Leave a Reply

CAPTCHA Image
*


+ 2 = nine