alt
December 9th, 2013

by Ivan St. Ivanov

This year I visited the fifth edition of Java2Days and guess what – it was my fifth attendance. I still remember my Winnie the Pooh paraphrase four years ago: This is the best Java Conference that I’ve been at. Actually, this is the only Java conference that I’ve been at. Well, after all these years, four Devoxx’s and two JavaOne’s I can definitely say that I still enjoy very much going to our local (not-only) Java geeks gathering! I will not blog here about my overall impressions though, but rather about one of the talks that I had.

For a second consecutive time I was speaker at Java2Days. Likewise last year, I was co-speaker to Koen Aers about JBoss Forge – a project that I contribute to from time to time. But this particular post will be about my second session: Dissecting the Hotspot JVM.

Regular readers of this blog might remember that I had the idea to bring OpenJDK to Bulgaria. I got that at last year’s JavaOne when I saw some talks by London Java Community (LJC) members Ben Evans and Martijn Verburg on adopting the Java reference implementation at various JUGs around the world. In the last couple of months I saw my dream come true, with the tremendous help and energy from other two BG JUG members: Martin Toshev and Dmitriy Aleksandrov (known better as Mitia). And of course not to forget the amazing support from LJC’s own Mani Sarkar that also gave a talk at Java2Days and participated in all our activities throughout the conference.

So, let’s get to what we showed at our Dissecting the Hotspot JVM talk. It was co-hosted by Martin Toshev and me, but for the most part was prepared by my co-speaker (kudos for that!). In the beginning we introduced the topic by describing what a Virtual Machine is (no, not that kind of VMs, that you run in VirtualBox or VMWare software). Then we described in a few words what the Hotspot JVM gives to the JVM developers: a byte code interpreter, a couple of compilers, memory model, garbage collection, classloading, startup and shutdown…

Most of the time we spent explaining the three major subsystems of the JVM as defined in this diagram, which we borrowed from artima.com:

The classloading subsystem is responsible for loading, validating and initializing the classes from the file system (or other media) to the memory. We spent some time here to explain the class format: the magic number (CAFEBABE), the class format version, the constant pool, the references to this and super classes as well as to the implemented interfaces, then the fields, methods and attributes.

The biggest part of our talk was devoted to the runtime data subsystem, i.e. the way data is stored in memory during a Java program runtime. Hotspot defines two types of memory: shared by all the threads and specific to every single thread. When a new thread is spawned, it gets its own memory that can be guaranteed to be used just by that thread. It contains the program counter pointing to the next instruction that should be executed as well as the Java and the native stacks. The Java stack in particular consists of number of frames: when a new method is called, the JVM creates a fixed sized segment in memory (called stack frame), which reserves space for the method return value, the local variables (including the method parameters and a reference to this in non-static methods), a reference to another stack, used to store the operands for the various operations run inside the method and a reference to the constant pool. The memory shared between all the threads contains the heap and things like JIT-compiled code, class definitions, interned strings, etc. We then went to explain the overhead that a single object takes when stored in memory. It’s not only about storing the object fields, but we get two machine words (4 bytes in 32-bit machines and 8 bytes on 64-bit ones) in addition. The first one is the so called mark work, which contains the hashcode, information concerning the garbage collector and locking. The other one, the so called class word, contains a reference to object class’s meta-data.

In the last part of our talk we dived into the execution engine. The na?ve look into that is to treat it as a simple interpreter: we have an array of byte code op codes, and the JVM executes them in a row. However, when the virtual machine identifies that a certain chunk of code is small and is executed more than often, it might decide to compile it just in time (hence the name JIT) to assembly code. It does some assumptions in order to do that, so if an event happens that would invalidate those assumptions, the already compiled code may be de-optimized and go back to its interpreted version. For example if the compiler assumed that there is just one implementation of a certain interface and decides that it should directly call that implementation’s method instead of going to look them up, but at some later time a classloader loads another implementation of the interface, then the assumptions gets invalidated and the code is de-optimized.

We had a lot of questions at the end, to most of which we were able to answer. An attendee asked about cross compiling OpenJDK, which means building an image for certain operating system on another operating system. We could not answer, but Mani found some interesting resources a few days later and shared them in the mailing list.

As a whole I think it was a pretty successful talk. We managed to deliver a lot of useful information in really structured way without going out of our time boundaries. I hope this and all the other conference events that we organized will bring much more people to our JUG meetings next year. Good times…

Leave a Reply

*