alt
December 9th, 2013

by Ivan St. Ivanov

This year I visited the fifth edition of Java2Days and guess what – it was my fifth attendance. I still remember my Winnie the Pooh paraphrase four years ago: This is the best Java Conference that I’ve been at. Actually, this is the only Java conference that I’ve been at. Well, after all these years, four Devoxx’s and two JavaOne’s I can definitely say that I still enjoy very much going to our local (not-only) Java geeks gathering! I will not blog here about my overall impressions though, but rather about one of the talks that I had.

For a second consecutive time I was speaker at Java2Days. Likewise last year, I was co-speaker to Koen Aers about JBoss Forge – a project that I contribute to from time to time. But this particular post will be about my second session: Dissecting the Hotspot JVM.

Regular readers of this blog might remember that I had the idea to bring OpenJDK to Bulgaria. I got that at last year’s JavaOne when I saw some talks by London Java Community (LJC) members Ben Evans and Martijn Verburg on adopting the Java reference implementation at various JUGs around the world. In the last couple of months I saw my dream come true, with the tremendous help and energy from other two BG JUG members: Martin Toshev and Dmitriy Aleksandrov (known better as Mitia). And of course not to forget the amazing support from LJC’s own Mani Sarkar that also gave a talk at Java2Days and participated in all our activities throughout the conference.

So, let’s get to what we showed at our Dissecting the Hotspot JVM talk. It was co-hosted by Martin Toshev and me, but for the most part was prepared by my co-speaker (kudos for that!). In the beginning we introduced the topic by describing what a Virtual Machine is (no, not that kind of VMs, that you run in VirtualBox or VMWare software). Then we described in a few words what the Hotspot JVM gives to the JVM developers: a byte code interpreter, a couple of compilers, memory model, garbage collection, classloading, startup and shutdown…

Most of the time we spent explaining the three major subsystems of the JVM as defined in this diagram, which we borrowed from artima.com:

The classloading subsystem is responsible for loading, validating and initializing the classes from the file system (or other media) to the memory. We spent some time here to explain the class format: the magic number (CAFEBABE), the class format version, the constant pool, the references to this and super classes as well as to the implemented interfaces, then the fields, methods and attributes.

The biggest part of our talk was devoted to the runtime data subsystem, i.e. the way data is stored in memory during a Java program runtime. Hotspot defines two types of memory: shared by all the threads and specific to every single thread. When a new thread is spawned, it gets its own memory that can be guaranteed to be used just by that thread. It contains the program counter pointing to the next instruction that should be executed as well as the Java and the native stacks. The Java stack in particular consists of number of frames: when a new method is called, the JVM creates a fixed sized segment in memory (called stack frame), which reserves space for the method return value, the local variables (including the method parameters and a reference to this in non-static methods), a reference to another stack, used to store the operands for the various operations run inside the method and a reference to the constant pool. The memory shared between all the threads contains the heap and things like JIT-compiled code, class definitions, interned strings, etc. We then went to explain the overhead that a single object takes when stored in memory. It’s not only about storing the object fields, but we get two machine words (4 bytes in 32-bit machines and 8 bytes on 64-bit ones) in addition. The first one is the so called mark work, which contains the hashcode, information concerning the garbage collector and locking. The other one, the so called class word, contains a reference to object class’s meta-data.

In the last part of our talk we dived into the execution engine. The na?ve look into that is to treat it as a simple interpreter: we have an array of byte code op codes, and the JVM executes them in a row. However, when the virtual machine identifies that a certain chunk of code is small and is executed more than often, it might decide to compile it just in time (hence the name JIT) to assembly code. It does some assumptions in order to do that, so if an event happens that would invalidate those assumptions, the already compiled code may be de-optimized and go back to its interpreted version. For example if the compiler assumed that there is just one implementation of a certain interface and decides that it should directly call that implementation’s method instead of going to look them up, but at some later time a classloader loads another implementation of the interface, then the assumptions gets invalidated and the code is de-optimized.

We had a lot of questions at the end, to most of which we were able to answer. An attendee asked about cross compiling OpenJDK, which means building an image for certain operating system on another operating system. We could not answer, but Mani found some interesting resources a few days later and shared them in the mailing list.

As a whole I think it was a pretty successful talk. We managed to deliver a lot of useful information in really structured way without going out of our time boundaries. I hope this and all the other conference events that we organized will bring much more people to our JUG meetings next year. Good times…

October 4th, 2013

by Ivan St. Ivanov

The community keynote

I’m not a big fan of keynotes at conferences. Prior to going to my first conference (back in 2010) I thought that this is where all the announcements and big news come. Now that I’ve been to my second JavaOne, I see that the community keynote is not for me. I’m not saying it was bad, but it was all about the different areas of life where Java is involved. Sounds good, the people that showed their achievements (James Gosling including) deserve all the best. But to be honest, I did not enjoy that session.

The most boring part was the introduction. One of the big sponsors of JavaOne (freescale) showed their vision about the Internet of Things. Most of the time I thought that the gentleman speaking got the venue wrong: instead of the Moscone center where Oracle Open World was going on, he came to show his white collar presentation to the geeks. If I compare it to the other sponsor’s talk (IBM) at the first day keynote I would say that blue giant’s was much more geek oriented. It showed IBM’s vision on how they want to improve developer’s life. Now what we saw is how freescale is going to make millions from home automation and all the likes.

Well, the community event had a really motivating part. Right after Stephan Janssen talked about Devoxx4Kids we saw on the stage a 10-year-old youngster, who had not been happy with his Minecraft experience and learned how to hack it… in Java. Really, it was very interesting to listen to that boy (his name is Aditya Gupta and he happens to be the son of Arun Gupta) talking about programming, Java and even de-obfuscating code.

The Forge gathering

As the talk to which I wanted to go was full, I decided to go to the Howard Street Caf?. There I saw Lincoln Baxter, the project lead of JBoss Forge – a project that I also contribute from time to time. We were soon joined by another contributor – Paul Bakker. Lincoln explained us in detail the architecture of Forge 2.0, the ideas behind addons and how the Furnace framework (i.e. our OSGi lite) works. Paul wants to integrate bnd tools into Forge, while I am right now doing the git tools migration to 2.0. So Lincoln’s lecture was quite valuable for us both.

You see, conferences are not just about going to talks, but are also about talking to each other (some people call it networking). You go there and meet all those kinds of geeks, whom you just follow on twitter or exchange mails from time to time. Now they are all there and you can ask them whatever is bugging you. Or just go and say Hi!.

Venkat in action

After that I went to see a talk on mixing JVM languages with Java. The JavaOne content catalog site is designed in a way that you have to click two or three times in order to get to the speakers for a session. So I did not know who was speaking there. And when I entered I found that it was Venkat Subramaniam.

You’ve got to see this guy in action. He has an infinite source of energy somewhere inside. He talks and talks and waves his hands and live-codes and at the same time never stops to talk. I have only heard of him so far, but have never seen him.

His talk was on calling Java from Groovy and Scala and vice versa. He was using Groovy shell, Scala REPL and InteliJ IDEA and his presentation was in a text file, where he just ticked the topics of the agenda, which he had passed.

Basically calling Java from other languages was not a big challenge. The only problem was with Java methods, which names had special meaning in the respective language, e.g. yield and def for Scala. You do this (in most JVM languages) by simply “escaping” the method name (usually by surrounding it with quotes). There were more things to tackle in the opposite case: when Java program was calling something from the other two. And it is natural: they are more powerful in terms of language features. But at the end they are compiled down to byte code. What Venkat showed was first how to run those: include the respective jar (scala/groovy) plus the compiled Scala or Groovy classes into your classpath. And then look at the file system to see what was generated: for example for companion objects scalac generates class files named <ComanionClass>$class.class. After that instantiate the class that you saw on the file system. But that does not seem to always work: sometimes the IDE complains that it can’t find a certain class (generated by scalac), but finally the program runs. The bottom line is: you have to know very well the internals of the language that you want to call, what is its syntax and most importantly how it maps to the Java concepts.

My five cents on mixing JVM languages. Most importantly: you shouldn’t do it just because it sounds cool. There must be a reason for that. The most compelling reason hides in Ola Bini’s programming languages pyramid. The essence is that the stable part of an application (the business logic for example) should be written in a stable language, which is statically compiled. But if you want to create a DSL that uses those APIs, Groovy or Scala are far better choices for that. And that’s where JVM language interoperability kicks in. Never did it myself, so I am just pondering here.

JVM internals

The last two sessions for me at this JavaOne were all about the JVM. IBM’s presentation on the community day JavaOne keynote touched the surface of their work on the packed objects, which try to overcome the problem that most of the memory consumed by Java objects is just header data, telling the JVM how to store, hash and synchronize on objects. The talk by an IBM architect on the last day showed exactly how much memory overhead the different Java collection structures require.

He started off by calculating how much memory an Integer object takes. On a 32-bit virtual machine it takes 4 times the data that a normal int would take. This makes 96 bits “administrative” data and just 32 bits for the int value.  The first 4 bytes are for the pointer to class object, the next 4 are for things like hashcode, then come some 4 bytes for locking and synchronization and only then come the last 4 bytes for the real value. The overhead becomes even bigger on 64-bit OSs, where it takes 224 bits for storing that same integer (all the first three groups now take 8 bytes, just the int value stays at 4 bytes). This can be decreased a little by using compressed object pointers (an option of the JVM).

The second and more interesting part of the presentation was about the collections: the memory that they use for storing objects, how fast you find those objects inside and the strategy for resizing each collection when it reaches its capacity limit. The collections that were compared were the Hash*’s (HashSet, HashMap, Hashtable), the lists (ArrayList, LinkedList) and the StringBuffer. As you might guess, there’s a tradeoff – in Hash* collections you find faster an element versus the lists, but the overhead for storing objects (i.e. the amount of memory that just contains “administrative” data for the JVM) is bigger (1.5 times more than in LinkedLists and 9 times more than in ArrayLists).

Expanding a collection is another very interesting topic. What happens if you have created a StringBuffer with size 40MBs, you fill it and then add an extra character? Well, you will get an 80MBs StringBuffer, as capacity of StringBuffer always doubles upon expansion. The same applies for Hash* structures, ArrayLists resize by the factor of 1.5. The LinkedLists are the best structure in this area as their nature is such that they always expand by one.

Briefly, my takeaways from this very informative talk are:

  • Use Hash* collections if you care more about finding an element faster than about the consumed memory
  • If possible create your collections with optimal capacity so that you avoid resizing. At the same time try to avoid additional overhead in memory consumption (empty collections also use memory)
  • If you are not sure about your memory consumption, use Eclipse Memory Analyzer

The last session that I visited happened to be about SAP JVM. But as I work for that company, I would like to keep the JavaOne 2013 series free from the least sense of corporate feelings. That’s why I am going to write about that talk in another post in the next few weeks.

October 4th, 2013

by Ivan St. Ivanov

It’s Arquillian time

Day three of JavaOne started and finished for me with Arquillian. In the morning I went to the talk by Andrew Rubinger and Aslak Knutsen about the ABC of integration testing. I hope most of the people reading this blog know what Arquillian is and does, but I will summarize it in one sentence: it makes your Java EE application unit tests run inside the container so that you don’t have to mock all the services that it provides: persistence, CDI, transactions, etc.

I think this is my 5th or 6th session in which I listen about the library. I blogged about it while it was in its alpha releases. I have even used in some of my projects). So I was pretty aware of most of the things they presented. However, I learned some new cool stuff. And as I always say: you have to be patient with the speakers giving their introductions which you are familiar with. You have to give the people that attend such sessions for the first time the chance to get the initial feel of how everything works. And I really recommend getting Andrew and Aslak’s new book, which I hope will be available end of November.

I spent the afternoon hacking Arquillian itself as part of the code garden initiative. There the leads of popular open source project gave the opportunity to community members to contribute small features. So I spent some time with Aslak while he explained me the task that I was supposed to work on. Then I spent some more time to understand how that particular part of Arquillian core works by running and debugging the unit tests. Finally I hacked it and at the end of the day I sent my pull request. Cheers!

Java APIs and the Internet of Things

One of the trending topics of this year’s JavaOne was Java in the Internet of Things. During the community day I saw what cool stuff the SouJava members did with some Raspberry PIs and Arduinos. But now was time for real hacking. Robert Savage showed us how we can use the pi4j library to write event based Java programs which can control devices attached to various ports of a Raspberry PI.

First of all, what is Raspberry PI? It’s a small, cheap (just $35) computer, which has 700 MHz ARM processor, 512 MBs RAM, very good GPU, an SD card slot, two USB ports, HDMI, audio output, WiFi and some low level I/O peripherals. It runs various versions of Linux and most importantly Java can also run there. The pi4j library abstracts away exactly the low level I/O (GPIO, serial ports and some other peripherals that I don’t really understand).

GPIO comes from General Purpose Input/Output. It’s a 26-pin chip. You can connect devices to most of these pins. If you want to check their state or control them with pi4j, you first have to create a GpioController and then use that to obtain handle to a device by providing the ID (simple number) of the PIN, where the device is attached. The good thing here is that there is a component API that supports most of the popular devices (keypad, led, LCD, relay, temperature sensor to name a few), so you are not concerned with low level hardware stuff. Once you get a handle to the device you can register listeners on events (like pushing a connected button for example), for which you can change the state of other devices (like lighting a led). There is also serial device API, which allows you to open the port and send commands there.

What you can also do is run a web server (like Jetty or Glassfish) and react on some events by sending an email or a tweet for example. I can’t believe what a real hardware newbie like me can do with this amazing computer and the pi4j library!