14 October 2007 - 20:06Managed Environments
When writing Java code, it’s useful to differentiate between two types of targets: a “normal” environment and a “managed” environment [1]. The difference between the two is simple. In a normal environment, you (the person writing the code) call the main() method [2]. In a managed environment, you do not. Managed environments are sometimes called container environments because they usually follow a containment or hosting model. In this model, the host container is the code that contains the main() method, and independent units of third-party code (hereafter plugins[3]) are managed by the container.
One of the most familiar examples of managed environments to most Java developers is the application server model used to host server-side Java applications. Such an application server can range from a full-blown JEE container implementation to a smaller-scale operation that hosts servlets with possibly a few EE features. Another example of a managed environment that most Java developers will have interacted with is the Eclipse IDE. Eclipse is entirely built around a container model (for the last few versions, that model has been OSGi). Users of Eclipse need not know or care about this, but anyone who writes plugins for Eclipse must understand the model and its implications.
It’s helpful to think about the differences between managed and normal environments. In theory, the environment should be mostly transparent to the code executing in that environment. In practice, however, there are some issues that need to be considered. This is especially true if you are writing library-level code: some libraries are much more “container-friendly” than others. If you are writing application-level code this may or may not be an important issue for you. Occasionally, you might have a single code base that you want to use in both types of environments.
The biggest difference between the two types of environments is the classloader layout. Normal Java environments are usually pretty simple in this regard [4]. Without going into unnecessary detail, this layout contains a bootstrap classloader that loads the system classes (like java.*) and an application classloader that loads everything else, including the classes that you write and any libraries those classes depend on. In contrast, a managed environment usually implies a more complicated classloader hierarchy. There will typically be a separate classloader for each plugin. There will often also be one or more classloaders in the hierarchy to support the container itself, and one or more classloaders that are shared among all of the plugins [5].
Why is the classloader layout important to think about? Two big reasons: visibility and static variables. Visibility is a simple property that has to do with the relationship between classloaders. Class A has visibility of class B if A’s classloader is able to load class B. Visibility becomes important in a variety of contexts. For example, many APIs will instantiate objects for you. In order to do this, they might need to have visibility of the class to be instantiated [6]. Static variables are associated with a class instead of an instance of that class, and are often used to obtain global variable semantics in Java. However, static variables aren’t really that similar to global variables - they do have a scope, and that scope is the class they are associated with. Such variables are only visible to classes that have visibility to the static variable class. If two different classloaders in the application each load a class that defines a static variable, there will be two instances of that variable. This situation may or may not be anticipated by the author of the class.
In the “normal” environment, it’s hard to make things go wrong classloader-wise. Any libraries your application code has dependencies on are pretty much guaranteed to have visibility of your code, since there’s essentially just one flat classloading space. Any static variables behave mostly like global variables for the same reason. In fact, you can pretty much ignore many classloader issues in a normal environment.
In a managed environment, there is much more potential for things to go wrong. As mentioned above, each plugin will have its own classloader. Most managed environments have complicated rules or configuration options that govern the relationships between the plugin classloaders, any shared classloaders, the container classloaders, and the bootstrap classloader. Since there is not a flat classloading space, visibility issues come into play. For instance, a library may not be able to instantiate an object because it does not have visibility to a classloader. A class may be loaded twice, and then both versions of the class may be attempted to be used in the same context (leading to very confusing ClassCastExceptions!). In short, all kinds of problems can happen, and debugging them can make for hours and hours of fun.
A particularly interesting problem that often occurs in managed environments is an insidious type of memory leak. Managed environments often provide for some amount of dynamic loading and unloading of plugins at runtime. This is most often implemented by discarding the plugin classloader when the plugin is unloaded. The plugin classloader contains references to all of the plugin classes, which in turn contain references to any static variables defined by the plugin. If all goes well, the garbage collector can reclaim the plugin classloader and everything referenced by it. However, it only takes one outside reference to a single object from the plugin to keep the classloader from being garbage collected. This is because every object has a reference to the classloader that loaded it. It turns out to be very non-trivial to cleanly unload a plugin at runtime - just perform a web search for “classloader memory leaks” to read all sorts of war stories. The Java language and libraries are just not designed for that kind of thing. Luckily, most managed environments do not have core functionality that depends on dynamic unloading [7].
Another problem occurs with the issue of configuration. Let me skip straight to an example: using Java system properties for configuration simply does not work in a managed environment [8]. System properties have global scope - when a system property is set, the value is read by every other part of the application. In particular, system properties span classloader visibility. Suppose a managed environment contains plugin A and plugin B, both of which depend on a library which is placed in a shared classloader space. Assume further that said library relies on system properties for configuration. This is bad because plugin A can now affect plugin B’s configuration.
So when writing Java code, whether a library or not, consider the target environment of your code. Managed environments are becoming more and more common. Managed environment techniques like custom classloaders that would have been considered advanced 5 years ago are now much more commonplace, which means you’re much more likely to run into them. Managed environments, long the standby of the server-side Java space, are now increasingly popular on the desktop. The Java community is starting to take a lot of interest in standardizing managed environments [9]. If you’re writing application-level code, understand which type of environment you’re targeting. If you’re writing a library, take the time to write container-friendly code.
Footnotes:
[1]
The term “managed” here means something different than another common use of the word: in the .NET world, the term “managed” refers to code that uses automatic memory management as opposed to manual allocation and freeing. That’s not the meaning I’m using in this article.
[2]
That is, the public static void main(String[] args) method. In other words, you control the entry point of the Java application.
[3]
Plugins may not be the best term, but “containees” or “independent third-party units of code” was just too unwieldy to use throughout the rest of the article. If you’re a server-side person, just substitute “web application” everywhere you see “plugin”.
[4]
But not always. As non-trivial applications grow, they almost always eventually start playing magic classloader tricks to support various features. At that point, they start to look a lot more like managed environments from the perspective of some of the code.
[5]
For example, see this document for a description of the classloader hierarchy in one managed environment.
[6]
Of course, a well-written API that does this will not require such visibility. Instead, it will either require that a classloader that does have visibility be passed, or it will make use of hacks such as the thread context classloader.
[7]
For instance, most JEE containers support hot redeploying of applications. This functionality is very useful during development but rarely enabled when the application is put into production.
[8]
Despite the fact that it does not work, many libraries do it anyway. You’ve been warned! :)
No Comments | Tags: Uncategorized