Classloader Magic

30 09 2008

OSGi has classloader magic, although its not really that much magic. Once into a runtime of a JVM a class is an instance of the class information bound to the classloader that instanced that. So when binding to the class, its both the classloader and the class that identifies this instance. All those ClassCastExceptions that everyone shouts out… “you stupid program .. those are the same class ” … only to realise some time later that you were the stupid one. Perhapse ClassCastExceptions have the wrong name since they really talk about the class instance rather than class type although that may just be becuase many developers think of class as type.

So there is no classloader magic, all that OSGi classloaders do it to work together though a shared registry of resolver instances of specific versions of class types. Having don’t that, the class instance becomes available.

The shared space that is normally used by the typical parent classloader as seen in tomcat applications becomes a slightly more intelligent classloader that maintains a virtual shared space of exported packages. The interesting part witll be to see if its possible to use this same type of classloader as the basis for a webapplication context, then the shared space will become a window into a group of OSGi classloaders each managing their own set of internal class instances and exported class instances.

How Shindig Works.

24 09 2008

Here is a great discription of the internals of Shindig, Well worth a read for the technically minded.

Linear Classloaders and OSGi

4 09 2008

OSGi does not remove all classloader problems as can be seen from and where the Peter Kriens notes that 

“Hibernate manipulates the classpath, and programs like that usually do not work well together with OSGi based systems. The reason is that in many systems the class visibility between modules is more or less unrestricted. In OSGi frameworks, the classpath is well defined and restricted. This gives us a lot of good features but it also gives us pain when we want to use a library that has aspirations to become a classloader when it grows up.”

It turns out that some JPA solutions are OSGi friendly, others are not. It all depends on what is done to load the persistence.xml and the related classes, and then the proxy classes cause further classloader problems.
I guess, since the author is Peter is OSGi Director of Technology, he knows what he is talking about.
Apparently EclipseLink was written to be OSGi friendly, and non-proxy, classloader clever ORM solutions also work, Cayenne falls into this group, and reportedly works OK in side OSGi, although I don’t know if that’s v2 or v3 

Stopping Spotlight

4 09 2008

Second in the slow network performance of a backup drive series, yes I cant run unit tests fast enough at the moment, so I am trying to speed my box up and fix all of those things that are broken. 


Who ever said building a search index was cheap. Its not, Spotlight does impact disk performance especially of mounted drives, especially network mounted drives. 

Its possible to disable, while running, 

mdutil -i off /Volumes/ieb_remote

and then to delete the contents of the index on that drive.

mdutil -E /Volumes/ieb_remote

and to see the status of indexing on the drive

mdutil -s /Volumes/ieb_remote


OSX network errors

4 09 2008

OSX has a bug in its network stack, apparently associated with a 10.5 update, but it appears to fix slow performance on large file transfers on Tiger as well. Details are here. 

The main issue is the ack response, where both machines back of and wait for the other to ack, being far to polite. It appears to have fixed things for me backing up on a very very slow time capsule, which is not fast again.

The commands to reduce the ack are 

sysctl -w net.inet.tcp.delayed_ack=0

and to set it back to defailt 

sysctl -w net.inet.tcp.delayed_ack=3

and to list it

sysctl net.inet.tcp.delayed_ack

The annoying part is, if the a network mounted disk is mounted over a dodgy tcp connection, it will loose the mount and stall other disk activities. This can turn up as slow response, rather than total loss of the mount.

Code Quality

1 09 2008

What do you do if you have no idea who might use your code, and you want to be able to sleep at night ? Simple, you make certain that all the lines of code you write for all the situations you can think of are covered by a unit test, so you dont get a 4 am call out to something that you broke. Fortunately open source code doesn’t come with a Blackberry you have to answer or a pager thats always on, but that doesn’t mean that there isn’t a sense of responsibility to others that use your code. Even if what you write disappears into a corporation and supports perhaps millions of users, I still feel a sense that I should make every effort to make it as solid and reliable as I can.

A level of testing that allows me to sleep can’t be achieved by just writing unit tests. There needs to be a number of tools to help, and a number of additional concepts. At the simple level, unit tests generate code coverage and high levels of coverage like Shindig’s social API see  generate increasing quality. The code is know to work before it hits a server environment and patches don’t introduce quite bugs that only become noisy in production. Cobertura or any other coverage reporting tool is vital in identifying lines of code that haven’t been tested or scenarios that are not covered. 

The second tool to generate quality is FindBugs, that will tell the developer about bad or dangerous code patterns. Having reports running as part of the build gets these reports close to the developer, and once you have added FindBugs annotations the developer gains control over false positives.

To a lesser extent Check Style reports like  can also identify overly complex or potentially buggy code.

All of these tools and reports are great at improving code quality, but there is one increasingly important area where they fail. Multicore processors are encouraging more and more use of parallelism within our applications. Scaleable web applications are starting to consider overlaying network wait times within a single request to minimize the request latency. With this inprocessor parallelism, single JVM testing cannot simulate or test the real randomness of an application under load, as the JVM is inherently synchronized with the testing framework. To achieve realistic testing I would argue that there needs to be some randomness in the testing sequence and profile. Only with this randomness, can a testing framework expect to expose those race conditions that only happen when networked machine are communicating over unpredictable connections in a truly disconnected environment. 

Now I can sleep.

Shindig SPI Implementation

1 09 2008

A while back I started a Service Provider Implementation for Shindig aiming to store the data necessary to support a the OpenSocial API in a simple DB. This was before there was a SPI, now there is and it makes quite a bit of sense. There is a push to release Shindig 1.0, but the end of the month, and although a sample implementation of the SPI may not be in the core 1.0 release, I think it would make sense to have something done by then. Not least because it will keep us clean on maintaining the SPI, if we have an implementation of the SPI to break.

The Open Social Spec has grown in recent months, as more eyes have been over it. 0.7 was a relatively simple social model, but in 0.8.1 that has just been released, you can see the influence of input from the likes of MySpace and LinkedIn. All of this makes sense, and certainly makes the model more complete, but at the same time makes it all the more complicated. 

So how to implement the SPI. I looked at pure JDBC, but decided, due to the size of the model, this was going to take quite a time to make work well. I have used Hibernate in the past, but often found that it leads to a great object model, but really bad DB model that dies in production. There are more modern standards like JPA, but some of these have the same potential as older versions of hibernate to make a mess of the DB. A while back I used Apache Cayenne on another reasonably large scale project. What struck me at the time was how if made the developer focus on the construction of a great DB model that would perform at scale, and then drove the connection of that DB model into a Object model. 

For Shindig, I am using Apache Cayenne 2. I did  a v0.7 implementation with this stack and I am now updating this to 0.8.1. The big thing that strikes me is that I am designing the database from an entity model, and then performing Relational to Object Mapping, rather than building a Object model and finding a way to store it. The second thing, that strikes me, is that even though there are significant additions and changes between 0.7 ad 0.8.1 Cayenne is quick and easy to work with, leaving me free to make the optimizations I want to in the ORM conversions without demanding me to jump through hoops. 

What will be really interesting to see, is how the implementation process impacts the database design and scale up in production. At the moment, it looks nice, but as always with a DB, looking nice needs to be checked with lots of data and load at scale.