The benefits of Unit testing.

29 09 2007

Ok, so its only one data point, but it illustrates the benefits of unit testing.

3 weeks without starting Sakai fixing a blocker bug in search. That three weeks has not been spent just writing code, but has mainly been spent writing unit tests. Initially simple test, then more complex multi threaded tests simulating life in a cluster.

After sorting out the Spring configuration file and getting that working (there has to be a better way of doing IoC). The search component starts up and runs in soak mode (constantly indexing) for 12 hours on 500MB of data. First Time! No Errors!

When in unit test mode, the developer cycle time was 10 seconds or less. Starting up Sakai still take at least 60…. you get bored… go read emails… write blogs 🙂

Unit testing… now you know it makes sense!

What’s the right way to do IoC ?

28 09 2007

When we talk about IoC, there is a vast spectrum of IoC complexity that we are willing to accept. Those who love Spring IoC in XML will create 1000’s of lines of XML and proudly show 5 lines of Java. On the other end of the scale there are those that IoC 2 or 3 large beans to represent the exposed bean. Which is right or better ? I have no idea, both are probably right depending on your religion. Here are some observations.

Static Factory patterns are a complete pain, make it hard to unit test and eliminate most deployments options, hence why we use IoC.

Manual construction of large bean structures prior to resorting to IoC reduces configuration post distribution but reduces the complexity of the deployed exposed beans. It also, is compiled and so the structure can be largely validated at compile time. This is still IoC, its just IoC using fixed method calls rather than invoked by some IoC management framework. So manual in code IoC simplifies what is exposed to the deployer and makes the configuration of the component more reliable.

On the other hand, doing all the IoC within Spring XML maintains flexibility, but increases complexity and delays much of the validation until the runtime. Excessive Spring XML is effectively programming in XML, and should not be confused or excused as not lines of code. Its all code. The positive side of this approach is the end result is highly configurable and customizable, but on the down side its almost impenetrable except by the author, and sometimes on the day they wrote it. So heavy use of XML IoC can de-stableize the deployment packages and confuse all but the authors with unnecessary detail.

So as I test I did both, I have about 50 beans that are reused multiple times and use pure getter setter IoC. There is only 1 new() in all the bean code. I constructed Unit tests with in code IoC in the test case classes. In general it takes about an hour to construct the work from scratch doing manual edits in eclipse. Then for production I did a pure spring XML file. This has taken about 14 hours to construct so far even with SpringIDE in eclipse. Maybe I am a slow typist, or SpringIDE isn’t helping me enough, but the lack of edit time validation the subtler details of the XML and the final validation at runtime appears to be slowing the development cycle.

I will stick with the Spring XML for this component, but it makes me wonder if there is a better way. Google Guice uses @Inject annotations and Models to do all the IoC directly in the code. Its clearly now as wide as Spring…. but it looks much easier to simple IoC and eliminates all the XML files.

Whatever the right way is, it has to allow the developer to cycle fast and make progress.

Timer leaks

21 09 2007

If you use Timer and TimerTask you may find some strange with one shot TimerTasks, i.e. ones that run just once after a delay. If you add a lot of them to the Timer, they tend to be held onto by the Timer itself, and hence if there are any references these will also not get GC’d.

The JavaDoc appears to say that if you cancel the task eg TimerTask.cancel(), and then Timer.purge the references should be released, however in tests I have done this does not appear to happen in JDK 1.5.

If you want to delay invocation of events, then a java.util.concurrent.DelayQueue is probably a good alternative. The Queue can be processed periodically by a single TimerTask, resulting in no memory leak.

Running specific Maven Test with JVM Args

21 09 2007

I have some long running tests in search, but I wouldnt want anyone to run them as part of the normal build. The tests dont have the word Test in the classname which prevents them from running, but they can be invoked on the command line with -Dtest=classname

mvn -Dtest=SearchSoak test

Also I have found that its necessary sometimes to add jvm args to the unit test, reconfiguring the Surefire plugin makes this possible, in the pom

    <maven.test.jvmargs> </maven.test.jvmargs>

And then to run with a heap dump and YourKit connection

export DYLD_LIBRARY_PATH="$DYLD_LIBRARY_PATH:/Applications/YourKit Java Profiler"mvn -Dtest=SearchSoak \
   -Dmaven.test.jvmargs='-XX:+HeapDumpOnOutOfMemoryError -agentlib:yjpagent' \

HSQL Unit testing

18 09 2007

Dont be fooled by HSQL Unit testing… its transaction isolation may lead you to beleive that your unit tests are working perfectly, but its doesnt support READ_COMMITTED transaction isolation, and its a true transaction monitor when it comes to committing the data, ie the code is single threadded. Since Sakai uses READ_COMMITTED for its transaction isolation in production, rahter than READ_DIRTY, tests that work on HSQL will not work in production, and tests that work in production wont work in HSQL.

update testtable set x = x + 1 where id = 99;
select x from testtable where id = 99;

If test table contains 1, and you do this in HSQL on multiple threads, there will be collisions on what is selected from testtable, since the sect reads the dirty data direct from the database, and not the data in the transaction. In MySQL and Oracle, each thread will give a unique number since a) the select is taken from the value in the transaction and, b) until the commit fires, the record is locked for update. Most of the time it doesnt matter, but if you are doing any tests that involve more than one thread, beware.

There is an alternative, DerbyDB from apache, that has an Apache license and so can be used in maven poms, however the dialect is not the same as Oracle, MySQL or HSQL. Have a read of this for more details.

Xen Bridge on Debian Sarge/Etch with 2 interfaces

14 09 2007

The standard network-bridge script that comes with Xen on Debian Sarge does not appear to work. The problem appears to be that the network script after converting the hardware ethernet into a promiscuous port (peth1), and binding a virtual port veth0.1 to the bridge, it fails to binf the fake eth1 to the virtual port.

I dont know if its the right solution, but binding the new fake eth1 to the bridge xenbr0, makes it all work.

brctl delif xenbr0 veht0.1
brctl addif xenbr0 eth1

Does the trick.

If you want to correct the scripts change the lines

       ip link set ${bridge} up
       add_to_bridge  ${bridge} ${vif0}
        add_to_bridge2 ${bridge} ${pdev}
        do_ifup ${netdev}


       ip link set ${bridge} up
        add_to_bridge  ${bridge} ${netdev}
        add_to_bridge2 ${bridge} ${pdev}
        do_ifup ${netdev}


      brctl delif ${bridge} ${pdev}
        brctl delif ${bridge} ${vid0}
        ip link set ${bridge} down


      brctl delif ${bridge} ${pdev}
        brctl delif ${bridge} ${netdev}
        ip link set ${bridge} down

Documentation on the Entity Binary Serialization

12 09 2007

I have put some rough documentation on the new Entity Serialization being used in 2.5 at