Lucene lock files that wont go away

5 09 2007

As part of the search testing, I’ve been running a large number of threads operating on lucene segments. It all works fine for about 45 minutes and then I start seeing messages like

Caused by:
    Lock obtain timed out:

Which prevents the index from opening for both reading and writing. Looking in the /tmp directory, there are 100’s of lucene-*.lock files.I believe that the GUID is based on the canonical path (looks like sha1) to the segment on disk, and if you forget to close the index, the lock will remain. So if your segments have fixed names, and your program crashes, then you will be left with lock files, and some time later you will not be able to open those segments.You could delete the lock files on startup, but that would be dangerous. If you manage the indexes in your own code, then using a shutdown hook and a register of active indexes will ensure that the indexes are all closed.


Runtime.getRuntime().addShutdownHook(new Thread() {
    /* (non-Javadoc)
     * @see java.lang.Thread#run()
   public void run()
      catch ( Exception ex )

My test now runs forever 🙂

Zip files without an End-of-central-directory

5 09 2007

If you create zip files with the ZipOutputStream and forget to close or flush the stream correctly, you may be able to re-open them in Java, but it you try unzip or another command line utility, you will get a cyptic End-of-central-directory missing message.

Looking at the archive with octal dump (eg od -a exposes that fact that the central directory is there at the end…. but its missing a few training bytes. Closing the stream correctly will fix the problem.

Unit Testing Strategies

5 09 2007

When I first wrote Search, I didn’t do much detailed unit testing because it was just so hard to make it work with the whole of the Sakai framework. And because of that lack of unit testing, the development of search was hard work. I did have some unit tests that did things like launch 50 free running threads to try and break synchronization of certain parts of the search framework, but much of this I did with Sakai up and running. The cycle was slow, and the startup even slower with the sort of tests I needed to do.
Recently I have been re-writing the search indexing strategies to make it more robust, using a 2 phase commit strategy controlled by 2 independent transaction managers (along the lines of XA) with a redo log of index updates. Clearly this is slightly more complex, and this time I need unit testing. So here is my personal list of do’s and dont’s for unit testing with sakai.

  • Dont use the any static covers anywhere in your code, this kills any possibility of unit testing.
  • Don’t use the Sakai ComponentManager, this again will kill any possibility since it requires logging components to be ready.
  • Don’t bind to spring, or at least not initially.
  • Do inject all your dependencies.
  • Do create custom mock objects to mock up only the parts of the API’s that you need, with eclipse its easy to let it do 90% of the work.
  • Do start by invoking the target method, run and fill in the missing dependencies after each run, which is why I say don’t bind to spring. Do all your injection in the unit test case.
  • Do bring up hsql and create the tables you need.

I have done this for the new search indexer and found about 10 really strange potential bugs that might have taken months to uncover inside Sakai. I have also been able to simulate a 10 node cluster using long running unit tests under multiple copies of maven that have exposed all sorts of locking and synchronization issues that would have been hard to find in a real Sakai cluster.

The code is at

Datasources at 

and an example of a large test class at

I am certain there will be those who say I should be using spring to do this construction, but the cycle appears to be faster with this direct route, and exposes where gettters and setters are missing, without bringing the unit test up.

Finally, If you  write a really big test that you dont want to run all the time, use

mvn -Dtest=SearchSoak test

to run it explicitly. This one launches 3 un-synchronize threads representing a 3 Sakai servers and runs for 10 minutes trying to break the code.