Xythos releases JSR-170 beta programme

11 09 2007

http://www.sys-con.com/read/426849.htm

Xythos has released its JSR-170 beta programme, Sakai gets a strong mention.

Advertisements




Faster Lighter Serialization.

7 09 2007

I have been having a long hard look at the serialization of data to storage in Sakai. Back in 2.4 I noticed while testing Content Hosting DAV for search that there was quite a lot of GC activity. A bit of profling (with YourKit, thanks for the OS license 🙂 ) showed that about 400 > 500 MB was going through the GC per upload in big worksite due to quota caluclations. This isnt quite as bad as it sounds since objects going through the heap quickly dont cost anything. However, this is not good.

So in 2.4 we put a caching fix that would mean that this would only happen once every ten minutes of upload activity. But this made me think about what was going wrong. Once again YourKit showed that the DOM parser was at fault.

A bit of free memory calculation around the serializer and parser parts of the route to and from the database shows that each entity is taking 2-3ms and consuming 300-400KB of heap, hence site with 1000 resources consumes 400MB.

When Cocoon 1 came out it was slow. It used DOM. Cocoon 2 came out with SAX and was much faster. So step 1, convert to a SAX parser for reading the entities. This dropped the parse time to about 1ms and the heap requirements down to 40K. An Ok speedup, but I have a gut feeling that this is still too slow. Writng the block, still using a DOM costed 4ms and 70K.

So yesterday, I started to write a serializing infrastructure that would not use XML, but would parse into a binary format. The early unit tests were showing that a parser and seriaiser based on DataOutputStream and DataInputStream under load was taking about 0.000020ms per entity with a 35 byte overhead per entity. By overhead, I mean the extra memory required for parsing, and not the memory required for the input data or output data. My methodology was probably flawed although these results looked like it was worth persuing.

So with the above parser the a large DAV upload, which hammers CHS generated the following metrics for serialize and parse, Parse, ie read the data from the database into an entity object, 62us-70us for parse to entity requiring about 4K per entity. Serialize 49us and 6K.

NFO: Average direct Parse now 0.06278688524590163ms 4706.255300546448 bytes (2007-09-07 22:59:54,014 http-8080-Processor23_org.sakaiproject.util.BaseDbSingleStorage)
INFO: Average direct Parse now 0.07424657534246576ms 4538.613698630137 bytes (2007-09-07 22:59:54,487 http-8080-Processor23_org.sakaiproject.util.BaseDbSingleStorage)
INFO: Average Serialization now 0.049322033898305084ms 6768.359322033898 bytes (2007-09-07 22:59:54,787 http-8080-Processor23_org.sakaiproject.util.BaseDbSingleStorage)
INFO: Average direct Parse now 0.06304347826086956ms 4680.677826086957 bytes (2007-09-07 22:59:56,718 http-8080-Processor23_org.sakaiproject.util.BaseDbSingleStorage)
INFO: Average direct Parse now 0.07405405405405406ms 4757.4648648648645 bytes (2007-09-07 22:59:59,191 http-8080-Processor22_org.sakaiproject.util.BaseDbSingleStorage)

The modifications can read either XML or the binary format, and can be configured to write either XML or binary. Obviously if you write XML then you loose all the advantages. The data is still stored in the same CLOB database columns, but it just looks different, as below.


CHSBRE
^/group/7f33526e-446f-4fca-80f6-f9dc0b48b7a1/caret/darwinfiles/caret-files/vol14/set/v14jun.amp )org.sakaiproject.content.types.fileUpload inherited
????????????????
d e DAV:creationdate 20070907215527865 e CHEF:is-collection false e DAV:getlastmodified 20070907215527866 e DAV:getcontenttype application/octet-stream e DAV:getcontentlength 103371 e DAV:displayname
v14jun.amp e CHEF:copyright >copyright (c) 2007, Sakai Administrator. All Rights Reserved. e
CHEF:creator admin e CHEF:modifiedby admin application/octet-stream ?? 1/2007/250/21/2932b7bf-e41f-478b-00c7-d30f298a58d3

Disadvantages ? Well the format of serialization is not XML, Srings are still readable, but you probably need the serialization classes to make real use of it. I have structured these so they can roun outside Sakia.





Surefire Unit Test arguments in Maven 2

5 09 2007

To make the surefire plugin to maven2 operate in a seperate jvm instance, and have different jvm args (eg more memory, or profiler) you can change the way in which the unit tests are launched.


 <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-surefire-plugin</artifactId>
        <configuration>
          <forkMode>pertest</forkMode>
          <argLine>${maven.test.jvmargs}</argLine>
        </configuration>
      </plugin>
    </plugins>

Then you will need to set ${maven.test.jvmargs} in the properties area to keep maven going.


<properties>
	<deploy.target/>
	<maven.test.jvmargs></maven.test.jvmargs>
</properties>

and finally to run

mvn  -Dtest=SearchSoak -Dmaven.test.jvmargs='-agentlib:yjpagent -Xmx128m' test

Some time the maven.junit.jvmargs parameter might appear in maven 2 as it did in maven 1 ?





Lucene lock files that wont go away

5 09 2007

As part of the search testing, I’ve been running a large number of threads operating on lucene segments. It all works fine for about 45 minutes and then I start seeing messages like

Caused by: java.io.IOException:
    Lock obtain timed out:
    Lock@/tmp/lucene-8b98f884942d60a1ef42e1098e14f204-commit.lock
         at org.apache.lucene.store.Lock.obtain(Lock.java:58)
         at org.apache.lucene.store.Lock$With.run(Lock.java:108)

Which prevents the index from opening for both reading and writing. Looking in the /tmp directory, there are 100’s of lucene-*.lock files.I believe that the GUID is based on the canonical path (looks like sha1) to the segment on disk, and if you forget to close the index, the lock will remain. So if your segments have fixed names, and your program crashes, then you will be left with lock files, and some time later you will not be able to open those segments.You could delete the lock files on startup, but that would be dangerous. If you manage the indexes in your own code, then using a shutdown hook and a register of active indexes will ensure that the indexes are all closed.

eg

Runtime.getRuntime().addShutdownHook(new Thread() {
    /* (non-Javadoc)
     * @see java.lang.Thread#run()
     */
   @Override
   public void run()
   {
      try
      {
         multiReader.close();
      }
      catch ( Exception ex )
      {
      }
    }
});

My test now runs forever 🙂





Zip files without an End-of-central-directory

5 09 2007

If you create zip files with the ZipOutputStream and forget to close or flush the stream correctly, you may be able to re-open them in Java, but it you try unzip or another command line utility, you will get a cyptic End-of-central-directory missing message.

Looking at the archive with octal dump (eg od -a broken.zip) exposes that fact that the central directory is there at the end…. but its missing a few training bytes. Closing the stream correctly will fix the problem.





Unit Testing Strategies

5 09 2007

When I first wrote Search, I didn’t do much detailed unit testing because it was just so hard to make it work with the whole of the Sakai framework. And because of that lack of unit testing, the development of search was hard work. I did have some unit tests that did things like launch 50 free running threads to try and break synchronization of certain parts of the search framework, but much of this I did with Sakai up and running. The cycle was slow, and the startup even slower with the sort of tests I needed to do.
Recently I have been re-writing the search indexing strategies to make it more robust, using a 2 phase commit strategy controlled by 2 independent transaction managers (along the lines of XA) with a redo log of index updates. Clearly this is slightly more complex, and this time I need unit testing. So here is my personal list of do’s and dont’s for unit testing with sakai.

  • Dont use the any static covers anywhere in your code, this kills any possibility of unit testing.
  • Don’t use the Sakai ComponentManager, this again will kill any possibility since it requires logging components to be ready.
  • Don’t bind to spring, or at least not initially.
  • Do inject all your dependencies.
  • Do create custom mock objects to mock up only the parts of the API’s that you need, with eclipse its easy to let it do 90% of the work.
  • Do start by invoking the target method, run and fill in the missing dependencies after each run, which is why I say don’t bind to spring. Do all your injection in the unit test case.
  • Do bring up hsql and create the tables you need.

I have done this for the new search indexer and found about 10 really strange potential bugs that might have taken months to uncover inside Sakai. I have also been able to simulate a 10 node cluster using long running unit tests under multiple copies of maven that have exposed all sorts of locking and synchronization issues that would have been hard to find in a real Sakai cluster.

The code is at https://source.sakaiproject.org/svn/search/trunk/search-impl/impl/src/test/org/sakaiproject/search/

Datasources at

https://source.sakaiproject.org/svn/search/trunk/search-impl/impl/src/test/org/sakaiproject/search/indexer/impl/test/TDataSource.java 

and an example of a large test class at

https://source.sakaiproject.org/svn/search/trunk/search-impl/impl/src/test/org/sakaiproject/search/indexer/impl/test/MergeUpdateOperationTest.java

I am certain there will be those who say I should be using spring to do this construction, but the cycle appears to be faster with this direct route, and exposes where gettters and setters are missing, without bringing the unit test up.

Finally, If you  write a really big test that you dont want to run all the time, use

mvn -Dtest=SearchSoak test

to run it explicitly. This one launches 3 un-synchronize threads representing a 3 Sakai servers and runs for 10 minutes trying to break the code.