Jackrabbit searching on jcr:path as the primary search vector is expensive… avoid. Node properties and node types are fast.
Jackrabbit searching
25 03 2009Comments : Comments Off on Jackrabbit searching
Categories : Uncategorized
Alternative Locking for a Jackrabbit Cluster
20 03 2009From the previous 2 posts you will see that I have been working on fixing some concurrent update issues with jackrabbit in a cluster. The optimising and merge nature of Jackrabbits conflict resolution strategy certainly gives it performance, but it does not guarantee that the data will always be persisted. Handling those exceptions would work in a perfect world, but I don’t have one of those to hand.
The solution, for the moment at least, appears to be to lock the nodes prior to modification locating the closest persisted ancestor to hold the lock. Unfortunately the jackrabbit lock manager uses the Journal records when performing locks so the I have written a in memory lock manager that replicates the map of locks over the cluster not using the the database. This, if it proves reliable, should eliminate the need to access a shared database on every lock and unlock operation. The unit tests are showing that under no contention locks take 0.02ms and clearing a set of about 10 locks is 0.004ms. Obviously with massive contention the lock time approaches a factor of the throughput. Sadly, the logging system is susceptible to deadlocking since we cannot guarantee the order of locking, however since update follow the same code paths the locking and unlocking order is liable to be the same. Its the same problem as exists inside the DB, except that the scope of locks we are dealing with are probably smaller.
Comments : Comments Off on Alternative Locking for a Jackrabbit Cluster
Categories : Uncategorized
Impact of Locks in a cluster
20 03 2009I thought JCR locking was a potential solution, but there are some issues. With Jackrabbit, each lock generates a journal entry, and it looks like there might be some journal activity generated with attempting to get a lock.
Using the locking mechanism in the previous post. I have one update to a property on one node. Performed 200 times, by ten threads concurrently. That leads to 19K journal updates. If I unwind the threads and the operations into loops performing the work sequentially for the 2000 operations I get about 4000 journal entries. Which means that locking multiplies the number of database operations in Jackrabbit by about 4x under load. Since these are write operations and they need to be replayed on all application server nodes in a cluster that might not be acceptable.
There are 2 other approaches to this problem. Accept that the exception can happen, and handle it in the same way you would a optimistic lock failure, or create a RDBMS lock scheme.
The optimistic lock failure recovery has complications since it requires perfect transactional isolation in the application code.
The RDBMS locking table might work provided the root persisted node can be identified. It may also be possible to implement this with a cluster replicated cache avoiding any DB overhead.
Comments : Comments Off on Impact of Locks in a cluster
Categories : Uncategorized
JCR Locks and Concurrent Modifications
20 03 2009Heavy concurrent modification of a single node in Jackrabbit will result in InvalidItemStateException even with a transaction.
The solution is to lock the node, the code below performs a database like lock on the node, timing out after 30s if no lock was obtained. The lock needs to be unlocked as its a cluster wide lock on the node.
I suspect however that the propagation rate will not be fast enough to maintain consistency over a cluster, but then again… nothing will be fast enough without impacting performance. The slightly annoying feature of this is that you must perform locking manually. This is IMVHO a bit crazy since at some point if you don’t and you write to the node, you will get an exception, and if you are in a transaction (as you should be) you wont be able to recover the exception since it will require rollback and a complete redo of the whole transaction.
public Lock getNodeLock(Node node) throws UnsupportedRepositoryOperationException, LockException, AccessDeniedException, RepositoryException { Lock lock = null; try { lock = node.getLock(); if (lock.getLockToken() != null) { return lock; } } catch (LockException e) { } lock = null; long sleepTime = 100; int tries = 0; while (tries++ < 300 ) { try { return node.lock(true, false); } catch (Exception ex) { if ( sleepTime < 500 ) { sleepTime = sleepTime + 10; } try { if ( tries%100 == 0 ) { System.err.println(Thread.currentThread() + " Waiting for "+sleepTime+" ms "+tries); } Thread.sleep(sleepTime); } catch (InterruptedException e) { } } } throw new Error("Failed to lock node "); }
Comments : 4 Comments »
Categories : Uncategorized
Jackrabbit Observation
19 03 2009Not Observation as in the ObservationManager sense, but an observation about JCR and Jackrabbit that has been confusing me and still is. If I put access control on JCR, I dont get notification of an access control failure untill I try and save an item or if in a transaction at the commit (need to check that). This means that the failure doesnt happen in the code where the problem is. I am not certain that is right since, given a permission denied on save you might take alternative action, but if you have to wait until the end of the transaction… how can you ?
A second thing that is consuming a certain abount of my free thought cycles at the moment is the issue of locking. I would have thought, opening a transaction and adding a node in the tree would lock the parent not until the transaction had been committed. However this does not appear to be the case. Does this mean that I have to explicitly lock parent nodes or nodes on modification, if so… what a pain… is there is a better way ?
Comments : Comments Off on Jackrabbit Observation
Categories : Uncategorized
Faster Jackrabbit
12 03 2009Just with everything, there are right ways to do thing and wrong ways. It looks like Jackrabbit is the same, doing lots of saves generates lots of version history in jackrabbit and results in lots of DB traffic which makes all JCR operations slow. If you can, one save per request cycle, and binding transaction manager to the JCR objects means that all SQL activity is performed at the end of the request cycle in one block. Having seen the impact of a small amount of tuning on write performance, I think there will be scope for more.
Comments : Comments Off on Faster Jackrabbit
Categories : Uncategorized