Cayenne is a hot pepper, no kidding.

28 04 2007

So I was getting fed up the the datamodel I am working with, lots of many to many mapping tables, often with a several tables between the real entities. So I looked up flattening with Cayenne.

http://cayenne.apache.org/doc20/flattened-relationships.html

Which looks rather simple, and comming form a Hibernate world with stes one-to-many many-to-many reverse joins there to confuse.

This aspect of Cayene is a little confusing to someone used to complexity. If you define the Database relationships in the mapping file, then you can tell Cayenne the path between the Entities in terms of the Database mappings.

Thats it.

It does the rest. I have 2 distant entites, with 4 mapping tables between them, 5 hops, and from one Class I can do a getRemoteEntityList(); Cayenne converts that into a multi table join acording to the its was conigured with in the mappings and you get the list in one sql statement.

This isnt too far off what we tend to do with hand optimized SQL.

The XML specification of this is quite easy to undertand, just define the path a comma seperated list of relationships, but the GUI mapper is even easier as at every step of navigation, it know what relationships are available and gives you the options to where to go next. If there is only one path its completely automatic, if there is a decision to be made its up to you.

This also works for updates and inserts (I am told), however if there is more than one mapping table, the relationship is read only. If there is one, it will perform all the operations on the intermediate table.

Advertisements




Apache Cayenne Is Cool (Hot, So far)

28 04 2007

The first weeks of a marriage are bliss, and I guess thats what the first hours of using yet another ORM tool is like. And so far, 4 hours in I’m having real fun (sad) with Apache Cayenne. I have a largish schema, not related to Sakai that I need to map, about 80 entities, mostly legacy with some reasonably complex relationships. The Cayenne Modeler tool and its integration with Eclipse is excellent, as is their getting started documentation.

After 30 minutes I had the whole schema mapped and building in Eclipse.

After 90 minutes after some problems with Spring XML (not related) the DataContext is up and running and I am able to navigate existing object.

Didn’t even have to write a finder or any ORM code at this point.

Unfortunately the database I am working with has no relationships recorded in the DB, but the modeler is letting me do this and once the relationships are marked when I add the mappings into the classes it knows which ones to use, it understands reverse navigation. At the moment it just looks like it works.

I am bound to hit problems at some point, but the first impression is nothing like it was with Hibernate2 and Spring, 4 hours of hard work followed by a trip to the bookshop.

Although there isn’t an obvious spring integration, it looks modular and will probably integrate with no problems. It might need a bit of configuration around the DataSource, but it looks like its not nearly as picky about being in Sakai Shared and will share a connection. So far its working so well, I haven’t had to look too far inside to work that one out.

No wonder it popped up as a top level project at Apache so fast.





Spring 2 XML validation errors (cvc-elt.1)

28 04 2007

If you get one of these in Spring 2

org.springframework.beans.factory.BeanDefinitionStoreException: Line 5 in XML document from class path resource [applicationContext.xml] is invalid; nested exception is org.xml.sax.SAXParseException: cvc-elt.1: Cannot find the declaration of element 'beans'.

Then you need to change your applicationContext.xml from


xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans/spring-beans-2.0.xsd"
>

To


xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.0.xsd"
>

There may also be problems with local resolution of the xsd, but the Documentation on the web is wrong in a number of places.





Search was broken

20 04 2007

I’ve been breaking search in a big way over the last week, but its fixed now (everything crossed)

The test environment is a 2 node cluster both nodes running flat out indexing and Apache Benchmark hammering the search interface with 5 concurrent users doing 10000 requests flat out against each cluster node.

Its not a full blown performance test, but it stresses the environment enough to expose the 1 in 100,000 failure events.

The merging algorithm was a bit dumb in 2.3 that caused small segments to be created and not merged into larger ones as quickly as would have been nice. This has changed. New index cycles generate new temporary segments, these are merged into the current active segment, and the result is distributed to the cluster. When the size of the active segment exceeds a limit, currently 20M, a new active segment is started.

When there are more then 10 segments, these are merged into 1 larger segment which is optimised in the process to remove deleted documents adn re-order numbering. Every time this happens a new segment is created and distibuted to the cluster.

So there 1 large segment and upto 9 smaller segments. When the 1 large segment reaches a second size (1.5G) it is frozen a a new large segmetn is created. The indexing process will eventually create many large segments which a sysadmin migh want to merge.

The index is now much more compact, containing no content wich means that it grows much slower related to the size of the language being indexed and the number of items, not the size of each item, so although the frozen segments are created by the process I would expect to see 1 frozen segment for ever 10 – 20 M documents, (wild guess, but I know 100,000 documents is fitting in 200M and the rate of growth is less than linear)

Since the index grows much more slowly now, its a bit hard to test with the default settings, so these can be adjusted in sakai.properties.

If you want to do a rapid test you can use the following sakai.properties

soakTest@org.sakaiproject.search.api.SearchIndexBuilderWorker=true
segmentThreshold@clusterSearchStorageImpl=204800
maxSegmentSize@clusterSearchStorageImpl=15728640
maxMegeSegmentSize@clusterSearchStorageImpl=12728640

segmentThreshold is the largest size of a new segment
maxSegmentSize is the largest size of a merged segment after which the segment becomes frozen (but still used)
maxMegeSegmentSize is the largest size of a merged segment that new segments are added to
soakTest causes a new rebuild whole instance command to be posted to the queue when the number pending has reduced to zero.

Also
On the admin page you can now turn diagnostics on and off at runtime on the node the page is being served from. The diagnostics appear in the log files.





JCRService + WebDAV Working

15 04 2007

I’ve now got the JCRService integrated into Sakai and working with Jackrabbits WebDAV library. The JCRService which exposes a the standard 170 API to Sakai is using a Jackrabbit implementation integrated into Sakai. The webDAV implementation uses Jackrabbits own webDAV libraries which have support for

  • RFC 2518 (WebDAV – HTTP Extensions for Distributed Authoring
  • RFC 3253 (DeltaV – Versioning Extensions to WebDAV)
  • RFC 3648 (Ordered Collections Protocol)
  • RFC 3744 (Access Control Protocol)
  • DAV Searching and Locating (DASL)

There is still some work to do in AccessManager to get all the fucntions mapped and setup some sort of default structure, but so far performance looks good, about 1000 items representing 20MB goes up in < 1Minute with some base level logging turned on. I expect this to drop a bit when the SakaiSecurity Service is fuly integrated.

JCRWebDAv Working





Maven 1 XML and DOM Serialization in JDK 1.5

12 04 2007

To serialise DOM in JAXP avoiding dependencies on Xerces there are 2 options. This is JDK 1.5 code and later, prior to that you can just use Xerces

1. Use an identity transform, eg
TransformerFactory fact = TransformerFactory.newInstance();
Transformer idTransform = fact.newTransformer();
Source input = new DOMSource(document);
Result output = new StreamResult(System.out);
idTransform.transform(input, output);

or 2. use DOM Level 3 Load and Store eg

OutputStream out = new FileOutputStream(fileName);
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
DOMImplementation impl = builder.getDOMImplementation();
DOMImplementationLS feature = (DOMImplementationLS) impl.getFeature("LS","3.0"); LSSerializer serializer = feature.createLSSerializer();
LSOutput output = feature.createLSOutput();
output.setByteStream(out);
output.setEncoding("UTF-8");
serializer.write(doc, output);

However when you try to build this 2 with maven 1, it wont since maven1 pulls xerces 2.4.0 into the root classpath… which is somewhere arround JDK 1.2.

The solution, maven.compile.fork=true

in project.properties of the project where you need to build, the compiler forks and uses the JVM classpath without all the Maven 1 jars. Thanks to Josh Holtzman for finding this.

Incedently, the DOMImplementationRegistry appears to be empty in JDK5 with no impls at all.

The first method needs namespace support, so will fail to serialize if the namespace is not found.