Sling Documentation Annotations

16 11 2009

Its been noticed that documentation that is not in the same version control system as the code, is frequently not maintained. This leads to the users of the interfaces getting increasing fustrated as nothing appears to work, although to fair to the developers the users may well be looking at out of date documentation.

To address this in Sling/Sakai K2 I have just done a first pass at documentation annotations that are discovered at runtime to build a set of documentation for the Service. I say service, because Sling is OSGi based, and every HTTP service end point is implemented as a OSGi service implementing javax.servlet.Servlet. The approach could be used for any service, but I am using it for Sling Servlets.

How to use:

First off, all Servlets that are active in the system are automatically registered with the documentation system, and if any of them don’t contain documentation there are some gentile reminders to the developer to create the documentation. If there are no documentation annotations present there are some friendly defaults.

add the documentation annotations to you build


So, add some annotations to your class:

@SlingServlet(methods = "GET", paths = "/system/doc")
@ServiceDocumentation(name = "DocumentationServlet", description = "Provides auto documentation of servlets registered with OSGi. Documentation will use the "
 + "service registration properties, or annotations if present."
 + " Requests to this servlet take the form /system/doc?p=&lt;classname&gt where <em>classname</em>"
 + " is the fully qualified name of the class deployed into the OSGi container. If the class is "
 + "not present a 404 will be retruned, if the class is present, it will be interogated to extract "
 + "documentation from the class. In addition to extracting annotation based documention the servlet will "
 + "display the OSGi service properties. All documentation is assumed to be HTML encoded. If the browser is "
 + "directed to <a href=\"/system/doc\" >/system/doc</a> a list of all servlets in the system will be displayed ",
 bindings = @ServiceBinding(type = BindingType.PATH, bindings = "/system/doc"),
 methods = {
 @ServiceMethod(name = "GET",
 description = "GETs to this servlet will produce documentation for the class, " +
 "or an index of all servlets.",
 parameters = @ServiceParameter(name = "p",
 description = "The name of the class to display the documentation for")) })

public class DocumentationServlet extends SlingSafeMethodsServlet {

And then rebuild and reload your Servlet.

Finally browse to http://localhost:8080/system/doc to check the documentation.


Obviously because the documentation is in the code and its deployed with the code, provided the developer keeps it current with the code they are editing, the documentation will be correct. So the next $64K is “How to make developers document what they do ?”

Note to self: JcrResourceResolver2, selectors and extensions

13 11 2009

This really is a note to myself, as I have a habit of forgetting this and spending ages debugging.

In JcrResourceResolver2.resolveInternal there is a loop that attempts to resolve a URI by selectively stripping off the segments of the last element using a . as a seperator. When a resource resolves the section of the path that resolves is used as the resource path, and the remainder is used as the resource path info. The selectors and extensions are explicitly parsed out of the resource path info ignoring anything that the ResourceProvider might have done. It is therefore vital not to attempt to process a path within a ResourceProvider, as the path used to determine the the resource path and the path info is local to resolveInternal and not influenced by anything that the ResourceProvider might choose to do.

There are one or two special cases. If no resource is found, the last element is processed in its entirety before a NonExistingResource is created and hence anything that gets involved in the ResourceProvider process should be very careful about creating Synthetic resources since unless a convention is followed the resource path cant be determined from the URL alone. Take the example of a virtual resource. Since its virtual and the resolution process is abstract, delayed until after resolution, nothing is known about the ultimate target. The convention we follow is that the last element cant contain ‘.’ I think this is going to cause problem and result in moving the whole resolution process inside the resolveInternal call tree, putting any code in a modified Sling JCR Resource Bundle. Unlike previous attempts this will not require patching the code base, just repackaging, unless I can work out a sensible patch that avoids recursive resolution.

Sorry if that was very boring, as I said in the title Note to self.

Declarative optional multiple references flaky in OSGi

12 11 2009

It looks like binding multiple optional references in OSGi is flaky at least with Felix 1.2.0. Uhh what does that mean?

AFAICT, an annotation like

 cardinality = ReferenceCardinality.OPTIONAL_MULTIPLE,
 referenceInterface = VirtualResourceType.class)


public void bindVirtualResourceType(VirtualResourceType virtualResourceType) {"Bound "+virtualResourceType);
public void unbindVirtualResourceType(VirtualResourceType virtualResourceType) {"UnBound "+virtualResourceType);

Only binds some of the time on reload, but unbind works every time.

I have a feeling that below will work, no idea why ?

protected void bindVirtualResourceType(ServiceReference reference) {
   VirtualResourceType virtualResourceType =  (VirtualResourceType) this.componentContext.locateService(
       "VirtualResourceType", reference);
   if ( virtualResourceType != null ) {"=====================BOUND VIRTUAL RESOURCE TYPE{}===============================",virtualResourceType.getResourceType());
      virtualResourceTypes.put(virtualResourceType.getResourceType(), virtualResourceType);
   } else {"=====================Faied to find BOUND VIRTUAL RESOURCE TYPE{}===============================",reference);


The annotation should have been

@Reference(name = “virtualResourceType”,
cardinality = ReferenceCardinality.OPTIONAL_MULTIPLE,
referenceInterface = VirtualResourceType.class,
policy = ReferencePolicy.DYNAMIC)

and then the bind and unbind can be

protected void bindVirtualResourceType(VirtualResourceType virtualResourceType) {
    virtualResourceTypes.put(virtualResourceType.getResourceType(), virtualResourceType);

protected void unbindVirtualResourceType(VirtualResourceType virtualResourceType) {

(I am an idiot!)

Sling Runtime Logging Config

10 11 2009

One of the most annoying things about bug hunting in open source, is that you can see that the developer left log.debug( statements for you in the code but you have to shutdown, reconfigure logging and restart. In Apache Sling this isnt the case. According to the documentation at you can configure logging at runtime. AND you can configure it on a class by class basis. Here are some screenshots of how.

Go to the admin console, select the configuration tab.

Picture 6

Select from the Configuration Factories drop down the logging.config Factory,

Picture 7

Set the properties including the package or class that you want this logging config to apply to, and save.

Picture 8



As I mentioned, you can do all of this on a running instance, no need to shutdown.

Sling and OSGi in Oakland

4 11 2009

Over here in Oaklant there has been a lot of interested in 2 areas. Firstly nosql storage, and secondly OSGi based platforms. The nosql platforms of local interest since anyone thinking of creating a business that will become profitable in social media, has to think about huge numbers of user to have any chance of converting the revenue from page view into real cash. They have to start from a scaling viewpoint. That doesn’t mean they have to go out and spend whatever meager funding they just raised on massive hardware, that means that they have to think in a way that scales. Doing this they start in a way that scales for the first 10 users, and then as the number ramp up faster than they can provision systems, or install software they can at least stand some chance of keeping up. Right at the backend, doing this with traditional RDBMS’s is complete nonsense. Ok, so you might be able to build a MySQL cluster in multimaster mode to handle X users, but at some point you are going to run out of ability to add more and you wont get to 10X or 100X, and by the way break even was at 1000X. To me thats why nosql, eventual consistency and a parallel architecture where where scale-up is almost 100%. This makes me laugh, back in 1992, having parallelised many scientific codes with what felt like real human benefits, Monte Carlo simulations for brain radiotherapy, early versions of GROWMOS and MNDO, protein folding codes, and algebraic multigrid CFD code used for predicting spread of fires in tube stations, and some military applications, we never saw this level of speedup, perhaps the problems were just not grand challenge enough… and social media is …. but on the serious side, thinking of the app as a massively parallel app from the start, creates opportunities to have all the data already distributed and available for algorithmic discovery from the start. Not surprising the Hadoop sessions were the largest, even if some of analysis was on the dark side of the internet.

The other strand at ApacheCon that grabbed interest was OSGi. Small components, plugging into standardized containers, loaded at runtime. In academia the grand challenge problems are those of the digital libraries with the responsibility to preserve information for 100s of years. Researchers datasets that may contain the essence of a future discovery. There must be a duty to ensure that this information is stored in such a way as to allow analysis to be performed. We have to think of the storage as a massive parallel machine, the cloud. Then we have to think of the mechanism for enabling future analysis. Using OSGi as the component model, storing data in the cloud open these possibilities up. I’ve heard Fedora Commons (thats Digital Libraries, not Linux distro) and DSpace are thinking this way. Adopting OSGi as a component model, thinking of cloud storage for the data.

Confusing, but logical ItemExistsException

3 11 2009

In Jackrabbit, if a session does not have permission to read an Item in the repository, a AccessDeniedPermission is thrown. In Sling this appears as a 404 at http which makes perfect sense (until I start to think about it). However if you suspect the item really does exist, you can try and modify the item. The result is an ItemExistsException, at the Jcr layer confirming that the AccessDeniedPermission on read was correct, the item exists but you cant write to it. What is confusing is that session.itemExists() returns false, and Sling gives a 404, both trying to hide the information, but its all to easy to use the update operation to determine if the information isnt there, or if you dont have read on it.

An example exception is

/private/9b/ba/25/71/user1_1257266276 is [[/_user/private/9b/ba/25/71/user1_1257266276/rep:policy/allow][false][user1-1257266276], [/_user/private/9b/ba/25/71/user1_1257266276/rep:policy/deny0][true][everyone], [/rep:policy/allow0][true][everyone]] 
03.11.2009 08:37:57.201 *INFO* [ [1257266276921] POST /_user/private/GetAllProfilesTest1257266276.html HTTP/1.1] org.sakaiproject.kernel.resource.AbstractPathResourceTypeProvider  /_user/private/9b/ba/25/71/user1_1257266276/GetAllProfilesTest1257266276 is a virtual file, base is /_user/private
  03.11.2009 08:37:57.207 *ERROR* [ [1257266276921] POST /_user/private/GetAllProfilesTest1257266276.html HTTP/1.1]
Exception during response processing. javax.jcr.ItemExistsException: /_user/private/9b/ba/25/71/user1_1257266276
at org.apache.jackrabbit.core.NodeImpl.internalAddChildNode(
at org.apache.jackrabbit.core.NodeImpl.internalAddNode(
at org.apache.jackrabbit.core.NodeImpl.internalAddNode(
at org.apache.jackrabbit.core.NodeImpl.addNode(

There are 2 implicatons here.

  • If using the if ( !session.itemExists(path) ) {  …. session.addNode(path) … } pattern you should expect an ItemExistsException to be thrown and handle appropriately.
  • If the aim of a 404 is to hide the existence of information, then it doesn’t work, perhaps it really should be a 403 every time since its easy enough to bypass the 404, and emitting a 404 implies that client code can create a node that doesn’t exist.