New Programming Language

5 02 2010

My new programming language that always compiles, never has bugs, has perfect style and is generally delivered on time, (all IMHO) is English. Developers must have a screw loose. Generally they refuse to write anything down, often they say the documentation is in the code, any yet, most of their leasure time is taken up refactoring rewriting and perfecting that algorithm that started out as a simple sort and is now drinking credit card limits on a cluster of Amazon nodes. Meanwhile, those crafty non developer types, lean back and claim victory with a page of prose that no compiler can even start to understand, and yet, they are the ones living it up with deadlines met. Now, we do do our best to ensure that doesnt happen, but there is a lesson to be learnt here.

Although 90% of those that write code for a living don’t do documentation and certainly don’t do it in advance, its far quicker and easier to experiment with ideas in plain English. (oops forgot that should be _i18n_native_tongue_ ), I wont go on, suffice it to say I’ve been trying out spec before implementation and found that on many occasions nuances that I can see would have appeared in the implementation have appeared in the spec, but the cost of correction has been significantly lower and my personal dialect of english has passed my patched english complier most of the time. Other’s substandard copies have problems. The bigest thing I have noticed with this approach is that I haven’t wasted hours debugging what I would eventually throw away.

It’s interesting that the software industry does so little of this. My first vocation in life was a design engineer; mechanical, high vacuum. There was no point in taking a stock bar out of the stores putting it into a lathe and hoping to machine off the right amount to make a stub shaft. In fact with high vacuum and often high voltage that was a positively dangerous pastime. Perhaps 200 years ago at the start of the industrial revolution or today if creating a work of art, but for engineering the physical material is always the last final part of the process. As software engineering, still a young profession evolves perhaps we will see more design and works of art and less metal bashing.





In the Zone

2 02 2010

I have been doing and experiment for the past 3 months. I work in a busy office, open plan and quite noisy at times. There are many projects running in the office, probably about 20 at any one time. A mixture of management, creatives and engineers. The thinking goes, with an open plan office where everyone can hear everything, there is cross fertilization of thoughts and information between projects, the normal process of active management of projects is not necessary because everything is visible and no one ever gets the chance to bury themselves in a hole for months. Well there is a free flow of information between individuals, and being within the University of Cambridge, free flow of thoughts is what we stand for. We’ve just spent a whole year shouting about 800 years of Cambridge thoughts changing the world (very arrogant, but thats also a Cambridge trait). So is this free flow of thoughts good ?

Several years ago, when I was still allowed to do sysadmin, a “joy” I am no longer considered safe to tackle. I was migrating an active CVS repository to a Subversion, the new and better source control system that didn’t force you to lock before edit. Sitting there at my keyboard, trying to get it done before a meeting a developer had taken the enforced downtime to continue that esoterically argument about memory consumption of byte arrays in C++ versus Java with a manager or co-developer, I can’t remember which. “^Rcvs” goes the keyboard, hoping to retrieve ”find . -name ‘*cvs*’ ” for the zillionth time, then as I hit the enter key, I hear something just so outrageous I have to turn around and join in. Meanwhile “rm -rf /repo/cvsmain” poped up from bash history and diligently got to work. Needless to say the developer who’s work was in that repo cornered me at twilight, and I never did sysadmin again. Fortunately there was no period in hospital. If you hate doing sysadmin, thats a sure fire way to get out of it.

There is a reason a Study is called a Study, there the most common utterance spoken in a library is “shhh”. To get complex things into your head takes time free from distractions. The sorry take above was a simple case, and the volume of information very low, but the conversation and level of bullshit so tempting I couldn’t resist it, and bang, I wasn’t thinking what I was doing, my mind was blank and empty, just like the diskspace 5 minutes later, never again will I try to mix bullshit with command line.

So three months ago, I realised that most of my coding and complex work was done after 8 in the evening, or between 5 and 8 in the morning if I could not sleep. Time for an experiment, what will happen to my working hours if I move out and zone in. I upped camp, replugged the IP phone into an empty office down the corridor. By 10 I have cleared email, and loaded my head with everything that was there yesterday. Verified that dementia has not set in, yet, and I find I can get a good hours productive output in before the 15 second question brings it all to a grinding halt for the next hour. I looked at my timesheets last week, yes even in a University we have timesheets, there is no escape; and discovered that since moving to a quiet “study” environment, my hours have dropped from averaging 60+h to a far more reasonable 40h per week. I am almost certain that quality has gone up, I am working on a new version of the Sakai backend, looking at the stats we have resolved close to 500 issues, just did a 0.2 release and have 6 open bugs, with close to 90% test and integration test coverage.

I know plenty of other engineers have reported the same phenomena, often to deaf ears, but I cant help think that the change has something to do with it. So if you manage a mixed team of engineers, creatives try an experiment, create a space for communication but put those that need to study to get the job done in an environment where the most common utterance is “shhh” and let them get on with what was asked. If fall asleep, kick them, but quietly. If it doesn’t work, shout at them, but take them outside first, please.





Incremental Integration testing with Sling

3 12 2009

I keep on forgetting how to do this.

cd launchpad/testing
mvn clean install
mvn jetty:run 1> run.log 2>&1 &
# wait for startup to finish
mvn test -Dtest=**/integrationtest/**/*Test.java

you can leave the server running, redeploy bundles and re-run specific tests avoiding a full rebuild.





Smart Meters, I dont get it.

3 12 2009

In the UK there has just been an announcement that every house will have a smart meter to monitor home energy use. Fantastic, at least if we want to reduce our consumption at home we can. But hold on a minute, rolout is going to take over 10 years, and its going to cost £6.8Bn and its only expected to result in a 10% saving in the home. I don’t get it. Who pays ? Apparently the home owner. £330 to get it installed, saving £26 per annum off the average bill, since most silicon based devices in constant use have a MTBF of < 10 years the cost will never be recovered. I don’t get it. Ahh its to reduce our Carbon footprint. I wonder how much extra Carbon footprint £6.8Bn of expenditure equates to, all economic activity has an impact (other than planting trees on a commercial scale and using the wood for buildings that last 200 years). If people really want make an impact they need to do less, consume less, and keep things simple. Call me sinical, but the smart meter initiative sounds like a stitch up between government and industry. Industry to create a huge new market for something that didn’t exist previously, government to find a new way of avoiding building sufficient green energy plants, nuclear included to meet peaks of demand and hence raising taxes. I wonder how many MW of green energy you could provision for £8.6Bn, if it was Nuclear probably 2 according to http://en.wikipedia.org/wiki/Economics_of_new_nuclear_power_plants, if it was wind, its much harder to tell. Thats why I don’t get it.





Sling Documentation Annotations

16 11 2009

Its been noticed that documentation that is not in the same version control system as the code, is frequently not maintained. This leads to the users of the interfaces getting increasing fustrated as nothing appears to work, although to fair to the developers the users may well be looking at out of date documentation.

To address this in Sling/Sakai K2 I have just done a first pass at documentation annotations that are discovered at runtime to build a set of documentation for the Service. I say service, because Sling is OSGi based, and every HTTP service end point is implemented as a OSGi service implementing javax.servlet.Servlet. The approach could be used for any service, but I am using it for Sling Servlets.

How to use:

First off, all Servlets that are active in the system are automatically registered with the documentation system, and if any of them don’t contain documentation there are some gentile reminders to the developer to create the documentation. If there are no documentation annotations present there are some friendly defaults.

add the documentation annotations to you build

<dependency>
 <groupId>org.sakaiproject.kernel</groupId>
 <artifactId>org.sakaiproject.kernel.doc</artifactId>
 <version>0.1-SNAPSHOT</version>
</dependency>

So, add some annotations to your class:

@SlingServlet(methods = "GET", paths = "/system/doc")
@ServiceDocumentation(name = "DocumentationServlet", description = "Provides auto documentation of servlets registered with OSGi. Documentation will use the "
 + "service registration properties, or annotations if present."
 + " Requests to this servlet take the form /system/doc?p=&lt;classname&gt where <em>classname</em>"
 + " is the fully qualified name of the class deployed into the OSGi container. If the class is "
 + "not present a 404 will be retruned, if the class is present, it will be interogated to extract "
 + "documentation from the class. In addition to extracting annotation based documention the servlet will "
 + "display the OSGi service properties. All documentation is assumed to be HTML encoded. If the browser is "
 + "directed to <a href=\"/system/doc\" >/system/doc</a> a list of all servlets in the system will be displayed ",
 bindings = @ServiceBinding(type = BindingType.PATH, bindings = "/system/doc"),
 methods = {
 @ServiceMethod(name = "GET",
 description = "GETs to this servlet will produce documentation for the class, " +
 "or an index of all servlets.",
 parameters = @ServiceParameter(name = "p",
 description = "The name of the class to display the documentation for")) })

public class DocumentationServlet extends SlingSafeMethodsServlet {

And then rebuild and reload your Servlet.

Finally browse to http://localhost:8080/system/doc to check the documentation.

 

Obviously because the documentation is in the code and its deployed with the code, provided the developer keeps it current with the code they are editing, the documentation will be correct. So the next $64K is “How to make developers document what they do ?”





Note to self: JcrResourceResolver2, selectors and extensions

13 11 2009

This really is a note to myself, as I have a habit of forgetting this and spending ages debugging.

In JcrResourceResolver2.resolveInternal there is a loop that attempts to resolve a URI by selectively stripping off the segments of the last element using a . as a seperator. When a resource resolves the section of the path that resolves is used as the resource path, and the remainder is used as the resource path info. The selectors and extensions are explicitly parsed out of the resource path info ignoring anything that the ResourceProvider might have done. It is therefore vital not to attempt to process a path within a ResourceProvider, as the path used to determine the the resource path and the path info is local to resolveInternal and not influenced by anything that the ResourceProvider might choose to do.

There are one or two special cases. If no resource is found, the last element is processed in its entirety before a NonExistingResource is created and hence anything that gets involved in the ResourceProvider process should be very careful about creating Synthetic resources since unless a convention is followed the resource path cant be determined from the URL alone. Take the example of a virtual resource. Since its virtual and the resolution process is abstract, delayed until after resolution, nothing is known about the ultimate target. The convention we follow is that the last element cant contain ‘.’ I think this is going to cause problem and result in moving the whole resolution process inside the resolveInternal call tree, putting any code in a modified Sling JCR Resource Bundle. Unlike previous attempts this will not require patching the code base, just repackaging, unless I can work out a sensible patch that avoids recursive resolution.

Sorry if that was very boring, as I said in the title Note to self.





Declarative optional multiple references flaky in OSGi

12 11 2009

It looks like binding multiple optional references in OSGi is flaky at least with Felix 1.2.0. Uhh what does that mean?

AFAICT, an annotation like

@Reference(name="virtualResourceType",
 cardinality = ReferenceCardinality.OPTIONAL_MULTIPLE,
 referenceInterface = VirtualResourceType.class)

with

public void bindVirtualResourceType(VirtualResourceType virtualResourceType) {
    log.info("Bound "+virtualResourceType);
   store.put(virtualResourceType,virtualResourceType);
}
public void unbindVirtualResourceType(VirtualResourceType virtualResourceType) {
   log.info("UnBound "+virtualResourceType);
   store.remove(virtualResourceType);
}

Only binds some of the time on reload, but unbind works every time.

I have a feeling that below will work, no idea why ?

protected void bindVirtualResourceType(ServiceReference reference) {
   VirtualResourceType virtualResourceType =  (VirtualResourceType) this.componentContext.locateService(
       "VirtualResourceType", reference);
   if ( virtualResourceType != null ) {
      LOGGER.info("=====================BOUND VIRTUAL RESOURCE TYPE{}===============================",virtualResourceType.getResourceType());
      virtualResourceTypes.put(virtualResourceType.getResourceType(), virtualResourceType);
   } else {
      LOGGER.info("=====================Faied to find BOUND VIRTUAL RESOURCE TYPE{}===============================",reference);
   }
}

Update:

The annotation should have been

@Reference(name = “virtualResourceType”,
cardinality = ReferenceCardinality.OPTIONAL_MULTIPLE,
referenceInterface = VirtualResourceType.class,
policy = ReferencePolicy.DYNAMIC)

and then the bind and unbind can be

protected void bindVirtualResourceType(VirtualResourceType virtualResourceType) {
    virtualResourceTypes.put(virtualResourceType.getResourceType(), virtualResourceType);
}

protected void unbindVirtualResourceType(VirtualResourceType virtualResourceType) {
    virtualResourceTypes.remove(virtualResourceType.getResourceType());
}

(I am an idiot!)





Sling Runtime Logging Config

10 11 2009

One of the most annoying things about bug hunting in open source, is that you can see that the developer left log.debug( statements for you in the code but you have to shutdown, reconfigure logging and restart. In Apache Sling this isnt the case. According to the documentation at http://sling.apache.org/site/logging.html you can configure logging at runtime. AND you can configure it on a class by class basis. Here are some screenshots of how.

Go to the admin console, select the configuration tab.

Picture 6

Select from the Configuration Factories drop down the logging.config Factory,

Picture 7

Set the properties including the package or class that you want this logging config to apply to, and save.

Picture 8

 

 

As I mentioned, you can do all of this on a running instance, no need to shutdown.





Confusing, but logical ItemExistsException

3 11 2009

In Jackrabbit, if a session does not have permission to read an Item in the repository, a AccessDeniedPermission is thrown. In Sling this appears as a 404 at http which makes perfect sense (until I start to think about it). However if you suspect the item really does exist, you can try and modify the item. The result is an ItemExistsException, at the Jcr layer confirming that the AccessDeniedPermission on read was correct, the item exists but you cant write to it. What is confusing is that session.itemExists() returns false, and Sling gives a 404, both trying to hide the information, but its all to easy to use the update operation to determine if the information isnt there, or if you dont have read on it.

An example exception is


/private/9b/ba/25/71/user1_1257266276 is [[/_user/private/9b/ba/25/71/user1_1257266276/rep:policy/allow][false][user1-1257266276], [/_user/private/9b/ba/25/71/user1_1257266276/rep:policy/deny0][true][everyone], [/rep:policy/allow0][true][everyone]] 
allows(-wPR-)denies(-----)allowPrivileges(-wPRC)denyPrivileges(-----)parentAllows(-----)parentDenies(-----)[/_user/private/9b/ba/25/71/user1_1257266276/rep:policy/allow][false][user1-1257266276],LocalAllow 
allows(-wPR-)denies(r----)allowPrivileges(-wPRC)denyPrivileges(r----)parentAllows(-----)parentDenies(-----)[/_user/private/9b/ba/25/71/user1_1257266276/rep:policy/deny0][true][everyone],LocalDeny 
allows(-wPR-)denies(r----)allowPrivileges(-wPRC)denyPrivileges(r----)parentAllows(r----)parentDenies(-----)[/rep:policy/allow0][true][everyone],NotLocalAllow 
allows(-wPR-)denies(r----)allowPrivileges(-wPRC)denyPrivileges(r----)parentAllows(r----)parentDenies(-----)[/rep:policy/allow0][true][everyone],LocalAllow 
03.11.2009 08:37:57.201 *INFO* [127.0.0.1 [1257266276921] POST /_user/private/GetAllProfilesTest1257266276.html HTTP/1.1] org.sakaiproject.kernel.resource.AbstractPathResourceTypeProvider  /_user/private/9b/ba/25/71/user1_1257266276/GetAllProfilesTest1257266276 is a virtual file, base is /_user/private  
03.11.2009 08:37:57.207 *ERROR* [127.0.0.1 [1257266276921] POST /_user/private/GetAllProfilesTest1257266276.html HTTP/1.1] org.apache.sling.servlets.post.impl.operations.ModifyOperation Exception during response processing.
javax.jcr.ItemExistsException: /_user/private/9b/ba/25/71/user1_1257266276 
at org.apache.jackrabbit.core.NodeImpl.internalAddChildNode(NodeImpl.java:777) 
at org.apache.jackrabbit.core.NodeImpl.internalAddNode(NodeImpl.java:729) 
at org.apache.jackrabbit.core.NodeImpl.internalAddNode(NodeImpl.java:677) 
at org.apache.jackrabbit.core.NodeImpl.addNode(NodeImpl.java:2093) 
at org.apache.sling.servlets.post.impl.operations.ModifyOperation.deepGetOrCreateNode(ModifyOperation.java:709) 
at org.apache.sling.servlets.post.impl.operations.ModifyOperation.processCreate(ModifyOperation.java:216) 
at org.apache.sling.servlets.post.impl.operations.ModifyOperation.doRun(ModifyOperation.java:89) 
at org.apache.sling.servlets.post.AbstractSlingPostOperation.run(AbstractSlingPostOperation.java:87) 
at org.apache.sling.servlets.post.impl.SlingPostServlet.doPost(SlingPostServlet.java:173) 
at org.apache.sling.api.servlets.SlingAllMethodsServlet.mayService(SlingAllMethodsServlet.java:143) 
at org.apache.sling.api.servlets.SlingSafeMethodsServlet.service(SlingSafeMethodsServlet.java:338) 
at org.apache.sling.api.servlets.SlingSafeMethodsServlet.service(SlingSafeMethodsServlet.java:369)

There are 2 implicatons here.

  • If using the if ( !session.itemExists(path) ) {  …. session.addNode(path) … } pattern you should expect an ItemExistsException to be thrown and handle appropriately.
  • If the aim of a 404 is to hide the existence of information, then it doesn’t work, perhaps it really should be a 403 every time since its easy enough to bypass the 404, and emitting a 404 implies that client code can create a node that doesn’t exist.




Clouds are search based, humans are not.

10 10 2009

Spending 4 hours in a car driving to Oxford to give a presentation at OSS Watch gave me an opportunity to think. Perhaps getting mildly lost in Milton Keynes on the way back consolidated my thoughts. For those that dont live in the Europe, Milton Keynes is a new town built in the 1960’s consuming a village of the same name. It’s laid out on a grid pattern like many US cities, rather alien to Europeans who have become used to winding roads that promiss the reward of a destination, “The Great North Road”, goes north south and for the authorities in London took them to the Great North. If named by those in Newcastle it might have been called “The Crowded South Road” to discourage any brain drain. But the interesting thing that struck me as a pulled over to search on Google Maps just where “H5″ went in Milton Keynes, (H5 is the name of the one of grid roads), was that humans are unable to make sense of large amounts of unfamiliar information. For the average European (habitants of Milton Keynes excluded), the grid pattern of Milton Keynes with its symbolic naming of the major arteries is confusing, just as the winding roads of Europe with ancient names and strange numbers “A1″, must feel like a trip along the blood vesciles of some strange animal for the average US city dweller. But even then we are all given a frame of reference or a language that enables our small brains to navigate this space. In the UK, the road names provide us a way, if you follow “Cambridge Road” out of the east end of London, you stand some chance of ending up in Cambridge, before urban sprawl that chance was a certainty. Imagine a world where there were no maps, and no visibility beyond the end of your nose, except a device. That device allowed you to say where you wanted to go, and it would go out into the cloud of information and tell you the way. This is the world of search and the cloud. The compartmenalisation and ordering has been abstracted to such an extent that all containers are removed and everything exists with in a massive amorphous cloud. We have developed highly efficient tools to locate information within that cloud eliminating all need to pre-categorise anything. But are we missing something? We are humans after all, and we have become adept at sharing and communicating by compartmentalizing what is important. We talk of main roads, autobahns, highways, interstates and know that although there are smaller less travelled routes to one side, we could take a detour, follow our noses, make discoveries and likely get back on the highway at the next junction. The trigger might be a signpost tempting us off the trunk route. Cloud and search does not really provide us with this structure, and the point of this post is that when you try and interface a compartmentalized or hierarchical mechanims with a search based cloud system it generates tension.

I and a team at Cambridge have been arguing about the UX issues surrounding file storage. There is a desire to create a cloud based storage system where users throw files into a central pot of information. There is no structure in the organization of the information, although they can retrieve that information by a URI to the information. There is nothing complicated or difficult about this. The URI is totally meaningless, like a Tiny URL before you follow it, but the file has metadata attached that makes it possible to find it by search, (eg tags). This enables us to create multiple views of the information. Obviously free text search is also available. On one side of the argument there is the opinion that these views should be single level depths and there should be no hierarchy. eg /tags/ceasium-137 would return all files tagged with ceasium-137.  In the middle there is the view that humans need structure and so we should allow the views to have some taxonomy, /physics/fission/ceasium-137, and on the other side there is a view that the meaningless URI should have meaning as well.

The tension this generates comes from the ability to list the contents at any one level and consequently the ability of other systems to interface to the structure. Listing /tags is equivalent to listing all the tags, viable within a single subject area, but impractical within human existence. The problem is exacerbated when tools that assume hierarchical structure are interfaced. Many have made the assumption that the hierarchy has been defined to limit the number of children to a manageable number. Filesystems and the tools that act on them, don’t traditionally expect to support millions of child nodes. In fact most file systems browsers become unusable over a few thousand items. So as soon as you interface these tools to an information store that is cloud based they fail as they are not clever enough to tell the user what options there might be without listing the all the options at the next level.

The real underlying question is, if we were to undo history and start intelligent life with a search engine, would Parmenides have though about Ontologies to organize our world into shared hierarchies which we could communicate with one another into every aspect of existence ? Perhaps the human brain craves structure even if in its default form its suboptimal.