• SPF oops

    Looks a lot like Google has started to enforce (https://blog.google/products/gmail/gmail-security-authentication-spam-protection/) This is a graph of forwarded vs rejected email for a small domain I use CLoudFlare to forward to gmail accounts. Normally rejection rates are very low. Last week they started to rise.

  • Noise

    To simulate a real behavior requires some level of randomness or noise in the signal. In my case I want to simulate the behaviour of a helm at sea. They try to steer in to a bearing but environmental factors ensure that they wont accurately. Waves, wind, distractions, lack of sleep all play a factor. To simulate this randomness is not enough. A random number between 0 and 1.0 is completely random jumping everywhere randomly in that range. Noise is required. Noise is a continuous waveform, not discrete steps. So how to achieve it ?

  • Site Moved

    This blog has been moved from Wordpress to Github pages, source at https://github.com/ieb/ieb.github.io. If you have comments or want to correct something please open a PR and I will consider it.

  • Why decentralised tracing is wrong.

    Let me be clear, I am not associated or working on any tracing app. I am a user. Let’s also be clear. Covid-19 is a frightening virus which has a horrible impact on some. 30% of patients dont make it out of ICU. The impact on some ethnic minorities is disproportionate. If you have a BMI over 30 your 37% more likely not to survie.

  • Contact Tracing with Privacy Built in.

    Contact tracing is a n squared or more computationally intensive operation when done centrally. It is impossible to do at scale manually, with each operation taking days. For governments to perform it centrally requires citizens give up their rights to privacy, especially when a contract tracing app is being used. And yet, it seems, most central contract tracing apps appear to want to gather all the data centrally. Given the current urgency this will not work and will require months of negotiation. I am not a Politician or a Lawyer, I am an Engineer. As an Engineer this is how I would perform contact tracing.

  • Fouling

    Screen Shot 2017-05-03 at 08.36.01

  • Metrics off Grid

    I sail, as many of those who have met me know. I have had the same boat for the past 25 years and I am in the process of changing it. Seeing Isador eventually sold will be a sad day as she predates even my wife. The site of our first date. My kids have grown up on her, strapped into car seats when babies, laughing at their parents soaked and at times terrified (parents that is, not the kids). see https://hallbergrassy38forsale.wordpress.com/ for info.

  • Referendums are binary so should be advisory

    If you ask the for the solution to the multi faceted question with a binary question you will get the wrong answer with a probability of 50%. Like a quantum bit, the general population can be in any state based on the last input or observation, and so a Referendum, like the EU Referendum just held in the UK should only ever be advisory. In that Referendum there were several themes. Immigration, the economy and UK Sovereignty. The inputs the general population were given, by various politicians on both sides of the argument, were exaggerated or untrue. It was no real surprise to hear some promises retracted once the winning side had to commit to deliver on them. No £350m per week for the NHS. No free trade deal with the EU without the same rights for EU workers as before. Migration unchanged. The Economy was hit, we don’t know how much it will be hit over the coming years and we are all, globally, hoping that in spite of a shock more severe than Lehman Brothers in 2008, the central banks, have quietly taken their own experts advice and put in place sufficient plans to deal with the situation. Had the Bank of England not intervened on Friday morning, the sheer cliff the FTSE100 was falling off, would have continued to near 0. When it did, the index did an impression of a base jumper, parachute open drifting gently upwards.

  • Ai in FM

    Limited experience in either of these fields does not stop thought or research. At the risk of being corrected, from which I will learn, I’ll share those thoughts.

  • What do do when your ISP blocks VPN IKE packets on port 500

    VPN IKE packets are the first phase of establishing a VPN. UDP versions of this packet go out on port 500. Some ISPs (PlusNet) block packets to routers on port 500, probably because they don’t want you to run a VPN end point on your home router. However this also breaks a normal 500<->500 UDP IKE conversation. Some routers rewrite the source port of the IKE packet so that they can support more than one VPN. The feature is often called a IPSec application gateway. The router keeps a list of the UDP port mappings using the MAC address of the internal machine. So the first machine to send a VPN IKE packet will get 500<->500, the second 1500<->500, the third 2500<->500 etc. If your ISP filters packets inbound to your router on UDP 500 the VPN on the first machines will always fail to work. You can trick your router into thinking your machine is the second or later machine by changing the MAC address before you send the first packet. On OSX

  • Scaling streaming from a threaded app server

    One of the criticisms that is often leveled against threaded servers where a thread or process is bound to a request for the lifetime of that request, is that they don’t scale when presented with a classical web scalability problem. In many applications the criticism is justified, not because the architecture is at fault, but often because some fundamental rules of implementation have been broken. Threaded servers are good at serving requests, where the application thread has to be bound to the request for the shortest possible time, and while it is bound, no IO waits are encountered. If that rule is adhered to, then some necessities of reliable web applications are relatively trivial to achieve and the server will be capable of delivering throughput that saturates all the resources of the hardware. Unfortunately, all to often application developer often break that rule and think the only solution has to be to use a much more complex environment that requires event based programming to interleave the IO wait states of thousands of in progress requests. In the process they dispose of transactions, since the storage system they were using (a RDBMS) can’t possibly manage several hundred thousand in progress transactions even if there was sufficient memory on the app server to manage the resources associated with each request to transaction mapping…. unless they have an infinite hardware budget and there was no such thing as physics.

  • Making the Digital Repository at Cambridge Fast(er)

    For the past month or so I have been working on upgrading the Digital Repository at the University of Cambridge Library from a heavily customised version of DSpace 1.6 to a minimally customised version of DSpace 3. The local customizations were deemed necessary

  • AIS NMEA and Google Maps API

    Those who know me will know I like nothing better than to get well offshore away from any hope of network connectivity. It’s like stepping back 20 years to before the internet and its blissfully quite. The only problem is that 20 years ago it was too quiet. Crossing the English Channel in thick fog with no radar and a Decca unit that could only be relied on to give you a fix to within a mile some of the time, made you glad to be alive when you stood on solid ground again. Rocks and strong currents round the Alderney race were not nearly as frightening as the St Malo ferry looming out of the fog, horns blaring, as if there was anything a 12m yacht could do in reply. After a white knuckle trips I bought a 16nm Radar which turned the unseen steel monster of the Channel into a passage like a tortoise crossing a freeway reading an ipod. I don’t know which was better, trying to guess if the container ship making 25kn, 10nm away was going to pass in front or behind you, or placing your trust in the unseen ships crew who had spent the past 4 days rolling north through Biscay with no sleep.

  • HowTo: Quickly resolve what an Sling/OSGi bundle needs.

    Resolving dependencies for an OSGi bundle can be hard at times, especially if working with legacy code. The sure-fire way of finding all the dependencies is to spin the bundle up in an OSGi container, but that requires building the bundle and deploying it. Here is a quick way of doing it with maven, that may at first sound odd.

  • Sakai CLE ElasticSearch


  • Fibonacci ring for Cassandra

    King Protea (Protea cynaroides)

  • Node.js vs SilkJS

    Node.js, everyone on the planet has heard about. Every developer at least. SilkJS is relatively new and creates an interesting server to compare Node.js against because it shares so much of the same code base. Both are based on the Google V8 Javascript engine that convert JS into compiled code before executing. Node.js as we all know uses a single thread that uses a OS level event queue to process events. What is often overlooked is that Node.js uses a single thread, and therefore a single core of the host machine. SilkJS is a threaded server using pthreads where each thread processes the request leaving it upto the OS to manage interleaving between threads while waiting for IO to complete. Node.js is often refereed to as Async and SilkJS is Sync. The advantages to both approaches that are the source of many flame wars. There is a good summary of the differences and reasons for each approach on the SilkJS website. In essence SilkJS claims to have a less complex programming model that does not require the developer to constantly think of everything in terms of events and callbacks in order to coerce a single thread into doing useful work whilst IO is happening. Although this approach hands the interleaving of IO over to the OS letting it decide when each pthread should be run. OS developers will argue that thats what an OS should be doing and certainly to get the most out of modern multicore hardware there is almost no way of getting away from the need to run multiple processes or threads to use all cores. There is some evidence in the benchmarks (horror, benchmarks, that’s a red rag to a bull!) from Node.js, SilkJS, Tomcat7, Jetty8, Tornado etc that using multiple threads or processes is a requirement for making use of all cores. So what is that evidence ?

  • Google CourseBuilder, a scalable course delivery platform ?

    This week I discovered Google CourseBuilder, the latest entry into the MOOC arena. It’s a Google App Engine application that Google Research used to host a MOOC to 155K students a few months ago. It follows a simular pedagogy to that used by other MOOC providers with high quality video lessons, that give the student the feeling they are working one on one with the lecturer. Google have open sourced the code under and Apache 2 license which gives us all an insight into the economies of scale that a MOOC represents. Unlike the traditional Virtual Learning Environment where the needs of staff are catered for in the user interface, Google CourseBuilder currently delegates all the functionality to spreadsheets, editing snippets of javascript and html. There is no reason why it could not be given an user interface, but when you consider what its is trying to do you realise that staff user interfaces for course creation are less important than the delivery of the course at scale. Consequently the application itself is tightly focused on delivering the course as quickly and as simply as possible to as many users as possible. Google App Engine makes this easy, even for meer mortals. Once you have accepted that nothing is really for free, and you do have to pay for bandwidth used and energy in at some point scaling this application upto 100K or even 1M users requires little or no effort on your part. You also, at the moment, have to accept if you are going to reach that many students, you are going to have to ask for a little bit of help from someone to write some HTML, drive a spreadsheet and write a bit of Javascript as well as hit the “deploy” button on the App Engine SDK. I say, at the moment, because it isn’t going to be that hard to create an administrative UI, and thats what I have been doing for a few hours this week.

  • Jackrabbit, Oak, Sling, maybe even OAE

    Back in January 2010 the Jackrabbit team starting asking its community what did it want to see in the next version of Jackrabbit. There were some themes in the responses. High(er) levels of write concurrency, millions of child nodes and cloud scale clustering with NoSQL backends. I voted for all of these.

  • Massively Online

    Paris Metro logo Español: Logo del Metro de Pa...

  • Evolution of networks

    In late 1993 I remember using my first web browser. I was on a IBM RS6000, called “elysium”. It had a 24bit 3D graphics accelerator, obviously vital for the blue on grey text that turned purple when you clicked it. I also needed it to look at the results of analysis runs. The stress waves flowing through the body shell of the next BWM 3 series, calculated by a parallel implementation of an iterative dynamic solver I was working on for MSC-Nastran. I was at the Parallel Application Center in Southampton, UK. I worked on European projects doing large scale parallelisations. Generally in engineering but always solving practical problems; Processing seismic survey data from the south atlantic in 6 days rather than 6 weeks on HP convex clusters. Monte Carlo simulations of how brain tissue would absorb radiation during radio therapy, avoiding subjecting the cancer patient to two hospital visits. It was interesting work, and in my nieve youth I felt at times I was doing good for humanity. Using a web browser was becoming part of my normal life. I used it to visit the few academic sites that contained real information, to access papers and research that previously we would have received in paper form from the British Library. This was the first network, exciting, raw, generating a shift in how I communicated and how effective I was.

  • Follow Up: ORM

    My last post was about the good the bad and the ugly of ORM. Having just suprised myself at the ease with which its was possible to build a reasonably complex query in Django ORM, I thought it would be worth sharing by way of an example. The query is a common problem. I want a list of users, and their profile information that are members of a group or subgroups. I have several tables, Profile, Principal, Group, GroupGroups, GroupMembers. The ORM expression is

  • Is ORM So bad?

    ORM gets a bad name, and most of the time it deserves a bad name. It produces nasty horrible queries that don’t scale and lead to dog slow applications in production. Well that’s not entirely fair. Programmers who write code or SQL and never bother to check if they have made stupid mistakes lead to dog slow applications in production. The problem with ORM is it puts lots of power into a programmers hands, lets them loose and makes them think that they can get away without thinking about what will happen when their application has more than 10 rows in a table. There is no magic to ORM, just like raw SQL, you have to tune it.

  • Is ORM so bad ?

  • Languages and Threading models

    Since I emerged from the dark world of Java where anything is possible I have been missing the freedom to do whatever I wanted with threads to exploit as many cores that are available. With a certain level of nervousness I have been reading commentary on most of the major languages surrounding their threading models and how they make it easy or hard to utilize or waste hardware resources. Every article I read sits on a scale somewhere between absolute truth to utter FUD. The articles towards the FUD end of the scale always seem to benchmarks created by the author of the winning platform, so are easy to spot. This post is not about which language is better or what app server is the coolest thing, its a note to myself on what I have learnt, with the hope if I have read to much FUD, someone will save me.

  • The trouble with Time Machine

    Every now and again Time Machine will spit out a “Cant perform backup, you must re-create your backup from scratch” or “Cant attach backup”. For anyone who was relying on its rollback-time feature this is a reasonably depressing message and does typify modern operating systems, especially those of the closed source variety. At some point, having spent all the budget on pretty user interfaces, and catered for all use cases the deadline driven environment decides, “Aw stuffit we will just popup a catch all, your stuffed mate dialog box”. 99% of users, rant and rave and delete their backup starting again with a sense of injustice. If your reading this and have little or no technical knowledge, thats what you should do now.

  • PyOAE renamed DjOAE

    I’ve been talking to several folks since my last post on PyOAE and it has become clear that the name doesn’t convey the right message. The questions often center around the production usage of a native Python webapp or the complexity of writing your own framework from scratch. To address this issue I have renamed PyOAE to DjOAE to reflect its true nature.

  • Vivo Harvester for Symplectic Elements

    I’ve been doing some work recently on a Vivo Harvester for Symplectic Elements. Vivo, as its website says, aims to build a global collaboration network for researchers. They say National, I think Global. The project hopes to do this by collecting everything know about a researcher including the links between researchers and publishing that to the web. You could say its like Facebook or G+ profiles for academics, except that the data is derived from reliable sources. Humans are notoriously unreliable when talking about themselves, and academics may be the worst offenders. That means that much of the information within a Vivo profile, and the Academic Graph that the underlying semantic web represents has been reviewed. Either because the links between individuals have been validated by co-authored research in peer reviewed journals, or because the source of the information has been checked and validated.

  • PyOAE

    For those that have been watching my G+ feed you will have noticed some videos being posted. Those vidoes are of the OAE 1.2 UI running on a server developed in Python using DJango. I am getting a increasing stream of questions about what it is, hence this blog post.

  • Flashback

    The world wakes up to an OSX virus. News media jumps on the story terrifying users that they might be infected. Even though the malware that users were tricked to install may not be nice its clear from looking at the removal procedure that unlike Windows platform were a virus normally buries itself deep within the inner workings of the OS, this trojan simply modified an XML file on disk and hence reveals its location. To be successful in doing that it would have had to persuade the user to give it elevated privileges as the file is only writable by the root user. If it failed to do that it would have infected user space.

  • Modern WebApps

    Modern web apps. like it or not, are going to make use of things like WebSockets. Browser support is already present and UX designers will start requiring that UI implementations get data from the server in real time. Polling is not a viable solution for real deployment since at a network level it will cause the endless transfer of useless data to and from the server. Each request asking every time, “what happened ?” and the server dutifully responding “like I said last time, nothing”. Even with minimal response sizes, every request comes with headers that will eat network capacity. Moving away from the polling model will be easy for UI developers working mostly in client and creating tempting UIs for a handfull of users. Those attractive UIs generate demand and soon the handfull of users become hundreds or thousands. In the past we were able to simply scale up the web server, turn keep alives off, distribute static content and tune the hell out of each critical request. As WebSockets become more wide spread, that won’t be possible. The problem here is that web servers have been built for the last 10 years on a thread per request model, and many supporting libraries share that assumption. In the polling world that’s fine, since the request gets bound to the thread, the response is generated as fast as possible, and the the thread is unbound. Provided the response time is low enough the request throughput of the sever will be maintained high enough to service all requests without exausting the OS’s ability to manage threads/processes/open resources.

  • Deploying/Tuning SparseMap Correctly: Update

    In the last post I reported the impact of using SparseMap with caching disabled, but at the same time noticed that there was a error in the way in which JDBC connections where handled. SparseMap doesn’t pool JDBC connections. It makes the assumption that anyone using SparseMap will read the Session API and note that Sessions should only be used on one thread during their active life time. Although the SparseMap Session Impl is thread safe, doing that eliminates all blocking synchronization and concurrent locks. If the Session API use followed, then JDBC connections can be taken out of the pool and bound to threads which ensures that only 1 thread will ever access a JDBC connection at any one time. When bound to threads, JDBC sessions can be long lived and so are only revalidated if they have been idle for some time. If any connection is found to be in a error state its replaced with a fresh connection.

  • Deploying/Tuning SparseMap Correctly


  • OpenID HTML and HTMLYadis Discovery parser for OpenID4Java

    OpenID4Java is great library for doing OpenID and OAuth. Step2 will probably be better but its not released. Unfortunately the HTML and HTMLYadis parsers rely on parsing the full HTML document and pull in a large number of libraries. These include things like Xerces and Resolver wich can cause problems if running multiple versions in the same JVM under OSGi. For anyone else wanting to eliminate dependencies here are Regex based parsers that have no dependencies outside code OpenID4Java and JRE.

  • Monitoring SparseMap

    Yesterday I started to add monitoring to sparsemap to show what impact client operations were having and to give users some feedback on why/if certain actions were slow. There is now a service StatsService, with an implementation StatsServiceImpl that collects counts and timings for storage layer calls and api calls. Storage layer calls are generally expensive. If SQL storage is used that means a network operation. If a column DB storage layer is used, it may mean a network operation, but its certainly more expensive than not. Storage calls also indicate a failure in the caching. The StatsServiceImpl can also be configured in OSGi to report the usage of individual Sparse Sessions when the session is logged out. Normally at the end of a request. So using long term application stats should tell a user of SparseMap why certain areas are going slow, what storage operations need attention and if using SQL give the DBA an instant indication of where the SQL needs to be tuned. The short term session stats should enable users of sparsemap to tune their usage to avoid thousands of storage layer calls and eliminate unnecessary API calls. Currently output is to the log file. Soon there will be some JMX beans and maybe some form of streamed output of stats. The log output is as follows.

  • SparseMap Content 1.5 released

    Sparse Map version 1.5 has been tagged (org.sakaiproject.nakamura.core-1.5) and released. Downloads of the source tree in Zip and TarGZ form are available from GitHub.

  • Search ACLs Part 2: Simple is always best

    I can’t take any credit for what I say next, all that goes to Mark Triggs who I spent some time chatting with last week. Simple is always best, and although a Bloom filter might be the CS way of solving membership resolution, its not the way to do it in an inverted index where we are really looking for the intersection between two sets. We already have lists of principals that can read each document associated with each document as a multivalued key word. This list is created by analysing the fully resolved ACL for each content item at the point of indexing. In general, content items dont have crazy numbers of ACLs so although the number of unique sets of read principals may be high, the number of read principals is not crazy high, so in that form cardinality etc are not overloaded.

  • Access Control Lists in Solr/Lucene

    This isn’t so much about access control lists in Solr or Lucene but more about access control lists in an inverted index in general. The problem is as follows. We have a large set of data that is access controlled. The access control is managed by users and they can individual items closed or open or anywhere between. The access control lists on the content, which may be files, or simply bundles of metadata is of the form 2 bitmaps, representing the permissions granted and denied, each pair of bitmaps being associated with a principal and the set of principal/bitmap pairs associated with each content item. A side complication is that the content is organised hierarchically and permissions for any one user inherit following the hierarchy back to the root of the tree. Users have many principals through membership of groups, through directly granted static principals and through dynamically acquired principals. All of this is implemented outside of the Solr in a content system. Its Solr’s task to index the content in such a way that a query on the content for an item is efficient and returns a dense result set that can have the one or two content items that the user can’t read, filtered out before the user gets to see the list. Ie we can tolerate a few items the user can’t read being found by a Solr query, but we cant tolerate most being unreadable. In the ACL bitmaps, we are only interested in a the read permission.

  • Deprecate Solr Bundle

    Before that scares the hell out of anyone using Solr, the Solr bundle I am talking about is a small shim OSGi bundle that takes content from a Social Content Repository system called Sparse Map and indexes the content using Solr, either embedded or as a remote Solr cluster. The Solr used is a snapshot from the current 4.0 development branch of Solr. Right, now thats cleared up I suspect 90% of the readers will leave the page to go and read something else ?

  • Rogue Gadgets

    I have long thought one of the problems with OpenSocial is its openness to enable any Gadget based app anywhere. Even if there is a technical solution to the problem of a rogue App in the browser sandbox afforded by the iframe that simply defers the issue. Sure, the Gadget code that is the App, can’t escape the iframe sandbox and interfere with the in browser container or other iframe hosted apps in from the same source. Unfortunately wonderful technical solutions are of little interest to a user whose user experience if impacted by the safe but rogue app. The app may be technically well behaved, but downright offensive and inappropriate on many other levels, and this is an area which has given many institutions food for thought when considering gadget based platforms like Google Apps for Education. A survey of the gadgets that a user could deploy via an open gadget rendering endpoint reveals that many violate internal policies of most organizations. Racial and sexual equality are often compromised. Even basic decency. It’s the openness of the gadget renderer that causes the problem, in many cases when deployed, it will render anything its given. It’s not hard to find gadgets providing porn in gmodules the source of iGoogle, not exactly what an institution would want to endorse on its staff/student home pages.

  • SparseMap Content version 1.4 released.

    Sparse Map version 1.4 has been tagged (org.sakaiproject.nakamura.core-1.4) and released. Downloads of the source tree in Zip and TarGZ form are available from GitHub.

  • OSGi and SPI

    OSGi provides a nice simple model to build components in and the classloader policies enable reasonably sophisticated isolation between packages and versions that make it possible to consider multiple versions of an API, and implementations of those APIs within a single container. Where OSGi starts to become unstuck is for SPI or Service Provider Interfaces. It’s not so much the SPI that’s a problem, rather the implementation. SPI’s normally allow a deployer to replace the internal implementation of some feature of a service. In Shindig there is a SPI for the various Social services that allow deployers to take Shindig’s implementation of OpenSocial and graft that implementation onto their existing Social graph. In other places the SPI might cover a lower level concept. Something as simple as storage. In almost all cases the SPI implementation needs some sort of access to the internals of the service that it is supporting, and that’s where the problem starts. I most of the models I have seen, OSGi bundles Export packages that represent the APIs they provide. Those APIs provide a communications conduit to the internal implementation of the services that the API describes without exposing the API. That allows the developer of the API to stabilise the API whilst allowing the implementation to evolve. The OSGi classloader policy gives that developer some certainty that well-behaved clients (ie the ones that don’t circumvent the OSGi classloader policies) wont be binding to the internals of the implementation.

  • Solr Search Bundle 1.3 Released

    he Solr Search v1.3 bundle developed for Nakamura has been released. This is not to be confused with Apache Solr 4. The bundle wraps a snapshot version of Apache Solr 4 at revision 1162474 and exposes a number of OSGi components that allow s SolrJ client to interact with the Solr server.

  • Minimalism

    In spare moments between real work, I’ve been experimenting with a light weight content server for user generated content. In short, that means content in a hierarchical tree that is shallow and very wide. It doesn’t preclude deep narrow trees, but wide and shallow is what it does best. Here are some of the things I wanted to do.

  • Sparse Map Content 1.3 Released

    Sparse Map version 1.3 has been tagged (org.sakaiproject.nakamura.core-1.3) and released. Downloads of the source tree in Zip and TarGZ form are available from GitHub.

  • Clustering Sakai OAE: Part II of ?

    Part II : Solr Clustering

  • Solr Search Bundle 1.2 released

    The Solr Search v1.2 bundle developed for Nakamura has been released. This is not to be confused with Apache Solr 4. The bundle wraps a snapshot version of Apache Solr 4 at revision 1162474 and exposes a number of OSGi components that allow s SolrJ client to interact with the Solr server.

  • Clustering Sakai OAE: Part I of ?

    Over the past month, on and off, I have been working with the developers at Charles Sturt University (CSU) to get Sakai OAE to cluster. The reasons they need it are obvious. Without clustering, a single instance of Sakai OAE will not scale upto the number of users they needed to support and with a single instance and SLA on uptime would be pointless. So we have to cluster. Sakai has generally been deployed in a cluster at any institution where more than a handfull of students were using the system. Clustering Sakai CLE is relatively straightforward compared to OAE. It can be done by configuring a load balancer with sticky sessions and deploying multiple app servers connecting to a single Database. However when an app server node dies with Sakai CLE, users are logged off and lose state. There are some commercial extensions to Sakai CLE using session replication with Terracotta that may address this.

  • Solr Search Bundle 1.1 released

    The Solr Search v1.1 bundle developed for Nakamura has been released. This is not to be confused with Apache Solr 4. The bundle wraps a snapshot version of Apache Solr 4 at revision 1162474 and exposes a number of OSGi components that allow s SolrJ client to interact with the Solr server. This release adds a priority based queue to enable Quality of Service commitments to be made against a request to index a content item, and that item appearing in the index. Those commitments are fulfilled with parallel queues ensuring that the indexing happens within the requested time, not blocked by less urgent items. The queues are persistent and reliable ensuring that should a remote Solr server fail, the request to index will not be lost. Optionally a deployer can specify that real time indexing is enabled using soft commits by configuration of the queue and the Solr server. Other improvements are listed with the issues fixed against this version, link below. Thanks goes to everyone who contributed and helped to get this release out.

  • Sparse Map Content 1.2 Released

    Sparse Map version 1.2 has been tagged (org.sakaiproject.nakamura.core-1.2) and released. Downloads of the source tree in Zip and TarGZ form are available from GitHub.

  • 1990's Internet

    Many years ago, when I migrated from dialup modem to cable the word broadband really did change your life. Always on, ping latencies below sub 500ms and ssh sessions that didnt resemble the teletype printer that used to broadcast the football results to the nation on Saturday at 5pm (for those that remember the 1970s). I was fortunate enough to be be in a rural UK village located between two nodes in what was then Cambridge Cables trunk network (later NTL, later Virgin). I was the first in this village to be connected, and I remember weeks of anticipation followed by dashed hopes as a stream of cable installation engineers visited. Eventually they admitted, that there was not enough signal getting to the door, and put in a slightly better quality cable. For almost 10 years after that, my Motorola Surfboard modem, connected to a UPS gave me broadband connectivity, 24x7x365 even through powercuts. Neve more than a few Mb/s but always on an always reliable sub 10ms latency with no packet loss.

  • SparseMap Content 1.1 Released

    SparseMap Content 1.1 has been tagged released and pushed to the maven repo. Details of the issues that have been fixed are at https://github.com/ieb/sparsemapcontent/issues?milestone=2&state=closed. This release includes a Lock Manager and a migration framework. Thanks to Chris Tweney at University of California, Berkley for his input on the migration framework and apologies to those that have submitted patches that didn’t get into this release, notably a MongoDB driver from Erik Froese. The tag contains support for webdav and a number of other extensions although there hasn’t been a formal release of these bundles yet.

  • Lift Community

    I have been looking for a way to create RESTfull services quickly and simply. JAX-RS is one way, but in my search I wandered in the world of Scala and came across the Lift community. Not exactly a perfect RESTfull framework, but their “Expected Behavior in the Lift community” is well worth a read before you post (or reply) with (or to) frustration on list. I certainly have shared many of the same thoughts with the author, David Pollak, from time to time, and I suspect at times with shame that I have not always lived by the rules he so clearly expresses. Its one more link on my bookmark list titled “To Reflect on what I do, regularly”.

  • Upgrading Sparse to Cassandra 0.8.1 (notes)

    Our GSoC student Aadish has been doing some nice work with Cassandra and the Sparse content system, but the version of Cassandra we are binding to is a bit old. So its time for an upgrade. Thrift is quite particular about versions, especially over the 0.6 boundary with Cassandra 0.7 where things changed a bit. Most notable was the addition of login to the Cassandra instance removing the ketspace parameter from most client calls. Now you use set_keyspace(keyspace) when the client is initialised. The other sensible change was the use of java.nio.ByteBuffer to most of the calls to avoid having to do expensive to byte[] conversions all over the place. So there are some API changes, not to hard to accomidate.

  • Comparative wide column indexing in SparseMap 1.1

    I hate doing comparative tests with databases, as it always generates DB wars. “Why didn’t you you this version X where thats fixed ?” or “What about using config z with tweak n?”. Sure, fixes come out and databasess need tuning, but if it possible to make very simple table operations go faster on small sets of data…that should be the default OOTB config. This limited test was done on the same piece of non production hardware (a laptop) to see if there was a way of getting round the MySQL 64K limit on rows without impacting performance. The test comes in 2 parts. A table load that loads rows into a wide column table as used by SparseMap which represents the cost of saving an Object in SparseMap, and the second is a set of queries performed once 100K rows are present in the database table. It would have been nice to see how the performance degraded or not over the entire test, but as you will see… that would have taken hours to complete.

  • Indexing and Query Performance in SparseMap 1.0, indicative tests

    Since early late May early June, it became apparent that the the table based indexing approach in Sparse Map used by Sakai OAE had problems. Performing queries on large key value tables can work, provided those queries are simple in nature and the volume of records in the table is not excessive. In fact parts of Wordpress’s ontology store use this approach. Unfortunately in Sakai OAE the key value table grows at 10-120 times the rate of the main content store which grows at 2x the number of content items. In addition to this the queries that need to be performed on this table are paged, sorted and distinct. Not surprisingly that generates a performance issue. It first became apparent in Derby where it was really obvious. So obvious that Sakai OAE 1.0 RC1 would grind to a halt after running in integration test suite on Derby. Those issues were fixed for the Derby driver with the 1.0 release of SparseMap on which Sakai OAE 1.0 was based. Unfortunately, further testing shows that all other databases are effected. I say all, I mean MySQL and PostgreSQL since I dont have access to an Oracle DB at the moment to test… but there is a high probability that it will also be effected. To show the impact, here are some comparative graphs. The first one shows Derby query performance at the OAE 1.0 RC1 tag. Fortunately this is not what was released since SparseMap 1.0 contains a fix for the problem. At the time the consensus was that the problem did not effect MySQL or PostgreSQL and so some extents thats true, however detailed testing shows that the problem effects MySQL and PostgreSQL and presumably Oracle.

  • Sparsemap Content WebDav

    I have always felt that a good test of a content system is to give it to a badly behaved client. WebDav clients tend to be badly behaved and often make unreasonable requests of a http server that was not expecting to be a webdav server. OSX Finder is a good example. It often makes 100s of PROPFIND requests just because the user looked at a Finder window, and dragging and dropping a folder onto a webdav volume in Finder is going to generate a storm of requests, some reading some writing.

  • GSoC Review

    This is the third or fourth year I have done a GSoC project as a mentor. They are always great fun, and just as hiring well reaps rewards, so does evaluating a GSoC student. Every year I have chose one with attitude over ability. The right attitude that is. An attitude to show those who appear to know more about the subject the evidence that proves they need to think again, and a hunger for the information that might add to what they already know. My wife always says men are strange. When they are lost in a big city they rarely ask the way. More often they search for a map or pull out a iphone or htc and try and work it out for themselves. All around them are experts that will not only tell them the way, but tell them about the great cafe to stop for coffee at one street up, and the dark alley to avoid one street down. Ability can be acquired, attitude is who you are. If its wrong, then it takes a long time to change. This year the GSoC student I chose and was mentoring was Aadish Kotwal. He knows how to ask questions and enrich his journey, in the process he enriched mine and showed me a thing or two. I think he enjoyed the process, and wrote some great code. I certainly enjoyed it. This is what he said:

  • SparseMapContent 1.0 released

    Normally I would announce this sort of thing on a mailing list, but I dont have one, so this is the next best thing. SparseMapContent v1.0 has just been released. Its a project I started almost 12 months ago to enable Sakai OAE to store user generated content in shallow wide hierarchies, where 1000s of users would be performing updates, potentially served by an elastic cluster. It stores its content in a simple key value store with rdbms storage on Derby, MySQL, Oracle, PostgreSQL or column storage on Apache Cassandra or Apache HBase. In addition there are both bitstream stores for shared filesystem or as blocks within the the underlying column database. It comes bundled as an OSGi bundle intended to work with Apache Sling.

  • A View of Human Society

    “Although it may be tempting to think of social institutions as functioning like organs, human societies behave, in practice, much more like slime molds. They don’t have eyes or brains that are anything like human eyes and brains. So although any one of us can learn from our mistakes, foresee problems and act reasonably to solve them, collectively we don’t do a very good job of this.

    Granted, our species is very intelligent and has developed exquisite communication across the world and through the centuries. But while people are smart enough to anticipate problems, they are also smart enough to make counterarguments. Every good idea in history has had to fight against many bad ideas before winning broad acceptance.”

    - Nathan Myhrvold 2011

  • Cambridge OAE: IMS-CP Import Support

    We have been working hard on IMS-CP at Cambridge. I should qualify that, Yushan Li from Tsinghua University who has been at Caret for the past month has been working hard on implementing IMS-CP import, and I have been causing trouble for him. Still, we have a working implementation of IMS-CP import that is in our “production” branch. It takes an IMS-CP file and during upload, unpacks the Zip file and converts the IMS Manifest into a Sakai Doc structure definition, so that the IMS-CP appears as a Multilevel Sakai Doc within the user library. They can still get to the original Zip file if they want to download it, and they can also use the Sakai Doc anywhere they can use any other Sakai Doc created inside OAE. Resources within the IMS-CP are also unpacked so that relative URLS within pages of the IMS-CP continue to work correctly.

  • Cambridge OAE: Engineering Syllabus Widget

    For the deployment of OAE at Cambridge we needed and easy way which institutions could embed syllabus information into the Courses and Group pages. The Uniersity of Cambridge has many departments, each of which operates largely autonomously when it comes to the provision of teaching and learning information to its students. The Engineering department, which will be one of the early adopters maintains this information as web pages with a reasonably well defined structure. Other departments may have a simular approach. This widget enables a user to embed content from the Engineering Teaching pages directly into any page within the Sakai OAE instance at Cambridge, simply by selecting the year and lecture from a set of drop downs. The functionality is implemented as a widget using a template configured proxy on the back end. We use Google’s Javascript Caja implementation to sanitize the HTML, part of the Sakai Widget API and we parse the html to remove headers and footers. Development for this widget took about 6 hours to complete. No back end functionality was required.

  • Cambridge OAE: Accept Terms and Conditions Widget.

    For the University of Cambridge OAE instance we have an “Accept Terms and Conditions Widget”. This is loaded as a widget on all pages and checks that the user has accepted the current terms and conditions. I they have not, it pulls their official identity from the institutional LDAP, and asks them a) do you accept the terms and conditions and b) are these the details that should be used. The wording of the terms and conditions is being reviewed by the lawyers at the moment, and can be internationalized as required. Development took about 4 hours, start to end and it uses most of the Sakai Widgets API. There is one small patch to the core code base to ensure that a div appears on every page to load the widget. Code is in the Cambridge OAE extension repositories https://github.com/ieb/ucamux , https://github.com/ieb/ucamex and is deployed as separate jars into instance.

  • Sakai OAE and the future

    A bit over a week ago I posted with Sakai CLE and the future, now its the turn of Sakai OAE. For the past 3 years I’ve been designing and developing the backend to Sakai OAE, with an aim to make it as efficient and performant as I could. Capable of supporting the demands of a pure web 2 application. The first 2 years it was an Open Source project. We added committers based on merit. Many of those who joined in worked for Higher Ed, most were already part of the Sakai community, but there were no formal arrangements so we made do with whatever could be contributed. The past year its been a managed project with funding. At the Sakai conference in downtown LA this year I decided that I didn’t agree with the direction of the management of the managed project, although I am firmly committed to the original concept of OAE and the Open Source part of the project. I wont go into my reasons here, if you really are interested in what will soon become ancient history, stop me when you see me and I will be happy to tell you what I think.

  • Sakai CLE and OAE the future

    I have heard a quite disturbing message, that I know is incorrect and completely out of date. Someone once said that Sakai 2.9 woud be the last version of Sakai CLE. Completely wrong, and will be certainly be proved completely wrong in the next few months. In fact any sort of prediction of that type made in Community Source is always going to be wrong. Community demand drives releases, and those doing the releases are the ones who decide when they are done, and when they are not going to do a release. There are hundreds of institutions running Sakai CLE and there will be Sakai CLE releases as long as there are institutions wanting releases. So if you read or see a presentation that says anything about the last Sakai CLE release, dont believe a word of it.

  • Very Slow Derby Queries

    Here’s a problem for anyone reading this.

  • Multiple Sling Opting Servlets,

    Sling OptingServlets allow a servlet to be registered to a resource type and decide if its going to handle the resource. This sounds good, except that there is only one slot available per registration. If you have a resource type X, method GET and you want to have 5 Servlets registered to optionaly process requests against resource type X with no selectors and no extensions, then you cant do it.

  • Operation Aurora and Source Control

    If you were watching the news a year ago you cant have missed the hack on Google. If you were watching some news today you might have noticed that Morgan Stanley was also a target. What amazes me the the attack vector. Malicious email apparently from a trusted colleague, launches a browser, downloads some javascript exploits a day zero vulnerability in the browser which then gives the attacker access to inside the corporate network. Now that is pretty basic hacking that most script kiddies could do (and do). If you were worried about getting caught you would have hacked a few intermediate machines or servers before reaching your target. The next step is what amazes me more than anything. With local access, the SCM is wide open. LOL, not in my world its not. Imagine, Apache SVN or Github configured without ssl/ssh, configured to allow any user that can open a port to create an unprivileged user who can then go on to discover other things, and even, if configured really badly (http://www.wired.com/images_blogs/threatlevel/2010/03/operationaurora_wp_0310_fnl.pdf) indicates that thats an OOTB config, that user gets privileges. From the reports of the attack and the white papers, it looks like the attackers were able to bypass version control and audit to make changes at will to the repository. Whats not clear in any of these incidents is what the SCMs were holding. Product, core production code or documents and management reports?

  • Choosing friends carefully

    In http://ceki.blogspot.com/2010/05/forces-and-vulnerabilites-of-apache.html Ceki Gülcü points out the problems of a meritocracy that is worth reading. What he skips over is the causes of the situation he describes. In any open-source project that is not a benevolent dictatorship or really a marketing front for a commercial product, inviting someone to become a committer, and giving them a vote is not that distant from telling them you social security number, credit card details, giving them a key to your front door and telling them when you will be out. Unfortunately with ever increasing competition, squeezed margins and companies less and less willing to give their employees time for free with no strings attached its now harder than ever to create sustainable communities. So what do we do ? We have no choice but to lower the barriers to entry, and that perhaps is in the nature of the members of most Apache PMC’s. The members are naturally optimistic people who think the best of their fellow human being and have no cause to be cynical about a anyones intentions. 6 months being a “nice bloke” on list is minimal investment for anyone hell bent on destroying an opensource community. I am quite surprised no corporate is hiring developers for just that purpose. As Ceki Gülcü points out, its not even email abuse and personal attacks that kills a community, but long email post that everyone has to read to understand, and endless trivial arguments over pointless issues. A sentence written once and read once consumes no time, but on list, every sentence written is read a thousand times. Thats not an observation of any project today, but something to be guarded against. Keep the standards up and meritocracy will survive, drop them and the project dies a slow death.

  • Getting somewhere, not sure where

    Finally I feel like I am getting somewhere.

  • Testing Sparse

    The depressing thing about profiling and load testing is the more you do it, the more problem you discover. In Nakamura I have now ported the content pool (amorphous pool of content identified by ID rather than path) from Jackrabbit to a Sparse Content Map store that I mentioned some weeks back. Load testing shows that the only concurrency issues are related to the Derby or MySQL JDBC drivers, and I am certain that I will need to jump through some hoops to minimise the sequential nature of RDBMS’s on write. From earlier tests thats not going to be the case for the Cassandra backend which should be fully concurrent on write. Load testing also shows memory leaks. Initially this was the typical prepared statements, being opened in loop, leaking without closing. Interesting Derby is immune to this behaviour and recognises duplicates so no leaks, but MySQL leaks badly, and always creates an internal result set implementation when a prepared statement is created. That was easy to fix, thread local hash map to ensure only one prepared statement of each sharded type is opened, and then all are closed after the end of the update operation.

  • Programmatic Logging Configuration in OSGi

    Although you can configure Logging via configuration files and admin consoles in OSGi, Sling/Felix in this instance, can you do it programatically. First, why would you want to. Well you might have a bundle, embedding a sub component where the logging verbosity is more than you want in the main logs. Obviously you can configure this bundle by adding config files, but thats not so easy if the config is outside the bundle, and you might want to put the logging configuration under the control of bundle itself on activation. In Sling Logging is configured by via a Factory, which you can get hold of via the ConfigurationAdmin service provided by OSGi.

  • Solr 4 OSGi Bundle

    I need a Solr bundle to provide search capabilities to my app, but I want it to work in a cluster, and I have to have near real time search indexing, so using SolrJ4 makes sense. On the Solr 4 road map it looks like a strong probability that it will be possible to configure for Near Real Time Search indexing in Solr based on the capabilities that were introduced into Lucene in version 2.9. So the approach is to create an OSGi bundle based on the 4.0-SNAPSHOT version of Solr, that will operate SolrJ in two modes. Remote for a cluster implementation where one or more Solr servers can provide search capabilities to the cluster, and Embedded where the App server cluster is a cluster of 1. My environment is based on Sling, which is OSGi. In some senses this make life easier as the classloader policies of OSGi allow me to isolate dependencies used by complex components such as Solr. On the other hand OSGi makes life harder since it requires that no cleaver tricks have been played with the underlying classloaders. Solr has a SolrResourceLoader that adds some custom package resolution and classloader structure which by default wont work in OSGi, so here is how to bring up Solr 4 Embedded as an OSGi component. At this point I have to give credit to Josh Holtzman working on the Matterhorn project for his pointer. He did this with Solr 1.3 a long time ago, and was kind enough to give me pointers.

  • Java Web Start for server applications

    If you want to make a Java Web Start distribution of a web application server built by maven, the there is a neat plugin to make it a bit easier, webstart-maven-plugin from org.codehaus. However, even if you give it full permissions and sign everything correctly, it probably wont work as the SecurityManager that the Java Web Start Client uses is going to stop all sorts of things, like finding the current users home directory, that a normal server app would have no problem with. So, first make certain that you have

  • Sparse Map Content Store

    While doing soak tests and endless failed release candidates for the Sakai3/Sakai OAE/Nakamura Q1 release I have been experimenting with a content store. This work was triggered by a desire to have support for storage of users, groups and content with access-control that a) clustered and b) didn’t have concurrency issues. I also wanted something thin and simple so that I could spend more time of making it fast and concurrent than having lots of features. The original implementation was done in Python (btw, thats my second Python app), in about 2 days on top of Cassandra, but the Sakai community is Java so although that was really fast to implement and resulted in code that had all the concurrency required, it wasn’t any good for the bulk of the community. (Community over code). The re-implementation in Java has taken several weeks part time, but along the way APIs have appeared, and what was hard coded to work on Cassandra, is now based on an abstract concept of a sparse map, or sparse array. Data is stored in rows, and each row contains a set of columns from a potentially infinite set of columns, ie a Column DB. All of the objects (Users, Groups and content) can be stored in that structure and with modification based operations on the underlying store the API is amenable to being stored in many ways. So far I have drivers for Memory (trivial Concurrent Hash Maps), JDBC (MySQL and Derby) and Cassandra. Content bodies are slightly problematic since some of these stores are not amenable to streaming GB of data (Cassandra & JDBC) so the driver also has a blocked storage abstraction that collects together chunks of the body into block sets. As currently configured a block set is 64 1MB blocks in a single row stored as byte[] columns. Where a streamable source is available (eg File) there is a driver to stream content in and out direct from file. The code base has minimal dependencies, the main one being Google Collections for its map implementations and so it should be possible to create a driver for BigTable and run inside an App Engine.

  • YouTube Snippets

    One of the perks of being a member of University of Cambridge is you can (are actively encouraged) to attend Lectures, in any department, on any subject. I think I am right in saying 99% are open to any member of the University. Every now and again the Computer Labs has a speaker worth listening to, Oliver Heckmann, Director of Engineering, Google Zurich and his talk “A Look Into Youtube - The World’s Largest Video Site” was one of those especially seeing as a few hours earlier Turkey reimposed their ban on YouTube for what they claimed was unsuitable content, identified by Dr Heckmann’s content ID system. He was relaxed, unflustered by the robust stance Google Inc’s chief council was taking, reported minutes before by Reuters, to paraphrase, probably incorrectly, “….. censorship by any one country is an attack on US free trade… “, non US readers might be wondering about Global free trade at this point.

  • Version on Create Concurrency

    In Jackrabbit 2.1.1 (possibly fixed in later version) if you create nodes with mix:versionable added to them, the version history will be created which will block other threads performing writes and Persistence manager level reads. If the persistence manager is a JDBC based persistence manager and other threads are attempting to find items that are not in the shared cache, reads will be also be blocked, as they need to access the database by reading. Remember the PM is a singled threaded transaction monitor in JR 2.1.1 and earlier. So creating an item with mix:versionable where many concurrent requests are performing the operation results in a thread trace as below (red == blocked threads).

  • Given enough Rope

    A bit of background. We have been experiencing bad performance for certain searches in Jackrabbit using Lucene. I always knew that sorting on terms not in the Jackrabbit Lucene index was a bad thing, but never looked much further than that until now.

  • Ever wondered by Skype doesn't work at home.

    Have you got a router with DOS protection? Go an look at the logs and if it shows lots of denied UDP traffic, turn the DOS protection off and see what happens. Inbound Skype often uses UDP and some routers think that the inound traffic is a DOS attempt, blocking the packets and making the Skype audio sound like you are in a a cave network at best. It also kills VPN/IPSec performance. Mine went for 5KB/s upto 600KB/s when I turned it off, after posting this I will make certain my IP changes.

  • Our code is like custard

    When you hit it, it becomes like concrete. Deadlocked like concrete. Oh dear, having just cut an RC3 for our Q1 release we elected to spin up a server and friendly users have a play, confident that we had found all the contention issues. So confident that the JMeter test scripts had found everything we used those test scripts to populate the QA server with a modest number of users, groups and content. I knew something wasn’t quite right as I loaded up the content and got complaints of slow response over IRC; come on, content upload being performed by 12 users with only 6K users loaded should not cause others to see really slow response (define really slow ? minutes, no kidding). In a moment of desperation I take solace in the fact that even though some queries were taking minutes, the server was rock solid. (lol).

  • Wrong Content Model

    I knew there was something wrong. You know that gut feeling you have about something, when it just doesn’t feel right, but you cant explain coherently enough to persuade others so eventually self doubt creeps in, and you go along with the crowd. We have a mantra, borrowed phrase really, borrowed from JCR. “Content is everything”. Its possible it was borrowed without knowing what it really meant. One of those emperors new clothes things, this time the emperor really was wearing cloths, and they were so fantastic that perhaps we thought they would fit our build, so we borrowed them, not quite realising what that meant.

  • Jackrabbit Performance

    Having spent a month or so gradually improving performance in Nakramura I thought I should share some of the progress. First off, Nakamura uses Sling which uses Jackrabbit as the JCR, but Nakamura probably abuses many of the JCR concepts, so what I’ve found isn’t representative of normal Jackrabbit usage. We do a lot of ACL manipulation, we have lots of editors of content, with lots of denys, we have groups that are dynamic, all of this fundimentally changes the way which the internals of Jackrabbit respond under load, and its possible that this has been the root cause of all our problems. Still be have them and I will post later with some observations on that.

  • Performance and Releases

    Why does everyone do performance testing at the last minute ? Must be because as a release date approaches the features pile in and there is no time to test if they work or let alone perform. Sadly thats were we are with Nakamura at the moment. We have slipped our release several times now and are seeing a never ending stream of bugs and performance problems. Some of these are just plain bugs, some are problems with single threaded performance and some are problems with concurrency. Plan bugs are easy to cope with, fix them. This late in the cycle the fixes tend to be small in scope because we did at least do some of the early things right, the other two areas are not so easy.

  • Variable char encoding of bytes

    I may not be looking in the right place, but often I want to take a byte[] and convert it into a char[] where the char representation comes from a set of chars that I decide on. This is not for false encryption or obfuscation, I just want a safe compact representation of keys. Hex is ok, but bulky and does no give much flexibility. I couldn’t find anything off the shelf that would work with random encoding sequences and random byte[] lengths so here is what I quickly put together. Published for any one who also couldn’t find anything off the shelf. It benchmarks at about 110K encodings per second (9ns) on a Mac Book Pro Java16 and is reasonable memory efficient. It could be made more efficient, but that would require more cpu.

  • Sometimes new is not better

    Here is a classic example of new not being better. My utilities provider has a “new and improved interface”, so much improved that they felt it necessary to tell their customers how much better it was.

  • And its working, JAMES3 IMAP in OSGi on Jackrabbit

    A Self contained James OSGi bundle containing a DNS Server and an IMAP server, binding down to a Sling repository. The screenshot is of OSX iMail connection doctor checking the imap connection and running some simple tests.

  • Bundling larger OSGi components.

    At the moment, I am bundling the James IMAP server with the JCR backend into OSGi. I could add all the jars to the OSGi Container, but I already have about 150 classloaders and dont really want to add more, so rightly or wrongly I am creating a larger bundle with most of the dependencies isolated inside the bundle. Wrongly the OSGi purists scream. James servers typically use Pheonix, being awkward, I want to bring the James IMAP server up standlone only creating what is absolutely necessary, so I am not using Pheonix to configure, I am instancing a custom NIOServer extending the default ImapServer based on the same. James servers, if you let BND analyse the classes, require lots of things, most of which I dont think I will be using. Things like Spring, Torque, Ant Tools, and getting well over 1000 packages just right in an OSGi Manifest is a bit of a pain. Once its right, you will be thankful you took the time, if not ask the OSGi purists who are now crawling up the walls and screaming. So here is the approach I am taking, which appears to be relatively painless (note, relatively):

  • Jackrabbit PM in columns

    A while back I asked about Cassandra as a Persistence Manger for Jackrabbit 2. The problem that exists with any Column DB and to some extends Cassandra (although it has a concept of Quorum) is that the persisted value is eventually consistent over the whole cluster. Jackrabbit PM’s, at least in JR2 and earlier need a level of ACID in what they do. The PM acts as a central Transaction Monitor streaming updates to storage and maintaining an in memory state. If you look a JR running against an RDBMS this is obvious. The ratio of select to modify is completely reversed with 90% update and 10% read inspite of the JR application experiencing 90% read and 10% update. This presents no problem on a single node, provided the PM give some guarantees to ACID like behaviour. The problem comes when JR is run in a cluster. When the state of items is changed on one node, that change must be known on all nodes, so that their internal caches can be adjusted. Thats what the ClusterNode implementation in JR2 does. In a cluster it goes further. The sequence which the changes are applied must be consistent, so that a add, remove, add happens in that order, on all nodes. Finally, and crucially for a persistence that is eventually consistent, all nodes must be able to see the committed version of an item when they receive the event concerning the item.

  • Whats going on in Nakamura/Sakai 3

    There are lots of things happening in Nakamura, the back end for the Sakai 3, and many of them are of little interested to anyone thinking of how Sakai 3 might impact their lives in teaching, learning and collaboration in higher ed. However, standing back for a moment, 2 features are going to make a massive impact, perhaps not initially, but certainly longer term. Full Business Rules Engine and Activity based Workflow Engine. I dont expect these words would excite many readers and I would not expect the features to be instantly visible to the teacher, student or researcher. However, with these capabilities in place it becomes possible for new features and functionality to be implemented with less effort and programming. We already have a highly flexible component architecture, in OSGi, allowing deployers to add functionality without modifying the core code base. We also have a extremely flexible unstructured storage model courtesy of Apache Sling and Apache Jackrabbit, which mean, unlike many comparative products there is no schema rebuild required to add functionality, but we moved to that structure 18 months ago and thats not what excites me. We already have a flexible widget based UI model that allows institutions to own, develop and deploy “Apps” for Sakai. We have OpenSocial integration allowing Students, Teachers and Researchers to build academic networks. Thats all in addition to all the features you rightly demand from Moodle 2, Blackboard 9 and others. But thats not what excites me.

  • Configuring Logging in Nakamura/Sling, at runtime

    One of the most frustrating things about trying to find out what is broken in a running application is that the information in the logs is not informative enough, and if it is, there is too much of it to be useful. This is often because the logging configuration has to be set prior to the server starting. So if you have 2 options. Configure the logging at debug level just in case there is a problem, or restart the server when there is a problem. Nether are really practical in production. Debug an everything kills the server, and your definitely in the wrong profession if you can predict where to put debug on (ie the future), you should be a banker.

  • ACL extensions just got easier

    Extending the types of ACLs in Jackrabbit 1.x was hard. After, 1.5 where there was a reasonable ACL implementation, much of the code that managed this area was buried deep within inner classes inside the DefaultAccessManager and related classes of Jackrabbit 1.5/1.6. In Jackrabbit 2 as part of the improvement to the UserManager (I guess) its become much easier to make extention. I had a patch that modified the core classes of ACLEditor, ACLProvider, ACLTemplate in JR1.5 allowing the injection of a class that would control the way in which Access Control Entries were collected for a node. This allowed things like dynamic principal membership to be implemented (eg membership of a group that is determined by a rule set, and not a membership record). The upside, was this was possible in 1.5, the downside was that it was hard and required re-implementation of some of the core classes, in the same package space to get round private, protected and even code blocks based on class names. So the patch was huge and hard to maintain.

  • Jackrabbit2 User Manager Scaling.

    In Jackrabbit 1.x the User manger (at least after 1.5) stored users in a security workspace within the repository. This was great, and works well upto about 1000 users. However it uses a structure where users are nested in the user that created the user. If if “admin” creates all your users, then there will be 1000 child nodes to the admin user node. Sadly, additions become slower the more child nodes there are. Pretty soon a number of things happen. The number of child nodes causes updates to be come so slow it hard to add users (>5K users). This can be addressed by a sharding the child node path, avoiding large numbers of child nodes. Secondly (and this is harder to solve), the query that is used to find a user, or to check that the user doesn’t exist somewhere becomes progressively more expensive. So that when you get to about 25K users the create user operation has slowed by an order of magnitude. That may not sound too bad, since its not often that you want to create a user, however, retreval of a user that becomes slower as well since you cant calculate the location of the user node from the userID, and since this needs to be done on almost every request, it slows everything.

  • New Programming Language

    My new programming language that always compiles, never has bugs, has perfect style and is generally delivered on time, (all IMHO) is English. Developers must have a screw loose. Generally they refuse to write anything down, often they say the documentation is in the code, any yet, most of their leasure time is taken up refactoring rewriting and perfecting that algorithm that started out as a simple sort and is now drinking credit card limits on a cluster of Amazon nodes. Meanwhile, those crafty non developer types, lean back and claim victory with a page of prose that no compiler can even start to understand, and yet, they are the ones living it up with deadlines met. Now, we do do our best to ensure that doesnt happen, but there is a lesson to be learnt here.

  • In the Zone

    I have been doing and experiment for the past 3 months. I work in a busy office, open plan and quite noisy at times. There are many projects running in the office, probably about 20 at any one time. A mixture of management, creatives and engineers. The thinking goes, with an open plan office where everyone can hear everything, there is cross fertilization of thoughts and information between projects, the normal process of active management of projects is not necessary because everything is visible and no one ever gets the chance to bury themselves in a hole for months. Well there is a free flow of information between individuals, and being within the University of Cambridge, free flow of thoughts is what we stand for. We’ve just spent a whole year shouting about 800 years of Cambridge thoughts changing the world (very arrogant, but thats also a Cambridge trait). So is this free flow of thoughts good ?

  • Incremental Integration testing with Sling

    I keep on forgetting how to do this.

  • Smart Meters, I dont get it.

    In the UK there has just been an announcement that every house will have a smart meter to monitor home energy use. Fantastic, at least if we want to reduce our consumption at home we can. But hold on a minute, rolout is going to take over 10 years, and its going to cost £6.8Bn and its only expected to result in a 10% saving in the home. I don’t get it. Who pays ? Apparently the home owner. £330 to get it installed, saving £26 per annum off the average bill, since most silicon based devices in constant use have a MTBF of < 10 years the cost will never be recovered. I don’t get it. Ahh its to reduce our Carbon footprint. I wonder how much extra Carbon footprint £6.8Bn of expenditure equates to, all economic activity has an impact (other than planting trees on a commercial scale and using the wood for buildings that last 200 years). If people really want make an impact they need to do less, consume less, and keep things simple. Call me sinical, but the smart meter initiative sounds like a stitch up between government and industry. Industry to create a huge new market for something that didn’t exist previously, government to find a new way of avoiding building sufficient green energy plants, nuclear included to meet peaks of demand and hence raising taxes. I wonder how many MW of green energy you could provision for £8.6Bn, if it was Nuclear probably 2 according to http://en.wikipedia.org/wiki/Economics_of_new_nuclear_power_plants, if it was wind, its much harder to tell. Thats why I don’t get it.

  • Sling Documentation Annotations

    Its been noticed that documentation that is not in the same version control system as the code, is frequently not maintained. This leads to the users of the interfaces getting increasing fustrated as nothing appears to work, although to fair to the developers the users may well be looking at out of date documentation.

  • Note to self: JcrResourceResolver2, selectors and extensions

    This really is a note to myself, as I have a habit of forgetting this and spending ages debugging.

  • Declarative optional multiple references flaky in OSGi

    It looks like binding multiple optional references in OSGi is flaky at least with Felix 1.2.0. Uhh what does that mean?

  • Sling Runtime Logging Config

    One of the most annoying things about bug hunting in open source, is that you can see that the developer left log.debug( statements for you in the code but you have to shutdown, reconfigure logging and restart. In Apache Sling this isnt the case. According to the documentation at http://sling.apache.org/site/logging.html you can configure logging at runtime. AND you can configure it on a class by class basis. Here are some screenshots of how.

  • Sling and OSGi in Oakland

    Over here in Oaklant there has been a lot of interested in 2 areas. Firstly nosql storage, and secondly OSGi based platforms. The nosql platforms of local interest since anyone thinking of creating a business that will become profitable in social media, has to think about huge numbers of user to have any chance of converting the revenue from page view into real cash. They have to start from a scaling viewpoint. That doesn’t mean they have to go out and spend whatever meager funding they just raised on massive hardware, that means that they have to think in a way that scales. Doing this they start in a way that scales for the first 10 users, and then as the number ramp up faster than they can provision systems, or install software they can at least stand some chance of keeping up. Right at the backend, doing this with traditional RDBMS’s is complete nonsense. Ok, so you might be able to build a MySQL cluster in multimaster mode to handle X users, but at some point you are going to run out of ability to add more and you wont get to 10X or 100X, and by the way break even was at 1000X. To me thats why nosql, eventual consistency and a parallel architecture where where scale-up is almost 100%. This makes me laugh, back in 1992, having parallelised many scientific codes with what felt like real human benefits, Monte Carlo simulations for brain radiotherapy, early versions of GROWMOS and MNDO, protein folding codes, and algebraic multigrid CFD code used for predicting spread of fires in tube stations, and some military applications, we never saw this level of speedup, perhaps the problems were just not grand challenge enough… and social media is …. but on the serious side, thinking of the app as a massively parallel app from the start, creates opportunities to have all the data already distributed and available for algorithmic discovery from the start. Not surprising the Hadoop sessions were the largest, even if some of analysis was on the dark side of the internet.

  • Confusing, but logical ItemExistsException

    In Jackrabbit, if a session does not have permission to read an Item in the repository, a AccessDeniedPermission is thrown. In Sling this appears as a 404 at http which makes perfect sense (until I start to think about it). However if you suspect the item really does exist, you can try and modify the item. The result is an ItemExistsException, at the Jcr layer confirming that the AccessDeniedPermission on read was correct, the item exists but you cant write to it. What is confusing is that session.itemExists() returns false, and Sling gives a 404, both trying to hide the information, but its all to easy to use the update operation to determine if the information isnt there, or if you dont have read on it.

  • Fedora OSGi

    Had a visit from Edwin Shin (Fedora Commons Committer) this week who nipped over from France where he now lives. Compared to the med, Cambridge is cold this year. We spent the week looking at the progress made on Sakai K2, based on Apache Sling from a view point of Digital Repositories. Looks like there is lots of common ground. We both see long term storage as being cloud based, with interesting points on storage mechanisms like Apache Cassandra, Project Voldemort etc. Also the component structure provided by OSGi (Apache Felix in the case of Sling) has some strong benefits. He started to embed a Fedora RDF component into Sling/K2 as an OSGi component. I wonder how much of a Fedora/DSpace functionality could be covered by standard components like this. Might achieve the same economies we saw with Sakai 1.8Million lines of code down to about 60K, just because we avoided “not-invented-here”

  • Clouds are search based, humans are not.

    Spending 4 hours in a car driving to Oxford to give a presentation at OSS Watch gave me an opportunity to think. Perhaps getting mildly lost in Milton Keynes on the way back consolidated my thoughts. For those that dont live in the Europe, Milton Keynes is a new town built in the 1960’s consuming a village of the same name. It’s laid out on a grid pattern like many US cities, rather alien to Europeans who have become used to winding roads that promiss the reward of a destination, “The Great North Road”, goes north south and for the authorities in London took them to the Great North. If named by those in Newcastle it might have been called “The Crowded South Road” to discourage any brain drain. But the interesting thing that struck me as a pulled over to search on Google Maps just where “H5” went in Milton Keynes, (H5 is the name of the one of grid roads), was that humans are unable to make sense of large amounts of unfamiliar information. For the average European (habitants of Milton Keynes excluded), the grid pattern of Milton Keynes with its symbolic naming of the major arteries is confusing, just as the winding roads of Europe with ancient names and strange numbers “A1”, must feel like a trip along the blood vesciles of some strange animal for the average US city dweller. But even then we are all given a frame of reference or a language that enables our small brains to navigate this space. In the UK, the road names provide us a way, if you follow “Cambridge Road” out of the east end of London, you stand some chance of ending up in Cambridge, before urban sprawl that chance was a certainty. Imagine a world where there were no maps, and no visibility beyond the end of your nose, except a device. That device allowed you to say where you wanted to go, and it would go out into the cloud of information and tell you the way. This is the world of search and the cloud. The compartmenalisation and ordering has been abstracted to such an extent that all containers are removed and everything exists with in a massive amorphous cloud. We have developed highly efficient tools to locate information within that cloud eliminating all need to pre-categorise anything. But are we missing something? We are humans after all, and we have become adept at sharing and communicating by compartmentalizing what is important. We talk of main roads, autobahns, highways, interstates and know that although there are smaller less travelled routes to one side, we could take a detour, follow our noses, make discoveries and likely get back on the highway at the next junction. The trigger might be a signpost tempting us off the trunk route. Cloud and search does not really provide us with this structure, and the point of this post is that when you try and interface a compartmentalized or hierarchical mechanims with a search based cloud system it generates tension.

  • Sling Virtual Paths, ResourceProviders

    I feel guilty and bad, I have a patch against the core of the Sling engine for resolving what I call virtual resources, but that patch is against implementation code and breaks with an annoying regularity as well as requiring the Sling Resource bundle to be unpacked, patched and repacked. There has to be a better way. I did refactor this to eliminate most of the code changes, and thats in SLING-1129, but IMHO its still not great.

  • HowTo: add more than one source directory into a maven build

    There are more project builds that use code generators to convert model files or interface descriptions into code and associated resources. The normal approach is to generate the code every time and place it in the …./target area of the build so that its gets built duging maven’s compile phase. However there are times when you want to generate the code rarely, and have it form part of the code base. If this is the case, then putting it in the man source tree, with all the hand crafted code can be dangerous and confusing. Ideally generated code, even if you keep it for a long time, should really go in its own source tree. eg …/src-generated.

  • Automated Testing

    JUnit testing and integration testing is all very well, but being within the same JVM generates some level of synchronization even if effort has been taken to isolate the testers from what is being tested. Frequently there is also a level internal knowledge shared between the tester and the tested, which blurs the API being tested.

  • Reloading OSGi

    If you want to break an OSGi container, just reload a utility library with static methods, that other things depend on… my experiance has been the whole container grinds to a halt only to be recovered by a kill. YMMV.

  • OSGi and Snapshot versions.

    If you call your versions 1.2.0-SNAPSHOT and reference them as such in a manifest file, they wont load in OSGi, or at least with Felix, as the versions are expected to be numbers, parsed on . In the Manifest the version must be parsable, 1.2.0.SNAPSHOT, which looks a bit odd, but works. It looks like the bnd tool does this for you.

  • File meta data

    Digital repositories need metadata associated with content, at the most basic level this can be as simple as properties associated with the file nodes in JCR.

  • Inverted Index Scalability

    Search mechanism based on inverted indexes work, because the number of terms in the search space is considerably smaller than the search space itself, otherwise, why would you bother to invert. So most search engines work well on languages. The human brain is quite capable of learning a controlled vocabulary that enables it to communicate concepts with other humans. Like a search engine it would suffer learning a single token to every piece of knowledge that ever existed. Communication would be highly efficient, but rather boring; single words followed by long and contemplative periods of thought.

  • Abstracting Sakai Urls

    For years the Sakai community has suffered with unspeakable URL’s. To give educators URL’s that they can only communicate in text, and are unable to spread by word of mouth must be a barrier to teaching. As we rewrite Sakai based on Apache Sling I am determined to ensure that a tutor at Cambridge can say, in a busy street, at lunch time to a confused student; “go to camtools quantumwell2008 and look for the lecture notes” with some confidence that the student will enter camtools.cam.ac.uk/quantumwell2008 and find what they need.

  • Reloadablility

    I have been using reloadability a lot recently in OSGi and its is great, but being great does depend largely on the developer. Its rather like reloading webapps. If the component is working well, and the developer hasn’t done anything to cause a problem then a bundle in OSGi will reload over and over again, if however they have captured resources and dont conform to the life-cycle methods of whichever bundle activation flavor they are using then reloading will fail. Not necessarily immediately, but at some point. For example, other bundles may depend on the reloaded bundle and they may not correctly re-set their internal state, or just occasionally some sort of lockup might happen as below.

  • Community behaviour and choice of scm.

    Coming from projects where the community is collaborative in nature, I have often wondered how the management of scm, and committer access influences the nature of the community. I wasn’t really certain that it had any impact until I tried something other than svn and cvs. IMHO, cvs is painful in the extreme for widely distributed groups, svn is better but it places some interesting barriers to creativity. With a centrally provisioned repository, personal expression and experimentation is limited. Few take branches to experiment, most work locally until they are willing to share. Developer communities resort to lazy commit consensus and/or tools like the excellent codereview app running on a Google App Engine. The former requiring commit stream discipline and the latter requiring constant searching for approval from fellow committers. So these tools are filling the holes left by a centralized scm in a distributed development environment.

  • Dynamic Groups

    We have been looking hard at how AuthZ works in Jackrabbit and Sling. Not least that JCR Sessions are considered expensive objects, even though they typically take < 1ms to initialize. The reason they are considered expensive, is not the cost of creating one, but the potential long term cost. Hence sessions are pooled for reuse but the selection strategy looks for a free session bound to the user before taking one from the pool of all sessions. Obviously the session is cleaned prior to being bound to another user.

  • Following a bunch of forks.

    Can start to be a pain, this fetches things a bit faster for review.

  • Bundle Dependencies

    Having spent a lot of time resolving bundle dependencies over the last two weeks here are some of the thing that I have learnt.

  • bundle and dependency plugin

    I have been experimenting with getting a happy medium in the bundle and dependency plugin with respect to internal dependencies. The bundle plugin analyses the code base and constructs a suitable MANIFEST.MF for the bundle, and the dependency plugin embeds jars into the jar to resolve internal dependencies. Its handling of those jars is interlinked, and confusing when it comes to both dependencies and transitive dependencies. The dependency plugin may pickup your dependencies and copy them into the jar, but unless they are required by the directives in the bundle plugin they wont appear in the final bundle. So here are some settings that appear to work.

  • Be careful what you publish

    Google are now analysing videos uploaded for Copyright infringement. If you happen to video cam recording of anything that has some ambient music, and publish that, even to one person, I am reliably informed that, if the music was copyrighted, by publishing you have breached the copyright. Do it too many times and Google will ban you.

  • Loading from OSGi Framework bundles

    There are some really confusing things about the Class resolution in OSGi, that to the uninitiated like myself of 4 days ago appear like complete black magic. First off, there are 9 rules to OSGi class resolution to confuse you, if you were not already confused enough by classloading, but since I have struggled for 4 days, I thought I might share one solution.

  • OSGi Service Model is limited

    I don’t want to sound negative, but as always when you look past the hype and read the detail some of the truth comes out. Take OSGi Configuration Service. On the face of it this would allow each service to have its own configuration and allow you to bring a service up and let is manage everything. Well to an extent that is true, provided you adopt the ManagedService model. This means that OSGi manages all the services for you, you define what you want to be configured by using constants within your code or creating an xml file defining he configuration constants, and OSGi takes care of the rest. Sounds pleasant enough, and allows changes in configuration to be listened to by the components. Update the config, and its reflected in the component. So this works perfectly well for a single component with a single service impl exposing a single api, but as soon as your bundle contains a collection of services implementations that are constructed with IoC, then none of the services can be managed. In short the ManagedService places a boundary around the service implementation that prevents it from communicating with other services except via static instances, or something horrible.

  • Jackrabbit searching

    Jackrabbit searching on jcr:path as the primary search vector is expensive… avoid. Node properties and node types are fast.

  • Alternative Locking for a Jackrabbit Cluster

    From the previous 2 posts you will see that I have been working on fixing some concurrent update issues with jackrabbit in a cluster. The optimising and merge nature of Jackrabbits conflict resolution strategy certainly gives it performance, but it does not guarantee that the data will always be persisted. Handling those exceptions would work in a perfect world, but I don’t have one of those to hand.

  • Impact of Locks in a cluster

    I thought JCR locking was a potential solution, but there are some issues. With Jackrabbit, each lock generates a journal entry, and it looks like there might be some journal activity generated with attempting to get a lock.

  • JCR Locks and Concurrent Modifications

    Heavy concurrent modification of a single node in Jackrabbit will result in InvalidItemStateException even with a transaction. The solution is to lock the node, the code below performs a database like lock on the node, timing out after 30s if no lock was obtained. The lock needs to be unlocked as its a cluster wide lock on the node.

  • Jackrabbit Observation

    Not Observation as in the ObservationManager sense, but an observation about JCR and Jackrabbit that has been confusing me and still is. If I put access control on JCR, I dont get notification of an access control failure untill I try and save an item or if in a transaction at the commit (need to check that). This means that the failure doesnt happen in the code where the problem is. I am not certain that is right since, given a permission denied on save you might take alternative action, but if you have to wait until the end of the transaction… how can you ?

  • Faster Jackrabbit

    Just with everything, there are right ways to do thing and wrong ways. It looks like Jackrabbit is the same, doing lots of saves generates lots of version history in jackrabbit and results in lots of DB traffic which makes all JCR operations slow. If you can, one save per request cycle, and binding transaction manager to the JCR objects means that all SQL activity is performed at the end of the request cycle in one block. Having seen the impact of a small amount of tuning on write performance, I think there will be scope for more.

  • Xerces and Classloaders

    It can be hard to work out what happens with Xerces and classloaders. Its a common cause of ClassCastExceptions. Often cause by the the java class loading one class from some parent classloader and then the ParserFactories loading from the context classloader, the result is a ClassCastException even though the classes are the same.

  • JCR SQL Queries

    Care needs to be taken

  • K2 Search

    Search is all inside Jackrabbit inside K2, there is no search service and no managing segments. It supports all standard documents that you might encounter, and the lag between update and search is generally < 100ms. This is not really any surprise since the Query mechanism inside Jackrabbit depends on search. This means we can do relational queries using the search engine without hitting the DB at all…. just a bit more scalable.

  • K2 Memory Usage

    33MB Perm Space, 11MB Heap after startup, thats with most of the functionallity to support the UX project.

  • JPA EntityManagers

    Been having problems with JPA EntityManagers recently in K2 as we didn’t realise they should not be shared between threads. Fortunately fixing this in K2 doesn’t require very much. Create a proxy to a thread bound entity manager contained within a ThreadBound holder. When these are put into a request scope cache from the CacheManager, the ThreadBound.unbind method is called when the request ends, so we have a hook to perform commit and close down on the EntityManager. The really nice thing about using a Cache from the CacheManager is that it can also be created with a Thread scope rather than a Request scope. In Thread scope, its not unbound… so by changing one switch we can, if the JPA usage is well behaved move from a per request to per thread binding. Fortunately the overhead of creating an EntityManger from most JPA providers is < 1ms.. and with this approach, we don’t need to create it on every request.

  • Kernel Update

    That ssounds like a security patch, but its not :). The Sakai Kernel work is moving forwards. We have a demo server up and running with the UX work on top of it and should soon have Content Authoring functional. We have tested some of the aspects. Loaded 5000 users, a bit slow at the moment, but then we dont have any indexes in the DB. 0.4s per user. We have tested webdav, which looks just fine with 2G of files form 1K to 100K in side, and done some tests with large files > 1G. So far so good. The best bit appears to be it is quite happy in 64M of memory and starts in 12s.

  • What not to do at an Airport ?

    Easy… arrive 5 hours after the plane takes off on New Years Eve… in Sydney… bound for London with no flights going out :). That’s what I did, oops. My family, who were with me were not that happy, but the consolation was we saw the fireworks on Sydney Harbour Bridge.

  • Apple Mail Sync Hang

    Sometimes when offline I have noticed Apple Mail hang, especially on slow connections with lots of latency. It appears that some ADSL providers in Australia route their US bound traffic via Japan, giving high packet latencies over the Pacific (400ms), so IMAP to GMail is slower than normal. If you have been working offline for some time, Apple Mail will store your operations in an offline cache in ~/Library/Mail/IMAP-XXXX/.OfflineCache where the XXXX is the account name. In this folder are normally lots numbered files, and an operations file. The numbered files are emails messages, and the operations file is a redo log written in binary format referencing the numbered file. When Mail comes back online it will replay the operations file.

  • Adding a Branch to Git

    One of the benefits of Git is that its easier to manage and merge in multiple branches. You can do this with a svn git repo by pulling the svn branches into the git repo to perform merge operations locally, before committing back to the svn branch. Normally you might have pulled the whole tree from svn including the branches and tags with git svn clone http://myrepo.com/svn -T trunk -b branches -t tags but what do you do if you only took out the trunk and want to add the branch in.

  • Unit Test and Constructor IoC

    I might have said it before but, Constructor IoC has 2 big advantages over setter and getters.

  • Increasing Code Coverage

    One of the problems for developers, is that the honest ones want to increase their code coverage and unit tests, but dont really know how much of their code isnt tested. There are solutions to this. You can add a coverage report to maven and build a site, or you can use an eclipse plugin, which is more interactive. http://www.eclemma.org/ works well inside eclipse and gives good information about the lines being covered. The temptation with this evidence that code is covered or not, is to chase the % coverage up, by devising unreal ways of exercising the code. This should be resisted at all costs, as it will give a false sense of security, but it would be get to the point where the last 5% of coverage is all there is to worry about.

  • Performin a Mvn Release (Notes)

    Maven perorms a release by editing the pom files and replacing the version numbers, it then commits the edited files after a build. Having doen that it re-edits the poms to the new trunk, and commits again.

  • OSGi version resolution

    OSGi version resolution looks great, but there is a problem that I am encountering. I can deploy more than one version of a jar as seperate OSGi bundles, but unless the consuming bundle was explicity written to specify a range of versions that bundle will not start.

  • Classloader Magic

    OSGi has classloader magic, although its not really that much magic. Once into a runtime of a JVM a class is an instance of the class information bound to the classloader that instanced that. So when binding to the class, its both the classloader and the class that identifies this instance. All those ClassCastExceptions that everyone shouts out… “you stupid program .. those are the same class “ … only to realise some time later that you were the stupid one. Perhapse ClassCastExceptions have the wrong name since they really talk about the class instance rather than class type although that may just be becuase many developers think of class as type.

  • How Shindig Works.

    Here is a great discription of the internals of Shindig, http://sites.google.com/site/opensocialarticles/Home/shindig-rest-java. Well worth a read for the technically minded.

  • Linear Classloaders and OSGi

    OSGi does not remove all classloader problems as can be seen from http://www.osgi.org/blog/2008/07/qvo-vadimus.html and http://www.osgi.org/blog/2007/06/osgi-and-hibernate.html where the Peter Kriens notes that

  • Stopping Spotlight

    Second in the slow network performance of a backup drive series, yes I cant run unit tests fast enough at the moment, so I am trying to speed my box up and fix all of those things that are broken.

  • OSX network errors

    OSX has a bug in its network stack, apparently associated with a 10.5 update, but it appears to fix slow performance on large file transfers on Tiger as well. Details are http://gp.darkproductions.com/2008/03/fix-for-slow-wifi-80211-file-transfers.html here.

  • Code Quality

    What do you do if you have no idea who might use your code, and you want to be able to sleep at night ? Simple, you make certain that all the lines of code you write for all the situations you can think of are covered by a unit test, so you dont get a 4 am call out to something that you broke. Fortunately open source code doesn’t come with a Blackberry you **have **to answer or a pager thats always on, but that doesn’t mean that there isn’t a sense of responsibility to others that use your code. Even if what you write disappears into a corporation and supports perhaps millions of users, I still feel a sense that I should make every effort to make it as solid and reliable as I can.

  • Shindig SPI Implementation

    A while back I started a Service Provider Implementation for Shindig aiming to store the data necessary to support a the OpenSocial API in a simple DB. This was before there was a SPI, now there is and it makes quite a bit of sense. There is a push to release Shindig 1.0, but the end of the month, and although a sample implementation of the SPI may not be in the core 1.0 release, I think it would make sense to have something done by then. Not least because it will keep us clean on maintaining the SPI, if we have an implementation of the SPI to break.

  • Sakai K1 1.0 Site

    I have published a maven site of the latest snaphot of the kernel containing current javadoc and test reports. Including some extremely embarrassing code coverage. (fixable). eg

  • HTTPRest in a load balancer.

    The situation is that you have a rest based API, that is used by a client that doesnt support cookies, and you want to ensure that the client binds to an existing session. And if thats not a pain, it needs to work in a load balanced cluster.

  • Pulling SVN with Git on OSX

    I meant to do this a while back, document git setup on OSX.

  • Sakai Realm Relationships

    While writing some new queries I once again found myself looking for an entity diagram of the core relationships in Sakai Realm. I couldn’t find one, so I did a quick sketch, here for reference.

  • Reasons to support Safari

    With the excelence of Firefox many developers leave out Safari from their list of primary targeted browsers. I think this is beginning to be a mistake.

  • How To Design a Good API and Why it matters

    A great talk on API design, 1h long but worth listening to. Especially the advice,

  • Moving Git patch streams between repos.

    I have 2 Git repositories mirroring SVN repositories, and I want to use git to replay all the patches I made in one of the repositories to the other. The added complication is one of the SVN repos that I am mirroring uses externals, the other does not. So my source is a single repo, with no externals, and the target has externals. Fortunately I want to make all my changes in one of the modules.

  • Whats so good about git with svn

    This should be whats so good about git, but Sakai has SVN as its central repo…. so whats so good about git-svn ?

  • Installing git on OSX Tiger to work with SVN.

    First off, no dmg available for Tiger, so you need to build from source.

  • Kernel DB Reverse Engineer with Cayenne

    Since I have been playing with Cayenne, I though I would have a look at what it would do with the Sakai Kernel database. This is the limited set of services that represent a minimal kernel, no UI.

  • Cayenne Plugin

    I have been playing with Shindig to create a datamodel to go behind the OpenSocial API. At first I thought it would be simple to use JDBC direct, but it turns out that the model has a reasonable number of dependencies and the entities are quite big. So, I decided to use Cayenne. The model is created and has some simple test cases, and it all appears to work. I did write a maven plugin to generate SQL scripts from the Cayenne model, so its supports about most major databases (about 8). Creating the model from some Java interfaces was quite easy, infact I did most of it by converting the java class into cayenne XML.

  • plexus-util build problems

    Looks like there are some dependency issues with building maven plugins. Plexus Utils 1.1 appears to be at fault, with messages such as NoSuchRealmException. If you get this, try upgrading the maven api to 2.0.6 or later in your plugin build. Worked for me.

  • Widget accessibility and Shindig

    I have spent most of the Day playing with Shindig, integrating its data-model into Sakai and learning Guice, which is very nice and simple, especially its error feedback, which just seams to get straight to the problem.

  • Offline Maven 2

    Ever tried to run maven 2 with no network and Snapshots, after midnight…. its hard because it likes to check for updates of snapshots. The following in ~/.m2/settings.xml might help.

  • Embedding Nasty Templates, Trick

    If you have a template that, becuase of its nature/strucutre/purpose is not valid XML… yes it does happen… (sherlog search snippets), you can insert them into a comment and then a CDATA and retrieve them from the DOM. Yes, HTML comments are in the DOM.

  • Javascript Templates

    There are 2 main ways of generating markup from within javascript. Either you perform DOM manipulation, as string injection or direct DOM manipulation. Or you can use templates.

  • Lucene Index Merge and Optimisation

    Lucene index merge has some parameters that effect how the index is built. This has an impact on the index operations other than search. The MergeFactor controls how many documents are stored within each segment before a new one is started and how many are started before they are collected into a larger one. So a Factor of 10 means, 10 documents before aggregating and 10 aggregated indexes of a certain size before aggregating again. Consequently MergeFactor controls the number of open files.

  • Installing Sources.

    To be able to install jar sources you can run the mvn source:jar maven command and that will put jar sources into your local repo, so you can use them in eclipse.

  • Reducing Working Code Size

    How many of us load the whole of the Sakai Code base into eclipse, and wonder why it consumes so much memory? Most I guess. Alternatively you can just load the code you are working on and just use the local maven repo for the Sakai jars, that way eclipse will run in considerably less memory. When you need to access the source code, if the repo has the source jars, then they can be used instead of the live code base. Obviously this doesn’t allow you to edit all any code anywhere…. but then should we all be doing that anyway… except for those rare debugging exercises.

  • Why Spring and Hibernate cant be seperated

    After extracting the spring-core, spring-hibernate3 and all the various parts of Sakai, fixing the classloader issues surrounding IdGenerators etc, I find both Spring and Hibernate use CG Lib for proxies, and if you separate Spring from Hibernate, they fight over CGLib. Either Hibernate cant create proxies because it cant see the hibernate classes from CGLib or Spring cant get to CGLib because its not in shared. Looks like its not going to be possible to separate Spring and Hibernate into separate classloaders without providing some extra level of visibility between the classloaders.

  • Reloading components in webapps... now :)

    All this talk of a requirement for reloading components as a requirement for developers got me thinking. Webapps do load spring context, so why not write a context listener for a webapp that loads that webapp up as a component ? Well at the moment almost nothing. Provided I treat the webapp the same as I would a component. I can change the packaging from component to war in pom.xml, add web.xml to the WEB-INF folder with a modified context listener class that loads the component with a new classloader and the correct context classloader. Start up tomcat and hot deploy. The component comes up. I can then redeploy, and I get a whole bunch or spring Infos about bean overloads. And relaoding works, no tomcat restart required.

  • Google Calendar

    Everyone keeps on telling me how cool Google Calendar is, and how we should throw away our Enterprise calendar systems. I sort of heard what they said an intended to get round to having a look one day. Finally I did. And it is.

  • OSGI Components

    Perhapse this is premature, but a quick look around Apache Felix and Spring OSGI (Sorry I should say Spring Modules…. why the name change ?) gives the impression that its not going to be too hard to make most of Sakai OSGI.

  • Spring Proxes that Dont appear to quite work.

    Spring Auto Proxies take the hassle out of AOP, but they dont always work. The situation. I have an implementation in a classloader (component classloader) that cant be seen by the classloader that spring lives in (shared classloader). The Service API to the implementation is in the shared classloader. So I can create an Auto Proxy on the Service API and all is Ok. But then there are some configuration settings in the implementation, expressed as getters and setters, that are not present in the API. So in Sakai, sakai.properties does this all over the place with a method@springID where springID is traditionally the API id.

  • Component Loaders

    I have been looking at an alternative mechanism of loading non webapp components in sakai for about 2 weeks. Currently we load the component manager as a side effect of webapp startup. Using a static ComponentManager factory means that the first thing to access it causes it to startup. Consequently, the first webapp with a context listener will perform a start-up. There is nothing particularly wrong with this other than the lack of control over the startup and lifecycle of the component manager, coupled with the need to have a static ComponentManager factory.

  • Vectors ... read the small print

    Well I thought Vectors were synchronized. The Object is, but Iterators are not.

  • Loading Component Manager.

    I’ve been having a look at the loadup of the component manager. At the moment we do it as a side effect of the first webapp to load. This is Ok, but it doesnt really tie into the life cycle of the container and feels wrong. We have seen some problems in the past due to startup order, which the Component Manager should not be subject to.

  • How to run Oracle on OSX Intel

    You cant, well not on OSX. But you can bring up a Parallels desktop, install debian etch from the network iso make certain you give it 1G of swap to keep oracle happy, dont add any extra packages and then edit /etc/apt/sources.list

  • Running Oracle On OSX

    Being slightly desperate for a test instance of Oracle on OSX I had a brainwave. Run Linux inside Parallels Desktop and run Oracle inside that VM. Since Parallels uses VT and can go direct to Hardware, looks like an option. However to Ubuntu 7.04 running you have to tell parallels your OS is solaris. Something to do with teh vga setup. Aparently line vga=771 also works on boot.

  • JSR-170 does provide a level interoperability.

    Well that’s a bit obvious, standards are supposed to do that, however frequently they fail to generate interoperability.

  • Sakai Search working in a cluster again

    Not so long ago I realised that Sakai search was corrupting its indexes every month or so due to NFS latencies in propagating changes in the shared segments. I had incorrectly assumed that these latencies might be in the order of 10s max, but this appears not to be the case.

  • The benefits of Unit testing.

    Ok, so its only one data point, but it illustrates the benefits of unit testing.

  • What's the right way to do IoC ?

    When we talk about IoC, there is a vast spectrum of IoC complexity that we are willing to accept. Those who love Spring IoC in XML will create 1000’s of lines of XML and proudly show 5 lines of Java. On the other end of the scale there are those that IoC 2 or 3 large beans to represent the exposed bean. Which is right or better ? I have no idea, both are probably right depending on your religion. Here are some observations.

  • Timer leaks

    If you use Timer and TimerTask you may find some strange with one shot TimerTasks, i.e. ones that run just once after a delay. If you add a lot of them to the Timer, they tend to be held onto by the Timer itself, and hence if there are any references these will also not get GC’d.

  • Running specific Maven Test with JVM Args

    I have some long running tests in search, but I wouldnt want anyone to run them as part of the normal build. The tests dont have the word Test in the classname which prevents them from running, but they can be invoked on the command line with -Dtest=classname

  • HSQL Unit testing

    Dont be fooled by HSQL Unit testing… its transaction isolation may lead you to beleive that your unit tests are working perfectly, but its doesnt support READ_COMMITTED transaction isolation, and its a true transaction monitor when it comes to committing the data, ie the code is single threadded. Since Sakai uses READ_COMMITTED for its transaction isolation in production, rahter than READ_DIRTY, tests that work on HSQL will not work in production, and tests that work in production wont work in HSQL.

  • Xen Bridge on Debian Sarge/Etch with 2 interfaces

    The standard network-bridge script that comes with Xen on Debian Sarge does not appear to work. The problem appears to be that the network script after converting the hardware ethernet into a promiscuous port (peth1), and binding a virtual port veth0.1 to the bridge, it fails to binf the fake eth1 to the virtual port.

  • Documentation on the Entity Binary Serialization

    I have put some rough documentation on the new Entity Serialization being used in 2.5 at http://bugs.sakaiproject.org/confluence/display/~ianeboston/Entity+Type1+Block+Encoding

  • Xythos releases JSR-170 beta programme


  • Faster Lighter Serialization.

    I have been having a long hard look at the serialization of data to storage in Sakai. Back in 2.4 I noticed while testing Content Hosting DAV for search that there was quite a lot of GC activity. A bit of profling (with YourKit, thanks for the OS license :) ) showed that about 400 > 500 MB was going through the GC per upload in big worksite due to quota caluclations. This isnt quite as bad as it sounds since objects going through the heap quickly dont cost anything. However, this is not good.

  • Surefire Unit Test arguments in Maven 2

    To make the surefire plugin to maven2 operate in a seperate jvm instance, and have different jvm args (eg more memory, or profiler) you can change the way in which the unit tests are launched.

  • Lucene lock files that wont go away

    As part of the search testing, I’ve been running a large number of threads operating on lucene segments. It all works fine for about 45 minutes and then I start seeing messages like

  • Zip files without an End-of-central-directory

    If you create zip files with the ZipOutputStream and forget to close or flush the stream correctly, you may be able to re-open them in Java, but it you try unzip or another command line utility, you will get a cyptic End-of-central-directory missing message.

  • Unit Testing Strategies

    When I first wrote Search, I didn’t do much detailed unit testing because it was just so hard to make it work with the whole of the Sakai framework. And because of that lack of unit testing, the development of search was hard work. I did have some unit tests that did things like launch 50 free running threads to try and break synchronization of certain parts of the search framework, but much of this I did with Sakai up and running. The cycle was slow, and the startup even slower with the sort of tests I needed to do. Recently I have been re-writing the search indexing strategies to make it more robust, using a 2 phase commit strategy controlled by 2 independent transaction managers (along the lines of XA) with a redo log of index updates. Clearly this is slightly more complex, and this time I need unit testing. So here is my personal list of do’s and dont’s for unit testing with sakai.

  • Development and Testing cluster up

    I used to do Sakai cluster testing with multiple tomcat instances on the same box, and share the ports. This is Ok, but when 5 developers try to do the same thing on the same box you soon run out of meaningful port numbers, and a careless killall can bin a few hours worth of testing.

  • Xen Virtual Sakai Servers with an NFS Home

    The aim is to create A number of Xen virtual servers with as much configuration in place to bring up Sakai, as per the previous posts, I am using an old Sarge box, with a Dom0 Xen installation and a bridged Xen network. Each client will hopefully DHCP its interface up, but I could build clients with fixed IP’s.

  • Xen installation on Debian, commands to remember

    Just a note to myself on a Xen installation on Sarge. Having managed to boot the machine into a non responsive state several times, here is how not to do it.

  • Ajax and UTF8

    I have suddenly found out that the faith that I had placed in the javascript escape(), encodeURI() and encodeURIComponent() for encoding correctly were misplaced. Here is the problem, a traditional form submits UTF-8 perfectly and all works. An AJAX form only works if there are no UTF-8 characters. And this only happens with certain UTF-8 characters that are high up in the range. It turns out that %20%u5FFD encoding produced by the above doesnt work when submitted as a application/x-www-form-urlencoded to Tomcat, even with charset=utf-8 or Character-Encoding: UTF-8; . The encoding has to be +%5F%FD to make it work. If you bring up tcpdump and look at the raw tcp packets you will see that Firefox uses the latter for direct posts.

  • Caching

    Its a while since I have had a chance to look at caching. Most of us reach for HashMap or something more concurrent, but there are plenty of cache implementations and the use cases are relatively well understood. Put an object, get it back, and clear it. JCACHE JSR-109 never really did appear to produce a standard, perhapse it was one of those standards that was just too close to the Map concept to be interesting for those involved, however several of the cache providers look like they support it. But in Sakai we have already been using a cache, for some time. We have a sophisticated internal cache in the form of the memory service. This works in a cluster and serveral years back was state of the art. Since then the Concurrent work has deliverd really sensible concurrent hash maps, Commons collections has LRU hash maps and collections, and the higher level caching providers have moved on.

  • Mobile Search Contexts

    Looks like the world outside Sakai is continuing to focus on search that has context. Taptu provides search results where the content is the sort of things that you actually want on a mobile phone…. not just the entire web recoded for a mobile phone. I understand that they are using Lucene and Velocity, and from the looks of it are buying lots of disk for the indexes. I wouldn’t expect a Sakai instance to need a 750GB index, but it looks like Taptu are already up at 10x that. See http://www.taptu.com/blog/ for the blog … which also looks obsessed and well informed on/with mobile devices.

  • Build Sakai with Maven 2

    Maven 2 is upon us… so here is a quick refresh.

  • Coversion of Sakai to Maven 2 build

    A long time ago, we started the conversion work on Sakai to make it use a Maven 2 build, partly because it promised to give us better control over dependencies but also because many of the projects we use in Sakai were also moving to a maven 2 build.

  • Maven2 repository distribution

    The maven wagon plugin support a greater number of plugins to perform distribution than the standard maven, which is good because doing a maven 2 deploy of a new artifact is a pain. However, webdav support is not quite there. It works just fine (in beta2) to do a deploy to an exising folder in the repo, but when there is no folder structure there, it fails.

  • A JSF Ajax Portal is possible and looks cool.

    I am no fan of JSF, but watching the eXo portal 2.0 is an eye opener.

  • Why Class Extension Pattern with Inner Classes is Bad

    I’m not against class extension patterns, most of Java is based on it and makes sense, but when you have a class extension model, coupled with inner classes and interfaces, the programmer can express all sorts of bad behavior.

  • Jackrabbit in a cluster, some pitfalls

    Although jackrabbit in a cluster does work there are one of 2 issues with the 1.3 code base. There is a small patch that needs to be applied to the CusterNode implementation to make terminate the journal record.

  • So much for launch dates.

    Caret have been helping launch another Darwin website, this time Darwin Correspondence. And this time we thought we had the launch under control. The media were expected to publish tomorrow morning, and the site would be up and ready, but we were wrong. http://news.bbc.co.uk/1/hi/sci/tech/6657237.stm The BBC decided they would publish a story a little earlier than expected.

  • My MacBook is contributing to global warming. (Jackrabbit Cluster Testing Session)

    Only by a few watts, testing Jackrabbit in a cluster. I have been testing a 2 node cluster of Sakai with JCRService with 2 data sets. Once it lots of small files, 9,215 files in 178Mb, some indexable, and the other 100 or so files of mixed type for 315MB. The single node cluster (ie no cluster) works seamlessly, uploading 4 files/second for the smaller files and growing the heap by about 100M every 2 minutes, which gets GC’s back down to a baseline of about 160M, probably about 1M per request. Looking at the CPU time, most time appears to be spent syncing the data down to the DB, which is good, indicating that the JCR is running as fast as the DB can accept data.

  • mvn eclipse:eclipse does some cool things

    Not having tried this much before I wasn’t certain what it did exactly, but once you have fully populated pom’s in you project, a quick mvn will sync all the .settings, .project and .classpath files that eclipse uses suitable for use in eclipse. The really great bits are, 1) it knows about transitive deps and adds those where necessary 2) it know about source code and if you do a mvn -DdownloadSources=true eclipse:eclipse it will download the sources, wire in the javadoc etc. It also knows about projects in the same package, referencing those directly.

  • Maven1 Maven2 synchronization

    Maven 2 is now building in trunk, but the other problem that I’m trying to fix the jars that are deployed and where they are deployed to, there are 2 a number bash commands Im using to automate this process, recorded here so I dont forget them.

  • Cayenne is a hot pepper, no kidding.

    So I was getting fed up the the datamodel I am working with, lots of many to many mapping tables, often with a several tables between the real entities. So I looked up flattening with Cayenne.

  • Apache Cayenne Is Cool (Hot, So far)

    The first weeks of a marriage are bliss, and I guess thats what the first hours of using yet another ORM tool is like. And so far, 4 hours in I’m having real fun (sad) with Apache Cayenne. I have a largish schema, not related to Sakai that I need to map, about 80 entities, mostly legacy with some reasonably complex relationships. The Cayenne Modeler tool and its integration with Eclipse is excellent, as is their getting started documentation.

  • Spring 2 XML validation errors (cvc-elt.1)

    If you get one of these in Spring 2

  • Search was broken

    I’ve been breaking search in a big way over the last week, but its fixed now (everything crossed)

  • JCRService + WebDAV Working

    I’ve now got the JCRService integrated into Sakai and working with Jackrabbits WebDAV library. The JCRService which exposes a the standard 170 API to Sakai is using a Jackrabbit implementation integrated into Sakai. The webDAV implementation uses Jackrabbits own webDAV libraries which have support for

  • Maven 1 XML and DOM Serialization in JDK 1.5

    To serialise DOM in JAXP avoiding dependencies on Xerces there are 2 options. This is JDK 1.5 code and later, prior to that you can just use Xerces

  • Maven 2 Version

    You may have noticed that the maven 2 build has M2 all over it. This was to avoid conflicts with the maven 1 builds, but we need to be able to build everything to particular versions. This is not quite as easy as with maven 1 where project.properties allowed you to build a specific version.

  • Thread Unbind

    ThreadLocalManager does not give the application using it any indication of when the request cycle is over. If it had some unbind mechanism then objects could be notified. eg If the object being ejected from the thread local implement ThreadUnbind, its unbind method is called.

  • Reimplementing JCR

    I’m having another go at implementing A JCR based service in Sakai, this time I’m not binding it to CHS as I want to see if its possible to solve the clustering issues with Jackrabbit 1.2.3. So far the structure looks interesting, javax.jcr.* lives in shared and with a very small additional API, used can get hold of a JCR session, which my be authenticated as system, anon or a Sakai user. Once they have this they can manipulate the repository using all the normal JCR/JSR-170 API features.

  • And you thought you had privacy.

    Howe do you go about storing large volumes of data persistently in a browser. Cookies are no good because the data gets sent back to the server so it will break most http connections at about 4K. But there are a number of newer mechanisms appearing. IE5+ has had userData ( http://msdn.microsoft.com/workshop/author/behaviors/reference/behaviors/userdata.asp ) for a while, Firefox2 has DOM:Storage from the HTML5 spec (see http://www.whatwg.org/specs/web-apps/current-work/#storage and http://developer.mozilla.org/en/docs/DOM:Storage ) and then there are things like halfnote ( http://aaronboodman.com/halfnote/ ). We appear to moving rapidly to a state where the browser is becoming an offline client, but none of this works for FF1.5, Safari and others.


    Earlier I had raised the idea that JCR might exist behind RMI potentially in a remote JVM. Once thing I hadn’t considered was the security model. In past implementations of Jackrabbit as a Service in Sakai it was tightly bound and embedded. There is a security manager that uses JAAS to connect to the Sakai Security services (AuthZGroups) and answer role/context based security questions.

  • Google Just cant help themselves.

    With all that money swimming around in Google they cant help but hire lots of programmers and let the experiment. Recent examples http://tools.google.com/gapminder/ which lets you correlate CO2 tomes per capita to life expectancy and play how it changed over the years. (actually this is flash and it was bought by Google, but it wont be log before it becomes Javascript+HTML+Canvas) and then there is a Canvas emulator to poor old IE browsers http://code.google.com/p/explorercanvas/ but the UI that makes you blink continues to be http://pipes.yahoo.com/pipes/pipe.edit?_id=0mwRk4O72xGtSjMVl7okhQ just right click to see if you can find any flash….. there isn’t any! Java2D for Javascript

  • Jackrabbit cluster has moved on

    It looks like Jackrabbit clustering has move on quite a bit since I last bought a JCR up. Originally there was a demand for clustering in http://issues.apache.org/jira/browse/JCR-169 and later that was implemented under http://issues.apache.org/jira/browse/JCR-623. The basic mechanism is to maintain a journal log that all nodes in the cluster subscribe to and replay to remain consistent within the cluster. The replay causes the the nodes in the cluster to maintain their cache consistent with other nodes in the cluster. The transport mechanism of the cluster was originally file system but could be anything http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200611.mbox/%3c66c10f230611100158u70ff7b9ak59b889b86b409215@mail.gmail.com%3e

  • Velocity HTML Escaping

    The age old problem often overlooked in the rush to get a UI out of the door…. html escaping. Some view technologies have no support for and you are just left to do it all manually or forget, jsp. Some like RSF just do it without even thinking about it because they are xml based. And some have mechanisms to let you choose.

  • Skinnable Portal

    Now that this is working, I have written some documentation (very brief) http://issues.sakaiproject.org/confluence/x/hpM

  • Apple Remote Desktop over SSH

    If you can get an SSH connection, into a network then you can forward IP over an SSH tunnel…. we all knew that… but Apple Remote Desktop works quite well tunneled. The port to forward in 5900, if you bring it into you local host then you can connect with ARD to localhot and control the remote one.

  • Large SVN changes with .externals in place

    If you need to make changes to a large source tree with an .externals defining the source tree, (like sakai) it can be a pain, so its better to do it with a script perhapse.

  • Resetting Continuum passwords

    The only place I could find info on resetting a Continuum password, was as a patch in Codehaus JIRA.

  • Content Hosting Handler

    We now have a patch for trunk for Content Hosting Handler.. that works!

  • GET urls are latin-1 encoded.

    Looks like get urls are latin-1 encoded.

  • UTF-8 Encoding on Gets with tomcat.

    Looks like browsers dont UTF encode Gets correctly, or at least that whats observed. Take a form, make it a GET and put some UTF-chars in … eg 中国的网页

  • Powerbook Hard Disks.

    If you find that your Powerbook sounds like a distant helicopter when you carry it around, replace your hard disk… now before you loose it! Apparently the Toshiba 60G disks bearings in G4’s fail with too much heat. I thought mine was just the fan…. until it sounded more like a Chinook overhead. http://www.raf.mod.uk/equipment/chinook.html

  • If you find that you get a invalid header field with a Maven 2 pacakaged war or jar then you are probably using and old version of the maven-archiver plugin that cant cope with pom descriptions containing line breaks.

  • Where Maven 2 parent resolution breaks.

    There is a subtlety in the way in which maven 2 parent resolution breaks. If you have a project that has a parent that needs to be built, the dependency tree does not take into account parent poms when performing the build. You have to explicitly list the parent as a module to make it build and be part of the tree.

  • Dependency Resolution

    Maven 2 dependency resolution is a wonderful thing, but its not quite there in 2 when you have multiple classloaders and dont necessarily want the transitive dependencies to flow through. In sakai in Maven 1 we defined everything explicitly, but in maven 2 I find that jars in the shared classloader creep into the components packaging and and webapp packaging. The only way to stop this is to make the jar provided, but if that is done the dependency does not flow, so you cant make a dependency transitive up to a war file and then not pack it. Its all or nothing, provided or not and packed.

  • Maven Ant Plugin

    Although the docs are a bit thin, the maven-ant-plugin works well for Maven 2 so you can port maven.xml into this form in some instances.

  • Maven2 more..... building

    I’ve now got it all building including osp, sam, sections, gradebook (after a break going to the Mall, went into the Apple store, looked and looked and looked, but walked out empty handed…. maybe nothing is quite as tempting as it used to be, but I did see some iPod games, I wonder when Apple will release the dev kit… if ever) sorry… yes its building now. Next problem is which deployer to use.

  • Maven2 Sakai .. more

    So I now have a set of poms for all apart from OSP that verify and go through the build. The XSLT approach has been augmented by scripts and manual editing to manipulate the poms into a structure that builds.

  • Maven 2 Continues....

    So, During listening to the “reassurances” of the Blackboard atourney that the wouldn’t issue writs against opensource before tea time……:)… I tried to do a bit of innovation. The maven 2 conversion for the build structure is starting to work. One thing that I notice is that the dependency definition is much more precise in maven2 which is generating missing deps in some of the projects (xercese et al).

  • More on Maven 2

    After looking at the options, and being pointed at an XSL by Dave DeWolf, I’ve written a script that scans the Sakai code base and converts all project.xml files to poms and the builds the necessary multiproject module definitions.

  • Maven 2

    What would be required to make Sakai maven 2 ? I’m only investigating to see what it would take, I’m not advocating that this should be done.

  • Tagged Search

    New feature for search, Tagged words of the search terms in a search results.

  • Jackrabbit Cluster

    Jackrabbit has created a clustered version… or at least there is a version in trunk that will be part of core that clusters. Jackrabbit can be made to use a shared DB quite easily, but if you do that and have 2 or more nodes accessing, the local cache maintains a stale copy of the data. The cache is what makes Jackrabbit fast. You could throw away the cache or make it write through expire, but that would kill much of the performance.

  • Clustered MySQL

    If you buy fast disks, put them in a cluster and put a DB ontop of them, then you might hope the clustering mechanism might not kill all that performance you were after. With 300MB/s disks and 1000BaseT interconnect DRBD generates a 3x slow down with mysql on anything that needs disk block writes. It would probably be Ok with slow disks, but when you find that order by generates disk writes, and most of Sakai selects have order by, then your onto a non starter. So we have abandoned DRBD for our HA cluster.

  • MySQL Cluster

    We have been having fun with a mysql cluster,

  • Triple Stores, Marsupials and Communities

    What happens when an a corporate tries to buy and exploit open source without the community? The likely hood is that it dies, Open source is the community. When Northrop Gumman bought Tucana it acquired the copyright to parts of the Kowari code base which was available under an MPL license. It then sent letters to some of the developers preventing them from releasing version 1.1 claiming it would damage their business. Presumably to avoid damages the key developers left, and the Community entered a log jam. The kowari-dev list makes interesting reading, especially the lack of messages since May 06. The last one being from from a lawyer. Unfortunately I suspect that the kowari-general list would have made interesting reading…. but thats disappeared with all the old messages.

  • Hosting Darwin

    A few months ago, Caret offered hosting space to Darwin online a Cambridge University Project. We didnt write any of the code, that work was done by Antranig Basman who works at Caret part time.

  • ActiveMQ / Kowari / Sakai Events

    For some time, I’ve been thinking about how the Sakai Events which can fill up a production database should be managed. Although of interest in the Sakai database they are not necessarily needed for the smooth running of a Sakai database, and when there are 10 - 20M events present, the event service slows down a little on insert.

  • Segment Merge

    The Segment merge algorithm in search is dumb and needs to be made better. It has a habit of not merging upto the full 2G segment size at the moment and needs to be made better. This has the advantage that we don’t ask for massive transfers, but it would be better to be able to ask for a target segment size and actually get it.

  • Structured Storage of Segments

    When the size of the index gets big there are some problem that I thought wouldn’t appear. A 500G index, of 1G segments is going to have at least 500 files in the local segments space and in the shared segments space, at this size I would hope on the file system.

  • Search Hardware Requirements

    The hardware requirements of search are somewhat undefined…. why? Because we are dealing with a variety of document types with all sorts of content. A 10M PDF might contain only 10K of indexable content, and a 100K email message might contain 99K of indexable content, this makes it difficult to come up with anything precise about the size of the index.

  • Velocity Based Spring MVC in Sakai

    Well it is possible, but you have to rewrite the Spring MVC view resolver and the Velocity Configurer so that neither extend any Spring framework classes, otherwise they will want to drag Velocity into shared. Once you have done that, it works like a dream, except that you cant easily get the templates onto the filesystem.

  • Spring MVC in sakai

    There is an unfortunate side effect of using Spring MVC based apps in Sakai. Sakai has the Springframework libraries in shared, it also has a number of utility classes that make spring MVC much easier, but if those classes are expected to load the selecte view technology, that view technology must also be placed in the shared classloader. As it stands, you cant use the Spring provided VelocityViewResolver as it will pull the Velocity jars into the shared classloader, and break every Velocity app in Sakai…. but you can write your own view resolver which will sit in your own webapp classloader.

  • Skinable Charon

    After becoming closely acquainted with with Charon at integration week, (thats the current Sakai portal and not a typo :) … more’s the pity ), I’ve had a look at making it a bit more skinnable. After a quick round of templating engines, and considering string properties for good measure, I notices that some including velocity allowed you to render without binding dependencies. So I’ve extracted all the inline html from Charon, created vm templates and put the Velocity binding behind an API. It should now be possible to have per site portal templates that go beyond what can be done with css… and plug in any other templating engine.

  • To 1.5 or not

    Pandora’s box is open. We may not realize it, but there are a whole load of things in 1.5 that may trip us up. This is more about not moving to 1.5 that moving to it.

  • Alfresco

    Looks like Alfresco is starting to pursue JSR-170 for its 1.2 release. The positive side is it clusters, the negative side is that at the moment it doesn’t pass all the Level 2 170 tests. I could be plugged in under the new CHS service with a small adapter, but, if its not solid through the JSR-170 API, then it will be a pain to run, even in a cluster.

  • In Tomcat Unit Testing

    Although the Test Harness is great and enables Unit tests to work for components, there is a ‘dirty hackers’ alternative.

  • Hierachy Service

    I’ve started to think about the hierarchy service. Im thinking about a Hibernate Managed (with lazy connections) object tree, on the basis that access is read mostly. This greatly simplifies the API and usage. To avoid issues with Entity names, the path, which will be stored in the object will he SHA1 hashed to generate the Unique node id. So we can have node paths upto 4K long.

  • Log4J and Chainsaw

    After years of using tail -f or less on log4j log files, and then trying to do cleaver things in XL with the output, I realize there is this thing called Chainsaw, which accepts a feed of Log4j over IP.

  • JCR Session Startup

    Is good, and quick,

  • JCR Sessions.

    I had thought with the structure of JCR that it would be a good ideal to open and attach one session to each request thread, avoiding the session creation mechanism. Before you think that this is totally daft, remember that the JCR persistence manager in Jackrabbit manages persistence for the entire JVM. So a managed session attached to the request thread is not so dumb…. well, perhaps, except that if anything goes wrong with the session, that state persists with the session to the next request. The interesting bit is that the error hangs around until and eden GC collection cycle takes place…. at which point any objects that were left uncommitted in in the session are finalized. If the finalization ‘rollsback’ the JCR object transaction, the session recovers, but a it looks like everything that was ‘committed’ after the error state is also rolled back.

  • Jackrabbit Cluster

    There is one think that I must have forgotten to check out completely with Jackrabbit….. clustering ! Although it uses a DB, and has a Persistence Manager, there is a Cache sitting above the Persistence Manager, which, in a cluster risks becoming invalidated. There is work in this area under https://issues.apache.org/jira/browse/JCR-169 but no indication of when its going to be implemented. I have also seen a jackrabbit-dev discussion on the implementation of cache concurrency in a cluster, which appears to be sensible and almost working.

  • Configuring Jackrabbit

    There are some interesting things that happen when you try and configure Jackrabbit. Firstly, the configuration appears to be stored in the repository home, so, once created, changing the startup configuration probably wont change the repository…. except that some settings appear to get through. For instance, the driver class and the database url appear to get through, but the username and password dont. Im not certain what exactly is happening.

  • JackRabbit CHS service working!

    Its working… unbelievable :) Hats of to Apache Jackrabbit as it looks stable and easy to work with. Also the Existing Sakai CHS code is quite solid as I’ve mostly done cut and past and only found one or 2 places where abstract dependencies cross the API boundaries.

  • Boddington/Tetra/Hierachy

    The penny dropped, I went to Inverness to meet the Tetra/Boddington crowd from UHI and the similarities and differences between Tetra/Boddington and Sakai became clear. They are both collaborative environments for education and research, but Boddington has a hierarchical organization where Sakai has a flat organizational structure. If you add a Hierarchy super structure to organize Sakai Sites and Entities and a mechanism of viewing that structure then Sakai starts to look a lot like Boddington.

  • TidlyWiki

    At first I thought…. not another Wiki…. but then this looks interesting. TidlyWiki is a wiki in an HTML page with the entire wiki engine written in JavaScript. Perfectly possible, JavaScript has good Regex support and all the Wiki engine needs to do is re-write DIV’s with the rendered content.

  • Content Hosting JSR-170 with Jackrabbit Continued....

    So the CHS based on Jackrabbit is largely written and working. It can be found in Sakai contrib (https://source.sakaiproject.org/contrib/tfd/trunk). Surprisingly it works! After a few hiccups with node types and getting the Sakai representation of additional properties into the JCR node structure, it all works. I think I’ve only had to fix about 3 bugs so far, mainly where /’s were not put into Id’s.

  • Search Bugs

    There has been a slow stream of bugs coming from Cape Town with search. It shows that its being used in anger which is good, but it also shows that there want enough testing in QA and my own approach to testing wasnt 100%. But thats not untypical of a developer who doesnt think up every type of user input.

  • Big Packets in MYSQL

    Once again we discover that putting lots of data into the DB as blobs is not the best thing to do. In this case is the search segments with MySQL. If you chuck 4 - 5 in blobs into MySQL Innodb tables, then everything runs a little slow on that table. You cant perform seeks on the blobs (not real ones) and when you try and retrieve them you find a practical file size limit somewhere below 2G as you might expect on a normal 32bit file system, not because MySQL cant cope with the data, but because the connectors and things on the network timeout.

  • Content Hosting API in 170

    Almost there, One thing that I dont know if its a good idea or not, but the Groups attributes associated with entities inside content hosting are stored in content hosting. So if you want to let Content Hosting manage its security underneath, by talking to AuthZGroups, then you have to stop it asking itself.

  • More on Patents

    Turns out Blackboard lawyers have stated they have no intention of going after opensource since they are not commercial (oS that is). But then they have commercial partners.

  • Blackboard Patent

    It doesn’t come as any surprise that Blackboard has a number of patents. Some of the them granted in places like Australia, NZ, and US and some pending in the EU and other places.

  • GroupEdit and RWiki

    Looks like there are others with the same off line Collaborative document writing. If you took Sakai + RWiki and exposed its services as web services, it would be a short step to integrating a full blown off line Collaborative Document Writing environment.

  • ContentHosting JSR-170

    For those who dont know JSR-170 is the Java Content Repository specification. Apache has recently released a 1.0 implementation with Jackrabbit, which looks good. Day Software who formed a large part of that project are using something similar in commercial products that they sell. If you believe their web site (no reason not to) they have some solid names as customers.

  • Search Deployments 2

    The Second problem that has been found in Search in production is that the number of index segments grows.

  • Search Deployments 1

    It looks like there are a number of sites actively deploying Sakai Search in production. Thankfully the stream of requests for fixes has been relatively low volume, so I did something right.

  • Sakai Search Update

    I have been doing quite a bit of work recently improving the performance and operation of the search indexer in Sakai. Problems have been mainly around using Hibernate in long running transactions. It is so unpredictable in how it touches the database, it means that long running transactions will lock large areas of the database… which causes other threads, connected to users requests to randomly exhibit stale object, optimistic locking and lock timeout failures. This is only made worse in MySQL where the lock is placed on the primary key index, which means that neighboring records are also locked (if they don’t exist).


    XPDL is a large schema (1000+ lines), I wanted to create an entity model from the schema, but I also wanted that entity model to make sense. So I looked at HyperJaxB, JaxB, Castor and number of other technologies. These are all great technologies, except for 2 things. With a complex real life schema, all tend to represent the Object model like a dom. Its not that easy to make the object model understand the xsd. For instance means a list of attributes, not an new object that implements a container that contains attributes. My second requirement is that the generated model should persist in Hibernate. This is where the JaxB like mapping technologies fall over. The bean model that is created is so complex that it looks completely mad when mapped into a database. Its almost impossible to do anything with.

  • IronGrid, P6Spy, IronTrack, IronEye

    Picking up an old Hibernate book, flicking to the back, I noticed IronTrack. I remember this from back in 2003, but never used it. When you start searching for IronGrid in Google, it looks like they don’t exist any more, a pity since it looked like a good way of seeing what a DB was doing. There are commercial products that do the same, but this was supposed to be OpenSource. However their is talk of a Trial License, which a new model to me, open source, but if you want the binaries, you need a license, strange. Perhaps thats why they don’t exist any more.

  • Workflow models

    I had a complete and working workflow engine, that had most of the concepts present in XBPLM. Then I realized that it was entity focused and not service focused. To make things work really well in Sakai or any other SOA, they need to minimize the size of the entity model that is pulled into memory and focus much more on delivering a result. The workflow engine was teetering on becoming and EJB monster without the EJB container.

  • Workflow Implementation

    I’ve resurrected the Workflow implementation that was started at the end of the last JISC VRE project meeting. Wonderful how meetings give you time to think, but leaving it alone for such a long time has done two things. Help me forget what I was doing, and solve the same problems in a different but better way.

  • IMS-LD and Workflow

    There has been a thread on sakai-pedagogy on Learning Design sparked by Mark Norton. This discussion triggered a long held thought, that IMS-LD is a specialized form of workflow that could be implemented and enacted in a generic workflow environment. I dont know how true this is, or if there is a sufficiently complete mapping to make this possible, but experimentation will help us discover if this is the case.

  • JCR JCP-170 JackRabbit and ContentHostingService

    As I start to look more at ContentHostingService and the JCR API it looks like the address the same problem, with different positions in the Content stack. Sakai Content Hosting Serivce API has been bound to by many of the tools in Sakai. Changing this API would be a significant investment with widespread impact. JCR API is a standard, and although complete appears to be missing some features that the Sakai Content Hosting Service uses. My interpretation, perhapse incorrect, is that JCR does not support the fine grained access control that exists within Sakai Content Hosting Service. There is however a mechanism for injecting the concept of a user into the JCR, but that mechanism feels like representing a user of the JCR rather than a user of Sakai. This, perhapse is a philisopical standpoint.

  • Clustered Index in Search

    After spending many hours with a hot CPU, I finally came up with and efficient mechanism of indexing in real time in a database clustered environment. I could have put the Lucene Index segments directly into the database with a JDBCDirectory from the Compass Framework. But unfortunately the MySQL configuration of Sakai prohibited the emulation of seeks within BLOBs, so the performance was hopeless. Im not convinced that emulating seeks in BLOBS actually helps as I think the entire BLOB might still be streamed to the App server from the database.

  • Section Group Support for Wiki in Sakai

    There is already some support for Groups and Sections in Sakai RWiki. This is basic support that connects a Wiki SubSpace to a Worksite group. If the connection is made (by using the name of the group as the SubSpace name), permissions are taken from the Group permissions. There is a wiki macro that will generate links to all the potential Group/Section SubSites in a Worksite (see the list of macros in the editing help page)

  • LGPL What is acceptable extension

    Sesame is LGPL license, with a clarification on Java binding. The net result of the statement in the Readme is that you can use Sesame in another project without it having to be LGPL. Thats great! Well its great if you want to use the LGPL Library in a way the developers intended. When it comes to reimplementing an underlying diver you are faced with three choices. Either implement the driver so that its compatible with the internal implementation, or implement your own algorithm, or use something else.

  • Sesame RDBMS Drivers

    I’ve written a Data source based Sesame driver, but one thing that occurs to me in the Sakai environment. Most production deployments do not allow the application servers to perform DDL operations on the database. Looking at the default implementations, thats the non data source ones, they all perform lots of DDL on the database in startup and in operation. This could be problem for embedding Sesame inside Sakai. I think I am going to have to re-implement from scratch the schema generation model. It might even be worth using Hibernate to build the schema although it not going to make sense to use Hibernate to derive the object model, the queries are just too complex and optimized.

  • Sesame RDBMS Implementation

    It looks like there are some interesting features in the Sesame default RDBMS implementation. Since it uses its own connection pooling, it tends to commit on close. If the standard connection pool that is used by default is replaced by a java.sql.Datasource, things like commit don’t happen when Sesame thinks they should have happened. The net result is a bunch of exceptions associated with lock timeouts, as one connection coming out of the data source block subsequent connection. The solution looks like its going to be to re-implement most of the RDBMS layer with one that is aware of a Datasource rather than a connection pool.

  • Sesame in a Clustered environment

    Sesame has one major advantage in a clustered environment, it stores its content in a database. Im not saying this is good thing, but it just makes it easier to deploy in a clusterd environment where the only thing that is shared is the database. It should be relatively easy to make it work OOTB with Sakai… however, it looks like the default implementation of the Sesame RDBMS Sail driver (this is the RDF Repository abstraction layer) like to get a jdbc url, user name and password. This would be Ok, except that Sakai likes use a Data source.

  • Wiki Sub-Sites Groups and Sections

    In general the Wiki tool was well received, and the presentations done by Harriet, Andrew and Frances Tracy invoked thought. It was especially good to see faculty members relaying real teaching and research experience of Sakai in use.

  • Exploding Content Hosting Service

    We had some extremely productive conversations on the ContentHostingService towards the end of the Vancouver Conference. The basic idea of the ContentHostingPulugin was extended, and it looks like it might be worth attempting to restructure the ContentHostingService to separate the implementation of the default storage mechanism so that node properties are stored centrally, but all handling mechanisms become plug ins. This will me merging both ContentResource and ContentCollection into a single ContentEntity more fully so that the core of ContentHosting can treat them the same, regardless of where and how they are stored. This opens the potential to have tools inject ContentHostingHandlers into the ContentHostingService. ie Repositories, DSpace, IMS-CP etc etc etc, not going to be 2.2, but maybe 2.3. Will be working with Jim Eng on this, as its his code.

  • Semantic Search

    Currently Search performs its indexing on text streams. There is a significant amount of information that can be extracted from entities, beside the simple digest of content. This includes things like the entity reference, the URL, title, description etc. There is also other information. We could create multiple indexes for this in Lucene quite easily, but it would not necessarily provide the search structure that is required. A better approach is probably going to be to represent this in RDF. So Im going to try and enhance the EntityContentProcuder with an RDF stream and place a pluggable RDF triple store underneath the search engine to operate as a secondary stream. Its quite possible that this will solve some of the search clustering problems and will certainly address the results clustering that would begin to make search really cool.

  • Vancouver 2006

    Vancouver 2006 was a watershed conference for me. The Sakai Foundation, less than 6 months old, transitioned from a funded project into and Open source community. Chuck was appointed as CEO, thankfully, as selecting almost anyone else would have created a year or more of pain for the community. Problems and issues were openly exposed and discussed without blame of recrimination. The future feels positive.