SparseMapContent 1.0 released

26 08 2011

Normally I would announce this sort of thing on a mailing list, but I dont have one, so this is the next best thing. SparseMapContent v1.0 has just been released. Its a project I started almost 12 months ago to enable Sakai OAE to store user generated content in shallow wide hierarchies, where 1000s of users would be performing updates, potentially served by an elastic cluster. It stores its content in a simple key value store with rdbms storage on Derby, MySQL, Oracle, PostgreSQL or column storage on Apache Cassandra or Apache HBase. In addition there are both bitstream stores for shared filesystem or as blocks within the the underlying column database. It comes bundled as an OSGi bundle intended to work with Apache Sling.

Released with SpraseMapContent is a Solr bundle based on a Snapshot of Solr 4, intended to provide a free text and keyword index for the content stored in SparseMapContent. That bundle maintains a persistent queue of pointers to content to be indexed filled by OSGi events, and indexer framework that resolves the pointers and indexes the content including metadata, and a search index. This bundle is also used by Sakai OAE, although like the SparseMapContent bundle it could be used by any application wanting to maintain a Solr index from an event stream.

If you dont like the names, neither do I and suggestions are welcome. Thank you to all those who have made contributions over the past 12 months.

 

Maven Repo: http://www2.caret.cam.ac.uk/maven2

    <dependency>
     <groupId>org.sakaiproject.nakamura</groupId>
     <artifactId>org.sakaiproject.nakamura.core</artifactId>
     <version>1.0</version>
    </dependency>
    <dependency>
     <groupId>org.sakaiproject.nakamura</groupId>
     <artifactId>org.sakaiproject.nakamura.solr</artifactId>
     <version>1.0</version>
    </dependency>

Code bases, issue trackers, etc

https://github.com/ieb/sparsemapcontent
https://github.com/ieb/solr
License: Apache 2
Advertisements

Actions

Information

4 responses

30 08 2011
Harry Wang

Hi Ian,

Thanks a lot for the great work on sparsemap. I posted a comment on Sakai project site as follows and have not got any reply. I think maybe I can ask you 🙂
“Hi, I remember one goal of using sparsemap is to support NoSQL databases such as Cassandra. Is there a document showing how to configure OAE with NoSQL databases? Do you guys recommend using a NoSQL database for OAE over MySQL? Any comments are highly appreciated.
https://confluence.sakaiproject.org/display/KERNDOC/Configuring+Nakamura+for+MySQL

What’s your opinion on this? Thanks!!

30 08 2011
Ian

The original aim of sparse map was to target NoSQL databases, but since no on is running in production with Cassandra or now HBase the support for those databases always lays behing the JDBC databases. If you want to experiment you need to use the Sling Web Console ( http://localhost:8080/system/console ) to deactivate the JDBCClientContentPool and activate the CassandraClientPool. Unfortunately some functionality is still missing from the Cassandra driver like performant sorting on queries.

31 08 2011
Harry Wang

Thanks, Ian. This is helpful. So, I guess MySQL is the best open source DB for OAE in production.

31 08 2011
Ian Boston

I would say Postgresql is a better choice. Mysql has concurrency issues with btree based locking, that at best slow writes down, and at worst on early versions cause deadlocks. Still both are neck and neck, and sparsemap supports both.

Hth Ian

Sent from my iPhone




%d bloggers like this: