SparseMap Content 1.1 Released

27 09 2011

SparseMap Content 1.1 has been tagged released and pushed to the maven repo. Details of the issues that have been fixed are at https://github.com/ieb/sparsemapcontent/issues?milestone=2&state=closed. This release includes a Lock Manager and a migration framework. Thanks to Chris Tweney at University of California, Berkley for his input on the migration framework and apologies to those that have submitted patches that didn’t get into this release, notably a MongoDB driver from Erik Froese.  The tag contains support for webdav and a number of other extensions although there hasn’t been a formal release of these bundles yet.

Release Tag:
https://github.com/ieb/sparsemapcontent/tree/org.sakaiproject.nakamura.core-1.1

Maven Repo:
http://www2.caret.cam.ac.uk/maven2/

Maven Dependency:

<dependency>
  <groupId>org.sakaiproject.nakamura</groupId>
  <artifactId>org.sakaiproject.nakamura.core</artifactId>
  <version>1.1</version>    
</dependency>
....
 <repositories>
    <repository>
      <id>caret-releases</id>
      <name>Caret Releases</name>
      <releases>
        <enabled>true</enabled>
      </releases>
      <snapshots>
        <enabled>false</enabled>
      </snapshots>
      <url>http://www2.caret.cam.ac.uk/maven2/</url>
    </repository>
 </repositories>
Advertisements




Lift Community

24 09 2011

I have been looking for a way to create RESTfull services quickly and simply. JAX-RS is one way, but in my search I wandered in the world of Scala and came across the Lift community. Not exactly a perfect RESTfull framework, but their “Expected Behavior in the Lift community” is well worth a read before you post (or reply) with (or to) frustration on list. I certainly have shared many of the same thoughts with the author, David Pollak,  from time to time, and I suspect at times with shame that I have not always lived by the rules he so clearly expresses. Its one more link on my bookmark list titled “To Reflect on what I do, regularly”.





Comparative wide column indexing in SparseMap 1.1

23 09 2011

I hate doing comparative tests with databases, as it always generates DB wars. “Why didn’t you you this version X where thats fixed ?” or “What about using config z with tweak n?”. Sure, fixes come out and databasess need tuning, but if it possible to make very simple table operations go faster on small sets of data…that should be the default OOTB config. This limited test was done on the same piece of non production hardware (a laptop) to see if there was a way of getting round the MySQL 64K limit on rows without impacting performance. The test comes in 2 parts. A table load that loads rows into a wide column table as used by SparseMap which represents the cost of saving an Object in SparseMap, and the second is a set of queries performed once 100K rows are present in the database table. It would have been nice to see how the performance degraded or not over the entire test, but as you will see… that would have taken hours to complete.

The problem with MySQL is that there is a limit to the number of colums you can have in a wide column. The limit is on the total size of the row and in SparseMap, because we dont have a limit on the size of properties, we have to reserve sufficient space to make searching viable which results in relatively wide columns. Hence we can only get about 25 columns into a table. A potential way of circumventing the MySQL 64K row limit is to use a datatype that is not stored in the row, like MySQL text , and then create an index based on the first 255 chars. The MySQL documentation reports that text and varchar are functionally equivalent except that if the column is returned then a temporary on disk table is needed to support a text query. Fortunately Sparse is only interested in the rowid, a varchar(32). Unfortunately, sorting will create the temporary disk table.

Insert performance

Insert performance degrades on MySQL  as the number of rows increases. That degradation is worse fo text fields than varchar fields, but it hardly matters since at 50ms per insert Sakai OAE is going to be unusable on MySQL, as each update operation needs 10s or perhaps 100s of updates to content objects.. Earlier tests showed that there was no difference between KV indexing and wide column indexing with MySQL as the problem was simply transposed. PostgreSQL shows the insert time constant with number of rows. Its impossible to see what that time is from the graph but the inserts between 99000 and 99500 averaged 1.3 ms per row. Its perfectly possible that the degradation in MySQL inserts is due to a fundamental mistake in the table structure, and that will need to be investigated some more, but as it stands insert time is an issue.

Query Performance

Query performance was tested on 100K rows after the inserts were completed. A range of queries were performed starting with single select term, no sorting (Q1) through to 5 select terms 2 sorts (Q5S2). The results show that text based columns impact query time significantly, probably because temporary tables are placed on disk, and perhaps the sorts are disk sorts. Its clear that Wide Text columns are not viable as a indexing approach with queries averaging at 1s. Since the average page load in Sakai OAE might need 10-20 of these queries it would probably take between 30 and 40s to load each page. Clearly a non starter.

Removing the MySQL Wide Text columns from the graph we see that apart from some query start up times MySQL queries are averaging 25-30ms per query and PostgreSQL are sub 5ms per query. Its not clear if the MySQL query cache would make a big difference since the table in question has a high update rate, and I suspect that any query cache would be constantly invalidated.

As I said, I dislike comparative tests between DB vendors since its almost impossible to compare like with like. However in this test readers should take the PosgreSQL line as what might be an acceptable response and use that to evaluate if the MySQL lines are acceptable. The lines might be changed by version or configuration. I used MySQL 5.1, with a large configuration, and PostgreSQL 9.0 with an out of the box configuration. In all cases the DB instances were empty and performing no other work.

On the basis of these tests, unless the MySQL configuration in SparseMap is fixed, I can’t recomend it for a production backend. What I dont know at the moment is, can the schema and table structure be fixed or will it require a MySQL specific indexer to be written.

Just in case I have made a massive mistake here. The MySQL table in question is of the form.

create table  cn_css_index ( rid varchar(32) NOT NULL,
            v1 varchar(780), ... v25 varchar(780), 
            primary key(rid)) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE INDEX cn_css_index_v1 ON cn_css_index (v1(255));
CREATE INDEX cn_css_index_v1 ON cn_css_index (v1(255));
...
CREATE INDEX cn_css_index_v25 ON cn_css_index (v1(255));

with queries of the form

select rid from cn_css_index where v1 = ? and v15 = ? order by v8;

Finally, the series of tests I have been performing over the past few days to verify scaleability may become moot, since I have just enabled near real time search in the Solr component, and provided that can be made to cluster it may be possible to removing indexing from SparseMap for most use cases.





Indexing and Query Performance in SparseMap 1.0, indicative tests

21 09 2011

Since early late May early June, it became apparent that the the table based indexing approach in Sparse Map used by Sakai OAE had problems. Performing queries on large key value tables can work, provided those queries are simple in nature and the volume of records in the table is not excessive. In fact parts of WordPress’s ontology store use this approach. Unfortunately in Sakai OAE the key value table grows at 10-120 times the rate of the main content store which grows at 2x the number of content items. In addition to this the queries that need to be performed on this table are paged, sorted and distinct. Not surprisingly that generates a performance issue. It first became apparent in Derby where it was really obvious. So obvious that Sakai OAE 1.0 RC1 would grind to a halt after running in integration test suite on Derby. Those issues were fixed for the Derby driver with the 1.0 release of SparseMap on which Sakai OAE 1.0 was based. Unfortunately, further testing shows that all other databases are effected. I say all, I mean MySQL and PostgreSQL since I dont have access to an Oracle DB at the moment to test… but there is a high probability that it will also be effected. To show the impact, here are some comparative graphs. The first one shows Derby query performance at the OAE 1.0 RC1 tag. Fortunately this is not what was released since SparseMap 1.0 contains a fix for the problem. At the time the consensus was that the problem did not effect MySQL or PostgreSQL and so some extents thats true, however detailed testing shows that the problem effects MySQL and PostgreSQL and presumably Oracle.

So what does this really mean ? It means that any page that relies on a data feed that uses SQL queries will slow dramatically. These tests are only for 10K objects, which sounds like a lot, but is not. Each content item needs 2 objects. A single user might have 20 content items in their home space, so with 100 users there are 4K objects before anyone has logged in. Once those users log in they will create pages and upload content. A content item typically has 100 embeded items, so the reality a small pilot of 100 users will reach 100K items in a few weeks.

Ignore the Derby OAE 1.0 RC1 graph, its irrelevant, but look at the MySQL OAE 1.0 graph. The pattern for the OAE 1.0 is a 4x increase in query times per doubling of number of rows. At 100K items queries may take upto 1000ms to execute, which means that a page with 1 SQL based feed will take at least 1s to return. Most pages have many more data feeds than that. Since this post is based on evidence, I will stop there and leave it to the reader to extrapolate.

What can or should be done ?

Short term:

  • Don’t run Sparse Map in key value indexing mode on RDBMS back ends if you have more than a few columns to index or are asking it to perform direct and complex queries. Ie dont use SparseMap 1.0. Unfortunately non of the NoSQL drivers support complex direct queries and all the RDBMS drivers in SparseMap 1.0 suffer from the problem… except Derby which strangely might be a better production DB for OAE 1.0 than MySQL or PostgreSQL.

Long Term:

  • Dont, unnecessarily create massive content trees with deep hierarchies to represent simple objects. Unpacking a JSON tree of single objects will do just this. If there is no need to do do it then its wasteful and would be better stored as a single property resulting in 1 or 2 content items per conceptual object.
  • Don’t rely on indexing as a substitute for publishing data in the right form for 99% of the queries (ie write exactly what you want to read once, and read it billions of times). Although the MySQL query cache does a good job of caching query results, in a situation where there is a high write traffic, the cache gets invalidated regularly.
  • Dont think that inverted indexes will eliminate the problem. All have latency or cardinallity issues and most have clustering issues which is why anyone running at scale in a cluster always puts a quality of service or data delivery parameter on data being written to data appearing in the index. There is plenty of material on this subject on the devops blogs at places like LinkedIn or Facebook.

 





Sparsemap Content WebDav

18 09 2011

I have always felt that a good test of a content system is to give it to a badly behaved client. WebDav clients tend to be badly behaved and often make unreasonable requests of a http server that was not expecting to be a webdav server. OSX Finder is a good example. It often makes 100s of PROPFIND requests just because the user looked at a Finder window, and dragging and dropping a folder onto a webdav volume in Finder is going to generate a storm of requests, some reading some writing.

Rather than attempting to write my own Webdav protocol library I have been using the Milton WebDav library, developed by Brad McEvoy. Although there is quite a bit of Spring Framework in the code base, none of it is an absolute requirements so it was easy to integrate into a simple, light OSGi container. I used Apache Felix via a modified version of the Apache Sling Launchpad to create a standalone executable jar containing Jetty and about 10 other components. The memory footprint at startup of is about 4MB, and startup of the full OSGi container is less that 5s. Adding in Milton Support for webdav and a DAV level 2 implementation of the SPI classes to interface between Milton and Sparse Content Map brings that upto about 7MB but barely impacts the startup time. To get DAV Level 1 support Sparse Map Content already had all the features required. To get level 2 support I added a LockManager to Sparse Content map, storing the locks as maps within the content system but keeping the locks independent of the content system itself to allow DAV clients to pre-lock content trees before they exist. The net result is a small light weight and reasonably fast dav server, running in an OSGi container. It runs happily under load in less that 25MB of memory and feels almost like a native file system when used within Finder. Finder naturally hammers the server with thousands of PROPFIND operations and loads of OSX specific file creates but both Sparse Map Content and the protocol handling provided by Milton hardly appear to notice. The trace below is from JConsole while uploading about 4GB in about 1000 files. SparseMap is running using wide column indexes under Derby, so I suspect most of the retained 14MB will be Derby caches. The upload finishes slightly after 13:12. End to end, adding WebDav to Sparse Map Content took a few hours mainly due to excellent protocol support and clean simple code in Milton coupled with the almost instant response from Brad on the Milton Users list. All the code for this can be found in the extensions sub folder of the sparsemap git repository, and will be in the 1.1 release in a few days time.