Our code is like custard

29 10 2010

When you hit it, it becomes like concrete. Deadlocked like concrete. Oh dear, having just cut an RC3 for our Q1 release we elected to spin up a server and friendly users have a play, confident that we had found all the contention issues. So confident that the JMeter test scripts had found everything we used those test scripts to populate the QA server with a modest number of users, groups and content. I knew something wasn’t quite right as I loaded up the content and got complaints of slow response over IRC; come on, content upload being performed by 12 users with only 6K users loaded should not cause others to see really slow response (define really slow ? minutes, no kidding). In a moment of desperation I take solace in the fact that even though some queries were taking minutes, the server was rock solid. (lol).

So with that background and my head firmly stuck in the sand, we went into the 1h bug bash to gently press the code base. At this moment I want you to visualise a large pool filled full of custard, about 30 people on the edge who jump in at the same time. Without much movement and a bit of shock the whole pool turns solid lump. Our code base is a non newtonian fluid, just like the custard in the swimming pool, and this blog post is stress relief before trying to find out where that deadlock is coming from, not what I want to be doing when I was supposed to be spending the weekend with family. If I find the cause and its not too embarrassing I might post it here.

 

Advertisements

Actions

Information

2 responses

29 10 2010
Ian

Mistake No 1 we allowed the UI to specify the sort order, without thinking that it would be a problem: resulting in

//*[@sling:resourceType=’sakai/user-home’] order by public/authprofile/basic/elements/firstName/@value ascending

SInce public/authprofile/basic/elements/firstName/@value cant generate a result set loaded by iterator, all the results get loaded into memory to be sorted. So thats 5×4*number-of-users JCR items loaded into memory eg for 6K users 120K items before we start sorting, no wonder the query takes over 10 minutes to complete.

Quick solution: don’t sort
Longer Term solution: custom scorer

29 10 2010
Josh Baron

Please know that the effort is more than appreciated!

Wish I could thank your family as well.

Happy bug hunting!




%d bloggers like this: