CakePHP Containable Statement Pitfalls
Published on Mar 15, 2009 by Jamie MunroOn some of our more recent projects, we have been using the Containable behavior that CakePHP provides. It works quite nicely, a lot easier than the previous format of having to bind and unbind models on the fly.
The pitfalls began to appear when some of our clients were reporting that some pages are extremely slow. After a bit of research, it was quite apparent that the root of the cause was attributed to us using the containable feature.
To help you understand, let me provide a scenario. Users can create an album. Albums contain pictures. Pictures contains comments. In our model, each association is set to so User hasMany Album. Album hasMany Picture. Picture hasMany Comment.
Using this relationship we can do a find('all') on User and have it contain the data it wants. For example:
<?php $users = $this->User->find('all', array('contain' => array( 'Album' => array( 'Picture' => array( 'Comment' ) ) ) ) ); ?>
The above statement will return us all users, albums, photos, and comments for those photos in a nice recursive array. Now you are asking, so what are the pitfalls? Well, let me clarify, there are no pitfalls with the data returned, it compilies it quite nicely, it's how it gets there.
To achieve the data above, CakePHP will do one query that selects * from users and left joins albums. For each album returned in the above result, it will in turn do one select * from pictures where the album_id is in the pictures table. Then for those results, it will do the same for the comments table. This can turn into 100s over queries.
Not only does it perform an IN statement, it performs an extraordinary amount of additional queries. I can be ok with the additional queries, as there are a lot of arguments that support smaller, simpler queries are just as effective as less, larger more complex queries.
There is simply no argument for the IN statement. The IN statement is far slower than if we were to perform one query using left or inner joins to return the same data, especially on large tables.
After doing some research, it is clear that CakePHP will not be solving this issue in 1.2.x. I have heard rumors that it will be addressed in version 2 though, which I will be quite excited for.
My advice, avoid using contains when joining more than one table. Instead write a custom query in your model. To give you an idea of how simple it is, the contains statement above could have been written like:
SELECT `User`.*, `Album`.*, `Picture`.*, `Comment`.* FROM users `User` INNER JOIN albums `Album` ON `User`.id = `Album`.user_id INNER JOIN photos `Photo` ON `Album`.id = `Photo`.album_id INNER JOIN comments `Comment` ON `Picture`.id = `Comment`.picture_id
This will accomplish the same thing, it is important to note though that your array will be structured different. It will be a flat array oppose to a recursive array.
Don't worry CakePHP, I still <3 you!