CakePHP Containable Statement Pitfalls

Published on Mar 15, 2009 by Jamie Munro

On some of our more recent projects, we have been using the Containable behavior that CakePHP provides. It works quite nicely, a lot easier than the previous format of having to bind and unbind models on the fly.

The pitfalls began to appear when some of our clients were reporting that some pages are extremely slow. After a bit of research, it was quite apparent that the root of the cause was attributed to us using the containable feature.



To help you understand, let me provide a scenario. Users can create an album. Albums contain pictures. Pictures contains comments. In our model, each association is set to so User hasMany Album. Album hasMany Picture. Picture hasMany Comment.

Using this relationship we can do a find('all') on User and have it contain the data it wants. For example:

<?php
$users = $this->User->find('all', array('contain' => array(
 'Album' => array(
 'Picture' => array(
 'Comment'
 )
 )
)
 )
 );
?>


The above statement will return us all users, albums, photos, and comments for those photos in a nice recursive array. Now you are asking, so what are the pitfalls? Well, let me clarify, there are no pitfalls with the data returned, it compilies it quite nicely, it's how it gets there.

To achieve the data above, CakePHP will do one query that selects * from users and left joins albums. For each album returned in the above result, it will in turn do one select * from pictures where the album_id is in the pictures table. Then for those results, it will do the same for the comments table. This can turn into 100s over queries.

Not only does it perform an IN statement, it performs an extraordinary amount of additional queries. I can be ok with the additional queries, as there are a lot of arguments that support smaller, simpler queries are just as effective as less, larger more complex queries.

There is simply no argument for the IN statement. The IN statement is far slower than if we were to perform one query using left or inner joins to return the same data, especially on large tables.

After doing some research, it is clear that CakePHP will not be solving this issue in 1.2.x. I have heard rumors that it will be addressed in version 2 though, which I will be quite excited for.

My advice, avoid using contains when joining more than one table. Instead write a custom query in your model. To give you an idea of how simple it is, the contains statement above could have been written like:

SELECT `User`.*, `Album`.*, `Picture`.*, `Comment`.*
FROM
 users `User`
INNER JOIN
 albums `Album` ON `User`.id = `Album`.user_id
INNER JOIN
 photos `Photo` ON `Album`.id = `Photo`.album_id
INNER JOIN
 comments `Comment` ON `Picture`.id = `Comment`.picture_id


This will accomplish the same thing, it is important to note though that your array will be structured different. It will be a flat array oppose to a recursive array.

Don't worry CakePHP, I still <3 you!

Tags: SQL | CakePHP | Optimization

Related Posts

blog comments powered by Disqus