Beware Includes with Conditions
##N+1 Problem
Lets say that you want to print out all of the invoices for each of your cusomters. Your code may look like this:
For every loop in customers, we have to go back to the database to fetch the
invoices associated with that customer. This is quite slow, and will lead to
performance problems. It is known as the N+1 problem - you will issue N+1
queries, where N is the number of customers. If you look at the server log, you
will see multiple queries being run.
##Using Includes
A simple solution to this problem is to simple preload all of the invoices for the cusomters, using an includes
statement.
Now you just generate two small, tight sql querires.
##Getting into trouble
Now lets say we only want to show the invoices that are have a status of “Open”. We may be tempted to add a where
condition on our include to do the filtering for us.
Unfortunately, Rails cannot figure out how to optimize and break apart this query, so it will Left Outer Join everything…
Here’s output from a different set of models.
The query above, while large, it is still much more performant than dealing with N+1 problems. However, once the tables start to get large or you are returning a large amount of rows, the query like the one above can be potentially troublesome. The database returns something like this:
customer_id | name | invoice_id | title
------------|---------|------------|----------------------------
1 | malcolm | 1 | Some invoice title
1 | malcolm | 2 | Another invoice
2 | wash | 3 | Late invoice title
3 | zoe | 4 | Some invoice title
3 | zoe | 5 | Another title
3 | zoe | 6 | Invoice Title
We get some duplication - a customer has_many invoices so every time we load a unique invoice we are going to load ALL of the customer data along with it.
##Double the has_many
Now, issues arise when you try and eager load data from more than one has_many.
This generates two left outer joins. Whats going to happen is that the database is going to return a row for every unique combination of customer, invoice and receipt. So if we have 25 customers, each customer has 10 invoices (250 total) and 10 receipts (250 total). The result set that is returned is not 525 records, but something closer to 2500 records. Now Rails has to instantiate all of that into ActiveRecord objects.
##When is it safe to add where conditions with an includes?
Only when conditions are used on the main model
This will generate just 3 small queries.