A Curious Case of Optimisation
Consider the following problem often encountered in web applications:
There are two models
Student
andUser.
The students table has an attributeuser_id
which references theid
on the users table. You want to print out the username of students in section A.
You implement the following code, which translates to the pseudocode - Fetch all students in section A from the database. For each student, fetch the user from the database.
students = Student.where(section: 'A')
students.each do |student|
puts User.find(student.user_id).username
end
Someone reviews your code and correctly points out that you are running N + 1 queries which slows the page. You fix it by prefetching all users and finding the right user in memory.
students = Student.where(section: 'A')
user_ids = students.pluck(:user_id)
users = User.where(id: user_ids)
students.each do |student|
user = users.find { |user| user.id == student.user_id }
puts user.username
end
You run two queries now, and page loads faster on the development database. Your code is merged, but the production feels more sluggish.
That doesn’t make sense!
Benchmarks
Here are benchmarks for the above and two other implementations - Hash and ORM API (discussed below) on a local database:
N | (N + 1) Query | Prefetch | Hash | ORM API |
---|---|---|---|---|
10 | 0.0163 | 0.0055 | 0.0053 | 0.0048 |
50 | 0.0557 | 0.0108 | 0.0082 | 0.0089 |
100 | 0.1005 | 0.0197 | 0.0120 | 0.0140 |
500 | 0.4517 | 0.1505 | 0.0428 | 0.0564 |
1,000 | 0.9273 | 0.4634 | 0.0767 | 0.0927 |
5,000 | 4.9729 | 10.572 | 0.3522 | 0.3989 |
10,000 | 9.3168 | 60.571 | 0.6063 | 0.8187 |
Observations from the above table:
- Hash and ORM API approach scale well - taking less than a second for 10,000 records.
- Prefetch performs better for small values of N (< 2000) which is typical for development
- (N + 1) query functions better for larger N, present in the production database.
The benchmark depends on many factors - notably the round trip time from the database. Local databases respond in ~2 ms, whereas a remote database responds in ~150ms, which completely changes the benchmark results.
Analysis
Here’s understand the costs involved in both methods:
Costs of (N + 1) query = (Fetch students) + N * (Fetch user)
= O(N) queries
Costs of prefetch method = (Fetch students) + (Pluck user_ids)
+ (Fetch users) + ((N * (N - 1)/2) * iterations
= O(N^2) iterations
Cost of iterations is much smaller than the cost of queries (10^-4ms « 0.5ms).
For small N, the cost of N queries dominates over the N^2 iterations. But for larger and larger N, the cost of N^2 iterations adds up and increases quadratically.
Profiling is the only way to be sure about performance.
Let’s see two better ways to solve this problem.
Hash approach
Here’s the critical realization - Each of the users match precisely one of the students. Hashes are a great way to store one to one mappings.
Using hashes converts O(n) inner iteration of prefetch method to O(1) hash table lookup.
Algorithm:
- Fetch all students in section A from the database.
- Fetch all user_id from such students from the database (using
SELECT students.user_id
or the ORM equivalent). - Fetch all users using
user_id
in step 2. - Create a hash using the foreign key as key and the object itself as the value.
- For each student, look up the hash table and find the user.
Implementation:
# Step 1
students = Student.where(section: 'A')
# Step 2 - Avoid a query using preloaded students
user_ids = students.pluck(:user_id)
# Step 3
users = User.where(id: user_ids)
# Step 4
users = users.to_h { |user| [user.id, user] }
# Step 5
students.each do |student|
puts users[student.user_id].username
end
Using ORM API
Databases and ORMs have recognized the need to work with relations efficiently. Rails has includes
whereas Django has prefetch_related
for fetching related records efficiently.
students = Student.where(section: 'A').includes(:user)
students.each do |student|
puts student.user.username
end
Phew! That was shorter and is just as fast as the hash approach.