技术控

    今日:0| 主题:63445
收藏本版 (1)
最新软件应用技术尽在掌握

[其他] N + 1: When More Queries Is a Good Thing

[复制链接]
丹·噌噯伱 发表于 2016-12-2 07:29:37
216 1

N + 1: When More Queries Is a Good Thing

N + 1: When More Queries Is a Good Thing

  Over the last week I have been trying to understand how eager loading works in Rails to eliminate the infamous N+1 query problem by reducing the number of queries fired. My initial hypothesis was that reducing the number of queries as much as possible was the goal. However, I was surprised by what I discovered.
   Using includes to Reduce Queries

   Most posts that you read about the infamous N + 1 Query Problem cite the includes method to address the issue. includes is used to eager load associations related to the model by using the minimum number of queries possible. For this, under the hood, it uses a preload or left outer join, depending on the situation. I will explain both situations in subsequent sections.
   This is nicely explained with examples in Active Record Query Methods documentation .
   When and How to Use includes ?

  Suppose that our user can have many posts and can comment on any post. Each post can have many comments. The basic structure is shown in snippet shown below:
  1. # models/users.rb
  2. class User < ApplicationRecord
  3.   has_many :posts
  4.   has_many :comments
  5. end

  6. # models/posts.rb
  7. class Post < ApplicationRecord
  8.   has_many :comments
  9.   belongs_to :user
  10. end

  11. # models/comments.rb
  12. class Comment < ApplicationRecord
  13.   belongs_to :user
  14.   belongs_to :post
  15. end
复制代码
  Now, if we want a user’s information with posts made by the user along with their comments, simply calling User.all will first load users, then it will fetch the posts by each user. After fetching the posts, it will fetch the comments made by the user for those posts. If we have 10 users each having 5 posts, and on average 2 comments on each post, one User.all will end up doing around 1 + 5 + 10 queries.
  1. # users_controller.rb
  2. def index
  3.   @users = User.all
  4.   render json: @users
  5. end
复制代码
  A simple solution is to use includes to tell Active Record that we want to fetch users and all related posts:
  1. @users = User.all.includes(:posts)
复制代码
   

N + 1: When More Queries Is a Good Thing

N + 1: When More Queries Is a Good Thing
   posts are preloaded, comments are not preloaded
    This improvises performance a bit, as it fetches users first, and then in the subsequent query it fetches posts related to those users. Now the previous 1 + 5 + 10 queries are reduced to 1 + 1 + 10 queries. But this will be a lot better if comments related to posts are loaded in advance, as well. This will reduce it all down to 1 + 1 + 1, totaling 3 queries to fetch all the data. Look at the snippet shown below to understand:
  1. # users_controller.rb
  2. def index
  3.   @users = User.all.includes(:posts => [:comments])
  4.   render json: @users
  5. end
复制代码
   

N + 1: When More Queries Is a Good Thing

N + 1: When More Queries Is a Good Thing
   All data is loaded in just 3 queries, one for users, one for posts, and one for comments related to posts
    All data is loaded in just 3 queries, one for users, one for posts, and one for comments related to posts
   Passing comments in an array tells active record to preload comments related to the posts as well. If some relationship of comments needs to preloaded, we can change arguments passed to the includes methods, like so:
  1. User.all.includes(:posts => [:comments => [:another_relationship]])
复制代码
  This way, any number of nested relationships can be preloaded. For all of the above queries includes uses preload.
  Fetching Posts with a Specific Title

  1. User.all.includes(:posts => [:comments]).where('posts.title = ?', some_title)
复制代码
This will raise an error. Whereas,
  1. User.all.includes(:posts => [:comments]).where(posts: {    title: some_title })
复制代码
will give us the expected result. This happens because when hash conditions are passed, a left outer join of users and posts is performed to fetch users with posts having the specific title.
     

N + 1: When More Queries Is a Good Thing

N + 1: When More Queries Is a Good Thing
   Specifying hash conditions on included relations
    But, what if we want to use a pure string or array conditions instead of hash conditions to specify conditions on included relations? Look at the following example:
  1. User.all.includes(:posts => [:comments]).where('posts.title = ?', some_title).references(:posts)
复制代码
  Notice the references(:posts) part? references tells includes to force join the posts relation with a left outer join . To understand this, see the example query generated by the above line of code:
     

N + 1: When More Queries Is a Good Thing

N + 1: When More Queries Is a Good Thing
   Example query when posts relation is force joined with includes through references
    We reduced the number of queries from 1 + 5 + 10 to 1 query. That’s great!
  But, Less is NOT Always More

   Look at the last two example queries. Both are 3 to 4 lines long, and are have substrings like t0_r1 , t0_r2 , … , t2_r5 . This seems unusual. I am not an SQL expert and didn’t know what this means. These are known as a CROSS JOIN or CARTESIAN join.
   So, using references or hash conditions to specify conditions for included relations can cause very long queries and unnecessary outer joins, which may affect performance and memory adversely. Instead, splitting a large single query into a few queries would be more beneficial.
   The Active Record documentation clearly states that when you need to query associations, you should use join queries with includes instead of references .
   Even though Active Record lets you specify conditions on the eager loaded associations, just like joins , the recommended way is to use joins instead.
  A better way to specify conditions on eager loaded associations:
  1. User.all.joins(:posts).where('posts.title = ? ', some_title).includes(:posts => [:comments])
复制代码
This generates 1 + 1 + 1 queries and loads only the users having posts matching the given conditions, such as a specific title, etc.
  Conclusion

  Eager loading associations can be very useful and improve performance to a great extent, but can also cause serious harm when many nested associations are loaded. I for one was a bit surprised that reducing trips to the database can actually make things worse.
   This post at Engine Yard explains issues related to eager loading associations very well. For example, do not forget to add pagination and limit to records being fetched.
  I hope you enjoyed this quick post on N + 1 queries.
じ★ve梦悱灬 发表于 2016-12-3 04:46:02
路过的帮顶
回复 支持 反对

使用道具 举报

我要投稿

推荐阅读


回页顶回复上一篇下一篇回列表
手机版/c.CoLaBug.com ( 粤ICP备05003221号 | 粤公网安备 44010402000842号 )

© 2001-2017 Comsenz Inc.

返回顶部 返回列表