I’m currently using Rails with PostgreSQL and am taking advantage of some advanced features within my named scopes. Primarily trigram text comparisons and some PostGIS distance queries.

I have advanced scopes like this.

scope :name_similar_to, -> (name, min_match=0.6) do
  where "similarity(lower(name), ?) >= ?",
    name.downcase,
    min_match
end

scope :close_to, -> (longitude, latitude) do
  where "ST_Within(lonlat::geometry, ST_Expand(ST_GeomFromText('POINT(? ?)', 4326), ?))",
    longitude,
    latitude,
    0.002
end

I also have simple scopes like this.

scope :postal_code, -> (postal_code) do
  where(postal_code: postal_code.to_s.gsub(/\-/, ""))
end

Chaining scopes to create complex queries is one of the things I love about ActiveRecord… but be careful.

Be mindful of the queries that ActiveRecord creates for you.

Consider the following query created by chaining these scopes together.

Location
  .name_similar_to("Palo")
  .close_to(37.4419, 122.1419)
  .postal_code(94303)
  .to_sql

# SELECT "standard_hotels".*
# FROM "standard_hotels"
# WHERE "standard_hotels"."postal_code" = '94303'
# AND (similarity(lower(name), 'palo') >= 0.6)
# AND (ST_Within(lonlat::geometry, ST_Expand(ST_GeomFromText('POINT(37.4419 122.1419)', 4326), 0.002)))

On the surface this looks pretty good, but in reality the performance is poor (around 400ms given the db size)… even though indexes exist in all the right places.

How might we improve performance? Lets reduce the number of records similarity matching and distance comparisons need to be performed on.

subquery = Location
  .select(:id)
  .postal_code(94303)
  .to_sql

Location
  .name_similar_to("Palo")
  .close_to(37.4419, 122.1419)
  .where("id in (#{subquery})")
  .to_sql

# SELECT "standard_hotels".*
# FROM "standard_hotels"
# WHERE (similarity(lower(name), 'palo') >= 0.6)
# AND (ST_Within(lonlat::geometry, ST_Expand(ST_GeomFromText('POINT(37.4419 122.1419)', 4326), 0.002)))
# AND (id in (SELECT id FROM "standard_hotels"  WHERE "standard_hotels"."postal_code" = '94303'))

That did it! The query now runs in 60ms instead of 400ms. Narrowing the initial results with a subquery yielded an 85% speed gain.

It’s always a good idea to review the queries chained scopes are creating for you.




comments powered by Disqus