#57 ✓resolved
Silex

Don't put conditions in where clause

Reported by Silex | February 27th, 2009 @ 05:39 AM

Hi,

First, what I'm using:


>> Searchlogic::Version::STRING
=> "1.6.4"
?> `mysql --version`
=> "mysql  Ver 14.14 Distrib 5.1.29-rc, for apple-darwin9.4.0 (i386) using readline 5.1\n"

And my problem follows:


>> Resident.count(:all, :conditions => { :tags => { :name_ne => 'qweasd' }})
=> 134
>> Resident.count(:all)
=> 140

But the problem is that there's no Resident at all that has a tag 'qweasd'. The difference in count is because some residents have no tags at all and the way you construct the SQL forces left joins to become inner joins.

The following code:


Resident.count(:all, :conditions => { :tags => { :name_ne => 'qweasd' }})

Creates the following SQL:


SELECT count(*) AS count_all FROM `residents` 
LEFT OUTER JOIN `taggings` ON (`residents`.`id` = `taggings`.`taggable_id` AND `taggings`.`taggable_type` = 'Resident') 
LEFT OUTER JOIN `tags` ON (`tags`.`id` = `taggings`.`tag_id`)
WHERE (`tags`.`name` != 'qweasd')

Putting the condition in the where clause forces the left outer join to be an inner join, thus NOT returning all the residents.

If I change the query to be:


SELECT count(*) AS count_all FROM `residents` 
LEFT OUTER JOIN `taggings` ON (`residents`.`id` = `taggings`.`taggable_id` AND `taggings`.`taggable_type` = 'Resident') 
LEFT OUTER JOIN `tags` ON (`tags`.`id` = `taggings`.`tag_id`)
AND (`tags`.`name` != 'qweasd')

There it works as intended and returns all the residents! I only moved the conditions into the join conditions.

Thanks

Comments and changes to this ticket

  • Silex

    Silex February 27th, 2009 @ 09:27 AM

    Hum wait, actually seems the AND clause is simply ignored. I'm not too sure what is happening now...

  • Silex

    Silex February 27th, 2009 @ 10:01 AM

    Okay I found it, it's because tags.name resolves to NULL when a resident has no tags (because of the left join), and NULL != 'qweasd' resolves to false...

    Not too sure how to fix this.

    
    mysql> select count(*) from residents;
    +----------+
    | count(*) |
    +----------+
    |      140 | 
    +----------+
    1 row in set (0.00 sec)
    
    mysql> select count(*) from residents where religion_id != 'qweasd';
    +----------+
    | count(*) |
    +----------+
    |      111 | 
    +----------+
    1 row in set, 113 warnings (0.00 sec)
    
    mysql> select count(*) from residents where religion_id IS NULL;
    +----------+
    | count(*) |
    +----------+
    |       29 | 
    +----------+
    1 row in set (0.00 sec)
    
    mysql> select count(*) from residents where religion_id = 'qweasd';
    +----------+
    | count(*) |
    +----------+
    |        0 | 
    +----------+
    1 row in set (0.00 sec)
    
  • Silex

    Silex February 27th, 2009 @ 10:06 AM

    A common fix is to use the SQL IFNULL() function, so the final query looks like :

    
    SELECT COUNT(*) FROM `residents` 
    LEFT JOIN `taggings` ON (`residents`.`id` = `taggings`.`taggable_id` AND `taggings`.`taggable_type` = 'Resident') 
    LEFT JOIN `tags` ON (`tags`.`id` = `taggings`.`tag_id`) 
    WHERE (IFNULL(tags.name, '') != 'qweasd')
    

    Do you have any support in searchlogic to do that? What are your thoughts on this issue?

  • Silex

    Silex March 3rd, 2009 @ 09:51 AM

    Hum, this ticket has become some kind of a mess, let's make a clear question out of it so it's simpler for you :

    
    >> Resident.count(:all, :conditions => { :tags => { :name_ne => 'qweasd' }})
    => 134
    >> Resident.count(:all)
    => 140
    

    There's 6 residents difference in count despite the fact that no Resident has a tag 'qweasd'. The difference comes from the fact that a LEFT JOIN uses NULL to represent a missed join, and the generated SQL uses != in the condition, and comparing something with NULL always returns false.

    The following code:

    
    Resident.count(:all, :conditions => { :tags => { :name_ne => 'qweasd' }})
    

    Generates the following SQL:

    
    SELECT count(*) AS count_all FROM `residents` 
    LEFT OUTER JOIN `taggings` ON (`residents`.`id` = `taggings`.`taggable_id` AND `taggings`.`taggable_type` = 'Resident') 
    LEFT OUTER JOIN `tags` ON (`tags`.`id` = `taggings`.`tag_id`)
    WHERE (`tags`.`name` != 'qweasd')
    

    If we want correct results, it should use IFNULL() and generate the following SQL:

    
    SELECT count(*) AS count_all FROM `residents` 
    LEFT OUTER JOIN `taggings` ON (`residents`.`id` = `taggings`.`taggable_id` AND `taggings`.`taggable_type` = 'Resident') 
    LEFT OUTER JOIN `tags` ON (`tags`.`id` = `taggings`.`tag_id`)
    WHERE (IFNULL(`tags`.`name`, '') != 'qweasd')
    

    So the questions to solve this issue are:

    • Is there a way to specify the "WHERE" part manually? (so I can add the IFNULL() part myself)
    • More generally, is there a way to manipulate the generated SQL?

    Thanks

  • Silex

    Silex March 4th, 2009 @ 06:05 AM

    Do you want me to create a simpler testcase? Do you need more informations from me? Do you want me to create a new ticket with a more accurate title?

    Thanks.

  • Ben Johnson

    Ben Johnson March 4th, 2009 @ 04:43 PM

    • State changed from “new” to “open”

    I apologize, I have been unusually busy this week. I will look at this in more detail tonight.

  • Silex

    Silex March 8th, 2009 @ 05:47 PM

    No worries. A simpler view of the problem is probably the classic User - Item relation ship:

    
    class User < ActiveRecord::Base
      has_many :items
    end
    
    class Item < ActiveRecord::Base
      belongs_to :user
    end
    
    User.create(:name => 'John')
    u = User.create(:name => 'Marc')
    Item.create(:user => u, :name => 'pencil')
    
    User.count(:all, :conditions => { :items => { :name_ne => 'qweasd' }})
    

    The count should return 1 user only when there's obviously two users that doesn't have an item named 'qweasd'.

  • Silex

    Silex March 8th, 2009 @ 05:49 PM

    "The count should return 1 user only when there's obviously two users that doesn't have an item named 'qweasd'."

    What I meant is that it should return 2 because there's two users without an item named 'qweasd' but it returns 1 because of the IFNULL() problem described above.

  • Silex

    Silex March 11th, 2009 @ 05:34 AM

    I think I should be able to fix my problem using something like:

    
    User.count(:all, :conditions => { :items => { :name_is_blank => true,  :or_name_ne => 'qweasd'}})
    

    Does _is_blank exist? Another ticket suggested not_blank exists but the "Column conditions" section of the documentation doesn't talk about it.

  • Ben Johnson

    Ben Johnson March 23rd, 2009 @ 02:48 AM

    I see what you are saying. If the IFNULL condition available for all database types?

  • Silex

    Silex March 24th, 2009 @ 05:14 AM

    well, IFNULL() is the classic "MySQL" fix, so I doubt it's present on other databases.

    Now about my problem, I think that for something like ActiveRecord to generate it, one should probably try to use your "_is_blank" method.

    I want to try what I wrote in my previous post but had a work rush and had no time to do it, I'd be able to try this week.

    The thing is that if add an if_null modifier you'll have to detect the "default" for the type to compare ('' for string, 0 for integers etc).

    I just want to test wether "WHERE items.name IS NULL OR items.name != 'qweasd'" works the same as "WHERE IFNULL(items.name, '') != 'qweasd'". If it's the case I think I can use that and so it avoids you some work :)

  • Silex

    Silex March 30th, 2009 @ 05:43 PM

    Hello,

    I did a little bit of testing, it looks like my is_blank solution would work!

    This is the rails application from http://github.com/binarylogic/se.... I wrote the following in script/console to illustrate the problem:

    
    >> User.all.size
    => 926
    >> User.find(:all, :conditions => { :orders => { :description_ne => 'asd' }}).size
    => 822
    >> User.find(:all, :conditions => { :orders => { :description_is_blank => true, :or_description_ne => 'asd' }}).size
    => 926
    >> User.find_by_sql("SELECT DISTINCT `users`.* FROM `users` LEFT OUTER JOIN `orders` ON orders.user_id = users.id WHERE IFNULL(`orders`.`description`, '') != 'asd'").size
    => 926
    

    Here is the SQL output:

    
    User Load (4.2ms)   SELECT * FROM `users`
    User Load (64.4ms)   SELECT DISTINCT `users`.* FROM `users` LEFT OUTER JOIN `orders` ON orders.user_id = users.id WHERE (`orders`.`description` != 'asd')
    User Load (1388.3ms)   SELECT DISTINCT `users`.* FROM `users` LEFT OUTER JOIN `orders` ON orders.user_id = users.id WHERE ((`orders`.`description` IS NULL or `orders`.`description` = '' or `orders`.`description` = false) OR `orders`.`description` != 'asd')
    User Load (1389.2ms)   SELECT DISTINCT `users`.* FROM `users` LEFT OUTER JOIN `orders` ON orders.user_id = users.id WHERE IFNULL(`orders`.`description`, '') != 'asd'
    

    Seems the workaround I found is ok.

    Now maybe you can help me with something I'd like very much to do in searchlogic:

    How can I add bits of hand-written SQL which would interfere in a friendly way with "@search = User.new_search(params[:search])" ?

    Basically if I want to use #find_by_sql I loose all the niceness of #new_search handling the ordering, the pageing, etc. I'd like something like @search.add_to_where_clause('AND X = 23')...

    Thanks!

  • Silex
  • Ben Johnson

    Ben Johnson April 16th, 2009 @ 03:19 AM

    Hi Silex,

    I apologize, really like to try and stay on top of these tickets. The fact is that I do not have as much time as I had had in the past.

    Did you for searchlogic and make the changes there?

  • Silex

    Silex April 16th, 2009 @ 05:42 AM

    I'm sorry, I really can't understand your question :) Looks like it lacks a word, if not severals.

  • Ben Johnson
  • Silex

    Silex April 17th, 2009 @ 04:01 AM

    I did not do any changes!

    It looks like the length of this ticket confuses you, I'll do a short recap of the problem and the found solution:

    With the rails application from http://github.com/binarylogic/se..., which makes it easy for you to replicate the problem, I can do the following:

    
    >> User.all.size
    => 926
    >> User.find(:all, :conditions => { :orders => { :description_ne => 'QWEIOHAF' }}).size
    => 822
    

    Without thinking, you'd think the two numbers should be the same, as no orders have a 'QWEIOHAF' description. But because of the left join and the comparison with NULL that always returns false for users that don't have any orders, the returned numbers are different.

    I first thought that you'd provide some kind of fix in searchlogic (use IFNULL somewhere, etc). Then I realised I could still use searchlogic but tweak the "query" a little bit with _is_blank and get the behavior I want:

    
    >> User.find(:all, :conditions => { :orders => { :description_is_blank => true, :or_description_ne => 'QWEIOHAF' }}).size
    => 926
    

    So basically I was able to find a solution to my problem without any modifications to searchlogic. Wether the issue I raised stays a problem for searchlogic or is fixed is up to you, for my particular application the workaround I found is decent.

    Now, I'd be very happy if you answered this question:

    How can I add bits of hand-written SQL which would interfere in a friendly way with "@search = User.new_search(params[:search])" ?

    Basically if I want to use #find_by_sql I loose all the niceness of #new_search handling the ordering, the pageing, etc. I'd like the best of the two worlds, #new_search handling all the boring parts of search form validation, sql injection prevention, etc AND being able to tweak the query so I can get around MySQL traps etc. Ideally, I'd like something like @search.add_to_where_clause('AND X = 23') or equivalent... maybe there's already something in rails for it but couldn't find it.

    Thanks! Philippe

  • Silex

    Silex April 25th, 2009 @ 06:36 AM

    My last post has everything that you need to know about this issue, is something unclear?

  • Silex

    Silex June 9th, 2009 @ 10:23 AM

    Looks like searchlogic v2 would be a better answer than my workaround.

  • Ben Johnson

    Ben Johnson June 20th, 2009 @ 04:59 AM

    • State changed from “open” to “resolved”

    Yeah I was just going through this ticket. I think your best bet is to use v2, sorry I can't be more help.

Please Sign in or create a free account to add a new ticket.

With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.

New-ticket Create new ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป

Provides common named scopes and object based searching.

People watching this ticket

Pages