Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: ActiveRecord::Calculations#ids returns duplicate ids #46503

Merged
merged 1 commit into from
Dec 9, 2022

Conversation

joshuay03
Copy link
Contributor

@joshuay03 joshuay03 commented Nov 15, 2022

Motivation / Background

This Pull Request has been created to address #46455.

Detail

This Pull Request changes ActiveRecord::Calculations#ids to only return the unique ids of the base model, ignoring the join on eager loaded, preloaded and included associations.

The cause for this was the join dependancy applied within #pluck when the relation contained preloads. #ids has been updated to query the relation's foreign key itself rather than relying on #pluck unless the records have already been loaded.

Additional information

Checklist

Before submitting the PR make sure the following are checked:

  • This Pull Request is related to one change. Changes that are unrelated should be opened in separate PRs.
  • Commit message has a detailed description of what changed and why. If this PR fixes a related issue include it in the commit message. Ex: [Fix #issue-number]
  • Tests are added or updated if you fix a bug or add a feature.
  • CHANGELOG files are updated for the changed libraries if there is a behavior change or additional feature. Minor bug fixes and documentation changes should not be included.
  • CI is passing.

@joshuay03 joshuay03 force-pushed the update-calculations-ids branch 2 times, most recently from 082228a to df21f6f Compare November 15, 2022 10:50
@joshuay03 joshuay03 changed the title Update: Calculations#ids doesn't pluck association ids Fix: Calculations#ids doesn't pluck association ids Nov 15, 2022
@joshuay03 joshuay03 force-pushed the update-calculations-ids branch 3 times, most recently from e4b342b to b876c5c Compare November 15, 2022 12:12
@joshuay03 joshuay03 changed the title Fix: Calculations#ids doesn't pluck association ids Fix: Calculations#ids plucks included record ids Nov 15, 2022
@joshuay03 joshuay03 changed the title Fix: Calculations#ids plucks included record ids Fix: Calculations#ids plucks included record IDs Nov 15, 2022
@joshuay03 joshuay03 force-pushed the update-calculations-ids branch 3 times, most recently from 5a2223f to 5b575ba Compare November 15, 2022 12:33
@joshuay03 joshuay03 changed the title Fix: Calculations#ids plucks included record IDs Fix: Calculations#ids plucks included associations IDs Nov 15, 2022
@joshuay03 joshuay03 changed the title Fix: Calculations#ids plucks included associations IDs Fix: ActiveRecord::Calculations#ids plucks included associations IDs Nov 15, 2022
@joshuay03 joshuay03 force-pushed the update-calculations-ids branch 2 times, most recently from e129452 to a473cf9 Compare November 15, 2022 12:38
@joshuay03 joshuay03 changed the title Fix: ActiveRecord::Calculations#ids plucks included associations IDs Fix: ActiveRecord::Calculations#ids plucks included associations IDs Nov 26, 2022
Copy link
Contributor

@adrianna-chang-shopify adrianna-chang-shopify left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋 Hey Joshua, thanks for picking this up!

Relying on the base relation makes sense to me, but I think #ids should still rely on #pluck, instead of extracting the pluck logic to #perform_pluck and calling that directly from #ids. #pluck avoids making queries if the relation is already loaded:

def pluck(*column_names)
  if loaded? && all_attributes?(column_names)
    result = records.pluck(*column_names)
    if @async
      return Promise::Complete.new(result)
    else
      return result
    end
  end
...
end

So if we jump straight to the #perform_pluck logic you've extracted when #ids is called, we will always perform a query. What do you think about still using #pluck in #ids, but scoping to the base relation there, ie.

def ids
  klass.pluck(primary_key)
end

?

@joshuay03
Copy link
Contributor Author

joshuay03 commented Dec 1, 2022

Thanks for the review! @adrianna-chang-shopify

So if we jump straight to the #perform_pluck logic you've extracted when #ids is called, we will always perform a query. What do you think about still using #pluck in #ids, but scoping to the base relation there, ie.

This makes way more sense. I'm still getting used to how AR base classes are structured so thanks for the suggestion which I've now actioned.

@adrianna-chang-shopify
Copy link
Contributor

adrianna-chang-shopify commented Dec 1, 2022

No problem! Test failure looks legit though, let me take a look 🤔

Ah, it's just the assertion that's wrong now that we're deduplicating ids, so 👍

@joshuay03 joshuay03 force-pushed the update-calculations-ids branch 6 times, most recently from 4414979 to 2d44f66 Compare December 1, 2022 16:51
@byroot byroot removed the ready PRs ready to merge label Dec 8, 2022
@adrianna-chang-shopify
Copy link
Contributor

Ah, yes scoping would be a problem 🤦‍♀️ Is there a reason not to do pluck(primary_key).uniq? That seems simplest..

@casperisfine
Copy link
Contributor

Is there a reason not to do pluck(primary_key).uniq? That seems simplest..

That is doing the deduplication on the Ruby side, that isn't ideal. Something like SELECT DISTINCT <pk> might be a solution, but not 100% sure I haven't dug much in the original issue yet.

@adrianna-chang-shopify
Copy link
Contributor

Ah, yeah that makes sense to me 👍 distinct.pluck(primary_key) gives us what we want, right? Although we're still constructing the join dependency unnecessarily at that point, so perhaps it makes more sense to inline the query inside #ids instead of relying on #pluck..?

@joshuay03 joshuay03 force-pushed the update-calculations-ids branch 5 times, most recently from 69434bb to 3078167 Compare December 8, 2022 17:06
@joshuay03
Copy link
Contributor Author

joshuay03 commented Dec 8, 2022

so perhaps it makes more sense to inline the query inside #ids instead of relying on #pluck..?

I agree @adrianna-chang-shopify.
I've updated #ids to pluck if records have been preloaded, and select the foreign key otherwise, handling where contradictions and async just like #pluck and #pick. Changes and additional test coverage visible in these comparisons:

I'm happy to drop async_ids and move to a separate PR if required.

@byroot byroot merged commit 4435b7a into rails:main Dec 9, 2022
@byroot
Copy link
Member

byroot commented Dec 9, 2022

Nice work! Thank you.

@simi
Copy link
Contributor

simi commented Dec 9, 2022

🤔 this adds async_ids as well, which is missing in commit message and changelog. Is that on purpose (to not promote async methods yet)?

@matthewd
Copy link
Member

matthewd commented Dec 9, 2022

I'm not sure this is something we should / need to change. But if we're going to, I think we need to take some more inspiration from the existing pluck tests, as the new implementation is no longer covered by them:

  def test_ids_if_table_included
    c = Company.create!(name: "test", contracts: [Contract.new(developer_id: 7)])
    assert_equal [c.id], Company.includes(:contracts).where("contracts.id" => c.contracts.first).ids
  end

@joshuay03
Copy link
Contributor Author

joshuay03 commented Dec 10, 2022

I think we need to take some more inspiration from the existing pluck tests, as the new implementation is no longer covered by them.

@matthewd I'll get a separate PR up with improved test coverage.

Update: Got this PR up: #46697.

@joshuay03 joshuay03 deleted the update-calculations-ids branch December 10, 2022 01:26
rafaelfranca pushed a commit that referenced this pull request Sep 27, 2023
This PR adds `async_ids` introduced at #46503
to 7_1_release_notes.
koic added a commit to koic/rubocop-rails that referenced this pull request Oct 14, 2023
…iveRecordAllMethod`

Follow up rails/rails#44446, rails/rails#37944,
rails/rails#46503, and rails/rails#47010.

This PR supports some Rails 7.1's new querying methods for `Rails/RedundantActiveRecordAllMethod`.
koic added a commit to koic/rubocop-rails that referenced this pull request Oct 14, 2023
…iveRecordAllMethod`

Follow up rails/rails#44446, rails/rails#37944,
rails/rails#46503, and rails/rails#47010.

This PR supports some Rails 7.1's new querying methods for `Rails/RedundantActiveRecordAllMethod`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants