Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different results on consecutive runs #436

Open
thiagov opened this issue Sep 7, 2022 · 4 comments
Open

Different results on consecutive runs #436

thiagov opened this issue Sep 7, 2022 · 4 comments
Labels
bug Something isn't working help wanted Extra attention is needed question Further information is requested

Comments

@thiagov
Copy link

thiagov commented Sep 7, 2022

Watchman Version: 0.21.3

What were you trying to do?
Use the “Search SDNs” service from watchman.

What did you expect to see?
I expect that consecutive requests using the same parameters would return the same result.

What did you see?
The result is different on each run. For some parameters, the last address returned is changing on each run.

How can we reproduce the problem?
With the following search string:

/search?name=M%20And%20S%20Radiology%20Associates%20Pa&address=PO%20Box%202947&city=San%20Antonio&zip=78299-2947&country=United%20States&sdnType=entity

The last address returned is either the address with entityId 26211 or the address with entityId 26213. The other addresses returned are always the same.

Is this the expected behavior?

@adamdecaf adamdecaf added bug Something isn't working enhancement New feature or request labels Sep 7, 2022
@adamdecaf
Copy link
Member

Thanks for opening this bug report. That's not intended behavior and the only situation I'd expect different results is if the data was refreshed between requests.

I tried out your query and can see two different addresses returned.

curl -v "http://localhost:8084/search?name=M%20And%20S%20Radiology%20Associates%20Pa&address=PO%20Box%202947&city=San%20Antonio&zip=78299-2947&country=United%20States&sdnType=entity" | jq '.addresses[9]'
{
  "entityID": "26211",
  "addressID": "39766",
  "address": "675 Third Avenue, 29th Floor",
  "cityStateProvincePostalCode": "New York, NY 10017",
  "country": "United States",
  "addressRemarks": "",
  "match": 0.476984126984127
}
{
  "entityID": "26213",
  "addressID": "39771",
  "address": "675 Third Avenue, 29th Floor",
  "cityStateProvincePostalCode": "New York, NY 10017",
  "country": "United States",
  "addressRemarks": "",
  "match": 0.476984126984127
}

The problem seems to be that match is exactly the same for both addresses.

  "match": 0.476984126984127
  "match": 0.476984126984127

Watchman will run multiple search operations concurrently to each other from each query. These results are sorted in a slice in a thread-safe manor, but due to the concurrent nature one address might be added before/after the other in an unpredictable way.

@adamdecaf
Copy link
Member

This non-deterministic behavior could happen with other queries, but is rare. My first idea at solving this could be to sort based on another identifier (entityID, addressID, etc), but I assume that would confuse people while making search much more complex.

@thiagov
Copy link
Author

thiagov commented Sep 8, 2022

Thanks for the explanation!

Yeah, ordering the results by match and using entityId as a tiebreaker seems like a good idea. It would certainly solve the issue I am experiencing.

For now I will consider that these non-deterministic results can happen in rare cases.

@adamdecaf
Copy link
Member

Yeah, ordering the results by match and using entityId as a tiebreaker seems like a good idea.

I'm not sure what overhead this would introduce, but we can investigate a bit.

@adamdecaf adamdecaf added help wanted Extra attention is needed question Further information is requested and removed enhancement New feature or request labels Sep 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants