Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve command scrubber in MongoClientTracer (#1587) #1663

Merged
merged 2 commits into from
Nov 19, 2020

Conversation

jyemin
Copy link
Contributor

@jyemin jyemin commented Nov 17, 2020

  • Don't scrub the command field value at all if it's a string
  • Use JsonWriter to improve efficiency of the scrubber: this will be more efficient than copying to a new BsonDocument (which is a LinkedHashMap underneath), since JsonWriter will write directly to a StringWriter. I considered doing this with ScrubbingBsonWriter class that implements org.bson.BsonWriter, but I was concerned about future proofing: while not common, MongoDB does occasionally add new BSON value types, and those do require new methods to be added to BsonWriter. This actually happened in the 3.4 release with the addition of the Decimal128 type.
  • Truncate the query string. If available, user JsonWriter.Builder.maxLength and JsonWriter.isTruncated to make the truncation more efficient. Otherwise, just use substring. Truncation is important because commands in MongoDB can be quite large: up to 48 MB for a bulk write

* Don't scrub the command field value at all if it's a string
* Use JsonWriter to improve efficiency of the scrubber
* If available, user JsonWriter.Builder.maxLength to limit size of the query string
private final JsonWriterSettings jsonWriterSettings;

public MongoClientTracer() {
this(32 * 1024);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got this default from the SqlNormalizer

private static final Method IS_TRUNCATED_METHOD;

static {
IS_TRUNCATED_METHOD = Arrays.stream(JsonWriter.class.getMethods())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method was added in the 3.7 release of the driver so accessing it reflectively

writeScrubbed(command, new JsonWriter(stringWriter, jsonWriterSettings), true);
// If using MongoDB driver >= 3.7, the substring invocation will be a no-op due to use of
// JsonWriterSettings.Builder.maxLength in the static initializer for JSON_WRITER_SETTINGS
return stringWriter.getBuffer()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question for reviewers: should we append a "..." to the query or just cut it off? Or do we need to make sure that even when truncated it remains valid JSON?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like SQL just cuts off at the limit like here, so seems good

if (UNSCRUBBED_FIELDS.contains(entry.getKey()) && entry.getValue().isString()) {
scrub.put(entry.getKey(), entry.getValue());
writer.writeName(entry.getKey());
// the first field of the root document is the command name, so we preserve its value
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question for reviewers: since we already extract the collection name into the Span, is it even necessary to preserve it in the scrubbed command?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

preserving it seems good, e.g. table names are preserved when scrubbing sql queries

Copy link
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

writeScrubbed(command, new JsonWriter(stringWriter, jsonWriterSettings), true);
// If using MongoDB driver >= 3.7, the substring invocation will be a no-op due to use of
// JsonWriterSettings.Builder.maxLength in the static initializer for JSON_WRITER_SETTINGS
return stringWriter.getBuffer()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like SQL just cuts off at the limit like here, so seems good

if (UNSCRUBBED_FIELDS.contains(entry.getKey()) && entry.getValue().isString()) {
scrub.put(entry.getKey(), entry.getValue());
writer.writeName(entry.getKey());
// the first field of the root document is the command name, so we preserve its value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

preserving it seems good, e.g. table names are preserved when scrubbing sql queries

@trask trask merged commit 65a6293 into open-telemetry:master Nov 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants