-
Notifications
You must be signed in to change notification settings - Fork 791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve command scrubber in MongoClientTracer (#1587) #1663
Conversation
jyemin
commented
Nov 17, 2020
•
edited
Loading
edited
- Don't scrub the command field value at all if it's a string
- Use JsonWriter to improve efficiency of the scrubber: this will be more efficient than copying to a new BsonDocument (which is a LinkedHashMap underneath), since JsonWriter will write directly to a StringWriter. I considered doing this with ScrubbingBsonWriter class that implements org.bson.BsonWriter, but I was concerned about future proofing: while not common, MongoDB does occasionally add new BSON value types, and those do require new methods to be added to BsonWriter. This actually happened in the 3.4 release with the addition of the Decimal128 type.
- Truncate the query string. If available, user JsonWriter.Builder.maxLength and JsonWriter.isTruncated to make the truncation more efficient. Otherwise, just use substring. Truncation is important because commands in MongoDB can be quite large: up to 48 MB for a bulk write
* Don't scrub the command field value at all if it's a string * Use JsonWriter to improve efficiency of the scrubber * If available, user JsonWriter.Builder.maxLength to limit size of the query string
private final JsonWriterSettings jsonWriterSettings; | ||
|
||
public MongoClientTracer() { | ||
this(32 * 1024); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got this default from the SqlNormalizer
private static final Method IS_TRUNCATED_METHOD; | ||
|
||
static { | ||
IS_TRUNCATED_METHOD = Arrays.stream(JsonWriter.class.getMethods()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method was added in the 3.7 release of the driver so accessing it reflectively
writeScrubbed(command, new JsonWriter(stringWriter, jsonWriterSettings), true); | ||
// If using MongoDB driver >= 3.7, the substring invocation will be a no-op due to use of | ||
// JsonWriterSettings.Builder.maxLength in the static initializer for JSON_WRITER_SETTINGS | ||
return stringWriter.getBuffer() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question for reviewers: should we append a "..." to the query or just cut it off? Or do we need to make sure that even when truncated it remains valid JSON?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like SQL just cuts off at the limit like here, so seems good
if (UNSCRUBBED_FIELDS.contains(entry.getKey()) && entry.getValue().isString()) { | ||
scrub.put(entry.getKey(), entry.getValue()); | ||
writer.writeName(entry.getKey()); | ||
// the first field of the root document is the command name, so we preserve its value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question for reviewers: since we already extract the collection name into the Span, is it even necessary to preserve it in the scrubbed command?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
preserving it seems good, e.g. table names are preserved when scrubbing sql queries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
writeScrubbed(command, new JsonWriter(stringWriter, jsonWriterSettings), true); | ||
// If using MongoDB driver >= 3.7, the substring invocation will be a no-op due to use of | ||
// JsonWriterSettings.Builder.maxLength in the static initializer for JSON_WRITER_SETTINGS | ||
return stringWriter.getBuffer() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like SQL just cuts off at the limit like here, so seems good
if (UNSCRUBBED_FIELDS.contains(entry.getKey()) && entry.getValue().isString()) { | ||
scrub.put(entry.getKey(), entry.getValue()); | ||
writer.writeName(entry.getKey()); | ||
// the first field of the root document is the command name, so we preserve its value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
preserving it seems good, e.g. table names are preserved when scrubbing sql queries