Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/clickhouse] ServiceName as column in Clickhouse metrics tables #31670

Closed
jwafle opened this issue Mar 9, 2024 · 4 comments · Fixed by #31803
Closed

[exporter/clickhouse] ServiceName as column in Clickhouse metrics tables #31670

jwafle opened this issue Mar 9, 2024 · 4 comments · Fixed by #31803
Labels

Comments

@jwafle
Copy link
Contributor

jwafle commented Mar 9, 2024

Component(s)

exporter/clickhouse

Is your feature request related to a problem? Please describe.

I currently use the clickhouseexporter for exporting traces, which follow a table schema with ServiceName as a column available for query. This column is indexed in the traces table, allowing for good performance on queries where ServiceName is used as a filter, which is a vast majority of queries in my experience.

When looking at benchmarking the metrics exporting against our current solution, I realized that ServiceName is not a column in the metrics tables. As someone comfortable with querying the traces table, this came as a surprise to me.

Given that the OpenTelemetry semantic conventions spec specifically calls out that SDKs are required to set service.name and that ServiceName is already a column in the Clickhouse spans table, I was wondering if there is a specific design decision behind not including this column in the metrics tables?

Describe the solution you'd like

I propose two possible solutions:

  1. ServiceName added as a column for all metrics tables. This would require some migration for users already using the clickhouse exporter, but to be honest, the migration shouldn't be that hard (just add a string column to the already existing tables). In fact, logic could probably be written to update the tables without any user intervention required by checking if the table exists in the old format and creating the new column if necessary.

  2. If there is a specific reason that ServiceName should not be a column in these tables, I would suggest automatically creating materialized views that would make it easier and more performant for users to query by ServiceName and including explicit documentation on how to do so.

Last possible solution (less favorable, in my opinion):

  1. Including specific documentation that makes it easy for users to manually create and query the materialized views proposed in solution 2.

Describe alternatives you've considered

Possible solutions provided above.

Additional context

n/a

@jwafle jwafle added enhancement New feature or request needs triage New item requiring triage labels Mar 9, 2024
Copy link
Contributor

github-actions bot commented Mar 9, 2024

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@Frapschen
Copy link
Contributor

ServiceName is meaningful however, it's a breaking change.

since the table scheme change is unpredictable, maybe we can introduce a table schema migration tools to the exporter.

migrate looks good, it can use in code.

/cc @hanjm @SpencerTorres

@Frapschen Frapschen removed the needs triage New item requiring triage label Mar 13, 2024
@Frapschen
Copy link
Contributor

As for a temporary method for speed up the query with ServiceName, we can ref Improving Map performance

update table schema:

ALTER TABLE otel_metrics_sum ADD COLUMN ServiceName String DEFAULT ResourceAttributes['service.name'] CODEC(ZSTD(1));
ALTER TABLE tbl MATERIALIZE COLUMN ServiceName;

# result:
┌─name─────────────────────────┬─type───────────────────────────────────────┬─default_type─┬─default_expression─────────────────┬─comment─┬─codec_expression──┬─ttl_expression─┐
│ ResourceAttributes           │ Map(LowCardinality(String), String)        │              │                                    │         │ ZSTD(1)           │                │
│ ResourceSchemaUrl            │ String                                     │              │                                    │         │ ZSTD(1)           │                │
│ ScopeName                    │ String                                     │              │                                    │         │ ZSTD(1)           │                │
│ ScopeVersion                 │ String                                     │              │                                    │         │ ZSTD(1)           │                │
│ ScopeAttributes              │ Map(LowCardinality(String), String)        │              │                                    │         │ ZSTD(1)           │                │
│ ScopeDroppedAttrCount        │ UInt32                                     │              │                                    │         │ ZSTD(1)           │                │
│ ScopeSchemaUrl               │ String                                     │              │                                    │         │ ZSTD(1)           │                │
│ MetricName                   │ String                                     │              │                                    │         │ ZSTD(1)           │                │
│ MetricDescription            │ String                                     │              │                                    │         │ ZSTD(1)           │                │
│ MetricUnit                   │ String                                     │              │                                    │         │ ZSTD(1)           │                │
│ Attributes                   │ Map(LowCardinality(String), String)        │              │                                    │         │ ZSTD(1)           │                │
│ StartTimeUnix                │ DateTime64(9)                              │              │                                    │         │ Delta(8), ZSTD(1) │                │
│ TimeUnix                     │ DateTime64(9)                              │              │                                    │         │ Delta(8), ZSTD(1) │                │
│ Value                        │ Float64                                    │              │                                    │         │ ZSTD(1)           │                │
│ Flags                        │ UInt32                                     │              │                                    │         │ ZSTD(1)           │                │
│ Exemplars.FilteredAttributes │ Array(Map(LowCardinality(String), String)) │              │                                    │         │ ZSTD(1)           │                │
│ Exemplars.TimeUnix           │ Array(DateTime64(9))                       │              │                                    │         │ ZSTD(1)           │                │
│ Exemplars.Value              │ Array(Float64)                             │              │                                    │         │ ZSTD(1)           │                │
│ Exemplars.SpanId             │ Array(String)                              │              │                                    │         │ ZSTD(1)           │                │
│ Exemplars.TraceId            │ Array(String)                              │              │                                    │         │ ZSTD(1)           │                │
│ AggTemp                      │ Int32                                      │              │                                    │         │ ZSTD(1)           │                │
│ IsMonotonic                  │ Bool                                       │              │                                    │         │ Delta(1), ZSTD(1) │                │
│ ServiceName                  │ String                                     │ DEFAULT      │ ResourceAttributes['service.name'] │         │ ZSTD(1)           │                │
└──────────────────────────────┴────────────────────────────────────────────┴──────────────┴────────────────────────────────────┴─────────┴───────────────────┴────────────────┘

after insert data:

select * from otel_metrics_sum format Vertical


Row 1:
──────
ResourceAttributes:           {'service.name':'demo 1','Resource Attributes 1':'value1'}
ResourceSchemaUrl:            Resource SchemaUrl 1
ScopeName:                    Scope name 1
ScopeVersion:                 Scope version 1
ScopeAttributes:              {'Scope Attributes 1':'value1'}
ScopeDroppedAttrCount:        10
ScopeSchemaUrl:               Scope SchemaUrl 1
MetricName:                   sum metrics
MetricDescription:            This is a sum metrics
MetricUnit:                   count
Attributes:                   {'sum_label_1':'1'}
StartTimeUnix:                2023-12-25 09:53:49.000000000
TimeUnix:                     2023-12-25 09:53:49.000000000
Value:                        11.234
Flags:                        0
Exemplars.FilteredAttributes: [{'key2':'value2','key':'value'}]
Exemplars.TimeUnix:           ['2023-12-25 09:53:49.000000000']
Exemplars.Value:              [54]
Exemplars.SpanId:             ['0102030000000000']
Exemplars.TraceId:            ['01020300000000000000000000000000']
AggTemp:                      0
IsMonotonic:                  false
ServiceName:                  demo 1

@hanjm
Copy link
Member

hanjm commented Mar 15, 2024

@Frapschen migration tool looks heavy and complex, may be block on alter clickhouse table. the status of current component is beta so made break change is ok, just notice it in changelog.

dmitryax pushed a commit that referenced this issue Mar 28, 2024
…cs (#31803)

resolve:
#31670

it's a breaking change.

users who upgrade to the latest version need to alter the Clickhouse
table:
```
ALTER TABLE otel_metrics_exponential_histogram ADD COLUMN ServiceName LowCardinality(String) CODEC(ZSTD(1));
ALTER TABLE otel_metrics_gauge ADD COLUMN ServiceName LowCardinality(String) CODEC(ZSTD(1));
ALTER TABLE otel_metrics_histogram ADD COLUMN ServiceName LowCardinality(String) CODEC(ZSTD(1));
ALTER TABLE otel_metrics_sum ADD COLUMN ServiceName LowCardinality(String) CODEC(ZSTD(1));
ALTER TABLE otel_metrics_summary ADD COLUMN ServiceName LowCardinality(String) CODEC(ZSTD(1));
```
rimitchell pushed a commit to rimitchell/opentelemetry-collector-contrib that referenced this issue May 8, 2024
…cs (open-telemetry#31803)

resolve:
open-telemetry#31670

it's a breaking change.

users who upgrade to the latest version need to alter the Clickhouse
table:
```
ALTER TABLE otel_metrics_exponential_histogram ADD COLUMN ServiceName LowCardinality(String) CODEC(ZSTD(1));
ALTER TABLE otel_metrics_gauge ADD COLUMN ServiceName LowCardinality(String) CODEC(ZSTD(1));
ALTER TABLE otel_metrics_histogram ADD COLUMN ServiceName LowCardinality(String) CODEC(ZSTD(1));
ALTER TABLE otel_metrics_sum ADD COLUMN ServiceName LowCardinality(String) CODEC(ZSTD(1));
ALTER TABLE otel_metrics_summary ADD COLUMN ServiceName LowCardinality(String) CODEC(ZSTD(1));
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants