Tutorials

Build a ClickHouse Telemetry Pipeline with OpenTelemetry and GlassFlow

Build a ClickHouse Telemetry Pipeline with OpenTelemetry and GlassFlow

Build a ClickHouse Telemetry Pipeline with OpenTelemetry and GlassFlow

This guide shows you how to stream OTLP data, perform deduplication and PII masking with GlassFlow and send data directly into ClickHouse for high-performance, low-cost observability.

Written by

Armend Avdijaj

Telemetry pipeline, OpenTelemetry, observability pipeline, GlassFlow, ClickHouse pipeline, ClickHouse observability, OTel, OTel Collector, data masking

ClickHouse has become a popular choice for observability because it can store and query high-volume telemetry data at a fraction of the cost of many hosted platforms. For small and mid-sized companies, that makes it attractive: you can keep more data, query it quickly, and avoid sending every trace, log, or metric to an expensive black-box service.

The challenge is getting telemetry into ClickHouse reliably. OpenTelemetry gives you a standard way to collect data from applications, but you still need to handle duplicate retries, sensitive attributes, schema mapping, batching, and backpressure before the data reaches ClickHouse.

This guide shows one practical architecture using the GlassFlow OTel Connector: a native OTLP ingestor built into GlassFlow that accepts telemetry from the OpenTelemetry Collector, applies streaming transformations, and writes efficient batches into ClickHouse.

The companion demo in this repository is traces-only so the moving parts are easy to understand. It walks through three common telemetry concerns: duplicate spans after retries, sensitive data in trace attributes, and cost from retaining every successful trace. GlassFlow handles span deduplication and stateless masking, while the OpenTelemetry Collector handles tail sampling before data is written to ClickHouse. The same OTLP pattern can be extended to logs and metrics.


Why ClickHouse Telemetry Pipelines Need a Streaming Layer

The simplest ClickHouse observability architecture looks like this:

Applications → OTel Collector → ClickHouse → Dashboard

That can work for early experiments, but production telemetry usually needs a processing step before storage:

  • Deduplication: collector retries and network interruptions can produce duplicate spans or events.

  • Schema control: OTLP payloads need to land in ClickHouse-native columns and types.

  • Cost control: successful traces are often high-volume and low-value compared with errors, and even some known error classes may be noise you do not want to store.

  • Privacy controls: emails, IDs, tokens, and other sensitive attributes should be removed before storage.

  • Backpressure and batching: ClickHouse performs best when inserts are batched and paced.

Some teams solve this with Kafka between the collector and ClickHouse. Kafka is valuable when you need durable replay, long retention, or several independent consumers. But if your primary goal is to stream telemetry into ClickHouse, running brokers, partitions, topics, and consumer groups can be more infrastructure than a smaller team wants to own.

The GlassFlow OTel Connector

GlassFlow natively understands OTLP — the wire protocol used by every OpenTelemetry SDK and the OTel Collector. When you create a pipeline with an OTLP source type (for example, otlp.traces), GlassFlow provisions an OTLP ingestor endpoint inside your cluster.

The architecture can be seen in this guide:

TelemetryGen → OTel Collector → GlassFlow (OTLP) → ClickHouse → HyperDX

The OTel Collector routes traces directly to the GlassFlow pipeline endpoint over gRPC or HTTP. No Kafka, no custom plugins, no topic management.

TelemetryGen, GlassFlow, ClickHouse, OTel, OpenTelemetry, Telemetry pipeline, observability pipeline, OTel pipeline, OpenTelemetry, ClickStack

Figure 1: OTLP traces flow from the collector into GlassFlow, then into ClickHouse for storage and HyperDX for exploration.


GlassFlow provides the streaming pieces that are easy to underestimate when building on ClickHouse:

  • Real-time deduplication — removes duplicate spans using a configurable key and time window. The demo uses span_id with a one-hour window, but the same mechanism can be set to longer windows when retries or replays arrive much later.

  • Filtering before storage — drops telemetry you know you do not need, including noisy error classes, health-check spans, or low-value success traffic.

  • Schema mapping — converts OTLP field types to ClickHouse-native types (e.g., Map(String, String), LowCardinality(String), DateTime64(9))

  • Batched writes — buffers records in memory and flushes to ClickHouse in efficient bulk inserts

  • Backpressure handling — applies flow control so ClickHouse is not overwhelmed

  • Horizontal scaling — GlassFlow can scale ingestion and sink components with replicas; the published performance benchmark shows roughly linear scaling up to about 510k events/sec at 10 replicas.

Pipeline definitions use the API v3 shape: a sources array, optional transforms (including deduplication), and a sink with nested connection_params and mapping. That matches all current GlassFlow ETL APIs and keeps routing explicit per pipeline.

What We're Building

By the end of this guide, you'll have a Kubernetes-based traces pipeline:

Pipeline

Source

Sink table

otlp-traces

OTLP traces

otel_traces

We'll use telemetrygen with two containers in one Job: mostly successful spans at ~45/s and error spans at ~5/s, each carrying demo attributes (user_email, demo_ssn) so you can verify redaction in the GlassFlow pipeline before data reaches ClickHouse. The collector runs tail sampling so you keep all error traces and a small fraction of successful ones. The OTel Collector configuration pattern applies to any OTLP-compatible SDK or agent.

TelemetryGen, GlassFlow, ClickHouse, OTel, OpenTelemetry, Telemetry pipeline, observability pipeline, OTel pipeline, OpenTelemetry, ClickStack

Figure 2: Each demo concern is handled in the layer where it is cheapest and safest to apply.


Prerequisites

  • kubectl configured against a cluster (kind works for local development)

  • helm ≥ 3.12

  • clickhouse-client (for verification queries)

  • make (the demo wraps kind, Helm, and curl in a small Makefile)

  • A machine with at least 4 CPU cores, 8 GB RAM, and roughly 10 GB disk space for the bundled stack

All demo files are available in the demos/observability-v2 directory of this repository.


Step 1 — Create the Cluster

cd clickhouse-etl/demos/observability-v2
make cluster
cd clickhouse-etl/demos/observability-v2
make cluster
cd clickhouse-etl/demos/observability-v2
make cluster

This creates a kind cluster named observability-v2 (override with CLUSTER_NAME) using the demo's node image and config. If you're using an existing cluster, skip this step.

Step 2 — Add Helm Repositories and Create Namespaces

make repos
make ns
make repos
make ns
make repos
make ns

Under the hood, this adds opentelemetry, glassflow(https://glassflow.github.io/charts), and clickstack, runs helm repo update for only those three repos, and ensures the otel, glassflow, and hyperdx namespaces exist.

Step 3 — Deploy the Stack

make install
make install
make install

This installs, in order: the OpenTelemetry Collector, GlassFlow ETL from the published glassflow/glassflow-etl chart by default, and ClickStack (HyperDX with embedded ClickHouse). The demo pins GLASSFLOW_CHART_VERSION=0.5.16; you can override GLASSFLOW_CHART if you want to test a local chart archive instead.

helm search repo glassflow/glassflow-etl
helm upgrade --install glassflow glassflow/glassflow-etl \\
    --namespace glassflow \\
    -f k8s/helm-values/glassflow.values.yaml \\
    --version 0.5.16 \\
    --wait --timeout 10m
helm search repo glassflow/glassflow-etl
helm upgrade --install glassflow glassflow/glassflow-etl \\
    --namespace glassflow \\
    -f k8s/helm-values/glassflow.values.yaml \\
    --version 0.5.16 \\
    --wait --timeout 10m
helm search repo glassflow/glassflow-etl
helm upgrade --install glassflow glassflow/glassflow-etl \\
    --namespace glassflow \\
    -f k8s/helm-values/glassflow.values.yaml \\
    --version 0.5.16 \\
    --wait --timeout 10m

The Helm values (k8s/helm-values/glassflow.values.yaml) enable the OTLP receiver, size resources for a local cluster, and disable the Kafka gateway on the UI:

sources:
  otlpReceiver:
    enabled: true
    replicas: 1
    resources:
      requests: { cpu: "200m", memory: "256Mi" }
      limits:   { cpu: "500m", memory: "512Mi" }

api:
  resources:
    requests: { cpu: "100m", memory: "100Mi" }
    limits:   { cpu: "250m", memory: "200Mi" }

ui:
  kafkaGateway:
    enabled: false
sources:
  otlpReceiver:
    enabled: true
    replicas: 1
    resources:
      requests: { cpu: "200m", memory: "256Mi" }
      limits:   { cpu: "500m", memory: "512Mi" }

api:
  resources:
    requests: { cpu: "100m", memory: "100Mi" }
    limits:   { cpu: "250m", memory: "200Mi" }

ui:
  kafkaGateway:
    enabled: false
sources:
  otlpReceiver:
    enabled: true
    replicas: 1
    resources:
      requests: { cpu: "200m", memory: "256Mi" }
      limits:   { cpu: "500m", memory: "512Mi" }

api:
  resources:
    requests: { cpu: "100m", memory: "100Mi" }
    limits:   { cpu: "250m", memory: "200Mi" }

ui:
  kafkaGateway:
    enabled: false

After install, make status shows pod state in each namespace.

Step 4 — Deploy the GlassFlow Pipeline

The HTTP API listens on port 8081 inside the cluster. Forward localhost:8080 → 8081 so the default GLASSFLOW_API_URL used by the Makefile works:

make pf-glassflow-api
make pf-glassflow-api
make pf-glassflow-api

In another terminal, create the pipeline:

make deploy-pipelines
make deploy-pipelines
make deploy-pipelines

This POSTs glassflow-pipelines/traces-pipeline.json to POST /api/v1/pipeline. If port 8080 is busy, forward the API on another port and set GLASSFLOW_API_URL, for example:

kubectl port-forward -n glassflow svc/glassflow-api 18080:8081
GLASSFLOW_API_URL=http://localhost:18080 make deploy-pipelines
kubectl port-forward -n glassflow svc/glassflow-api 18080:8081
GLASSFLOW_API_URL=http://localhost:18080 make deploy-pipelines
kubectl port-forward -n glassflow svc/glassflow-api 18080:8081
GLASSFLOW_API_URL=http://localhost:18080 make deploy-pipelines

What's in a pipeline definition?

The traces pipeline uses v3 JSON: sources, transforms, and sink with connection_params aligned to your ClickHouse credentials (these must match k8s/helm-values/hyperdx.values.yaml and the table DDL):

{
  "version": "v3",
  "pipeline_id": "otlp-traces",
  "name": "OTLP Traces Pipeline",
  "sources": [
    {
      "type": "otlp.traces",
      "source_id": "traces"
    }
  ],
  "transforms": [
    {
      "type": "dedup",
      "source_id": "traces",
      "config": {
        "key": "span_id",
        "time_window": "1h"
      }
    },
    {
      "type": "stateless",
      "source_id": "traces",
      "config": {
        "transforms": [
          // ... passthrough transforms for mapped trace fields
          {
            "expression": "{'service.name': resource_attributes['service.name'], 'user_email': '[REDACTED]', 'demo_ssn': '[REDACTED]'}",
            "output_name": "resource_attributes",
            "output_type": "map"
          },
          {
            "expression": "{'user_email': '[REDACTED]', 'demo_ssn': '[REDACTED]'}",
            "output_name": "attributes",
            "output_type": "map"
          }
        ]
      }
    }
  ],
  "sink": {
    "type": "clickhouse",
    "connection_params": {
      "host": "hyperdx-clickstack-clickhouse.hyperdx.svc.cluster.local",
      "port": "9000",
      "http_port": "8123",
      "database": "default",
      "username": "otelcollector",
      "password": "otelcollectorpass",
      "secure": false,
      "skip_certificate_verification": true
    },
    "table": "otel_traces",
    "max_batch_size": 1000,
    "max_delay_time": "1s",
    "mapping": [
      {"name": "start_timestamp", "column_name": "Timestamp", "column_type": "DateTime64(9)"},
      {"name": "trace_id", "column_name": "TraceId", "column_type": "String"},
      {"name": "span_id", "column_name": "SpanId", "column_type": "String"},
      // ... additional fields
    ]
  }
}
{
  "version": "v3",
  "pipeline_id": "otlp-traces",
  "name": "OTLP Traces Pipeline",
  "sources": [
    {
      "type": "otlp.traces",
      "source_id": "traces"
    }
  ],
  "transforms": [
    {
      "type": "dedup",
      "source_id": "traces",
      "config": {
        "key": "span_id",
        "time_window": "1h"
      }
    },
    {
      "type": "stateless",
      "source_id": "traces",
      "config": {
        "transforms": [
          // ... passthrough transforms for mapped trace fields
          {
            "expression": "{'service.name': resource_attributes['service.name'], 'user_email': '[REDACTED]', 'demo_ssn': '[REDACTED]'}",
            "output_name": "resource_attributes",
            "output_type": "map"
          },
          {
            "expression": "{'user_email': '[REDACTED]', 'demo_ssn': '[REDACTED]'}",
            "output_name": "attributes",
            "output_type": "map"
          }
        ]
      }
    }
  ],
  "sink": {
    "type": "clickhouse",
    "connection_params": {
      "host": "hyperdx-clickstack-clickhouse.hyperdx.svc.cluster.local",
      "port": "9000",
      "http_port": "8123",
      "database": "default",
      "username": "otelcollector",
      "password": "otelcollectorpass",
      "secure": false,
      "skip_certificate_verification": true
    },
    "table": "otel_traces",
    "max_batch_size": 1000,
    "max_delay_time": "1s",
    "mapping": [
      {"name": "start_timestamp", "column_name": "Timestamp", "column_type": "DateTime64(9)"},
      {"name": "trace_id", "column_name": "TraceId", "column_type": "String"},
      {"name": "span_id", "column_name": "SpanId", "column_type": "String"},
      // ... additional fields
    ]
  }
}
{
  "version": "v3",
  "pipeline_id": "otlp-traces",
  "name": "OTLP Traces Pipeline",
  "sources": [
    {
      "type": "otlp.traces",
      "source_id": "traces"
    }
  ],
  "transforms": [
    {
      "type": "dedup",
      "source_id": "traces",
      "config": {
        "key": "span_id",
        "time_window": "1h"
      }
    },
    {
      "type": "stateless",
      "source_id": "traces",
      "config": {
        "transforms": [
          // ... passthrough transforms for mapped trace fields
          {
            "expression": "{'service.name': resource_attributes['service.name'], 'user_email': '[REDACTED]', 'demo_ssn': '[REDACTED]'}",
            "output_name": "resource_attributes",
            "output_type": "map"
          },
          {
            "expression": "{'user_email': '[REDACTED]', 'demo_ssn': '[REDACTED]'}",
            "output_name": "attributes",
            "output_type": "map"
          }
        ]
      }
    }
  ],
  "sink": {
    "type": "clickhouse",
    "connection_params": {
      "host": "hyperdx-clickstack-clickhouse.hyperdx.svc.cluster.local",
      "port": "9000",
      "http_port": "8123",
      "database": "default",
      "username": "otelcollector",
      "password": "otelcollectorpass",
      "secure": false,
      "skip_certificate_verification": true
    },
    "table": "otel_traces",
    "max_batch_size": 1000,
    "max_delay_time": "1s",
    "mapping": [
      {"name": "start_timestamp", "column_name": "Timestamp", "column_type": "DateTime64(9)"},
      {"name": "trace_id", "column_name": "TraceId", "column_type": "String"},
      {"name": "span_id", "column_name": "SpanId", "column_type": "String"},
      // ... additional fields
    ]
  }
}

The important ideas:

  • sources: otlp.traces with a stable source_id ties transforms to the OTLP stream.

  • Dedup transform: duplicate deliveries (for example collector retries) collapse on span_id inside a one-hour window before ClickHouse.

  • Stateless transform: the demo rewrites the OTLP resource and span attribute maps with [REDACTED] values for user_email and demo_ssn; resource_attributes['service.name'] is passed through so HyperDX and ClickHouse can still group by service.

  • sink.connection_params: in-cluster DNS for ClickHouse plus credentials; keep these in sync if you rotate passwords.

  • sink.mapping: maps OTLP fields to ClickHouse column types; Map(String, String) fits dynamic resource and span attributes, while ServiceName is derived in ClickHouse from ResourceAttributes['service.name'].

GlassFlow exposes a unified OTLP receiver (glassflow-otlp-receiver.glassflow.svc.cluster.local:4317). The collector sets x-glassflow-pipeline-id: otlp-traces on export so traffic lands in this pipeline.

Step 5 — Create the ClickHouse Table

Port-forward ClickHouse (bundled with the HyperDX Helm release):

kubectl port-forward -n hyperdx svc/hyperdx-clickstack-clickhouse 9000:9000
kubectl port-forward -n hyperdx svc/hyperdx-clickstack-clickhouse 9000:9000
kubectl port-forward -n hyperdx svc/hyperdx-clickstack-clickhouse 9000:9000

Then create the traces table:

make create-clickhouse-tables
make create-clickhouse-tables
make create-clickhouse-tables

This runs clickhouse/create_otel_tables.sql, which defines otel_traces only — span fields, resource and scope maps, a derived ServiceName, and bloom filters on trace, span, service, status, and attribute fields.

Schema highlights

otel_traces — stores distributed spans. Together with GlassFlow deduplication on span_id, each span identifier should appear once within the configured time window. That means ClickHouse can answer trace queries without ReplacingMergeTree cleanup or FINAL reads:

CREATE TABLE IF NOT EXISTS otel_traces
(
    `Timestamp` DateTime64(9) CODEC(Delta(8), ZSTD(1)),
    `TraceId` String CODEC(ZSTD(1)),
    `SpanId` String CODEC(ZSTD(1)),
    `ParentSpanId` String CODEC(ZSTD(1)),
    `TraceState` String CODEC(ZSTD(1)),
    `SpanName` LowCardinality(String) CODEC(ZSTD(1)),
    `SpanKind` LowCardinality(String) CODEC(ZSTD(1)),
    `ResourceAttributes` Map(String, String) CODEC(ZSTD(1)),
    `ServiceName` LowCardinality(String) DEFAULT ResourceAttributes['service.name'] CODEC(ZSTD(1)),
    `ScopeName` String CODEC(ZSTD(1)),
    `ScopeVersion` String CODEC(ZSTD(1)),
    `SpanAttributes` Map(String, String) CODEC(ZSTD(1)),
    `Duration` UInt64 CODEC(ZSTD(1)),
    `StatusCode` LowCardinality(String) CODEC(ZSTD(1)),
    `StatusMessage` String CODEC(ZSTD(1)),
    INDEX idx_trace_id TraceId TYPE bloom_filter(0.001) GRANULARITY 1,
    INDEX idx_span_id SpanId TYPE bloom_filter(0.001) GRANULARITY 1
)
ENGINE = MergeTree()
PARTITION BY toDate(Timestamp)
ORDER BY (ServiceName, Timestamp, TraceId, SpanId);
CREATE TABLE IF NOT EXISTS otel_traces
(
    `Timestamp` DateTime64(9) CODEC(Delta(8), ZSTD(1)),
    `TraceId` String CODEC(ZSTD(1)),
    `SpanId` String CODEC(ZSTD(1)),
    `ParentSpanId` String CODEC(ZSTD(1)),
    `TraceState` String CODEC(ZSTD(1)),
    `SpanName` LowCardinality(String) CODEC(ZSTD(1)),
    `SpanKind` LowCardinality(String) CODEC(ZSTD(1)),
    `ResourceAttributes` Map(String, String) CODEC(ZSTD(1)),
    `ServiceName` LowCardinality(String) DEFAULT ResourceAttributes['service.name'] CODEC(ZSTD(1)),
    `ScopeName` String CODEC(ZSTD(1)),
    `ScopeVersion` String CODEC(ZSTD(1)),
    `SpanAttributes` Map(String, String) CODEC(ZSTD(1)),
    `Duration` UInt64 CODEC(ZSTD(1)),
    `StatusCode` LowCardinality(String) CODEC(ZSTD(1)),
    `StatusMessage` String CODEC(ZSTD(1)),
    INDEX idx_trace_id TraceId TYPE bloom_filter(0.001) GRANULARITY 1,
    INDEX idx_span_id SpanId TYPE bloom_filter(0.001) GRANULARITY 1
)
ENGINE = MergeTree()
PARTITION BY toDate(Timestamp)
ORDER BY (ServiceName, Timestamp, TraceId, SpanId);
CREATE TABLE IF NOT EXISTS otel_traces
(
    `Timestamp` DateTime64(9) CODEC(Delta(8), ZSTD(1)),
    `TraceId` String CODEC(ZSTD(1)),
    `SpanId` String CODEC(ZSTD(1)),
    `ParentSpanId` String CODEC(ZSTD(1)),
    `TraceState` String CODEC(ZSTD(1)),
    `SpanName` LowCardinality(String) CODEC(ZSTD(1)),
    `SpanKind` LowCardinality(String) CODEC(ZSTD(1)),
    `ResourceAttributes` Map(String, String) CODEC(ZSTD(1)),
    `ServiceName` LowCardinality(String) DEFAULT ResourceAttributes['service.name'] CODEC(ZSTD(1)),
    `ScopeName` String CODEC(ZSTD(1)),
    `ScopeVersion` String CODEC(ZSTD(1)),
    `SpanAttributes` Map(String, String) CODEC(ZSTD(1)),
    `Duration` UInt64 CODEC(ZSTD(1)),
    `StatusCode` LowCardinality(String) CODEC(ZSTD(1)),
    `StatusMessage` String CODEC(ZSTD(1)),
    INDEX idx_trace_id TraceId TYPE bloom_filter(0.001) GRANULARITY 1,
    INDEX idx_span_id SpanId TYPE bloom_filter(0.001) GRANULARITY 1
)
ENGINE = MergeTree()
PARTITION BY toDate(Timestamp)
ORDER BY (ServiceName, Timestamp, TraceId, SpanId);

Step 6 — Configure the OTel Collector

The OTel Collector is the bridge between your instrumented services and GlassFlow. This setup uses the standard otel/opentelemetry-collector-contrib image with the built-in OTLP exporter, plus tail_sampling for cost-aware retention. Masking happens in the GlassFlow pipeline, so the collector config stays focused on receiving OTLP, sampling, batching, and forwarding.

The demo installs the collector via Helm (make install). The relevant fragment of k8s/helm-values/otel-collector.values.yaml:

mode: deployment

image:
  repository: otel/opentelemetry-collector-contrib
  tag: "0.120.0"

config:
  receivers:
    otlp:
      protocols:
        grpc: { endpoint: 0.0.0.0:4317 }
        http: { endpoint: 0.0.0.0:4318 }

  processors:
    batch:
      send_batch_size: 64
      send_batch_max_size: 128
    memory_limiter:
      limit_mib: 128
      check_interval: 1s
    tail_sampling:
      decision_wait: 5s
      num_traces: 50000
      policies:
        - name: keep-errors
          type: status_code
          status_code:
            status_codes: [ERROR]
        - name: keep-10pct-ok
          type: probabilistic
          probabilistic:
            sampling_percentage: 10

  exporters:
    otlp/glassflow-traces:
      endpoint: "glassflow-otlp-receiver.glassflow.svc.cluster.local:4317"
      tls:
        insecure: true
      headers:
        x-glassflow-pipeline-id: "otlp-traces"

  service:
    pipelines:
      traces:
        receivers: [otlp]
        processors: [memory_limiter, transform/redact, tail_sampling, batch]
        exporters: [otlp/glassflow-traces]
mode: deployment

image:
  repository: otel/opentelemetry-collector-contrib
  tag: "0.120.0"

config:
  receivers:
    otlp:
      protocols:
        grpc: { endpoint: 0.0.0.0:4317 }
        http: { endpoint: 0.0.0.0:4318 }

  processors:
    batch:
      send_batch_size: 64
      send_batch_max_size: 128
    memory_limiter:
      limit_mib: 128
      check_interval: 1s
    tail_sampling:
      decision_wait: 5s
      num_traces: 50000
      policies:
        - name: keep-errors
          type: status_code
          status_code:
            status_codes: [ERROR]
        - name: keep-10pct-ok
          type: probabilistic
          probabilistic:
            sampling_percentage: 10

  exporters:
    otlp/glassflow-traces:
      endpoint: "glassflow-otlp-receiver.glassflow.svc.cluster.local:4317"
      tls:
        insecure: true
      headers:
        x-glassflow-pipeline-id: "otlp-traces"

  service:
    pipelines:
      traces:
        receivers: [otlp]
        processors: [memory_limiter, transform/redact, tail_sampling, batch]
        exporters: [otlp/glassflow-traces]
mode: deployment

image:
  repository: otel/opentelemetry-collector-contrib
  tag: "0.120.0"

config:
  receivers:
    otlp:
      protocols:
        grpc: { endpoint: 0.0.0.0:4317 }
        http: { endpoint: 0.0.0.0:4318 }

  processors:
    batch:
      send_batch_size: 64
      send_batch_max_size: 128
    memory_limiter:
      limit_mib: 128
      check_interval: 1s
    tail_sampling:
      decision_wait: 5s
      num_traces: 50000
      policies:
        - name: keep-errors
          type: status_code
          status_code:
            status_codes: [ERROR]
        - name: keep-10pct-ok
          type: probabilistic
          probabilistic:
            sampling_percentage: 10

  exporters:
    otlp/glassflow-traces:
      endpoint: "glassflow-otlp-receiver.glassflow.svc.cluster.local:4317"
      tls:
        insecure: true
      headers:
        x-glassflow-pipeline-id: "otlp-traces"

  service:
    pipelines:
      traces:
        receivers: [otlp]
        processors: [memory_limiter, transform/redact, tail_sampling, batch]
        exporters: [otlp/glassflow-traces]

The collector still decides which traces to keep, but it no longer mutates attributes. In this demo, GlassFlow stateless transformations write redacted demo values for both ResourceAttributes and SpanAttributes before the ClickHouse sink. For production-grade PII handling, treat this as a pattern for pipeline-owned field-level masking; broader regex-style scrubbing may still belong in an SDK, collector processor, or another edge component, depending on your compliance boundary.

Step 7 — Generate Synthetic Traces

make telemetry
make telemetry
make telemetry

This applies k8s/telemetry/telemetrygen-traces.yaml: one Job with two containers — ~45 spans/s with Ok and ~5/s with Error, each attaching demo PII attributes for redaction checks. For example:

containers:
  - name: telemetrygen-ok
    args:
      - traces
      - --otlp-endpoint=otel-opentelemetry-collector.otel.svc.cluster.local:4317
      - --otlp-insecure
      - --duration=50m
      - --rate=45
      - --status-code=Ok
      - --otlp-attributes=user_email="[email protected]"
      - --otlp-attributes=demo_ssn="000-00-0000"
  - name: telemetrygen-error
    args:
      - traces
      - --rate=5
      - --status-code=Error
      # ... same endpoint; different demo attributes
containers:
  - name: telemetrygen-ok
    args:
      - traces
      - --otlp-endpoint=otel-opentelemetry-collector.otel.svc.cluster.local:4317
      - --otlp-insecure
      - --duration=50m
      - --rate=45
      - --status-code=Ok
      - --otlp-attributes=user_email="[email protected]"
      - --otlp-attributes=demo_ssn="000-00-0000"
  - name: telemetrygen-error
    args:
      - traces
      - --rate=5
      - --status-code=Error
      # ... same endpoint; different demo attributes
containers:
  - name: telemetrygen-ok
    args:
      - traces
      - --otlp-endpoint=otel-opentelemetry-collector.otel.svc.cluster.local:4317
      - --otlp-insecure
      - --duration=50m
      - --rate=45
      - --status-code=Ok
      - --otlp-attributes=user_email="[email protected]"
      - --otlp-attributes=demo_ssn="000-00-0000"
  - name: telemetrygen-error
    args:
      - traces
      - --rate=5
      - --status-code=Error
      # ... same endpoint; different demo attributes

Note: The collector service name is otel-opentelemetry-collector — the Helm release name (otel) is prefixed by the chart when mode: deployment creates a ClusterIP service.

Within about a minute, traces should appear in HyperDX. Because of tail sampling, you should see far more ERROR rows per capita than OK rows in otel_traces relative to the generator mix.

Step 8 — Explore in HyperDX and GlassFlow

HyperDX (ClickStack):

make pf-hyperdx
make pf-hyperdx
make pf-hyperdx

Open `http://localhost:8090`. Navigate to **Search** once in the Chart Explorer view and you should see the incoming spans.

HyperDx, ClickStack, ClickHouse, GlassFlow, telemetry pipelines, OTel pipelines, observability, observability pipeline, GlassFlow

Figure 3: HyperDX querying trace telemetry stored in ClickHouse.

GlassFlow UI uses local 8081 so the API can keep 8080:

make pf-glassflow
make pf-glassflow
make pf-glassflow

Open http://localhost:8081, open Pipelines, and confirm otlp-traces is running. Keep make pf-glassflow-api running if you still need the API on http://localhost:8080.

Verifying Data Quality

With ClickHouse port-forward active, run:

Dead lettter Queue, DLQ, ClickHouse, Observability, Observability pipeline, telemetry pipeline, GlassFlow, ClickHouse, ClickStack

Figure 4: A healthy run has no duplicate span IDs, no DLQ backlog, masked demo PII, and error-prioritized retention.

-- Volume by status (after tail sampling)
SELECT StatusCode, count() AS n
FROM otel_traces
GROUP BY StatusCode
ORDER BY n DESC;

-- Dedup: no duplicate span_id within the GlassFlow window
SELECT SpanId, count() AS c
FROM otel_traces
GROUP BY SpanId
HAVING c > 1
LIMIT 20;

-- Redaction: latest rows should show [REDACTED] in both resource and span maps
SELECT
  count() AS latest_rows,
  countIf(ResourceAttributes['user_email'] = '[REDACTED]') AS redacted_resource_email,
  countIf(ResourceAttributes['demo_ssn'] = '[REDACTED]') AS redacted_resource_ssn,
  countIf(SpanAttributes['user_email'] = '[REDACTED]') AS redacted_span_email,
  countIf(SpanAttributes['demo_ssn'] = '[REDACTED]') AS redacted_span_ssn,
  countIf(arrayExists(x -> x IN ('[email protected]', '[email protected]', '000-00-0000', '999-99-9999'), mapValues(ResourceAttributes))) AS raw_resource_attr_rows,
  countIf(arrayExists(x -> x IN ('[email protected]', '[email protected]', '000-00-0000', '999-99-9999'), mapValues(SpanAttributes))) AS raw_span_attr_rows
FROM otel_traces
WHERE Timestamp >= (SELECT max(Timestamp) FROM otel_traces) - INTERVAL 30 SECOND;
-- Volume by status (after tail sampling)
SELECT StatusCode, count() AS n
FROM otel_traces
GROUP BY StatusCode
ORDER BY n DESC;

-- Dedup: no duplicate span_id within the GlassFlow window
SELECT SpanId, count() AS c
FROM otel_traces
GROUP BY SpanId
HAVING c > 1
LIMIT 20;

-- Redaction: latest rows should show [REDACTED] in both resource and span maps
SELECT
  count() AS latest_rows,
  countIf(ResourceAttributes['user_email'] = '[REDACTED]') AS redacted_resource_email,
  countIf(ResourceAttributes['demo_ssn'] = '[REDACTED]') AS redacted_resource_ssn,
  countIf(SpanAttributes['user_email'] = '[REDACTED]') AS redacted_span_email,
  countIf(SpanAttributes['demo_ssn'] = '[REDACTED]') AS redacted_span_ssn,
  countIf(arrayExists(x -> x IN ('[email protected]', '[email protected]', '000-00-0000', '999-99-9999'), mapValues(ResourceAttributes))) AS raw_resource_attr_rows,
  countIf(arrayExists(x -> x IN ('[email protected]', '[email protected]', '000-00-0000', '999-99-9999'), mapValues(SpanAttributes))) AS raw_span_attr_rows
FROM otel_traces
WHERE Timestamp >= (SELECT max(Timestamp) FROM otel_traces) - INTERVAL 30 SECOND;
-- Volume by status (after tail sampling)
SELECT StatusCode, count() AS n
FROM otel_traces
GROUP BY StatusCode
ORDER BY n DESC;

-- Dedup: no duplicate span_id within the GlassFlow window
SELECT SpanId, count() AS c
FROM otel_traces
GROUP BY SpanId
HAVING c > 1
LIMIT 20;

-- Redaction: latest rows should show [REDACTED] in both resource and span maps
SELECT
  count() AS latest_rows,
  countIf(ResourceAttributes['user_email'] = '[REDACTED]') AS redacted_resource_email,
  countIf(ResourceAttributes['demo_ssn'] = '[REDACTED]') AS redacted_resource_ssn,
  countIf(SpanAttributes['user_email'] = '[REDACTED]') AS redacted_span_email,
  countIf(SpanAttributes['demo_ssn'] = '[REDACTED]') AS redacted_span_ssn,
  countIf(arrayExists(x -> x IN ('[email protected]', '[email protected]', '000-00-0000', '999-99-9999'), mapValues(ResourceAttributes))) AS raw_resource_attr_rows,
  countIf(arrayExists(x -> x IN ('[email protected]', '[email protected]', '000-00-0000', '999-99-9999'), mapValues(SpanAttributes))) AS raw_span_attr_rows
FROM otel_traces
WHERE Timestamp >= (SELECT max(Timestamp) FROM otel_traces) - INTERVAL 30 SECOND;

In a verified run after deploying the stateless transformation, the latest window showed redacted values in both maps and zero raw demo PII rows.

Choosing the Right ClickHouse Telemetry Architecture

There is more than one way to send OpenTelemetry data to ClickHouse. The right choice depends on how much operational complexity you want to own.

Architecture

Best fit

Trade-offs

OTel Collector → ClickHouse

Small prototypes and very simple pipelines

Minimal components, but limited room for deduplication, schema normalization, and controlled batching before storage

OTel Collector → Kafka → Consumers → ClickHouse

Large platforms with multiple downstream consumers, replay needs, or existing Kafka operations

Durable and flexible, but adds brokers, topics, partitions, consumer groups, lag monitoring, and extra cost

OTel Collector → GlassFlow → ClickHouse

Small and mid-sized teams that want production-grade telemetry ingestion without operating Kafka

Adds a focused streaming layer for OTLP ingestion, long-window deduplication, filtering, mapping, batching, backpressure, and horizontal scaling

The GlassFlow OTel Connector is a good fit when:

  • ClickHouse is your primary telemetry store.

  • You want to build an internal observability stack without taking on a full event-streaming platform.

  • You need practical safeguards like long-window deduplication, filtering of unneeded telemetry, schema mapping, batching, and DLQ visibility.

  • You expect throughput to grow and want a path to horizontal scaling; GlassFlow's published benchmark reaches about 510k events/sec with 10 ingestor and sink replicas.

  • You want to start with traces and later extend the same pattern to logs or metrics.

Kafka still makes sense when:

  • Multiple independent consumers need to read the same telemetry stream, for example ClickHouse, S3 archival, and real-time alerting.

  • You need durable replay across long retention windows.

  • Your company already runs Kafka well and telemetry is just another stream in that platform.

For many ClickHouse observability use cases, especially at small and mid-sized companies, GlassFlow gives you the processing layer you need without turning the telemetry project into a Kafka operations project. For the scaling details behind those numbers, see the GlassFlow performance benchmark.

Cleanup

make telemetry-remove
make uninstall
make ns-remove
make cluster-delete
make telemetry-remove
make uninstall
make ns-remove
make cluster-delete
make telemetry-remove
make uninstall
make ns-remove
make cluster-delete

Summary

For teams choosing ClickHouse as their telemetry store, GlassFlow provides a focused ingestion and processing layer between OpenTelemetry and ClickHouse. By accepting OTLP natively, GlassFlow can apply real-time deduplication over configurable windows, filter out telemetry you do not need, coerce schemas, and batch writes before telemetry reaches storage. When volume grows, the same architecture can scale horizontally; GlassFlow's benchmark shows about 510k events/sec at 10 replicas.

The repository demo focuses on traces end to end: span-level deduplication and stateless PII masking in GlassFlow, tail sampling in the OpenTelemetry Collector, and a single otel_traces table in ClickHouse viewable in HyperDX. The same OTLP receiver and header-based routing extend naturally to otlp.logs and otlp.metrics when you are ready.

All configuration files are available in demos/observability-v2. For step-by-step troubleshooting, port-forward notes, and API schema alignment details, see GUIDE.md. If you run into issues or have questions, open an issue on GitHub or reach out to the GlassFlow community.

We are also adding more native sources for GlassFlow and want to hear from the community. If you need a new source for GlassFlow, check out integrations and request access here.

Did you like this article? Share it!

You might also like

Data transformations at TB scale for ClickHouse

Get query ready data, lower ClickHouse load, and reliable pipelines at enterprise scale.