OpenTelemetry in Node.js: Complete Observability Guide
Stop guessing about production performance. OpenTelemetry provides vendor-neutral instrumentation for traces, metrics, and logs. This guide walks through manual and auto-instrumentation for Node.js apps.

Observability is more than logging — it's traces, metrics, and structured events working together. OpenTelemetry (OTel) has become the industry standard, replacing Jaeger and Prometheus clients. In 2026, it's the definitive way to instrument Node.js applications. This guide covers setup, instrumentation, and exporting to backend systems.
What OpenTelemetry Provides
- Traces: End-to-end request flow across services, showing latency breakdowns
- Metrics: Request rates, error rates, latency histograms, custom business metrics
- Logs: Structured logs with trace correlation
Basic Setup: Manual Tracing
import { NodeSDK } from '@opentelemetry/sdk-node';
import { ConsoleSpanExporter } from '@opentelemetry/sdk-trace-node';
import { Resource } from '@opentelemetry/resources';
import { SemanticResourceAttributes } from '@opentelemetry/semantic-conventions';
const sdk = new NodeSDK({
resource: new Resource({
[SemanticResourceAttributes.SERVICE_NAME]: ‘my-node-api’,
[SemanticResourceAttributes.SERVICE_VERSION]: ‘1.0.0’,
}),
traceExporter: new ConsoleSpanExporter(), // Replace with OTLP exporter for production
});
sdk.start();
// Graceful shutdown
process.on(‘SIGTERM’, () => {
sdk.shutdown().then(() => process.exit(0));
});
Manual Instrumentation with Spans
import { trace } from '@opentelemetry/api';
const tracer = trace.getTracer(‘my-tracer’);
app.get(‘/api/process/:id’, async (req, res) => {
// Create a span for the entire operation
return tracer.startActiveSpan(‘process-request’, async (span) => {
span.setAttribute(‘user.id’, req.user.id);
span.setAttribute(‘request.id’, req.params.id);
try {
// Nested span for database query
const result = await tracer.startActiveSpan('database-query', async (dbSpan) => {
dbSpan.setAttribute('db.query', 'SELECT * FROM data WHERE id = $1');
const data = await db.query('SELECT * FROM data WHERE id = $1', [req.params.id]);
dbSpan.end();
return data;
});
// Another nested span for external API
const enriched = await tracer.startActiveSpan('external-api-call', async (apiSpan) => {
const response = await fetch('https://api.enrichment.com/process', {
body: JSON.stringify(result)
});
apiSpan.setAttribute('http.status_code', response.status);
apiSpan.end();
return response.json();
});
res.json(enriched);
} catch (err) {
span.recordException(err);
span.setStatus({ code: SpanStatusCode.ERROR });
throw err;
} finally {
span.end();
}
});
});
Auto-Instrumentation for Zero-Code Tracing
Use @opentelemetry/auto-instrumentations-node to automatically instrument common libraries:
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node';
const sdk = new NodeSDK({
instrumentations: [getNodeAutoInstrumentations()],
// This automatically instruments:
// - HTTP/HTTPS (Express, fetch, http module)
// - Database (PostgreSQL, MySQL, MongoDB, Redis)
// - gRPC, GraphQL, and more
});
sdk.start();
// No code changes needed in your routes! All HTTP requests and DB queries are traced.
Metrics Collection: HTTP Request Duration Histogram
import { metrics } from '@opentelemetry/api';
const meter = metrics.getMeter(‘http-metrics’);
const requestDuration = meter.createHistogram(‘http.request.duration’, {
description: ‘Duration of HTTP requests’,
unit: ‘ms’,
});
app.use((req, res, next) => {
const start = Date.now();
res.on(‘finish’, () => {
const duration = Date.now() - start;
requestDuration.record(duration, {
method: req.method,
route: req.route?.path || req.path,
status_code: res.statusCode,
});
});
next();
});
Exporting to Backend Systems
Send traces to Jaeger, Zipkin, or Datadog via OTLP exporter:
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-grpc';
import { OTLPMetricExporter } from '@opentelemetry/exporter-metrics-otlp-grpc';
const traceExporter = new OTLPTraceExporter({
url: ‘http://jaeger-collector:4317’, // OTLP endpoint
});
const metricExporter = new OTLPMetricExporter({
url: ‘http://prometheus:4317’,
});
const sdk = new NodeSDK({
traceExporter,
metricReader: new PeriodicExportingMetricReader({ exporter: metricExporter }),
});
For Datadog, use @opentelemetry/exporter-trace-otlp-http with Datadog endpoint.
Correlating Logs with Traces
Inject trace/span IDs into your logs:
import { trace } from '@opentelemetry/api';
import pino from 'pino';
const logger = pino({
mixin() {
const span = trace.getActiveSpan();
const spanContext = span?.spanContext();
if (spanContext) {
return {
trace_id: spanContext.traceId,
span_id: spanContext.spanId,
};
}
return {};
},
});
// Every log now includes trace context automatically
logger.info(‘User logged in’);
Sampling for High-Volume Systems
Don't trace every request (cost and performance). Use probabilistic sampler:
import { TraceIdRatioBasedSampler } from '@opentelemetry/sdk-trace-node';
const sdk = new NodeSDK({
sampler: new TraceIdRatioBasedSampler(0.01), // 1% of traces
});
Production Considerations
Performance impact: Auto-instrumentation adds 3-8% CPU overhead. Sampling reduces this.
Memory leak prevention: Always end spans. Use startActiveSpan with async/await (auto-ends on scope exit).
Sensitive data: Configure OTel to redact query parameters, authorization headers.
Common Pitfalls
- Not propagating context across async boundaries: Use
context.with()or ensure instrumentation wraps all async calls. - Too many attributes: High-cardinality attributes (user IDs, request IDs) blow up storage. Use sampling for those.
- Blocking shutdown: Set exporter timeout to avoid hanging process:
shutdown({ timeoutMillis: 5000 }).
Visualizing with Jaeger
Run Jaeger locally:
docker run -d --name jaeger -p 16686:16686 -p 4317:4317 jaegertracing/all-in-one
Access UI at http://localhost:16686. Search traces by service, operation, or tags.
Conclusion
OpenTelemetry is non-negotiable for production Node.js services in 2026. Start with auto-instrumentation to get traces for free, add custom spans for critical business logic, and export to Jaeger or Datadog. Correlate logs with trace IDs to debug end-to-end. The setup takes an hour, but the debugging time saved will pay back in days. Don't wait for an outage — instrument today.
Comments
Join the conversation — sign in to leave a comment.