How We Process Billions of Logs Using ClickHouse

Introduction

In the fast-paced world of application development and deployment, effective logging is crucial for monitoring, debugging, and maintaining the health of systems. At CyberMind Works, we have developed a robust logging pipeline to handle billions of logs efficiently. This article delves into how we leverage ClickHouse, Fluent Bit, Pino, and Grafana to create a high-performance logging infrastructure.

Why We Chose ClickHouse

ClickHouse is a columnar database management system known for its fast query processing, making it ideal for analytics and log management. It is designed to handle large volumes of data with high throughput and low latency. This makes ClickHouse an excellent choice for our logging needs, where we require rapid ingestion and querying of billions of log entries.

Database Setup (ClickHouse)

We use ClickHouse to store and query our logs. Here are the table schemas we use:

Logs Table

CREATE TABLE cmwlogs.backendlogs
(
    namespace LowCardinality(String),
    level LowCardinality(UInt8),
    timestamp DateTime,
    log String
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(timestamp)
ORDER BY (namespace, level, timestamp)
SETTINGS index_granularity = 8192;

Logging

Logging is the backbone of our monitoring and debugging process. Here's a brief overview:

Why Logging is Important

1. Debugging: Helps identify and resolve issues quickly.
2. Monitoring: Provides insights into system health and performance.
3. Auditing and Compliance: Maintains an audit trail for regulatory requirements.
4. Performance Optimization: Identifies bottlenecks and areas for improvement.

Tools We Use

We use Pino and NestJS Pino for backend logging. Logs are written to stdout and collected by a Fluent Bit daemon set, which then pushes them to our ClickHouse database.

Setting Up Pino and NestJS Pino

Installation

npm install pino nestjs-pino pino-pretty

Configuration in `app.module.ts`

@Module({
  imports: [
    LoggerModule.forRootAsync({
      imports: [ConfigModule],
      inject: [ConfigService],
      useFactory: (configService: ConfigService) => {
        const streams: DestinationStream[] = [process.stdout];
        const env = configService.get('NODE_ENV');
        return {
          pinoHttp: [
            {
              genReqId: () => v4().replace(/-/g, ''),
              ...(env === 'development'
                ? {
                    transport: {
                      target: 'pino-pretty',
                      options: { colorize: true },
                    },
                  }
                : {}),
            },
            pino.multistream(streams),
          ],
        };
      },
    }),
  ],
})
export class AppModule {}

Configuration in `main.ts`

import { Logger } from 'nestjs-pino';

async function bootstrap() {
  const app = await NestFactory.create(AppModule, {
    cors: true,
    bufferLogs: true,
  });
  app.useLogger(app.get(Logger));
  await app.listen(3000);
}

bootstrap();

Injecting the Logger in Services

@InjectPinoLogger(AuthService.name) // Add the service name to logs
private readonly logger: PinoLogger,

Source Map Support

Enable source maps in tsconfig.json:

{
  "compilerOptions": {
    "sourceMap": true,
  }
}

Add the following in `main.ts`:

import { install } from 'source-map-support';
install();

Fluent Bit Setup

To handle log collection and forwarding, we use Fluent Bit, a lightweight and high-performance log processor.

Installation

kubectl create namespace fluent-bit
helm repo add fluent https://fluent.github.io/helm-charts
helm upgrade --install fluent-bit fluent/fluent-bit --values values.yaml -n fluent-bit

Filtering Logs

Fluent Bit allows filtering logs using the record_accessor property:

filters: |
  [FILTER]
      Name kubernetes
      Match kube.*
      Merge_Log On
      Keep_Log Off
      K8S-Logging.Parser On

  [FILTER]
      Name grep
      Match *
      Regex $kubernetes['labels']['should_log'] do

Configuring Deployments

Add the should_log label to your Kubernetes deployment configuration:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: sales-jobverse-stage-backend
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sales-jobverse-stage-backend
  template:
    metadata:
      labels:
        app: sales-jobverse-stage-backend
        should_log: do

Grafana for Visualization

To visualize and analyze our logs, we use Grafana, an open-source platform for monitoring and observability. Grafana integrates seamlessly with ClickHouse, allowing us to build interactive dashboards and perform complex queries on our log data.

Setting Up Grafana

1. Install Grafana:

helm repo add grafana https://grafana.github.io/helm-charts
helm install grafana grafana/grafana

2. Configure Data Source:
- Open Grafana and navigate to Configuration > Data Sources.
- Add ClickHouse as a data source.
- Provide the ClickHouse connection details.

3. Create Dashboards:
- Create custom dashboards to visualize log data.
- Use ClickHouse queries to filter and aggregate log information.

The Logging Pipeline

Collecting Logs

Logs are collected by Fluent Bit, which runs as a daemon set in our Kubernetes cluster. Fluent Bit listens to all pod logs, filters them based on labels, and forwards the relevant logs to ClickHouse.

Storing Logs

In ClickHouse, logs are stored in the backend and frontend logs tables. The table structure allows for efficient querying and analysis of logs.

Querying and Analyzing Logs

ClickHouse's high-performance query engine enables us to quickly analyze large volumes of logs. Grafana helps visualize these logs, providing insights into system health, debugging issues, and optimizing performance.

Conclusion

By leveraging ClickHouse, Fluent Bit, Pino, and Grafana, we have created a high-performance logging infrastructure capable of processing billions of logs efficiently. This setup not only helps us maintain the health of our systems but also provides valuable insights for debugging and performance optimization. Through continuous monitoring and logging, CyberMind Works is committed to delivering robust and reliable applications.

For further reading and detailed configurations, refer to the following resources:

- Pino Documentation
- NestJS Pino
- Fluent Bit Documentation
- ClickHouse Documentation
- Grafana Documentation

How We Process Billions of Logs Using ClickHouse

Introduction

Why We Chose ClickHouse

Database Setup (ClickHouse)

Logs Table

Logging

Tools We Use

Setting Up Pino and NestJS Pino

Installation

Configuration in `app.module.ts`

Configuration in `main.ts`

Injecting the Logger in Services

Source Map Support

Fluent Bit Setup

Installation

Filtering Logs

Configuring Deployments

Grafana for Visualization

Setting Up Grafana

The Logging Pipeline

Collecting Logs

Storing Logs

Querying and Analyzing Logs

Conclusion

About Boopesh Mahendran

CONTACT US

How can we at CMW help?

Reach out to us here!

How We Process Billions of Logs Using ClickHouse

Introduction

Why We Chose ClickHouse

Database Setup (ClickHouse)

Logs Table

Logging

Tools We Use

Setting Up Pino and NestJS Pino

Installation

Configuration in app.module.ts

Configuration in `main.ts`

Injecting the Logger in Services

Source Map Support

Fluent Bit Setup

Installation

Filtering Logs

Configuring Deployments

Grafana for Visualization

Setting Up Grafana

The Logging Pipeline

Collecting Logs

Storing Logs

Querying and Analyzing Logs

Conclusion

About Boopesh Mahendran

CONTACT US

How can we at CMW help?

Reach out to us here!

Configuration in `app.module.ts`