Building a Claude Code Usage Analytics Platform: Architecture Deep Dive

Executive Summary

🎓 New to system design? Before diving into this technical architecture, you might want to read From Idea to Architecture: Using Claude Code to Design Complex Systems to learn how we used Claude Code to brainstorm and design this system from scratch.

roiAI is a comprehensive analytics platform for analyzing AI service usage and costs, specifically designed for Claude Code users. The system consists of three main components:

roiAI CLI - A command-line tool that collects and analyzes Claude Code usage data from any user's machine. Works locally or syncs across all your devices for consolidated analytics.
roiAI Web API - A RESTful API service that aggregates data from multiple machines, enabling cross-device analytics and team insights
roiAI Web Application - A Next.js-based web interface for viewing unified analytics from all your devices and global rankings

System Architecture Overview

The roiAI system follows a three-tier architecture with clear separation between client, edge, and server components.

Infrastructure Note: All traffic to roiai.fyi passes through Cloudflare, which provides DNS management, DDoS protection, SSL/TLS termination, CDN caching, and Web Application Firewall (WAF) services before reaching our Kubernetes cluster.

Data Flow

Loading diagram...

graph TB
    subgraph "CLIENT SIDE - User Machines"
        subgraph "Developer"
            CLI1[roiAI CLI<br/>Work Laptop]
            DB1[(Local SQLite)]
            CLI1 --> DB1
        end
        
        subgraph "Same User - Different Device"
            CLI2[roiAI CLI<br/>Home Desktop]
            DB2[(Local SQLite)]
            CLI2 --> DB2
        end
        
        subgraph "Cloud Development"
            CLI3[roiAI CLI<br/>Cloud VM/Codespace]
            DB3[(Local SQLite)]
            CLI3 --> DB3
        end
    end
    
    subgraph "CLOUDFLARE EDGE"
        CF[Cloudflare<br/>DNS/CDN/WAF/SSL]
    end
    
    subgraph "SERVER SIDE - Cloud Infrastructure"
        subgraph "Kubernetes Cluster"
            subgraph "Ingress Layer"
                INGRESS[NGINX Ingress<br/>roiai.fyi / api.roiai.fyi]
            end
            
            subgraph "Application Layer"
                WEB[Web Service<br/>Next.js:3000]
                API[API Service<br/>/api endpoints]
            end
            
            subgraph "Background Workers"
                STATS[Stats Worker<br/>Port: 3001]
                RANK[Ranking Worker<br/>Port: 3002]
            end
            
            subgraph "Data Layer"
                PG[(PostgreSQL<br/>10Gi PV)]
            end
            
            INGRESS --> WEB
            INGRESS --> API
            WEB --> PG
            API --> PG
            STATS --> PG
            RANK --> PG
        end
    end
    
    CLI1 -.->|HTTPS| CF
    CLI2 -.->|HTTPS| CF
    CLI3 -.->|HTTPS| CF
    CF ==>|Proxied| INGRESS
    
    classDef clientStyle fill:#ffe0e0,stroke:#c0392b,stroke-width:2px
    classDef dbStyle fill:#fff3e0,stroke:#e65100,stroke-width:2px
    classDef serverStyle fill:#e0ffe0,stroke:#27ae60,stroke-width:2px
    classDef workerStyle fill:#fce4ec,stroke:#880e4f,stroke-width:2px
    classDef cloudflareStyle fill:#f6821f,stroke:#f48120,stroke-width:3px,color:#fff
    
    class CLI1,CLI2,CLI3 clientStyle
    class DB1,DB2,DB3,PG dbStyle
    class WEB,API,INGRESS serverStyle
    class STATS,RANK workerStyle
    class CF cloudflareStyle

💡 Enjoying this article? Sign up to get more insights on AI development, Claude Code tips, and engineering best practices delivered to your inbox.

Component Details

1. roiAI CLI (Client-Side)

Purpose

Collects and analyzes Claude Code usage data from any user's machine - whether they're developers, data scientists, researchers, or other professionals using AI tools. Seamlessly sync and consolidate analytics across all your devices (work laptop, home desktop, cloud VMs, etc.).

Multi-Device Innovation: Unlike traditional analytics tools, roiAI recognizes that modern developers work across multiple environments. Whether you're coding on your work laptop during the day, continuing on your home desktop in the evening, or using cloud development environments like GitHub Codespaces - roiAI consolidates all your Claude Code usage into one unified view.

Key Features

✓Privacy-First: Can operate entirely offline with local-only analytics
✓Universal Compatibility: Works on any machine where Claude Code is used
✓Incremental Sync: Processes only new data since last sync
✓Batch Processing: Handles large datasets efficiently
✓Multi-Device Analytics: Consolidate usage data from all your machines (work laptop, home desktop, cloud VMs) into a unified dashboard

User Types

Software DevelopersData ScientistsML ResearchersContent CreatorsBusiness Analysts

Technology Stack

TypeScript/Node.jsSQLite (Prisma ORM)Commander.jsAxios

Core Commands

roiai cc sync         # Local analysis only
roiai cc login        # Authenticate with cloud
roiai cc push         # Upload data to cloud
roiai cc push-status  # Check auth and sync status
roiai cc logout       # Remove authentication

Data Flow

graph LR
    A[Claude Logs] -->|Read| B[roiAI CLI]
    B -->|Parse JSONL| C[Process Messages]
    C -->|Store| D[Local SQLite]
    D -->|Aggregate| E[Statistics]
    E -->|Optional| F[Push to Cloud]

CLIENT SIDESERVER SIDE

2. roiAI Web API (Server-Side)

Purpose

RESTful API service that receives data from CLI clients and provides analytics endpoints.

Network Architecture

Cloudflare Integration:

DNS Management: All roiai.fyi domains managed through Cloudflare
SSL/TLS: Universal SSL certificates and end-to-end encryption
DDoS Protection: Layer 3/4/7 attack mitigation
CDN: Static assets cached at edge locations globally
WAF: Web Application Firewall with custom rules
Analytics: Traffic analytics and threat monitoring

API Structure

Category	Endpoints	Purpose
Authentication	/api/v1/auth/*	User registration, login, email verification
Data Sync	/api/v1/cli/*	Batch upload, health checks
User Management	/api/v1/users/activity	User activity data

Key Design Decisions

Batch ProcessingAccepts up to 1000 messages per sync request
DeduplicationPrevents duplicate message processing
Rate LimitingProtects against abuse (enforced at both Cloudflare and application level)
Standardized Error CodesConsistent error handling across endpoints

3. Kubernetes Deployment (Helm Charts)

Architecture Components

Component	Type	Replicas	Resources	Notes
Web Service	Next.js app	1 (HPA: 3)	100m CPU / 256Mi RAM	Pod Anti-Affinity for HA
Stats Worker	Background	2 (HPA: 4)	50m CPU / 128Mi RAM	Port 3001 for health
Ranking Worker	Background	1 (fixed)	50m CPU / 128Mi RAM	Advisory locks prevent multiple
PostgreSQL	Database	1	200m CPU / 256Mi RAM	10Gi persistent volume

Supporting Infrastructure

✓Cloudflare: DNS, CDN, SSL/TLS, DDoS protection, WAF
✓Ingress: NGINX with rate limiting and custom headers
✓Domains: roiai.fyi, www.roiai.fyi, api.roiai.fyi (all via Cloudflare)
✓Storage Class: vultr-block-storage
✓Monitoring: Prometheus ServiceMonitors and alerts

Deployment Topology

graph TB
    subgraph "External Services"
        CF[Cloudflare<br/>Edge Network]
        LE[Let's Encrypt<br/>Origin Certs]
    end
    
    subgraph "Kubernetes Namespace: roiai"
        subgraph "Public Facing"
            ING[NGINX Ingress<br/>Rate Limiting]
            SVC[Service: web-service<br/>ClusterIP]
        end
        
        subgraph "Application Pods"
            WEB1[web-deployment-1<br/>Next.js]
            WEB2[web-deployment-2<br/>Next.js]
            STATS1[stats-worker-1]
            STATS2[stats-worker-2]
            RANK[ranking-worker-1]
        end
        
        subgraph "Data Persistence"
            PG[PostgreSQL StatefulSet]
            PVC1[postgres-data-pvc<br/>10Gi]
            PVC2[backup-pvc<br/>10Gi]
        end
        
        subgraph "Jobs & CronJobs"
            MIG[db-migrate Job]
            BAK[backup CronJob<br/>Daily 2AM]
        end
        
        CF ==> ING
        ING --> SVC
        SVC --> WEB1
        SVC --> WEB2
        WEB1 --> PG
        WEB2 --> PG
        STATS1 --> PG
        STATS2 --> PG
        RANK --> PG
        PG --> PVC1
        BAK --> PVC2
        MIG --> PG
        ING -.-> LE
    end
    
    style CF fill:#f6821f,stroke:#f48120,color:#fff
    style ING fill:#3498db,stroke:#2980b9
    style PG fill:#e67e22,stroke:#d35400
    style RANK fill:#e74c3c,stroke:#c0392b

System Interactions

1. CLI Authentication Flow (Client → Server)

sequenceDiagram
    participant User
    participant CLI as roiAI CLI<br/>[Client Machine]
    participant Server as roiAI Server<br/>[Cloud]
    participant DB as PostgreSQL<br/>[Cloud]

    User->>CLI: roiai cc login
    CLI->>User: Prompt for credentials
    User->>CLI: Email/Username + Password
    CLI->>CLI: Collect machine info
    
    Note over CLI,Server: Client → Server Communication
    CLI->>Server: POST /api/v1/cli/login
    Note over CLI,Server: CliLoginRequest with machine_info
    
    Server->>Server: Validate request schema
    Server->>Server: Authenticate user credentials
    Server->>Server: Generate API key + hash
    Server->>DB: Store API key with machine info
    Server->>CLI: CliLoginResponse (user + api_key)
    
    CLI->>CLI: Store credentials locally
    CLI->>User: Success confirmation

2. Data Sync Flow (Client → Server)

sequenceDiagram
    participant User
    participant CLI as roiAI CLI<br/>[Client Machine]
    participant LocalDB as SQLite<br/>[Client Machine]
    participant Server as roiAI Server<br/>[Cloud]
    participant DB as PostgreSQL<br/>[Cloud]

    User->>CLI: roiai cc push
    CLI->>LocalDB: Check sync status
    
    alt Local sync needed
        CLI->>CLI: Process .jsonl files
        CLI->>LocalDB: Store new messages
    end
    
    Note over CLI,Server: Client → Server Communication
    CLI->>Server: GET /api/v1/cli/health
    Server->>CLI: Authentication confirmation
    
    CLI->>LocalDB: Analyze push queue
    CLI->>User: Display statistics
    
    loop For each batch
        CLI->>LocalDB: Select batch messages
        CLI->>CLI: Build PushRequest
        Note over CLI,Server: Client → Server Batch Upload
        CLI->>Server: POST /api/v1/cli/upsync
        Note over CLI,Server: Batch of messages + entities
        
        Server->>Server: Validate request
        Server->>Server: Transform entities
        Server->>DB: Begin transaction
        Server->>DB: Process entities
        Server->>DB: Process messages
        Server->>DB: Update child counts
        Server->>DB: Collect stats deltas
        Server->>DB: Commit transaction
        Server->>CLI: PushResponse with results
        
        CLI->>LocalDB: Mark messages as synced
        CLI->>User: Update progress
    end
    
    CLI->>User: Final summary

3. Background Workers Flow

flowchart TB
    subgraph "Kubernetes Cluster [Cloud]"
        subgraph "Stats Worker"
            SW1[Stats Worker<br/>Instance 1]
            SW2[Stats Worker<br/>Instance 2]
        end
        
        subgraph "Ranking Worker"
            RW[Ranking Worker<br/>Single Instance]
        end
        
        subgraph "Database"
            PG[(PostgreSQL)]
            DELTA[PendingStatsDelta<br/>Table]
            STATS[Statistics Tables<br/>Daily/Weekly/Monthly/Yearly]
            RANK[Ranking Views<br/>Materialized]
        end
        
        SW1 -->|Process deltas| DELTA
        SW2 -->|Process deltas| DELTA
        DELTA -->|Aggregate| STATS
        
        RW -->|Calculate rankings| STATS
        STATS -->|Update| RANK
        
        RW -.->|Advisory lock| PG
        
        API[API Service] -->|Write| DELTA
    end
    
    style SW1 fill:#e0f2f1,stroke:#00796b
    style SW2 fill:#e0f2f1,stroke:#00796b
    style RW fill:#fce4ec,stroke:#c2185b
    style DELTA fill:#fff3e0,stroke:#e65100
    style STATS fill:#e8eaf6,stroke:#303f9f
    style RANK fill:#e8eaf6,stroke:#303f9f

Worker Responsibilities

Worker	Purpose	Processing
Stats Worker	Process pending statistics deltas	Runs every few minutes, aggregates PendingStatsDelta into time-based statistics tables
Ranking Worker	Calculate user rankings	Runs periodically, updates cost rankings across all time periods using advisory locks

How Stats Worker Operates

1Delta Collection: When messages are uploaded via API, the server writes incremental statistics to the PendingStatsDelta table
2Batch Processing: Stats Worker periodically queries pending deltas grouped by user and date
3Aggregation: For each user/date combination, it sums all deltas (messages, tokens, costs)
4Upsert Operations: Updates or inserts into appropriate statistics tables (DailyStats, WeeklyStats, MonthlyStats, YearlyStats)
5Cleanup: Deletes processed deltas to prevent reprocessing
6Parallel Processing: Multiple Stats Worker instances can process different user/date combinations simultaneously

Materialized Views for Fast Rankings

The system uses PostgreSQL materialized views to cache pre-computed rankings:

-- Example: current_daily_cost_rankings view
CREATE MATERIALIZED VIEW current_daily_cost_rankings AS
SELECT 
    u.id,
    u.username,
    u.email,
    ds.totalCost,
    ds.totalMessages,
    ds.costRank,
    PERCENT_RANK() OVER (ORDER BY ds.totalCost DESC) as percentile
FROM users u
JOIN daily_stats ds ON u.id = ds.userId
WHERE ds.date = CURRENT_DATE
ORDER BY ds.costRank;

-- Indexed for fast queries
CREATE INDEX ON current_daily_cost_rankings (costRank);
CREATE INDEX ON current_daily_cost_rankings (userId);

Design Rationale

1. Privacy-First Architecture

•Local-First Processing: All analysis happens on user's machine
•Optional Cloud Sync: Users control when/if to upload
•Machine Isolation: Each machine analyzed separately

2. Scalability Considerations

•Cloudflare Edge: Global CDN reduces origin server load
•Batch Processing: Reduces API calls and database transactions
•Worker Separation: Background tasks don't impact API performance
•Horizontal Scaling: All services except ranking worker can scale

3. Reliability Features

•DDoS Protection: Cloudflare shields against attacks
•Retry Mechanisms: Failed syncs retry with exponential backoff
•Deduplication: Prevents data corruption from duplicate uploads
•Health Checks: All services expose health endpoints
•Pod Disruption Budgets: Ensures availability during updates

4. Performance Optimizations

•Edge Caching: Static assets served from Cloudflare's global network
•Incremental Sync: Only new data processed
•Database Indexing: Optimized queries for rankings
•Connection Pooling: 30 connections in production

Security Considerations

1. Network Security (Cloudflare Layer)

WAF Rules: Custom rules to block malicious requests
Bot Management: Protection against automated attacks
SSL/TLS: End-to-end encryption with origin certificates
IP Filtering: Whitelist/blacklist capabilities

2. Authentication & Authorization

JWT Tokens: Short-lived access tokens (7 days)
API Keys: Machine-specific for CLI authentication
Bcrypt: Password hashing with 12 rounds

3. Container Security

Non-Root Users: All containers run as user 1001
ReadOnly Filesystems: Prevents runtime modifications
Capability Dropping: Minimal Linux capabilities
Security Contexts: Enforced at pod and container level

4. Application Security

CORS: Restricted to specific domains
Rate Limiting: Dual-layer (Cloudflare + NGINX)
Input Validation: Strict schema validation
SQL Injection Protection: Prisma ORM with parameterized queries

Operational Aspects

1. Monitoring & Alerting

graph LR
    A[Services] -->|metrics| B[Prometheus]
    B -->|query| C[Alert Manager]
    C -->|notify| D[Email/Slack]
    B -->|visualize| E[Grafana]
    CF[Cloudflare] -->|analytics| F[CF Dashboard]
    
    subgraph Alerts
        G[Memory > 90%]
        H[Pending > 10k]
        I[Pod Restarts]
        J[CF Rate Limits]
    end
    
    C --> Alerts
    F --> J

2. Backup & Recovery

✓Daily Backups: PostgreSQL data backed up at 2 AM UTC
✓7-Day Retention: Rolling backup window
✓Persistent Volumes: Data survives pod restarts
✓Cloudflare Logs: 7-day retention for security analysis

3. Deployment Strategy

✓Rolling Updates: Zero-downtime deployments
✓Image Versioning: Semantic versioning for releases
✓Environment Separation: Dev/Prod configurations
✓Cloudflare Page Rules: Different caching strategies per environment

Future Considerations

1. Potential Enhancements

Cloudflare WorkersRedis IntegrationReal-time AnalyticsMulti-region SupportGraphQL API

2. Scaling Strategies

•Cloudflare Workers for edge computing
•Database read replicas for analytics queries
•Message queue for async processing
•Kubernetes cluster autoscaling

Conclusion

💡 Want to design your own system like this? Learn how to leverage Claude Code for system design in our tutorial: From Idea to Architecture: Using Claude Code to Design Complex Systems. Discover the iterative process and best practices we used to create roiAI.

The roiAI system demonstrates a well-architected solution for AI usage analytics that balances:

🔒

User Privacy

Local-first approach with optional cloud sync

🌍

Global Performance

Cloudflare edge network for worldwide access

📈

Scalability

Horizontal scaling with multiple worker instances and HPA

🛡️

Reliability

Multi-layer protection and comprehensive error handling

🔐

Security

Defense-in-depth from edge to application

🔧

Maintainability

Clear separation of concerns and standard practices

The architecture supports users from various backgrounds - developers, data scientists, researchers, and business professionals - who want local-only analytics or need consolidated insights across all their devices (work laptop, home desktop, cloud VMs, etc.), making it a flexible solution for the evolving AI development landscape.

Frequently Asked Questions

What makes roiAI different from other analytics platforms?

roiAI is specifically designed for Claude Code usage analytics with a privacy-first architecture. It offers local-only analysis options, seamless multi-device synchronization, and tracks API-equivalent costs without storing any actual code or conversation content.

How does the multi-device sync work?

Each device running roiAI CLI gets a unique identifier. When you run 'roiai cc push', the CLI uploads metadata (not code) to the cloud where it's deduplicated and aggregated. This allows you to see unified analytics across your work laptop, home desktop, cloud VMs, and any other development environment.

Can I use roiAI without uploading data to the cloud?

Yes! Run 'roiai cc sync' for local-only analysis. All data stays on your machine, and you get full analytics without any cloud connectivity. Cloud sync is completely optional and only happens when you explicitly use 'roiai cc push'.

What technology stack powers roiAI?

The roiAI platform uses TypeScript/Node.js for the CLI, Next.js 15 for the web application, PostgreSQL with Prisma ORM for data storage, Kubernetes for container orchestration, and Cloudflare for global edge network services. The entire stack is optimized for horizontal scaling and high availability.

How secure is my usage data?

Security is built into every layer: end-to-end encryption via HTTPS, bcrypt password hashing, JWT tokens for authentication, device-specific API keys, Cloudflare WAF protection, and container security with non-root users and read-only filesystems. We never store your actual code or conversations - only metadata like token counts and timestamps.

Building a Claude Code Usage Analytics Platform: Architecture Deep Dive

Table of Contents

Executive Summary

System Architecture Overview

Data Flow

Component Details

1. roiAI CLI (Client-Side)

Purpose

Key Features

User Types

Technology Stack

Core Commands

Data Flow

2. roiAI Web API (Server-Side)

Purpose

Network Architecture

API Structure

Key Design Decisions

3. Kubernetes Deployment (Helm Charts)

Architecture Components

Supporting Infrastructure

Deployment Topology

System Interactions

1. CLI Authentication Flow (Client → Server)

2. Data Sync Flow (Client → Server)

3. Background Workers Flow

Worker Responsibilities

How Stats Worker Operates

Materialized Views for Fast Rankings

Design Rationale

1. Privacy-First Architecture

2. Scalability Considerations

3. Reliability Features

4. Performance Optimizations

Security Considerations

1. Network Security (Cloudflare Layer)

2. Authentication & Authorization

3. Container Security

4. Application Security

Operational Aspects

1. Monitoring & Alerting

2. Backup & Recovery

3. Deployment Strategy

Future Considerations

1. Potential Enhancements

2. Scaling Strategies

Read More Articles Like This

Conclusion

User Privacy

Global Performance

Scalability

Reliability

Security

Maintainability

Frequently Asked Questions

What makes roiAI different from other analytics platforms?

How does the multi-device sync work?

Can I use roiAI without uploading data to the cloud?

What technology stack powers roiAI?

How secure is my usage data?

Enjoyed this article?