ArchitectureKubernetesCloudAI Analytics

Building a Claude Code Usage Analytics Platform: Architecture Deep Dive

A comprehensive technical breakdown of roiAI - a production-ready analytics platform for tracking Claude Code usage, costs, and productivity metrics across teams.

🚀
roiAI Team
Engineering
15 min read

Executive Summary

🎓 New to system design? Before diving into this technical architecture, you might want to read From Idea to Architecture: Using Claude Code to Design Complex Systems to learn how we used Claude Code to brainstorm and design this system from scratch.

roiAI is a comprehensive analytics platform for analyzing AI service usage and costs, specifically designed for Claude Code users. The system consists of three main components:

  1. roiAI CLI - A command-line tool that collects and analyzes Claude Code usage data from any user's machine. Works locally or syncs across all your devices for consolidated analytics.
  2. roiAI Web API - A RESTful API service that aggregates data from multiple machines, enabling cross-device analytics and team insights
  3. roiAI Web Application - A Next.js-based web interface for viewing unified analytics from all your devices and global rankings

System Architecture Overview

The roiAI system follows a three-tier architecture with clear separation between client, edge, and server components.

Infrastructure Note: All traffic to roiai.fyi passes through Cloudflare, which provides DNS management, DDoS protection, SSL/TLS termination, CDN caching, and Web Application Firewall (WAF) services before reaching our Kubernetes cluster.

Data Flow

Loading diagram...

💡 Enjoying this article? Sign up to get more insights on AI development, Claude Code tips, and engineering best practices delivered to your inbox.

Sign Up for Free

Component Details

1. roiAI CLI (Client-Side)

Purpose

Collects and analyzes Claude Code usage data from any user's machine - whether they're developers, data scientists, researchers, or other professionals using AI tools. Seamlessly sync and consolidate analytics across all your devices (work laptop, home desktop, cloud VMs, etc.).

Multi-Device Innovation: Unlike traditional analytics tools, roiAI recognizes that modern developers work across multiple environments. Whether you're coding on your work laptop during the day, continuing on your home desktop in the evening, or using cloud development environments like GitHub Codespaces - roiAI consolidates all your Claude Code usage into one unified view.

Key Features

  • Privacy-First: Can operate entirely offline with local-only analytics
  • Universal Compatibility: Works on any machine where Claude Code is used
  • Incremental Sync: Processes only new data since last sync
  • Batch Processing: Handles large datasets efficiently
  • Multi-Device Analytics: Consolidate usage data from all your machines (work laptop, home desktop, cloud VMs) into a unified dashboard

User Types

Software DevelopersData ScientistsML ResearchersContent CreatorsBusiness Analysts

Technology Stack

TypeScript/Node.jsSQLite (Prisma ORM)Commander.jsAxios

Core Commands

roiai cc sync         # Local analysis only
roiai cc login        # Authenticate with cloud
roiai cc push         # Upload data to cloud
roiai cc push-status  # Check auth and sync status
roiai cc logout       # Remove authentication

Data Flow

CLIENT SIDESERVER SIDE

2. roiAI Web API (Server-Side)

Purpose

RESTful API service that receives data from CLI clients and provides analytics endpoints.

Network Architecture

Cloudflare Integration:

  • DNS Management: All roiai.fyi domains managed through Cloudflare
  • SSL/TLS: Universal SSL certificates and end-to-end encryption
  • DDoS Protection: Layer 3/4/7 attack mitigation
  • CDN: Static assets cached at edge locations globally
  • WAF: Web Application Firewall with custom rules
  • Analytics: Traffic analytics and threat monitoring

API Structure

CategoryEndpointsPurpose
Authentication/api/v1/auth/*User registration, login, email verification
Data Sync/api/v1/cli/*Batch upload, health checks
User Management/api/v1/users/activityUser activity data

Key Design Decisions

  • Batch ProcessingAccepts up to 1000 messages per sync request
  • DeduplicationPrevents duplicate message processing
  • Rate LimitingProtects against abuse (enforced at both Cloudflare and application level)
  • Standardized Error CodesConsistent error handling across endpoints

3. Kubernetes Deployment (Helm Charts)

Architecture Components

ComponentTypeReplicasResourcesNotes
Web ServiceNext.js app1 (HPA: 3)100m CPU / 256Mi RAMPod Anti-Affinity for HA
Stats WorkerBackground2 (HPA: 4)50m CPU / 128Mi RAMPort 3001 for health
Ranking WorkerBackground1 (fixed)50m CPU / 128Mi RAMAdvisory locks prevent multiple
PostgreSQLDatabase1200m CPU / 256Mi RAM10Gi persistent volume

Supporting Infrastructure

  • Cloudflare: DNS, CDN, SSL/TLS, DDoS protection, WAF
  • Ingress: NGINX with rate limiting and custom headers
  • Domains: roiai.fyi, www.roiai.fyi, api.roiai.fyi (all via Cloudflare)
  • Storage Class: vultr-block-storage
  • Monitoring: Prometheus ServiceMonitors and alerts

Deployment Topology

System Interactions

1. CLI Authentication Flow (Client → Server)

2. Data Sync Flow (Client → Server)

3. Background Workers Flow

Worker Responsibilities

WorkerPurposeProcessing
Stats WorkerProcess pending statistics deltasRuns every few minutes, aggregates PendingStatsDelta into time-based statistics tables
Ranking WorkerCalculate user rankingsRuns periodically, updates cost rankings across all time periods using advisory locks

How Stats Worker Operates

  1. 1Delta Collection: When messages are uploaded via API, the server writes incremental statistics to the PendingStatsDelta table
  2. 2Batch Processing: Stats Worker periodically queries pending deltas grouped by user and date
  3. 3Aggregation: For each user/date combination, it sums all deltas (messages, tokens, costs)
  4. 4Upsert Operations: Updates or inserts into appropriate statistics tables (DailyStats, WeeklyStats, MonthlyStats, YearlyStats)
  5. 5Cleanup: Deletes processed deltas to prevent reprocessing
  6. 6Parallel Processing: Multiple Stats Worker instances can process different user/date combinations simultaneously

Materialized Views for Fast Rankings

The system uses PostgreSQL materialized views to cache pre-computed rankings:

-- Example: current_daily_cost_rankings view
CREATE MATERIALIZED VIEW current_daily_cost_rankings AS
SELECT 
    u.id,
    u.username,
    u.email,
    ds.totalCost,
    ds.totalMessages,
    ds.costRank,
    PERCENT_RANK() OVER (ORDER BY ds.totalCost DESC) as percentile
FROM users u
JOIN daily_stats ds ON u.id = ds.userId
WHERE ds.date = CURRENT_DATE
ORDER BY ds.costRank;

-- Indexed for fast queries
CREATE INDEX ON current_daily_cost_rankings (costRank);
CREATE INDEX ON current_daily_cost_rankings (userId);

Design Rationale

1. Privacy-First Architecture

  • Local-First Processing: All analysis happens on user's machine
  • Optional Cloud Sync: Users control when/if to upload
  • Machine Isolation: Each machine analyzed separately

2. Scalability Considerations

  • Cloudflare Edge: Global CDN reduces origin server load
  • Batch Processing: Reduces API calls and database transactions
  • Worker Separation: Background tasks don't impact API performance
  • Horizontal Scaling: All services except ranking worker can scale

3. Reliability Features

  • DDoS Protection: Cloudflare shields against attacks
  • Retry Mechanisms: Failed syncs retry with exponential backoff
  • Deduplication: Prevents data corruption from duplicate uploads
  • Health Checks: All services expose health endpoints
  • Pod Disruption Budgets: Ensures availability during updates

4. Performance Optimizations

  • Edge Caching: Static assets served from Cloudflare's global network
  • Incremental Sync: Only new data processed
  • Database Indexing: Optimized queries for rankings
  • Connection Pooling: 30 connections in production

Security Considerations

1. Network Security (Cloudflare Layer)

  • WAF Rules: Custom rules to block malicious requests
  • Bot Management: Protection against automated attacks
  • SSL/TLS: End-to-end encryption with origin certificates
  • IP Filtering: Whitelist/blacklist capabilities

2. Authentication & Authorization

  • JWT Tokens: Short-lived access tokens (7 days)
  • API Keys: Machine-specific for CLI authentication
  • Bcrypt: Password hashing with 12 rounds

3. Container Security

  • Non-Root Users: All containers run as user 1001
  • ReadOnly Filesystems: Prevents runtime modifications
  • Capability Dropping: Minimal Linux capabilities
  • Security Contexts: Enforced at pod and container level

4. Application Security

  • CORS: Restricted to specific domains
  • Rate Limiting: Dual-layer (Cloudflare + NGINX)
  • Input Validation: Strict schema validation
  • SQL Injection Protection: Prisma ORM with parameterized queries

Operational Aspects

1. Monitoring & Alerting

2. Backup & Recovery

  • Daily Backups: PostgreSQL data backed up at 2 AM UTC
  • 7-Day Retention: Rolling backup window
  • Persistent Volumes: Data survives pod restarts
  • Cloudflare Logs: 7-day retention for security analysis

3. Deployment Strategy

  • Rolling Updates: Zero-downtime deployments
  • Image Versioning: Semantic versioning for releases
  • Environment Separation: Dev/Prod configurations
  • Cloudflare Page Rules: Different caching strategies per environment

Future Considerations

1. Potential Enhancements

Cloudflare WorkersRedis IntegrationReal-time AnalyticsMulti-region SupportGraphQL API

2. Scaling Strategies

  • Cloudflare Workers for edge computing
  • Database read replicas for analytics queries
  • Message queue for async processing
  • Kubernetes cluster autoscaling

Read More Articles Like This

Get the latest insights on AI development, system design with Claude Code, and engineering best practices. Join our community of developers building with AI.

Sign Up Now - It's Free

Conclusion

💡 Want to design your own system like this? Learn how to leverage Claude Code for system design in our tutorial: From Idea to Architecture: Using Claude Code to Design Complex Systems. Discover the iterative process and best practices we used to create roiAI.

The roiAI system demonstrates a well-architected solution for AI usage analytics that balances:

🔒

User Privacy

Local-first approach with optional cloud sync

🌍

Global Performance

Cloudflare edge network for worldwide access

📈

Scalability

Horizontal scaling with multiple worker instances and HPA

🛡️

Reliability

Multi-layer protection and comprehensive error handling

🔐

Security

Defense-in-depth from edge to application

🔧

Maintainability

Clear separation of concerns and standard practices

The architecture supports users from various backgrounds - developers, data scientists, researchers, and business professionals - who want local-only analytics or need consolidated insights across all their devices (work laptop, home desktop, cloud VMs, etc.), making it a flexible solution for the evolving AI development landscape.

Frequently Asked Questions

What makes roiAI different from other analytics platforms?

roiAI is specifically designed for Claude Code usage analytics with a privacy-first architecture. It offers local-only analysis options, seamless multi-device synchronization, and tracks API-equivalent costs without storing any actual code or conversation content.

How does the multi-device sync work?

Each device running roiAI CLI gets a unique identifier. When you run 'roiai cc push', the CLI uploads metadata (not code) to the cloud where it's deduplicated and aggregated. This allows you to see unified analytics across your work laptop, home desktop, cloud VMs, and any other development environment.

Can I use roiAI without uploading data to the cloud?

Yes! Run 'roiai cc sync' for local-only analysis. All data stays on your machine, and you get full analytics without any cloud connectivity. Cloud sync is completely optional and only happens when you explicitly use 'roiai cc push'.

What technology stack powers roiAI?

The roiAI platform uses TypeScript/Node.js for the CLI, Next.js 15 for the web application, PostgreSQL with Prisma ORM for data storage, Kubernetes for container orchestration, and Cloudflare for global edge network services. The entire stack is optimized for horizontal scaling and high availability.

How secure is my usage data?

Security is built into every layer: end-to-end encryption via HTTPS, bcrypt password hashing, JWT tokens for authentication, device-specific API keys, Cloudflare WAF protection, and container security with non-root users and read-only filesystems. We never store your actual code or conversations - only metadata like token counts and timestamps.

Enjoyed this article?

Stay updated with our latest insights on AI development and analytics.

Read More Articles