Infrastructure

Cloudflare

Network
300+ PoPs
Edge Runtime
Workers (V8 Isolates)
Storage
R2 (S3-compatible)
Database
D1 (SQLite at edge)

Cloudflare's developer platform has expanded from CDN and DDoS protection into a comprehensive edge computing stack. Workers run JavaScript/TypeScript (and Wasm) in V8 isolates across 300+ points of presence globally — with cold starts measured in microseconds rather than the hundreds of milliseconds typical of Lambda or Cloud Run. R2 object storage eliminates the egress fees that make S3 expensive at scale. D1 brings SQLite to the edge. KV provides globally consistent key-value storage.

Axevate deploys Next.js applications, AI API proxies, and eCommerce edge logic on Cloudflare. The platform is particularly well-suited for applications where latency is the primary constraint: global storefronts where sub-100ms response times affect conversion, AI API gateways that need rate limiting and caching close to users, and edge-auth middleware that runs before requests reach the origin. Our experience covers the full stack: Workers, Pages, R2, D1, KV, Workers AI, AI Gateway, and Zero Trust tunnels.


1Workers: Edge Serverless Compute

Cloudflare Workers run JavaScript, TypeScript, Python, and WebAssembly in V8 isolates — not Node.js. This distinction matters: there is no process startup, no file system, no Node.js built-ins (no fs, no child_process, no native addons). Cold starts are ~0ms because isolates are pre-warmed and shared across requests. The result is consistently sub-5ms overhead globally, regardless of region.

The Workers runtime implements Web Standard APIs (fetch, Response, Request, Headers, crypto, WebSockets, Streams) rather than Node.js APIs. Most modern JavaScript packages work in Workers if they don't depend on Node built-ins. The compatibility date setting enables gradual access to newer APIs. For packages that require Node.js, the nodejs_compat compatibility flag adds a subset of Node APIs (path, events, Buffer, stream) — enough to run many packages not originally written for the edge.

Workers are billed per request and per CPU time — not per memory or idle time. The free tier covers 100,000 requests/day and 10ms CPU per request. The paid tier ($5/month + $0.30/million requests) covers 30 seconds of CPU per request. Workers are appropriate for request routing, edge authentication, API proxying, personalization logic, and full application serving via Hono or Remix. They are not appropriate for long-running background jobs, memory-intensive operations, or Node.js-native dependencies.

2Cloudflare Pages & Next.js Deployment

Cloudflare Pages provides Git-connected deployment for static sites and full-stack applications. The @cloudflare/next-on-pages adapter translates Next.js App Router applications into Workers-compatible bundles, enabling Next.js deployments that run on the edge globally rather than on a single-region serverless function.

The adapter supports App Router Server Components, Server Actions, API routes, middleware, and ISR/on-demand revalidation — but with constraints. The Workers runtime size limit (1MB compressed, extendable via Smart Placement to 25MB) restricts very large applications. Server-side node modules that require Node.js built-ins need the nodejs_compat flag and may have partial support. Dynamic imports and code splitting require careful attention to avoid bundle size violations.

For teams choosing between Vercel and Cloudflare Pages: Vercel has full Next.js feature support with zero configuration (it's built by the same team), simpler debugging, and better observability tooling. Cloudflare Pages has lower latency globally via edge execution, no per-seat pricing, and dramatically lower egress costs for high-traffic sites. The right choice depends on whether latency or cost is the primary constraint — both are production-viable for most Next.js applications.

3R2: Object Storage Without Egress Fees

Cloudflare R2 is S3-compatible object storage with zero egress fees. AWS S3's egress pricing ($0.09/GB out) becomes significant at scale — a CDN serving 100TB/month pays $9,000 in egress to S3. R2 eliminates this entirely: you pay for storage ($0.015/GB/month) and operations only. For media-heavy applications, user-generated content platforms, and backup storage, this represents a substantial cost reduction.

R2 exposes an S3-compatible API, so existing S3 SDKs (AWS SDK v3 with custom endpoint, or the R2-specific Workers binding) work without modification. R2 buckets can be exposed publicly via a custom domain through Cloudflare's CDN — files are served directly from the edge with CDN caching, no origin egress. R2 also supports event notifications (Workers consumers), object lifecycle policies, and cross-region replication.

R2 is an appropriate replacement for S3 + CloudFront when your storage and serving are tightly coupled. It is not a full S3 replacement for all use cases: it lacks S3 Glacier equivalents for archival pricing, and some AWS-native services (EMR, SageMaker, Glue) integrate with S3 but not R2. For web application asset storage, user uploads, and static site hosting, R2 is the lower-cost choice with comparable durability guarantees.

4D1 Database & Workers KV

D1 is Cloudflare's serverless SQLite database, accessed from Workers via the D1 binding. Each D1 database is a SQLite file replicated globally — primary writes go to a single region, reads are served from read replicas close to the Worker instance. This gives strong consistency for writes with globally low-latency reads, which matches most web application access patterns (reads far outnumber writes; writes go through a write path that can tolerate 50-100ms regional latency).

D1 supports standard SQLite SQL with some limitations: no user-defined functions, limited to 2GB per database (currently), and batch operations for efficiency. The D1 API is async and uses prepared statements. For Next.js applications on Cloudflare Pages, D1 replaces the need for an external database connection entirely — your data layer runs at the edge without a connection pool or latency to a separate database region.

Workers KV is a globally distributed key-value store with eventual consistency. Writes propagate to all edge locations within ~60 seconds. KV is appropriate for configuration values, session tokens, feature flags, and any data where eventual consistency is acceptable and read performance at global scale matters. It is not appropriate for data requiring strong consistency (shopping cart contents, financial records) or complex queries.

5AI Gateway & Workers AI

Cloudflare AI Gateway is a proxy layer between your application and AI providers (OpenAI, Anthropic, Google, Hugging Face). It adds request logging, cost analytics, rate limiting, semantic caching (return cached responses for similar queries, reducing costs 20-40%), and provider fallback — all via a URL substitution with no code changes. For production AI applications spending significant budget on API calls, AI Gateway's semantic caching alone typically pays for itself within weeks.

Workers AI runs inference directly on Cloudflare's GPU-equipped edge network. Available models include Llama 3, Mistral, Phi-2 (text generation), Whisper (transcription), BAAI/bge embedding models, and Stable Diffusion (image generation). Latency is low and there are no cold starts. The tradeoff: model selection is limited to what Cloudflare hosts, and the frontier models (GPT-4o, Claude Opus, Gemini Ultra) are only available via AI Gateway to external providers, not via Workers AI directly. Workers AI is appropriate for lightweight inference workloads — classification, embeddings generation, moderation — where model quality matters less than latency and cost.


How We Use It in Practice

Real architectural problems across industries — and how we approach them.

Global eCommerce / Retail

Edge Auth Middleware + Personalization: Next.js on Cloudflare Pages at 50ms

A luxury goods brand running a Next.js storefront on Vercel had customers in Japan and Australia experiencing 800-1,200ms page loads due to regional serverless cold starts and origin round-trips for authentication. Their JWT validation and geolocation-based personalization were running in a US-East Lambda, forcing every request across the globe before a cached response could be served.

Our approach

Migrated the Next.js application to Cloudflare Pages via next-on-pages. Authentication middleware moved into a Cloudflare Worker that validates JWTs using Web Crypto API (no Node.js crypto dependency) and resolves in <5ms at the PoP closest to the user. Geolocation-based product catalog variants (currency, size standards, available SKUs) stored in Workers KV — acceptable for the eventual consistency window since catalog data changes infrequently. Write operations (cart, checkout) route to a D1 database co-located via Smart Placement. Page loads in Tokyo dropped from 1,100ms to 68ms measured p95.

AI Applications / Cost Optimization

LangGraph + Cloudflare AI Gateway: Semantic Caching for a High-Volume Support Agent

A SaaS company running a LangGraph-based support agent handling 50,000 conversations per month was spending $14,000/month on OpenAI API calls. Analysis showed that 35% of inbound queries were semantically near-identical — variations of 'how do I reset my password', 'where is my invoice', 'can I change my plan' — but the agent was making fresh API calls for every one.

Our approach

Cloudflare AI Gateway inserted between the LangGraph application and OpenAI as a URL-substitution proxy. Semantic caching enabled with a 0.88 similarity threshold — tuned by sampling 2,000 real conversation pairs and measuring which threshold preserved response quality while maximizing cache hits. Static system prompts and FAQ-class tool results are cached; user-specific data injections are structured to appear at the end of the prompt, preserving cache eligibility for the shared prefix. AI Gateway analytics surface cache hit rate per day; threshold is reviewed monthly. Current cache hit rate: 31%, reducing monthly OpenAI spend to $9,600 with no measurable degradation in resolution rates.

Media & Publishing

R2 + Cloudflare CDN: Eliminating $40K/Year in S3 Egress for a Video Platform

A B2B training platform serving instructional videos was paying $3,400/month in AWS S3 egress fees — 40TB/month served through CloudFront at $0.085/GB. The content was static and long-lived; the egress cost was purely infrastructure overhead with no business value. Migrating storage was a significant operational concern given the volume of existing assets.

Our approach

Incremental migration: R2 bucket created with matching path structure; new uploads routed to R2 from day one. Existing S3 assets migrated using rclone over 3 weeks during low-traffic hours with checksum verification. Cloudflare CDN configured as the public delivery layer over the R2 bucket — no custom origin server required. S3 lifecycle policies archive remaining legacy content to Glacier for compliance retention. Post-migration egress cost: $0. R2 storage cost for 80TB: $1,200/month. Net savings: $2,200/month ($26,400/year). Migration completed with zero user-facing downtime; one broken URL batch was identified and fixed via a redirect Worker rule.

FAQ

Workers for globally distributed, latency-sensitive workloads with simple runtime requirements. Lambda for workloads needing full Node.js, large native dependencies, long execution times (up to 15 minutes), or deep AWS ecosystem integration (SQS, DynamoDB, S3 events). Workers start in microseconds vs Lambda's 100ms+ cold starts. Lambda has a significantly larger runtime surface area. Most teams use both: Workers for edge routing, auth, and API proxies; Lambda for compute-intensive background jobs and AWS-integrated services.

Ready to build with Cloudflare?

Talk to Us