DataWeave · Cloudflare

DataWeave's workload — crawling retailer catalogs, matching products with ML, serving AI insights to brand customers — is one of the cleanest fits for Cloudflare's developer platform I've seen. The crawling, the egress tax, and the AI inference all run cleaner on our primitives.

AI commerce analytics at scale.
The runtime underneath it.

The pitch on dataweave.com is direct: "AI-powered e-commerce analytics for digital commerce." Pricing, assortment, product matching, share of search — across retailer catalogs, served to brand customers as real-time intelligence. Three workloads stacked on top of each other, each with its own infrastructure problem.

Crawl

JS-heavy retailer sites, headless browser at scale

Match

ML product-matching across catalogs

Serve

AI insights to brand customers per query

GA · June 2026 Your AI bill is out of control. Cloudflare can fix it now.
AI Gateway ships dollar-denominated spend limits per brand customer, identity-driven budgets, PII redaction, and automatic fallback routing — across every model provider, in one log line. For the Claude inference path that's already in production at DataWeave, this is the cost-control layer that wasn't possible until last week. Read the announcement.

Three primitives that map directly to how DataWeave actually runs:

Workers + Browser Rendering — managed headless Chromium for JS-heavy retailer sites. Replaces Lambda + EC2 crawler fleets and the scaling/cost overhead that comes with them.
R2 — zero-egress object storage for crawled HTML, images, and product feeds. The S3 egress-per-customer-query bill is the silent margin tax on every analytics product; R2 erases it.
AI Gateway — in front of Claude (already in production). Semantic cache on repeated SKU attribute extractions, per-tenant cost attribution across your brand customer base, fallback when Anthropic rate-limits.

One question for the DataWeave platform team:

Is the bigger near-term pain on the crawling side — managing the EC2 fleet that pulls down JS-heavy retailer pages — or on the inference economics side — keeping per-brand AI cost predictable as customer count grows? 20 minutes to find the right starting point.

Deeper Dive

The full architecture, ready when you are

The detailed primitive-by-primitive mapping — including the request-flow diagram for a brand-customer query on Cloudflare, the Workers + Browser Rendering crawler architecture, the egress-tax math, and the 90-day rollout — runs about 45 KB of dense technical content.

Read the expanded version →

Grab 20 minutes →

Matt Holscher Cloudflare · Developer Platform

AI commerce analytics at scale.The runtime underneath it.

One question for the DataWeave platform team:

The full architecture, ready when you are

AI commerce analytics at scale.
The runtime underneath it.