Know what your AI workflow is likely to cost.

Build the graph, choose the models, add prompts, retrieval, and infrastructure, then simulate the result. Flowcost shows cost, latency, and dependencies before the architecture turns into code.

Tap a node to compare options

Researchgemini-2.5-pro
$0
Writeclaude-sonnet-4.6
$0
Generate imagegemini-3-pro-image
$0
Reviewclaude-sonnet-4.6
$0
Vercel FunctionsCompute
$0
SupabaseDatabase
$0
Cloudflare R2Object storage
$0
$0/moatusers
Content pipeline

Before you build

Find the tradeoffs before they become rework.

Use Flowcost while the architecture is still fluid. Model the workflow once, test the assumptions that matter, and leave with a scenario your team can review, share, and use as the implementation baseline.

01

Expose token bloat before it ships.

See how workflow prompts, node prompts, and retrieved context change token load, latency, and context-window pressure before the assumptions disappear into code.

02

Find the moment a clever workflow gets expensive.

Add retries, inline tool use, and multi-agent routing to understand when the architecture gets slower, costlier, or harder to justify.

03

Model the users who actually create the bill.

Split monthly users into segments with different traffic share and usage levels. Flowcost turns that into one deterministic scenario, so cost and latency move with the audience you expect.

04

Show what retrieval adds beyond the model call.

Model the knowledge source, embedding path, vector storage, and backing systems behind retrieval. Then tune when context is injected and see how much it changes cost, latency, and prompt load.

05

Choose models before the choice hardens.

Browse models filtered to the step you are designing, compare providers and specializations, and see limits, pricing meters, and capability differences before the workflow depends on them.

06

Hand off a scenario, not a vague spec.

Share the workflow with teammates or export a structured implementation brief for a coding agent, with the graph, bindings, infrastructure, and assumptions intact.

One tool

Change one part. Watch the whole scenario move.

A model swap changes more than the model bill. Prompts, retrieval, tools, caching, compute, and service dependencies move with it. Flowcost keeps those surfaces in one scenario so the tradeoffs stay visible.

+23
+37
Demand volume
Text billing
Prompt cache
Retrieval
Workflow reach
Web search
Retries
Latency
Tooling
Embeddings
Persistence
Transfer / egress
Multimodal
Knowledge refresh
Plan fees

Pricing

Start free. One plan when you need more.

Use Flowcost free to pressure-test one workflow with the core catalog. Upgrade when you need broader coverage, more scenarios, and stronger export or sharing controls.

Core

$0

Pro

$99/yr

Active scenarios1Unlimited
Cloud scenarios1Unlimited
AI model catalogCoreFull
Infrastructure catalogCoreFull
Full cost & latency breakdown
Minimum viable price
User personas & demand curvesCoreFull
Connected infrastructure
Multi-modal & agent modeling
RAG modeling
Shareable scenario linksPublicAccess phrase
Variations1Unlimited
Agent handoff export
PDF export
Duplication control
Codebase import CLISoon
Local & self-hosted modelsSoon
AI model recommendationsSoon

FAQ

What Flowcost does (and doesn't)

How current is the pricing data?+

Pricing is checked automatically and reviewed before it reaches the catalog. Provider changes are usually reflected within hours, not weeks. If you spot a discrepancy, let us know and we will correct it quickly.

Can I model agentic loops and retries?+

Flowcost models workflows as directed graphs, not cycles. For retries and fallbacks, each node has a goal policy: single-pass, retry-once, or escalate-to-a-stronger-model. That policy adjusts the effective request count and cost. If your agent design has true feedback loops, flatten them into a chain with pass-rate estimates on each branch.

What happens when part of the estimate is incomplete?+

You still get a number. Flowcost returns the total it can calculate and flags every gap: an unsupported meter, a missing catalog entry, or an unresolved infrastructure dimension. The estimate is directional, not silent. You decide whether the known portion is enough for the decision in front of you.

Does Flowcost model latency under load?+

Flowcost estimates per-operation latency based on provider profiles — how long a single call takes, not how it behaves at scale. It does not simulate queue depth, rate-limit backoff, or cold starts. If you need to understand throughput under concurrency, you will still need a load test.

How do I compare different architectures?+

Create variations. Each variation shares the same workflow graph but lets you swap models, infrastructure, and settings independently. You can compare cost, latency, and assumptions without rebuilding the scenario from scratch.

What can’t Flowcost estimate?+

Anything outside the workflow itself: engineering time, fine-tuning runs, cross-region egress, compliance costs, or traffic spikes you have not modeled. Flowcost covers the runtime cost and latency of the workflow you define. Everything around it is still yours to account for.

Is this a monitoring tool?+

No. Flowcost is a planning workspace for architecture and economics before you build or redesign. It does not connect to production, ingest logs, or track live spend. Use it to compare scenarios, surface tradeoffs, and find your pricing floor before the invoice arrives.