AI Deployment

Where to deploy AI agents and LLM apps in 2026.Without paying AWS prices.

You built an AI agent in your AI IDE. The README says "deploy this to production." This is the part every agent tutorial skips: which platforms work, which break, and what you should actually pay.

Published April 26, 2026

The problem with most agent tutorials

Most LangChain, LangGraph, and agent SDK tutorials end at python app.py. They do not tell you how to keep the script running after you close your laptop. They do not say what happens when you need a database to remember conversation history, a queue for slow tool calls, or a webhook the agent can answer at 3 a.m. without you being awake.

The result is a strange gap. You can vibe code an entire agent in a weekend with an AI IDE, but the moment you try to ship it, you hit a wall of cloud configuration that the AI cannot help you with as easily. That wall is what this post is about.

Why Vercel, Heroku, and AWS are not the right answer

Three names show up over and over when developers ask "where do I deploy this?" Each one breaks for AI agents in a specific way.

Vercel

Vercel is built for serverless functions. Each function has a hard execution timeout: 10 seconds on Hobby, 60 on Pro, 5 minutes on Enterprise. Most agent calls take longer than that. A LangChain agent that does a web search, calls a model, runs a tool, and replies routinely takes 20 to 90 seconds for a single turn. On Vercel that turn is killed mid-response. You can work around this with streaming and Edge Functions, but you cannot run a long-living process: an agent that polls Slack, watches a queue, or holds a websocket open for a chat session. Vercel is for stateless web requests, not stateful agents.

Heroku

Heroku can run a long-living process. It also charges $50 per month per dyno for anything beyond the smallest tier. Add Postgres ($50), Redis ($15), and a worker dyno for background tool calls and you are at $165 per month before you have a single user. For an agent that mostly idles, you are paying for capacity you are not using.

AWS

AWS can do everything an AI agent needs. AWS can also bury a non-technical developer for two weekends. Lambda for short calls, ECS Fargate for long-running, RDS for the database, ElastiCache for state, Secrets Manager for API keys, CloudWatch for logs, and IAM for permissions. By the time you have wired all of those, your 20-minute agent has cost you 20 hours of YAML. And the bill is the worst of the three: typical small AI agents on AWS cost $400 to $1,200 per month once you include Bedrock or model passthrough costs and RDS.

The actual deploy checklist for an AI agent

Forget platforms for a minute. What does an AI agent actually need to run in production? Here is the real checklist.

1.
Long-running processes. The agent loop, queue worker, or websocket handler needs to run continuously, not start up per request.
2.
CPU and GPU bursts. When a user kicks off a task, you may want to run inference, transcription, or image generation. That work needs compute, then it needs to disappear.
3.
Environment variable secrets. Model API keys, database URLs, third-party tokens. These have to be injected at runtime, never committed to your repo.
4.
A real database. Conversation history, task state, user accounts. Postgres is the standard. SQLite on a single container will lose data the moment the container restarts.
5.
Cron and scheduled jobs. Most agents need a heartbeat: refresh data every hour, summarize the day at 5 p.m., retry failed tool calls every 10 minutes.
6.
Websockets or long polling. If your agent talks to a chat UI, you need a connection that stays open.
7.
One environment for all of it. Splitting these across five providers is what turns a side project into a configuration job.

Any platform you pick has to do all seven without you having to glue them together yourself.

How Varity handles this in one command

Varity is the easiest way to deploy any app, AI agent, or LLM. 60-80% cheaper than AWS. The whole point of the platform is that the seven-item checklist above runs without configuration.

Inside an AI IDE like Claude Code or Cursor, you ask the agent to deploy with Varity. Under the hood the CLI runs:

$ varitykit deploy

The intelligent orchestration algorithm reads your repo, detects what services your code needs, picks the right provider for each one, and ships. For an LLM agent that means:

+Long-running container compute, billed per second, not per dyno hour.
+GPU compute attached on demand for inference workloads. Per-second billing, so you only pay while a request is actually running.
+Secrets injected at runtime from the developer portal, never written to disk.
+Postgres database auto-configured and connected through DATABASE_URL.
+Cron expressed in your project config, no separate scheduler service.
+Websockets work the same way an HTTP route works. No load balancer surgery.
+Auth, payments, and storage included in the platform cost.

You do not pick instance types. You do not write a Dockerfile unless you want to. You do not configure a VPC. The CLI prints a live URL after about 60 seconds.

If you want the longer version of how this works, the Varity docs walk through the deploy flow for a LangChain agent, a chatbot with memory, and a retrieval-augmented LLM app.

Real pricing for a real AI agent

Numbers help. Here is the monthly bill for a typical small AI agent in production: a LangChain agent with a Postgres database, model passthrough, websocket chat, and 1,000 monthly active users running about 10,000 model calls per day.

Service	AWS	Vercel Pro	Heroku	Varity
Compute	$300/mo	$200/mo	$150/mo	$40/mo
Database	$250/mo	$120/mo	$50/mo	$30/mo
Background jobs and cron	$80/mo	$90/mo	$50/mo	$0 (included)
Auth	$100/mo	$80/mo	$50/mo	$0 (included)
Total	$730/mo	$490/mo	$300/mo	$70/mo

AWS includes Bedrock passthrough, ECS Fargate Spot, RDS Postgres, and Cognito. Vercel includes Vercel Pro, Neon Postgres, Inngest for background jobs, and Clerk. Heroku includes a Performance dyno, Heroku Postgres Standard, Heroku Scheduler, and Auth0. Model passthrough costs are not included in any column and are roughly the same across all four (about $300 per month for 10,000 daily calls on a small model). See the Varity pricing page for current per-second rates.

The pattern is the same as any other app shape: Varity is 60-80% cheaper because auth, database, and background jobs are part of the platform, not three separate vendor bills.

The vibe coder shortcut

If you are building inside an AI IDE, you do not have to leave it to deploy. The Varity MCP server runs inside Claude Code, Cursor, and Windsurf. You type:

deploy this to Varity

The IDE calls the Varity MCP, which runs varitykit deploy for you, streams the build log back, and pastes the live URL into your chat. From the moment you stop coding to the moment your agent is in production, you do not touch a cloud console.

AI gives you the power to build. Varity gives you the power to ship. The deploy step is supposed to take 60 seconds, not a weekend.

What to do next

If your agent is already on Vercel or AWS and you want to migrate, the migrate from Vercel page walks through pointing a one-click migrate button at your existing project. If you are starting fresh, install the CLI and run varitykit init inside the project your AI IDE just built. Either way, the agent is in production after one command.

The next time an LLM tutorial ends at python app.py, you will know what to do.

Frequently asked questions

What is the best platform to host an AI agent in 2026?

For most non-technical developers, Varity is the best platform for hosting AI agents. It supports long-running processes, per-second GPU billing, a built-in database, cron, websockets, and secrets management. Vercel does not support long-running processes due to its hard execution timeout. Heroku can run a long-living process but costs $165 per month before a single user. AWS can handle everything but takes days to configure correctly.

How much does it cost to host an AI agent in production?

For a typical small AI agent with a Postgres database, websocket chat, model passthrough, and 1,000 monthly active users running 10,000 model calls per day: Varity costs around $70 per month, a Vercel Pro stack around $490, and AWS Bedrock with ECS and RDS around $730. The savings on Varity come from auth, database, cron, and background jobs being included in the platform cost rather than paid separately.

Can I deploy an AI agent from Claude Code or Cursor without leaving the editor?

Yes. Varity has an MCP server that runs inside Claude Code, Cursor, and Windsurf. Type 'deploy this to Varity' and the IDE calls the Varity MCP, which runs varitykit deploy for you, streams the build log back, and pastes the live URL into your chat.

Why does Vercel not work for AI agents?

Vercel is built for serverless functions with hard execution timeouts: 10 seconds on Hobby, 60 seconds on Pro, and 5 minutes on Enterprise. Most AI agent calls take longer than that. A LangChain agent doing a web search, calling a model, running a tool, and replying routinely takes 20 to 90 seconds for a single turn. On Vercel that turn is killed mid-response. Vercel also does not support long-running processes like queue workers or persistent websocket connections.

What does an AI agent need to run in production?

An AI agent needs seven things to run reliably in production: long-running process support, CPU and GPU burst compute, environment variable secrets management, a real database for conversation history and task state, cron and scheduled jobs, websocket or long-polling support, and one environment that handles all of the above. Any platform that splits these across five providers turns a side project into a configuration job.

How do I deploy an AI agent with Varity?

Run 'varitykit init' inside your project directory and then 'varitykit deploy'. Varity reads your repo, detects what services your code needs, and deploys in about 60 seconds. If you are already on Vercel, you can migrate with the one-click migrate button on the Varity migrate from Vercel page.

Deploy your AI agent in 60 seconds

Deploy for free during beta. No credit card required until your app has real traffic.

Deploy Free Migrate from Vercel