Announcing Atlas: open-source text-to-SQL with a semantic layer
You're already using AI to query your data. Atlas makes it safe, accurate, and deployable.
The problem: everyone is already doing this
Every data team we talk to is using ChatGPT or Copilot to write SQL. Paste your schema, describe the question, copy the query back into your database client, and hope it works.
It works surprisingly often. But it fails in ways that are hard to catch: silently wrong column references, missing WHERE clauses that filter soft-deletes, metrics calculated before discounts instead of after. The AI doesn't know your business rules. It guesses from column names like fact_txn_amt and is_del_flg, and it gets the semantics wrong often enough that you can't trust the output without reading every query line by line.
The other problem is operational. ChatGPT can't run the query. It doesn't validate that the SQL is read-only. It doesn't enforce row-level access. There's no audit trail. You can't embed it in a product. It's a parlor trick, not infrastructure.
What Atlas is
Atlas is a text-to-SQL agent that connects to your database, understands your schema through a semantic layer, validates every query, and runs it. All in one place. Self-host it with Docker, Railway, or Vercel, or use Atlas Cloud at app.useatlas.dev and skip infrastructure entirely.
The core idea: give the AI the context it needs to write correct SQL, then validate the output before it touches your database. No training data, no vector databases, no fine-tuning. Just a YAML semantic layer that describes what your tables and columns actually mean.
$ bun create @useatlas my-app --demo
$ cd my-app && bun run dev
> Ready on http://localhost:3000
> Connected to PostgreSQL - 42 tables profiled
> Semantic layer generated at ./semantic/Run atlas init against your database and it profiles every table (column types, sample values, cardinality, nullability) and generates YAML entity files that the agent reads before writing SQL. You can enrich these with descriptions, business terms, and known query patterns. Changes go through pull requests. The semantic layer lives in your repo, versioned like code.
How it works under the hood
When a user asks a question, the agent reads the semantic layer to understand what tables exist, what columns mean, how tables join, and what metrics are defined. Then it writes SQL. Before the query reaches your database, it passes through a 7-layer validation pipeline:
- Empty check: rejects blank input
- Regex mutation guard: blocks INSERT, UPDATE, DELETE, DROP
- AST parse: confirms a single SELECT statement
- Table whitelist: only tables in the semantic layer are queryable
- RLS injection: appends WHERE clauses for tenant isolation
- Auto LIMIT: prevents unbounded result sets
- Statement timeout: kills runaway queries
This is defense-in-depth. Any single layer can fail, but the pipeline makes it so all of them would have to fail simultaneously for a dangerous query to execute.
What ships in 1.0
- 7 databases: PostgreSQL, MySQL, BigQuery, ClickHouse, DuckDB, Snowflake, and Salesforce via datasource plugins
- 6 LLM providers: Anthropic, OpenAI, Bedrock, Ollama, OpenAI-compatible (vLLM, TGI, LiteLLM), and AI Gateway. Bring your own keys or use Atlas Cloud's managed tokens
- 21+ plugins: Datasource adapters, sandbox backends, interaction channels (Slack, Teams, MCP), action triggers (email, JIRA, webhooks). Build your own with the Plugin SDK
- Embeddable everywhere: Script tag widget, React component, TypeScript SDK, headless API. Works with Next.js, Nuxt, SvelteKit, or any HTTP client
- Chat SDK: 8 platform adapters: Slack, Teams, Discord, Telegram, Google Chat, GitHub, Linear, and WhatsApp
- Enterprise features: SSO (SAML/OIDC), SCIM provisioning, custom roles, IP allowlists, approval workflows, audit log retention and export, data residency
- Effect.ts architecture: The entire backend uses Effect.ts for structured concurrency, typed errors, composable Layers, and graceful shutdown. The agent loop runs on @effect/ai, database connections on @effect/sql
- Admin console: Connections, users, plugin marketplace, semantic layer editor with autocomplete, query analytics, learned patterns, billing, and settings. All in one place
How Atlas compares
There are good tools in this space. Atlas is different in a few specific ways.
vs Vanna AI: Vanna is a Python library that learns from historical queries via RAG. Atlas uses an explicit YAML semantic layer. You know exactly what context the agent sees, and changes go through code review. Vanna is great for Python shops that want a library. Atlas is a deployable product with auth, admin, and embedding built in.
vs WrenAI:WrenAI is a GenBI platform with a UI-based semantic modeling layer. It's closer to “replace Looker” than “embed an analyst.” Atlas is designed to be a component in your application, not a standalone BI tool. WrenAI is also AGPL-3.0 end-to-end. Atlas's client libraries are MIT.
vs raw MCP: Connecting Claude Desktop directly to your database via a MCP server gives the AI raw schema with no business context, no validation, and no audit trail. Atlas has its own MCP server that provides the same semantic layer and validation pipeline. Context + safety, not just connectivity.
vs enterprise platforms: ThoughtSpot, Databricks AI/BI, and Looker AI are powerful but proprietary and locked to their ecosystems. Atlas is open-source, deploy-anywhere, and designed for embedding, not for replacing your entire BI stack.
Detailed comparisons: docs.useatlas.dev/comparisons
Self-hosted is free. Cloud is for teams.
Atlas is AGPL-3.0 licensed. You can self-host the full product, every feature, no artificial limits, for free. Run bun create @useatlas, connect your database, and you're done.
Atlas Cloud is the managed option for teams that don't want to run infrastructure. It starts with a 14-day free trial (no credit card), then Starter, Pro, and Business tiers. See pricing.
Try it
The fastest way to see Atlas is the live demo. No signup, no installation. It's connected to a cybersecurity SaaS dataset with 60 tables and 200K rows of realistic, messy data.
$ bun create @useatlas my-app
$ cd my-app
$ cp .env.example .env # add your ANTHROPIC_API_KEY + ATLAS_DATASOURCE_URL
$ bun run devRead the quick start guide for the full walkthrough, or jump straight to connecting your database.