cairndata captures it. An AI assistant that learns your pipeline's business context, monitors upstream repos for breaking changes, and writes dbt models with knowledge no doc ever captured.
Backend adds a column to the users table. Nobody tells you.
Your staging model doesn't have it. Data is incomplete. You find out when an analyst asks "why is the tier field empty?"
Someone changes a status enum from 4 values to 6.
Your JOIN assumed 4 values. New values create fan-out. Metrics jump 30% overnight. Nobody knows why.
A migration runs Friday. Your pipeline runs Monday.
72 hours of bad data in the warehouse. The analyst already sent the report to C-level. Finance is "reconciling."
You join a new project. You spend 3 weeks just understanding the pipeline.
Because the real knowledge lives in Slack threads, PR comments from 2 years ago, and one person's head. No doc covers it.
You can't watch every PR in every upstream repo. The team that changed the schema doesn't even know your pipeline exists. This isn't a people problem — it's a tooling gap.
cairndata is an AI assistant that gets better at understanding your pipeline with every session. It remembers what it learns.
cairndata builds a knowledge graph of your project — tables, sources, business meanings, gotchas, past incidents. On first run it scans your project and upstream repos. Then it learns from every session. After a month, it knows more about your pipeline than a new team member after a quarter.
The review-upstream-prs skill scans open PRs in your upstream repos, filters by tier priority and keywords, and analyzes impact on your pipeline — including grain impact analysis that detects when enum expansion causes fan-out in downstream JOINs. Generates a prioritized impact report.
cairndata writes dbt models knowing your project's conventions (CTE structure, naming, test patterns), warehouse schema (from cache), and business context (from the knowledge graph). It debugs problems by checking what it knows about the model first — before querying the warehouse.
A repo you clone. Skills, config, knowledge graph — all files on your disk. No dashboards, no vendor APIs. Runs on top of Claude Code, so AI calls go through Anthropic, but your project data stays local.
Data observability tools tell you data is bad — after the fact. cairndata reads upstream PRs and understands why data changed, because it has your pipeline's context and history.
Knowledge graph, schema cache, session journal — every session makes cairndata smarter about your pipeline. Tribal knowledge captured in code, not in people's heads.
Skills are markdown files. Config is a text file. The knowledge graph is JSON. Everything is readable, editable, and version-controllable. Don't like how a skill works? Change it. Want to add your own? Drop a file.
~/.cairndata/ and ~/.claude/skills/. No global installs.setup.sh with your dbt project path. Configure GCP project, datasets, upstream repos. Creates your project config.~/.cairndata/.Get notified when cairndata launches. One email, no spam. You'll know before your pipeline does.