Building a Threat Model for Your AI App in 30 Minutes

You ship your AI app in record time. It works beautifully — until someone jailbroke your chatbot, exfiltrated your OpenAI API key, and racked up a $7,000 bill overnight.

You didn't need a huge security team to prevent it.
You just needed a threat model.

Threat modeling isn't just for enterprise architects with 50-page PDFs. For AI apps, it's a fast, high-leverage exercise that surfaces the most likely attack paths before attackers find them. And the good news? You can build a working threat model for your app in under 30 minutes.

A threat model is a structured way to identify what can go wrong in your system before attackers do. For modern AI apps, a lightweight, time-boxed approach can go a long way in preventing costly security incidents.

Introduction

A threat model is a structured way to identify what can go wrong in your system before attackers do. Traditionally, it's been associated with heavyweight frameworks like STRIDE or long workshops. But for modern AI apps — especially ones built by small teams or indie developers — a lightweight, time-boxed approach can go a long way.

Why it matters for AI apps:

You're working with new attack surfaces (prompt injection, jailbreaks, embeddings, output exfiltration) that traditional web checklists don't cover
Many apps ship fast with little to no review
A single mistake (like exposing a key in a system prompt) can have massive impact

This post walks you through a practical 30-minute exercise to threat model your AI app. All you need is a whiteboard (or a napkin), a basic understanding of your architecture, and 30 focused minutes.

Step 1: Define What You're Protecting (5 min)

Before you can defend, you need to know what matters.

Ask:

What sensitive assets does my app touch?
Where do those assets live?
Why would attackers want them?

Think broadly: not just secrets, but anything that would cause damage if exposed or modified.

Example Asset Table

Asset	Where It Lives	Why It Matters
OpenAI API Key	.env + serverless runtime	Can be abused to rack up huge bills
Supabase Service Key	.env + RAG pipeline	Grants access to your entire database
Internal Docs / PDFs	Vector DB	Proprietary knowledge base
Plugin Credentials	.env + tool registry	Access to external systems
System Prompt	App code or env vars	Can contain hidden logic or sensitive data

Don't overthink this step — five minutes is enough. You just need a quick list of what you're protecting.

Step 2: Map Your Data Flows (10 min)

Next, map how data moves through your system.

You don't need a perfect architecture diagram. Just sketch the major components:

Where does user input enter?
How does it flow through the model, RAG pipeline, or agents?
Where does it exit (e.g., UI, logs, third-party APIs)?
Where are your trust boundaries?

Example Data Flow


User → Model → Tool Calls → External APIs
  ↓       ↓         ↓
Vector DB → Output → Frontend/Logs

Trust boundaries to note:

User input boundary: where prompt injection and jailbreaks happen
Vector DB ingestion: where indirect prompt injection can enter
Output boundary: where silent exfiltration happens
Tool invocation: where untrusted outputs can trigger real actions

This map will be the backbone of your threat model. It shows where attackers can enter, move, and exit.

Step 3: Identify Likely Attack Surfaces (10 min)

Now, look at each asset and data flow. Ask:

Where can attackers influence the system?
Where could sensitive data leak?
What happens if a model output is malicious or wrong?

For AI apps, the key attack surfaces are well-known. Here's a quick reference:

Attack Surface	Example Risk	Related Resource
Prompt Injection	Attacker overrides instructions	Prompt Injection 101
Jailbreaks	Bypassing model safety rules	Real-World AI Jailbreaks
Data Exfiltration	API keys leaked through output	Silent Exfiltration
Untrusted Outputs	LLM-generated SQL vulnerable to injection	When LLMs Write Code
Vector DB Leakage	Sensitive embeddings exposed via RAG	Vector DBs & Embeddings Security

This is where your map meets reality. Don't try to be exhaustive — focus on likely threats, not every theoretical edge case.

Step 4: Prioritize & Plan Defenses (5 min)

You can't fix everything at once. So prioritize.

For each attack surface, rate:

Likelihood: How likely is this to happen in your context?
Impact: How bad would it be if it did?

List 1–2 practical mitigations.

Example Mini-Matrix

Attack Surface	Likelihood	Impact	Mitigation
Prompt Injection	High	High	Prompt segmentation, Rafter scanning
Data Exfiltration	High	High	Output filtering, key isolation
Untrusted Outputs	Medium	High	Code review + scanners
Jailbreaks	Medium	Medium	Role separation, jailbreak detection
Vector DB Leakage	Low	High	Access control, document sanitization

This table gives you a security to-do list that's practical, not overwhelming.

Step 5: Automate What You Can (Bonus)

Manual checklists are great. Automated scanning is better.

Once you've mapped your attack surfaces, integrate scanning to catch issues before attackers do. Tools like Rafter combine traditional static analysis with AI-aware scanning to detect:

Prompt injection patterns in code and configs
Key exposure in prompts and frontends
Insecure code generation outputs
Dangerous RAG configurations

This lets you operationalize your threat model. Instead of relying on memory, you've got a security net running continuously in CI. Start by scanning your repo with Rafter to automatically cover the most common AI-specific attack surfaces.

Conclusion

Threat modeling for AI apps doesn't have to be a long workshop or a 40-page document. In 30 minutes, you can:

List your critical assets
Map your data flows
Identify likely attack surfaces
Prioritize risks
Put basic defenses in place

This lightweight exercise dramatically reduces the chance that a prompt injection, jailbreak, or silent exfiltration blindsides you after launch.

Do it once today — then revisit periodically as your app evolves.
Run a Rafter scan to automatically cover the most common AI-specific attack surfaces.

It's one of the highest ROI security activities you can do as an indie developer or small team.