How Rafter Scans AI-Generated Code: Under the Hood

Written by the Rafter Team

Rafter scans AI-generated code differently than traditional SAST tools. Traditional scanners were built for human-written code—consistent style, deliberate architecture, gradual evolution over months or years. AI-generated code breaks those assumptions. It arrives in bulk, mixes frameworks freely, skips error handling inconsistently, and introduces vulnerability patterns that human developers would catch mid-keystroke but that LLMs produce without hesitation. Rafter's scanning pipeline addresses these differences by combining battle-tested open-source static analyzers with a proprietary AI review layer, then consolidating findings into deduplicated, severity-ranked results with copy-paste fix prompts you can drop directly into your AI coding tool.
This post explains exactly how that pipeline works. We built Rafter to scan the code that AI tools generate, so we're going to be transparent about what we catch, what we miss, and how we're improving.
Why AI-Generated Code Needs Different Scanning
Human developers build mental models as they write. They remember that the auth middleware on line 12 protects the route on line 87. They notice when error handling in one file contradicts the pattern in another. They hesitate before hardcoding a default password, even in a prototype.
AI coding assistants don't do any of that. They generate code token by token, optimizing for local coherence without maintaining a global security model. The result is a distinct set of vulnerability patterns that traditional scanners weren't designed to catch at volume.
| Pattern | Human-Written Code | AI-Generated Code |
|---|---|---|
| Error handling | Consistent within a project—teams adopt conventions | Inconsistent across files—one route validates input, the next doesn't |
| Authentication | Centralized middleware or decorator pattern | Mixed approaches—JWT in one endpoint, session cookies in another, nothing in a third |
| Hardcoded values | Developers know to use env vars (usually) | Default credentials, API keys, and connection strings appear in generated code frequently |
| Framework usage | Follows framework conventions and security defaults | Mixes framework patterns—React with raw DOM manipulation, Express with manual header setting |
| Dependency choices | Teams vet and standardize dependencies | Suggests whatever was common in training data, including deprecated or vulnerable packages |
| Permission defaults | Teams review IAM, CORS, and access policies | Over-permissive by default—Access-Control-Allow-Origin: *, public S3 buckets, disabled RLS |
The Veracode State of AI-Generated Code Security report (July 2025) found that 45% of AI-generated code contains at least one security vulnerability. Pearce et al. (2021) showed that 40% of Copilot suggestions in security-sensitive contexts were insecure, and BaxBench (February 2025) found 49% of AI-generated outputs were vulnerable or incorrect. The problem isn't that AI writes worse code than humans—it's that AI writes code faster than humans can review it, and the vulnerability patterns are different enough that scanners tuned for human-written code miss them.
AI-generated code isn't inherently less secure than human-written code. But it's produced at a volume and speed that makes manual review impossible. Automated scanning tuned for AI output patterns is the only way to maintain security velocity.
Rafter's Scanning Pipeline
When you connect a repository to Rafter—through the GitHub app, CLI, or API—the scanning pipeline runs through five stages:
1. Repository ingestion. Rafter clones the repo at the specified branch or commit, identifies the languages and frameworks in use, and routes the codebase to the appropriate scanner configurations.
2. Static analysis layer. Multiple open-source and proprietary analyzers run in parallel against the codebase. Each scanner targets different vulnerability classes—secrets detection, dependency vulnerabilities, SAST pattern matching, infrastructure misconfigurations, and code quality issues.
3. AI-powered contextual review. Rafter's proprietary AI scanner (rf) analyzes the code with context that static rules can't capture—cross-file data flow, intent inference, and AI-specific vulnerability patterns like inconsistent auth approaches or missing input validation in generated routes.
4. Finding consolidation. Results from all scanners are merged, deduplicated by fingerprint, classified against the OWASP Top 10:2025, and ranked by severity.
5. Fix generation. Every finding gets a plain-English explanation and a structured fix prompt designed to paste directly into ChatGPT, Claude, Cursor, or any AI coding assistant.
A typical scan completes in 30 seconds to 2 minutes. Results appear in your dashboard, CLI output, or API response in SARIF-compatible format for integration with existing toolchains.
Static Analysis Layer
Rafter doesn't rely on a single scanner. The static analysis layer runs multiple specialized tools in parallel, each targeting vulnerability classes the others miss:
| Scanner | Category | What It Finds |
|---|---|---|
| Gitleaks | Secret Detection | Hardcoded API keys, tokens, passwords, and credentials in code and git history |
| Trivy + Bandit + OpenGrep | SAST + SCA | Known CVEs in dependencies, SQL injection, XSS, command injection, insecure crypto, deserialization flaws |
| Checkov | Infrastructure as Code | Terraform, CloudFormation, and Kubernetes misconfigurations—public buckets, open security groups, missing encryption |
Rafter AI (rf) | AI-Specific | Proprietary rules tuned for AI-generated code patterns—inconsistent auth, over-permissive defaults, framework misuse |
| ESLint Security Rules | Code Quality | JavaScript/TypeScript anti-patterns, prototype pollution vectors, unsafe regex |
These aren't run sequentially—they execute in parallel on GCP Cloud Run workers, so adding scanners doesn't linearly increase scan time. Each scanner produces findings in a normalized format that feeds into the consolidation stage.
Why Multiple Scanners Matter
No single scanner catches everything. Gitleaks finds the AWS key hardcoded in a config file. Trivy finds the critical CVE in an outdated dependency. Bandit catches the eval() call with user input. Checkov catches the Terraform resource with a public IP and no security group. The Rafter AI scanner catches the endpoint that has authentication in development but not in production because the AI tool generated the route handler without the auth middleware that exists on every other route.
Running all of these together, then deduplicating the overlapping results, gives you broader coverage than any single tool while keeping noise manageable.
AI-Powered Contextual Review
The Rafter AI scanner (rf) is where things get interesting—and where we should be honest about both strengths and limitations.
Traditional static analysis works by matching patterns: "if a user-controlled value reaches an SQL query without parameterization, flag it." This works well for known vulnerability patterns but fails when the vulnerability is contextual—when the issue isn't in any single line but in how multiple files interact.
Rafter's AI layer adds three capabilities that static rules can't provide:
Cross-File Context
AI-generated codebases frequently have inconsistent security patterns across files. One API route validates input, sanitizes output, and checks authentication. The next route—generated in a different session or by a different prompt—does none of that. Static analysis checks each file independently. The AI layer reviews routes in context, flagging when a route deviates from the security patterns established elsewhere in the project.
Intent Inference
When an AI tool generates a file upload handler without file type validation, a static analyzer sees valid code. The AI layer recognizes the intent—"this is a file upload endpoint"—and checks whether the implementation matches security expectations for that intent: file type validation, size limits, storage path sanitization, and malware scanning hooks.
AI-Specific Pattern Detection
Some vulnerability patterns are almost exclusive to AI-generated code:
// ✗ Vulnerable: AI-generated Supabase client with RLS bypassed
import { createClient } from '@supabase/supabase-js'
const supabase = createClient(
process.env.NEXT_PUBLIC_SUPABASE_URL,
process.env.SUPABASE_SERVICE_ROLE_KEY // Service role key bypasses RLS
)
// This endpoint lets any authenticated user read ANY user's data
export async function GET(req) {
const { data } = await supabase
.from('profiles')
.select('*')
.eq('id', req.params.userId)
return Response.json(data)
}
// ✓ Secure: Using anon key with RLS enforced
import { createClient } from '@supabase/supabase-js'
const supabase = createClient(
process.env.NEXT_PUBLIC_SUPABASE_URL,
process.env.NEXT_PUBLIC_SUPABASE_ANON_KEY // Anon key respects RLS
)
export async function GET(req) {
// RLS policy ensures users can only read their own profile
const { data } = await supabase
.from('profiles')
.select('*')
.eq('id', req.params.userId)
return Response.json(data)
}
This pattern—using the Supabase service role key in a client-facing endpoint—is rare in human-written code because developers learn about RLS before building with Supabase. AI coding assistants generate it constantly because the service role key "works" and produces fewer errors during generation. Rafter's AI scanner flags this pattern specifically, along with similar issues in Firebase (missing security rules), Appwrite (exposed endpoints), and other backend-as-a-service platforms that AI tools frequently suggest.
Finding Consolidation and Deduplication
Five scanners running in parallel produce overlapping findings. The same hardcoded secret might be flagged by Gitleaks (as a credential leak), by the SAST scanner (as a hardcoded string in a sensitive context), and by the AI scanner (as a service key used in a client-facing endpoint). Showing three findings for one issue creates noise that makes developers ignore results.
Rafter's consolidation pipeline handles this in three steps:
1. Fingerprint generation. Each finding gets a SHA-256 fingerprint derived from the finding type and location. The same vulnerability in the same file produces the same fingerprint across runs, which enables tracking over time and prevents duplicate alerts on unchanged code.
2. Deduplication. Findings with matching fingerprints are merged. The first occurrence wins, but metadata from overlapping findings enriches the primary finding—if Gitleaks identifies the secret type and the AI scanner identifies the security impact of where it's used, both contribute to the final finding.
3. OWASP classification and severity ranking. Every deduplicated finding is mapped to the OWASP Top 10:2025 categories using a three-tier classification system:
- Priority 1: CWE-to-OWASP mapping (most precise—if the finding has a CWE identifier, it maps directly)
- Priority 2: Keyword-based classification (regex matching on finding descriptions)
- Priority 3: Tool-based heuristic (Gitleaks findings default to Cryptographic Failures, Checkov findings default to Security Misconfiguration)
Severity levels are normalized across all scanners: critical, high, severe, and blocker all map to error. warning and medium map to warning. note, info, and low map to note. This gives you a single, consistent severity scale regardless of which scanner produced the finding.
The result is a clean, ranked list of findings—not a noisy dump of raw scanner output. Each finding appears once, with a clear severity level, OWASP category, and file location.
Fix Generation: From Finding to Prompt
Finding vulnerabilities is the easy part. The hard part is fixing them—especially when the developer who needs to fix the code is using an AI tool to write it. Rafter closes this loop by generating structured fix prompts for every finding.
Here's how it works:
1. Finding context assembly. For each vulnerability, Rafter assembles the rule ID, severity level, file path, line number, and a plain-English description of what's wrong and why it matters.
2. Single-vulnerability prompts. Each finding produces a prompt structured for AI coding assistants:
You are a senior application-security engineer. Never mock data,
suppress linter security rules, or shortcut the fix. Think step-by-step.
VULNERABILITY: hardcoded-credentials
SEVERITY: error
FILE: src/lib/db.ts:14
DESCRIPTION: Database connection string with credentials hardcoded in source code.
Provide:
1. Explanation of the vulnerability and its impact
2. Step-by-step remediation with code examples
3. Prevention strategies to avoid reintroduction
3. Bulk remediation prompts. When a scan produces multiple findings, Rafter groups them by rule ID and generates a consolidated prompt that asks for prioritized remediation across all issues—so you can paste one prompt into Claude or ChatGPT and get a comprehensive fix plan.
The fix prompt workflow looks like this in practice:
- Rafter scans your repo and finds 8 vulnerabilities
- You open the findings in the dashboard or CLI
- Each finding has a "Copy Fix Prompt" button
- You paste the prompt into Cursor, ChatGPT, Claude, or Lovable
- The AI tool generates the fix using the structured context
- You rescan to verify the fix resolved the issue
This scan-find-fix-rescan loop is deliberate. AI tools generate vulnerable code; Rafter finds the vulnerabilities; AI tools fix the vulnerabilities using Rafter's structured prompts; Rafter verifies the fixes. The AI is both the source of the problem and—with proper guidance—the solution.
What Rafter Doesn't Catch (Honest Assessment)
No scanner catches everything, and we'd rather be transparent about our limits than have you discover them in production.
Business logic vulnerabilities. Rafter can't determine that your e-commerce checkout allows negative quantities, that your access control lets users escalate their own permissions through a multi-step workflow, or that your rate limiter has a race condition in the token bucket implementation. Business logic requires understanding what the application should do, which is beyond any automated scanner.
Runtime-only vulnerabilities. Issues that only manifest at runtime—SSRF through DNS rebinding, timing-based side channels, race conditions in concurrent request handling—require dynamic testing (DAST) or targeted penetration testing. Rafter is a static analysis platform.
Novel vulnerability classes. Rafter's static rules and AI patterns detect known vulnerability classes and their variations. A genuinely novel attack technique—something nobody has seen before—won't match existing patterns until we update the scanner. We release rule updates continuously, but there's always a lag between discovery and detection.
Obfuscated or intentionally deceptive code. If malicious code is deliberately obfuscated to evade scanning—encoded payloads, indirect evaluation through computed property access, multi-stage deobfuscation—static analysis has fundamental limits. This is a constraint shared by all SAST tools.
Deep dependency chain analysis. Rafter scans your direct and transitive dependencies for known CVEs via Trivy, but it doesn't trace data flow through third-party library code. If a vulnerability exists in how your code interacts with a library's internal behavior, that requires deeper program analysis than our current pipeline provides.
We're actively working on expanding coverage in these areas—particularly runtime-aware analysis and deeper cross-file data flow tracking. But today, these are real gaps, and combining Rafter with DAST tools and manual penetration testing is the right approach for comprehensive coverage.
Integration Points
Rafter is designed to fit into the workflow you already have, not replace it.
GitHub App
Connect your GitHub account, select repositories, and Rafter scans automatically on push or pull request. The GitHub app handles authentication, branch selection, and org-level configuration—including per-repo scan triggers, branch filters, scan modes (fast or plus), cooldown periods, and monthly scan caps for cost control.
CLI
For local development and CI/CD integration:
# Install
npx @rafter/cli scan # Node projects
pip install rafter-cli && rafter scan # Python projects
# Scan with options
rafter scan --mode plus --branch main
REST API
For programmatic access and custom integrations:
# Trigger a scan
curl -X POST https://api.rafter.so/api/static/scan \
-H "X-API-Key: your-api-key" \
-H "Content-Type: application/json" \
-d '{"repository_name": "owner/repo", "scan_mode": "fast"}'
# Check results
curl https://api.rafter.so/api/static/scan?scan_id=xxx&format=json \
-H "X-API-Key: your-api-key"
Results come back in SARIF-compatible JSON or formatted Markdown, so you can feed them into existing security dashboards, Slack alerts, or issue trackers.
Setting Up Rafter on Your AI-Built Project
Whether you're building with Cursor, Claude, Lovable, Replit, or any other AI coding tool, here's how to get scanning running in five minutes:
- Sign in at rafter.so/dashboard with your GitHub account
- Connect your repo—select the repository and branch you want to scan
- Run your first scan—choose
fastfor quick results orplusfor deep analysis - Review findings—each vulnerability includes severity, OWASP category, file location, and a plain-English explanation
- Fix with AI—copy the fix prompt, paste it into your AI coding tool, apply the fix, and rescan to verify
For CI/CD integration, add the Rafter CLI to your GitHub Actions workflow to scan every pull request automatically.
First scans are free. No credit card required. Most AI-built projects complete scanning in 30 seconds to 2 minutes.
Conclusion
Rafter's scanning pipeline is built for the specific reality of AI-generated code: high volume, inconsistent patterns, and vulnerability classes that traditional scanners weren't designed to catch. The combination of open-source static analyzers, a proprietary AI review layer, and structured fix prompts creates a scan-find-fix loop that works with AI tools rather than against them.
We're transparent about what we catch and what we miss because trust matters more than marketing. No scanner catches everything. Rafter catches a meaningful set of vulnerabilities that would otherwise ship to production in AI-generated code—and it does it in 30 seconds to 2 minutes.
Next steps:
- Run your first scan—sign in with GitHub, select a repo, get results in 30 seconds
- Review the security tool comparison guide to understand where Rafter fits in a comprehensive security strategy
- Set up automated scanning in your CI/CD pipeline for continuous protection
- Read about vibe coding security to understand the broader security landscape for AI-generated applications