The numbers

GitHub's 2024 Secret Scanning report documents a problem that is accelerating, not stabilizing. 23.8 million secrets were detected in public repositories over the course of the year -- API keys, database connection strings, OAuth tokens, private keys, and cloud provider credentials committed directly to source control.

That figure alone is alarming. The more consequential finding is the correlation with AI coding assistants.

23.8M Secrets leaked on GitHub in 2024
40% Higher leak rate in Copilot-using repos
6.4% Secret leak rate with Copilot (vs 4.6% without)
3.0 Valid secrets generated per Copilot prompt (across 8,127 suggestions)

Repositories using GitHub Copilot exhibit a 6.4% secret leak rate, compared to 4.6% for repositories without Copilot. That is a 40% relative increase. Across a sample of 8,127 Copilot suggestions, researchers found that the model generated 3.0 valid, working secrets per prompt -- credentials that could authenticate against real services.

These are not theoretical risks. These are credentials that, if discovered by an attacker, provide immediate access to production infrastructure. The velocity of AI-generated code means these secrets are being committed faster than any human review process can catch them.

Enterprise implication: If your engineering teams use AI coding assistants -- and statistically, they do -- your codebase is producing secrets at a higher rate than before. The question is whether your detection infrastructure has scaled accordingly.


Five paths credentials leak through AI agents

The GitHub data captures only one vector: secrets committed to source control. In practice, AI agents introduce at least five distinct credential exposure paths, each with different detection requirements.

1. Hardcoding during generation Critical

AI models generate code with hardcoded credentials because that is what their training data contains. When a developer prompts "connect to my Postgres database," the model produces a connection string with a placeholder that looks like a real credential -- or, in some cases, is a real credential memorized from training data. The developer accepts the suggestion, commits, and pushes. The secret is now in version history permanently.

2. Context window exposure Critical

Developers paste code containing credentials into LLM prompts for debugging or refactoring. The credentials are now part of the model's input context. For hosted LLMs (ChatGPT, Claude, Gemini), this means the credentials have left your network perimeter. For fine-tuned or cached models, the credentials may persist in training data or conversation logs. This is especially dangerous when developers paste .env files or configuration blocks into public-facing LLMs.

3. Training data memorization High

LLMs memorize and regurgitate credentials from their training corpora. Research has demonstrated that models can reproduce valid AWS keys, Stripe API tokens, and database passwords that appeared in public repositories. The 3.0 valid secrets per prompt finding is direct evidence of this phenomenon. The model is not "inventing" credentials -- it is recalling them from training data and presenting them as boilerplate.

4. MCP tool exfiltration Critical

Model Context Protocol (MCP) servers connect AI agents to external tools -- databases, APIs, file systems. A compromised or poorly configured MCP server can include credentials in its tool call responses, which the agent then processes and potentially logs, caches, or forwards to other services. A malicious MCP server can actively exfiltrate credentials by embedding them in responses designed to be passed to other tools.

Vector: Agent-to-tool runtime
5. Framework CVEs High

The frameworks that power AI agents have their own vulnerability surface. LangChain has disclosed multiple CVEs including arbitrary code execution through prompt injection. Langflow (CVE-2025-3248) exposed unauthenticated remote code execution that could extract environment variables -- including any secrets stored as env vars. These are not hypothetical: they are patched CVEs with public exploits.

Static and runtime coverage for MCP exfiltration

Path 4 -- MCP tool exfiltration -- requires two layers of defense. Before deployment, Aguara Scanner analyzes MCP server configurations with 148+ rules that detect hardcoded credentials, overly permissive tool definitions, and known-vulnerable server patterns. At runtime, Oktsec's MCP Gateway intercepts every tool call and response, scanning for credential patterns before data leaves the perimeter. A credential in a tool response is caught and redacted before the agent can process it.


Why pre-commit hooks are not enough

The standard recommendation for secret leakage is pre-commit hook scanning: tools like gitleaks, trufflehog, or GitHub's native secret scanning. These tools are necessary. Every engineering team should run them. But they address only one of the five paths described above -- secrets committed to source control.

AI agents operate at runtime. They call tools, receive responses, pass data between services, and communicate with other agents -- all outside the scope of a pre-commit hook. Consider the following scenarios that no pre-commit scanner will catch:

The runtime gap: Pre-commit hooks and static scanning operate at code-time. AI agents operate at runtime. A credential can leak through a tool call response, an agent-to-agent message, or an MCP server output -- none of which touch a git repository. Closing this gap requires runtime content scanning on every message and tool call that crosses a trust boundary.

This is the layer Oktsec was built for. The platform's 169+ detection rules include 15 credential redaction patterns that scan agent-to-agent communication, tool call payloads, and MCP server responses at runtime. When a credential pattern is detected, it is redacted before it reaches the next hop -- whether that is another agent, an external API, or a log file.


Defense-in-depth: three layers

Securing AI agent workflows against credential leakage requires coverage at three distinct layers. No single tool covers all three. The architecture must be deliberately layered.

Layer 1: Secrets Management
Never store credentials in code, configuration files, or environment variables accessible to agents. Use a secrets manager (HashiCorp Vault, AWS Secrets Manager, Azure Key Vault) with automatic rotation. Inject secrets at the infrastructure level -- not the application level. The agent should never see the raw credential.
Layer 2: Pre-deploy Scanning
Scan all code, configurations, and agent definitions before deployment. This includes MCP server configurations, Langflow flow files, and agent system prompts. Aguara Scanner provides 148+ rules across 15 security categories that detect hardcoded credentials in MCP configurations, overly permissive tool definitions, and known-vulnerable patterns -- all deterministic, no LLM in the loop, and self-hostable for air-gapped environments.
Layer 3: Runtime Enforcement
Scan every message, tool call, and agent response at runtime. Oktsec provides 15 credential redaction patterns that catch AWS keys, database connection strings, OAuth tokens, private keys, and other sensitive values in transit. The MCP Gateway intercepts tool calls and responses before they cross the perimeter -- a credential in a tool response is redacted before the agent processes it. This is the layer that pre-commit hooks cannot reach.

Coverage across the credential lifecycle

Aguara and Oktsec together provide coverage from configuration through runtime, catching credentials at every stage of the AI agent lifecycle.

148+ Aguara static rules
15 Credential redaction patterns
169+ Oktsec detection rules

Static scanning catches what is in configuration files. Runtime scanning catches what is in motion. Both are required. Neither alone is sufficient.


Code review checklist for AI-generated code

When reviewing pull requests that contain AI-generated code, apply the following checks. These supplement -- not replace -- automated scanning.

No hardcoded credentials. Search for patterns: API keys, connection strings, tokens, passwords, private keys. Check .env.example files for real values accidentally committed instead of placeholders.
Secrets loaded from environment or vault. Verify that credentials are accessed via os.environ, process.env, or a secrets manager SDK -- never from string literals or config files checked into the repository.
No credentials in AI prompts. Check system prompts, tool descriptions, and agent instructions for embedded API keys or database passwords. AI-generated agent configurations frequently include example credentials that are actually real.
MCP server configs are credential-free. Verify that MCP server definitions do not contain hardcoded tokens, API keys, or authentication headers. Use environment variable references or a secrets manager.
Pre-commit hooks are active. Confirm that gitleaks, trufflehog, or equivalent tooling is configured in .pre-commit-config.yaml and running on every commit. Verify it has not been bypassed with --no-verify.
Agent framework dependencies are pinned and patched. Check LangChain, Langflow, CrewAI, and other agent framework versions against known CVEs. Unpatched framework vulnerabilities can expose environment variables and secrets.
Credential rotation is automated. Verify that any credentials the system uses have automatic rotation configured. A leaked credential with a 24-hour TTL is a manageable incident. A leaked credential with no expiration is a breach.

For automated coverage, run Aguara against your MCP server configurations and agent definitions as part of CI. The scanner detects credential patterns that human review frequently misses, especially in JSON and YAML configuration files where secrets are easy to overlook.

# Scan MCP server configurations for hardcoded credentials
npx aguara scan ./mcp-servers/ --format sarif --output results.sarif

# Scan agent configuration files
npx aguara scan ./agents/ --rules credential,secrets,api-key

# Integrate with CI -- fail the build on credential findings
npx aguara scan . --severity critical,high --exit-code 1

The speed of AI-generated code outpaces human review

The fundamental problem is not that developers are careless. It is that AI coding assistants produce code at a velocity that exceeds the capacity of manual review. A developer using Copilot can accept dozens of suggestions per hour. Each suggestion may or may not contain a hardcoded credential. The odds are small per suggestion, but at scale -- across thousands of developers, millions of suggestions, and hundreds of repositories -- the math is unforgiving.

23.8 million leaked secrets is not a failure of individual discipline. It is a systemic failure of detection infrastructure to keep pace with generation speed.

The response must be equally systematic: secrets management at the infrastructure layer, static scanning at the configuration layer, and runtime enforcement at the communication layer. Automation at every stage. No manual gates that depend on a human catching what a model missed.

The bottom line: AI agents do not understand that a string is a secret. They do not distinguish between a placeholder and a production credential. They cannot evaluate whether a tool call response should be forwarded or redacted. That judgment must be encoded in automated systems that operate at every layer -- from the IDE to the runtime -- without depending on human review to catch every case.

Back to all posts

Secure your AI agent stack

Oktsec provides credential detection, MCP Gateway, and runtime content scanning for AI agent communication. Open source, self-hosted, no LLM.