[R&D] OWASP Top 10 LLM 2025: a Synapsed Research Study

19th May 2025

Synapsed, the AI Trusted Advisory Company, has conducted a pioneering study evaluating real Large Language Models (LLMs) against the newly defined OWASP Top 10 LLM 2025 standard. This groundbreaking research leverages our innovative testing tool, SynInspect, designed specifically to identify critical vulnerabilities unique to LLMs.

The OWASP Foundation’s Top 10 LLM Vulnerabilities framework has emerged as a critical taxonomy for categorizing and addressing these security challenges. This framework identifies key vulnerability categories including prompt injection, data leakage, supply chain tampering, and other LLM-specific security concerns that require specialized testing and mitigation strategies.

This research study presents a comprehensive analysis of SynInspect, an innovative testing tool developed by Synapsed for evaluating the security posture of Large Language Models (LLMs). Our study examined test results from 10 different LLM models, all tested locally using LMStudio, to identify vulnerability patterns, security strengths, and areas requiring urgent countermeasures.

Overview of tested LLM Models

Our analysis examined 10 distinct LLM models, representing a diverse cross-section of current architectures and capabilities:

DeepSeek R1-distill-qwen-7b
gemma-3-4b-it-qat
LLama 3 Groq 8B Tool Use
Mistral-7B Claude-chat
LLaMA 3.2B Instruct
gemma3-4B-claude-3.7-sonnet-reasoning-distilled
alibaba-pai.DistilQwen2.5-DS3-0324-7B-GGUF
Phi 4
OpenChat-3.5-7B-Qwen-v2.0-GGUF
Hermes 3 Llama 3.2B Instruct

These models represent five distinct architectural families: LLaMA, Gemma, Mistral, Qwen, and Phi. All tests were performed locally using LMStudio, ensuring consistent testing conditions and eliminating dependencies on external API services.

Test environment and configuration

All tests were conducted in a controlled local environment using LMStudio, eliminating variability introduced by online API services. This approach ensured consistent testing conditions across all models and enabled direct comparison of results.

Each model underwent a comprehensive battery of tests across all OWASP Top 10 LLM vulnerability categories:

LLM01-Prompt Injection
LLM02-Sensitive Data Leak
LLM03-Supply Chain Tampering
LLM04-Data And Model Poisoning
LLM05-Improper Output Handling
LLM06-Excessive Agency
LLM07-System Prompt Leakage
LLM08-Vector And Embedding Weakness
LLM09-Misinformation
LLM10-Unbounded Consumption

The testing process involved 340 distinct probes designed to evaluate specific vulnerability types within each category. Each probe was systematically applied to all models, with responses analyzed for security implications.

Vulnerability Analysis Results

Our analysis revealed significant variations in vulnerability detection across the tested models.

The total number of vulnerabilities detected ranged from 16 to 84 across different models, with an average of 54.6 vulnerabilities per model.
When comparing security posture across models, we observed that no model was completely immune to vulnerabilities.

The distribution of vulnerabilities across OWASP categories showed clear patterns:

LLM07-System Prompt Leakage: 116 vulnerabilities
LLM08-Vector And Embedding Weakness: 105 vulnerabilities
LLM05-Improper Output Handling: 78 vulnerabilities
LLM02-Sensitive Data Leak: 71 vulnerabilities
LLM01-Prompt Injection: 59 vulnerabilities

The following picture summarises the high level results of the study:

Conclusion

Our study demonstrates a pressing need for enhanced security standards and practices across the LLM industry. The innovative capabilities of SynInspect clearly illustrate that addressing these vulnerabilities requires specialized, proactive security measures. Detailed strategic recommendations for mitigation and security advancement are discussed comprehensively in our full white paper.

For a more in-depth exploration of our study, drop an email to: info@synapsed.ai in order to download the full white paper.

Menu

[R&D] OWASP Top 10 LLM 2025: a Synapsed Research Study

Author

Overview of tested LLM Models

Test environment and configuration

Vulnerability Analysis Results

Conclusion

Related Tags:

Leave a Reply Cancel reply