What Are the Main Risks to LLM Security?

Large Language Models, or LLMs, are deep learning algorithms that are trained to replicate language. While being trained, they map millions of words onto a high-dimensional vector graph, transforming their semantic meaning into a numerical map. Similarities between words are measured according to the space between them on this graph: this is how an LLM model can then recreate text in new ways, since it calculates which words are likely to follow the previous one. This ability makes it incredibly useful for quickly handling queries, translations, and summarization across an organization’s user base and employees.

However, LLM security is the cutting-edge process of protecting corporate data and processes while handling novel and highly unpredictable large language models. Securing LLM applications demands an in-depth understanding of the major risks that contribute to LLM security breaches.

자세히 알아보기 Read the GigaOm Radar Report

The Top 10 Risks Facing LLMs

Because LLMs are already being rapidly adopted, there’s a lot of data on the security oversights originating from real-life applications; OWASP tracks these, and lists the most widespread and serious issues in new LLM models. These are the 10 biggest threats facing LLMs today.

#1. Prompt Injection

Prompt injection is a vulnerability where attackers craft inputs that deliberately override the model’s original instructions and manipulate its output. By embedding deceptive commands—such as “Ignore all previous instructions”—an attacker can bypass safeguards and force the system to generate unauthorized responses.

This technique becomes especially dangerous when LLMs process external or user-supplied data, as hidden instructions within the input can execute unintended actions without the model recognizing the manipulation. Such exploits can lead to severe consequences, including data breaches, financial losses, business email compromise, and regulatory violations like GDPR non-compliance. Ultimately, prompt injection attacks are one of the most common Large Language Model vulnerabilities, thanks to the public-facing nature of most LLM platforms. The result of this can be unauthorized access, leaking sensitive information, and undermining decision-making processes.

#2. Sensitive Information Disclosure

Since LLMs process vast and diverse datasets, they may inadvertently learn and replicate sensitive information embedded in their training data. Without proper safeguards, models could expose personal data, proprietary business details, or other confidential content in user interactions. Implementing strict data filtering, access controls, and prompt monitoring is essential to minimizing these risks and ensuring compliance with privacy regulations.

#3. Supply Chain Vulnerabilities

LLM supply chain vulnerabilities arise when compromised components, services, or datasets undermine system integrity, leading to data breaches, biased outputs, or system failures. Unlike traditional software risks, which focus on code flaws and dependencies, LLMs introduce unique threats tied to third-party models, datasets, and fine-tuning methods.

Malicious actors can exploit these weaknesses by tampering with pre-trained models, injecting poisoned data, or manipulating fine-tuning processes. Techniques like Low-Rank Adaptation (LoRA) and Parameter-Efficient Fine-Tuning (PEFT)—commonly used in open-access platforms like Hugging Face—further expand the attack surface. Additionally, on-device LLMs introduce new risks by decentralizing deployment, making it harder to secure the entire supply chain.

This extra risk lies atop the traditional software vulnerabilities that can impact other components within LLM-adjacent systems. Outdated and deprecated dependencies are one example of this increased risk of compromised applications. Ensuring the integrity of third-party models, validating datasets, and applying strict security measures are essential to mitigating these threats.

#4. Data and Model Poisoning

Training data poisoning occurs when attackers inject manipulated data into an LLM’s training or fine-tuning process, subtly altering its behavior to introduce biases, backdoors, or vulnerabilities. This can cause the model to generate misleading, insecure, or unethical responses that serve the attacker’s hidden objectives.

Because the impact of poisoned data becomes deeply embedded in the model during training, correcting it after deployment is extremely difficult, if not impossible. Ensuring the integrity of training datasets from the outset—through rigorous validation, access controls, and anomaly detection—is critical to maintaining the security, reliability, and ethical performance of LLMs.

#5. Insecure Output Handling

Insecure output handling occurs when the responses generated by LLMs are not properly validated, sanitized, or controlled before being processed by other systems or shown to users. If LLM-generated output is directly inserted into databases, executed in system shells, or used to generate code for web applications, it can introduce security risks such as unauthorized command execution or code injection.

A key challenge stems from the complexity of transformer-based architectures, which, while excellent at understanding context, can produce unpredictable outputs. Without strict monitoring and sanitization, these responses may contain sensitive, inappropriate, or exploitable content, increasing the risk of security breaches and operational failures.

#6. Excessive Agency

LLM tools are often granted a higher degree of agency than traditional applications. This is because LLMs are expected to call functions and interact with interfaces outside of the core app. For instance, an enterprise tool that answers employees’ questions about their pay, remaining overtime, and other queries, will need to be connected to the underlying HR software, usually via an API.

One example of this is a plugin that retrieves embeddings within a vector database: if it’s allowed to accept configuration parameters, attackers can use it to change host parameters and exfiltrate training data. Consider another plugin, such as one that allows a user to query an LLM model with a URL input; an attacker can construct a request that redirects this URL to a third-party domain that’s pre-loaded with malicious code, setting up a possible content injection attack into the LLM.

#7. System Prompt Leakage

System prompts, designed to steer an LLM’s behavior according to application requirements, can inadvertently contain sensitive information that was never meant to be exposed. If discovered, this hidden data can be exploited to facilitate further attacks.

One major risk is the unintentional disclosure of internal decision-making processes that should remain confidential. When attackers gain insight into how an application functions, they can identify weaknesses and find ways to bypass security controls. For example, in a banking chatbot, a system prompt might include details of transactions or loan limits. If an attacker uncovers this information, they may attempt to manipulate transactions to exceed the limit or circumvent restrictions on loan amounts.

By exposing such operational details, system prompts can become an unintentional attack surface, making it critical to ensure they do not reveal sensitive data that could compromise security.

#8. Vector and Embedding Weaknesses

Many organizations choose to enhance LLM reliability with Retrieval Augmented Generation (RAG). This technique combines the pre-trained model with an authoritative database of verified, source-appropriate information. This way, after a user requests a response from the LLM, it references the information first. RAG allows developers to keep an LLM’s responses up-to-date, and reduce hallucinations. However, RAG relies on vector and embedding mechanisms – wherein the specific data is transformed into vectors.

If there are vulnerabilities within the RAG, however, this can introduce significant threats to the LLM’s host organization. For instance, attackers can attempt to invert these embeddings, and therefore recover the source information from the RAG. If this information is supplied from within the company, there is the risk of customer, product, or intellectual data theft.

#9 Misinformation

One of the most well-recognized risks for any LLM is the chance of it spreading misinformation within its corporation or user base. This vulnerability can have a significant impact on the user’s surrounding security, integrity, and reputation. Some of the misinformation has been heavily reported on: when an Air Canada customer asked about bereavement rates, their site’s chatbot confidently stated that the ticket can be refunded after travel is completed.  However, when the customer went to request compensation after landing, Air Canada refused, stating that it was only valid before travel. The customer then went on to successfully sue the organization.

A related challenge for internally-facing LLM deployments is overreliance. This happens when employees place too much trust in LLM-generated content without first verifying its accuracy. As a result, misinformation can have a greater impact, as users may incorporate incorrect data into important decisions or processes without proper validation.

#10. Unbounded Resource Usage

Attackers can take advantage of the resource usage required to respond to queries by creating high-volume task queues, or manipulating the model’s context window to degrade performance, disrupt service availability, and drive up operational costs. These denial of service (DoS) attacks exploit the resource-intensive nature of LLMs by overwhelming them with computationally expensive tasks.

One particularly concerning technique is recursive context expansion, where an attacker crafts inputs that force the LLM to repeatedly extend and process its context, consuming excessive memory and compute power. Without proper safeguards, such attacks can impact system reliability, slow response times for legitimate users, and introduce financial burdens due to increased resource consumption. Implementing rate limiting, resource allocation controls, and anomaly detection is crucial to mitigating these threats.

Protect New LLMs with Check Point CloudGuard

Check Point CloudGuard is a security platform that protects your entire cloud ecosystem. By combining account activity logs, real-time network activity, and cutting-edge threat intelligence like OWASP’s, CloudGuard grants real-time visibility into LLMs and their surrounding databases and plugins.

Alongside in-depth visibility, CloudGuard provides automatic remediation that takes an attack’s severity and context to heart. Explore how CloudGuard can defend against cutting-edge LLM and cloud-based threats with a demo.

×
  피드백
본 웹 사이트에서는 기능과 분석 및 마케팅 목적으로 쿠키를 사용합니다. 웹 사이트를 계속 이용하면 쿠키 사용에 동의하시게 됩니다. 자세한 내용은 쿠키 공지를 읽어 주십시오.