Security News > 2024 > October > Researchers Reveal 'Deceptive Delight' Method to Jailbreak AI Models
2024-10-23 09:54
Cybersecurity researchers have shed light on a new adversarial technique that could be used to jailbreak large language models (LLMs) during the course of an interactive conversation by sneaking in an undesirable instruction between benign ones. The approach has been codenamed Deceptive Delight by Palo Alto Networks Unit 42, which described it as both simple and effective, achieving an average
News URL
https://thehackernews.com/2024/10/researchers-reveal-deceptive-delight.html
Related news
- 20% of Generative AI ‘Jailbreak’ Attacks Succeed, With 90% Exposing Sensitive Data (source)
- Apple Opens PCC Source Code for Researchers to Identify Bugs in Cloud AI Security (source)
- Researchers Uncover Vulnerabilities in Open-Source AI and ML Models (source)
- Researchers Warn of Privilege Escalation Risks in Google's Vertex AI ML Platform (source)