https://openreview.net/forum?id=UWymGURI75
Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game | OpenReview
While Large Language Models (LLMs) are increasingly being used in real-world applications, they remain vulnerable to prompt injection attacks: malicious third...
prompt injection attackstensor trustonline game