Universal and Transferable Adversarial Attacks on Aligned Language Models - arxiv.org

Clear