|
|
Venues (Conferences, Journals, ...)
|
|
GrowBag graphs for keyword ? (Num. hits/coverage)
Group by:
The graphs summarize 8 occurrences of 8 keywords
|
|
|
Results
Found 36 publication records. Showing 36 according to the selection in the facets
Hits ?▲ |
Authors |
Title |
Venue |
Year |
Link |
Author keywords |
36 | Charlie Miller |
Mobile Attacks and Defense. |
IEEE Secur. Priv. |
2011 |
DBLP DOI BibTeX RDF |
jailbreaking, App Store, Android Market, data execution prevention, DEP, address space layout randomization, ASLR, computer security, malware, SMS, smart phone, Android, iPhone, sandbox, iOS |
28 | Andy Zhou, Bo Li, Haohan Wang |
Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
28 | Yichuan Mo, Yuji Wang, Zeming Wei, Yisen Wang 0001 |
Studious Bob Fight Back Against Jailbreaking via Prompt Adversarial Tuning. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
28 | Yixin Cheng, Markos Georgopoulos, Volkan Cevher, Grigorios G. Chrysos |
Leveraging the Context through Multi-Round Interactions for Jailbreaking Attacks. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
28 | Zhenxing Niu, Haodong Ren, Xinbo Gao 0001, Gang Hua 0001, Rong Jin |
Jailbreaking Attack against Multimodal Large Language Model. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
28 | Divij Handa, Advait Chirmule, Bimal G. Gajera, Chitta Baral |
Jailbreaking Proprietary Large Language Models using Word Substitution Cipher. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
28 | Guangyu Shen, Siyuan Cheng 0005, Kaiyuan Zhang 0002, Guanhong Tao 0001, Shengwei An, Lu Yan, Zhuo Zhang 0002, Shiqing Ma, Xiangyu Zhang 0001 |
Rapid Optimization for Jailbreaking LLMs via Subconscious Exploitation and Echopraxia. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
28 | Taeyoun Kim, Suhas Kotha, Aditi Raghunathan |
Jailbreaking is Best Solved by Definition. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
28 | Huijie Lv, Xiao Wang, Yuansen Zhang, Caishuang Huang, Shihan Dou, Junjie Ye, Tao Gui, Qi Zhang, Xuanjing Huang 0001 |
CodeChameleon: Personalized Encryption Framework for Jailbreaking Large Language Models. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
28 | Patrick Chao, Edoardo Debenedetti, Alexander Robey, Maksym Andriushchenko, Francesco Croce, Vikash Sehwag, Edgar Dobriban, Nicolas Flammarion, George J. Pappas, Florian Tramèr, Hamed Hassani, Eric Wong 0001 |
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
28 | Weikang Zhou, Xiao Wang, Limao Xiong, Han Xia, Yingshuang Gu, Mingxu Chai, Fukang Zhu, Caishuang Huang, Shihan Dou, Zhiheng Xi, Rui Zheng, Songyang Gao, Yicheng Zou, Hang Yan 0001, Yifan Le, Ruohui Wang, Lijun Li, Jing Shao, Tao Gui, Qi Zhang, Xuanjing Huang 0001 |
EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
28 | Yifan Li 0009, Hangyu Guo, Kun Zhou 0002, Wayne Xin Zhao, Ji-Rong Wen |
Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking Multimodal Large Language Models. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
28 | Tianlong Li, Xiaoqing Zheng, Xuanjing Huang 0001 |
Open the Pandora's Box of LLMs: Jailbreaking LLMs through Representation Engineering. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
28 | Zhenhua Wang, Wei Xie 0007, Baosheng Wang, Enze Wang, Zhiwen Gui, Shuoyoucheng Ma, Kai Chen |
Foot In The Door: Understanding Large Language Model Jailbreaking via Cognitive Psychology. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
28 | Tong Liu, Yingjie Zhang, Zhe Zhao, Yinpeng Dong, Guozhu Meng, Kai Chen |
Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise and Reconstruction. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
28 | Yihan Wang, Zhouxing Shi, Andrew Bai, Cho-Jui Hsieh |
Defending LLMs against Jailbreaking Attacks via Backtranslation. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
28 | Canaan Yung, Hadi Mohaghegh Dolatabadi, Sarah M. Erfani, Christopher Leckie |
Round Trip Translation Defence against Large Language Model Jailbreaking Attacks. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
28 | Xuandong Zhao, Xianjun Yang, Tianyu Pang, Chao Du, Lei Li 0005, Yu-Xiang Wang, William Yang Wang |
Weak-to-Strong Jailbreaking on Large Language Models. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
28 | Xijia Tao, Shuai Zhong, Lei Li 0039, Qi Liu, Lingpeng Kong |
ImgTrojan: Jailbreaking Vision-Language Models with ONE Image. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
28 | Xingang Guo, Fangxu Yu, Huan Zhang, Lianhui Qin, Bin Hu |
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
28 | Daoyuan Wu, Shuai Wang, Yang Liu, Ning Liu |
LLMs Can Defend Themselves Against Jailbreaking in a Practical Manner: A Vision Paper. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
28 | Yunhao Liu 0002, Gengzhong Feng, Yangyang Sun, Xiangyin Kong |
Jailbreaking in closed two-sided platforms. |
Inf. Manag. |
2023 |
DBLP DOI BibTeX RDF |
|
28 | Yichen Gong, Delong Ran, Jinyuan Liu, Conglei Wang, Tianshuo Cong, Anyu Wang 0001, Sisi Duan, Xiaoyun Wang |
FigStep: Jailbreaking Large Vision-language Models via Typographic Visual Prompts. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
28 | Yuanwei Wu, Xiang Li, Yixin Liu, Pan Zhou, Lichao Sun 0001 |
Jailbreaking GPT-4V via Self-Adversarial Attacks with System Prompts. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
28 | Nan Xu, Fei Wang 0060, Ben Zhou, Bangzheng Li, Chaowei Xiao, Muhao Chen |
Cognitive Overload: Jailbreaking Large Language Models with Overloaded Logical Thinking. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
28 | Zhexin Zhang, Junxiao Yang, Pei Ke, Minlie Huang |
Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
28 | Alexander Robey, Eric Wong 0001, Hamed Hassani, George J. Pappas |
SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
28 | Xiaoyu Zhang, Cen Zhang, Tianlin Li, Yihao Huang 0006, Xiaojun Jia, Xiaofei Xie, Yang Liu, Chao Shen |
A Mutation-Based Method for Multi-Modal Jailbreaking Attack Detection. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
28 | Anay Mehrotra, Manolis Zampetakis, Paul Kassianik, Blaine Nelson, Hyrum Anderson, Yaron Singer, Amin Karbasi |
Tree of Attacks: Jailbreaking Black-Box LLMs Automatically. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
28 | Patrick Chao, Alexander Robey, Edgar Dobriban, Hamed Hassani, George J. Pappas, Eric Wong 0001 |
Jailbreaking Black Box Large Language Models in Twenty Queries. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
28 | Yi Liu, Gelei Deng, Zhengzi Xu, Yuekang Li, Yaowen Zheng, Ying Zhang, Lida Zhao, Tianwei Zhang 0004, Yang Liu 0003 |
Jailbreaking ChatGPT via Prompt Engineering: An Empirical Study. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
28 | Raz Lapid, Ron Langberg, Moshe Sipper |
Open Sesame! Universal Black Box Jailbreaking of Large Language Models. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
28 | Haoran Li 0003, Dadi Guo, Wei Fan, Mingshi Xu, Yangqiu Song |
Multi-step Jailbreaking Privacy Attacks on ChatGPT. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
28 | Haoran Li, Dadi Guo, Wei Fan, Mingshi Xu, Jie Huang 0009, Fanpu Meng, Yangqiu Song |
Multi-step Jailbreaking Privacy Attacks on ChatGPT. |
EMNLP (Findings) |
2023 |
DBLP DOI BibTeX RDF |
|
28 | Hasan Cavusoglu, Huseyin Cavusoglu, Xianjun Geng |
Bloatware and Jailbreaking: Strategic Impacts of Consumer-Initiated Modification of Technology Products. |
Inf. Syst. Res. |
2020 |
DBLP DOI BibTeX RDF |
|
28 | Silvio Peroni, David M. Shotton, Fabio Vitali |
Jailbreaking your Reference Lists: the OpenCitations Project Strikes Again. |
ISWC (Posters & Demos) |
2016 |
DBLP BibTeX RDF |
|
Displaying result #1 - #36 of 36 (100 per page; Change: )
|
|