The FacetedDBLP logo    Search for: in:

Disable automatic phrases ?     Syntactic query expansion: ?

Searching for jailbreaking with no syntactic query expansion in all metadata.

Publication years (Num. hits)
2011 (1) 2016 (1) 2020 (1) 2023 (13) 2024 (20)
Publication types (Num. hits)
article(34) inproceedings(2)
Venues (Conferences, Journals, ...)
GrowBag graphs for keyword ? (Num. hits/coverage)

Group by:
The graphs summarize 8 occurrences of 8 keywords

Results
Found 36 publication records. Showing 36 according to the selection in the facets
Hits ? Authors Title Venue Year Link Author keywords
36Charlie Miller Mobile Attacks and Defense. Search on Bibsonomy IEEE Secur. Priv. The full citation details ... 2011 DBLP  DOI  BibTeX  RDF jailbreaking, App Store, Android Market, data execution prevention, DEP, address space layout randomization, ASLR, computer security, malware, SMS, smart phone, Android, iPhone, sandbox, iOS
28Andy Zhou, Bo Li, Haohan Wang Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
28Yichuan Mo, Yuji Wang, Zeming Wei, Yisen Wang 0001 Studious Bob Fight Back Against Jailbreaking via Prompt Adversarial Tuning. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
28Yixin Cheng, Markos Georgopoulos, Volkan Cevher, Grigorios G. Chrysos Leveraging the Context through Multi-Round Interactions for Jailbreaking Attacks. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
28Zhenxing Niu, Haodong Ren, Xinbo Gao 0001, Gang Hua 0001, Rong Jin Jailbreaking Attack against Multimodal Large Language Model. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
28Divij Handa, Advait Chirmule, Bimal G. Gajera, Chitta Baral Jailbreaking Proprietary Large Language Models using Word Substitution Cipher. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
28Guangyu Shen, Siyuan Cheng 0005, Kaiyuan Zhang 0002, Guanhong Tao 0001, Shengwei An, Lu Yan, Zhuo Zhang 0002, Shiqing Ma, Xiangyu Zhang 0001 Rapid Optimization for Jailbreaking LLMs via Subconscious Exploitation and Echopraxia. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
28Taeyoun Kim, Suhas Kotha, Aditi Raghunathan Jailbreaking is Best Solved by Definition. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
28Huijie Lv, Xiao Wang, Yuansen Zhang, Caishuang Huang, Shihan Dou, Junjie Ye, Tao Gui, Qi Zhang, Xuanjing Huang 0001 CodeChameleon: Personalized Encryption Framework for Jailbreaking Large Language Models. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
28Patrick Chao, Edoardo Debenedetti, Alexander Robey, Maksym Andriushchenko, Francesco Croce, Vikash Sehwag, Edgar Dobriban, Nicolas Flammarion, George J. Pappas, Florian Tramèr, Hamed Hassani, Eric Wong 0001 JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
28Weikang Zhou, Xiao Wang, Limao Xiong, Han Xia, Yingshuang Gu, Mingxu Chai, Fukang Zhu, Caishuang Huang, Shihan Dou, Zhiheng Xi, Rui Zheng, Songyang Gao, Yicheng Zou, Hang Yan 0001, Yifan Le, Ruohui Wang, Lijun Li, Jing Shao, Tao Gui, Qi Zhang, Xuanjing Huang 0001 EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
28Yifan Li 0009, Hangyu Guo, Kun Zhou 0002, Wayne Xin Zhao, Ji-Rong Wen Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking Multimodal Large Language Models. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
28Tianlong Li, Xiaoqing Zheng, Xuanjing Huang 0001 Open the Pandora's Box of LLMs: Jailbreaking LLMs through Representation Engineering. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
28Zhenhua Wang, Wei Xie 0007, Baosheng Wang, Enze Wang, Zhiwen Gui, Shuoyoucheng Ma, Kai Chen Foot In The Door: Understanding Large Language Model Jailbreaking via Cognitive Psychology. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
28Tong Liu, Yingjie Zhang, Zhe Zhao, Yinpeng Dong, Guozhu Meng, Kai Chen Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise and Reconstruction. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
28Yihan Wang, Zhouxing Shi, Andrew Bai, Cho-Jui Hsieh Defending LLMs against Jailbreaking Attacks via Backtranslation. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
28Canaan Yung, Hadi Mohaghegh Dolatabadi, Sarah M. Erfani, Christopher Leckie Round Trip Translation Defence against Large Language Model Jailbreaking Attacks. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
28Xuandong Zhao, Xianjun Yang, Tianyu Pang, Chao Du, Lei Li 0005, Yu-Xiang Wang, William Yang Wang Weak-to-Strong Jailbreaking on Large Language Models. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
28Xijia Tao, Shuai Zhong, Lei Li 0039, Qi Liu, Lingpeng Kong ImgTrojan: Jailbreaking Vision-Language Models with ONE Image. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
28Xingang Guo, Fangxu Yu, Huan Zhang, Lianhui Qin, Bin Hu COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
28Daoyuan Wu, Shuai Wang, Yang Liu, Ning Liu LLMs Can Defend Themselves Against Jailbreaking in a Practical Manner: A Vision Paper. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
28Yunhao Liu 0002, Gengzhong Feng, Yangyang Sun, Xiangyin Kong Jailbreaking in closed two-sided platforms. Search on Bibsonomy Inf. Manag. The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
28Yichen Gong, Delong Ran, Jinyuan Liu, Conglei Wang, Tianshuo Cong, Anyu Wang 0001, Sisi Duan, Xiaoyun Wang FigStep: Jailbreaking Large Vision-language Models via Typographic Visual Prompts. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
28Yuanwei Wu, Xiang Li, Yixin Liu, Pan Zhou, Lichao Sun 0001 Jailbreaking GPT-4V via Self-Adversarial Attacks with System Prompts. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
28Nan Xu, Fei Wang 0060, Ben Zhou, Bangzheng Li, Chaowei Xiao, Muhao Chen Cognitive Overload: Jailbreaking Large Language Models with Overloaded Logical Thinking. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
28Zhexin Zhang, Junxiao Yang, Pei Ke, Minlie Huang Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
28Alexander Robey, Eric Wong 0001, Hamed Hassani, George J. Pappas SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
28Xiaoyu Zhang, Cen Zhang, Tianlin Li, Yihao Huang 0006, Xiaojun Jia, Xiaofei Xie, Yang Liu, Chao Shen A Mutation-Based Method for Multi-Modal Jailbreaking Attack Detection. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
28Anay Mehrotra, Manolis Zampetakis, Paul Kassianik, Blaine Nelson, Hyrum Anderson, Yaron Singer, Amin Karbasi Tree of Attacks: Jailbreaking Black-Box LLMs Automatically. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
28Patrick Chao, Alexander Robey, Edgar Dobriban, Hamed Hassani, George J. Pappas, Eric Wong 0001 Jailbreaking Black Box Large Language Models in Twenty Queries. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
28Yi Liu, Gelei Deng, Zhengzi Xu, Yuekang Li, Yaowen Zheng, Ying Zhang, Lida Zhao, Tianwei Zhang 0004, Yang Liu 0003 Jailbreaking ChatGPT via Prompt Engineering: An Empirical Study. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
28Raz Lapid, Ron Langberg, Moshe Sipper Open Sesame! Universal Black Box Jailbreaking of Large Language Models. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
28Haoran Li 0003, Dadi Guo, Wei Fan, Mingshi Xu, Yangqiu Song Multi-step Jailbreaking Privacy Attacks on ChatGPT. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
28Haoran Li, Dadi Guo, Wei Fan, Mingshi Xu, Jie Huang 0009, Fanpu Meng, Yangqiu Song Multi-step Jailbreaking Privacy Attacks on ChatGPT. Search on Bibsonomy EMNLP (Findings) The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
28Hasan Cavusoglu, Huseyin Cavusoglu, Xianjun Geng Bloatware and Jailbreaking: Strategic Impacts of Consumer-Initiated Modification of Technology Products. Search on Bibsonomy Inf. Syst. Res. The full citation details ... 2020 DBLP  DOI  BibTeX  RDF
28Silvio Peroni, David M. Shotton, Fabio Vitali Jailbreaking your Reference Lists: the OpenCitations Project Strikes Again. Search on Bibsonomy ISWC (Posters & Demos) The full citation details ... 2016 DBLP  BibTeX  RDF
Displaying result #1 - #36 of 36 (100 per page; Change: )
Valid XHTML 1.1! Valid CSS! [Valid RSS]
Maintained by L3S.
Previously maintained by Jörg Diederich.
Based upon DBLP by Michael Ley.
open data data released under the ODC-BY 1.0 license