Rajashree Agrawal
Reed College(US)
Publications by Year
Research Areas
Logic, programming, and type systems, Formal Methods in Verification, Security and Verification in Computing, Model Reduction and Neural Networks, Semantic Web and Ontologies
Most-Cited Works
- → Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data(2024)12 cited
- → Many-shot Jailbreaking(2024)3 cited
- → Towards a Scalable Proof Engine: A Performant Prototype Rewriting Primitive for Coq(2024)2 cited
- → Compact Proofs of Model Performance via Mechanistic Interpretability(2024)
- → Failures to Find Transferable Image Jailbreaks Between Vision-Language Models(2024)
- → Jailbreak Defense in a Narrow Domain: Limitations of Existing Methods and a New Transcript-Classifier Approach(2024)
- → Modular addition without black-boxes: Compressing explanations of MLPs that compute numerical integration(2024)