profile photo

Nick Crispino

PhD Student in Natural Language Processing

πŸ“š Second-year PhD Student
πŸ‘¨β€πŸ« Advisor: Professor Chenguang Wang
πŸ’»GitHub 🐦Twitter

Research

I'm interested in scaling multi-agent approaches to improve downstream task performance and expanding the real-world use cases of agentic workflows, as well as exploring critical areas of LLM safety through interpretability.

project image

SteeringControl: Holistic Evaluation of Alignment Steering in LLMs


Vincent Siu*, Nicholas Crispino*, David Park, Nathan W. Henry, Zhun Wang, Yang Liu, Dawn Song, Chenguang Wang
in arXiv:2509.13450, 2025
arxiv / code / dataset

SteeringControl is a benchmark measuring the behavioral entanglement that occurs on out of distribution datasets when using current representation steering methods to improve performance on a single behavior, along with a modular code framework allowing for standardization of training-free steering methods.

project image

RepIt: Representing Isolated Targets to Steer Language Models


Vincent Siu, Nathan W. Henry, Nicholas Crispino, Yang Liu, Dawn Song, Chenguang Wang
in arXiv:2509.13281, 2025
arxiv / code

RepIt isolates concept-specific representations of refusal that can be used to suppress refusal on target concepts while preserving the baseline model’s refusal on other concepts using steering methods in interpretability.

project image

COSMIC: Generalized Refusal Identification in LLM Activations


Vincent Siu, Nicholas Crispino, Zihao Yu, Sam Pan, Zhun Wang, Yang Liu, Dawn Song, Chenguang Wang
in Findings of the Association for Computational Linguistics (ACL), 2025
arxiv / code

COSMIC improves the direction selection step of steering refusal in LLMs by choosing a direction maximizing the cosine similarity between paired harmless and harmful behavior, allowing for the more robust application of activation steering.

project image

MLAN: Language-Based Instruction Tuning Improves Zero-Shot Generalization of Multimodal Large Language Models


Jianhong Tu, Zhuohao Ni, Nicholas Crispino, Zihao Yu, Michael Bendersky, Beliz Gunel, Ruoxi Jia, Xin Liu, Lingjuan Lyu, Dawn Song, Chenguang Wang
in arXiv preprint, 2024
arxiv / code

MLAN proposes focusing on text-only instances in instruction tuning to improve instruction following in both the vision and language modalities in multimodal large language models at a lower cost than traditional vision-only or vision-based methods.

project image

Agent Instructs Large Language Models to be General Zero-Shot Reasoners


Nicholas Crispino, Kyle Montgomery, Fankun Zeng, Dawn Song, Chenguang Wang
in Forty-first International Conference on Machine Learning (ICML), 2024
arxiv / code

Zero-shot AgentInstruct uses an agent to generate dataset-specific instructions to improve the zero-shot performance of instruction-following large language models.





Design and source code from Jon Barron's website. Template from Leonid Keselman. Modifications from me and Claude Code ✨.