Tristan Hume
Publications by Year
Research Areas
Topic Modeling, Natural Language Processing Techniques, Explainable Artificial Intelligence (XAI), Software Engineering Research, Ethics and Social Impacts of AI
Most-Cited Works
- → Discovering Language Model Behaviors with Model-Written Evaluations(2023)119 cited
- → The Capacity for Moral Self-Correction in Large Language Models(2023)48 cited
- → Measuring Faithfulness in Chain-of-Thought Reasoning(2023)21 cited
- → Specific versus General Principles for Constitutional AI(2023)7 cited