Liane Lovitt
Publications by Year
Research Areas
Topic Modeling, Explainable Artificial Intelligence (XAI), Natural Language Processing Techniques, Neural Networks and Applications, Reinforcement Learning in Robotics
Most-Cited Works
- → Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback(2022)360 cited
- → Predictability and Surprise in Large Generative Models(2022)171 cited
- → In-context Learning and Induction Heads(2022)84 cited
- → Evaluating and Mitigating Discrimination in Language Model Decisions(2023)9 cited
- → Clio: Privacy-Preserving Insights into Real-World AI Use(2024)5 cited