Reading
Research
AgentHarm
GT-HarmBench
PostTrainBench: Can LLM Agents Automate LLM Post-Training?
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Alignment faking in large language models
Weird Generalization and Inductive Backdoors: New Ways to Corrupt LLMs
Reinforcing Multi-Turn Reasoning in LLM Agents via Turn-Level Reward Design
Leisure
The Lonely Skier
Butter
Rules for a Knight
A Wild Sheep Chase
The Wind-Up Bird Chronicle
Dune
The Riddle-Master Trilogy
Animal Farm
The Odyssey
The Iliad
Twenty Thousand Leagues Under the Sea
A Wizard of Earthsea
First Person Singular