News

The researchers argue that CoT monitoring can help researchers detect when models begin to exploit flaws in their training, ...