@AnthropicAI: New research: When open-source models are fine-tuned on seemingly benign chemical synthesis informat...
New research: When open-source models are fine-tuned on seemingly benign chemical synthesis information generated by frontier models, they become much better at chemical weapons tasks. We call this an elicitation attack. https://t.co/44mYnxFKzr