OpenAI told the Financial Times that it found evidence linking DeepSeek to the use of distillation — a common technique developers use to train AI models by extracting data from larger, more capable ones.
Some early DeepSeek testers were surprised to see the AI identify itself as ChatGPT in early responses, which prompted speculation that DeepSeek AI might have been trained with ChatGPT chats.
Meanwhile, DeepSeek's research even admits its R1 model is based on other open-source systems: "We demonstrate that the reasoning patterns of larger models can be distilled into smaller models, resulting in better performance compared to the reasoning patterns discovered through RL on small models."