1 DeepSeek-V3的创新架构和低成本高性能训练方法
表1 各种大模型训练成本对比 |
| 模型名称 | Gemini Ultra | Llama-3 | Claude 3 | DeepSeek-V3 | GPT-4o | 讯飞星火X1-13B |
| 训练成本/美元 | 约1.9亿 | 约9000万 | 约1亿 | 约557万 | 约1亿 | 约780万 |
DeepSeek: Technological innovations and development trends toward artificial general intelligence
Received date: 2025-02-14
Online published: 2025-04-19
Copyright
Wenjun WU , Xingchuang LIAO , Jinkun ZHAO . DeepSeek: Technological innovations and development trends toward artificial general intelligence[J]. Science & Technology Review, 2025 , 43(6) : 14 -20 . DOI: 10.3981/j.issn.1000-7857.2025.02.00175
表1 各种大模型训练成本对比 |
| 模型名称 | Gemini Ultra | Llama-3 | Claude 3 | DeepSeek-V3 | GPT-4o | 讯飞星火X1-13B |
| 训练成本/美元 | 约1.9亿 | 约9000万 | 约1亿 | 约557万 | 约1亿 | 约780万 |
| 1 |
|
| 2 |
|
| 3 |
|
| 4 |
|
| 5 |
|
| 6 |
|
| 7 |
|
| 8 |
|
| 9 |
|
| 10 |
|
| 11 |
de Moura L, Ullrich S. The lean 4 theorem prover and programming language[C]//Automated Deduction-CADE 28: 28th International Conference on Automated Deduction, Virtual Event. Cham: Springer International Publishing, 2021: 625-635.
|
| 12 |
|
| 13 |
|
| 14 |
|
| 15 |
|
/
| 〈 |
|
〉 |