mirror of
https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp
synced 2025-12-17 02:48:41 +08:00
add figure
This commit is contained in:
1
.gitattributes
vendored
1
.gitattributes
vendored
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||
assets/cost.png filter=lfs diff=lfs merge=lfs -text
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
license: mit
|
||||
library_name: transformers
|
||||
base_model:
|
||||
- deepseek-ai/DeepSeek-V3.1-Base
|
||||
- deepseek-ai/DeepSeek-V3.2-Exp-Base
|
||||
---
|
||||
# DeepSeek-V3.2-Exp
|
||||
|
||||
@@ -50,7 +50,7 @@ We are excited to announce the official release of DeepSeek-V3.2-Exp, an experim
|
||||
This experimental release represents our ongoing research into more efficient transformer architectures, particularly focusing on improving computational efficiency when processing extended text sequences.
|
||||
|
||||
<div align="center">
|
||||
<img src="cost.jpg" >
|
||||
<img src="assets/cost.png" >
|
||||
</div>
|
||||
|
||||
- DeepSeek Sparse Attention (DSA) achieves fine-grained sparse attention for the first time, delivering substantial improvements in long-context training and inference efficiency while maintaining virtually identical model output quality.
|
||||
|
||||
3
assets/cost.png
Normal file
3
assets/cost.png
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:d4b8e78d9a3220108e480bb44383b03a0d8ccbf7fc41ac113c5e67f5d7c8a44d
|
||||
size 102206
|
||||
Reference in New Issue
Block a user