inference/README.md
548 B · 14 lines · markdown Raw
1 # DeepSeek V3.2
2
3 First convert huggingface model weights to the the format required by our inference demo. Set `MP` to match your available GPU count:
4 ```bash
5 cd inference
6 export EXPERTS=256
7 python convert.py --hf-ckpt-path ${HF_CKPT_PATH} --save-path ${SAVE_PATH} --n-experts ${EXPERTS} --model-parallel ${MP}
8 ```
9
10 Launch the interactive chat interface and start exploring DeepSeek's capabilities:
11 ```bash
12 export CONFIG=config_671B_v3.2.json
13 torchrun --nproc-per-node ${MP} generate.py --ckpt-path ${SAVE_PATH} --config ${CONFIG} --interactive
14 ```