inference/README.md

548 B · 14 lines · markdown Raw

1	`# DeepSeek V3.2`
2
3	First convert huggingface model weights to the the format required by our inference demo. Set `MP` to match your available GPU count:
4	```bash
5	`cd inference`
6	`export EXPERTS=256`
7	`python convert.py --hf-ckpt-path ${HF_CKPT_PATH} --save-path ${SAVE_PATH} --n-experts ${EXPERTS} --model-parallel ${MP}`
8	```
9
10	`Launch the interactive chat interface and start exploring DeepSeek's capabilities:`
11	```bash
12	`export CONFIG=config_671B_v3.2.json`
13	`torchrun --nproc-per-node ${MP} generate.py --ckpt-path ${SAVE_PATH} --config ${CONFIG} --interactive`
14	```