Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
prav719
/
DeepSeek-R1-Distill-Qwen-32B-flash-attention-2_H100
like
0
Transformers
TensorBoard
Safetensors
Generated from Trainer
trl
sft
Model card
Files
Files and versions
xet
Metrics
Training metrics
Community
Deploy
Use this model
main
DeepSeek-R1-Distill-Qwen-32B-flash-attention-2_H100
/
runs
133 kB
1 contributor
History:
3 commits
prav719
Model save
80691f4
verified
10 months ago
Feb23_14-18-26_mistralft-0
Training in progress, epoch 1
10 months ago
Feb24_05-28-35_mistralft-0
Training in progress, epoch 1
10 months ago
Feb24_05-30-09_mistralft-0
Training in progress, epoch 1
10 months ago
Feb24_05-32-12_mistralft-0
Training in progress, epoch 1
10 months ago
Feb24_05-34-20_mistralft-0
Training in progress, epoch 1
10 months ago
Feb24_05-52-11_mistralft-0
Training in progress, epoch 1
10 months ago
Feb24_06-03-25_mistralft-0
Model save
10 months ago