--- license: mit library_name: transformers pipeline_tag: text-generation tags: - tensormind - causal-lm - text-generation - chinese - custom-code language: - zh - en model-index: - name: TensorMind results: - task: type: text-generation name: Chinese Multiple-Choice Evaluation dataset: type: custom name: C-Eval metrics: - type: accuracy value: 27.27 name: C-Eval (0-shot) - task: type: text-generation name: Chinese Multiple-Choice Evaluation dataset: type: custom name: CMMLU metrics: - type: accuracy value: 25.26 name: CMMLU (0-shot) - task: type: text-generation name: Chinese Multiple-Choice Evaluation dataset: type: custom name: A-CLUE metrics: - type: accuracy value: 25.43 name: A-CLUE (0-shot) - task: type: text-generation name: Chinese Multiple-Choice Evaluation dataset: type: custom name: TMMLU+ metrics: - type: accuracy value: 24.96 name: TMMLU+ (0-shot) --- # TensorMind (0.5B) TensorMind is a 536.9M-parameter causal language model for lightweight Chinese/English text generation. ## Model Details - Architecture: Decoder-only Transformer (`TensorMindForCausalLM`) - Layers: 32 - Hidden size: 1024 - Heads / KV heads: 16 / 8 (GQA) - Context length: 32,768 - Vocab size: 32,768 - Positional encoding: RoPE - Activation: SiLU - Parameters: 536,941,568 (~0.5B) ## Quick Start ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM repo_id = "TensorMind/TensorMind" tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( repo_id, trust_remote_code=True, torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32, ) prompt = "请用三句话介绍一下你自己。" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=128, do_sample=True, temperature=0.7) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Benchmark Snapshot Evaluation time: 2026-03-07 00:40 (UTC+8), zero-shot (`n-shot=0`). | Model | Params | C-Eval | CMMLU | A-CLUE | TMMLU+ | AGIEval | |---|---:|---:|---:|---:|---:|---:| | TensorMind | 0.5B | 27.27 | 25.26 | 25.43 | 24.96 | 33.56 | ![TensorMind benchmark table](./assets/compare_table_tensormind.png) ![TensorMind benchmark radar](./assets/compare_radar_tensormind.png) ## Intended Use - Lightweight chat and text generation - Local experimentation and teaching - Baseline model for research and fine-tuning ## Limitations - This is a small model and can produce factual errors. - Benchmark numbers above are from multiple-choice style evaluations and do not fully represent open-ended generation quality. - Outputs may contain bias or unsafe content; apply filtering for production use. ## License MIT License.