fine-tune

#52

by m-hasnain-sabqi - opened Feb 24

←

Files changed (8) hide show

.eval_results/MathArena--aime_2026.yaml DELETED Viewed

@@ -1,8 +0,0 @@
-- dataset:
-    id: MathArena/aime_2026
-    task_id: MathArena/aime_2026
-  value: 95.83
-  date: '2026-02-18'
-  source:
-    url: https://matharena.ai/?comp=aime--aime_2026
-    name: Official MathArena Evaluation

.eval_results/MathArena--hmmt_feb_2026.yaml DELETED Viewed

@@ -1,8 +0,0 @@
-- dataset:
-    id: MathArena/hmmt_feb_2026
-    task_id: MathArena/hmmt_feb_2026
-  value: 86.36
-  date: '2026-02-23'
-  source:
-    url: https://matharena.ai/?comp=hmmt--hmmt_feb_2026
-    name: Official MathArena Evaluation

.eval_results/swe_bench_verified.yaml DELETED Viewed

@@ -1,19 +0,0 @@
-- dataset:
-    id: SWE-bench/SWE-bench_Verified
-    task_id: swe_bench_%_resolved
-  value: 72.80
-  source:
-    url: https://www.swebench.com/
-    name: SWE-Bench official evaluation
-    user: nielsr
-  notes: high reasoning, official
-- dataset:
-    id: SWE-bench/SWE-bench_Verified
-    task_id: swe_bench_%_resolved
-  value: 77.8
-  source:
-    url: https://huggingface.co/zai-org/GLM-5/
-    name: Model card
-    user: nielsr
-  notes: Z.ai reported number

.eval_results/terminal_bench.yaml DELETED Viewed

@@ -1,11 +0,0 @@
-- dataset:
-    id: harborframework/terminal-bench-2.0
-    task_id: terminal_bench
-  value: 52.4
-  date: '2026-02-23'
-  source:
-    url: https://www.tbench.ai/leaderboard/terminal-bench/2.0
-    name: Terminal-Bench Leaderboard
-    user: burtenshaw
-  notes: "agent: Terminus 2"

.eval_results/terminal_bench_2.yaml DELETED Viewed

@@ -1,10 +0,0 @@
-- dataset:
-    id: harborframework/terminal-bench-2.0
-    task_id: terminalbench_2
-  value: 52.4
-  date: '2026-02-23'
-  source:
-    url: https://www.tbench.ai/leaderboard/terminal-bench/2.0
-    name: Terminal-Bench Leaderboard
-    user: SaylorTwift
-  notes: "agent: Terminus 2"

.eval_results/yc-bench.yaml DELETED Viewed

@@ -1,9 +0,0 @@
-- dataset:
-    id: collinear-ai/yc-bench
-    task_id: medium
-  value: 1208190
-  date: "2026-03-24"
-  source:
-    url: https://github.com/collinear-ai/yc-bench
-    name: "YC-Bench eval"
-  notes: "avg final funds (USD) across seeds 1,2,3. GLM-5 (via OpenRouter z-ai/glm-5)"

README.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 language:
-- en
-- zh
 library_name: transformers
 license: mit
 pipeline_tag: text-generation
@@ -22,11 +22,6 @@ pipeline_tag: text-generation
     👉 One click to <a href="https://chat.z.ai">GLM-5</a>.
 </p>
-<p align="center">
-    [<a href="https://huggingface.co/papers/2602.15763" target="_blank">Paper</a>]
-    [<a href="https://github.com/zai-org/GLM-5" target="_blank">GitHub</a>]
-</p>
 ## Introduction
 We are launching GLM-5, targeting complex systems engineering and long-horizon agentic tasks. Scaling is still one of the most important ways to improve the intelligence efficiency of Artificial General Intelligence (AGI). Compared to GLM-4.5, GLM-5 scales from 355B parameters (32B active) to 744B parameters (40B active), and increases pre-training data from 23T to 28.5T tokens. GLM-5 also integrates DeepSeek Sparse Attention (DSA), largely reducing deployment cost while preserving long-context capacity.
@@ -154,12 +149,4 @@ vLLM, SGLang, KTransformers, and xLLM all support local deployment of GLM-5. A s
 ## Citation
-```bibtex
-@article{glm5team2026glm5,
-  title={GLM-5: from Vibe Coding to Agentic Engineering},
-  author={GLM-5 Team and Aohan Zeng and Xin Lv and Zhenyu Hou and Zhengxiao Du and Qinkai Zheng and Bin Chen and Da Yin and Chendi Ge and Chengxing Xie and others},
-  journal={arXiv preprint arXiv:2602.15763},
-  year={2026},
-  url={https://huggingface.co/papers/2602.15763}
-}
-```

 ---
 language:
+  - en
+  - zh
 library_name: transformers
 license: mit
 pipeline_tag: text-generation
     👉 One click to <a href="https://chat.z.ai">GLM-5</a>.
 </p>
 ## Introduction
 We are launching GLM-5, targeting complex systems engineering and long-horizon agentic tasks. Scaling is still one of the most important ways to improve the intelligence efficiency of Artificial General Intelligence (AGI). Compared to GLM-4.5, GLM-5 scales from 355B parameters (32B active) to 744B parameters (40B active), and increases pre-training data from 23T to 28.5T tokens. GLM-5 also integrates DeepSeek Sparse Attention (DSA), largely reducing deployment cost while preserving long-context capacity.
 ## Citation
+Our technical report is coming soon.

chat_template.jinja CHANGED Viewed

@@ -32,10 +32,10 @@ For each function call, output the function name and arguments within the follow
 {%- set ns = namespace(last_user_index=-1) %}
 {%- for m in messages %}
     {%- if m.role == 'user' %}
-        {%- set ns.last_user_index = loop.index0 -%}
     {%- endif %}
 {%- endfor %}
-{%- for m in messages -%}
 {%- if m.role == 'user' -%}<|user|>{{ visible_text(m.content) }}
 {%- elif m.role == 'assistant' -%}
 <|assistant|>

 {%- set ns = namespace(last_user_index=-1) %}
 {%- for m in messages %}
     {%- if m.role == 'user' %}
+        {% set ns.last_user_index = loop.index0 -%}
     {%- endif %}
 {%- endfor %}
+{% for m in messages %}
 {%- if m.role == 'user' -%}<|user|>{{ visible_text(m.content) }}
 {%- elif m.role == 'assistant' -%}
 <|assistant|>