| | --- |
| | base_model: Writer/Palmyra-Local-1.7B |
| | tags: |
| | - instruct |
| | - finetune |
| | - DPO |
| | - distillation |
| | - small |
| | - local |
| | - On Device |
| | - Transformers.js |
| | - Enterprise LLM |
| | - Enterprise |
| | - Enterprise ready |
| | model_type: palmyra |
| | model-index: |
| | - name: Palmyra-Med-70B |
| | results: [] |
| | license: other |
| | license_name: writer-open-model-license |
| | license_link: https://writer.com/legal/open-model-license/ |
| | extra_gated_prompt: >- |
| | By clicking "Agree", you agree to the [License |
| | Agreement](https://writer.com/legal/open-model-license/) |
| | and acknowledge Writer's [Privacy |
| | Policy](https://writer.com/legal/acceptable-use/). |
| | extra_gated_fields: |
| | Name: text |
| | Email: text |
| | Organization or Affiliation: text |
| | Receive email updates and promotions on Writer products, services, and research?: |
| | type: select |
| | options: |
| | - 'Yes' |
| | - 'No' |
| | I acknowledge that this model is for non-commercial use only unless I acquire a separate license from Writer: checkbox |
| | language: |
| | - en |
| | --- |
| | |
| | **Palmyra-local-1.7B-Instruct** |
| |
|
| | **Introduction** |
| | Palmyra-local is part of the Palmyra series of domain-specialized language models, designed for high performance on enterprise and task-specific use cases. This release features a 1.7 billion parameter instruction-tuned variant of Palmyra-local, built for local deployment and optimized for enterprise-grade language understanding and generation. |
| |
|
| | Compared to earlier versions, Palmyra-local brings the following enhancements: |
| |
|
| | - **Stronger domain reasoning in code and math**, powered by targeted expert tuning and curated domain datasets. |
| | - **Improved instruction-following**, generation of long-form outputs (8K+ tokens), accurate handling of structured data (e.g., tables), and consistent structured output generation (especially JSON). |
| | - **Robust prompt handling**, enabling nuanced role-play, dynamic agent behavior, and complex prompt chaining in enterprise workflows. |
| | - **Extended context support**, with a maximum context window of 128K tokens and generation support for up to 8K tokens. |
| | - **Multilingual capabilities**, supporting over 29 languages including English, Spanish, French, German, Chinese, Arabic, Japanese, and more. |
| |
|
| | This repository includes the **instruction-tuned Palmyra-local 1.7B model**, with the following architecture details: |
| |
|
| | - **Type**: Causal Language Model |
| | - **Training Stages**: Pretraining + Instruction Tuning |
| | - **Architecture**: Transformer with RoPE positional encoding |
| | - **Total Parameters**: 1.7B |
| | - **Number of Layers**: 28 |
| | - **Attention Heads**: GQA |
| |
|
| |
|
| | ## Training Details |
| | - Architecture: Palmyra |
| | - Training Method: From scratch |
| | - Attention Mechanism: GQA |
| | - Training Data: [~1T packed dataset] |
| |
|
| |
|
| | ## Benchmark Results |
| |
|
| | | Benchmark | Palmyra-local-1.7B | Qwen2.5-1.5B-Instruct | GPT-4 mini | Llama-3.2-1B-Instruct | Llama-3.2-3B-Instruct | |
| | |-----------|--------------------|----------------------|------------|----------------------|----------------------| |
| | | HumanEval | 74.10 | 61.60 | N/A | N/A | N/A | |
| | | MBPP | 66.86 | 63.20 | N/A | N/A | N/A | |
| | | GSM8K | 81.0 | 73.20 | 88.6 | N/A | 75.6 | |
| | | MATH | 60.94 | 55.20 | 64.0 | N/A | 46.7 | |
| | | MMLU | 59.82 | 58.37 | 67.3 | 32.2 | 58.0 | |
| | | MMLU Pro | 34.10 | 32.40 | 52.8 | N/A | N/A | |
| | | Average | 62.8 | 57.33 | N/A | N/A | N/A | |
| |
|
| | **Notes:** |
| |
|
| | - **HumanEval** and **MBPP**: Benchmark data for these tasks were not available for **GPT-4 mini**, **Llama-3.2-1B-Instruct**, and **Llama-3.2-3B-Instruct** based on the model created sources. |
| |
|
| |
|
| | ## Usage |
| |
|
| | ### Install dependencies |
| |
|
| | requirements.txt |
| |
|
| | ```txt |
| | transformers==4.51.0 |
| | torch==2.6.0 |
| | tokenizers==0.21.1 |
| | accelerate==1.6.0 |
| | ``` |
| |
|
| | ```bash |
| | pip install -r requirements.txt |
| | ``` |
| |
|
| | --- |
| |
|
| | ### Inference |
| |
|
| | ```python |
| | import torch |
| | from transformers import AutoTokenizer, AutoModelForCausalLM |
| | |
| | model_id = "Writer/Palmyra-local-1_7B" |
| | auth_token = "xxx" |
| | |
| | # Load tokenizer |
| | tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True, token=auth_token) |
| | |
| | # Load model with quantization for lower memory usage (optional) |
| | model = AutoModelForCausalLM.from_pretrained( |
| | model_id, |
| | torch_dtype=torch.float16, |
| | device_map="auto", |
| | trust_remote_code=True, |
| | token=auth_token, |
| | ) |
| | |
| | # Prepare input |
| | messages = [ |
| | {"role": "user", "content": "Write a blog post about strangelets"}, |
| | ] |
| | |
| | # Check if apply_chat_template is available, fallback if not |
| | if hasattr(tokenizer, "apply_chat_template"): |
| | input_ids = tokenizer.apply_chat_template( |
| | messages, tokenize=True, add_generation_prompt=True, return_tensors="pt" |
| | ) |
| | else: |
| | input_text = messages[0]["content"] |
| | input_ids = tokenizer(input_text, return_tensors="pt").input_ids |
| | |
| | # Ensure input_ids is on the same device as the model |
| | input_ids = input_ids.to(model.device) |
| | |
| | # Generation config |
| | gen_conf = { |
| | "max_new_tokens": 256, |
| | "eos_token_id": tokenizer.eos_token_id, |
| | "temperature": 0.7, |
| | "top_p": 0.9, |
| | } |
| | |
| | # Generate output |
| | with torch.inference_mode(): |
| | output_id = model.generate(input_ids, **gen_conf) |
| | |
| | # Decode output |
| | output_text = tokenizer.decode(output_id[0][input_ids.shape[1]:], skip_special_tokens=True) |
| | |
| | print(output_text) |
| | ``` |
| |
|
| |
|
| | ### Citation and Related Information |
| |
|
| | To cite this model: |
| |
|
| | ``` |
| | @misc{Palmyra-Local-1.7B, |
| | author = {Writer Engineering team}, |
| | title = {{Palmyra-Local-1.7B: A powerful LLM designed for On device run}}, |
| | howpublished = {\url{https://dev.writer.com}}, |
| | year = 2025, |
| | month = March |
| | } |
| | ``` |
| |
|
| | Contact |
| | Hello@writer.com |
| |
|