Instructions to use vtriple/Llama-3.1-8B-yara with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use vtriple/Llama-3.1-8B-yara with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct")
model = PeftModel.from_pretrained(base_model, "vtriple/Llama-3.1-8B-yara")

llama-cpp-python

How to use vtriple/Llama-3.1-8B-yara with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="vtriple/Llama-3.1-8B-yara",
	filename="llama3_8b_yara.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use vtriple/Llama-3.1-8B-yara with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf vtriple/Llama-3.1-8B-yara
# Run inference directly in the terminal:
llama-cli -hf vtriple/Llama-3.1-8B-yara

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf vtriple/Llama-3.1-8B-yara
# Run inference directly in the terminal:
llama-cli -hf vtriple/Llama-3.1-8B-yara

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf vtriple/Llama-3.1-8B-yara
# Run inference directly in the terminal:
./llama-cli -hf vtriple/Llama-3.1-8B-yara

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf vtriple/Llama-3.1-8B-yara
# Run inference directly in the terminal:
./build/bin/llama-cli -hf vtriple/Llama-3.1-8B-yara

Use Docker

docker model run hf.co/vtriple/Llama-3.1-8B-yara

LM Studio
Jan
Ollama
How to use vtriple/Llama-3.1-8B-yara with Ollama:
```
ollama run hf.co/vtriple/Llama-3.1-8B-yara
```

Unsloth Studio new

How to use vtriple/Llama-3.1-8B-yara with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for vtriple/Llama-3.1-8B-yara to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for vtriple/Llama-3.1-8B-yara to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for vtriple/Llama-3.1-8B-yara to start chatting

Pi new

How to use vtriple/Llama-3.1-8B-yara with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf vtriple/Llama-3.1-8B-yara

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "vtriple/Llama-3.1-8B-yara"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use vtriple/Llama-3.1-8B-yara with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf vtriple/Llama-3.1-8B-yara

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default vtriple/Llama-3.1-8B-yara

Run Hermes

hermes

Docker Model Runner
How to use vtriple/Llama-3.1-8B-yara with Docker Model Runner:
```
docker model run hf.co/vtriple/Llama-3.1-8B-yara
```

Lemonade

How to use vtriple/Llama-3.1-8B-yara with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull vtriple/Llama-3.1-8B-yara

Run and chat with the model

lemonade run user.Llama-3.1-8B-yara-{{QUANT_TAG}}

List all available models

lemonade list

Model Card for LLaMA 3.1 8B Instruct - YARA Rule Generation Fine-tuned

This model is a fine-tuned version of the LLaMA 3.1 8B Instruct model, specifically adapted for YARA rule generation and cybersecurity-related tasks.

Model Details

Model Description

This model is based on the LLaMA 3.1 8B Instruct model and has been fine-tuned on a custom dataset of YARA rules and cybersecurity-related content. It is designed to assist in generating YARA rules and provide more accurate and relevant responses to queries in the cybersecurity domain, with a focus on malware detection and threat hunting.

Developed by: Wyatt Roersma (No organization affiliation)
Model type: Instruct-tuned Large Language Model
Language(s) (NLP): English (primary), with potential for limited multilingual capabilities
License: [Specify the license, likely related to the original LLaMA 3.1 license]
Finetuned from model: meta-llama/Meta-Llama-3.1-8B-Instruct

Model Sources

Repository: https://huggingface.co/vtriple/Llama-3.1-8B-yara

Uses

Direct Use

This model can be used for a variety of cybersecurity-related tasks, including:

Generating YARA rules for malware detection
Assisting in the interpretation and improvement of existing YARA rules
Answering questions about YARA syntax and best practices
Providing explanations of cybersecurity threats and vulnerabilities
Offering guidance on malware analysis and threat hunting techniques

Out-of-Scope Use

This model should not be used for:

Generating or assisting in the creation of malicious code
Providing legal or professional security advice without expert oversight
Making critical security decisions without human verification
Replacing professional malware analysis or threat intelligence processes

Bias, Risks, and Limitations

The model may reflect biases present in its training data and the original LLaMA 3.1 model.
It may occasionally generate incorrect or inconsistent YARA rules, especially for very specific or novel malware families.
The model's knowledge is limited to its training data cutoff and does not include real-time threat intelligence.
Generated YARA rules should always be reviewed and tested by security professionals before deployment.

Recommendations

Users should verify and test all generated YARA rules before implementation. The model should be used as an assistant tool to aid in rule creation and cybersecurity tasks, not as a replacement for expert knowledge or up-to-date threat intelligence. Always consult with cybersecurity professionals for critical security decisions and rule deployments.

How to Get Started with the Model

Use the following code to get started with the model:

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel, PeftConfig

# Load the model
model_name = "vtriple/Llama-3.1-8B-yara"
config = PeftConfig.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)
model = PeftModel.from_pretrained(model, model_name)

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)

# Example usage
prompt = "Generate a YARA rule to detect a PowerShell-based keylogger"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=500)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

The model was fine-tuned on a custom dataset of YARA rules, cybersecurity-related questions and answers, and malware analysis reports. [You may want to add more specific details about your dataset here]

Training Procedure

Training Hyperparameters

Training regime: bf16 mixed precision
Optimizer: AdamW
Learning rate: 5e-5
Batch size: 4
Gradient accumulation steps: 4
Epochs: 5
Max steps: 4000

Evaluation

A custom YARA evaluation dataset was used to assess the model's performance in generating accurate and effective YARA rules. [You may want to add more details about your evaluation process and results]

Environmental Impact

Hardware Type: NVIDIA A100
Hours used: 12 Hours
Cloud Provider: vast.io

Technical Specifications

Model Architecture and Objective

This model uses the LLaMA 3.1 8B architecture with additional LoRA adapters for fine-tuning. It was trained using a causal language modeling objective on YARA rules and cybersecurity-specific data.

Compute Infrastructure

Hardware

Single NVIDIA A100 GPU

Software

Python 3.8+
PyTorch 2.0+
Transformers 4.28+
PEFT 0.12.0

Model Card Author

Wyatt Roersma

Model Card Contact

For questions about this model, please email Wyatt Roersma at wyattroersma@gmail.com.

Downloads last month: 43

GGUF

Model size

83.9M params

Architecture

llama

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for vtriple/Llama-3.1-8B-yara

Base model

meta-llama/Llama-3.1-8B

Finetuned

meta-llama/Llama-3.1-8B-Instruct

Adapter

(2250)

this model