Instructions to use Esperanto/Protein-Llama-3-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Esperanto/Protein-Llama-3-8B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Esperanto/Protein-Llama-3-8B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Esperanto/Protein-Llama-3-8B")
model = AutoModelForCausalLM.from_pretrained("Esperanto/Protein-Llama-3-8B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Esperanto/Protein-Llama-3-8B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Esperanto/Protein-Llama-3-8B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Esperanto/Protein-Llama-3-8B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Esperanto/Protein-Llama-3-8B

SGLang

How to use Esperanto/Protein-Llama-3-8B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Esperanto/Protein-Llama-3-8B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Esperanto/Protein-Llama-3-8B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Esperanto/Protein-Llama-3-8B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Esperanto/Protein-Llama-3-8B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Esperanto/Protein-Llama-3-8B with Docker Model Runner:
```
docker model run hf.co/Esperanto/Protein-Llama-3-8B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Model Details

Protein-Llama-3-8B is a specialized version of the Llama-3-8B language model, fine-tuned for the task of protein language modeling. This model has been continually pre-trained using LoRA technique on extensive datasets of protein sequences, enabling it to generate novel protein sequences based on natural language prompts. It supports both uncontrollable and controllable protein generation, allowing users to specify desired characteristics for the proteins. The model is designed to facilitate advancements in protein engineering, making it a valuable tool for drug development, chemical synthesis, and other biotechnological applications. For full details please read our paper.

Model Description

Generating novel protein sequences possessing desired properties, termed as protein engineering, is crucial for industries like drug development and chemical synthesis. Traditional protein engineering techniques often involve introducing random mutations into the gene encoding the protein of interest. This is followed by expression and screening to identify variants with improved or novel functions, which are then reproduced. While effective, these approaches are labor-intensive and time-consuming, as they rely on iterating over known protein sequences. This limits their ability to generate diverse protein sequences with entirely new capabilities, as they are constrained by existing protein templates. Moreover, the need to analyze numerous protein variants can waste valuable experimental resources. However, leveraging a Large Language Model (LLM) that has learned the "protein language" significantly accelerates this process. An LLM can generate and evaluate protein sequences in a matter of seconds. The inherent randomness of LLM-generated sequences enhances diversity, enabling the creation of completely novel proteins with potentially unprecedented functions. This not only streamlines the discovery and development process but also expands the scope of possibilities in protein engineering. This model is based on the Llama-3-8B architecture and is capable of generating proteins based on user defined characteristics.

Energy Efficient Protein Language Models: Leveraging Small Language Models with LoRA for Controllable Protein Generation

Usage

To download and use the Protein-Llama-3 model for inference, follow these steps:

Installation

Ensure you have the transformers library installed. You can install it using pip:

pip install transformers

Uncontrollable Generation

Uncontrollable generation can be handled via prompting the model with the phrase 'Seq=<'.

generator = pipeline('text-generation', model="Esperanto/Protein-Llama-3-8B")

sequences = generator("Seq=<",temperature=0.2,
    top_k=40,
    top_p=0.9,
    do_sample=True,
    repetition_penalty=1.2,
    max_new_tokens=30,
    num_return_sequences=500)

for sequence in sequences:
    print(sequence['generated_text'])

Controllable Generation

Controllable generation can be done by prompting the model with '[Generate xxx protein] Seq=<'. Here, xxx can be any family from the 10 classes supported by this model.

generator = pipeline('text-generation', model="Esperanto/Protein-Llama-3-8B")

sequences = generator("[Generate Ligase enzyme protein] Seq=<",temperature=0.2,
    top_k=40,
    top_p=0.9,
    do_sample=True,
    repetition_penalty=1.2,
    max_new_tokens=30,
    num_return_sequences=500)

for sequence in sequences:
    print(sequence['generated_text'])