PersonaPlex Finetune — Pharma Adherence

Experimental research artifact. Not intended for production or clinical use. See Limitations below.

This repository contains the artifacts from an example finetuning run: a LoRA finetune of nvidia/personaplex-7b-v1 on synthetic pharma adherence (patient-support) dialogues.

Recipe and code: https://github.com/emotion-machine-org/personaplex-finetune
Base model: nvidia/personaplex-7b-v1 (7B params, dep_q=16, voice + role conditioning over Moshi)
Run name: adhery-v2-21
Trained: April 2026, 1024 steps, batch 8, duration 80 s/sample
Adapter: LoRA rank 64, scaling 2.0 (skip depformer, ~5.6% params trainable)

The full model card describing the recipe lives in the GitHub repo at MODEL_CARD.md; this README adapts that card to the specifics of this run.

What this is

A voice-native streaming language model finetuned to follow patient-support / medication-adherence call structure: greeting, medication check-in, side-effect probe, adherence nudge, close. Trained with mid-conversation context injection (the puppeteer mechanism) so an external system can inject talking points during the call.

Run artifacts included:

merged_step448/model.safetensors — merged checkpoint at step 448 (best so far on script-adherence eval)
checkpoints/checkpoint_000064..000768/ — LoRA adapter snapshots every 64 steps
args.yaml — full training config
metrics.train.jsonl, metrics.eval.jsonl — train/eval loss curves
wandb/, tb/ — Weights & Biases and TensorBoard run logs
gen_eval/step_000064..000768/ — generation eval results per checkpoint
gemini_eval_step448/ — Gemini judge transcripts + scoring at step 448

Training data

Synthetic patient-support dialogues (adhery-v2 dataset, ~2k samples, mean duration 296 s) generated with Claude and rendered to speech with VibeVoice 7B, aligned with WhisperX. Each sample carries a text_prompt, voice_prompt, and context_injections (frame-offset talking points).

The data is not redistributed here. The pipeline in pipeline/ regenerates equivalent data from public sources.

Intended use

Research on voice-native patient-support agents and mid-call context injection.
Reproducing the pharma branch of the experiments documented in docs/history/notes/combined_experiment_report.md.

Out of scope / Limitations

This model is experimental and was trained for research. Specifically:

Not for production deployment of any kind, including any patient-facing context.
Not medical advice. The model is not aligned for clinical correctness; it can and will hallucinate medication names, dosages, schedules, and side effects.
English-only, biased toward American English (VibeVoice voice library).
No red-team / jailbreak evaluation.
No HIPAA / regulatory review.
No human-subject testing. All eval is synthetic-judge based (Claude / Gemini).
Synthetic training data is fluent but stylized; out-of-distribution prompts will surface that distribution shift.
Voice cloning capability inherits from PersonaPlex. Follow consent norms.

Evaluation

Generation eval (every 64 steps, in gen_eval/): held-out prompts, 30 s generations, Claude-judged 1–5 on naturalness, accuracy, and script adherence.
Gemini eval at step 448 (gemini_eval_step448/): larger held-out set with Gemini as judge. See reviews.json for per-prompt scoring.

Eval is judge-based and not a substitute for human review.

License

Adapter weights: NVIDIA Open Model License (inherited from nvidia/personaplex-7b-v1).
Code in the linked GitHub repo: MIT.

Citation

@software{personaplex_finetune_pharma,
  title  = {PersonaPlex Finetune — Pharma Adherence},
  author = {emotion-machine-org},
  year   = {2026},
  url    = {https://github.com/emotion-machine-org/personaplex-finetune}
}

Underlying base models:

Moshi: Défossez et al., arXiv:2410.00037
PersonaPlex: NVIDIA, model card

Downloads last month: -

Model tree for demegire/personaplex-finetune-pharma

Base model

kyutai/moshiko-pytorch-bf16

Finetuned

nvidia/personaplex-7b-v1

Finetuned

(37)

this model

Dataset used to train demegire/personaplex-finetune-pharma

Paper for demegire/personaplex-finetune-pharma

Moshi: a speech-text foundation model for real-time dialogue

Paper • 2410.00037 • Published Sep 17, 2024 • 17