You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

D360-VLM v4.0.0

D360-VLM v4.0.0 is the V4 unified enterprise document intelligence model for extraction, classification, and structured JSON generation from business documents, including noisy and partially degraded scans.

What's New In V4

Migrates to one canonical d360_vlm runtime with legacy V3 API/inference wrappers for backward compatibility.
Adds release gates for field extraction F1, document classification accuracy, confidence calibration, and JSON schema validity.
Introduces a normalized business document schema (business-doc-v1) and a LLaVA-style image+instruction+JSON supervision manifest.
Ships budget-guarded Modal training + packaging workflow for repeatable enterprise releases.

Models Used

Primary V4 base model: HuggingFaceTB/SmolVLM-500M-Instruct
Training objective: Vision-to-JSON instruction tuning (image + prompt -> structured response)
Runtime surface: Unified d360_vlm engine + V3 compatibility endpoints

V4 Architecture

Backbone: AutoModelForVision2Seq initialized from HuggingFaceTB/SmolVLM-500M-Instruct
Processor: AutoProcessor multimodal tokenization for image + text sequences
Supervision format: LLaVA-style examples (<image>, business prompt, target JSON response)
Serving layer: Canonical FastAPI app in d360_vlm.api.app with V3-compatible routes and wrappers
Quality gates: Extraction/classification metrics, confidence buckets (ECE), schema checks, and release readiness flags

Training Summary (Current V4 Run)

Run name: d360_vlm_v4
Train rows: unknown
Eval rows: unknown
Elapsed minutes: unknown
Projected spend (USD): unknown (budget: unknown)
Hardware profile: Modal L4 GPU with runtime budget guard

Evaluation Signals

Release ready: None
Field F1: None
Document type accuracy: None
Schema valid rate: None

Included In This HF Repo

model/ trained model artifacts
model.py runtime wrapper
d360_vlm_v3_inference.py compatibility inference API
d360_vlm_v3_api.py compatibility FastAPI entrypoint
eval_gate_results.json gate report
model_card.json structured release metadata

Intended Use

Enterprise document workflows: forms, invoices, receipts, IDs, and mixed-layout business documents.
Structured extraction and classification into downstream JSON contracts.

Limitations

Performance is dataset-dependent; always validate on your own documents.
Degraded handwriting and extreme low-resolution scans may reduce quality.
Human review is recommended for high-risk compliance and financial workflows.

Repository

abrarali113/d360_vlm_unified

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Image-Text-to-Text

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for abrarali113/d360_vlm_unified

Base model

HuggingFaceTB/SmolLM2-360M

Quantized

HuggingFaceTB/SmolLM2-360M-Instruct

Quantized

HuggingFaceTB/SmolVLM-500M-Instruct

Finetuned

(25)

this model