You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

D360-VLM v4.0.0

D360-VLM v4.0.0 is the V4 unified enterprise document intelligence model for extraction, classification, and structured JSON generation from business documents, including noisy and partially degraded scans.

What's New In V4

  • Migrates to one canonical d360_vlm runtime with legacy V3 API/inference wrappers for backward compatibility.
  • Adds release gates for field extraction F1, document classification accuracy, confidence calibration, and JSON schema validity.
  • Introduces a normalized business document schema (business-doc-v1) and a LLaVA-style image+instruction+JSON supervision manifest.
  • Ships budget-guarded Modal training + packaging workflow for repeatable enterprise releases.

Models Used

  • Primary V4 base model: HuggingFaceTB/SmolVLM-500M-Instruct
  • Training objective: Vision-to-JSON instruction tuning (image + prompt -> structured response)
  • Runtime surface: Unified d360_vlm engine + V3 compatibility endpoints

V4 Architecture

  • Backbone: AutoModelForVision2Seq initialized from HuggingFaceTB/SmolVLM-500M-Instruct
  • Processor: AutoProcessor multimodal tokenization for image + text sequences
  • Supervision format: LLaVA-style examples (<image>, business prompt, target JSON response)
  • Serving layer: Canonical FastAPI app in d360_vlm.api.app with V3-compatible routes and wrappers
  • Quality gates: Extraction/classification metrics, confidence buckets (ECE), schema checks, and release readiness flags

Training Summary (Current V4 Run)

  • Run name: d360_vlm_v4
  • Train rows: unknown
  • Eval rows: unknown
  • Elapsed minutes: unknown
  • Projected spend (USD): unknown (budget: unknown)
  • Hardware profile: Modal L4 GPU with runtime budget guard

Evaluation Signals

  • Release ready: None
  • Field F1: None
  • Document type accuracy: None
  • Schema valid rate: None

Included In This HF Repo

  • model/ trained model artifacts
  • model.py runtime wrapper
  • d360_vlm_v3_inference.py compatibility inference API
  • d360_vlm_v3_api.py compatibility FastAPI entrypoint
  • eval_gate_results.json gate report
  • model_card.json structured release metadata

Intended Use

  • Enterprise document workflows: forms, invoices, receipts, IDs, and mixed-layout business documents.
  • Structured extraction and classification into downstream JSON contracts.

Limitations

  • Performance is dataset-dependent; always validate on your own documents.
  • Degraded handwriting and extreme low-resolution scans may reduce quality.
  • Human review is recommended for high-risk compliance and financial workflows.

Repository

abrarali113/d360_vlm_unified

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for abrarali113/d360_vlm_unified