Engineering & Architecture

Technology

Two deployment pathways — on-premise NVIDIA GPUs or HIPAA-compliant cloud — purpose-built for clinical radiology workflows. Hospitals choose the architecture that fits their security and budget requirements.

Dual Deployment Architecture

Voxel Vision offers two distinct AI inference pathways. Customers choose based on their institution's data governance policy, budget, and IT infrastructure. Both deliver the same clinical output — preliminary radiology reports generated from medical images.

Pathway 1: On-Premise

Zero data leaves premises

Run open-source vision-language models locally on hospital-owned NVIDIA GPUs. DICOM images are processed entirely within the facility's network. Ideal for institutions with strict data governance policies that prohibit any PHI from leaving premises.

How It Works

1.DICOM images are acquired from the scanner and routed to the local VoxelMD inference server
2.Open-source VLM (e.g., Gemma 4 / MedGemma, Qwen 3.5 VL) processes images + clinical context on local NVIDIA GPU
3.Model generates a preliminary radiology report draft in seconds
4.Report is pushed to the radiologist's dictation workspace / PACS for review and signature

Compatible Models

Gemma 4 (4B–27B)MedGemmaQwen 3.5 VL (7B–72B)Open-source VLMs

Hardware

NVIDIA RTX 5090NVIDIA B200 / B300NVIDIA BlackwellCUDA / TensorRT

Pathway 2: HIPAA Cloud

BAA-Covered • De-Identified Data

Send de-identified DICOM data to enterprise-grade, HIPAA-compliant cloud APIs via a signed Business Associate Agreement (BAA). Similar to the proven architecture used by RapidAI and Rad AI — DICOM-in, results-out. Ideal for institutions that prioritize cost efficiency and don't want to maintain GPU hardware.

How It Works

1.DICOM images are acquired from the scanner and routed to the VoxelMD edge gateway
2.PHI is stripped / de-identified before data leaves the hospital network
3.De-identified images are sent via encrypted API call to HIPAA-compliant Vertex AI (Google Cloud with BAA)
4.Enterprise VLM (e.g., Gemini 3 Flash) processes images and returns a preliminary report
5.Report is re-identified and delivered to the radiologist's PACS / dictation workspace

Cloud Stack

Google Vertex AIGemini 3 FlashBAA-CoveredCMEK Encryption

Compliance

HIPAA BAADe-IdentificationSOC 2 Type IITLS + AES-256Audit Logging

Same Output

Both pathways produce identical clinical deliverables — preliminary radiology report drafts ready for radiologist review.

Customer Choice

Hospitals choose the pathway that matches their data governance, budget, and IT infrastructure. Mix-and-match is supported.

Industry Standard

Cloud pathway follows the proven DICOM-in/results-out pattern used by FDA-cleared platforms like RapidAI and Rad AI.

Core Architecture

Shared infrastructure components designed for clinical-grade reliability across both deployment pathways.

DICOM-In / Results-Out

Standard DICOM protocol integration — no proprietary workstations required. Images are ingested from any compliant scanner; results are delivered back to PACS as DICOM Secondary Capture or directly to the dictation workspace.

NVIDIA GPU Inference

On-premise inference powered by NVIDIA CUDA and TensorRT. Supports latest consumer (RTX 5090), enterprise (B200/B300), and workstation GPUs. Quantized model variants for edge deployment.

Zero-Trust Security

End-to-end encryption (TLS in transit, AES-256 at rest), role-based access control, comprehensive audit logging. Designed to exceed HIPAA requirements and satisfy the most demanding hospital IT security audits.

AI Model Stack

Vision-language models selected and validated for clinical radiology use cases across both deployment pathways.

On-Premise Models (Open-Source)

Gemma 4 (4B–27B) — Google's open-weight multimodal model, optimized for vision-language tasks
MedGemma — Healthcare-tuned variant for medical image comprehension and report generation
Qwen 3.5 VL (7B–72B) — Alibaba's open-source VLM with dynamic resolution DICOM processing
Fine-tuned on radiology-specific datasets for anatomy-specific analysis (thyroid, CT perfusion)

Cloud Models (Enterprise APIs)

Gemini 3 Flash — Google's fastest multimodal model via Vertex AI, ideal for real-time inference
Gemini 3 Pro — Higher-capability model for complex multi-series interpretation
HIPAA-compliant access via signed BAA with Google Cloud
Customer-Managed Encryption Keys (CMEK) for data at rest

GPU Infrastructure (On-Premise)

CUDA-optimized DICOM image preprocessing and model inference pipeline
TensorRT model optimization for maximum throughput on NVIDIA hardware
Support for INT8/INT4 quantized model variants for consumer-grade GPUs
Parallel processing of multi-series imaging studies for batch workflows

Data Pipeline (Shared)

DICOM ingestion and normalization from any compliant modality
Automated PHI stripping / de-identification for cloud pathway
Quality control preprocessing — contrast, windowing, slice selection
Structured output mapping to ACR-compliant radiology reporting standards

Clinical Integration

Seamless integration into existing radiology infrastructure — no workflow disruption.

Epic EHR

Deep integration with Epic for patient context, clinical history, and order data

Enterprise PACS

Direct PACS integration for image viewing and report synchronization

HL7 / FHIR

Standards-based interoperability for broad EHR compatibility

DICOM

Native DICOM ingestion from any compliant imaging modality

Security & Compliance

Patient data security is non-negotiable. The on-premise pathway ensures PHI never leaves the hospital network. The cloud pathway uses de-identification, BAA-covered endpoints, and CMEK encryption — following the same proven security model used by FDA-cleared radiology AI platforms.

HIPAA CompliantBAA-Covered CloudOn-Premise Air-GapDe-IdentificationCMEK EncryptionAudit LoggingRBACSOC 2

Explore Our Products

Voxel Suite Voxel Vision Technical Specs