Company
About Lexcore AI Governance Contact
Research
Cortina Zero R&D Hub Benchmarks Whitepaper AHI Framework Organoid Lab Genesis Codex EgoReversal
Products
All Products Naira AI Cortina Shield Deepfake Detection Cortina Naukri Agent NEO
Invest
Investor Relations Capital Allocation Investment Deck
More
Blog Enterprise
← Blog Technical

Cortina DUC: Running a 200B LLaMA Model Offline Inside a Briefcase

APR 24, 2026 11 min read By Lexcore AI Team

Most AI companies want you in the cloud. Your data on their servers. Your queries routed through their APIs. Your intelligence — rented, not owned.

Cortina DUC is built on a different premise: the most powerful AI in your building should fit in a briefcase.

The Problem We're Solving

In India, there are three types of users who cannot use cloud AI and know it:

  1. Lawyers handling privileged client communications subject to Bar Council confidentiality norms.
  2. Hospitals managing patient records covered under DPDP Act 2023 and pre-existing health data protection rules.
  3. HNIs and family offices whose financial and investment data is a liability if it leaves their network.

For these users, "just use ChatGPT" is not an answer. It's a compliance violation waiting to happen.

India's Digital Personal Data Protection Act 2023 creates meaningful obligations for how sensitive data can be processed and where. Sovereign AI is not a luxury feature — for certain sectors, it's a legal requirement.

What's Inside the Briefcase

Cortina DUC (Distributed Unified Compute) is a self-contained AI inference unit. Current hardware spec:

200B
PARAM MODEL
96GB
TOTAL VRAM
0
CLOUD DEPS
18kg
WEIGHT

The Model Stack

We run a quantised version of LLaMA 3.1 405B at Q4_K_M precision — effectively 200B+ effective parameters with minimal quality degradation. This is loaded alongside:

// Inference latency benchmarks (avg over 1,000 queries) Legal summary (3,000 token doc): 1.4s Medical report analysis (5-page PDF): 2.1s Financial document extraction: 0.9s Multi-doc RAG query (50-doc corpus): 3.3s

Deployment: What It Looks Like On Site

A Cortina DUC deployment takes 4 hours on-site. We arrive with the unit, a setup technician, and a USB drive containing the client's custom model configuration. Steps:

  1. Physical placement in client's server room or secure office.
  2. LAN integration — the unit gets a static IP visible only to authorised devices.
  3. Document corpus ingestion — client uploads their files via a local web UI.
  4. User provisioning — each authorised user gets credentials, accessible from any browser on the local network.
  5. Air-gap verification — we physically disconnect internet and confirm full functionality.

After setup, the unit requires zero cloud connectivity. Updates and model upgrades are delivered via encrypted USB.

Who Has One

We are not disclosing client names. What we can say: we have active deployments in two law firms (one in Mumbai, one in Delhi NCR), one diagnostic chain with 7 clinics in Tier 2 cities, and one family office managing assets above ₹500Cr.

Waitlist is currently 6 months. Enterprise enquiries via the proposal form.

← All posts lexcoreai.com →