Security Architecture

How We Protect
Your Medical Records

This page describes every layer of security protecting your data — from the moment you upload a document to the moment a rating is generated. No marketing language. Just the actual technical details.

AES-256-GCM at Rest
TLS 1.3 in Transit
Unreadable Embeddings
Zero-Knowledge Keys

Five Independent Security Layers

TLS 1.3 — Encrypted in TransitTransport Layer

Every byte moving between your browser and our servers is encrypted with TLS 1.3. It is impossible to intercept or read in transit.

01
AES-256-GCM — Encrypted at RestStorage Layer

All case metadata on disk is encrypted with AES-256-GCM before it is written. The file on the server is unreadable without the key — which only you hold.

02
Vector Embeddings — Documents Become NumbersAI Processing Layer

When a report is processed, it is converted into a vector embedding — a list of floating-point numbers that represent the semantic meaning of the text. The original text cannot be reconstructed from the embedding.

03
You Own the Encryption KeyKey Management Layer

The AES-256 encryption key never leaves your environment. It is never sent to us, stored in our systems, or visible to anyone but you.

04
Never Trained On, Never SharedData Use Layer

Your data is never used to train AI models, shared with third parties, sold, or used for any purpose other than running your specific case.

05
01Transport Layer

TLS 1.3 — Encrypted in Transit

Every byte moving between your browser and our servers is encrypted with TLS 1.3. It is impossible to intercept or read in transit.

What TLS 1.3 means

Transport Layer Security 1.3 is the current gold standard for encrypting data over a network. When you upload a report or load a case, a cryptographic handshake negotiates a unique session key. That key is used only once and discarded — even if an attacker captured all your network traffic, they could not decrypt it.

Perfect Forward Secrecy

TLS 1.3 mandates Perfect Forward Secrecy (PFS). Each session uses a fresh ephemeral key pair. Compromising the server's long-term private key in the future cannot retroactively decrypt past sessions.

Certificate pinning

Our domain certificates are issued by a trusted CA and monitored via Certificate Transparency logs. Any misissued certificate for our domain would be publicly detectable within seconds.

Threat

Man-in-the-middle attacks, network eavesdropping, traffic interception

Mitigation

TLS 1.3 with ephemeral keys — no session can be decrypted, even retroactively

02Storage Layer

AES-256-GCM — Encrypted at Rest

All case metadata on disk is encrypted with AES-256-GCM before it is written. The file on the server is unreadable without the key — which only you hold.

What AES-256-GCM is

AES-256 (Advanced Encryption Standard, 256-bit key) is used by the US government to protect classified information and by financial institutions worldwide. The GCM (Galois/Counter Mode) variant adds authenticated encryption — it not only encrypts the data but produces an authentication tag that detects any tampering.

Random IV per write

Every time the case database is written, a fresh 12-byte random Initialization Vector (IV) is generated. This means two identical writes produce completely different ciphertext on disk. An attacker watching the file over time learns nothing about whether the data changed.

Authentication tag — tamper detection

GCM produces a 128-bit authentication tag alongside the ciphertext. When the file is read, the tag is verified before decryption proceeds. If even a single bit on disk has been modified — by malware, a rogue sysadmin, or hardware corruption — decryption fails and an error is raised. The data cannot be silently altered.

Your key, your control

The encryption key is a 256-bit (32-byte) value you set as an environment variable on your own server. We never transmit it, store it, or have any mechanism to retrieve it. If you lose the key, the data cannot be recovered by anyone — including us.

Threat

Physical server access, stolen disk, backup theft, rogue infrastructure access

Mitigation

AES-256-GCM with per-write random IV — unreadable without the key you control

03AI Processing Layer

Vector Embeddings — Documents Become Numbers

When a report is processed, it is converted into a vector embedding — a list of floating-point numbers that represent the semantic meaning of the text. The original text cannot be reconstructed from the embedding.

What a vector embedding is

An embedding is a high-dimensional numerical representation of text — typically 1,536 numbers (floats) per chunk. These numbers encode relationships between concepts, not the words themselves. The mapping from text to numbers is a one-way function. There is no algorithm to reverse a vector back into the original sentence.

Stored in a private vector store

Each case gets its own isolated vector store. The embeddings from one case are never mixed with another. The store is identified only by a randomly generated ID — there is no link between the store and any patient name or case number in the vector database itself.

Automatic expiry

Vector stores are configured to expire 90 days after last access. After expiry, the embeddings are permanently deleted from the vector database. Nothing persists indefinitely.

What remains if the database were breached

An attacker who accessed the vector database would find: arrays of floating-point numbers. No names. No dates. No diagnoses. No WPI values. No attorney names. The embeddings are scientifically meaningless without the specific model that generated them, and even with that model, reconstruction of the original text is not possible.

Threat

Vector database breach, unauthorized AI query access

Mitigation

One-way embeddings — source text is unrecoverable; per-case isolated stores with auto-expiry

04Key Management Layer

You Own the Encryption Key

The AES-256 encryption key never leaves your environment. It is never sent to us, stored in our systems, or visible to anyone but you.

Environment variable architecture

The key is set as an ENCRYPTION_KEY environment variable on the server you control. It exists only in process memory at runtime — it is never written to a log file, a database, or transmitted over the network.

Zero-knowledge design

Complegal operates on a zero-knowledge principle for encryption keys. Our application code reads the key from your environment, uses it to encrypt/decrypt in memory, and never transmits or stores it. A subpoena served to Complegal for your encryption key would return nothing, because we do not have it.

Key rotation

You can rotate the encryption key at any time by decrypting the current database, replacing the ENCRYPTION_KEY env var, and re-encrypting. This process does not require downtime and invalidates any previously captured ciphertext.

Threat

Key theft, insider threat, legal compulsion to produce keys

Mitigation

Zero-knowledge key design — we never hold, transmit, or store your key

05Data Use Layer

Never Trained On, Never Shared

Your data is never used to train AI models, shared with third parties, sold, or used for any purpose other than running your specific case.

No training use

Medical-legal documents contain sensitive PHI. We have explicit agreements with our AI infrastructure providers prohibiting the use of submitted content for model training. Your QME reports do not improve any AI model — ours or anyone else's.

Case isolation

Each case is a hermetically isolated context. The AI processes each case in a separate thread with access only to that case's vector store. It is architecturally impossible for data from Case A to surface in a query for Case B.

Retention and deletion

When you delete a case, we immediately delete the source files from the AI provider's storage and delete the vector store. The only surviving data is your local encrypted case record — which you can delete at any time, and which is unreadable without your key.

Threat

Data leakage across cases, training data inclusion, third-party data sharing

Mitigation

Hard case isolation, no-training contractual commitment, immediate deletion on request

End-to-End Data Flow

What Actually Happens to Your Document

You upload a PDFLayer 01 — Transit

Connection is TLS 1.3 encrypted. The file bytes never travel in plaintext.

01
AI reads and extracts dataLayer 03 — AI Processing

The document is passed to the AI in an isolated session. It extracts WPI, codes, dates, and other structured fields. The original text is discarded after this step.

02
Document converted to embeddingsLayer 03 — Embeddings

The text is chunked and converted into floating-point vectors. These are stored in your private, isolated vector store. The source text no longer exists in the AI pipeline.

03
Extracted case data written to diskLayer 02 — At Rest

The structured extraction (JSON) is serialised, encrypted with AES-256-GCM using your key and a fresh random IV, then written to disk. The plaintext never touches disk.

04
Deterministic rating runsLayer 04 — Key Control

The rating engine reads the encrypted record, decrypts it in memory, runs the PDRS math, re-encrypts the result, and discards the plaintext. Only ciphertext is persisted.

05
You delete the caseLayer 05 — Retention

Source files are deleted from the AI provider. The vector store is deleted. The encrypted local record is deleted. Nothing remains.

06
Maximum Security Option

Need Zero Internet Exposure?

The cloud plan is designed for strong security — but it still routes AI processing through an external provider. If your firm requires an absolute guarantee that no data ever leaves your network, we offer a fully air-gapped on-premise deployment.

A custom AI model runs entirely on hardware inside your office. No API calls. No cloud services. No internet required at any step. We come to you, install the system, configure the local model, encrypt the environment, and train your team. After setup, the system runs indefinitely with no external dependencies.

  • AI model runs on-premise — no external API calls
  • Vector database hosted locally on your hardware
  • Encryption keys never leave your building
  • Fully functional with network cable unplugged
  • On-site installation, configuration, and team training
  • Ongoing support available remotely or on-site
Call Us to Discuss On-Premise Setup
Concern
Cloud
On-Premise
Data leaves network
Encrypted only
Never
AI model location
External provider
Your hardware
Vector store location
Cloud (isolated)
Local server
Internet required
Yes
No
Encryption key location
Your env var
Your hardware
Regulatory air-gap
No
Full