·Dictum Team

AI clinical documentation privacy risks clinicians should know

hipaaprivacysecurity

Using AI to generate clinical notes introduces privacy risks that didn't exist when documentation was entirely manual. The technology works — it saves time and reduces charting burden. But it also creates new pathways for patient data to be exposed, misused, or retained longer than necessary. Understanding these risks doesn't mean avoiding AI documentation. It means choosing and configuring it carefully.

The main privacy risks fall into five categories: data access, storage and retention, model training, human review, and third-party sub-processors. Here's what each involves and what to do about it.

Data access risks

When you use an AI scribe, patient audio and clinical content pass through multiple systems. Each hop is a potential exposure point.

Audio transmission. If the tool sends audio to the cloud for processing, that data travels over the internet to the vendor's servers. Without strong encryption (TLS 1.2+), it's vulnerable to interception.

Server-side processing. Once audio reaches the vendor's infrastructure, it's processed by speech recognition and language models. During processing, the data exists in memory on the vendor's servers. Who can access those servers? What logging is in place?

API integrations. Some AI scribes connect to EHR systems, cloud storage, or third-party NLP services. Each integration creates an additional access point with its own security posture.

What to verify:

  • Is audio encrypted end-to-end during transmission?
  • Does the vendor offer on-device processing to avoid cloud transmission entirely?
  • What access controls govern the vendor's internal systems?

Storage and retention risks

Data at rest presents a different set of risks than data in transit.

Over-retention. Some vendors store audio recordings, transcripts, and generated notes for extended periods — sometimes indefinitely. Every day that data sits on a server is another day it could be breached, subpoenaed, or accessed by unauthorized personnel.

Backup and replication. Even if the primary copy is deleted, backups may persist. Ask whether deletion is complete across all replicas, backups, and disaster recovery systems.

Geographic storage. Where data is physically stored matters. Different countries have different data protection laws. If your vendor stores data outside the U.S., additional regulations may apply.

What to verify:

  • What is the default retention period for audio, transcripts, and notes?
  • Can you configure auto-deletion?
  • Is deletion complete across all backups and replicas?
  • Where are servers physically located?

Model training risks

This is the risk that concerns clinicians most — and the one where vendor transparency is worst.

If a vendor uses patient encounters to train or fine-tune their AI models, your patient's data becomes embedded in the model's behavior. Even after the original data is deleted, its influence remains. De-identification reduces this risk but doesn't eliminate it, particularly for encounters involving rare conditions or unique clinical presentations.

For a detailed breakdown of training vs. processing and what to ask vendors, read our guide on whether AI medical scribes train on patient data.

What to verify:

  • Does the vendor use patient data for model training?
  • What de-identification methodology is applied?
  • Can you opt out?
  • Does the BAA address training explicitly?

Human review and access control risks

AI-generated notes sometimes require human review — by vendor staff for quality assurance, by your clinical team for accuracy, or by compliance teams for audit purposes. Each layer of human access is a potential privacy risk.

Vendor-side review. Some vendors have internal teams that review a sample of encounters to evaluate model performance. If those reviewers can see identifiable patient information, that's a privacy concern — even if the review is for quality purposes.

Insufficient role-based access. Within the vendor's organization, who can access your data? Are access controls granular, or can any engineer pull up patient recordings?

Audit logging gaps. If the vendor doesn't maintain detailed audit logs, there's no way to verify who accessed what data and when. This makes breach investigations and compliance audits significantly harder.

What to verify:

  • Does the vendor conduct human review of encounters? If so, is data de-identified first?
  • What role-based access controls are in place?
  • Are audit logs available to your practice?
  • Can you review access reports on request?

Third-party sub-processor risks

Most AI scribes don't run entirely on proprietary infrastructure. They use cloud providers (AWS, GCP, Azure), third-party speech recognition services, and external LLM APIs. Each sub-processor introduces its own data handling practices.

Chain of custody. When your patient's audio passes through three or four different services, the chain of custody becomes harder to track. A BAA with your vendor doesn't automatically bind the vendor's sub-processors.

LLM provider policies. If the vendor routes data through a third-party LLM (like a major cloud AI service), that provider may have its own data retention and training policies. Some LLM providers retain input data for abuse monitoring or model improvement unless the customer specifically opts out.

What to verify:

  • What sub-processors does the vendor use?
  • Does the BAA extend to all sub-processors?
  • Do sub-processors retain or train on data?
  • Has the vendor obtained appropriate data processing agreements from each sub-processor?

Privacy risk assessment checklist

Use this checklist to evaluate the privacy posture of any AI clinical documentation tool. Score each item as Met, Partially met, Not met, or Unknown.

Data access

  • ☐ Audio is encrypted with TLS 1.2+ during transmission
  • ☐ Notes and transcripts are encrypted at rest (AES-256)
  • ☐ On-device processing is available for sensitive encounters
  • ☐ API integrations use secure authentication and encryption

Storage and retention

  • ☐ Default retention period is clearly documented
  • ☐ Auto-deletion is configurable by the practice
  • ☐ Deletion covers all copies, backups, and replicas
  • ☐ Data storage locations are disclosed

Model training

  • ☐ Vendor explicitly states whether patient data is used for training
  • ☐ Training opt-out is available if training occurs
  • ☐ De-identification methodology is documented (Safe Harbor or Expert Determination)
  • ☐ BAA addresses model training explicitly

Human review and access

  • ☐ Vendor discloses whether human review of encounters occurs
  • ☐ Role-based access controls limit who can view PHI
  • ☐ Audit logs are maintained and accessible to your practice
  • ☐ Human reviewers only see de-identified data

Sub-processors

  • ☐ Complete list of sub-processors is available
  • ☐ BAA extends to all sub-processors
  • ☐ Sub-processors do not retain data for their own purposes
  • ☐ Data processing agreements are in place with each sub-processor

Compliance documentation

  • ☐ BAA is signed before any data is processed
  • ☐ SOC 2 Type II certification (or equivalent) is current
  • ☐ Breach notification process is documented in the BAA
  • ☐ Vendor provides compliance documentation on request

How Dictum addresses these risks

Dictum's architecture is designed to minimize each of the risk categories above:

  • End-to-end encryption for audio in transit and notes at rest
  • On-device processing in offline mode — audio never leaves the device
  • Configurable auto-delete with defined retention windows
  • No model training on patient encounter data
  • BAA available for all clinical users, with sub-processor coverage

Full details are available on the HIPAA compliance page and security overview.

Clinicians should review AI-generated documentation before adding it to the medical record and should use Dictum in accordance with their organization's policies and applicable laws.

Keep learning

Privacy risks don't exist in isolation. They connect to HIPAA compliance obligations, consent requirements, and vendor evaluation decisions. Continue with these related guides: