This page is the practical companion to Compliance. What is the data, where does it live, who can read it, and how does a Data Subject Rights request flow?Documentation Index
Fetch the complete documentation index at: https://docs.vorel.ai/llms.txt
Use this file to discover all available pages before exploring further.
What Vorel collects
End-customer data (the actual lead / patient / guest / vehicle owner) sits in a small, intentional set of tables. The full inventory lives athandoff/docs/security/PII-INVENTORY.md; this is the
summary.
High-sensitivity (Direct Identifier + Communication Content)
conversations.customer_identifier— WhatsAppwa_id(E.164 phone) or voice DID. The primary cross-channel identity anchor.conversations.customer_name— set from WhatsApp profile name on first inbound; nullable.messages.content— every inbound + outbound message text. The transcript source-of- truth — append-only, never edited.messages.content_translated— translation of the same content (when language differs from tenant default).messages.media_url— inbound media (images / voice clips). May contain customer-uploaded selfies / IDs.leads.name,leads.email,leads.phone— collected by the qualification sub-agent.leads.notes— free-form; may quote message content.appointments.customer_name,customer_phone,customer_email— required at booking time for confirmation copy.appointments.location_text— sometimes the customer’s address.leads.attributes(JSONB) — vertical-specific slots; may embed strings the customer mentioned (employer, neighborhood, allergies, etc.).
Medium-sensitivity (Behavioral / Metadata)
conversations.customer_language—'en' | 'ar'detected from first message.conversations.tags— operator-curated tags.messages.tool_payload— JSON payload from tool calls; may include customer-derived strings (search queries, slot values from qualification).qa_evaluations.score+criteria— per-conversation grading; doesn’t carry raw PII but links to it via FKs.
Operator data (not end-customer PII)
users.email,users.full_name— your dashboard team’s profile, mirror of Clerk.audit_log.ip_address,audit_log.user_agent— operator dashboard activity.audit_log.metadata— free-form JSONB per action; references customer ids by FK only — raw customer PII is never stored here directly.
Where data lives
Primary store
- Postgres (Railway-managed) — every tenant-scoped table. Disk-encrypted at rest. Row-Level
Security gates every read by
tenant_id. - Redis (Railway-managed) — BullMQ queues + rate-limit counters. No direct PII (job payloads are short-lived; rate limit keys are tenant id + IP / phone, not message content).
Caches + working storage
- In-process — request-scoped tenant context, conversation history during a single dispatch. Not persisted.
- Sentry — exception capture. Per
apps/web/src/lib/sentry.ts, redactsvapk_*API keys and known PII fields before sending. Customer PII can leak into stack traces if a downstream vendor returns it in an error message — the redactor catches the high-risk patterns.
Ephemeral vendor stores
- Vapi — voice call orchestration; holds transcripts + recordings until our worker picks
up the
end-of-call-reportand persists what we want. - 360dialog — WhatsApp Cloud ingest. Holds inbound messages until our webhook receiver acknowledges.
- Deepgram, ElevenLabs — process audio in-flight; no long-term storage on their side per their commercial terms.
- Gemini — LLM API; per Google’s commercial-tier terms, requests are not used for model training.
Object storage
The PII inventory mentionss3://vorel-{env}-recordings/... for voice clips. This S3 bucket
is dormant today — the dormant Pulumi infrastructure provisions it but Vorel’s current
Railway-hosted deployment doesn’t use it. Recordings live on Vapi’s CDN with the URL referenced
in messages.media_url. Reactivates with the AWS deployment path.
Append-only tables
Three tables block UPDATE + DELETE at the DB level via Postgres triggers:messages— transcript integrity.audit_log— audit integrity.billing_events— financial-record integrity.
vorel role explicitly bypasses for retention sweeps + the
right-to-erasure path.
Right to access (PDPL Art. 15 / GDPR Art. 15)
POST /api/tenant/export — operator-gated. Returns a ZIP containing:
conversations.csvmessages.csvleads.csvappointments.csvofferings.csvknowledge_base.csvaudit_log.csvREADME.md— chain-of-custody (timestamp, operator, scope, redaction settings)
include_full_pii=true for
the unredacted version (audit-logged with the operator + a justification field).
When a customer requests their data: your operator routes the request through this endpoint,
attaches the chain-of-custody README, and provides the resulting ZIP per your DSAR workflow.
Right to erasure (PDPL Art. 17 / GDPR Art. 17)
POST /api/tenant/forget — operator-gated, dry-run-by-default.
The 7-step scrub for a single customer phone within a single tenant:
Tombstone conversations
customer_identifier replaced with redacted-<tenant_id>-<uuid8> so the conversation row
survives but doesn’t link to the real customer.Redact message content
messages.content and content_translated overwritten with [redacted]. Append-only
triggers permit this via the vorel role’s bypass; standard tenant role still cannot.Null leads PII
leads.name / email / phone / notes / attributes (the high-PII keys) nulled or
redacted.Null appointments PII
appointments.customer_name / customer_phone / customer_email / location_text /
notes redacted.Scrub audit-log JSONB references
Regex-replace E.164 phones +
wa_id-shaped numbers in audit_log.metadata JSONB.Salted-hash audit reference
A new audit row records the scrub, tagged with
sha256(salt + phone) of the original
customer identifier — this row IS the chain-of-custody artefact proving deletion happened.
Salt comes from RIGHT_TO_ERASURE_HASH_SALT env var; without a salt the hash would be
trivially reversible by any phone-number dictionary.What right-to-erasure does NOT touch
messagesrow deletion — we redact content, we don’t delete the rows. The append-only trigger is intentional; the audit-log record needs the conversation row to survive for the chain of custody to make sense.audit_logrows — append-only. The deletion event is recorded as a new audit row, not by removing prior rows.billing_events— financial integrity supersedes per-customer scrub.- Webhook deliveries already-fired — past deliveries to your tenant’s outbound webhook URL carried the message content. Vorel can’t reach those receivers to retroactively scrub. Customers should expect that data may also live in your downstream systems and address it there per your own retention policy.
Per-tenant retention configuration
| Data class | Default retention | Configurable per-tenant? |
|---|---|---|
| Conversations / messages | Subscription term + 30-day grace | No (Phase R extension) |
| Voice recordings | 90 days | Yes (30–365 days, operator-set) |
| QA evaluations | Same as conversations | No |
| Audit log | Subscription term + 30-day grace | No |
| Billing events | Indefinite (financial integrity) | No |
Customer requests workflow
When a customer of yours (your end customer) requests a Data Subject Rights action:You receive the request
Customer emails / messages your team requesting their data or asking to be forgotten.
Notify your Vorel operator
Email your operator (or use your support channel) with: customer’s phone number + the
nature of the request (export / erasure) + any required documentation per your DSAR
process.
Operator runs the endpoint
Vorel’s operator runs
/api/tenant/export or /api/tenant/forget against your tenant
scoped to the specific customer phone. Default settings; no include_full_pii unless you
explicitly request it.Receive the artefact
Operator sends you the ZIP (for export) or the audit-log reference (for erasure). The
audit-log entry includes the salted-hash of the customer phone — sufficient proof of
deletion without re-introducing the PII.
What about analytics on customer data?
Per-tenant analytics (the/(dashboard)/analytics surface + the weekly-rollup API) reads
the same RLS-scoped tables. The operator-side cross-tenant analytics surface
(/admin/tenants/[id]/analytics) is per-tenant only — there’s no cross-tenant aggregation
on the operator surface today.
We do not run platform-wide analytics over customer content for any purpose (model training,
benchmarking, marketing, etc.).
Related docs
- Security overview — RLS, encryption, audit, rate limiting, SLOs
- Compliance — DPA + region model + sub-processor disclosure
- Product → Guardrails — runtime safety policy