EngineeringMay 28, 20257 min read

How We Index Your Data Safely

Sam Chen

Synapse Team

Security is often a feature listed on a pricing page and forgotten about. At Synapse, it's a constraint that shapes every architectural decision we make.

The indexing pipeline

When you connect a data source — a database, a document store, a SaaS app — Synapse runs an ingestion pipeline with four stages:

Extraction: We pull data using read-only credentials you provide. We request the minimum permissions required and nothing more.
Chunking: Long documents are split into overlapping windows. This preserves context across chunk boundaries.
Embedding: Chunks are converted to vector representations using a model that runs inside your VPC. Embeddings never leave your infrastructure.
Storage: Vectors are stored in an isolated namespace in a managed vector database. Each workspace is cryptographically separated.

What we never do

We never train on your data. We never use your data to improve models shared with other customers. We never store raw document text longer than the TTL you configure.

The audit trail

Every query that touches your data is logged with a timestamp, the user who ran it, and the exact chunks that were retrieved. You can export this log at any time.

Security isn't a checkbox for us. It's how we sleep at night.

EngineeringSecurity

How We Index Your Data Safely

The indexing pipeline

What we never do

The audit trail

More from the blog