Semarize

CRM & Data

Supabase — Conversation Data with Auth, RLS, and Real-Time Built In

Supabase is PostgreSQL with batteries included. You get Row Level Security for multi-tenant access control, Edge Functions for serverless webhook handlers, real-time subscriptions for instant alerting, PostgREST for automatic API generation from your schema, and a managed dashboard — all on top of a real PostgreSQL database you can connect to directly with any Postgres client.

This guide covers what Supabase adds on top of plain Postgres for conversation data pipelines. If you want the foundational SQL patterns, see the PostgreSQL guide. Everything here assumes you already have a Supabase project and want to use its platform features — not just the database.

Recommended default ingestion pattern

Edge Function webhook handler → direct INSERT via service role client

The Edge Function receives the webhook from your conversation platform or automation tool, normalizes the payload, and upserts into the conversations and conversation_signals tables. Use the service role key (bypasses RLS) for pipeline writes.

When it fits

When to Choose Supabase

Supabase is not the right choice for every project. It shines when you need more than a database — when you need a platform.

You want a managed Postgres database with auth, RLS, and an instant REST API without provisioning separate infrastructure. One project gives you the database, the API layer, and the auth system.

Row Level Security lets you gate access to conversation data per team, tenant, or user — critical for multi-tenant applications where each customer should only see their own calls.

Edge Functions give you a serverless webhook handler that writes directly to the database. No separate API server, no container orchestration, no cold infrastructure to manage.

Real-time subscriptions let you build live dashboards and alert systems that react to new signals the instant they’re inserted. No polling, no message queue, no additional pub/sub layer.

PostgREST generates a REST API from your schema automatically. Frontend applications can query conversation data without a custom backend. Filter, sort, paginate — all via URL parameters.

You’re building a product or internal tool (not just an analytics layer) and want database + auth + API in one platform. Supabase collapses the stack so you ship faster.

Architecture choices

Three Choices That Shape Your Supabase Setup

These three decisions will shape how your conversation data pipeline behaves in production. Get them right early.

1

RLS impact on pipeline writes

RLS is enabled by default on new tables in Supabase. Your pipeline — Edge Functions, cron jobs, external scripts — needs to bypass RLS using the service role key. Application users should go through RLS policies. Design your policies before you start writing data, not after. Retrofitting RLS onto a table with existing data and active queries is painful and error-prone.

2

Service role key management

The service role key bypasses all RLS. Never expose it in client-side code. Use it only in Edge Functions, server-side scripts, and backend services. Rotate it immediately if compromised. Store it in environment variables — never commit it to source control, never log it, never pass it in a URL parameter.

3

Edge Function cold starts and limits

Edge Functions have a ~200ms cold start on first invocation. For webhook handlers that fire a few times per minute, this is fine. For high-frequency endpoints (>100 req/sec), consider an external service. Max execution time is 150 seconds. Memory limit is 150MB per invocation. These limits are generous for data ingestion but will bite you if you try to do heavy processing inside the function.

Schema design

Schema Design

A two-table model with Postgres-native types and Supabase-specific considerations. The tenant_id column on both tables powers RLS policies for multi-tenant access control.

conversations

One row per call

Columns

call_idTEXT PKPrimary key. Unique identifier from the source platform.
source_platformTEXTWhere the call originated — gong, zoom, teams, etc.
call_timestampTIMESTAMPTZWhen the call started. NOT NULL. Timezone-aware for correct ordering across regions.
duration_secondsINTEGERCall duration in seconds.
participantsJSONBArray of participant names, roles, and email addresses. Indexable with GIN.
deal_idTEXTForeign key to CRM deal/opportunity.
account_idTEXTForeign key to CRM account/company.
transcript_textTEXTFull transcript text. Can be large — consider Supabase Storage for very high volumes.
metadataJSONBFlexible JSON for source-specific metadata. Add a GIN index for efficient querying.
ingested_atTIMESTAMPTZDEFAULT now(). When this row was inserted into the database.
tenant_idTEXTFor multi-tenant RLS. Maps to the tenant claim in your JWT.

conversation_signals

One row per extracted signal

Columns

signal_idTEXT PKPrimary key. Unique identifier for this signal row.
call_idTEXT FKForeign key to conversations(call_id).
signal_nameTEXTName of the extracted signal — overall_score, risk_flag, pain_point. NOT NULL.
signal_valueTEXTThe extracted value. Cast to appropriate type in queries.
signal_typeTEXTType hint — score, boolean, category, extraction, count.
confidenceREALConfidence score from the evaluation (0.0 to 1.0).
scoring_versionTEXTVersion of the evaluation framework that produced this signal.
evaluated_atTIMESTAMPTZWhen this signal was generated.
tenant_idTEXTFor multi-tenant RLS. Must match the conversations table.

RLS Policies & Indexes

Security and performance configuration

Enable RLS on both tables. Policy: users can only see rows where tenant_id = auth.jwt()->>'tenant_id'. Pipeline writes use the service role key, which bypasses RLS entirely.

B-tree index on call_timestamp for time-range queries. This is the most common filter in dashboards and reports.

Composite index on (call_id, signal_name) for fast signal lookups by conversation and signal type.

GIN indexes on JSONB columns (participants, metadata) for efficient JSON path queries and containment checks.

For Semarize-specific modeling, versioning, and grounding lineage best practices, see the canonical modeling guide.

Data loading

Loading Strategy

Start with the Edge Function webhook handler. Add bulk loading for historical backfills and external pipelines for complex orchestration.

Edge Function webhook handler

Recommended starting pattern

Deploy an Edge Function that accepts POST requests from your conversation platform or automation tool

Use the Supabase client with the service role key to upsert into the database

Edge Functions run at the edge — low latency for webhook receivers

Use INSERT ... ON CONFLICT (call_id) DO UPDATE for idempotent writes — re-delivering the same webhook produces the same result

Bulk historical loads

Connect directly to Postgres using psql or a migration tool and use COPY FROM for maximum throughput

Or use the Supabase dashboard to upload CSV files via the Management API

The Supabase client library works for smaller batch inserts (<1000 rows per request) — beyond that, prefer direct Postgres connections

External pipelines

Connect via the Postgres connection string (available in the Supabase dashboard under Settings → Database)

Use any Postgres client library — Supabase is Postgres under the hood, nothing proprietary in the wire protocol

For connection pooling, use Supabase’s built-in Supavisor pooler (port 6543) instead of direct connections (port 5432) to avoid exhausting your connection limit

Production concerns

Operational Playbook

Supabase-specific operational considerations for running a conversation data pipeline in production. Everything from the PostgreSQL guide still applies — autovacuum tuning, index maintenance, EXPLAIN ANALYZE. These are the extras.

RLS policy testing

Test policies in the SQL editor with: set role authenticated; set request.jwt.claims = '{"tenant_id": "test"}'; Then run your SELECT and INSERT queries to verify users can only see their own data. Test both read and write policies. A misconfigured policy that silently returns zero rows is worse than one that throws an error.

Edge Function monitoring

Check Edge Function logs in the Supabase dashboard under Edge Functions → Logs. Monitor for timeouts (150s limit), memory errors, and cold start frequency. If you’re hitting invocation limits on the free plan, scale to Pro. Set up alerts for failed invocations — a silently failing webhook handler means missing data.

Connection pooling with Supavisor

Use port 6543 (Supavisor) for application connections, not port 5432 (direct). Supavisor handles connection pooling automatically. Direct connections are limited and vary by plan — the free tier allows around 60 direct connections. External scripts, BI tools, and application backends should all go through the pooler.

Storage limits by plan

Free plan: 500MB database, 1GB file storage. Pro plan: 8GB database, 100GB file storage. Raw transcripts consume storage fast — a single call transcript can be 20–50KB. At 100 calls per day, that’s 1–5MB daily in transcript data alone. Check your plan limits before loading large historical datasets. You’ll likely need Pro for production volumes.

Realtime channel limits

Supabase Realtime supports broadcasting changes on table rows. On the free plan, you get 200 concurrent connections. For production alerting systems that serve multiple users or teams, use Pro for higher limits. Each dashboard tab or browser window counts as a separate connection.

Postgres under the hood

Everything from the PostgreSQL operational guide applies here: autovacuum tuning, index maintenance, EXPLAIN ANALYZE for slow queries. Access Postgres directly via the Supabase SQL editor or any Postgres client. Supabase does not hide the database from you — you have full superuser access.

What you can build

Analysis Examples

Three patterns that use Supabase-native tools — PostgREST, Realtime, and the SQL editor — to turn conversation signals into actionable output.

1

Live signal dashboard (PostgREST + custom UI)

Use Supabase’s auto-generated REST API to fetch conversation_signals filtered by team and date range. Build a React dashboard with the supabase-js client. No backend needed — PostgREST serves the data directly. Filter with URL parameters, paginate with Range headers, and let RLS handle per-tenant access control automatically.

2

Real-time risk alerts (Supabase Realtime)

Enable Realtime on the conversation_signals table. Subscribe to INSERT events where signal_name = 'risk_flag'. When a high-risk signal lands, instantly notify the rep’s manager via a connected Slack webhook. Zero polling, instant response. The alert fires the moment the row is committed, not on the next polling cycle.

3

Team performance report (SQL editor + CSV export)

Use the Supabase SQL editor to run aggregation queries — average score by rep, by week, with trend lines. Export results as CSV for stakeholders who prefer spreadsheets. When self-service becomes important, connect a BI tool like Metabase or Grafana directly to your Postgres instance via the connection string.

Structured signals

RLS, Realtime, and the Signal Layer

Supabase's managed platform adds capabilities on top of Postgres that change how you expose and react to conversation intelligence data.

Row Level Security policies on conversation_signals let you scope access by tenant. A policy like USING (tenant_id = auth.uid()) means your frontend can query signals via PostgREST without building a separate authorisation layer. The database enforces tenant isolation, not your application code.

Supabase Realtime subscriptions can trigger alerts the moment a new signal is inserted. Subscribe to INSERT events on conversation_signals with a filter like signal_name=eq.churn_risk and confidence=gt.0.8 to push high-risk signals to Slack or a dashboard in under a second, without polling.

Edge Functions handle webhook-driven ingestion with Deno runtime limits (150ms CPU, 150MB memory on Pro). For signal ingestion, this is plenty — parse the JSON payload, run the upsert via the service role client, return 200. Keep the function stateless and let Supavisor handle connection pooling on port 6543.

PostgREST auto-generates a REST API over your signal tables. Your frontend can query conversation_signals with URL parameters like ?signal_name=eq.objection_handling&order=confidence.desc&limit=10 without writing SQL or building API endpoints.

Full modeling guide: Modeling Semarize Outputs covers versioning, lineage columns, and schema patterns for any database.
CRM sync: For pushing signals back to your CRM, see the HubSpot guide or Salesforce guide.

FAQ

Frequently Asked Questions

Explore

Explore Semarize