On this page

Intro
What Data You Can Extract
API Access
Key Extraction Flows
Automation Tools
What You Can Build
Challenges & Gotchas
FAQ
Explore Semarize
Related Resources

Get Your Data

Symbl.ai - How to Get Your Conversation Data

A practical guide to getting your conversation data out of Symbl.ai - covering the Async API, Streaming API, Conversation API, Tracker configuration, Nebula summaries, and how to route structured intelligence into your downstream systems.

What you'll learn

What conversation data you can extract from Symbl.ai - transcripts, topics, action items, questions, sentiment, entities, and Tracker hits
How to access data via the Symbl.ai APIs - authentication, Async API, Streaming API, and Conversation API
Three extraction patterns: batch processing, real-time streaming, and webhook-triggered flows
How to connect Symbl.ai data pipelines to Zapier, n8n, and Make
Advanced use cases - compliance monitoring, real-time intelligence, sentiment trending, and custom processing pipelines

Data

What Data You Can Extract From Symbl.ai

Symbl.ai is a processing platform - you send it audio, video, or text, and it returns structured conversation intelligence. Unlike platforms that also record calls, Symbl.ai focuses entirely on the analysis layer. Every processed conversation produces a rich set of structured outputs accessible via the Conversation API.

Structured outputs available per conversation

Full transcript with speaker labels

Topics (auto-detected conversation themes)

Action items with assignee detection

Follow-up suggestions

Questions asked during conversation

Sentiment analysis (per message and overall)

Named entities (people, orgs, dates, etc.)

Tracker hits (custom keyword/phrase detection)

Conversation analytics (talk ratios, silence, etc.)

Nebula abstractive summaries

API Access

How to Get Data via the Symbl.ai API

Symbl.ai exposes three main API surfaces: the Async API for batch processing, the Streaming API for real-time analysis, and the Conversation API for retrieving results. The workflow is: authenticate with your App ID and Secret, submit content for processing, then retrieve structured results via the Conversation API.

Authenticate

Symbl.ai uses OAuth 2.0-style token authentication. Send your App ID and App Secret to the POST /oauth2/token/generate endpoint to receive an access token. Include this token as a Bearer token in all subsequent API calls.

POST https://api.symbl.ai/oauth2/token/generate

{
  "type": "application",
  "appId": "<your_app_id>",
  "appSecret": "<your_app_secret>"
}

// Response:
// { "accessToken": "eyJhb...", "expiresIn": 86400 }

Access tokens expire after 24 hours by default. Your integration must handle automatic token refresh - cache the token and regenerate before the expiresIn window closes to avoid mid-pipeline auth failures.

Submit content via the Async API

The Async API accepts audio, video, or text for batch processing. Submit a file URL via POST /v1/process/audio/url (or the video/text equivalents). The response returns a conversationId and a jobId for tracking processing status.

POST https://api.symbl.ai/v1/process/audio/url

{
  "url": "https://storage.example.com/calls/discovery-call.mp3",
  "name": "Discovery Call - Acme Corp",
  "confidenceThreshold": 0.6,
  "detectTopics": true,
  "detectActionItems": true,
  "detectQuestions": true,
  "enableSpeakerDiarization": true,
  "diarizationSpeakerCount": 2
}

// Response:
// { "conversationId": "5681...a3f2", "jobId": "9f1c...b7e4" }

You can also submit raw audio via POST /v1/process/audio (multipart upload) or text via POST /v1/process/text. Each endpoint returns a conversationId for result retrieval.

Check processing status

Poll the job status via GET /v1/job/{jobId} until the status is completed. Alternatively, pass a webhookUrl in your submission request and Symbl.ai will POST a callback when processing finishes.

GET https://api.symbl.ai/v1/job/9f1c...b7e4

// Response:
// { "id": "9f1c...b7e4", "status": "completed" }

Processing time varies by file length and format. Audio files typically process in a fraction of their recording duration. Implement exponential backoff when polling to avoid unnecessary API calls.

Retrieve results via the Conversation API

Structured endpoints

Once processing completes, use the Conversation API to retrieve individual signal types. Each endpoint returns structured JSON for a specific intelligence category: GET /v1/conversations/{id}/messages for transcript, /topics, /action-items, /questions, /follow-ups, /entities, and /analytics.

Trackers & Nebula

Retrieve custom Tracker detections via /trackers - each hit includes the matched phrase, context, and confidence score. For abstractive summaries, call the Nebula endpoint with the conversation ID to generate a natural-language summary of the entire conversation.

Patterns

Key Extraction Flows

There are three practical patterns for processing conversations through Symbl.ai. The right choice depends on whether you're doing a one-off batch analysis, running ongoing extraction from a recording platform, or need real-time intelligence during live calls.

Batch Processing (Async API)

Process historical recordings in bulk

Collect audio/video URLs from your recording platform (Zoom, Teams, your telephony system, cloud storage, etc.)

For each file, POST to /v1/process/audio/url (or /video/url) with your desired configuration - speaker diarisation, topic detection, Tracker IDs, and confidence thresholds

Store the returned conversationId and jobId. Poll GET /v1/job/{jobId} or use webhook callbacks to track completion

Once processing completes, call the Conversation API endpoints to retrieve topics, action items, questions, entities, sentiment, and transcript data

Run your analysis pipeline against the structured output - score with Semarize, push to your warehouse, or route to downstream systems

Tip: Respect concurrent processing limits. Submit files in controlled batches and track each jobId. If a batch is interrupted, you can resume from the last unprocessed file without re-submitting completed ones.

Real-Time Streaming (WebSocket API)

Live intelligence during conversations

Open a WebSocket connection to wss://api.symbl.ai/v1/streaming/{connectionId}. Pass your access token and configuration (speaker info, Trackers, language settings) in the start_request message

Stream raw audio packets over the WebSocket as the conversation happens. Symbl.ai processes audio in real time and emits structured events back over the same connection

Listen for real-time events: topic_response, action_item_response, question_response, tracker_response, and message_response (transcript segments)

When the conversation ends, send a stop_request. Symbl.ai finalises processing and the conversation becomes available via the Conversation API for full result retrieval

Note: WebSocket connections have concurrent limits per account. Implement reconnection logic with exponential backoff - a dropped connection during a live call means lost real-time signals unless you fall back to the Async API for post-call processing.

Webhook-Triggered Incremental Processing

Automatic processing when new recordings appear

Set up a webhook in your recording platform (Zoom, Teams, etc.) that fires when a new recording is available. The webhook payload includes the recording URL

Your webhook handler receives the event, extracts the recording URL, and submits it to Symbl.ai's Async API with a webhookUrl callback pointing back to your system

Symbl.ai processes the recording and POSTs a callback to your webhookUrl when complete, including the conversationId

Your callback handler retrieves structured results from the Conversation API and routes them downstream - to Semarize for scoring, your CRM, or your data warehouse

Log the conversationId and recording ID as a deduplication key to prevent reprocessing if webhooks fire multiple times

Tip:Using Symbl.ai's webhookUrl callback eliminates the need to poll for job completion. Your pipeline only activates when results are actually ready, reducing unnecessary API calls and simplifying your architecture.

Automation

Send Symbl.ai Data to Automation Tools

Once you can extract structured conversation data from Symbl.ai, the next step is routing it through Semarize for structured scoring and into your downstream systems. Below are end-to-end example flows - each showing the full pipeline from recording trigger through Symbl.ai processing and Semarize evaluation to CRM, Slack, or database output.

ZapierNo-code automation

Recording → Symbl.ai → Zapier → Semarize → CRM

Detect a new recording from your meeting platform, submit it to Symbl.ai for processing, retrieve the structured output, send it to Semarize for scoring, then write the scored signals directly to your CRM.

Example Zap

Trigger: New Recording

Fires when a new recording is available

App: Zoom / Teams / Custom Webhook

Event: New Recording Available

Output: recording_url, meeting_id

Webhooks by Zapier

Submit audio to Symbl.ai Async API

Method: POST

URL: https://api.symbl.ai/v1/process/audio/url

Auth: Bearer {{access_token}}

Body: { url: {{recording_url}}, detectTopics: true }

Poll until completed

Webhooks by Zapier

Retrieve results from Conversation API

GET /v1/conversations/{{conversationId}}/topics

GET /v1/conversations/{{conversationId}}/action-items

GET /v1/conversations/{{conversationId}}/messages

Structured data retrieved

Webhooks by Zapier

POST /v1/runs (sync) to Semarize

Method: POST

URL: https://api.semarize.com/v1/runs

Auth: Bearer smz_live_...

Body: { kit_code, mode: "sync", input: { transcript } }

Structured output returned

Salesforce - Update Record

Write scored signals to Opportunity

Object: Opportunity

AI Score: {{overall_score}}

Risk Flag: {{risk_flag}}

Topics: {{top_topics}}

Setup steps

Create a new Zap. Choose your recording source as the trigger (Zoom, Teams, or a custom webhook). Connect your account and select the "New Recording" event.

Add a "Webhooks by Zapier" Action to generate a Symbl.ai access token. POST to https://api.symbl.ai/oauth2/token/generate with your App ID and App Secret.

Add another "Webhooks by Zapier" Action to submit the recording URL to Symbl.ai. POST to https://api.symbl.ai/v1/process/audio/url with the recording URL and your processing configuration.

Add a Delay step (2-5 minutes depending on typical call length), then a "Webhooks by Zapier" Action to poll the job status. Alternatively, use Zapier's webhook trigger to receive the Symbl.ai callback.

Add HTTP Request steps to retrieve topics, action items, and messages from the Conversation API using the conversationId.

Add a "Webhooks by Zapier" Action to send the Symbl.ai output to Semarize. POST to https://api.semarize.com/v1/runs with kit_code, mode: "sync", and the transcript in input.transcript.

Add a Salesforce (or HubSpot, Sheets, etc.) Action to write the Semarize brick values to your CRM record.

Test each step end-to-end, then turn on the Zap.

Watch out for: Symbl.ai access tokens expire after 24 hours. For long-running Zaps, add a token refresh step at the start of each run. Also, Zapier has step data size limits - for very long transcripts, store the Symbl.ai output in cloud storage and pass a reference URL to Semarize.

Learn more about Zapier automation

n8nSelf-hosted workflows

Recording → Symbl.ai → n8n → Semarize → Database

Receive new recording notifications via webhook, process through Symbl.ai, retrieve structured intelligence, send to Semarize for scoring, then write the results to your database. n8n's native loop support handles Symbl.ai job polling and batch processing.

Example Workflow

Webhook - New Recording

Receives callback from recording platform

Method: POST

Path: /symbl-ingest

Output: recording_url, metadata

HTTP Request - Auth Token

POST /oauth2/token/generate (Symbl.ai)

Body: { type: 'application', appId, appSecret }

HTTP Request - Submit Audio

POST /v1/process/audio/url (Symbl.ai)

URL: https://api.symbl.ai/v1/process/audio/url

Body: { url: {{recording_url}}, detectTopics: true }

Output: conversationId, jobId

Poll until completed

HTTP Request - Conversation API

GET /v1/conversations/{id}/topics + messages

Fetch: topics, action-items, messages

HTTP Request - Semarize

POST /v1/runs (sync)

URL: https://api.semarize.com/v1/runs

Auth: Bearer smz_live_...

Body: { kit_code, mode: "sync", input: { transcript } }

Scores & signals returned

Postgres - Insert Row

Write structured output to database

Table: call_evaluations

Columns: conversation_id, score, topics, action_items

Setup steps

Add a Webhook node as the workflow trigger. Configure your recording platform to POST new recording notifications to this webhook URL.

Add an HTTP Request node to generate a Symbl.ai access token. POST to https://api.symbl.ai/oauth2/token/generate with your App ID and App Secret. Cache the token if running multiple workflows.

Add an HTTP Request node to submit the recording to Symbl.ai. POST to https://api.symbl.ai/v1/process/audio/url with the recording URL, detection options, and speaker diarisation settings.

Add a Wait node (or a polling loop with IF node) to check job status. GET https://api.symbl.ai/v1/job/{jobId} until status is "completed". Use exponential backoff in the loop.

Add HTTP Request nodes to retrieve structured data from the Conversation API - GET /v1/conversations/{id}/messages for transcript, /topics, /action-items, /questions, and /entities.

Add a Code node (JavaScript) to assemble the Symbl.ai output into a clean transcript string. Join message text by speaker, preserving the conversation flow.

Add an HTTP Request node to send the transcript to Semarize. POST to https://api.semarize.com/v1/runs, add your API key as a Bearer token, set kit_code, mode to "sync", and map the transcript into input.transcript.

Add a Postgres (or MySQL / HTTP Request) node to write the structured Semarize output. Use conversation_id as the primary key for upserts.

Activate the workflow. Monitor the first few runs to verify the full pipeline - Symbl.ai processing, result retrieval, Semarize scoring, and database writes.

Watch out for: Use conversation IDs as deduplication keys to prevent reprocessing. If your recording platform fires duplicate webhooks, the first run should store the conversation ID - subsequent runs check for existence before submitting to Symbl.ai again.

Learn more about n8n automation

MakeVisual automation with branching

Recording → Symbl.ai → Make → Semarize → CRM + Slack

Fetch new recordings on a schedule, process through Symbl.ai, retrieve structured intelligence, send to Semarize for scoring, then use a Router to branch the output - alert on risk flags via Slack and write all signals to your CRM.

Example Scenario

Schedule - Every 30 min

Triggers the scenario on interval

Interval: 30 minutes

HTTP - Auth Token

POST /oauth2/token/generate (Symbl.ai)

Body: { type: 'application', appId, appSecret }

HTTP - Submit Recording

POST /v1/process/audio/url (Symbl.ai)

Iterator: for each new recording

Body: { url: {{item.recording_url}} }

Output: conversationId, jobId

Wait for completion

HTTP - Conversation API

GET topics, action-items, messages

GET /v1/conversations/{{conversationId}}/messages

GET /v1/conversations/{{conversationId}}/topics

HTTP - Semarize

POST /v1/runs (sync)

URL: https://api.semarize.com/v1/runs

Auth: Bearer smz_live_...

Body: { kit_code, mode: "sync", input: { transcript } }

Structured output

Router - Branch on Risk Flag

Route by Semarize output

Branch 1: IF risk_flag.value = true

Branch 2: ALL (fallthrough)

Branch 1 - Risk detected

Slack - Alert Channel

Notify team about flagged call

Channel: #deal-alerts

Message: Risk on {{conversation_id}}, score: {{score}}

Branch 2 - All calls

Salesforce - Update Record

Write all scored signals to Opportunity

AI Score: {{overall_score}}

Risk Flag: {{risk_flag}}

Topics: {{top_topics}}

Setup steps

Create a new Scenario. Add a Schedule module as the trigger, set to your desired interval (30 minutes works well for most teams).

Add an HTTP module to generate a Symbl.ai access token. POST to https://api.symbl.ai/oauth2/token/generate with your App ID and App Secret.

Add an HTTP module to fetch new recordings from your source platform. Use the platform's API to list recordings since the last run timestamp.

Add an Iterator module to loop through each recording. For each, add an HTTP module to submit to Symbl.ai's Async API with the recording URL and your processing config.

Add a Sleep module (or HTTP polling loop) to wait for Symbl.ai processing. Then add HTTP modules to retrieve topics, messages, and action items from the Conversation API.

Add an HTTP module to send the assembled transcript to Semarize. POST to https://api.semarize.com/v1/runs with kit_code, mode: "sync", and input.transcript. Parse the response as JSON.

Add a Router module. Define Branch 1 with a filter: bricks.risk_flag.value equals true. Leave Branch 2 as a fallthrough (no filter).

On Branch 1, add a Slack module to alert your team when risk is detected. Map the score, risk flag, and conversation ID into the message.

On Branch 2, add a Salesforce module to write all brick values to the Opportunity record. Set the scenario schedule and activate.

Watch out for: Each API call counts as a Make operation. A scenario processing 20 recordings uses ~100+ operations (auth + submit + poll + retrieve + Semarize per recording). Use mode: "sync" for Semarize to avoid additional polling operations.

Learn more about Make automation

What you can build

What You Can Do With Symbl.ai Data in Semarize

Semarize unlocks structured compliance scoring, cross-session trend analysis, custom evaluation frameworks, and the ability to build your own intelligence layers on top of Symbl.ai's conversation output.

Real-Time Quality Gate for AI Agents

AI Response Evaluation & Safety Scoring

What Semarize generates

response_relevance = 0.87hallucination_detected = truetone_appropriate = truesafety_violation = false

Your product team uses Symbl.ai’s streaming API to power an AI voice agent. Every interaction generates a transcript — but are the AI agent’s responses accurate, relevant, and safe? Pipe Symbl.ai’s real-time transcripts into Semarize after each interaction. A quality gate kit scores every AI conversation for response_relevance, hallucination_detected, tone_appropriate, and safety_violation. When the AI agent tells a customer “we offer a full refund within 90 days” but your policy says 30 days, Semarize flags the hallucination with evidence. The AI team gets a daily quality report with scored, actionable signals from every conversation.

Learn more about AI Evaluation

AI Agent Quality GateLast 24h · 3 conversations

conv-0412Agent A

PASS

Relevance0.92

Hallucination

Tone

Safety

conv-0413Agent B

FAIL

Relevance0.87

Hallucination

Tone

Safety

conv-0414Agent A

PASS

Relevance0.95

Hallucination

Tone

Safety

Hallucination Detected in conv-0413: “We offer a full refund within 90 days” - policy states 30 days

Multi-Source Deal Signal Aggregation

Unified Qualification Across Symbl, Zoom & Teams

What Semarize generates

budget_confirmed = truetimeline_set = "Q3 2026"decision_maker = falsedeal_completeness = 67%

Your sales team runs discovery calls on Symbl.ai, follow-ups on Zoom, and internal reviews on Teams - but no single platform sees the full deal picture. Semarize scores transcripts from all three platforms and unifies deal qualification signals into one view: budget_confirmed from Symbl, timeline_set from Zoom, decision_maker from Teams. Deal completeness aggregates to 67% with a clear gap - decision_maker not identified in the Teams follow-up. Revenue teams see exactly which qualification signals are missing and where, across every platform in the deal cycle. Symbl.ai provides conversation intelligence but cannot natively aggregate and reconcile deal signals across multiple platforms.

Learn more about AI Evaluation

Unified Deal Signal Summary

Symbl.ai

budget_confirmed=true

Zoom

timeline_set=Q3 2026

Teams

decision_maker=false

Signals feed into unified deal view

Deal Completeness67%

Missing: decision_maker not identified in Teams follow-up

Conversation-Driven Product Feedback Loop

Feature-Level Sentiment & Urgency Scoring

What Semarize generates

feature_satisfaction = 32mention_count = 89churn_threats = 12request_urgency = "critical"

Your support team processes customer calls through Symbl.ai. You need to go beyond topics and action items to quantify product sentiment at the feature level. Run every support transcript through a product feedback kit. Semarize scores each call for feature_satisfaction (per feature mentioned), churn_signal_strength, bug_report_severity, and feature_request_urgency. A monthly product board report shows structured feedback from 500+ calls: “Dashboard loading” has a satisfaction score of 32/100, mentioned in 89 calls, with 12 explicit churn threats. Product prioritisation shifts from gut feel to conversation evidence.

Learn more about Customer Success

Product Feedback Report500+ calls scored

Dashboard loading

32/100

CRITICAL

Export functionality

58/100

HIGH

Search filters

71/100

MEDIUM

Notification system

44/100

HIGH

“Dashboard loading” - satisfaction 32/100, mentioned in 89 calls, 12 explicit churn threats

Custom Conversation Analytics Platform

End-to-End Scored Data Pipeline

Vibe-coded

What Semarize generates

pipeline_stages = 4output_format = "typed SQL columns"latency = "< 45s"monthly_volume = 2,400

A data engineer vibe-codes a FastAPI service that chains Symbl.ai processing with Semarize evaluation. Audio files drop into an S3 bucket, a Lambda triggers Symbl.ai’s async API for transcription, then passes the transcript to Semarize for structured scoring. The scored output lands in Snowflake with typed columns: discovery_depth (int), budget_confirmed (bool), competitor_mentioned (varchar), sentiment_score (float). A dbt model aggregates scores by rep, team, and quarter. The BI team builds dashboards on structured, query-ready conversation data — fully typed and ready for analytics at scale.

Learn more about Data Science

Data Pipeline ArchitectureVibe-coded

S3 Bucketaudio

Lambdatrigger

Symbl.aitranscript

Semarizestructured JSON

SnowflakeSQL columns

Latency< 45s

Monthly volume2,400

Outputtyped SQL

Stages4

Watch out for

Common Challenges & Gotchas

These are the issues that come up most often when teams start processing conversations through Symbl.ai at scale.

Async processing is not instant

The Async API queues files for processing. Attempting to retrieve results immediately after submission will return incomplete data. Poll the job status endpoint or use webhook callbacks to know when processing is done.

Access token expiration

Symbl.ai access tokens expire after a set period. Your integration must handle token refresh automatically - failing to do so will cause API calls to return 401 errors mid-pipeline. Cache the token and refresh before expiry.

Concurrent processing limits

Each plan tier has a limit on how many concurrent Async jobs or Streaming connections you can run. Exceeding the limit results in queued or rejected requests. For bulk backfills, implement a job queue with concurrency control.

Tracker configuration drift

Trackers need to be maintained as your business evolves. Competitor names change, product features launch, and compliance language updates. Stale Tracker configurations produce false negatives - regularly audit and update your Tracker vocabulary.

Speaker diarisation accuracy

Speaker separation quality depends on audio quality, microphone setup, and the number of speakers. Overlapping speech and poor-quality audio degrade diarisation accuracy. Validate speaker labels before using them for per-speaker analysis or attribution.

WebSocket connection management

The Streaming API uses WebSocket connections that can drop due to network issues. Implement reconnection logic with state recovery - losing a connection mid-conversation means losing real-time signals unless you have fallback processing via the Async API.

Conversation ID tracking

Every submission to Symbl.ai returns a conversation ID that you need to retrieve results. Losing or failing to store this ID means you cannot access the processed output. Use the conversation ID as your primary key and store it immediately after submission.

FAQ

Frequently Asked Questions

Explore

Symbl.ai - How to Get Your Conversation Data

What Data You Can Extract From Symbl.ai

How to Get Data via the Symbl.ai API

Authenticate

Submit content via the Async API

Check processing status

Retrieve results via the Conversation API

Key Extraction Flows

Batch Processing (Async API)

Real-Time Streaming (WebSocket API)

Webhook-Triggered Incremental Processing

Send Symbl.ai Data to Automation Tools

Recording → Symbl.ai → Zapier → Semarize → CRM

Setup steps

Recording → Symbl.ai → n8n → Semarize → Database

Setup steps

Recording → Symbl.ai → Make → Semarize → CRM + Slack

Setup steps

What You Can Do With Symbl.ai Data in Semarize

Real-Time Quality Gate for AI Agents

Multi-Source Deal Signal Aggregation

Conversation-Driven Product Feedback Loop

Custom Conversation Analytics Platform

Common Challenges & Gotchas

Frequently Asked Questions

Explore Semarize

Get Started

Developer Quickstart

Pricing

How It Works

Bricks

Kits

Symbl.ai - How to Get Your Conversation Data

What Data You Can Extract From Symbl.ai

How to Get Data via the Symbl.ai API

Authenticate

Submit content via the Async API

Check processing status

Retrieve results via the Conversation API

Key Extraction Flows

Batch Processing (Async API)

Real-Time Streaming (WebSocket API)

Webhook-Triggered Incremental Processing

Send Symbl.ai Data to Automation Tools

Recording → Symbl.ai → Zapier → Semarize → CRM

Setup steps

Recording → Symbl.ai → n8n → Semarize → Database

Setup steps

Recording → Symbl.ai → Make → Semarize → CRM + Slack

Setup steps

What You Can Do With Symbl.ai Data in Semarize

Real-Time Quality Gate for AI Agents

Multi-Source Deal Signal Aggregation

Conversation-Driven Product Feedback Loop

Custom Conversation Analytics Platform

Common Challenges & Gotchas

Frequently Asked Questions

Explore Semarize

Get Started

Developer Quickstart

Pricing

How It Works

Bricks

Kits

Related Resources

Get Your Data

Automation

CRM & Data

Playbooks

Blog