Get Your Data
Sembly.ai - How to Get Your Conversation Data
A practical guide to getting your conversation data out of Sembly.ai - covering outbound automations, webhook configuration, data filtering, Zapier integration, and how to route structured meeting data into your downstream systems.
What you'll learn
- What conversation data you can extract from Sembly.ai - transcriptions, meeting notes, tasks, and metadata
- How to configure outbound automations at sembly.ai/automations/custom/ to push data to your endpoint
- Three extraction patterns: webhook receiver, Zapier integration, and filtered team routing
- How to connect Sembly.ai data pipelines to Zapier, n8n, and Make
- Advanced use cases - decision documentation, coaching automation, meeting ROI, and knowledge graphs
Data
What Data You Can Extract From Sembly.ai
Sembly.ai captures meetings and produces three primary data types, all deliverable via outbound automations. Each automation payload includes meeting metadata alongside the content, giving you full context for downstream processing.
Data types available via outbound automations
Transcription
Speaker-attributed transcript text with timestamps. Supports 45+ languages with automatic language detection. The primary input for structured analysis.
Meeting Notes
AI-generated summaries of the meeting. Useful for quick context but not sufficient for deep structured analysis - that's where Semarize comes in.
Tasks
Extracted action items with assignees and due dates. Sembly identifies tasks natively, but structured scoring and categorisation requires downstream processing.
Automation Access
How to Get Data via Sembly.ai Outbound Automations
Sembly.ai uses an outbound automation system rather than a traditional REST API. You configure automations that push data to your chosen destination when meetings are processed. The workflow is: create a custom automation, configure your endpoint, set filters, and Sembly delivers the data as a JSON payload via HTTP POST.
Create a custom automation
Navigate to sembly.ai/automations/custom/ in your Sembly.ai dashboard. Create a new custom automation and select the data types you want to push: Transcription, Meeting Notes, and/or Tasks.
# Sembly.ai Custom Automation Configuration # Navigate to: sembly.ai/automations/custom/ 1. Click "Create Automation" 2. Select data types: - Transcription (speaker-attributed text) - Meeting Notes (AI summaries) - Tasks (with assignees & due dates) 3. Configure destination endpoint 4. Set filters (optional)
Configure your webhook endpoint
Provide the URL where Sembly.ai should deliver meeting data. This can be a direct webhook receiver, a Zapier webhook URL, an n8n webhook endpoint, or any HTTP endpoint that accepts POST requests with JSON payloads.
# Example: Webhook endpoint configuration Destination URL: https://your-server.com/webhooks/sembly Method: POST Content-Type: application/json # Example: Using a Zapier catch hook Destination URL: https://hooks.zapier.com/hooks/catch/12345/abcdef/ # Example: Using an n8n webhook Destination URL: https://your-n8n.com/webhook/sembly-meetings
Your endpoint must accept HTTP POST requests and return a 2xx status code. Sembly.ai delivers the payload as a JSON body containing the selected data types and meeting metadata.
Set up filters
Sembly.ai supports advanced filtering so you only receive the meetings you care about. Filters can be combined to narrow the scope precisely.
Route only specific team's meetings. E.g., send only Sales meetings to your coaching pipeline.
Filter by keywords in meeting titles or content. E.g., only meetings mentioning "roadmap" or "sprint".
Filter by meeting category. E.g., only external prospect calls, not internal standups.
Receive and process the payload
Payload structure
Sembly.ai delivers a JSON payload containing the meeting metadata and your selected data types. The payload includes the meeting ID (use as a deduplication key), title, date, participants, and the data content.
Processing timing
Automations fire after Sembly.ai finishes processing the meeting. This typically takes a few minutes to an hour after the meeting ends, depending on length and system load. Your receiver should handle payloads arriving at variable intervals.
Patterns
Key Extraction Flows
There are three practical patterns for getting meeting data out of Sembly.ai. All are push-based since Sembly uses outbound automations rather than a pull API. The right choice depends on your infrastructure and integration preferences.
Custom Webhook Receiver
Direct HTTP endpoint for full control
Deploy a webhook receiver endpoint on your server or cloud function (AWS Lambda, Cloudflare Worker, etc.) that accepts POST requests with JSON payloads
Configure a custom automation at sembly.ai/automations/custom/ pointing to your endpoint URL. Select Transcription as the data type
When Sembly.ai finishes processing a meeting, it POSTs the transcript and metadata to your endpoint automatically
Your receiver parses the JSON payload, extracts the transcript and meeting metadata, and routes it to your analysis pipeline
Use the meeting ID from the payload as a deduplication key to prevent processing the same meeting twice on retry deliveries
Zapier Integration
No-code automation via native Sembly.ai trigger
Create a new Zap with Sembly.ai as the trigger app. Select the "New Meeting" trigger event and connect your Sembly.ai account
The trigger fires automatically when Sembly.ai finishes processing a meeting, delivering the transcript and metadata to Zapier
Add downstream actions: send the transcript to Semarize for structured analysis, then route the scored output to your CRM, database, or notification channel
Use Zapier's built-in filtering to process only meetings that match your criteria (e.g., specific teams or meeting types)
Filtered Team Routing
Multiple automations for different teams and pipelines
Create separate custom automations for each team or use case. E.g., one automation for Sales meetings, one for Engineering, one for Customer Success
Apply team and keyword filters on each automation so only relevant meetings reach each pipeline
Point each automation to a different endpoint or webhook URL, routing data to team-specific analysis kits in Semarize
Each pipeline can use a different Semarize kit - a coaching scorecard for Sales, a decision log for Engineering, a sentiment tracker for CS
Automation
Send Sembly.ai Data to Automation Tools
Once Sembly.ai is pushing meeting data via outbound automations, the next step is routing it through Semarize for structured analysis and into your downstream systems. Below are end-to-end example flows - each showing the full pipeline from Sembly.ai trigger through Semarize evaluation to CRM, Slack, or database output.
Sembly.ai → Zapier → Semarize → CRM
Use Sembly.ai's native Zapier trigger to detect new meetings, send the transcript to Semarize for structured analysis, then write the scored output - signals, flags, and evidence - directly to your CRM.
Setup steps
Create a new Zap. Choose Sembly.ai as the trigger app and select "New Meeting Processed" as the event. Connect your Sembly.ai account.
The trigger automatically delivers the transcript and meeting metadata when Sembly.ai finishes processing. No additional API call needed to fetch the transcript — Sembly pushes it directly.
Add a "Webhooks by Zapier" Action. Set method to POST, URL to https://api.semarize.com/v1/runs. Add your Semarize API key as a Bearer token. In the body, set kit_code to your Kit, mode to "sync", and map the transcript text into input.transcript.
Add a Formatter step to extract individual brick values from the Semarize JSON response — overall_score, risk_flag, decision_count, etc.
Add a Salesforce (or HubSpot, Sheets, etc.) Action to write the extracted scores and signals to your CRM record.
Test each step end-to-end, then turn on the Zap.
Sembly.ai → n8n → Semarize → Database
Receive Sembly.ai meeting data via an n8n webhook node, send each transcript to Semarize for analysis, then write the structured scores and signals to your database. n8n's native loop support handles batch processing if multiple meetings arrive.
Setup steps
Add a Webhook node as the workflow trigger. Set the method to POST and note the generated webhook URL. This is the URL you’ll configure in Sembly.ai’s custom automation.
In Sembly.ai, create a custom automation at sembly.ai/automations/custom/. Set the destination URL to your n8n webhook URL. Select Transcription as the data type.
Add a Code node (JavaScript) to extract the transcript text and meeting metadata from the Sembly.ai payload. The payload structure includes transcription, metadata, and optionally notes and tasks.
Add an HTTP Request node to send the transcript to Semarize. Set method to POST, URL to https://api.semarize.com/v1/runs. Add your API key as a Bearer token. Set kit_code, mode to "sync", and map the transcript into input.transcript.
Add a Code node to extract the brick values from the Semarize response — overall_score, risk_flag, decision_count, evidence, confidence.
Add a Postgres (or MySQL / HTTP Request) node to write the structured output. Use meeting_id as the primary key for upserts.
Activate the workflow. Send a test meeting through Sembly.ai to verify the full pipeline.
Sembly.ai → Make → Semarize → CRM + Slack
Receive Sembly.ai meeting data via a Make webhook, send each transcript to Semarize for structured analysis, then use a Router to branch the scored output - alert on risk flags via Slack and write all signals to your CRM.
Setup steps
Create a new Scenario. Add a Webhooks module (Custom webhook) as the trigger. Copy the generated webhook URL.
In Sembly.ai, create a custom automation at sembly.ai/automations/custom/. Set the destination URL to your Make webhook URL. Select Transcription and any other data types you need.
Back in Make, define the data structure for the Sembly.ai payload by sending a test meeting or manually defining the fields (transcription, metadata, notes, tasks).
Add an HTTP module to send the transcript to Semarize. Set URL to https://api.semarize.com/v1/runs, add your Bearer token, and set kit_code, mode to "sync", and input.transcript from the webhook payload. Parse the response as JSON.
Add a Router module. Define Branch 1 with a filter: bricks.risk_flag.value equals true. Leave Branch 2 as a fallthrough (no filter).
On Branch 1, add a Slack module to alert your team when risk is detected. Map the score, risk flag, and meeting ID into the message.
On Branch 2, add a Salesforce module to write all brick values (score, risk_flag, decision_count) to the appropriate record.
Activate the scenario. Monitor the first few runs in Make's execution log to verify payloads are arriving and processing correctly.
What you can build
What You Can Do With Sembly.ai Data in Semarize
Semarize enables structured decision logging, automated coaching pipelines, meeting ROI analysis, and the ability to build your own tools on structured conversation signals.
Engineering Decision Documentation
Structured Decision Capture
What Semarize generates
Your engineering teams make hundreds of technical decisions in meetings captured by Sembly.ai. Turning those conversations into a structured record of architectural decisions, trade-off discussions, and technical debt acknowledgements requires purpose-built evaluation. Set up an outbound automation that pipes every engineering meeting transcript into Semarize. A decision documentation kit extracts decision_type (architecture/tooling/process), decision_rationale, alternatives_discussed, trade_offs_acknowledged, and owner_assigned. Over 6 months, you build a searchable decision log of 340+ technical decisions — each with the full reasoning context. When a new engineer asks “why did we choose Kafka over RabbitMQ?” the answer is structured and timestamped.
Learn more about Data Science use casesSales Meeting Coaching Automation
Automated Rep Scoring
What Semarize generates
Your sales team uses Sembly.ai to capture prospect meetings. Transcripts and summaries capture what was said — but turning that into automated, scored coaching requires structured evaluation. Configure Sembly’s outbound automation to send every sales meeting transcript to Semarize automatically. A coaching kit scores each call for discovery_question_quality, competitive_positioning, value_articulation, and next_step_commitment. When a rep scores below 50 on competitive_positioning for 3 consecutive calls, the system auto-creates a coaching task in your project management tool. The manager gets specific evidence: “Rep didn’t address the competitor comparison raised at 14:32.” Coaching becomes automated, specific, and evidence-backed.
Learn more about Sales CoachingMeeting ROI by Team and Type
Cost-per-Decision Analysis
What Semarize generates
Your CEO wants to know: which team’s meetings produce the most value per hour? Quantifying meeting ROI requires scoring every conversation for outcomes, not just recording attendance. Pipe every transcript through a meeting value kit. Semarize scores each meeting for decision_count, action_item_clarity, strategic_alignment (does the discussion connect to company OKRs?), and participant_contribution_balance. Cross-reference with compensation data. A quarterly report shows engineering reviews produce 4.1 decisions/hour at $2,100/decision, while marketing syncs produce 0.6 decisions/hour at $14,800/decision. Meeting budgets get reallocated.
Learn more about RevOpsCustom Meeting Knowledge Graph
Connected Organisational Memory
What Semarize generates
A knowledge management lead vibe-codes a Neo4j-backed app that ingests Semarize output from every Sembly.ai meeting. The app builds a knowledge graph: meetings are nodes, topics are edges, decisions are properties, and people are connected through shared context. Query “What has Account X discussed across all meetings in the last quarter?” and get a structured timeline of topics, decisions, and sentiment shifts — pulled from engineering, sales, and CS meetings. Account managers prepare for QBRs in 10 minutes instead of 2 hours because the organisational memory is structured and connected.
Learn more about Customer SuccessWatch out for
Common Challenges & Gotchas
These are the issues that come up most often when teams start extracting meeting data from Sembly.ai at scale.
No pull-based API for historical data
Sembly.ai uses an outbound push model only. You cannot query or poll for past meeting data. Automations are forward-looking, so configure them before you need the data — any meetings that occur before setup won’t be sent automatically.
Processing delay after meetings end
Transcripts and meeting notes aren’t available the instant a meeting ends. Sembly processes recordings asynchronously, with typical delays of minutes to an hour. Your downstream pipeline should expect and handle this latency.
Webhook endpoint reliability
Since Sembly pushes data to your endpoint, your receiver must be reliably available. If your endpoint is down when Sembly fires the automation, you may miss data. Use a queuing layer or a service like Zapier that provides built-in retry logic.
Payload size for long meetings
Multi-hour meetings produce large transcript payloads. Some automation tools have payload size limits. Plan for large payloads by using cloud storage as an intermediary or chunking transcripts before processing.
Duplicate processing protection
Without idempotency checks, a retried webhook delivery could process the same meeting twice. Use the meeting ID from the payload as a deduplication key to ensure each transcript is handled exactly once.
Filter configuration complexity
Sembly.ai’s filtering system (by team, keywords, conversation types) is powerful but requires careful setup. Over-broad filters mean noise; over-narrow filters mean missed meetings. Test your filter configuration with a few meetings before going live.
Speaker label accuracy with large groups
Speaker attribution can degrade in meetings with many participants, poor audio, or frequent cross-talk. Validate speaker labels before using them for per-speaker scoring or coaching analysis.
FAQ
Frequently Asked Questions
Explore