Get Your Data
Dialpad - How to Get Your Conversation Data
A practical guide to getting your conversation data out of Dialpad - covering API access, per-call transcript extraction, Ai Moments, webhook-triggered flows, and how to route structured data into your downstream systems.
What you'll learn
- What conversation data you can extract from Dialpad - transcripts, Ai Moments, call recordings, and metadata
- How to access data via the Dialpad API - authentication, endpoints, and per-call transcript retrieval
- Two extraction patterns: batch polling and webhook-triggered via call_transcription events
- How to connect Dialpad data pipelines to Zapier, n8n, and Make
- Advanced use cases - agent QA scoring, contact center analytics, moment trend analysis, and custom dashboards
Data
What Data You Can Extract From Dialpad
Dialpad is a cloud communications platform with built-in Ai. Every call produces a transcript, moment detections, and rich metadata that can be extracted via API - the transcript text, speaker identification, Ai-detected key phrases and action items, call recordings, and contextual call detail records.
Common fields teams care about
API Access
How to Get Transcripts via the Dialpad API
Dialpad exposes call data and transcripts through a REST API at developers.dialpad.com. The workflow is: authenticate with an API key or OAuth token, list calls to get call IDs, then fetch the transcript for each call individually.
Authenticate
Dialpad supports two authentication methods: API key and OAuth 2.0. For automation and server-to-server integrations, an API key is simplest - generate one in the Dialpad admin console under Integrations, then pass it as a Bearer token or apikey query parameter on every request.
Authorization: Bearer <your_api_key> Content-Type: application/json # Or as a query parameter: GET https://dialpad.com/api/v2/calls?apikey=<your_api_key>
List calls to get call IDs
Use the GET /api/v2/stats/calls endpoint or the call events webhook to collect call IDs. Filter by date range and pagination parameters. Each call object includes a call_id you will use to fetch the transcript.
GET https://dialpad.com/api/v2/stats/calls
?start_date=2025-01-01T00:00:00Z
&end_date=2025-02-01T00:00:00Z
&limit=100
&cursor=<next_page_cursor>The response returns an array of call objects with call_id, date_started, duration, direction, and participant details. Keep paginating using the cursor until no more results are returned.
Fetch the transcript
For each call ID, request the transcript via GET /api/v2/transcripts/{call_id}. The response contains the transcript text, speaker labels, timestamps, and any detected Ai Moments (key phrases and action items).
GET https://dialpad.com/api/v2/transcripts/5678901234
// Response includes:
{
"call_id": "5678901234",
"transcript": [
{
"speaker": "Agent - Sarah M.",
"text": "Thanks for calling, how can I help?",
"timestamp": 0.5
},
...
],
"moments": [
{ "type": "action_item", "text": "Follow up on billing" },
{ "type": "keyword", "text": "competitor mentioned" }
]
}Each entry in the transcript array includes speaker, text, and timestamp. The moments array contains Dialpad Ai-detected key phrases and action items. Reassemble into plain text by concatenating entries, or preserve the structured format for per-speaker analysis.
Handle rate limits and transcript availability
Rate limits
Dialpad enforces a rate limit of approximately 1,200 requests/minute. When you receive a 429 response, back off using exponential retry logic. For bulk operations, pace requests at ~15–20 per second and persist your pagination cursor between runs.
Transcript timing
Dialpad Ai processes transcripts in near real-time during the call, but the finalized version becomes available shortly after the call ends - typically within a few minutes. For longer calls or during peak load, allow up to 15–30 minutes. Use the call_transcription webhook event to be notified when the transcript is ready.
Patterns
Key Extraction Flows
There are two primary patterns for getting transcripts out of Dialpad. The right choice depends on whether you're doing a historical backfill or need near real-time processing as calls complete.
Batch Polling (Backfill & Incremental)
Historical export or scheduled ongoing extraction
Define your date range - for backfills, this may be several months of historical calls. For incremental polling, use your last successful poll timestamp as the start
Call GET /api/v2/stats/calls with start_date and end_date filters. Paginate through the full result set, collecting all call IDs
For each call ID, fetch the transcript via GET /api/v2/transcripts/{call_id}. Pace requests to stay within the 1,200/minute rate limit
Store each transcript with its call metadata (call ID, date, duration, participants, moments) in your data warehouse or object store
Route stored data to your analysis pipeline - Semarize for structured evaluation, or direct to your BI tool for reporting
Update your stored cursor / timestamp to the current run time for the next poll cycle
Webhook-Triggered
Near real-time on call transcription completion
Register a webhook endpoint in the Dialpad admin console. Subscribe to the call_transcription event type - this fires when Dialpad Ai finishes processing a call's transcript
When the webhook fires, parse the event payload to extract the call_id and basic call metadata
Fetch the full transcript via GET /api/v2/transcripts/{call_id} using the call ID from the event payload
Route the transcript and metadata downstream - to Semarize for structured analysis, your CRM updater, or automation platform
Automation
Send Dialpad Transcripts to Automation Tools
Once you can extract transcripts from Dialpad, the next step is routing them through Semarize for structured analysis and into your downstream systems. Below are end-to-end example flows - each showing the full pipeline from Dialpad trigger through Semarize evaluation to CRM, Slack, or database output.
Dialpad → Zapier → Semarize → CRM
Detect new Dialpad calls via webhook, fetch the transcript, send it to Semarize for structured analysis, then write the scored output - signals, flags, and evidence - directly to your CRM.
Setup steps
Create a new Zap. Choose "Webhooks by Zapier" as the trigger and select "Catch Hook". Copy the webhook URL and register it in Dialpad's admin console under Webhooks, subscribing to the call_transcription event.
Add a "Webhooks by Zapier" Action (Custom Request) to fetch the transcript from Dialpad. Set method to GET, URL to https://dialpad.com/api/v2/transcripts/{{call_id}}, and add your API key as a Bearer token.
Add a second "Webhooks by Zapier" Action. Set method to POST, URL to https://api.semarize.com/v1/runs. Add your Semarize API key as a Bearer token. In the body, set kit_code to your Kit, mode to "sync", and map the transcript text into input.transcript.
Add a Formatter step to extract individual brick values from the Semarize JSON response - agent_score, compliance_flag, resolution_status, etc.
Add a Salesforce (or HubSpot, Sheets, etc.) Action to write the extracted scores and signals to your CRM record.
Test each step end-to-end, then turn on the Zap.
Dialpad → n8n → Semarize → Database
Poll Dialpad for new calls on a schedule, fetch transcripts, send each one to Semarize for analysis, then write the structured scores and signals to your database. n8n's native loop support handles pagination and batch processing.
Setup steps
Add a Cron node as the workflow trigger. Set the interval to your desired polling frequency (hourly works well for most teams).
Add an HTTP Request node to list new calls from Dialpad. Set method to GET, URL to https://dialpad.com/api/v2/stats/calls, configure Bearer auth with your API key, and set start_date to one interval ago.
Add a Split In Batches node to iterate over the returned call IDs. Inside the loop, add an HTTP Request node to fetch each transcript via GET /api/v2/transcripts/{call_id}.
Add a Code node (JavaScript) to reassemble the transcript array into a single text string. Join each entry's text, prefixed by speaker name.
Add another HTTP Request node to send the transcript to Semarize. Set method to POST, URL to https://api.semarize.com/v1/runs. Add your API key as a Bearer token. Set kit_code, mode to "sync", and map the transcript into input.transcript.
Add a Code node to extract the brick values from the Semarize response - agent_score, compliance_flag, resolution_status, evidence, confidence.
Add a Postgres (or MySQL / HTTP Request) node to write the structured output. Use call_id as the primary key for upserts.
Activate the workflow. Monitor the first few runs to verify Semarize responses are arriving and writing correctly.
Dialpad → Make → Semarize → CRM + Slack
Fetch new Dialpad transcripts on a schedule, send each to Semarize for structured analysis, then use a Router to branch the scored output - alert on escalation flags via Slack and write all signals to your CRM.
Setup steps
Create a new Scenario. Add a Schedule module as the trigger, set to your desired interval (15-60 minutes is typical for contact center data).
Add an HTTP module to list new calls from Dialpad. Set method to GET, URL to https://dialpad.com/api/v2/stats/calls, configure Bearer auth, and filter by start_date since the last run.
Add an Iterator module to loop through each call. For each, add an HTTP module to fetch the transcript via GET /api/v2/transcripts/{call_id}.
Add another HTTP module to send the transcript to Semarize. Set URL to https://api.semarize.com/v1/runs, add your Bearer token, and set kit_code, mode to "sync", and input.transcript from the previous step. Parse the response as JSON.
Add a Router module. Define Branch 1 with a filter: bricks.escalation_flag.value equals true. Leave Branch 2 as a fallthrough (no filter).
On Branch 1, add a Slack module to alert your team when an escalation is detected. Map the score, escalation flag, and call ID into the message.
On Branch 2, add a Salesforce module to write all brick values (agent_score, compliance_flag, resolution_status) to the Case or Contact record.
Set the scenario schedule and activate. Monitor the first few runs in Make's execution log.
What you can build
What You Can Do With Dialpad Data in Semarize
Custom QA scoring grounded against your playbook, cross-channel analytics, deep moment trend analysis, and building your own tools on structured conversation signals.
Knowledge-Grounded Disclosure Sequence Verification
Regulatory Evidence Automation
What Semarize generates
Your financial services contact centre handles regulated calls. Required disclosures must be delivered in the right order, with the right phrasing, within the required timeframe. Run every transcript through a regulatory compliance kit grounded against your compliance policy document. Semarize verifies disclosure_sequence_correct, consent_language_verbatim_match, risk_warning_delivered, and opt_out_offered — with the exact timestamp and evidence span for each. Every call generates a structured evidence package that maps directly to your regulatory filing template. Your compliance team gets weekly audit coverage of 100% of calls. Audit prep drops from 3 weeks to 3 days.
Learn more about QA & CompliancePolicy & Procedure Accuracy Audit
Knowledge-Grounded Agent Verification
What Semarize generates
Run a knowledge-grounded kit against your product documentation, return policies, and troubleshooting guides on every call. Semarize checks whether the return window quoted was accurate, whether the warranty terms matched the current policy, and whether troubleshooting steps followed the approved sequence. After scoring 3,000 calls, you discover that 12% of agents cite outdated return policy terms — and the specific sections they get wrong. The cost of honouring incorrect promises drops immediately once targeted retraining addresses the exact knowledge gaps the data revealed.
Learn more about QA & ComplianceStructured Coaching Signal Pipeline
Typed Conversation Data for Analytics
What Semarize generates
A data architect builds an Airflow DAG that runs every call through Semarize. Each call lands in BigQuery with typed columns: agent_id (string), empathy_score (float), resolution_quality (float), compliance_pass (bool), discovery_depth (int), escalation_risk (float). dbt models build derived tables: agent daily scorecards, weekly compliance summaries, and customer sentiment trends. The BI team builds Looker dashboards on structured conversation data that didn’t exist 3 months ago — no data science team required, just an API and an orchestrator.
Learn more about Data ScienceCustom Workforce Analytics Engine
Structured Signals Joined with WFM & CRM Data
What Semarize generates
A workforce analytics lead vibe-codes a Metabase dashboard that joins Semarize scores from every Dialpad call with WFM schedule data and CRM outcomes. The dashboard reveals that afternoon shifts have 15% lower empathy_scores than morning shifts, that agents who handle more than 40 calls/day see quality drop by 20% after call 35, and that the top 10% of agents by resolution_quality handle 30% fewer calls — but generate 2x the CSAT. Staffing models get adjusted: high-performers get premium time slots, and daily call caps prevent quality degradation.
Learn more about RevOpsWatch out for
Common Challenges & Gotchas
These are the issues that come up most often when teams start extracting transcripts from Dialpad at scale.
Transcripts are per-call only - no bulk endpoint
Unlike some platforms that offer bulk transcript export, Dialpad requires you to fetch transcripts one call at a time using the call_id. For historical backfills, this means building a loop that lists calls, collects IDs, and fetches each transcript individually. Plan for longer backfill times on large datasets.
Recording URLs expire
Dialpad serves call recordings via time-limited secure blob URLs. If you need the audio file, you must download it promptly after receiving the URL. Storing the URL for later retrieval won't work - the link will have expired. Build download-on-receipt into your pipeline.
OAuth token refresh required
If you use OAuth rather than a static API key, access tokens expire and must be refreshed periodically. Automation workflows that run on a schedule need to handle token refresh transparently, or they'll fail silently when the token expires mid-run.
Webhook delivery is not guaranteed exactly-once
Dialpad webhooks (including call_transcription events) may deliver the same event more than once, or miss delivery during outages. Implement idempotency checks using the call ID as a deduplication key, and run a periodic reconciliation poll to catch any missed events.
Ai features require specific plan tiers
Dialpad Ai features - transcription, Ai Moments, Ai Scorecards - are not available on all plan tiers. If your account doesn't include Dialpad Ai, transcript and moments endpoints will return empty or be unavailable. Confirm your plan includes these features before building your extraction pipeline.
Contact center vs. business line data separation
Dialpad separates data between its UCaaS (business phone) and CCaaS (contact center) products. API calls for contact center data may use different endpoints or require separate credentials from business line calls. Make sure your integration targets the correct product line for the data you need.
FAQ
Frequently Asked Questions
Explore