Semarize

Get Your Data

Zoom - How to Get Your Conversation Data

A practical guide to getting your conversation data from Zoom meetings - covering the Zoom API, cloud recording transcripts, real-time webhooks, and how to route structured data into downstream systems.

What you'll learn

  • What conversation data you can extract from Zoom - meeting transcripts, recording metadata, participant details, and audio files
  • How to access data via the Zoom API - Server-to-Server OAuth, endpoints, and pagination
  • Three extraction patterns: cloud recording downloads, scheduled polling, and webhook-triggered via Zoom events
  • How to connect Zoom data pipelines to Zapier, n8n, and Make
  • Advanced use cases - custom scoring, CRM enrichment, compliance, and warehouse analytics

Data

What Data You Can Extract From Zoom

Zoom cloud recordings produce a rich set of data assets beyond just the video file. Every meeting with cloud recording enabled generates transcripts, audio files, participant logs, and metadata that can be extracted via the Zoom API.

Common fields teams care about

Meeting transcript text (auto-generated VTT transcript from cloud recordings)
Audio/video recordings (MP4 video and M4A audio files from cloud recordings)
Speaker identification (participant-level attribution in transcript)
Meeting metadata (topic, host, start/end time, duration, meeting type)
Participant list (who joined, join/leave times, attention tracking)
Chat messages (in-meeting chat log)
Polling and Q&A data (poll results and Q&A responses if used)
Recording metadata (file size, download URLs, processing status)
Meeting registration data (registrant information if registration was required)
Webinar-specific data (panelist details, attendee engagement for webinar meetings)

API Access

How to Get Transcripts via the Zoom API

Zoom exposes cloud recordings and transcripts through a REST API. The workflow is: authenticate with Server-to-Server OAuth, list recordings by date range, then download the transcript file for each meeting.

1

Authenticate

Create a Server-to-Server OAuth app in the Zoom App Marketplace. Grant scopes: cloud_recording:read:admin, meeting:read:admin. Generate an access token via POST https://zoom.us/oauth/token with account_credentials grant type.

Authorization: Bearer {access_token}
Server-to-Server OAuth replaces the deprecated JWT app type. Your app needs admin approval in the Zoom Marketplace before it can generate tokens.
2

List recordings

Call the GET /v2/users/{userId}/recordings endpoint with from and to date parameters. Returns meetings with their recording files. Each meeting object contains an array of recording files (video, audio, transcript). Pagination via next_page_token.

GET https://api.zoom.us/v2/users/{userId}/recordings?from=2026-01-01&to=2026-01-31

The response returns an array of meeting objects, each containing recording_files[] with file_type, download_url, and status. Keep paginating until next_page_token is empty.

3

Download transcript

Find the recording file where file_type is TRANSCRIPT. Download it using the download_url with your access token. The file is in VTT format with timestamps and speaker labels. Parse VTT to extract utterances.

GET {download_url}?access_token={token}

The response is a WebVTT file containing timestamped utterances with speaker labels. Parse the VTT format to extract plain text segments by speaker for downstream analysis.

4

Handle recording availability and limits

Cloud recording requirement

Cloud recordings are available only if cloud recording is enabled. Local recordings do not appear in the API.

Rate limits & retention

Zoom enforces API rate limits (varies by plan). Use the Retry-After header. Recordings are auto-deleted after the retention period set by admin.

Patterns

Key Extraction Flows

There are three practical patterns for getting transcripts out of Zoom. The right choice depends on whether you're doing a one-off migration, running ongoing extraction, or need near real-time processing.

Backfill (Historical Export)

One-off migration of past meetings

1

Create a Server-to-Server OAuth app in the Zoom App Marketplace and generate an access token

2

List recordings by date range using GET /v2/users/{userId}/recordings with from and to parameters

3

Filter the recording files array for entries where file_type is TRANSCRIPT

4

Download and parse the VTT transcript files, extracting plain text with speaker attribution

5

Send the parsed transcript text to Semarize for structured analysis

Tip: Zoom's recording list endpoint supports from and to date filters. Process one month at a time to keep payloads manageable.

Incremental Polling

Ongoing extraction on a schedule

1

Schedule a job (cron, cloud function, or automation tool) to run at regular intervals

2

List recordings from the last N hours using the from date parameter

3

Cross-reference returned meeting UUIDs against your list of already-processed meetings

4

Download transcripts for new meetings and parse the VTT format

5

Route the parsed transcript text to Semarize for structured analysis

Tip: Store processed meeting UUIDs. Zoom uses meeting UUID (not meeting ID) as the unique identifier for each meeting instance.

Webhook-Triggered

Near real-time on recording completion

1

Create a Zoom webhook-only app in the Zoom App Marketplace and subscribe to the recording.completed event

2

When the webhook fires, parse the event payload to extract download URLs and meeting metadata

3

Download the transcript file from the payload's download URL using the included download token

4

Parse the VTT content and route the structured transcript to Semarize for analysis

Note: Zoom webhooks include download tokens that expire after a set period. Process promptly or store the token for later use.

Automation

Send Zoom Transcripts to Automation Tools

Once you can extract transcripts from Zoom, the next step is routing them through Semarize for structured analysis and into your downstream systems. Below are end-to-end example flows - each showing the full pipeline from Zoom trigger through Semarize evaluation to CRM, Slack, or database output.

ZapierNo-code automation

Zoom → Zapier → Semarize → CRM

Detect new Zoom recordings, download the transcript, send it to Semarize for structured analysis, then write the scored output - signals, flags, and evidence - directly to your CRM.

Example Zap
Trigger: New Recording in Zoom
Fires when Zoom completes a cloud recording
App: Zoom
Event: New Recording
Output: meeting_id, download_url
Webhooks by Zapier
Download transcript from Zoom
Method: GET
URL: {download_url}?access_token={token}
Filter: file_type = TRANSCRIPT
Transcript returned
Webhooks by Zapier
POST /v1/runs (sync) to Semarize
Method: POST
URL: https://api.semarize.com/v1/runs
Auth: Bearer smz_live_...
Body: { kit_code, mode: "sync", input: { transcript } }
Structured output returned
Formatter by Zapier
Extract brick values from Semarize response
Extract: bricks.overall_score.value
Extract: bricks.risk_flag.value
Extract: bricks.pain_point.value
Salesforce - Update Record
Write scored signals to Opportunity
Object: Opportunity
AI Score: {{overall_score}}
Risk Flag: {{risk_flag}}
Pain Point: {{pain_point}}

Setup steps

1

Create a new Zap. Choose Zoom as the trigger app and select "New Recording" as the event. Connect your Zoom account.

2

Add a "Webhooks by Zapier" Action (Custom Request) to download the transcript. Filter the recording files for file_type = TRANSCRIPT and GET the download_url with your access token.

3

Add a second "Webhooks by Zapier" Action. Set method to POST, URL to https://api.semarize.com/v1/runs. Add your Semarize API key as a Bearer token. In the body, set kit_code to your Kit, mode to "sync", and map the parsed transcript text into input.transcript.

4

Add a Formatter step to extract individual brick values from the Semarize JSON response - overall_score, risk_flag, pain_point, etc.

5

Add a Salesforce (or HubSpot, Sheets, etc.) Action to write the extracted scores and signals to your CRM record.

6

Test each step end-to-end, then turn on the Zap.

Watch out for: Zapier has a built-in Zoom trigger for "New Recording." Use it instead of raw webhooks for simpler setup. Use mode: "sync" so Semarize returns results inline - Zapier doesn't natively support polling loops.
Learn more about Zapier automation
n8nSelf-hosted workflows

Zoom → n8n → Semarize → Database

Poll Zoom for new recordings on a schedule, download transcripts, send each one to Semarize for analysis, then write the structured scores and signals to your database. n8n's native loop support handles pagination and batch processing.

Example Workflow
Cron - Every Hour
Triggers the workflow on schedule
Mode: Every Hour
Timezone: UTC
HTTP Request - List Recordings
GET /v2/users/me/recordings (Zoom)
Method: GET
URL: https://api.zoom.us/v2/users/me/recordings
Auth: OAuth2
Params: { from: {{$now.minus(1, 'hour')}} }
For each meeting
HTTP Request - Download Transcript
GET download_url (Zoom)
Filter: file_type = TRANSCRIPT
URL: {{$json.download_url}}?access_token={{token}}
Code - Parse VTT
Extract plain text with speaker labels
Parse: VTT → structured utterances
HTTP Request - Semarize
POST /v1/runs (sync)
URL: https://api.semarize.com/v1/runs
Auth: Bearer smz_live_...
Body: { kit_code, mode: "sync", input: { transcript } }
Scores & signals returned
Postgres - Insert Row
Write structured output to database
Table: call_evaluations
Columns: meeting_uuid, score, risk_flag, pain_point

Setup steps

1

Add a Cron node as the workflow trigger. Set the interval to your desired polling frequency (hourly works well for most teams).

2

Add an HTTP Request node to list new recordings from Zoom. Set method to GET, URL to https://api.zoom.us/v2/users/me/recordings, configure OAuth2 auth, and set the from parameter to one interval ago.

3

Add a Split In Batches node to iterate over the returned meetings. Inside the loop, filter recording files for file_type = TRANSCRIPT.

4

Add an HTTP Request node to download each VTT transcript file using the download_url from the recording file object.

5

Add a Code node (JavaScript) to parse the VTT format into structured utterances. Extract speaker labels and text segments.

6

Add another HTTP Request node to send the parsed transcript to Semarize. Set method to POST, URL to https://api.semarize.com/v1/runs. Add your API key as a Bearer token. Set kit_code, mode to "sync", and map the transcript into input.transcript.

7

Add a Code node to extract the brick values from the Semarize response - overall_score, risk_flag, pain_point, evidence, confidence.

8

Add a Postgres (or MySQL / HTTP Request) node to write the structured output. Use meeting UUID as the primary key for upserts. Activate the workflow and monitor the first few runs.

Watch out for: Zoom access tokens expire after 1 hour. Use n8n's OAuth2 credential type to handle automatic refresh. Use meeting UUIDs as deduplication keys to prevent reprocessing.
Learn more about n8n automation
MakeVisual automation with branching

Zoom → Make → Semarize → CRM + Slack

Receive Zoom recording webhooks, download transcripts, send each to Semarize for structured analysis, then use a Router to branch the scored output - alert on risk flags via Slack and write all signals to your CRM.

Example Scenario
Webhook - Recording Completed
Receives Zoom recording.completed event
Event: recording.completed
Payload: download URLs, meeting metadata
HTTP - Download Transcript
GET download_url (Zoom, TRANSCRIPT file)
Filter: file_type = TRANSCRIPT
URL: {{download_url}}?access_token={{token}}
HTTP - Semarize
POST /v1/runs (sync)
URL: https://api.semarize.com/v1/runs
Auth: Bearer smz_live_...
Body: { kit_code, mode: "sync", input: { transcript } }
Structured output
Router - Branch on Risk Flag
Route by Semarize output
Branch 1: IF risk_flag.value = true
Branch 2: ALL (fallthrough)
Branch 1 - Risk detected
Slack - Alert Channel
Notify team about flagged meeting
Channel: #deal-alerts
Message: Risk on {{meeting_uuid}}, score: {{score}}
Branch 2 - All meetings
Salesforce - Update Record
Write all scored signals to Opportunity
AI Score: {{overall_score}}
Risk Flag: {{risk_flag}}
Pain Point: {{pain_point}}

Setup steps

1

Create a new Scenario. Add a Webhook module as the trigger to receive Zoom's recording.completed event.

2

Configure your Zoom webhook-only app to send recording.completed events to your Make webhook URL. Handle the endpoint.url_validation verification request.

3

Add an HTTP module to download the transcript. Filter the recording files for file_type = TRANSCRIPT and GET the download_url with the included download token.

4

Add a Text Parser module or Code module to parse the VTT format into plain text with speaker labels.

5

Add another HTTP module to send the parsed transcript to Semarize. Set URL to https://api.semarize.com/v1/runs, add your Bearer token, and set kit_code, mode to "sync", and input.transcript from the previous step. Parse the response as JSON.

6

Add a Router module. Define Branch 1 with a filter: bricks.risk_flag.value equals true. Leave Branch 2 as a fallthrough (no filter).

7

On Branch 1, add a Slack module to alert your team when risk is detected. Map the score, risk flag, and meeting ID into the message.

8

On Branch 2, add a Salesforce module to write all brick values (score, risk_flag, pain_point) to the Opportunity record. Activate the scenario.

Watch out for: Zoom sends webhook verification requests. Configure your Make webhook to handle the endpoint.url_validation event. Use mode: "sync" to avoid needing a polling loop for each run.
Learn more about Make automation

What you can build

What You Can Do With Zoom Data in Semarize

When raw transcripts become structured, grounded, and programmable, new possibilities open up. Here's what you can build.

Customer Research Insight Extraction

Product Intelligence

What Semarize generates

feature_request = "bulk_import"pain_severity = 0.84competitor_mentioned = "competitor_x"sentiment = "frustrated"

Your product team runs 30 customer research calls per month on Zoom. Insights live in a PM’s notebook — if they remember to take notes. With Semarize, every call is scored against a product feedback kit. Feature requests, pain points, competitor mentions, and sentiment signals are extracted as structured data. The output feeds directly into your product backlog tool — each feature request becomes a typed record with severity score, customer segment, and the exact quote as evidence. Product decisions are made on aggregated signal data, not anecdotes.

Learn more about Data Science use cases
Product Insights - Feb 202612 calls analyzed
Feature requestBulk import
Severity: 0.84Segment: Enterprise

...we need to import 10,000 records and there's no way to do it...

Feature requestAPI webhooks
Severity: 0.71Segment: Mid-market

...we'd love real-time notifications when...

Pain pointOnboarding too slow
Severity: 0.68Segment: SMB

...it took us three weeks to get the team set up...

8 feature requests · 5 pain points · 3 competitor mentions this month

Meeting Effectiveness Scoring

Operational Intelligence

What Semarize generates

agenda_followed = truedecisions_made = 3action_items_assigned = truemeeting_efficiency = 71

Your COO wants to fix meeting culture. You run a meeting effectiveness kit on every internal Zoom call. It scores whether an agenda was followed, how many decisions were explicitly stated, whether action items were assigned with owners, and whether the meeting stayed on track. Scores feed a weekly report showing which teams run effective meetings and which waste time. After one quarter of visibility, average meeting efficiency scores improve 30% — because what gets measured gets managed.

Learn more about QA & Compliance
Meeting Effectiveness - Weekly Report
Engineering
82+8%
Sales
71+3%
Marketing
54-2%
Leadership
63+12%
Avg decisions per meeting: 3.2 · Action items assigned: 78%

Training Session Evaluation

Enablement Intelligence

What Semarize generates

curriculum_coverage = 0.85topic_missed = "objection_handling"accuracy_vs_playbook = 0.91engagement_level = "high"

Your enablement team records training sessions on Zoom. Instead of manually reviewing 2-hour recordings, they run a training evaluation kit grounded against the actual curriculum document. Semarize checks whether the trainer covered all required topics, whether roleplay exercises met the rubric, and flags any statements that contradict the official playbook. New hires get a structured skills report after each session — and trainers get feedback on what they missed.

Learn more about Sales Coaching
Training Evaluation

Grounded against: Sales Curriculum v2

Product overview95% covered
Discovery methodology88% covered
Demo flow82% covered
Objection handlingmissed
Closing techniques76% covered
Curriculum coverage: 85%

Playbook accuracy: 91% - 1 inaccurate claim detected

Multi-Platform Coaching Console

Sales Coaching

Vibe-coded

What Semarize generates

overall_score = 64platform = "zoom"weak_skill = "discovery"calls_analyzed = 47

Your reps use both Zoom and Gong for customer calls depending on the account. A sales manager vibe-codes a coaching console that pulls Semarize scores from calls on both platforms. The console shows a unified skill view per rep — discovery, objection handling, closing — regardless of which tool recorded the call. One consistent scoring framework, one dashboard, no vendor lock-in. Coaching decisions are based on ALL conversations, not just the ones in one platform.

Learn more about Sales Coaching
Coaching ConsoleVibe-coded with Next.js
Sarah K.47 calls 32 15
Discovery
42
Objections
78
Closing
61
James M.38 calls 20 18
Discovery
71
Objections
65
Closing
83
Priya R.29 calls 29 0
Discovery
55
Objections
44
Closing
58
Zoom GongScores identical regardless of recording platform

Watch out for

Common Challenges & Gotchas

These are the issues that come up most often when teams start extracting transcripts from Zoom at scale.

Cloud recording must be enabled

Transcripts are only generated for cloud recordings. Local recordings don't produce API-accessible transcripts.

Server-to-Server OAuth setup

JWT apps are deprecated. Setting up S2S OAuth requires marketplace app creation and admin approval.

VTT parsing required

Transcripts come as WebVTT files that need parsing to extract plain text with speaker labels.

Recording retention limits

Cloud recordings are auto-deleted after the admin-configured retention period. Export before they expire.

API rate limits vary by plan

Rate limits differ between Pro, Business, and Enterprise plans. Implement backoff logic.

Meeting UUID vs Meeting ID

Zoom uses two identifiers. Recurring meetings share a meeting ID but have unique UUIDs per instance. Always use UUID for API calls.

Transcript quality varies

Auto-generated transcripts may have accuracy issues with accents, technical jargon, or poor audio quality. Speaker attribution can be inconsistent for dial-in participants.

FAQ

Frequently Asked Questions

Explore

Explore Semarize