Get Your Data
HubSpot - How to Get Your Conversation Data
A practical guide to getting your conversation data from HubSpot - covering the HubSpot API, call recording access, Calling SDK integrations, and how to route structured data into downstream systems.
What you'll learn
- What conversation data you can extract from HubSpot - call recordings, call metadata, engagement data, and CRM context
- How to access data via the HubSpot API - private apps, OAuth, and key endpoints
- Three extraction patterns: CRM engagement export, API polling, and workflow-triggered via HubSpot workflows
- How to connect HubSpot data pipelines to Zapier, n8n, and Make
- Advanced use cases - custom scoring, CRM enrichment, compliance, and warehouse analytics
Data
What Data You Can Extract From HubSpot
HubSpot captures more than just the call recording. Every call engagement produces a set of structured assets that can be extracted via API - recordings, metadata, associated CRM records, and contextual information about the contact, deal, and engagement history.
Common fields teams care about
API Access
How to Get Call Data via the HubSpot API
HubSpot exposes call engagements through a REST API. The workflow is: authenticate with a Private App token, list call objects with properties, then fetch recordings and associated CRM context for each call.
Authenticate
Create a Private App in HubSpot (Settings → Integrations → Private Apps). Grant scopes: crm.objects.contacts.read, sales-email-read, e-commerce. For calls specifically: crm.objects.calls.read.
Authorization: Bearer {private_app_token}List call engagements
Call the GET /crm/v3/objects/calls endpoint with the properties you need. Results are paginated - use the after cursor for pagination. Filter by date using hs_timestamp property with the search API.
GET https://api.hubapi.com/crm/v3/objects/calls?properties=hs_call_title,hs_call_duration,hs_call_recording_url,hs_call_status&limit=100
Returns paginated call objects with properties. Use the after cursor for pagination.
Access call recording
The hs_call_recording_url property contains the recording URL. For transcripts: if HubSpot Conversation Intelligence is enabled, use GET /crm/v3/objects/calls/{callId}?properties=hs_call_body to get the transcript text.
GET https://api.hubapi.com/crm/v3/objects/calls/{callId}?properties=hs_call_body,hs_call_recording_urlRecordings from third-party calling providers (Aircall, RingCentral) follow the provider's storage URLs.
Handle associations and context
Associations API
Use the Associations API to link calls to contacts, companies, and deals: GET /crm/v4/objects/calls/{callId}/associations/{toObjectType}.
Recording availability
HubSpot's call recording availability depends on the calling provider and plan tier. Not all plans include call recording or transcription.
Patterns
Key Extraction Flows
There are three practical patterns for getting call data out of HubSpot. The right choice depends on whether you're doing a one-off migration, running ongoing extraction, or need near real-time processing.
Backfill (CRM Engagement Export)
One-off migration of past calls
Create a Private App with the required scopes for call and association access
Search for calls by date range via the Search API (POST /crm/v3/objects/calls/search) with hs_timestamp filters
Fetch recording URLs and transcript text for each call object
Resolve associations - link each call to its contact, company, and deal via the Associations API
Send the collected data to Semarize for structured analysis
Incremental Polling
Ongoing extraction on a schedule
Schedule a job (cron, cloud function, or automation tool) to run at your desired interval
Search for calls created since your last run timestamp using the Search API with BETWEEN filters on hs_timestamp
Filter out already-processed call IDs using your deduplication store
Fetch recordings and transcripts for new calls
Route the data to Semarize for structured analysis and update your high-water mark timestamp
Workflow-Triggered
Near real-time on call completion
Create a HubSpot workflow triggered on "Call completed" - this fires when a call engagement is logged
Use a webhook action in the workflow to notify your endpoint with the call ID and metadata
On your endpoint, fetch the call details and recording via the HubSpot API
Process the call data via Semarize and route the structured output to your downstream systems
Automation
Send HubSpot Call Data to Automation Tools
Once you can extract call data from HubSpot, the next step is routing it through Semarize for structured analysis and into your downstream systems. Below are end-to-end example flows - each showing the full pipeline from HubSpot trigger through Semarize evaluation to CRM, Slack, or database output.
HubSpot → Zapier → Semarize → CRM
Detect new HubSpot call engagements, fetch the recording, send it to Semarize for structured analysis, then write the scored output - signals, flags, and evidence - directly to your CRM.
Setup steps
Create a new Zap. Choose HubSpot as the trigger app and select "New Engagement" as the event. Filter for engagement type = CALL. Connect your HubSpot account.
Add a "Webhooks by Zapier" Action (Custom Request) to fetch the recording. Set method to GET, URL to the recording URL from the trigger, and add auth if the provider requires it.
Add a second "Webhooks by Zapier" Action. Set method to POST, URL to https://api.semarize.com/v1/runs. Add your Semarize API key as a Bearer token. In the body, set kit_code to your Kit, mode to "sync", and map the transcript or recording into the input.
Add a Formatter step to extract individual brick values from the Semarize JSON response - overall_score, risk_flag, pain_point, etc.
Add a HubSpot Action to write the extracted scores and signals back to the Contact or Deal record.
Test each step end-to-end, then turn on the Zap.
HubSpot → n8n → Semarize → Database
Poll HubSpot for new calls on a schedule, fetch recordings, send each one to Semarize for analysis, then write the structured scores and signals to your database. n8n's native loop support handles pagination and batch processing.
Setup steps
Add a Cron node as the workflow trigger. Set the interval to your desired polling frequency (hourly works well for most teams).
Add a HubSpot node to search for new calls. Use the Search API with a hs_timestamp filter set to one interval ago. Configure auth with your Private App token.
Add a Split In Batches node to iterate over the returned call objects. Inside the loop, add an HTTP Request node to fetch each recording via the recording URL.
Add a Code node (JavaScript) to prepare the transcript. If using Conversation Intelligence, extract hs_call_body. Otherwise, process the audio recording.
Add another HTTP Request node to send the transcript to Semarize. Set method to POST, URL to https://api.semarize.com/v1/runs. Add your API key as a Bearer token. Set kit_code, mode to "sync", and map the transcript into input.transcript.
Add a Code node to extract the brick values from the Semarize response - overall_score, risk_flag, pain_point, evidence, confidence.
Add a Postgres (or MySQL / HTTP Request) node to write the structured output. Use call_id as the primary key for upserts.
Activate the workflow. Monitor the first few runs to verify Semarize responses are arriving and writing correctly.
HubSpot → Make → Semarize → CRM + Slack
Receive HubSpot webhook notifications on call completion, fetch the recording, send it to Semarize for structured analysis, then use a Router to branch the scored output - alert on risk flags via Slack and write all signals to your CRM.
Setup steps
Create a new Scenario. Add a Webhooks module as the trigger - this will receive the HubSpot workflow webhook payload on call completion.
In HubSpot, create a workflow triggered on "Call completed". Add a webhook action pointing to your Make webhook URL.
Add an HTTP module to fetch the recording. Set method to GET, URL to the recording URL from the webhook payload, and add auth headers if the provider requires it.
Add another HTTP module to send the recording/transcript to Semarize. Set URL to https://api.semarize.com/v1/runs, add your Bearer token, and set kit_code, mode to "sync", and input from the previous step. Parse the response as JSON.
Add a Router module. Define Branch 1 with a filter: bricks.risk_flag.value equals true. Leave Branch 2 as a fallthrough (no filter).
On Branch 1, add a Slack module to alert your team when risk is detected. Map the score, risk flag, and call ID into the message.
On Branch 2, add a HubSpot module to write all brick values (score, risk_flag, pain_point) to the Deal record.
Set the scenario schedule and activate. Monitor the first few runs in Make's execution log.
What you can build
What You Can Do With HubSpot Data in Semarize
Semarize gives you structured, typed signals you can ground against your own documents, write back to your CRM, and build custom tools on.
Playbook-Grounded Call Scoring
Sales Methodology Enforcement
What Semarize generates
Your VP of Sales has a 12-page sales playbook defining exactly how discovery calls should be run. Semarize evaluates whether the actual methodology was followed. Did the rep ask the five required discovery questions? Did they qualify budget before demoing? Did they position against competitors using approved messaging? The weekly playbook adherence report shows which reps follow the process and which are freelancing — grounded against your actual playbook document.
Learn more about Sales CoachingGrounded against: Sales Playbook v4
Budget qualification skipped - most common deviation across team
Automated CRM Field Enrichment
Pipeline Data Accuracy
What Semarize generates
Reps spend 15 minutes after every call updating HubSpot deal properties. Most don’t bother — so pipeline data rots. Semarize extracts typed deal signals from every call — budget range, decision timeline, champion name, next steps, competitors mentioned — and writes them directly to HubSpot properties via API. Every call auto-enriches the deal record with what was actually said, not what the rep remembered to log. Pipeline accuracy improves because the data comes from conversations, not manual entry.
Learn more about RevOpsBefore call
After Semarize
Auto-populated from call transcript · 0 manual entry required
Knowledge-Grounded Pricing & Packaging Verification
Commercial Accuracy Scoring
What Semarize generates
Your pricing page changed last quarter, but reps are still quoting old tiers on calls. Run a knowledge-grounded kit against your current rate card and discount authority matrix on every call. Semarize checks whether the pricing tier quoted was correct, whether the discount offered was within the rep’s authority level, and whether the packaging described matches the current product configuration. Finance gets a weekly commercial risk exposure report. After scoring 400 calls, the data shows 8% revenue leakage from mis-quoted pricing — caught before contracts go out, not after.
Learn more about QA & ComplianceGrounded against: Battlecards Q1 2026
Rep: “They don’t support Slack integration”
Ground truth: Added Slack integration Dec 2025Outdated
Rep: “Our uptime is 3x better”
Ground truth: 99.9% vs 99.5% SLAAccurate
Rep: “They require annual contracts”
Ground truth: Monthly billing available since Q4Outdated
Structured Conversation Signal Pipeline
Typed Data for BI & Analytics
What Semarize generates
A RevOps analyst vibe-codes a pipeline that runs every call through Semarize and lands typed rows in the data warehouse: pain_category (varchar), urgency_level (float), budget_range (varchar), decision_timeline (date), competitor_mentioned (varchar), champion_identified (bool). These columns get joined with CRM deal data in dbt. For the first time, the team can answer “which pain categories convert fastest from which reps?” with a SQL query instead of listening to 50 recordings. Pipeline reviews shift from anecdotes to structured, queryable conversation evidence.
Learn more about RevOpsSignal coverage - Acme Corp
Combines conversation signals + CRM data · Updated after every call
Watch out for
Common Challenges & Gotchas
These are the issues that come up most often when teams start extracting call data from HubSpot at scale.
Call recording depends on calling provider
HubSpot's built-in calling has recording limits. Third-party providers (Aircall, RingCentral) store recordings externally.
Transcription requires Conversation Intelligence
Full transcript text is only available if you have HubSpot's Conversation Intelligence add-on (Sales Hub Enterprise).
Recording URL authentication
Some recording URLs require authentication to download. Handle token refresh and URL expiry in your pipeline.
Association complexity
Calls can be associated with multiple contacts, companies, and deals. Resolving the right context requires additional API calls.
API rate limits
HubSpot enforces per-app rate limits (100 requests per 10 seconds for Private Apps). Implement queuing for bulk operations.
Inconsistent recording formats
Different calling providers produce different audio formats and quality levels. Normalize before processing.
Plan tier restrictions
Call recording, transcription, and workflow webhook actions are gated by HubSpot plan tier. Verify your plan includes the features you need.
FAQ
Frequently Asked Questions
Explore