Semarize

Get Your Data

HubSpot - How to Get Your Conversation Data

A practical guide to getting your conversation data from HubSpot - covering the HubSpot API, call recording access, Calling SDK integrations, and how to route structured data into downstream systems.

What you'll learn

  • What conversation data you can extract from HubSpot - call recordings, call metadata, engagement data, and CRM context
  • How to access data via the HubSpot API - private apps, OAuth, and key endpoints
  • Three extraction patterns: CRM engagement export, API polling, and workflow-triggered via HubSpot workflows
  • How to connect HubSpot data pipelines to Zapier, n8n, and Make
  • Advanced use cases - custom scoring, CRM enrichment, compliance, and warehouse analytics

Data

What Data You Can Extract From HubSpot

HubSpot captures more than just the call recording. Every call engagement produces a set of structured assets that can be extracted via API - recordings, metadata, associated CRM records, and contextual information about the contact, deal, and engagement history.

Common fields teams care about

Call recordings (audio files from HubSpot's built-in calling or integrated providers)
Call metadata (date, time, duration, outcome, call type, disposition)
Call notes and summaries (rep-entered notes and AI-generated summaries)
Associated records (linked contacts, companies, deals, tickets)
Engagement properties (call direction, status, recording URL)
Contact and company context (full CRM record associated with the call)
Deal stage and pipeline data (where the deal stands when the call happened)
Call transcripts (available via HubSpot's Conversation Intelligence add-on)
Activity timeline (call positioned within the full engagement history)
Custom properties (any custom fields your team tracks on call engagements)

API Access

How to Get Call Data via the HubSpot API

HubSpot exposes call engagements through a REST API. The workflow is: authenticate with a Private App token, list call objects with properties, then fetch recordings and associated CRM context for each call.

1

Authenticate

Create a Private App in HubSpot (Settings → Integrations → Private Apps). Grant scopes: crm.objects.contacts.read, sales-email-read, e-commerce. For calls specifically: crm.objects.calls.read.

Authorization: Bearer {private_app_token}
Private apps are the recommended auth method. OAuth is available for marketplace apps.
2

List call engagements

Call the GET /crm/v3/objects/calls endpoint with the properties you need. Results are paginated - use the after cursor for pagination. Filter by date using hs_timestamp property with the search API.

GET https://api.hubapi.com/crm/v3/objects/calls?properties=hs_call_title,hs_call_duration,hs_call_recording_url,hs_call_status&limit=100

Returns paginated call objects with properties. Use the after cursor for pagination.

3

Access call recording

The hs_call_recording_url property contains the recording URL. For transcripts: if HubSpot Conversation Intelligence is enabled, use GET /crm/v3/objects/calls/{callId}?properties=hs_call_body to get the transcript text.

GET https://api.hubapi.com/crm/v3/objects/calls/{callId}?properties=hs_call_body,hs_call_recording_url

Recordings from third-party calling providers (Aircall, RingCentral) follow the provider's storage URLs.

4

Handle associations and context

Associations API

Use the Associations API to link calls to contacts, companies, and deals: GET /crm/v4/objects/calls/{callId}/associations/{toObjectType}.

Recording availability

HubSpot's call recording availability depends on the calling provider and plan tier. Not all plans include call recording or transcription.

Patterns

Key Extraction Flows

There are three practical patterns for getting call data out of HubSpot. The right choice depends on whether you're doing a one-off migration, running ongoing extraction, or need near real-time processing.

Backfill (CRM Engagement Export)

One-off migration of past calls

1

Create a Private App with the required scopes for call and association access

2

Search for calls by date range via the Search API (POST /crm/v3/objects/calls/search) with hs_timestamp filters

3

Fetch recording URLs and transcript text for each call object

4

Resolve associations - link each call to its contact, company, and deal via the Associations API

5

Send the collected data to Semarize for structured analysis

Tip: Use the Search API (POST /crm/v3/objects/calls/search) with hs_timestamp filters for efficient date-range queries.

Incremental Polling

Ongoing extraction on a schedule

1

Schedule a job (cron, cloud function, or automation tool) to run at your desired interval

2

Search for calls created since your last run timestamp using the Search API with BETWEEN filters on hs_timestamp

3

Filter out already-processed call IDs using your deduplication store

4

Fetch recordings and transcripts for new calls

5

Route the data to Semarize for structured analysis and update your high-water mark timestamp

Tip: HubSpot's search API supports BETWEEN filters on timestamps. Track your high-water mark for efficient incremental queries.

Workflow-Triggered

Near real-time on call completion

1

Create a HubSpot workflow triggered on "Call completed" - this fires when a call engagement is logged

2

Use a webhook action in the workflow to notify your endpoint with the call ID and metadata

3

On your endpoint, fetch the call details and recording via the HubSpot API

4

Process the call data via Semarize and route the structured output to your downstream systems

Note: HubSpot workflows can trigger webhooks on call completion. This gives near-real-time processing without polling. Requires Sales Hub Professional or Enterprise.

Automation

Send HubSpot Call Data to Automation Tools

Once you can extract call data from HubSpot, the next step is routing it through Semarize for structured analysis and into your downstream systems. Below are end-to-end example flows - each showing the full pipeline from HubSpot trigger through Semarize evaluation to CRM, Slack, or database output.

ZapierNo-code automation

HubSpot → Zapier → Semarize → CRM

Detect new HubSpot call engagements, fetch the recording, send it to Semarize for structured analysis, then write the scored output - signals, flags, and evidence - directly to your CRM.

Example Zap
Trigger: New Engagement (Call)
Fires when HubSpot logs a new call engagement
App: HubSpot
Event: New Engagement
Filter: Type = CALL
Output: call_id, recording_url
Webhooks by Zapier
Fetch recording from URL
Method: GET
URL: {{hs_call_recording_url}}
Auth: Bearer (if required)
Recording fetched
Webhooks by Zapier
POST /v1/runs (sync) to Semarize
Method: POST
URL: https://api.semarize.com/v1/runs
Auth: Bearer smz_live_...
Body: { kit_code, mode: "sync", input: { transcript } }
Structured output returned
Formatter by Zapier
Extract brick values from Semarize response
Extract: bricks.overall_score.value
Extract: bricks.risk_flag.value
Extract: bricks.pain_point.value
HubSpot - Update Contact/Deal
Write scored signals to Contact or Deal
Object: Contact or Deal
AI Score: {{overall_score}}
Risk Flag: {{risk_flag}}
Pain Point: {{pain_point}}

Setup steps

1

Create a new Zap. Choose HubSpot as the trigger app and select "New Engagement" as the event. Filter for engagement type = CALL. Connect your HubSpot account.

2

Add a "Webhooks by Zapier" Action (Custom Request) to fetch the recording. Set method to GET, URL to the recording URL from the trigger, and add auth if the provider requires it.

3

Add a second "Webhooks by Zapier" Action. Set method to POST, URL to https://api.semarize.com/v1/runs. Add your Semarize API key as a Bearer token. In the body, set kit_code to your Kit, mode to "sync", and map the transcript or recording into the input.

4

Add a Formatter step to extract individual brick values from the Semarize JSON response - overall_score, risk_flag, pain_point, etc.

5

Add a HubSpot Action to write the extracted scores and signals back to the Contact or Deal record.

6

Test each step end-to-end, then turn on the Zap.

Watch out for: Zapier has a native HubSpot trigger for new engagements. Use it for simpler setup. Recording URLs may require additional auth to download.
Learn more about Zapier automation
n8nSelf-hosted workflows

HubSpot → n8n → Semarize → Database

Poll HubSpot for new calls on a schedule, fetch recordings, send each one to Semarize for analysis, then write the structured scores and signals to your database. n8n's native loop support handles pagination and batch processing.

Example Workflow
Cron - Every Hour
Triggers the workflow on schedule
Mode: Every Hour
Timezone: UTC
HubSpot - Search Calls
POST /crm/v3/objects/calls/search
Method: POST
URL: https://api.hubapi.com/crm/v3/objects/calls/search
Auth: Bearer (Private App)
Body: { filters: [{ propertyName: hs_timestamp, operator: GTE, value: {{lastRun}} }] }
For each call
HTTP Request - Fetch Recording
Download recording from URL
URL: {{hs_call_recording_url}}
Code - Prepare Transcript
Process recording or extract transcript text
Extract: hs_call_body or audio data
HTTP Request - Semarize
POST /v1/runs (sync)
URL: https://api.semarize.com/v1/runs
Auth: Bearer smz_live_...
Body: { kit_code, mode: "sync", input: { transcript } }
Scores & signals returned
Postgres - Insert Row
Write structured output to database
Table: call_evaluations
Columns: call_id, score, risk_flag, pain_point

Setup steps

1

Add a Cron node as the workflow trigger. Set the interval to your desired polling frequency (hourly works well for most teams).

2

Add a HubSpot node to search for new calls. Use the Search API with a hs_timestamp filter set to one interval ago. Configure auth with your Private App token.

3

Add a Split In Batches node to iterate over the returned call objects. Inside the loop, add an HTTP Request node to fetch each recording via the recording URL.

4

Add a Code node (JavaScript) to prepare the transcript. If using Conversation Intelligence, extract hs_call_body. Otherwise, process the audio recording.

5

Add another HTTP Request node to send the transcript to Semarize. Set method to POST, URL to https://api.semarize.com/v1/runs. Add your API key as a Bearer token. Set kit_code, mode to "sync", and map the transcript into input.transcript.

6

Add a Code node to extract the brick values from the Semarize response - overall_score, risk_flag, pain_point, evidence, confidence.

7

Add a Postgres (or MySQL / HTTP Request) node to write the structured output. Use call_id as the primary key for upserts.

8

Activate the workflow. Monitor the first few runs to verify Semarize responses are arriving and writing correctly.

Watch out for: n8n has a built-in HubSpot node. Use it for search and read operations to simplify auth handling.
Learn more about n8n automation
MakeVisual automation with branching

HubSpot → Make → Semarize → CRM + Slack

Receive HubSpot webhook notifications on call completion, fetch the recording, send it to Semarize for structured analysis, then use a Router to branch the scored output - alert on risk flags via Slack and write all signals to your CRM.

Example Scenario
Webhook - Call Completed
Receives HubSpot workflow webhook
Source: HubSpot Workflow
Event: Call completed
HTTP - Fetch Recording
Download recording from URL
Method: GET
URL: {{hs_call_recording_url}}
Auth: Bearer (if required)
HTTP - Semarize
POST /v1/runs (sync)
URL: https://api.semarize.com/v1/runs
Auth: Bearer smz_live_...
Body: { kit_code, mode: "sync", input: { transcript } }
Structured output
Router - Branch on Risk Flag
Route by Semarize output
Branch 1: IF risk_flag.value = true
Branch 2: ALL (fallthrough)
Branch 1 - Risk detected
Slack - Alert Channel
Notify team about flagged call
Channel: #deal-alerts
Message: Risk on {{call_id}}, score: {{score}}
Branch 2 - All calls
HubSpot - Update Deal
Write all scored signals to Deal
AI Score: {{overall_score}}
Risk Flag: {{risk_flag}}
Pain Point: {{pain_point}}

Setup steps

1

Create a new Scenario. Add a Webhooks module as the trigger - this will receive the HubSpot workflow webhook payload on call completion.

2

In HubSpot, create a workflow triggered on "Call completed". Add a webhook action pointing to your Make webhook URL.

3

Add an HTTP module to fetch the recording. Set method to GET, URL to the recording URL from the webhook payload, and add auth headers if the provider requires it.

4

Add another HTTP module to send the recording/transcript to Semarize. Set URL to https://api.semarize.com/v1/runs, add your Bearer token, and set kit_code, mode to "sync", and input from the previous step. Parse the response as JSON.

5

Add a Router module. Define Branch 1 with a filter: bricks.risk_flag.value equals true. Leave Branch 2 as a fallthrough (no filter).

6

On Branch 1, add a Slack module to alert your team when risk is detected. Map the score, risk flag, and call ID into the message.

7

On Branch 2, add a HubSpot module to write all brick values (score, risk_flag, pain_point) to the Deal record.

8

Set the scenario schedule and activate. Monitor the first few runs in Make's execution log.

Watch out for: HubSpot recording URLs may be temporary or require authentication. Download and store recordings before processing.
Learn more about Make automation

What you can build

What You Can Do With HubSpot Data in Semarize

Semarize gives you structured, typed signals you can ground against your own documents, write back to your CRM, and build custom tools on.

Playbook-Grounded Call Scoring

Sales Methodology Enforcement

What Semarize generates

playbook_adherence = 0.76required_questions_asked = 3methodology_followed = truedeviation = "skipped_budget"

Your VP of Sales has a 12-page sales playbook defining exactly how discovery calls should be run. Semarize evaluates whether the actual methodology was followed. Did the rep ask the five required discovery questions? Did they qualify budget before demoing? Did they position against competitors using approved messaging? The weekly playbook adherence report shows which reps follow the process and which are freelancing — grounded against your actual playbook document.

Learn more about Sales Coaching
Playbook Adherence

Grounded against: Sales Playbook v4

Open with agenda setting- detected at 0:42
Identify 3+ pain points- 3 identified
Qualify budget range- not discussed
Map decision process- detected at 18:22
!
Confirm next steps with date- next step set, no date
3 of 5 required steps completed · 76% adherence

Budget qualification skipped - most common deviation across team

Automated CRM Field Enrichment

Pipeline Data Accuracy

What Semarize generates

budget_range = "$50k-75k"decision_timeline = "Q2_2026"champion_name = "Sarah Chen"next_step = "technical_review"

Reps spend 15 minutes after every call updating HubSpot deal properties. Most don’t bother — so pipeline data rots. Semarize extracts typed deal signals from every call — budget range, decision timeline, champion name, next steps, competitors mentioned — and writes them directly to HubSpot properties via API. Every call auto-enriches the deal record with what was actually said, not what the rep remembered to log. Pipeline accuracy improves because the data comes from conversations, not manual entry.

Learn more about RevOps
Deal Auto-EnrichmentAcme Corp

Before call

Budget-
Timeline-
Champion-
Next step-
Competitor-

After Semarize

Budget$50k-75k
TimelineQ2 2026
ChampionSarah Chen, VP Ops
Next stepTechnical review
CompetitorCompetitor X

Auto-populated from call transcript · 0 manual entry required

Knowledge-Grounded Pricing & Packaging Verification

Commercial Accuracy Scoring

What Semarize generates

pricing_tier_correct = falsediscount_within_authority = truepackaging_current = falsecommercial_risk = "high"

Your pricing page changed last quarter, but reps are still quoting old tiers on calls. Run a knowledge-grounded kit against your current rate card and discount authority matrix on every call. Semarize checks whether the pricing tier quoted was correct, whether the discount offered was within the rep’s authority level, and whether the packaging described matches the current product configuration. Finance gets a weekly commercial risk exposure report. After scoring 400 calls, the data shows 8% revenue leakage from mis-quoted pricing — caught before contracts go out, not after.

Learn more about QA & Compliance
Competitive Response Check

Grounded against: Battlecards Q1 2026

Rep:They don’t support Slack integration

Ground truth: Added Slack integration Dec 2025Outdated

Rep:Our uptime is 3x better

Ground truth: 99.9% vs 99.5% SLAAccurate

Rep:They require annual contracts

Ground truth: Monthly billing available since Q4Outdated

1 of 3 claims accurate · 2 battlecard updates needed

Structured Conversation Signal Pipeline

Typed Data for BI & Analytics

Vibe-coded

What Semarize generates

pain_category = "operational_cost"urgency_level = 0.78typed_columns = 6pipeline_query = "SQL-ready"

A RevOps analyst vibe-codes a pipeline that runs every call through Semarize and lands typed rows in the data warehouse: pain_category (varchar), urgency_level (float), budget_range (varchar), decision_timeline (date), competitor_mentioned (varchar), champion_identified (bool). These columns get joined with CRM deal data in dbt. For the first time, the team can answer “which pain categories convert fastest from which reps?” with a SQL query instead of listening to 50 recordings. Pipeline reviews shift from anecdotes to structured, queryable conversation evidence.

Learn more about RevOps
Deal Momentum BoardVibe-coded with Next.js
Acme Corp$85kStage 3
0.88
↑ trending up
Northwind$120kStage 4
0.71
→ stable
Contoso$45kStage 2
0.42
stalling
Fabrikam$200kStage 3
0.31
at risk

Signal coverage - Acme Corp

BudgetTimelineChampionLegalSecurity

Combines conversation signals + CRM data · Updated after every call

Watch out for

Common Challenges & Gotchas

These are the issues that come up most often when teams start extracting call data from HubSpot at scale.

Call recording depends on calling provider

HubSpot's built-in calling has recording limits. Third-party providers (Aircall, RingCentral) store recordings externally.

Transcription requires Conversation Intelligence

Full transcript text is only available if you have HubSpot's Conversation Intelligence add-on (Sales Hub Enterprise).

Recording URL authentication

Some recording URLs require authentication to download. Handle token refresh and URL expiry in your pipeline.

Association complexity

Calls can be associated with multiple contacts, companies, and deals. Resolving the right context requires additional API calls.

API rate limits

HubSpot enforces per-app rate limits (100 requests per 10 seconds for Private Apps). Implement queuing for bulk operations.

Inconsistent recording formats

Different calling providers produce different audio formats and quality levels. Normalize before processing.

Plan tier restrictions

Call recording, transcription, and workflow webhook actions are gated by HubSpot plan tier. Verify your plan includes the features you need.

FAQ

Frequently Asked Questions

Explore

Explore Semarize