Semarize

Get Your Data

Salesforce - How to Get Your Conversation Data

A practical guide to getting your conversation data from Salesforce - covering the Salesforce API, Einstein Conversation Insights, Service Cloud Voice recordings, and how to route structured data into downstream systems.

What you'll learn

  • What conversation data you can extract from Salesforce - call logs, Einstein Conversation Insights transcripts, Service Cloud Voice recordings, and activity data
  • How to access data via the Salesforce API - Connected Apps, OAuth 2.0, and SOQL queries
  • Three extraction patterns: SOQL-based export, scheduled polling, and Platform Event-driven flows
  • How to connect Salesforce data pipelines to Zapier, n8n, and Make
  • Advanced use cases - custom scoring, CRM enrichment, compliance, and warehouse analytics

Data

What Data You Can Extract From Salesforce

Salesforce stores conversation data across multiple objects - Task records for call activity, VoiceCall records for Service Cloud Voice, and ConversationEntry objects for Einstein Conversation Insights. Each source provides different levels of detail depending on your org's configuration and add-ons.

Common fields teams care about

Call activity logs (Task records with call metadata - subject, description, duration, outcome)
Einstein Conversation Insights transcripts (AI-generated transcripts from Sales Cloud calls)
Service Cloud Voice recordings (call recordings from Amazon Connect or partner telephony)
Voice call metadata (call direction, duration, queue, agent, disposition)
Activity timeline (calls positioned within the full account/opportunity timeline)
Associated records (linked contacts, accounts, opportunities, cases)
Call sentiment and topics (Einstein-detected sentiment signals and topic mentions)
Custom fields on Task/VoiceCall objects (any custom fields your team tracks)
Call coaching signals (Einstein's built-in coaching recommendations if enabled)
Omni-Channel routing data (queue assignment, routing priority, wait times)

API Access

How to Get Call Data via the Salesforce API

Salesforce exposes call data through its REST API. The workflow is: authenticate via a Connected App with OAuth 2.0, query call records using SOQL, then fetch recordings and transcripts from the relevant objects.

1

Authenticate

Create a Connected App in Salesforce Setup (Setup → App Manager → New Connected App). Enable OAuth 2.0 with scopes: api, refresh_token, offline_access. Use the OAuth 2.0 JWT bearer or web server flow for app-to-app auth.

Authorization: Bearer {access_token}
Use https://login.salesforce.com/services/oauth2/token for production or https://test.salesforce.com/... for sandbox.
2

Query call records

For Task-based calls, Service Cloud Voice records, and Einstein Conversation Insights, use SOQL queries via the REST API endpoint GET /services/data/v59.0/query?q={SOQL}.

-- Task-based calls
SELECT Id, Subject, Description, CallDurationInSeconds,
       CallType, ActivityDate, WhoId, WhatId
FROM Task
WHERE TaskSubtype = 'Call'
  AND ActivityDate >= 2026-01-01

-- Service Cloud Voice
SELECT Id, CallType, CallDurationInSeconds,
       FromPhoneNumber, ToPhoneNumber, VendorCallKey
FROM VoiceCall
WHERE CreatedDate >= 2026-01-01T00:00:00Z

-- Einstein Conversation Insights
-- Query ConversationEntry objects for transcript segments

Use the REST API: GET /services/data/v59.0/query?q={SOQL}. Results are paginated - each response includes a nextRecordsUrl if more records exist.

3

Access recordings and transcripts

Service Cloud Voice recordings can be accessed via the VoiceCall content endpoint. Einstein Conversation Insights provides transcript segments through ConversationEntry objects with speaker labels. Third-party CTI recordings require following the provider's URL/API for recording access.

-- Service Cloud Voice recording
GET /services/data/v59.0/sobjects/VoiceCall/{id}/Content

-- Einstein Conversation Insights
-- Query ConversationEntry objects for transcript
-- segments with speaker labels

-- Third-party CTI recordings
-- Follow the provider's URL/API for recording access

Einstein Conversation Insights returns transcript data as ConversationEntry records, each with a speaker ID and text segment. Third-party CTI providers (e.g., Five9, RingCentral) store recordings externally - check their documentation for API access.

4

Handle authentication and limits

API limits

Salesforce enforces concurrent API call limits based on org edition (API calls per 24-hour period). Monitor usage via SELECT COUNT() FROM ApiEvent.

Einstein requirements

Einstein Conversation Insights requires Sales Cloud Einstein or Service Cloud Einstein. Not all orgs have it enabled. Check your org's entitlements before building transcript-dependent flows.

Patterns

Key Extraction Flows

There are three practical patterns for getting call data out of Salesforce. The right choice depends on whether you're doing a one-off migration, running ongoing extraction, or need near real-time processing.

Backfill (Historical Export)

One-off migration of past call data

1

Create a Connected App with the necessary OAuth scopes (api, refresh_token, offline_access)

2

Write a SOQL query for Tasks or VoiceCalls filtered by date range

3

Execute the query with pagination - use queryMore for result sets over 2,000 records. Salesforce returns a nextRecordsUrl for pagination

4

Fetch recordings where available via the VoiceCall content endpoint or provider API

5

Send each transcript and call metadata to Semarize for structured analysis

Tip: Use queryMore for result sets over 2,000 records. Salesforce returns a nextRecordsUrl for pagination.

Incremental Polling

Ongoing extraction on a schedule

1

Schedule a job (cron, Lambda, etc.) that runs your extraction script at regular intervals

2

Query Tasks or VoiceCalls modified since the last run using SystemModstamp or LastModifiedDate

3

Filter out already-processed record IDs to avoid reprocessing

4

Fetch recordings and transcripts for new or updated records

5

Route each transcript and its metadata to Semarize for structured analysis

Tip: Use SystemModstamp or LastModifiedDate for reliable incremental queries. These fields capture both insert and update events.

Platform Event-Driven

Near real-time on record creation

1

Create a Platform Event or enable Change Data Capture (CDC) on Task or VoiceCall objects in Salesforce Setup

2

Subscribe via CometD or an Apex trigger that listens for new call records

3

When new call data is captured, push the record to your processing endpoint

4

Fetch the recording and process via Semarize for structured analysis

Note: Change Data Capture (CDC) fires real-time events when records are created or modified. This eliminates the need for polling.

Automation

Send Salesforce Call Data to Automation Tools

Once you can extract call data from Salesforce, the next step is routing it through Semarize for structured analysis and into your downstream systems. Below are end-to-end example flows - each showing the full pipeline from Salesforce trigger through Semarize evaluation to CRM, Slack, or database output.

ZapierNo-code automation

Salesforce → Zapier → Semarize → CRM

Detect new Salesforce call Tasks, fetch the recording, send it to Semarize for structured analysis, then write the scored output - signals, flags, and evidence - back to your Salesforce Opportunity.

Example Zap
Trigger: New Task (Call) in Salesforce
Fires when a new call Task is created
App: Salesforce
Event: New Record
Object: Task (TaskSubtype = 'Call')
Webhooks by Zapier
Fetch recording from provider
Method: GET
URL: Recording provider endpoint
Auth: Provider credentials
Recording returned
Webhooks by Zapier
POST /v1/runs (sync) to Semarize
Method: POST
URL: https://api.semarize.com/v1/runs
Auth: Bearer smz_live_...
Body: { kit_code, mode: "sync", input: { transcript } }
Structured output returned
Formatter by Zapier
Extract brick values from Semarize response
Extract: bricks.overall_score.value
Extract: bricks.risk_flag.value
Extract: bricks.pain_point.value
Salesforce - Update Record
Write scored signals to Opportunity
Object: Opportunity
AI Score: {{overall_score}}
Risk Flag: {{risk_flag}}
Pain Point: {{pain_point}}

Setup steps

1

Create a new Zap. Choose Salesforce as the trigger app and select "New Record" as the event. Set the object to Task and connect your Salesforce account.

2

Add a Filter step to only continue when TaskSubtype equals 'Call'. This prevents non-call activities from triggering the flow.

3

Add a "Webhooks by Zapier" Action (Custom Request) to fetch the recording from your telephony provider. Map the call ID or vendor call key from the Task record.

4

Add a second "Webhooks by Zapier" Action. Set method to POST, URL to https://api.semarize.com/v1/runs. Add your Semarize API key as a Bearer token. In the body, set kit_code to your Kit, mode to "sync", and map the transcript text into input.transcript.

5

Add a Formatter step to extract individual brick values from the Semarize JSON response - overall_score, risk_flag, pain_point, etc.

6

Add a Salesforce Action to write the extracted scores and signals back to the related Opportunity record. Test each step end-to-end, then turn on the Zap.

Watch out for: Zapier's Salesforce trigger can fire on new Tasks. Filter for TaskSubtype = 'Call' in your Zap to avoid processing non-call activities.
Learn more about Zapier automation
n8nSelf-hosted workflows

Salesforce → n8n → Semarize → Database

Poll Salesforce for new call records on a schedule, fetch recordings, send each one to Semarize for analysis, then write the structured scores and signals to your database. n8n's built-in Salesforce node handles auth and pagination automatically.

Example Workflow
Cron - Every Hour
Triggers the workflow on schedule
Mode: Every Hour
Timezone: UTC
Salesforce - SOQL Query
Query Tasks modified since last run
Node: Salesforce
Operation: SOQL Query
Query: SELECT Id, Subject, ... FROM Task WHERE TaskSubtype = 'Call' AND SystemModstamp >= {{last_run}}
For each call record
HTTP Request - Fetch Recording
GET recording from provider
Method: GET
URL: Provider endpoint / VoiceCall content
Code - Prepare Data
Format call data and transcript for Semarize
Map: call metadata + transcript text
HTTP Request - Semarize
POST /v1/runs (sync)
URL: https://api.semarize.com/v1/runs
Auth: Bearer smz_live_...
Body: { kit_code, mode: "sync", input: { transcript } }
Scores & signals returned
Postgres - Insert Row
Write structured output to database
Table: call_evaluations
Columns: call_id, score, risk_flag, pain_point

Setup steps

1

Add a Cron node as the workflow trigger. Set the interval to your desired polling frequency (hourly works well for most teams).

2

Add a Salesforce node. Configure OAuth credentials for your Connected App. Set the operation to SOQL Query and write your query to fetch call Tasks modified since the last run.

3

Add a Split In Batches node to iterate over the returned call records. Inside the loop, add an HTTP Request node to fetch each recording from your telephony provider.

4

Add a Code node (JavaScript) to prepare the call data - combine metadata from the Salesforce record with the transcript or recording content.

5

Add another HTTP Request node to send the data to Semarize. Set method to POST, URL to https://api.semarize.com/v1/runs. Add your API key as a Bearer token. Set kit_code, mode to "sync", and map the transcript into input.transcript.

6

Add a Code node to extract the brick values from the Semarize response - overall_score, risk_flag, pain_point, evidence, confidence.

7

Add a Postgres (or MySQL / HTTP Request) node to write the structured output. Use call_id as the primary key for upserts.

8

Activate the workflow. Monitor the first few runs to verify Semarize responses are arriving and writing correctly.

Watch out for: n8n has a built-in Salesforce node that supports SOQL queries. Use it to handle auth and pagination automatically.
Learn more about n8n automation
MakeVisual automation with branching

Salesforce → Make → Semarize → CRM + Slack

Receive new Salesforce call activity via webhook, fetch the recording, send it to Semarize for structured analysis, then use a Router to branch the scored output - alert on risk flags via Slack and write all signals back to your CRM.

Example Scenario
Webhook - New Call Activity
Triggered by Salesforce Outbound Message or Platform Event
Source: Salesforce Platform Event / Outbound Message
HTTP - Fetch Recording
GET recording from provider
Method: GET
URL: Provider endpoint or VoiceCall content
Auth: Provider credentials
HTTP - Semarize
POST /v1/runs (sync)
URL: https://api.semarize.com/v1/runs
Auth: Bearer smz_live_...
Body: { kit_code, mode: "sync", input: { transcript } }
Structured output
Router - Branch on Risk Flag
Route by Semarize output
Branch 1: IF risk_flag.value = true
Branch 2: ALL (fallthrough)
Branch 1 - Risk detected
Slack - Alert Channel
Notify team about flagged call
Channel: #deal-alerts
Message: Risk on {{call_id}}, score: {{score}}
Branch 2 - All calls
Salesforce - Update Record
Write all scored signals to Opportunity
AI Score: {{overall_score}}
Risk Flag: {{risk_flag}}
Pain Point: {{pain_point}}

Setup steps

1

Create a new Scenario. Add a Webhook module as the trigger - configure it to receive events from Salesforce Outbound Messages or Platform Events.

2

In Salesforce Setup, configure an Outbound Message or Platform Event on the Task object (filtered to call Tasks) that sends data to your Make webhook URL.

3

Add an HTTP module to fetch the recording from your telephony provider. Map the call ID or vendor call key from the webhook payload.

4

Add another HTTP module to send the recording/transcript to Semarize. Set URL to https://api.semarize.com/v1/runs, add your Bearer token, and set kit_code, mode to "sync", and input.transcript from the previous step. Parse the response as JSON.

5

Add a Router module. Define Branch 1 with a filter: bricks.risk_flag.value equals true. Leave Branch 2 as a fallthrough (no filter).

6

On Branch 1, add a Slack module to alert your team when risk is detected. Map the score, risk flag, and call ID into the message.

7

On Branch 2, add a Salesforce module to write all brick values (score, risk_flag, pain_point) back to the Opportunity record.

8

Activate the scenario. Monitor the first few runs in Make's execution log.

Watch out for: Salesforce Outbound Messages or Platform Events can trigger Make scenarios. Configure the trigger in Salesforce Setup.
Learn more about Make automation

What you can build

What You Can Do With Salesforce Data in Semarize

Salesforce stores your data. Semarize structures it. When conversation content is evaluated against your own frameworks and returned as typed, programmable output, new possibilities open up.

Custom QA Rubric Scoring

Contact Center QA

What Semarize generates

resolution_quality = 0.82empathy_demonstrated = truetroubleshooting_complete = trueescalation_appropriate = false

Your contact center runs 500 calls per day. Your QA team has a 40-point rubric covering resolution quality, empathy, troubleshooting thoroughness, and escalation handling — and needs every call scored against it. Semarize evaluates every call against YOUR rubric, returning typed scores for each dimension. QA coverage goes from 5% random sampling to 100% automated evaluation. The QA team shifts from scoring calls to coaching on the scores.

Learn more about QA & Compliance
QA Evaluation - Call SF-8721

Grounded against: QA Rubric v6

Resolution Quality82/100
Issue identifiedRoot cause foundPreventive advice
Empathy & Rapport91/100
Active listeningAcknowledgmentPersonalization
Process Adherence74/100
VerificationTicket createdSummary provided
Escalation Handling65/100
Escalated without attempting resolution
Overall: 78 / 100Above threshold (70)

Knowledge-Grounded Resolution Accuracy

Policy & Product Verification

What Semarize generates

return_policy_correct = falsewarranty_terms_accurate = truetroubleshooting_step_skipped = "power_cycle"accuracy_score = 0.76

Your support team handles hundreds of calls daily. When agents quote return windows, warranty terms, or troubleshooting sequences, are they getting it right? Run a knowledge-grounded kit against your product documentation, return policies, and troubleshooting guides on every call. Semarize checks whether the return window quoted was accurate, whether the warranty terms matched the current policy, and whether troubleshooting steps followed the approved sequence. After scoring 3,000 calls, you discover that 12% of agents cite outdated return policy terms. The cost of honouring incorrect promises drops immediately once you target the specific agents and the specific policy sections they get wrong.

Learn more about QA & Compliance
Escalation Risk Queue - Today12 calls flagged
SF-8834Billing dispute0.92Supervisor review

Frustration detected, no resolution attempted

SF-8836Integration error0.78Re-attempt

Technical issue, agent lacked product knowledge

SF-8839Account access0.45Monitor

Standard request, minor confusion

SF-8841Feature request0.31Route to product

No frustration, feature feedback captured

Predicted to prevent 8 unnecessary escalations today

Cross-Call Commitment Continuity Scoring

Follow-Through Analysis

What Semarize generates

follow_through_rate = 59%dropped_commitments = 3commitment_continuity = "weak"close_rate_correlation = 2.8x

Action items get captured after every meeting — but do those commitments actually carry through to the next call? Run pairs of consecutive meeting transcripts through a commitment tracking kit. Semarize compares commitments_made in meeting N with commitments_referenced in meeting N+1, scoring follow_through_rate, dropped_commitments, and new_blockers_introduced. Across your sales team, the data shows a 41% commitment drop-off between discovery and demo calls. Deals where commitments carry through close at 2.8x the rate. Pipeline reviews now include commitment continuity as a deal health metric.

Learn more about RevOps
Customer Journey - Case SF-9102
Chat11:30 AM
Score: 72
Handoff quality: 52%
Phone2:15 PM
Score: 61
Handoff quality: 78%
EmailNext day
Score: 71
Journey score: 68/100

Chat → Phone handoff lost context. Customer repeated issue.

Custom Win/Loss Conversation Evidence Engine

Outcome-Linked Signal Analysis

Vibe-coded

What Semarize generates

win_predictor_1 = "budget_confirmed"win_predictor_2 = "decision_maker"model_accuracy = 0.82deals_analysed = 500

A data analyst vibe-codes a Retool app that pulls Semarize scores from every transcript associated with closed deals. The app correlates conversation signals with outcomes: which Brick values predict wins vs losses? After analysing 500 closed deals, the model reveals that deals where budget_confirmed=true AND decision_maker_present=true AND discovery_depth>65 close at 4x the rate. The team updates their deal qualification criteria based on actual conversation evidence — not CRM checkbox data.

Learn more about RevOps
Revenue IntelligenceVibe-coded with Retool
Acme Corp$85kNegotiation0.82
Velocity: 4 calls/week ↑Stakeholders: 3/5 mapped
Globex Inc$120kProposal0.71
Velocity: 2 calls/week →Stakeholders: 4/5 mapped
Initech$45kDiscovery0.54
Velocity: 1 call/week ↓Stakeholders: 1/5 mapped
Umbrella Co$200kLegal Review0.38
Velocity: 0 calls this week ↓Stakeholders: 2/5 mapped
Joins Semarize signals with SOQL opportunity data · Updated daily

Watch out for

Common Challenges & Gotchas

These are the issues that come up most often when teams start extracting call data from Salesforce at scale.

Multiple call data sources

Salesforce stores call data across Task objects, VoiceCall records, and Einstein Conversation entries. You may need to query multiple objects.

Einstein Conversation Insights requires add-on

Full transcript access requires Einstein for Sales or Service. Without it, you only get Task metadata and notes.

API call limits vary by edition

Salesforce enforces daily API call limits (e.g., 15,000 for Enterprise Edition). Bulk operations should use the Bulk API to conserve limits.

Recording access varies by provider

Service Cloud Voice uses Amazon Connect. Third-party CTI providers store recordings externally. Each has different access patterns.

SOQL query complexity

Querying related records (call → contact → account → opportunity) requires multiple queries or relationship queries. Plan your data model carefully.

Sandbox vs. production differences

API endpoints and data differ between sandbox and production. Always test in sandbox before deploying to production.

Change Data Capture setup

CDC requires admin configuration and has per-org event delivery limits. Monitor event bus capacity for high-volume orgs.

FAQ

Frequently Asked Questions

Explore

Explore Semarize