On this page

Intro
What Data You Can Extract
API Access
Key Extraction Flows
Automation Tools
What You Can Build
Challenges & Gotchas
FAQ
Explore Semarize
Related Resources

Get Your Data

Salesforce - How to Get Your Conversation Data

A practical guide to getting your conversation data from Salesforce - covering the Salesforce API, Einstein Conversation Insights, Service Cloud Voice recordings, and how to route structured data into downstream systems.

What you'll learn

What conversation data you can extract from Salesforce - call logs, Einstein Conversation Insights transcripts, Service Cloud Voice recordings, and activity data
How to access data via the Salesforce API - Connected Apps, OAuth 2.0, and SOQL queries
Three extraction patterns: SOQL-based export, scheduled polling, and Platform Event-driven flows
How to connect Salesforce data pipelines to Zapier, n8n, and Make
Advanced use cases - custom scoring, CRM enrichment, compliance, and warehouse analytics

Data

What Data You Can Extract From Salesforce

Salesforce stores conversation data across multiple objects - Task records for call activity, VoiceCall records for Service Cloud Voice, and ConversationEntry objects for Einstein Conversation Insights. Each source provides different levels of detail depending on your org's configuration and add-ons.

Common fields teams care about

Call activity logs (Task records with call metadata - subject, description, duration, outcome)

Einstein Conversation Insights transcripts (AI-generated transcripts from Sales Cloud calls)

Service Cloud Voice recordings (call recordings from Amazon Connect or partner telephony)

Voice call metadata (call direction, duration, queue, agent, disposition)

Activity timeline (calls positioned within the full account/opportunity timeline)

Associated records (linked contacts, accounts, opportunities, cases)

Call sentiment and topics (Einstein-detected sentiment signals and topic mentions)

Custom fields on Task/VoiceCall objects (any custom fields your team tracks)

Call coaching signals (Einstein's built-in coaching recommendations if enabled)

Omni-Channel routing data (queue assignment, routing priority, wait times)

API Access

How to Get Call Data via the Salesforce API

Salesforce exposes call data through its REST API. The workflow is: authenticate via a Connected App with OAuth 2.0, query call records using SOQL, then fetch recordings and transcripts from the relevant objects.

Authenticate

Create a Connected App in Salesforce Setup (Setup → App Manager → New Connected App). Enable OAuth 2.0 with scopes: api, refresh_token, offline_access. Use the OAuth 2.0 JWT bearer or web server flow for app-to-app auth.

Authorization: Bearer {access_token}

Use https://login.salesforce.com/services/oauth2/token for production or https://test.salesforce.com/... for sandbox.

Query call records

For Task-based calls, Service Cloud Voice records, and Einstein Conversation Insights, use SOQL queries via the REST API endpoint GET /services/data/v59.0/query?q={SOQL}.

-- Task-based calls
SELECT Id, Subject, Description, CallDurationInSeconds,
       CallType, ActivityDate, WhoId, WhatId
FROM Task
WHERE TaskSubtype = 'Call'
  AND ActivityDate >= 2026-01-01

-- Service Cloud Voice
SELECT Id, CallType, CallDurationInSeconds,
       FromPhoneNumber, ToPhoneNumber, VendorCallKey
FROM VoiceCall
WHERE CreatedDate >= 2026-01-01T00:00:00Z

-- Einstein Conversation Insights
-- Query ConversationEntry objects for transcript segments

Use the REST API: GET /services/data/v59.0/query?q={SOQL}. Results are paginated - each response includes a nextRecordsUrl if more records exist.

Access recordings and transcripts

Service Cloud Voice recordings can be accessed via the VoiceCall content endpoint. Einstein Conversation Insights provides transcript segments through ConversationEntry objects with speaker labels. Third-party CTI recordings require following the provider's URL/API for recording access.

-- Service Cloud Voice recording
GET /services/data/v59.0/sobjects/VoiceCall/{id}/Content

-- Einstein Conversation Insights
-- Query ConversationEntry objects for transcript
-- segments with speaker labels

-- Third-party CTI recordings
-- Follow the provider's URL/API for recording access

Einstein Conversation Insights returns transcript data as ConversationEntry records, each with a speaker ID and text segment. Third-party CTI providers (e.g., Five9, RingCentral) store recordings externally - check their documentation for API access.

Handle authentication and limits

API limits

Salesforce enforces concurrent API call limits based on org edition (API calls per 24-hour period). Monitor usage via SELECT COUNT() FROM ApiEvent.

Einstein requirements

Einstein Conversation Insights requires Sales Cloud Einstein or Service Cloud Einstein. Not all orgs have it enabled. Check your org's entitlements before building transcript-dependent flows.

Patterns

Key Extraction Flows

There are three practical patterns for getting call data out of Salesforce. The right choice depends on whether you're doing a one-off migration, running ongoing extraction, or need near real-time processing.

Backfill (Historical Export)

One-off migration of past call data

Create a Connected App with the necessary OAuth scopes (api, refresh_token, offline_access)

Write a SOQL query for Tasks or VoiceCalls filtered by date range

Execute the query with pagination - use queryMore for result sets over 2,000 records. Salesforce returns a nextRecordsUrl for pagination

Fetch recordings where available via the VoiceCall content endpoint or provider API

Send each transcript and call metadata to Semarize for structured analysis

Tip: Use queryMore for result sets over 2,000 records. Salesforce returns a nextRecordsUrl for pagination.

Incremental Polling

Ongoing extraction on a schedule

Schedule a job (cron, Lambda, etc.) that runs your extraction script at regular intervals

Query Tasks or VoiceCalls modified since the last run using SystemModstamp or LastModifiedDate

Filter out already-processed record IDs to avoid reprocessing

Fetch recordings and transcripts for new or updated records

Route each transcript and its metadata to Semarize for structured analysis

Tip: Use SystemModstamp or LastModifiedDate for reliable incremental queries. These fields capture both insert and update events.

Platform Event-Driven

Near real-time on record creation

Create a Platform Event or enable Change Data Capture (CDC) on Task or VoiceCall objects in Salesforce Setup

Subscribe via CometD or an Apex trigger that listens for new call records

When new call data is captured, push the record to your processing endpoint

Fetch the recording and process via Semarize for structured analysis

Note: Change Data Capture (CDC) fires real-time events when records are created or modified. This eliminates the need for polling.

Automation

Send Salesforce Call Data to Automation Tools

Once you can extract call data from Salesforce, the next step is routing it through Semarize for structured analysis and into your downstream systems. Below are end-to-end example flows - each showing the full pipeline from Salesforce trigger through Semarize evaluation to CRM, Slack, or database output.

ZapierNo-code automation

Salesforce → Zapier → Semarize → CRM

Detect new Salesforce call Tasks, fetch the recording, send it to Semarize for structured analysis, then write the scored output - signals, flags, and evidence - back to your Salesforce Opportunity.

Example Zap

Trigger: New Task (Call) in Salesforce

Fires when a new call Task is created

App: Salesforce

Event: New Record

Object: Task (TaskSubtype = 'Call')

Webhooks by Zapier

Fetch recording from provider

Method: GET

URL: Recording provider endpoint

Auth: Provider credentials

Recording returned

Webhooks by Zapier

POST /v1/runs (sync) to Semarize

Method: POST

URL: https://api.semarize.com/v1/runs

Auth: Bearer smz_live_...

Body: { kit_code, mode: "sync", input: { transcript } }

Structured output returned

Formatter by Zapier

Extract brick values from Semarize response

Extract: bricks.overall_score.value

Extract: bricks.risk_flag.value

Extract: bricks.pain_point.value

Salesforce - Update Record

Write scored signals to Opportunity

Object: Opportunity

AI Score: {{overall_score}}

Risk Flag: {{risk_flag}}

Pain Point: {{pain_point}}

Setup steps

Create a new Zap. Choose Salesforce as the trigger app and select "New Record" as the event. Set the object to Task and connect your Salesforce account.

Add a Filter step to only continue when TaskSubtype equals 'Call'. This prevents non-call activities from triggering the flow.

Add a "Webhooks by Zapier" Action (Custom Request) to fetch the recording from your telephony provider. Map the call ID or vendor call key from the Task record.

Add a second "Webhooks by Zapier" Action. Set method to POST, URL to https://api.semarize.com/v1/runs. Add your Semarize API key as a Bearer token. In the body, set kit_code to your Kit, mode to "sync", and map the transcript text into input.transcript.

Add a Formatter step to extract individual brick values from the Semarize JSON response - overall_score, risk_flag, pain_point, etc.

Add a Salesforce Action to write the extracted scores and signals back to the related Opportunity record. Test each step end-to-end, then turn on the Zap.

Watch out for: Zapier's Salesforce trigger can fire on new Tasks. Filter for TaskSubtype = 'Call' in your Zap to avoid processing non-call activities.

Learn more about Zapier automation

n8nSelf-hosted workflows

Salesforce → n8n → Semarize → Database

Poll Salesforce for new call records on a schedule, fetch recordings, send each one to Semarize for analysis, then write the structured scores and signals to your database. n8n's built-in Salesforce node handles auth and pagination automatically.

Example Workflow

Cron - Every Hour

Triggers the workflow on schedule

Mode: Every Hour

Timezone: UTC

Salesforce - SOQL Query

Query Tasks modified since last run

Node: Salesforce

Operation: SOQL Query

Query: SELECT Id, Subject, ... FROM Task WHERE TaskSubtype = 'Call' AND SystemModstamp >= {{last_run}}

For each call record

HTTP Request - Fetch Recording

GET recording from provider

Method: GET

URL: Provider endpoint / VoiceCall content

Code - Prepare Data

Format call data and transcript for Semarize

Map: call metadata + transcript text

HTTP Request - Semarize

POST /v1/runs (sync)

URL: https://api.semarize.com/v1/runs

Auth: Bearer smz_live_...

Body: { kit_code, mode: "sync", input: { transcript } }

Scores & signals returned

Postgres - Insert Row

Write structured output to database

Table: call_evaluations

Columns: call_id, score, risk_flag, pain_point

Setup steps

Add a Cron node as the workflow trigger. Set the interval to your desired polling frequency (hourly works well for most teams).

Add a Salesforce node. Configure OAuth credentials for your Connected App. Set the operation to SOQL Query and write your query to fetch call Tasks modified since the last run.

Add a Split In Batches node to iterate over the returned call records. Inside the loop, add an HTTP Request node to fetch each recording from your telephony provider.

Add a Code node (JavaScript) to prepare the call data - combine metadata from the Salesforce record with the transcript or recording content.

Add another HTTP Request node to send the data to Semarize. Set method to POST, URL to https://api.semarize.com/v1/runs. Add your API key as a Bearer token. Set kit_code, mode to "sync", and map the transcript into input.transcript.

Add a Code node to extract the brick values from the Semarize response - overall_score, risk_flag, pain_point, evidence, confidence.

Add a Postgres (or MySQL / HTTP Request) node to write the structured output. Use call_id as the primary key for upserts.

Activate the workflow. Monitor the first few runs to verify Semarize responses are arriving and writing correctly.

Watch out for: n8n has a built-in Salesforce node that supports SOQL queries. Use it to handle auth and pagination automatically.

Learn more about n8n automation

MakeVisual automation with branching

Salesforce → Make → Semarize → CRM + Slack

Receive new Salesforce call activity via webhook, fetch the recording, send it to Semarize for structured analysis, then use a Router to branch the scored output - alert on risk flags via Slack and write all signals back to your CRM.

Example Scenario

Webhook - New Call Activity

Triggered by Salesforce Outbound Message or Platform Event

Source: Salesforce Platform Event / Outbound Message

HTTP - Fetch Recording

GET recording from provider

Method: GET

URL: Provider endpoint or VoiceCall content

Auth: Provider credentials

HTTP - Semarize

POST /v1/runs (sync)

URL: https://api.semarize.com/v1/runs

Auth: Bearer smz_live_...

Body: { kit_code, mode: "sync", input: { transcript } }

Structured output

Router - Branch on Risk Flag

Route by Semarize output

Branch 1: IF risk_flag.value = true

Branch 2: ALL (fallthrough)

Branch 1 - Risk detected

Slack - Alert Channel

Notify team about flagged call

Channel: #deal-alerts

Message: Risk on {{call_id}}, score: {{score}}

Branch 2 - All calls

Salesforce - Update Record

Write all scored signals to Opportunity

AI Score: {{overall_score}}

Risk Flag: {{risk_flag}}

Pain Point: {{pain_point}}

Setup steps

Create a new Scenario. Add a Webhook module as the trigger - configure it to receive events from Salesforce Outbound Messages or Platform Events.

In Salesforce Setup, configure an Outbound Message or Platform Event on the Task object (filtered to call Tasks) that sends data to your Make webhook URL.

Add an HTTP module to fetch the recording from your telephony provider. Map the call ID or vendor call key from the webhook payload.

Add another HTTP module to send the recording/transcript to Semarize. Set URL to https://api.semarize.com/v1/runs, add your Bearer token, and set kit_code, mode to "sync", and input.transcript from the previous step. Parse the response as JSON.

Add a Router module. Define Branch 1 with a filter: bricks.risk_flag.value equals true. Leave Branch 2 as a fallthrough (no filter).

On Branch 1, add a Slack module to alert your team when risk is detected. Map the score, risk flag, and call ID into the message.

On Branch 2, add a Salesforce module to write all brick values (score, risk_flag, pain_point) back to the Opportunity record.

Activate the scenario. Monitor the first few runs in Make's execution log.

Watch out for: Salesforce Outbound Messages or Platform Events can trigger Make scenarios. Configure the trigger in Salesforce Setup.

Learn more about Make automation

What you can build

What You Can Do With Salesforce Data in Semarize

Salesforce stores your data. Semarize structures it. When conversation content is evaluated against your own frameworks and returned as typed, programmable output, new possibilities open up.

Custom QA Rubric Scoring

Contact Center QA

What Semarize generates

resolution_quality = 0.82empathy_demonstrated = truetroubleshooting_complete = trueescalation_appropriate = false

Your contact center runs 500 calls per day. Your QA team has a 40-point rubric covering resolution quality, empathy, troubleshooting thoroughness, and escalation handling - and needs every call scored against it. Semarize evaluates every call against YOUR rubric, returning typed scores for each dimension. QA coverage goes from 5% random sampling to 100% automated evaluation. The QA team shifts from scoring calls to coaching on the scores.

Learn more about QA & Compliance

QA Evaluation - Call SF-8721

Grounded against: QA Rubric v6

Resolution Quality82/100

Issue identifiedRoot cause foundPreventive advice

Empathy & Rapport91/100

Active listeningAcknowledgmentPersonalization

Process Adherence74/100

VerificationTicket createdSummary provided

Escalation Handling65/100

Escalated without attempting resolution

Overall: 78 / 100Above threshold (70)

Escalation Risk Prediction

Conversation-Driven Risk Scoring

What Semarize generates

frustration_detected = trueresolution_attempted = falseescalation_risk_score = 0.92recommended_action = "supervisor_review"

Your support team handles hundreds of calls daily - but you only find out a case is headed for escalation after it happens. Semarize scores every support interaction for frustration signals, failed resolution attempts, and escalation likelihood in real time. Each case gets a risk score and a recommended action - surface high-risk cases for supervisor review before they blow up. After scoring a week of calls, you find that 18% of cases with risk above 0.75 would have escalated without intervention. Salesforce tracks case status fields, but it cannot natively predict escalation risk from what was actually said in the conversation.

Learn more about QA & Compliance

Escalation Risk Queue - Today12 calls flagged

SF-8834Billing dispute0.92Supervisor review

Frustration detected, no resolution attempted

SF-8836Integration error0.78Re-attempt

Technical issue, agent lacked product knowledge

SF-8839Account access0.45Monitor

Standard request, minor confusion

SF-8841Feature request0.31Route to product

No frustration, feature feedback captured

Predicted to prevent 8 unnecessary escalations today

Cross-Channel Journey Scoring

Touchpoint & Handoff Quality Analysis

What Semarize generates

journey_score = 68handoff_quality_score = 52%channel_sequence = "chat > phone > email"weakest_handoff = "chat_to_phone"

Your customers move between chat, phone, and email - but you have no visibility into whether context carries across those transitions. Semarize scores every interaction across channels and measures handoff quality between consecutive touchpoints. Each customer journey gets a composite score reflecting interaction quality and transition smoothness. After scoring 2,000 multi-channel journeys, you discover that chat-to-phone handoffs lose context 48% of the time - driving repeat explanations and lower satisfaction. Salesforce logs case activity across channels, but it cannot natively score journey quality or measure how well context transfers from one conversation to the next.

Learn more about RevOps

Customer Journey - Case SF-9102

Chat11:30 AM

Score: 72

Handoff quality: 52%

Phone2:15 PM

Score: 61

Handoff quality: 78%

EmailNext day

Score: 71

Journey score: 68/100

Chat → Phone handoff lost context. Customer repeated issue.

Conversation-Powered Revenue Intelligence

Deal Health & Stakeholder Mapping

Vibe-coded

What Semarize generates

deal_health_score = 0.74call_velocity = 2.3/weekstakeholders_mapped = 3/5velocity_trend = "declining"

Your pipeline reviews rely on reps self-reporting deal status - but the CRM fields rarely reflect what is actually happening in conversations. Semarize scores every sales call for deal health signals, tracks call velocity trends across the opportunity lifecycle, and maps stakeholder coverage from who actually shows up and speaks on calls. After scoring a quarter of pipeline activity, you find that deals with declining call velocity and fewer than 3 stakeholders mapped close at one-fifth the rate. Revenue teams now run pipeline reviews against conversation-derived signals instead of CRM fields. Salesforce tracks opportunity stages and activity counts, but it cannot natively derive deal health or stakeholder mapping from what was said on calls.

Learn more about RevOps

Revenue IntelligenceVibe-coded with Retool

Acme Corp$85kNegotiation0.82

Velocity: 4 calls/week ↑Stakeholders: 3/5 mapped

Globex Inc$120kProposal0.71

Velocity: 2 calls/week →Stakeholders: 4/5 mapped

Initech$45kDiscovery0.54

Velocity: 1 call/week ↓Stakeholders: 1/5 mapped

Umbrella Co$200kLegal Review0.38

Velocity: 0 calls this week ↓Stakeholders: 2/5 mapped

Joins Semarize signals with SOQL opportunity data · Updated daily

Watch out for

Common Challenges & Gotchas

These are the issues that come up most often when teams start extracting call data from Salesforce at scale.

Multiple call data sources

Salesforce stores call data across Task objects, VoiceCall records, and Einstein Conversation entries. You may need to query multiple objects.

Einstein Conversation Insights requires add-on

Full transcript access requires Einstein for Sales or Service. Without it, you only get Task metadata and notes.

API call limits vary by edition

Salesforce enforces daily API call limits (e.g., 15,000 for Enterprise Edition). Bulk operations should use the Bulk API to conserve limits.

Recording access varies by provider

Service Cloud Voice uses Amazon Connect. Third-party CTI providers store recordings externally. Each has different access patterns.

SOQL query complexity

Querying related records (call → contact → account → opportunity) requires multiple queries or relationship queries. Plan your data model carefully.

Sandbox vs. production differences

API endpoints and data differ between sandbox and production. Always test in sandbox before deploying to production.

Change Data Capture setup

CDC requires admin configuration and has per-org event delivery limits. Monitor event bus capacity for high-volume orgs.

FAQ

Frequently Asked Questions

Explore

Salesforce - How to Get Your Conversation Data

What Data You Can Extract From Salesforce

How to Get Call Data via the Salesforce API

Authenticate

Query call records

Access recordings and transcripts

Handle authentication and limits

Key Extraction Flows

Backfill (Historical Export)

Incremental Polling

Platform Event-Driven

Send Salesforce Call Data to Automation Tools

Salesforce → Zapier → Semarize → CRM

Setup steps

Salesforce → n8n → Semarize → Database

Setup steps

Salesforce → Make → Semarize → CRM + Slack

Setup steps

What You Can Do With Salesforce Data in Semarize

Custom QA Rubric Scoring

Escalation Risk Prediction

Cross-Channel Journey Scoring

Conversation-Powered Revenue Intelligence

Common Challenges & Gotchas

Frequently Asked Questions

Explore Semarize

Get Started

Developer Quickstart

Pricing

How It Works

Bricks

Kits

Salesforce - How to Get Your Conversation Data

What Data You Can Extract From Salesforce

How to Get Call Data via the Salesforce API

Authenticate

Query call records

Access recordings and transcripts

Handle authentication and limits

Key Extraction Flows

Backfill (Historical Export)

Incremental Polling

Platform Event-Driven

Send Salesforce Call Data to Automation Tools

Salesforce → Zapier → Semarize → CRM

Setup steps

Salesforce → n8n → Semarize → Database

Setup steps

Salesforce → Make → Semarize → CRM + Slack

Setup steps

What You Can Do With Salesforce Data in Semarize

Custom QA Rubric Scoring

Escalation Risk Prediction

Cross-Channel Journey Scoring

Conversation-Powered Revenue Intelligence

Common Challenges & Gotchas

Frequently Asked Questions

Explore Semarize

Get Started

Developer Quickstart

Pricing

How It Works

Bricks

Kits

Related Resources

Get Your Data

Automation

CRM & Data

Playbooks

Blog