On this page

Intro
What Data You Can Extract
API Access
Key Extraction Flows
Automation Tools
What You Can Build
Challenges & Gotchas
Structured Signals
FAQ
Explore Semarize
Complete the Pipeline
Related Resources

Get Your Data

CallMiner - How to Get Your Conversation Data

CallMiner Eureka is one of the longest-running speech analytics platforms in the contact center space. This guide covers the standards-based APIs and connectors that ship with the platform (including the Open Voice Transcription Standard, OVTS), real-time streaming via Eureka Alert, and the tenant-specific auth model.

Quick answer

CallMiner offers standards-based API access to the Eureka platform via APIs, connectors, and the Open Voice Transcription Standard (OVTS). Authenticate with the credentials issued for your tenant, then pull interactions, transcripts, categories, and analytics scores. For live use cases, real-time streaming transcription is available via Eureka Alert. API and integration scope depends on your CallMiner agreement, so confirm with your account team before building.

What you'll learn

What interaction data you can extract from CallMiner - audio, text, chat, video transcripts, metadata, and speaker labels
How to access data via the CallMiner REST API - OAuth 2.0 authentication, endpoints, and pagination
Three extraction patterns: historical backfill, incremental polling, and real-time API
How to connect CallMiner data pipelines to Zapier, n8n, and Make
Advanced use cases - custom compliance scoring, attrition prediction, omnichannel consistency, and warehouse analytics

Data

What Data You Can Extract From CallMiner

CallMiner captures interactions across multiple channels - voice, chat, email, and video. Every interaction produces a set of structured assets that can be extracted via API - the transcript, speaker identification, timing metadata, channel type, and contextual information about the interaction and its associated contact.

Common fields teams care about

Full transcript text (audio, chat, email, video)

Speaker labels (agent vs. customer)

Agent ID and agent name

Channel type (voice / chat / email / video)

Interaction date, time, and duration

Customer identifier and contact metadata

Interaction direction (inbound / outbound)

Disposition and wrap-up codes

OVTS-compatible transcript format

Associated campaign or queue metadata

API Access

How to Get Transcripts via the CallMiner API

CallMiner exposes interactions and transcripts through a REST API secured with OAuth 2.0. The workflow is: obtain an access token from the developer portal, list interactions by date range, then fetch the transcript for each interaction ID.

Authenticate with OAuth 2.0

CallMiner uses OAuth 2.0 with client credentials. Register your application at developer.callminer.com to obtain a client_id and client_secret. Exchange them for a Bearer token via the token endpoint.

POST https://auth.callminer.com/oauth/token
Content-Type: application/x-www-form-urlencoded

grant_type=client_credentials
&client_id=<your_client_id>
&client_secret=<your_client_secret>

# Response:
# { "access_token": "eyJ...", "token_type": "Bearer", "expires_in": 3600 }

API access is enterprise or partner-gated. Contact your CallMiner account representative or apply through developer.callminer.com to provision credentials. Tokens expire - implement automatic refresh in your pipeline.

List interactions by date range

Call the GET /v1/interactions endpoint with startDate and endDate query parameters. Results are paginated - each response includes an offset or nextPage token to fetch the next page.

GET https://api.callminer.com/v1/interactions?startDate=2025-01-01T00:00:00Z&endDate=2025-02-01T00:00:00Z&limit=100
Authorization: Bearer <access_token>
Content-Type: application/json

The response returns an array of interaction objects with id, channel, agentId, duration, startTime, and associated metadata. Keep paginating until no more results are returned.

Fetch the transcript

For each interaction ID, request the transcript via GET /v1/interactions/{id}/transcript. The response contains an array of utterances, each with a speaker role, timestamp, and text segment.

GET https://api.callminer.com/v1/interactions/INT-20250115-00482/transcript
Authorization: Bearer <access_token>

Each utterance in the response includes speakerRole (agent / customer), startTime, endTime, and text. Reassemble into plain text by concatenating utterances, or preserve the structured format for per-speaker analysis. CallMiner also supports OVTS format for cross-platform interoperability.

Handle rate limits and transcript availability

Rate limits

CallMiner enforces per-endpoint rate limits that vary by access tier. When you receive a 429 response, back off using the Retry-After header. For bulk operations, pace requests and persist your pagination token between runs.

Transcript timing

Audio transcripts are not available the instant an interaction ends. CallMiner processes recordings asynchronously - typical lag varies by interaction length and system load. Text-based channels (chat, email) are generally available faster. Build a buffer into your extraction timing or implement a retry with exponential backoff.

Patterns

Key Extraction Flows

There are three practical patterns for getting transcripts out of CallMiner. The right choice depends on whether you're doing a one-off migration, running ongoing extraction, or need near real-time processing via CallMiner's real-time API.

Backfill (Historical Export)

One-off migration of past interactions

Define your date range — typically 6–12 months of historical interactions, or all available data if migrating off CallMiner’s native analytics

Call GET /v1/interactions with startDate and endDate parameters. Paginate through the full result set, collecting all interaction IDs

For each interaction ID, fetch the transcript via GET /v1/interactions/{id}/transcript. Pace requests to stay within rate limits

Store each transcript with its interaction metadata (interaction ID, date, agent, channel, disposition) in your data warehouse or object store

Once the backfill completes, run your analysis pipeline against the stored data in bulk

Tip: Persist your pagination token between batches. If the process is interrupted, you can resume from where you left off instead of re-scanning from the start.

Incremental Polling

Ongoing extraction on a schedule

Set a cron job or scheduled trigger (hourly, daily, etc.) that runs your extraction script

On each run, call GET /v1/interactions with startDate set to your last successful poll timestamp

Fetch transcripts for any new interaction IDs returned. Use the interaction ID as a deduplication key to avoid reprocessing

Route each transcript and its metadata to your downstream pipeline — analysis tool, warehouse, or automation platform

Update your stored timestamp to the current run time for the next poll cycle

Tip: Account for transcript processing delay on audio channels. An interaction that ended 10 minutes ago may not have a transcript yet. Polling with a 1\u20132 hour lag reduces empty fetches. Text channels are typically available sooner.

Real-Time API

Near real-time on interaction completion

Configure a real-time API endpoint or webhook listener in your CallMiner admin settings. CallMiner fires events when an interaction is processed and the transcript becomes available

When the event fires, parse the payload to extract the interaction ID and metadata

Immediately fetch the transcript via GET /v1/interactions/{id}/transcript using the interaction ID from the event

Route the transcript and metadata downstream — to your analysis pipeline, CRM updater, or automation tool

Note:Real-time API availability depends on your CallMiner plan and access tier. Not all accounts have access to real-time event triggers. Check with your CallMiner account representative for your plan's capabilities.

Automation

Send CallMiner Transcripts to Automation Tools

Once you can extract transcripts from CallMiner, the next step is routing them through Semarize for structured analysis and into your downstream systems. Below are end-to-end example flows - each showing the full pipeline from CallMiner trigger through Semarize evaluation to CRM, Slack, or database output.

ZapierNo-code automation

CallMiner → Zapier → Semarize → CRM

Detect new CallMiner interactions on a schedule, fetch the transcript, send it to Semarize for structured analysis, then write the scored output - signals, flags, and evidence - directly to your CRM.

Example Zap

Trigger: Schedule (Every Hour)

Polls for new CallMiner interactions

App: Schedule by Zapier

Event: Every Hour

Output: triggers extraction flow

Webhooks by Zapier

List new interactions from CallMiner API

Method: GET

URL: https://api.callminer.com/v1/interactions

Auth: Bearer (OAuth token)

Params: startDate={{last_run}}&limit=50

For each interaction

Webhooks by Zapier

Fetch transcript from CallMiner API

Method: GET

URL: https://api.callminer.com/v1/interactions/{{id}}/transcript

Auth: Bearer (OAuth token)

Transcript returned

Webhooks by Zapier

POST /v1/runs (sync) to Semarize

Method: POST

URL: https://api.semarize.com/v1/runs

Auth: Bearer smz_live_...

Body: { kit_code, mode: "sync", input: { transcript } }

Structured output returned

Formatter by Zapier

Extract brick values from Semarize response

Extract: bricks.compliance_score.value

Extract: bricks.empathy_score.value

Extract: bricks.escalation_risk.value

Salesforce - Update Record

Write scored signals to Contact record

Object: Contact

Compliance Score: {{compliance_score}}

Empathy Score: {{empathy_score}}

Escalation Risk: {{escalation_risk}}

Setup steps

Create a new Zap. Choose Schedule by Zapier as the trigger and set it to run every hour. This avoids needing a direct CallMiner trigger integration.

Add a "Webhooks by Zapier" Action (Custom Request) to list new interactions from CallMiner. Set method to GET, URL to https://api.callminer.com/v1/interactions, add your OAuth Bearer token, and pass startDate as a parameter.

Add another "Webhooks by Zapier" Action to fetch the transcript for each interaction. Set method to GET, URL to https://api.callminer.com/v1/interactions/{{id}}/transcript with the Bearer token.

Add a third "Webhooks by Zapier" Action. Set method to POST, URL to https://api.semarize.com/v1/runs. Add your Semarize API key as a Bearer token. In the body, set kit_code to your Kit, mode to "sync", and map the transcript text into input.transcript.

Add a Formatter step to extract individual brick values from the Semarize JSON response — compliance_score, empathy_score, escalation_risk, etc.

Add a Salesforce (or HubSpot, Sheets, etc.) Action to write the extracted scores and signals to your CRM record.

Test each step end-to-end, then turn on the Zap.

Watch out for: Zapier has step data size limits that can truncate very long transcripts. For interactions over 60 minutes, consider storing the transcript in cloud storage and passing a reference URL instead of inline text. Use mode: "sync"so Semarize returns results inline - Zapier doesn't natively support polling loops.

Learn more about Zapier automation

n8nSelf-hosted workflows

CallMiner → n8n → Semarize → Database

Poll CallMiner for new interactions on a schedule, fetch transcripts, send each one to Semarize for analysis, then write the structured scores and signals to your database. n8n's native loop support handles pagination and batch processing.

Example Workflow

Cron - Every Hour

Triggers the workflow on schedule

Mode: Every Hour

Timezone: UTC

HTTP Request - List Interactions

GET /v1/interactions (CallMiner)

Method: GET

URL: https://api.callminer.com/v1/interactions

Auth: Bearer (OAuth token)

Params: startDate={{$now.minus(1, 'hour')}}&limit=100

For each interaction ID

HTTP Request - Fetch Transcript

GET /v1/interactions/{id}/transcript (CallMiner)

URL: https://api.callminer.com/v1/interactions/{{$json.id}}/transcript

Code - Reassemble Transcript

Concatenate utterances into plain text

Join: utterances[].text by speakerRole

HTTP Request - Semarize

POST /v1/runs (sync)

URL: https://api.semarize.com/v1/runs

Auth: Bearer smz_live_...

Body: { kit_code, mode: "sync", input: { transcript } }

Scores & signals returned

Postgres - Insert Row

Write structured output to database

Table: interaction_evaluations

Columns: interaction_id, agent_id, channel, compliance_score, empathy_score

Setup steps

Add a Cron node as the workflow trigger. Set the interval to your desired polling frequency (hourly works well for most contact center volumes).

Add an HTTP Request node to list new interactions from CallMiner. Set method to GET, URL to https://api.callminer.com/v1/interactions, configure OAuth Bearer auth, and set startDate to one interval ago.

Add a Split In Batches node to iterate over the returned interaction IDs. Inside the loop, add an HTTP Request node to fetch each transcript via GET /v1/interactions/{id}/transcript.

Add a Code node (JavaScript) to reassemble the utterances array into a single transcript string. Join each utterance’s text, prefixed by speaker role.

Add another HTTP Request node to send the transcript to Semarize. Set method to POST, URL to https://api.semarize.com/v1/runs. Add your API key as a Bearer token. Set kit_code, mode to "sync", and map the transcript into input.transcript.

Add a Code node to extract the brick values from the Semarize response — compliance_score, empathy_score, escalation_risk, evidence, confidence.

Add a Postgres (or MySQL / HTTP Request) node to write the structured output. Use interaction_id as the primary key for upserts.

Activate the workflow. Monitor the first few runs to verify Semarize responses are arriving and writing correctly.

Watch out for:Use interaction IDs as deduplication keys to prevent reprocessing. You can also use async mode with n8n's native loop - POST /v1/runs (default async), then poll GET /v1/runs/:runId with a Wait + IF loop until status is "succeeded".

Learn more about n8n automation

MakeVisual automation with branching

CallMiner → Make → Semarize → CRM + Slack

Fetch new CallMiner transcripts on a schedule, send each to Semarize for structured analysis, then use a Router to branch the scored output - alert on compliance flags via Slack and write all signals to your CRM.

Example Scenario

Schedule - Every 30 min

Triggers the scenario on interval

Interval: 30 minutes

HTTP - List New Interactions

GET /v1/interactions (CallMiner)

Method: GET

Auth: Bearer (OAuth token)

Params: startDate={{formatDate(...)}}&limit=100

HTTP - Fetch Transcript

GET /v1/interactions/{id}/transcript (per interaction)

Iterator: for each interaction in response

URL: /v1/interactions/{{item.id}}/transcript

HTTP - Semarize

POST /v1/runs (sync)

URL: https://api.semarize.com/v1/runs

Auth: Bearer smz_live_...

Body: { kit_code, mode: "sync", input: { transcript } }

Structured output

Router - Branch on Compliance Flag

Route by Semarize output

Branch 1: IF compliance_score < 0.7

Branch 2: ALL (fallthrough)

Branch 1 - Compliance risk

Slack - Alert Channel

Notify team about flagged interaction

Channel: #compliance-alerts

Message: Low compliance on {{interaction_id}}, score: {{score}}

Branch 2 - All interactions

Salesforce - Update Record

Write all scored signals to Contact

Compliance Score: {{compliance_score}}

Empathy Score: {{empathy_score}}

Escalation Risk: {{escalation_risk}}

Setup steps

Create a new Scenario. Add a Schedule module as the trigger, set to your desired interval (15–60 minutes is typical for contact center volumes).

Add an HTTP module to list new interactions from CallMiner. Set method to GET, URL to https://api.callminer.com/v1/interactions, configure OAuth Bearer auth, and filter by startDate since the last run.

Add an Iterator module to loop through each interaction. For each, add an HTTP module to fetch the transcript via GET /v1/interactions/{id}/transcript.

Add another HTTP module to send the transcript to Semarize. Set URL to https://api.semarize.com/v1/runs, add your Bearer token, and set kit_code, mode to "sync", and input.transcript from the previous step. Parse the response as JSON.

Add a Router module. Define Branch 1 with a filter: bricks.compliance_score.value less than 0.7. Leave Branch 2 as a fallthrough (no filter).

On Branch 1, add a Slack module to alert your compliance team when a low score is detected. Map the score, interaction ID, and agent into the message.

On Branch 2, add a Salesforce module to write all brick values (compliance_score, empathy_score, escalation_risk) to the Contact record.

Set the scenario schedule and activate. Monitor the first few runs in Make’s execution log.

Watch out for: Each API call counts as an operation. A scenario processing 50 interactions uses ~150 operations (list + transcript + Semarize per interaction). Use mode: "sync" to avoid needing a polling loop for each run.

Learn more about Make automation

What you can build

What You Can Do With CallMiner Data in Semarize

Semarize delivers portable compliance scoring, attrition prediction, consistent omnichannel measurement, and the ability to build your own analytics on structured conversation signals from CallMiner.

Custom Scoring Framework Portability

Compliance on Your Terms

What Semarize generates

framework_version = "TCPA-v2026.1"disclosure_compliance = 0.94prohibited_phrases = 0evidence_linked = true

Your compliance team needs scores that match your exact regulatory framework — updated on your timeline, against your jurisdiction’s requirements. Pull interaction transcripts from CallMiner and run them through your own compliance kit in Semarize. You define the exact disclosure sequences, consent language, and prohibited phrases for your jurisdiction. When regulations change, you update your Semarize kit the same day. The structured output feeds directly into your compliance database. Auditors get evidence-backed scores against your framework, with every violation linked to the exact transcript evidence.

Learn more about QA & Compliance

Compliance Framework ComparisonSame Day Update

CallMiner Default

Coverage68%

Update cadenceQuarterly

Your Custom Framework

Coverage94%

Update cadenceSame Day

Disclosure sequence

Consent language (state-specific)

Prohibited phrases (TCPA v2026.1)

Mini-Miranda compliance

4 rules checked · Custom framework covers all · Default misses 2

Agent Attrition Prediction Model

Workforce Intelligence

What Semarize generates

frustration_trend = "rising"coaching_receptivity = 0.42attrition_risk = 0.78early_warning_weeks = 6

Your workforce planning team wants to predict which agents will leave within 90 days. Pull 12 months of transcripts and score every interaction through an agent wellbeing kit. Semarize extracts frustration_frequency, coaching_receptivity, performance_trend_slope, and customer_escalation_rate per agent per month. Feed the structured output into a gradient boosting model. The model identifies that agents with declining coaching_receptivity AND rising frustration_frequency churn within 90 days with 78% accuracy. HR intervenes with targeted support 6 weeks earlier.

Learn more about Data Science

Agent Attrition Risk - 90 Day Window78% accuracy

Agent R. Torres78%

receptivity: 0.42

Intervention Triggered

Agent K. Patel61%

receptivity: 0.55

Watch

Agent M. Chen34%

receptivity: 0.71

Healthy

Agent J. Brooks82%

receptivity: 0.38

Intervention Triggered

HR intervenes 6 weeks earlier with targeted support

Omnichannel Experience Consistency

Unified CX Scoring

What Semarize generates

phone_empathy = 0.81chat_empathy = 0.59email_empathy = 0.72gap = 22%

Your contact center handles calls, chats, and emails through CallMiner. CallMiner scores each channel separately with different models. Your CX team needs one consistent score. Pull transcripts from all channels and run them through the same Semarize experience quality kit. Every interaction — regardless of channel — gets scored for empathy_demonstrated, resolution_clarity, effort_reduction, and brand_alignment. A quarterly report shows that chat interactions score 22% lower on empathy than phone calls. The training team builds a chat-specific empathy module and scores normalise within 8 weeks.

Learn more about Customer Success

Omnichannel Experience ConsistencyScored by same Semarize kit

ChannelEmpathyResolutionEffortBrand

Phone0.810.770.690.74

Chat0.590.720.650.68

Email0.720.80.740.71

Chat empathy is 22% lower than phone - training module deployed, scores normalised in 8 weeks

Custom Speech Analytics Data Lake

Structured Pipeline to Snowflake

Vibe-coded

What Semarize generates

daily_interactions = 2,500+typed_columns = 7pipeline_latency = "< 5min"storage = "Snowflake"

A data engineering lead vibe-codes an Airflow pipeline that exports every CallMiner interaction via API, scores it through Semarize, and lands typed rows in Snowflake. Each interaction becomes a row with: agent_id, channel, compliance_score (float), empathy_score (float), resolution_achieved (bool), escalation_risk (float), topic_primary (varchar). dbt models build agent daily scorecards, compliance trend reports, and CSAT prediction features. The BI team builds Tableau dashboards on conversation data that’s queryable, joinable, and fully owned by the organisation.

Learn more about RevOps

Speech Analytics Data Lake PipelineVibe-coded with Airflow

CallMiner

REST API

Semarize

Structured JSON

Snowflake

Typed Rows

Tableau

Dashboards

Snowflake schema · 7 typed columns

agent_id(varchar)

channel(varchar)

compliance_score(float)

empathy_score(float)

resolution_achieved(bool)

escalation_risk(float)

topic_primary(varchar)

2,500+ daily interactions·< 5min latency·Owned by the organisation

Watch out for

Common Challenges & Gotchas

These are the issues that come up most often when teams start extracting transcripts from CallMiner at scale.

Enterprise / partner-gated access

CallMiner API access is not self-serve. You need to work with your account representative or apply through the developer portal. Budget time for provisioning — it can take days to weeks depending on your agreement.

OAuth 2.0 token management

CallMiner uses OAuth 2.0 for authentication. Access tokens expire and must be refreshed. If your automation does not handle token refresh gracefully, requests will start failing silently after the token TTL.

Multi-channel data shape differences

Audio, chat, email, and video interactions return different metadata fields. A pipeline built for audio transcripts may miss fields from chat interactions or break on missing speaker labels in email threads.

API rate limits

Exceeding rate limits results in throttled responses. Implement exponential backoff and pace bulk operations to avoid hitting ceilings, especially during large historical backfills.

Transcript processing delays

Audio interactions require transcription before data is available via API. Attempting to fetch a transcript too soon after an interaction ends will return empty or incomplete data. Build in a delay or retry mechanism.

Large payload sizes at scale

Contact centers generate thousands of interactions daily. Fetching all interactions in a single request is not feasible. Plan for pagination, batching, and incremental processing from the start.

Duplicate processing protection

Without idempotency checks, re-running an extraction flow can process the same interaction twice. Use interaction IDs as deduplication keys to ensure each transcript is handled exactly once.

Structured signals

Example structured signals from CallMiner interactions

CallMiner's category and score model is configurable but bespoke per deployment. Semarize emits a normalized signal layer on top - same shape regardless of how categories are defined in your tenant - so cross-team reporting stays consistent. Example: one collections interaction.

Raw CallMiner interaction transcript snippet

Agent: I see your balance has been past due for 60 days. Can you tell me what's changed?
Customer: I lost my job last month. I can pay $200 by Friday but not the full amount.
Agent: We can set up a hardship plan. Let me transfer you to the hardship desk.

Structured signal output

{
  "source": {
    "tool": "CallMiner",
    "ref": "callminer_interaction_77ad21"
  },
  "signals": [
    {
      "signal_type": "next_step",
      "value": "Transfer to hardship desk for partial-payment plan",
      "confidence": 0.96
    },
    {
      "signal_type": "risk_flag",
      "value": "Customer financial hardship - default risk if no plan agreed",
      "confidence": 0.91
    },
    {
      "signal_type": "pain_identified",
      "value": "Customer lost employment last month - reduced ability to pay",
      "confidence": 0.93
    },
    {
      "signal_type": "sentiment",
      "value": "anxious",
      "confidence": 0.84
    }
  ]
}

FAQ

Frequently Asked Questions

Explore

Pipeline

Complete the pipeline

CallMiner owns the speech analytics layer. Semarize signals warehouse next to the existing categorization output for joined reporting.

Source - you are here

CallMiner

Get the raw interaction data

Destination

Snowflake

Load signals into the warehouse

Explore

Automation

Make

Trigger workflows from the signals

Explore

CallMiner - How to Get Your Conversation Data

What Data You Can Extract From CallMiner

How to Get Transcripts via the CallMiner API

Authenticate with OAuth 2.0

List interactions by date range

Fetch the transcript

Handle rate limits and transcript availability

Key Extraction Flows

Backfill (Historical Export)

Incremental Polling

Real-Time API

Send CallMiner Transcripts to Automation Tools

CallMiner → Zapier → Semarize → CRM

Setup steps

CallMiner → n8n → Semarize → Database

Setup steps

CallMiner → Make → Semarize → CRM + Slack

Setup steps

What You Can Do With CallMiner Data in Semarize

Custom Scoring Framework Portability

Agent Attrition Prediction Model

Omnichannel Experience Consistency

Custom Speech Analytics Data Lake

Common Challenges & Gotchas

Example structured signals from CallMiner interactions

Frequently Asked Questions

Explore Semarize

Get Started

Developer Quickstart

Pricing

How It Works

Semarize API

Bricks

Kits

Developer Hub

Automation Patterns

Complete the pipeline

CallMiner

Snowflake

Make

CallMiner - How to Get Your Conversation Data

What Data You Can Extract From CallMiner

How to Get Transcripts via the CallMiner API

Authenticate with OAuth 2.0

List interactions by date range

Fetch the transcript

Handle rate limits and transcript availability

Key Extraction Flows

Backfill (Historical Export)

Incremental Polling

Real-Time API

Send CallMiner Transcripts to Automation Tools

CallMiner → Zapier → Semarize → CRM

Setup steps

CallMiner → n8n → Semarize → Database

Setup steps

CallMiner → Make → Semarize → CRM + Slack

Setup steps

What You Can Do With CallMiner Data in Semarize

Custom Scoring Framework Portability

Agent Attrition Prediction Model

Omnichannel Experience Consistency

Custom Speech Analytics Data Lake

Common Challenges & Gotchas

Example structured signals from CallMiner interactions

Frequently Asked Questions

Explore Semarize

Get Started

Developer Quickstart

Pricing

How It Works

Semarize API

Bricks

Kits

Developer Hub

Automation Patterns

Complete the pipeline

CallMiner

Snowflake

Make

Related Resources

Get Your Data

Automation

CRM & Data

Playbooks

Blog