On this page

Intro
What Data You Can Extract
API Access
Key Extraction Flows
Automation Tools
What You Can Build
Challenges & Gotchas
FAQ
Explore Semarize
Related Resources

Get Your Data

Fireflies.ai - How to Get Your Meeting Data

A practical guide to getting your meeting data out of Fireflies.ai - covering GraphQL API access, transcript extraction, batch polling, webhook-triggered flows, and how to route structured data into your downstream systems.

What you'll learn

What meeting data you can extract from Fireflies.ai - transcripts, sentences, speaker labels, audio/video URLs, and meeting metadata
How to access data via the Fireflies GraphQL API - API key authentication, queries, and pagination
Two extraction patterns: batch polling and webhook-triggered processing
How to connect Fireflies data pipelines to Zapier, n8n, and Make
Advanced use cases - meeting intelligence, deal tracking, speaker analysis, and custom dashboards

Data

What Data You Can Extract From Fireflies.ai

Fireflies.ai is an AI meeting assistant that joins your calls, records them, and generates transcripts automatically. Every meeting produces a rich set of structured data that can be extracted via the GraphQL API - the full transcript, per-sentence speaker labels, timing metadata, audio and video URLs, and contextual meeting information.

Common fields teams care about

Full transcript text (raw_text)

Per-sentence speaker labels (speaker_name)

Sentence-level timestamps (start_time, end_time)

Meeting title and organizer

Meeting date, time, and duration

Participant list and emails

Audio download URL

Video download URL (Business+ plans)

AI-generated summary and action items

Meeting source (Zoom, Meet, Teams, etc.)

API Access

How to Get Transcripts via the Fireflies GraphQL API

Fireflies exposes meeting data through a GraphQL API at api.fireflies.ai/graphql. The workflow is: authenticate with an API key, query transcripts with filters, then extract the fields you need from the response.

Authenticate

Fireflies uses API key authentication. Generate your API key from app.fireflies.ai/integrations/custom/fireflies. Pass it as a Bearer token in the Authorization header on every request.

Authorization: Bearer <your_fireflies_api_key>
Content-Type: application/json

Your API key is tied to your user account and inherits your permissions. On Free/Pro plans you get 50 API requests per day. Business+ plans have higher or unlimited limits.

List transcripts

Use the transcripts query to list meetings. You can filter by date, organizer email, or other fields. Results are returned as a paginated array.

POST https://api.fireflies.ai/graphql

{
  "query": "query {
    transcripts {
      id
      title
      date
      duration
      organizer_email
      participants
    }
  }"
}

The response returns an array of transcript objects with id, title, date, duration, and participant information. Use the IDs to fetch detailed transcript content in the next step.

Fetch the full transcript

For each transcript ID, query the transcript field with the specific fields you need. The sentences array gives you per-sentence granularity with speaker labels.

POST https://api.fireflies.ai/graphql

{
  "query": "query Transcript($id: String!) {
    transcript(id: $id) {
      id
      title
      date
      duration
      organizer_email
      participants
      audio_url
      video_url
      sentences {
        speaker_name
        text
        start_time
        end_time
      }
    }
  }",
  "variables": {
    "id": "abc123def456"
  }
}

Each object in the sentences array includes speaker_name, text, start_time, and end_time. Concatenate sentence text for a plain transcript, or preserve the structured format for per-speaker analysis.

Handle rate limits and media URLs

Rate limits

Fireflies enforces daily request limits that vary by plan. Free and Pro plans allow50 requests/day. Business+ plans offer higher or unlimited limits. Each GraphQL query counts as one request regardless of how many fields you request. Plan your extraction to stay within limits, especially during backfills.

Media URL expiration

The audio_url and video_urlfields return time-limited download links that expire after approximately 24 hours. If your workflow needs the audio or video files, download them immediately upon retrieval - don't store the URL for later use.

Patterns

Key Extraction Flows

There are two practical patterns for getting transcripts out of Fireflies.ai. The right choice depends on whether you're doing a one-off migration and ongoing batch polling, or need near real-time processing via webhooks.

Batch Polling (Backfill & Ongoing)

Scheduled extraction of transcripts

Set up a scheduled trigger (daily or hourly) that runs your extraction script. For historical backfills, run in daily batches to stay within API rate limits

Query the transcripts endpoint via GraphQL to list recent meetings. Filter by date to fetch only new transcripts since your last poll

For each transcript ID returned, fetch the full transcript including sentences, speaker labels, and metadata. Each fetch counts as one API request

Store each transcript with its metadata (transcript ID, date, participants, duration) in your data warehouse or object store

Route stored transcripts to your analysis pipeline - Semarize for structured scoring, your CRM for enrichment, or a dashboard for reporting

Tip: On Free/Pro plans with 50 requests/day, listing transcripts uses 1 request plus 1 per transcript detail fetch. Batch your daily extraction to stay under the limit - fetch the list first, then prioritize which transcripts to fully retrieve.

Webhook-Triggered

Near real-time on transcription completion

Register a webhook endpoint in your Fireflies settings or use the Zapier "Transcription Complete" trigger. Fireflies fires an event when a meeting transcript is ready

When the webhook fires, parse the event payload to extract the transcript ID and basic meeting metadata

Immediately fetch the full transcript via the GraphQL API using the transcript ID from the event payload

Route the transcript and metadata downstream - to Semarize for structured analysis, your CRM for enrichment, or Slack for notifications

Note:Fireflies has native Zapier integration with a "Transcription Complete" trigger, making webhook-based flows easy to set up without custom webhook infrastructure. Each transcript fetch still counts against your daily API quota.

Automation

Send Fireflies Transcripts to Automation Tools

Once you can extract transcripts from Fireflies, the next step is routing them through Semarize for structured analysis and into your downstream systems. Below are end-to-end example flows - each showing the full pipeline from Fireflies trigger through Semarize evaluation to CRM, Slack, or database output.

ZapierNo-code automation

Fireflies → Zapier → Semarize → CRM

Detect new Fireflies transcriptions, fetch the full transcript, send it to Semarize for structured analysis, then write the scored output - signals, flags, and evidence - directly to your CRM.

Example Zap

Trigger: Transcription Complete

Fires when Fireflies finishes a transcript

App: Fireflies.ai

Event: Transcription Complete

Output: transcript_id, title, date

Webhooks by Zapier

Fetch full transcript from Fireflies API

Method: POST

URL: https://api.fireflies.ai/graphql

Auth: Bearer <api_key>

Body: { query: "{ transcript(id: \"{{id}}\") { sentences { speaker_name text } } }" }

Transcript returned

Webhooks by Zapier

POST /v1/runs (sync) to Semarize

Method: POST

URL: https://api.semarize.com/v1/runs

Auth: Bearer smz_live_...

Body: { kit_code, mode: "sync", input: { transcript } }

Structured output returned

Formatter by Zapier

Extract brick values from Semarize response

Extract: bricks.overall_score.value

Extract: bricks.risk_flag.value

Extract: bricks.action_items.value

Salesforce - Update Record

Write scored signals to Opportunity

Object: Opportunity

AI Score: {{overall_score}}

Risk Flag: {{risk_flag}}

Action Items: {{action_items}}

Setup steps

Create a new Zap. Choose Fireflies.ai as the trigger app and select "Transcription Complete" as the event. Connect your Fireflies account.

Add a "Webhooks by Zapier" Action (Custom Request) to fetch the full transcript from Fireflies. Set method to POST, URL to https://api.fireflies.ai/graphql, add your Bearer token, and pass a GraphQL query for the transcript ID from the trigger.

Add a Code step or Formatter to concatenate the sentences array into a plain text transcript. Join each sentence's text, prefixed by speaker_name.

Add a second "Webhooks by Zapier" Action. Set method to POST, URL to https://api.semarize.com/v1/runs. Add your Semarize API key as a Bearer token. In the body, set kit_code to your Kit, mode to "sync", and map the transcript text into input.transcript.

Add a Formatter step to extract individual brick values from the Semarize JSON response - overall_score, risk_flag, action_items, etc.

Add a Salesforce (or HubSpot, Sheets, etc.) Action to write the extracted scores and signals to your CRM record. Test each step end-to-end, then turn on the Zap.

Watch out for: Zapier has step data size limits that can truncate very long transcripts. For meetings over 60 minutes, consider storing the transcript in cloud storage and passing a reference URL instead of inline text. Use mode: "sync"so Semarize returns results inline - Zapier doesn't natively support polling loops.

Learn more about Zapier automation

n8nSelf-hosted workflows

Fireflies → n8n → Semarize → Database

Poll Fireflies for new transcripts on a schedule, fetch each one via GraphQL, send to Semarize for analysis, then write the structured scores and signals to your database. n8n's native loop support handles pagination and batch processing.

Example Workflow

Cron - Every Hour

Triggers the workflow on schedule

Mode: Every Hour

Timezone: UTC

HTTP Request - List Transcripts

POST GraphQL query (Fireflies)

Method: POST

URL: https://api.fireflies.ai/graphql

Auth: Bearer <api_key>

Body: { query: "{ transcripts { id title date } }" }

For each transcript ID

HTTP Request - Fetch Transcript

GraphQL transcript query (Fireflies)

Body: { query: "{ transcript(id: $id) { sentences { speaker_name text } } }" }

Code - Reassemble Transcript

Concatenate sentences into plain text

Join: sentences[].speaker_name + text

HTTP Request - Semarize

POST /v1/runs (sync)

URL: https://api.semarize.com/v1/runs

Auth: Bearer smz_live_...

Body: { kit_code, mode: "sync", input: { transcript } }

Scores & signals returned

Postgres - Insert Row

Write structured output to database

Table: meeting_evaluations

Columns: transcript_id, score, risk_flag, action_items

Setup steps

Add a Cron node as the workflow trigger. Set the interval to your desired polling frequency (hourly works well for most teams, but daily may be better for Free/Pro plans to conserve API requests).

Add an HTTP Request node to list new transcripts from Fireflies. Set method to POST, URL to https://api.fireflies.ai/graphql, configure Bearer auth, and send a GraphQL query for recent transcripts.

Add a Code node to filter results to only transcripts newer than your last successful poll. Store the last poll timestamp in a static data node or external store.

Add a Split In Batches node to iterate over the returned transcript IDs. Inside the loop, add an HTTP Request node to fetch each full transcript via GraphQL.

Add a Code node (JavaScript) to reassemble the sentences array into a single transcript string. Join each sentence's text, prefixed by speaker_name.

Add another HTTP Request node to send the transcript to Semarize. Set method to POST, URL to https://api.semarize.com/v1/runs. Add your API key as a Bearer token. Set kit_code, mode to "sync", and map the transcript into input.transcript.

Add a Code node to extract the brick values from the Semarize response - overall_score, risk_flag, action_items, evidence, confidence.

Add a Postgres (or MySQL / HTTP Request) node to write the structured output. Use transcript_id as the primary key for upserts. Activate the workflow and monitor the first few runs.

Watch out for:Use transcript IDs as deduplication keys to prevent reprocessing. On Free/Pro plans, the 50 requests/day limit means you need to be strategic - listing transcripts costs 1 request, plus 1 per detail fetch. You can also use async mode with n8n's native loop - POST /v1/runs (default async), then poll GET /v1/runs/:runId with a Wait + IF loop until status is "succeeded".

Learn more about n8n automation

MakeVisual automation with branching

Fireflies → Make → Semarize → CRM + Slack

Fetch new Fireflies transcripts on a schedule, send each to Semarize for structured analysis, then use a Router to branch the scored output - alert on risk flags via Slack and write all signals to your CRM.

Example Scenario

Schedule - Every 30 min

Triggers the scenario on interval

Interval: 30 minutes

HTTP - List Transcripts

POST GraphQL query (Fireflies)

Method: POST

URL: https://api.fireflies.ai/graphql

Auth: Bearer <api_key>

Body: { query: "{ transcripts { id title date } }" }

HTTP - Fetch Transcript

GraphQL transcript query (per meeting)

Iterator: for each transcript in response

Body: { query: "{ transcript(id: {{item.id}}) { sentences { speaker_name text } } }" }

HTTP - Semarize

POST /v1/runs (sync)

URL: https://api.semarize.com/v1/runs

Auth: Bearer smz_live_...

Body: { kit_code, mode: "sync", input: { transcript } }

Structured output

Router - Branch on Risk Flag

Route by Semarize output

Branch 1: IF risk_flag.value = true

Branch 2: ALL (fallthrough)

Branch 1 - Risk detected

Slack - Alert Channel

Notify team about flagged meeting

Channel: #meeting-alerts

Message: Risk on {{title}}, score: {{score}}

Branch 2 - All meetings

Salesforce - Update Record

Write all scored signals to Opportunity

AI Score: {{overall_score}}

Risk Flag: {{risk_flag}}

Action Items: {{action_items}}

Setup steps

Create a new Scenario. Add a Schedule module as the trigger, set to your desired interval (15-60 minutes is typical).

Add an HTTP module to list new transcripts from Fireflies. Set method to POST, URL to https://api.fireflies.ai/graphql, configure Bearer auth, and send a GraphQL query for recent transcripts.

Add an Iterator module to loop through each transcript. For each, add an HTTP module to fetch the full transcript via GraphQL with the transcript ID.

Add a Text Aggregator or Tools module to concatenate the sentences array into plain text. Join speaker_name and text for each sentence.

Add another HTTP module to send the transcript to Semarize. Set URL to https://api.semarize.com/v1/runs, add your Bearer token, and set kit_code, mode to "sync", and input.transcript from the previous step. Parse the response as JSON.

Add a Router module. Define Branch 1 with a filter: bricks.risk_flag.value equals true. Leave Branch 2 as a fallthrough (no filter).

On Branch 1, add a Slack module to alert your team when risk is detected. Map the score, risk flag, and meeting title into the message.

On Branch 2, add a Salesforce module to write all brick values (score, risk_flag, action_items) to the Opportunity record. Set the scenario schedule and activate.

Watch out for: Each API call counts as a Make operation. A scenario processing 20 transcripts uses ~60 operations (list + transcript + Semarize per meeting). On Fireflies Free/Pro plans, the 50 requests/day limit is the real bottleneck - not Make operations. Use mode: "sync" to avoid needing a polling loop for each run.

Learn more about Make automation

What you can build

What You Can Do With Fireflies Data in Semarize

Custom scoring frameworks, multi-meeting deal tracking, speaker performance benchmarking, and building your own tools on structured meeting signals.

Pipeline Hygiene - Qualification Signal Extraction

BANT Coverage from Actual Conversations

What Semarize generates

budget_confirmed = falseauthority_identified = trueneed_articulated = truetimeline_stated = false

Fireflies records and transcribes your meetings - but it cannot extract structured qualification signals from the conversation content. Reps self-report BANT fields in the CRM, and nobody verifies whether those signals were actually discussed. Semarize analyzes every meeting transcript to detect which qualification signals - budget, authority, need, timeline - were genuinely present in the conversation. After processing deals in Proposal Sent stage, the data reveals that 2 of 4 deals have no budget confirmation from any meeting transcript, even though the CRM shows budget as "confirmed." RevOps catches pipeline risk weeks earlier because the evidence comes from what was actually said, not what reps entered.

Learn more about QA & Compliance

Pipeline Hygiene - BANT Signal Coverage4 deals in Proposal Sent

Deal

BudgetAuth.NeedTimeline

Acme Corp

Globex Inc

Initech Ltd

Soylent Co

2 of 4 deals have no budget confirmation from any meeting transcript

Meeting Outcome Accountability Scoring

Decision Quality & Follow-Through

What Semarize generates

committed_decisions = 3owner_assigned_rate = 0.45decision_follow_through = 0.62accountability_score = 0.54

Your company runs product, engineering, and customer success meetings — all recorded. Leadership wants to know which meetings actually produce outcomes with accountability. Run every meeting through an outcome accountability kit. Semarize scores each for committed_decisions (decisions with explicit owners and deadlines), owner_assignment_rate, decision_evidence_quality (was the decision grounded in data?), and follow_through_score (did previous meeting’s decisions get referenced?). A quarterly report shows that engineering standups produce 3.2 committed decisions per hour but CS team meetings produce 0.8 — and only 45% of CS decisions have assigned owners. The CS director restructures meetings around decision-forcing frameworks with explicit ownership. Accountability scores improve 2x within a month.

Learn more about Customer Success

Meeting Effectiveness - Q4 ReportScored from Fireflies transcripts

Eng. Standup

Decisions/hr3.2

Action clarity82%

CS Team

Scope drift

Decisions/hr0.8

Action clarity45%

Product Review

Decisions/hr2.1

Action clarity71%

CS meetings restructured with decision-forcing framework - scores improved 2x in one month

Stale Battlecard Detection

Competitive Intel Currency Scoring

What Semarize generates

battlecard_claim_current = falsepositioning_used = "approved"missed_differentiator = "security_compliance"stale_claims_flagged = 2

Competitive battlecards get updated quarterly — but are reps actually using the current version on calls? Run a knowledge-grounded kit against your latest competitive intelligence docs on every sales meeting. Semarize checks each competitive claim against the current battlecard: is the competitor pricing they quoted still accurate? Did they use the approved positioning statement? Did they miss a key differentiator? After 300 meetings, the data shows 2 battlecard sections are cited incorrectly in 40% of competitive conversations. Product marketing updates those sections and measures adoption the following week.

Learn more about QA & Compliance

Competitive Landscape - 6 Month AnalysisSales conversations

Competitor A

42 mentions+6% win rate

Pricing 35%

Features 50%

Support 15%

Competitor B

67 mentions-12% win rate

Pricing 65%

Features 20%

Support 15%

Competitor C

28 mentions+3% win rate

Pricing 20%

Features 55%

Support 25%

Competitor B's “unlimited seats” pricing cited in 28% of lost deals - TCO calculator improved win rate 19%

Custom Meeting ROI Calculator

Cost-per-Decision Analysis

Vibe-coded

What Semarize generates

cost_per_decision = $14,200meeting_roi = 0.34annual_meeting_cost = $2.1Mtop_meeting_type = "deal_review"

A COO vibe-codes a React app that calculates the actual cost of meetings by combining Semarize scores from Fireflies transcripts with calendar and compensation data. Each meeting gets a decision_density score, a projected_revenue_impact (based on deal signals extracted), and a cost_per_decision (meeting duration × attendee hourly cost ÷ decisions made). The app reveals that the company spends $2.1M/year on meetings, but only 34% produce actionable decisions. “Status update” meetings cost $14,200 per decision vs. $1,800 for deal review meetings. The executive team cuts 30% of status meetings and redirects time to deal reviews.

Learn more about RevOps

Meeting ROI DashboardVibe-coded with React

$2.1M

Annual cost

34%

Produce decisions

30%

Savings opportunity

Cost per decision by meeting type

Status Update$14,20038%

Deal Review$1,80022%

Sprint Planning$3,10018%

1:1 Coaching$2,40012%

Other$5,60010%

Time allocation

Cutting 30% of status meetings redirects $240K/yr to high-ROI deal reviews

Watch out for

Common Challenges & Gotchas

These are the issues that come up most often when teams start extracting transcripts from Fireflies.ai at scale.

Daily API request limits

Free and Pro plans are capped at 50 API requests per day. If you need to backfill hundreds of transcripts, you'll burn through the limit quickly. Plan your extraction in daily batches or upgrade to a Business plan for higher limits.

Time-limited media URLs

Audio and video download URLs returned by the API expire after approximately 24 hours. If your pipeline fetches a URL but doesn't download the file immediately, the link will be dead when you try to use it later. Always download media assets right away.

Video access requires Business plan

The video_url field is only populated for accounts on Business plans or higher. If you're on a Free or Pro plan and your workflow depends on video access, the field will be null. Plan your pipeline around audio-only or transcript-only processing if needed.

GraphQL query complexity

Unlike REST APIs, the Fireflies API uses GraphQL. If your team isn't familiar with GraphQL syntax, the learning curve can slow down initial setup. Structure your queries carefully - requesting too many nested fields can also slow down response times.

Speaker identification accuracy

Speaker labels depend on Fireflies correctly mapping participants from calendar invites and platform integrations. Unregistered guests, phone dial-ins, or unnamed participants can appear as generic labels. Validate speaker names before relying on them for per-speaker analysis.

Transcript processing delay

Transcripts are not available instantly after a meeting ends. Processing typically takes 5 to 15 minutes but can be longer during peak hours. If your automation triggers immediately on meeting end, it may fetch incomplete or unavailable data. Build in a retry with delay.

FAQ

Frequently Asked Questions

Explore

Fireflies.ai - How to Get Your Meeting Data

What Data You Can Extract From Fireflies.ai

How to Get Transcripts via the Fireflies GraphQL API

Authenticate

List transcripts

Fetch the full transcript

Handle rate limits and media URLs

Key Extraction Flows

Batch Polling (Backfill & Ongoing)

Webhook-Triggered

Send Fireflies Transcripts to Automation Tools

Fireflies → Zapier → Semarize → CRM

Setup steps

Fireflies → n8n → Semarize → Database

Setup steps

Fireflies → Make → Semarize → CRM + Slack

Setup steps

What You Can Do With Fireflies Data in Semarize

Pipeline Hygiene - Qualification Signal Extraction

Meeting Outcome Accountability Scoring

Stale Battlecard Detection

Custom Meeting ROI Calculator

Common Challenges & Gotchas

Frequently Asked Questions

Explore Semarize

Get Started

Developer Quickstart

Pricing

How It Works

Bricks

Kits

Fireflies.ai - How to Get Your Meeting Data

What Data You Can Extract From Fireflies.ai

How to Get Transcripts via the Fireflies GraphQL API

Authenticate

List transcripts

Fetch the full transcript

Handle rate limits and media URLs

Key Extraction Flows

Batch Polling (Backfill & Ongoing)

Webhook-Triggered

Send Fireflies Transcripts to Automation Tools

Fireflies → Zapier → Semarize → CRM

Setup steps

Fireflies → n8n → Semarize → Database

Setup steps

Fireflies → Make → Semarize → CRM + Slack

Setup steps

What You Can Do With Fireflies Data in Semarize

Pipeline Hygiene - Qualification Signal Extraction

Meeting Outcome Accountability Scoring

Stale Battlecard Detection

Custom Meeting ROI Calculator

Common Challenges & Gotchas

Frequently Asked Questions

Explore Semarize

Get Started

Developer Quickstart

Pricing

How It Works

Bricks

Kits

Related Resources

Get Your Data

Automation

CRM & Data

Playbooks

Blog