Semarize

Get Your Data

Fireflies.ai - How to Get Your Meeting Data

A practical guide to getting your meeting data out of Fireflies.ai - covering GraphQL API access, transcript extraction, batch polling, webhook-triggered flows, and how to route structured data into your downstream systems.

What you'll learn

  • What meeting data you can extract from Fireflies.ai - transcripts, sentences, speaker labels, audio/video URLs, and meeting metadata
  • How to access data via the Fireflies GraphQL API - API key authentication, queries, and pagination
  • Two extraction patterns: batch polling and webhook-triggered processing
  • How to connect Fireflies data pipelines to Zapier, n8n, and Make
  • Advanced use cases - meeting intelligence, deal tracking, speaker analysis, and custom dashboards

Data

What Data You Can Extract From Fireflies.ai

Fireflies.ai is an AI meeting assistant that joins your calls, records them, and generates transcripts automatically. Every meeting produces a rich set of structured data that can be extracted via the GraphQL API - the full transcript, per-sentence speaker labels, timing metadata, audio and video URLs, and contextual meeting information.

Common fields teams care about

Full transcript text (raw_text)
Per-sentence speaker labels (speaker_name)
Sentence-level timestamps (start_time, end_time)
Meeting title and organizer
Meeting date, time, and duration
Participant list and emails
Audio download URL
Video download URL (Business+ plans)
AI-generated summary and action items
Meeting source (Zoom, Meet, Teams, etc.)

API Access

How to Get Transcripts via the Fireflies GraphQL API

Fireflies exposes meeting data through a GraphQL API at api.fireflies.ai/graphql. The workflow is: authenticate with an API key, query transcripts with filters, then extract the fields you need from the response.

1

Authenticate

Fireflies uses API key authentication. Generate your API key from app.fireflies.ai/integrations/custom/fireflies. Pass it as a Bearer token in the Authorization header on every request.

Authorization: Bearer <your_fireflies_api_key>
Content-Type: application/json
Your API key is tied to your user account and inherits your permissions. On Free/Pro plans you get 50 API requests per day. Business+ plans have higher or unlimited limits.
2

List transcripts

Use the transcripts query to list meetings. You can filter by date, organizer email, or other fields. Results are returned as a paginated array.

POST https://api.fireflies.ai/graphql

{
  "query": "query {
    transcripts {
      id
      title
      date
      duration
      organizer_email
      participants
    }
  }"
}

The response returns an array of transcript objects with id, title, date, duration, and participant information. Use the IDs to fetch detailed transcript content in the next step.

3

Fetch the full transcript

For each transcript ID, query the transcript field with the specific fields you need. The sentences array gives you per-sentence granularity with speaker labels.

POST https://api.fireflies.ai/graphql

{
  "query": "query Transcript($id: String!) {
    transcript(id: $id) {
      id
      title
      date
      duration
      organizer_email
      participants
      audio_url
      video_url
      sentences {
        speaker_name
        text
        start_time
        end_time
      }
    }
  }",
  "variables": {
    "id": "abc123def456"
  }
}

Each object in the sentences array includes speaker_name, text, start_time, and end_time. Concatenate sentence text for a plain transcript, or preserve the structured format for per-speaker analysis.

4

Handle rate limits and media URLs

Rate limits

Fireflies enforces daily request limits that vary by plan. Free and Pro plans allow50 requests/day. Business+ plans offer higher or unlimited limits. Each GraphQL query counts as one request regardless of how many fields you request. Plan your extraction to stay within limits, especially during backfills.

Media URL expiration

The audio_url and video_url fields return time-limited download links that expire after approximately 24 hours. If your workflow needs the audio or video files, download them immediately upon retrieval - don't store the URL for later use.

Patterns

Key Extraction Flows

There are two practical patterns for getting transcripts out of Fireflies.ai. The right choice depends on whether you're doing a one-off migration and ongoing batch polling, or need near real-time processing via webhooks.

Batch Polling (Backfill & Ongoing)

Scheduled extraction of transcripts

1

Set up a scheduled trigger (daily or hourly) that runs your extraction script. For historical backfills, run in daily batches to stay within API rate limits

2

Query the transcripts endpoint via GraphQL to list recent meetings. Filter by date to fetch only new transcripts since your last poll

3

For each transcript ID returned, fetch the full transcript including sentences, speaker labels, and metadata. Each fetch counts as one API request

4

Store each transcript with its metadata (transcript ID, date, participants, duration) in your data warehouse or object store

5

Route stored transcripts to your analysis pipeline - Semarize for structured scoring, your CRM for enrichment, or a dashboard for reporting

Tip: On Free/Pro plans with 50 requests/day, listing transcripts uses 1 request plus 1 per transcript detail fetch. Batch your daily extraction to stay under the limit - fetch the list first, then prioritize which transcripts to fully retrieve.

Webhook-Triggered

Near real-time on transcription completion

1

Register a webhook endpoint in your Fireflies settings or use the Zapier "Transcription Complete" trigger. Fireflies fires an event when a meeting transcript is ready

2

When the webhook fires, parse the event payload to extract the transcript ID and basic meeting metadata

3

Immediately fetch the full transcript via the GraphQL API using the transcript ID from the event payload

4

Route the transcript and metadata downstream - to Semarize for structured analysis, your CRM for enrichment, or Slack for notifications

Note: Fireflies has native Zapier integration with a "Transcription Complete" trigger, making webhook-based flows easy to set up without custom webhook infrastructure. Each transcript fetch still counts against your daily API quota.

Automation

Send Fireflies Transcripts to Automation Tools

Once you can extract transcripts from Fireflies, the next step is routing them through Semarize for structured analysis and into your downstream systems. Below are end-to-end example flows - each showing the full pipeline from Fireflies trigger through Semarize evaluation to CRM, Slack, or database output.

ZapierNo-code automation

Fireflies → Zapier → Semarize → CRM

Detect new Fireflies transcriptions, fetch the full transcript, send it to Semarize for structured analysis, then write the scored output - signals, flags, and evidence - directly to your CRM.

Example Zap
Trigger: Transcription Complete
Fires when Fireflies finishes a transcript
App: Fireflies.ai
Event: Transcription Complete
Output: transcript_id, title, date
Webhooks by Zapier
Fetch full transcript from Fireflies API
Method: POST
URL: https://api.fireflies.ai/graphql
Auth: Bearer <api_key>
Body: { query: "{ transcript(id: \"{{id}}\") { sentences { speaker_name text } } }" }
Transcript returned
Webhooks by Zapier
POST /v1/runs (sync) to Semarize
Method: POST
URL: https://api.semarize.com/v1/runs
Auth: Bearer smz_live_...
Body: { kit_code, mode: "sync", input: { transcript } }
Structured output returned
Formatter by Zapier
Extract brick values from Semarize response
Extract: bricks.overall_score.value
Extract: bricks.risk_flag.value
Extract: bricks.action_items.value
Salesforce - Update Record
Write scored signals to Opportunity
Object: Opportunity
AI Score: {{overall_score}}
Risk Flag: {{risk_flag}}
Action Items: {{action_items}}

Setup steps

1

Create a new Zap. Choose Fireflies.ai as the trigger app and select "Transcription Complete" as the event. Connect your Fireflies account.

2

Add a "Webhooks by Zapier" Action (Custom Request) to fetch the full transcript from Fireflies. Set method to POST, URL to https://api.fireflies.ai/graphql, add your Bearer token, and pass a GraphQL query for the transcript ID from the trigger.

3

Add a Code step or Formatter to concatenate the sentences array into a plain text transcript. Join each sentence's text, prefixed by speaker_name.

4

Add a second "Webhooks by Zapier" Action. Set method to POST, URL to https://api.semarize.com/v1/runs. Add your Semarize API key as a Bearer token. In the body, set kit_code to your Kit, mode to "sync", and map the transcript text into input.transcript.

5

Add a Formatter step to extract individual brick values from the Semarize JSON response - overall_score, risk_flag, action_items, etc.

6

Add a Salesforce (or HubSpot, Sheets, etc.) Action to write the extracted scores and signals to your CRM record. Test each step end-to-end, then turn on the Zap.

Watch out for: Zapier has step data size limits that can truncate very long transcripts. For meetings over 60 minutes, consider storing the transcript in cloud storage and passing a reference URL instead of inline text. Use mode: "sync" so Semarize returns results inline - Zapier doesn't natively support polling loops.
Learn more about Zapier automation
n8nSelf-hosted workflows

Fireflies → n8n → Semarize → Database

Poll Fireflies for new transcripts on a schedule, fetch each one via GraphQL, send to Semarize for analysis, then write the structured scores and signals to your database. n8n's native loop support handles pagination and batch processing.

Example Workflow
Cron - Every Hour
Triggers the workflow on schedule
Mode: Every Hour
Timezone: UTC
HTTP Request - List Transcripts
POST GraphQL query (Fireflies)
Method: POST
URL: https://api.fireflies.ai/graphql
Auth: Bearer <api_key>
Body: { query: "{ transcripts { id title date } }" }
For each transcript ID
HTTP Request - Fetch Transcript
GraphQL transcript query (Fireflies)
Body: { query: "{ transcript(id: $id) { sentences { speaker_name text } } }" }
Code - Reassemble Transcript
Concatenate sentences into plain text
Join: sentences[].speaker_name + text
HTTP Request - Semarize
POST /v1/runs (sync)
URL: https://api.semarize.com/v1/runs
Auth: Bearer smz_live_...
Body: { kit_code, mode: "sync", input: { transcript } }
Scores & signals returned
Postgres - Insert Row
Write structured output to database
Table: meeting_evaluations
Columns: transcript_id, score, risk_flag, action_items

Setup steps

1

Add a Cron node as the workflow trigger. Set the interval to your desired polling frequency (hourly works well for most teams, but daily may be better for Free/Pro plans to conserve API requests).

2

Add an HTTP Request node to list new transcripts from Fireflies. Set method to POST, URL to https://api.fireflies.ai/graphql, configure Bearer auth, and send a GraphQL query for recent transcripts.

3

Add a Code node to filter results to only transcripts newer than your last successful poll. Store the last poll timestamp in a static data node or external store.

4

Add a Split In Batches node to iterate over the returned transcript IDs. Inside the loop, add an HTTP Request node to fetch each full transcript via GraphQL.

5

Add a Code node (JavaScript) to reassemble the sentences array into a single transcript string. Join each sentence's text, prefixed by speaker_name.

6

Add another HTTP Request node to send the transcript to Semarize. Set method to POST, URL to https://api.semarize.com/v1/runs. Add your API key as a Bearer token. Set kit_code, mode to "sync", and map the transcript into input.transcript.

7

Add a Code node to extract the brick values from the Semarize response - overall_score, risk_flag, action_items, evidence, confidence.

8

Add a Postgres (or MySQL / HTTP Request) node to write the structured output. Use transcript_id as the primary key for upserts. Activate the workflow and monitor the first few runs.

Watch out for: Use transcript IDs as deduplication keys to prevent reprocessing. On Free/Pro plans, the 50 requests/day limit means you need to be strategic - listing transcripts costs 1 request, plus 1 per detail fetch. You can also use async mode with n8n's native loop - POST /v1/runs (default async), then poll GET /v1/runs/:runId with a Wait + IF loop until status is "succeeded".
Learn more about n8n automation
MakeVisual automation with branching

Fireflies → Make → Semarize → CRM + Slack

Fetch new Fireflies transcripts on a schedule, send each to Semarize for structured analysis, then use a Router to branch the scored output - alert on risk flags via Slack and write all signals to your CRM.

Example Scenario
Schedule - Every 30 min
Triggers the scenario on interval
Interval: 30 minutes
HTTP - List Transcripts
POST GraphQL query (Fireflies)
Method: POST
URL: https://api.fireflies.ai/graphql
Auth: Bearer <api_key>
Body: { query: "{ transcripts { id title date } }" }
HTTP - Fetch Transcript
GraphQL transcript query (per meeting)
Iterator: for each transcript in response
Body: { query: "{ transcript(id: {{item.id}}) { sentences { speaker_name text } } }" }
HTTP - Semarize
POST /v1/runs (sync)
URL: https://api.semarize.com/v1/runs
Auth: Bearer smz_live_...
Body: { kit_code, mode: "sync", input: { transcript } }
Structured output
Router - Branch on Risk Flag
Route by Semarize output
Branch 1: IF risk_flag.value = true
Branch 2: ALL (fallthrough)
Branch 1 - Risk detected
Slack - Alert Channel
Notify team about flagged meeting
Channel: #meeting-alerts
Message: Risk on {{title}}, score: {{score}}
Branch 2 - All meetings
Salesforce - Update Record
Write all scored signals to Opportunity
AI Score: {{overall_score}}
Risk Flag: {{risk_flag}}
Action Items: {{action_items}}

Setup steps

1

Create a new Scenario. Add a Schedule module as the trigger, set to your desired interval (15-60 minutes is typical).

2

Add an HTTP module to list new transcripts from Fireflies. Set method to POST, URL to https://api.fireflies.ai/graphql, configure Bearer auth, and send a GraphQL query for recent transcripts.

3

Add an Iterator module to loop through each transcript. For each, add an HTTP module to fetch the full transcript via GraphQL with the transcript ID.

4

Add a Text Aggregator or Tools module to concatenate the sentences array into plain text. Join speaker_name and text for each sentence.

5

Add another HTTP module to send the transcript to Semarize. Set URL to https://api.semarize.com/v1/runs, add your Bearer token, and set kit_code, mode to "sync", and input.transcript from the previous step. Parse the response as JSON.

6

Add a Router module. Define Branch 1 with a filter: bricks.risk_flag.value equals true. Leave Branch 2 as a fallthrough (no filter).

7

On Branch 1, add a Slack module to alert your team when risk is detected. Map the score, risk flag, and meeting title into the message.

8

On Branch 2, add a Salesforce module to write all brick values (score, risk_flag, action_items) to the Opportunity record. Set the scenario schedule and activate.

Watch out for: Each API call counts as a Make operation. A scenario processing 20 transcripts uses ~60 operations (list + transcript + Semarize per meeting). On Fireflies Free/Pro plans, the 50 requests/day limit is the real bottleneck - not Make operations. Use mode: "sync" to avoid needing a polling loop for each run.
Learn more about Make automation

What you can build

What You Can Do With Fireflies Data in Semarize

Custom scoring frameworks, multi-meeting deal tracking, speaker performance benchmarking, and building your own tools on structured meeting signals.

Feature Claim Verification at Scale

Source-of-Truth Grounded QA

What Semarize generates

feature_claim_accurate = falseintegration_overstated = truepricing_tier_correct = trueknowledge_gap_area = "api_limits"

Your team runs sales meetings every day. Feature claims, integration capabilities, and pricing references get stated on every call — but are they accurate? Run a knowledge-grounded kit against your product documentation on every meeting. Semarize verifies each feature claim, integration capability statement, and pricing reference against the source of truth. After 200 meetings, the data shows reps consistently overstate API rate limits and misquote the enterprise tier’s SSO configuration. Product marketing gets a weekly accuracy report targeting the exact knowledge gaps — messaging corrections happen within days, not after the next QBR.

Learn more about QA & Compliance
Pipeline Hygiene - BANT Signal Coverage4 deals in Proposal Sent
Deal
BudgetAuth.NeedTimeline
Acme Corp
Globex Inc
Initech Ltd
Soylent Co
2 of 4 deals have no budget confirmation from any meeting transcript

Meeting Outcome Accountability Scoring

Decision Quality & Follow-Through

What Semarize generates

committed_decisions = 3owner_assigned_rate = 0.45decision_follow_through = 0.62accountability_score = 0.54

Your company runs product, engineering, and customer success meetings — all recorded. Leadership wants to know which meetings actually produce outcomes with accountability. Run every meeting through an outcome accountability kit. Semarize scores each for committed_decisions (decisions with explicit owners and deadlines), owner_assignment_rate, decision_evidence_quality (was the decision grounded in data?), and follow_through_score (did previous meeting’s decisions get referenced?). A quarterly report shows that engineering standups produce 3.2 committed decisions per hour but CS team meetings produce 0.8 — and only 45% of CS decisions have assigned owners. The CS director restructures meetings around decision-forcing frameworks with explicit ownership. Accountability scores improve 2x within a month.

Learn more about Customer Success
Meeting Effectiveness - Q4 ReportScored from Fireflies transcripts
Eng. Standup
Decisions/hr3.2
Action clarity82%
CS Team
Scope drift
Decisions/hr0.8
Action clarity45%
Product Review
Decisions/hr2.1
Action clarity71%
CS meetings restructured with decision-forcing framework — scores improved 2x in one month

Stale Battlecard Detection

Competitive Intel Currency Scoring

What Semarize generates

battlecard_claim_current = falsepositioning_used = "approved"missed_differentiator = "security_compliance"stale_claims_flagged = 2

Competitive battlecards get updated quarterly — but are reps actually using the current version on calls? Run a knowledge-grounded kit against your latest competitive intelligence docs on every sales meeting. Semarize checks each competitive claim against the current battlecard: is the competitor pricing they quoted still accurate? Did they use the approved positioning statement? Did they miss a key differentiator? After 300 meetings, the data shows 2 battlecard sections are cited incorrectly in 40% of competitive conversations. Product marketing updates those sections and measures adoption the following week.

Learn more about QA & Compliance
Competitive Landscape - 6 Month AnalysisSales conversations
Competitor A
42 mentions+6% win rate
Pricing 35%
Features 50%
Support 15%
Competitor B
67 mentions-12% win rate
Pricing 65%
Features 20%
Support 15%
Competitor C
28 mentions+3% win rate
Pricing 20%
Features 55%
Support 25%
Competitor B's “unlimited seats” pricing cited in 28% of lost deals — TCO calculator improved win rate 19%

Custom Meeting ROI Calculator

Cost-per-Decision Analysis

Vibe-coded

What Semarize generates

cost_per_decision = $14,200meeting_roi = 0.34annual_meeting_cost = $2.1Mtop_meeting_type = "deal_review"

A COO vibe-codes a React app that calculates the actual cost of meetings by combining Semarize scores from Fireflies transcripts with calendar and compensation data. Each meeting gets a decision_density score, a projected_revenue_impact (based on deal signals extracted), and a cost_per_decision (meeting duration × attendee hourly cost ÷ decisions made). The app reveals that the company spends $2.1M/year on meetings, but only 34% produce actionable decisions. “Status update” meetings cost $14,200 per decision vs. $1,800 for deal review meetings. The executive team cuts 30% of status meetings and redirects time to deal reviews.

Learn more about RevOps
Meeting ROI DashboardVibe-coded with React
$2.1M
Annual cost
34%
Produce decisions
30%
Savings opportunity
Cost per decision by meeting type
Status Update$14,20038%
Deal Review$1,80022%
Sprint Planning$3,10018%
1:1 Coaching$2,40012%
Other$5,60010%
Time allocation
Cutting 30% of status meetings redirects $240K/yr to high-ROI deal reviews

Watch out for

Common Challenges & Gotchas

These are the issues that come up most often when teams start extracting transcripts from Fireflies.ai at scale.

Daily API request limits

Free and Pro plans are capped at 50 API requests per day. If you need to backfill hundreds of transcripts, you'll burn through the limit quickly. Plan your extraction in daily batches or upgrade to a Business plan for higher limits.

Time-limited media URLs

Audio and video download URLs returned by the API expire after approximately 24 hours. If your pipeline fetches a URL but doesn't download the file immediately, the link will be dead when you try to use it later. Always download media assets right away.

Video access requires Business plan

The video_url field is only populated for accounts on Business plans or higher. If you're on a Free or Pro plan and your workflow depends on video access, the field will be null. Plan your pipeline around audio-only or transcript-only processing if needed.

GraphQL query complexity

Unlike REST APIs, the Fireflies API uses GraphQL. If your team isn't familiar with GraphQL syntax, the learning curve can slow down initial setup. Structure your queries carefully - requesting too many nested fields can also slow down response times.

Speaker identification accuracy

Speaker labels depend on Fireflies correctly mapping participants from calendar invites and platform integrations. Unregistered guests, phone dial-ins, or unnamed participants can appear as generic labels. Validate speaker names before relying on them for per-speaker analysis.

Transcript processing delay

Transcripts are not available instantly after a meeting ends. Processing typically takes 5 to 15 minutes but can be longer during peak hours. If your automation triggers immediately on meeting end, it may fetch incomplete or unavailable data. Build in a retry with delay.

FAQ

Frequently Asked Questions

Explore

Explore Semarize