Skip to main content

Overview

The transcript dataset provides access to high-accuracy transcripts from regular earnings calls and unique events. Each transcript is associated with an event and updated in the dataset when available.
If we have audio in English for an event, we will provide a transcript.

How it works

Transcript data is provided in JSON format and is derived from our live transcripts. When a live event finishes, the transcript is processed and augmented with confidence scores, paragraph breaks, and more to enhance readability and usability. The transcript is then made available for download and analysis through the API shortly after the event concludes.

Chapters

Use the Chapters endpoint to retrieve structured segments, or chapters, associated with a transcript. Querying with the appropriate identifier returns a paginated list of chapter objects within the data array. Each chapter object includes a title, start timestamp and end timestamp, defining distinct sections within the content. See API reference for more details.

How to access the data

The API allows you to query the transcripts dataset using a variety of parameters, such as ticker, ISIN, date, and more. You can also use the API to retrieve a list of all transcripts available in the dataset. For more information on how to use the API, see the API reference below.
If you have a subscription to all documents we offer, you can retrieve all documents in a single request using the Document endpoints.
The data is served through a global CDN, ensuring low latency and high availability. The URL is unique to each file and customer. It should not be shared with others or altered in any way as it may result in a loss of access.

API Reference

Explore the Transcripts API endpoints.

Speaker identification

Effective mid-April 2025, all new transcripts for regular earnings events include speaker identification. This feature lets you easily determine who is speaking at any given time during an event. While older transcripts will not be retroactively updated, all new transcripts will include speaker identification. There are two types of transcripts available in the API:
  • Standard transcripts (typeId = 15): These do not include speaker identification.
  • In-house transcripts (typeId = 22): These include speaker identification.
There are no breaking changes when switching from Standard to In-house transcripts. However, if your application filters results by typeId, make sure to include typeId 22 in addition to 15, to retrieve the in-house transcripts. See the Data structure section for more details on the JSON structure of both transcript types.
It’s not always possible to identify who is speaking. In such cases, fields might be null in the JSON file. This can happen for various reasons, such as:
  • The speaker is not clearly identified in the audio.
  • The speaker’s voice is not recognized.
  • The speaker is not part of the event’s official roster.
  • We are unable to verify the speaker’s identity or role.
We strive to provide speaker identification across all transcripts as part of our ongoing commitment to quality and accuracy. During periods of high activity—such as earnings season—some events may be prioritized for speaker attribution based on factors like client interest and market relevance. As a result, speaker data may appear on certain transcripts sooner than others. We’re continually enhancing our processes to expand speaker coverage and ensure timely delivery across all events.
To ensure you always have the most up-to-date and accurate data, we recommend querying the API for both transcript types. This allows you to check whether speaker identification is present in the in-house transcript. If it is, use that version; if not, fall back to the standard transcript.You can retrieve both types by specifying the query parameter typeIds=15,22 in your API request. This will return both transcript types for the specified event.
There are several reasons why you may not be able to find a transcript for a specific event:
  • Make sure the event is a regular earnings event (Earnings Call) in the API.
  • Make sure you are on the v3 API.
  • Make sure you have access to the transcript dataset.
  • Did the event recently take place? If so, the transcript may not be available yet.
If you have checked all of the above and still cannot find the transcript, please contact your Quartr API representative for assistance.

Data structure

Transcripts are provided as JSON files, enabling easy access and analysis of recorded event content. Below is an example snippet of a transcript JSON file.
{
  "version": "1.0",
  "event_id": 123456,
  "company_id": 123,
  "transcript": {
    "text": "This is the full transcript text",
    "number_of_speakers": 3,
    "paragraphs": [
      {
        "text": "This is the paragraph text",
        "start": 0, // Start time in seconds from start
        "end": 10, // End time in seconds from start
        "speaker": 0, // Zero-based index of the speaker
        "sentences": [
          {
            "text": "This is the sentence text",
            "start": 0,
            "end": 5,
            "words": [
              {
                "word": "This",
                "punctuated_word": "This",
                "start": 0,
                "end": 5,
                "confidence": 0.9
              }
            ]
          }
        ]
      }
    ]
  }
}