Skip to main content

Overview

The live transcript dataset access allows you to access live text transcripts from events. The dataset is updated in real-time as new transcripts become available.

How it works

Our API provides support for live transcript streaming using JSON Lines (jsonl) format. When making a live transcript request, the API returns a URL that streams live transcript data in jsonl format. We continuously make upgrades to the format to improve the accuracy and usability of the transcripts. It is therefore important that unrecognized record types are ignored by clients as to be able to add more functionality in the future.

Available versions

We currently support two versions of the file format:
  • 1.7 – The latest version. This release introduces refinement instructions. To request this version, set the transcriptVersion=1.7 query parameter. For more information, see the Refinement instructions section below.
  • 1.6 – The default version returned by the API if no version is specified. If your client is already using version 1.6 and ignores unknown record types, it will remain compatible even when new record types are added in version 1.7.

File format

{"type": "start", "file_format_version": "1.6"}
{"t": "Hi", "s": 0.1, "e": 0.5, "p": "0", "S": "0"}
{"t": "and", "s": 0.5, "e": 0.6, "p": "0", "S": "0"}
{"type": "interruption", "time": 0.6, "restarting": true}
{"t": "welcome", "s": 0.7, "e": 1.1, "p": "0", "S": "0"}
{"t": "to", "s": 1.09, "e": 2.47, "p": "0", "S": "0"}
{"type": "section", "name": "predicted-speech", "s": 0.1}
{"t": "the", "s": 2.5, "e": 3, "p": "0", "S": "0"}
{"t": "[indiscernible]", "s": 3, "e": 3.01, "p": "0", "c": 0.41, "ot": "first"}
{"t": "quarter.", "s": 3.1, "e": 3.5, "p": "0", "S": "0"}
{"t": "[indiscernible]", "s": 4.1, "e": 5.4, "p": "0", "c": 0.54, "ot": "twenty"}
{"t": "[indiscernible]", "s": 5.4, "e": 6.4, "p": "0", "c": 0.30, "ot": "twenty-four"}
{"type": "keep-alive"}
{"t": "You", "s": 15.1, "e": 16.1, "p": "1", "c": 0.6, "S": "1"}
{"t": "may", "s": 16.1, "e": 17.1, "p": "1", "c": 0.6, "S": "1"}
{"t": "now", "s": 17.1, "e": 18.1, "p": "1", "c": 0.6, "S": "1"}
{"type": "section", "name": "predicted-qna", "s": 1.09}
{"t": "[indiscernible]", "s": 18.1, "e": 18.4, "p": "1", "c": 0.6, "ot": "disconnect."}
{"type": "section", "name": "predicted-speech", "e": 18.4}
{"type": "end"}
For forward compatibility, clients should ignore any type that they don’t recognize, as to be able to add more functionality in the future, such as paragraph splits, speaker recognition, etc, etc.
  • start: This should be the first record in any file. All other keys are meta data for the stream and could be empty.
  • entry: This is the default and most common record type and comes after the start. If no type is specified, it should be assumed that it is of this type. It contains the following
    • s: The start time in seconds, we prefer to do this as a number, but clients should be able to handle string.
    • e: The end time in seconds, we prefer to do this as a number, but clients should be able to handle string.
    • p: The phrase id (note that this is a string and does not have to be possible to decode as a decimal number). If passing words line-by-line, then several lines (words) can have the same phrase id so that phrases can be reconstructed in the front-end, if needed.
    • t: The transcript text (in version ≥1.1 this will most often be a single word). If [indiscernible] is passed, then it means that the transcription is not confident enough to show this word or phrase. Such records are usually sent with their own phrase id, p, in versions prior to 1.4, see below. During music or poor sound quality, several [indiscernible] may be passed after each other. Clients do not have to show each one, but could truncate them all into one, even if they have separate phrase ids.
    • S: The speaker index of the entry. Can be missing, especially for low-confidence phrases. See below for how to handle paragraph-level speakers.
    • ot: The original text, as masked by “[indiscernible]” in t, due to low confidence of transcription.
    • c: The confidence of the original text, if any.
  • keep-alive: This can be sent at any point to indicate that the stream is still going, but nothing is necessarily being said. Clients should ignore these unless they have some logic for stand-off or similar. In practice this will rarely if ever be sent, but rather a file will be regarded active as long as it has not been closed.
  • end: Indicates an end. Nothing can be added to a file after this. If this exists or appears in the file during the stream, clients should stop polling. It can, but does not have to contain meta data. If it does not contain meta data, like below, then a successful exit should be assumed.
    • code: an exit code, 0=success, anything else indicates failure
    • system_reason: a reason for ending. Good for debugging. Clients should not display this to the user.
    • user_reason: a reason for ending. Clients could show this.
  • interruption: Something went wrong with the live transcription, the restart will be attempted 3 times.
    • time: Time, in seconds, from the beginning of the transcript when this occurred.
    • restarting: Indicates whether there is an attempt at restarting the live transcription. true indicating a restart, false indicating the live transcription can’t be recovered.
  • section: A section delimiter, displays the start or end of some section.
    • name: the name/id of the section. E.g. predicted-qna and predicted-speech.
A section can come way after the speech that it refers to. A section is not guaranteed to exist. predicted-speech could end without it ever having been noted as started. It could start without ever being ended.

Refinement instructions (version 1.7 and above)

Refinement instructions are records with the fields s, e, rt and i. Not all of them are always present, it depends on the type of the record.
  • i: The instruction type. Possible values are:
    • word-update: Updates a word based on the words timestamp. s is the timestamp of the word to update, rt is the new word Example: {"i": "word-update", "s": 123.45, "rt": "Tomato"}
    • word-insert: Inserts a word at the given timestamp. s is the timestamp of the word to insert, rt is the new word Example: {"i": "word-insert", "s": 123.45, "rt": "Tomato"}
    • word-delete: Deletes a word at the given timestamp. s is the timestamp of the word to delete Example: {"i": "word-delete", "s": 123.45}
    • paragraph-insert: Inserts a paragraph break, such that the words with s < the timestamp of this instruction and words with s >= the timestamp of this instruction should be in separate paragraphs. Example: This paragraph insert should divide the paragraph so that “Hello there”. is in the first, and “My name” is in the second.
      {"s":1,"e":2,"p":"1","t":"Hello","S":"1"}
      {"s":2,"e":3,"p":"1","t":"there.","S":"1"}
      {"s":3,"e":4,"p":"1","t":"My","S":"1"}
      {"s":4,"e":5,"p":"1","t":"name","S":"1"}
      ...
      {"i": "paragraph-insert", "s": 3}
      
The instructions often come in chunks, and when applied one after another can perform more complex edits. For example, if a word TomHanks needs to be split into two words, it will be done with one word-deleteinstruction followed by two word-insert instructions that divide up the time range of the original word.

How to access the data

The API allows you to query the live transcript dataset using a variety of parameters, such as ticker, ISIN, and date and more. You can also use the API to retrieve a list of all live transcripts available in the dataset. For more information on how to use the API, see the API reference.
If you have a subscription to both the audio and transcript live datasets, you can retrieve both the Live Audio and transcript data simultaneously using the Live endpoints.
The data is served through a global CDN, ensuring low latency and high availability. The URL is unique to each live transcript stream and customer. It should not be shared with others or altered in any way as it may result in a loss of access.

API Reference

Explore the Live Transcripts API endpoints.

Example

Below is an example of how to consume a live transcript. The same example is also available at CodePen. Just make sure to replace the transcriptUrl with your own.
html
<!doctype html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <title>Live Transcript</title>
    <style>
      #transcript {
        font-family: Arial, sans-serif;
        font-size: 16px;
        line-height: 1.5;
      }
      .paragraph {
        margin-bottom: 1em;
      }
    </style>
  </head>
  <body>
    <h1>Live Transcript</h1>
    <div id="transcript"></div>

    <script>
      document.addEventListener("DOMContentLoaded", () => {
        const transcriptUrl = "your-transcript.jsonl"; // <<< Replace
        const transcriptElement = document.getElementById("transcript");
        // First line is a metadata line, skip it
        let lastPosition = 1;
        let currentParagraphId = null;
        let currentParagraphElement = null;

        async function fetchNewTranscript() {
          try {
            const response = await fetch(transcriptUrl, {
              headers: { Range: `bytes=${lastPosition}-` },
              cache: "no-cache",
            });

            if (response.ok) {
              const reader = response.body.getReader();
              const decoder = new TextDecoder("utf-8");
              let { value, done } = await reader.read();
              while (!done) {
                const chunk = decoder.decode(value, { stream: true });
                processTranscriptChunk(chunk);
                lastPosition += value.length;
                ({ value, done } = await reader.read());
              }
            } else {
              console.error("Failed to fetch transcript:", response.statusText);
            }
          } catch (error) {
            console.error("Error fetching transcript:", error);
          } finally {
            setTimeout(fetchNewTranscript, 2000); // Adjust the interval as needed
          }
        }

        function processTranscriptChunk(chunk) {
          const lines = chunk.trim().split("\n");
          lines.forEach((line) => {
            try {
              const json = JSON.parse(line);

              if (json.type && json.type !== "entry") {
                // Handle other types of messages (e.g., start, end, interruption)
                return;
              }

              // Start a new paragraph if 'p' changes
              if (json.p !== currentParagraphId) {
                currentParagraphId = json.p;
                currentParagraphElement = document.createElement("div");
                currentParagraphElement.className = "paragraph";
                transcriptElement.appendChild(currentParagraphElement);
              }

              // Append text to the current paragraph
              const textNode = document.createTextNode(json.t + " ");
              currentParagraphElement.appendChild(textNode);
            } catch (e) {
              console.error("Error parsing JSON line:", e);
            }
          });
        }

        fetchNewTranscript();
      });
    </script>
  </body>
</html>
I