AssemblyAI Ruby SDK
The AssemblyAI Ruby SDK provides an easy-to-use interface for interacting with the AssemblyAI API, which supports async, audio intelligence models, as well as the latest LeMUR models.
The Ruby SDK does not support Streaming STT at this time.
Documentation
Visit the AssemblyAI documentation for step-by-step instructions and a lot more details about our AI models and API.
Quickstart
Install the gem and add to the application's Gemfile by executing:
bundle add assemblyai
If bundler is not being used to manage dependencies, install the gem by executing:
gem install assemblyai
Import the AssemblyAI package and create an AssemblyAI object with your API key:
require 'assemblyai'
client = AssemblyAI::Client.new(api_key: 'YOUR_API_KEY')
You can now use the client
object to interact with the AssemblyAI API.
Speech-To-Text
transcript = client.transcripts.transcribe(
audio_url: 'https://assembly.ai/espn.m4a',
)
transcribe
queues a transcription job and polls it until the status
is completed
or error
.
If you don't want to wait until the transcript is ready, you can use submit
:
transcript = client.transcripts.submit(
audio_url: 'https://assembly.ai/espn.m4a'
)
uploaded_file = client.files.upload(file: '/path/to/your/file')
# You can also pass an IO object or base64 string
# uploaded_file = client.files.upload(file: File.new('/path/to/your/file'))
transcript = client.transcripts.transcribe(audio_url: uploaded_file.upload_url)
puts transcript.text
transcribe
queues a transcription job and polls it until the status
is completed
or error
.
If you don't want to wait until the transcript is ready, you can use submit
:
transcript = client.transcripts.submit(audio_url: uploaded_file.upload_url)
You can extract even more insights from the audio by enabling any of our AI models using transcription options. For example, here's how to enable Speaker diarization model to detect who said what.
transcript = client.transcripts.transcribe(
audio_url: audio_url,
speaker_labels: true
)
transcript.utterances.each do |utterance|
printf('Speaker %<speaker>s: %<text>s', speaker: utterance.speaker, text: utterance.text)
end
This will return the transcript object in its current state. If the transcript is still processing, the status
field
will be queued
or processing
. Once the transcript is complete, the status
field will be completed
.
transcript = client.transcripts.get(transcript_id: transcript.id)
sentences = client.transcripts.get_sentences(transcript_id: transcript.id)
p sentences
paragraphs = client.transcripts.get_paragraphs(transcript_id: transcript.id)
p paragraphs
srt = client.transcripts.get_subtitles(
transcript_id: transcript.id,
subtitle_format: AssemblyAI::Transcripts::SubtitleFormat::SRT
)
srt = client.transcripts.get_subtitles(
transcript_id: transcript.id,
subtitle_format: AssemblyAI::Transcripts::SubtitleFormat::SRT,
chars_per_caption: 32
)
vtt = client.transcripts.get_subtitles(
transcript_id: transcript.id,
subtitle_format: AssemblyAI::Transcripts::SubtitleFormat::VTT
)
vtt = client.transcripts.get_subtitles(
transcript_id: transcript.id,
subtitle_format: AssemblyAI::Transcripts::SubtitleFormat::VTT,
chars_per_caption: 32
)
page = client.transcripts.list
You can pass parameters to .list
to filter the transcripts.
To paginate over all pages, subsequently, use the .list_by_url
method.
loop do
page = client.transcripts.list_by_url(url: page.page_details.prev_url)
break if page.page_details.prev_url.nil?
end
response = client.transcripts.delete(transcript_id: transcript.id)
Apply LLMs to your audio with LeMUR
Call LeMUR endpoints to apply LLMs to your transcript.
response = client.lemur.task(
transcript_ids: ['0d295578-8c75-421a-885a-2c487f188927'],
prompt: 'Write a haiku about this conversation.'
)
response = client.lemur.summary(
transcript_ids: ['0d295578-8c75-421a-885a-2c487f188927'],
answer_format: 'one sentence',
context: {
'speakers': ['Alex', 'Bob']
}
)
response = client.lemur.question_answer(
transcript_ids: ['0d295578-8c75-421a-885a-2c487f188927'],
questions: [
{
question: 'What are they discussing?',
answer_format: 'text'
}
]
)
response = client.lemur.action_items(
transcript_ids: ['0d295578-8c75-421a-885a-2c487f188927']
)
response = client.lemur.task(...)
deletion_response = client.lemur.purge_request_data(request_id: response.request_id)