Convert Video to text in Ruby




Published on

April 20, 2020

Setup on Google Cloud Console

  • Create a Project e.g My-Project
  • Start your free trial on Google Cloud Platform by adding credit card because the credit card is required for a free trial
  • Create or select a project
  • Enable the Cloud Speech-to-Text API for that project.
  • Create a service account. IAM -> Service Accounts -> Create Service Account
  • Download a private key as JSON.
  • Create a Bucket. Google Cloud Storage -> Browse

Installations on Local System

Run these command from a terminal

  • Install Google Cloud Storage gem

    gem install google-cloud-storage

  • Install Google Cloud Speech gem

    gem install google-cloud-speech

  • Install ffmpeg sudo apt-get install ffmpeg (Ubuntu) brew install ffmpeg (Mac)

Ruby Code

require "google/cloud/speech"
require "google/cloud/storage"

#Google cloud project id
project_id = "google_cloud_project_id"
#Downloaded key file
key_file   = "file_name.json"

Convert video to aduio

Convert Video file to audio file using ffmpeg.

What is FLAC? FLAC stands forĀ Free Lossless Audio Codec, an audio format similar to MP3, butĀ lossless, meaning that audio is compressed in FLAC without any loss in quality.

We will use both of these commands for better results.

system "ffmpeg -i video.mp4 aduio_temp.flac"
system "ffmpeg -i audio_temp.flac -ac 1 audio_final.flac"

Upload audio to Google Storage

First access project by Storage API. Then create a new file in the bucket which will be a copy of adio_final.flac.

Note: Here I'm using the first bucket on Google Cloud Storage. If you have more than one buckets then you can select any storage bucket that you want.

storage = Google::Cloud::Storage.new project: project_id, keyfile: key_file
bucket_name = storage.buckets.first.name
puts bucket_name
bucket  = storage.bucket bucket_name
local_file_path = 'audio_final.flac'
file = bucket.create_file local_file_path, 'audio_cloud.flac'
puts "Uploaded #{file.name}"

Translate Audio to Text

Now we'll convert the audio that we uploaded on Cloud Storage to text. Access that file in the following mentioned way gs://bucket-name/file-name. We can use different language_code as if we are using German-language video then "de-DE" etc

speech = Google::Cloud::Speech.new
storage_path = "gs://audio_bucket-1/audio_cloud.flac"

config = { encoding: :FLAC,
        language_code: "en-US" }
audio = { uri: storage_path }
operation = speech.long_running_recognize config, audio

audio_text = ''
puts "Operation started"
if !operation.nil?
    raise operation.results.message if operation.error?
    results = operation.response.results
    results.each do |result|
        audio_text << result.alternatives.first.transcript
        puts "Transcription: #{result.alternatives.first.transcript}"

puts audio_text

Before running this script set the Google Application Credentials.

export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/file_name.json"

Now run this script.

ruby translate_video_to_text.rb

You can find the complete code on Github. Check it on Github

React, comment and follow on