Accessibility is one of the most important parts of the modern web. That is why we are using transcriptions to enhance the usability of video files online. Transcriptions are one of the most accessible ways to deliver video content as it caters to the challenges of a wide variety of web users. In this post, we'll look at how to add transcriptions to videos rendered in a Nextjs application with Cloudinary.
In the end, we'll create a web application that uses the Cloudinary API to transcribe a user-uploaded video and returns a downloadable URL of the transcribed video.
The Cloudinary API uses a Google-Speech-AI add-on to generate a subtitle file for the uploaded video, and then we will add a transformation that overlays this subtitle on the given video.
Prerequisites
To follow along with this tutorial, you will need to have, — A free Cloudinary account. — Experience with JavaScript and React.js — Next.js is not a requirement, but it's good to have.
Sandbox
If you'd like to get a headstart by looking at the finished demo, I've got it set up here on Codesandbox for you!
We completed this project in this sandbox.
To test successfully with this demo, ensure that you upload a video size <1MB
Fork and run it to quickly get started.
https://codesandbox.io/embed/dawn-fast-qt9enh?fontsize=14&hidenavigation=1&theme=dark
Setup and Installations
First, we will create a Next.js boilerplate with the following command:
1npx create-next-app video-transcription
Let's navigate to the root folder and install Netlify-CLI with the following command:
1cd video-transcription
Next, install the following packages:
Cloudinary — a NodeJS SDK to interact with the Cloudinary APIs
File saver — to help us save our transcribed video
Axios — to carry out HTTP requests.
Dotenv — to store our API keys safely.
The following command will install all the above packages
1npm i cloudinary file-saver axios dotenv
Setup Cloudinary transcription Add-on
To enable the transcription feature on Cloudinary, we need to follow the process shown below:
Navigate to the Add-ons tab on your Cloudinary account and select the Google AI Video Transcription add-on.
Next, select the free plan that offers 120 monthly units. For a more broad project, you should probably select a paid plan with more units, but this will be sufficient for our demo.
Navigate back into the project folder and start the development server with the command below:
1npm run dev
The above command starts a development server at http://localost:3000. You can check that port on the browser to see our demo app running. Next, create a transacribe.js
file in our pages/api
folder. Then we add the following snippet to it:
1const multiparty = require("multiparty");2const Cloudinary = require("cloudinary").v2;3const pth = require("path");4require("dotenv").config();56const uploadVideo = async (req, res) => {7 const form = new multiparty.Form();8 const data = await new Promise((resolve, reject) => {9 form.parse(req, async function (err, fields, files) {10 if (err) reject({ err });11 const path = files.video[0].path;12 const filename = pth.parse(files.video[0].originalFilename).name;1314 // config Cloudinary1516 // rest of the code here17 } catch (error) {18 console.log(error);19 }20 });21 });2223 res.status(200).json({ success: true, data });24};2526export default uploadVideo;
Here, we define a uploadVideo()
function that first accepts the video file coming from the client. Next, we parse the video
In the snippet above, we:
- Import
Cloudinary
and other necessary packages. - Create an
uploadVideo()
function to receive the video file from the client - Parse the request data with multiparty to retrieve the video's
path
andfilename
.
Next, we need to upload the retrieved video file to Cloudinary and transcribe using the Cloudinary Video Transcription Add-on we added.
1//src/pages/api/transcribe.js2try {3Cloudinary.config({4 cloud_name: process.env.CLOUD_NAME,5 api_key: process.env.API_KEY,6 api_secret: process.env.API_SECRET,7 secure: true8});910const VideoTranscribe = Cloudinary.uploader.upload(11 path,12 {13 resource_type: "video",14 public_id: `videos/${filename}`,15 raw_convert: "google_speech:srt"16 },17 function (error, result) {18 if (result) {19 return result;20 }21 return error;22 }23);2425let { public_id } = await VideoTranscribe;2627const transcribedVideo = Cloudinary.url(`${public_id}`, {28 resource_type: "video",29 fallback_content: "Your browser does not support HTML5 video tags",30 transformation: [31 {32 overlay: {33 resource_type: "subtitles",34 public_id: `${public_id}.srt`35 }36 }37 ]38});39resolve({ transcribedVideo });40} catch (error) {41console.log(error);42}4344In the snippet above, we set up a Cloudinary instance to enable communications between our Next.js project and our Cloudinary account. Next, we upload the video to Cloudinary, transcribe it and return the result (the transcribed video). Lastly, we destructure the `public_id` of the transcribed video and use it to fetch the transcribed video. Afterwards, we simply add a Cloudinary transformation that overlays the subtitle on the video thereby achieving a complete video transcription functionality for the originally uploaded video.4546> **Note**: The Cloudinary Video Transcription feature can only be triggered during an `upload` or `update` call.47> Also, each 15s video you transcribe takes 1 unit from your allocated 120 units in the free plan.4849With this, we are finished with our transcription logic.5051Next, let's implement the frontend aspect of this application. For this part, we will be creating a JSX form with an input field of type *file*, and a submit button.52Navigate to the `index.js` file in the `pages` folder and add the following code:5354```js55// pages/index.js56import Head from 'next/head'57import axios from 'axios'58import { useState } from 'react'59import {saveAs} from 'file-saver'60export default function Home() {61 const [selected, setSelected] = useState(null)62 const [videoUrl, setVideoUrl] = useState('')63 const [downloaded, setDownloaded] = useState(false)6465 const handleChange = (e) => {66 if (e.target.files && e.target.files[0]) {67 const i = e.target.files[0];68 let reader = new FileReader()69 reader.onload = () => {70 let base64String = reader.result71 setSelected(base64String)72 }73 reader.readAsDataURL(i)74 }75 }7677 const handleSubmit = async(e) => {78 e.preventDefault()79 try {80 const body = JSON.stringify(selected)81 const config = {82 headers: {83 "Content-Type": "application/json"84 }85 };86 const response = await axios.post('/transcribe', body, config)87 const { data } = await response.data88 setVideoUrl(data)89 } catch (error) {90 console.error(error);91 }92 }93 // return statement return ()94}
In the snippet above, we've set up the handleChange()
and handleSubmit()
functions to handle the interaction and submission of our form. Ideally, you can click on the Choose file button to select a video from your local filesystem, and the Upload video button to submit the selected video to our Next.js /transcribe
API route for transcription.
Next, let's set up the return statement of our index.js
file to render the JSX form for choosing and uploading videos for transcription:
1return (2<div>3 <Head>4 <title>Create Next App</title>5 <meta name="description" content="Generated by create next app" />6 <link rel="icon" href="/favicon.ico" />7 </Head>8 <header>9 <h1>10 Video transcription with Cloudinary11 </h1>12 </header>13 <main>14 <section>15 <form onSubmit={handleSubmit}>16 <label>17 <span>Choose your video file</span>18 <input type="file" onChange={handleChange} required />19 </label>20 <button type='submit'>Upload</button>21 </form>22 </section>23 <section id="video-output">24 {25 videoUrl?26 <div >27 <div>28 <video controls width={480}>29 <source src={`${videoUrl}.webm`} type='video/webm'/>30 <source src={`${videoUrl}.mp4`} type='video/mp4'/>31 <source src={`${videoUrl}.ogv`} type='video/ogg'/>32 </video>33 </div>34 <button35 onClick={() => {36 saveAs(videoUrl, "transcribed-video");37 setDownloaded(true)}}38 disabled={downloaded? true: false}>39 {downloaded? 'Downloaded': 'Download'}40 </button>41 </div> :42 <p>Please Upload a Video file to be Transcribed</p>43 }44 </section>45 </main>46</div>47)
And with that, we should be able to upload and transcribe videos. As a bonus, I've added a couple more functionalities to allow you to download the transcribed video for local use. If you enjoyed this, be sure to come back for more as I look forward to all the things you'll do with this feature.