Introduction
This article is to demonstrate how to build a webrtc app using next js and firebase API. The application will use peer to peer protocol to ensure real-time video and audio communication. The connection will use firebase as a third-party server to signal and store data for stream negotiation. We will also use Cloudinary for the online storage of a user's recording.
Codesandbox
The final version of this project can be viewed on Codesandbox.
You can find the full source code on my Github repo.
Prerequisites
This article will require a user to have entry-level knowledge and understanding of javascript and React/Nextjs.
Setting Up the Sample Project
In your respective directory, generate a Next.js project by using the following command in your terminal
npx create-next-app videocall
Go to your project directory using: cd videocall
Download the necessary dependencies:
npm install firebase cloudinary dotenv @materialize-ui/core
We will begin by setting up our backend with the cloudinary feature.
Cloudinary Credentials Setup
In this project, an end-user will have Cloudinary for media upload and storage. In order to use it, you will be required to create an account and log into it. Each Cloudinary user has their own dashboard. You can access yours through this Link. In your dashboard, you will be able to access your Cloud name
, API Key
, and API Secret
. The three are what we need to integrate our project's online storage capabilities.
In your project root directory, create a new file named .env
. Paste the following environment variables
1CLOUDINARY_CLOUD_NAME =2345CLOUDINARY_API_KEY =6789CLOUDINARY_API_SECRET=
Fill the above information with the values from your Cloudinary dashboard then restart your project using npm run dev
.
Head to the pages/api
folder and create a new file name it upload.js
.
In the file configure the environment keys and libraries to avoid code duplication.
1var cloudinary = require("cloudinary").v2;23cloudinary.config({45cloud_name: process.env.CLOUDINARY_NAME,67api_key: process.env.CLOUDINARY_API_KEY,89api_secret: process.env.CLOUDINARY_API_SECRET,1011});
Our backend's post request will be handled by a handler function as follows:
1export default async function handler(req, res) {23if (req.method === "POST") {45let url = ""67try {89let fileStr = req.body.data;1011const uploadedResponse = await cloudinary.uploader.upload_large(1213fileStr,1415{1617resource_type: "video",1819chunk_size: 6000000,2021}2223);2425} catch (error) {2627res.status(500).json({ error: "Something wrong" });2829}30313233res.status(200).json("backend complete");3435}3637}
In the above code, when a post function is fired, the variable fileStr
is used to store the request's body data which is then uploaded to the user's Cloudinary profile. The body's Cloudinary URL is then captured and stored in the url
variable. An optional move would be to send the variable back to the front end as a response. We will settle with the backed complete
message for now
This concludes our backend's Cloudinary integration.
Firebase Integration
From Firebase, we will use Firestore. A recommended choice for WebRTC due to its ability to listen to databases in real-time.
First, get to the official firebase website through this link and click the console
option at the top right of the navbar.
If it's your first time, ensure to sign up and log in before you access the console.
In the console, you will be required to create a new project through the Add project
option. Proceed with the guided 4-steps to create your project then click to start a firebase app.
After registering your app, you will be provided with instructions on how to use firebase SDK which will also include your respective firebase environment variables. Now we are ready to integrate Firestore into our app.
In your Next.js app's root directory, head to the .env
file and paste the following:
1apiKey: " "23authDomain: " "45projectId: " "67storageBucket: " "89messagingSenderId: " "1011appId: " "1213measurementId: " "
Fill in the blanks with details from your firebase account and restart your Next.js project again to load the env updated file.
Client-side configurations
When creating a webrtc in nextjs, server-side rendering will have to be handled with consideration that we will use modules that only work in the browser such as the window
object.
With this in mind, paste the following in your _app.js
:
1function SafeHydrate({ children }) {23return (45<div suppressHydrationWarning>67{typeof window === 'undefined' ? null : children}89</div>1011)1213}14151617function MyApp({ Component, pageProps }) {1819return <SafeHydrate><Component {...pageProps} /></SafeHydrate>2021}22232425export default MyApp
In the above code, we confirm if we are on the server by confirming that the window
object is undefined. We then render a div with the prop suppressHydrationWarning
, which cleans up the errors from hydration mismatch. We then need to wrap our page component with the SafeHydrate
component for safer readability.
For a better understanding of the above concept, use the following Link.
Now we are ready to proceed with our front end.
Front End.
In your pages/index.js
, start by including the necessary imports.
1import React, { useRef} from "react";23import { firestore } from "../utils/firebase"45import Button from '@material-ui/core/Button';
We then create use the useRef
hook to create variables that we will reference our DOM elements
1const webcamButtonRef = useRef();23const webcamVideoRef = useRef();45const callButtonRef = useRef();67const callInputRef = useRef();89const answerButtonRef = useRef();1011const remoteVideoRef = useRef();1213const hangupButtonRef = useRef();1415const videoDownloadRef = useRef();
Add the following below the code
1let videoUrl = null;2345let recordedChunks = [];
videoUrl
will be used to provide the Cloudinary link of the recorded files.
recordedChunks
array will be populated by event data that will be registered as event listeners.
Streaming in our video elements will be accomplished through global variables for the peer connections. Our peer connection references STUN servers hosted by the Google-a-STUN server which are effective in creating peer-peer connections in terms of discovering suitable IP address ports. We will also have an ICECandidatePoolsize which is a 16-bit integer value that specifies the size of the prefetched ICE candidate pool. Faster connections are established when ICE candidates are fetched by an ICE agent before a user tries to connect. The ICE gathering begins triggering when the candidate pool size is changed.
Having said this, paste the following below the recordedChunks
variable:
1const servers = {23iceServers: [45{67urls: [89"stun:stun1.l.google.com:19302",1011"stun:stun2.l.google.com:19302",1213],1415},1617],1819iceCandidatePoolSize: 10,2021};2223// Global State2425const pc = new RTCPeerConnection(servers);26272829let localStream = null;3031let remoteStream = null;3233var options = { mimeType: "video/webm; codecs=vp9" };34353637let mediaRecorder = null;
The behaviors of the stream object will be defined by the local and remote stream.
Assign a MIME type for WEBM videos to the options
variable.
The media recorder variable will provide the functionality to easily record the media.
With the above setup. we proceed to code the webCamHandler
function
Paste the following below the mediaRecorder
function
1const webCamHandler = async () => {23localStream = await navigator.mediaDevices.getUserMedia({45video: true,67audio: true,89});10111213remoteStream = new MediaStream();14151617localStream.getTracks().forEach((track) => {1819pc.addTrack(track, localStream);2021});22232425pc.ontrack = (event) => {2627event.streams[0].getTracks().forEach((track) => {2829remoteStream.addTrack(track);3031});3233};34353637webcamVideoRef.current.srcObject = localStream;3839remoteVideoRef.current.srcObject = remoteStream;40414243mediaRecorder = new MediaRecorder(localStream, options);4445mediaRecorder.ondataavailable = (event) => {4647if (event.data.size > 0) {4849recordedChunks.push(event.data);5051// console.log("recored chunks", recordedChunks);5253}5455};5657mediaRecorder.start();5859}
The webCamHandler
will be an async function. The function begins by activating the user's audio and video stream. It then creates a media stream and connects which is received from local to RTCPeerConnection. The mediaStream will have at least one media track that is individually added to the RTCPeerConnection when transmitting media to the remote peer. Our local peer conn. will then populate the remote peer with the incoming track event. The webcamvideoRef
and remoteVideoRef
will then be populated by local and remote video respectively.
The media recorder provides a media recording interface. Here we will register a data event listener that populates the event data to the recoredChunks
array before activating the media recorder.
The next function will be the callHandler
function. Start by pasting the following:
1const callHandler = async () => {23const callDoc = firestore.collection("calls").doc();45const offerCandidates = callDoc.collection("offerCandidates");67const answerCandidates = callDoc.collection("answerCandidates");891011callInputRef.current.value = callDoc.id;12131415pc.onicecandidate = (event) => {1617event.candidate && offerCandidates.add(event.candidate.toJSON());1819};20212223const offerDescription = await pc.createOffer();2425await pc.setLocalDescription(offerDescription);26272829const offer = {3031sdp: offerDescription.sdp,3233type: offerDescription.type,3435};36373839await callDoc.set({ offer });40414243callDoc.onSnapshot((snapshot) => {4445const data = snapshot.data();4647if (!pc.currentRemoteDescription && data?.answer) {4849const answerDescription = new RTCSessionDescription(data.answer);5051pc.setRemoteDescription(answerDescription);5253}5455});56575859await callDoc.set({ offer });60616263callDoc.onSnapshot((snapshot) => {6465const data = snapshot.data();6667if (!pc.currentRemoteDescription && data?.answer) {6869const answerDescription = new RTCSessionDescription(data.answer);7071pc.setRemoteDescription(answerDescription);7273}7475});7677answerCandidates.onSnapshot((snapshot) => {7879snapshot.docChanges().forEach((change) => {8081if (change.type === "added") {8283const candidate = new RTCIceCandidate(change.doc.data());8485pc.addIceCandidate(candidate);8687}8889});9091});9293hangupButtonRef.current.disabled = false;9495}
This async function will begin by referencing variables to the user's Firestore collections. This is done to obtain relevant information for connection establishment. Each offer candidate is saved to the database with respect to all call documents containing a subcollection of the offer candidate. We will use the ‘offer’ method to create an SDP offer when starting a new webRTC connection to a remote peer. We will listen to a document through the onSnapshot method. The method will be created using an initial callback provided and will contain the contents of the single document. A call will update the snapshot each time the contents will change. During the callHandler function moment, the hangup button will be disabled.
Paste the following below the above function to include the answeHandler:
1const answerHandler = async () => {23const callId = callInputRef.current.value;45const callDoc = firestore.collection("calls").doc(callId);67const answerCandidates = callDoc.collection("answerCandidates");89const offerCandidates = callDoc.collection("offerCandidates");10111213pc.onicecandidate = (event) => {1415event.candidate && answerCandidates.add(event.candidate.toJSON());1617};18192021const callData = (await callDoc.get()).data();22232425const offerDescription = callData.offer;2627await pc.setRemoteDescription(new RTCSessionDescription(offerDescription));28293031const answerDescription = await pc.createAnswer();3233await pc.setLocalDescription(answerDescription);34353637const answer = {3839type: answerDescription.type,4041sdp: answerDescription.sdp,4243};4445await callDoc.update({ answer });46474849offerCandidates.onSnapshot((snapshot) => {5051snapshot.docChanges().forEach((change) => {5253if (change.type === "added") {5455let data = change.doc.data();5657pc.addIceCandidate(new RTCIceCandidate(data));5859}6061});6263});6465}
In the code above, we will begin by populating the caller id input element with the provided id value. we will then use the respective variables to reference Firestore collections. Now each call will contain an offerCandididate's sub-collection. We will therefore get the candidates to save them to the database. We then assign caller id with data from the database and use onSnapshot method to listen to the document. A snapshot will immediately be created with the current contents of the single document. For every change in content, another call will update the document snapshot.
Below the above function paste the handler function codes below:
1const hangupHandler = () => {2345localStream.getTracks().forEach((track) => track.stop());67remoteStream.getTracks().forEach((track) => track.stop());89mediaRecorder.onstop = async (event) => {1011let blob = new Blob(recordedChunks, {1213type: "video/webm",1415});16171819await readFile(blob).then((encoded_file) => {2021uploadVideo(encoded_file);2223});24252627videoDownloadRef.current.href = URL.createObjectURL(blob);2829videoDownloadRef.current.download =3031new Date().getTime() + "-locastream.webm";3233};3435console.log(videoDownloadRef);36373839};
Above, we begin by calling the getTrack method to obtain the local stream video element. We will iterate over them and call each track's stop method. We then handle the stop event using the onstop event handler from the media recorder. At this point, we can either download or upload the video stream.
We will use the file reader function to encode our blob(recordedChunks) to a 64bit encoded file then pass the file to the handler function. The file reader and upload function are as follows:
1function readFile(file) {23console.log("readFile()=>", file)45return new Promise(function (resolve, reject) {67let fr = new FileReader();891011fr.onload = function () {1213resolve(fr.result);1415};16171819fr.onerror = function () {2021reject(fr);2223};24252627fr.readAsDataURL(file);28293031}3233);3435}36373839const uploadVideo = async (base64) => {4041try {4243fetch("/api/upload", {4445method: "POST",4647body: JSON.stringify({ data: base64 }),4849headers: { "Content-Type": "application/json" },5051}).then((response) => {5253console.log("successfull session", response.status);5455});5657} catch (error) {5859console.error(error);6061}6263}
The encoded file once sent to the /api/upload
url will be sent to the backend directory created earlier to be uploaded to Cloudinary.
User Interface
We begin by introducing our component's styles in the styles/global
directory.
1div{23/* leverage cascade for cross-browser gradients */45background: radial-gradient(67hsl(100 100% 60%),89hsl(200 100% 60%)1011) fixed;1213background: conic-gradient(1415hsl(100 100% 60%),1617hsl(200 100% 60%),1819hsl(100 100% 60%)2021) fixed;2223-webkit-background-clip: text;2425-webkit-text-fill-color: #00000082;2627text-align: center;2829}30313233body {3435background: hsl(204 100% 5%);3637/* background: conic-gradient(3839hsl(100 100% 60%),4041hsl(200 100% 60%),4243hsl(100 100% 60%)4445) fixed; */4647color: rgb(0, 0, 0);4849padding: 5vmin;5051box-sizing: border-box;5253display: grid;5455place-content: center;5657font-family: system-ui;5859font-size: min(200%, 5vmin);6061}62636465h1 {6667font-size: 10vmin;6869line-height: 1.1;7071max-inline-size: 15ch;7273margin: auto;7475}76777879p {8081font-family: "Dank Mono", ui-monospace, monospace;8283margin-top: 1ch;8485line-height: 1.35;8687max-inline-size: 40ch;8889margin: auto;9091}92939495html {9697block-size: 100%;9899inline-size: 100%;100101text-align: center;102103}104105.webcamVideo {106107width: 40vw;108109height: 30vw;110111margin: 2rem;112113background: #2c3e50;114115}116117118119.videos {120121display: flex;122123align-items: center;124125justify-content: center;126127}
Then in our index component's return statement, add the following code:
1return (23<div id="center">45<h1>Start Webcam</h1>67<div className="videos">89<span>1011<p>Local Stream</p>1213<video1415className="webcamVideo"1617ref={webcamVideoRef}1819autoPlay2021playsInline2223/>2425</span>2627<span>2829<p>Remote Stream</p>3031<video3233className="webcamVideo"3435ref={remoteVideoRef}3637autoPlay3839playsInline4041></video>4243</span>4445</div>4647<Button4849variant="contained"5051color="primary"5253onClick={webCamHandler}5455ref={webcamButtonRef}5657>5859Start Webcam6061</Button>6263<p>Create a new call</p>6465<Button6667variant="contained"6869color="primary"7071onClick={callHandler}7273ref={callButtonRef}7475>7677Create Call(Offer)7879</Button>8081<p>Join a call</p>8283<p>Answer the call from a different browser window or device</p>84858687{/* <TextField id="filled-basic" label="id" variant="filled" ref={callInputRef} /> */}8889<input ref={callInputRef} />9091<Button9293color="primary"9495variant="contained"9697onClick={answerHandler}9899ref={answerButtonRef}100101>102103Answer104105</Button>106107<p>Hangup</p>108109<Button110111color="primary"112113variant="contained"114115onClick={hangupHandler}116117ref={hangupButtonRef}118119>120121Hangup122123</Button>124125<a ref={videoDownloadRef} href={videoUrl}>126127Download session video128129</a>130131</div>132133);
The pasted code above will result in achieving the UI to implement our code. You may proceed to enjoy your video chat.
Happy coding!