Real Time Communication Web App

Eugene Musebe

Introduction

This article is to demonstrate how to build a webrtc app using next js and firebase API. The application will use peer to peer protocol to ensure real-time video and audio communication. The connection will use firebase as a third-party server to signal and store data for stream negotiation. We will also use Cloudinary for the online storage of a user's recording.

Codesandbox

The final version of this project can be viewed on Codesandbox.

You can find the full source code on my Github repo.

Prerequisites

This article will require a user to have entry-level knowledge and understanding of javascript and React/Nextjs.

Setting Up the Sample Project

In your respective directory, generate a Next.js project by using the following command in your terminal

npx create-next-app videocall

Go to your project directory using: cd videocall

Download the necessary dependencies:

npm install firebase cloudinary dotenv @materialize-ui/core

We will begin by setting up our backend with the cloudinary feature.

Cloudinary Credentials Setup

In this project, an end-user will have Cloudinary for media upload and storage. In order to use it, you will be required to create an account and log into it. Each Cloudinary user has their own dashboard. You can access yours through this Link. In your dashboard, you will be able to access your Cloud name, API Key, and API Secret. The three are what we need to integrate our project's online storage capabilities.

In your project root directory, create a new file named .env. Paste the following environment variables

1CLOUDINARY_CLOUD_NAME =
2
3
4
5CLOUDINARY_API_KEY =
6
7
8
9CLOUDINARY_API_SECRET=

Fill the above information with the values from your Cloudinary dashboard then restart your project using npm run dev.

Head to the pages/api folder and create a new file name it upload.js.

In the file configure the environment keys and libraries to avoid code duplication.

1var cloudinary = require("cloudinary").v2;
2
3cloudinary.config({
4
5cloud_name: process.env.CLOUDINARY_NAME,
6
7api_key: process.env.CLOUDINARY_API_KEY,
8
9api_secret: process.env.CLOUDINARY_API_SECRET,
10
11});

Our backend's post request will be handled by a handler function as follows:

1export default async function handler(req, res) {
2
3if (req.method === "POST") {
4
5let url = ""
6
7try {
8
9let fileStr = req.body.data;
10
11const uploadedResponse = await cloudinary.uploader.upload_large(
12
13fileStr,
14
15{
16
17resource_type: "video",
18
19chunk_size: 6000000,
20
21}
22
23);
24
25} catch (error) {
26
27res.status(500).json({ error: "Something wrong" });
28
29}
30
31
32
33res.status(200).json("backend complete");
34
35}
36
37}

In the above code, when a post function is fired, the variable fileStr is used to store the request's body data which is then uploaded to the user's Cloudinary profile. The body's Cloudinary URL is then captured and stored in the url variable. An optional move would be to send the variable back to the front end as a response. We will settle with the backed complete message for now

This concludes our backend's Cloudinary integration.

Firebase Integration

From Firebase, we will use Firestore. A recommended choice for WebRTC due to its ability to listen to databases in real-time.

First, get to the official firebase website through this link and click the console option at the top right of the navbar.

If it's your first time, ensure to sign up and log in before you access the console.

In the console, you will be required to create a new project through the Add project option. Proceed with the guided 4-steps to create your project then click to start a firebase app.

After registering your app, you will be provided with instructions on how to use firebase SDK which will also include your respective firebase environment variables. Now we are ready to integrate Firestore into our app.

In your Next.js app's root directory, head to the .env file and paste the following:

1apiKey: " "
2
3authDomain: " "
4
5projectId: " "
6
7storageBucket: " "
8
9messagingSenderId: " "
10
11appId: " "
12
13measurementId: " "

Fill in the blanks with details from your firebase account and restart your Next.js project again to load the env updated file.

Client-side configurations

When creating a webrtc in nextjs, server-side rendering will have to be handled with consideration that we will use modules that only work in the browser such as the window object.

With this in mind, paste the following in your _app.js:

1function SafeHydrate({ children }) {
2
3return (
4
5<div suppressHydrationWarning>
6
7{typeof window === 'undefined' ? null : children}
8
9</div>
10
11)
12
13}
14
15
16
17function MyApp({ Component, pageProps }) {
18
19return <SafeHydrate><Component {...pageProps} /></SafeHydrate>
20
21}
22
23
24
25export default MyApp

In the above code, we confirm if we are on the server by confirming that the window object is undefined. We then render a div with the prop suppressHydrationWarning, which cleans up the errors from hydration mismatch. We then need to wrap our page component with the SafeHydrate component for safer readability.

For a better understanding of the above concept, use the following Link.

Now we are ready to proceed with our front end.

Front End.

In your pages/index.js, start by including the necessary imports.

1import React, { useRef} from "react";
2
3import { firestore } from "../utils/firebase"
4
5import Button from '@material-ui/core/Button';

We then create use the useRef hook to create variables that we will reference our DOM elements

1const webcamButtonRef = useRef();
2
3const webcamVideoRef = useRef();
4
5const callButtonRef = useRef();
6
7const callInputRef = useRef();
8
9const answerButtonRef = useRef();
10
11const remoteVideoRef = useRef();
12
13const hangupButtonRef = useRef();
14
15const videoDownloadRef = useRef();

Add the following below the code

1let videoUrl = null;
2
3
4
5let recordedChunks = [];

videoUrl will be used to provide the Cloudinary link of the recorded files.

recordedChunks array will be populated by event data that will be registered as event listeners.

Streaming in our video elements will be accomplished through global variables for the peer connections. Our peer connection references STUN servers hosted by the Google-a-STUN server which are effective in creating peer-peer connections in terms of discovering suitable IP address ports. We will also have an ICECandidatePoolsize which is a 16-bit integer value that specifies the size of the prefetched ICE candidate pool. Faster connections are established when ICE candidates are fetched by an ICE agent before a user tries to connect. The ICE gathering begins triggering when the candidate pool size is changed.

Having said this, paste the following below the recordedChunks variable:

1const servers = {
2
3iceServers: [
4
5{
6
7urls: [
8
9"stun:stun1.l.google.com:19302",
10
11"stun:stun2.l.google.com:19302",
12
13],
14
15},
16
17],
18
19iceCandidatePoolSize: 10,
20
21};
22
23// Global State
24
25const pc = new RTCPeerConnection(servers);
26
27
28
29let localStream = null;
30
31let remoteStream = null;
32
33var options = { mimeType: "video/webm; codecs=vp9" };
34
35
36
37let mediaRecorder = null;

The behaviors of the stream object will be defined by the local and remote stream.

Assign a MIME type for WEBM videos to the options variable.

The media recorder variable will provide the functionality to easily record the media.

With the above setup. we proceed to code the webCamHandler function

Paste the following below the mediaRecorder function

1const webCamHandler = async () => {
2
3localStream = await navigator.mediaDevices.getUserMedia({
4
5video: true,
6
7audio: true,
8
9});
10
11
12
13remoteStream = new MediaStream();
14
15
16
17localStream.getTracks().forEach((track) => {
18
19pc.addTrack(track, localStream);
20
21});
22
23
24
25pc.ontrack = (event) => {
26
27event.streams[0].getTracks().forEach((track) => {
28
29remoteStream.addTrack(track);
30
31});
32
33};
34
35
36
37webcamVideoRef.current.srcObject = localStream;
38
39remoteVideoRef.current.srcObject = remoteStream;
40
41
42
43mediaRecorder = new MediaRecorder(localStream, options);
44
45mediaRecorder.ondataavailable = (event) => {
46
47if (event.data.size > 0) {
48
49recordedChunks.push(event.data);
50
51// console.log("recored chunks", recordedChunks);
52
53}
54
55};
56
57mediaRecorder.start();
58
59}

The webCamHandler will be an async function. The function begins by activating the user's audio and video stream. It then creates a media stream and connects which is received from local to RTCPeerConnection. The mediaStream will have at least one media track that is individually added to the RTCPeerConnection when transmitting media to the remote peer. Our local peer conn. will then populate the remote peer with the incoming track event. The webcamvideoRef and remoteVideoRef will then be populated by local and remote video respectively.

The media recorder provides a media recording interface. Here we will register a data event listener that populates the event data to the recoredChunks array before activating the media recorder.

The next function will be the callHandler function. Start by pasting the following:

1const callHandler = async () => {
2
3const callDoc = firestore.collection("calls").doc();
4
5const offerCandidates = callDoc.collection("offerCandidates");
6
7const answerCandidates = callDoc.collection("answerCandidates");
8
9
10
11callInputRef.current.value = callDoc.id;
12
13
14
15pc.onicecandidate = (event) => {
16
17event.candidate && offerCandidates.add(event.candidate.toJSON());
18
19};
20
21
22
23const offerDescription = await pc.createOffer();
24
25await pc.setLocalDescription(offerDescription);
26
27
28
29const offer = {
30
31sdp: offerDescription.sdp,
32
33type: offerDescription.type,
34
35};
36
37
38
39await callDoc.set({ offer });
40
41
42
43callDoc.onSnapshot((snapshot) => {
44
45const data = snapshot.data();
46
47if (!pc.currentRemoteDescription && data?.answer) {
48
49const answerDescription = new RTCSessionDescription(data.answer);
50
51pc.setRemoteDescription(answerDescription);
52
53}
54
55});
56
57
58
59await callDoc.set({ offer });
60
61
62
63callDoc.onSnapshot((snapshot) => {
64
65const data = snapshot.data();
66
67if (!pc.currentRemoteDescription && data?.answer) {
68
69const answerDescription = new RTCSessionDescription(data.answer);
70
71pc.setRemoteDescription(answerDescription);
72
73}
74
75});
76
77answerCandidates.onSnapshot((snapshot) => {
78
79snapshot.docChanges().forEach((change) => {
80
81if (change.type === "added") {
82
83const candidate = new RTCIceCandidate(change.doc.data());
84
85pc.addIceCandidate(candidate);
86
87}
88
89});
90
91});
92
93hangupButtonRef.current.disabled = false;
94
95}

This async function will begin by referencing variables to the user's Firestore collections. This is done to obtain relevant information for connection establishment. Each offer candidate is saved to the database with respect to all call documents containing a subcollection of the offer candidate. We will use the ‘offer’ method to create an SDP offer when starting a new webRTC connection to a remote peer. We will listen to a document through the onSnapshot method. The method will be created using an initial callback provided and will contain the contents of the single document. A call will update the snapshot each time the contents will change. During the callHandler function moment, the hangup button will be disabled.

Paste the following below the above function to include the answeHandler:

1const answerHandler = async () => {
2
3const callId = callInputRef.current.value;
4
5const callDoc = firestore.collection("calls").doc(callId);
6
7const answerCandidates = callDoc.collection("answerCandidates");
8
9const offerCandidates = callDoc.collection("offerCandidates");
10
11
12
13pc.onicecandidate = (event) => {
14
15event.candidate && answerCandidates.add(event.candidate.toJSON());
16
17};
18
19
20
21const callData = (await callDoc.get()).data();
22
23
24
25const offerDescription = callData.offer;
26
27await pc.setRemoteDescription(new RTCSessionDescription(offerDescription));
28
29
30
31const answerDescription = await pc.createAnswer();
32
33await pc.setLocalDescription(answerDescription);
34
35
36
37const answer = {
38
39type: answerDescription.type,
40
41sdp: answerDescription.sdp,
42
43};
44
45await callDoc.update({ answer });
46
47
48
49offerCandidates.onSnapshot((snapshot) => {
50
51snapshot.docChanges().forEach((change) => {
52
53if (change.type === "added") {
54
55let data = change.doc.data();
56
57pc.addIceCandidate(new RTCIceCandidate(data));
58
59}
60
61});
62
63});
64
65}

In the code above, we will begin by populating the caller id input element with the provided id value. we will then use the respective variables to reference Firestore collections. Now each call will contain an offerCandididate's sub-collection. We will therefore get the candidates to save them to the database. We then assign caller id with data from the database and use onSnapshot method to listen to the document. A snapshot will immediately be created with the current contents of the single document. For every change in content, another call will update the document snapshot.

Below the above function paste the handler function codes below:

1const hangupHandler = () => {
2
3
4
5localStream.getTracks().forEach((track) => track.stop());
6
7remoteStream.getTracks().forEach((track) => track.stop());
8
9mediaRecorder.onstop = async (event) => {
10
11let blob = new Blob(recordedChunks, {
12
13type: "video/webm",
14
15});
16
17
18
19await readFile(blob).then((encoded_file) => {
20
21uploadVideo(encoded_file);
22
23});
24
25
26
27videoDownloadRef.current.href = URL.createObjectURL(blob);
28
29videoDownloadRef.current.download =
30
31new Date().getTime() + "-locastream.webm";
32
33};
34
35console.log(videoDownloadRef);
36
37
38
39};

Above, we begin by calling the getTrack method to obtain the local stream video element. We will iterate over them and call each track's stop method. We then handle the stop event using the onstop event handler from the media recorder. At this point, we can either download or upload the video stream.

We will use the file reader function to encode our blob(recordedChunks) to a 64bit encoded file then pass the file to the handler function. The file reader and upload function are as follows:

1function readFile(file) {
2
3console.log("readFile()=>", file)
4
5return new Promise(function (resolve, reject) {
6
7let fr = new FileReader();
8
9
10
11fr.onload = function () {
12
13resolve(fr.result);
14
15};
16
17
18
19fr.onerror = function () {
20
21reject(fr);
22
23};
24
25
26
27fr.readAsDataURL(file);
28
29
30
31}
32
33);
34
35}
36
37
38
39const uploadVideo = async (base64) => {
40
41try {
42
43fetch("/api/upload", {
44
45method: "POST",
46
47body: JSON.stringify({ data: base64 }),
48
49headers: { "Content-Type": "application/json" },
50
51}).then((response) => {
52
53console.log("successfull session", response.status);
54
55});
56
57} catch (error) {
58
59console.error(error);
60
61}
62
63}

The encoded file once sent to the /api/upload url will be sent to the backend directory created earlier to be uploaded to Cloudinary.

User Interface

We begin by introducing our component's styles in the styles/global directory.

1div{
2
3/* leverage cascade for cross-browser gradients */
4
5background: radial-gradient(
6
7hsl(100 100% 60%),
8
9hsl(200 100% 60%)
10
11) fixed;
12
13background: conic-gradient(
14
15hsl(100 100% 60%),
16
17hsl(200 100% 60%),
18
19hsl(100 100% 60%)
20
21) fixed;
22
23-webkit-background-clip: text;
24
25-webkit-text-fill-color: #00000082;
26
27text-align: center;
28
29}
30
31
32
33body {
34
35background: hsl(204 100% 5%);
36
37/* background: conic-gradient(
38
39hsl(100 100% 60%),
40
41hsl(200 100% 60%),
42
43hsl(100 100% 60%)
44
45) fixed; */
46
47color: rgb(0, 0, 0);
48
49padding: 5vmin;
50
51box-sizing: border-box;
52
53display: grid;
54
55place-content: center;
56
57font-family: system-ui;
58
59font-size: min(200%, 5vmin);
60
61}
62
63
64
65h1 {
66
67font-size: 10vmin;
68
69line-height: 1.1;
70
71max-inline-size: 15ch;
72
73margin: auto;
74
75}
76
77
78
79p {
80
81font-family: "Dank Mono", ui-monospace, monospace;
82
83margin-top: 1ch;
84
85line-height: 1.35;
86
87max-inline-size: 40ch;
88
89margin: auto;
90
91}
92
93
94
95html {
96
97block-size: 100%;
98
99inline-size: 100%;
100
101text-align: center;
102
103}
104
105.webcamVideo {
106
107width: 40vw;
108
109height: 30vw;
110
111margin: 2rem;
112
113background: #2c3e50;
114
115}
116
117
118
119.videos {
120
121display: flex;
122
123align-items: center;
124
125justify-content: center;
126
127}

Then in our index component's return statement, add the following code:

1return (
2
3<div id="center">
4
5<h1>Start Webcam</h1>
6
7<div className="videos">
8
9<span>
10
11<p>Local Stream</p>
12
13<video
14
15className="webcamVideo"
16
17ref={webcamVideoRef}
18
19autoPlay
20
21playsInline
22
23/>
24
25</span>
26
27<span>
28
29<p>Remote Stream</p>
30
31<video
32
33className="webcamVideo"
34
35ref={remoteVideoRef}
36
37autoPlay
38
39playsInline
40
41></video>
42
43</span>
44
45</div>
46
47<Button
48
49variant="contained"
50
51color="primary"
52
53onClick={webCamHandler}
54
55ref={webcamButtonRef}
56
57>
58
59Start Webcam
60
61</Button>
62
63<p>Create a new call</p>
64
65<Button
66
67variant="contained"
68
69color="primary"
70
71onClick={callHandler}
72
73ref={callButtonRef}
74
75>
76
77Create Call(Offer)
78
79</Button>
80
81<p>Join a call</p>
82
83<p>Answer the call from a different browser window or device</p>
84
85
86
87{/* <TextField id="filled-basic" label="id" variant="filled" ref={callInputRef} /> */}
88
89<input ref={callInputRef} />
90
91<Button
92
93color="primary"
94
95variant="contained"
96
97onClick={answerHandler}
98
99ref={answerButtonRef}
100
101>
102
103Answer
104
105</Button>
106
107<p>Hangup</p>
108
109<Button
110
111color="primary"
112
113variant="contained"
114
115onClick={hangupHandler}
116
117ref={hangupButtonRef}
118
119>
120
121Hangup
122
123</Button>
124
125<a ref={videoDownloadRef} href={videoUrl}>
126
127Download session video
128
129</a>
130
131</div>
132
133);

The pasted code above will result in achieving the UI to implement our code. You may proceed to enjoy your video chat.

Happy coding!

Eugene Musebe

Software Developer

I’m a full-stack software developer, content creator, and tech community builder based in Nairobi, Kenya. I am addicted to learning new technologies and loves working with like-minded people.