Create a gray-scale webcam with Next.js

Eugene Musebe

Introduction

This article demonstrates how we can change a webcam background color to grayscale.

Codesandbox

Locate project demo on Codesandbox.

You can also find the Github repo this Link.

Prerequisites

Entry-level knowledge in javascript and React and or Nextjs.

Project setup

Create a new nextjs project with npx create-next-app webcamgrayscale. The project will involve both frontend and backend. We will begin with our backend. Head to the project using a terminal with cd webcamgrayscale.

The backend integration involves Cloudinary integration. You will be required to create a new cloudinary account or log in to your existing account using this link. After login, you will be given a dashboard containing environment variables necessary to integrate cloudinary to your project. We will use the name, api key and api secret to implement this.

Head to your project root directory and create a file named .env.local. Inside it paste the following code

1".env.local"
2
3CLOUDINARY_CLOUD_NAME =
4
5CLOUDINARY_API_KEY =
6
7CLOUDINARY_API_SECRET=

Fill the code above with the environment variables from the cloudinary dashboard and restart your project using npm run dev

In the pages/api directory, create a new file named upload.js which will contain our backend code.

Begin by downloading cloudinary to your dependencies: npm install cloudinary

Paste the following code to configure the environment keys and libraries.

1"pages/api/upload.js"
2
3var cloudinary = require("cloudinary").v2;
4
5cloudinary.config({
6 cloud_name: process.env.CLOUDINARY_NAME,
7 api_key: process.env.CLOUDINARY_API_KEY,
8 api_secret: process.env.CLOUDINARY_API_SECRET,
9});

Conclude the backend by creating a handle function to configure the API POST request. The function below will receive the request body from the front end, upload it to cloudinary and send back the file's cloudinary link as a response.

1"pages/api/upload.js"
2
3export default async function handler(req, res) {
4 if (req.method === "POST") {
5 let url = ""
6 try {
7 let fileStr = req.body.data;
8 const uploadedResponse = await cloudinary.uploader.upload_large(
9 fileStr,
10 {
11 resource_type: "video",
12 chunk_size: 6000000,
13 }
14 );
15 url = uploadedResponse.url
16 } catch (error) {
17 res.status(500).json({ error: "Something wrong" });
18 }
19
20 res.status(200).json({data: url});
21 }
22}

For our front end, our codes will be stored in the pages/index directory. We will need Tensorflow and BodyPix for this to work.

Import them in your component

1"pages/index"
2
3import React, { useRef, useEffect, useState } from "react";
4import * as tf from "@tensorflow/tfjs";
5import * as bodyPix from "@tensorflow-models/body-pix";

BodyPix will serve the neural network nature. We use MobileNetV1 to provide a faster architecture compared to ResNet50. It will however be less accurate than ResNet50. We will include the outputStride, multiplier, and quantBytes settings to achieve a more segmented accuracy in our case.

1"pages/index"
2
3const modelConfig = {
4 architecture: "MobileNetV1",
5 outputStride: 16,
6 multiplier: 1,
7 quantBytes: 4,
8};

Inside the root, the component declares the following variables. We will use them as we proceed.

1"pages/index"
2
3 let ctx_out, video_in, ctx_tmp, c_tmp, c_out;

Create variables to reference the DOM elements via the ref attribute.

1"pages/index"
2
3
4 const processedVid = useRef();
5 const rawVideo = useRef();
6 const startBtn = useRef();
7 const closeBtn = useRef();
8 const videoDownloadRef = useRef();
9 const [model, setModel] = useState(null);

Use the useEffect hook to perform side effects. The side effect in our case will be to load the bodyPix model with the model configuration.

1"pages/index"
2
3useEffect(() => {
4 if (model) return;
5 const start_time = Date.now() / 1000;
6
7 bodyPix.load(modelConfig).then((m) => {
8 setModel(m);
9 const end_time = Date.now() / 1000;
10 console.log(`model loaded successfully, ${end_time - start_time}`);
11 });
12
13}, []);

Include the segmentation configuration. Ensure to set full internal resolution meaning there will be no resizing the input image. Set the segmentation threshold to 0.04 which configures the minimum confident threshold before each pixel is considered part of the human body. The score threshold will be 0.4 to set the minimum threshold to recognize the human body. The webcam videos will be automatically flipped horizontally in our case. To avoid this, set the flip horizontal setting to true and finally set the maximum detection to 1 to determine the maximum number of detections per image.

1"pages/index"
2
3 const segmentationConfig = {
4 internalResolution: "full",
5 segmentationThreshold: 0.1,
6 scoreThreshold: 0.4,
7 flipHorizontal: true,
8 maxDetections: 1,
9 };

Next, declare the variables below

"pages/index"

let recordedChunks = []; let localStream = null; let options = { mimeType: "video/webm; codecs=vp9" }; let mediaRecorder = null; let videoUrl = null;

1create a new function `startCamHandler` to handle the webcam configurations.
2
3"pages/index"
4
5 const startCamHandler = async () => {
6 console.log("Starting webcam and mic ..... ");
7 localStream = await navigator.mediaDevices.getUserMedia({
8 video: true,
9 audio: false,
10 });
11
12 // console.log(model);
13
14 //populate video element
15 rawVideo.current.srcObject = localStream;
16 video_in = rawVideo.current;
17 rawVideo.current.addEventListener("loadeddata", (ev) => {
18 console.log("loaded data.");
19 transform();
20 });
21
22 mediaRecorder = new MediaRecorder(localStream, options);
23 mediaRecorder.ondataavailable = (event) => {
24 console.log("data-available");
25 if (event.data.size > 0) {
26 recordedChunks.push(event.data);
27 }
28 };
29 mediaRecorder.start();
30 };

When the above function fires, a user will receive a notification to allow the webcam to open. The webcam view will be used to populate our DOM video element and once the data is loaded we use an event listener to fire the transform function. We shall also use a media recorder to load chunks of recorded data and fill them in an array named recordedChunks.

The transform function will extract canvas context from our canvas DOM element and as well create another temporary canvas to use when computing the frame we require. Once the computeFrame function is called, it starts by drawing the current webcam video frame on it using the draw image method then retrieving the pixel data using getImageData and assigning it to the frame variable. We then use the current image data to call the segment person method on the canvas to begin analysis. Each pixel has 4 data, the RGB and the alpha or transparency data. This means each pixel needs 4 array spaces to make the final size 4 times the actual pixel number. We will have to loop through all the pixels to check the RGB value and multiply each index by 4 then add an offset. We fail to add the offset to R because it's the first value for each pixel making its index 0. We then calculate the gray color for our view and set the new RGB color to gray if the map value is not 1 for the current pixel in iteration.

1let transform = () => {
2 c_out = processedVid.current;
3 ctx_out = c_out.getContext("2d");
4
5 c_tmp = document.createElement("canvas");
6 c_tmp.setAttribute("width", 800);
7 c_tmp.setAttribute("height", 450);
8
9 ctx_tmp = c_tmp.getContext("2d");
10
11 computeFrame();
12 };
13
14 let computeFrame = async () => {
15 ctx_tmp.drawImage(
16 video_in,
17 0,
18 0,
19 video_in.videoWidth,
20 video_in.videoHeight
21 );
22
23 let frame = ctx_tmp.getImageData(
24 0,
25 0,
26 video_in.videoWidth,
27 video_in.videoHeight
28 );
29
30 const { data: segmentation } = await model.segmentPerson(
31 frame,
32 segmentationConfig
33 );
34
35 // .then((segmentation) => {
36 let output_img = ctx_out.getImageData(
37 0,
38 0,
39 video_in.videoWidth,
40 video_in.videoHeight
41 );
42
43 for (let i = 0; i < segmentation.length; i++) {
44 // Extract data into r, g, b, a from imgData
45 const [r, g, b, a] = [
46 frame.data[i * 4],
47 frame.data[i * 4 + 1],
48 frame.data[i * 4 + 2],
49 frame.data[i * 4 + 3],
50 ];
51
52 // Calculate the gray color
53 const gray = 0.3 * r + 0.59 * g + 0.11 * b;
54
55 // Set new RGB color to gray if map value is not 1
56 // for the current pixel in iteration
57 [
58 output_img.data[i * 4],
59 output_img.data[i * 4 + 1],
60 output_img.data[i * 4 + 2],
61 output_img.data[i * 4 + 3],
62 ] = !segmentation[i] ? [gray, gray, gray, 255] : [r, g, b, a];
63 }
64
65 ctx_out.putImageData(output_img, 0, 0);
66 setTimeout(computeFrame, 0);
67 };

The above code should result in the grayscaled result. Your configured video should look like below:

{FINAL IMAGE SAMPLE}

Upon closing our webcam, we will have a code to stop the local stream from converting recordedChunks to a blob. The blob makes it easier to obtain a base64 format of the video through a file reader and passes the media file to the uploadVideo function which will be our final step.

1"pages/index"
2
3function readFile(file) {
4 console.log("readFile()=>", file);
5 return new Promise(function (resolve, reject) {
6 let fr = new FileReader();
7
8 fr.onload = function () {
9 resolve(fr.result);
10 };
11
12 fr.onerror = function () {
13 reject(fr);
14 };
15
16 fr.readAsDataURL(file);
17 });
18 }
19
20 const stopCamHandler = () => {
21 console.log("Hanging up the call ...");
22 localStream.getTracks().forEach((track) => track.stop());
23
24 mediaRecorder.onstop = async (event) => {
25 let blob = new Blob(recordedChunks, {
26 type: "video/webm",
27 });
28
29 // Save original video to cloudinary
30 await readFile(blob).then((encoded_file) => {
31 uploadVideo(encoded_file);
32 });
33
34 videoDownloadRef.current.href = URL.createObjectURL(blob);
35 videoDownloadRef.current.download =
36 new Date().getTime() + "-locastream.webm";
37 };
38 };

The uploadVideo function will upload the received media file to the back end through a POST method where it will be uploaded to cloudinary for online storage.

1"pages/index"
2
3 const uploadVideo = async (base64) => {
4 console.log("uploading to backend...");
5 try {
6 fetch("/api/upload", {
7 method: "POST",
8 body: JSON.stringify({ data: base64 }),
9 headers: { "Content-Type": "application/json" },
10 }).then((response) => {
11 console.log("successfull session", response.status);
12 });
13 } catch (error) {
14 console.error(error);
15 }
16 };

At this point, we have successfully created our frontend functions. Proceed by pasting the codes below to fill in the necessary DOM elements in the return statement

1"pages/index"
2
3 <div className="container">
4 {model && (
5 <>
6 <div className="card">
7 <div className="videos">
8 <video
9 className="display"
10 width={800}
11 height={450}
12 ref={rawVideo}
13 autoPlay
14 playsInline
15 ></video>
16 </div>
17
18 <canvas
19 className="display"
20 width={800}
21 height={450}
22 ref={processedVid}
23 ></canvas>
24 </div>
25 <div className="buttons">
26 <button className="button" onClick={startCamHandler} ref={startBtn}>
27 Start Webcam
28 </button>
29 <button className="button" onClick={stopCamHandler} ref={closeBtn}>
30 Close and upload original video
31 </button>
32 <button className="button">
33 <a ref={videoDownloadRef} href={videoUrl}>
34 Get Original video
35 </a>
36 </button>
37 </div>
38 </>
39 )}
40 {!model && <div>Loading machine learning models...</div>}
41 </div>

The codes above will result in the DOM structure that looks as follows:

.

We have successfully created our grayscale webcam project. Ensure to go through the article to enjoy the experience.

Eugene Musebe

Software Developer

I’m a full-stack software developer, content creator, and tech community builder based in Nairobi, Kenya. I am addicted to learning new technologies and loves working with like-minded people.