Nona Blog

How to build a video calling POC with AWS chime and ReactJS

As we see remote work and remote socialising becoming more of a mainstream occurrence the popularity of video calling technology has dramatically risen.

Considering this, I set out on a mission to find the quickest, cheapest and most customisable solution to implementing a web-based video chat application.

Looking for love in all the wrong places

I started by researching protocols (RTMP, RTSP, Apple HLS e.t.c), but I soon realised that building something at such a low level was not worth the time and effort.

I then started looking at some managed solutions and quickly came across Twilio’s video chat api and a tutorial for how to implement it in React – https://www.twilio.com/blog/2018/03/video-chat-react.html

Although this seemed by far like the easiest option, the pricing model for Twilio did not make it a viable option for me.

I soon found AWS’s chime – https://aws.amazon.com/chime/ as a potential alternative. Although there is comprehensive documentation on how to build a server-side application that uses chime, I found nothing that meaningfully tackled the client side of things and nothing at all in ReactJS.

So I decided to do this myself.

Getting started with the build

In this article I will be sharing my findings on how to build a basic video chat proof of concept using the AWS SDK and chime with express as the server and ReactJS as the client.

On the server side we will be doing two things:

  1. Creating a meeting
  2. Creating a single attendee for the meeting

For this we will need the following npm packages:

https://www.npmjs.com/package/aws-sdk

https://www.npmjs.com/package/express

https://www.npmjs.com/package/cors

https://www.npmjs.com/package/uuid

Before we create a meeting we need to do a bit of setup:

  1. Create a new node project with express and a single “get” endpoint – see here if you are not sure how to go about doing that – https://expressjs.com/en/starter/hello-world.html
  2. Configure the Javascript aws-sdk – https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/configuring-the-jssdk.html

Creating a meeting

The server code to create a meeting is as follows:

const express = require('express')
const app = express()
const AWS = require('aws-sdk')
const { v4: uuid } = require('uuid')
const cors = require('cors')
const region = 'us-east-1'

const chime = new AWS.Chime({ region })
chime.endpoint = new AWS.Endpoint(
    'https://service.chime.aws.amazon.com/console'
)

app.get('/meeting', cors(), async (req, res) => {
    const response = {}
    try {
        response.meetingResponse = await chime
            .createMeeting({
                ClientRequestToken: uuid(),
                MediaRegion: region,
            })
            .promise()
    } catch (err) {
        res.send(err)
    }

    res.send(response)
})

app.listen(5000, () => console.log(`Video calling POC server listening at http://localhost:5000`))

Let’s go through each part of this to figure out what is going on.

After our imports and setting the AWS region we have to create an instance of AWS Chime.

const chime = new AWS.Chime({ region })
chime.endpoint = new AWS.Endpoint(
    'https://service.chime.aws.amazon.com/console'
)

Here we set the chime region as well as the endpoint.

“https://service.chime.aws.amazon.com/console” is a constant for all applications trying to connect to chime’s services.

Next we create our endpoint:

app.get('/meeting', cors(), async (req, res) => {

We are doing two important things in the above line:

  1. Adding CORS middleware to the request so that we can call it via our react app that will likely have a different domain to the server.
  2. The endpoint function needs to be async as we are going to create and process some promises within this function.

The meeting creation logic is as follows:

    try {
        const response = {}
        response.meetingResponse = await chime
            .createMeeting({
                ClientRequestToken: uuid(),
                MediaRegion: region,
            })
            .promise()

    res.send(response)
    } catch (err) {
        res.send(err)
    }

Here we are creating our response object, we are going to use this to send information back to the client when the endpoint is hit.

We then assign the meetingResponse attribute on the response object to the result of our create meeting api call.

The ClientRequestToken parameter is the meeting’s unique id (hence why we to use uuid), this could be any string, as long as it uniquely identifies this specific meeting. The MediaRegion is the region in which the meeting will be created.

We then catch and send any errors we may get back from this.

At this point we can either return this as a response or create an attendee right off the bat.

Creating a meeting attendee

In this example I have chosen to include creating the attendee in the same Express endpoint call as creating the meeting for the sake of ease.

This code will then look as follows:

    response.attendee = await chime
        .createAttendee({
            MeetingId: response.meetingResponse.Meeting.MeetingId,
            ExternalUserId: uuid(),
        })
        .promise()

In the code above we are doing another call on our chime object that returns a promise passing in the meeting id, as well as a user id (ExternalUserId).

ExternalUserId can also be anything, as long as it uniquely identifies the user.

We should also include this in our try/catch block (or make a new one).

Running the server

Now when we run our Express server and hit the API endpoint we should get the following response:

{
   "meetingResponse":{
      "Meeting":{
         "MeetingId":"some unique meeting id",
         "ExternalMeetingId":null,
         "MediaPlacement":{
            "AudioHostUrl":"some AudioHostUrl",
            "AudioFallbackUrl"some AudioFallbackUrl",
            "ScreenDataUrl":"some ScreenDataUrl",
            "ScreenSharingUrl":"some ScreenSharingUrl",
            "ScreenViewingUrl":"some ScreenViewingUrl",
            "SignalingUrl":"some SignalingUrl",
            "TurnControlUrl":"some TurnControlUrl"
         },
         "MediaRegion":"us-east-1"
      }
   },
   "attendee":{
      "Attendee":{
         "ExternalUserId":"uuid for meeting",
         "AttendeeId":"uuid  for user",
         "JoinToken":"a unique token to join the meeting"
      }
   }
}

We will only be using some of these fields for our POC.

The full code for the server can be found here – https://github.com/richlloydmiles/chime-video-calling-server-poc

Creating the client

A caveat to the client side of things is that it does seem a little more involved than the server side, but once we remove a lot of the boilerplate code it’s really not too bad. For the sake of the POC I am also just going to put everything in the App component.

For the client app we are going to need 2 additional modules

https://www.npmjs.com/package/amazon-chime-sdk-js

https://www.npmjs.com/package/axios

I am including axios here as it is an easy way to make api calls.

I am going to include the entire POC code below and then walk through each part of it:

import React, { useRef, useState } from 'react';
import './App.css';
import axios from 'axios'

import * as Chime from 'amazon-chime-sdk-js';

function App() {
  const [meetingResponse, setMeetingResponse] = useState()
  const [attendeeResponse, setAttendeeResponse] = useState()
  const [callCreated, setCallCreated] = useState(false)
  const videoElement = useRef()
  const startCall = async () => { 
    const response = await axios.get('http://localhost:5000/meeting')
    setMeetingResponse(response.data.meetingResponse)
    setAttendeeResponse(response.data.attendee)
    setCallCreated(true)
  }

  const joinVideoCall = async () => { 
    const logger = new Chime.ConsoleLogger('ChimeMeetingLogs', Chime.LogLevel.INFO);
    const deviceController = new Chime.DefaultDeviceController(logger);
    const configuration = new Chime.MeetingSessionConfiguration(meetingResponse, attendeeResponse);
    const meetingSession = new Chime.DefaultMeetingSession(configuration, logger, deviceController);

    const observer = {
      audioVideoDidStart: () => {
        meetingSession.audioVideo.startLocalVideoTile();
      },
      videoTileDidUpdate: tileState => {
        meetingSession.audioVideo.bindVideoElement(tileState.tileId, videoElement.current);
      }
    }

    meetingSession.audioVideo.addObserver(observer);
    const firstVideoDeviceId = (await meetingSession.audioVideo.listVideoInputDevices())[0].deviceId;
    await meetingSession.audioVideo.chooseVideoInputDevice(firstVideoDeviceId);
    meetingSession.audioVideo.start();
  }

  return (
    <div className="App">
      <header className="App-header">
        <video ref={videoElement}></video>
        <button disabled={!callCreated} onClick={joinVideoCall}> join call</button>
        <button onClick={startCall}>start call</button>
      </header>
    </div>
  );
}

export default App;

After the initial imports we have some state and ref attributes setup in our component

  const [meetingResponse, setMeetingResponse] = useState()
  const [attendeeResponse, setAttendeeResponse] = useState()
  const [callCreated, setCallCreated] = useState(false)
  const videoElement = useRef()

The first two will just be the various parts of the response from our endpoint call.

callCreated is a flag that will be set following a successful meeting creation, and videoElement is an html reference that will contain the stream for our video (this is assigned within the jsx section of our component).

Next we have our startCall function:

  const startCall = async () => { 
    const response = await axios.get('http://localhost:5000/meeting')
    setMeetingResponse(response.data.meetingResponse)
    setAttendeeResponse(response.data.attendee)
    setCallCreated(true)
  }

In this function we are querying our api endpoint and setting our three state variables

Next we have the joinVideoCall function (which is significantly more bulky than the other functions so lets break it down).

const logger = new Chime.ConsoleLogger('ChimeMeetingLogs', Chime.LogLevel.INFO);

Firstly we are setup the logger so that we can view logs as well as the type of log level we want to capture. To test this you can open up your console in the browser when the app is running to see the logs.

const deviceController = new Chime.DefaultDeviceController(logger);

The device controller is the module that controls the devices connected to the session – we will connect this to the meeting in two lines time.

 const configuration = new Chime.MeetingSessionConfiguration(meetingResponse, attendeeResponse);

Configuration setup will take in the api response.

Finally we create our meeting session, this is the object we will use going forward.

const meetingSession = new Chime.DefaultMeetingSession(configuration, logger, deviceController);

The next few lines are focused around the videoTile

VideoTile is a binding of attendee id, a video stream, and a video element that sends out updates to session observers whenever one of its properties changes.

https://aws.github.io/amazon-chime-sdk-js/interfaces/videotile.html

    const observer = {
      audioVideoDidStart: () => {
        meetingSession.audioVideo.startLocalVideoTile();
      },
      videoTileDidUpdate: tileState => {
        meetingSession.audioVideo.bindVideoElement(tileState.tileId, videoElement.current);
      }
    }

    meetingSession.audioVideo.addObserver(observer);

Here we add an observer that listens for 2 events:

  1. audioVideoDidStart – on this event we start our videoTile
  2. videoTileDidUpdate – Bind our videoTile to the video ref

Next we want to:

  • Get a list of our currently attached devices
  • Pick out the first one on our list
  • Use it as the current video input
  • Finally we start the input stream
    const firstVideoDeviceId = (await meetingSession.audioVideo.listVideoInputDevices())[0].deviceId;
    await meetingSession.audioVideo.chooseVideoInputDevice(firstVideoDeviceId);
    meetingSession.audioVideo.start();

This will obviously need to be updated if we want to add any type of complexity to this POC.

The jsx part of React app is pretty straight forward:

    <div className="App">
      <header className="App-header">
        <video ref={videoElement}></video>
        <button disabled={!callCreated} onClick={joinVideoCall}> join call</button>
        <button onClick={startCall}>start call</button>
      </header>
    </div>

Here we are attaching our video ref to the video element, then calling the above created functions – either creating or joining the video call.

callCreated exists to stop us trying to join a call that does not yet exist.

The link to the full source code for this can be found here – https://github.com/richlloydmiles/chime-video-calling-client-poc

For the most part the server side of the POC was taken directly from the AWS documentation on Github – https://github.com/aws/amazon-chime-sdk-js

Richard

Richard

Add comment