A Guide to the Amazon Chime SDK – WebRTC.ventures

Written by Hamza Nasir | Dec 7, 2020 5:00:00 AM

The pandemic has forced the big players in cloud computing to offer communication services, of which arguably the biggest is Amazon with their Amazon Web Services platform. Now AWS is offering SDK’s and API’s for their pre-existing Amazon Chime video conferencing application. This will allow customers to easily build their own real-time communication apps with Amazon’s infrastructure as the backbone.

In this post, we will go through setting up a simple videoconferencing app with the Amazon Chime SDK and explore some of the core functionality. Let’s get into it!

First things first

First and foremost, we need to clone the Chime’s SDK github repository as it contains a couple of demo applications for us to work with. The SDK can also be added as a dependency to your own project. But for now, we’ll just work with demo code.

git clone https://github.com/aws/amazon-chime-sdk-js.git

Perfect! Now change directory into the demos/browser folder:

cd demos/browser

Pre-requisites

Please make sure you satisfy the following prerequisites to build, test, and run these demos from source:

Node 10 or higher
npm 6.11 or higher

If you don’t already have Node and NPM installed, use this link: https://nodejs.org/en/download/.

Now, we have to initialize our AWS credentials. Make sure you set these environment variables with your credentials:

export AWS_ACCESS_KEY_ID=<your access key id>
export AWS_SECRET_ACCESS_KEY=<your secret access>

If you don’t have your AWS credentials then you’ll have to create an AWS access key. You can create it from selecting My Security Credentials under your profile banner and then scrolling to the Access keys for CLI, SDK, & API access section and selecting Create access key.

You can also initialize your credentials using the aws command line interface using the configure command. This will store all your credentials in the root ~/.aws directory from which the Chime SDK will authenticate the user.

Now run:

npm start

This will build and run your project at http://localhost:8080. If all goes well then you should see this screen.

Code Overview

We will now discuss the core functionality of the sample application provided by the Chime SDK.

First, we need to initialize the SDK. If you get errors here, please check your AWS credentials and roles.

const chime = new AWS.Chime({ region: 'us-east-1' });

// Set the AWS SDK Chime endpoint. The global endpoint is https://service.chime.aws.amazon.com.
chime.endpoint = new AWS.Endpoint(process.env.ENDPOINT || 'https://service.chime.aws.amazon.com');

Any WebRTC-based application starts with token generation. So let’s start with that:

meetingTable[requestUrl.query.title] = await chime.createMeeting({
          // Use a UUID for the client request token to ensure that any request retries
          // do not create multiple meetings.
          ClientRequestToken: uuidv4(),
          // Specify the media region (where the meeting is hosted).
          // In this case, we use the region selected by the user.
          MediaRegion: requestUrl.query.region,
          // Any meeting ID you wish to associate with the meeting.
          // For simplicity here, we use the meeting title.
          ExternalMeetingId: requestUrl.query.title.substring(0, 64),
        }).promise();

As you can see here, we save the response from chime.createMeeting in a table so we can later look up the meeting title to know whether the meeting exists or not. We will later send this token to the client securely.

Now what if the meeting already exists? We then have to send a token from a different API. Specifically from this one:

// Create new attendee for the meeting
      const attendee = await chime.createAttendee({
        // The meeting ID of the created meeting to add the attendee to
        MeetingId: meeting.Meeting.MeetingId,

        // Any user ID you wish to associate with the attendeee.
        // For simplicity here, we use a random id for uniqueness
        // combined with the name the user provided, which can later
        // be used to help build the roster.
        ExternalUserId: `${uuidv4().substring(0, 8)}#${requestUrl.query.name}`.substring(0, 64),
      }).promise()

Simple right? Let’s see what we can do with this token on the client side.

let joinInfo = (await this.joinMeeting()).JoinInfo;
    const configuration = new MeetingSessionConfiguration(joinInfo.Meeting, joinInfo.Attendee);
    await this.initializeMeetingSession(configuration);
    const url = new URL(window.location.href);
    url.searchParams.set('m', this.meeting);
    history.replaceState({}, `${this.meeting}`, url.toString());
    return configuration.meetingId;

In this block of code, the response from await this.joinMeeting() is the object we generated on the server.

We will now use this configuration object to initialize our meetingSession. This will be responsible for controlling most of the call actions, like so:

const deviceController = new DefaultDeviceController(logger);
    configuration.enableWebAudio = this.enableWebAudio;
    configuration.enableUnifiedPlanForChromiumBasedBrowsers = this.enableUnifiedPlanForChromiumBasedBrowsers;
    configuration.enableSimulcastForUnifiedPlanChromiumBasedBrowsers = this.enableSimulcast;
    this.meetingSession = new DefaultMeetingSession(configuration, logger, deviceController);
    this.audioVideo = this.meetingSession.audioVideo;

Keep an eye out for the meetingSession and audioVideo objects. They will be used repeatedly to set up the call.

To join, we first have to configure an Audio and Video device with Chime. This is synonymous to the process of publishing in most WebRTC applications. We create a dropdown of available devices so that the user can choose which devices they want to publish with. Let’s look at how we populate and set audio devices:

async populateAudioInputList(): Promise<void> {
    const genericName = 'Microphone';
    const additionalDevices = ['None', '440 Hz'];
    this.populateDeviceList(
      'audio-input',
      genericName,
      await this.audioVideo.listAudioInputDevices(),
      additionalDevices
    );
    this.populateInMeetingDeviceList(
      'dropdown-menu-microphone',
      genericName,
      await this.audioVideo.listAudioInputDevices(),
      additionalDevices,
      async (name: string) => {
        await this.audioVideo.chooseAudioInputDevice(this.audioInputSelectionToDevice(name));
      }
    );
  }

Ignoring the UI code, it’s important to realize the two main calls in actions here. The first is that we populate our Audio devices using the this.audioVideo.listAudioInputDevices() function. And then, when the user selects a device from the drop down we select it in our publisher using the this.audioVideo.chooseAudioInputDevice() function.

Once the user has initialized their devices we are ready to start the call with one simple line of code.

this.audioVideo.start();

As of now, we’ve connected to the session and published our stream. Now there’s one last thing left to do: Subscribe to other streams. This is done in this particular example in the setupSubscribeToAttendeeIdPresenceHandler() function.

This is a rather lengthy function that handles UI operations such as binding a volume indicator to show the participant’s mic activity and speaker detection. But what’s important is to look at the this.audioVideo.realtimeSubscribeToVolumeIndicator() function. Despite its confusing name that implies that it has something to do with the subscribers volume, this is the function responsible for connecting the remote participant’s stream to your session.

We’ve made some great progress! Our client is now successfully connected to the call. But you may ask, what about the stream UI? How does this connect to stuff you can actually see on the screen?

This is an area that Amazon Chime offers a lot of flexibility. Maybe even too much flexibility one would argue. Let’s keep it simple. In your App you need to make sure your react class implements AudioVideoObserver:

export class DemoMeetingApp implements AudioVideoObserver, DeviceChangeObserver

If this class of yours also has code to change devices, you’ll have to implement the DeviceChangeObserver code as well. Now you’ll have to implement one of AudioVideoObserver’s call back called videoTileDidUpdate(tileState: VideoTileState).

This gets called whenever a new participant joins or when the user’s stream’s state changes. You will use this to bind this to an html element in your UI using this function call:

this.audioVideo.bindVideoElement(tileId, videoElement);

Here, the videoElement is the existing element in your HTML DOM where you would like to display the user’s stream and tileId is contained in the tileState object you received in the videoTileDidUpdate call back.

And there you have it. You’re an Amazon Chime Pro!

Ready to build audio calling, video calling, or screen sharing capabilities into your business or application? Contact the experts at WebRTC.ventures today!

View full post