Generative AI is a powerful tool for call centers. It can be integrated as a standalone AI Sales Agent or as a co-pilot providing Agent Assist to a human sales rep, among other uses. Generative AI is particularly well-suited for industries like travel because it can pull from public information about destinations, climates, and the like in order to provide highly personalized advice for would-be travelers.
In this post, we’ll guide you through building an AI Travel Agent (aka voicebot) with the integration of Symbl.ai’s Nebula Large Language Model (LLM) for natural human conversations, Symbl.ai’s Streaming API for real-time transcription, and Amazon Connect’s robust communication platform.
In a later post, Integrating Agent Assist with Symbl.ai Nebula LLM and Amazon Connect, we show an alternate use of this functionality where the Generative AI serves as a co-pilot to a human agent. Of course, the benefits of both the AI Travel Agent and the AI Travel Agent Assist transcend the travel sector to almost any field.
Let’s start with a look at the main players in this project: the Nebula Chat, Symbl’s Streaming API, and Amazon Connect, and then we will go through the steps of building our AI Travel Agent.
Nebula, Symbl.ai’s proprietary Large Language Model, is optimized specifically for building Generative AI experiences and workflows that involve human conversations. It is capable of processing various types of conversations such as sales, contact center, recruitment, meetings, emails, chats, and more.
Nebula can perform instruction tasks such as request summaries, follow-up questions, draft emails, and issues to review. In the travel agent scenario, it can also identify and recommend resolutions to customer issues and recommend alternative travel arrangements. All this leading to:
Nebula is ideal for scenarios involving human dialogue and supports two distinct model variations.
In our project, we use the Chat model.
Symbl.ai provides a wide range of services and products that leverage advanced natural language processing and machine learning. These tools are adept at analyzing text and speech data to extract valuable insights and intelligence. The features offered by Symbl.ai are diverse, encompassing:
In this project, we used the Streaming API, which uses the WebSocket protocol to process audio and provide conversation intelligence in real time.
Amazon Connect is the Amazon Web Services (AWS) cloud-based contact center service that is integral to the solutions discussed in this blog. It offers a seamless, scalable, and customizable experience for call centers of all types. Amazon Connect is designed for easy setup and scalability, allowing businesses to quickly establish and adapt their contact center operations as needed, without extensive hardware or complex software. This adaptability is crucial for businesses experiencing growth or fluctuating contact volumes.
A significant advantage of Amazon Connect is its ability to seamlessly integrate with other AWS services and third-party applications, such as we are doing today with Symbl.ai’s Nebula LLM and Streaming API. This integration facilitates real-time data processing and intelligence gathering, enhancing customer service.
Amazon Connect also provides tools for customizing customer experiences, including interactive voice response systems and real-time analytics, enabling businesses to tailor interactions to specific customer needs. Moreover, it adheres to AWS’s rigorous standards for data security and compliance, ensuring secure communications and adherence to data protection regulations.
You will need access to the following;
Make sure to set up your Amazon Connect account by claiming your number, setting up routing profiles, agents, queues, contact flows, etc., as described in the ‘Get started with Amazon Connect’ documentation.
Our AI Travel Agent will work like this:
These steps are depicted in the diagram below:
We use Next.js and Typescript for the client side code, along with the amazon-connect-streams library for loading the CCP interface to directly receive calls in our web application. We also download the latest versions of connect-rtc-js and aws-sdk libraries.
Now load the CCP within the Next.js app by following these steps:
The connect-rtc-js library acts as a wrapper around amazon-connect-streams so that we are able to access the RTCSession in our code. With this we are able to get the remote stream and RTCPeerConnection object.
The code for these steps is shown below:
// 1. Call connect.core.initCCP and 2. Configure the CCP
connect.core.initCCP(containerDivRef.current, {
ccpUrl: "<amazon_connect_url/ccp-v2/>",
loginPopup: true,
loginPopupAutoClose: true,
loginOptions: { autoClose: true },
// 3. Prevent amazon-connect-streams from sing its default CCP
softPhone: { allowFramedSoftphone: false },
pageOptions: {
enableAudioDeviceSettings: true,
enablePhoneTypeSettings: true,
},
});
// 4. Initialize the connect-rtc-js as the softphone manager
connect.core.initSoftphoneManager({ allowFramedSoftphone: true });
// 5. Register a call back in case a call is received
connect.core.onSoftphoneSessionInit((e) => {
// returns connect-rtc-js which is softphone manager
var softphoneManager = connect.core.getSoftphoneManager();
connectionIdRef.current = e.connectionId;
setSoftphoneManager(softphoneManager);
if (softphoneManager) {
var session = softphoneManager.getSession(e.connectionId);
const remoteStream = session?._remoteAudioStream;
session._onRemoteStreamAdded((e) => {
console.log({ remoteStreamAdded: e });
});
}
});
To connect to the Symbl.ai Streaming API, we first need an access token to authenticate the requests.
Once the session is connected, you can now bind listeners to the onmessage
callback to receive updates such as transcription, trackers, sentiments, action_items, etc. from Symbl.ai.
In our case, we are only interested in the transcriptions and then we are calling Nebula LLM only when we detect a pause in speech from the user for a specific number of milliseconds.
This flow is depicted in the image below.
Now this is the part we have been waiting for, the power of Generative AI.
As mentioned above, Nebula provides 2 models: Instruct and Chat. In this blog post we use the latter because we want to simulate a conversation, therefore we want the Nebula Chat Model to keep track of previous messages.
Also note that there are 2 different endpoints for the Chat Model.
In this demo, we use the 2nd option as it allows receiving responses as they are being generated.
Once that text response is received, we just send it to the call agent which in turn will convert text to speech (TTS) and relay that to the Amazon Connect call.
If you are interested in integrating an AI Agent into your communications platform, reach out to the experts at WebRTC.ventures. Contact us today!