How to build Extensible and Scalable Bot applications

Azure Serverless is gaining momentum these days and chatbot is at the centre of the business transformation. So, what if we create a chatbot on Azure Serverless? This is the topic I presented at the Global Integration Bootcamp 2019 in Calgary.

Through this topic, I demonstrated two things:

  • A Contextual Bot, that adds a bit of an empathy while conversing.
  • How to leverage Azure services to develop the Bot that is highly scalable and extensible and grows with your future business scenarios.

Let’s take a scenario where a customer who purchased your flagship product was delighted by the experience that the product offered and she is taking her time to chat with the bot to report that she is truly happy with the purchase. The Bot responds to the customer “You are awesome. Thanks for the great feedback!”

Now, on a flip side, let’s say a customer who purchased your product regrets the purchase and reports it to the chatbot. Can the bot now say “You are awesome. Thanks for the great feedback!” ?

The next thing the customer is going to do is take a screenshot of the conversation with the bot and post it in the social media and put a ding on your company’s reputation.

There’s a great phrase, written in the ’70s:

“the definition of today’s AI is a machine that can make a perfect chess move while the room is on fire.”

The ideal bot would contextually understands the situation and responds “I am sorry for the inconvenience and we will look into this on priority”.

The cherry on top would be to send the entire conversation transcript in an email to a human agent so that the agent can follow up with a phone call to the customer to further calm the customer down and reassure that someone is looking into the issue reported promptly.

These are the 2 scenarios that I would like to explore in this post from an Azure Serverless perspective.

Scenario 1

Firstly, let’s see how we can make the bot contextually respond based on the situation.

Obviously, we are going to utilize Bot Dialog (which I won’t go in detail in this post) to compose the conversation and at the end of the dialog, we call the Text Analytics Cognitive Services API to find the sentiment and if the sentiment is less than or equals 0.5, show the empathetic message or otherwise show a message “You are awesome. Thanks for the feedback”.

And here is a sample architecture for the same:

But, there are a few problems with this architecture.

  1. If you want to change the bot’s response to the customer, you will have to make modifications to your bot code.
  2. If later you decide to use Google’s NLP instead, you will have to change the bot code.
  3. If later you decide to add additional validations with the Text Analytics API, you need to change the bot code.

Ideally, we would like the bot only to engage in the conversation and it doesn’t have to know who is performing the sentiment analysis in the backend. And we want some other component to do a heavy lifting of integrating with the Text Analytics API, validating the sentiment and composing the response message that the bot sends to the customer.

So, in technical parlance, all we are talking is the “Single Responsibility Principle” and “Facade” patterns. This way, the architecture lends itself flexibly to any change in the future.

Here is the corrected architecture.

We create a logic app that connects to the Text Analytics API and based on the sentiment response, it composes the response message.

For starters, Logic App is an Azure IaaS solution that enables us to quickly build powerful integration solutions. Logic App has an extensive list of connectors to connect to other applications such as Office 365, Dynamics 365, Salesforce, SharePoint, Oracle DB, BizTalk, SAP to name a few.

A few scenarios include:

  • Notify me when someone uploads a file to the SharePoint Document list.
  • Every day at 9 AM, retrieve the data from a third party application and store it in Dynamics 365.
  • When I win an opportunity in Dynamics 365 notify my team through Teams.

In short, when something happens, the Logic App is triggered which will execute a series of actions. If you have used “IFTTT” app on iOS or Android, imagine Logic App is an enterprise equivalent of it. Visit this link to learn more about the Logic App.

In our case, the bot calls the Logic App and gets the response back which in turn will be sent to the customer.

Scenario 2

This scenario is about generating the conversation transcript once the conversation has ended. Review the entire conversation to find the sentiment.

  • If positive sentiment, send the transcript to the customer.
  • If negative sentiment, send the transcript to both the customer and the human agent to follow up further as a token of goodwill gesture.

Firstly, we want a database to store the conversation as it happens. For this purpose, we are going to utilize Azure Storage. Note that it isn’t usually a good idea to keep the entire conversation in-memory and push it to the Azure Storage only after the conversation ends. This may not scale as well as you think as you are building a tight dependency with your in-memory storage.

The bot code uses endDialogAsync to indicate that the conversation has ended.

return await stepContext.EndDialogAsync(cancellationToken: cancellationToken);

Right before the above code, we could call a logic app, which retrieves the conversation from the Azure Storage, calls the Text Analytics API to perform the sentiment analysis and based on this, sends an email out to the intended recipients.

But, there are a few problems with this architecture.

What if our bot application is a huge success and we are planning to extend this to add more intelligence which pushed us to develop many more logic apps?

What if we introduce a few Azure Functions that are also interested in few of the events happening in the Bot?

It quickly puts a lot of onus on the bot to manage these and not to say that the bot code has to be modified every time you introduce a logic app or an azure function.

How can we solve this?

It happens that we have already solved this many centuries ago.

When we go to a restaurant, do we directly walk in to the kitchen and talk to various chefs every we want to order our food? Absolutely not! We talk to one person and that person in turn will manage the communication with the chefs.

So, this is our primary architectural pattern we are going to follow for our scenario. But wait, who is our waiter/waitress here?

Event Grid

Event Grid is our waiter here.

For starters, Event Grid is an Azure service for managing and routing of all events from any source to any destination. This enables to decouple the event publishers from the event handlers which allows us to build scalable serverless applications.

There are five concepts in Azure Event Grid:

  • Events – What happened.
  • Event Publisher– Where the event took place.
  • Topics – The endpoint where publishers send events.
  • Event subscriptions – The endpoint or built-in mechanism to route events, sometimes to more than one handler. Subscriptions are also used by handlers to intelligently filter incoming events.
  • Event handlers – The app or service reacting to the event.

To learn more about the Event Grid, visit this link.

We send all the events that the bot generates to Event Grid and the bot’s responsibility ends here. Logic Apps or Azure Functions subscribe to the events that they are interested in with the Event Grid. As soon as the events generated by the bot hits the event grid, it invokes the subscribed Logic App or Azure Function or any other supported event handler component for that particular event.

So, our final architecture looks like this. As you might have guessed, the bot is the Event Publisher and the Logic App is the Event Handler that gets the sentiment and sends the email out to the intended recipients.

Final Thoughts

Most businesses are very cautious when it comes to investing on Virtual assistants citing that it is a huge capital undertaking for them or it gets pushed to the lower priority as it is generally not a first class citizen for the business. But more and more, customers are already ahead of the game and use personal assistants day in and day out and rapidly getting accustomed to it than we are able to keep up with out enterprise solutions.

Organizations have to embrace the idea of earmarking budget to build intelligent solutions in order to provide the experience that their customers deserve and demand. Some organizations have understood this shift loud and clear and started investing, but they test waters before jumping all in. And while they test the waters, they should not fall in the trap of building something that is not extensible and maintainable in the future. The solution must lend itself to any future changes and enhancements and this is the core idea of this post. I hope that I have provided some clarity around how to build extensible bot applications. After all, if the bots were replace humans, let them age very well.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s