How chatbots work?
September 15, 2016, 9 min to read
In a nutshell, chatbots are a new way of interacting with services, a new user experience that creates many opportunities. They were here 20 years ago on IRC, so why now on mobile? Because earlier, in 2015, we reached this point where people use more messaging applications than social network ones. Most of our time on smartphone is now spent on messaging apps. It then seems obvious that messaging applications are a highly potential way of connecting with consumers and logically, bots came that way. However, messaging applications are not the only reason why bots entered our lives. Artificial intelligence is the core of any conversational service and as we have seen during the last years, AI has been rising and conversational services with it.
The new bot trend is due to two main factors: artificial intelligence and the rise of messaging applications.
What is behind a bot?
There are three main components in the bot architecture. The UI interface is the way users will interact with the bot. It could be Messenger, Slack, SMS and many more. Then, there is the engine that will process the users’ messages. It is composed with the same recipe that our others back-ends, including a database to store user data, calls to external services and so on. It is finally not so different from a classic architecture with a front-end (website or mobile) and a back-end served as an API. The main difference is the need of a Natural Language Processing engine. It will basically transform user’s inputs into an action that could be executed by our back-end engine.
Interacting with them
Slack, Messenger, WeChat, and now Skype; messenger platforms are opening interfaces to developers, allowing them to build advanced conversations. However, those well known platforms are not the only way to reach out to bots. Indeed, you can use any conversational interface, as long as it has an interface. SMS for example, is a simple way to do so. Platforms only expose an interface, often as an API but they do not provide any artificial intelligence services. The power behind those platforms comes with the amount of users you can reach through them.
Text messages and much more
Text messages are the first elements that comes to mind when talking about conversational bots. However, lots of interactions can be used to build great user experiences. Sometimes, clicking on a button, sending a GPS location or using a contextual menu can be faster than typing a long message. It is actually more accurate to see those incoming messages as events because it could be meta information about a conversation or an action linked to a button. For example, you could receive an event when a user goes online or when he has read your messages.
How to choose?
The differences between services, besides the fact that they target different users, are the tools provided to developers to build advances conversations. For example, Slack recently released its new button feature that allows you to build faster interactions and Messenger now offers to add authentication to a third service inside the conversation. Those features will change the way you design interactions and moreover, the way you implement them. As there is no standard for conversations, it could be difficult to adapt a conversation design for a service to another.
Interface your engine to the UI platforms
To receive messages from the service of your choice, your bot engine needs to expose an API. That way, the service will send a request to your API for every event you subscribed to. To respond to the message after processing it, you will have to respond to the request or send back a new one, depending on the service you are using.
Making them intelligent
We have seen that one of the reasons why bots are flourishing is the rise of Artificial Intelligence and especially Natural Language Processing (NLP). One of the many subtopics of NLP is Natural Language Understanding (NLU) that combines artificial intelligence, computer science, and linguistics. Basically, it is a meaning analysis to extract intents and meta data like sentiments. The result can be a logical expression or even, an action. For example:
What is the largest city in California? => argmax(λx.city(x) ∧ loc(x, CA), λx.population(x)) Play a song by Rihanna => play_music(artist: Rihanna)
Concepts of NLU
As we have seen when working with Echo at Work, NLU has some specific concepts and vocabulary. Here is a quick sum-up:
Entities: it represents a concept. It could be a song, a date or a pizza sauce if you are building a pizza delivery bot.
Intents: it is basically the action or function that the software should execute when a user say something. An intent can be triggered by different inputs, eg: "I want to listen to a Flume’s song” or “Listen to Flume” or “Play any song from Flume” ; all these should trigger the music player to launch a song from Flume.
The most important concept in NLU is context. When a NLU algorithm analyzes a sentence, it does not have the history of the user conversation. It means that if it receives the answer for a question it has just asked, it will not remember the question. To differentiate phases during your conversation, you need to store its state. It can be either flags like “listening-music” or parameters like “artist: ’Flume’”. With context, you can easily chain intents with no need to know what was the previous question. Here is a quick example:
NLU AS a service
There are a lot of NLU (or NLU-related) services that recently appeared to build a bot. When using one of those, you need to configure all your entities and intents on the service website, so NLU will be separated from your application logic. To extract the intent and the parameters of an input, you have to send your input to the service through its API. If the input matches one of the configured intents, you will receive it along with the parameters. Most of the services also offer you to add an “action” to your intents; so in your code, you can call for the corresponding method.
Wit.ai and API.ai, the two leaders are quite similar in term of performances. However, API.ai seems to be taking the lead today due to its clearer interface. You can create entities and intents and easily link them to contexts. API.ai also provide build-in conversation (named Domains) that make it fast to handle requests linked to a pre-defined domain such as sport, weather or user authentication. Finally, API.ai also provides english speech recognition to build a Siri-like bot.
Parlez vous français ?
While NLU services do a great job when parsing english language, using them with another language like french lead to much poorer results. The algorithm used to match an input and an intent fails if the input is not exactly one of the few phrases you provided to the service as references.
Connecting NLP service and UI platform
If you tried to build your own bot, you might have seen that most of the NLP services offer a way to manage responses from the service itself. It is good for discovering the environment and the NLP service, but as soon as you need to interact with an external service or create a user database, you quickly get stuck. If you want to create a custom application, you will need a back end for that. You can use any technology you want, but it will be easier if you choose one that has SDKs or libraries for the UI platform and the NLP service you selected.
As we have seen previously, text messages are not the only way to interact with bots. When designing the back end of a bot, you need to choose the events that are relevant for you and to handle them. Therefore, you should first have a dispatcher that will trigger the proper controller depending on the event you received.
Once you identified your event, you can extract the intent out of it if there is any, and the parameters. For a text message, this is when you will have to make the call to the NLP API. For a button, it it much easier because it is often a match between the action it triggers and the corresponding method.
Executing the intent
Then, you just have to execute the code linked to this intent and create (or not) a response that you can send back to the user. Most of your logic should be here including database actions, third party services calls, analytics, and so on.
Returning the response
Finally, you can send to the user the response he is waiting for. Depending on the service you are using, it might be either responding to the request you received or a new call to make. The second options is more flexible as it will allow you to send back more than one message for example.
Understanding how a bot works, knowing its abilities and limitations, is crucial when planning to work on one. Integrating in your project the existing tools is a first technical challenge, but it is only one face of the job. Designing the right service, by picking tools carefully and orchestrating them in a meaningful way, is at least equally challenging.
On one side, bots benefit from an immediate availability on every platform, with no need for installation. They come thus as a great answer to app fatigue. On the other side, they are not suitable for all situations, and will not replace applications. They should rather be seen as a way to complete the range of services you can offer your users.
Today we hence envision successful bots, either as technical breakthroughs, or as cleverly designed, app completing, services. The first category requires months of work for an AI research team, the second a savvy and creative PM / developer duo. Pick your side.