By: Jason Evans, Kore.ai CIO
For most enterprise technology endeavors, there is a cost benefit analysis to innovation. Companies routinely must choose between launching quickly and launching correctly. If speed is the primary driver, quality suffers; alternatively, if quality is the primary driver, speed suffers.
The same is true when it comes to chatbot development, specifically the natural language processing component (NLP). Believe it or not, chatbots don’t come right out of the gate with the ability to understand human speech, they need to be trained just like a human would before going out in the real world and conversing. Machine learning (ML) is the most common way developers can NL-enable a bot to talk to people, systems, and things. But, Machine Learning requires a substantial amount of time, work, and most importantly – data – to create a bot that can accurately interpret and respond to predetermined inputs.
When it comes to natural language training, the essential question is “can Machine Learning alone solve for both quality of the chatbot’s NL intelligence and the enterprise’s need for speed to market?” The answer to that question, at least for now, is complicated. Under an ML model development cycles for complex chatbots can quickly elongate, and time to deployment becomes a business issue in many instances. The greater the accuracy (viz., quality) the chatbot demands, the longer it takes to train it. That’s the conventional wisdom for most enterprises hoping to build intelligent chatbots, but it doesn’t have to be.
Analyzing Machine Learning and Chatbots
To fully understand why ML presents a game of give-and-take for chatbot training, it’s important to examine the role it plays in how a bot interprets a user’s input. The common misconception is that ML actually results in a bot understanding language word-for-word. To get at the root of the problem, ML doesn’t look at words themselves when processing what the user says. Instead, it uses what the developer has trained it with (patterns, data, algorithms, and statistical modeling) to find a match for an intended goal. In the simplest of terms, it would be like a human learning a phrase like “Where is the train station” in another language, but not understanding the language itself. Sure it might serve a specific purpose for a specific task, but it offers no wiggle room or ability vary the phrase in any way.
To learn like this – the ML way – requires huge amounts of data and teaching to achieve an acceptable degree of accuracy. With ML, it typically takes around 1,000 examples to develop a degree of accuracy that produces positive user experiences.
When an insufficient amount of data exists during the pre-deployment stage – which is usually the case without users to supply it – bot developers must relegate themselves to developing custom rules to identify the intent of a message. The simplest of rules may involve something like “if a sentence contains the word ‘forecast,’” then the user is asking about the weather. It sounds like a simple enough fix, but when a conversation is longer and more complex, the level of accuracy decreases quickly and the bot is prone to false positives and the user experience suffers. Not good if you’ve spent time and resources on a bot that’s a pain to use!
Using Fundamental Meaning to Aid Your Chatbot’s NLP
Fundamental Meaning is an approach to NLP that’s all about understanding words themselves. Each user utterance is broken down word-for-word, as if the chatbot were in school breaking down a sentence on the chalkboard. During this process, it’s looking for two things – intent (what the user is asking it to do) and entities (the necessary data needed to complete a task).
For example, if a user types this request to a shopping chatbot:
“I am trying to find a pair of black dress shoes for my husband”
The chatbot would then break the utterance down to the essentials (verbs and nouns), to determine the intent would be “find shoes.” Since that chatbot now knows what its supposed to be doing, it can look for entities. In this scenario “black” (color) and “dress” (category) and “husband” (men’s department) give the bot an idea of where to start.