While smart speakers and personal assistants are making big strides in the consumer world, conversational systems are entering enterprises with hesitant steps. Enterprises require different types of conversational systems to deliver the next generation information systems to customers. Goal-driven conversational systems are popular. We have used different approaches in building these systems, depending on use: virtual assistance, knowledge synthesis and help desk automation. We have varied the techniques based on the type of knowledge base required and the kind of interaction expected by users. In each of these cases, advancement in deep learning has improved efficacy. Going forward, we see big developments coming in from two directions: general AI and increasingly natural user interfaces.
Conversational experience with machines is a paradigm shift in human–computer interaction. Smart speakers (Amazon Echo, Google Home, Apple HomePod) and personal assistants (Siri, Google Assistant, Cortana) are getting to be a part of normal life. The humanization of technology has just started and its profound impact is being felt across the globe in all activities—listening to music, booking cabs/flights/hotels, shopping, getting personalized product recommendations/ offers, fitness/wellness regimes and, elderly care, to name a few. In contrast to some early efforts to build general conversational systems, such as ELIZA, modern conversational systems have a specific goal, e.g., answering questions on policies or carrying out transactions for us. The proliferation of goal-oriented chatbots, focusing on speech- and text-based questions and answers, is just the beginning of an exciting journey. Enterprises require different types of conversational systems to deliver the next generation of information systems to customers. Some of them are described in this essay.
Virtual assistants are a typical example of a goal-driven conversational system. Such a system can answer completely self-contained questions with good accuracy. In its most popular form, it takes many different questions against the same answer, referred to as “predefined intent”. It picks one of these answers against a user query. The training data for such a system usually comprises many predefined intents. When a user issues a query, the system matches it to one of the predefined intents and shows the corresponding answer. We observed that when the number of intents in a system increases, the accuracy of standard machine learning approaches falls to a very low level and the system becomes unusable.
With the advent of deep learning, the efficacy of such systems has improved to a level at which it is possible to deploy this kind of system for mainstream usage, even with a large number of predefined intents. A system containing a large number of intents gives a feeling of being conversationally intelligent to the users. At TCS Research, we have developed novel deep learning–based algorithms for this. Cara, a digital assistant to answer TCS associates’ queries on HR policies, is a good example of virtual assistance.
With most of the popular platforms available in the market, users, when interacting with these systems, cannot make a pronoun reference to some fact present in the previously asked question or in the previously shown answer. The platforms also lack the capability to handle appropriate conversation beyond the pre-configured QA. Further, it is hard for such systems to identify when it has made a mistake. In such cases, we make the system return a completely selfcontained answer; the users can decide whether the answer is right or wrong and can give feedback to the system. Such feedback is referred to the teachers of every digital assistant, who then train the system to answer correctly. We also developed novel approaches that can handle anaphoric references and guide the conversation through, to answer users’ questions.
The same mechanism is also used to execute transactions, such as booking a flight, booking a table in a restaurant, or applying for leave. Here, instead of defining an answer for a given intent, we associate an API with the intent, and response is shown to the user, based on what
The Talent Engagement team (Human Resources) at TCS realized it was answering routine queries through emails, phone calls, and personal interactions nearly 40% of its time, despite putting up policy documents on the company intranet. These connects were transactional and not adding real value to the team. In India alone, automating HR query resolution could provide multiple benefits: help HR do value-adding work and help the 302,378 employees get quick answers. TCS’ HR teams, in collaboration with our research teams, created a digital assistant, which we call Cara, to automate answering associate queries on TCS’ HR policies.
To this end, the HR teams gathered frequently asked questions (FAQs) from HR personnel deputed in different cities of India and created close to 35,000 queries, which were used to train Cara. HR SMEs continue to train Cara against the questions not answered well on a regular basis.
Cara can handle a large number of intents (approximately, 1500; and that number is growing) and can clearly discriminate between queries that are textually similar but have different meanings. It even answers complex questions such as: “I am based in Delhi, while my project location is Chennai. I want to apply for leave according to Delhi. Can I do so?”
At the time of this writing, Cara has been up for almost a year. Users were skeptical at first but now offer enthusiastic support. Cara has answered close to 60,000 queries from various TCS associates. At its peak operational capacity, Cara has handled 650 queries a day. It now covers all HR policies for India.
was returned by the API. In order to execute such a transaction, the system needs to ask questions of users against various parameters needed by the API. The flight booking intent, for example, will require destination, date of travel, and other details. Such interactions to elicit missing information from users need to be prescripted. Such a dialogue is often modeled using finite state automaton internally.
When information about a group in an enterprise is required by people from within or outside the group, systems such as the virtual assistance described earlier are not a great fit, mainly due to the effort involved in training and maintenance. Information about a group is mostly factual rather than transactional or measurable, and changes more frequently. It comprises organizational structure, key people, and documents, such as best practices and case studies. The metadata of these documents (authors, publication channel, business domain, keywords, and so on) is found to be more useful than the actual content for retrieval of these documents. Analysts believe that this problem falls in the realm of knowledge management. We created a solution which can store such information in a knowledge graph and which has a deep learning–based component that helps users retrieve factual information from the knowledge graph against a natural language query. This system also proactively engages with specific people in a natural language conversation in order to synthesize knowledge and keep the knowledge graph up to date. We, therefore, call this type of conversational system knowledge synthesis.
One of TCS’ internal systems utilizes knowledge synthesis to retrieve information about TCS Research and is made available to users through a digital assistant named Loca. It provides useful information to practitioners and business leaders about all that is happening in a specific technology area and who the right people are to connect with in TCS Research. The knowledge graph of this system comprises information on various research projects, research groups, researchers, and so on. An example query on this knowledge graph is, “Which research areas in TCS use deep learning?” Here “deep learning” is recognized as a keyword. Such keywords are associated with research publications and reusable assets that are, in turn, associated with research projects being executed by research groups. Further, a research area (such as “life sciences”) can include many research groups. To effectively answer such a question, the system traverses the knowledge graph. It is because of this traversal that users tend to think that the system performs logical reasoning. There have been many attempts at making opendomain QA from a knowledge graph effective in research literature. However, not many of them are capable of performing deep traversal of the knowledge graph. In our experience, we observed that deep learning based approaches have brought significant gains in the efficacy of the system used for querying a knowledge graph.
It is important to keep the information in the knowledge graph up to date. Loca proactively engages with users, to ask questions about their work and based on their replies, updates the knowledge graph. Such a system has two types of users, those that can update information about its work (teachers) and the other that is only allowed to query or retrieve information (users). Such proactive questions are asked to teachers, based on missing information in the knowledge graph and prioritized based on what was queried by users. The system also allows teachers to proactively update information using the natural language interface. A key strength of such a system is that if users’ queries are not appropriately answered by the system due to missing or unavailable information in the knowledge graph, it can identify such failures and initiate remedial flow to learn from teachers. The addition of proactive user engagement to natural language based querying mechanism elevates the conversational system from being a mere QA system to an artificially intelligent system capable of synthesizing knowledge from multiple stakeholders in an organization. We believe that the phenomenon of knowledge synthesis can revolutionize the space of knowledge management.
Help desk automation
Similar to some of the initial efforts of goal-driven conversational systems in the late 90s, we have developed a conversational system for help desk automation.
Normally, in a help desk system, users are first required to choose a multilevel category, under which they want to raise a ticket and then provide a textual description of the problem. This multilevel category is actually a path from a root node to a leaf node of a tree. This is often managed with the help of dynamically populated dropdown fields in the user interface. These category annotations on the tickets are used to assign the ticket to the appropriate support staff. Sometimes, the help desk staff makes a phone call to the requestor and asks a few questions to determine the right category, and updates the system; however, without changing the original ticket description. Therefore, there is a need for a system to ask the right questions to the requestor.
Adding proactive user engagement to a natural language based querying mechanism elevates it from merely being a QA system to an artificially intelligent system
in order to determine the correct category. This can be achieved with engineering-based approaches combined with virtual assistance but will require significant manual effort to set up and configure the conversational system for every new use case.
With the help of deep learning– based approaches, we have developed a novel system. Using these, it is possible to train a conversational system that automatically decides what questions to ask the user, similar to what the help desk staff would have asked, to determine the right category. This also involves a form of root cause analysis. The strength of this approach is that the application can be trained on the historical data of a help desk system and requires very little human effort to setup and configure, for new use cases.
The future: natural and intelligent . . .
At TCS, we have not only deployed innovative conversational systems, such as Cara and Loca, for our own internal use, but have also delivered such solutions to our customers. We have also developed a platform that makes it very easy for subject matter experts (SMEs) to develop similar enterpriseclass, production-strength systems. The platform internally uses stateof-the-art deep learning and machine learning based algorithms to render these digital assistants. We observe that the use of these conversational systems in enterprises will be pervasive. Right now, enterprises work with forms, search, and menu driven knowledge mining. In some cases, we see basic natural language interfaces to systems.
Going forward, we see big developments coming in from two directions: general AI and increasingly natural user interfaces. In the near term, we expect that various purpose-specific conversational systems in an organization will converge into a single conversational system. In general, the conversational intelligence of such digital assistants will also improve.
Today’s conversational systems focus on text/speech recognition and NLP. However, nonverbal communication represents nearly two-thirds of all human communications. We believe that future conversational systems will use vision, gestures, emotions, and tactile, augmented reality (AR), and haptic feedback, and other types of input and to provide connected users a truly interactive experience
Conversation via voice channel
In a rapidly digitizing world, we interact with more and more digital interfaces. The quest is on to make these interactions voice-based. The technology to understand human speech signals is ripe and is available for use on standard cellphones. We have also developed a speech-based interface for Cara and are trying to bring it into mainstream usage. For adoption of speech as a primary interface in enterprise systems, some challenges still need to be addressed, such as: What if a user stammers? Will security inspectors accept voice recognition as a means for authentication? We have also done initial pilots with humanoid robots as another channel for users to interact with such systems.