Sislab » Demo

Natural Language Processing

admin — Wed, 04 Oct 2017 15:59:39 +0000

Natural Language Understanding covers a wide spectrum of Natural Language Processing and Machine Learning applications ranging from text categorization to complex tasks such as document comprehension. Complexity of the application determines the breadth and the depth of the required language understanding; however, it invariably involves mapping the surface form (text) to an application-internal semantic representation (categorization label, sequence of domain-specific concepts, domain-independent representations such as rhetoric relations, etc.).

Generation of these mappings might require NLP/ML tasks of various complexity. For some applications the sentence-level analysis is sufficient, while some require analysis of language beyond sentence boundary. Examples of such tasks include semantic parsing, discourse parsing, summarization, etc., some of which are demonstrated below.

Natural Language Understanding for Conversational Systems

Natural Language Understanding for open-ended conversational systems (such as social bots, chat bots and multimodal conversational agents in general) requires a rich annotation of the sentence/utterance on local level (e.g. entities) and global level (e.g. single or multiple sentences). The following video shows an example of an automatic system for open-domain NLU. The system produces the following annotation:

Segmentation into functional units (i.e. discourse segments an input utterance can be divided into)
Dialogue Act: a combination of semantic dimension (category of the user intent), communicative function (the intent) and qualifiers (additional specification of the user behavior, such as sentiment)
Entities (any user-provided content which may be used as a topic of a conversation)

The annotation pipeline was successfully used for the Roving Mind social bot during the Alexa Prize Challenge 2017.

Discourse Parsing

Discourse analysis is one of the most challenging tasks in Natural Language Processing, that has applications in many language technology areas such as opinion mining, summarization, information extraction, etc. Penn Discourse Treebank (PDTB) style discourse parsing is a composite task of detecting explicit and non-explicit discourse relations, their connective and argument spans, and assigning a sense to these relations. The system presented in the video is a end-to-end discourse parser developed for CoNLL 2015 and 2016 Shared Tasks on Shallow Discourse Parsing, where it ranked 2nd and 3rd respectively.

Conversational Machines

admin — Tue, 03 Oct 2017 11:02:20 +0000

A conversational agent is a software that is able to answer specific or generic users requests, to interact and to help accomplish his or her goals. These goals maybe explicitly defined as in the case of an artificial intelligent agent helping users plan for a vacation trip. In other cases goals may not have been clearly defined as in a problem solving task or in a socially entertaining robot interaction.

The complexity of this system varies a lot and include the ability to process a multimodal human inputs, acoustic scene sensors and processing massive amount of data. We have been developing technology for such agents — spoken dialog systems for telephone, desktop browsers, smartphones, social bots and consumer robots since the early nineties.

The Demos below demonstrate various agents and their components developed over the years.

Roving Mind — Amazon Alexa social bot

A group of students from Signals and Interactive Systems Lab at University of Trento ( Italy ), was chosen as one of the 12 sponsored university teams all over the world to participate to the Alexa prize, a one-year long competition sponsored by Amazon to advance conversational Artificial Intelligence. The aim of the competition is to create a social computer (“bot”) able to have coherent and engaging conversations with humans about popular topics such as politics or sport.

Paper: Cervone A., Tortoreto G., Mezza S., Gambi E. and Riccardi G., “Roving Mind: a balancing act between open–domain and engaging dialogue systems ”, 1st Alexa Prize Conference, Las Vegas, 2017.

2017

Conversational Agents

2013

2009 ( pre-iPhone, pre-SIRI, pre-Alexa, pre-HeyGoogle )

Talk @ AVIOS Conference and Demo:

Dialog Management with Reinforcement Learning using POMDPs

Conversational agents need to make decisions through a very large set of alternatives reacting to users’ input and choose the best answer that most likely will lead to the accomplishment of a task. In this demo we show the internal belief states of such machines using so-called Partially Observable Markov Decision Processes. In this demo it’s possible to see all the intermediate steps in the decision processes associated to the voice of the user and the prompt of the system.

Varges S., Riccardi G., Quarteroni S. , and Ivanov A. V., “ POMDP Concept Policies and Task Structures for Hybrid Dialog Management” , ICASSP, Prague, 2011

2009

Affective Analysis and Summarization of Conversations

admin — Mon, 02 Oct 2017 12:20:39 +0000

ith the expansion of the call center industry, spoken conversation data is being generated in overwhelming amounts. In call centers, where the goals are to evaluate the expertise of operators, as well as to understand the content of the call in terms of topics, callers’ concerns and emotions, an automatic summary should contain a range of indicators that are useful for monitoring call quality addressing all these aspects.

The SENSEI automatic spoken conversation summary is organized in terms of several dimensions -– objective conversation descriptions –- such as factual metrics, emotional labels, discourse, and written synopses. Together these dimensions form an affective and content summary of a call and provide a wider perspective on the conversation.

Extractive and Abstractive Summarization

Call-center conversation synopses are short summaries of the events taking place during a conversation between a caller (or user) and one or more agents. Such a synopsis should contain a description of the user need or problem, and how the agent solves that problem. It might also describe the attitude of the caller and the agent.
While extractive synopses are generated by selected the most important conversation turns, the abstractive summarization is template based. The templates are learned by extracting frequent patterns from hand-written synopses, generalizing slot variables and filling the templates with entities extracted from a conversation transcript.

Overlap Discourse

Overlapping speech is a frequently occurring event in human-human conversations and it indicates the level of co-operation between the speakers. The description consists of statistics on competitive and non-competitive overlaps occurring in a conversation and the high amount of competitive overlaps signals problematic calls.

Emotion Recognition

Besides providing an affective description of a conversation, the identification of basic and complex emotions such as anger, frustration, empathy and satisfaction has a straightforward application to the evaluation of the call itself, as well as the operator expertise in handing situations. The emotion recognition models detect the presence of an empathy on the agents side, and the presence of anger, frustration and satisfaction on the client side.

Stepanov E. A., Favre B., Alam F., Chowdhury A. S., Singla K., Trione J., Bechet F. and Riccardi G., “Automatic Summarization of Call-Center Conversations”, IEEE ASRU, Scottsdale, 2015.

Alam F., Danieli M. and Riccardi G.,“Annotating and Modeling Empathy in Spoken Conversations”, Computer Speech and Language, July , v. 50, pp. 40-61, 2018.

Personal Healthcare Agents (PHA)

admin — Sun, 01 Oct 2017 03:21:56 +0000

“Agent” is an overloaded term in the computer science and artificial intelligence domain.

More recently researchers and practitioners have become acquainted to interact with digital agents, to speak natural language and have relatively simple tasks executed. Such agents can retrieve contacts, set alarms, provide driving directions to your work location and occasionally surprise with funny jokes. This current state-of the-art has been a tremendous achievement following decades of research in the past 50 years [Riccardi , 2014].

Personal Healthcare Agents (PHA) will change people’s lives and revolutionize the way they manage their wellbeing and health. They will be able to sense the environment , the personal and social behavior , as well as the human organ systems. They will be elaborating, interpreting, summarizing and making sense of these signals and share it with you as well as your caregivers. They will be playing a key role in providing evidence for personalized therapies and in the doctors’ decision-making processes. PHAs will be supporting and motivating people to stir their habits towards healthy lifestyles. PHAs will be engaging patients to behave according to doctors’ recommendations and prescriptions. PHAs may be granted the mission to communicate amongst themselves to share information and make sense of information and trends within demographic groups across geographical and urban areas at different scales.

It will take at least 50 years. It will need new technology and research, change in people’s habits, disruption in doctors and health professionals protocols, innovation in healthcare services and education of next-generation doctors, engineers and professionals.

Let the research and technology journey begin.

For more information see here.

Broadcast News Summarization

admin — Sun, 24 Sep 2017 01:54:23 +0000

With the growing importance of Internet, there is a constantly growing amount of multimedia being generated, such as broadcast tv and radio programs, vlogs, etc.
Being able to efficiently summarize the content of such sources of information has the potential to affect various sectors. The primary carrier of information in such sources is speech. Thus, the broadcast news processing is essentially an extractive speech summarization. It involves tasks such as Automatic Speech Recognition, Topic Segmentation, Extractive Summarization; and other tasks, such as Sentiment Analysis, to enhance the value of the summary.

The broadcast news processing consists in the following automated tasks:

Automatic transcription of the audio of the news
Segmentation of the broadcast into topical segments (news)
Extraction of the key phrases form the news segments
Extractive summarization of the news segments
Sentiment Analysis of news segments
Topic and sentiment trend analysis

Everyday one of our servers automatically downloads and processes broadcast news from LA7 Youtube Channel. Extracted information is then exposed in a friendly User Interface using sections, color-coding and animation effects. Additionally, Sentiment Trend of news is constructed using the daily news broadcasts.

Social Media Analytics and Summarization

admin — Sat, 23 Sep 2017 12:31:00 +0000

Social Media Analytics is the process of “extracting valuable hidden insights from vast amounts of semi-structured and unstructured social media data to enable informed and insightful decision making” (Khan, 2015). Since huge portion of social media data exists in a form of a conversation (e.g. Tweets, forums, etc.), one aspect of analytics is the summarization of these conversations.

Social Media Conversation Summarization

Summarization of social media conversations produces the “Town Hall Summary” that

a) identifies the main issues discussed in a set of reader comments and

b) characterizes opinions offered on these issues, identifying alternative viewpoints, indicating the strength of interest in an issue or support for different viewpoints (aggregation), indicating consensus or agreement among the comment, indicating disagreement among the comment, indicating qualitatively how opinion was distributed (e.g. using phrases like “Many said this; others said that”, “some said”, “most said”), indicating evidence or grounds for a viewpoint and indicating whether the discussion was particularly emotional/heated and if so over what.

The challenge of producing such summaries is addressed by article-comment linking, topical clustering, cluster labelling and extractive and template-based summarization techniques.

Riccardi G., Bechet F., Danieli M., Favre B., Gaizauskas R., Kruschwitz and Poesio M., ”The SENSEI Project: Making Sense of Human Conversations”, Lecture Notes on Artificial Intelligence, J.F. Quesada et al. ( Eds) , vol. 9577, pp. 10-33, 2016.

Brexit Referendum Use Case

In the month preceding the referendum date, SENSEI’s system monitored millions of social media conversations to predict the outcome of the referendum.

Every day, more than 300,000 posts across multilingual media sources on the topic of the UK EU Referendum are captured and automatically analysed by the SENSEI technology. Most exit polls were showing confidence the REMAIN side would prevail. In contrast, the SENSEI system hit with very high accuracy the final outcome.

Celli F., Stepanov E. A., Poesio M. and Riccardi G., “Predicting Brexit: Classifying Agreement is Better than Sentiment and Pollsters” , PEOPLES Workshop at , Osaka 2016.