Natural Language Processing

Natural Language Processing

Natural Language Understanding covers a wide spectrum of Natural Language Processing and Machine Learning applications ranging from text categorization to complex tasks such as document comprehension. Complexity of the application determines the breadth and the depth of the required language understanding; however, it invariably involves mapping the surface form (text) to an application-internal semantic representation (categorization label, sequence of domain-specific concepts, domain-independent representations such as rhetoric relations, etc.).

Generation of these mappings might require NLP/ML tasks of various complexity. For some applications the sentence-level analysis is sufficient, while some require analysis of language beyond sentence boundary. Examples of such tasks include semantic parsing, discourse parsing, summarization, etc., some of which are demonstrated below.

Natural Language Understanding for Conversational Systems

Natural Language Understanding for open-ended conversational systems (such as social bots, chat bots and multimodal conversational agents in general) requires a rich annotation of the sentence/utterance on local level (e.g. entities) and global level (e.g. single or multiple sentences). The following video shows an example of an automatic system for open-domain NLU. The system produces the following annotation:
  • Segmentation into functional units (i.e. discourse segments an input utterance can be divided into)
  • Dialogue Act: a combination of semantic dimension (category of the user intent), communicative function (the intent) and qualifiers (additional specification of the user behavior, such as sentiment)
  • Entities (any user-provided content which may be used as a topic of a conversation)
The annotation pipeline was successfully used for the Roving Mind social bot during the Alexa Prize Challenge 2017.


Discourse Parsing

Discourse analysis is one of the most challenging tasks in Natural Language Processing, that has applications in many language technology areas such as opinion mining, summarization, information extraction, etc. Penn Discourse Treebank (PDTB) style discourse parsing is a composite task of detecting explicit and non-explicit discourse relations, their connective and argument spans, and assigning a sense to these relations. The system presented in the video is a end-to-end discourse parser developed for CoNLL 2015 and 2016 Shared Tasks on Shallow Discourse Parsing, where it ranked 2nd and 3rd respectively.



Comments are closed.