Homepage for CSCI-B 659 and LING-L 715
This research seminar is a large group endeavor to build a semantically and pragmatically enabled AI for interactive communication with advanced dialog memory and specialized knowledge from a specific domain.
To see the people involved, visit the people page.
To generate a knowledge representation for the AI the system processes natural language text, extracts semantic relations from unstructured text, enriches these relations with language specific implicatures and presuppositions using computational semantics and computational pragmatics methods, and in addition domain specific knowledge encoded in for example ontologies and knowledge representations.
The results of this knowledge extraction process are stored in a queryable knowledge graph using a Graph database and OWL with a reasoner.
The system connects to existing AI services like Amazon Alexa to allow for spoken language queries using Graph DBs like Neo4J, Stardog, or Apache Jena as Knowledge Representation backends. Spoken language queries are translated to Cypher and SPARQL, returned results from the knowledge graph are translated into natural language text and converted to speech using existing text-2-speech engines.
The system resembles the architecture of systems like IBM’s Watson or existing AIs like Google Assistant or Amazon Alexa.
The research goal of this system is to use technologies and NLP components that go beyond the existing capabilities, that enable not only reasoning based on ontologies (e.g. OWL with Pellet or FACT++), but also generation of facts using semantic and pragmatic processing.
For example, our system can be enabled to use an OWL ontology which encodes CEO as a subclass of Person. An assertion that “Tim Cook is a CEO” would automatically extend the retreivable facts about “Tim Cook” to the inherited ones that relate to the class Person, for example that “Tim Cook” has parents and so on.
In addition to this, our linguistic analyzers extract deep properties from sentences, clauses, phrases to allow us to reason in a more complex way over text or utterances. For example, the use of definite and specific noun phrases (NPs) like “the blue car” in an utterance like “John bought the blue car.” implies that in the situation there were more cars and none of them was blue. We can encode this implication in a knowledge representation and make the general context with implicatures (or implications) and presuppositions more transparent, independent of the specific domain of language that we are processing.
Using knowledge representations like the Unified Medical Language System (UMLS) we can process meta-information about medical terminology and names of medical products to generate guesses about the situation of a patient. For example, if a person tweets about a price increase of a medication called Prozac, we might conclude that this person very likely suffers from depression or that she is responsible for someone who does. A statement in a patient report that explains that “the patient was prescribed 50 mg of Codeine elixir” implies that the patient was suffering from some (mild) form of pain. Such domain specific implications can be generated from existing knowledge representations.
The system consists of:
- Natural Language Processing (NLP) pipelines in parallel. The analyses from different NLP components and pipelines is agregated in one uniform linguistic representation.
- Linguistic representations are mapped on semantic relations using linguistic principles of semantics and pragmatics.
- Semantic relations are mapped on abstract graph representations (Knowledge Graphs).
- Knowledge graphs are translated into Cypher (for Neo4J) and SPARQL (for Stardog, Apache Jena) to store them in GraphDBs.
- Natural language queries (text or speech-2-text) are translated into assertions or queries of the GraphDB-based knowlede representations.
- Query results are converted from Cypher, SPARQL, etc. to natural language and via text to speech to spoken language output.
This system was designed with the following criteria in mind:
We established each of the stages as independent modules. These modules communicate over XML-RPC APIs. Each of the modules can exist on a server that is completely separate from the others (provided internet connectivity is there). Most of the modules have small machine utilization requirements. The largest resource consumers are external programs we are utilizing: the graph databases and the natural language processing software.
By this, we mean programming language neutral and natural language neutral. The system does not rely upon a particular language since each of the modules operates over xmlrpc. All that is necessary is for a given module to expose the same functionality as another over the rpc. (One area where this is not true currently involves transmission of whole python objects by pickling an object, passing it into a function, and then unpickling it on the other side. However, we are working to find solutions that maintain language neutrality on this side).
The system is not biased towards a particular language (however, implicatures that are domain specific do not translate from one language to another and language modules have to be compatible with these other languages).
Because each of the services in the pipeline exists as a single atomic piece, when more and more tasks pile up, the system should be able to efficiently scale up by forking each server process.
We strive to make each of the components fault tolerant, allowing them to deal with errors in the linguistic analyses generated as well as errors in the behavior of other modules. All xml servers are setup as system daemons in linux that can be controlled by the usual methods.