Natural Language Understanding : Let's Play Dumb

What is the meaning of the word understanding? This was a question posed during a particularly enlightening lecture given by Dr. Anupam Basu, a professor with the Department of Computer Science Engineering at IIT Kharagpur, India.

Understanding something probably relates to being able to answer questions based on it, maybe form an image or a flow chart in your head. If you can make another human being comprehend the concept with the least amount of effort, well that means you do truly understand what you are talking about. But what about a computer? How does it understand?

Let’s take a look at some sentences written in various languages:

少年は学校に行く; (Japanese)

The boy is going to school; (English)

Le garçon va à l’école; (French)

लड़का स्कूल जा रहा है; (Hindi)

El niño va a la escuela; (Spanish)

What is happening in the sentences above? If you know any two of these languages you will probably be able to tell the right answer. If you know only one, say English, you will probably answer with respect to that language. Taking a closer look, you will notice two sentences seem to have the same structure—the ones in Spanish and French. So you take an intelligent guess and say yes they mean the same thing. Since these sentences mean the same thing, and the one in English talks about a boy with respect to a school, maybe they all mean the same thing.

But how many sentences have you really understood? For me, only three of these make sense, thus forming a picture in my head of a boy directed toward a school connected by the present continuous tense of the verb “go.” English (a language I am well versed in), Hindi (my national language), and French (a language I am learning) all plant the same image in my head. The other translations I have taken from Google-Translate, hence the Japanese is just symbols for me. If you ask me questions in Japanese based on that sentence, I won’t be able to answer them, and definitely not in Japanese. If you ask me a question in any of the other three languages however, I will be able to answer quite fluently and even translate between the three without much effort.

During the lecture, Professor Basu, introduced the concept of frames, invented by Charles J. Fillmore.

“The idea behind frame semantics is that speakers are aware of possibly quite complex situation types, packages of connected expectations, that go by various names—frames, schemas, scenarios, scripts, cultural narratives, memes—and the words in our language are understood with such frames as their presupposed background.” (Fillmore 2012, p. 712) Source/Read more at Computational Linguistics-Charles J. Fillmore

Consider the following sentence:

Tom is going to school in the evening with a basket of muffins from home.

What are the possible ways one can represent this in a computer program?

Let us consider a schema, which defines the various attributes of the verb “go.” The verb go will generally have attributes like:

go { Agent :Tom; Source : home ; Destination : school ; Time : evening ;Object: basket of muffins …}

Now suppose, someone asks the question “What is Tom carrying?”

How can your algorithm answer this? Carrying or to carry is an altogether different verb. Let us look again, now that we have a little more information.

Tom is a boy. Mary is a girl. Tom likes muffins. John cannot dance.T om loves engineering. Jane is beautiful. Tom is going to school in the evening with a basket of muffins from home. Tom sleeps early.

(With complexity, more attributes can be added, right now I am simply considering the easiest possible sentences.)

For this particular data, when the question “What is Tom carrying?” is asked, we can form the case frame for each verb that has Tom as a key, thus ignoring the other possible agents, that is Mary, John and Jane:

to be {Agent: Tom ; Object: boy;…..}

to like {Agent: Tom ; Object: muffins;……}

to love {Agent: Tom ; Object: engineering;…..}

to sleep {Agent: Tom ; Object: none ; Time: early ; …}

go {Agent :Tom; Object: basket of muffins …}

Now based on the generic case frame of the verb “to carry,” it will be easy to answer the question, as carry can’t be related to engineering for example. Anymore related questions like, “Where is he going with the basket?” may also be answered. Since the verb is the key here, and not the agent, we will just look at the case frame of the verb “to go.”

This is a pretty basic example, and there are of course quite a few redundancies, like size of the frame and actually relating the verbs to their objects.There’s a need for word embeddings and other things that I am assuming are taken care of at the moment, but it puts forth a method to make your algorithm “understand.”

Now consider the following set of sentences:

Father gave a birthday gift to the boy. The boy opened the packet. He found a story book inside.

It was his birthday. Father bought him a packet. The boy opened it. There was a book within. It had pictures.

A book was wrapped in lace. It was his birthday present. The boy’s father presented it to him.

Question: What is the birthday present? The answer is very simple. The book. Yet the sentences vary in structure.The number of words that can be used to describe the same book is countably large. Imagine this to be a major news incident, or something that goes viral on Facebook. Each website or user will have their own set of words,set of interviews, opinions, grammatical structure, correct or incorrect.

Now imagine a search engine trying to crawl through it all to find you the right answer. Or is it the closest match? Did you go through the entire process of forming a case frame before deciding the answer was the book? Or was it a simple image that flashed through your mind.

Father -> Gift -> Boy -> Birthday -> Book

So, our algorithm must be built on a background knowledge similar to the one possessed by humans, also called ontology. Just like a human child keeps adding to a huge database of words, syntax, and semantics, subconsciously through his/her life, the algorithm should be able to learn too. However, designing a schema independent of the language it represents, like your brain, it is still a challenge. It will require a powerful dependency parser and a very large dictionary of words, to actually train the algorithm.

Here, I would like to mention the chatter bot ELIZA. It is an early natural language processing computer program created by Joseph Weizenbaum in the 1960s at the MIT Artificial Intelligence Laboratory. ELIZA was created to demonstrate the superficiality of communication between man and machine. ELIZA was capable of engaging in discourse, however it could not converse with true understanding. Even though many early users were convinced of ELIZA’s intelligence and understanding.

This once again brings us to the question is your machine/algorithm really able to “understand” a language and replicate without rote learning? Making the algorithm independent of the data set and able to handle unexpected problems is of paramount importance today. For example, an online chat bot designed to solve customer issues through texts should comprehend what the customer is asking. However there is a silent killer lurking, ready to destabilize your algorithm in such situations. When a person introduces in a touch of sarcasm into the conversation, the chat bot is unable to detect subtlety of language. That’s the toughest problem for any NLP algorithm to handle in today’s era, as no data set can teach sarcasm. Language, like the world and news of the day, continuously evolves. So how can any algorithm keep up?

This is the not a new issue. The question was famously posed by Alan Turing in his 1950s paper “Can Machines Think?” A machine that shows true comprehension continues to intrigue and interest computer scientists, who continue to search for a universal answer.

XRDS

Crossroads – The ACM Magazine for Students

Natural Language Understanding : Let’s Play Dumb

Further Reading

Leave a Reply Cancel reply