Posted on October 13, 2008 by Peter Turney
For a few years, I was fascinated by the problem of context in machine learning and artificial intelligence. Context is a popular problem in AI: there is a biennial conference dedicated to context in AI. The problem of context arises when we try to use the past to make predictions about the future. I think of context in terms of foreground and background information. If we describe a situation purely in terms of the foreground information, ignoring the background (the context), and then we make predictions about similar situations in the future, we will often make mistakes, especially if the past background is significantly different from the future background.
I first started thinking about context while I was applying standard supervised machine learning techniques to the diagnosis of gas turbine engines. Our data consisted of temperature and pressure sensor readings from turbine engines, which we represented as feature vectors. The task was to classify the vectors into eight different classes, one healthy engine class and seven different fault classes. A twist to the task was that we did not randomly split the data into training and testing sets. To make the task more realistic, we split the data into two sets based on the weather; a cold weather set and a warm weather set. The idea was that the diagnostic software should not be sensitive to the weather conditions. The foreground information consisted of the temperature and pressure values of the engine sensors and the background information consisted of the weather conditions. We quickly found that it was not possible to accurately classify the vectors using only the foreground information; accurate classification required taking the background information into account.
The problem of word sense disambiguation is often described in terms of context. The word “bank” means one thing in the sentence “I put my money in the bank” (financial bank) and another thing in the sentence “I put my towel on the bank” (river bank). The meaning of “bank” (the foreground) depends on the context of the sentence (the background). The task of word sense disambiguation is to use the context to determine the intended sense of the ambiguous word.
I now see that the problem of understanding how to use contextual information is very closely related to the problem of understanding analogy-making. (I am certainly not the first person to see this; Boicho Kokinov noticed it some time ago. I think it first occurred to me a couple of years ago, during a conversation with Daniel Silver about context, analogy, lifelong learning, and knowledge transfer.)
Analogy-making is based on the distinction between attributional similarity and relational similarity. Two things are attributionally similar when they have similar properties (attributes); for example, dog and wolf are attributionally similar. Two pairs of things are relationally similar when they have similar relations; for example, the relations of the pair dog:bark and the pair cat:meow are similar. Analogy-making is based on using relational similarity instead of attributional similarity. The reason for using relational similarity is that it results in more reliable predictions about the future than attributional similarity.
Context is about the relation between foreground and background. The way we use contextual information to make predictions is that we represent the relations between the foreground and the background (e.g., the relation between weather and engine temperature), instead of representing only the properties of the foreground (e.g., the engine temperature out of context). Thus reasoning with contextual information is an instance of analogical reasoning. Looking at it the other way around, relational similarity is superior to attributional similarity because the former is more robust than the latter when there is contextual variation.
Let’s consider the colour red as an example. We tend to think of red as an attribute (“That car is red.”), but it is really a relation (“That car is red under these lighting conditions.”). We think that red in the morning is the same thing as red in the evening, but a spectrograph would show that this is not true. The spectrum of a red object changes as the lighting conditions change, but somehow we see the same red. How is this possible? This is the problem of colour constancy. Our colour vision automatically adjusts to the lighting conditions, at a basic neural level, far below our conscious awareness. Thus, for millennia, we thought red was a property, but we now know it is really a relation.
Why did evolution give our visual system colour constancy? Because we can make more reliable predictions about the future when we use the relation between an object’s spectrum and the lighting conditions than when we use the object’s spectrum alone. Our visual system takes context into account, without our awareness. Visual perception uses analogy (relational similarity), although we usually assume vision is purely attributional.
I believe that context and analogy are pervasive in cognition, but they almost always operate below our conscious awareness, so we have the mistaken impression that we can often ignore context, and that properties without relations are adequate for most problems. This illusion is a major barrier to progress in AI research.