This blog is highly personal, makes no attempt at being politically correct, will occasionaly offend your sensibility, and certainly does not represent the opinions of the people I work with or for.
On mental models
avatar

On mental models

I have first written this text in May 2006 and have not published it before May 2007. I have been wondering for one year whether it was correct (or at least partially correct) or whether there was something really wrong in it. In the end I realized that I may simply have described a part of my own thinking (or at least a simplified version of this part that I can see and understand).

"Never increase, beyond what is necessary, the number of entities required to explain anything" --- William of Ockham (1285-1349)

Part one

Why

In the beginning there is confusion, sometimes frustration, always darkness. Well, at least the feeling that you will never be able to understand. In the end there is a clear, accurate, and working mental model. But what are mental models exactly ?

Causality

Causality is, for human beings, one of the most understood general principle. It expresses the fact that events called causes produce other events called effects. For instance, this word is bold (assuming that whatever program --probably a web browser-- that you use to read this text is able to correctly interpret html tags), because it is surrounded by the two tags <b> and </b>.

An important thing about causes and effects is the fact that causes have deterministic effects, meaning that the same cause always causes the same (cf. a, b below) effect.

a. Of course if you play some nice pop music in the house at lunch time, the effect will probably be that your housemates are happy. But if you produce the same cause at midnight the effect will be that you will probably be killed (above all the day before an important exam). So you might think "Hey, wait! The two effects of the cause are not the same!". Well, the point here is that the cause is not "playing the music", but rather "playing the music at this particular time". So the two apparently similar events are actually distinct events.

b. In the everyday life, causes can be very difficult to be fully known because of so many variables involved in the causality. For instance if today is my birthday, the music at midnight might actually be welcomed (because of the ongoing party -- above all if I invited my housemates to the party). Fortunately in most of the problems this text will be referring to, the causes are defined using very few variables.

Even if the same causes always produces the same effects, a given effect can be produced by several causes. For instance word can be html-encoded by "<b>word</b>" or by "<b>w</b><b>o</b><b>r</b><b>d</b>".

Even though the two causes have got the same effect (at least as long as the final rendering is involved), one of them seems more... well, better. We will come to that later.

It is of the interest of human beings to know which effects are produced by some given causes as well as knowing which causes produce some given effects. For instance, today I am interested in having this text typed, and I am happy to know which causes can produce this effect. An example of cause can be me tirelessly hitting little buttons on the keyboard of my laptop. Another one can be using a speech recognition software. Another one can be spending time explaining my thinking to someone and have this person to write the article for me. The human mind has therefore to link causes and effects in some kind of ways. Possibly in a ways which can be presented using some language (but not necessary).

Rules

The most usable piece of knowledge which encapsulates the relationship between causes and effects are rules. A rule is a statement which claims a causal relationship between two events. For instance, the following is a rule.

Rule: in HTML, the tags <b> and </b> (in this order) cause the text between them to be rendered in bold. The tags themselves are not displayed.

So one may think "ok, so everything is all about rules, right ?". Well, not quite.

Unfortunately, the problem with rules is that they are basically nothing else than meaningless statements. They only start to make sense if you know the objects that they refer to. For instance, any first year student in maths (at least when I was first year student in maths) knows that two vectorial spaces of same finite dimension on the same underlying field are isomorphic. This is a rule and in fact, more precisely, a mathematical theorem of Linear Algebra.

Now, assuming that you do not know about linear algebra, you can nevertheless know that the rule talks about the following objects or properties: vectorial spaces (objects), dimension (property of one object), fields (objects), being isomorphic (property of a couple pf objects). You can understand the actual causality: if two vectorial spaces on the same field are of same finite dimension then they are isomorphic.

It is important to notice that you can actually understand (!), at least perceive, the logical relationship between the properties of the involved objects, in other words the causality, without having got a damn clue of what this rule actually means (because the involved objects are unknown to you).

So we have an important principle: No matter how clear or simple (or beautiful) the rules are, the knowledge of the set of rules is of no use to you if you do not know the objects (and their properties) that the rules refer to.

Then you will ask, "Ok, so how do I know the objects ?". Well, this is where it actually gets interesting...

If it was only about objects of the daily life of 30,000 years ago (ie, stones, meat, rain, caves, playstations, fruits, mates etc.), it would then be enough just to see them with your eyes during your childhood (or any other period of your life), and let your brain do the rest. Unfortunately we now live in a word in which some objects of our everyday life are rather invisible (and complex), such as the HTTP protocol.

In many occasions, those objects are nothing else than the closure of their rules. In the sense that being rather immaterial, you only know of them by the rules which express their properties (and you assume that they do not have any other particular rules than the ones which were involved in their formal definition). In particular some of those rules explain how they must interact with other objects, such as in the relationship between my internet browser and the HTTP protocol --kind of, don't blame me for being too simple about how the web works in this text--.

But, wait!.. Haven't we said that those rules are meaningless without first knowing the objects ? By saying that some of those object are defined by rules haven't we got here a bootstrapping problem ? This is where a wonderful, flawless and very powerful mechanism comes into play. It is embedded as primitive feature in the mind of every human being: the power to think abstractly.

The power to think abstractly is the ability to consider objects and give sense to their rules when the objects are not concrete including in the particular case that the objects are actually defined by rules. When you use this ability just for itself (maybe because this gives you some pleasure -- possibly as a paid job), only on abstract objects and with rules only applied to those purely abstract objects, you are what is commonly called a mathematician.

Abstract objects are nothing else than the sum (or combination if you prefer) of their properties. Hence, as long as abstract objects are involved, you can evaluate how complex they are by counting their properties. For instance any given integer has no other properties than... its name. The unique property of the abstract object usually denoted "2" is to be 2.

We saw earlier that causality can be a bit difficult to state accurately if the objects are complicated (or if the relationships between them are), hence people interested in the subject of "objects, their properties and their rules" might want to start with simple objects such as the integers. This is why mathematical objects are very simple. Their rules are also simple and, very importantly, 100% accurate. In mathematics, more complex objects are always built upon perfectly known simpler objects and using perfectly known simple methods. Even if after a couple of iterations the objects become more complex, their rules are still 100% accurate. This might be a good definition of mathematics actually: the art of studying the rules of purely abstract objects. Such perfect objects, as already stated, are obtained from perfect simpler ones, and what is more perfect that integers (objects with only one single clear property).

One might wonder, where does the idea of 2 come from ? I believe that at some point in the past, a caveman (or maybe a monkey before him...) thought that there was like a kind of something similar between having two children and two pieces of chicken (possibly to feed them). He knew that children and pieces of chicken are not the same, but his two children seen together and the two pieces of chicken also seen together seemed to have like a property in common. Later on, we called this property "2", and the object 2 is the object which is defined by having this single property (of being 2). By inventing the abstract object 2 our caveman thought that there was something interesting about the related property shared by the two group of things (children and pieces of chicken), and that it may be worth to keep this in mind as a concept in itself.

The state of confusion comes when one knows some rules that refer to properties (or entire objects) which have not been mentally abstracted yet. So, if you're studying a subject, the first and very important thing you have to do is to make sure that you know the definition of all objects and their properties used in the rules that you try and understand. For instance, if you are trying to understand what CSS (Cascading Style Sheet) is, you might want to know that there are some objects called "contents of a html page" and another object called "style sheet" (which is very often a simple text file next to your html file) whose effect is to modify the way some objects in the html file are displayed in your web browser (most of the time for aesthetic purposes).

Now comes an important question. Have we understood what is CSS ? The answer is that at least we have understood what it is for, but we obviously have not understood yet how to use it. Does this mean that the fact of fully understanding something equals knowing how to use it ? Well, the answer in no. For instance many people know how to drive their car without necessary fully knowing their car. So does it mean that fully understanding something equals knowing how it is built ? I think that we get closer to 'yes', but the answer is still no. For instance, I know how my kites are built, but I don't think I understand them fully; this because I do not know the calculus that have been done in order to decide their particular shapes (they are very expensive and well designed stunt kites).

I would say that fully understanding something equals being able to explain why any given of their parts or why any given of their properties, was necessary to fulfil their primary function (or any of their secondary functions). For instance knowing by heart how to rebuilt my car doesn't necessary imply that I know why a particular piece goes where it usually goes. I would definitely perfectly know my car if I can answer such questions.

The case of fully abstract objects, such as considered in mathematics, is special. Because those (mathematical) objects do not have any primary function. For instance the number "2" does not have any primary function. It was just given to our conscious thinking by our power of abstraction. Hence fully understanding purely abstract objects equals knowing all their properties, and if the objects are primer, such as 2, knowing from which concrete situation(s) they have been abstracted helps to fully understand them (but this is not really compulsory; actually this point is subject to debate...).

Most of the time in life we just want to know how to use things such as CSS or a TV remote control. We use them most of the time to solve problems. Problems such that, for instance, "I want to move one object down of 10 pixels on my web page" or "I want to change the current TV channel".

In both cases we use our mental models of the objects, in other words, their mental abstract equivalents. In the case of the remote control I have a mental model of the TV and a mental model of the remote control, and I know the rule which states that pushing this button would change (increase or decrease) the TV channel. The problem having been solved in my mind I can solve it in reality (from the first attempt).

This all bring the question of defining what is exactly "solving a problem" ? Well simple. Given a set of objects, a problem is two configurations (or states) A and B. The state A is defined by some particular values of the properties of the involved objects, and idem for the state B. Solving a problem abstractly is finding a way to transit from state A to state B using the rules. Solving the problem in reality is to apply those rules to real objects and take advantage of causality, i.e., creating the causes and wait for the effects.

In some cases we know the objects, and we know the rules, but the sequence is very difficult to find. Most mazes (or chess-like games) are built on this principle.

In most of the cases, we know the objects only partially and only a subset of the rules. This type of situation covers 99% of everything that we do in life. In those cases, our mental models are incomplete, and this can be a true nightmare. It's like knowing something, but still not being able to solve problems about it; because we ignore some properties of the objects, or ignore some rules. We could also not being able to understand some rules, because they refer to unknown properties of the objects.

In those latter cases, the only solution is to complete (update) our mental models. This is the most basic and most important skill after being able to think abstractly. But unlike being able to think abstractly which is basically natural (even though training --for instance math studies-- greatly enhances one's efficiency to do so), mental model completion is a skill to be learnt, ...and taught.


Part two

In Part 1 we saw what are objects and rules. In few words, rules are causality statements linking events called "causes" and other events called "effects"; when causes occur, effects are bound to also occur (at least this is what rules claim).

Some situations involve many rules, and learning them by heart may be a time consuming as well as boring task. Let us take an example in the XHTML world.

Rule 1: bold is obtained using the two tags <b> and </b>

Rule 2: underlines are obtained using the two tags <u> and </u>

Let us now introduce a principle:

Principle 1: XHTML is a XML language.

You may not know what XML is about, but this principle simply says that tags are always for the form <tagname> and </tagname>

Given this principle, you do not actually have to remember that the closing tag for bold is </b>. This saves time and space in the mind. If I tell you that the tag for bold is "b", you will guess yourself that this word has been encoded <b>word</b>. With this principle in your pocket, the previous rules can be shortly stated as "the XHTML tags for bold and underline are respectively 'b' and 'u'".

Here is another principle which helps a lot:

Principle 2: Computer languages have been invented by English speaking people.

With principle 2 in mind you can guess what is the tag for italic. Yes, you are right! It is "i". So this word has been encoded <i>word</i>

Now, let us calm down a bit and think. What are principles in fact ? A principle is a rule about rules; how they are written or constructed. The rules of a given system fit the system behavior, but the principles tend to help humans when learning the rules or using the system. In fact, principles exist to help building better mental models. Hence, even if designing rules (when building a system) is an engineering task, designing principles is a psychological task, during which you are motivated by the artistic concepts of simplicity, efficiency and beauty (at least, you should be).

When you learn (the rules of) something, your mind always tries to find out the underlying principles, even if they are not clearly stated. To the mind, this is actually even more important that the actual rules. The mind does this in an attempt to make its learning more efficient. When I was learning (X)HMTL, from the first few rules I encountered, I thought that the following principle was true:

Principle: (X)HTML tags are never complete words.

I was then surprised when I learnt about the tag 'strike', which is used to do this effect. I did the mistake of believing that this principle was true, while it wasn't. This happens all the time. Problems come (and I always want to shot-down some designers when this happens) when a rule comes along which is in contradiction with some (more of less stated) principles. It also happened, for instance, when I understood that in CSS (at least with Safari)

body {background-color:black;background:url('background.jpg') center fixed no-repeat;}

doesn't work, while the following does

body {background:url('background.jpg') center fixed no-repeat;background-color:black;}

This negates the very original idea, the Prime Directive, behind CSS! By understanding why my first attempt didn't work I updated my mental model about the object "background" and learnt a new (and very very unexpected) rule about one of its properties.


Part 3

Physics is the part of human knowledge, whose primary purpose is to find the objects and rules behind the behavior of the universe.

Human natural languages being of limited power when it comes to express those rules precisely, people use mathematics as a way to reach this precision. For instance with basic human language you would say

Rule (v1): When you throw something in the air it eventually goes down.

But the human language extended with the plugin "mathematical language" can say something already more precise

Rule (v2): The second derivative of the motion of an object is the acceleration due to the forces applied to the object divided by the mass of the object.

Now, you might not be familiar with physics, but rule v2 is one of the deepest truth in this universe. It tells you everything that you need to know if one day you decide to work for Nasa. It is known as Newton's second law and can be purely mathematically written

$\frac{\textstyle \vec{F}}{\textstyle m} = \frac{\textstyle d^{2}t}{\textstyle dt^{2}}$

The "mathematical plugin" has showed to be extremely powerful to describe the rules of the universe. I will not here describe this power (not the purpose of the text) but it is very impressive. Some people have claimed that the rules of the universe follow a unique design principle: mathematics. As if, assuming that God exists and assuming that she has created the Universe, she must have told herself "I will design and create a universe, and I decide (principle 1) that all its rules will be expressed by mathematics." (Leading me to think that God must have been a math student.)

Some people also think that she also gave herself another principle (principle 2): that eventually the rules must be very simple (well, simple enough so human physicists can understand them).

God put aside, and coming back to reality, in fact nobody understands why the mathematics plugin always works (principle 1) so well (principle 2). This is actually the true mystery of the universe.




[ add a comment ]

Archives