In 1950, Alan Turing proposed a general procedure to test the intelligence of an agent known as the Turing test (TT). The idea was to talk in natural language using a computer screen and a keyboard to one machine and one human. All participants are separated from one another and cannot see the other ones. After an arbitrary amount of time, if the “judge” cannot reliably make a difference between the answers of the human and the machine, then the machine is said to have passed the test. In this case, Alan Turing concluded that the machine is as intelligent as the human and, moreover, that the machine is able to think in the same way that humans do.
There have been a huge number of philosophical debates about the validity of the TT. One of the main detractors was John Searle who proposed a strong argument against the TT known as the "Chinese room" thought experiment. Searle argued that an agent could pass the test simply by manipulating symbols of which they had no understanding. Without this understanding, an agent could not be described as "thinking" in the same sense people do. In turn, John Searle’s argument was widely criticized.
One crucial parameter is the time of the test. Shorter the test is, easier the machine could pass the test. Also, if the conversation covers a large number of fields and topics, the challenge is far more difficult than focusing on a specific problem, such as a particular chemistry expertise for example.
Another problem with the TT is that the “judge” knows that there is a machine. As a consequence, his behavior is very different compared to a natural conversion between two or more people. Most of the time, the “judge” tries to trick the machine using nonsense or unnatural sentences. Sometime, we have also seen some people involved in this kind of test trying to imitate the machine…
My conclusion is that the TT is definitely not a good procedure for testing intelligence. A better way could be a procedure with multiple people involved in a natural conversation for a given amount of time where participants don’t know that one of them is a computer program. This could be done using an existing forum or a chat system for example. Then, after the test, we could analyse the log file to state about the result. This way is far more close a real application. To be clear, I don’t speak here about useless contests or prizes, but about useful experiments for researchers and developers.
Anyway, even if successfully passed, the conclusion of this kind of test cannot be any sort of equivalence between the human brain and the machine…
There have been a huge number of philosophical debates about the validity of the TT. One of the main detractors was John Searle who proposed a strong argument against the TT known as the "Chinese room" thought experiment. Searle argued that an agent could pass the test simply by manipulating symbols of which they had no understanding. Without this understanding, an agent could not be described as "thinking" in the same sense people do. In turn, John Searle’s argument was widely criticized.
One crucial parameter is the time of the test. Shorter the test is, easier the machine could pass the test. Also, if the conversation covers a large number of fields and topics, the challenge is far more difficult than focusing on a specific problem, such as a particular chemistry expertise for example.
Another problem with the TT is that the “judge” knows that there is a machine. As a consequence, his behavior is very different compared to a natural conversion between two or more people. Most of the time, the “judge” tries to trick the machine using nonsense or unnatural sentences. Sometime, we have also seen some people involved in this kind of test trying to imitate the machine…
My conclusion is that the TT is definitely not a good procedure for testing intelligence. A better way could be a procedure with multiple people involved in a natural conversation for a given amount of time where participants don’t know that one of them is a computer program. This could be done using an existing forum or a chat system for example. Then, after the test, we could analyse the log file to state about the result. This way is far more close a real application. To be clear, I don’t speak here about useless contests or prizes, but about useful experiments for researchers and developers.
Anyway, even if successfully passed, the conclusion of this kind of test cannot be any sort of equivalence between the human brain and the machine…
1 comment:
At the time Turing proposed the test monitors were almost unknown. He actually proposed using something equivalent to the teletype.
Post a Comment