This web-site is about the construction of a piece of software that, wen run on a sufficiently powerful computer, communicates with the real world that we inhabit using standard peripherals like keyboard and display. Using these facilities it should be able to develop language capabilities that are sufficient to have a reasonable chance to pass the Turing test.
Strong AI and weak AI are terms to distinguish between different goals that can be set when trying to implement models of intelligent behavior. Strong AI aims at an artificial system that can be safely designated as "intelligent" in some definition of the word. Weak AI studies simplified models of existing intelligent organisms.
To clarify this distinction we can make an analogy with aerodynamics. Strong aerodynamics would aim at a mechanism that can stay airborne for an extended period of time. Weak aerodynamics is quite happy with a massively heavy wing profile that can be studied in a wind tunnel, but that could never really fly.
There has been some discussion going on about whether it would be possible at all to make an artificial construct that is intelligent. In 1950 Alan Turing published an article in which he proposes a procedure to determine whether something/someone is intelligent. This procedure became known as the Turing test. Basically it proposes to let an expert enquirer interview both the test subject and a real person simultaneously. The questions and answers are typewritten and all participants are isolated from each other. The enquirer asks questions to "A" and "B" but does not know initially which is the test subject and which is the real person. The test subject is said to be intelligent if the expert enquirer fails to determine which is the real person half the time.
The most well known objection to this endeavour has been raised by John Searle who claims that intelligence depends on mental states that are can only exist in human beings and hence cannot be shared by artificial systems. That seems to me as absurd as claiming that flight is dependant on the intrinsical levitation powers of feathers that are only present in feathers of real birds.
It seems a good idea to me to try and build a system that passes the Turing test and see how many real people are prepared to call it intelligent. To me the Turing test serves as a reasonable definition of comparative intelligence.
Let's view the construction of an intelligent software system as an engineering problem. What would be its requirements?
Item 1 is easy. This is standard functionality that is built-in in most pogramming languages and operating systems.
Item 2 is a form of learning. This is going to be one of the basic functions of our system.
Item 3 can be a matter of generate-and-test if Item 2 and Item 4 are implemented properly.
Item 4 is the tricky part that provides the context in which the other components have to function. Most Artificial Intelligence research programs set a quite specific goal for their systems. This holds even more so for most Machine Learning research programs. However, I think that there is a huge risk that a computer system that is good at reaching a particular goal can be very simple and straight forward if the designers of that system learned a lot about the problem of reaching that goal before the system is implemented. Done that way, it is the researchers that learn and consequently the system is a product of natural intelligence instead of a product with artificial intelligence.
Let's see if we can do better than that. First we'll have a closer look at Item 2. Can we quantify in some way the products of a learning algorithm? It turns out that using the Minimum Description Length principle, we can.
The Minimum Description Length principle tells us to write every model that any learning algorithm could have at any moment as a computer program that given an arbitrarily long sequence of symbols produces a sequence of input/output symbols. One of those programs is e.g., print(x), lets denote it p. Suppose the input output stream so far was i/o =i/o 1...i/o n, then p( i/o )= i/o . Now define the compression of an arbitrary program q of the sequence i/o as the length of the executable for q plus m minus the length of the executable for p minus n, where m is the length of the shortest sequence d =d 1...d m such that q( d )= i/o . According to the Minimum Description Length principle the program q with the highest compression is the best model given the sequence i/o .
Now suppose that we use a learning algorithm that uses the Minimum Description Length principle to select a model each time it needs to evaluate its expectations about the future. Then we could set as a goal of the output generator to choose output symbols in such a way that the expected growth of the length of q is as large as possible. In other words, the purpose of the entire system is: learn as fast as possible. To make this work we have to extend requirement Item 2 a bit, because the system not only has to model the input/output stream, but also its own growth.