We shall use the PV-Semaphore to construct a lock. In fact, the construction is completely direct: we shall rename “P” to “Acquire” and “V” to “Release”, set the initial value of the PV-Semaphore to 1, and we are done. We have our lock.

What is a lock? A lock is a software construct that can be acquired and released, and can only be held (a completed act of acquiring) by one thread at a time. If there are multiple threads seeking to acquire the lock concurrently, or attempting to acquire a held lock, it will be queued in a wait queue, for when the thread holding the lock releases it.

A lock is useful for many larger purposes:

- The lock can protect a data structure for being accessed during a period of transition, e.g. a linked list while it is being sorted.
- The lock can serialize the collection of activities, such that multiple threads carry out a collection of activities in some well-defined time sequence.
- The lock might assist in enforcing a rendez-vous where all threads collect at a certain point in their code before any continue beyond that point.

As evidenced by the simplicity of the construction, a PV-Semaphore is very most a lock, just with a few constraints-on-use imposed. The use of new words help emphasize the application and suggest what should and should not happen.

We will prove that the construction does achieve a lock. Proofs in concurrent programming usually consider three aspects of correctness:

- Safety. In this case, safety is the assertion that at most one thread can hold the lock at a time.
- Liveness. In this case, liveness is the assertion that (assuming the holding thread does not hold the lock forever), among the other threads seeking to acquire the lock some will acquire it.
- Fairness. The fairness property is not always assured, and our construction will not assure it. An example fairness property that we could have demanded from our lock is that any thread seeking to acquire the lock will eventually succeed. An even stronger fairness would be demanding that a thread seeking to acquire the lock does so before threads that demand the lock subsequently.

I use the method of __Data Structure Invariants__ in the proof. The counter will maintain the following invariant:

Invariant —If the counter is 1, then there are no waiting threads and the lock is not acquired. If the counter is 0 or less, then the lock is held by exactly one thread and there are minus counter threads waiting to acquire the lock. We shall demand that only a thread holding the lock will release it, and only a thread not holding the lock attempt to acquire it.

We use induction on the number of operations upon the lock.

The basis case is there were no operations on the lock and we verify that a newly created lock satisfies the invariant.

Let us now consider the i-th operation and the invariant holds at the start of the operation.

If count is 1, the operation can only be an acquire, and there are no lock holders, and there are no threads waiting. After the acquire, the acquiring thread is the only thread holding the lock, the count is 0, and there are no waiting threads. The invariant holds.

If the count is 0 or less, then the operation can be an acquire by another thread or a release by this thread. If it is an acquire, the acquiring thread is placed on the wait queue and the count is increment. The invariant holds.

If the operation is a release, if the count is less than 0 then a thread leaves the wait state, becomes the new holder of the lock, and the count is decremented. The invariant holds.

If the count is 0, there are no waiting threads, there is no holder of the lock, and the count is now 1. The invariant holds.

In all cases, after the i-th operation the invariant holds, given that it held at the start of the i-th operation. It holds for the zeroth operation, therefore the invariant always holds.

]]>**The data structures**

Given a set Q of states, let them be indexed by the naturals and the states referred by their index in the indexing. That is, Q = { q_{1}, q_{2}, … , q_{n} } and we can speak of state i or more correctly the state with index i.

Given a set S of tape symbols, let them be indexed by the naturals and the symbols referred by their index in the indexing. That is, S = { s_{1}, s_{2}, … , s_{m} } and we can speak of symbol i or more correctly the symbol with index i.

- Encode name-value pair (n,v) ∈ N × N as r
_{i}=2^{n}3^{v} - Encode a set of name value pairs { r
_{i}} as Π_{pi∈P}p_{i}^{ri}, where P is a set of primes larger then some value M. - A tape T is a set of name value pairs (i,s
_{i}) where i is the cell position and s_{i}is the tape symbol index - A configuration is C = 2
^{h}3^{q}T, where h is the head position, q is the state index, and T is the tape contents. - A computation C
^{*}is a set of name value pairs (i,C_{i}) where C_{i}is the i-th step in the computuation - Encode a left or right transition, δ
^{L}_{i}, δ^{R}_{i}, as 2^{q}3^{s}5^{q’}7^{s’}, where the transition is form state of index q to state of index q’, with symbol of index s under the head replaced with symbol of index s’, and the head moves left, or right, respectively - A TM left or right transition set is a set of left or right transition, Δ
^{L}= Π_{pi∈P}p_{i}^{δLi}, and Δ^{R}= Π_{pi∈P}p_{i}^{δRi}, with P as before a set of primes larger than some value M.

**Data consistency constraints**

It is necessary that the data representation equations are true. As we are quantifying over the naturals, we can expect to get all sorts of numbers that somewhat fit our notions of a set or a configuration, but not entirely.

- Map of a set is a function, φ
_{func}(C) = ∀ n, v, v’ (φ_{isin}(n,v,C) ∧ φ_{isin}(n,v’,C)) ⇒ (v=v’) - Consistency of a configuration, φ
_{conconfig}(C) = φ_{func}(C) ∧ ∃ h, q ( h ≥ 1 ∧ q ≥ 1 ∧ 2^{h}3^{q}| C ) - Consistency of a set of configuration φ
_{conconfigstar}(C^{*}) = ∀ i, C φ_{isin}(i,C,C^{*}) ⇒ φ_{conconfig}(C) - No gaps constraint on a functional set φ
_{nogap}(F) = ∀ (i < i’ < i”) ( ∃ v, v” φ_{isin}(i,v,F) ∧ φ_{isin}(i”,v”,F) ) ⇒ ( ∃ v’ φ_{isin}(i’,v’,F) ) - Consistency of a computation, φ
_{concomp}(C^{*}) = φ_{conconfigstar}(C^{*}) ∧ φ_{nogap}(C^{*})

**The predicates**

- The predicate that name-value pair (n,v) is in set T is given by φ
_{isin}(n,v,T) = ∃ p φ_{bigprime}(p) ∧ p^{2n3v}|| T. - The predicate that tuple (q,s,q’,s’) is in a transition directory is given by φ
_{isin}(q,s,q’,s’,Δ) = ∃ p φ_{bigprime}(p) ∧ p^{2q3s5q’7s’}|| Δ. - The predicate that the tapes in configurations C and C’ differ only that the tape symbol under the head in configuration C is s and replaced with s’ in C’ is given by φ
_{repl}(s,s’,C,C’) = ∃ h 2^{h}||C ∧ φ_{isin}(h,s,C) ∧ φ_{isin}(h,s’,C’) ∧ ( ∀ h’, s’ h’≠h ⇒ ( φ_{isin}(h’,s’,C) ⇔ φ_{isin}(h’,s’,C’) ) ) - The predicate that configuration C’ follows from C according to transitions in Δ, ignoring head movement is given by φ
_{fol}(C,C’,Δ) = ∃ s, s’, q, q’ φ_{repl}(s,s’,C,C’) ∧ 3^{q}||C ∧ 3^{q’}||C’ ∧ φ_{isin}(q,s,q’,s’,Δ) - The predicate that the head between configurations C and C’ moves left is defined as φ
_{hl}(C,C’) = ∃ h 2^{h}||C’ ∧ 2^{h+1}||C, and φ_{hr}(C,C’) is defined likewise for a right head movement - The predicate that C’ follows form C according to transitions in Δ
^{*}is φ_{follow}(C,C’,Δ^{L},Δ^{R}) = ( φ_{fol}(C,C’,Δ^{L}) ∧ φ_{hl}(C,C’)) ∨ (φ_{fol}(C,C’,Δ^{R}) ∧ φ_{hr}(C,C’)) - The predicate that C and C’ are consecutive configurations in a computation C
^{*}is given by φ_{cc}(C,C’,C^{*}) = ∃ i, j φ_{isin}(i,C,C^{*}) ∧ φ_{isin}(j,C’,C^{*}) ∧ ( j = i+1 ) - Assert that the computation C
^{*}follows the Turing Machine transitions as φ_{follows}(C^{*}) = ∀ C, C’ φ_{cc}(C,C’,C^{*}) ⇒ φ_{follow}(C,C’,Δ^{L},Δ^{R})

However, as programmers, we don’t program on a Turing Machine. Other formats for computation have arisen. The most typical format is based on an array of storage cells and a processing unit that performs simple logical and arithmetic computations on the values in the cells. Generally the program is not built into the finite machine of the computer. The finite state of the computer is programmed to carry out some actions, and these actions are triggered by the memory contents as *instructions*, and the instructions are sequenced into programs.

We are at a place in the course where it would be clearer if discussions of computability were done in a machine format and a computer language that is familiar. To make sure that the relevance to the Turing Machine, we will embark on a series of exercises that elevate the Turing Machine to a more typical programming format.

**The RAM Storage Model**

The most typical machine model of computation is called the *RAM Model of Computation*. The data store of a RAM machine is the RAM: a device that can pair a value drawn from a finite range of values with an address (drawn unfortunately from a finite range of addresses). At this moment in history, the basic range of values is the byte range, an integer between 0 and 255. However, the compiler understands values from more expansive ranges, such as integers, and will silently build up the more expansive ranges by ganging together multiple bytes.

The RAM is a box that when given an address, value pair and a store request, will store that value at that address; when given an address and a retrieve request, the RAM will present the value that was last stored at that address.

The RAM model is suggested by the actual available hardware. There are hardware modules that perform exactly the store and retrieve actions of the model. Compilers are responsible for setting up a correspondence between names in the program and storage addresses. This correspondence is done during compilation and the names of variables are not used in the resulting executable. All the variable names have been changed to fixed addresses, and those addresses are written into the resulting executable.

**Binding Storage Model**

Another model of storage, which is featured in Javascript, is the *binding model*. The binding model considered memory to be a dictionary of name-value pairs. To store a value in location “name”, an entry is made in the dictionary associating the value to the name, or if the name is already in the dictionary, the association is updated.

In this model, the name and value can be arbitrary strings. The hardware is not close to providing a direct implementation of this storage model, so it uses its RAM storage model to simulate the binding storage model. The binding storage model is typical of scripting languages which do not compile the code, so do not have the opportunity to collect all variable names together, and create a layout of these variables in memory.

In Javascript, and other languages adopting this memory model (such as Scheme, a Lisp variant), the dictionaries are further stacked so that a name can occur in multiple dictionaries. When the value of a name is sought, it is first looked up in the dictionary at the top of the stack, and if not found in that dictionary, the next deepest dictionary is searched. This continues until the bottom-most dictionary is searched, at which point the name is considered to be unknown.

The stacking of dictionary allows for various memory features, such as the easy handling of recursive procedure calls. If function f that defines variable i calls itself, the variable i in the called f is distinct from the variable i in the caller f, even though they have the same name. Also, the caller’s i must reappear when the called f returns. This is accommodated by having a function call push a new, initially empty dictionary of bindings on the dictionary stack, and defining the new i in that dictionary, causing the older i to be hidden until the dictionary stack is popped.

**Implementing storage on a Turing Machine**

Our goal is to turn a Turing Machine into something that runs typical computer code, so that our discussions of Turing Machines will be on more familiar grounds. To that end, we will implement the Binding Storage model on the Turing Machine, and later use the storage to simplify the programming of Turing Machines.

Either memory model can be implemented, but the Binding Model is easier to implement, and is a very powerful model.

We will make great use of the theorem that a multi-tape Turing Machine is equivalent to a single tape Turing Machine, and freely introduce additional tapes to make the finite state programming simpler.

For the binding model, we will introduce one tape completely dedicated to the dictionary, and another tape dedicated to the transfer of requests and data to the dictionary.

The dictionary tape will be formatted as:

⊣ L ( ⎵^{+} Σ^{+} ⎵^{+} Σ^{+} )^{*} R ⎵^{∞}

Where ⊣ is the left end of the tape, ⎵ is the reserved blank symbol, L and R are reserved tape symbols, and Σ is the alphabet for writing names and values. The leftwards of the or these is the name, the rightwards is the value.

The transfer tape will have to format:

⊣ (R|W) ⎵ Σ^{+} [⎵^{+} Σ^{+}] ⎵^{∞}

to request a read or write, respectively (with the leftmost string being the name, and the rightmost being the value, when a write); and returns either

⊣$ ⎵ Σ^{+} ⎵^{∞}

or a blank tape. If the read is successful the string will be a copy of the value; else it will return blank tape.

*Note:* To keep things simple, if when updating the value associated with a name, if the new value is too large to fit into the spaces were was the old name, just truncate the new value. For this reason I have defined the write to allow multiple blanks ⎵ before the value. You can copy those too when the binding is created, thus reserving space for future, larger, values. Real computers have these issues too.

**Implementing Compound States**

Each place in the overall programming of a Turing Machine that it is needed to read or write to the data store, the TM sets up the transfer tape with the request and then state control transfers to a fresh copy of the conglomeration of states that undertake the read or write. Yes, this means a lot of states. If it takes k states to do a read and there are m reads, their will be k*m states created. That’s fine.

We call the conglomeration of states a *compound state* and can draw it succinctly in the state diagram using the notation:

◯ → ⦾^{R} → ◯

or

◯ → ⦾^{W} → ◯

where the superscripts identify the compound state that is being inserted.

You will be asked to work out the details of another compound state. Consider the compound state which moves the head left erasing everything until a $ is encountered. Assume for simplicity that a $ will be encountered (you might loop infinitely if not, but that’s OK). You will be asked to draw the state diagram for this compound state, ⦾^{E}, the full state diagram of a TM which:

- Writes the $abc in it’s tape;
- Then with the head over any blank to the left of the c erases all characters

until the $ is encountered

and (thirdly) the state diagram of this machine but using the notation of the compound state to simplify your diagrams.

**Notation for k-tape machines**

We should fix a notation for k-tape Turing Machines. Let the notation x:y mean that tape x has a y under the head, or that on tape x write a y under the head. Use this notation for left and right as well: x:L or x:R will move the tape head of tape x left or right. And finally, transitions can be any collection of x:y pairs yielding another collection of x:y pairs. The input pairs must all be satisfied, and the output pairs are all effectuated. (Make also a set of common-sense rules that among the output collection there are not contradictory pairs, such as x:R and x:L both appearing.)

Example: A:c,B:d → A:e,C:R is the transition: if on tape A is an c, and on tape B is a d, then write an e to tape A and move the head of tape C right. No other tapes matter as the trigger, and no other tape actions are taken.

**Exercises for Part One**

Assignment 1:Write the TM finite state diagram to implement read ⦾^{R}and write ⦾^{W}.Assignment 2:Implement the ⦾^{E}compound state. Write the state diagram for the machine which writes $abc to its tape then erases the letters a, b and c. Write the state diagram twice: one showing all states, and the second using the compound state notation to simplify the diagram.Assignment 3:Show the complete Turing Machine to implement a = 1010,

using the compound states R and W. You can also assume compound states that write literal strings, such as the compound state ⦾^{“1010″}.

This part will be about how a Turing Machine implements the *Central Processing Unit*, an architectural element of practical computing machinery.

Most CPU’s implement a *register machine*, where the CPU contains memory locations which are the source and destination of all computation. To compute on data in RAM memory, the data is first brought into the registers. We will follow that pattern and for a k register machine we will have at least k tapes, each tape will contain the contents of a register.

The Intel architecture, for instance, descends from the original Intel 8080 chip with seven registers called A, B, C, D, E, H and L. The flexibility in use of the registers depends even among register machines by architecture, with RISC architectures having more flexibility and CISC architectures having less, but a typical instruction that follows the pattern that register A is a special *accumulator* register would be: “ADD the contents of register B and C and place the result in A”, or a compound state ⦾^{A=B+C}.

If we suppose integers written in binary, with the least-significant bit leftmost on tape B and C, you can imagine how to write such a compound state.

**Recursion and Dictionary Stacks**

An necessity of general programming is some sort of looping structure. LISP, an early computer language used the mathematics of the lambda-calculus to implement all control as recursion. While LISP descendents grew in complexity and thereby obscuring the mathematical simplicity, the LISP derivative Scheme maintains the simplicity. We will implement a Scheme-like recursion structure, and program the recursive Fibonacci Algorithm on our Turing Machine.

The Fibonacci Sequence can be generated by the recursive definition:

f

_{i+2}= f_{i+1}+ f_{i}for i ≥1, and f_{2}= f_{1}= 1

A possible implementation is:

⦾

^{FIB}: A Turing Machine when started with the number i on tape A (written in binary, least significant bit against the left edge of the tape) halts with the i-th fibonacci number, f_{i}, written on tape A.

- Append a “Directory Mark” to the end of the variable store.
- Add variables i and f to the variable store, in the current directory
- Decrement the contents of tape A and store a copy of the result into variable i
- Goto ⦾
^{FIB}- (
on completion of ⦾) Store a copy of the contents of tape A to variable f^{FIB}- Read variable i into A and decrement the contents of A
- Goto ⦾
^{FIB}- (
on completion of ⦾) Copy the contents of A into B; read the variable f into C; and enter ⦾^{FIB}^{A=B+C}- Remove all variables in the store right of the Directory Mark, and remove the Directory Mark

The new thing introduced is the idea of a *Directory Mark* and the creation of variables for recursion by writing and reading from the right most variable of the same name, when there are multiple variables of the same name, in the value store.

The changes to your data store are:

- search for names from the rightmost end of the store back to the start of tape, leftwards;
- allow for multiple variables of the same name, with the rightmost one covering the any to the left of the same name;
- demark the store with the character “D”, a character outside the alphabet for names, with two compound states: ⦾
^{+D}which adds a “D” do the end of the store, and ⦾^{-D}which erases from the right end of the tape leftwards until encountering a “D”, and erases that “D”.

With this simple addition you have a complete Scheme-like programming language.

**Javascript does it this way**

By the way, this idea of a stack of dictionaries is how Scheme does it. But of perhaps more interest, is how Javascript does it. For simple programs, this is not important. But to understand Javascript objects, and its model of Prototype inheritance, it is crucial. Javascript does it this way because Brendan Eich, the creator of Javascript, was bringing these concepts in from Scheme. It is a very simple mechanism to implement recursion, but also of a thing called a Closure, which is crucial for functional programming.

**Exercises for Part Two**

Assignment 1:Write the TM finite state diagram to implement ⦾^{A=B+C}.Assignment 2:Implement the ⦾^{+D}and ⦾^{-D}compound states.Assignment 3:Implement helper compound states ⦾^{x→y}where x and y are stand-ins for A, B, C, etc, which copy the contents of tape x to tape y.Assignment 4:Implement helper compound states ⦾^{T→x}and ⦾^{x→T}where x is a stand-in for A< B, C etc, which copy the contents of tape x to and from the transfer tape (called T), for use in read and write. Note that in the copy to the transfer tape, the copy appends to the transfer tape, because for write the contents of T should be name value.Assignment 5:Implement the ⦾^{FIB}compound state.

As an example, to store the value in tape C into variable i, the sequence of compound states would be:

⦾^{store C in i} = → ⦾^{“i”} → ⦾^{C→T}→ ⦾^{W} →

For extra credit, write the compound state ⦾^{GCD} which implements a recursive the Euclidean Algorithm to find the greatest common divisor of two integers. Given i and j written in binary on tape A j separated from i by one blank, and all blanks to the right of j,

⊣ ‹i› ⎵ ‹j› ⎵^{∞}

the compound state ⦾^{GCD} will leave tape A with k, the gcd of i and j, written leftmost, all blanks following:

⊣ ‹k› ⎵^{∞}

where ‹i› ∈ {0,1}^{+} is understood as a binary integer with the leftmost 0/1 the one’s place, etc. E.g. ‹25› would be 10011.

Dr. Nim is a plastic toy manufactured in the mid-1960′s by E.S.R, also the maker of Digi-Comp, a mechanical computer. Dr. Nim is a Mariambaud game, so named because a version of this games is featured in Last Year in Mariambaud, by Alain Renais, and a landmark in the Nouvelle Vague of French cinema. In Dr. Nim, there are 15 marbles. Players alternate taking 1 to 3 marbles, trying to force the opponent to take the last marble. The plastic Dr. Nim was the opponent, and could actually calculate the winning strategy in the flipping of plastic “gates”.

The gist of the winning strategy is to note that 1 is 1 mod 4; and that it is always possible that exactly 4 marbles be taken be the combination of the player and Dr. Nim. Therefore, the player that first leaves the opponent with 1 mod 4 marbles, can maintain this until there is but one marble.

Hence the plastic computer that is Dr. Nim counts mod 4, and cleverly takes actions to always leave its “state” (the flipping of the plastic gates) in according with 1 mod 4 marbles remaining.

I presented this game in class, and you read the Dr. Nim booklet.

The device is a Finite, and the Dr. Nim booklet is pretty good about describing it as such. What is the regular language of Dr. Nim? Make these assumptions:

- Assume away the “Equalizer”.
- Dr. Nim will go first (and because we have removed the Equalizer, will always win).
- The alphabet for the language is Σ = {1, 2, 3,
__1__,__2__,__3__}, where the digit indicates the quantity of marbles taken

in the turn, and the overline and underline indicate whether it is Dr. Nim’s or the player’s turn, respectively. - The game is played with any number 2 mod 4 of marbles (because if it

is exactly 15 marbles, the language is finite and therefore boring) - The accepted strings are those sequence of marble takings that end in a win for Dr. Nim.

For instance, the Language of Dr. Nim, restricted to strings of length two, is {1__1__}.

This course will give you a broader view of computation, and how it relates to mechanical computation and language.

**Hilbert’s 10th Problem**

As this course opens, so does a Major Motion Picture about Alan Turing, called the Imitation Game. So we might as well begin our history here, with Alan Turing’s contribution to the theory of computation.

Alan Turing was very concerned with the nature of computation because it was a pressing, and open problem in the history of mathematics. The nature of computation became fundamental to mathematics because in the growing notion of thought as a mechanical process, all of mathematics must then be a result of this mechanical process. If there were machines to make roads, it stands to reason that before long there should be machines to build mathematics.

At the 1900 Mathematical Congress, David Hilbert gave a series of challenge problems. His tenth problem in the series concerned finding the roots of any multi-dimensional polynomial, given the restriction of the number system to the integers. Such integer restricted problems are called Diophantine. Eventually this problem was found to be impossible to answer. It is impossible to determine, in general, whether or not integers can satisfy a Diophantine equation.

The problem is, just as the final conclusions were being drawn up across all of science and mathematics, the foundations crumbled.

The demand for increasingly clarity in mathematics lead people such as Friedrich Ludwig Gottlob Frege to set the foundations of mathematics on logic and set theory. By 1910 the philosophers Bertrand Russel and Alfred North Whitehead had gotten as far as a summary opus, *Principia Mathematica*, that was to completely describe the foundations of mathematics. But they couldn’t finish, because the more precise they thought about logic, the more paradoxes they discovered.

Bertrand Russel, in thinking about sets, asked himself about self-referential sets. His famous question is: *In the town where the barber shaves all who do not shave themselves, who shaves the barber?* It cannot be the barber, for he only shaves those that do not shave themselves, but if it isn’t the barber, then the barber does not shave himself and so it must be the barber.

The careful language of PM made it possible to write down such a self-referential formula, and this led to the ability to define impossible objects. So rather than the age of reason, we are in the age of Alice and Wonderland, where we think as many as six impossible things before breakfast.

In fact, it wasn’t only mathematics that seemed to be crumbling under its own self-contradiction, all the tidy plans for the end-of-the-century science were abruptly shredded, as Max Planck’s suggested Quantum Physics (1900), Einstein’s Theory of Relativity (1905) and Werner Heisenberg the Uncertainty Principal (1927).

For mathematics, this movement towards the strange new limits on mathematics, which total break from any classical understanding of reality, thought, and the mind, were realized in Kurt Gödel’s Incompleteness Theorem (1931). Turing recast Gödel’s problem into one of computation, and in 1936 published his solution to the Halting Problem, which states that no general method exists that will determine if a given program will ever terminate its computation.

The connection between Turing and Gödel, halting and truth, is the following: proof was being considered as a computation. It is a logical computation, finite steps over a finite symbol set, with strictly enforced rules. Just like a computer’s computation. If all computation halts, then proofs eventually terminate with an answer. The statement as been proven true, or proven false.

All statements are certainly true or false, the proof allows us to measure that. No, our measurement can fail to come to a conclusion.

**The Imitation Game**

Along with all these change in intellectual viewpoint, man’s perceived place in the world was changing. The obvious aspects that a human is a machine began to impress themselves on human self-knowledge. In 1923 Sigmund Freud published *The Ego and the Id*. In doing so, he recast our spiritual selves as the subject of scientific analysis and explanation. The emotive or sub-conscious mind is also a machine. Combined with the work on computation theory, the world began to wonder about Artificial Intelligence. Can a machine think.

The incompleteness theorem gave a tempting avenue towards the divine. Maybe machines can’t tell the difference between true and false, but somehow systems of logic are created in which those things that can be labeled true and false are all the useful things. Maybe there is another sort of intelligence that makes these over-arching rules, which is not rational but is feed by something uniquely human; and from there the machines take over. Similar to a bulldozer being thing the thing that can lift the earth, but it is the person that wills that that earth needs be lifted.

For this, Turing set off to make is Imitation Game, or Turing Test, that would be some sort of detector for this divine quality. However, it had to be played out through language, and language itself was under reconsideration, particularly by Ludwig Wittgenstein, *Tractatus Logico-Philosophicus*, in 1921, who could claim that the problems of thought are always a problem of language, and therefore the Turing Test skirts the problem by confusing a thought with a discussion of a thought.

I think that code breaking influenced this direction. Prior to the modern age of code breaking, which Turing was the central figurehead, cyptography was thought of in terms of linguistics. But set to the problem of code making and code breaking, the only obvious starting point was that of a machine obsfucating a symbol stream. Hence, the limitations of codes were the limitations of machine computation. This carries forward to us today in an even more refined manner, but contemplating codes unbreakable not because of a formal, mathematical unsolvability, but because the algorithm needed to solve a code out-strips the available computational resources.

**Language and Computation**

In 1957 Noam Chomsky published *Syntactic Structures*, which proposed a mathematical theory for the grammar of languages. His approach, married with the general concerns of Wittgenstein, put computation within a language framework. You will see this in the course: there will be a continued interest in “languages” as a model of computation.

Noam Chomsky was looking for rule systems that could generate grammatically correct sentences. As with mathematical incompleteness, it is not true that languages easily have a perfect grammar. “All grammars leak”, Edward Sapir 1921. However, Chomsky defined a sort of grammar that seems close to true, and in doing so, created the Chomsky Hierarchy of languages, with increasing complexity, to explain syntax and its relationship to meaning.

These things are important to computer scientists not only to define computation, but to guide the creation of computer languages, which should be of a lesser sort than human languages, as they are only need to command, and need to, even should not, express ambiguity. In doing so, a hierarchy of computation is created in parallel, with each level in the hierarchy of computation associated with a level in the hierarchy of language.

]]>A Logical Theory contains *relations, **variables, logical connectives, *and *quantifiers. *The logical connectives are always AND, OR and NOT, and the quantifiers are always EXISTS and FORALL. These elements are combined into *formulas* in simple ways conforming a simple grammar. It is easy to tell what is a formula, because the grammar rules are simple. How simple? It must be decidable. In addition, there are axioms, which are formulas which are known to be a priori true.

For instance, the theory of addition would have a single relation PLUS, which given three numbers x, y and z, would be true if and only if x+y=z, at least that’s what we would picture in our minds. A *model* for a theory is a map of the relation to some “real world” example such that all the true relations and formulas in the theory are actually true in the real world. So the theory of addition would certainly be true in the model where the variables would take values in the integers, or naturals, and PLUS would agree with addition of integers.

The theory of addition would have axioms asserting the properties about the theory, such as the commutativity of addition, the existence and uniqueness of the sum, and (perhaps) of the zero element. These axioms would be chosen so that they capture accurately the properties of the common model. The axiom of commutativity would be:

- FORALL x, FORALL y, EXISTS z PLUS(x,y,z) AND PLUS(y,x,z)

It is possible for multiple models to exist. Addition over the integers has the same formal properties as addition over the integers mod N (except the set of elements is finite and there is no order relationship … neither of these is important to the theory), or addition over the reals (except that the reals have a continuum of elements, while the integers are denumerable – again this is not important to the theory).

The question is, is the set of all true formulas of the theory Turing Decidable? For some theories, it is, for others, it is not. For those, this means that there is no algorithmic method for deciding the truth of all propositions. Since Proof is an algorithmic method, this means that not all truths are provable.

If we restrict to a theory modeled by the integers and addition, this is decidable. However, if we consider a theory modeled by the integers, addition and multiplication (called the language of Number Theory), this is not decidable. Because, given a description of a Turing Machine M, and an input w, it is possible to write down a formula in the language of integers, addition and multiplication, such that the formula is true if and only if M accepts w.

It’s a bit of a trick to do this, but in a certain way, I’m not surprised. The amount of brains needed to run a Turing Machine (move left, move right, write a zero, write a one) doesn’t seem to be more than what’s needed to solve integer formulas! And indeed, it isn’t.

The set of true formulas is, however, Turing acceptible. The true formulas are those true because they are axioms, or are relations, or are formulas whose truth is entailed by logical consequence from the axioms and relations. Such truth formulas will be true in any model of the theory. It is possible that certain formulas will be true in some models and not others. This occurs when essential axioms are removed. For instance, a theory of geometry with the parallel postulate removed. Euclidean and non-Euclidean geometries are models for the neutral theory.

Not withstanding that, it is possible to write down the entailment rules so that a diligent Turing Machine can apply the entailment rules to the axioms and grind out many true formulas, and these are called theorems, and the grinding process is called proof. Rather un-poetic, but there you have it.

]]>