The Stochastic Revolution in Art and Science

Graham Coulter-Smith

Far from finding statistics dry I am attracted to its mysterious qualities. I remember seeing balls running down a wall-sized, quincunx peg board in the science museum in Boston and forming the famous bell curve (Gauss’ ‘normal distribution), it seemed miraculous that these randomly bouncing balls should end up in such an orderly array. Statistics is the science of finding structure empirically rather than beginning with a predetermined notion of what structure should be. And statistics is also capable of dealing with both complexity and change. In this sense statistics can be understood as the scientific corollary of poststructuralism in philosophy, it also underscores the enduring, but often misunderstood, aesthetics of chance in the arts.

The notion of chance has played a major role in twentieth century avant-gardism from Marcel Duchamp to John Cage. It led to a movement in music, aleatorism, a precursor of minimalism. It also affected literature in the cut and paste poetry of Tristan Tzara that laid the bases of Brion Gysin and William Burrough’s cut-and-paste approach to literature, and we hear it today in the dominance of sampling in popular music over the past two decades. Chance and montage go hand in hand as do chance and counter-narrative practice in experimental film, video and literature. The essays below will explore the concept of chance in art and in the parallel world of science which, in epistemological terms, is the primary domain of chance since Darwin published The Origin of the Species in 1859. Part One will deal with the roots of the aesthetics of chance in the visual arts the second, third and fourth parts will tackle the role played by chance in the domains of linguistics, computer science, and cognitive science. A fifth part may be added later on Evolutionary Computing.


Dada and Darwin, The Roots of the Aesthetics of Chance

The roots of the aesthetics of chance in visual art can be traced back to the Dada and Surrealist movements in the early twentieth century. And if we examine the cultural context in which Dada and Surrealism came into prominence we realise that one of the most powerful cultural revolutions informing such a concern is most likely connected with the cultural aftershocks following Darwin’s publication of his theory of evolution (1859). Darwin’s theory of evolution was a truly revolutionary moment for the arts because it dealt the coup de grâce to the humanist, anthropic proposition that ‘man’ was made in the image of God.[1] Compared to Darwinian evolution the Copernican revolution was a hiccup. Copernicus’ observation that the entire universe did not revolve around the Earth bruised humanist hubris, but Darwin’s observations dealt a mortal blow to the anthropic ideal that ‘man’ was made in the image of God. One of the first to try to come to terms with the new order of things was the philosopher Friedrich Nietzsche who proclaimed that ‘God is dead’ and formulated a post-anthropic philosophy based on the Will to Power which, according to Gilles Deleuze, also appears in Nietzsche’s writings as the ‘dice thrower’ (aleator): which is to say, as chance.[2]


Nietzsche was not a scientist and his interpretation of Darwin was motivated more by German romanticism than by science. His Will to Power is a species of life force similar to the élan vital that Henri Bergson formulated in the early twentieth century.[3] Nietzsche’s aestheticization of chance in the wake of Darwin’s catastrophic discovery certainly contributed to the intellectual matrix in which Duchamp, Dada and Surrealism formulated their aleatoric aesthetic practices. And it is against this background that the Duchampian Readymade, aleatoric montage (cut and paste poetry and photography), automatic writing and painting, and counter-narrative practices (such as the surrealist object and surrealist film) emerge as key principles. These are the aesthetic tools for a new age in which human beings are no longer made in the image of God but are instead the product of an approximately infinite number of ‘dice throws’.


One recalls Stéphane Mallarmé’s Un coup de des jamais n'abolira le hasard (‘A Throw of the Dice Will Never Abolish Chance’), 1914, which stems from the period in which Duchamp was creating his first Readymades. Contemplating Mallarmé’s title from the point of view of the Nietzschian dice throw one can note that an infinite number of throws must inevitably lead to what Nietzsche referred to as the eternal return of the same (which incidently is a poetic echo of the ‘eternal return’ of the mathematician Gauss’‘normal distribution’: the infamous ‘bell curve’). To continue the conflation of Nietzsche and Mallarmé, the Nietzschean return of the same can never abolish chance because, although similar, the universes created by the infinite and eternal throws of the dice cannot be identical. One thinks here of the Mandelbrot Set with its infinite difference based on infinite self-similarity. And one might note that the fractal geometry of which the Mandelbrot Set is an instance, reflects the self-similar irregularity (regular irregularity) we see in natural forms.[4]


Linda Allison 1997 The Mandelbrot Set transformed into cybernetic jewellery

Nietzsche’s principle of the ‘eternal return’ shows quite clearly that his dice-throw metaphor is concerned not with chaos but with the realization that in certain circumstances chance-like behaviour is actually a form of order invisible to common sense. This must be the case or else Darwinian evolution would be impossible and the irregularity of nature would not be regular (a regularity evidenced by the trivial fact that there can be a taxonomy of clouds). Looking back on Nietzsche’s aestheticisation of chance one can perceive that the philosopher in Nietzsche was groping for a new form of reason, a logic that transcends the anthropic common sense that led to myths such as a God in the image of ‘man’, and a universe that rotated around the Earth.

Darwinian evolution thrust an aleatoric, antianthropic vision of the universe onto humankind and Dada and Surrealism were the among the first to offer a cultural response to the ensuing crisis of Humanism and the apotheosis of reason evident in the Enlightenment. But Dada and Surrealism were in the minority, chance was not attractive to the majority of artists in the first half of the twentieth century busy pursuing the rationalistic machine aesthetic. The dominant aesthetic discourse was geometric abstraction which, in retrospect, seems more an expression of eighteenth century Enlightenment rationalism than an expression of twentieth century science.

In contrast, Dada and Surrealism resonate to a much greater degree with developments in twentieth century science. It is worth pointing out that the Surrealist movement was founded in 1924 only three years prior to the revolution in physics heralded by Werner Heisenberg’s principle of indeterminacy (also known as the ‘uncertainty principle’) formulated in 1927.[5] This was the point at which chance took on a key role not only in natural history, as in the case of Darwinian evolution, but also in the most theoretically rigorous field of science: physics. Dada and Surrealist artists probably had no knowledge of Heisenberg, nevertheless, one can acknowledge the degree of parallelism between the heightened role of chance in science and in art.

It is also worth noting that while he worked on The Large Glass, 1915-23, Duchamp became engrossed in popularisations of non-Euclidean geometry while his abstractionist colleagues in the De Stijl and Constructivist movements pursued a formalist vocabulary largely based on the classical discourse of Euclidean geometry.[6] Duchamp was particularly attracted to the non-Euclidean proposition that there could be a fourth dimension of space. The notion of an additional dimension of space suggests that the things in our three-dimensional world are interconnected in a dimension beyond the scope of our senses.

In Duchamp’s Notes and Projects for The Large Glass it becomes apparent that he believed that such interconnections could explain the hidden order that lay behind what we perceive as chance-like events.[7] What appears to us as paradoxical (such as the fact that subatomic phenomena are both wave-like and particle-like) or serendipitous (synchronicity) might be explained by interconnections at a higher spatial dimension. This notion confounds common sense but the concept of higher spatial dimensions has not left the realm of science, instead it has become one of the fundamental premises of the most advanced theoretical physics of the 1990s and 2000s (M-Theory formerly known as String Theory is based on the proposition that there are eleven dimensions).[8]

The Death of Logos

There are many things that differentiate humans from animals, and there are probably more things that we have in common than we care to admit. But the key distinction that is made is language. The philospher Jacques Derrida has referred to this privileging of language as ‘logocentrism’. He traces its history in Western culture back to ancient classical metaphysics (not surprising for a philosopher) and the concept of Logos which I am capitalizing and italicising to enhance its alleged importance. In the Judeo-Christian tradition it is said ‘in the beginning was the word’: that is to say Logos is as much a theological as a metaphysical conception. Which turns the discussion back to Darwin.

The question I would like to ask now is, in the wake of the post-Darwinian ‘death of God’ is there also a ‘death of Logos’? And is its nemesis the anti-metaphysical, anti-theological episteme of chance? I believe it might be, however, the field I am entering now is actual science, which is to say ongoing science and as such there are no definitive answers. And in any case this is a speculative mission, I want to be suggestive rather than definitive.

Derrida’s philosophical excursion into the roots of the metaphysics of logocentrism came to the conclusion that it arose out of a privileging of speech over language. For Derrida the privileging of speech is related to a metaphysical privileging of presence. He argues that the ability to speak that delineated the human from other animals was elevated by metaphysics to the point where it suggested a God like mastery. In Derridean terms presence implies a capacity of mind to wholly encompass and command. One can note here that in the Italian Renaissance the invention of perspective that led to the capacity to capture the real (a technology continued by photography, film, television, hi fi etc.) was understood in terms of a reflection of the capacity of the eye of God to encompass (capture) all things. And in The Archaeology of Knowledge (Foucault 1972) Michel Foucault notes a similar grandiosity in Enlightenment concept of the taxonomic table upon which nature can be laid out in order to be totally captured by the rational gaze. The problem that poststructuralists such as Derrida and Foucault point out is that human beings are patently not like God. We quite simply cannot grasp everything, much of our activity is trial and error: bricolage. And from an evolutionary point of view this modus operandi is distinctly advantageous when faced with a constantly changing environment.


Derrida counters the metaphysics of logocentrism by the counter-privileging of ‘writing’ which is not especially helpful because although intended to include all forms of inscription, including the visual, it remains much too focused on verbal language. There is also the problem that writing, understood according to its common usage, is actually more prone to logocentrism because it is generally more formal than speech.

Speech is part of the body whereas writing is disembodied. Speech allows for much more play and bricolage, it is capable of much greater mutation than the more rule-bound condition of writing. In comparison to speech, writing is a dead thing that can be laid out for anatomical examination. And in purely physical terms the closest medium to speech would be music not writing. The etymological root of ‘language’ is the French word langue which means both ‘language’ and ‘tongue’. Like the hands the tongue is as much a thing of the body as of the mind.

Analogical visualisation of the mapping of the body in the somatosensory cortex. This makes apparent the privileging of the hands and mouth in the neuro-somatic map.

Through speech language, and therefore a considerable part of thought, is implicated in the body.[9] From an evolutionary point of view the primary focus of logocentrism appears to be an attempt to deny the fact that we are animals. But the implications of evolutionism go further than linking us to the apes because they suggest that we have a great deal in common with all creatures (60% of fruit fly genes are human),[10] all the way down to bacteria which whom we have a symbiotic relationship, and ultimately DNA.

A Turning Point: Computational Linguistics

Noam Chomsky created a revolution in linguistics closely related to the field of computational linguistics.  In creating a model of language based on Hilbertian mathematical logic Chomsky was instrumental in opened up a situation in which the metaphysics of Logos was ultimately confronted by the anti-Logos of probability and stochastic formations.


LEFT: A Chomskian arrangment of simple phrase-structure grammar into a logical tree. S = sentence, NP = noun phrase, VP = verb phrase, N = noun, V = verb, Art = article, Adj = adjective. RIGHT: Tree structure derived from a simple instance of an L-System computational biology algorithm that belongs to the class of formal ‘context free’ grammars, as classified by the ‘Chomsky hierarchy’.

Chomsky is no lover of chance, he is first and foremost a rationalist in the structuralist tradition that Derrida is so opposed to. But Derrida also shows in his deconstructions of key structuralists such as Ferdinand de Saussure and Claude Levi-Strauss that within every apparently watertight structuralism there lies a self-referential ‘decentred’ ‘free play’ which one can argue has practical parallels in the key role played by stochastics in contemporary computational science (stochastic linguistics, evolutionary computing, neural networks).

The Mirror Mind

Since Chomsky first conceived of the transformational generative grammar he has claimed that there is a so-called ‘language organ’. Actually, it would be more accurate to describe it as a ‘grammar organ’ and I will use that term from now on. Broca’s area would be an obvious candidate. Wernicke’s and Broca’s areas are the two major zones of the brain currently associated with language. The intimate relationship between speech and the body noted above is also evident in the fact that Broca’s area is not only responsible for language but also for motor functioning, and not just run-of-the-mill motor functioning. Broca’s is populated by so-called ‘mirror neurons’. Mirror neurons activate when we perform an action such as a gesture. This in itself is not unlike any other motor neuron, the difference lies in the fact that the same pattern fires when one sees someone else performing the same action.

One manifestation of the action of mirror neurons is evident in the way in which we unconsciously adopt other peoples’ bodily positions and gestures if we like them (if we are ‘in tune’ with them). This indicates a social role for Broca’s mirror neurons that fits in neatly with the obvious social role of language, and the evolutionary advantage that working together in a tribe or society brings with it.

In 1999 Mirella Dapretto and Susan Bookheimer, of the University of California at Los Angeles, used functional MRI (fMRI) to locate the area of the brain dealing with grammar and found it in Broca’s area which has long been associated with language.[11] Against this background a case for Broca’s being the Chomskian ‘grammar organ’ is made by the cognitive neuroscientist John R. Skoyles who suggests that ‘brain syntax can get externalised into linguistic syntax through mirror neurons … in the Broca's area’.[12],[13],[14]  One might also add that the internalisation of other people’s gestures could be described as a neurological corollary of the syntax of ‘body language’. There is also the important phenomenon of the tongue and the large somatosensory cortical mapping of the tongue which has obvious implications for the relationship between motor mirroring and speech.

Indeed, A. N. Meltzoff and M. K. Moore (Meltzoff 1989) have observed that newborns imitate tongue movements even though the baby cannot see its own tongue. And one can point to the fact until the invention of mirrors no human being would have been able to see their own body in the way in which other’s see it. Yet pre-specular humans and animals are nevertheless very able to empathise and communicate with each other through mimetic body language (one remembers that the pioneering psychologist Theodor Lipps [1851-1914] argued that empathy was based on kinaesthetic ‘feeling in’ Einfuehlung). The social role of mirror neurons seems self-evident.[15] And, of course, oral mimicry plays a major role in the acquisition of language.



But this pre-specular mimesis is problematic because it raises the question as to how one can mimic another’s bodily movement if one cannot see one’s own body. The obvious solution to this problem is to cite the neurological mapping of the body in the somatosensory cortex. But this raises the question of how mirror neurons might communicate with it. Skoyles’ use of the phrase ‘brain syntax’ rather than ‘grammar organ’ is noteworthy in this respect because it consciously or unconsciously deconstructs the notion that we can localize grammar in the highly networked and distributed architecture of the brain.



Skoyles offers no solution to the question of how mirror neurons know what one might call ‘body syntax’ (indeed he does not ask the question) but he suggests that when the mirror neurons in Broca’s area mirror someone else’s behaviour there is a ‘syntactic’ function that resonates with the syntactic function of speech (which involves mimetic mirror-motor control of the tongue). He notes: ‘gestures can carry lexical information [body language] and this, through them, can be syntactically organised by the brain syntax [patterns in the Broca language network] underlying them and their movements.’[16] He adds a hard science dimension when he notes:

Such a link would explain why, the FOX2 gene which impairs complex motor actions also impairs syntax (and why motor and syntax processes coexist in the Broca’s area). Further, it would explain why, the functional relationships between linguistic subjects, verbs and objects in certain respects parallels the functional one between motor effectors, their actions and their objects.[17]

What is thought provoking about Skoyes’ observations is the blurring of a boundary between ‘brain syntax’, ‘body syntax’ and grammar. In broader terms mirror neurons suggest a blurring of the boundary between ‘body language’ and verbal language. This underscores the profound relationship between language and the body. It also suggests that Broca’s is not simply a ‘grammar organ’. Indeed there probably isn’t a specific ‘grammar organ’ especially if ‘brain syntax’ is a neurological correlate of bodily movement which in turn has its own neurodynamic ‘mirror’. The fact that motor and linguistic functions are so embedded points to the difficulty in identifying specific neurological ‘organs’. It might be better to think in terms of massively parallel and distributed neural networks.

fMRI scans show that language functions are are distributed

It could be the case that the fuzzy character of neurological ‘organs’ such as Broca’s is an enormous evolutionary advantage in the sense that new ‘organs’ such as grammars can develop via soft-wiring without the need for the painfully slow process of natural evolution. It should be noted here that the evolution of language is a major theoretical problem to which there is no definitive answer. In her review of Terrence Deacon’s The Symbolic Species (Deacon 1997) Amy Tabor notes:

With language … Nature went from absolute zero to full-blown complexity in a single leap: no other species, so far as we know, uses any symbolic system, however rudimentary, to communicate. The singularity of language is especially puzzling because language is so successful. With its help, a relatively weak, slow, and vulnerable species knocks several thousand species out of the Darwinian game every year. A chain of logic emerges which seems to command an anti-Darwinian [I would prefer to say anti-genetic] account of language: 1) valuable evolutionary adaptations are widespread, and complex ones are slow to develop; 2) language is a complex, unique, and valuable behavior, which developed quickly; therefore, 3) language must not be an evolutionary adaptation.

Could it be that the neural networks of the brain are sufficiently plastic that the ‘grammar organ’ could be socially programmed rather than hardwired? It is at this point that the mimetics of mirror neurons intersects with memetics (evolutionary sociology). It makes sense when one understands brain, body and society as highly interconnected and interactive.

TABULA RASA, The Artificial Mind


One of the key problems in cognitive science is to what extent is the brain structured like a body with organs (which recalls Deleuze and Guattari’s metaphor of a ‘body without organs’) and to what extent is it capable of dynamic construction and reconstruction. There is no definitive answer to this problem at present because there are very practical problems facing research in this area, not least the fact that when one places a brain on a slab for anatomical dissection it is in many respects a useless lump of goo.

EEG allows superficial access to cortical neuron firing but it is only relatively recently that functional magnetic resonance imaging (fMRI) has been able to chart what is happening deep inside the living brain. But unlike EEG, fMRI is an indirect representation (deduction) of what is going on. Magnetic resonance imaging is able to tune into the iron in the blood and the images of mental activity derived from this are based on the fact that when a neuron fires it becomes oxygen depleted and requires a surge of blood. Accordingly concentrations of blood in areas of the brain suggest recent neuronal activity. This combined with poor temporal response means that what is represented in fMRI is more a diagram than a direct view (which can be said for a great deal of technically assisted perception).

Waves of Probability: The Essence of Mind

Which brings us to the virtual brain, the simulation of neural activity in the computer. Artificial neural networks (ANNs) are artificial but they do have the very significant advantage that they are simulations of what might be termed the essential medium of mind: patterns of electrical (or electronic) activity than are capable of dynamic adaptation to unpredictable inputs.

Mathematical representation of neuronal activity in an artificial neural network

Artificial neural networks mimic the activity patterns that characterise brain functioning. There are limitations of course, ANN nodes can only switch on and off whereas in the brain ‘connections are not merely on or off, they possess varying strength that allows the influence of a given neuron on one of its neighbours to be either very strong, very weak (perhaps even no influence) or anything in between’[18] However the variance in the strength of connection between one neuron and another can be reproduced by dynamically assigning different weights to the connections between the nodes. This is referred to as ‘tuning’ the ANN and the process of tuning has significant resonance with the process of evolutionary ‘selection’.

Currently most ANN require a human supervisor to train them (or bring them up). But unsupervised self-organizing ANNs are under development (e.g. Kohonen’s ‘self-organizing maps’,[19] and Grossberg’s ‘adaptive resonance theory’[20]) and this promises the possibility of overlaps and synthesis with research into self-organising genetic algorithms in the field of Evolutionary Computing.[21] Admittedly ANNs are primitive in comparision with the naturally evolved brain but the fact that an ANN can achieve pattern recognition via learning rather than depending on a pre-programmed set of rules does bring computational simulation that bit closer to the way in which natural neural networks function.

The significance of the ANN for this discussion lies in the fact that it is described as ‘a probabilistic model for information storage and organization in the brain’ [my emphasis].[22] The ANN begins with random firing, random waves of ‘neuronal’ activity. An input such as an image is provided and initially the output is random but by adjusting the weights of the network linkages the output is tuned to the input.

An especially interesting feature of this process is that the network does not create an internal electronic analogue of the external stimuli. If the input is the letter ‘A’ there is no electronic image of an ‘A’ inside the network. What there is instead is a stochastic correlate. Until relatively recently it was not possible to ‘see’ inside the ‘hidden layer(s)’ of ANN but the new unsupervised ‘self-organizing’ ANN do allow access to a representation of the ‘synaptic’ weightings required to tune the output to the input and they are remarkable for their apparent lack of representation:


TOP Input images 5 x7 element characters. MIDDLE Graphical representation of the synaptic weights into the hidden nodes BOTTOM and from the hidden nodes.[23]

As in an EEG all we see is what might be termed an encoded pattern. So, in a metaphorical sense the hidden layer (a notion that evokes the large role played by unconscious processes in the brain) remains hidden.


A neuron growing over a microprocessor


Scholars inform us that the quintessentially abstract mathematical concept of zero was born out of the materialist needs of Arab commerce.[24] Similarly at the turn of the millennium globalized commerce is pushing computational linguistics towards the solution of the problem of machine translation and one of the most powerful tools appears to be chance: in the form of stochastic linguistics. As Machine Translation advances even the traditional subject-object-verb (or is it SVO, or VSO?) grammar model is being superseded by more empirical structures based on statistical, materialistic analyses of corpuses of actual language (as opposed to the application of the labours of grammarians and lexicographers whose products become a language in itself, a metalanguage, a kind of metaphysical language).



The metaphysics of Logos is antithetical to any suggestion that language could be associated with mere chance. It would be akin to characterizing God as a dice player. And an intellectualist preference for complex structure is particularly evident in responses to stochastic approaches to language. For example, one group of commentators notes:



Interestingly, state of the art language models for speech recognition are based on a very crude linguistic model, namely conditioning the probability of a word on a small fixed number of preceding words. Despite many attempts to incorporate more sophisticated information into the models, the n-gram model remains the state of the art, used in virtually all speech [emphasis added][25]



The term ‘n-gram’ means a combination of words, two words is a bigram, three a trigram etc. Stochastic grammars work by analysing large corpuses of language usage and finding patterns in the conjunction of words, for example if I write or say ‘have a nice …’ then according to analyses of corpuses of contemporary English usage the next word could very probably be ‘day’. N-gram stochastic grammar is used in continuous speech recognition software with remarkable success especially when it facilitates customisation by the user as s/he uses his/her specific vocabulary. In the face of rising commercial demand for effective products and the concomitant commercial advantage, voice recognition research largely jettisoned the formal dictionary and grammar approach due to the fact that speech, unlike writing is not especially well differentiated. It is as much a motile as it is a structural phenomenon, and in this sense among its closest relatives would be music and brain waves.

Common Sense versus Artificial Intelligence

Earlier in this discussion I noted that science has initiated a stochastic revolution evident in the number of scientific revolutions accompanied by an association with chance: Darwinian Evolution, Quantum Mechanics, Chaos and Complexity Theory, Evolutionary Computation (genetic algorithms), and Artificial Life (cellular automata). The next landmark in this revolution might be the resolution of the ‘common sense’ problem that confronts computational linguistics and machine translation.



In the context of the Artificial Intelligence community the common sense problem alludes to the fact that by the time one is five years old one has a acquired an enormous amount of information about the world that aids and is aided by a relatively complex articulation of language.[26] By the time one is an adult that corpus of knowledge is exponentially greater. Currently the average computer does not possess the common sense/general knowledge of a one year old (and that is an absurdly generous comparison because there is really no comparision). One way around this problem is to bypass the Herculean task of emulating human common sense and find another way of determining context. This is accomplished in stochastic linguistics by analysing corpuses of data not to attain understanding in a human-like fashion but to find patterns using statistical methods. Currently the most sophisticated technology appears to be Latent Semantic Analysis, of which more later.



To return to the intellectual snobbery regarding stochastic linguistics: I noted above the use of the term ‘crude’ in one group of commentators observations on stochastic linguistics, another criticises the application of stochastic methods in machine translation noting; The main problem of machine implemented algorithms has been the fact that they are almost always based on parametric Markov [n-gram] models of the English language. It seems to be a well understood fact that, as already argued by Chomsky 40 years ago, Markovian models are not adequate linguistic descriptions for natural languages. [27]



This passage gives the reader the impression that Chomsky wrote off Markovian (n-gram) models decades ago. Maybe he did but the computational linguistics community did not. A prime instance of this concerns a large cutting-edge scientific project called the ‘semantic web’. When implemented (it will take some years) the semantic web will revolutionise the Internet by creating an interface not only between different human languages but also between human and machine languages. The two are intertwined in the realm of web oriented Artificial Intelligence research and development. What follows is a passage from the W3C (World Wide Web Consortium) the agency responsible for ‘the development of interoperable technologies (specifications, guidelines, software, and tools) to lead the Web to its full potential’. The passage is from a document defining a ‘syntax for representing n-gram (Markovian) stochastic grammars within the W3C Speech Interface Framework’:

The use of stochastic N-Gram models has a long and successful history in the research community and is now more and more effecting commercial systems, as the market asks for more robust and flexible solutions. The primary purpose of specifying a stochastic grammar format is to support large vocabulary and open vocabulary applications. In addition, stochastic grammars can be used to represent concepts or semantics. This specification defines the mechanism for combining stochastic and structured (in this case Context-Free) grammars as well as methods for combined semantic definitions.[28] [emphases added]

The reference to ‘context free’ (formal grammar) points to the artificial intelligence/computer science community’s awareness of the need for a formal system that can work around the common sense problem. 

The Stochastic-Semantic Web

To call stochastic grammar ‘crude’ is to ignore the fact that the leading edge of web technology developing towards globalized information interflow is almost exclusively based on stochastic technologies. For example the PageRank algorithm that is at the heart of Google makes use of a statistical modeling of the billions of webpages and trillions of indexed terms.[29]  Apart from Markovian stochastic grammar computer scientists at the turn of the millennium are also making increasing use of Bayesian probabilistic inference. Bayesian algorithms play a key role in ‘agent’ technology which uses artificially intelligent programs to perform tasks such as the automated buying and selling of shares (e.g. tracker funds). Another evolving technology is Latent Semantic Analysis which one company, Knowledge Analysis Technologies,[30] uses to mark essays. According to the company’s website (and the company has survived in business now for more than three years) LSA is able to match a student’s text against the analysis of a large corpus of information on the relevant subject and even point the student to references he or she might need to improve their work. Like all stochastic technologies LSA is commonsense independent, it just recognizes patterns. [31] It does not tangle with the wicked webs of syntax and semantics. It sidesteps the babel of languages entirely by employing mathematical representation instead.[32] And if one reflects on this, perhaps it is the case that using language to analyse language is in logical terms much the same as standing between two mirrors.



The people who call stochastic grammar ‘crude’ may be missing the point that in the force-space of the computer and its billions of iterations per second simple algorithms become capable of dealing with great complexity. One can draw an analogy with the fact that the literally infinite beauty of the Mandelbrot Set is based on an extremely simple nonlinear equation: Z = Z2 + C. Mandelbrot himself has stated that he was astounded that such a simple equation could produce such incredibly intricate complexity. A similar effect became evident in the mid-1980s when in The Blind Watchmaker (1986) Richard Dawkins described the remarkable results he gained from a simple computational model of genetic adaptation, he reported:



Nothing in my biologist’s intuition, nothing in my 20 years experience of  programming computers, and nothing in my wildest dreams, prepared me for what actually emerged on the screen. I can't remember exactly when in the sequence it first began to dawn on me that an evolved resemblance to something  like an insect was possible. With a wild surmise, I began to breed     generation after generation, from whichever child looked most like an insect.  My incredulity grew in parallel with the evolving resemblance. . . Admittedly they have eight legs like a spider, instead of six like an insect, but even so!  I still cannot conceal from you my feeling of exultation as I first watched these exquisite creatures emerging before my eyes.



Which brings me back to the time I was standing in front of the wall-sized pegboard in the Boston Science Museum watching the balls dropping from the hopper and bouncing randomly around only to end up in a perfect Bell Curve.



Corey Michael, Anthony. 1993. God and the new cosmology : the anthropic design argument. Lanham, Md.: Rowman & Littlefield.


Deacon, Terrence William. 1997. The symbolic species : the co-evolution of language and the brain. 1st ed. New York: W.W. Norton.


Deleuze, Gilles. 1986. Nietzsche and philosophy. London: Athlone Press.


Foucault, Michel. 1972. The archaeology of knowledge. [1st American ed. New York,: Pantheon Books.


Harris Errol, E. 1992. Cosmos and Theos : ethical and theological implications of the anthropic cosmological principle. New Jersey ; London: Humanities Press.


Meltzoff, A. & M. K. Moore. 1989.  Imitation in newborn infants: Exploring the range of gestures imitated and the underlying mechanisms. Developmental Psychology (25):954-962.


Pearsall, Judy. 1998. The New Oxford Dictionary of English, edited by Current English Dictionaries Chief Editor, Patrick Hanks. Oxford: Clarendon Press.


Tillers, Imants. 2002. The Beginner's Guide to Oil Painting [orig. 1973]. In The Postmodern Art of Imants Tillers: Appropriation en abyme, Graham Coulter-Smith, Appendix A. London and Southampton: Paul Holberton and the Fine Art Research Centre, Southampton Institute.



[1] The abbreviation of ‘anthropomorphic’ to ‘anthropic’ is derived from the existing term ‘anthropic principle’ defined by The New Oxford Dictionary of English as ‘The cosmological principle that theories of the universe are constrained by the necessity to allow human existence.’ (Pearsall 1998). As the analysis procedes it will be shown that Tillers’ antianthropic position is opposed to the Judeo-Christian concept of God. The appropriateness of the term antianthropic to Tillers’ stance is supported by discussions of the anthropic principle that focus on the notion of God such as: God and the New Cosmology: the Anthropic Design Argument (Corey Michael 1993); and Cosmos and Theos: Ethical and Theological Implications of the Anthropic Cosmological Principle (Harris Errol 1992).


[2] (Deleuze 1986)


[3] Teilhard de Chardin is another philospher inspired by Nietzsche who produced the science fiction-like notion of a noossphere or metaconsciousness formed by all the minds of the inhabitants of Earth. One can also cite the Freudian unconscious and the Jungian collective unconscious and the Marxian notion of history which is a force that transcends individuals. All seem affected by the decentring of ‘man’ in the universe that began with Darwinian evolution. If Deleuze’s interpretation is correct then Nietzsche’s response to Darwin seems to be conflated with the creative evolutionism of Darwin’s predecessor the French naturalist Jean-Baptiste Lamark.


[4] The best way to see the Mandelbrot Set is online just type Mandelbrot Set into Google.


[5] Which was accompanied by Niels Bohr’s surrealistic principle of ‘complementarity’ which acknowledges that in the quantum domain a phenomenon can occupy two mutually contradictory states (such as wave-like and particle-like behaviour), a principle that totally defies the commonsensical, classical logic of non-contradiction.


[6] But according to Linda Dalrymple Henderson Malevich was interested in such theories.


[7] Imants Tillers points this out in ‘The Beginner’s Guide to Oil Painting’ (Tillers 2002)


[8] For an up to date account of M-Theory see the following webpage published by the Department of Applied Mathematics & Theoretical Physics (DAMTP), University of Cambridge. URL accessed March 2004:


[9] In evolutionary terms human language developed because the thorax became larger and the larynx became lowered away from the mouth cavity creating a larger acoustic passage than in our cousins the apes. This, together with alterations in the tongue and mouth cavity combined to create an instrument capable of a great deal more sound variation than can be achieved by any other animal. It is also the case that the resultant sharper pharyngeal angle that arose from the lowering of the larynx also gave rise to the danger of death from choking. Obviously this serious evolutionary disadvantage was mitigated by the communicative and cognitive advantages afforded by speech.




[12] McGraw Hill sponsored website for John R. Skoyles and Dorion Sagan’s book Up from Dragons: The Evolution of Human Intelligence. URL accessed May 2004:




[14] Skoyles’ use of the phrase ‘brain syntax’ rather than ‘grammar organ’ is noteworthy because it consciously or unconsciously deconstructs the notion that we can localize grammar in the highly networked and distributed architecture of the brain


[15] One can also point to the massive role played by technologies of representation in our culture: perspective in painting, photography and photomechanical reproduction, film, television.











[21] See for example Neural Network Using Genetic Algorithms  By Omri Weisman and Ziv Pollack; There is also a company providing commercial applications of genetic algorithms for neural networks. They state: There may be times when you are not sure how many hidden processing elements your neural network should have or which inputs are providing useful information. NeuroSolutions can adjust parameters and select inputs for you automatically. The technique it uses is genetic optimization, which uses a form of artificial intelligence known as a genetic algorithm to search for the best combination of parameters and inputs. It will train your neural network many times while adjusting the parameters and excluding inputs based on the laws of evolution (survival of the fittest). Once the genetic training is complete, you end up with the settings that produced the lowest error. URL accessed May 2004:






[24] Melvyn Bragg, 2004, ‘Zero’ Melvyn Bragg in conversation with Robert Kaplan, Ian Stewart, Lisa Jardine, In Our Time BBC Radio 4: Thursday 13 May.


[25]  Eric Brill, Radu Florian, John C. Henderson, Lidia Mangu 1998 [John Hopkins University, Baltimore] Beyond n-grams: can linguistic sophistication improve language modeling?  Proceedings of the 36th conference on Association for Computational Linguistics - Volume 1 Montreal, Quebec, Canada  Accessed March 2004 via the ACM Portal (subscription access only online library: URL


[26] Jean Aitchison, Kidspeak: How Children Acquire Language, Cambridge University Press. Reproduced on Columbia University’s Fathom open online learning resource: URL accessed May 2004






[29] In an account of the PageRank algorithm Stanford computer scientists note: We determine analytically the modulus of the second eigenvalue for the web hyperlink matrix used by Google for computing PageRank. Specifically, we prove the following statement: ``For any matrix $A=[cP + (1-c)E]^T$, where $P$ is an $n \times n$ row-stochastic matrix, $E$ is a strictly positive $n \times n$ rank-one row-stochastic matrix, and $0 \leq c \leq 1$, the second eigenvalue of $A$ has modulus $|\lambda_2| \leq c$. Furthermore, if $P$ has at least two irreducible closed subsets, the second eigenvalue $\lambda_2 = c$.'' This statement has implications for the convergence rate of the standard PageRank algorithm as the web scales, for the stability of PageRank to perturbations to the link structure of the web, for the detection of Google spammers, and for the design of algorithms to speed up PageRank." [emphasis added] URL accessed May 2004:




[31] Steven Mills, Computer Science, Nottingham University, explains: ‘Knowledge Analysis Technologies' Intelligent Essay Assessor scores essay content using Latent Semantic Analysis to identify semantic similarities between human-graded exemplars and submitted text. IEA stems from research by Landauer et al. (1998) and is currently KAT's cornerstone product. As the process requires 1Gb+ RAM, it's a Web-based service giving evaluation and advice on the conceptual content of submitted essays - key features include relatively low unit cost, quick customised feedback, and plagiarism detection. Human/automated score correlations are quoted at 0.85 - 0.91. LSA represents documents and their word contents in a large two dimensional matrix semantic space. Using a matrix algebra technique known as Singular Value Decomposition (SVD), new relationships between words and documents are uncovered, and existing relationships are modified to more accurately represent their true significance. The words and their contexts are represented by a matrix. Each word being considered for the analysis is represented as a row of a matrix, and the columns of the matrix represent the sentences, paragraphs, or other subdivisions of the contexts in which the words occur. The cells contain the frequencies of the words in each context. The SVD is then applied to the matrix. SVD breaks the original matrix into three component matrices, that, when matrix multiplied, reproduce the original matrix. Using a reduced dimension of these three matrices in which the word-context associations can be represented, new relationships between words and contexts are induced when reconstructing a close approximation to the original matrix from the reduced dimension component SVD matrices. These new relationships are made manifest, whereas prior to the SVD, they were hidden or latent.’ URL accessed May 2004:


[32] And it is at this point that most if not all minds in the zone of the arts are truly challenged as we gape at the sight of a professional mathematical equation. But the imagination of a mathematician is able to see things inside these arcane expressions. Beautiful minds that stared long enough at the mind boggling equations for string theory began to see not only strings but tubes, sheets and membranes, that have become images of parallel interdimensional worlds that might be extremely close to us.