P-Y. Oudeyer L'auto-organisation de la parole. University Paris VI, 2003.

Sony CSL authors: Pierre-Yves Oudeyer

Abstract

Human vocalizations systems are complex. Vocalizations are digital and compositional: they are built with the re-combination of units which are systematically re-used. These units are present at several levels (e.g. the gestures, the coordination of gestures or phonemes, the morphemes). While the articulatory space which defines the space of physically possible gestures is continuous, each language discretizes this space in its own way. While there is a great diversity across the repertoires of these units in the world languages, there are also strong regularities (e.g. the frequency of the vowel system /i,e,a,o,u/).The way these units are combined is also very particular: 1) all the sequences of phonemes are not allowed in a given language; 2) the set of allowed sequences is organized into patterns. This organization in patterns means that for example, one can summarize the allowed phoneme combinations in Japanese by the pattern"CV": a syllable must be composed of two slots, and only a certain category of phonemes that we call consonants can be used in the first slot, while only another category of phonemes which we call vowels can be used in the second slot. It is then natural to ask where this organization comes from. Two types of answers must be provided. The first type is a functional answer: it establishes the function of sound systems, and shows that human sound systems have an organization which makes them efficient for achieving this function. This has for example been proposed by Lindblom who showed that statistical regularities of vowel systems could be predicted by searching for the vowel systems with quasi-optimal perceptual distinctiveness. This type of answer is necessary, but not sufficient: it does not allow to explain how evolution (genetic or cultural) may have found these optimal structures. In particular, it is possible that "naive" darwinian search with random variations is not efficient enough for finding complex structures like those of speech: the search space is too big. This is why a second type of answer is necessary: we have to account for how natural selection may have found these structures. A possible way to do that is to show how self-organization can constrain the search space and help natural selection. This may be done by showing how a much simpler system can self-organize spontaneously and form the structure we want to explain. We present an artificial system of this type. We use the method of the artificial, which consists in building a society of formal agents. The scientific logic is abductive. This does not allow to show directly what were the mechanisms which gave rise to human speech, but allows to know what types of mechanisms are plausible candidates. The building of this artificial system provides constraints to the space of possible theories, in particular by showing examples of mechanisms which are sufficient, and examples of mechanisms which are not necessary. Technically, the artificial system is based on the coupling of generic sensory-motor systems which are initially randomly wired. These neural devices are implemented as the brain of artificial agents. We show how this system self-organizes so that agents develop vocalizations systems shared by all members of a community, digital, compositional, and characterized by statistical regularities similar to those of human languages. We also show how these systems develop phonotactic rules and an organisation into patterns of the allowed phoneme combinations. Each rule system is shared by agents of a community, but different across communities. The type of mechanism illustrated by this system appears to be a necessary complement to the functional explanation. Additionnally, it does not require the explicit presence of a functional pressure for efficient communication. It does not require any social pressure and the agents do not have any social capability. While nowadays speech codes are obviously influenced by the function of communication for which they are used, the simplicity of the system allows to propose a new hypothesis about the initial invention of shared vocalization systems: they might be self-organized collateral effects of certain cerebral structures which appeared in humans under the pressure of functions very different from communication. We develop this hypothesis by explaining what are these cerebral structures and what were their initial function.

Keywords: origins of speech, self-organization, evolution, forms, artificial systems, agents, phonetics, phonology, exaptation

Downloads

[PDF] Adobe Acrobat PDF file

BibTeX entry

@PHDTHESIS { oudeyer:03c, AUTHOR="P-Y. Oudeyer", SCHOOL="University Paris VI", TITLE="L'auto-organisation de la parole", YEAR="2003", }