My research activity is focused on complexity science and its interdisciplinary applications. Along the past years I have been active in several fields from granular media, to complexity and information theory, from social dynamics to sustainabilty. My very recent KREYON project (www.kreyon.net) concerned “Unfolding the dynamics of creativity, novelties and innovation”. In this context I am interested in understanding and modelling how the “new” enters our lives in its multiform instantiations: personal novelties or global innovation. To this end I’m blending, in a unitary interdisciplinary effort, three main activities: web-based experiments, data science and theoretical modeling.
Key to this endeavor is to grasp the structure and the dynamics of the “space of possibilities” in order to come up with a solid mathematical modelling of the way systems – biological, technological, social – explore the new at the individual and collective levels. Exploiting the knowledge of the way the space of possibilities is explored can be helpful to conceive the next generation of Artificial Intelligent algorithms able to cope with the occurrence of novelties, bridging in this way the gap between inference and unanticipated events.
The origin and meaning of facial beauty represent a longstanding puzzle. Despite the profuse literature devoted to facial attractiveness, its very nature, its determinants and the nature of inter-person differences remain controversial issues. Here we tackle such questions proposing a novel experimental approach in which human subjects, instead of rating natural faces, are allowed to efficiently explore the face-space and “sculpt” their favorite variation of a reference facial image. The results reveal that different subjects prefer distinguishable regions of the face-space, highlighting the essential subjectivity of the phenomenon. The different sculpted facial vectors exhibit strong correlations among pairs of facial distances, characterising the underlying universality and complexity of the cognitive processes, and the relative relevance and robustness of the different facial distances.
Creative industries constantly strive for fame and popularity. Though highly desirable, popularity is not the only achievement artistic creations might ever acquire. Leaving a longstanding mark in the global production and influencing future works is an even more important achievement, usually acknowledged by experts and scholars. ‘Significant’ or ‘influential’ works are not always well known to the public or have sometimes been long forgotten by the vast majority. In this paper, we focus on the duality between what is successful and what is significant in the musical context. To this end, we consider a user-generated set of tags collected through an online music platform, whose evolving co-occurrence network mirrors the growing conceptual space underlying music production. We define a set of general metrics aiming at characterizing music albums throughout history, and their relationships with the overall musical production. We show how these metrics allow to classify albums according to their current popularity or their belonging to expert-made lists of important albums. In this way, we provide the scientific community and the public at large with quantitative tools to tell apart popular albums from culturally or aesthetically relevant artworks. The generality of the methodology presented here lends itself to be used in all those fields where innovation and creativity are in play.
The quest for information is one of the most common activity of human beings. Despite the the impressive progress of search engines, not to miss the needed piece of information could be still very tough, as well as to acquire specific competences and knowledge by shaping and following the proper learning paths. Indeed, the need to find sensible paths in information networks is one of the biggest challenges of our societies and, to effectively address it, it is important to investigate the strategies adopted by human users to cope with the cognitive bottleneck of finding their way in a growing sea of information. Here we focus on the case of Wikipedia and investigate a recently released dataset about users’ click on the English Wikipedia, namely the English Wikipedia Clickstream. We perform a semantically charged analysis to uncover the general patterns followed by information seekers in the multi-dimensional space of Wikipedia topics/categories. We discover the existence of well defined strategies in which users tend to start from very general, i.e., semantically broad, pages and progressively narrow down the scope of their navigation, while keeping a growing semantic coherence. This is unlike strategies associated to tasks with predefined search goals, namely the case of the Wikispeedia game. In this case users first move from the ‘particular’ to the ‘universal’ before focusing down again to the required target. The clear picture offered here represents a very important stepping stone towards a better design of information networks and recommendation strategies, as well as the construction of radically new learning paths.
We introduce a Maximum Entropy model able to capture the statistics of melodies in music. The model can be used to generate new melodies that emulate the style of a given musical corpus. Instead of using the n–body interactions of (n−1)–order Markov models, traditionally used in automatic music generation, we use a k-nearest neighbour model with pairwise interactions only. In that way, we keep the number of parameters low and avoid over-fitting problems typical of Markov models. We show that long-range musical phrases don’t need to be explicitly enforced using high-order Markov interactions, but can instead emerge from multiple, competing, pairwise interactions. We validate our Maximum Entropy model by contrasting how much the generated sequences capture the style of the original corpus without plagiarizing it. To this end we use a data-compression approach to discriminate the levels of borrowing and innovation featured by the artificial sequences. Our modelling scheme outperforms both fixed-order and variable-order Markov models. This shows that, despite being based only on pairwise interactions, our scheme opens the possibility to generate musically sensible alterations of the original phrases, providing a way to generate innovation.
The reconstruction of phylogenies of cultural artefacts represents an open problem that mixes theoretical and computational challenges. Existing bench- marks rely on simulated phylogenies, where hypotheses on the underlying evolutionary mechanisms are unavoidable, or in real data phylogenies, for which no true evolutionary history is known. Here we introduce a web-based game, Copystree, where users create phylogenies of manuscripts, through successive copying actions, in a fully monitored setup. While players enjoy the experience, Copystree allows to build artificial phylogenies whose evolutionary processes do not obey to any pre-defined theoretical mechanisms, being generated instead with the unpredictability of human creativity. We present the analysis of the data gathered during the first set of experiments and use the artificial phylogenies gathered for a first test of existing phylogenetic algorithms.
Rules are an efficient feature of natural languages which allow speakers to use a finite set of instructions to generate a virtually infinite set of utterances. Yet, for many regular rules, there are irregular exceptions. There has been lively debate in cognitive science about how individual learners acquire rules and exceptions; for example, how they learn the past tense of preach is preached, but for teach it is taught. However, for most population or language-level models of language structure, particularly from the perspective of language evolution, the goal has generally been to examine how languages evolve stable structure, and neglects the fact that in many cases, languages exhibit exceptions to structural rules. We examine the dynamics of regularity and irregularity across a population of interacting agents to investigate how, for example, the irregular teach coexists beside the regular preach in a dynamic language system. Models show that in the absence of individual biases towards either regularity or irregularity, the outcome of a system is determined entirely by the initial condition. On the other hand, in the presence of individual biases, rule systems exhibit frequency dependent patterns in regularity reminiscent of patterns found in natural language. We implement individual biases towards regularity in two ways: through child agents who have a preference to generalise using the regular form, and through a memory constraint wherein an agent can only remember an irregular form for a finite time period. We provide theoretical arguments for the prediction of a critical frequency below which irregularity cannot persist in terms of the duration of the finite time period which constrains agent memory. Further, within our framework we also find stable irregularity, arguably a feature of most natural languages not accounted for in many other cultural models of language structure.
The complex organization of syntax in hierarchical structures is one of the core design features of human language. Duality of patterning refers for instance to the organization of the meaningful elements in a language at two distinct levels: a combinatorial level where meaningless forms are combined into meaningful forms and a compositional level where meaningful forms are composed into larger lexical units. The question remains wide open regarding how such a structure could have emerged. Furthermore a clear mathematical framework to quantify this phenomenon is still lacking. The aim of this paper is that of addressing these two aspects in a self-consistent way. First, we introduce suitable measures to quantify the level of combinatoriality and compositionality in a language, and present a framework to estimate these observables in human natural languages. Second, we show that the theoretical predictions of a multi-agents modeling scheme, namely the Blending Game, are in surprisingly good agreement with empirical data. In the Blending Game a population of individuals plays language games aiming at success in communication. It is remarkable that the two sides of duality of patterning emerge simultaneously as a consequence of a pure cultural dynamics in a simulated environment that contains meaningful relations, provided a simple constraint on message transmission fidelity is also considered.
The understanding and the characterisation of individual mobility patterns in urban environments is important in order to improve liveability and planning of big cities. In relatively recent times, the availability of data regarding human movements have fostered the emergence of a new branch of social studies, with the aim to unveil and study those patterns thanks to data collected by means of geolocalisation technologies. In this paper we analyse a large dataset of GPS tracks of cars collected in Rome (Italy). Dividing the drivers in classes according to the number of trips they perform in a day, we show that the sequence of the travelled space connecting two consecutive stops shows a precise behaviour so that the shortest trips are performed at the middle of the sequence, when the longest occur at the beginning and at the end when drivers head back home. We show that this behaviour is consistent with the idea of an optimisation process in which the total travel time is minimised, under the effect of spatial constraints so that the starting points is on the border of the space in which the dynamics takes place.
We present a numerical model for the evolution of pathogens organised in discrete antigenic clusters, where individuals in the same clusters have the same fitness. The fitness of each cluster is a decreasing function of the total number of cluster members appeared in the population. Cluster transition is modelled with inclusion and exclusion of dynamical epistatic effects. In both cases we observe a continuous transition, driven by the mutation rate, from a dynamics with single clusters alternating in time to the coexistence of many clusters in the population. The transition between the two regimes is investigated in terms of the key parameters of the model. We find that the location and the scaling of this transition can be explained in terms of the time of first appearance of a new cluster in the population. The presence of dynamical epistatic effects results in a shift of the value of the mutation rate where the transition occurs.
It is common opinion that many innovations are triggered by serendipity whose notion is associated with fortuitous events leading to unintended consequences. One might argue that this interpretation is due to the poor understanding of the dynamics of innovations. Very little is known, in fact, about how innovations proceed and samples the space of potential novelties. This space is usually referred to as the adjacent possible, a concept originally introduced in the study of biological systems to indicate the set of possibilities that are one step away from what actually exists. In this paper we focus on the problem of defining the adjacent possible space, and analyzing its dynamics, for a particular system, namely the cultural system of the network of movies. We synthesized to this end the graph emerging from the Internet Movies Database (IMDb) and looked at the static and dynamical properties of this network. We deal, in particular, with the subtle mechanism of the adjacent possible by measuring the expansion and the coverage of this elusive space during the global evolution of the system. Finally, we introduce the concept of adjacent possibilities at the level of single node and try to elucidate its nature by looking at the correlations with topological and user annotation metrics.
The dynamics of political votes has been widely studied, both for its practical interest and as a paradigm of the dynamics of mass opinions and collective phenomena, where theoretical predictions can be easily tested. However, the vote outcome is often influenced by many factors beyond the bare opinion on the candidate, and in most cases it is bound to a single preference. The voter perception of the political space is still to be elucidated. We here propose a web experiment (laPENSOcos’i) where we explicitly investigate participant’s opinions on political entities (parties, coalitions, individual candidates) of the Italian political scene. As a main result, we show that the political perception follows a Weber-Fechner-like law, i.e., when ranking political entities according to the user expressed preferences, the perceived distance of the user from a given entity scales as the logarithm of this rank.
The emergence of novelties and their rise and fall in popularity is an ubiquitous phenomenon in human activities. The coexistence of always popular milestones with novel and sometimes ephemeral trends pervades technological, scientific and artistic production. By introducing suitable statistical measures, we demonstrate that different systems of human activities, i.e. the creation of hashtags in Twitter, the interaction with online program code repositories, the creation of texts and the listening of songs on an on-line platform, exhibit surprisingly similar properties.We then introduce a general framework to explain those regularities. We propose a simple mathematical model based on the expansion into the adjacent possible, that has been proven to be a very general and powerful mechanism able to explain many of the statistical patterns emerging in innovation dynamics, to which we add two crucial elements. On the one hand we quantify the idea that, while exploring a conceptual or physical space, inertia exists towards known already discovered elements. On the other hand, we highlight the role of the collective dynamics – where many users interact, in a direct or indirect way in the emergence and diffusion of novelties and innovations.
Rules are an efficient feature of natural languages which allow speakers to use a finite set of instructions to generate a virtually infinite set of utterances. Yet, for many regular rules, there are irregular exceptions. There has been lively debate in cognitive science about how individual learners acquire rules and exceptions; for example, how they learn the past tense of preach is preached, but for teach it is taught. In this paper, we take a different perspective, examining the dynamics of regularity and irregularity across a population of interacting agents to investigate how inflectional rules are applied to verbs. We show that in the absence of biases towards either regularity or irregularity, the outcome is determined by the initial condition, irrespective of the frequency of usage of the given lemma. On the other hand, in presence of biases, rule systems exhibit frequency dependent patterns in regularity reminiscent of patterns in natural language corpora. We examine the case where individuals are biased towards linguistic regularity in two ways: either as child learners, or through a memory constraint wherein irregular forms can only be remembered by an individual agent for a finite time period. We provide theoretical arguments for the prediction of a critical frequency below which irregularity cannot persist in terms of the duration of the finite time period which constrains agent memory.
Studies in literature and narrative have begun to argue more forcefully for considering human evolution as central to understanding stories and storytelling more generally (Sugiyama, 2001; Hernadi, 2002). However, empirical studies in language evolution have focused primarily on language structure or the language faculty, leaving the evolution of stories largely unexplored (although see Von Heiseler, 2014). Stories are unique products of human culture enabled principally by human language. Given this, the dynamics of creativity in stories, and the traits which make successful stories, are of crucial interest to understanding the evolution of language in the context of human evolution more broadly. The current work aims to illuminate how stories emerge, evolve, and change in the context of a collaborative cultural effort. We present results from a novel experimental paradigm centered around a story game where players write short continuations (between 60 and 120 characters) of existing stories. These continuations then become open to other players to continue in turn. Stories are subject to player selection, allowing for variation and speciation of the resulting narratives, and evolve as a result of collaborative effort between players. The game starts with a seed of over 60 potential stories, and players choose which stories to continue, providing a player-driven story selection mechanism. In this way, stories which are creative, intriguing, and open ended spawn more stories, and eventually lead to longer story paths as play continues. The game also introduces further limitations by constraining a players’ view of the story path: players have access only to a story and its parent, meaning knowledge of the existing narrative is limited. We present data from hundreds of players and stories, creating large story trees which explore the space of different possible narratives which grow out of a confined set of starting points. This data allows us to investigate several aspects of the growing story trees to illuminate not only what makes a story successful, but how creative stories trigger new stories, and what makes individual storytellers successful. Given the selection mechanism central to game play, we identify the most successful stories by their number of offspring. Particularly successful storytellers emerge measured both by how many children their stories have spawned, and also how long their story path extends. We also show that coherent stories often emerge, despite the fact that they are authored by several different players, and any given player only sees a limited snapshot of the story path. We contextualise the results of the game and connect it to language evolution in two ways. First, we look for detectable triggers of innovation and creativity within the story trees, and identify these as expanding the adjacent possible (e.g., new adaptations open the space of other possible adaptations in the future; Tria, Loreto, Servedio, & Strogatz, 2014). We argue that this concept can be extended to stories, using evidence from the game bolstered by evidence from more traditional literature (the Gutenberg Corpus). Second, we frame the results in terms of recurring themes found in storytelling cross-culturally (Tehrani, 2013). We suggest that the most successful triggers of innovation in stories combine original novelty and a firm grounding in existing recurring story frameworks in human culture. This indicates that much like other cultural and biological systems, stories are subject to competing pressures for stability and conservation on the one hand, and innovation and novelty on the other.
Creole languages offer an invaluable opportunity to study the processes leading to the emergence and evolution of Language, thanks to the short – typically a few generations – and reasonably well defined time-scales involved in their emergence. Another well-known case of a very fast emergence of a Language, though referring to a much smaller population size and different ecological conditions, is that of the Nicaraguan Sign Language. What these two phenomena have in common is that in both cases what is emerging is a contact language, i.e., a language born out of the non-trivial interaction of two (or more) parent languages. This is a typical case of what is known in biology as horizontal transmission. In many well-documented cases, creoles emerged in large segregated sugarcane or rice plantations on which the slave labourers were the overwhelming majority. Lacking a common substrate language, slaves were naturally brought to shift to the economically and politically dominant European language (often referred to as the lexifier) to bootstrap an effective communication system among themselves. Here, we focus on the emergence of creole languages originated in the contacts of European colonists and slaves during the 17th and 18th centuries in exogenous plantation colonies of especially the Atlantic and Indian Ocean, where detailed census data are available. Those for several States of USA can be found at http://www.census.gov/history, while for Central America and the Caribbean can be found at http://www.jamaicanfamilysearch.com/Samples/1790al11.htm. Without entering in the details of the creole formation at a fine-grained linguistic level, we aim at uncovering some of the general mechanisms that determine the emergence of contact languages, and that successfully apply to the case of creole formation.
Air Transportation represents a very interesting example of a complex techno-social system whose importance has considerably grown in time and whose management requires a careful understanding of the subtle interplay between technological infrastructure and human behavior. Despite the competition with other transportation systems, a growth of air traffic is still foreseen in Europe for the next years. The increase of traffic load could bring the current Air Traffic Network above its capacity limits so that safety standards and performances might not be guaranteed anymore. Lacking the possibility of a direct investigation of this scenario, we resort to computer simulations in order to quantify the disruptive potential of an increase in traffic load. To this end we model the Air Transportation system as a complex dynamical network of flights controlled by humans who have to solve potentially dangerous conflicts by redirecting aircraft trajectories. The model is driven and validated through historical data of flight schedules in a European national airspace. While correctly reproducing actual statistics of the Air Transportation system, e.g., the distribution of delays, the model allows for theoretical predictions. Upon an increase of the traffic load injected in the system, the model predicts a transition from a phase in which all conflicts can be successfully resolved, to a phase in which many conflicts cannot be resolved anymore. We highlight how the current flight density of the Air Transportation system is well below the transition, provided that controllers make use of a special re-routing procedure. While the congestion transition displays a universal scaling behavior, its threshold depends on the conflict solving strategy adopted. Finally, the generality of the modeling scheme introduced makes it a flexible general tool to simulate and control Air Transportation systems in realistic and synthetic scenarios.
Each sphere of knowledge and information could be depicted as a complex mesh of correlated items. By properly exploiting these connections, innovative and more efficient navigation strategies could be defined, possibly leading to a faster learning process and an enduring retention of information. In this work we investigate how the topological structure embedding the items to be learned can affect the efficiency of the learning dynamics. To this end we introduce a general class of algorithms that simulate the exploration of knowledge/information networks standing on well-established findings on educational scheduling, namely the spacing and lag effects. While constructing their learning schedules, individuals move along connections, periodically revisiting some concepts, and sometimes jumping on very distant ones. In order to investigate the effect of networked information structures on the proposed learning dynamics we focused both on synthetic and real-world graphs such as subsections of Wikipedia and word-association graphs. We highlight the existence of optimal topological structures for the simulated learning dynamics whose efficiency is affected by the balance between hubs and the least connected items. Interestingly, the real-world graphs we considered lead naturally to almost optimal learning performances.
We introduce a model for music generation where melodies are seen as a network of interacting notes. Starting from the principle of maximum entropy we assign to this network a probability distribution, which is learned from an existing musical corpus. We use this model to generate novel musical sequences that mimic the style of the corpus. Our main result is that this model can reproduce high-order patterns despite having a polynomial sample complexity. This is in contrast with the more traditionally used Markov models that have an exponential sample complexity.
Contact languages are born out of the non-trivial interaction of two (or more) parent languages. Nowadays, the enhanced possibility of mobility and communication allows for a strong mixing of languages and cultures, thus raising the issue of whether there are any pure languages or cultures that are unaffected by contact with others. As with bacteria or viruses in biological evolution, the evolution of languages is marked by horizontal transmission; but to date no reliable quantitative tools to investigate these phenomena have been available. An interesting and well documented example of contact language is the emergence of creole languages, which originated in the contacts of European colonists and slaves during the 17th and 18th centuries in exogenous plantation colonies of especially the Atlantic and Indian Ocean. Here, we focus on the emergence of creole languages to demonstrate a dynamical process that mimics the process of creole formation in American and Caribbean plantation ecologies. Inspired by the Naming Game (NG), our modeling scheme incorporates demographic information about the colonial population in the framework of a non-trivial interaction network including three populations: Europeans, Mulattos/Creoles, and Bozal slaves. We show how this sole information makes it possible to discriminate territories that produced modern creoles from those that did not, with a surprising accuracy. The generality of our approach provides valuable insights for further studies on the emergence of languages in contact ecologies as well as to test specific hypotheses about the peopling and the population structures of the relevant territories. We submit that these tools could be relevant to addressing problems related to contact phenomena in many cultural domains: e.g., emergence of dialects, language competition and hybridization, globalization phenomena.
Empirical evidence shows that the rate of irregular usage of English verbs exhibits discontinuity as a function of their frequency: the most frequent verbs tend to be totally irregular. We aim to qualitatively understand the origin of this feature by studying simple agent-based models of language dynamics, where each agent adopts an inflectional state for a verb and may change it upon interaction with other agents. At the same time, agents are replaced at some rate by new agents adopting the regular form. In models with only two inflectional states (regular and irregular), we observe that either all verbs regularise irrespective of their frequency, or a continuous transition occurs between a low-frequency state, where the lemma becomes fully regular, and a high-frequency one, where both forms coexist. Introducing a third (mixed) state, wherein agents may use either form, we find that a third, qualitatively different behaviour may emerge, namely, a discontinuous transition in frequency. We introduce and solve analytically a very general class of three-state models that allows us to fully understand these behaviours in a unified framework. Realistic sets of interaction rules, including the well-known naming game (NG) model, result in a discontinuous transition, in agreement with recent empirical findings. We also point out that the distinction between speaker and hearer in the interaction has no effect on the collective behaviour. The results for the general three-state model, although discussed in terms of language dynamics, are widely applicable.
Several recent theories have suggested that an increase in the number of non-native speakers in a language can lead to changes in morphological rules. We examine this experimentally by contrasting the performance of native and non-native English speakers in a simple Wug-task, showing that non-native speakers are significantly more likely to provide non -ed (i.e., irregular) past-tense forms for novel verbs than native speakers. Both groups are sensitive to sound similarities between new words and existing words (i.e., are more likely to provide irregular forms for novel words which sound similar to existing irregulars). Among both natives and non-natives, irregularizations are non-random; that is, rather than presenting as truly irregular inflectional strategies, they follow identifiable sub-rules present in the highly frequent set of irregular English verbs. Our results shed new light on how native and non-native learners can affect language structure.
The comprehension of vehicular traffic in urban environments is crucial to achieve a good management of the complex processes arising from people collective motion. Even allowing for the great complexity of human beings, human behavior turns out to be subject to strong constraints – physical, environmental, social, economical – that induce the emergence of common patterns. The observation and understanding of those patterns is key to setup effective strategies to optimize the quality of life in cities while not frustrating the natural need for mobility. In this paper we focus on vehicular mobility with the aim to reveal the underlying patterns and uncover the human strategies determining them. To this end we analyze a large dataset of GPS vehicles tracks collected in the Rome (Italy) district during a month. We demonstrate the existence of a local optimization of travel times that vehicle drivers perform while choosing their journey. This finding is mirrored by two additional important facts, i.e., the observation that the average vehicle velocity increases by increasing the travel length and the emergence of a universal scaling law for the distribution of travel times at fixed traveled length. A simple modeling scheme confirms this scenario opening the way to further predictions.
Language universals have long been attributed to an innate Universal Grammar. An alternative explanation states that linguistic universals emerged independently in every language in response to shared cognitive or perceptual biases. A computational model has recently shown how this could be the case, focusing on the paradigmatic example of the universal properties of colour naming patterns, and producing results in quantitative agreement with the experimental data. Here we investigate the role of an individual perceptual bias in the framework of the model. We study how, and to what extent, the structure of the bias influences the corresponding linguistic universal patterns. We show that the cultural history of a group of speakers introduces population-specific constraints that act against the pressure for uniformity arising from the individual bias, and we clarify the interplay between these two forces.
The introduction of a new SESAR scenario in the European Airspace will impact the functioning and the performances of the current Air Traffic Management (ATM) System. The understanding of the features and the limits of the current system could be crucial in order to improve and design the structure of the future ATM. In this paper we present some results of the “Assessment of Critical Delay Patterns and Avalanche Dynamics” PhD project from the ComplexWorld Network. During this project we developed a model of Air Traffic Control (ATC) based on Complex Network theory capable of reproducing the features of the real ATC in three European National Airspaces. We then developed an optimization algorithm based on “Extremal Optimization” in order to build efficient and globally optimized planned trajectories. The ATC model is applied in order to study the efficiency of this new planned trajectories when subject to external perturbations and to compare them to the current situation.
Novelties are a familiar part of daily life. They are also fundamental to the evolution of biological systems, human society, and technology. By opening new possibilities, one novelty can pave the way for others in a process that Kauffman has called “expanding the adjacent possible”. The dynamics of correlated novelties, however, have yet to be quantified empirically or modeled mathematically. Here we propose a simple mathematical model that mimics the process of exploring a physical, biological, or conceptual space that enlarges whenever a novelty occurs. The model, a generalization of Polya’s urn, predicts statistical laws for the rate at which novelties happen (Heaps’ law) and for the probability distribution on the space explored (Zipf’s law), as well as signatures of the process by which one novelty sets the stage for another. We test these predictions on four data sets of human activity: the edit events of Wikipedia pages, the emergence of tags in annotation systems, the sequence of words in texts, and listening to new songs in online music catalogues. By quantifying the dynamics of correlated novelties, our results provide a starting point for a deeper understanding of the adjacent possible and its role in biological, cultural, and technological evolution.
Human languages are rule governed, but almost invariably these rules have exceptions in the form of irregularities. Since rules in language are efficient and productive, the persistence of irregularity is an anomaly. How does irregularity linger in the face of internal (endogenous) and external (exogenous) pressures to conform to a rule? Here we address this problem by taking a detailed look at simple past tense verbs in the Corpus of Historical American English. The data show that the language is open, with many new verbs entering. At the same time, existing verbs might tend to regularize or irregularize as a consequence of internal dynamics, but overall, the amount of irregularity sustained by the language stays roughly constant over time. Despite continuous vocabulary growth, and presumably, an attendant increase in expressive power, there is no corresponding growth in irregularity. We analyze the set of irregulars, showing they may adhere to a set of minority rules, allowing for increased stability of irregularity over time. These findings contribute to the debate on how language systems become rule governed, and how and why they sustain exceptions to rules, providing insight into the interplay between the emergence and maintenance of rules and exceptions in language.
The naming game (NG) describes the agreement dynamics of a population of N agents interacting locally in pairs leading to the emergence of a shared vocabulary. This model has its relevance in the novel fields of semiotic dynamics and specifically to opinion formation and language evolution. The application of this model ranges from wireless sensor networks as spreading algorithms, leader election algorithms to user-based social tagging systems. In this paper, we introduce the concept of overhearing (i.e., at every time step of the game, a random set of N-delta individuals are chosen from the population who overhear the transmitted word from the speaker and accordingly reshape their inventories). When delta= 0 one recovers the behavior of the original NG. As one increases delta, the population of agents reaches a faster agreement with a significantly low-memory requirement. The convergence time to reach global consensus scales as log N as delta approaches 1. Copyright (C) EPLA, 2013
The lexicons of human languages organize their units at two distinct levels. At a first combinatorial level, meaningless forms (typically referred to as phonemes) are combined into meaningful units (typically referred to as morphemes). Thanks to this, many morphemes can be obtained by relatively simple combinations of a small number of phonemes. At a second compositional level of the lexicon, morphemes are composed into larger lexical units, the meaning of which is related to the individual meanings of the composing morphemes. This duality of patterning is not a necessity for lexicons and the question remains wide open regarding how a population of individuals is able to bootstrap such a structure and the evolutionary advantages of its emergence. Here we address this question in the framework of a multi-agents model, where a population of individuals plays simple naming games in a conceptual environment modeled as a graph. We demonstrate that errors in communication conditions for the emergence of duality of patterning, that can thus be explained in a pure cultural way. Compositional lexicons turn out to be faster to lead to successful communication thanpurely combinatorial lexicons, suggesting that meaning played a crucial role in the evolution of language.
One of the fundamental problems in cognitive science is how humans categorize the visible color spectrum. The empirical evidence of the existence of universal or recurrent patterns in color naming across cultures is paralleled by the observation that color names begin to be used by individual cultures in a relatively fixed order. The origin of this hierarchy is largely unexplained. Here we resort to multiagent simulations, where a population of individuals, subject to a simple perceptual constraint shared by all humans, namely the human Just Noticeable Difference, categorizes and names colors through a purely cultural negotiation in the form of language games. We found that the time needed for a population to reach consensus on a color name depends on the region of the visible color spectrum. If color spectrum regions are ranked according to this criterion, a hierarchy with [red, (magenta)-red], [violet], [green/yellow], [blue], [orange], and [cyan], appearing in this order, is recovered, featuring an excellent quantitative agreement with the empirical observations of the WCS. Our results demonstrate a clear possible route to the emergence of hierarchical color categories, confirming that the theoretical modeling in this area has now attained the required maturity to make significant contributions to the ongoing debates concerning language universals.
The empirical evidence that human color categorization exhibits some universal patterns beyond superficial discrepancies across different cultures is a major breakthrough in cognitive science. As observed in the World Color Survey (WCS), indeed, any two groups of individuals develop quite different categorization patterns, but some universal properties can be identified by a statistical analysis over a large number of populations. Here, we reproduce the WCS in a numerical model in which different populations develop independently their own categorization systems by playing elementary language games. We find that a simple perceptual constraint shared by all humans, namely the human Just Noticeable Difference (JND), is sufficient to trigger the emergence of universal patterns that unconstrained cultural interaction fails to produce. We test the results of our experiment against real data by performing the same statistical analysis proposed to quantify the universal tendencies shown in the WCS [Kay P & Regier T. (2003) Proc. Natl. Acad. Sci. USA 100: 9085-9089], and obtain an excellent quantitative agreement. This work confirms that synthetic modeling has nowadays reached the maturity to contribute significantly to the ongoing debate in cognitive science.
Our social behaviour has evolved primarily through contact with a limited number of other individuals. Yet as a species we exhibit uniformities on a global scale. This kind of emergent behaviour is familiar territory for statistical physicists.
What processes can explain how very large populations are able to converge on the use of a particular word or grammatical construction without global coordination? Answering this question helps to understand why new language constructs usually propagate along an S-shaped curve with a rather sudden transition towards global agreement. It also helps to analyse and design new technologies that support or orchestrate self-organizing communication systems, such as recent social tagging systems for the web. The article introduces and studies a microscopic model of communicating autonomous agents performing language games without any central control. We show that the system undergoes a disorder/order transition, going through a sharp symmetry breaking process to reach a shared set of conventions. Before the transition, the system builds up non-trivial scale-invariant correlations, for instance in the distribution of competing synonyms, which display a Zipf-like law. These correlations make the system ready for the transition towards shared conventions, which, observed on the timescale of collective behaviours, becomes sharper and sharper with system size. This surprising result not only explains why human language can scale up to very large populations but also suggests ways to optimize artificial semiotic dynamics.