-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path_main.Rmd
4485 lines (3205 loc) · 305 KB
/
_main.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
title: "Individual-based models of cultural evolution"
subtitle: "A step-by-step guide using R"
author:
- Alberto Acerbi
- Alex Mesoudi
- Marco Smolla
date: "`r Sys.Date()`"
site: bookdown::bookdown_site
documentclass: book
bibliography: "biblio.bib"
link-citations: true
---
# Introduction {-#Introduction}
TO DO
<!--chapter:end:index.Rmd-->
# (PART\*) Basics {-}
# Unbiased transmission
We start by simulating a simple case of unbiased cultural transmission. We will detail each step of the simulation and explain the code line-by-line. In the following chapters, we will reuse most of this initial model, building up the complexity of our simulations.
## Initialising the simulation
Here we will simulate a case where $N$ individuals each possess one of two mutually exclusive cultural traits. These alternative traits are denoted $A$ and $B$. For example, $A$ might be eating a vegetarian diet, and $B$ might be eating a non-vegetarian diet. In reality, traits are seldom clear-cut (e.g. what about pescatarians?), but models are designed to cut away all the complexity to give tractable answers to simplified situations.
Our model has non-overlapping generations. In each generation, all $N$ individuals are replaced with $N$ new individuals. Again, this is unlike any real biological group but provides a simple way of simulating change over time. Generations here could correspond to biological generations, but could equally be 'cultural generations' (or learning episodes), which might be much shorter.
Each new individual of each new generation picks a member of the previous generation at random and copies their cultural trait. This is known as unbiased oblique cultural transmission. 'Unbiased' refers to the fact that traits are copied entirely at random. The term 'oblique' means that members of one generation learn from those of the previous, non-overlapping, generation. This is different from, for example, horizontal cultural transmission, where individuals copy members of the same generation, and vertical cultural transmission, where offspring copy their biological parents.
If we assume that the two cultural traits are transmitted in an unbiased way, what does that mean for the average trait frequency in the population? To answer this question, we must track the proportion of individuals who possess trait $A$ over successive generations. We will call this proportion $p$. We could also track the proportion who possess trait $B$, but this will always be $1 - p$ given that the two traits are mutually exclusive. For example, if $70\%$ of the population have trait $A$ $(p=0.7)$, then the remaining $30\%$ must have trait $B$ (i.e. $1-p=1-0.7=0.3$).
The output of the model will be a plot showing $p$ over all generations up to the last generation. Generations (or time steps) are denoted by $t$, where generation one is $t=1$, generation two is $t=2$, up to the last generation $t=t_{\text{max}}$.
First, we need to specify the fixed parameters of the model. These are quantities that we decide on at the start and do not change during the simulation. In this model these are $N$ (the number of individuals) and $t_{\text{max}}$ (the number of generations). Let's start with $N=100$ and $t_{\text{max}}=200$:
```{r 1.1}
N <- 100
t_max <- 200
```
Now we need to create our individuals. The only information we need to keep about our individuals is their cultural trait ($A$ or $B$). We'll make **population** the data structure containing the individuals. The type of data structure we have chosen here is a tibble. This is a more user-friendly version of a dataframe.
Initially, we'll give each individual either an $A$ or $B$ at random, using the `sample()` command. This can be seen in the code chunk below. The `sample()` command takes three arguments (i.e. inputs or options). The first argument lists the elements to pick at random, in our case, the traits $A$ and $B$. The second argument gives the number of times to pick, in our case $N$ times, once for each individual. The final argument says to replace or reuse the elements specified in the first argument after they've been picked (otherwise there would only be one copy of $A$ and one copy of $B$, so we could only give two individuals traits before running out). Within the tibble command, the word $trait$ denotes the name of the variable within the tibble that contains the random $A$s and $B$s, and the whole tibble is assigned the name **population**.
There are one line before `sample()` in the chunk below. First, we need to call the `tidyverse` library. We will use this throughout the chapter. Here, it allows us to create a tibble using the tibble command.
```{r 1.2, message = FALSE}
library(tidyverse)
population <- tibble(trait = sample(c("A", "B"), N, replace = TRUE))
```
We can see the cultural traits of our population by simply entering its name in the R console:
```{r 1.3}
population
```
As expected, there is a single column called $trait$ containing $A$s and $B$s. The type of the column, in this case 'chr' (i.e. character), is reported below the name.
A specific individual's trait can be retrieved using the square bracket notation in R. For example, individual 4's trait can be retrieved by typing:
```{r 1.4}
population$trait[4]
```
This should match the fourth row in the table above.
We also need a tibble to record the output of our simulation, that is, to track the trait frequency $p$ in each generation. This will have two columns with $t_{\text{max}}$ rows, one row for each generation. The first column is simply a counter of the generations, from 1 to $t_{\text{max}}$. This will be useful to plot the output later. The other column contains the values of $p$ for each generation.
At this stage we don't know what $p$ will be in each generation, so, for now, let's fill the **output** tibble with lots of NAs, which is R's symbol for Not Available, or missing value. We can use the `rep()` (repeat) command to repeat NA $t_{\text{max}}$ times. We're using NA rather than, say, zero, because zero could be misinterpreted as $p = 0$, which would mean that all individuals have trait $B$. This would be misleading, because at the moment we haven't yet calculated $p$, so it's nonexistent, rather than zero.
```{r 1.5}
output <- tibble(generation = 1:t_max, p = rep(NA, t_max))
```
We can, however, fill in the first value of $p$ for our already-created first generation of individuals, held in **population**. The command below sums the number of $A$s in **population** and divides by $N$ to get a proportion out of 1 rather than an absolute number. It then puts this proportion in the first slot of $p$ in **output**, the one for the first generation, $t=1$. We can again write the name of the tibble, `output`, to see that it worked.
```{r 1.6}
output$p[1] <- sum(population$trait == "A") / N
output
```
This first value of $p$ should be around $0.5$, meaning that around 50 individuals have trait $A$, and 50 have trait $B$. Even though `sample()` returns either trait with equal probability, this does not necessarily mean that we will get exactly 50 $A$s and 50 $B$s. This happens with simulations and finite population sizes: they are probabilistic (or stochastic), not deterministic. Analogously, flipping a coin 100 times will not always give exactly 50 heads and 50 tails. Sometimes we will get 51 heads, sometimes 49, etc. To see this in our simulation, re-run the above code with different values of the random number seed.
## Execute generation turn-over many times
Now that we have built the population, we can simulate what individuals do in each generation. We iterate these actions over $t_{\text{max}}$ generations. In each generation, we need to:
* copy the current individuals to a separate tibble called **previous_population** to use as demonstrators for the new individuals; this allows us to implement oblique transmission with its non-overlapping generations, rather than mixing up the generations
* create a new generation of individuals, each of whose trait is picked at random from the **previous_population** tibble
* calculate $p$ for this new generation and store it in the appropriate slot in **output**
To iterate, we'll use a for-loop, using $t$ to track the generation. We've already done generation 1 so we'll start at generation 2. The random picking of models is done with `sample()` again, but this time picking from the traits held in **previous_population**. Note that we have added comments briefly explaining what each line does. This is perhaps superfluous when the code is this simple, but it's always good practice. Code often gets cut-and-pasted into other places and loses its context. Explaining what each line does lets other people - and a future, forgetful you - know what's going on.
```{r 1.7}
for (t in 2:t_max) {
previous_population <- population # copy the population tibble to previous_population tibble
population <- tibble(trait = sample(previous_population$trait, N, replace = TRUE)) # randomly copy from previous generation's individuals
output$p[t] <- sum(population$trait == "A") / N # get p and put it into the output slot for this generation t
}
```
Now we should have 200 values of $p$ stored in **output**, one for each generation. You can list them by typing **output**, but more effective is to plot them.
## Plotting the model results
We use `ggplot()` to plot our data. The syntax of ggplot may be slightly obscure at the beginning, but it forces us to have a clear picture of the data before plotting.
In the first line in the code below, we are telling ggplot that the data we want to plot is in the tibble **output**. Then, with the command `aes()` we declare the 'aesthetics' of the plot, that is, how we want our data mapped in our plot. In this case, we want the values of $p$ on the y-axis, and the values of $generation$ on the x-axis (this is why we created earlier, in the tibble **output**, a column to keep the count of generations).
We then use `geom_line()`. In ggplot, 'geoms' describe what kind of visual representation should be plotted: lines, bars, boxes and so on. This visual representation is independent of the mapping that we declared before with `aes()`. The same data, with the same mapping, can be visually represented in many different ways. In this case, we are asking ggplot to represent the data as a line. You can change `geom_line()` in the code below to `geom_point()`, and see what happens (other geoms have less obvious effects, and we will see some of them in the later chapters).
The other commands are mainly to make the plot look nicer. We want the y-axis to span all the possible values of $p$, from 0 to 1, and we use a particular 'theme' for our plot, in this case, a standard theme with white background. With the command `labs()` we give a more informative label to the y-axis (ggplot automatically labels the axis with the name of the tibble columns that are plotted: this is good for $generation$, but less so for $p$).
```{r 1.8}
ggplot(data = output, aes(y = p, x = generation)) +
geom_line() +
ylim(c(0, 1)) +
theme_bw() +
labs(y = "p (proportion of individuals with trait A)")
```
The proportion of individuals with trait $A$ should start off hovering around 0.5, and then oscillating randomly (it may be, in some cases, also reach 0, meaning that all $A$s disappeared, or 1, meaning that all $B$s disappeared). Unbiased transmission, or random copying, is by definition random, so different runs of this simulation will generate different plots. If you rerun all the code, trying different seeds of the random number generator, you will get something different. In all likelihood, $p$ might go to 0 or 1 at some point. At $p = 0$ there are no $A$s and every individual possesses $B$. At $p = 1$ there are no $B$s and every individual possesses $A$. This is a typical feature of cultural drift, analogous to genetic drift: in small populations, with no selection or other directional processes operating, traits can be lost purely by chance after some generations.
## Write a function to wrap the model code
Ideally, we would like to repeat the simulation to explore this idea in more detail, perhaps changing some of the parameters. For example, if we increase $N$, are we more or less likely to lose one of the traits? As noted above, individual-based models like this one are probabilistic or stochastic, thus it is essential to run simulations many times to understand what happens. With our code scattered about in chunks, it is hard to quickly repeat the simulation. Instead, we can wrap it all up in a function:
```{r 1.9}
unbiased_transmission_1 <- function(N, t_max) {
population <- tibble(trait = sample(c("A", "B"), N, replace = TRUE))
output <- tibble(generation = 1:t_max, p = rep(NA, t_max))
output$p[1] <- sum(population$trait == "A") / N
for (t in 2:t_max) {
previous_population <- population # copy individuals to previous_population tibble
population <- tibble(trait = sample(previous_population$trait, N, replace = TRUE))
# randomly copy from previous generation
output$p[t] <- sum(population$trait == "A") / N # get p and put it into output slot for this generation t
}
output
}
```
This is just all of the code snippets that we already ran above, but all within a function with parameters $N$ and $t_{\text{max}}$ as arguments to the function. In addition, `unbiased_transmission_1()` ends with the line `output`. This means that this tibble will be exported from the function when it is run. This is useful for storing data from simulations wrapped in functions, otherwise that data is lost after the function is executed.
Nothing will happen when you run the above code, because all you have done is define the function but not actually run it. The point is that we can now call the function in one go, easily changing the values of $N$ and $t_{\text{max}}$. Let's try first with the same values of $N$ and $t_{\text{max}}$ as before, and save the output from the simulation into **data_model**, as a record of what happened.
```{r 1.10}
data_model <- unbiased_transmission_1(N = 100, t_max = 200)
```
We also need to create another function to plot the data, so we do not need to rewrite all the plotting instructions each time. Whereas this may seem impractical now, it is convenient to separate the function that runs the simulation and the function that plots the data for various reasons. With more complicated models, we do not want to rerun a simulation just because we want to change some detail in the plot. It also makes conceptual sense to keep separate the raw output of the model from the various ways we can visualise it, or the further analysis we want to perform on it. As above, the code is identical to what we already wrote:
```{r 1.11}
plot_single_run <- function(data_model) {
ggplot(data = data_model, aes(y = p, x = generation)) +
geom_line() +
ylim(c(0, 1)) +
theme_bw() +
labs(y = "p (proportion of individuals with trait A)")
}
```
At this point, we can visualise the results:
```{r 1.12}
plot_single_run(data_model)
```
As anticipated, the plot is different from the simulation we ran before, even though the code is exactly the same. This is due to the stochastic nature of the simulation.
Now let's try changing the parameters. We can call the simulation and the plotting functions together. The code below reruns and plots the simulation with a much larger $N$.
```{r 1.13}
data_model <- unbiased_transmission_1(N = 10000, t_max = 200)
plot_single_run(data_model)
```
You should see much less fluctuation. Rarely in a population of $N = 10000$ will either trait go to fixation. Try re-running the previous code chunk to explore the effect of $N$ on long-term dynamics.
## Run several independent simulations and plot their results
Wrapping a simulation in a function like this is good because we can easily re-run it with just a single command. However, it's a bit laborious to manually re-run it. Say we wanted to re-run the simulation 10 times with the same parameter values to see how many times $A$ goes to fixation, and how many times $B$ goes to fixation. Currently, we'd have to manually run the `unbiased_transmission_1()` function 10 times and record somewhere else what happened in each run. It would be better to automatically re-run the simulation several times and plot each run as a separate line on the same plot. We could also add a line showing the mean value of $p$ across all runs.
Let's use a new parameter $r_{\text{max}}$ to specify the number of independent runs, and use another for-loop to cycle over the $r_{\text{max}}$ runs. Let's rewrite the `unbiased_transmission_1()` function to handle multiple runs. We will call the new function `unbiased_transmission_2()`.
```{r 1.14}
unbiased_transmission_2 <- function(N, t_max, r_max) {
output <- tibble(generation = rep(1:t_max, r_max), p = as.numeric(rep(NA, t_max * r_max)), run = as.factor(rep(1:r_max, each = t_max))) # create the output tibble
for (r in 1:r_max) { # for each run
population <- tibble(trait = sample(c("A", "B"), N, replace = TRUE))
# create first generation
output[output$generation == 1 & output$run == r, ]$p <- sum(population$trait == "A") / N # add first generation's p for run r
for (t in 2:t_max) {
previous_population <- population # copy individuals to previous_population tibble
population <- tibble(trait = sample(previous_population$trait, N, replace = TRUE))
# randomly copy from previous generation
output[output$generation == t & output$run == r, ]$p <- sum(population$trait == "A") / N # get p and put it into output slot for this generation t and run r
}
}
output # export data from function
}
```
There are a few changes here. First, we need a different **output** tibble, because we need to store data for all the runs. For that, we initialise the same $generation$ and $p$ columns as before, but with space for all the runs. $generation$ is now built by repeating the count of each generation $r_{\text{max}}$ times, and $p$ is NA repeated for all generations, for all runs.
We also need a new column called $run$ that keeps track of which run the data in the other two columns belongs to. Note that the definition of $run$ is preceded by `as.factor()`. This specifies the type of data to put in the $run$ column. We want $run$ to be a 'factor' or categorical variable so that, even if runs are labelled with numbers (1, 2, 3...), this should not be misinterpreted as a continuous, real number: there is no sense in which run 2 is twice as 'runny' as run 1, or run 3 half as 'runny' as run 6. Runs could equally have been labelled using letters, or any other arbitrary scheme. While omitting `as.factor()` does not make any difference when running the simulation, it would create problems when plotting the data because ggplot would treat runs as continuous real numbers rather than discrete categories (you can see this yourself by modifying the definition of **output** in the previous code chunk). This is a good example of how it is important to have a clear understanding of your data before trying to plot or analyse them.
Going back to the function, we then set up an $r$ loop, which executes once for each run. The code within this loop is mostly the same as before, except that we now use the `[output$generation == t & output$run == r, ]` notation to put $p$ into the right place in **output**.
The plotting function is also changed to handle multiple runs:
```{r 1.15}
plot_multiple_runs <- function(data_model) {
ggplot(data = data_model, aes(y = p, x = generation)) +
geom_line(aes(colour = run)) +
stat_summary(fun = mean, geom = "line", size = 1) +
ylim(c(0, 1)) +
theme_bw() +
labs(y = "p (proportion of individuals with trait A)")
}
```
To understand how the above code works, we need to explain the general functioning of ggplot. As explained above, `aes()` specifies the 'aesthetics', or how the data are mapped in the plot. This is independent from the possible visual representations of this mapping, or 'geoms'. If we declare specific aesthetics when we call `ggplot()`, these aesthetics will be applied to all geoms we call afterwards. Alternatively, we can specify the aesthetics in the geom itself. For example this:
```{r 1.16, eval=FALSE}
ggplot(data = output, aes(y = p, x = generation)) +
geom_line()
```
is equivalent to this:
```{r 1.17, eval=FALSE}
ggplot(data = output) +
geom_line(aes(y = p, x = generation))
```
We can use this property to make more complex plots. The plot created in `plot_multiple_runs` has a first geom, `geom_line()`. This inherits the aesthetics specified in the initial call to `ggplot()` but also has a new mapping specific to `geom_line()`, `colour = run`. This tells ggplot to plot each run line with a different colour. The following command, `stat_summary()`, calculates the mean of all runs. However, this only inherits the mapping specified in the initial `ggplot()` call. If in the aesthetic of `stat_summary()` we had also specified `colour = run`, it would separate the data by run, and it would calculate the mean of each run. This, though, is just the lines we have already plotted with the `geom_line()` command. For this reason, we did not put `colour = run` in the `ggplot()` call, only in `geom_line()`. As always, there are various ways to obtain the same result. This code:
```{r 1.18, eval=FALSE}
ggplot(data = output) +
geom_line(aes(y = p, x = generation, colour = run)) +
stat_summary(aes(y = p, x = generation), fun.y = mean, geom = "line", size = 1)
```
is equivalent to the code we wrapped in the function above. However, the original code is clearer, as it distinguishes the global mapping, and the mappings specific to each visual representation.
`stat_summary()` is a generic ggplot function which can be used to plot different statistics to summarise our data. In this case, we want to calculate the mean on the data mapped in $y$, we want to plot them with a line, and we want this line to be thicker than the lines for the single runs. The default line size for geom_line is 0.5, so `size = 1` doubles the thickness.
Let's now run the function and plot the results for five runs with the same parameters we used at the beginning ($N=100$ and $t_{\text{max}}=200$):
```{r 1.19}
data_model <- unbiased_transmission_2(N = 100, t_max = 200, r_max = 5)
plot_multiple_runs(data_model)
```
You should be able to see five independent runs of our simulation shown as regular thin lines, along with a thicker line showing the mean of these lines. Some runs have probably gone to 0 or 1, and the mean should be somewhere in between. The data is stored in **data_model**, which we can inspect by writing its name.
```{r 1.20}
data_model
```
Now let's run the `unbiased_transmission_2()` model with $N = 10000$, to compare with $N = 100$.
```{r 1.21}
data_model <- unbiased_transmission_2(N = 10000, t_max = 200, r_max = 5)
plot_multiple_runs(data_model)
```
The mean line should be almost exactly at $p = 0.5$ now, with the five independent runs fairly close to it.
## Varying initial conditions
Let's add one final modification. So far the starting frequencies of $A$ and $B$ have been the same, roughly 0.5 each. But what if we were to start at different initial frequencies of $A$ and $B$? Say, $p = 0.2$ or $p = 0.9$? Would unbiased transmission keep $p$ at these initial values, or would it go to $p = 0.5$ as we have found so far?
To find out, we can add another parameter, $p_0$, which specifies the initial probability of an individual having an $A$ rather than a $B$ in the first generation. Previously this was always $p_0 = 0.5$, but in the new function below we add it to the `sample()` function to weight the initial allocation of traits in $t = 1$.
<!-- see the comment above for the "as.numeric" -->
```{r 1.22}
unbiased_transmission_3 <- function(N, p_0, t_max, r_max) {
output <- tibble(generation = rep(1:t_max, r_max), p = as.numeric(rep(NA, t_max * r_max)), run = as.factor(rep(1:r_max, each = t_max)))
for (r in 1:r_max) {
population <- tibble(trait = sample(c("A", "B"), N, replace = TRUE, prob = c(p_0, 1 - p_0)))
# create first generation
output[output$generation == 1 & output$run == r, ]$p <- sum(population$trait == "A") / N # add first generation's p for run r
for (t in 2:t_max) {
previous_population <- population # copy individuals to previous_population tibble
population <- tibble(trait = sample(previous_population$trait, N, replace = TRUE))
# randomly copy from previous generation
output[output$generation == t & output$run == r, ]$p <- sum(population$trait == "A") / N # get p and put it into output slot for this generation t and run r
}
}
output # export data from function
}
```
`unbiased_transmission_3()` is almost identical to the previous function. The only changes are the addition of $p_0$ as an argument to the function, and the $prob$ argument in the `sample()` command. The $prob$ argument gives the probability of picking each option, in our case $A$ and $B$, in the first generation. The probability of $A$ is now $p_0$, and the probability of $B$ is now $1 - p_0$. We can use the same plotting function as before to visualise the result. Let's see what happens with a different value of $p_0$, for example $p_0 = 0.2$.
```{r 1.23}
data_model <- unbiased_transmission_3(N = 10000, p_0 = 0.2, t_max = 200, r_max = 5)
plot_multiple_runs(data_model)
```
With $p_0 = 0.2$, trait frequencies stay at $p = 0.2$. Unbiased transmission is truly non-directional: it maintains trait frequencies at whatever they were in the previous generation, barring random fluctuations caused by small population sizes.
***
***
## Analytical model {-}
If $p$ is the frequency of $A$ in one generation, we are interested in calculating $p'$, the frequency of $A$ in the next generation under the assumption of unbiased transmission. Each new individual in the next generation picks a demonstrator at random from among the previous generation. The demonstrator will have $A$ with probability $p$. The frequency of $A$ in the next generation, then, is simply the frequency of $A$ in the previous generation:
$$p' = p \hspace{30 mm}(1.1)$$
Equation 1.1 simply says that under unbiased transmission there is no change in $p$ over time. If, as we assumed above, the initial value of $p$ in a particular population is $p_0$, then the equilibrium value of $p$, $p^*$, at which there is no change in $p$ over time, is just $p_0$.
We can plot this recursion, to recreate the final simulation plot above:
```{r 1.24}
p_0 <- 0.2
t_max <- 200
pop_analytical <- tibble(p = rep(NA, t_max), generation = 1:t_max)
pop_analytical$p[1] <- p_0
for (i in 2:t_max) {
pop_analytical$p[i] <- pop_analytical$p[i - 1]
}
ggplot(data = pop_analytical, aes(y = p, x = generation)) +
geom_line() +
ylim(c(0, 1)) +
theme_bw() +
labs(y = "p (proportion of individuals with trait A)")
```
Here, we use a **for** loop to cycle through each generation, each time updating $p$ according to the recursion equation above. Remember, there is no $N$ here because the recursion is deterministic and assumes an infinite population size; hence there is no stochasticity due to finite population sizes. There is also no need to have multiple runs as each run is identical, hence no $r_{max}$.
Don't worry, it gets more complicated than this in later chapters. The key point here is that analytical (or deterministic) models assume infinite populations and no stochasticity. Simulations with very large populations should give the same results as analytical models. Basically, the closer we can get in stochastic models to the assumption of infinite populations, the closer the match to infinite-population deterministic models. Deterministic models give the ideal case; stochastic models permit more realistic dynamics based on finite populations.
More generally, creating deterministic recursion-based models can be a good way of verifying simulation models, and vice versa: if the same dynamics occur in both individuals-based and recursion-based models, then we can be more confident that those dynamics are genuine and not the result of a programming error or mathematical mistake.
***
***
## Summary of the model
Even this extremely simple model provides some valuable insights. First, unbiased transmission does not in itself change trait frequencies. As long as populations are large, trait frequencies remain the same.
Second, the smaller the population size, the more likely traits are to be lost by chance. This is a basic insight from population genetics, known there as genetic drift, but it can also be applied to cultural evolution. Many studies have tested (and some supported) the idea that population size and other demographic factors can shape cultural diversity.
Furthermore, generating expectations about cultural change under simple assumptions like random cultural drift can be useful for detecting non-random patterns like selection. If we don't have a baseline, we won't know selection or other directional processes when we see them.
We also have introduced several programming techniques that will be useful in later simulations. We have seen how to use tibbles to hold characteristics of individuals and the outputs of simulations, how to use loops to cycle through generations and simulation runs, how to use `sample()` to pick randomly from sets of elements, how to wrap simulations in functions to easily re-run them with different parameter values, and how to use `ggplot()` to plot the results of simulations.
## Further reading
@cavalli-sforza_cultural_1981 explored how cultural drift affects cultural evolution, which was extended by @neiman_stylistic_1995 in an archaeological context. @bentley_random_2004 present models of unbiased transmission for several cultural datasets. @lansing_domain_2011 and commentaries explore the underlying assumptions of applying random drift to cultural evolution.
<!--chapter:end:01-Unbiased_transmission.Rmd-->
# Unbiased and biased mutation
Evolution doesn't work without a source of variation that introduces new variation upon which selection, drift and other processes can act. In genetic evolution, mutation is almost always blind with respect to function. Beneficial genetic mutations are no more likely to arise when they are needed than when they are not needed - in fact, most genetic mutations are neutral or detrimental to an organism. Cultural evolution is more interesting, in that novel variation may sometimes be directed to solve specific problems, or systematically biased due to features of our cognition. In the models below, we'll simulate both unbiased and biased mutation.
## Unbiased mutation
First, we will simulate unbiased mutation in the same basic model as used in the previous chapter. We'll remove unbiased transmission to see the effect of unbiased mutation alone.
As in the previous model, we assume $N$ individuals each of whom possesses one of two discrete cultural traits, denoted $A$ and $B$. In each generation, from $t = 1$ to $t = t_{\text{max}}$, the $N$ individuals are replaced with $N$ new individuals. Instead of random copying, each individual now gives rise to a new individual with the same cultural trait as them. (Another way of looking at this is in terms of timesteps, such as years: the same $N$ individual live for $t_{\text{max}}$ years and keep their cultural trait from one year to the next.)
At each generation, however, there is a probability $\mu$ that each individual mutates from their current trait to the other trait (the Greek letter Mu is the standard notation for the mutation rate in genetic evolution, and it has an analogous function here). For example, vegetarian individuals can decide to eat animal products, and vice versa. Remember, this is not copied from other individuals, as in the previous model, but can be thought of as an individual decision. Another way to see this is that the probability of changing trait applies to each individual independently; whether an individual mutates has no bearing on whether or how many other individuals have mutated. On average, this means that $\mu N$ individuals mutate each generation. Like in the previous model, we are interested in tracking the proportion $p$ of agents with trait $A$ over time.
We'll wrap this in a function called `unbiased_mutation()`, using much of the same code as `unbiased_transmission_3()`. As before, we need to call the tidyverse library and set a seed for the random number generator, so the results will be the same each time we rerun the code. Of course, if you want to see the stochasticity inherent in the simulation, you can remove the set.seed command, or set it to a different number.
```{r 2.1, message = FALSE}
library(tidyverse)
set.seed(111)
unbiased_mutation <- function(N, mu, p_0, t_max, r_max) {
output <- tibble(generation = rep(1:t_max, r_max), p = as.numeric(rep(NA, t_max * r_max)), run = as.factor(rep(1:r_max, each = t_max))) # create the output tibble
for (r in 1:r_max) {
population <- tibble(trait = sample(c("A", "B"), N, replace = TRUE, prob = c(p_0, 1 - p_0)))
output[output$generation == 1 & output$run == r, ]$p <- sum(population$trait == "A") / N # add first generation's p for run r
for (t in 2:t_max) {
previous_population <- population # copy individuals to previous_population tibble
mutate <- sample(c(TRUE, FALSE), N, prob = c(mu, 1 - mu), replace = TRUE) # determine 'mutant' individuals
if (nrow(population[mutate & previous_population$trait == "A", ]) > 0) { # if there are 'mutants' from A to B
population[mutate & previous_population$trait == "A", ]$trait <- "B" # then flip them to B
}
if (nrow(population[mutate & previous_population$trait == "B", ]) > 0) { # if there are 'mutants' from B to A
population[mutate & previous_population$trait == "B", ]$trait <- "A" # then flip them to A
}
output[output$generation == t & output$run == r, ]$p <- sum(population$trait == "A") / N # get p and put it into output slot for this generation t and run r
}
}
output # export data from function
}
```
The only changes from the previous model are the addition of $\mu$, the parameter that specifies the probability of mutation, in the function definition and new lines of code within the `for` loop on $t$ which replace the random copying command with unbiased mutation. Let's examine these lines to see how they work.
The most obvious way of implementing unbiased mutation - which is not done above - would have been to set up another `for` loop. We would cycle through each individual one by one, each time calculating whether it should mutate or not based on $mu$. This would certainly work, but R is notoriously slow at loops. It's always preferable in R, where possible, to use 'vectorised' code. That's what is done above in our three added lines, starting from `mutate <- sample()`.
First, we pre-specify the probability of mutating for each individual. For this, we again use the function `sample()`, picking $TRUE$ (corresponding to being a mutant) or $FALSE$ (not mutating, i.e. keeping the same cultural trait) for $N$ times. The draw, however, is not random: the probability of drawing $TRUE$ is equal to $\mu$, and the probability of drawing $FALSE$ is $1-\mu$. You can think about the procedure in this way: each individual in the population flips a biased coin that has $\mu$ probability to land on, say, heads, and $1-\mu$ to land on tails. If it lands on heads they change their cultural trait.
After that, in the following lines, we change the traits for the 'mutant' individuals. We need to check whether there are individuals that change their trait, both from $A$ to $B$ and vice versa, using the two `if` conditionals. If there are no such individuals, then assigning a new value to an empty tibble returns an error. To check, we make sure that the number of rows is greater than 0 (using `nrow()>0` within the `if`).
To plot the results, we can use the same function `plot_multiple_runs()` we wrote in the [previous chapter][Unbiased transmission], reproduced here for convenience.
```{r 2.2, echo=FALSE}
plot_multiple_runs <- function(data_model) {
ggplot(data = data_model, aes(y = p, x = generation)) +
geom_line(aes(colour = run)) +
stat_summary(fun.y = mean, geom = "line", size = 1) +
ylim(c(0, 1)) +
theme_bw() +
labs(y = "p (proportion of individuals with trait A)")
}
```
Let's now run and plot the model:
```{r 2.3}
data_model <- unbiased_mutation(N = 100, mu = 0.05, p_0 = 0.5, t_max = 200, r_max = 5)
plot_multiple_runs(data_model)
```
Unbiased mutation produces random fluctuations over time and does not alter the overall frequency of $A$, which stays around $p = 0.5$. Because mutations from $A$ to $B$ are as equally likely as $B$ to $A$, there is no overall directional trend.
If you remember from the previous chapter, with unbiased transmission, instead, when populations were small (e.g. $N=100$) generally one of the traits disappeared after a few generations. Here, though, with $N=100$, both traits remain until the end of the simulation. Why this difference? You can think of it in this way: when one trait becomes popular, say the frequency of $A$ is equal to $0.8$, with unbiased transmission it is more likely that individuals of the new generation will pick up $A$ randomly when copying. The few individuals with trait $B$ will have 80% of probability of copying $A$. With unbiased mutation, on the other hand, since $\mu$ is applied independently to each individual when $A$ is common then there will be more individuals that will flip to $B$ (specifically, $\mu p N$ individuals, which in our case is 4) than individuals that will flip to $A$ (equal to $\mu (1-p) N$ individuals, in our case 1) keeping the traits at similar frequencies.
But what if we were to start at different initial frequencies of $A$ and $B$? Say, $p=0.1$ and $p=0.9$? Would $A$ disappear? Would unbiased mutation keep $p$ at these initial values, like we saw unbiased transmission does in Model 1?
To find out, let's change $p_0$, which specifies the initial probability of drawing an $A$ rather than a $B$ in the first generation.
```{r 2.4}
data_model <- unbiased_mutation(N = 100, mu = 0.05, p_0 = 0.1, t_max = 200, r_max = 5)
plot_multiple_runs(data_model)
```
You should see $p$ go from 0.1 up to 0.5. In fact, whatever the initial starting frequencies of $A$ and $B$, unbiased mutation always leads to $p = 0.5$, for the reason explained above: unbiased mutation always tends to balance the proportion of $A$s and $B$s.
## Biased mutation
A more interesting case is biased mutation. Let's assume now that there is a probability $\mu_b$ that an individual with trait $B$ mutates into $A$, but there is no possibility of trait $A$ mutating into trait $B$. Perhaps trait $A$ is a particularly catchy or memorable version of a story or an intuitive explanation of a phenomenon, and $B$ is difficult to remember or unintuitive to understand.
The function `biased_mutation()` captures this unidirectional mutation.
```{r 2.5}
biased_mutation <- function(N, mu_b, p_0, t_max, r_max) {
output <- tibble(generation = rep(1:t_max, r_max), p = as.numeric(rep(NA, t_max * r_max)), run = as.factor(rep(1:r_max, each = t_max))) # create the output tibble
for (r in 1:r_max) {
population <- tibble(trait = sample(c("A", "B"), N, replace = TRUE, prob = c(p_0, 1 - p_0)))
output[output$generation == 1 & output$run == r, ]$p <- sum(population$trait == "A") / N # add first generation's p for run r
for (t in 2:t_max) {
previous_population <- population # copy individuals to previous_population tibble
mutate <- sample(c(TRUE, FALSE), N, prob = c(mu_b, 1 - mu_b), replace = TRUE) # find 'mutant' individuals
if (nrow(population[mutate & previous_population$trait == "B", ]) > 0) {
population[mutate & previous_population$trait == "B", ]$trait <- "A" # if individual was B and mutates, flip to A
}
output[output$generation == t & output$run == r, ]$p <- sum(population$trait == "A") / N # get p and put it into output slot for this generation t and run r
}
}
output # export data from function
}
```
There are just two changes in this code compared to `unbiased_mutation()`. First, we've replaced $\mu$ with $\mu_b$ to keep the two parameters distinct and avoid confusion. Second, the line in `unbiased_mutation()` which caused individuals with $A$ to mutate to $B$ has been deleted.
Let's see what effect this has by running `biased_mutation()`. We'll start with the population entirely composed of individuals with $B$, i.e. $p_0 = 0$, to see how quickly and in what manner $A$ spreads via biased mutation.
```{r 2.6}
data_model <- biased_mutation(N = 100, mu_b = 0.05, p_0 = 0, t_max = 200, r_max = 5)
plot_multiple_runs(data_model)
```
The plot shows a steep increase that slows and plateaus at $p = 1$ by around generation $t = 100$. There should be a bit of fluctuation in the different runs, but not much. Now let's try a larger sample size.
```{r 2.7}
data_model <- biased_mutation(N = 10000, mu_b = 0.05, p_0 = 0, t_max = 200, r_max = 5)
plot_multiple_runs(data_model)
```
With $N = 10000$ the line should be smooth with little (if any) fluctuation across the runs. But notice that it plateaus at about the same generation, around $t = 100$. Population size has little effect on the rate at which a novel trait spreads via biased mutation. $\mu_b$, on the other hand, does affect this speed. Let's double the biased mutation rate to 0.1.
```{r 2.8}
data_model <- biased_mutation(N = 10000, mu_b = 0.1, p_0 = 0, t_max = 200, r_max = 5)
plot_multiple_runs(data_model)
```
Now trait $A$ reaches fixation around generation $t = 50$. Play around with $N$ and $\mu_b$ to confirm that the latter determines the rate of diffusion of trait $A$, and that it takes the same form each time - roughly an 'r' shape with an initial steep increase followed by a plateauing at $p = 1$.
***
***
## Analytical model {-}
If $p$ is the frequency of $A$ in one generation, we are interested in calculating $p'$, the frequency of $A$ in the next generation under the assumption of unbiased mutation. The next generation retains the cultural traits of the previous generation, except that $\mu$ of them switch to the other trait. There are therefore two sources of $A$ in the next generation: members of the previous generation who had $A$ and didn't mutate, therefore staying $A$, and members of the previous generation who had $B$ and did mutate, therefore switching to $A$. The frequency of $A$ in the next generation is therefore:
$$p' = p(1-\mu) + (1-p)\mu \hspace{30 mm}(2.1)$$
The first term on the right-hand side of Equation 2.1 represents the first group, the $(1 - \mu)$ proportion of the $p$ $A$-carriers who didn't mutate. The second term represents the second group, the $\mu$ proportion of the $1 - p$ $B$-carriers who did mutate.
To calculate the equilibrium value of $p$, $p^*$, we want to know when $p' = p$, or when the frequency of $A$ in one generation is identical to the frequency of $A$ in the next generation. This can be found by setting $p' = p$ in Equation 2.1, which gives:
$$p = p(1-\mu) + (1-p)\mu \hspace{30 mm}(2.2)$$
Rearranging Equation 2.2 gives:
$$\mu(1 - 2p) = 0 \hspace{30 mm}(2.3)$$
The left-hand side of Equation 2.3 equals zero when either $\mu = 0$, which given our assumption that $\mu > 0$ cannot be the case, or when $1 - 2p = 0$, which after rearranging gives the single equilibrium $p^* = 0.5$. This matches our simulation results above. As we found in the simulations, this does not depend on $\mu$ or the starting frequency of $p$.
We can also plot the recursion in Equation 2.1 like so:
```{r 2.9}
p_0 <- 0
t_max <- 200
mu <- 0.1
pop_analytical <- tibble(p = rep(NA, t_max), generation = 1:t_max)
pop_analytical$p[1] <- p_0
for (i in 2:t_max) {
pop_analytical$p[i] <- pop_analytical$p[i - 1] * (1 - mu) + (1 - pop_analytical$p[i - 1]) * mu
}
ggplot(data = pop_analytical, aes(y = p, x = generation)) +
geom_line() +
ylim(c(0, 1)) +
theme_bw() +
labs(y = "p (proportion of individuals with trait A)")
```
Again, this should resemble the figure generated by the simulations above, and confirm that $p^* = 0.5$.
For biased mutation, assume that only $B$s are switching to $A$, and with probability $\mu_b$ instead of $\mu$. The first term on the right-hand side becomes simply $p$, because $A$s do not switch. The second term remains the same, but with $\mu_b$. Thus,
$$p' = p + (1-p)\mu_b \hspace{30 mm}(2.4)$$
The equilibrium value $p^*$ can be found by again setting $p' = p$ and solving for $p$. Assuming $\mu_b > 0$, this gives the single equilibrium $p^* = 1$, which again matches the simulation results.
We can plot the above recursion like so:
```{r 2.10}
p_0 <- 0
t_max <- 200
mu_b <- 0.1
pop_analytical <- tibble(p = rep(NA, t_max), generation = 1:t_max)
pop_analytical$p[1] <- p_0
for (i in 2:t_max) {
pop_analytical$p[i] <- pop_analytical$p[i - 1] + (1 - pop_analytical$p[i - 1]) * mu_b
}
ggplot(data = pop_analytical, aes(y = p, x = generation)) +
geom_line() +
ylim(c(0, 1)) +
theme_bw() +
labs(y = "p (proportion of individuals with trait A)")
```
Hopefully, this looks identical to the final simulation plot with the same value of $\mu_b$.
Furthermore, we can specify an equation for the change in $p$ from one generation to the next, or $\Delta p$. We do this by subtracting $p$ from both sides of Equation 2.4, giving:
$$\Delta p = p' - p = (1-p)\mu_b \hspace{30 mm}(2.5)$$
Seeing this helps explain two things. First, the $1 - p$ part explains the r-shape of the curve. It says that the smaller is $p$, the larger $\Delta p$ will be. This explains why $p$ increases in frequency very quickly at first, when $p$ is near zero, and the increase slows when $p$ gets larger. We have already determined that the increase stops altogether (i.e. $\Delta p$ = 0) when $p = p^* = 1$.
Second, it says that the rate of increase is proportional to $\mu_b$. This explains our observation in the simulations that larger values of $\mu_b$ cause $p$ to reach its maximum value faster.
***
***
## Summary of the model
With this simple model, we can draw the following insights. Unbiased mutation, which resembles genetic mutation in being non-directional, always leads to an equal mix of the two traits. It introduces and maintains cultural variation in the population. It is interesting to compare unbiased mutation to unbiased transmission from Model 1. While unbiased transmission did not change $p$ over time, unbiased mutation always converges on $p^* = 0.5$, irrespective of the starting frequency. (NB $p^* = 0.5$ assuming there are two traits; more generally, $p^* = 1/v$, where $v$ is the number of traits.)
Biased mutation, which is far more common - perhaps even typical - in cultural evolution, shows different dynamics. Novel traits favoured by biased mutation spread in a characteristic fashion - an r-shaped diffusion curve - with a speed characterised by the mutation rate $\mu_b$. Population size has little effect, whether $N = 100$ or $N = 10000$. Whenever biased mutation is present ($\mu_b > 0$), the favoured trait goes to fixation, even if it is not initially present.
In terms of programming techniques, the major novelty in Model 2 is the use of `sample()` to determine which individuals should undergo whatever the fixed probability specifies (in our case, mutation). This could be done with a loop, but vectorising code in the way we did here is much faster in R than loops.
## Further reading
@boyd_culture_1985 model what they call 'guided variation', which is equivalent to biased mutation as modelled in this chapter. @henrich_cultural_2001 shows how biased mutation / guided variation generates r-shaped curves similar to those generated here.
<!--chapter:end:02-Unbiased_and_biased_mutation.Rmd-->
# Biased transmission: direct bias
So far we have looked at unbiased transmission ([Chapter 1][Unbiased transmission]) and mutation, both unbiased and biased ([Chapter 2][Unbiased and biased mutation]). Let's complete the set by looking at biased transmission. This occurs when one trait is more likely to be copied than another trait. When the choice depends on the features of the trait, it is often called 'direct' or 'content' bias. When the choice depends on features of the demonstrators (the individuals from whom one is copying), it is often called 'indirect' or 'context' bias. Both are sometimes also called 'cultural selection' because one trait is selected to be copied over another trait. In this chapter, we will look at trait-based (direct, content) bias.
(As an aside, there is a confusing array of terminology in the field of cultural evolution, as illustrated by the preceding paragraph. That's why models are so useful. Words and verbal descriptions can be ambiguous. Often the writer doesn't realise that there are hidden assumptions or unrecognised ambiguities in their descriptions. They may not realise that what they mean by 'cultural selection' is entirely different from how someone else uses it. Models are great because they force us to precisely specify exactly what we mean by a particular term or process. We can use the words in the paragraph above to describe biased transmission, but it's only really clear when we model it, making all our assumptions explicit.)
To simulate biased transmission, following the simulations in [Chapter 1][Unbiased transmission], we assume there are two traits $A$ and $B$, and that each individual chooses another individual from the previous generation at random. This time, however, we give the traits two different probabilities of being copied: we can call them, $s_a$ and $s_b$ respectively. When an individual encounters another individual with trait $A$, they will copy them with probability $s_a$. When they encounter an individual with trait $B$, they will copy them with probability $s_b$.
With $s_a=s_b$, copying is unbiased, and individuals switch to the encountered alternative with the same probability. This reproduces the results of the simulations when the transmission is unbiased. If $s_a=s_b=1$, the model is exactly the same as in [Chapter 1][Unbiased transmission]. The relevant situation in this chapter is when $s_a>s_b$ (or vice versa) so that we have biased transmission. Perhaps $A$ (or $B$) is a more effective tool, a more memorable story, or a more easily pronounced word.
Let's first write the function, and then explore what happens in this case. Below is a function `biased_transmission_direct()` that implements all of these ideas.
```{r 3.1}
library(tidyverse)
set.seed(111)
biased_transmission_direct <- function (N, s_a, s_b, p_0, t_max, r_max) {
output <- tibble(generation = rep(1:t_max, r_max), p = as.numeric(rep(NA, t_max * r_max)), run = as.factor(rep(1:r_max, each = t_max)))
for (r in 1:r_max) {
population <- tibble(trait = sample(c("A", "B"), N, replace = TRUE, prob = c(p_0, 1 - p_0))) # create first generation
output[output$generation == 1 & output$run == r, ]$p <- sum(population$trait == "A") / N # add first generation's p for run r
for (t in 2:t_max) {
previous_population <- population # copy individuals to previous_population tibble
demonstrator_trait <- tibble(trait = sample(previous_population$trait, N, replace = TRUE))
# for each individual, pick a random individual from the previous generation to act as demonstrator and store their trait
# biased probabilities to copy:
copy_a <- sample(c(TRUE, FALSE), N, prob = c(s_a, 1 - s_a), replace = TRUE)
copy_b <- sample(c(TRUE, FALSE), N, prob = c(s_b, 1 - s_b), replace = TRUE)
if (nrow(population[copy_a & demonstrator_trait == "A", ]) > 0) {
population[copy_a & demonstrator_trait == "A", ]$trait <- "A"
}
if (nrow(population[copy_b & demonstrator_trait == "B", ]) > 0) {
population[copy_b & demonstrator_trait == "B", ]$trait <- "B"
}
output[output$generation == t & output$run == r, ]$p <- sum(population$trait == "A") / N # get p and put it into output slot for this generation t and run r
}
}
output # export data from function
}
```
Most of `biased_transmission_direct()` is recycled from the previous models. As before, we initialise the data structure **output** from multiple runs, and in generation $t = 1$, we create a **population** tibble to hold the trait of each individual.
The major change is that we now include biased transmission. We first select at random the demonstrators from the previous generation (using the same code we used in `unbiased_transmission()`) and we store their trait in **demonstrator_trait**. Then we get the probabilities to copy $A$ and to copy $B$ for the entire population, using the same code used in `biased_mutation()`, only that this time it produces a probability to copy. Again using the same code as in `biased mutation()`, we have the individuals copy the trait at hand with the desired probability.
Let's run our function `biased_transmission_direct()`. As before, to plot the results, we can use the same function `plot_multiple_runs()` we wrote in [Chapter 1][Unbiased transmission].
As noted above, the interesting case is when one trait is favoured over the other. We can assume, for example, $s_a=0.1$ and $s_b=0$. This means that when individuals encounter another individual with trait $A$ they copy them 1 in every 10 times, but, when individuals encounter another individual with trait $B$, they never switch. We can also assume that the favoured trait, $A$, is initially rare in the population ($p_0=0.01$) to see how selection favours this initially-rare trait (Note that $p_0$ needs to be higher than 0; since there is no mutation in this model, we need to include at least some $A$s at the beginning of the simulation, otherwise it would never appear).
```{r 3.2, echo=FALSE}
plot_multiple_runs <- function(data_model) {
ggplot(data = data_model, aes(y = p, x = generation)) +
geom_line(aes(colour = run)) +
stat_summary(fun.y = mean, geom = "line", size = 1) +
ylim(c(0, 1)) +
theme_bw() +
labs(y = "p (proportion of individuals with trait A)")
}
```
```{r 3.3}
data_model <- biased_transmission_direct(N = 10000, s_a = 0.1, s_b = 0 , p_0 = 0.01, t_max = 150, r_max = 5)
plot_multiple_runs(data_model)
```
With a moderate selection strength, we can see that $A$ gradually replaces $B$ and goes to fixation. It does this in a characteristic manner: the increase is slow at first, then picks up speed, then plateaus.
Note the difference to biased mutation. Where biased mutation was r-shaped, with a steep initial increase, biased transmission is s-shaped, with an initial slow uptake. This is because the strength of biased transmission (like selection in general) is proportional to the variation in the population. When $A$ is rare initially, there is only a small chance of picking another individual with $A$. As $A$ spreads, the chances of picking an $A$ individual increases. As $A$ becomes very common, there are few $B$ individuals left to switch. In the case of biased mutation, instead, the probability to switch is independent of the variation in the population.
## Strength of selection
On what does the strength of selection depend? First, the strength is independent of the specific values of $s_a$ and $s_b$. What counts is their relative difference, in this case $s_a-s_b = 0.1$. If we run a simulation with, say, $s_a=0.6$ and $s_b=0.5$, we see the same pattern, albeit with slightly more noise. That is, the single runs are more different from one another compared to the previous simulation. This is because switches from $A$ to $B$ are now also possible.
```{r 3.4}
data_model <- biased_transmission_direct(N = 10000, s_a = 0.6, s_b = 0.5 , p_0 = 0.01, t_max = 150, r_max = 5)
plot_multiple_runs(data_model)
```
To change the selection strength, we need to modify the difference between $s_a$ and $s_b$. We can double the strength by setting $s_a = 0.2$, and keeping $s_b=0$.
```{r 3.5}
data_model <- biased_transmission_direct(N = 10000, s_a = 0.2, s_b = 0 , p_0 = 0.01, t_max = 150, r_max = 5)
plot_multiple_runs(data_model)
```
As we might expect, increasing the strength of selection increases the speed with which $A$ goes to fixation. Note, though, that it retains the s-shape.
***
***
## Analytical model {-}
As before, we have $p$ individuals with trait $A$, and $1 - p$ individuals with trait $B$. As we saw that what is important is the relative difference between the two probabilities of being copied associated to the two traits and not their absolute value, we consider always $s_b=0$ and vary $s_a$, which we can call simply $s$. Thus, the $p$ individuals with trait $A$ always keep their $A$s. The $1 - p$ individuals with trait $B$ pick another individual at random, hence with probability $p$, and with probability $s$, they switch to trait $A$. We can, therefore, write the recursion for $p$ under biased transmission as:
$$p' = p + p(1-p)s \hspace{30 mm}(3.1)$$
The first term on the right-hand side is the unchanged $A$ bearers, and the second term is the $1-p$ $B$-bearers who find one of the $p$ $A$-bearers and switch with probability $s$.
Here is some code to plot this biased transmission recursion:
```{r 3.6}
t_max <- 150
s <- 0.1
pop_analytical <- tibble(p = rep(NA, t_max), generation = 1:t_max)
pop_analytical$p[1] <- 0.01
for (i in 2:t_max) {
pop_analytical$p[i] <- pop_analytical$p[i - 1] + pop_analytical$p[i - 1] * (1 - pop_analytical$p[i - 1]) * s
}
ggplot(data = pop_analytical, aes(y = p, x = generation)) +
geom_line() +
ylim(c(0, 1)) +
theme_bw() +
labs(y = "p (proportion of individuals with trait A)")
```
The curve above should be identical to the simulation curve, given that the simulation had the same biased transmission strength $s$ and a large enough $N$ to minimise stochasticity.
From the equation above, we can see how the strength of biased transmission depends on variation in the population, given that $p(1 - p)$ is the formula for variation. This determines the shape of the curve, while $s$ determines the speed with which the equilibrium $p^*$ is reached.
But what is the equilibrium $p^*$ here? In fact, there are two. As before, the equilibrium can be found by setting the change in $p$ to zero, or when:
$$p(1-p)s = 0 \hspace{30 mm}(3.2)$$
There are three ways in which the left-hand side can equal zero: when $p = 0$, when $p = 1$ and when $s = 0$. The last case is uninteresting: it would mean that biased transmission is not occurring. The first two cases simply say that if either trait reaches fixation, then it will stay at fixation. This is to be expected, given that we have no mutation in our model. It contrasts with unbiased and biased mutation, where there is only one equilibrium value of $p$.
We can also say that $p = 0$ is an unstable equilibrium, meaning that any slight perturbation away from $p = 0$ moves $p$ away from that value. This is essentially what we simulated above: a slight perturbation up to $p = 0.01$ went all the way up to $p = 1$. In contrast, $p = 1$ is a stable equilibrium: any slight perturbation from $p = 1$ immediately goes back to $p = 1$.
***
***
## Summary of the model
We have seen how biased transmission causes a trait favoured by cultural selection to spread and go to fixation in a population, even when it is initially very rare. Biased transmission differs in its dynamics from biased mutation. Its action is proportional to the variation in the population at the time at which it acts. It is strongest when there is lots of variation (in our model, when there are equal numbers of $A$ and $B$ at $p = 0.5$), and weakest when there is little variation (when $p$ is close to 0 or 1).
## Further reading
@boyd_culture_1985 modelled direct bias, while @henrich_cultural_2001 added directly biased transmission to his guided variation / biased mutation model, showing that this generates s-shaped curves similar to those generated here. Note though that subsequent work has shown that s-shaped curves can be generated via other processes (e.g. @reader_distinguishing_2004), and should not be considered definite evidence for biased transmission.
<!--chapter:end:03-Biased_transmission_direct_bias.Rmd-->
# Biased transmission: frequency-dependent indirect bias
## The logic of conformity
In [Chapter 3][Biased transmission: direct bias] we looked at the case where one cultural trait is intrinsically more likely to be copied than another trait. Here we will start looking at the other kind of biased transmission when traits are equivalent, but individuals are more likely to adopt a trait according to the characteristics of the population, and in particular which other individuals already have it. (As we mentioned previously, these are often called 'indirect' or 'context' biases).
A first possibility is that we may be influenced by the frequency of the trait in the population, i.e. how many other individuals already have the trait. Conformity (or 'positive frequency-dependent bias') has been most studied. Here, individuals are disproportionately more likely to adopt the most common trait in the population, irrespective of its intrinsic characteristics. (The opposite case, anti-conformity or negative frequency-dependent bias is also possible, where the least common trait is more likely to be copied. This is probably less common in real life.)
For example, imagine trait $A$ has a frequency of 0.7 in the population, with the rest possessing trait $B$. An unbiased learner would adopt trait $A$ with a probability exactly equal to 0.7. This is unbiased transmission and is what happens the model described in ([Chapter 1][Unbiased transmission]: by picking a member of the previous generation at random, the probability of adoption is equal to the frequency of that trait among the previous generation.
A conformist learner, on the other hand, would adopt trait $A$ with a probability greater than 0.7. In other words, common traits get an 'adoption boost' relative to unbiased transmission. Uncommon traits get an equivalent 'adoption penalty'. The magnitude of this boost or penalty can be controlled by a parameter, which we will call $D$.
Let's keep things simple in our model. Rather than assuming that individuals sample across the entire population, which in any case might be implausible in large populations, let's assume they pick only three demonstrators at random. Why three? This is the minimum number of demonstrators that can yield a majority (i.e. 2 vs 1), which we need to implement conformity. When two demonstrators have one trait and the other demonstrator has a different trait, we want to boost the probability of adoption for the majority trait, and reduce it for the minority trait.
We can specify the probability of adoption as follows:
**Table 1: Probability of adopting trait $A$ for each possible combination of traits amongst three demonstrators**
Demonstrator 1 | Demonstrator 2 | Demonstrator 3 | Probability of adopting trait $A$
-------------- | -------------- | -------------- | --------------------------------- |
$A$ | $A$ | $A$ | 1
| | |
$A$ | $A$ | $B$ | $2/3 + D/3$
$A$ | $B$ | $A$ | $2/3 + D/3$
$B$ | $A$ | $A$ | $2/3 + D/3$
| | |
$A$ | $B$ | $B$ | $1/3 - D/3$
$B$ | $A$ | $B$ | $1/3 - D/3$
$B$ | $B$ | $A$ | $1/3 - D/3$
| | |
$B$ | $B$ | $B$ | 0
The first row says that when all demonstrators have trait $A$, then trait $A$ is definitely adopted. Similarly, the bottom row says that when all demonstrators have trait $B$, then trait $A$ is never adopted, and by implication trait $B$ is always adopted.
For the three combinations where there are two $A$s and one $B$, the probability of adopting trait $A$ is $2/3$, which it would be under unbiased transmission (because two out of three demonstrators have $A$), plus the conformist adoption boost specified by $D$. As we want $D$ to vary from 0 to 1, it is divided by three, so that the maximum probability of adoption is equal to 1 (when $D=1$).
Similarly, for the three combinations where there are two $B$s and one $A$, the probability of adopting $A$ is 1/3 minus the conformist adoption penalty specified by $D$.
Let's implement these assumptions in the kind of individual-based model we've been building so far. As before, assume $N$ individuals each of whom possesses one of two traits $A$ or $B$. The frequency of $A$ is denoted by $p$. The initial frequency of $A$ in generation $t = 1$ is $p_0$. Rather than going straight to a function, let's go step by step.
First, we'll specify our parameters, $N$ and $p_0$ as before, plus the new conformity parameter $D$. We also create the usual **population** tibble and fill it with $A$s and $B$s in the proportion specified by $p_0$, again exactly as before.
```{r 4.1}
library(tidyverse)
set.seed(111)
N <- 100
p_0 <- 0.5
D <- 1
population <- tibble(trait = sample(c("A", "B"), N, replace = TRUE, prob = c(p_0, 1 - p_0))) # create first generation
```
Now we create another tibble, called **demonstrators** that picks, for each new individual in the next generation, three demonstrators at random from the current population of individuals. It, therefore, needs three columns/variables, one for each of the demonstrators, and $N$ rows, one for each individual. We fill each column with randomly chosen traits from the **population** tibble. We can have a look at **demonstrators** by entering its name in the R console.
```{r 4.2}
# create dataframe with a set of 3 randomly-picked demonstrators for each agent
demonstrators <- tibble(dem1 = sample(population$trait, N, replace = TRUE), dem2 = sample(population$trait, N, replace = TRUE), dem3 = sample(population$trait, N, replace = TRUE))
demonstrators
```
Think of each row here as containing the traits of three randomly-chosen demonstrators chosen by each new next-generation individual. Now we want to calculate the probability of adoption of $A$ for each of these three-trait demonstrator combinations.
First we need to get the number of $A$s in each combination. Then we can replace the traits in **population** based on the probabilities in Table 1. When all demonstrators have $A$, we set to $A$. When no demonstrators have $A$, we set to $B$. When two out of three demonstrators have $A$, we set to $A$ with probability $2/3 + D/3$ and $B$ otherwise. When one out of three demonstrators have $A$, we set to $A$ with probability $1/3 - D/3$ and $B$ otherwise.
```{r 4.3}
# get the number of As in each 3-dem combo
num_As <- rowSums(demonstrators == "A")
population$trait[num_As == 3] <- "A" # for dem combos with all As, set to A
population$trait[num_As == 0] <- "B" # for dem combos with no As, set to B
prob_majority <- sample(c(TRUE, FALSE), prob = c((2/3 + D/3), 1 - (2/3 + D/3)), N, replace = TRUE)
prob_minority <- sample(c(TRUE, FALSE), prob = c((1/3 - D/3), 1 - (1/3 - D/3)), N, replace = TRUE)
# when A is a majority, 2/3
if (nrow(population[prob_majority & num_As == 2, ]) > 0) {
population[prob_majority & num_As == 2, ] <- "A"
}
if (nrow(population[prob_majority == FALSE & num_As == 2, ]) > 0) {
population[prob_majority == FALSE & num_As == 2, ] <- "B"
}
# when A is a minority, 1/3
if (nrow(population[prob_minority & num_As == 1, ]) > 0) {
population[prob_minority & num_As == 1, ] <- "A"
}
if (nrow(population[prob_minority == FALSE & num_As == 1, ]) > 0) {
population[prob_minority == FALSE & num_As == 1, ] <- "B"
}
```
To check it works, we can add the new **population** tibble as a column to **demonstrators** and have a look at it. This will let us see the three demonstrators and the resulting new trait side by side.
```{r 4.4}
# for testing only, add the new traits to the demonstrator dataframe and show it
demonstrators <- add_column(demonstrators, new_trait = population$trait)
demonstrators
```
Because we set $D=1$ above, the new trait is always the majority trait among the three demonstrators. This is perfect conformity. We can weaken conformity by reducing $D$. Here an example with $D=0.5$. All the code is the same as what we already discussed above.
```{r 4.5}
N <- 100
p_0 <- 0.5
D <- 0.1
population <- tibble(trait = sample(c("A", "B"), N, replace = TRUE, prob = c(p_0, 1 - p_0))) # create first generation
# create dataframe with a set of 3 randomly-picked demonstrators for each agent
demonstrators <- tibble(dem1 = sample(population$trait, N, replace = TRUE), dem2 = sample(population$trait, N, replace = TRUE), dem3 = sample(population$trait, N, replace = TRUE))
# get the number of As in each 3-dem combo
num_As <- rowSums(demonstrators == "A")
population$trait[num_As == 3] <- "A" # for dem combos with all As, set to A
population$trait[num_As == 0] <- "B" # for dem combos with no As, set to B
prob_majority <- sample(c(TRUE, FALSE), prob = c((2/3 + D/3), 1 - (2/3 + D/3)), N, replace = TRUE)
prob_minority <- sample(c(TRUE, FALSE), prob = c((1/3 - D/3), 1 - (1/3 - D/3)), N, replace = TRUE)
# when A is a majority, 2/3
if (nrow(population[prob_majority & num_As == 2, ]) > 0) {
population[prob_majority & num_As == 2, ] <- "A"
}
if (nrow(population[prob_majority == FALSE & num_As == 2, ]) > 0) {
population[prob_majority == FALSE & num_As == 2, ] <- "B"
}
# when A is a minority, 1/3
if (nrow(population[prob_minority & num_As == 1, ]) > 0) {
population[prob_minority & num_As == 1, ] <- "A"
}
if (nrow(population[prob_minority == FALSE & num_As == 1, ]) > 0) {
population[prob_minority == FALSE & num_As == 1, ] <- "B"
}
# for testing only, add the new traits to the demonstrator dataframe and show it
demonstrators <- add_column(demonstrators, new_trait = population$trait)
demonstrators
```
Now that conformity is weaker, sometimes the new trait is not the majority amongst the three demonstrators.
## Testing conformist transmission
As in the previous chapters, we can put all this code together into a function to see what happens over multiple generations and in multiple runs. There is nothing new in the code below, which is a combination of the code we already wrote in ([Chapter 1][Unbiased transmission]) and the new bits of code for conformity introduced above.
```{r 4.6}
conformist_transmission <- function (N, p_0, D, t_max, r_max) {
output <- tibble(generation = rep(1:t_max, r_max), p = as.numeric(rep(NA, t_max * r_max)), run = as.factor(rep(1:r_max, each = t_max)))
for (r in 1:r_max) {
population <- tibble(trait = sample(c("A", "B"), N, replace = TRUE, prob = c(p_0, 1 - p_0)))
# create first generation
output[output$generation == 1 & output$run == r, ]$p <- sum(population$trait == "A") / N # add first generation's p for run r
for (t in 2:t_max) {
# create dataframe with a set of 3 randomly-picked demonstrators for each agent
demonstrators <- tibble(dem1 = sample(population$trait, N, replace = TRUE), dem2 = sample(population$trait, N, replace = TRUE), dem3 = sample(population$trait, N, replace = TRUE))
# get the number of As in each 3-dem combo
num_As <- rowSums(demonstrators == "A")
population$trait[num_As == 3] <- "A" # for dem combos with all As, set to A
population$trait[num_As == 0] <- "B" # for dem combos with no As, set to B
prob_majority <- sample(c(TRUE, FALSE), prob = c((2/3 + D/3), 1 - (2/3 + D/3)), N, replace = TRUE)
prob_minority <- sample(c(TRUE, FALSE), prob = c((1/3 - D/3), 1 - (1/3 - D/3)), N, replace = TRUE)
# when A is a majority, 2/3
if (nrow(population[prob_majority & num_As == 2, ]) > 0) {
population[prob_majority & num_As == 2, ] <- "A"
}
if (nrow(population[prob_majority == FALSE & num_As == 2, ]) > 0) {
population[prob_majority == FALSE & num_As == 2, ] <- "B"
}
# when A is a minority, 1/3
if (nrow(population[prob_minority & num_As == 1, ]) > 0) {
population[prob_minority & num_As == 1, ] <- "A"
}
if (nrow(population[prob_minority == FALSE & num_As == 1, ]) > 0) {
population[prob_minority == FALSE & num_As == 1, ] <- "B"
}
output[output$generation == t & output$run == r, ]$p <- sum(population$trait == "A") / N # get p and put it into output slot for this generation t and run r
}
}
output # export data from function
}
```
We can test the function with perfect conformity ($D=1$) and plot it (again we use the function `plot_multiple_runs()` we wrote in [Chapter 1][Unbiased transmission]).
```{r 4.7, echo=FALSE}
plot_multiple_runs <- function(data_model) {
ggplot(data = data_model, aes(y = p, x = generation)) +
geom_line(aes(colour = run)) +
stat_summary(fun.y = mean, geom = "line", size = 1) +
ylim(c(0, 1)) +
theme_bw() +
labs(y = "p (proportion of individuals with trait A)")
}
```
```{r 4.8}