vignettes/examples_main_functions_callsync.Rmd

---
title: "Examples main functions callsync"
author: "Simeon Q. Smeele and Stephen A. Tyndel"
date: "`r Sys.Date()`"
output: 
  rmarkdown::html_vignette:
    fig_caption: yes
vignette: >
  %\VignetteIndexEntry{Examples main functions callsync}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
  examples_main_functions_callsync.Rmd
editor_options: 
  chunk_output_type: console
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

# Introduction

This vignette is based on two artificially created audio files.

We used the following recordings from the Macaulay Library at the Cornell Lab of Ornithology: ML397067 and ML273028 of Spring Peeper (Pseudacris crucifer). Each recording contained a focal individual. 

The artificially created files both contain the same 10 minutes of background noise comprised of six snippets taken from both recordings. The snippets were then concatenated together in random sequences to produce a 10 minute long recording. This recording was than amplified by 4dB. 

Spring Peepers produce tonal calls around 120 ms and have a almost flat fundamental frequency around 1500 Hz with a louder first harmonic. The background noise contains multiple other species and loud low frequency (up to 2 kHz) bursts of noise. All noise has lower amplitude than the Spring Peepers, since we tried to create a situation where focal individuals are recorded with backpack microphones. 

We overlayed 20 frog calls on top of the background noise on each track, 10 calls from each recording ML273028 and ML397067. The recordings were ordered such that the order of calls alternated between calls taken from recording ML273028 and ML397067. This way, the first artificially created track simulated a focal individual (taken from ML273028) and the second artificially created track simulated a different focal individual (taken from ML397067). 

We artificially normalized the calls prior to inserting them using the tuneR package to simulate a focal individual. That is, a given focal individual was normalized at 85% of the total amplitude percentage, while non-focal birds were normalized at 55%. Thus, each track were normalized conversely to one another. 

We added 0.03 seconds of additional noise at each second for the second artificially created audio file to create temporal drift. This resulted in a track that contained approximately 18 seconds of drift. 

<br>

# Load libraries and settings block

The code below will load all required libraries and includes all paths used later on. It also downloads the audio files and generates the results folders. 

```{r libraries-and-paths}
# Install missing libraries and load all libraries
libraries = c('seewave', 'tuneR', 'callsync', 'stringr', 'parallel')
for(lib in libraries){
  if(! lib %in% installed.packages()) lapply(lib, install.packages)
  lapply(libraries, require, character.only = TRUE)
}

# Clean R
rm(list = ls()) 

# Paths to all files
path_recordings = 'audio'
path_chunks = 'results/chunks'
path_calls = 'results/calls'
path_git = 'https://raw.githubusercontent.com'
path_repo = '/simeonqs/callsync/master/vignettes/audio'
file_1 = '/artificial_recording_ID_ind1_ID_REC_rec1_REC.wav'
file_2 = '/artificial_recording_drifted_ID_ind2_ID_REC_rec1_REC.wav'
url_1 = paste0(path_git, path_repo, file_1)
url_2 = paste0(path_git, path_repo, file_2)
local_file_1 = paste0(path_recordings, file_1)
local_file_2 = paste0(path_recordings, file_2)

# Download audio files from GitHub
options(timeout = 600)
if(!dir.exists('audio')) dir.create('audio')
if(!file.exists(local_file_1)) 
  download.file(url_1, destfile = local_file_1, mode = 'wb')
if(!file.exists(local_file_2)) 
  download.file(url_2, destfile = local_file_2, mode = 'wb')

# Clean and create results directories
if(dir.exists('results')) unlink('results', recursive = TRUE)
dir.create(path_chunks, recursive = TRUE)
dir.create(path_calls, recursive = TRUE)
```

<br>

# Major alignment

The following code chunk aligns 2 minute chunks (`chunk_size`) from both audio files. The first and last 0.5 minutes (`blank`) of the recording are discarded. The step loads an additional 0.5 minutes (`wing`) on either side to ensure alignment is possible. The `keys_id` and `keys_rec` contain the bit of text just before and after the individual id and recording id from the file name structure (e.g., `artificial_recording_ID_ind1_ID_REC_rec1_REC`). The function stores a pdf and log file in the specified location (`path_chunks`).

```{r major-alignment}
align(chunk_size = 2,
      step_size = 0.5,
      path_recordings = path_recordings,
      path_chunks = path_chunks, 
      keys_id = c('ID_', '_ID'),
      keys_rec = c('REC_', '_REC'),
      blank = 0.5, 
      wing = 0.5, 
      save_pdf = TRUE, 
      save_log = TRUE)
```

<br>

# Call detection and assignment

The following code chunk runs `call.detect.multiple` and `call.assign` together using the function `detect.and.assign`. Chunks are first filtered with a high-pass filter (`ffilter_from`) and then a spectral envelope is computed. Using a threshold of 0.2 of the maximum (`threshold`) calls are detected. Each detection is then aligned with the other recording and only the loudest detection (focal individual) is stored as separate wav file in the specified location (`path_calls`). Note that calls close to the start and end (`wing`) are not included, because the corresponding detection from the other recording might lie outside the chunk.

```{r call-detection-and-assignment}
detect.and.assign(path_chunks = path_chunks,
                  path_calls = path_calls,
                  ffilter_from = 1000,
                  threshold = 0.2, 
                  msmooth = c(500, 95), 
                  min_dur = 0.05, 
                  max_dur = 0.5,
                  step_size = 0.02, 
                  wing = 10) 
```

<br>

# Trace fundamental frequency

The following code chunk loads all calls and runs an algorithm to detect the fundamental frequency. It first detect the precise start and end time with `call.detect` and then uses the spectrum at specified intervals (`hop`) to detect the fundamental frequency. It also applies a smoothening function to impute and smoothing missing and incorrect detections. The code chunk also plot's an example of the traces. 

```{r trace-fundamental-frequency, fig.cap = "Example of a fundamental frequency trace. Black dashed lines are detected start and end times. Green dashed line is the detected fundamental frequency."}
# Audio files 
calls = list.files(path_calls,  '*wav', full.names = T)

# Load waves
waves = lapply(calls, load.wave, ffilter_from = 700)
names(waves) = basename(calls)

# Detect call
message('Starting detections.')
detections = lapply(waves, call.detect, 
                    threshold = 0.3,
                    msmooth = c(500, 95))

# Extract one call per wave
new_waves = lapply(1:length(waves), function(i) 
  waves[[i]][detections[[i]]$start:detections[[i]]$end])

# Trace fundamental
message('Tracing.')
traces = mclapply(new_waves, function(new_wave)
  trace.fund(new_wave, spar = 0.3, freq_lim = c(0.5, 2), 
             thr = 0.15, hop = 5, noise_factor = 1.5), 
  mc.cores = 1)
names(traces) = basename(calls)

# Plot example of trace
better.spectro(waves[[1]])
abline(v = detections[[1]][c('start', 'end')]/waves[[1]]@samp.rate,
       lty = 2, lwd = 3)
lines(traces[[1]]$time + detections[[1]]$start/waves[[1]]@samp.rate,
      traces[[1]]$fund,
      lty = 2, col = 3, lwd = 3)
```

<br>

# Analyse the traces

The final code chunks runs the default measurements on the traces and plots the mean fundamental frequency vs duration to show that individuals are variable and their calls form fuzzy clusters (red vs blue).

```{r analysis, fig.cap = "Scatter plot of meand fundamental frequency [Hz] vs duration [s]. Dots are individual calls and are coloured by individual."}
# Take measurements
measurements = measure.trace.multiple(traces = traces, 
                                      new_waves = new_waves, 
                                      waves = waves, 
                                      detections = detections)

# Plot
individuals = traces |> names() |> strsplit('@') |> sapply(`[`, 2) |> 
  as.factor() |> as.integer()
colours = c('#d11141', '#00aedb')[individuals]
plot(measurements$mean_fund_hz, 
     measurements$duration_s, 
     col = colours, pch = 16,
     xlab = 'Mean fund freq [Hz]',
     ylab = 'Duration [s]')
```

<br>

# Further examples and maintenance

For a real world example see <https://github.com/simeonqs/callsync_an_R_package_for_alignment_and_analysis_of_multi-microphone_animal_recordings>. For questions and suggestions please contact <simeonqs@hotmail.com>.

<br>

# Acknowledgements

We thank The Macaulay Library at the Cornell Lab of Ornithology for providing the recordings on which this vignette is based.