DataPreparationSpeciesChoice

output	title	author
html_document	ReadMe	Vanessa Haller-Bull

DataPreparationSpeciesChoice

collection and computations to prepare data for the species choice analysis

1. Step - DataDownload

The data for the traits of concern is downloaded from www.coraltraits.org. For quantitative traits we download the average +/- standard deviation.

The main code to complete the data download is "DataPrep.R", it utilises the inputs on species of concern as well as their taxonomy provided by Josh.

The output is saves as "SpeciesData.csv" and contains the following traits:

Table 1: Number of missing values for the data downloaded

Trait	Missing Data
skeletaldensityMin	382
skeletaldensityMax	382
growthrateMin	357
growthrateMax	357
corallitewidthMin	50
corallitewidthMax	20
colonydiameterMin	254
colonydiameterMax	254
polypfecundityMin	403
polypfecundityMax	403
eggsizeMin	411
eggsizeMax	411
photosynthesisMin	408
photosynthesisMax	408
BRI	202

Next, we download the extra data that will be used to impute missing datapoints for each trait. This is done through the code called "[Trait name]Data.R" and teh results are saved as "[Trait name]BeforeImputed.RData".

The following traits are used to support each of the traits in the main analysis.

Table 2: Traits used for the imputation for each variable in the final dataset

Variable in final dataset	Variables used for imputation
Skeletal density	Larval swimming speed, Colony shape factor, Substrate attachment, Wave exposure, Wave exposure preference
Growth rate	Calcification rate, Life history strategy, Growth form, Growth form from Veron
Corallite width	Polyps per area, Tissue thickness, Total biomass
Colony diameter	Colony area, Colony shape factor, Coloniality
Reproduction	Polyp fecundity, Egg size, Colony fecundity, Mode of larval development, Propagule size, Sexual system, Symbiodinium
Photosynthesis	Symbiodinium density, Symbiodinium clade, Symbiodinium subclade, Zooxanthellate

2. Step - Imputation

The imputation trials different methods and repeats these multiple times to receive teh best results. The code is saved as "ImputeMissingData[Trait name].R" and the results are saved in multiple dataframes named "[Trait name]ImputedData[Numbering].RData".

3. Step - TestImputation

To test the imputation results, I iteratively remove one known datapoint and refit the model to this dataset and then predict this removed datapoint using the fitted model. Then I calculate the mean +/- standard deviation for each of these predicted datapoints which can then be compared to the known datapoint prior to removal.

The code to conduct this analysis is named "CheckImputedData[Trait name].R".

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
DataDownload		DataDownload
Imputation		Imputation
TestImputation		TestImputation
.gitignore		.gitignore
DFMeansAny.R		DFMeansAny.R
DataPreparationSpeciesChoice.Rproj		DataPreparationSpeciesChoice.Rproj
README.html		README.html
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DataPreparationSpeciesChoice

1. Step - DataDownload

2. Step - Imputation

3. Step - TestImputation

About

Releases

Packages

Languages

VHallerBull/DataPreparationSpeciesChoice

Folders and files

Latest commit

History

Repository files navigation

DataPreparationSpeciesChoice

1. Step - DataDownload

2. Step - Imputation

3. Step - TestImputation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages