Data information & exploratory analyses

Author

Juliette Archambeau

Published

June 17, 2024

The data

Data information

All info in Beaton et al. (2022)

Seed from ten trees from each of 21 native Scottish P. sylvestris populations were collected in March 2007. Populations were chosen to represent the species native range in Scotland and to include three populations from each of the seven seed zones.

Trees from the same population can be considered unrelated (see sampling strategy). Seedlings from the same mother tree are described as a family and are assumed to be half-siblings.

After growing in one of three nurseries (NW, NG and NE), trees were transplanted to one of three field sites in 2012:

  • Yair in the Scottish Borders (field site in the south of Scotland: FS, latitude 55.603625, longitude -2.893025). All trees transplanted to FS were raised in the NG.

  • Glensaugh (field site in the east of Scotland: FE, latitude 56.893567, longitude -2.535736). All but four of the trees transplanted to FE were raised locally in the NE (the remainder were grown in NG).

  • Inverewe (field site in the west of Scotland: FW, latitude 57.775714, longitude -5.597181). FW contains cohorts of trees raised in each of the three nurseries as follows: 290 were grown locally in the NW; 132 were grown in the NG; and 82 were grown in the NE.

At each site, trees were planted in randomised blocks at 3 m x 3 m spacing. There are four randomized blocks in both FS and FE and three in FW.

There are 168 families in total. Each block comprised one individual from each of eight (of the 10 sampled) families per 21 populations (168 trees).

Although most families (N = 159) were represented at each of the three sites, families with insufficient trees (N = 9) were replaced in one site (FS) with a different family from the same population.

Important points

  • potential strong effect of nurseries on the phenotypes

  • potential effects of blocks

  • In FS, some families are not the same as in FE and FW.

  • FE is the site with the harshest climate (i.e. coldest site with the shortest growing season length) and FW the most beneficial climate, i.e. warmer and wetter and with more growing degree days per year and a much longer season length than the two other sites (see Table 2 of Beaton et al. 2022).

Data loading

Fiel site data

Code
data <- read_excel(here("data/ScotsPine/Field.xlsx"), na = "NA")

Nursery data

Code
# Dataset in Beaton et al. (2022)
nursery_data <- read_delim(here("data/ScotsPine/nurserytraits.txt"), show_col_types = FALSE) %>% 
  dplyr::select(PopulationCode, Family,Nursery,FieldCode) %>% 
  drop_na(FieldCode)

# data <- data %>% left_join(nursery_data, by=c("FieldCode","PopulationCode","Family")) 

In the nursery dataset from Perry et al. (2022), 1736 trees have a field code.

02/06/2023 - Annika found the field code of the trees that had none in Perry et al. (2022), here is the updated file:

Code
# Dataset updated by Annika - 02/06/2023
nursery_data <- read_excel(here("data/ScotsPine/SPnurseries_updatedByAnnika02062023.xlsx")) %>% 
  dplyr::rename(FieldCode=Tag)

We can then match the nursery and common garden data using the the field code:

Code
data <- data %>% left_join(nursery_data, by=c("FieldCode")) 

Data formatting

We have to change the names of the block to specify that they are nested within sites:

Code
# site-specific blocks
data <- data %>% mutate(Block = paste0(FieldSite,"_",Block))

According to Beaton et al. (2022), there are four randomised blocks in both FS and FE and three in FW. We look at the number of individuals in each block:

Code
data %>% group_by(Block) %>% summarise("Number of individuals"=n()) %>% kable_mydf()
Block Number of individuals
FE_A 168
FE_B 168
FE_C 168
FE_D 168
FS_A 168
FS_B 168
FS_C 168
FS_D 168
FW_A 168
FW_B 168
FW_C 167
FW_c 1

We see that there is one typo in FW (block noted as c instead of C), that we have to correct:

Code
data <- data %>% mutate(Block=case_when(Block=="FW_c" ~ "FW_C",
                                TRUE ~ Block))

data %>% group_by(Block) %>% summarise("Number of individuals"=n()) %>% kable_mydf()
Block Number of individuals
FE_A 168
FE_B 168
FE_C 168
FE_D 168
FS_A 168
FS_B 168
FS_C 168
FS_D 168
FW_A 168
FW_B 168
FW_C 168

We save the dataset for further analyses.

Code
data %>% saveRDS(file=here("data/ScotsPine/formatted_CG_data.rds"))

Phenotypic traits

Meaning of the phenotypic variables:

Code
trait_names <- data %>% 
  dplyr::select(contains("HA"),contains("BD"),contains("BT")) %>% 
  colnames()

trait_names <- lapply(trait_names, function(x){
  
  if(grepl("HA",x)==TRUE){
    year <- str_sub(x,3,-1)
    trait_name <- paste0("Height in 20",year)
    list(code=x,name=trait_name)
  
    } else if(grepl("BD",x)==TRUE){
      year <- str_sub(x,3,-1)
      trait_name <- paste0("Duration of burburst in 20",year)
      list(code=x,name=trait_name)
      
    } else if(grepl("BT",x)==TRUE){
      year <- str_sub(x,3,-3)
      stage <- str_sub(x,6,-1)
      trait_name <- paste0("Time taken to reach stage ",stage," in 20", year)
      list(code=x,name= trait_name)
    }
}) %>% setNames(trait_names)

trait_names %>% 
  bind_rows() %>% 
  setNames(c("Variable code","Variable name")) %>% 
  kable_mydf()
Variable code Variable name
HA13 Height in 2013
HA14 Height in 2014
HA15 Height in 2015
HA16 Height in 2016
HA17 Height in 2017
HA18 Height in 2018
HA19 Height in 2019
HA20 Height in 2020
BD15 Duration of burburst in 2015
BD16 Duration of burburst in 2016
BD17 Duration of burburst in 2017
BD18 Duration of burburst in 2018
BD19 Duration of burburst in 2019
BT15_4 Time taken to reach stage 4 in 2015
BT15_5 Time taken to reach stage 5 in 2015
BT15_6 Time taken to reach stage 6 in 2015
BT16_4 Time taken to reach stage 4 in 2016
BT16_5 Time taken to reach stage 5 in 2016
BT16_6 Time taken to reach stage 6 in 2016
BT17_4 Time taken to reach stage 4 in 2017
BT17_5 Time taken to reach stage 5 in 2017
BT17_6 Time taken to reach stage 6 in 2017
BT18_4 Time taken to reach stage 4 in 2018
BT18_5 Time taken to reach stage 5 in 2018
BT18_6 Time taken to reach stage 6 in 2018
BT19_4 Time taken to reach stage 4 in 2019
BT19_5 Time taken to reach stage 5 in 2019
BT19_6 Time taken to reach stage 6 in 2019

Measurements

We check whether each trait was measured in the three common gardens.

Code
traits <- data %>% 
  dplyr::select(contains("HA"),contains("BD"),contains("BT")) %>% 
  colnames()

lapply(traits, function(x) 
  data %>% 
    dplyr::select(FieldSite, any_of(x)) %>% 
    drop_na(any_of(x)) %>% 
    pull(FieldSite) %>% 
    unique()) %>% 
  setNames(traits)
$HA13
[1] "FE" "FW"

$HA14
[1] "FE" "FW" "FS"

$HA15
[1] "FE" "FW" "FS"

$HA16
[1] "FE" "FW" "FS"

$HA17
[1] "FE" "FW" "FS"

$HA18
[1] "FE" "FW" "FS"

$HA19
[1] "FE" "FW" "FS"

$HA20
[1] "FE" "FW" "FS"

$BD15
[1] "FE" "FW" "FS"

$BD16
[1] "FE" "FW" "FS"

$BD17
[1] "FE" "FW" "FS"

$BD18
[1] "FE" "FW" "FS"

$BD19
[1] "FE" "FW" "FS"

$BT15_4
[1] "FE" "FW" "FS"

$BT15_5
[1] "FE" "FW" "FS"

$BT15_6
[1] "FE" "FW" "FS"

$BT16_4
[1] "FE" "FW" "FS"

$BT16_5
[1] "FE" "FW" "FS"

$BT16_6
[1] "FE" "FW" "FS"

$BT17_4
[1] "FE" "FW" "FS"

$BT17_5
[1] "FE" "FW" "FS"

$BT17_6
[1] "FE" "FW" "FS"

$BT18_4
[1] "FE" "FW" "FS"

$BT18_5
[1] "FE" "FW" "FS"

$BT18_6
[1] "FE" "FW" "FS"

$BT19_4
[1] "FE" "FW" "FS"

$BT19_5
[1] "FE" "FW" "FS"

$BT19_6
[1] "FE" "FW" "FS"

Height in 2013 was not measured in FS, the field site in Yair (Scottish Borders).

Trait distribution

Code
data %>% 
  dplyr::select(names(trait_names)) %>% 
  pivot_longer(everything(),names_to="variable",values_drop_na = TRUE) %>% 
  ggplot(aes(x=value)) +  
  geom_histogram(aes(y=after_stat(density)), colour="blue",fill="white",bins = 100) +
  geom_density(alpha=.2,fill="pink") +
  xlab("") +
  facet_wrap(~variable,scales="free") + 
  theme_bw() 

References

Beaton, Joan, Annika Perry, Joan Cottrell, Glenn Iason, Jenni Stockan, and Stephen Cavers. 2022. “Phenotypic Trait Variation in a Long-Term Multisite Common Garden Experiment of Scots Pine in Scotland.” Scientific Data 9 (1): 671. https://doi.org/10.1038/s41597-022-01791-8.
Perry, A., J. K. Beaton, J. A. Stockan, J. E. Cottrell, G. R. Iason, and S. Cavers. 2022. “Long-Term Multisite Scots Pine Trial, Scotland: Nursery Phenotypes, 2007-2011.” NERC EDS Environmental Information Data Centre. https://doi.org/10.5285/29ced467-8e03-4132-83b9-dc2aa50537cd.