1 Model equation and code

In this document, we aimed to estimate the relationship between potential drivers (namely, two admixture scores, four indexes of environmental heterogeneity and two indexes of climatic harshness) and the within-population genetic variation.

We modeled each trait \(y_{bpcr}\) such as:

\[\begin{equation} \begin{aligned} y_{bpcr} & \sim \mathcal{N}(\mu_{bpc},\sigma^{2}_{r})\\[3pt] \mu_{bpc} & = \beta_{0} + B_{b} + P_{p} + C_{c(p)}\\[3pt] \end{aligned} \end{equation}\]

where \(\beta_{0}\) is the global intercept, \(B_{b}\) the block intercepts, \(P_{p}\) the population intercepts, \(C_{c(p)}\) the clone intercepts and \(\sigma^{2}_{r}\) the residual variance.

The prior of \(\beta_{0}\) was weakly informative and centered around the mean of the observed values for the trait under considered, as follows:

\[\beta_{0} \sim \mathcal{N}(\mu_{y},2)\]

The population and block intercepts, \(P_{p}\) and \(B_{b}\) were considered normally-distributed with variances \(\sigma^{2}_{P}\) and \(\sigma^{2}_{B}\), such as:

\[\begin{bmatrix} B_{b} \\ P_{p} \end{bmatrix} \sim \mathcal{N}\left(0, \begin{bmatrix}\sigma^{2}_{B}\\[3pt] \sigma^{2}_{P}\\ \end{bmatrix} \right)\\[3pt]\]

The clone intercepts \(C_{c(p)}\) were considered to follow some population-specific normal distributions, such as: \[C_{c(p)} \sim \mathcal{N}(0,\sigma^{2}_{C_{p}})\]

where \(\sigma^{2}_{C_{p}}\) are the population-specific variances among clones.

To partition the total variance, we parameterize our model so that only the total variance, \(\sigma_{tot}^{2}\) has a prior, such that:

\[\begin{equation} \begin{aligned} \sigma_{tot}^{2} & = \sigma_{r}^{2} + \sigma_{B}^{2} + \overline{\sigma_{C_{p}}^{2}} + \sigma_{P}^{2}\\[3pt] \sigma_{r} & = \sigma_{tot} \times \sqrt(\pi_{r})\\[3pt] \sigma_{B} & = \sigma_{tot} \times \sqrt(\pi_{B})\\[3pt] \sigma_{P} & = \sigma_{tot} \times \sqrt(\pi_{P})\\[3pt] \overline{\sigma_{C_{p}}} & = \sigma_{tot} \times \sqrt(\pi_{C})\\[3pt] \sigma_{tot} & \sim \mathcal{S}^{*}(0,1,3) \end{aligned} \end{equation}\]

where \(\overline{\sigma_{C_{p}}}\) and \(\overline{\sigma_{C_{p}}^{2}}\) are the mean of the population-specific among-clones standard deviations (\(\sigma_{C_{p}}\)) and variances (\(\sigma^{2}_{C_{p}}\)), respectively, and \(\sum_{l}^{4}\pi_{l}=1\) (using the simplex function in Stan).

The population-specific among-clones standard deviations \(\sigma_{C_{p}}\) follow a log-normal distribution with mean \(\overline{\sigma_{C_{p}}}\) and variance \(\sigma^{2}_{K}\), such as:

\[\begin{equation*} \begin{aligned} \sigma_{C_{p}} & \sim \mathcal{LN}\left(\ln(\overline{\sigma_{C_{p}}})-\frac{\sigma^{2}_{K}}{2} + \beta_{X}X_{p},\sigma^{2}_{K}\right)\\[3pt] \sigma_{K} & \sim \exp(1)\\[3pt] \end{aligned} \end{equation*}\]

with \(X_{p}\) the potential driver considered and \(\beta_{x}\) its associated coefficient.

Here is the Stan model code:

model = stan_model("scripts/StanModels/HierarchicalModel_VaryingInterClonesSD_OneCovariate.stan")
model

## S4 class stanmodel 'HierarchicalModel_VaryingInterClonesSD_OneCovariate' coded as follows:
## //Model 1:  model with the betaX * Xp on the median of the sigma_clon
## 
## data {
##   int<lower=1> N;                                                              // Number of observations
##   vector[N] y;                                                                 // Response variable
##   int<lower=0> nblock;                                                         // Number of blocks
##   int<lower=0> nprov;                                                          // Number of provenances
##   int<lower=0> nclon;                                                          // Number of clones
##   int<lower=0, upper=nblock> bloc[N];                                          // Blocks
##   int<lower=0, upper=nprov> prov[N];                                           // Provenances
##   int<lower=0, upper=nprov> which_prov[nclon];                                 // Provenances
##   int<lower=0, upper=nclon> clon[N];                                           // Clones
##   vector[nprov] X;                                                             // Potential predictor of sigma_clon
## }
## 
## 
## parameters {
##   real beta0; // global intercept
##   real betaX; // coefficient of Xp
##   simplex[4] pi;
##   real<lower = 0> sigma_tot;
##   real<lower = 0> sigma_K;
##   vector[nprov] z_log_sigma_clon;
##   
##   vector[nblock] z_block;
##   vector[nprov] z_prov;
##   vector[nclon] z_clon;
##   
## }
## 
## 
## transformed parameters {
##   real R_squared;
##   
##   real<lower = 0>  sigma_r;
##   real<lower = 0>  sigma_block;
##   real<lower = 0>  sigma_prov;
##   
##   real mean_sigma_clon;
##   vector[nprov] sigma_clon;
##   
##   vector[nprov] alpha_prov;
##   vector[nclon] alpha_clon;
##   vector[nblock] alpha_block;
##   
##   vector[N] mu; // linear predictor
##   
##   // variance partitioning with the simplex pi
##   sigma_r = sqrt(pi[1]) * sigma_tot;
##   sigma_block = sqrt(pi[2]) * sigma_tot;
##   sigma_prov = sqrt(pi[3]) * sigma_tot;
##   
##   mean_sigma_clon= sqrt(pi[4]) * sigma_tot;
##   sigma_clon = exp(log(mean_sigma_clon) - (square(sigma_K)/2)  + betaX*X + z_log_sigma_clon*sigma_K);
##   
##   alpha_prov = z_prov*sigma_prov;
##   
##   for(c in 1:nclon){
##     alpha_clon[c] =  z_clon[c]*sigma_clon[which_prov[c]];
##   }
##   
##   alpha_block = z_block*sigma_block;
##   
##   mu = rep_vector(beta0, N) + alpha_clon[clon] + alpha_block[bloc] + alpha_prov[prov];
##   R_squared = 1 - variance(y - mu) / variance(y);
## }
## 
## model{
##   //Priors
##   beta0 ~ normal(mean(y),2);
##   betaX ~ std_normal();
##   sigma_tot ~ student_t(3, 0.0, 1.0);
##   
##   z_prov ~ std_normal();
##   z_block ~ std_normal();
##   z_clon ~ std_normal();
##   
##   z_log_sigma_clon ~ std_normal();
##   sigma_K ~ exponential(1);
##   
##   // Likelihood
##   y ~ normal(mu, sigma_r);
## }
## 
## 
## generated quantities {
##   //Variances
##   real<lower=0> sigma2_r;
##   real<lower=0> sigma2_prov;
##   real<lower=0> sigma2_block;
##   vector[nprov] sigma2_clon;
##   
##   // Heritability
##   vector[nprov] h2_prov;
## 
##   // Posterior predictive check
##   vector[N] y_rep;
## 
##   // Log-Likelihood (for WAIC/loo computations)
##   vector[N] log_lik;
##   
##   // Bayesian R2
##   real bayes_R2_res; // residual based R2
##   real bayes_R2;     // model based R2, i.e.  using modelled (approximate) residual variance
## 
##   sigma2_r = square(sigma_r);
##   sigma2_prov = square(sigma_prov);
##   sigma2_clon = square(sigma_clon);
##   for(p in 1:nprov)  h2_prov[p] = sigma2_clon[p]/(sigma2_r+sigma2_clon[p]);
##   sigma2_block = square(sigma_block);
##   for(i in 1:N)  {
##     y_rep[i] = normal_rng(mu[i], sigma_r);
##     log_lik[i] = normal_lpdf(y[i]| mu[i], sigma_r); // log probability density function
##   }
##   
##   bayes_R2_res = variance(mu) / (variance(mu) + variance(y-mu));
##   bayes_R2 = variance(mu) / (variance(mu) + sigma2_r);
##   
## }
##

In the Stan code, we also calculate the posterior distribution of \(\mathcal{R}^2\) in three different ways.

The first way is the classical way of calculating \(\mathcal{R}^2\) in the frequentist framework. This classical \(\mathcal{R}^2\) corresponds to the proportion of variance explained by the model and is a commonly used measure of model fit. It is calculated as follows:

\[\mathcal{R}^2 = \frac{Var_{\mu}}{Var_y} \] where \(Var_{\mu}\) is the fitted variance, i.e. the variance of the modelled predicted means and \(Var_y\) is the total variance, i.e. the variance of the observations. This way of calculating \(\mathcal{R}^2\) was used in the tutorial "Modèle Hiérarchique avec Stan" of Matthieu Authier & Eric Parent.

However, according to Gelman et al. (2019) this definition of \(\mathcal{R}^2\) raises two problems in a Bayesian framework. First, it does not reflect the posterior uncertainty in the coefficients (which should remove or at least reduce the overfitting problem of least squares). Second, under some conditions(e.g. in the presence of strong prior information and weak data), the fitted variance might be higher than the total variance and thus \(\mathcal{R}^2\) might be greater than 1.

Therefore, Gelman et al. (2019) propose a Bayesian version of the classical \(\mathcal{R}^2\), which is defined as:

\[\mathcal{R}^2 = \frac{Var_{\mu}}{Var_{\mu}+Var_{res}} \] where \(Var_{\mu}\) is the variance of modelled predictive means and \(Var_{res}\) is the residual variance, which can be obtained with two different ways.

The residual based \(\mathcal{R}^2\) used draws from the residual distribution, such as:

\[Var_{res}=Var_{n=1}^{N}(y_n-\mu_n)\]

The model based \(\mathcal{R}^2\) used draws from the modelled residual variance. For instance, for linear regression:

\[Var_{res}=\sigma^2\] In this latter case, both \(Var_{\mu}\) and \(Var_{res}\) in the \(\mathcal{R}^2\) formula are computed only using posterior quantities from the fitted model.

2 Setting up

2.1 Functions used in the document

2.1.1 DownloadData()

DownloadData <- function(trait,sub,drivers){

# Trait transformations
#######################
  
    # Phenology-realted traits and delta13C are centered to help convergence
if(trait=="MeanBB"|trait=="MeanDBB"|trait=="d13C"){
    CenterResponseVariable <- TRUE
    LogResponseVariable <- FALSE
    
    # SLA is not centered and we used its logarithm as response variable
  }  else if(trait=="POR_SLA"){ 
    CenterResponseVariable <- FALSE
    LogResponseVariable <- TRUE
    
    # Height is not transformed
  } else { 
    CenterResponseVariable <- FALSE
    LogResponseVariable <- FALSE
  }


# Potential drivers
###################

df.drivers <- readRDS(file=paste0("data/DF_Drivers.rds")) %>% # downloading the dataframe with the drivers
    dplyr::select(all_of(c("prov",drivers))) %>% 
    dplyr::mutate(across(-prov,scale)) # scaling (mean-centering) the potential drivers


  
# Merging with the phenotypic data
##################################

if(trait=="MeanBB"|trait=="MeanDBB"){
      
  df <- readRDS(file="data/ClonapinData/PhenoDataNovember2019_AnnualTraits.rds") %>% 
    dplyr::filter(!(prov=="MAD")) %>% 
    when(trait=="MeanBB" ~ dplyr::select(.,prov,clon,tree,block,BDX_s32013,BDX_s32014,BDX_s32015,BDX_s32017),
         ~ dplyr::select(.,prov,clon,tree,block,BDX_dbb2014,BDX_dbb2015,BDX_dbb2017)) %>% 
    filter(rowSums(is.na(across(where(is.numeric)))) != ncol(.)-4) %>% 
    dplyr::mutate(trait = rowMeans(dplyr::select(.,contains("BDX")),na.rm=T)) %>%
    execute_if(CenterResponseVariable, mutate(trait=trait-mean(trait))) %>% 
    execute_if(LogResponseVariable, mutate(trait=log(trait))) %>% 
    arrange(by=tree) %>% 
    inner_join(df.drivers,by="prov") 
  
  } else {
    
  df <- readRDS(file="data/ClonapinData/PhenoDataNovember2019_AnnualTraits.rds") %>% 
    dplyr::filter(!(prov=="MAD")) %>% 
    dplyr::rename(trait=all_of(trait)) %>% 
    dplyr::select(prov,clon,tree,block,trait) %>% 
    drop_na() %>%  
    execute_if(CenterResponseVariable, mutate(trait=trait-mean(trait))) %>% 
    execute_if(LogResponseVariable, mutate(trait=log(trait))) %>% 
    arrange(by=tree) %>% 
    inner_join(df.drivers,by="prov")
  
  }
  
  
# Creating subsets
##################

  if(sub=="sub1"|sub=="sub2"){

  ClonesToKeep <- df %>% 
    dplyr::select(prov,clon) %>% 
    group_by(clon) %>% 
    mutate(NbInd = n()) %>% 
    distinct() %>%
    filter(NbInd>sub.ind)
  
  df <- df %>% semi_join(ClonesToKeep, by="clon") 
  
  PopsToKeep <- df %>%  
    dplyr::select(prov,clon) %>% 
    distinct() %>%  
    group_by(prov) %>%  
    dplyr::summarise(NbClones=n()) %>% 
    filter(NbClones>sub.clones)
  
  df <- df %>% semi_join(PopsToKeep, by="prov") 
  }


# Creating the Stan lists
#########################

list.stan <- lapply(drivers, function(x)    {
    list(N=length(df$tree),
         y=df$trait,
         X=as.numeric(pull(unique(df[,c(x,"prov")]) %>% dplyr::select(contains(x)))),
         nprov = length(unique(df$prov)),
         nclon = length(unique(df$clon)),
         nblock = length(unique(df$block)),
         prov = as.numeric(as.factor(df$prov)),
         which_prov = as.numeric(as.factor(pull(unique(df[c("prov","clon")])[,"prov"]))),
         clon = as.numeric(as.factor(df$clon)),
         bloc = as.numeric(as.factor(df$block)))})
  
names(list.stan) <- drivers
  
return(c(list(df=df),list.stan))
}

2.1.2 DownloadDataCTD()

# Only for all data (sub="alldata")
DownloadDataCTD <- function(trait){

if(trait=="MeanBB"|trait=="MeanDBB"|trait=="d13C"){
    CenterResponseVariable <- TRUE
    LogResponseVariable <- FALSE
    
  }  else if(trait=="POR_SLA"){
    CenterResponseVariable <- FALSE
    LogResponseVariable <- TRUE
  } else {
    CenterResponseVariable <- FALSE
    LogResponseVariable <- FALSE
  }

# Potential drivers
###################

df.drivers <- readRDS(file=paste0("data/DF_Drivers.rds")) %>% # downloading the dataframe with the drivers
    dplyr::select(c("prov",contains("MSP"),contains("MCMT"))) %>% 
    dplyr::mutate(across(-prov,scale)) # scaling (mean-centering) the potential drivers


if(trait=="MeanBB"|trait=="MeanDBB"){
    
  df <- readRDS(file="data/ClonapinData/PhenoDataNovember2019_AnnualTraits.rds") %>% 
    dplyr::filter(!(prov=="MAD")) %>% 
    when(trait=="MeanBB" ~ dplyr::select(.,prov,clon,tree,block,BDX_s32013,BDX_s32014,BDX_s32015,BDX_s32017),
         ~ dplyr::select(.,prov,clon,tree,block,BDX_dbb2014,BDX_dbb2015,BDX_dbb2017)) %>% 
    dplyr::filter(rowSums(is.na(across(where(is.numeric)))) != ncol(.)-4) %>% 
    dplyr::mutate(trait = rowMeans(dplyr::select(.,contains("BDX")),na.rm=T)) %>%
    execute_if(CenterResponseVariable, mutate(trait=trait-mean(trait))) %>% 
    execute_if(LogResponseVariable, mutate(trait=log(trait))) %>% 
    arrange(by=tree) %>% 
    inner_join(df.drivers,by="prov") 
  
  } else {
    
  df <- readRDS(file="data/ClonapinData/PhenoDataNovember2019_AnnualTraits.rds") %>% 
    dplyr::filter(!(prov=="MAD")) %>% 
    dplyr::rename(trait=all_of(trait)) %>% 
    dplyr::select(prov,clon,tree,block,trait) %>% 
    drop_na() %>%  
    execute_if(CenterResponseVariable, mutate(trait=trait-mean(trait))) %>% 
    execute_if(LogResponseVariable, mutate(trait=log(trait))) %>% 
    arrange(by=tree) %>% 
    inner_join(df.drivers,by="prov")
  
  }
  
  # Depending on the site where the traits were measured, we do not use the same CTD
if(trait=="AST_htnov12") CTD <- c("CTD_MCMT_height_asturias","CTD_MSP_height_asturias")
if(trait %in% c("POR_SLA","d13C","POR_htoct12")) CTD <- c("CTD_MCMT_height_SLA_d13C_portugal","CTD_MSP_height_SLA_d13C_portugal")
if(trait=="BDX_htnov13") CTD <- c("CTD_MCMT_height_bordeaux25","CTD_MSP_height_bordeaux25")
if(trait=="BDX_htnov18") CTD <- c("CTD_MCMT_height_bordeaux85","CTD_MSP_height_bordeaux85")
if(trait=="meanBB") CTD <- c("CTD_MCMT_meanBB_bordeaux","CTD_MSP_meanBB_bordeaux")
if(trait=="meanDBB") CTD <- c("CTD_MCMT_meanDBB_bordeaux","CTD_MSP_meanDBB_bordeaux")

  list.stan <- lapply(CTD, function(x)    {
    list(N=length(df$tree),
         y=df$trait,
         X=as.numeric(pull(unique(df[,c(x,"prov")]) %>% dplyr::select(contains(x)))),
         nprov = length(unique(df$prov)),
         nclon = length(unique(df$clon)),
         nblock = length(unique(df$block)),
         prov = as.numeric(as.factor(df$prov)),
         which_prov = as.numeric(as.factor(pull(unique(df[c("prov","clon")])[,"prov"]))),
         clon = as.numeric(as.factor(df$clon)),
         bloc = as.numeric(as.factor(df$block)))})
  
  names(list.stan) <- CTD
  
  return(list.stan)
}

2.1.3 ExtractParam() and ExtractPopSpecificParam()

ExtractParam <- function(x)  {
  
  readRDS(file=paste0("outputs/models/",x,"_",sub,".rds")) %>% 
  mclapply(function(x) broom::tidyMCMC(x,pars=param,
                droppars = NULL, 
                estimate.method = point.est, 
                ess = F, 
                rhat = F, 
                conf.int = T,
                conf.level = conf.level)) %>% 
  bind_rows(.id="Drivers") %>% 
  mutate(Traits=x)
}


ExtractPopSpecificParam <- function(trait){

  readRDS(file=paste0("outputs/models/",trait,"_",sub,".rds")) %>% 
  mclapply(function(driver) {
    broom::tidyMCMC(driver,pars=param,
                droppars = NULL, 
                estimate.method = point.est, 
                ess = F, 
                rhat = F, 
                conf.int = T,
                conf.level = conf.level) %>% 
      mutate(Traits=trait,
             prov=provs) }) %>%
  bind_rows(.id="Drivers") %>% 
  dplyr::filter(Drivers %in% selected.drivers) %>% 
  mutate(Drivers=factor(Drivers,levels=selected.drivers)) 
}

2.1.4 PlotParam()

PlotParam <- function(x){
  if(param == "betaX"){yaxis.label <- "$\\beta_{X}$ estimates"}
  
  p <- x %>% 
  ggplot(aes(x = Drivers, y = estimate,ymin = conf.low, ymax = conf.high,colour=Traits,shape=Traits)) +
  geom_pointinterval(position = position_dodge(width = .6),point_size=2.5,show.legend = c(size = TRUE)) +
  facet_grid(.~fct_relevel(Hyp,"Admixture","Climate harshness","Temporal heterogeneity","Spatial heterogeneity"),
             scales="free", space = "free") + 
  ylab(TeX(yaxis.label)) + xlab("") +
  scale_colour_manual(values = trait.colors,labels=trait.labels) +
  scale_shape_manual(values = trait.shapes,labels=trait.labels) +
  theme_bw() +
  theme(axis.text.x = element_text(size=13),
        axis.text.y = element_text(size=16),
        axis.title = element_text(size=16),
      legend.title=element_text(size=13), 
      legend.text=element_text(size=10),
      strip.text.x = element_text(size = 14),
      panel.grid.minor.x=element_blank(),
      panel.grid.major.x=element_blank()) +
    guides(color=guide_legend(ncol=1))
  
  return(p)

}

2.2 Options

# DOCUMENT OPTIONS
##################

# Sampling in Bayesian models
n_chains <- 4 # number of chains (MCMC)
n_iter <- 2500 # number of iterations
n_warm <- 1250 # number of iterations in the warm-up phase
n_thin <- 1 # thinning interval
save_warmup = FALSE 

theme_set(theme_bw(base_size = 20))

# Credible intervals
point.est="median"
conf.level <- 0.95

2.3 Parameters, variables

# For reproductibility
set.seed(444)

########################################
# Running the models on a filtered dataset or not? (modification after the first round of review in Heredity)
 sub = "alldata" # if the dataset is not filtered.

### Two filtered datasets:

# Subset 1: only populations represented by at least 4 clones and clones represented by at least 4 individuals.
# sub = "sub1"
# sub.clones = 3
# sub.ind = 3

# Subset 1: only populations represented by at least 4 clones and clones represented by at least 4 individuals.
# sub = "sub2"
# sub.clones = 10
# sub.ind = 6
########################################

# Dataset in which the potential drivers are
dfdrivers = "DF_Drivers"

# Vector of the names of the potential drivers
drivers = c("A","D", # population admixture indexes
            "SH.20km.PC1","SH.20km.PC2","SH.1km.PC1","SH.1km.PC2", # spatial environmental heterogeneity indexes
            "mean_bio6","mean_MCMT","mean_EMT","mean_bio14","mean_SHM","mean_MSP", # climate harhness indexes
            "var_bio6","var_MCMT","var_EMT","var_bio14","var_SHM","var_MSP") # temporal heterogeneity indexes

selected.drivers <- c("A","D",
                      "mean_MCMT","mean_MSP","var_MCMT","var_MSP",
                      "SH.20km.PC1","SH.20km.PC2","SH.1km.PC1","SH.1km.PC2")

selected.drivers.labels <- c("A","D",
                             "mean(MCMT)","mean(SP)",
                             "var(MCMT)","var(SP)",
                             "SH1[20km]","SH2[20km]","SH1[1.6km]","SH2[1.6km]")

# Downloading and scaling (mean-centering) the potential drivers
df.drivers <- readRDS(file=paste0("data/DF_Drivers.rds")) %>% # downloading the dataframe with the drivers
    dplyr::select(all_of(c("prov",drivers))) %>% 
    dplyr::mutate(across(-prov,scale)) # scaling (mean-centering) the potential drivers


# Traits
########

d13c <- parse(text = TeX("$\\delta^{13}C$ (Fundão)"))

if(sub=="sub2"){
  
  traits <- c("AST_htnov12","BDX_htnov13","BDX_htnov18","MeanBB","MeanDBB")
  trait.labels <- c("Height (Asturias, 21 months)",
                    "Height (Bordeaux, 25 months)",
                    "Height (Bordeaux, 85 months)",
                    "Mean bud burst date (Bordeaux)",
                    "Mean duration of bud burst (Bordeaux)")
  trait.colors <- c("#D73027","#F46D43", "#FDAE61","#5AAE61","#A6DBA0")
  trait.shapes <- c(16,16,16,17, 17)} else {
    
  traits <- c("POR_htoct12","AST_htnov12","BDX_htnov13","BDX_htnov18","MeanBB","MeanDBB","POR_SLA","d13C")
  trait.labels <- c("Height (Fundão, 20 months)",
                    "Height (Asturias, 21 months)",
                    "Height (Bordeaux, 25 months)",
                    "Height (Bordeaux, 85 months)",
                    "Mean bud burst date (Bordeaux)",
                    "Mean duration of bud burst (Bordeaux)",
                    "Specific Leaf Area (Fundão)",
                    d13c)
  trait.colors <- c("#B2182B","#D73027","#F46D43", "#FDAE61","#5AAE61","#A6DBA0" ,"#4575B4", "#ABD9E9")
  trait.shapes <- c(16,16,16,16, 17,17,15,15)}



# Populations
#############

# Extract the population names in the right order
provs <- readRDS(file="data/ClonapinData/PhenoDataNovember2019_AnnualTraits.rds") %>% 
  dplyr::filter(!(prov=="MAD")) %>% 
  arrange(by=tree) %>% 
  distinct(prov) %>% 
  pull()


# Options for figures
######################

selected.drivers.colors <- c("#4D9221" ,"#7FBC41", "#C51B7D" ,"#DE77AE",
                              "#E08214" ,"#FDB863",
                              "#2D004B","#542788" ,"#8073AC" ,"#B2ABD2")

3 Running and saving the models

For each of the eight trait, we ran eight models, one for each potential driver:

for(trait in traits){

if(sub=="sub2"){
stanlist.drivers <- DownloadData(trait,sub,drivers=selected.drivers)
} else if(sub=="alldata"|sub=="sub1"){
stanlist.drivers <- DownloadData(trait,sub,drivers)
}

list.models <- list()

for(i in names(stanlist.drivers)[-1]){
  
  list.models[[i]] <- sampling(model, 
                               data = stanlist.drivers[[i]], 
                               pars=c("beta0","betaX",
                                      "pi","sigma2_r","sigma2_block","sigma2_prov","sigma2_clon",
                                      "h2_prov",
                                      "sigma_tot","sigma_block","sigma_prov",
                                      "mean_sigma_clon","sigma_clon","sigma_K",
                                      #"alpha_clon","alpha_prov",#"alpha_block",
                                      # "y_rep" not included not to have a too heavy file
                                      "R_squared","bayes_R2_res","bayes_R2","log_lik"),
                               iter = n_iter, 
                               chains = n_chains, 
                               cores = n_chains,
                               save_warmup = save_warmup,
                               thin=n_thin)
}
saveRDS(list.models,file=paste0("outputs/models/",trait,"_",sub,".rds"))
}

We use the loo package to compare the models based on the ELPD.

if(sub=="alldata"){ # we only ran this analysis for the unfiltered dataset
loos <- mclapply(list.models,loo)
loocompare <- loo_compare(x=loos)
saveRDS(loocompare,file=paste0("outputs/loo/",trait,"_",sub,".rds"))}

4 Posterior distributions

4.1 \(\beta_{X}\)

After Heredity second review

param <- "betaX"

df.betaX <- mclapply(traits,ExtractParam) %>% 
  bind_rows() %>%
  mutate(Drivers=case_when(Drivers=="SH.20km.PC1"~"SH1[20km]",
                      Drivers=="SH.20km.PC2"~"SH2[20km]",
                      Drivers=="SH.1km.PC1"~"SH1[1.6km]",
                      Drivers=="SH.1km.PC2"~"SH2[1.6km]",
                      Drivers=="mean_bio6"~"mean(bio6)",
                      Drivers=="mean_MCMT"~"mean(MCMT)",
                      Drivers=="mean_EMT"~"mean(EMT)",
                      Drivers=="mean_bio14"~"mean(bio14)",
                      Drivers=="mean_MSP"~"mean(SP)",
                      Drivers=="mean_SHM"~"mean(SHM)",
                      Drivers=="var_bio6"~"var(bio6)",
                      Drivers=="var_MCMT"~"var(MCMT)",
                      Drivers=="var_EMT"~"var(EMT)",
                      Drivers=="var_bio14"~"var(bio14)",
                      Drivers=="var_MSP"~"var(SP)",
                      Drivers=="var_SHM"~"var(SHM)",
                      TRUE ~ as.character(Drivers)),
         Hyp=case_when(Drivers=="A"~"Admixture",
                       Drivers=="D"~"Admixture",
                       Drivers=="SH1[20km]"~"Spatial heterogeneity",
                       Drivers=="SH2[20km]"~"Spatial heterogeneity",
                       Drivers=="SH1[1.6km]"~"Spatial heterogeneity",
                       Drivers=="SH2[1.6km]"~"Spatial heterogeneity",
                       Drivers=="mean(bio6)"~"Climate harshness",
                       Drivers=="mean(MCMT)"~"Climate harshness",
                       Drivers=="mean(EMT)"~"Climate harshness",
                       Drivers=="mean(bio14)"~"Climate harshness",
                       Drivers=="mean(SHM)"~"Climate harshness",
                       Drivers=="mean(SP)"~"Climate harshness",
                       Drivers=="var(bio6)"~"Temporal heterogeneity",
                       Drivers=="var(MCMT)"~"Temporal heterogeneity",
                       Drivers=="var(EMT)"~"Temporal heterogeneity",
                       Drivers=="var(bio14)"~"Temporal heterogeneity",
                       Drivers=="var(SHM)"~"Temporal heterogeneity",
                       Drivers=="var(SP)"~"Temporal heterogeneity"),
         TraitType=case_when(Traits=="POR_htoct12"~"Morphological",
                             Traits=="BDX_htnov13"~"Morphological",
                             Traits=="AST_htnov12"~"Morphological",
                             Traits=="BDX_htnov18"~"Morphological",
                             Traits=="MeanBB"~"Phenological",
                             Traits=="MeanDBB"~"Phenological",
                             Traits=="POR_SLA"~"Functional",
                             Traits=="d13C"~"Functional"),
         Traits=factor(Traits,levels=c(traits)))


if(sub=="sub2"){width.png.selected.drivers=17} else {
  width.png.selected.drivers=18}


# Generating the plot with all drivers
if(sub=="sub1"|sub=="alldata"){PlotParam(df.betaX) %>% ggsave(file = paste0("figs/PosteriorDistributions/AllTestedDrivers_",sub,".png"),width=28,height=12)}

# Generating the plot with only the selected drivers
df.betaX %>% 
  dplyr::filter(Drivers %in% selected.drivers.labels) %>% 
  dplyr::mutate(Drivers=factor(Drivers,levels=selected.drivers.labels)) %>% 
  PlotParam %>% 
  ggsave(file = paste0("figs/PosteriorDistributions/SelectedDrivers_",sub,".png"),width=width.png.selected.drivers,height=10)

4.2 Total genetic variances \(\sigma^{2}_{C_{p}}\)

param = "sigma2_clon"

mclapply(traits, function(trait){

p <- ExtractPopSpecificParam(trait) %>% 
   ggplot(aes(x = prov, y = estimate,ymin = conf.low, ymax = conf.high,color=Drivers)) +
  geom_pointinterval(position = position_dodge(width = .8),point_size=2,alpha=0.6,size=4) +
  scale_color_manual(values=selected.drivers.colors,
                     labels=selected.drivers.labels) +
  xlab("Populations") +
  ylab(TeX("$\\sigma^{2}_{C_{p}}$ estimates")) +
  labs(color = "Potential drivers") +
  theme(legend.position = "bottom",
        panel.grid.minor.x=element_blank(),
        panel.grid.major.x=element_blank())  +
    guides(color=guide_legend(ncol=5))
  
    ggsave(p,file=paste0("figs/PosteriorDistributions/",trait,"_SigmaClon_",sub,".png"),height=6,width=22) })

4.3 \(\sigma_{C_{p}}\) against \(X_{p}\) (scatterplots)

param = "sigma_clon"

df.sigmaclon <- mclapply(traits, function(trait){

p <- ExtractPopSpecificParam(trait) %>% 
  left_join(df.drivers[,c("prov",selected.drivers)],by="prov") %>% 
  mutate(driver.value = case_when(
    Drivers=="A" ~ A,
    Drivers=="D" ~ D,
    Drivers=="SH.20km.PC1" ~ SH.20km.PC1,
    Drivers=="SH.20km.PC2" ~ SH.20km.PC2,
    Drivers=="SH.1km.PC1" ~ SH.1km.PC1,
    Drivers=="SH.1km.PC2" ~ SH.1km.PC2,
    Drivers=="mean_MCMT" ~ mean_MCMT,
    Drivers=="mean_MSP" ~ mean_MSP,
    Drivers=="var_MCMT" ~ var_MCMT,
    Drivers=="var_MSP" ~ var_MSP,
  )) %>% 
  dplyr::select(-A,-D, -contains("km.PC"),-contains("_bio")) %>% 
  mutate(Drivers= case_when(
    Drivers=="SH.20km.PC1" ~ "SH1[20km]",
    Drivers=="SH.20km.PC2" ~ "SH2[20km]",
    Drivers=="SH.1km.PC1" ~ "SH1[1.6km]",
    Drivers=="SH.1km.PC2" ~ "SH2[1.6km]",
    Drivers=="mean_MCMT" ~ "mean(MCMT)",
    Drivers=="mean_MSP" ~ "mean(SP)",
    Drivers=="var_MCMT" ~ "var(MCMT)",
    Drivers=="var_MSP" ~ "var(SP)",
    Drivers=="A" ~ "A",
    Drivers=="D" ~ "D"))

})

names(df.sigmaclon) <- traits


# Create df with median, high and low 95% credible interval of betaX, sigma_K and mean_sigma_clon
df.regression.lines <- mclapply(traits,function(x){
readRDS(file=paste0("outputs/models/",x,"_",sub,".rds")) %>% 
   mclapply(function(driver) {
     broom::tidyMCMC(driver,pars=c("betaX","sigma_K","mean_sigma_clon"),
                droppars = NULL, 
                estimate.method = point.est, 
                ess = F, 
                rhat = F, 
                conf.int = T,
                conf.level = conf.level) %>% 
      mutate(Traits=x) }) %>% 
  bind_rows(.id="Drivers") %>% 
  dplyr::filter(Drivers %in% selected.drivers) %>% 
  mutate(Drivers=factor(Drivers,levels=selected.drivers)) %>% 
  dplyr::select(-std.error) %>% 
    mutate(Drivers= case_when(
    Drivers=="SH.20km.PC1" ~ "SH1[20km]",
    Drivers=="SH.20km.PC2" ~ "SH2[20km]",
    Drivers=="SH.1km.PC1" ~ "SH1[1.6km]",
    Drivers=="SH.1km.PC2" ~ "SH2[1.6km]",
    Drivers=="mean_MCMT" ~ "mean(MCMT)",
    Drivers=="mean_MSP" ~ "mean(SP)",
    Drivers=="var_MCMT" ~ "var(MCMT)",
    Drivers=="var_MSP" ~ "var(SP)",
    Drivers=="A" ~ "A",
    Drivers=="D" ~ "D")) %>% 
  pivot_wider(names_from = term,values_from=c(estimate,conf.low,conf.high))})

names(df.regression.lines) <- traits

# List with y-axis limits for each trait
list.ylim <- list()
list.ylim[["POR_htoct12"]] <- c(-3,5.5)
list.ylim[["BDX_htnov13"]] <- c(2.5,5.5)
list.ylim[["BDX_htnov18"]] <- c(5.2,7)
list.ylim[["AST_htnov12"]] <- c(2.2,5.2)
list.ylim[["MeanBB"]] <- c(2.5,5)
list.ylim[["MeanDBB"]] <- c(2.8,4.6)
list.ylim[["POR_SLA"]] <- c(-4.8,-1.5)
list.ylim[["d13C"]] <- c(-8,1)



# Plot one graph for each trait (with eight subgraphs corresponding to each driver)
mclapply(traits,function(x){ 


scatterplots <- mclapply(selected.drivers.labels, function(driver){
  
  df2 <- df.regression.lines[[x]]
  mean_sigma_clon <-  as.numeric(df2[df2$Drivers==driver,"estimate_mean_sigma_clon"])
  sigma_K <- as.numeric(df2[df2$Drivers==driver,"estimate_sigma_K"])
  betaX.median <- as.numeric(df2[df2$Drivers==driver,"estimate_betaX"])
  betaX.low <- as.numeric(df2[df2$Drivers==driver,"conf.low_betaX"])
  betaX.high <- as.numeric(df2[df2$Drivers==driver,"conf.high_betaX"])
  
  df.sigmaclon[[x]] %>% 
  dplyr::filter(Drivers== driver) %>% 
  ggplot(aes(x=driver.value, y=log(estimate))) + 
  geom_point()+
  xlab(driver) + ylab(TeX("$log(\\sigma_{C_p})$")) +
  ylim(list.ylim[[x]]) +
  
  # low crebible interval of betaX
  geom_function(colour ="blue", linetype="dashed", size=1, fun = function(x) log(mean_sigma_clon) - ((sigma_K*sigma_K)/2)  + betaX.low*x) +
  
  # high credible interval of betaX
  geom_function(colour ="forestgreen", linetype="dashed", size=1, fun = function(x) log(mean_sigma_clon) - ((sigma_K*sigma_K)/2)  + betaX.high*x) +
    
  # median estimate of betaX
  geom_function(colour= "red", size=1,fun = function(x) log(mean_sigma_clon) - ((sigma_K*sigma_K)/2)  + betaX.median*x) +

  panel_border() # a border around each panel
})

names(scatterplots) <- selected.drivers.labels

scatterplots[[2]] <- scatterplots[[2]] + ylab("")
scatterplots[[3]] <- scatterplots[[3]] + ylab("")
scatterplots[[4]] <- scatterplots[[4]] + ylab("")
scatterplots[[5]] <- scatterplots[[5]] + ylab("")
scatterplots[[7]] <- scatterplots[[7]] + ylab("")
scatterplots[[8]] <- scatterplots[[8]] + ylab("")
scatterplots[[9]] <- scatterplots[[9]] + ylab("")
scatterplots[[10]] <- scatterplots[[10]] + ylab("")

p <- plot_grid(plotlist=scatterplots, nrow=2)

ggsave(p,file=paste0("figs/PosteriorDistributions/",x,"_ScatterPlots_SigmaClon_Drivers_",sub,".png"),height=10,width=18)

  })

4.4 Broad-sense heritabilities \(H^2_{p}\)

We calculated population-specific broad-sense heritabilities, such as:

\[ H^2_{p} = \frac{\sigma^{2}_{C_{p}}}{\sigma^{2}_{C_{p}} + \sigma^{2}_r} \]

param = "h2_prov"

mclapply(traits, function(trait){

p <- ExtractPopSpecificParam(trait) %>% 
     ggplot(aes(x = prov, y = estimate,ymin = conf.low, ymax = conf.high,color=Drivers)) +
  geom_pointinterval(position = position_dodge(width = .8),point_size=2,alpha=0.6,size=4) +
  scale_color_manual(values=selected.drivers.colors,
                     labels=selected.drivers.labels) +
  xlab("Populations") +
  ylab(TeX("$\\H^{2}_{p}$ estimates")) +
  labs(color = "Potential drivers") +
  theme(legend.position = "bottom",
        panel.grid.minor.x=element_blank(),
        panel.grid.major.x=element_blank())  +
    guides(color=guide_legend(ncol=5))

ggsave(p,file=paste0("figs/PosteriorDistributions/",trait,"_H2prov_",sub,".png"),height=6,width=22) })

For the discussion, we calculated the mean heritability values of each trait

param = "h2_prov"

h2means <- mclapply(traits, function(trait){

tab <- ExtractPopSpecificParam(trait)

mean(tab$estimate)

})

names(h2means) <- traits
h2means

## $POR_htoct12
## [1] 0.06930765
## 
## $AST_htnov12
## [1] 0.114654
## 
## $BDX_htnov13
## [1] 0.1516421
## 
## $BDX_htnov18
## [1] 0.2709387
## 
## $MeanBB
## [1] 0.3420657
## 
## $MeanDBB
## [1] 0.2315267
## 
## $POR_SLA
## [1] 0.08540636
## 
## $d13C
## [1] 0.07378368

4.5 Variance partitioning

param="pi"

mclapply(traits, function(trait){

p <- ExtractParam(trait) %>% 
  dplyr::filter(Drivers %in% selected.drivers) %>% 
  mutate(Drivers=case_when(Drivers=="SH.20km.PC1"~"SH1[20km]",
                      Drivers=="SH.20km.PC2"~"SH2[20km]",
                      Drivers=="SH.1km.PC1"~"SH1[1.6km]",
                      Drivers=="SH.1km.PC2"~"SH2[1.6km]",
                      Drivers=="mean_MCMT" ~ "mean(MCMT)",
                      Drivers=="mean_MSP" ~ "mean(SP)",
                      Drivers=="var_MCMT" ~ "var(MCMT)",
                      Drivers=="var_MSP" ~ "var(SP)",
                      TRUE ~ as.character(Drivers)),
         prop.var=rep(c("sigma_r","sigma_block","sigma_prov","mean_sigma_clon"),length(selected.drivers))) %>% 
  mutate(prop.var=factor(prop.var,levels=c("sigma_block","sigma_prov","mean_sigma_clon","sigma_r")),
         Drivers=factor(Drivers,levels=selected.drivers.labels)) %>% 
  
  ggplot(aes(x = prop.var, y = estimate,ymin = conf.low, ymax = conf.high,color=Drivers)) +
  geom_pointinterval(position = position_dodge(width = .8),point_size=3,alpha=0.6,size=5) +
  scale_color_manual(values=selected.drivers.colors) +
  xlab("") +
  ylab("Proportion of variance explained") +
  labs(color = "Potential drivers") +
  scale_x_discrete(labels=c("Blocks",
                            "Populations",
                            "Clones",
                            "Residuals")) +
  theme(axis.text = element_text(size=25),
        panel.grid.minor.x=element_blank(),
        panel.grid.major.x=element_blank())  +
    guides(color=guide_legend(ncol=1))

ggsave(p,file=paste0("figs/PosteriorDistributions/",trait,"_VarPart_",sub,".png"),height=6,width=18)

})

# Variance partitioning for height only
#######################################
heights <- str_subset(traits,"ht")

plots <- mclapply(heights, function(trait){

  ExtractParam(trait) %>% 
  dplyr::filter(Drivers %in% selected.drivers) %>% 
  mutate(Drivers=case_when(Drivers=="SH.20km.PC1"~"SH1[20km]",
                      Drivers=="SH.20km.PC2"~"SH2[20km]",
                      Drivers=="SH.1km.PC1"~"SH1[1.6km]",
                      Drivers=="SH.1km.PC2"~"SH2[1.6km]",
                      Drivers=="mean_MCMT" ~ "mean(MCMT)",
                      Drivers=="mean_MSP" ~ "mean(SP)",
                      Drivers=="var_MCMT" ~ "var(MCMT)",
                      Drivers=="var_MSP" ~ "var(SP)",
                      TRUE ~ as.character(Drivers)),
         prop.var=rep(c("sigma_r","sigma_block","sigma_prov","mean_sigma_clon"),length(selected.drivers))) %>% 
  mutate(prop.var=factor(prop.var,levels=c("sigma_block","sigma_prov","mean_sigma_clon","sigma_r")),
         Drivers=factor(Drivers,levels=selected.drivers.labels)) %>% 
  
  ggplot(aes(x = prop.var, y = estimate,ymin = conf.low, ymax = conf.high,color=Drivers)) +
  geom_pointinterval(position = position_dodge(width = .8),point_size=4,alpha=0.6,size=8) +
  scale_color_manual(values=selected.drivers.colors) +
  xlab("") +
  ylim(0,0.9) +
  ylab("Proportion of variance explained") +
  labs(color = "Potential drivers") +
  scale_x_discrete(labels=c("Blocks",
                            "Populations",
                            "Clones",
                            "Residuals"
                            )) +
  theme(legend.position = c(.32, .76),
        legend.text = element_text(size=15,margin = margin(t = 10)),
        legend.background = element_rect(colour = "grey"),
        axis.text = element_text(size=22),
        panel.grid.minor.x=element_blank(),
        panel.grid.major.x=element_blank())  +
    guides(color=guide_legend(ncol=2))
})
  
plots[[2]] <- plots[[2]] + ylab("") + theme(legend.position = "none")
plots[[3]] <- plots[[3]] + theme(legend.position = "none")
plots[[4]] <- plots[[4]] + ylab("") + theme(legend.position = "none")

p <- plot_grid(plots[[1]],plots[[2]],plots[[3]],plots[[4]],
               labels = c('A)','B)',"C)","D)"),
               label_size = 20 )

p

ggsave(p,file=paste0("figs/PosteriorDistributions/VarPart_AllHeights_",sub,".png"),height=14,width=17)

4.6 \(\mathcal{R}^{2}\)

params <- c("R_squared","bayes_R2_res","bayes_R2")

for(param in params){

p <- traits %>% 
  mclapply(ExtractParam) %>% 
  bind_rows() %>%
  dplyr::filter(Drivers %in% selected.drivers) %>% 
  mutate(trait.class=case_when(Traits=="POR_htoct12"~"Height",
                       Traits=="BDX_htnov13"~"Height",
                       Traits=="BDX_htnov18"~"Height",
                       Traits=="AST_htnov12"~"Height",
                       Traits=="MeanBB"~"Phenology",
                       Traits=="MeanDBB"~"Phenology",
                       Traits=="POR_SLA"~"Functional traits",
                       Traits=="d13C"~"Functional traits"),
         trait.names=case_when(Traits=="POR_htoct12"~"Fundão 20m",
                       Traits=="BDX_htnov13"~"Bordeaux 25m",
                       Traits=="BDX_htnov18"~"Bordeaux 85m",
                       Traits=="AST_htnov12"~"Asturias 21m",
                       Traits=="MeanBB"~"MeanBB",
                       Traits=="MeanDBB"~"MeanDBB",
                       Traits=="POR_SLA"~"SLA",
                       Traits=="d13C"~"d13C"),
         Traits=factor(Traits, levels=traits),
         Drivers=case_when(Drivers=="SH.20km.PC1"~"SH1[20km]",
                      Drivers=="SH.20km.PC2"~"SH2[20km]",
                      Drivers=="SH.1km.PC1"~"SH1[1.6km]",
                      Drivers=="SH.1km.PC2"~"SH2[1.6km]",
                      Drivers=="mean_MCMT" ~ "mean(MCMT)",
                      Drivers=="mean_MSP" ~ "mean(SP)",
                      Drivers=="var_MCMT" ~ "var(MCMT)",
                      Drivers=="var_MSP" ~ "var(SP)",
                      TRUE ~ as.character(Drivers)),
         Drivers=factor(Drivers,levels=selected.drivers.labels)) %>% 
  ggplot(aes(x = trait.names, y = estimate,
                   ymin = conf.low, 
                   ymax = conf.high,
                   color=fct_relevel(Drivers,levels=selected.drivers.labels))) +
  geom_pointinterval(position = position_dodge(width = .6),point_size=2.5,alpha=0.6) +
  facet_grid(.~trait.class,scales="free", space = "free") + 
  ylab(TeX("$R^{2}$")) + 
  xlab("") +
  labs(color = "Potential drivers") +
  scale_color_manual(values=selected.drivers.colors) +
  theme_bw() +
  theme(axis.text.x = element_text(size=13),
        axis.text.y = element_text(size=16),
        axis.title = element_text(size=16),
      legend.title=element_text(size=13), 
      legend.text=element_text(size=10),
      legend.position = c(.87, .19),
      legend.background = element_rect(colour = "grey"),
      strip.text.x = element_text(size = 14),
      panel.grid.minor.x=element_blank(),
      panel.grid.major.x=element_blank()) +
    guides(color=guide_legend(ncol=2))
  
  ggsave(p, file=paste0("figs/PosteriorDistributions/",param,"_",sub,".png"),height=7,width=13)
}

5 Correlation btw the number of clones per pop and \(\sigma^{2}_{C_{p}}\)

param <- "sigma2_clon"

# Extrating the sigma2_clon (i.e. within-population genetic variation)
sigma2_clon <- mclapply(traits, ExtractPopSpecificParam) %>% 
  setNames(traits)
  
# Function to extract the number of clones 
NbClonePerProv <- function(trait){

  df.drivers <- df.drivers %>% 
    dplyr::select(all_of(c("prov",selected.drivers)))
  
  if(trait=="MeanBB"|trait=="MeanDBB"){
      
  df <- readRDS(file="data/ClonapinData/PhenoDataNovember2019_AnnualTraits.rds") %>% 
    dplyr::filter(!(prov=="MAD")) %>% 
    when(trait=="MeanBB" ~ dplyr::select(.,prov,clon,tree,block,BDX_s32013,BDX_s32014,BDX_s32015,BDX_s32017),
         ~ dplyr::select(.,prov,clon,tree,block,BDX_dbb2014,BDX_dbb2015,BDX_dbb2017)) %>% 
    filter(rowSums(is.na(across(where(is.numeric)))) != ncol(.)-4) %>% 
    dplyr::select(prov,clon) %>% 
    distinct(.) %>% 
    group_by(prov) %>% 
    dplyr::summarise(count=n())
  
  } else {
    
  df <- readRDS(file="data/ClonapinData/PhenoDataNovember2019_AnnualTraits.rds") %>% 
    dplyr::filter(!(prov=="MAD")) %>% 
    dplyr::rename(trait=all_of(trait)) %>% 
    dplyr::select(prov,clon,tree,block,trait) %>% 
    drop_na() %>%  
    dplyr::select(prov,clon) %>% 
    distinct(.) %>% 
    group_by(prov) %>% 
    dplyr::summarise(count=n())
  
  }}

# Extracting the number of clones per population
count <- mclapply(traits,NbClonePerProv) %>% setNames(traits)

df <- mclapply(traits, function(x){
  mclapply(selected.drivers,function(driver)
  paste(round(cor(count[[x]]$count,sigma2_clon[[x]]$estimate[sigma2_clon[[x]]$Drivers==driver]),3)," [", # median estimate
       round(cor(count[[x]]$count,sigma2_clon[[x]]$conf.low[sigma2_clon[[x]]$Drivers==driver]),3),",",   # low 95% CI
       round(cor(count[[x]]$count,sigma2_clon[[x]]$conf.high[sigma2_clon[[x]]$Drivers==driver]),3),"] ") # high 95% CI
  ) %>% 
  setNames(selected.drivers.labels)}) %>% 
  setNames(traits) %>% 
  bind_rows(.id="Traits")

# Save as a latex table for the Supplementary Information
print(xtable(df, type = "latex",digits=3), 
      file = paste0("tables/Correlation_NbClonesPerProv_Sigma2clon.tex"), 
      include.rownames=FALSE)

6 Climatic transfer distances

CTDbetaX <- lapply(traits, function(trait) {
  
  list.stan <- DownloadDataCTD(trait)
  
  
  output <- lapply(names(list.stan), function(x){
    
    mod <- sampling(model, 
                  data = list.stan[[x]], 
                  pars=c("betaX"),
                  iter = n_iter, 
                  chains = n_chains, 
                  cores = n_chains,
                  save_warmup = save_warmup,
                  thin=n_thin)
  
    broom::tidyMCMC(mod,pars=(c("betaX")),
                  droppars = NULL, estimate.method = "median", ess = T, rhat = T, conf.int = T,conf.level = 0.95) %>% 
      dplyr::mutate(driver=x)
    
  }) %>% bind_rows
  
  
  })

names(CTDbetaX) <- traits

saveRDS(CTDbetaX,file=paste0("outputs/models/CTDbetaX.rds"))

CTDbetaX <- readRDS(file=paste0("outputs/models/CTDbetaX.rds")) %>% 
  bind_rows(.id="Traits") %>%
  mutate(Drivers=case_when(str_detect(driver,"MCMT")==TRUE ~ paste0("CTD(",str_sub(driver,5,8),")"),
                           str_detect(driver,"MSP")==TRUE ~ paste0("CTD(",str_sub(driver,6,7),")")),
         Traits=factor(Traits,levels=traits))


p <- CTDbetaX  %>%   ggplot(aes(x = Drivers, y = estimate,ymin = conf.low, ymax = conf.high,colour=Traits,shape=Traits)) +
  geom_pointinterval(position = position_dodge(width = .6),point_size=2.5,size=3,show.legend = c(size = TRUE)) +
  ylab(TeX("$\\beta_{X}$ estimates")) + xlab("") +
  scale_colour_manual(labels = trait.labels, # 
                     values=trait.colors) +
  scale_shape_manual(labels = trait.labels,
                     values = trait.shapes) +
  theme_bw() +
  theme(axis.text.x = element_text(size=13),
        axis.text.y = element_text(size=16),
        axis.title = element_text(size=16),
      legend.title=element_text(size=13), 
      legend.text=element_text(size=10),
      strip.text.x = element_text(size = 14),
      panel.grid.minor.x=element_blank(),
      panel.grid.major.x=element_blank()) +
    guides(color=guide_legend(ncol=1,override.aes = list(size=2)))
p

ggsave(p,file="figs/PosteriorDistributions/CTD_betaX.png",height=6,width=9)

7 Genetic diversity

We want to estimate the relationship between within-population quantitative genetic variation and genetic diversity (expected Nei’s heterozygosity \(H_e\)), to check that demographic factors do not influence quantitative genetic variation. For that, we take the Nei’s heterozygosity \(H_e\) values from Rodríguez-Quilón et al. (2016), for both nuSSRs (12 nuclear microsatellites) and SNPs (266 SNP markers).

df.drivers <- readRDS(file="data/DF_Drivers.rds") %>% 
    dplyr::select(prov) %>% 
    dplyr::mutate(He.nuSSRs=case_when(prov=="HOU" ~ 0.55,
                                      prov=="VER" ~ 0.57,
                                      prov=="MIM" ~ 0.58,
                                      prov=="OLO" ~ 0.57,
                                      prov=="PET" ~ 0.55,
                                      prov=="PLE" ~ 0.61,
                                      prov=="STJ" ~ 0.60,
                                      prov=="ALT" ~ 0.66,
                                      prov=="ARM" ~ 0.56,
                                      prov=="CAD" ~ 0.55,
                                      prov=="CAS" ~ 0.54,
                                      prov=="LAM" ~ 0.50,
                                      prov=="LEI" ~ 0.62,
                                      prov=="PUE" ~ 0.54,
                                      prov=="SAC" ~ 0.59,
                                      prov=="SEG" ~ 0.56,
                                      prov=="SIE" ~ 0.56,
                                      prov=="ARN" ~ 0.58,
                                      prov=="BAY" ~ 0.58,
                                      prov=="BON" ~ 0.57,
                                      prov=="CAR" ~ 0.65,
                                      prov=="CEN" ~ 0.56,
                                      prov=="COC" ~ 0.60,
                                      prov=="CUE" ~ 0.63,
                                      prov=="OLB" ~ 0.58,
                                      prov=="QUA" ~ 0.64,
                                      prov=="SAL" ~ 0.58,
                                      prov=="VAL" ~ 0.55,
                                      prov=="PIA" ~ 0.52,
                                      prov=="PIE" ~ 0.52,
                                      prov=="COM" ~ 0.76,
                                      prov=="ORI" ~ 0.64,
                                      prov=="TAM" ~ 0.55),
                    He.SNPs=case_when(prov=="HOU" ~ 0.25,
                                      prov=="VER" ~ 0.26,
                                      prov=="MIM" ~ 0.26,
                                      prov=="OLO" ~ 0.25,
                                      prov=="PET" ~ 0.26,
                                      prov=="PLE" ~ 0.27,
                                      prov=="STJ" ~ 0.26,
                                      prov=="ALT" ~ 0.24,
                                      prov=="ARM" ~ 0.23,
                                      prov=="CAD" ~ 0.24,
                                      prov=="CAS" ~ 0.24,
                                      prov=="LAM" ~ 0.23,
                                      prov=="LEI" ~ 0.26,
                                      prov=="PUE" ~ 0.23,
                                      prov=="SAC" ~ 0.24,
                                      prov=="SEG" ~ 0.25,
                                      prov=="SIE" ~ 0.26,
                                      prov=="ARN" ~ 0.29,
                                      prov=="BAY" ~ 0.28,
                                      prov=="BON" ~ 0.30,
                                      prov=="CAR" ~ 0.27,
                                      prov=="CEN" ~ 0.27,
                                      prov=="COC" ~ 0.29,
                                      prov=="CUE" ~ 0.28,
                                      prov=="OLB" ~ 0.29,
                                      prov=="QUA" ~ 0.29,
                                      prov=="SAL" ~ 0.28,
                                      prov=="VAL" ~ 0.28,
                                      prov=="PIA" ~ 0.25,
                                      prov=="PIE" ~ 0.25,
                                      prov=="COM" ~ 0.30,
                                      prov=="ORI" ~ 0.29,
                                      prov=="TAM" ~ 0.19)) %>% 
    dplyr::mutate(dplyr::across(-prov,scale))

FormatGeneticDiversity <- function(trait){

if(trait=="MeanBB"|trait=="MeanDBB"|trait=="d13C"){
    CenterResponseVariable <- TRUE
    LogResponseVariable <- FALSE
    
  }  else if(trait=="POR_SLA"){
    CenterResponseVariable <- FALSE
    LogResponseVariable <- TRUE
  } else {
    CenterResponseVariable <- FALSE
    LogResponseVariable <- FALSE
  }
  
if(trait=="MeanBB"|trait=="MeanDBB"){
    
  df <- readRDS(file="data/ClonapinData/PhenoDataNovember2019_AnnualTraits.rds") %>% 
    dplyr::filter(!(prov=="MAD")) %>% 
    when(trait=="MeanBB" ~ dplyr::select(.,prov,clon,tree,block,BDX_s32013,BDX_s32014,BDX_s32015,BDX_s32017),
         ~ dplyr::select(.,prov,clon,tree,block,BDX_dbb2014,BDX_dbb2015,BDX_dbb2017)) %>% 
    filter(rowSums(is.na(across(where(is.numeric)))) != ncol(.)-4) %>% 
    dplyr::mutate(trait = rowMeans(dplyr::select(.,contains("BDX")),na.rm=T)) %>%
    execute_if(CenterResponseVariable, mutate(trait=trait-mean(trait))) %>% 
    execute_if(LogResponseVariable, mutate(trait=log(trait))) %>% 
    arrange(by=tree) %>% 
    inner_join(df.drivers,by="prov") 
  
  } else {
    
  df <- readRDS(file="data/ClonapinData/PhenoDataNovember2019_AnnualTraits.rds") %>% 
    dplyr::filter(!(prov=="MAD")) %>% 
    dplyr::rename(trait=all_of(trait)) %>% 
    dplyr::select(prov,clon,tree,block,trait) %>% 
    drop_na() %>%  
    execute_if(CenterResponseVariable, mutate(trait=trait-mean(trait))) %>% 
    execute_if(LogResponseVariable, mutate(trait=log(trait))) %>% 
    arrange(by=tree) %>% 
    inner_join(df.drivers,by="prov")
  
  }
  
  var.He <- c("He.nuSSRs","He.SNPs")
  
  list.stan <- lapply(var.He, function(x)    {
    list(N=length(df$tree),
         y=df$trait,
         X=as.numeric(pull(unique(df[,c(x,"prov")]) %>% dplyr::select(contains(x)))),
         nprov = length(unique(df$prov)),
         nclon = length(unique(df$clon)),
         nblock = length(unique(df$block)),
         prov = as.numeric(as.factor(df$prov)),
         which_prov = as.numeric(as.factor(pull(unique(df[c("prov","clon")])[,"prov"]))),
         clon = as.numeric(as.factor(df$clon)),
         bloc = as.numeric(as.factor(df$block)))})
  
  names(list.stan) <- var.He
  
  return(list.stan)
}

GenDivbetaX <- lapply(traits, function(trait) {
  
  list.stan <- FormatGeneticDiversity(trait)
  
  
  output <- lapply(names(list.stan), function(x){
    
    mod <- sampling(model, 
                  data = list.stan[[x]], 
                  pars=c("betaX"),
                  iter = n_iter, 
                  chains = n_chains, 
                  cores = n_chains,
                  save_warmup = save_warmup,
                  thin=n_thin)
  
    broom::tidyMCMC(mod,pars=(c("betaX")),
                  droppars = NULL, estimate.method = "median", ess = T, rhat = T, conf.int = T,conf.level = 0.95) %>% 
      dplyr::mutate(driver=x)
    
  }) %>% bind_rows
  
  
  })

names(GenDivbetaX) <- traits

saveRDS(GenDivbetaX,file=paste0("outputs/models/GenDivbetaX.rds"))

GenDivbetaX <- readRDS(file=paste0("outputs/models/GenDivbetaX.rds")) %>% 
  bind_rows(.id="Traits") %>% 
  dplyr::mutate(Traits=factor(Traits,levels=traits))


p <- GenDivbetaX  %>% 
  ggplot(aes(x = driver, y = estimate,ymin = conf.low, ymax = conf.high,colour=Traits,shape=Traits)) +
  geom_pointinterval(position = position_dodge(width = .6),point_size=2.5,size=3,show.legend = c(size = TRUE)) +
  ylab(TeX("$\\beta_{X}$ estimates")) + xlab("") +
  scale_colour_manual(labels = trait.labels, # 
                     values=trait.colors) +
  scale_shape_manual(labels = trait.labels,
                     values = trait.shapes) +
  theme_bw() +
  theme(axis.text.x = element_text(size=13),
        axis.text.y = element_text(size=16),
        axis.title = element_text(size=16),
      legend.title=element_text(size=13), 
      legend.text=element_text(size=10),
      strip.text.x = element_text(size = 14),
      panel.grid.minor.x=element_blank(),
      panel.grid.major.x=element_blank()) +
    guides(color=guide_legend(ncol=1,override.aes = list(size=2)))
p

ggsave(p,file="figs/PosteriorDistributions/GeneticDiversity_betaX.png",height=6,width=9)

8 Geographic marginality

Following the comments of referee #1 (first round of review in Heredity), we calculate an index to quantify (or at least try to capture) the geographic marginality of the populations within the species range.

For that, we use the distribution of maritime pine based on both the Euforgen distribution and the NFI plots of Spain and France. We define the centre of the distribution as the point at mid-distance of the most extreme latitudes and longitudes where the presence of maritime pine has been observed.

Here are some maps showing maritime pine distribution, the population location and the centre of the distribution.

library(raster)

## Loading required package: sp

## 
## Attaching package: 'raster'

## The following object is masked from 'package:rstan':
## 
##     extract

## The following object is masked from 'package:tidyr':
## 
##     extract

## The following object is masked from 'package:dplyr':
## 
##     select

library(rnaturalearth)
library(ggthemes)

## 
## Attaching package: 'ggthemes'

## The following object is masked from 'package:cowplot':
## 
##     theme_map

library(sf)

## Linking to GEOS 3.5.1, GDAL 2.1.2, PROJ 4.9.3; sf_use_s2() is TRUE

# We extract the population coordinates
prov.coord <- readRDS(file="data/ClonapinData/PhenoDataNovember2019_AnnualTraits.rds") %>% 
  dplyr::select(prov, longitude_prov,latitude_prov) %>% 
  distinct() %>% 
  dplyr::filter(!prov=="MAD") %>% 
  dplyr::rename(longitude=longitude_prov,latitude=latitude_prov)


PinpinDistri  <- shapefile('../../GenomicOffset/GenomicOffsetPinPin/data/maps/MaskPinpinDistri/PinpinDistriEUforgen_NFIplotsBuffer10km.shp')
center.distri.long <- mean(c(extent(PinpinDistri)@xmin,extent(PinpinDistri)@xmax))
center.distri.lat <- mean(c(extent(PinpinDistri)@ymin,extent(PinpinDistri)@ymax))
center.distri <- c(center.distri.long,center.distri.lat)


# Maps with the raster package
##############################

# Create spatial objects for plotting
prov.coord.spatial <- SpatialPoints(prov.coord[,c("longitude","latitude")], 
                        proj4string=CRS("+proj=longlat +datum=WGS84"))
center.distri.spatial <- SpatialPoints(data.frame(longitude=center.distri.long,latitude=center.distri.lat), 
                        proj4string=CRS("+proj=longlat +datum=WGS84"))

plot(PinpinDistri,col="cadetblue1", lwd=0.1)
plot(prov.coord.spatial,add=T,pch=17,col="darkblue")
plot(center.distri.spatial,add=T,pch=4,col="red",lwd=4)

# Maps with the sf and ggplot packages
######################################

# This map is included in the Supplementary Information and the response to the reviewers.

world <- ne_countries(scale = "medium", returnclass = "sf")

PinpinDistri <- st_read('../../GenomicOffset/GenomicOffsetPinPin/data/maps/MaskPinpinDistri/PinpinDistriEUforgen_NFIplotsBuffer10km.shp')

## Reading layer `PinpinDistriEUforgen_NFIplotsBuffer10km' from data source `/home/juliette/Documents/GenomicOffset/GenomicOffsetPinPin/data/maps/MaskPinpinDistri/PinpinDistriEUforgen_NFIplotsBuffer10km.shp' using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 1 field
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -9.491658 ymin: 31.18334 xmax: 13.15834 ymax: 50.74848
## CRS:           4326

mapIBD <- ggplot(PinpinDistri) +
  geom_sf(color=alpha("chartreuse4",0.6),size=0.2,fill=alpha("chartreuse4",0.3)) + 
  theme_bw() +
  geom_sf(data = world,color=alpha("gray",0.8),size=0.2,fill=alpha("gray",0.1)) +
    coord_sf(xlim = c(-10, 13), 
             ylim = c(32,50.5),
             crs = "+proj=longlat +datum=WGS84") +
  geom_point(data=prov.coord[,c("longitude","latitude")],aes(x=longitude,y=latitude),size=2,pch=17) +
  geom_point(data=data.frame(longitude=center.distri.long,latitude=center.distri.lat),aes(x=longitude,y=latitude),size=6,pch=8,col="red") +
  xlab("Longitude") + ylab("Latitude")

mapIBD

ggsave(mapIBD,file="maps/mapIBD.png",height=8,width=10)

We then calculate the distance (in meters) between each population location and the centre of the distribution. With the geosphere package.

# calculate the distance to the point with the package geosphere
library(geosphere)
df.drivers <- prov.coord %>% 
  dplyr::mutate(dist.center.pt = distGeo(center.distri, prov.coord[,c("longitude","latitude")])) %>% 
  dplyr::mutate(dist.center.pt.scaled = as.numeric(scale(dist.center.pt)))

DownloadDataIBD <- function(trait){

if(trait=="MeanBB"|trait=="MeanDBB"|trait=="d13C"){
    CenterResponseVariable <- TRUE
    LogResponseVariable <- FALSE
    
  }  else if(trait=="POR_SLA"){
    CenterResponseVariable <- FALSE
    LogResponseVariable <- TRUE
  } else {
    CenterResponseVariable <- FALSE
    LogResponseVariable <- FALSE
  }
  
  if(trait=="MeanBB"|trait=="MeanDBB"){
    
  df <- readRDS(file="data/ClonapinData/PhenoDataNovember2019_AnnualTraits.rds") %>% 
    dplyr::filter(!(prov=="MAD")) %>% 
    when(trait=="MeanBB" ~ dplyr::select(.,prov,clon,tree,block,BDX_s32013,BDX_s32014,BDX_s32015,BDX_s32017),
         ~ dplyr::select(.,prov,clon,tree,block,BDX_dbb2014,BDX_dbb2015,BDX_dbb2017)) %>% 
    filter(rowSums(is.na(across(where(is.numeric)))) != ncol(.)-4) %>% 
    dplyr::mutate(trait = rowMeans(dplyr::select(.,contains("BDX")),na.rm=T)) %>%
    execute_if(CenterResponseVariable, mutate(trait=trait-mean(trait))) %>% 
    execute_if(LogResponseVariable, mutate(trait=log(trait))) %>% 
    arrange(by=tree) %>% 
    inner_join(df.drivers,by="prov") 
  
  } else {
    
  df <- readRDS(file="data/ClonapinData/PhenoDataNovember2019_AnnualTraits.rds") %>% 
    dplyr::filter(!(prov=="MAD")) %>% 
    dplyr::rename(trait=all_of(trait)) %>% 
    dplyr::select(prov,clon,tree,block,trait) %>% 
    drop_na() %>%  
    execute_if(CenterResponseVariable, mutate(trait=trait-mean(trait))) %>% 
    execute_if(LogResponseVariable, mutate(trait=log(trait))) %>% 
    arrange(by=tree) %>% 
    inner_join(df.drivers,by="prov")
  
  }
  
  
  list.stan <- list(N=length(df$tree),
         y=df$trait,
         X=unique(df$dist.center.pt.scaled),
         nprov = length(unique(df$prov)),
         nclon = length(unique(df$clon)),
         nblock = length(unique(df$block)),
         prov = as.numeric(as.factor(df$prov)),
         which_prov = as.numeric(as.factor(pull(unique(df[c("prov","clon")])[,"prov"]))),
         clon = as.numeric(as.factor(df$clon)),
         bloc = as.numeric(as.factor(df$block)))
  
  return(list.stan)
}

We run one model per trait (so eight models) and we extract the \(\beta_X\) coefficients standing for the association between the total genetic variances and the distance between each population and the center of the species distribution.

IBDbetaX <- lapply(traits, function(trait) {
  
  list.stan <- DownloadDataIBD(trait)
  
  mod <- sampling(model, 
                  data = list.stan, 
                  pars=c("betaX"),
                  iter = n_iter, 
                  chains = n_chains, 
                  cores = n_chains,
                  save_warmup = save_warmup,
                  thin=n_thin)
  
  broom::tidyMCMC(mod,pars=(c("betaX")),
                  droppars = NULL, estimate.method = "median", ess = T, rhat = T, conf.int = T,conf.level = 0.95) 
    
  }) %>% 
  bind_rows() %>% 
  dplyr::mutate(Traits=traits)
  

saveRDS(IBDbetaX,file=paste0("outputs/models/IBDbetaX.rds"))

We build the figure included in the Supplementary Information with the \(\beta_X\) coefficients.

IBDbetaX <- readRDS(file=paste0("outputs/models/IBDbetaX.rds")) %>% 
  dplyr::mutate(Traits=factor(Traits,levels=traits))

p <- IBDbetaX  %>%   ggplot(aes(x=term,y = estimate,ymin = conf.low, ymax = conf.high,colour=Traits,shape=Traits)) +
  geom_pointinterval(position = position_dodge(width = .6),point_size=2.5,size=3,show.legend = c(size = TRUE)) +
  ylab(TeX("$\\beta_{X}$ estimates")) + xlab("") +
  scale_colour_manual(labels = trait.labels,
                     values=trait.colors) +
  scale_shape_manual(labels = trait.labels,
                     values = trait.shapes) +
  theme_bw() +
  theme(axis.text.x = element_text(size=13),
        axis.text.y = element_text(size=16),
        axis.title = element_text(size=16),
      legend.title=element_text(size=13), 
      legend.text=element_text(size=10),
      strip.text.x = element_text(size = 14),
      panel.grid.minor.x=element_blank(),
      panel.grid.major.x=element_blank()) +
    guides(color=guide_legend(ncol=1,override.aes = list(size=2)))

p

ggsave(p,file="figs/PosteriorDistributions/IBD_betaX.png",height=6,width=6)

In the second round of review, the reviewer answered: 'concerning the central marginal test that authors performed in reply to the reviewers' comments: It doesn't make sense to measure distance from the centroid when the whole range is concave (Mediterranean basin) and there is an island too (Corsica).' So this analysis was removed from the second round of review.

9 Experimental design

For the whole dataset (i.e. sub="alldata).

list.df.traits <- lapply(traits, function(x){
  list.trait <- DownloadData(x,sub=sub,drivers=selected.drivers)
  return(list.trait$df %>% mutate(trait.id=x))})

## Warning: `invoke()` is deprecated as of rlang 0.4.0.
## Please use `exec()` or `inject()` instead.
## This warning is displayed once per session.

lapply(list.df.traits, function(x){
  
# Minimum, maximum and average number of trees per clone in each population
df <- x %>% 
    dplyr::select(prov,clon) %>% 
    group_by(clon) %>% 
    mutate(NbInd = n()) %>% 
    distinct() %>% 
    group_by(prov) %>% 
    mutate(MinNbTreesPerClone=min(NbInd),MaxNbTreesPerClone=max(NbInd),MeanNbTreesPerClone=mean(NbInd)) %>% 
    dplyr::select(-clon,-NbInd) %>% 
    distinct() 

# Number of trees in each population
df <- x %>% 
    group_by(prov) %>% 
    summarise(NbTrees=n()) %>% 
    merge(df,by="prov")
  
# Number of clones in each population  
df <- x %>% 
    dplyr::select(prov,clon) %>% 
    distinct() %>% 
    group_by(prov) %>%  
    summarise(NbClones=n()) %>% 
    merge(df,by="prov")
  
# Export in a latex table for the Supplementary Information
df <- df %>% dplyr::rename(Population=prov) 

print(xtable(df, type = "latex",digits=1),
      file = paste0("tables/ExperimentalDesign/",unique(x$trait.id),".tex"),
      include.rownames=FALSE)
  })

## [[1]]
## [1] "% latex table generated in R 3.6.3 by xtable 1.8-4 package\n% Tue Jan 10 18:23:18 2023\n\\begin{table}[ht]\n\\centering\n\\begin{tabular}{lrrrrr}\n  \\hline\nPopulation & NbClones & NbTrees & MinNbTreesPerClone & MaxNbTreesPerClone & MeanNbTreesPerClone \\\\ \n  \\hline\nALT &  9 & 54 &  4 &  8 & 6.0 \\\\ \n  ARM &  8 & 42 &  4 &  7 & 5.2 \\\\ \n  ARN & 17 & 93 &  3 &  8 & 5.5 \\\\ \n  BAY & 18 & 91 &  2 &  7 & 5.1 \\\\ \n  BON &  9 & 60 &  5 &  8 & 6.7 \\\\ \n  CAD & 10 & 60 &  4 &  8 & 6.0 \\\\ \n  CAR &  6 & 29 &  4 &  5 & 4.8 \\\\ \n  CAS & 10 & 51 &  3 &  7 & 5.1 \\\\ \n  CEN &  9 & 56 &  3 &  8 & 6.2 \\\\ \n  COC & 18 & 92 &  3 &  7 & 5.1 \\\\ \n  COM &  4 & 26 &  5 &  8 & 6.5 \\\\ \n  CUE & 28 & 133 &  1 &  7 & 4.8 \\\\ \n  HOU & 26 & 143 &  2 &  8 & 5.5 \\\\ \n  LAM &  9 & 45 &  2 &  7 & 5.0 \\\\ \n  LEI & 23 & 124 &  3 &  8 & 5.4 \\\\ \n  MIM & 18 & 78 &  2 &  6 & 4.3 \\\\ \n  OLB & 22 & 149 &  5 &  8 & 6.8 \\\\ \n  OLO & 24 & 122 &  1 &  8 & 5.1 \\\\ \n  ORI & 26 & 147 &  3 &  8 & 5.7 \\\\ \n  PET & 24 & 119 &  2 &  8 & 5.0 \\\\ \n  PIA & 16 & 87 &  3 &  8 & 5.4 \\\\ \n  PIE &  9 & 55 &  5 &  7 & 6.1 \\\\ \n  PLE & 20 & 94 &  3 &  7 & 4.7 \\\\ \n  PUE &  8 & 38 &  2 &  7 & 4.8 \\\\ \n  QUA & 17 & 102 &  2 &  8 & 6.0 \\\\ \n  SAC &  9 & 38 &  1 &  7 & 4.2 \\\\ \n  SAL & 14 & 58 &  2 &  7 & 4.1 \\\\ \n  SEG & 21 & 93 &  1 &  6 & 4.4 \\\\ \n  SIE &  7 & 39 &  4 &  7 & 5.6 \\\\ \n  STJ & 28 & 155 &  4 &  8 & 5.5 \\\\ \n  TAM & 15 & 66 &  1 &  7 & 4.4 \\\\ \n  VAL & 12 & 69 &  4 &  8 & 5.8 \\\\ \n  VER & 27 & 138 &  2 &  7 & 5.1 \\\\ \n   \\hline\n\\end{tabular}\n\\end{table}\n"
## 
## [[2]]
## [1] "% latex table generated in R 3.6.3 by xtable 1.8-4 package\n% Tue Jan 10 18:23:18 2023\n\\begin{table}[ht]\n\\centering\n\\begin{tabular}{lrrrrr}\n  \\hline\nPopulation & NbClones & NbTrees & MinNbTreesPerClone & MaxNbTreesPerClone & MeanNbTreesPerClone \\\\ \n  \\hline\nALT &  9 & 70 &  7 &  8 & 7.8 \\\\ \n  ARM &  8 & 63 &  7 &  8 & 7.9 \\\\ \n  ARN & 17 & 132 &  6 &  8 & 7.8 \\\\ \n  BAY & 18 & 136 &  6 &  8 & 7.6 \\\\ \n  BON &  9 & 69 &  7 &  8 & 7.7 \\\\ \n  CAD & 10 & 75 &  6 &  8 & 7.5 \\\\ \n  CAR &  6 & 48 &  8 &  8 & 8.0 \\\\ \n  CAS & 10 & 73 &  6 &  8 & 7.3 \\\\ \n  CEN &  9 & 70 &  7 &  8 & 7.8 \\\\ \n  COC & 18 & 138 &  6 &  8 & 7.7 \\\\ \n  COM &  4 & 31 &  7 &  8 & 7.8 \\\\ \n  CUE & 28 & 214 &  6 &  8 & 7.6 \\\\ \n  HOU & 26 & 195 &  6 &  8 & 7.5 \\\\ \n  LAM &  9 & 69 &  6 &  8 & 7.7 \\\\ \n  LEI & 23 & 172 &  7 &  8 & 7.5 \\\\ \n  MIM & 18 & 139 &  6 &  8 & 7.7 \\\\ \n  OLB & 22 & 172 &  7 &  8 & 7.8 \\\\ \n  OLO & 24 & 172 &  5 &  8 & 7.2 \\\\ \n  ORI & 26 & 204 &  6 &  8 & 7.8 \\\\ \n  PET & 24 & 181 &  5 &  8 & 7.5 \\\\ \n  PIA & 16 & 121 &  5 &  8 & 7.6 \\\\ \n  PIE &  9 & 69 &  6 &  8 & 7.7 \\\\ \n  PLE & 20 & 152 &  6 &  8 & 7.6 \\\\ \n  PUE &  8 & 62 &  7 &  8 & 7.8 \\\\ \n  QUA & 17 & 131 &  6 &  8 & 7.7 \\\\ \n  SAC &  9 & 69 &  6 &  8 & 7.7 \\\\ \n  SAL & 14 & 105 &  6 &  8 & 7.5 \\\\ \n  SEG & 21 & 157 &  6 &  8 & 7.5 \\\\ \n  SIE &  8 & 63 &  7 &  8 & 7.9 \\\\ \n  STJ & 28 & 210 &  5 &  8 & 7.5 \\\\ \n  TAM & 15 & 113 &  6 &  8 & 7.5 \\\\ \n  VAL & 12 & 92 &  6 &  8 & 7.7 \\\\ \n  VER & 27 & 206 &  5 &  8 & 7.6 \\\\ \n   \\hline\n\\end{tabular}\n\\end{table}\n"
## 
## [[3]]
## [1] "% latex table generated in R 3.6.3 by xtable 1.8-4 package\n% Tue Jan 10 18:23:18 2023\n\\begin{table}[ht]\n\\centering\n\\begin{tabular}{lrrrrr}\n  \\hline\nPopulation & NbClones & NbTrees & MinNbTreesPerClone & MaxNbTreesPerClone & MeanNbTreesPerClone \\\\ \n  \\hline\nALT &  7 & 52 &  4 &  8 & 7.4 \\\\ \n  ARM &  8 & 63 &  7 &  8 & 7.9 \\\\ \n  ARN & 14 & 107 &  6 &  8 & 7.6 \\\\ \n  BAY & 18 & 140 &  7 &  8 & 7.8 \\\\ \n  BON &  6 & 47 &  7 &  8 & 7.8 \\\\ \n  CAD &  9 & 67 &  6 &  8 & 7.4 \\\\ \n  CAR &  6 & 47 &  7 &  8 & 7.8 \\\\ \n  CAS & 10 & 73 &  6 &  8 & 7.3 \\\\ \n  CEN &  5 & 37 &  6 &  8 & 7.4 \\\\ \n  COC & 14 & 108 &  6 &  8 & 7.7 \\\\ \n  COM &  4 & 30 &  7 &  8 & 7.5 \\\\ \n  CUE & 25 & 187 &  5 &  8 & 7.5 \\\\ \n  HOU & 21 & 154 &  5 &  8 & 7.3 \\\\ \n  LAM &  8 & 56 &  5 &  8 & 7.0 \\\\ \n  LEI & 19 & 133 &  4 &  8 & 7.0 \\\\ \n  MIM & 16 & 124 &  6 &  8 & 7.8 \\\\ \n  OLB & 15 & 113 &  5 &  8 & 7.5 \\\\ \n  OLO & 18 & 129 &  4 &  8 & 7.2 \\\\ \n  ORI & 23 & 181 &  7 &  8 & 7.9 \\\\ \n  PET & 19 & 144 &  6 &  8 & 7.6 \\\\ \n  PIA & 14 & 109 &  7 &  8 & 7.8 \\\\ \n  PIE &  7 & 52 &  6 &  8 & 7.4 \\\\ \n  PLE & 18 & 130 &  5 &  8 & 7.2 \\\\ \n  PUE &  7 & 51 &  6 &  8 & 7.3 \\\\ \n  QUA & 15 & 115 &  7 &  8 & 7.7 \\\\ \n  SAC &  7 & 50 &  5 &  8 & 7.1 \\\\ \n  SAL &  9 & 69 &  5 &  8 & 7.7 \\\\ \n  SEG & 21 & 161 &  6 &  8 & 7.7 \\\\ \n  SIE &  6 & 46 &  7 &  8 & 7.7 \\\\ \n  STJ & 23 & 177 &  6 &  8 & 7.7 \\\\ \n  TAM &  9 & 69 &  7 &  8 & 7.7 \\\\ \n  VAL &  8 & 61 &  7 &  8 & 7.6 \\\\ \n  VER & 21 & 156 &  6 &  8 & 7.4 \\\\ \n   \\hline\n\\end{tabular}\n\\end{table}\n"
## 
## [[4]]
## [1] "% latex table generated in R 3.6.3 by xtable 1.8-4 package\n% Tue Jan 10 18:23:18 2023\n\\begin{table}[ht]\n\\centering\n\\begin{tabular}{lrrrrr}\n  \\hline\nPopulation & NbClones & NbTrees & MinNbTreesPerClone & MaxNbTreesPerClone & MeanNbTreesPerClone \\\\ \n  \\hline\nALT &  7 & 52 &  4 &  8 & 7.4 \\\\ \n  ARM &  8 & 63 &  7 &  8 & 7.9 \\\\ \n  ARN & 14 & 105 &  6 &  8 & 7.5 \\\\ \n  BAY & 18 & 139 &  7 &  8 & 7.7 \\\\ \n  BON &  6 & 47 &  7 &  8 & 7.8 \\\\ \n  CAD &  9 & 67 &  6 &  8 & 7.4 \\\\ \n  CAR &  6 & 46 &  7 &  8 & 7.7 \\\\ \n  CAS & 10 & 72 &  5 &  8 & 7.2 \\\\ \n  CEN &  5 & 37 &  6 &  8 & 7.4 \\\\ \n  COC & 14 & 105 &  6 &  8 & 7.5 \\\\ \n  COM &  4 & 28 &  6 &  8 & 7.0 \\\\ \n  CUE & 25 & 183 &  5 &  8 & 7.3 \\\\ \n  HOU & 21 & 154 &  5 &  8 & 7.3 \\\\ \n  LAM &  8 & 56 &  5 &  8 & 7.0 \\\\ \n  LEI & 19 & 133 &  4 &  8 & 7.0 \\\\ \n  MIM & 16 & 121 &  6 &  8 & 7.6 \\\\ \n  OLB & 15 & 112 &  5 &  8 & 7.5 \\\\ \n  OLO & 18 & 128 &  4 &  8 & 7.1 \\\\ \n  ORI & 23 & 181 &  7 &  8 & 7.9 \\\\ \n  PET & 19 & 144 &  6 &  8 & 7.6 \\\\ \n  PIA & 14 & 109 &  7 &  8 & 7.8 \\\\ \n  PIE &  7 & 51 &  6 &  8 & 7.3 \\\\ \n  PLE & 18 & 128 &  5 &  8 & 7.1 \\\\ \n  PUE &  7 & 51 &  6 &  8 & 7.3 \\\\ \n  QUA & 15 & 114 &  6 &  8 & 7.6 \\\\ \n  SAC &  7 & 49 &  5 &  8 & 7.0 \\\\ \n  SAL &  9 & 69 &  5 &  8 & 7.7 \\\\ \n  SEG & 21 & 161 &  6 &  8 & 7.7 \\\\ \n  SIE &  6 & 46 &  7 &  8 & 7.7 \\\\ \n  STJ & 23 & 177 &  6 &  8 & 7.7 \\\\ \n  TAM &  9 & 67 &  7 &  8 & 7.4 \\\\ \n  VAL &  8 & 59 &  6 &  8 & 7.4 \\\\ \n  VER & 21 & 155 &  6 &  8 & 7.4 \\\\ \n   \\hline\n\\end{tabular}\n\\end{table}\n"
## 
## [[5]]
## [1] "% latex table generated in R 3.6.3 by xtable 1.8-4 package\n% Tue Jan 10 18:23:18 2023\n\\begin{table}[ht]\n\\centering\n\\begin{tabular}{lrrrrr}\n  \\hline\nPopulation & NbClones & NbTrees & MinNbTreesPerClone & MaxNbTreesPerClone & MeanNbTreesPerClone \\\\ \n  \\hline\nALT &  7 & 52 &  4 &  8 & 7.4 \\\\ \n  ARM &  8 & 63 &  7 &  8 & 7.9 \\\\ \n  ARN & 14 & 48 &  1 &  5 & 3.4 \\\\ \n  BAY & 18 & 140 &  7 &  8 & 7.8 \\\\ \n  BON &  6 & 47 &  7 &  8 & 7.8 \\\\ \n  CAD &  9 & 67 &  6 &  8 & 7.4 \\\\ \n  CAR &  6 & 47 &  7 &  8 & 7.8 \\\\ \n  CAS & 10 & 73 &  6 &  8 & 7.3 \\\\ \n  CEN &  5 & 37 &  6 &  8 & 7.4 \\\\ \n  COC & 14 & 108 &  6 &  8 & 7.7 \\\\ \n  COM &  4 & 30 &  7 &  8 & 7.5 \\\\ \n  CUE & 25 & 187 &  5 &  8 & 7.5 \\\\ \n  HOU & 21 & 154 &  5 &  8 & 7.3 \\\\ \n  LAM &  8 & 56 &  5 &  8 & 7.0 \\\\ \n  LEI & 19 & 133 &  4 &  8 & 7.0 \\\\ \n  MIM & 16 & 123 &  5 &  8 & 7.7 \\\\ \n  OLB & 15 & 112 &  5 &  8 & 7.5 \\\\ \n  OLO & 18 & 129 &  4 &  8 & 7.2 \\\\ \n  ORI & 23 & 181 &  7 &  8 & 7.9 \\\\ \n  PET & 19 & 144 &  6 &  8 & 7.6 \\\\ \n  PIA & 14 & 109 &  7 &  8 & 7.8 \\\\ \n  PIE &  7 & 52 &  6 &  8 & 7.4 \\\\ \n  PLE & 18 & 130 &  5 &  8 & 7.2 \\\\ \n  PUE &  7 & 51 &  6 &  8 & 7.3 \\\\ \n  QUA & 15 & 115 &  7 &  8 & 7.7 \\\\ \n  SAC &  7 & 50 &  5 &  8 & 7.1 \\\\ \n  SAL &  9 & 69 &  5 &  8 & 7.7 \\\\ \n  SEG & 21 & 160 &  6 &  8 & 7.6 \\\\ \n  SIE &  6 & 46 &  7 &  8 & 7.7 \\\\ \n  STJ & 23 & 177 &  6 &  8 & 7.7 \\\\ \n  TAM &  9 & 69 &  7 &  8 & 7.7 \\\\ \n  VAL &  8 & 60 &  6 &  8 & 7.5 \\\\ \n  VER & 21 & 156 &  6 &  8 & 7.4 \\\\ \n   \\hline\n\\end{tabular}\n\\end{table}\n"
## 
## [[6]]
## [1] "% latex table generated in R 3.6.3 by xtable 1.8-4 package\n% Tue Jan 10 18:23:18 2023\n\\begin{table}[ht]\n\\centering\n\\begin{tabular}{lrrrrr}\n  \\hline\nPopulation & NbClones & NbTrees & MinNbTreesPerClone & MaxNbTreesPerClone & MeanNbTreesPerClone \\\\ \n  \\hline\nALT &  7 & 52 &  4 &  8 & 7.4 \\\\ \n  ARM &  8 & 63 &  7 &  8 & 7.9 \\\\ \n  ARN & 14 & 59 &  2 &  5 & 4.2 \\\\ \n  BAY & 18 & 140 &  7 &  8 & 7.8 \\\\ \n  BON &  6 & 47 &  7 &  8 & 7.8 \\\\ \n  CAD &  9 & 67 &  6 &  8 & 7.4 \\\\ \n  CAR &  6 & 47 &  7 &  8 & 7.8 \\\\ \n  CAS & 10 & 73 &  6 &  8 & 7.3 \\\\ \n  CEN &  5 & 37 &  6 &  8 & 7.4 \\\\ \n  COC & 14 & 108 &  6 &  8 & 7.7 \\\\ \n  COM &  4 & 30 &  7 &  8 & 7.5 \\\\ \n  CUE & 25 & 187 &  5 &  8 & 7.5 \\\\ \n  HOU & 21 & 154 &  5 &  8 & 7.3 \\\\ \n  LAM &  8 & 56 &  5 &  8 & 7.0 \\\\ \n  LEI & 19 & 133 &  4 &  8 & 7.0 \\\\ \n  MIM & 16 & 124 &  6 &  8 & 7.8 \\\\ \n  OLB & 15 & 112 &  5 &  8 & 7.5 \\\\ \n  OLO & 18 & 129 &  4 &  8 & 7.2 \\\\ \n  ORI & 23 & 181 &  7 &  8 & 7.9 \\\\ \n  PET & 19 & 144 &  6 &  8 & 7.6 \\\\ \n  PIA & 14 & 109 &  7 &  8 & 7.8 \\\\ \n  PIE &  7 & 52 &  6 &  8 & 7.4 \\\\ \n  PLE & 18 & 130 &  5 &  8 & 7.2 \\\\ \n  PUE &  7 & 51 &  6 &  8 & 7.3 \\\\ \n  QUA & 15 & 115 &  7 &  8 & 7.7 \\\\ \n  SAC &  7 & 50 &  5 &  8 & 7.1 \\\\ \n  SAL &  9 & 69 &  5 &  8 & 7.7 \\\\ \n  SEG & 21 & 160 &  6 &  8 & 7.6 \\\\ \n  SIE &  6 & 46 &  7 &  8 & 7.7 \\\\ \n  STJ & 23 & 177 &  6 &  8 & 7.7 \\\\ \n  TAM &  9 & 69 &  7 &  8 & 7.7 \\\\ \n  VAL &  8 & 60 &  6 &  8 & 7.5 \\\\ \n  VER & 21 & 156 &  6 &  8 & 7.4 \\\\ \n   \\hline\n\\end{tabular}\n\\end{table}\n"
## 
## [[7]]
## [1] "% latex table generated in R 3.6.3 by xtable 1.8-4 package\n% Tue Jan 10 18:23:18 2023\n\\begin{table}[ht]\n\\centering\n\\begin{tabular}{lrrrrr}\n  \\hline\nPopulation & NbClones & NbTrees & MinNbTreesPerClone & MaxNbTreesPerClone & MeanNbTreesPerClone \\\\ \n  \\hline\nALT &  9 & 53 &  4 &  8 & 5.9 \\\\ \n  ARM &  8 & 39 &  3 &  6 & 4.9 \\\\ \n  ARN & 17 & 89 &  3 &  8 & 5.2 \\\\ \n  BAY & 18 & 87 &  2 &  7 & 4.8 \\\\ \n  BON &  9 & 59 &  5 &  8 & 6.6 \\\\ \n  CAD & 10 & 57 &  4 &  7 & 5.7 \\\\ \n  CAR &  6 & 27 &  3 &  5 & 4.5 \\\\ \n  CAS & 10 & 49 &  3 &  7 & 4.9 \\\\ \n  CEN &  9 & 53 &  3 &  7 & 5.9 \\\\ \n  COC & 18 & 90 &  3 &  7 & 5.0 \\\\ \n  COM &  4 & 22 &  4 &  7 & 5.5 \\\\ \n  CUE & 28 & 122 &  1 &  6 & 4.4 \\\\ \n  HOU & 26 & 142 &  2 &  8 & 5.5 \\\\ \n  LAM &  9 & 45 &  2 &  7 & 5.0 \\\\ \n  LEI & 23 & 118 &  3 &  8 & 5.1 \\\\ \n  MIM & 18 & 76 &  1 &  6 & 4.2 \\\\ \n  OLB & 22 & 148 &  5 &  8 & 6.7 \\\\ \n  OLO & 24 & 118 &  1 &  8 & 4.9 \\\\ \n  ORI & 26 & 143 &  2 &  8 & 5.5 \\\\ \n  PET & 24 & 116 &  2 &  8 & 4.8 \\\\ \n  PIA & 16 & 83 &  2 &  8 & 5.2 \\\\ \n  PIE &  9 & 55 &  5 &  7 & 6.1 \\\\ \n  PLE & 20 & 89 &  2 &  7 & 4.5 \\\\ \n  PUE &  8 & 34 &  2 &  6 & 4.2 \\\\ \n  QUA & 17 & 99 &  2 &  8 & 5.8 \\\\ \n  SAC &  9 & 36 &  1 &  7 & 4.0 \\\\ \n  SAL & 14 & 57 &  2 &  7 & 4.1 \\\\ \n  SEG & 21 & 91 &  1 &  6 & 4.3 \\\\ \n  SIE &  7 & 35 &  4 &  6 & 5.0 \\\\ \n  STJ & 28 & 152 &  4 &  8 & 5.4 \\\\ \n  TAM & 14 & 60 &  2 &  7 & 4.3 \\\\ \n  VAL & 12 & 67 &  3 &  7 & 5.6 \\\\ \n  VER & 27 & 131 &  1 &  7 & 4.9 \\\\ \n   \\hline\n\\end{tabular}\n\\end{table}\n"
## 
## [[8]]
## [1] "% latex table generated in R 3.6.3 by xtable 1.8-4 package\n% Tue Jan 10 18:23:18 2023\n\\begin{table}[ht]\n\\centering\n\\begin{tabular}{lrrrrr}\n  \\hline\nPopulation & NbClones & NbTrees & MinNbTreesPerClone & MaxNbTreesPerClone & MeanNbTreesPerClone \\\\ \n  \\hline\nALT &  9 & 35 &  3 &  4 & 3.9 \\\\ \n  ARM &  8 & 30 &  3 &  4 & 3.8 \\\\ \n  ARN & 17 & 65 &  3 &  4 & 3.8 \\\\ \n  BAY & 18 & 69 &  2 &  4 & 3.8 \\\\ \n  BON &  9 & 36 &  4 &  4 & 4.0 \\\\ \n  CAD & 10 & 40 &  4 &  4 & 4.0 \\\\ \n  CAR &  6 & 23 &  3 &  4 & 3.8 \\\\ \n  CAS & 10 & 39 &  3 &  4 & 3.9 \\\\ \n  CEN &  9 & 35 &  3 &  4 & 3.9 \\\\ \n  COC & 18 & 67 &  3 &  4 & 3.7 \\\\ \n  COM &  4 & 16 &  4 &  4 & 4.0 \\\\ \n  CUE & 28 & 100 &  1 &  4 & 3.6 \\\\ \n  HOU & 26 & 100 &  2 &  4 & 3.8 \\\\ \n  LAM &  9 & 34 &  3 &  4 & 3.8 \\\\ \n  LEI & 23 & 88 &  3 &  4 & 3.8 \\\\ \n  MIM & 18 & 65 &  1 &  4 & 3.6 \\\\ \n  OLB & 22 & 88 &  4 &  4 & 4.0 \\\\ \n  OLO & 24 & 88 &  1 &  4 & 3.7 \\\\ \n  ORI & 26 & 101 &  2 &  4 & 3.9 \\\\ \n  PET & 24 & 88 &  2 &  4 & 3.7 \\\\ \n  PIA & 16 & 60 &  2 &  4 & 3.8 \\\\ \n  PIE &  9 & 36 &  4 &  4 & 4.0 \\\\ \n  PLE & 20 & 70 &  2 &  4 & 3.5 \\\\ \n  PUE &  8 & 29 &  2 &  4 & 3.6 \\\\ \n  QUA & 17 & 64 &  2 &  4 & 3.8 \\\\ \n  SAC &  6 & 16 &  1 &  4 & 2.7 \\\\ \n  SAL & 14 & 46 &  2 &  4 & 3.3 \\\\ \n  SEG & 21 & 73 &  1 &  4 & 3.5 \\\\ \n  SIE &  7 & 28 &  4 &  4 & 4.0 \\\\ \n  STJ & 28 & 111 &  3 &  4 & 4.0 \\\\ \n  TAM & 14 & 51 &  2 &  4 & 3.6 \\\\ \n  VAL & 12 & 47 &  3 &  4 & 3.9 \\\\ \n  VER & 27 & 101 &  1 &  4 & 3.7 \\\\ \n   \\hline\n\\end{tabular}\n\\end{table}\n"

10 Model with G*E

10.1 Model equation and code

This model is based on the previous one (so I do not re-explain the meaning of each parameters, except for the new ones).

We model each trait \(y_{bpcrs}\) such as:

\[\begin{equation} \begin{aligned} y_{bpcr} & \sim \mathcal{N}(\mu_{bpc},\sigma^{2}_{r})\\[3pt] \mu_{bpc} & = \beta_{0} + B_{b} + P_{p} + C_{c(p)} + S_{s} + S_{s}*C_{c(p)}\\[3pt] \beta_{0} & \sim \mathcal{N}(\mu_{y},2)\\[3pt] \begin{bmatrix} B_{b} \\ P_{p} \\ S_{s} \\ S_{s}*C_{c(p)} \end{bmatrix} & \sim \mathcal{N}\left(0, \begin{bmatrix}\sigma^{2}_{B}\\[3pt] \sigma^{2}_{P}\\[3pt] \sigma^{2}_{S}\\[3pt] \sigma^{2}_{S*C}\\[3pt] \end{bmatrix} \right)\\[3pt] C_{c(p)} & \sim \mathcal{N}(0,\sigma^{2}_{C_{p}}) \end{aligned} \end{equation}\]

where \(S_{s}\) are the site intercepts, \(\sigma^{2}_{S}\) is the variance among sites and \(\sigma^{2}_{S*C}\) is the variance of the interaction among the sites and clones, that is the G*E variance.

Partitionning of the total variance \(\sigma_{tot}^{2}\):

\[\begin{equation} \begin{aligned} \sigma_{tot}^{2} & = \sigma_{r}^{2} + \sigma_{B}^{2} + \overline{\sigma_{C_{p}}^{2}} + \sigma_{P}^{2} + \sigma_{S}^{2} + \sigma_{S*C}^{2}\\[3pt] \sigma_{r} & = \sigma_{tot} \times \sqrt(\pi_{r})\\[3pt] \sigma_{B} & = \sigma_{tot} \times \sqrt(\pi_{B})\\[3pt] \sigma_{P} & = \sigma_{tot} \times \sqrt(\pi_{P})\\[3pt] \overline{\sigma_{C_{p}}} & = \sigma_{tot} \times \sqrt(\pi_{C})\\[3pt] \sigma_{S} & = \sigma_{tot} \times \sqrt(\pi_{S})\\[3pt] \sigma_{S*C} & = \sigma_{tot} \times \sqrt(\pi_{S*C})\\[3pt] \sigma_{tot} & \sim \mathcal{S}^{*}(0,1,3) \end{aligned} \end{equation}\]

with \(\sum_{l}^{6}\pi_{l}=1\).

The expression of \(\sigma_{C_{p}}\) (and its associated priors) is the same than the main model of the paper (which is trait-specific).

Here is the stan model code:

model = stan_model("scripts/StanModels/HierarchicalModel_VaryingInterClonesSD_OneCovariate_3Sites.stan")
model

## S4 class stanmodel 'HierarchicalModel_VaryingInterClonesSD_OneCovariate_3Sites' coded as follows:
## //Model 1 with G*E:  model with the betaX * Xp on the median of the sigma_clon
## 
## data {
##   int<lower=1> N;                                                              // Number of observations
##   vector[N] y;                                                                 // Response variable
##   int<lower=0> nblock;                                                         // Number of blocks
##   int<lower=0> nsite;                                                          // Number of sites
##   int<lower=0> nprov;                                                          // Number of provenances
##   int<lower=0> nclon;                                                          // Number of clones
##   int<lower=0, upper=nblock> bloc[N];                                          // Blocks
##   int<lower=0, upper=nprov> prov[N];                                           // Provenances
##   int<lower=0, upper=nsite> site[N];                                           // Sites
##   int<lower=0, upper=nsite*nclon> site_clon[N];                                // Sites * Clones
##   int<lower=0, upper=nprov> which_prov[nclon];                                 // Provenances
##   int<lower=0, upper=nclon> clon[N];                                           // Clones
##   vector[nprov] X;                                                             // Potential predictor of sigma_clon
## }
## 
## 
## parameters {
##   real beta0; // global intercept
##   real betaX; // coefficient of Xp
##   simplex[6] pi;
##   real<lower = 0> sigma_tot;
##   real<lower = 0> sigma_K;
##   vector[nprov] z_log_sigma_clon;
##   
##   vector[nblock] z_block;
##   vector[nprov] z_prov;
##   vector[nclon] z_clon;
##   vector[nsite] z_site;
##   vector[nsite*nclon] z_site_clon;
##   
## }
## 
## 
## transformed parameters {
##   real R_squared;
##   
##   real<lower = 0>  sigma_r;
##   real<lower = 0>  sigma_block;
##   real<lower = 0>  sigma_prov;
##   real<lower = 0>  sigma_site;
##   real<lower = 0>  sigma_site_clon;
##   
##   real mean_sigma_clon;
##   vector[nprov] sigma_clon;
##   
##   vector[nprov] alpha_prov;
##   vector[nclon] alpha_clon;
##   vector[nblock] alpha_block;
##   vector[nsite] alpha_site;
##   vector[nsite*nclon] alpha_site_clon;
##   
##   vector[N] mu; // linear predictor
##   
##   // variance partitioning with the simplex pi
##   sigma_r = sqrt(pi[1]) * sigma_tot;
##   sigma_block = sqrt(pi[2]) * sigma_tot;
##   sigma_prov = sqrt(pi[3]) * sigma_tot;
##   sigma_site = sqrt(pi[5]) * sigma_tot;
##   sigma_site_clon = sqrt(pi[6]) * sigma_tot;
##   
##   mean_sigma_clon= sqrt(pi[4]) * sigma_tot;
##   sigma_clon = exp(log(mean_sigma_clon) - (square(sigma_K)/2)  + betaX*X + z_log_sigma_clon*sigma_K);
##   
##   alpha_prov = z_prov*sigma_prov;
##   
##   for(c in 1:nclon){
##     alpha_clon[c] =  z_clon[c]*sigma_clon[which_prov[c]];
##   }
##   
##   alpha_block = z_block*sigma_block;
##   alpha_site = z_site*sigma_site;
##   alpha_site_clon = z_site_clon*sigma_site_clon;
##   
##   mu = rep_vector(beta0, N) + alpha_clon[clon] + alpha_block[bloc]  +  alpha_site[site]  + alpha_prov[prov] + alpha_site_clon[site_clon];
##   R_squared = 1 - variance(y - mu) / variance(y);
## }
## 
## model{
##   //Priors
##   beta0 ~ normal(mean(y),2);
##   betaX ~ std_normal();
##   sigma_tot ~ student_t(3, 0.0, 1.0);
##   
##   z_prov ~ std_normal();
##   z_block ~ std_normal();
##   z_clon ~ std_normal();
##   z_site ~ std_normal();
##   z_site_clon ~ std_normal();
##   
##   z_log_sigma_clon ~ std_normal();
##   sigma_K ~ exponential(1);
##   
##   // Likelihood
##   y ~ normal(mu, sigma_r);
## }
## 
## 
## generated quantities {
##   //Variances
##   real<lower=0> sigma2_r;
##   real<lower=0> sigma2_prov;
##   real<lower=0> sigma2_block;
##   real<lower=0> sigma2_site;
##   vector[nprov] sigma2_clon;
##   real<lower=0> sigma2_site_clon;
##   vector[nprov] h2_prov;
##   
##   // Posterior predictive check
##   vector[N] y_rep;
##   
##   // Log-Likelihood (for WAIC/loo computations)
##   vector[N] log_lik;
##   
##   sigma2_r = square(sigma_r);
##   sigma2_prov = square(sigma_prov);
##   sigma2_clon = square(sigma_clon);
##   sigma2_site = square(sigma_site);
##   sigma2_site_clon = square(sigma_site_clon);
##   for(p in 1:nprov)  h2_prov[p] = sigma2_clon[p]/(sigma2_r+sigma2_clon[p]);
##   sigma2_block = square(sigma_block);
##   for(i in 1:N)  {
##     y_rep[i] = normal_rng(mu[i], sigma_r);
##     log_lik[i] = normal_lpdf(y[i]| mu[i], sigma_r); // log probability density function
##   }
## }
##

10.2 Running the GE model

# The three height measurements we used:
traits <- c("POR_htoct12","BDX_htnov13","AST_htnov12")

# Height is not transformed
CenterResponseVariable <- FALSE
LogResponseVariable <- FALSE


# Downloading and scaling (mean-centering) the potential drivers
df.drivers <- readRDS(file="data/DF_Drivers.rds") %>% 
    dplyr::select(all_of(c("prov",selected.drivers))) %>% 
    dplyr::mutate(across(-prov,scale)) %>% 
    dplyr::mutate(across(-prov,as.numeric))

  
# Merging with the phenotypic data
df <- lapply(traits, function(trait){
   readRDS(file="data/ClonapinData/PhenoDataNovember2019_AnnualTraits.rds") %>% 
    dplyr::filter(!(prov=="MAD")) %>% 
    dplyr::rename(trait=all_of(trait)) %>% 
    dplyr::select(prov,clon,tree,block,site,trait) %>% 
    dplyr::mutate(site_clon=paste0(site,"_",clon)) %>% 
    drop_na() %>%  
    execute_if(CenterResponseVariable, mutate(trait=trait-mean(trait))) %>% 
    execute_if(LogResponseVariable, mutate(trait=log(trait))) %>% 
    arrange(by=tree) %>% 
    inner_join(df.drivers,by="prov")
}) %>% bind_rows()


list.stan <- lapply(selected.drivers, function(x) {
    list(N=length(df$tree),
         y=df$trait,
         X=as.numeric(pull(unique(df[,c(x,"prov")]) %>% dplyr::select(contains(x)))),
         nprov = length(unique(df$prov)),
         nclon = length(unique(df$clon)),
         nblock = length(unique(df$block)),
         nsite = length(unique(df$site)),
         prov = as.numeric(as.factor(df$prov)),
         which_prov = as.numeric(as.factor(pull(unique(df[c("prov","clon")])[,"prov"]))),
         clon = as.numeric(as.factor(df$clon)),
         bloc = as.numeric(as.factor(df$block)),
         site = as.numeric(as.factor(df$site)),
         site_clon = as.numeric(as.factor(df$site_clon)))})
  
names(list.stan) <- selected.drivers

list.models <- list()

for(i in names(list.stan)){
  
list.models[[i]] <- sampling(model, 
                               data = list.stan[[i]], 
                               pars=c("betaX", "pi", "R_squared", "h2_prov"), # we keep only the parameters of interest to save space
                               iter = n_iter, 
                               chains = n_chains, 
                               cores = n_chains,
                               save_warmup = save_warmup,
                               thin=n_thin)}

saveRDS(list.models,file=paste0("outputs/models/HeightGE.rds"))

10.3 Figures of post. distributions

10.3.1 \(\beta_{X}\)

d <- readRDS(file=paste0("outputs/models/HeightGE.rds")) %>% 
  mclapply(function(x) broom::tidyMCMC(x,pars=("betaX"),
              droppars = NULL, 
              estimate.method = point.est, 
              ess = F, 
              rhat = F, 
              conf.int = T,
              conf.level = conf.level)) %>% 
  bind_rows() %>% 
  mutate(Drivers=selected.drivers) %>% 
  mutate(Drivers=factor(Drivers,levels=selected.drivers))


p <- ggplot(d,aes(x = Drivers, y = estimate,ymin = conf.low, ymax = conf.high,color=Drivers)) +
  geom_pointinterval(position = position_dodge(width = .6),point_size=5,size=5) +
  ylab(TeX("$\\beta_{X}$ estimates")) + xlab("") +
  scale_x_discrete(labels=selected.drivers.labels) +
  scale_color_manual(values=selected.drivers.colors) +
  theme_bw() +
  theme(axis.text.y = element_text(size=16),
        axis.text.x = element_text(size=11),
        axis.title = element_text(size=16),
        legend.position = "none")

p

ggsave(p,file=paste0("figs/PosteriorDistributions/HeightGE_BetaX.png"),height=8,width=12)

10.3.2 Broad-sense heritabilities

p <- readRDS(file=paste0("outputs/models/HeightGE.rds")) %>% 
  mclapply(function(x) {
    broom::tidyMCMC(x,pars=("h2_prov"),
                droppars = NULL, 
                estimate.method = point.est, 
                ess = F, 
                rhat = F, 
                conf.int = T,
                conf.level = conf.level)}) %>%  
  bind_rows(.id="Drivers") %>% 
  mutate(prov=rep(provs,length(selected.drivers))) %>% 
  mutate(Drivers=factor(Drivers,levels=selected.drivers)) %>% 
  ggplot(aes(x = prov, y = estimate,ymin = conf.low, ymax = conf.high,color=Drivers)) +
  geom_pointinterval(position = position_dodge(width = .8),point_size=2,alpha=0.6,size=4) +
  scale_color_manual(values=selected.drivers.colors,
                     labels=selected.drivers.labels) +
  xlab("Populations") +
  ylab(TeX("$\\H^{2}_{p}$ estimates")) +
  labs(color = "Potential drivers") +
  theme(legend.position = "bottom",
        panel.grid.minor.x=element_blank(),
        panel.grid.major.x=element_blank())  +
    guides(color=guide_legend(ncol=5))

p

ggsave(p,file=paste0("figs/PosteriorDistributions/HeightGE_H2prov.png"),height=6,width=22)

10.3.3 Variance partitioning

p <- readRDS(file=paste0("outputs/models/HeightGE.rds")) %>% 
  mclapply(function(x) {
    broom::tidyMCMC(x,pars=("pi"),
                droppars = NULL, estimate.method = point.est, ess = F, rhat = F, conf.int = T,conf.level = 0.95)}) %>%  
  bind_rows(.id="Drivers") %>% 
  mutate(prop.var=rep(c("Residuals","Blocks","Populations","Clones","Sites","Sites*Clones"),length(selected.drivers))) %>% 
  mutate(prop.var=factor(prop.var, 
                         levels=c("Sites","Blocks","Sites*Clones","Populations","Clones","Residuals"))) %>%
  mutate(Drivers=factor(Drivers,levels=selected.drivers)) %>% 
  ggplot(aes(x = prop.var, y = estimate,ymin = conf.low, ymax = conf.high,color=Drivers)) +
  geom_pointinterval(position = position_dodge(width = .8),point_size=3,alpha=0.6,size=5) +
  scale_color_manual(values=selected.drivers.colors,
                     labels=selected.drivers.labels) +
  xlab("") +
  ylab("Proportion of variance explained") +
  labs(color = "Potential drivers") +
  theme(axis.text = element_text(size=25),
        panel.grid.minor.x=element_blank(),
        panel.grid.major.x=element_blank())  +
    guides(color=guide_legend(ncol=1))

p

ggsave(p,file=paste0("figs/PosteriorDistributions/HeightGE_VarPart.png"),height=6,width=18)

Bibliography

Gelman, Andrew, Ben Goodrich, Jonah Gabry, and Aki Vehtari. 2019. “R-Squared for Bayesian Regression Models.” The American Statistician. Taylor & Francis.

Rodríguez-Quilón, Isabel, Luis Santos-del-Blanco, María Jesús Serra-Varela, Jarkko Koskela, Santiago C González-Martínez, and Ricardo Alía. 2016. “Capturing Neutral and Adaptive Genetic Diversity for Conservation in a Highly Structured Tree Species.” Ecological Applications 26 (7). Wiley Online Library: 2254–66.

Bayesian analysis to associate within-population genetic variation with potential underlying drivers

Based on real data

Juliette Archambeau

10 janvier, 2023