Estimating plasticity at the population level

Villemereuil & Chevin method & RDPI of Valladeres

Author

Juliette Archambeau

Published

Invalid Date

1 Introduction

The goal of this report is to determine whether the populations have different plastic responses. For what, we use different approaches presented below.

The focus trait of the present report is (except in Section 6 and Section 7 which are applied to many traits):

Code
trait <- "HA20"

To account for the nursery effects, we estimate plasticity to environment variation originating from the combined effect of the site and the nursery. For that, in the present report, we only use the two common gardens in which the trees come from a unique nursery, i.e. FS (in which all trees come from NG) and FE (in which all trees come from NE, except four trees that come from NG and that we remove to estimate plasticity).

In Section 3, we partitioned the variance of the trait of interest at the species level, i.e., with all populations combined. For that, we follow the methodology in de Villemereuil & Chevin (2025) to partition the phenotypic variance of the traits of interest. Below are sections extracted from the paper:

\[V_P = V_{Plas} + V_{Gen} + V_{Res}\]

\(V_P\) is the total phenotypic variance of the trait. \(V_{Plas}\) is the phenotypic variance arising from changes in the mean reaction norm across environments. \(V_{Gen}\) is the total genetic variance in the trait across environments. \(V_{Res}\) is the residual variance, not explained by the reaction norm.

\[V_{Gen} = V_{Add} + V_{nonAdd}\]

\(V_{Add}\) is the total additive genetic variance in the trait across environments. \(V_{nonAdd}\) is the non additive genetic variance in the trait.

\[V_{Add} = V_A + V_{AxE}\]

\(V_A\) is the environment-blind additive genetic variance of the trait, i.e., based on the mean breeding values across environments. \(V_{AxE}\) is the additive genetic variance arising from plasticity, i.e., variance of the mean-centered breeding values.

From this, it is possible to derive unitless quantities of interest.

\(P_{RN}^2 = V_{Plas}/V_P\) is the proportion of the phenotypic variance arising from average plastic responses to environments.

Variance-standardised additive genetic variances are heritabilities:

\(h_{RN}^2=\frac{V_{Add}}{V_P}=\frac{V_A}{V_P}+\frac{V_{AxE}}{V_P}=h^2+h^2_I\)

In other words, the heritability of the trait when fully accounting for its reaction norm (\(h_{RN}^2\)) is equal to the environment-blind heritability of the trait (\(h^2\), based on the breeding values averaged across environments) plus the heritability from plasticity (\(h_I^2\), based on the breeding values by environment interaction).

\(T_{RN}^2=\frac{V_{Plas}+V_{Gen}}{V_P}=P_{RN}^2+h_{RN}^2\)

\(T_{RN}^2\) is the proportion of the phenotypic variance explained by the (possibly plastic and genetically variable) reaction norm, and thus our ability to predict the individual phenotype from the genotype and the environment.

To apply this methodology, we followed the tutorials associated with the Reacnorm R package: https://github.com/devillemereuil/Reacnorm.

In Section 4, we attempted to partition the variance of the trait of interest for each population individually. However, it appeared that the sample sizes were likely too small to allow the models to converge.

In Section 5.1, we estimated population varying slopes (i.e., random slopes) to quantify the mean change in the trait of interest across two different sites (Yair and Glensaugh) for each population. This approach (if I have understood correctly) would be similar to estimate \(V_{Plas}\) for each population. While this estimate would not indicate the genetic variation in reaction norms within populations, it would allow for comparisons of the mean plastic responses across sites among populations. In Section 5.2, we estimated varying slopes at the climatic cluster level instead of the population level.

In Section 6, we calculate the relative distance plasticity index (RDPI) from Valladares, Sanchez-Gomez, and Zavala (2006) In this paper, the authors propose to quantify phenotypic plasticity based on phenotypic distances among individuals of a given species (in our study, population) exposed to different environments (in our study, planted in different common gardens), which is summarized in the RDPI index that allows for statistical comparisons of phenotypic plasticity between species (in our study, populations).

For a given trait, the relative distance plasticity index \(RDPI_p\) for a given population \(p\) can be calculated as follows:

\[RDPI_p = \frac{\sum_{n=1}^N{\frac{x_{ij}-x_{i'j'}}{x_{ij}+x_{i'j'}}}}{N}\]

where \(N\) is the number of pairwise comparisons among individuals of the same population but from different common gardens. \(x_{ij}\) and \(x_{i'j'}\) are the phenotypic values of individual \(j\) and \(j'\) in common gardens \(i\) and \(i’\). As such, the phenotypic plasticity of a population for a given trait corresponds to the sum of the relative phenotypic distances for all pairs of individuals of the population grown in different common gardens. We calculate the RDPI for height measurements, the duration of bud burst and the number of days to reach of stage 4, 5 or 6 of budburst.

Comment: Note that RDPI may also be calculated at the family level (instead of the population level as above) and then be averaged for each population. However, as some families in FS are not the as in FE and FW, I thought that calculating the RDPI indexes directly at the population level would allow to consider all families (not having to remove some). But this is point I will be happy to discuss!

In Section 7, we calculate the difference in mean trait values between the two studied sites for each population.

Last, we compared the correlations among population slopes obtained in Section 5.1, RDPI calculated in Section 6 and the raw differences calculated in Section 7.

2 The data

We load the phenotypic data from the common gardens.

Code
data <- readRDS(file=here("data/ScotsPine/formatted_CG_data.rds"))

The dataset contains 1848 individuals, 21 populations, 177 families. Here are the first 10 rows:

Code
data[1:10,] %>% kable_mydf()
PopulationCode Family FieldSite FieldCode Block HA13 HA14 HA15 HA16 HA17 HA18 HA19 HA20 HI13 HI14 HI15 HI16 HI17 HI18 HI19 HI20 DA13 DA14 DA15 DA16 DA17 DA18 DA19 DA20 DI14 DI15 DI16 DI17 DI18 DI19 DI20 BD15 BD16 BD17 BD18 BD19 BT15_4 BT15_5 BT15_6 BT16_4 BT16_5 BT16_6 BT17_4 BT17_5 BT17_6 BT18_4 BT18_5 BT18_6 BT19_4 BT19_5 BT19_6 Nursery
GC GC5 FE 5001 FE_A 780 975 1190 1470 1890 2440 2800 3220 140 195 215 280 420 550 360 420 17.99 30.83 42.73 62.9 82 104 130.51 146.42 12.84 11.90 20.17 19.1 22 26.51 15.92 21 37 21 14 28 58 72 79 48 64 85 47 61 68 54 61 68 46 60 74 NE
GE GE9 FE 5002 FE_A 690 875 1080 1400 1730 2010 2400 2960 125 185 205 320 330 280 390 560 19.50 35.38 47.00 67.2 82 127 143.24 165.52 15.88 11.62 20.20 14.8 45 16.24 22.28 15 28 23 21 21 64 72 79 57 69 85 52 61 75 54 61 75 53 60 74 NE
BE BE4 FE 5003 FE_A 630 670 695 730 NA NA NA NA 55 40 25 35 NA NA NA NA 9.08 7.26 8.67 7.6 NA NA NA NA 0.00 1.41 0.00 NA NA NA NA 28 33 NA NA NA 58 79 86 57 85 90 47 75 NA 75 NA NA NA NA NA NE
RM RM7 FE 5004 FE_A 760 1010 1270 1610 2035 2320 2700 3110 110 250 260 340 425 285 380 410 19.73 37.21 49.81 63.7 89 108 136.87 162.34 17.48 12.60 13.89 25.3 19 28.87 25.46 35 30 28 21 14 44 58 79 48 64 78 40 61 68 54 61 75 46 53 60 NE
SO SO2 FE 5005 FE_A 690 870 995 1140 1400 1710 2030 2520 90 180 125 145 260 310 320 490 13.94 24.78 34.99 47.0 61 91 100.27 120.96 10.84 10.21 12.01 14.0 30 9.27 20.69 35 28 35 21 28 44 72 79 57 64 85 47 61 82 54 68 75 46 60 74 NE
MG MG10 FE 5006 FE_A 635 820 1050 1280 1650 2040 2270 2750 95 185 230 230 370 390 230 480 14.83 22.99 35.27 46.2 60 96 103.45 114.59 8.16 12.28 10.93 13.8 36 7.45 11.14 14 21 21 14 14 58 64 72 57 64 78 47 61 68 54 61 68 46 53 60 NE
BB BB8 FE 5007 FE_A 770 1015 1300 1560 1925 2230 2650 3170 140 245 285 260 365 305 420 520 20.50 38.79 49.88 69.3 86 98 120.96 133.69 18.29 11.09 19.42 16.7 12 22.96 12.73 21 28 28 14 14 58 72 79 57 69 85 47 61 75 54 61 68 46 53 60 NE
CR CR5 FE 5008 FE_A 905 1150 1405 1760 2340 2790 3270 3790 140 245 255 355 580 450 480 520 25.10 37.50 52.29 75.5 98 128 151.20 171.89 12.40 14.79 23.21 22.5 30 23.20 20.69 21 28 23 21 14 58 64 79 57 69 85 52 61 75 47 61 68 46 53 60 NE
GT GT8 FE 5009 FE_B 885 1105 1345 1670 2090 2490 2880 3420 190 220 240 325 420 400 390 540 19.38 31.31 45.79 67.1 86 109 138.46 159.15 11.93 14.48 21.31 18.9 23 29.46 20.69 15 26 30 21 28 64 72 79 64 78 90 52 68 82 61 68 82 46 60 74 NE
CC CC7 FE 5010 FE_B 610 720 905 1040 1320 1520 1810 2200 60 110 185 135 280 200 290 390 14.48 23.69 33.37 47.5 57 82 92.31 98.68 9.21 9.68 14.13 9.5 25 10.31 6.37 21 28 28 21 28 58 64 79 57 64 85 47 61 75 47 61 68 46 53 74 NE

We load the genomic relationship matrix.

Code
GRM <- readRDS(file=here("data/ScotsPine/GenomicData/data_update_Fev2025/formatted_data/Gmat_VanRaden.rds"))
GRM[1:10,1:10]
             5001         5002          5004          5005         5006          5007         5008          5009         5010         5011
5001  1.001754392  0.006519614  6.021209e-03 -3.308892e-03 -0.012986334 -0.0103246489 -0.005507989 -0.0099405467  0.014779192  0.006420764
5002  0.006519614  1.005616433  2.987842e-02  9.731069e-03 -0.024867057 -0.0018414560 -0.007189568 -0.0091610829 -0.001410565  0.003708786
5004  0.006021209  0.029878415  1.014206e+00  6.354084e-05 -0.009920033  0.0018114160 -0.001273047 -0.0058235843 -0.015339580 -0.012354560
5005 -0.003308892  0.009731069  6.354084e-05  1.003141e+00 -0.013168210  0.0049938202  0.008473645 -0.0073622696 -0.022385254  0.012409120
5006 -0.012986334 -0.024867057 -9.920033e-03 -1.316821e-02  1.010319132  0.0020051460 -0.016777030 -0.0029030730 -0.005454051  0.001790722
5007 -0.010324649 -0.001841456  1.811416e-03  4.993820e-03  0.002005146  1.0094462709 -0.013189947  0.0003566858 -0.004515659  0.022918155
5008 -0.005507989 -0.007189568 -1.273047e-03  8.473645e-03 -0.016777030 -0.0131899473  0.995407024  0.0026715333  0.002578422 -0.008585459
5009 -0.009940547 -0.009161083 -5.823584e-03 -7.362270e-03 -0.002903073  0.0003566858  0.002671533  1.0080698159 -0.015530987 -0.022970375
5010  0.014779192 -0.001410565 -1.533958e-02 -2.238525e-02 -0.005454051 -0.0045156593  0.002578422 -0.0155309866  1.059251781 -0.006317228
5011  0.006420764  0.003708786 -1.235456e-02  1.240912e-02  0.001790722  0.0229181549 -0.008585459 -0.0229703749 -0.006317228  1.033974344

Matrix dimensions:

Code
dim(GRM)
[1] 1761 1761

3 Species level model with GRM

We filtered the phenotypic data to retain only the trait of interest and measurements from the Glensaugh (FE) and Yair (FS) sites. Additionally, we excluded the four trees measured in FE that originated from the NG nursery.

The remaining trees all originate either from the FE site and the NE nursery or the FS site and the NG nursery.

Code
subdata <- data %>% 
  dplyr::select(FieldSite,Block,FieldCode,PopulationCode,Family,Nursery,all_of(trait)) %>% 
  drop_na() %>% # 1848 to 1765 values
  filter(FieldSite %in% c("FE","FS")) %>% # 1765 to 1311
  filter(!(FieldSite=="FE" & Nursery =="NG")) %>% # 1311 to 1307
  filter((FieldCode%in%rownames(GRM))) %>%  # 1307 to 1293
  dplyr::rename(y=any_of(trait)) %>% 
  arrange(FieldCode)

# filter the grm with the remainng trees
GRM <-  GRM[as.character(subdata$FieldCode), as.character(subdata$FieldCode), drop = FALSE]
# dim(GRM)
# GRM[1:10,1:10]
# 
# all.equal(as.character(subdata$FieldCode),colnames(GRM))
# all.equal(as.character(subdata$FieldCode),rownames(GRM))
Code
n_chains <- 4 # number of chains (MCMC)
n_iter <- 6000 # number of iterations
n_warm <- 3000 # number of iterations in the warm-up phase
n_thin <- 1 # thinning interval
n_seed <- 404

form <- brmsformula(y ~ 0 + FieldSite + (0 + FieldSite | gr(FieldCode, cov = GRM)))

mod <- brm(formula = form,
           data = subdata,
           data2 = list(GRM=GRM),
           save_pars = save_pars(group = FALSE),
           chains = n_chains,
           cores = n_chains,
           seed = n_seed,
           iter = n_iter,
           warmup = n_warm)

saveRDS(mod, file = here(paste0("outputs/ScotsPine/StanModels/",trait,"_TwoCG_SpeciesLevelModel_PartVarVillemereuil.rds")))

# Messages d'avis :
# 1: There were 4 chains where the estimated Bayesian Fraction of Missing Information was low. See
# https://mc-stan.org/misc/warnings.html#bfmi-low 
# 2: Examine the pairs() plot to diagnose sampling problems
#  
# 3: The largest R-hat is 1.06, indicating chains have not mixed.
# Running the chains for more iterations may help. See
# https://mc-stan.org/misc/warnings.html#r-hat 
# 4: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
# Running the chains for more iterations may help. See
# https://mc-stan.org/misc/warnings.html#bulk-ess 
# 5: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
# Running the chains for more iterations may help. See
# https://mc-stan.org/misc/warnings.html#tail-ess 

Model summary:

Code
mod <- readRDS(here(paste0("outputs/ScotsPine/StanModels/",trait,"_TwoCG_SpeciesLevelModel_PartVarVillemereuil.rds")))
summary(mod)
Warning: Parts of the model have not converged (some Rhats are > 1.05). Be careful when analysing the results! We recommend running more iterations and/or setting stronger priors.
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: y ~ 0 + FieldSite + (0 + FieldSite | gr(FieldCode, cov = GRM)) 
   Data: subdata (Number of observations: 1293) 
  Draws: 4 chains, each with iter = 3000; warmup = 1000; thin = 1;
         total post-warmup draws = 8000

Multilevel Hyperparameters:
~FieldCode (Number of levels: 1293) 
                             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(FieldSiteFE)                433.77     39.18   356.40   510.54 1.03       81      108
sd(FieldSiteFS)                619.30     37.47   546.37   693.01 1.02       91      175
cor(FieldSiteFE,FieldSiteFS)     0.60      0.12     0.35     0.83 1.01      309      635

Regression Coefficients:
            Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
FieldSiteFE  2722.76     44.62  2635.01  2810.47 1.00     3475     3742
FieldSiteFS  4064.35     87.32  3891.72  4238.29 1.00     4331     5205

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma   323.77     44.47   225.01   403.03 1.05       63       88

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Code
plot(mod, variable=c("b_FieldSiteFE", "b_FieldSiteFS","sigma"))

Code
plot(mod, variable=c("cor_FieldCode__FieldSiteFE__FieldSiteFS", "sd_FieldCode__FieldSiteFE","sd_FieldCode__FieldSiteFS"))

Code
# Getting the mean uncertainty on the fixed-effect parameters
var_uncert <- vcov(mod) |> # extracts the variance-covariance matrix of fixed effects.
  diag() |> # diagonal values represent the variances of each fixed effect parameter.
  mean()  # Calculate mean uncertainty across parameters

# Computing V_plas (variance of the average reaction norm)
post_theta_cs <- fixef(mod, summary = FALSE) |> # extracts posterior samples for fixed effects
  as_draws_df() # formats them as a posterior draws object.

# Select fixed effects corresponding to the sites and compute their row-wise variances.
var_plas_cs <- post_theta_cs |> 
  select(starts_with("FieldSite")) |>
  as.matrix() |> 
  rowVars()

# Correcting V_plas for uncertainty
post_cs <- data.frame(V_Plas = var_plas_cs - var_uncert) |>  # subtract the mean parameter uncertainty (var_uncert) from the raw V_plas values.
  cbind(select(post_theta_cs, starts_with("."))) |>  # combine with MCMC metadata (e.g., chain, iteration, draw) to create a valid posterior object.
  as_draws_df()

# Getting the G-matrix (variance-covariance of random effects)
post_cs[["G"]] <- VarCorr(mod, summary = FALSE)[["FieldCode"]][["cov"]] |> # extracts variance-covariance matrices of random effects
  apply(1, \(mat_) { mat_ }, simplify = FALSE) # apply() to structure the posterior samples into a list of matrices for further processing

# Getting the residual variance
# Extract posterior samples for residual variance
post_cs[["V_R"]] <- VarCorr(mod, summary = FALSE)[["residual__"]][["sd"]][, 1]^2

# Thinning the posterior draws for computational efficiency
# Thin the posterior draws to reduce computation time while maintaining sufficient information
post_cs <- thin_draws(post_cs, thin = length(var_plas_cs) / 1000)

# Save MCMC metadata (iteration, chain, etc.) for further use
post_cs_info <- select(post_cs, starts_with("."))

# Summarising V_Plas (posterior estimates and diagnostics)
# Display summary statistics for V_Plas (mean, median, credible intervals).
summarise_draws(subset_draws(post_cs, variable = "V_Plas"))  %>% kable_mydf()
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
V_Plas 899407.1 892926.2 113381.7 116218.4 725103.6 1091157 1.01 994.11 867.31
Code
# Computing genetic variance decomposition
# Apply function to compute genetic variances from the G-matrix
post_gen_cs <- 
  map(post_cs[["G"]], rn_cs_gen, .progress = TRUE) |> 
  bind_rows() |> 
  select(where(\(col_) { abs(mean(col_)) > 10^-5 })) |>  # Remove negligible components
  cbind(post_cs_info) |> 
  as_draws_df()

# Summarise genetic variance components
summarise_draws(post_gen_cs) %>% kable_mydf()
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
V_Add 287092.21 284848.2 37209.54 36096.99 230634.96 350583.55 1.04 75.56 95.73
V_A 224172.09 222120.3 32074.89 30614.86 174294.76 277512.25 1.02 112.43 196.98
V_AxE 62920.12 61354.6 18430.24 17296.40 34739.81 94787.41 1.01 176.29 363.59
N_eff 1.21 1.2 0.07 0.07 1.10 1.33 1.01 258.65 432.38
Code
# Compute total phenotypic variance and variance-standardised estimates
post_var_tpc_cs <- bind_draws(post_cs, post_gen_cs) |> 
  subset_draws(variable = c("V_Plas", "V_Add", "V_A", "V_AxE", "V_R")) |> 
  mutate_variables(V_Tot = V_Plas + V_Add + V_R)

# Compute standardised variance estimates (P2, H2_RN, H2, H2_I, T2)
post_std_tpc_cs <- post_var_tpc_cs |> 
  transmute(
    P2_RN = V_Plas / V_Tot,       # Proportion of phenotypic variance due to plasticity
    H2_RN = V_Add / V_Tot,        # Heritability of reaction norm
    H2 = V_A / V_Tot,             # Heritability of environment-blind genetic variance
    H2_I = V_AxE / V_Tot,         # Heritability of genetic variance due to plasticity
    T2_RN = (V_Plas + V_Add) / V_Tot # Total variance explained by genetic and plasticity components
  ) |> 
  cbind(post_cs_info) |> 
  as_draws_df()

# Summarising standardised variance estimates
summarise_draws(post_std_tpc_cs) %>% kable_mydf()
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
P2_RN 0.69 0.69 0.03 0.03 0.64 0.74 1.00 1011.07 981.12
H2_RN 0.22 0.22 0.03 0.03 0.18 0.28 1.01 122.40 265.65
H2 0.17 0.17 0.03 0.03 0.13 0.22 1.01 213.30 473.93
H2_I 0.05 0.05 0.01 0.01 0.03 0.07 1.01 192.40 447.42
T2_RN 0.92 0.92 0.02 0.02 0.88 0.96 1.03 69.61 91.77

MCMC traces of the parameters of interest:

Code
mcmc_trace(post_std_tpc_cs)

Posterior distributions of the parameters of interest:

Code
mcmc_areas(post_std_tpc_cs, prob = 0.95, area_method = "scaled height")

Main take-home message: At the species level, the heritability of the reaction norm (\(h_{RN}^2\)) mainly comes from the environment-blind heritability of the trait (\(h^2\), based on the breeding values averaged across environments) and not from the heritability from plasticity (\(h_I^2\), based on the breeding values by environment interaction).

4 Population-level models with GRM

We apply the same model separately to each population.

Code
pop_names <- subdata$PopulationCode %>% unique() %>% sort # sorted by alphabetical number

lapply(pop_names, function(pop_i) {
  
  subdata_pop <- subdata %>% 
    filter(PopulationCode %in% pop_i)
  
  GRM_pop <-  GRM[as.character(subdata_pop$FieldCode), as.character(subdata_pop$FieldCode), drop = FALSE]
  
  form <- brmsformula(y ~ 0 + FieldSite + (0 + FieldSite | gr(FieldCode, cov = GRM_pop)))
  
  mod <- brm(formula = form,
             data = subdata_pop,
             data2 = list(GRM_pop=GRM_pop),
             save_pars = save_pars(group = FALSE),
             chains = n_chains,
             cores = n_chains,
             seed = n_seed,
             iter = n_iter,
             warmup = n_warm,
             control = list(adapt_delta = 0.9))
  
  saveRDS(mod, file = here(paste0("outputs/ScotsPine/StanModels/PopulationLevelModels/",trait,"_TwoCG_PopLevelModel_PartVarVillemereuil_pop",pop_i,".rds")))
  })

Let’s look at the model for the AB population.

Code
pop_i <- "AB"

mod <- readRDS(here(paste0("outputs/ScotsPine/StanModels/PopulationLevelModels/",trait,"_TwoCG_PopLevelModel_PartVarVillemereuil_pop",pop_i,".rds")))
  
summary(mod)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: y ~ 0 + FieldSite + (0 + FieldSite | gr(FieldCode, cov = GRM_pop)) 
   Data: subdata_pop (Number of observations: 61) 
  Draws: 4 chains, each with iter = 8000; warmup = 6000; thin = 1;
         total post-warmup draws = 8000

Multilevel Hyperparameters:
~FieldCode (Number of levels: 61) 
                             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(FieldSiteFE)                293.78    174.38    15.49   622.24 1.02      316      904
sd(FieldSiteFS)                524.68    227.46    55.75   928.75 1.02      363      983
cor(FieldSiteFE,FieldSiteFS)     0.07      0.57    -0.94     0.95 1.00      584     1661

Regression Coefficients:
            Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
FieldSiteFE  2429.46    121.62  2187.94  2671.43 1.00     3499     3030
FieldSiteFS  3720.41    173.96  3367.54  4061.40 1.00     2889     3374

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma   465.11    138.67   180.66   689.41 1.04      245      298

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Code
plot(mod, nvariables = 3)

Code
# Getting the mean uncertainty on the fixed-effect parameters
var_uncert <- vcov(mod) |> # extracts the variance-covariance matrix of fixed effects
  diag() |> # diagonal values represent the variances of each fixed effect parameter
  mean()  # Calculate mean uncertainty across parameters
  
# Computing V_plas (variance of the average reaction norm)
post_theta_cs <- fixef(mod, summary = FALSE) |> # extracts posterior samples for fixed effects
  as_draws_df() # formats them as a posterior draws object.
  
# Select fixed effects corresponding to the field or environment and compute their row-wise variances.
var_plas_cs <- post_theta_cs |> 
  select(starts_with("FieldSite")) |>  # Match "FieldSite" to your model's fixed-effect prefix
  as.matrix() |> 
  rowVars()
  
# Correcting V_plas for uncertainty
post_cs <- data.frame(V_Plas = var_plas_cs - var_uncert) |>  # subtract the mean parameter uncertainty (var_uncert) from the raw V_plas values.
  cbind(select(post_theta_cs, starts_with("."))) |>  # combine with MCMC metadata (e.g., chain, iteration, draw) to create a valid posterior object.
  as_draws_df()
  
# Getting the G-matrix (variance-covariance of random effects)
post_cs[["G"]] <- VarCorr(mod, summary = FALSE)[["FieldCode"]][["cov"]] |> # extracts variance-covariance matrices of random effects
  apply(1, \(mat_) { mat_ }, simplify = FALSE) # apply() to structure the posterior samples into a list of matrices for further processing
  
# Getting the residual variance
# Extract posterior samples for residual variance
post_cs[["V_R"]] <- VarCorr(mod, summary = FALSE)[["residual__"]][["sd"]][, 1]^2

# Thinning the posterior draws for computational efficiency
# Thin the posterior draws to reduce computation time while maintaining sufficient information
post_cs <- thin_draws(post_cs, thin = length(var_plas_cs) / 1000)

# Save MCMC metadata (iteration, chain, etc.) for further use
post_cs_info <- select(post_cs, starts_with("."))

# Summarising V_Plas (posterior estimates and diagnostics)
# Display summary statistics for V_Plas (mean, median, credible intervals).
summarise_draws(subset_draws(post_cs, variable = "V_Plas")) %>% kable_mydf()
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
V_Plas 843991.2 823829 269086.3 256800 432691 1318222 1 839.38 674.83
Code
# Computing genetic variance decomposition
# Apply function to compute genetic variances from the G-matrix
post_gen_cs <- 
  map(post_cs[["G"]], rn_cs_gen, .progress = TRUE) |> 
  bind_rows() |> 
  select(where(\(col_) { abs(mean(col_)) > 10^-5 })) |>  # Remove negligible components
  cbind(post_cs_info) |> 
  as_draws_df()

# Summarise genetic variance components
summarise_draws(post_gen_cs) %>% kable_mydf()
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
V_Add 221053.51 195732.30 154679.75 168032.11 17974.77 497873.04 1.02 262.67 530.69
V_A 125014.64 88135.03 115710.58 92939.27 7120.32 370332.53 1.01 279.03 304.82
V_AxE 96038.87 71722.10 85708.77 62358.55 6832.29 272565.88 1.01 442.08 498.37
N_eff 1.17 1.12 0.17 0.14 1.00 1.52 1.01 813.99 954.07
Code
# Compute total phenotypic variance and variance-standardised estimates
post_var_tpc_cs <- bind_draws(post_cs, post_gen_cs) |> 
  subset_draws(variable = c("V_Plas", "V_Add", "V_A", "V_AxE", "V_R")) |> 
  mutate_variables(V_Tot = V_Plas + V_Add + V_R)

# Compute standardised variance estimates (P2, H2_RN, H2, H2_I, T2)
post_std_tpc_cs <- post_var_tpc_cs |> 
  transmute(
    P2 = V_Plas / V_Tot,          # Proportion of phenotypic variance due to plasticity
    H2_RN = V_Add / V_Tot,        # Heritability of reaction norm
    H2 = V_A / V_Tot,             # Heritability of environment-blind genetic variance
    H2_I = V_AxE / V_Tot,         # Heritability of genetic variance due to plasticity
    T2 = (V_Plas + V_Add) / V_Tot # Total variance explained by genetic and plasticity components
    ) |> 
  cbind(post_cs_info) |> 
  as_draws_df()

# Summarising standardised variance estimates
summarise_draws(post_std_tpc_cs) %>% kable_mydf()
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
P2 0.64 0.65 0.09 0.09 0.47 0.76 1.00 979.70 1035.54
H2_RN 0.17 0.15 0.12 0.13 0.02 0.39 1.02 294.65 481.90
H2 0.10 0.07 0.09 0.07 0.01 0.29 1.01 297.91 367.47
H2_I 0.08 0.06 0.07 0.05 0.01 0.21 1.01 496.38 630.76
T2 0.81 0.81 0.10 0.11 0.64 0.97 1.03 241.57 342.04
Code
# Plotting MCMC diagnostics and posterior distributions
mcmc_trace(post_std_tpc_cs)

Code
mcmc_areas(post_std_tpc_cs, prob = 0.95, area_method = "scaled height")

MCMC traces of the parameters of interest:

Code
mcmc_trace(post_std_tpc_cs)

Posterior distributions of the parameters of interest:

Code
mcmc_areas(post_std_tpc_cs, prob = 0.95, area_method = "scaled height")

Main take-home message: We do not have enough data to estimate those parameters of interest at the population level.

5 Species level model with varying slopes

5.1 At the population level

Code
subdata <- data %>% 
  dplyr::select(FieldSite,Block,FieldCode,PopulationCode,Family,Nursery,all_of(trait)) %>% 
  drop_na() %>% # 1848 to 1765 values
  filter(FieldSite %in% c("FE","FS")) %>% # 1765 to 1311
  filter(!(FieldSite=="FE" & Nursery =="NG")) %>% # 1311 to 1307
  dplyr::rename(y=any_of(trait)) %>% 
  arrange(FieldCode) %>% 
  mutate(FieldSite_num = as.numeric(as_factor(FieldSite)))
Code
# Model options
n_chains <- 4 # number of chains (MCMC)
n_iter <- 8000 # number of iterations
n_warm <- 4000 # number of iterations in the warm-up phase
n_thin <- 1 # thinning interval
n_seed <- 404

form <- brmsformula(y ~ 1 + FieldSite_num + (1 + FieldSite_num | PopulationCode))

mod <- brm(formula = form,
           data = subdata,
           chains = n_chains,
           cores = n_chains,
           seed = n_seed,
           iter = n_iter,
           warmup = n_warm,
           control = list(adapt_delta = 0.95))

# no warnings!

saveRDS(mod, file = here(paste0("outputs/ScotsPine/StanModels/",trait,"_TwoCG_SpeciesLevelModel_PopulationVaryingSlopes.rds")))

Summary of the model:

Code
mod <- readRDS(file = here(paste0("outputs/ScotsPine/StanModels/",trait,"_TwoCG_SpeciesLevelModel_PopulationVaryingSlopes.rds"))) 

summary(mod)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: y ~ 1 + FieldSite_num + (1 + FieldSite_num | PopulationCode) 
   Data: subdata (Number of observations: 1307) 
  Draws: 4 chains, each with iter = 8000; warmup = 4000; thin = 1;
         total post-warmup draws = 16000

Multilevel Hyperparameters:
~PopulationCode (Number of levels: 21) 
                             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)                  215.02     69.47    94.29   368.63 1.00     5973     7150
sd(FieldSite_num)               80.78     49.54     4.19   188.75 1.00     1747     5314
cor(Intercept,FieldSite_num)    -0.22      0.49    -0.89     0.88 1.00     5615     5919

Regression Coefficients:
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept      1451.71     72.86  1307.33  1593.62 1.00    11668    10884
FieldSite_num  1256.95     39.08  1180.61  1334.74 1.00    16523    11538

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma   600.95     12.07   577.99   625.09 1.00    24531    11256

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Interval plots of the population varying slopes (i.e., random slopes).

Code
# Extract posterior samples for population-level slopes
population_slopes <- as_draws_df(mod) %>%
  spread_draws(r_PopulationCode[PopulationCode, FieldSite_num]) %>% 
  filter(FieldSite_num=="FieldSite_num")

population_slopes %>%
  ggplot(aes(x = r_PopulationCode, y = PopulationCode)) +
  stat_halfeye(
    .width = c(0.80, 0.95),   # Specify intervals
    fill = "skyblue",
    point_interval = "median_qi",
    color = "blue") +
  theme_bw() +
  labs(title = "MCMC areas of population varying slopes (median, 80% and 95% credible intervals)",
       x = "Slope",
       y = "Population")

Global slope of the model and its 95% credible interval.

Code
mod_coeff <- mod %>% 
  broom.mixed::tidyMCMC(
    robust = T, # median and mean absolute deviation are used to compute point estimates and uncertainty
    ess = F, 
    rhat = F, 
    conf.int = T,
    conf.level = 0.95)

global_intercept <- mod_coeff$estimate[["b_Intercept"]]
global_slope <- mod_coeff$estimate[["b_FieldSite_num"]]
global_slope_low95 <- mod_coeff[mod_coeff$term=="b_FieldSite_num","conf.low"] %>% pull()
global_slope_high95 <- mod_coeff[mod_coeff$term=="b_FieldSite_num","conf.high"] %>% pull()

ggplot(data = subdata) +
  geom_point(aes(x = FieldSite_num, y = y), color = "gray") +
  scale_x_continuous(breaks = c(1, 2), 
                     limits = c(0.8, 2.2),
                     labels = c("Glensaugh (FE)", "Yair (FS)")) +  # Set x-axis limits and breaks
    geom_ribbon(aes(x = FieldSite_num, 
                    ymin = global_intercept + global_slope_low95 * FieldSite_num, 
                    ymax = global_intercept + global_slope_high95 * FieldSite_num),  
                inherit.aes = FALSE, 
                fill = "blue", 
                alpha = 0.2) + 
  geom_abline(slope = global_slope, 
              intercept = global_intercept, 
              color = "blue", 
              linetype = "dashed", 
              linewidth = 1) +
  theme_bw() +
  labs(title = "Global slope (median and 95% intervals)",
       x = "",
       y = "Tree height (mm)")

Now we look the estimated intercept and slope of each population. The credible intervals reflect uncertainty in the estimation of the slope only.

Code
pop_names <- subdata$PopulationCode %>% unique %>% sort

lapply(pop_names, function(pop_i){
  
  term_intercept <- paste0("r_PopulationCode[",pop_i,",Intercept]")
  term_slope <- paste0("r_PopulationCode[",pop_i,",FieldSite_num]")
  pop_intercept <- mod_coeff[mod_coeff$term==term_intercept,"estimate"] %>% pull
  pop_slope <- mod_coeff[mod_coeff$term==term_slope,"estimate"] %>% pull
  pop_slope_low95 <- mod_coeff[mod_coeff$term==term_slope,"conf.low"] %>% pull
  pop_slope_high95 <- mod_coeff[mod_coeff$term==term_slope,"conf.high"] %>% pull
  
ggplot(data = subdata) +
  geom_point(aes(x = FieldSite_num, y = y), color="gray") +
  scale_x_continuous(breaks = c(1, 2), 
                     limits = c(0.5, 2.5),
                     labels = c("Glensaugh (FE)", "Yair (FS)")) +  # Set x-axis limits and breaks
  geom_ribbon(aes(x = FieldSite_num, 
                  ymin = global_intercept + global_slope_low95 * FieldSite_num, 
                  ymax = global_intercept + global_slope_high95 * FieldSite_num),  
              inherit.aes = FALSE, 
              fill = "blue", 
              alpha = 0.2) + 
  geom_ribbon(aes(x = FieldSite_num, 
                  ymin = (global_intercept + pop_intercept) + (global_slope + pop_slope_low95) * FieldSite_num, 
                  ymax = (global_intercept + pop_intercept) + (global_slope + pop_slope_high95) * FieldSite_num),  
              inherit.aes = FALSE, 
              fill = "red", 
              alpha = 0.2) + 
  geom_point(data=subdata %>% filter(subdata$PopulationCode==pop_i), aes(x=FieldSite_num, y=y), color="red", alpha=0.8) +
  geom_abline(slope = global_slope, 
              intercept = global_intercept, 
              color = "blue", 
              linetype = "dashed", 
              linewidth = 1) +
  geom_abline(slope=(global_slope + pop_slope) , intercept=(global_intercept + pop_intercept), color="red", alpha=0.8, linewidth=0.5) +
  scale_color_manual(values=c("red",rep("gray",20))) +
  theme_bw() +
  labs(title = paste0("Population ",pop_i),
       x = "",
       y = "Tree height (mm)")
})
[[1]]


[[2]]


[[3]]


[[4]]


[[5]]


[[6]]


[[7]]


[[8]]


[[9]]


[[10]]


[[11]]


[[12]]


[[13]]


[[14]]


[[15]]


[[16]]


[[17]]


[[18]]


[[19]]


[[20]]


[[21]]

We extract the slopes for comparison.

Code
median_slope_df <- mod_coeff %>% 
  filter(stringr::str_detect(term, ",FieldSite_num]$")) %>% 
  mutate(Population = str_extract(term, "\\[([A-Z]{2}),")) %>%
  mutate(Population = str_remove_all(Population, "\\[|,")) %>% 
  dplyr::select(Population, estimate) %>% 
  dplyr::rename(median_slope = estimate)

Main take-home message: Very low variation in the slope across sites among populations, and the differences in slope are not statistically meaningful.

5.2 At the climatic cluster level

Code
subdata <- data %>% 
  dplyr::select(FieldSite,Block,FieldCode,PopulationCode,Family,Nursery,all_of(trait)) %>% 
  drop_na() %>% # 1848 to 1765 values
  filter(FieldSite %in% c("FE","FS")) %>% # 1765 to 1311
  filter(!(FieldSite=="FE" & Nursery =="NG")) %>% # 1311 to 1307
  dplyr::rename(y=any_of(trait)) %>% 
  arrange(FieldCode) %>% 
  mutate(FieldSite_num = as.numeric(as_factor(FieldSite))) %>% 
  mutate(ClimCluster = case_when(PopulationCode %in% c("AB","AC","BB","GD","GT", "RM") ~ "Cairngorms",
                                 PopulationCode %in% c("AM","BW","GA","GC","GE","GL", "MG","RD","SO") ~ "Central Atlantic",
                                 PopulationCode %in% c("BE","CC","CG","CR","LC", "SD") ~ "Hyper-oceanic"))
Code
# Model options
n_chains <- 4 # number of chains (MCMC)
n_iter <- 8000 # number of iterations
n_warm <- 4000 # number of iterations in the warm-up phase
n_thin <- 1 # thinning interval
n_seed <- 404

form <- brmsformula(y ~ 1 + FieldSite_num + (1 + FieldSite_num | ClimCluster))

mod <- brm(formula = form,
           data = subdata,
           chains = n_chains,
           cores = n_chains,
           seed = n_seed,
           iter = n_iter,
           warmup = n_warm,
           control = list(adapt_delta = 0.95))

# Messages d'avis :
# 1: There were 137 divergent transitions after warmup. See
# https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
# to find out why this is a problem and how to eliminate them. 
# 2: Examine the pairs() plot to diagnose sampling problems

saveRDS(mod, file = here(paste0("outputs/ScotsPine/StanModels/",trait,"_TwoCG_SpeciesLevelModel_ClusterVaryingSlopes.rds")))

Summary of the model:

Code
mod <- readRDS(file = here(paste0("outputs/ScotsPine/StanModels/",trait,"_TwoCG_SpeciesLevelModel_ClusterVaryingSlopes.rds"))) 

summary(mod)
Warning: There were 137 divergent transitions after warmup. Increasing adapt_delta above 0.95 may help. See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: y ~ 1 + FieldSite_num + (1 + FieldSite_num | ClimCluster) 
   Data: subdata (Number of observations: 1307) 
  Draws: 4 chains, each with iter = 8000; warmup = 4000; thin = 1;
         total post-warmup draws = 16000

Multilevel Hyperparameters:
~ClimCluster (Number of levels: 3) 
                             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)                  504.86    343.82   138.81  1445.36 1.00     4376     6729
sd(FieldSite_num)              125.08    153.45     3.21   572.58 1.00     2863     2343
cor(Intercept,FieldSite_num)    -0.22      0.57    -0.98     0.89 1.00     8447     8518

Regression Coefficients:
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept      1445.22    311.06   779.81  2090.72 1.00     4072     4867
FieldSite_num  1257.74     99.44  1033.58  1474.61 1.00     3423     2353

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma   608.17     12.05   585.24   632.26 1.00    14701    11049

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Interval plots of the climatic cluster varying slopes (i.e., random slopes).

Code
# Extract posterior samples for population-level slopes
clim_cluster_slopes <- as_draws_df(mod) %>%
  spread_draws(r_ClimCluster[ClimCluster, FieldSite_num]) %>% 
  filter(FieldSite_num=="FieldSite_num")

clim_cluster_slopes %>%
  ggplot(aes(x = r_ClimCluster, y = ClimCluster)) +
  stat_halfeye(
    .width = c(0.80, 0.95),   # Specify intervals
    fill = "skyblue",
    point_interval = "median_qi",
    color = "blue") +
  theme_bw() +
  labs(title = "MCMC areas of climatic cluster varying slopes (median, 80% and 95% credible intervals)",
       x = "Slope",
       y = "Climatic cluster")

Global slope of the model and its 95% credible interval.

Code
mod_coeff <- mod %>% 
  broom.mixed::tidyMCMC(
    robust = T, # median and mean absolute deviation are used to compute point estimates and uncertainty
    ess = F, 
    rhat = F, 
    conf.int = T,
    conf.level = 0.95)

global_intercept <- mod_coeff$estimate[["b_Intercept"]]
global_slope <- mod_coeff$estimate[["b_FieldSite_num"]]
global_slope_low95 <- mod_coeff[mod_coeff$term=="b_FieldSite_num","conf.low"] %>% pull()
global_slope_high95 <- mod_coeff[mod_coeff$term=="b_FieldSite_num","conf.high"] %>% pull()

ggplot(data = subdata) +
  geom_point(aes(x = FieldSite_num, y = y), color = "gray") +
  scale_x_continuous(breaks = c(1, 2), 
                     limits = c(0.8, 2.2),
                     labels = c("Glensaugh (FE)", "Yair (FS)")) +  # Set x-axis limits and breaks
    geom_ribbon(aes(x = FieldSite_num, 
                    ymin = global_intercept + global_slope_low95 * FieldSite_num, 
                    ymax = global_intercept + global_slope_high95 * FieldSite_num),  
                inherit.aes = FALSE, 
                fill = "blue", 
                alpha = 0.2) + 
  geom_abline(slope = global_slope, 
              intercept = global_intercept, 
              color = "blue", 
              linetype = "dashed", 
              linewidth = 1) +
  theme_bw() +
  labs(title = "Global slope (median and 95% intervals)",
       x = "",
       y = "Tree height (mm)")

Now we look the estimated intercept and slope of each climatic cluster The credible intervals reflect uncertainty in the estimation of the slope only.

Code
clim_cluster_names <- subdata$ClimCluster %>% unique %>% sort

lapply(clim_cluster_names, function(clim_cluster_i){
  
  if(clim_cluster_i == "Central Atlantic"){
    term_intercept <- paste0("r_ClimCluster[Central.Atlantic,Intercept]")
    term_slope <- paste0("r_ClimCluster[Central.Atlantic,FieldSite_num]")
  } else {
    term_intercept <- paste0("r_ClimCluster[",clim_cluster_i,",Intercept]")
    term_slope <- paste0("r_ClimCluster[",clim_cluster_i,",FieldSite_num]")
  }
  
  clim_cluster_intercept <- mod_coeff[mod_coeff$term==term_intercept,"estimate"] %>% pull
  clim_cluster_slope <- mod_coeff[mod_coeff$term==term_slope,"estimate"] %>% pull
  clim_cluster_slope_low95 <- mod_coeff[mod_coeff$term==term_slope,"conf.low"] %>% pull
  clim_cluster_slope_high95 <- mod_coeff[mod_coeff$term==term_slope,"conf.high"] %>% pull
  
ggplot(data = subdata) +
  geom_point(aes(x = FieldSite_num, y = y), color="gray") +
  scale_x_continuous(breaks = c(1, 2), 
                     limits = c(0.5, 2.5),
                     labels = c("Glensaugh (FE)", "Yair (FS)")) +  # Set x-axis limits and breaks
  geom_ribbon(aes(x = FieldSite_num, 
                  ymin = global_intercept + global_slope_low95 * FieldSite_num, 
                  ymax = global_intercept + global_slope_high95 * FieldSite_num),  
              inherit.aes = FALSE, 
              fill = "blue", 
              alpha = 0.2) + 
  geom_ribbon(aes(x = FieldSite_num, 
                  ymin = (global_intercept + clim_cluster_intercept) + (global_slope + clim_cluster_slope_low95) * FieldSite_num, 
                  ymax = (global_intercept + clim_cluster_intercept) + (global_slope + clim_cluster_slope_high95) * FieldSite_num),  
              inherit.aes = FALSE, 
              fill = "red", 
              alpha = 0.2) + 
  geom_point(data=subdata %>% filter(subdata$ClimCluster==clim_cluster_i), aes(x=FieldSite_num, y=y), color="red", alpha=0.8) +
  geom_abline(slope = global_slope, 
              intercept = global_intercept, 
              color = "blue", 
              linetype = "dashed", 
              linewidth = 1) +
  geom_abline(slope=(global_slope + clim_cluster_slope) , intercept=(global_intercept + clim_cluster_intercept), color="red", alpha=0.8, linewidth=0.5) +
  scale_color_manual(values=c("red",rep("gray",20))) +
  theme_bw() +
  labs(title = paste0("Climatic cluster: ",clim_cluster_i),
       x = "",
       y = "Tree height (mm)")
})
[[1]]


[[2]]


[[3]]

Main take-home message: As for populations, there is almost not variation in the slope across sites among climatic clusters.

6 RDPI

For that, the first option is to use the two common gardens in which the trees come from a unique nursery, i.e. FS (in which all trees come from NG) and FE (in which all trees come from NE, except four trees that come from NG and that we remove to calculate the RDPI index).

So we calculate the RDPI indexes capturing plasticity to environment variation between FS and FE.

Code
rdpi_data <- data %>% filter(!(FieldSite=="FE" & Nursery =="NG")) # we remove the four trees in FE that come from NG
pop_codes <- unique(rdpi_data$PopulationCode) %>% sort() # population codes
site_codes <- c("FE","FS") # field site codes
site_comb <- combn(site_codes,2)
calc_rel_dist <- function(x){abs(x[1]-x[2])/(x[1]+x[2])}

list_traits <- list(heights = rdpi_data %>% dplyr::select(contains("HA"), -contains("13")) %>% colnames(), # height measurements (except 2013)
                    duration_budburst = rdpi_data %>% dplyr::select(contains("BD")) %>% colnames(), # duration of bud burst
                    timing_budburst = rdpi_data %>% dplyr::select(contains("BT")) %>% colnames()) # time taken to reach budburst stages

lapply(list_traits, function(traits) {
  
  lapply(traits, function(trait){

  lapply(pop_codes, function(pop){
  
  subpop <- rdpi_data %>%
    drop_na(any_of(trait)) %>% 
    dplyr::filter(PopulationCode %in% pop)
  
  n_pop <- nrow(subpop) # nb of individuals in the population

 rel_dist <- lapply(1:dim(site_comb)[[2]], function(nb_comb){
 
    subsite1 <- subpop %>% dplyr::filter(FieldSite %in% site_comb[1,nb_comb])
    subsite2 <- subpop %>% dplyr::filter(FieldSite %in% site_comb[2,nb_comb])
    
    
    lapply(1:nrow(subsite1), function(x) {
      
      
      rbind(subsite1[x,trait],subsite2[,trait]) %>% 
        pull() %>% 
        combn(2, calc_rel_dist) %>% 
        .[1:nrow(subsite2)] %>% 
        as.numeric()
      
      }) %>% unlist()
    
    }) %>% unlist()
 
 N <- length(rel_dist)
 
sum(rel_dist)/N
 
 
 }) %>% 
    setNames(pop_codes) %>% 
    as_tibble() %>% 
    pivot_longer(everything(), names_to = "PopulationCode", values_to = "RDPI") %>% 
    mutate(Trait=trait)
  
  }) %>% bind_rows()

  
}) %>% saveRDS(file=here("outputs/ScotsPine/PhenotypicPlasticity/RDPI_based_on_FE_FS.rds"))
Code
list_titles <- list(list("heights","height"),
                    list("duration_budburst","the duration of budburst"),
                    list("timing_budburst","the time taken to reach budburst stages"))

lapply(list_titles, function(x){
  
  readRDS(file=here("outputs/ScotsPine/PhenotypicPlasticity/RDPI_based_on_FE_FS.rds"))[[x[[1]]]] %>% 
  pivot_wider(values_from = RDPI, names_from = Trait) %>% 
  dplyr::select(-PopulationCode) %>% 
  cor() %>% 
  corrplot(method = 'number',
           type = 'lower', diag = FALSE,
           title=paste0("Correlation among RDPI for ",x[[2]]),
           mar=c(0,0,2,0),
           number.cex=0.9,tl.cex=0.9)
  })

Let’s look at the correlation with the population varying slopes of the model above (median estimates).

Code
df <- readRDS(file=here("outputs/ScotsPine/PhenotypicPlasticity/RDPI_based_on_FE_FS.rds"))$heights %>% 
  filter(Trait == trait) %>% 
  dplyr::rename(Population = PopulationCode) %>% 
  inner_join(median_slope_df, by="Population")

cor(df$RDPI, df$median_slope)
[1] 0.4178298

7 Raw difference betwen population means

Code
diff_data <- data %>% 
  filter(FieldSite %in% c("FE","FS")) %>% # 1765 to 1311
  filter(!(FieldSite=="FE" & Nursery =="NG")) # we remove the four trees in FE that come from NG
pop_codes <- unique(diff_data$PopulationCode) %>% sort() # population codes

list_traits <- list(heights = diff_data %>% dplyr::select(contains("HA"), -contains("13")) %>% colnames(), # height measurements (except 2013)
                    duration_budburst = diff_data %>% dplyr::select(contains("BD")) %>% colnames(), # duration of bud burst
                    timing_budburst = diff_data %>% dplyr::select(contains("BT")) %>% colnames()) # time taken to reach budburst stages

lapply(list_traits, function(traits) {
  
  lapply(traits, function(trait){

  lapply(pop_codes, function(pop){
  
  raw_diff <- diff_data %>%
    drop_na(any_of(trait)) %>% 
    dplyr::filter(PopulationCode %in% pop) %>% 
    arrange(FieldSite) %>% 
    group_by(FieldSite) %>% 
    summarise(mean_trait = mean(!!sym(trait), na.rm = TRUE)) %>% 
    summarise(difference = diff(mean_trait)) %>%
    pull(difference)
    
  tibble(PopulationCode = pop,
         Raw_Diff = raw_diff,
         Trait = trait)
  
  }) %>% bind_rows()
    
  }) %>% bind_rows()

  
}) %>% saveRDS(file=here("outputs/ScotsPine/PhenotypicPlasticity/Raw_Diff_based_on_FE_FS.rds"))
Code
list_titles <- list(list("heights","height"),
                    list("duration_budburst","the duration of budburst"),
                    list("timing_budburst","the time taken to reach budburst stages"))

lapply(list_titles, function(x){
  
  readRDS(file=here("outputs/ScotsPine/PhenotypicPlasticity/Raw_Diff_based_on_FE_FS.rds"))[[x[[1]]]] %>% 
  pivot_wider(values_from = Raw_Diff, names_from = Trait) %>% 
  dplyr::select(-PopulationCode) %>% 
  cor() %>% 
  corrplot(method = 'number',
           type = 'lower', diag = FALSE,
           title=paste0("Correlation among raw differences for ",x[[2]]),
           mar=c(0,0,2,0),
           number.cex=0.9,tl.cex=0.9)
  })

8 Correlations

Correlation among population slopes, RDPI and raw differences

Code
df <- readRDS(file=here("outputs/ScotsPine/PhenotypicPlasticity/RDPI_based_on_FE_FS.rds"))$heights %>% 
  filter(Trait == trait) %>% 
  dplyr::rename(Population = PopulationCode) %>% 
  inner_join(median_slope_df, by="Population")

readRDS(file=here("outputs/ScotsPine/PhenotypicPlasticity/Raw_Diff_based_on_FE_FS.rds"))$heights %>% 
  filter(Trait == trait) %>% 
  dplyr::rename(Population = PopulationCode) %>% 
  inner_join(df, by=c("Population","Trait")) %>% 
  dplyr::select(-Population,-Trait) %>% 
  cor() %>% 
  corrplot(method = 'number',
           type = 'lower', diag = FALSE,
           title=paste0(""),
           mar=c(0,0,2,0),
           number.cex=0.9,tl.cex=0.9)

References

Valladares, Fernando, David Sanchez-Gomez, and Miguel A. Zavala. 2006. “Quantitative Estimation of Phenotypic Plasticity: Bridging the Gap Between the Evolutionary Concept and Its Ecological Applications.” Journal of Ecology 94 (6): 1103–16. https://doi.org/10.1111/j.1365-2745.2006.01176.x.