Estimating plasticity at the population level

Villemereuil & Chevin method & RDPI of Valladeres

Author

Juliette Archambeau

Published

Invalid Date

1 Introduction

The goal of this report is to determine whether the populations have different plastic responses. For what, we use different approaches presented below.

The focus trait of the present report is (except in Section 6 and Section 7 which are applied to many traits):

Code

trait <- "HA20"

To account for the nursery effects, we estimate plasticity to environment variation originating from the combined effect of the site and the nursery. For that, in the present report, we only use the two common gardens in which the trees come from a unique nursery, i.e. FS (in which all trees come from NG) and FE (in which all trees come from NE, except four trees that come from NG and that we remove to estimate plasticity).

In Section 3, we partitioned the variance of the trait of interest at the species level, i.e., with all populations combined. For that, we follow the methodology in de Villemereuil & Chevin (2025) to partition the phenotypic variance of the traits of interest. Below are sections extracted from the paper:

\[V_P = V_{Plas} + V_{Gen} + V_{Res}\]

\(V_P\) is the total phenotypic variance of the trait. \(V_{Plas}\) is the phenotypic variance arising from changes in the mean reaction norm across environments. \(V_{Gen}\) is the total genetic variance in the trait across environments. \(V_{Res}\) is the residual variance, not explained by the reaction norm.

\[V_{Gen} = V_{Add} + V_{nonAdd}\]

\(V_{Add}\) is the total additive genetic variance in the trait across environments. \(V_{nonAdd}\) is the non additive genetic variance in the trait.

\[V_{Add} = V_A + V_{AxE}\]

\(V_A\) is the environment-blind additive genetic variance of the trait, i.e., based on the mean breeding values across environments. \(V_{AxE}\) is the additive genetic variance arising from plasticity, i.e., variance of the mean-centered breeding values.

From this, it is possible to derive unitless quantities of interest.

\(P_{RN}^2 = V_{Plas}/V_P\) is the proportion of the phenotypic variance arising from average plastic responses to environments.

Variance-standardised additive genetic variances are heritabilities:

\(h_{RN}^2=\frac{V_{Add}}{V_P}=\frac{V_A}{V_P}+\frac{V_{AxE}}{V_P}=h^2+h^2_I\)

In other words, the heritability of the trait when fully accounting for its reaction norm (\(h_{RN}^2\)) is equal to the environment-blind heritability of the trait (\(h^2\), based on the breeding values averaged across environments) plus the heritability from plasticity (\(h_I^2\), based on the breeding values by environment interaction).

\(T_{RN}^2=\frac{V_{Plas}+V_{Gen}}{V_P}=P_{RN}^2+h_{RN}^2\)

\(T_{RN}^2\) is the proportion of the phenotypic variance explained by the (possibly plastic and genetically variable) reaction norm, and thus our ability to predict the individual phenotype from the genotype and the environment.

To apply this methodology, we followed the tutorials associated with the Reacnorm R package: https://github.com/devillemereuil/Reacnorm.

In Section 4, we attempted to partition the variance of the trait of interest for each population individually. However, it appeared that the sample sizes were likely too small to allow the models to converge.

In Section 5.1, we estimated population varying slopes (i.e., random slopes) to quantify the mean change in the trait of interest across two different sites (Yair and Glensaugh) for each population. This approach (if I have understood correctly) would be similar to estimate \(V_{Plas}\) for each population. While this estimate would not indicate the genetic variation in reaction norms within populations, it would allow for comparisons of the mean plastic responses across sites among populations. In Section 5.2, we estimated varying slopes at the climatic cluster level instead of the population level.

In Section 6, we calculate the relative distance plasticity index (RDPI) from Valladares, Sanchez-Gomez, and Zavala (2006) In this paper, the authors propose to quantify phenotypic plasticity based on phenotypic distances among individuals of a given species (in our study, population) exposed to different environments (in our study, planted in different common gardens), which is summarized in the RDPI index that allows for statistical comparisons of phenotypic plasticity between species (in our study, populations).

For a given trait, the relative distance plasticity index \(RDPI_p\) for a given population \(p\) can be calculated as follows:

\[RDPI_p = \frac{\sum_{n=1}^N{\frac{x_{ij}-x_{i'j'}}{x_{ij}+x_{i'j'}}}}{N}\]

where \(N\) is the number of pairwise comparisons among individuals of the same population but from different common gardens. \(x_{ij}\) and \(x_{i'j'}\) are the phenotypic values of individual \(j\) and \(j'\) in common gardens \(i\) and \(i’\). As such, the phenotypic plasticity of a population for a given trait corresponds to the sum of the relative phenotypic distances for all pairs of individuals of the population grown in different common gardens. We calculate the RDPI for height measurements, the duration of bud burst and the number of days to reach of stage 4, 5 or 6 of budburst.

Comment: Note that RDPI may also be calculated at the family level (instead of the population level as above) and then be averaged for each population. However, as some families in FS are not the as in FE and FW, I thought that calculating the RDPI indexes directly at the population level would allow to consider all families (not having to remove some). But this is point I will be happy to discuss!

In Section 7, we calculate the difference in mean trait values between the two studied sites for each population.

Last, we compared the correlations among population slopes obtained in Section 5.1, RDPI calculated in Section 6 and the raw differences calculated in Section 7.

2 The data

We load the phenotypic data from the common gardens.

Code

data <- readRDS(file=here("data/ScotsPine/formatted_CG_data.rds"))

The dataset contains 1848 individuals, 21 populations, 177 families. Here are the first 10 rows:

Code

data[1:10,] %>% kable_mydf()

PopulationCode	Family	FieldSite	FieldCode	Block	HA13	HA14	HA15	HA16	HA17	HA18	HA19	HA20	HI13	HI14	HI15	HI16	HI17	HI18	HI19	HI20	DA13	DA14	DA15	DA16	DA17	DA18	DA19	DA20	DI14	DI15	DI16	DI17	DI18	DI19	DI20	BD15	BD16	BD17	BD18	BD19	BT15_4	BT15_5	BT15_6	BT16_4	BT16_5	BT16_6	BT17_4	BT17_5	BT17_6	BT18_4	BT18_5	BT18_6	BT19_4	BT19_5	BT19_6	Nursery
GC	GC5	FE	5001	FE_A	780	975	1190	1470	1890	2440	2800	3220	140	195	215	280	420	550	360	420	17.99	30.83	42.73	62.9	82	104	130.51	146.42	12.84	11.90	20.17	19.1	22	26.51	15.92	21	37	21	14	28	58	72	79	48	64	85	47	61	68	54	61	68	46	60	74	NE
GE	GE9	FE	5002	FE_A	690	875	1080	1400	1730	2010	2400	2960	125	185	205	320	330	280	390	560	19.50	35.38	47.00	67.2	82	127	143.24	165.52	15.88	11.62	20.20	14.8	45	16.24	22.28	15	28	23	21	21	64	72	79	57	69	85	52	61	75	54	61	75	53	60	74	NE
BE	BE4	FE	5003	FE_A	630	670	695	730	NA	NA	NA	NA	55	40	25	35	NA	NA	NA	NA	9.08	7.26	8.67	7.6	NA	NA	NA	NA	0.00	1.41	0.00	NA	NA	NA	NA	28	33	NA	NA	NA	58	79	86	57	85	90	47	75	NA	75	NA	NA	NA	NA	NA	NE
RM	RM7	FE	5004	FE_A	760	1010	1270	1610	2035	2320	2700	3110	110	250	260	340	425	285	380	410	19.73	37.21	49.81	63.7	89	108	136.87	162.34	17.48	12.60	13.89	25.3	19	28.87	25.46	35	30	28	21	14	44	58	79	48	64	78	40	61	68	54	61	75	46	53	60	NE
SO	SO2	FE	5005	FE_A	690	870	995	1140	1400	1710	2030	2520	90	180	125	145	260	310	320	490	13.94	24.78	34.99	47.0	61	91	100.27	120.96	10.84	10.21	12.01	14.0	30	9.27	20.69	35	28	35	21	28	44	72	79	57	64	85	47	61	82	54	68	75	46	60	74	NE
MG	MG10	FE	5006	FE_A	635	820	1050	1280	1650	2040	2270	2750	95	185	230	230	370	390	230	480	14.83	22.99	35.27	46.2	60	96	103.45	114.59	8.16	12.28	10.93	13.8	36	7.45	11.14	14	21	21	14	14	58	64	72	57	64	78	47	61	68	54	61	68	46	53	60	NE
BB	BB8	FE	5007	FE_A	770	1015	1300	1560	1925	2230	2650	3170	140	245	285	260	365	305	420	520	20.50	38.79	49.88	69.3	86	98	120.96	133.69	18.29	11.09	19.42	16.7	12	22.96	12.73	21	28	28	14	14	58	72	79	57	69	85	47	61	75	54	61	68	46	53	60	NE
CR	CR5	FE	5008	FE_A	905	1150	1405	1760	2340	2790	3270	3790	140	245	255	355	580	450	480	520	25.10	37.50	52.29	75.5	98	128	151.20	171.89	12.40	14.79	23.21	22.5	30	23.20	20.69	21	28	23	21	14	58	64	79	57	69	85	52	61	75	47	61	68	46	53	60	NE
GT	GT8	FE	5009	FE_B	885	1105	1345	1670	2090	2490	2880	3420	190	220	240	325	420	400	390	540	19.38	31.31	45.79	67.1	86	109	138.46	159.15	11.93	14.48	21.31	18.9	23	29.46	20.69	15	26	30	21	28	64	72	79	64	78	90	52	68	82	61	68	82	46	60	74	NE
CC	CC7	FE	5010	FE_B	610	720	905	1040	1320	1520	1810	2200	60	110	185	135	280	200	290	390	14.48	23.69	33.37	47.5	57	82	92.31	98.68	9.21	9.68	14.13	9.5	25	10.31	6.37	21	28	28	21	28	58	64	79	57	64	85	47	61	75	47	61	68	46	53	74	NE

We load the genomic relationship matrix.

Code

GRM <- readRDS(file=here("data/ScotsPine/GenomicData/data_update_Fev2025/formatted_data/Gmat_VanRaden.rds"))
GRM[1:10,1:10]

             5001         5002          5004          5005         5006          5007         5008          5009         5010         5011
5001  1.001754392  0.006519614  6.021209e-03 -3.308892e-03 -0.012986334 -0.0103246489 -0.005507989 -0.0099405467  0.014779192  0.006420764
5002  0.006519614  1.005616433  2.987842e-02  9.731069e-03 -0.024867057 -0.0018414560 -0.007189568 -0.0091610829 -0.001410565  0.003708786
5004  0.006021209  0.029878415  1.014206e+00  6.354084e-05 -0.009920033  0.0018114160 -0.001273047 -0.0058235843 -0.015339580 -0.012354560
5005 -0.003308892  0.009731069  6.354084e-05  1.003141e+00 -0.013168210  0.0049938202  0.008473645 -0.0073622696 -0.022385254  0.012409120
5006 -0.012986334 -0.024867057 -9.920033e-03 -1.316821e-02  1.010319132  0.0020051460 -0.016777030 -0.0029030730 -0.005454051  0.001790722
5007 -0.010324649 -0.001841456  1.811416e-03  4.993820e-03  0.002005146  1.0094462709 -0.013189947  0.0003566858 -0.004515659  0.022918155
5008 -0.005507989 -0.007189568 -1.273047e-03  8.473645e-03 -0.016777030 -0.0131899473  0.995407024  0.0026715333  0.002578422 -0.008585459
5009 -0.009940547 -0.009161083 -5.823584e-03 -7.362270e-03 -0.002903073  0.0003566858  0.002671533  1.0080698159 -0.015530987 -0.022970375
5010  0.014779192 -0.001410565 -1.533958e-02 -2.238525e-02 -0.005454051 -0.0045156593  0.002578422 -0.0155309866  1.059251781 -0.006317228
5011  0.006420764  0.003708786 -1.235456e-02  1.240912e-02  0.001790722  0.0229181549 -0.008585459 -0.0229703749 -0.006317228  1.033974344

Matrix dimensions:

Code

dim(GRM)

[1] 1761 1761

3 Species level model with GRM

We filtered the phenotypic data to retain only the trait of interest and measurements from the Glensaugh (FE) and Yair (FS) sites. Additionally, we excluded the four trees measured in FE that originated from the NG nursery.

The remaining trees all originate either from the FE site and the NE nursery or the FS site and the NG nursery.

Code

subdata <- data %>% 
  dplyr::select(FieldSite,Block,FieldCode,PopulationCode,Family,Nursery,all_of(trait)) %>% 
  drop_na() %>% # 1848 to 1765 values
  filter(FieldSite %in% c("FE","FS")) %>% # 1765 to 1311
  filter(!(FieldSite=="FE" & Nursery =="NG")) %>% # 1311 to 1307
  filter((FieldCode%in%rownames(GRM))) %>%  # 1307 to 1293
  dplyr::rename(y=any_of(trait)) %>% 
  arrange(FieldCode)

# filter the grm with the remainng trees
GRM <-  GRM[as.character(subdata$FieldCode), as.character(subdata$FieldCode), drop = FALSE]
# dim(GRM)
# GRM[1:10,1:10]
# 
# all.equal(as.character(subdata$FieldCode),colnames(GRM))
# all.equal(as.character(subdata$FieldCode),rownames(GRM))

Code

n_chains <- 4 # number of chains (MCMC)
n_iter <- 6000 # number of iterations
n_warm <- 3000 # number of iterations in the warm-up phase
n_thin <- 1 # thinning interval
n_seed <- 404

form <- brmsformula(y ~ 0 + FieldSite + (0 + FieldSite | gr(FieldCode, cov = GRM)))

mod <- brm(formula = form,
           data = subdata,
           data2 = list(GRM=GRM),
           save_pars = save_pars(group = FALSE),
           chains = n_chains,
           cores = n_chains,
           seed = n_seed,
           iter = n_iter,
           warmup = n_warm)

saveRDS(mod, file = here(paste0("outputs/ScotsPine/StanModels/",trait,"_TwoCG_SpeciesLevelModel_PartVarVillemereuil.rds")))

# Messages d'avis :
# 1: There were 4 chains where the estimated Bayesian Fraction of Missing Information was low. See
# https://mc-stan.org/misc/warnings.html#bfmi-low 
# 2: Examine the pairs() plot to diagnose sampling problems
#  
# 3: The largest R-hat is 1.06, indicating chains have not mixed.
# Running the chains for more iterations may help. See
# https://mc-stan.org/misc/warnings.html#r-hat 
# 4: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
# Running the chains for more iterations may help. See
# https://mc-stan.org/misc/warnings.html#bulk-ess 
# 5: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
# Running the chains for more iterations may help. See
# https://mc-stan.org/misc/warnings.html#tail-ess

Model summary:

Code

mod <- readRDS(here(paste0("outputs/ScotsPine/StanModels/",trait,"_TwoCG_SpeciesLevelModel_PartVarVillemereuil.rds")))
summary(mod)

Warning: Parts of the model have not converged (some Rhats are > 1.05). Be careful when analysing the results! We recommend running more iterations and/or setting stronger priors.

 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: y ~ 0 + FieldSite + (0 + FieldSite | gr(FieldCode, cov = GRM)) 
   Data: subdata (Number of observations: 1293) 
  Draws: 4 chains, each with iter = 3000; warmup = 1000; thin = 1;
         total post-warmup draws = 8000

Multilevel Hyperparameters:
~FieldCode (Number of levels: 1293) 
                             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(FieldSiteFE)                433.77     39.18   356.40   510.54 1.03       81      108
sd(FieldSiteFS)                619.30     37.47   546.37   693.01 1.02       91      175
cor(FieldSiteFE,FieldSiteFS)     0.60      0.12     0.35     0.83 1.01      309      635

Regression Coefficients:
            Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
FieldSiteFE  2722.76     44.62  2635.01  2810.47 1.00     3475     3742
FieldSiteFS  4064.35     87.32  3891.72  4238.29 1.00     4331     5205

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma   323.77     44.47   225.01   403.03 1.05       63       88

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Code

plot(mod, variable=c("b_FieldSiteFE", "b_FieldSiteFS","sigma"))

Code

plot(mod, variable=c("cor_FieldCode__FieldSiteFE__FieldSiteFS", "sd_FieldCode__FieldSiteFE","sd_FieldCode__FieldSiteFS"))

Code

# Getting the mean uncertainty on the fixed-effect parameters
var_uncert <- vcov(mod) |> # extracts the variance-covariance matrix of fixed effects.
  diag() |> # diagonal values represent the variances of each fixed effect parameter.
  mean()  # Calculate mean uncertainty across parameters

# Computing V_plas (variance of the average reaction norm)
post_theta_cs <- fixef(mod, summary = FALSE) |> # extracts posterior samples for fixed effects
  as_draws_df() # formats them as a posterior draws object.

# Select fixed effects corresponding to the sites and compute their row-wise variances.
var_plas_cs <- post_theta_cs |> 
  select(starts_with("FieldSite")) |>
  as.matrix() |> 
  rowVars()

# Correcting V_plas for uncertainty
post_cs <- data.frame(V_Plas = var_plas_cs - var_uncert) |>  # subtract the mean parameter uncertainty (var_uncert) from the raw V_plas values.
  cbind(select(post_theta_cs, starts_with("."))) |>  # combine with MCMC metadata (e.g., chain, iteration, draw) to create a valid posterior object.
  as_draws_df()

# Getting the G-matrix (variance-covariance of random effects)
post_cs[["G"]] <- VarCorr(mod, summary = FALSE)[["FieldCode"]][["cov"]] |> # extracts variance-covariance matrices of random effects
  apply(1, \(mat_) { mat_ }, simplify = FALSE) # apply() to structure the posterior samples into a list of matrices for further processing

# Getting the residual variance
# Extract posterior samples for residual variance
post_cs[["V_R"]] <- VarCorr(mod, summary = FALSE)[["residual__"]][["sd"]][, 1]^2

# Thinning the posterior draws for computational efficiency
# Thin the posterior draws to reduce computation time while maintaining sufficient information
post_cs <- thin_draws(post_cs, thin = length(var_plas_cs) / 1000)

# Save MCMC metadata (iteration, chain, etc.) for further use
post_cs_info <- select(post_cs, starts_with("."))

# Summarising V_Plas (posterior estimates and diagnostics)
# Display summary statistics for V_Plas (mean, median, credible intervals).
summarise_draws(subset_draws(post_cs, variable = "V_Plas"))  %>% kable_mydf()

variable	mean	median	sd	mad	q5	q95	rhat	ess_bulk	ess_tail
V_Plas	899407.1	892926.2	113381.7	116218.4	725103.6	1091157	1.01	994.11	867.31

Code

# Computing genetic variance decomposition
# Apply function to compute genetic variances from the G-matrix
post_gen_cs <- 
  map(post_cs[["G"]], rn_cs_gen, .progress = TRUE) |> 
  bind_rows() |> 
  select(where(\(col_) { abs(mean(col_)) > 10^-5 })) |>  # Remove negligible components
  cbind(post_cs_info) |> 
  as_draws_df()

# Summarise genetic variance components
summarise_draws(post_gen_cs) %>% kable_mydf()

variable	mean	median	sd	mad	q5	q95	rhat	ess_bulk	ess_tail
V_Add	287092.21	284848.2	37209.54	36096.99	230634.96	350583.55	1.04	75.56	95.73
V_A	224172.09	222120.3	32074.89	30614.86	174294.76	277512.25	1.02	112.43	196.98
V_AxE	62920.12	61354.6	18430.24	17296.40	34739.81	94787.41	1.01	176.29	363.59
N_eff	1.21	1.2	0.07	0.07	1.10	1.33	1.01	258.65	432.38

Code

# Compute total phenotypic variance and variance-standardised estimates
post_var_tpc_cs <- bind_draws(post_cs, post_gen_cs) |> 
  subset_draws(variable = c("V_Plas", "V_Add", "V_A", "V_AxE", "V_R")) |> 
  mutate_variables(V_Tot = V_Plas + V_Add + V_R)

# Compute standardised variance estimates (P2, H2_RN, H2, H2_I, T2)
post_std_tpc_cs <- post_var_tpc_cs |> 
  transmute(
    P2_RN = V_Plas / V_Tot,       # Proportion of phenotypic variance due to plasticity
    H2_RN = V_Add / V_Tot,        # Heritability of reaction norm
    H2 = V_A / V_Tot,             # Heritability of environment-blind genetic variance
    H2_I = V_AxE / V_Tot,         # Heritability of genetic variance due to plasticity
    T2_RN = (V_Plas + V_Add) / V_Tot # Total variance explained by genetic and plasticity components
  ) |> 
  cbind(post_cs_info) |> 
  as_draws_df()

# Summarising standardised variance estimates
summarise_draws(post_std_tpc_cs) %>% kable_mydf()

variable	mean	median	sd	mad	q5	q95	rhat	ess_bulk	ess_tail
P2_RN	0.69	0.69	0.03	0.03	0.64	0.74	1.00	1011.07	981.12
H2_RN	0.22	0.22	0.03	0.03	0.18	0.28	1.01	122.40	265.65
H2	0.17	0.17	0.03	0.03	0.13	0.22	1.01	213.30	473.93
H2_I	0.05	0.05	0.01	0.01	0.03	0.07	1.01	192.40	447.42
T2_RN	0.92	0.92	0.02	0.02	0.88	0.96	1.03	69.61	91.77

MCMC traces of the parameters of interest:

Code

mcmc_trace(post_std_tpc_cs)

Posterior distributions of the parameters of interest:

Code

mcmc_areas(post_std_tpc_cs, prob = 0.95, area_method = "scaled height")

Main take-home message: At the species level, the heritability of the reaction norm (\(h_{RN}^2\)) mainly comes from the environment-blind heritability of the trait (\(h^2\), based on the breeding values averaged across environments) and not from the heritability from plasticity (\(h_I^2\), based on the breeding values by environment interaction).

4 Population-level models with GRM

We apply the same model separately to each population.

Code

pop_names <- subdata$PopulationCode %>% unique() %>% sort # sorted by alphabetical number

lapply(pop_names, function(pop_i) {
  
  subdata_pop <- subdata %>% 
    filter(PopulationCode %in% pop_i)
  
  GRM_pop <-  GRM[as.character(subdata_pop$FieldCode), as.character(subdata_pop$FieldCode), drop = FALSE]
  
  form <- brmsformula(y ~ 0 + FieldSite + (0 + FieldSite | gr(FieldCode, cov = GRM_pop)))
  
  mod <- brm(formula = form,
             data = subdata_pop,
             data2 = list(GRM_pop=GRM_pop),
             save_pars = save_pars(group = FALSE),
             chains = n_chains,
             cores = n_chains,
             seed = n_seed,
             iter = n_iter,
             warmup = n_warm,
             control = list(adapt_delta = 0.9))
  
  saveRDS(mod, file = here(paste0("outputs/ScotsPine/StanModels/PopulationLevelModels/",trait,"_TwoCG_PopLevelModel_PartVarVillemereuil_pop",pop_i,".rds")))
  })

Let’s look at the model for the AB population.

Code

pop_i <- "AB"

mod <- readRDS(here(paste0("outputs/ScotsPine/StanModels/PopulationLevelModels/",trait,"_TwoCG_PopLevelModel_PartVarVillemereuil_pop",pop_i,".rds")))
  
summary(mod)

 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: y ~ 0 + FieldSite + (0 + FieldSite | gr(FieldCode, cov = GRM_pop)) 
   Data: subdata_pop (Number of observations: 61) 
  Draws: 4 chains, each with iter = 8000; warmup = 6000; thin = 1;
         total post-warmup draws = 8000

Multilevel Hyperparameters:
~FieldCode (Number of levels: 61) 
                             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(FieldSiteFE)                293.78    174.38    15.49   622.24 1.02      316      904
sd(FieldSiteFS)                524.68    227.46    55.75   928.75 1.02      363      983
cor(FieldSiteFE,FieldSiteFS)     0.07      0.57    -0.94     0.95 1.00      584     1661

Regression Coefficients:
            Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
FieldSiteFE  2429.46    121.62  2187.94  2671.43 1.00     3499     3030
FieldSiteFS  3720.41    173.96  3367.54  4061.40 1.00     2889     3374

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma   465.11    138.67   180.66   689.41 1.04      245      298

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Code

plot(mod, nvariables = 3)

Code

# Getting the mean uncertainty on the fixed-effect parameters
var_uncert <- vcov(mod) |> # extracts the variance-covariance matrix of fixed effects
  diag() |> # diagonal values represent the variances of each fixed effect parameter
  mean()  # Calculate mean uncertainty across parameters
  
# Computing V_plas (variance of the average reaction norm)
post_theta_cs <- fixef(mod, summary = FALSE) |> # extracts posterior samples for fixed effects
  as_draws_df() # formats them as a posterior draws object.
  
# Select fixed effects corresponding to the field or environment and compute their row-wise variances.
var_plas_cs <- post_theta_cs |> 
  select(starts_with("FieldSite")) |>  # Match "FieldSite" to your model's fixed-effect prefix
  as.matrix() |> 
  rowVars()
  
# Correcting V_plas for uncertainty
post_cs <- data.frame(V_Plas = var_plas_cs - var_uncert) |>  # subtract the mean parameter uncertainty (var_uncert) from the raw V_plas values.
  cbind(select(post_theta_cs, starts_with("."))) |>  # combine with MCMC metadata (e.g., chain, iteration, draw) to create a valid posterior object.
  as_draws_df()
  
# Getting the G-matrix (variance-covariance of random effects)
post_cs[["G"]] <- VarCorr(mod, summary = FALSE)[["FieldCode"]][["cov"]] |> # extracts variance-covariance matrices of random effects
  apply(1, \(mat_) { mat_ }, simplify = FALSE) # apply() to structure the posterior samples into a list of matrices for further processing
  
# Getting the residual variance
# Extract posterior samples for residual variance
post_cs[["V_R"]] <- VarCorr(mod, summary = FALSE)[["residual__"]][["sd"]][, 1]^2

# Thinning the posterior draws for computational efficiency
# Thin the posterior draws to reduce computation time while maintaining sufficient information
post_cs <- thin_draws(post_cs, thin = length(var_plas_cs) / 1000)

# Save MCMC metadata (iteration, chain, etc.) for further use
post_cs_info <- select(post_cs, starts_with("."))

# Summarising V_Plas (posterior estimates and diagnostics)
# Display summary statistics for V_Plas (mean, median, credible intervals).
summarise_draws(subset_draws(post_cs, variable = "V_Plas")) %>% kable_mydf()

variable	mean	median	sd	mad	q5	q95	rhat	ess_bulk	ess_tail
V_Plas	843991.2	823829	269086.3	256800	432691	1318222	1	839.38	674.83

Code

# Computing genetic variance decomposition
# Apply function to compute genetic variances from the G-matrix
post_gen_cs <- 
  map(post_cs[["G"]], rn_cs_gen, .progress = TRUE) |> 
  bind_rows() |> 
  select(where(\(col_) { abs(mean(col_)) > 10^-5 })) |>  # Remove negligible components
  cbind(post_cs_info) |> 
  as_draws_df()

# Summarise genetic variance components
summarise_draws(post_gen_cs) %>% kable_mydf()

variable	mean	median	sd	mad	q5	q95	rhat	ess_bulk	ess_tail
V_Add	221053.51	195732.30	154679.75	168032.11	17974.77	497873.04	1.02	262.67	530.69
V_A	125014.64	88135.03	115710.58	92939.27	7120.32	370332.53	1.01	279.03	304.82
V_AxE	96038.87	71722.10	85708.77	62358.55	6832.29	272565.88	1.01	442.08	498.37
N_eff	1.17	1.12	0.17	0.14	1.00	1.52	1.01	813.99	954.07

Code

# Compute total phenotypic variance and variance-standardised estimates
post_var_tpc_cs <- bind_draws(post_cs, post_gen_cs) |> 
  subset_draws(variable = c("V_Plas", "V_Add", "V_A", "V_AxE", "V_R")) |> 
  mutate_variables(V_Tot = V_Plas + V_Add + V_R)

# Compute standardised variance estimates (P2, H2_RN, H2, H2_I, T2)
post_std_tpc_cs <- post_var_tpc_cs |> 
  transmute(
    P2 = V_Plas / V_Tot,          # Proportion of phenotypic variance due to plasticity
    H2_RN = V_Add / V_Tot,        # Heritability of reaction norm
    H2 = V_A / V_Tot,             # Heritability of environment-blind genetic variance
    H2_I = V_AxE / V_Tot,         # Heritability of genetic variance due to plasticity
    T2 = (V_Plas + V_Add) / V_Tot # Total variance explained by genetic and plasticity components
    ) |> 
  cbind(post_cs_info) |> 
  as_draws_df()

# Summarising standardised variance estimates
summarise_draws(post_std_tpc_cs) %>% kable_mydf()

variable	mean	median	sd	mad	q5	q95	rhat	ess_bulk	ess_tail
P2	0.64	0.65	0.09	0.09	0.47	0.76	1.00	979.70	1035.54
H2_RN	0.17	0.15	0.12	0.13	0.02	0.39	1.02	294.65	481.90
H2	0.10	0.07	0.09	0.07	0.01	0.29	1.01	297.91	367.47
H2_I	0.08	0.06	0.07	0.05	0.01	0.21	1.01	496.38	630.76
T2	0.81	0.81	0.10	0.11	0.64	0.97	1.03	241.57	342.04

Code

# Plotting MCMC diagnostics and posterior distributions
mcmc_trace(post_std_tpc_cs)

Code

mcmc_areas(post_std_tpc_cs, prob = 0.95, area_method = "scaled height")

MCMC traces of the parameters of interest:

Code

mcmc_trace(post_std_tpc_cs)

Posterior distributions of the parameters of interest:

Code

mcmc_areas(post_std_tpc_cs, prob = 0.95, area_method = "scaled height")

Main take-home message: We do not have enough data to estimate those parameters of interest at the population level.

5 Species level model with varying slopes

5.1 At the population level

Code

subdata <- data %>% 
  dplyr::select(FieldSite,Block,FieldCode,PopulationCode,Family,Nursery,all_of(trait)) %>% 
  drop_na() %>% # 1848 to 1765 values
  filter(FieldSite %in% c("FE","FS")) %>% # 1765 to 1311
  filter(!(FieldSite=="FE" & Nursery =="NG")) %>% # 1311 to 1307
  dplyr::rename(y=any_of(trait)) %>% 
  arrange(FieldCode) %>% 
  mutate(FieldSite_num = as.numeric(as_factor(FieldSite)))

Code

# Model options
n_chains <- 4 # number of chains (MCMC)
n_iter <- 8000 # number of iterations
n_warm <- 4000 # number of iterations in the warm-up phase
n_thin <- 1 # thinning interval
n_seed <- 404

form <- brmsformula(y ~ 1 + FieldSite_num + (1 + FieldSite_num | PopulationCode))

mod <- brm(formula = form,
           data = subdata,
           chains = n_chains,
           cores = n_chains,
           seed = n_seed,
           iter = n_iter,
           warmup = n_warm,
           control = list(adapt_delta = 0.95))

# no warnings!

saveRDS(mod, file = here(paste0("outputs/ScotsPine/StanModels/",trait,"_TwoCG_SpeciesLevelModel_PopulationVaryingSlopes.rds")))

Summary of the model:

Code

mod <- readRDS(file = here(paste0("outputs/ScotsPine/StanModels/",trait,"_TwoCG_SpeciesLevelModel_PopulationVaryingSlopes.rds"))) 

summary(mod)

 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: y ~ 1 + FieldSite_num + (1 + FieldSite_num | PopulationCode) 
   Data: subdata (Number of observations: 1307) 
  Draws: 4 chains, each with iter = 8000; warmup = 4000; thin = 1;
         total post-warmup draws = 16000

Multilevel Hyperparameters:
~PopulationCode (Number of levels: 21) 
                             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)                  215.02     69.47    94.29   368.63 1.00     5973     7150
sd(FieldSite_num)               80.78     49.54     4.19   188.75 1.00     1747     5314
cor(Intercept,FieldSite_num)    -0.22      0.49    -0.89     0.88 1.00     5615     5919

Regression Coefficients:
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept      1451.71     72.86  1307.33  1593.62 1.00    11668    10884
FieldSite_num  1256.95     39.08  1180.61  1334.74 1.00    16523    11538

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma   600.95     12.07   577.99   625.09 1.00    24531    11256

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Interval plots of the population varying slopes (i.e., random slopes).

Code

# Extract posterior samples for population-level slopes
population_slopes <- as_draws_df(mod) %>%
  spread_draws(r_PopulationCode[PopulationCode, FieldSite_num]) %>% 
  filter(FieldSite_num=="FieldSite_num")

population_slopes %>%
  ggplot(aes(x = r_PopulationCode, y = PopulationCode)) +
  stat_halfeye(
    .width = c(0.80, 0.95),   # Specify intervals
    fill = "skyblue",
    point_interval = "median_qi",
    color = "blue") +
  theme_bw() +
  labs(title = "MCMC areas of population varying slopes (median, 80% and 95% credible intervals)",
       x = "Slope",
       y = "Population")

Global slope of the model and its 95% credible interval.

Code

mod_coeff <- mod %>% 
  broom.mixed::tidyMCMC(
    robust = T, # median and mean absolute deviation are used to compute point estimates and uncertainty
    ess = F, 
    rhat = F, 
    conf.int = T,
    conf.level = 0.95)

global_intercept <- mod_coeff$estimate[["b_Intercept"]]
global_slope <- mod_coeff$estimate[["b_FieldSite_num"]]
global_slope_low95 <- mod_coeff[mod_coeff$term=="b_FieldSite_num","conf.low"] %>% pull()
global_slope_high95 <- mod_coeff[mod_coeff$term=="b_FieldSite_num","conf.high"] %>% pull()

ggplot(data = subdata) +
  geom_point(aes(x = FieldSite_num, y = y), color = "gray") +
  scale_x_continuous(breaks = c(1, 2), 
                     limits = c(0.8, 2.2),
                     labels = c("Glensaugh (FE)", "Yair (FS)")) +  # Set x-axis limits and breaks
    geom_ribbon(aes(x = FieldSite_num, 
                    ymin = global_intercept + global_slope_low95 * FieldSite_num, 
                    ymax = global_intercept + global_slope_high95 * FieldSite_num),  
                inherit.aes = FALSE, 
                fill = "blue", 
                alpha = 0.2) + 
  geom_abline(slope = global_slope, 
              intercept = global_intercept, 
              color = "blue", 
              linetype = "dashed", 
              linewidth = 1) +
  theme_bw() +
  labs(title = "Global slope (median and 95% intervals)",
       x = "",
       y = "Tree height (mm)")

Now we look the estimated intercept and slope of each population. The credible intervals reflect uncertainty in the estimation of the slope only.

Code

pop_names <- subdata$PopulationCode %>% unique %>% sort

lapply(pop_names, function(pop_i){
  
  term_intercept <- paste0("r_PopulationCode[",pop_i,",Intercept]")
  term_slope <- paste0("r_PopulationCode[",pop_i,",FieldSite_num]")
  pop_intercept <- mod_coeff[mod_coeff$term==term_intercept,"estimate"] %>% pull
  pop_slope <- mod_coeff[mod_coeff$term==term_slope,"estimate"] %>% pull
  pop_slope_low95 <- mod_coeff[mod_coeff$term==term_slope,"conf.low"] %>% pull
  pop_slope_high95 <- mod_coeff[mod_coeff$term==term_slope,"conf.high"] %>% pull
  
ggplot(data = subdata) +
  geom_point(aes(x = FieldSite_num, y = y), color="gray") +
  scale_x_continuous(breaks = c(1, 2), 
                     limits = c(0.5, 2.5),
                     labels = c("Glensaugh (FE)", "Yair (FS)")) +  # Set x-axis limits and breaks
  geom_ribbon(aes(x = FieldSite_num, 
                  ymin = global_intercept + global_slope_low95 * FieldSite_num, 
                  ymax = global_intercept + global_slope_high95 * FieldSite_num),  
              inherit.aes = FALSE, 
              fill = "blue", 
              alpha = 0.2) + 
  geom_ribbon(aes(x = FieldSite_num, 
                  ymin = (global_intercept + pop_intercept) + (global_slope + pop_slope_low95) * FieldSite_num, 
                  ymax = (global_intercept + pop_intercept) + (global_slope + pop_slope_high95) * FieldSite_num),  
              inherit.aes = FALSE, 
              fill = "red", 
              alpha = 0.2) + 
  geom_point(data=subdata %>% filter(subdata$PopulationCode==pop_i), aes(x=FieldSite_num, y=y), color="red", alpha=0.8) +
  geom_abline(slope = global_slope, 
              intercept = global_intercept, 
              color = "blue", 
              linetype = "dashed", 
              linewidth = 1) +
  geom_abline(slope=(global_slope + pop_slope) , intercept=(global_intercept + pop_intercept), color="red", alpha=0.8, linewidth=0.5) +
  scale_color_manual(values=c("red",rep("gray",20))) +
  theme_bw() +
  labs(title = paste0("Population ",pop_i),
       x = "",
       y = "Tree height (mm)")
})

[[1]]


[[2]]


[[3]]


[[4]]


[[5]]


[[6]]


[[7]]


[[8]]


[[9]]


[[10]]


[[11]]


[[12]]


[[13]]


[[14]]


[[15]]


[[16]]


[[17]]


[[18]]


[[19]]


[[20]]


[[21]]

We extract the slopes for comparison.

Code

median_slope_df <- mod_coeff %>% 
  filter(stringr::str_detect(term, ",FieldSite_num]$")) %>% 
  mutate(Population = str_extract(term, "\\[([A-Z]{2}),")) %>%
  mutate(Population = str_remove_all(Population, "\\[|,")) %>% 
  dplyr::select(Population, estimate) %>% 
  dplyr::rename(median_slope = estimate)

Main take-home message: Very low variation in the slope across sites among populations, and the differences in slope are not statistically meaningful.

5.2 At the climatic cluster level

Code

subdata <- data %>% 
  dplyr::select(FieldSite,Block,FieldCode,PopulationCode,Family,Nursery,all_of(trait)) %>% 
  drop_na() %>% # 1848 to 1765 values
  filter(FieldSite %in% c("FE","FS")) %>% # 1765 to 1311
  filter(!(FieldSite=="FE" & Nursery =="NG")) %>% # 1311 to 1307
  dplyr::rename(y=any_of(trait)) %>% 
  arrange(FieldCode) %>% 
  mutate(FieldSite_num = as.numeric(as_factor(FieldSite))) %>% 
  mutate(ClimCluster = case_when(PopulationCode %in% c("AB","AC","BB","GD","GT", "RM") ~ "Cairngorms",
                                 PopulationCode %in% c("AM","BW","GA","GC","GE","GL", "MG","RD","SO") ~ "Central Atlantic",
                                 PopulationCode %in% c("BE","CC","CG","CR","LC", "SD") ~ "Hyper-oceanic"))

Code

# Model options
n_chains <- 4 # number of chains (MCMC)
n_iter <- 8000 # number of iterations
n_warm <- 4000 # number of iterations in the warm-up phase
n_thin <- 1 # thinning interval
n_seed <- 404

form <- brmsformula(y ~ 1 + FieldSite_num + (1 + FieldSite_num | ClimCluster))

mod <- brm(formula = form,
           data = subdata,
           chains = n_chains,
           cores = n_chains,
           seed = n_seed,
           iter = n_iter,
           warmup = n_warm,
           control = list(adapt_delta = 0.95))

# Messages d'avis :
# 1: There were 137 divergent transitions after warmup. See
# https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
# to find out why this is a problem and how to eliminate them. 
# 2: Examine the pairs() plot to diagnose sampling problems

saveRDS(mod, file = here(paste0("outputs/ScotsPine/StanModels/",trait,"_TwoCG_SpeciesLevelModel_ClusterVaryingSlopes.rds")))

Summary of the model:

Code

mod <- readRDS(file = here(paste0("outputs/ScotsPine/StanModels/",trait,"_TwoCG_SpeciesLevelModel_ClusterVaryingSlopes.rds"))) 

summary(mod)

Warning: There were 137 divergent transitions after warmup. Increasing adapt_delta above 0.95 may help. See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup

 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: y ~ 1 + FieldSite_num + (1 + FieldSite_num | ClimCluster) 
   Data: subdata (Number of observations: 1307) 
  Draws: 4 chains, each with iter = 8000; warmup = 4000; thin = 1;
         total post-warmup draws = 16000

Multilevel Hyperparameters:
~ClimCluster (Number of levels: 3) 
                             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)                  504.86    343.82   138.81  1445.36 1.00     4376     6729
sd(FieldSite_num)              125.08    153.45     3.21   572.58 1.00     2863     2343
cor(Intercept,FieldSite_num)    -0.22      0.57    -0.98     0.89 1.00     8447     8518

Regression Coefficients:
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept      1445.22    311.06   779.81  2090.72 1.00     4072     4867
FieldSite_num  1257.74     99.44  1033.58  1474.61 1.00     3423     2353

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma   608.17     12.05   585.24   632.26 1.00    14701    11049

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Interval plots of the climatic cluster varying slopes (i.e., random slopes).

Code

# Extract posterior samples for population-level slopes
clim_cluster_slopes <- as_draws_df(mod) %>%
  spread_draws(r_ClimCluster[ClimCluster, FieldSite_num]) %>% 
  filter(FieldSite_num=="FieldSite_num")

clim_cluster_slopes %>%
  ggplot(aes(x = r_ClimCluster, y = ClimCluster)) +
  stat_halfeye(
    .width = c(0.80, 0.95),   # Specify intervals
    fill = "skyblue",
    point_interval = "median_qi",
    color = "blue") +
  theme_bw() +
  labs(title = "MCMC areas of climatic cluster varying slopes (median, 80% and 95% credible intervals)",
       x = "Slope",
       y = "Climatic cluster")

Global slope of the model and its 95% credible interval.

Code

mod_coeff <- mod %>% 
  broom.mixed::tidyMCMC(
    robust = T, # median and mean absolute deviation are used to compute point estimates and uncertainty
    ess = F, 
    rhat = F, 
    conf.int = T,
    conf.level = 0.95)

global_intercept <- mod_coeff$estimate[["b_Intercept"]]
global_slope <- mod_coeff$estimate[["b_FieldSite_num"]]
global_slope_low95 <- mod_coeff[mod_coeff$term=="b_FieldSite_num","conf.low"] %>% pull()
global_slope_high95 <- mod_coeff[mod_coeff$term=="b_FieldSite_num","conf.high"] %>% pull()

ggplot(data = subdata) +
  geom_point(aes(x = FieldSite_num, y = y), color = "gray") +
  scale_x_continuous(breaks = c(1, 2), 
                     limits = c(0.8, 2.2),
                     labels = c("Glensaugh (FE)", "Yair (FS)")) +  # Set x-axis limits and breaks
    geom_ribbon(aes(x = FieldSite_num, 
                    ymin = global_intercept + global_slope_low95 * FieldSite_num, 
                    ymax = global_intercept + global_slope_high95 * FieldSite_num),  
                inherit.aes = FALSE, 
                fill = "blue", 
                alpha = 0.2) + 
  geom_abline(slope = global_slope, 
              intercept = global_intercept, 
              color = "blue", 
              linetype = "dashed", 
              linewidth = 1) +
  theme_bw() +
  labs(title = "Global slope (median and 95% intervals)",
       x = "",
       y = "Tree height (mm)")

Now we look the estimated intercept and slope of each climatic cluster The credible intervals reflect uncertainty in the estimation of the slope only.

Code

clim_cluster_names <- subdata$ClimCluster %>% unique %>% sort

lapply(clim_cluster_names, function(clim_cluster_i){
  
  if(clim_cluster_i == "Central Atlantic"){
    term_intercept <- paste0("r_ClimCluster[Central.Atlantic,Intercept]")
    term_slope <- paste0("r_ClimCluster[Central.Atlantic,FieldSite_num]")
  } else {
    term_intercept <- paste0("r_ClimCluster[",clim_cluster_i,",Intercept]")
    term_slope <- paste0("r_ClimCluster[",clim_cluster_i,",FieldSite_num]")
  }
  
  clim_cluster_intercept <- mod_coeff[mod_coeff$term==term_intercept,"estimate"] %>% pull
  clim_cluster_slope <- mod_coeff[mod_coeff$term==term_slope,"estimate"] %>% pull
  clim_cluster_slope_low95 <- mod_coeff[mod_coeff$term==term_slope,"conf.low"] %>% pull
  clim_cluster_slope_high95 <- mod_coeff[mod_coeff$term==term_slope,"conf.high"] %>% pull
  
ggplot(data = subdata) +
  geom_point(aes(x = FieldSite_num, y = y), color="gray") +
  scale_x_continuous(breaks = c(1, 2), 
                     limits = c(0.5, 2.5),
                     labels = c("Glensaugh (FE)", "Yair (FS)")) +  # Set x-axis limits and breaks
  geom_ribbon(aes(x = FieldSite_num, 
                  ymin = global_intercept + global_slope_low95 * FieldSite_num, 
                  ymax = global_intercept + global_slope_high95 * FieldSite_num),  
              inherit.aes = FALSE, 
              fill = "blue", 
              alpha = 0.2) + 
  geom_ribbon(aes(x = FieldSite_num, 
                  ymin = (global_intercept + clim_cluster_intercept) + (global_slope + clim_cluster_slope_low95) * FieldSite_num, 
                  ymax = (global_intercept + clim_cluster_intercept) + (global_slope + clim_cluster_slope_high95) * FieldSite_num),  
              inherit.aes = FALSE, 
              fill = "red", 
              alpha = 0.2) + 
  geom_point(data=subdata %>% filter(subdata$ClimCluster==clim_cluster_i), aes(x=FieldSite_num, y=y), color="red", alpha=0.8) +
  geom_abline(slope = global_slope, 
              intercept = global_intercept, 
              color = "blue", 
              linetype = "dashed", 
              linewidth = 1) +
  geom_abline(slope=(global_slope + clim_cluster_slope) , intercept=(global_intercept + clim_cluster_intercept), color="red", alpha=0.8, linewidth=0.5) +
  scale_color_manual(values=c("red",rep("gray",20))) +
  theme_bw() +
  labs(title = paste0("Climatic cluster: ",clim_cluster_i),
       x = "",
       y = "Tree height (mm)")
})

[[1]]


[[2]]


[[3]]

Main take-home message: As for populations, there is almost not variation in the slope across sites among climatic clusters.

6 RDPI

For that, the first option is to use the two common gardens in which the trees come from a unique nursery, i.e. FS (in which all trees come from NG) and FE (in which all trees come from NE, except four trees that come from NG and that we remove to calculate the RDPI index).

So we calculate the RDPI indexes capturing plasticity to environment variation between FS and FE.

Code

rdpi_data <- data %>% filter(!(FieldSite=="FE" & Nursery =="NG")) # we remove the four trees in FE that come from NG
pop_codes <- unique(rdpi_data$PopulationCode) %>% sort() # population codes
site_codes <- c("FE","FS") # field site codes
site_comb <- combn(site_codes,2)
calc_rel_dist <- function(x){abs(x[1]-x[2])/(x[1]+x[2])}

list_traits <- list(heights = rdpi_data %>% dplyr::select(contains("HA"), -contains("13")) %>% colnames(), # height measurements (except 2013)
                    duration_budburst = rdpi_data %>% dplyr::select(contains("BD")) %>% colnames(), # duration of bud burst
                    timing_budburst = rdpi_data %>% dplyr::select(contains("BT")) %>% colnames()) # time taken to reach budburst stages

lapply(list_traits, function(traits) {
  
  lapply(traits, function(trait){

  lapply(pop_codes, function(pop){
  
  subpop <- rdpi_data %>%
    drop_na(any_of(trait)) %>% 
    dplyr::filter(PopulationCode %in% pop)
  
  n_pop <- nrow(subpop) # nb of individuals in the population

 rel_dist <- lapply(1:dim(site_comb)[[2]], function(nb_comb){
 
    subsite1 <- subpop %>% dplyr::filter(FieldSite %in% site_comb[1,nb_comb])
    subsite2 <- subpop %>% dplyr::filter(FieldSite %in% site_comb[2,nb_comb])
    
    
    lapply(1:nrow(subsite1), function(x) {
      
      
      rbind(subsite1[x,trait],subsite2[,trait]) %>% 
        pull() %>% 
        combn(2, calc_rel_dist) %>% 
        .[1:nrow(subsite2)] %>% 
        as.numeric()
      
      }) %>% unlist()
    
    }) %>% unlist()
 
 N <- length(rel_dist)
 
sum(rel_dist)/N
 
 
 }) %>% 
    setNames(pop_codes) %>% 
    as_tibble() %>% 
    pivot_longer(everything(), names_to = "PopulationCode", values_to = "RDPI") %>% 
    mutate(Trait=trait)
  
  }) %>% bind_rows()

  
}) %>% saveRDS(file=here("outputs/ScotsPine/PhenotypicPlasticity/RDPI_based_on_FE_FS.rds"))

Code

list_titles <- list(list("heights","height"),
                    list("duration_budburst","the duration of budburst"),
                    list("timing_budburst","the time taken to reach budburst stages"))

lapply(list_titles, function(x){
  
  readRDS(file=here("outputs/ScotsPine/PhenotypicPlasticity/RDPI_based_on_FE_FS.rds"))[[x[[1]]]] %>% 
  pivot_wider(values_from = RDPI, names_from = Trait) %>% 
  dplyr::select(-PopulationCode) %>% 
  cor() %>% 
  corrplot(method = 'number',
           type = 'lower', diag = FALSE,
           title=paste0("Correlation among RDPI for ",x[[2]]),
           mar=c(0,0,2,0),
           number.cex=0.9,tl.cex=0.9)
  })

Let’s look at the correlation with the population varying slopes of the model above (median estimates).

Code

df <- readRDS(file=here("outputs/ScotsPine/PhenotypicPlasticity/RDPI_based_on_FE_FS.rds"))$heights %>% 
  filter(Trait == trait) %>% 
  dplyr::rename(Population = PopulationCode) %>% 
  inner_join(median_slope_df, by="Population")

cor(df$RDPI, df$median_slope)

[1] 0.4178298

7 Raw difference betwen population means

Code

diff_data <- data %>% 
  filter(FieldSite %in% c("FE","FS")) %>% # 1765 to 1311
  filter(!(FieldSite=="FE" & Nursery =="NG")) # we remove the four trees in FE that come from NG
pop_codes <- unique(diff_data$PopulationCode) %>% sort() # population codes

list_traits <- list(heights = diff_data %>% dplyr::select(contains("HA"), -contains("13")) %>% colnames(), # height measurements (except 2013)
                    duration_budburst = diff_data %>% dplyr::select(contains("BD")) %>% colnames(), # duration of bud burst
                    timing_budburst = diff_data %>% dplyr::select(contains("BT")) %>% colnames()) # time taken to reach budburst stages

lapply(list_traits, function(traits) {
  
  lapply(traits, function(trait){

  lapply(pop_codes, function(pop){
  
  raw_diff <- diff_data %>%
    drop_na(any_of(trait)) %>% 
    dplyr::filter(PopulationCode %in% pop) %>% 
    arrange(FieldSite) %>% 
    group_by(FieldSite) %>% 
    summarise(mean_trait = mean(!!sym(trait), na.rm = TRUE)) %>% 
    summarise(difference = diff(mean_trait)) %>%
    pull(difference)
    
  tibble(PopulationCode = pop,
         Raw_Diff = raw_diff,
         Trait = trait)
  
  }) %>% bind_rows()
    
  }) %>% bind_rows()

  
}) %>% saveRDS(file=here("outputs/ScotsPine/PhenotypicPlasticity/Raw_Diff_based_on_FE_FS.rds"))

Code

list_titles <- list(list("heights","height"),
                    list("duration_budburst","the duration of budburst"),
                    list("timing_budburst","the time taken to reach budburst stages"))

lapply(list_titles, function(x){
  
  readRDS(file=here("outputs/ScotsPine/PhenotypicPlasticity/Raw_Diff_based_on_FE_FS.rds"))[[x[[1]]]] %>% 
  pivot_wider(values_from = Raw_Diff, names_from = Trait) %>% 
  dplyr::select(-PopulationCode) %>% 
  cor() %>% 
  corrplot(method = 'number',
           type = 'lower', diag = FALSE,
           title=paste0("Correlation among raw differences for ",x[[2]]),
           mar=c(0,0,2,0),
           number.cex=0.9,tl.cex=0.9)
  })

8 Correlations

Correlation among population slopes, RDPI and raw differences

Code

df <- readRDS(file=here("outputs/ScotsPine/PhenotypicPlasticity/RDPI_based_on_FE_FS.rds"))$heights %>% 
  filter(Trait == trait) %>% 
  dplyr::rename(Population = PopulationCode) %>% 
  inner_join(median_slope_df, by="Population")

readRDS(file=here("outputs/ScotsPine/PhenotypicPlasticity/Raw_Diff_based_on_FE_FS.rds"))$heights %>% 
  filter(Trait == trait) %>% 
  dplyr::rename(Population = PopulationCode) %>% 
  inner_join(df, by=c("Population","Trait")) %>% 
  dplyr::select(-Population,-Trait) %>% 
  cor() %>% 
  corrplot(method = 'number',
           type = 'lower', diag = FALSE,
           title=paste0(""),
           mar=c(0,0,2,0),
           number.cex=0.9,tl.cex=0.9)

References

Valladares, Fernando, David Sanchez-Gomez, and Miguel A. Zavala. 2006. “Quantitative Estimation of Phenotypic Plasticity: Bridging the Gap Between the Evolutionary Concept and Its Ecological Applications.” Journal of Ecology 94 (6): 1103–16. https://doi.org/10.1111/j.1365-2745.2006.01176.x.