1 Goal

In this document, we calculate some indexes used as proxies of potential drivers of the within-population genetic variation: some population admixture scores, climate harshness indexes, environmental spatial and temporal heterogeneity indexes.

1.1 Population admixture scores

We calculated two indexes, which are both used in the paper.

\(A\) is the raw proportion of admixture for each population (i.e. the proportion of the population genome coming from "foreign" gene pools).
\(D\) is a the proportion of admixture weighted by the divergence between the source and sink gene pools, i.e. the proportion of belonging to each source gene pool weighted by the divergence between the source and sink gene pool.

1.2 Climate harshness indexes

In the first verion of the paper, the climate harhness indexes were \(SHM\) (summer heat moisture index) and \(invEMT\) (inverse of the extreme minimum temperatures) and they were extracted from raster files at 1kmx1km spatial resolution (the same as the one used to calculate the spatial enviromnental heterogeneity, see below). However, Maurizio Marchi (the developper of the Climate Downscaling Tool) told us that it is more accurate to extract point-estimate data with the bilinear interpolation of ClimateDT. So, in the reviewed version of the paper (after the second Heredity review), we use climate harshness indexes extracted with ClimateDT.

To obtain the climate harhness indexes, we calculated the mean over of period 1901-1950 of following annual variables:

\(EMT\) is the extreme minimum temperature. For a given year \(i\), \(EMT_i\) is calculated as follows: \[EMT_i = -18.09776 + 2.140643*tmn01_i +6.834903e-02*tmn01_i^2 + 2.252697e-04*tmn12_i - 4.770242e-02*tmn12_i^2 + 3.062871e-03*TD_i^2\]. To aid vizualisation and interpretation, we use \(invEMT\) (the inverse of \(EMT\)).
\(bio6\) is minimum temperature of the coldest month
\(MCMT\) is the mean coldest month temperature.
\(SHM\) is the summer heat moisture index. The higher the \(SHM\), the drier the weather.
\(bio14\) is the precipitation of the driest month.
\(SP\) is the summer precipitation.

1.3 Environmental spatial heterogeneity indexes

To calculate the environmental spatial heterogeneity indexes, we used climatic, soil and topographic data.

The climatic data were extracted from raster files at 1kmx1km spatial resolution (kindly provided by Maurizio Marchi), in which each variable was averaged over the period 1901-1950. Here is the list of the climatic variables used:

Annual variables
- \(EMT\): extreme minimum temperature (°C)
- \(SHM\): summer heat moisture index (°C/mm)
- \(SP\): summer precipitation (mm) (named \(PRCsum\) in the raster files and \(MSP\) in ClimateDT)
- \(SPR\): mean spring precipitation (mm)
- \(MWMT\): mean warmest month temperature (°C)
- \(MCMT\): mean coldest month temperature (°C)
- \(TD\): temperature difference, Celsius degrees (°C) (\(MWMT-MCMT\))
Monthly variables
- Monthly minimum temperature (°C): \(tmn01\) to \(tmn12\)
- Monthly maximum temperature (°C): \(tmx01\) to \(tmx12\)
- Monthly total precipitation (mm): \(prc01\) to \(prc12\)
- Hargreaves climatic moisture deficit (mm): \(CMD_01\) to \(CMD_12\)

Soil variables used:

\(DepthRoots\): Depth available to roots (cm)
\(Clay\): Clay content in the topsoil (0-30 cm) (%)
\(Sand\): Sand content in the topsoil (0-30 cm) (%)
\(Silt\): Silt content in the topsoil (0-30 cm) (%)

The topographic variable used is the topographic ruggedness index (unitless).

To calculate the environmental spatial heterogeneity indexes, we first extracted raster cell values of the climatic, topographic and soil variables within a 20-km radius around each population location, and we kept only raster cells that fell within forested areas, to avoid including environmental data from non-suitable areas (e.g. lakes, mountain peaks). We then performed a principal component analysis (PCA) on the set of the selected raster cells and extracted the PC1 and PC2 scores of each cell, accounting for 45.2% and 34.1% of the variance, respectively. To obtain the four indexes of environmental spatial heterogeneity, we calculated the variances of the PC1 and PC2 scores in a 20-km and 1.6-km radius around each population location:

SH1[20km] the environmental spatial heterogeneity of the PC1 score in a 20-km radius around the population location.
SH2[20km] the environmental spatial heterogeneity of the PC2 score in a 20-km radius around the population location.
SH1[1.6km] the environmental spatial heterogeneity of the PC1 score in a 1.6-km radius around the population location.
SH2[1.6km] the environmental spatial heterogeneity of the PC2 score in a 1.6-km radius around the population location.

1.4 Environmental temporal heterogeneity indexes

The environmntal temporal heterogeneity indexes corresponf to the variance over of period 1901-1950 of the same annual variables as the one used for climate harshness indexes.

2 Population coordinates

We extract the population (i.e. provenance) coordinates, transform them in a spatial object and extract the population structure data.

# Population coordinates
data <- readRDS(file="data/ClonapinData/PhenoDataNovember2019_AnnualTraits.rds")
df <- unique(data[,c("prov","longitude_prov","latitude_prov")])

# We remove the ROD population because it has no genomic data,
# and the MAD population because it has only one clone (so the genetic variation can't
# be estimated)
df <- df[!(df$prov=="ROD"|df$prov=="MAD"),]
colnames(df) <- c("prov","longitude","latitude")

# Create a spatial object of the provenance coordinates 
xyprov <- SpatialPoints(df[,c("longitude","latitude")], 
                        proj4string=CRS("+proj=longlat +datum=WGS84"))

# Adding the proportion of belonging to each gene pool (population structure)
gp <- data[!(is.na(data$Q1)),] %>% 
  dplyr::select(prov,clon,paste0(rep("Q",6),1:6)) %>% 
  distinct() %>%  # keep only one row per clone
  group_by(prov) %>% 
  summarise_at(vars(paste0(rep("Q",6),1:6)), mean) 

for (i in 1:length(gp$prov)){gp[i,"mainGP"] <- names(gp[i,2:7])[which.max(apply(gp[i,2:7],MARGIN=2,max))]}
colnames(gp) <- c("prov","gpNA","gpC","gpCS","gpFA","gpIA","gpSES","mainGP")

df <- left_join(df,gp,by="prov")

To extract point-estimate climatic data with ClimateDT, we have to create a csv file with the provenance coordinates and elevation.

I first created a csv with the initial provenance elevation data, which were provided with the CLONAPIN data. However, for the CAS provenance, there were more than 200m difference between the initial provenance elevetion data (391m) and the elevation data from the DEEM used in ClimateDT (158.43m).

Looking at the location of the CAS provenance on a map, we considered the DEEM elevation data more plausible because the locations with an elevation of about 300m were quite far from the GPS point of the provenance. Thus, for the CAS provenance, we will use the DEEM elevation data.

# Building the csv file for ClimateDT
provcoord <- readRDS(file="data/ClonapinData/PhenoDataNovember2019_AnnualTraits.rds")
provcoord <- unique(provcoord[,c("prov","latitude_prov","longitude_prov","altitude_prov")])

provcoord <- provcoord[!(provcoord$prov=="ROD"|provcoord$prov=="MAD"),] # removing MAD and ROD
colnames(provcoord) <- c("prov","longitude","latitude","altitude")

provcoord[provcoord$prov=="CAS","altitude"] <- 158 # we change the elevation data of the CAS provenance

write_csv(provcoord,"data/Climate/Raw/ClimateDTdownloads/provenance_coordinates_readyforClimateDT.csv")

3 Population admixture scores

We used estimates from Jaramillo-Correa et al. (2015), that were obtained using the Bayesian approach available in Structure (Pritchard, Stephens, and Donnelly 2000). Jaramillo-Correa et al. (2015) estimated:

for each clone, the proportions of ancestry from each of the six known gene pools in maritime pine. In the df dataset, these proportions are averaged by population and correspond to the columns gpNA, gpC, gpCS, gpFA, gpIA and gpSES (see below).
the allele frequency divergence of each gene pool from the common ancestral one (Fk, which should be numerically similar to Fst; Falush et al., 2003):

First, we assigned each clone to the gene pool that contributed more than 50% ancestry (corresponding to the column mainGP in the df dataset, averaged by population).

Then, we calculated the two admixture scores as follows:

The population admixture score \(A\) corresponds to the proportion of ancestry originating from the "foreign" gene pools.
To obtain the admixture score \(D\), we weighted the proportions of ancestry originating from the "foreign" gene pools by the allele frequency divergence between the source and sink gene pools, i.e. the sum of the allele frequency divergence of the source and sink gene pools from the common ancestral one. Thus, the admixture score \(D\) considers both the percentage of ‘foreign’ ancestries and their respective divergence. We developed \(D\) considering that some gene pools are more divergent than others and thus may bring higher genetic diversity to an admixed population at the same level of introgression. To check the weights used to calculate \(D\), we correlate them with the pairwise \(F_{st}\) among gene pools. We also check the correlation between \(D\) and \(D_{fst}\), i.e. an expected similar index obtained by weighting the proportions of ancestry originating from the "foreign" gene pools by the pairwise \(F_{st}\).

\(A\) and \(D\) values were calculated for each clone and then averaged by population.

3.1 Calculating \(A\) and \(D\)

# Estimated divergence between each gene pool and the common ancestral one 
# From Jaramillo-Correa et al. (2015)
divNA <- 0.5268   # Northern Africa
divC <- 0.2739    # Corsica
divCS <- 0.0600   # Central Spain
divFA <- 0.1374   # French Atlantic
divIA <- 0.1944   # Iberian Atlantic 
divSES <- 0.1522  # South-estern Spain


df <- df %>% 
  mutate(mainGP=case_when(mainGP=="Q1"~"gpNA",
                          mainGP=="Q2"~"gpC",
                          mainGP=="Q3"~"gpCS",
                          mainGP=="Q4"~"gpFA",
                          mainGP=="Q5"~"gpIA",
                          mainGP=="Q6"~"gpSES"),
         
         # Admixture score A:
         A = case_when(mainGP=="gpNA"~1-gpNA, 
                       mainGP=="gpC"~1-gpC,
                       mainGP=="gpCS"~1-gpCS,
                       mainGP=="gpFA"~1-gpFA,
                       mainGP=="gpIA"~1-gpIA,
                       mainGP=="gpSES"~1-gpSES),
  
         # Admixture score D:
         D = case_when(mainGP=="gpNA"~ gpC*(divC+divNA) + gpCS*(divCS+divNA) + gpFA*(divFA+divNA) + gpIA*(divIA+divNA) + gpSES*(divSES+divNA),
                       mainGP=="gpC"~ gpNA*(divNA+divC) + gpCS*(divCS+divC) + gpFA*(divFA+divC) + gpIA*(divIA+divC) + gpSES*(divSES+divC),
                       mainGP=="gpCS"~ gpNA*(divNA+divCS) + gpC*(divC+divCS) + gpFA*(divFA+divCS) + gpIA*(divIA+divCS) + gpSES*(divSES+divCS),
                       mainGP=="gpFA"~ gpNA*(divNA+divFA) + gpC*(divC+divFA) + gpCS*(divCS+divFA) + gpIA*(divIA+divFA) + gpSES*(divSES+divFA),
                       mainGP=="gpIA"~ gpNA*(divNA+divIA) + gpC*(divC+divIA) + gpCS*(divCS+divIA) + gpFA*(divFA+divIA) + gpSES*(divSES+divIA),
                       mainGP=="gpSES"~ gpNA*(divNA+divSES) + gpC*(divC+divSES) + gpCS*(divCS+divSES) + gpFA*(divFA+divSES) + gpIA*(divIA+divSES)))

CorAD <- cor(df$A,df$D)

# Show df
df %>%
  kable(digits=3) %>%
  kable_styling(font_size=11,
                bootstrap_options = c("striped","hover", "condensed"), full_width = F) %>%
  column_spec(1:2, bold = T)

prov	longitude	latitude	gpNA	gpC	gpCS	gpFA	gpIA	gpSES	mainGP	A	D
CEN	-4.491	40.278	0.012	0.002	0.884	0.003	0.048	0.051	gpCS	0.116	0.031
ARN	-5.116	40.195	0.010	0.002	0.955	0.008	0.010	0.014	gpCS	0.045	0.014
ALT	-6.494	43.283	0.003	0.000	0.109	0.092	0.793	0.002	gpIA	0.207	0.061
SAL	-3.063	41.835	0.010	0.004	0.944	0.027	0.008	0.007	gpCS	0.056	0.016
COM	-3.954	36.834	0.240	0.028	0.126	0.011	0.039	0.557	gpSES	0.443	0.218
CAD	-6.418	43.540	0.002	0.001	0.050	0.010	0.935	0.002	gpIA	0.065	0.018
VAL	-4.311	40.516	0.012	0.003	0.940	0.007	0.013	0.025	gpCS	0.060	0.018
MIM	-1.303	44.134	0.004	0.002	0.025	0.951	0.012	0.006	gpFA	0.049	0.014
LEI	-8.957	39.783	0.004	0.003	0.511	0.007	0.473	0.002	gpCS	0.489	0.125
BAY	-2.877	41.523	0.003	0.004	0.967	0.013	0.009	0.004	gpCS	0.033	0.009
SIE	-6.493	43.528	0.003	0.001	0.076	0.015	0.903	0.001	gpIA	0.097	0.028
LAM	-6.219	43.559	0.002	0.001	0.004	0.051	0.942	0.000	gpIA	0.058	0.020
TAM	-5.017	33.600	0.932	0.000	0.027	0.000	0.039	0.001	gpNA	0.068	0.045
COC	-4.498	41.255	0.017	0.005	0.831	0.060	0.044	0.044	gpCS	0.169	0.044
OLO	-1.831	46.566	0.003	0.001	0.007	0.979	0.006	0.003	gpFA	0.021	0.007
STJ	-2.029	46.764	0.003	0.002	0.028	0.946	0.017	0.004	gpFA	0.054	0.015
CUE	-4.484	41.375	0.003	0.001	0.872	0.063	0.058	0.002	gpCS	0.128	0.030
PET	-1.300	44.064	0.003	0.001	0.021	0.966	0.004	0.004	gpFA	0.034	0.009
ORI	-2.351	37.531	0.248	0.005	0.028	0.001	0.010	0.708	gpSES	0.292	0.180
SEG	-8.450	42.817	0.003	0.001	0.149	0.013	0.831	0.003	gpIA	0.169	0.046
OLB	-0.623	40.173	0.080	0.009	0.777	0.006	0.003	0.125	gpCS	0.223	0.078
QUA	-0.359	38.972	0.092	0.006	0.499	0.005	0.015	0.382	gpCS	0.501	0.142
CAS	-6.983	43.501	0.001	0.000	0.005	0.001	0.992	0.001	gpIA	0.008	0.003
PLE	-2.344	47.781	0.005	0.001	0.064	0.920	0.005	0.005	gpFA	0.080	0.020
BON	-1.661	39.986	0.164	0.010	0.640	0.003	0.002	0.181	gpCS	0.360	0.139
HOU	-1.150	45.183	0.004	0.001	0.026	0.960	0.007	0.002	gpFA	0.040	0.011
VER	-1.091	45.552	0.003	0.002	0.018	0.972	0.004	0.001	gpFA	0.028	0.008
ARM	-6.458	43.305	0.005	0.007	0.022	0.006	0.959	0.001	gpIA	0.041	0.015
CAR	-4.277	41.172	0.001	0.001	0.904	0.060	0.023	0.011	gpCS	0.096	0.021
PIE	9.038	41.973	0.007	0.969	0.023	0.000	0.001	0.001	gpC	0.031	0.014
PUE	-6.631	43.548	0.002	0.000	0.021	0.001	0.973	0.002	gpIA	0.027	0.008
PIA	9.465	42.021	0.004	0.974	0.010	0.001	0.008	0.003	gpC	0.026	0.012
SAC	-8.364	42.118	0.004	0.001	0.291	0.003	0.699	0.002	gpIA	0.301	0.079

\(A\) and \(D\) are correlated at 0.91.

3.2 Calculating \(D_{fst}\)

We calculate the pairwise \(F_{st}\) matrix among gene pools.

# Script from /home/juliette/Documents/GenomicOffset/GenomicOffsetPinPin/reports/PreparingGenomicData/CalculatingPairwiseFST.Rmd

# File with the genotype names (clone names)
geno_names <- read.delim2("../../GenomicOffset/GenomicOffsetPinPin/data/ClonapinBlups523IndPiMASSJuly2019.txt", row.names=1)

# File with the genotype of each clone for each SNP
geno <- read.csv("../../GenomicOffset/GenomicOffsetPinPin/data/5165snps523genotypesNA.txt", header=FALSE, row.names=1)

# In this file, SNPs have their names, but not the genotypes.
# head(geno[,1:10])

# Removing the first two columns with allele info (A,T, G or C)
geno <- geno[,3:dim(geno)[[2]]]

# Give the genotype name for each column of geno
colnames(geno) <- rownames(geno_names)

# ----

# Preparing genomic data
geno[geno ==1] <- 12
geno[geno ==2] <- 22
geno[geno ==0] <- 11
geno <- t(geno) # SNps in column and genotypes in row

geno <- geno %>%  
  as.data.frame() %>% 
  dplyr::mutate(prov=substr(row.names(geno), 0, 3)) %>% 
   dplyr::filter(!(prov=="MAD")) %>% 
  left_join(df[,c("prov","mainGP")],by="prov") %>% 
  dplyr::select(prov, mainGP, everything())

# Calculate the pairwise Fst with the hierfstat package
fst <- pairwise.WCfst(geno[,-1],diploid=TRUE)
saveRDS(fst,file="data/GenomicData/Gp_Fstmatrix.rds")

We first compare the weights used to calculate \(D\) and the pairwise \(F_{st}\).

fst <- readRDS(file="data/GenomicData/Gp_Fstmatrix.rds")
fst[lower.tri(fst,diag = FALSE)] <- NA

DFcomp <-  as.data.frame(fst) %>% 
  rownames_to_column(var="GP1") %>% 
  melt(value.name = "fst",variable.name="GP2") %>% 
  drop_na(fst) %>% 
  dplyr::mutate(DivGP1 = case_when(GP1=="gpC"~ divC,
                                   GP1=="gpCS"~divCS,
                                   GP1=="gpFA"~divFA,
                                   GP1=="gpIA"~divIA,
                                   GP1=="gpNA"~divNA),
                                 DivGP2 = case_when(GP2=="gpCS"~divCS,
                                   GP2=="gpFA"~divFA,
                                   GP2=="gpIA"~divIA,
                                   GP2=="gpSES"~divSES,
                                   GP2=="gpNA"~divNA),
                                 DivTot=DivGP1+DivGP2) 
CorDivTotfst <- cor(DFcomp$DivTot,DFcomp$fst)

DFcomp %>%
  kable(digits=3) %>%
  kable_styling(font_size=11,
                bootstrap_options = c("striped","hover", "condensed"), full_width = F) %>%
  column_spec(1:2, bold = T)

	GP1	GP2	fst	DivGP1	DivGP2	DivTot
7	gpC	gpCS	0.132	0.274	0.060	0.334
13	gpC	gpFA	0.185	0.274	0.137	0.411
14	gpCS	gpFA	0.071	0.060	0.137	0.197
19	gpC	gpIA	0.202	0.274	0.194	0.468
20	gpCS	gpIA	0.062	0.060	0.194	0.254
21	gpFA	gpIA	0.119	0.137	0.194	0.332
25	gpC	gpNA	0.323	0.274	0.527	0.801
26	gpCS	gpNA	0.244	0.060	0.527	0.587
27	gpFA	gpNA	0.293	0.137	0.527	0.664
28	gpIA	gpNA	0.332	0.194	0.527	0.721
31	gpC	gpSES	0.175	0.274	0.152	0.426
32	gpCS	gpSES	0.083	0.060	0.152	0.212
33	gpFA	gpSES	0.150	0.137	0.152	0.290
34	gpIA	gpSES	0.163	0.194	0.152	0.347
35	gpNA	gpSES	0.157	0.527	0.152	0.679

Weights used to calculate \(D\) and the pairwise \(F_{st}\) are correlated at 0.9.

DFfst <- readRDS(file="data/GenomicData/Gp_Fstmatrix.rds") %>% 
  as.data.frame() %>% 
  rownames_to_column(var="GP1") %>% 
  melt(value.name = "fst",variable.name="GP2") %>% 
  drop_na(fst)

df <- df %>% mutate(Dfst = case_when(mainGP=="gpNA"~ gpC*DFfst[DFfst$GP1=="gpC"&DFfst$GP2=="gpNA","fst"] +
                                       gpCS*DFfst[DFfst$GP1=="gpCS"&DFfst$GP2=="gpNA","fst"] + 
                                       gpFA*DFfst[DFfst$GP1=="gpFA"&DFfst$GP2=="gpNA","fst"] + 
                                       gpIA*DFfst[DFfst$GP1=="gpIA"&DFfst$GP2=="gpNA","fst"] + 
                                       gpSES*DFfst[DFfst$GP1=="gpSES"&DFfst$GP2=="gpNA","fst"],
                       mainGP=="gpC"~ gpNA*DFfst[DFfst$GP1=="gpNA"&DFfst$GP2=="gpC","fst"]+ 
                                       gpCS*DFfst[DFfst$GP1=="gpCS"&DFfst$GP2=="gpC","fst"] + 
                                       gpFA*DFfst[DFfst$GP1=="gpFA"&DFfst$GP2=="gpC","fst"] + 
                                       gpIA*DFfst[DFfst$GP1=="gpIA"&DFfst$GP2=="gpC","fst"] + 
                                       gpSES*DFfst[DFfst$GP1=="gpSES"&DFfst$GP2=="gpC","fst"],
                       mainGP=="gpCS"~ gpNA*DFfst[DFfst$GP1=="gpNA"&DFfst$GP2=="gpCS","fst"] + 
                                       gpC*DFfst[DFfst$GP1=="gpC"&DFfst$GP2=="gpCS","fst"] + 
                                       gpFA*DFfst[DFfst$GP1=="gpFA"&DFfst$GP2=="gpCS","fst"] + 
                                       gpIA*DFfst[DFfst$GP1=="gpIA"&DFfst$GP2=="gpCS","fst"] + 
                                       gpSES*DFfst[DFfst$GP1=="gpSES"&DFfst$GP2=="gpCS","fst"],
                       mainGP=="gpFA"~ gpNA*DFfst[DFfst$GP1=="gpNA"&DFfst$GP2=="gpFA","fst"] + 
                                       gpC*DFfst[DFfst$GP1=="gpC"&DFfst$GP2=="gpFA","fst"] +
                                       gpCS*DFfst[DFfst$GP1=="gpCS"&DFfst$GP2=="gpFA","fst"] + 
                                       gpIA*DFfst[DFfst$GP1=="gpIA"&DFfst$GP2=="gpFA","fst"] + 
                                       gpSES*DFfst[DFfst$GP1=="gpSES"&DFfst$GP2=="gpFA","fst"],
                       mainGP=="gpIA"~ gpNA*DFfst[DFfst$GP1=="gpNA"&DFfst$GP2=="gpIA","fst"] + 
                                       gpC*DFfst[DFfst$GP1=="gpC"&DFfst$GP2=="gpIA","fst"] + 
                                       gpCS*DFfst[DFfst$GP1=="gpCS"&DFfst$GP2=="gpIA","fst"] + 
                                       gpFA*DFfst[DFfst$GP1=="gpFA"&DFfst$GP2=="gpIA","fst"] + 
                                       gpSES*DFfst[DFfst$GP1=="gpSES"&DFfst$GP2=="gpIA","fst"],
                       mainGP=="gpSES"~ gpNA*DFfst[DFfst$GP1=="gpNA"&DFfst$GP2=="gpSES","fst"] + 
                                       gpC*DFfst[DFfst$GP1=="gpC"&DFfst$GP2=="gpSES","fst"] + 
                                       gpCS*DFfst[DFfst$GP1=="gpCS"&DFfst$GP2=="gpSES","fst"] + 
                                       gpFA*DFfst[DFfst$GP1=="gpFA"&DFfst$GP2=="gpSES","fst"]+ 
                                       gpIA*DFfst[DFfst$GP1=="gpIA"&DFfst$GP2=="gpSES","fst"]))

CorDDfst <- cor(df$D,df$Dfst)
CorADfst <- cor(df$A,df$Dfst)

# Show df
df %>%
  kable(digits=3) %>%
  kable_styling(font_size=11,
                bootstrap_options = c("striped","hover", "condensed"), full_width = F) %>%
  column_spec(1, bold = T)

prov	longitude	latitude	gpNA	gpC	gpCS	gpFA	gpIA	gpSES	mainGP	A	D	Dfst
CEN	-4.491	40.278	0.012	0.002	0.884	0.003	0.048	0.051	gpCS	0.116	0.031	0.011
ARN	-5.116	40.195	0.010	0.002	0.955	0.008	0.010	0.014	gpCS	0.045	0.014	0.005
ALT	-6.494	43.283	0.003	0.000	0.109	0.092	0.793	0.002	gpIA	0.207	0.061	0.019
SAL	-3.063	41.835	0.010	0.004	0.944	0.027	0.008	0.007	gpCS	0.056	0.016	0.006
COM	-3.954	36.834	0.240	0.028	0.126	0.011	0.039	0.557	gpSES	0.443	0.218	0.061
CAD	-6.418	43.540	0.002	0.001	0.050	0.010	0.935	0.002	gpIA	0.065	0.018	0.005
VAL	-4.311	40.516	0.012	0.003	0.940	0.007	0.013	0.025	gpCS	0.060	0.018	0.007
MIM	-1.303	44.134	0.004	0.002	0.025	0.951	0.012	0.006	gpFA	0.049	0.014	0.006
LEI	-8.957	39.783	0.004	0.003	0.511	0.007	0.473	0.002	gpCS	0.489	0.125	0.031
BAY	-2.877	41.523	0.003	0.004	0.967	0.013	0.009	0.004	gpCS	0.033	0.009	0.003
SIE	-6.493	43.528	0.003	0.001	0.076	0.015	0.903	0.001	gpIA	0.097	0.028	0.008
LAM	-6.219	43.559	0.002	0.001	0.004	0.051	0.942	0.000	gpIA	0.058	0.020	0.007
TAM	-5.017	33.600	0.932	0.000	0.027	0.000	0.039	0.001	gpNA	0.068	0.045	0.020
COC	-4.498	41.255	0.017	0.005	0.831	0.060	0.044	0.044	gpCS	0.169	0.044	0.015
OLO	-1.831	46.566	0.003	0.001	0.007	0.979	0.006	0.003	gpFA	0.021	0.007	0.003
STJ	-2.029	46.764	0.003	0.002	0.028	0.946	0.017	0.004	gpFA	0.054	0.015	0.006
CUE	-4.484	41.375	0.003	0.001	0.872	0.063	0.058	0.002	gpCS	0.128	0.030	0.009
PET	-1.300	44.064	0.003	0.001	0.021	0.966	0.004	0.004	gpFA	0.034	0.009	0.004
ORI	-2.351	37.531	0.248	0.005	0.028	0.001	0.010	0.708	gpSES	0.292	0.180	0.044
SEG	-8.450	42.817	0.003	0.001	0.149	0.013	0.831	0.003	gpIA	0.169	0.046	0.012
OLB	-0.623	40.173	0.080	0.009	0.777	0.006	0.003	0.125	gpCS	0.223	0.078	0.032
QUA	-0.359	38.972	0.092	0.006	0.499	0.005	0.015	0.382	gpCS	0.501	0.142	0.056
CAS	-6.983	43.501	0.001	0.000	0.005	0.001	0.992	0.001	gpIA	0.008	0.003	0.001
PLE	-2.344	47.781	0.005	0.001	0.064	0.920	0.005	0.005	gpFA	0.080	0.020	0.008
BON	-1.661	39.986	0.164	0.010	0.640	0.003	0.002	0.181	gpCS	0.360	0.139	0.057
HOU	-1.150	45.183	0.004	0.001	0.026	0.960	0.007	0.002	gpFA	0.040	0.011	0.004
VER	-1.091	45.552	0.003	0.002	0.018	0.972	0.004	0.001	gpFA	0.028	0.008	0.003
ARM	-6.458	43.305	0.005	0.007	0.022	0.006	0.959	0.001	gpIA	0.041	0.015	0.005
CAR	-4.277	41.172	0.001	0.001	0.904	0.060	0.023	0.011	gpCS	0.096	0.021	0.007
PIE	9.038	41.973	0.007	0.969	0.023	0.000	0.001	0.001	gpC	0.031	0.014	0.005
PUE	-6.631	43.548	0.002	0.000	0.021	0.001	0.973	0.002	gpIA	0.027	0.008	0.003
PIA	9.465	42.021	0.004	0.974	0.010	0.001	0.008	0.003	gpC	0.026	0.012	0.005
SAC	-8.364	42.118	0.004	0.001	0.291	0.003	0.699	0.002	gpIA	0.301	0.079	0.020

df %>% 
  arrange(prov) %>% 
  dplyr::select(-longitude,-latitude) %>% 
  xtable(type = "latex",digits=2) %>% 
  print(file = "tables/Drivers/PopulationAdmixtureIndexesValues.tex",
      include.rownames=FALSE)
# print(xtable(df, type = "latex",digits=2),
#       file = "tables/Drivers/ClimaticDriversValues.tex",
#       include.rownames=FALSE)

\(A\) and \(D_{fst}\) are correlated at 0.91 and \(D\) and \(D_{fst}\) at 0.96.

4 Spatial heterogeneity

4.1 Visualization buffer areas + maritime pine distribution

For visualization, we create different potential buffer zones around the location of the populations, respectively with a radius of 5km, 10km, 20km, 50km, 75km and 100km.

buff1 <- buffer(xyprov,width=1000,dissolve=F) # 1km
buff5 <- buffer(xyprov,width=5000,dissolve=F) # 5km
buff10 <- buffer(xyprov,width=10000,dissolve=F) # 10km
buff20 <- buffer(xyprov,width=20000,dissolve=F) # 20km
buff50 <- buffer(xyprov,width=50000,dissolve=F) # 50km
buff75 <- buffer(xyprov,width=75000,dissolve=F) # 75km
buff100 <- buffer(xyprov,width=100000,dissolve=F) # 100km

We will project the buffer zones with the maritime pine distribution, which is based on the EUFORGEN distribution (http://www.euforgen.org/) and 10-km radius areas around the National Forest Inventory plots with maritime pines (only for Europe). The problem with using this distribution for further analyses is that it is not precise, including non-forested areas and not including the TAM population (Moroccan population). That's why we will not use it in further analyses, and we will use raster of forested areas instead.

# Maritime pine distribution from EUFORGEN and the National Forest Inventories
distri <- shapefile("data/PinPinDistri/PinpinDistriEUforgen_NFIplotsBuffer10km.shp")

Map of the buffer zones + maritime pine distribution

In the article and the rest of the document, we will use the buffer areas of 20-km radius.

4.2 Creating a mask of the forested areas

# Load a raster of one variable that will be used to set the resolution and extent of the mask
# Here precipitation in January with resolution 1000m 
prc01 <- raster("data/Climate/Raw/FromMarta_B4ESTClimate/1901-1950_CRU/prc01.tif")

4.2.1 In Europe

The forested areas are the "Forest Type 2015" of Copernicus. 100 m resolution. https://land.copernicus.eu/pan-european/high-resolution-layers/forests/forest-type-1/status-maps/2015?tab=metadata

The CRS is EPSG:3035 (ETRS89, LAEA) (+proj=laea +lat_0=52 +lon_0=10 +x_0=4321000 +y_0=3210000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs).

We keep all the forested areas, so broadleaved forests (1), coniferous forests (2) and mixed forests (3).

Legend:

0 all non-tree and non-forest areas
1 broadleaved forest
2 coniferous forest
3 mixed forest (only for aggregated 100m layer)

# Merge the different tiles of Copernicus (100 m resolution)
rast1 <- raster("data/ForestedAreas/RawData/Europe/FTY_2015_100m_eu_03035_d02_E20N20/FTY_2015_100m_eu_03035_d02_E20N20.tif")
rast2 <- raster("data/ForestedAreas/RawData/Europe/FTY_2015_100m_eu_03035_d02_E30N10/FTY_2015_100m_eu_03035_d02_E30N10.tif")
rast3 <- raster("data/ForestedAreas/RawData/Europe/FTY_2015_100m_eu_03035_d02_E30N20/FTY_2015_100m_eu_03035_d02_E30N20.tif")
rast4 <- raster("data/ForestedAreas/RawData/Europe/FTY_2015_100m_eu_03035_d02_E40N20/FTY_2015_100m_eu_03035_d02_E40N20.tif")

# Change the CRS of the buffer areas of 20-km radius in CRS ETRS89-LAEA Europe
buffer <- spTransform(buff20,crs(rast1))

# Keep only raster values within the 20-km radius around each population location
rast1 <- mask(rast1,buffer,updatevalue=NA)
rast2 <- mask(rast2,buffer,updatevalue=NA)
rast3 <- mask(rast3,buffer,updatevalue=NA)
rast4 <- mask(rast4,buffer,updatevalue=NA)


# We keep cells with forested areas, so either broadleaved forests (1), conifer forests (2) or mixed forests (3).
# For that, we attribute 1 to cells with forested areas and 0 to others.

rast1[!(rast1==2|rast1==3|rast1==1)] <- NA
rast2[!(rast2==2|rast2==3|rast2==1)] <- NA
rast3[!(rast3==2|rast3==3|rast3==1)] <- NA
rast4[!(rast4==2|rast4==3|rast4==1)] <- NA

rast1[rast1==2|rast1==3|rast1==1] <- 1
rast2[rast2==2|rast2==3|rast2==1] <- 1
rast3[rast3==2|rast3==3|rast3==1] <- 1
rast4[rast4==2|rast4==3|rast4==1] <- 1

# Save the rasters:
writeRaster(rast1,"data/ForestedAreas/ForestedAreasMasks/EuropeForestAreasRast1.tif",overwrite=T)
writeRaster(rast2,"data/ForestedAreas/ForestedAreasMasks/EuropeForestAreasRast2.tif",overwrite=T)
writeRaster(rast3,"data/ForestedAreas/ForestedAreasMasks/EuropeForestAreasRast3.tif",overwrite=T)
writeRaster(rast4,"data/ForestedAreas/ForestedAreasMasks/EuropeForestAreasRast4.tif",overwrite=T)

We use saga version 2.3.1 to combine the tiles together because I did not manage to run the functions merge and mosaic of the raster package.

cd ~/Documents/H2Pinpin/H2Pinpin/data/ForestedAreas/ForestedAreasMasks/

# Change the file format so that it can be used by SAGA
saga_cmd io_gdal 0 -GRIDS sagafiles/EuropeForestAreasRast1.sgrd -FILES EuropeForestAreasRast1.tif 
saga_cmd io_gdal 0 -GRIDS sagafiles/EuropeForestAreasRast2.sgrd -FILES EuropeForestAreasRast2.tif 
saga_cmd io_gdal 0 -GRIDS sagafiles/EuropeForestAreasRast3.sgrd -FILES EuropeForestAreasRast3.tif 
saga_cmd io_gdal 0 -GRIDS sagafiles/EuropeForestAreasRast4.sgrd -FILES EuropeForestAreasRast4.tif 

# Merging the files (mosaic)
saga_cmd grid_tools 3 -GRIDS sagafiles/EuropeForestAreasRast1.sgrd\;sagafiles/EuropeForestAreasRast2.sgrd\;sagafiles/EuropeForestAreasRast3.sgrd\;sagafiles/EuropeForestAreasRast4.sgrd\
                      -TYPE 7 \
                      -OVERLAP 1 \
                      -BLEND_DIST 10.000000 \
                      -TARGET_OUT_GRID sagafiles/EuropeForestAreas.sgrd
                      
# from format sgrd (SAGA format) to tif
gdal_translate -of GTiff sagafiles/EuropeForestAreas.sdat EuropeForestAreas.tif

# Load the mosaic of forested areas tiles
rast <- raster("data/ForestedAreas/ForestedAreasMasks/EuropeForestAreas.tif")

# Create a mask at 1000 m resolution and CRS ETRS89-LAEA Europe
eu.mask <- crop(rast,extent(prc01)) 
eu.mask <- raster::resample(eu.mask,prc01) 
eu.mask <- mask(prc01,eu.mask)

4.2.2 In Morocco

Data downloaded from here: http://www.fao.org/geonetwork/srv/fr/metadata.show?currTab=simple&id=37195

Legend of the LCCCode:

0003 / 0004 = Mosaic cropland vegetation
0004 // 0003 = Mosaic vegetation / cropland
0010 = Artficial surfaces and associated areas
0011 = Bare areas
11490 // 11494 = Rainfed trees and scrub crops
11498 = Rainfed cropland
20049 // 20058 = Sparse vegetation
20058 = Sparse grassland
21446 // 21450-121340 / 21454 = Mosaic forest or shrubland/grassland
21450 = Closed to open shrubland
21454 // 21446 // 21450 = Mosaic grassland/forest or shrubland
21496 // 21497-15048 = Closed to open broadleaved evergreen or semidecidous forest
21496-121340 // 21497-129401 = Closed (>40%) broadleaved evergreen or semidecidous forest
21497-121340 = Closed (>40%) broadleaved decidous forest
21497-15045 = Closed to open (>15%) mixed broadleaved decidous and needleaved evergreen forest
21499-121340 = Closed (>40%) needeleaved evergreen forest
21518 = Closed to open broadleaved decidous shrubland
41638-60686-R2 // 41898-60686-R2 = Closed broadleaved forest or shrubland permanently flooded - brakish water
6001 = Consolidated bare areas
6004 = Non - consolidated bare areas
6020 = Salt hardpans
7001 // 8001 = Water bodies

We selected cells with the codes in bold.

# Load the data
mor <- shapefile("data/ForestedAreas/RawData/Morocco/mar_gc_adg.shp")

# Keep only the forested areas:
submor <- mor[mor$LCCCode == "21499-121340" |
              mor$LCCCode == "21497-15045" |
              mor$LCCCode == "21454 // 21446 // 21450" |
              mor$LCCCode == "21497-121340" |
              mor$LCCCode == "21496 // 21497-15048" |
              mor$LCCCode == "21496-121340 // 21497-129401" |
              mor$LCCCode == "21446 // 21450-121340 / 21454" |
              mor$LCCCode == "21450" |
              mor$LCCCode == "0003 / 0004" |
              mor$LCCCode == "0004 // 0003" |
              mor$LCCCode == "21518",]

# Change the CRS
submor <- spTransform(submor,crs(rast)) # rast is EuropeForestAreas.tif with CRS ETRS89-LAEA Europe

# Change the CRS of the buffer areas of 20 km radius to the CRS ETRS89-LAEA Europe
buffer <- spTransform(buff20,crs(rast))

# intersection with the 20-km buffer
submor <- gIntersection(submor,buffer, byid=TRUE)
plot(out, col="green", bg="white")
plot(xyprov,add=T)

                    # Create the Moroccan mask at 1000m resolution
mor.mask <- crop(prc01,extent(submor)) 
mor.mask <- mask(mor.mask,submor,updatevalue=NA)

plot(mor.mask)
plot(xyprov,add=T)

4.2.3 Merge masks of Europe and Morocco

mask <- merge(eu.mask,mor.mask)
writeRaster(mask,"data/ForestedAreas/ForestedAreasMasks/ForestedAreasMask.tif",overwrite=T)

4.3 Extracting env. variables in the 20-km radius

mask <- raster("data/ForestedAreas/ForestedAreasMasks/ForestedAreasMask.tif") # the mask based on prec01

# We create a dataframe with the non-NAs values of the mask
val <- getValues(mask) 
n.val <- length(val)
tab <- tibble(cell.num=1:n.val,mask.values=val) %>% drop_na(mask.values) %>% dplyr::select(-mask.values)

Annual climatic indices:

\(EMT\): Extreme minimum temperature over the period considered (1901-1950) (°C)
\(SHM\): Summer heat moisture index (°C/mm)
\(PRCsum\): Mean summer precipitation (mm) (\(SP\) in the manuscript and in ClimateDT)
\(SPR\): Mean spring precipitation (mm)
\(MWMT\): Mean Warmest Month Temperature (°C)
\(MCMT\): Mean Coldest Month Temperature (°C)
\(TD\): Temperature difference, Celsius degrees (°C) \(MWMT-MCMT\)

vars <- c("EMT","MCMT","MWMT","PRCsum","SHM","SPR","TD") # annual indices to extract

for(i in vars){
rast.var <- raster(paste0("data/Climate/Raw/FromMaurizio/Annualndices/",i,".tif"))
#rast.var <- crop(rast.var,extent(mask)) # no need, same extent between the two rasters
#rast.var <- raster::resample(rast.var,mask) # no need, same resolution
rast.var <- mask(rast.var,mask,updatevalue=NA) # apply the mask
rast.values <- getValues(rast.var) 
tab.vars <- tibble(cell.num=1:length(rast.values),values=rast.values) %>% drop_na(values)
colnames(tab.vars)[2] <- i
tab <- full_join(tab,tab.vars,by="cell.num")
}

sapply(tab, function(x) sum(is.na(x))) # no NAs

Monthly indices (12 variables):

Monthly minimum temperature (°C): \(tmn01\) to \(tmn12\)
Monthly maximum temperature (°C): \(tmx01\) to \(tmx12\)
Monthly total precipitation (mm): \(prc01\) to \(prc12\)
Hargreaves climatic moisture deficit (mm): \(CMD_01\) to \(CMD_12\)

vars <- list.files(path="data/Climate/Raw/FromMaurizio/MonthlyIndices",pattern=".tif")
vars <- vars[-grep("PET",vars)] # remove monthly PET
vars <- str_sub(vars,1,-5) # extract variable names

for(i in vars){
rast.var <- raster(paste0("data/Climate/Raw/FromMaurizio/MonthlyIndices/",i,".tif"))
#rast.var <- crop(rast.var,extent(mask)) # no need, same extent between the two rasters
#rast.var <- raster::resample(rast.var,mask) # no need, same resolution
rast.var <- mask(rast.var,mask,updatevalue=NA) # apply the mask

rast.values <- getValues(rast.var) 
tab.vars <- tibble(cell.num=1:length(rast.values),values=rast.values) %>% drop_na(values)
colnames(tab.vars)[2] <- i
tab <- full_join(tab,tab.vars,by="cell.num")
}

sapply(tab, function(x) sum(is.na(x))) # no NAs

Soil variables:

\(DepthRoots\): Depth available to roots (cm)
\(Clay\): Clay content in the topsoil (0-30 cm) (%)
\(Sand\): Sand content in the topsoil (0-30 cm) (%)
\(Silt\): Silt content in the topsoil (0-30 cm) (%)

soil.rast <- list(raster("../../Pinpin_Clonapin/data/raw-data/soil/DERIVED_LAYERS/STU_EU_DEPTH_ROOTS_WGS84.tif"),
                  raster("../../Pinpin_Clonapin/data/raw-data/soil/DERIVED_LAYERS/STU_EU_T_CLAY_WGS84.tif"),
                  raster("../../Pinpin_Clonapin/data/raw-data/soil/DERIVED_LAYERS/STU_EU_T_SAND_WGS84.tif"),
                  raster("../../Pinpin_Clonapin/data/raw-data/soil/DERIVED_LAYERS/STU_EU_T_SILT_WGS84.tif"))
names(soil.rast) <- c("DepthRoots","Clay","Sand","Silt")

for(i in names(soil.rast)){
rast.var <- projectRaster(soil.rast[[i]],mask)
rast.var <- crop(rast.var,extent(mask)) # same extent between the two rasters
#rast.var <- raster::resample(rast.var,mask) # no need, same resolution
rast.var <- mask(rast.var,mask,updatevalue=NA) # apply the mask

rast.values <- getValues(rast.var) 
tab.vars <- tibble(cell.num=1:length(rast.values),values=rast.values) %>% drop_na(values)
colnames(tab.vars)[2] <- i
tab <- full_join(tab,tab.vars,by="cell.num")
}

sapply(tab, function(x) sum(is.na(x)))

Topographic ruggedness index (TRI)

rast.var <- raster("../../GenomicOffset/GenomicOffsetPinPin/data/Topography/TRI/TifsLAEA/TRI_LAEA.tif") # file reprojected from UTM31N to ETRS89 - LAEA in SAGA
rast.var <- crop(rast.var,extent(mask)) # same extent between the two rasters
rast.var <- raster::resample(rast.var,mask) # different resolution because TRI has a 90-m resolution
rast.var <- mask(rast.var,mask,updatevalue=NA) # apply the mask
rast.values <- getValues(rast.var) 
tab.vars <- tibble(cell.num=1:length(rast.values),values=rast.values) %>% drop_na(values)
colnames(tab.vars)[2] <- "TRI"
tab <- full_join(tab,tab.vars,by="cell.num")

saveRDS(tab,file="data/DF_EnvHetero.rds")

4.4 Run the PCA

We perform a PCA on the selected cell values = cell in forested areas and in 20-km radius around a population location.

tab <- readRDS(file="data/DF_EnvHetero.rds")

# We rename PRCsum to SP (to have the same variable names in the paper)
tab <- tab %>% dplyr::rename(SP=PRCsum, SpringP=SPR)

pca <- prcomp(tab[,-1], center = TRUE, scale. = TRUE) # centered and scaled PCA

p <- ggbiplot(pca,
              obs.scale = 1, 
              var.scale = 1, 
              varname.size =5,
              alpha = 0.1,#labels.size = 10,
              varname.adjust = 1.5) + 
  theme_bw() +
  theme(plot.title = element_text(size=18),
        axis.title = element_text(size=18),
        axis.text = element_text(size=12))
p

ggsave(p,file="figs/CalculatingDrivers/PCA_EnvHetero.png",width=15,height=12)


# For PhD Defense:
# pdefense <- ggbiplot(pca,obs.scale = 1, var.scale = 1,
#          varname.size =0,alpha = 0.1,
#          varname.adjust = 1.5) +
#   theme_bw() +
#   theme(plot.title = element_text(size=18),
#         axis.title = element_text(size=45),
#         axis.text = element_blank())
# ggsave(pdefense,file="/home/juliette/Documents/Thesis/DEFENSE/figures/PCA.pdf",device="pdf")

We assign the PC1 and PC2 scores to the cells on the map.

# extract the PC1 and PC2 coordinates of the points (location within 20-km radius around population locations)
tab[,"PC1"] <- pca$x[,"PC1"] 
tab[,"PC2"] <- pca$x[,"PC2"]
range(tab$PC1)

## [1] -13.26144  16.43171

range(tab$PC2)

## [1] -11.84963  11.14772

# Assign PC1 and PC2 values to the raster mask
rast.pc2 <- rast.pc1 <- mask
rast.pc1[tab$cell.num] <- pca$x[,"PC1"]
rast.pc2[tab$cell.num] <- pca$x[,"PC2"]

4.5 Calculate SH and SHW

# xyprov is a spatial object of the provenances coordinates in CRS WGS84
xyprov <- spTransform(xyprov,crs(rast.pc1)) # reproject in ETRS89 - LAEA 
df <- df %>% mutate(long.laea = xyprov@coords[,1],lat.laea = xyprov@coords[,2])

We first calculate \(SH\) for each population as the variance of cell values in the 20-km and 1.6-km radius, such as:

\[ SH = \frac{\sum_{ij}(x_{ij} - \bar{x})^2}{n-1}\]

where \(i\) and \(j\) are the geographical coordinates of the cells being considered, \(x_{ij}\) is the value of the PC1 (or PC2) score at cell \([i,j]\), \(\bar{x}\) is the weighted mean for the 20-km (or 1.6-km) radius region.

4.5.1 20-km radius

# We are going to extract the cell values of the cells in the 20-km radius around the population location

# For PC1
pop.val1 <- extract(rast.pc1, xyprov, buffer=20000,cellnumbers=TRUE) # we obtain a list with 34 elements

# For PC2
pop.val2 <- extract(rast.pc2, xyprov, buffer=20000,cellnumbers=TRUE) # we obtain a list with 34 elements


# We extract for each population:

# Information related to the number of cells with NAs
df[,"Nb.NAs.20km"] <- unlist(lapply(pop.val1, function(x) sum(is.na(x[,"value"])))) # number of NAs
df[,"Nb.val.20km"] <- unlist(lapply(pop.val1, function(x) sum(!is.na(x[,"value"])))) # number of cells with non-NAs values
df[,"Nb.cell.20km"] <- df[,"Nb.NAs.20km"] + df[,"Nb.val.20km"]  # total number of cells

# Environmental heterogeneity index (variance of the values of the cells within the 20-km radius)
df[,"SH.20km.PC1"] <- unlist(lapply(pop.val1, function(x) var(x[,"value"],na.rm=T))) # PC1 score   
df[,"SH.20km.PC2"] <- unlist(lapply(pop.val2, function(x) var(x[,"value"],na.rm=T))) # PC2 score

Comment: Here, we extracted the cells within a 20 km radius around the population location with the extract function, i.e. thus selecting the cells that have their center within the 20 km radius around the population location. In a different way, when the mask was constructed, the cells were selected in a buffer zone around the population location with the buffer function, so that cells were included if part of the cell was included in the 20 km radius, which results in a larger number of selected cells.

We calculate \(SHW\) for each population as the weighted variance of cell values in the 20-km radius region, such as:

\[ SHW = \frac{\sum_{ij}(x_{ij}-\bar{x})^{2}m_{ij}}{\sum_{ij}m_{ij}} \]

where \(m_{ij}\) is the weighting at cell \([i,j]\).

# parameters from De Lucas et al., 2008
c <- 0.2229 # shape parameter
alpha <- 0.0028 # scale parameter

# Visualization of the dispersal kernel with these parameters:
dist = 1:2000 
weight=(c/(2*pi*alpha^2*gamma(2/c)))*exp(-((dist)/alpha)^c)
plot(weight,
     xlab="Distance in meters",
     ylab="Weight",
     pch=16)

# Adding a threshold of 250 meters representing the radius of the samping area
(c/(2*pi*alpha^2*gamma(2/c)))*exp(-((250)/alpha)^c)

## [1] 3.659826e-07

weight <- 1:2000
weight[weight<250]= 3.659826e-07
weight[weight>=250]=(c/(2*pi*alpha^2*gamma(2/c)))*exp(-((weight[weight])/alpha)^c)
plot(weight,
     xlab="Distance in meters",
     ylab="Weight",
     pch=16)

The weights are almost zero at a distance of more than 100 m.

calcWeights <- function(x,pop.nb,rast){
  coords <-  xyFromCell(rast,x[,1])
  coord.x = coords[,1]
  coord.y = coords[,2]
  dist=pointDistance(xyprov@coords[pop.nb,],coords,lonlat = F)
  weight <- dist
  weight[weight<250] <- 3.659826e-07
  for(i in which(weight>=250)) weight[i] <- (c/(2*pi*alpha^2*gamma(2/c)))*exp(-((weight[i])/alpha)^c)
  x <- cbind(x,coord.x,coord.y,dist,weight)
  x <- as.data.frame(x)
  x <- x[!is.na(x$value),]
  return(x)
}

# plot(weight~dist,
#      xlab="Distance in meters",
#      ylab="Weight",
#      pch=16)

# 
# calcWeights <- function(x,pop.nb,rast){
#   coords <-  xyFromCell(rast,x[,1])
#   coord.x = coords[,1]
#   coord.y = coords[,2]
#   dist=pointDistance(xyprov@coords[pop.nb,],coords,lonlat = F)
#   weight=(c/(2*pi*alpha^2*gamma(2/c)))*exp(-((dist+250)/alpha)^c) # if we take 200m, PIA is weird for isntance
#   
#   x <- cbind(x,coord.x,coord.y,dist,weight)
#   x <- as.data.frame(x)
#   x <- x[!is.na(x$value),]
#   return(x)
# }

pop.val1 <-mapply(calcWeights, pop.val1,pop.nb=1:33,MoreArgs=list(rast=rast.pc1), SIMPLIFY=F)
pop.val2 <-mapply(calcWeights, pop.val2,pop.nb=1:33,MoreArgs=list(rast=rast.pc2), SIMPLIFY=F)
# For SH (already done before): 
#df[,"SH"] <- unlist(lapply(pop.val, function(x) var(x$value)))
# df[,"SH"] <- unlist(lapply(pop.val, function(x) sum((x$value-mean(x$value))^2)/(length(x$cell)-1)))

# Calculation SHW
df[,"SHW.20km.PC1"] <- unlist(lapply(pop.val1, function(x) sum(((x$value-mean(x$value))^2)*x$weight)/sum(x$weight)))
df[,"SHW.20km.PC2"] <- unlist(lapply(pop.val2, function(x) sum(((x$value-mean(x$value))^2)*x$weight)/sum(x$weight)))

df[,"minDist"] <- unlist(lapply(pop.val1, function(x) min(x$dist)))
df[,"meanDist"] <- unlist(lapply(pop.val1, function(x) mean(x$dist,na.rm=T)))
df[,"Dist1000"] <- unlist(lapply(pop.val1, function(x) length(x$cell[x$dist<1000])))

4.6 Visualization

4.6.1 PC1

4.6.2 Figures for SI

We generate the figures used in the Supplementary Information to show how PC1 and PC2 scores were attributed to raster cell values within a 20km radius around each population location.

PC1

We sort SH.20km.PC1 in ascending order:

df %>% dplyr::select(prov,SH.20km.PC1) %>% arrange(SH.20km.PC1) %>%  
  kable() %>%
  kable_styling(font_size=11,bootstrap_options = c("striped","hover", "condensed"), full_width = F) %>%
  column_spec(1, bold = T)

prov	SH.20km.PC1
OLO	0.0164592
HOU	0.0199138
VER	0.0561962
STJ	0.0724334
COC	0.1023016
PET	0.1132217
MIM	0.1250154
CAR	0.2887455
PLE	0.3107704
CUE	0.4152351
LEI	0.4538560
BAY	0.6007530
SEG	0.9034542
LAM	1.3488393
SAL	1.8213656
QUA	1.8484792
PUE	1.9342907
CAD	1.9428268
SIE	2.2115856
BON	2.2808226
ARM	2.8348172
ALT	3.1157857
ORI	3.3994551
CAS	3.5633689
TAM	3.7733529
COM	6.7168541
OLB	8.2221548
SAC	8.5797279
CEN	8.9757398
VAL	10.7818411
PIA	11.0349900
ARN	21.9499021
PIE	23.6211453

The four populations with the lowest values are OLO, HOU, VER and STJ (we choose these four for visualization because OLO and STJ are located next to each other and so are VER and HOU). The three populations with the highest values are PIA, PIE and ARN (PIE and PIA, the two Corsican populations, are located next to each other).

PC2

We sort SH.20km.PC2 in ascending order:

df %>% dplyr::select(prov,SH.20km.PC2) %>% arrange(SH.20km.PC2) %>%  
  kable() %>%
  kable_styling(font_size=11,bootstrap_options = c("striped","hover", "condensed"), full_width = F) %>%
  column_spec(1, bold = T)

prov	SH.20km.PC2
HOU	0.0460692
CAR	0.0465328
OLO	0.0777221
COC	0.0836111
STJ	0.0850043
VER	0.0977318
PLE	0.1106614
MIM	0.1430788
PET	0.1942967
LEI	0.2561911
CUE	0.2634282
BAY	0.3459568
SAL	0.8226303
BON	1.2513821
SEG	1.3105301
QUA	2.3982469
LAM	2.8529413
TAM	2.9544629
PUE	3.3209288
CAD	3.9931115
ALT	4.1075309
ORI	4.1103031
ARM	4.2521537
SIE	4.3160372
CAS	4.8389016
CEN	5.2851221
PIA	5.8603367
VAL	6.1634688
OLB	6.4686497
SAC	7.2216946
PIE	11.5244241
ARN	12.9951842
COM	18.0852393

The two populations with the lowest values are HOU and CAR. The two populations with the highest values are ARN and COM.

xy.selected.prov <- df %>% 
  dplyr::filter(prov %in% c("OLO","HOU","STJ","VER","PIE","PIA","ARN","COM","CAR","CUE","COC")) %>% 
  arrange(prov) %>% # ordered by alphabetical order  
  dplyr::select(longitude,latitude) %>% 
  SpatialPoints(proj4string=CRS("+proj=longlat +datum=WGS84"))

buff16 <- buffer(xy.selected.prov,width=1600,dissolve=F) # 1km radius around each population location
buff20 <- buffer(xy.selected.prov,width=20000,dissolve=F) # 20km radius around each population location

# Reprojecting in ETRS89 - LAEA 
xy.selected.prov <- spTransform(xy.selected.prov,crs(rast.pc1))
buff20 <- spTransform(buff20,crs(rast.pc1))
buff16 <- spTransform(buff16,crs(rast.pc1))


# Parameters/option for the different populations
info.pops <- list(ARN=list(code="ARN",
                           extent=c(3016892,2007109,3056892,2050109),
                           title="ARN",
                           legend="FALSE"),
                  PIAandPIE=list(code="PIAandPIE",
                               extent=c(4220000,2070000,4300000,2130000),
                               title="PIA and PIE",
                               legend="FALSE"),
                  OLOandSTJ=list(code="OLOandSTJ",
                           extent=c(3380000,2658000,3437892,2723000),
                           title="OLO and STJ",
                           legend="FALSE"),
                  HOUandVER=list(code="HOUandVER",
                           extent=c(3439892,2495000,3466892,2580000),
                           title="HOU and VER",
                           legend="TRUE"),
                  CAR=list(code="CARandCUEandCOC",
                           extent=c(3083392,2095109,3148392,2170109),
                           title="CAR, CUE and COC",
                           legend="FALSE"),
                  COM=list(code="COM",
                           extent=c(3050392,1617109,3095392,1666109),
                           title="COM",
                           legend="FALSE"))

# Generating figures for PC1 and PC2
list.PCs <- list(PC1=list(code="PC1",
                          rast=rast.pc1,
                          pops=info.pops[c("ARN","PIAandPIE","OLOandSTJ","HOUandVER")],
                          zlim=c(range(tab$PC1)[[1]],range(tab$PC1)[[2]])),
                 PC2=list(code="PC2",
                          rast=rast.pc2,
                          pops=info.pops[c("COM","ARN","HOUandVER","CAR")],
                          zlim=c(range(tab$PC2)[[1]],range(tab$PC2)[[2]])))


mclapply(list.PCs,function(pc){

rastpop <- pc$rast

mclapply(pc$pops,function(x){
  
pdf(paste0("figs/CalculatingDrivers/",pc$code,"maps_",x$code,".pdf"),height=6,width=6)
raster::plot(rastpop,
     ext=extent(matrix(x$extent, nrow=2)),
     legend=x$legend,
     zlim=pc$zlim,
     legend.width=1, legend.shrink=0.8,
     axis.args=list(cex.axis=2),
     legend.args = list(text = code.pc,font = 2, line = 0.2, cex = 2.5),
     axes=F, box=F,
     main=x$title,cex.main=2.5,
     col=colorRampPalette(c("green","blue","yellow","red"))(255),alpha=0.7)
raster::plot(buff20,add=T,lwd=2)
plot(xy.selected.prov,add=T,cex=1)#,pch=18,
plot(buff16,add=T,lwd=2)
dev.off()
  
})

})

4.6.3 1.6-km radius

# We are going to extract the cell values of the cells in the 1.6-km radius around the population location

# For PC1
pop.val1 <- extract(rast.pc1, xyprov, buffer=1600,cellnumbers=TRUE) # we obtain a list with 34 elements

# For PC2
pop.val2 <- extract(rast.pc2, xyprov, buffer=1600,cellnumbers=TRUE) # we obtain a list with 34 elements


# We extract for each population:

# Information related to the number of cells with NAs
df[,"Nb.NAs.1km"] <- unlist(lapply(pop.val1, function(x) sum(is.na(x[,"value"])))) # number of NAs
df[,"Nb.val.1km"] <- unlist(lapply(pop.val1, function(x) sum(!is.na(x[,"value"])))) # number of cells with non-NAs values
df[,"Nb.cell.1km"] <- df[,"Nb.NAs.1km"] + df[,"Nb.val.1km"]  # total number of cells

# Environmental heterogeneity index (variance of the values of the cells within the 1.6-km radius)
df[,"SH.1km.PC1"] <- unlist(lapply(pop.val1, function(x) var(x[,"value"],na.rm=T))) # PC1 score   
df[,"SH.1km.PC2"] <- unlist(lapply(pop.val2, function(x) var(x[,"value"],na.rm=T))) # PC2 score

df %>% 
  dplyr::select(contains("1km")) %>%
  kable(digits=3) %>%
  kable_styling(font_size=11,
                bootstrap_options = c("striped","hover", "condensed"), full_width = F) %>%
  column_spec(1, bold = T)

Nb.NAs.1km	Nb.val.1km	Nb.cell.1km	SH.1km.PC1	SH.1km.PC2
0	8	8	0.374	0.241
0	8	8	0.542	0.387
0	8	8	0.120	0.200
0	8	8	0.066	0.042
0	9	9	0.859	1.763
0	8	8	0.213	0.381
0	9	9	0.835	0.597
0	9	9	0.005	0.011
0	8	8	0.016	0.007
0	9	9	0.064	0.033
0	8	8	0.464	0.774
0	8	8	0.034	0.073
0	8	8	0.078	0.058
0	8	8	0.001	0.000
2	7	9	0.009	0.005
1	7	8	0.006	0.012
0	8	8	0.001	0.002
0	8	8	0.002	0.012
0	8	8	0.300	0.272
0	8	8	0.104	0.092
0	8	8	0.583	0.408
0	8	8	0.311	0.358
0	9	9	0.241	0.287
0	9	9	0.063	0.031
0	8	8	0.447	0.260
0	8	8	0.006	0.012
2	5	7	0.002	0.000
0	8	8	0.147	0.151
0	8	8	0.002	0.012
0	7	7	0.535	0.312
0	8	8	0.080	0.141
3	6	9	0.009	0.011
0	9	9	1.435	0.404

# # Figure for the PhD Defense:
# ext.laea <- extent(matrix(c( 4220000  ,  2070000, 4300000  , 2130000 ), nrow=2))
# buff16 <- buffer(xyprov,width=1600,dissolve=F) # 1km
# rastpop <- rast.pc1
# rastpop <- mask(rastpop,buff16,updatevalue=NA)
# pdf("/home/juliette/Documents/Thesis/DEFENSE/figures/PC1_1kmcells.pdf",height=7.85,width=10)
# plot(rastpop,ext=ext.laea,
#      legend.width=1, legend.shrink=0.8,
#      axis.args=list(cex.axis=2),
#      legend.args = list(text = 'PC1',font = 2, line = 0.2, cex = 2.5),
#      col=colorRampPalette(c("yellow","red"))(255),
#      axes=F)
# plot(xyprov,add=T,pch=18,cex=1)
# plot(buff16,add=T,lwd=1)
# dev.off()
# # 
# rastpop <- rast.pc2
# rastpop <- mask(rastpop,buff16,updatevalue=NA)
# pdf("/home/juliette/Documents/Thesis/DEFENSE/figures/PC2_1kmcells.pdf",height=7.85,width=10)
# plot(rastpop,ext=ext.laea,
#      legend.width=1, legend.shrink=0.8,
#      axis.args=list(cex.axis=2),
#      legend.args = list(text = 'PC2',font = 2, line = 0.2, cex = 2.5),
#      col=colorRampPalette(c("yellow","red"))(255),
#      axes=F)
# plot(xyprov,add=T,pch=18,cex=1)
# plot(buff16,add=T,lwd=1)
# dev.off()

# For PC1
pop.val1 <- extract(rast.pc1, xyprov, buffer=3000,cellnumbers=TRUE) # we obtain a list with 34 elements

# For PC2
pop.val2 <- extract(rast.pc2, xyprov, buffer=3000,cellnumbers=TRUE) # we obtain a list with 34 elements

Keep9cells <- function(x,pop.nb,rast){
  coords <-  xyFromCell(rast,x[,1])
  coord.x = coords[,1]
  coord.y = coords[,2]
  dist=pointDistance(xyprov@coords[pop.nb,],coords,lonlat = F)
  weight <- dist
  weight[weight<250] <- 3.659826e-07
  for(i in which(weight>=250)) weight[i] <- (c/(2*pi*alpha^2*gamma(2/c)))*exp(-((weight[i])/alpha)^c)
  x <- cbind(x,coord.x,coord.y,dist)
  x <- as.data.frame(x)
  x <- x[!is.na(x$value),]
  x <- arrange(x,by=dist) %>% slice(1:9)
  return(x)
}

pop.val1 <-mapply(Keep9cells, pop.val1,pop.nb=1:33,MoreArgs=list(rast=rast.pc1), SIMPLIFY=F)
pop.val2 <-mapply(Keep9cells, pop.val2,pop.nb=1:33,MoreArgs=list(rast=rast.pc2), SIMPLIFY=F)

# We extract for each population:

# Information related to the number of cells with NAs
df[,"Nb.NAs.9cells"] <- unlist(lapply(pop.val1, function(x) sum(is.na(x[,"value"])))) # number of NAs
df[,"Nb.val.9cells"] <- unlist(lapply(pop.val1, function(x) sum(!is.na(x[,"value"])))) # number of cells with non-NAs values
df[,"Nb.cell.9cells"] <- df[,"Nb.NAs.1km"] + df[,"Nb.val.1km"]  # total number of cells

# Environmental heterogeneity index (variance of the values of the cells within the 1.6-km radius)
df[,"SH.9cells.PC1"] <- unlist(lapply(pop.val1, function(x) var(x[,"value"],na.rm=T))) # PC1 score   
df[,"SH.9cells.PC2"] <- unlist(lapply(pop.val2, function(x) var(x[,"value"],na.rm=T))) # PC2 score

df %>% 
  dplyr::select(contains("9cells")) %>%
  kable(digits=3) %>%
  kable_styling(font_size=11,
                bootstrap_options = c("striped","hover", "condensed"), full_width = F) %>%
  column_spec(1, bold = T)

Nb.val.9cells	Nb.cell.9cells	SH.9cells.PC1	SH.9cells.PC2
9	8	0.343	0.224
9	8	0.539	0.373
9	8	0.105	0.175
9	8	0.058	0.037
9	9	0.859	1.763
9	8	0.197	0.345
9	9	0.835	0.597
9	9	0.005	0.011
9	8	0.014	0.007
9	9	0.064	0.033
9	8	0.436	0.719
9	8	0.043	0.086
9	8	0.069	0.054
9	8	0.001	0.000
9	9	0.012	0.005
9	8	0.006	0.012
9	8	0.001	0.011
9	8	0.002	0.013
9	8	0.263	0.239
9	8	0.091	0.088
9	8	0.549	0.379
9	8	0.294	0.329
9	9	0.241	0.287
9	9	0.063	0.031
9	8	0.392	0.228
9	8	0.006	0.011
9	7	0.003	0.002
9	8	0.139	0.138
9	8	0.001	0.011
9	7	0.675	0.382
9	8	0.077	0.133
9	9	0.006	0.009
9	9	1.435	0.404

Here is the correspondence between the variable names in this document and the variable names in the paper:

SH.20km.PC1 => SH1[20km]
SH.20km.PC2 => SH2[20km]
SH.1.6km.PC1 => SH1[1.6km]
SH.1km.PC2 => SH2[1.6km]

4.7 Tables (to check variable plausibility)

Correlations among the environmental heterogeneity indexes:

	SH.20km.PC1	SH.20km.PC2	SHW.20km.PC1	SHW.20km.PC2	SH.1km.PC1	SH.1km.PC2	SH.9cells.PC1	SH.9cells.PC2
SH.20km.PC1	1.000	0.778	0.279	0.137	0.561	0.333	0.600	0.356
SH.20km.PC2	0.778	1.000	0.178	0.216	0.698	0.799	0.717	0.814
SHW.20km.PC1	0.279	0.178	1.000	0.877	0.010	-0.036	-0.006	-0.046
SHW.20km.PC2	0.137	0.216	0.877	1.000	-0.004	0.113	-0.024	0.099
SH.1km.PC1	0.561	0.698	0.010	-0.004	1.000	0.686	0.996	0.689
SH.1km.PC2	0.333	0.799	-0.036	0.113	0.686	1.000	0.677	0.998
SH.9cells.PC1	0.600	0.717	-0.006	-0.024	0.996	0.677	1.000	0.685
SH.9cells.PC2	0.356	0.814	-0.046	0.099	0.689	0.998	0.685	1.000

At 20-km radius, weak correlation between \(SH\) and \(SHW\) and the number of no-NA and NA cells:

	SH.20km.PC1	SH.20km.PC2	SHW.20km.PC1	SHW.20km.PC2
Nb.NAs.20km	-0.329	-0.258	-0.067	0.073
Nb.val.20km	0.326	0.256	0.066	-0.073
Nb.cell.20km	-0.135	-0.044	-0.052	-0.009

At 1.6-km radius, very weak correlation between \(SH\) and \(SHW\) and the number of no-NA and NA cells:

	SH.1km.PC1	SH.1km.PC2
Nb.NAs.1km	-0.253	-0.225
Nb.val.1km	0.358	0.327
Nb.cell.1km	0.228	0.215

\(SH\) at 20-km radius has plausible outputs, while \(SHW\) has highly implausible values. See table below the provenance sorted by \(SH\) values:

prov	SH.20km.PC1	SH.20km.PC2	SHW.20km.PC1	SHW.20km.PC2	SH.1km.PC1	SH.1km.PC2	SH.9cells.PC1	SH.9cells.PC2
OLO	0.0164592	0.0777221	0.0317397	0.2733301	0.0094036	0.0051262	0.0124146	0.0054464
HOU	0.0199138	0.0460692	0.0575290	0.0296516	0.0056728	0.0120469	0.0055989	0.0112477
VER	0.0561962	0.0977318	0.0382893	0.0653611	0.0016021	0.0004732	0.0031340	0.0017005
STJ	0.0724334	0.0850043	0.1723654	0.2262236	0.0063682	0.0120204	0.0064358	0.0124672
COC	0.1023016	0.0836111	0.0403213	0.0192419	0.0012808	0.0004666	0.0011223	0.0004108
PET	0.1132217	0.1942967	0.0376352	0.2054232	0.0024650	0.0124776	0.0022581	0.0134366
MIM	0.1250154	0.1430788	0.0913257	0.2646539	0.0048957	0.0111053	0.0048957	0.0111053
CAR	0.2887455	0.0465328	0.0126181	0.0141989	0.0016033	0.0116461	0.0014398	0.0112454
PLE	0.3107704	0.1106614	0.0263038	0.0182243	0.0629047	0.0311361	0.0629047	0.0311361
CUE	0.4152351	0.2634282	0.4170237	0.0268819	0.0013332	0.0018250	0.0011801	0.0114863
LEI	0.4538560	0.2561911	0.3405906	0.1444059	0.0156472	0.0068048	0.0140602	0.0066829
BAY	0.6007530	0.3459568	0.5287138	0.3859016	0.0637982	0.0325186	0.0637982	0.0325186
SEG	0.9034542	1.3105301	0.0677415	0.0369804	0.1039489	0.0923702	0.0909560	0.0875323
LAM	1.3488393	2.8529413	1.0252085	3.5587077	0.0338634	0.0730391	0.0432421	0.0859804
SAL	1.8213656	0.8226303	0.3200903	0.1758596	0.0655126	0.0418700	0.0578386	0.0367636
QUA	1.8484792	2.3982469	3.3417721	3.8445054	0.3108199	0.3575822	0.2943501	0.3289829
PUE	1.9342907	3.3209288	1.8429107	4.1831515	0.0802020	0.1410220	0.0767704	0.1330363
CAD	1.9428268	3.9931115	1.3624726	3.3910649	0.2129489	0.3807609	0.1967605	0.3454852
SIE	2.2115856	4.3160372	0.2992390	1.2499280	0.4638710	0.7738594	0.4355487	0.7194657
BON	2.2808226	1.2513821	1.0268501	0.6279184	0.4466757	0.2597429	0.3916035	0.2277211
ARM	2.8348172	4.2521537	1.0388269	0.8987438	0.1467888	0.1510096	0.1388638	0.1382230
ALT	3.1157857	4.1075309	2.6243228	2.6275168	0.1202799	0.2000988	0.1052720	0.1754711
ORI	3.3994551	4.1103031	1.5029619	1.7714128	0.3001282	0.2722580	0.2631283	0.2388773
CAS	3.5633689	4.8389016	1.6972816	2.3318921	0.2409040	0.2871943	0.2409040	0.2871943
TAM	3.7733529	2.9544629	3.8312651	2.5875755	0.0779903	0.0581450	0.0693665	0.0537996
COM	6.7168541	18.0852393	0.3749921	1.8162808	0.8590390	1.7627341	0.8590390	1.7627341
OLB	8.2221548	6.4686497	0.5408123	0.3481675	0.5826322	0.4084071	0.5492035	0.3793049
SAC	8.5797279	7.2216946	1.0098261	0.3992617	1.4347528	0.4037317	1.4347528	0.4037317
CEN	8.9757398	5.2851221	11.2774070	7.8438688	0.3742971	0.2408937	0.3428790	0.2240336
VAL	10.7818411	6.1634688	0.4349211	0.3852892	0.8354679	0.5968357	0.8354679	0.5968357
PIA	11.0349900	5.8603367	9.4426607	5.6108793	0.0092793	0.0111288	0.0060671	0.0090379
ARN	21.9499021	12.9951842	1.2670169	0.4598626	0.5421156	0.3869346	0.5393576	0.3726968
PIE	23.6211453	11.5244241	0.4545641	0.1140265	0.5346164	0.3123310	0.6748777	0.3823646

Same table but sorted by \(SH\) at 1.6-km radius:

prov	SH.20km.PC1	SH.20km.PC2	SHW.20km.PC1	SHW.20km.PC2	SH.1km.PC1	SH.1km.PC2	SH.9cells.PC1	SH.9cells.PC2
COC	0.1023016	0.0836111	0.0403213	0.0192419	0.0012808	0.0004666	0.0011223	0.0004108
CUE	0.4152351	0.2634282	0.4170237	0.0268819	0.0013332	0.0018250	0.0011801	0.0114863
VER	0.0561962	0.0977318	0.0382893	0.0653611	0.0016021	0.0004732	0.0031340	0.0017005
CAR	0.2887455	0.0465328	0.0126181	0.0141989	0.0016033	0.0116461	0.0014398	0.0112454
PET	0.1132217	0.1942967	0.0376352	0.2054232	0.0024650	0.0124776	0.0022581	0.0134366
MIM	0.1250154	0.1430788	0.0913257	0.2646539	0.0048957	0.0111053	0.0048957	0.0111053
HOU	0.0199138	0.0460692	0.0575290	0.0296516	0.0056728	0.0120469	0.0055989	0.0112477
STJ	0.0724334	0.0850043	0.1723654	0.2262236	0.0063682	0.0120204	0.0064358	0.0124672
PIA	11.0349900	5.8603367	9.4426607	5.6108793	0.0092793	0.0111288	0.0060671	0.0090379
OLO	0.0164592	0.0777221	0.0317397	0.2733301	0.0094036	0.0051262	0.0124146	0.0054464
LEI	0.4538560	0.2561911	0.3405906	0.1444059	0.0156472	0.0068048	0.0140602	0.0066829
LAM	1.3488393	2.8529413	1.0252085	3.5587077	0.0338634	0.0730391	0.0432421	0.0859804
PLE	0.3107704	0.1106614	0.0263038	0.0182243	0.0629047	0.0311361	0.0629047	0.0311361
BAY	0.6007530	0.3459568	0.5287138	0.3859016	0.0637982	0.0325186	0.0637982	0.0325186
SAL	1.8213656	0.8226303	0.3200903	0.1758596	0.0655126	0.0418700	0.0578386	0.0367636
TAM	3.7733529	2.9544629	3.8312651	2.5875755	0.0779903	0.0581450	0.0693665	0.0537996
PUE	1.9342907	3.3209288	1.8429107	4.1831515	0.0802020	0.1410220	0.0767704	0.1330363
SEG	0.9034542	1.3105301	0.0677415	0.0369804	0.1039489	0.0923702	0.0909560	0.0875323
ALT	3.1157857	4.1075309	2.6243228	2.6275168	0.1202799	0.2000988	0.1052720	0.1754711
ARM	2.8348172	4.2521537	1.0388269	0.8987438	0.1467888	0.1510096	0.1388638	0.1382230
CAD	1.9428268	3.9931115	1.3624726	3.3910649	0.2129489	0.3807609	0.1967605	0.3454852
CAS	3.5633689	4.8389016	1.6972816	2.3318921	0.2409040	0.2871943	0.2409040	0.2871943
ORI	3.3994551	4.1103031	1.5029619	1.7714128	0.3001282	0.2722580	0.2631283	0.2388773
QUA	1.8484792	2.3982469	3.3417721	3.8445054	0.3108199	0.3575822	0.2943501	0.3289829
CEN	8.9757398	5.2851221	11.2774070	7.8438688	0.3742971	0.2408937	0.3428790	0.2240336
BON	2.2808226	1.2513821	1.0268501	0.6279184	0.4466757	0.2597429	0.3916035	0.2277211
SIE	2.2115856	4.3160372	0.2992390	1.2499280	0.4638710	0.7738594	0.4355487	0.7194657
PIE	23.6211453	11.5244241	0.4545641	0.1140265	0.5346164	0.3123310	0.6748777	0.3823646
ARN	21.9499021	12.9951842	1.2670169	0.4598626	0.5421156	0.3869346	0.5393576	0.3726968
OLB	8.2221548	6.4686497	0.5408123	0.3481675	0.5826322	0.4084071	0.5492035	0.3793049
VAL	10.7818411	6.1634688	0.4349211	0.3852892	0.8354679	0.5968357	0.8354679	0.5968357
COM	6.7168541	18.0852393	0.3749921	1.8162808	0.8590390	1.7627341	0.8590390	1.7627341
SAC	8.5797279	7.2216946	1.0098261	0.3992617	1.4347528	0.4037317	1.4347528	0.4037317

In the following analyses, we are going to use \(SH\) at 20-km radius for both PC1 and PC2 scores (SH1[20km] and SH2[20km]) and \(SH\) at 1.6-km radius for both PC1 and PC2 too (SH1[1.6km] and SH2[1.6km]).

4.8 SI Table

df %>% 
  arrange(prov) %>% 
  dplyr::select(prov,
                SH.20km.PC1,SH.20km.PC2, SH.1km.PC1,SH.1km.PC2,
                Nb.val.20km,Nb.val.1km) %>% 
  xtable(type = "latex",digits=2) %>% 
  print(file = "tables/Drivers/SHDriversValues.tex",
      include.rownames=FALSE)

5 Climate harshness and temporal heterogeneity

We initially extracted \(SHM\) and \(EMT\) values from the same raster files used to calculate the spatial environmnental heterogeneity. As the raster files have a 1kmx1km spatial resolution, the extraction from the raster files is less accurate than extracting point-estimate climatic data with ClimateDT (bilinear interpolation).

So, following the second review in Heredity, we used the extracted \(SHM\) and \(EMT\) values from ClimateDT rather than those extracted from the 1km*1km raster files.

# Extracting SHM
################
var <- "SHM"
rast.var <- raster(paste0("data/Climate/Raw/FromMaurizio/Annualndices/",var,".tif"))
df[,var] <- extract(rast.var, xyprov) #
df %>% 
  dplyr::select(prov,SHM) %>% 
  arrange(by=SHM) %>% 
  kable() %>%
  kable_styling(font_size=11,bootstrap_options = c("striped","hover", "condensed"), full_width = F) %>%
  column_spec(1, bold = T)

# Extracting EMT
################
var <- "EMT"
rast.var <- raster(paste0("data/Climate/Raw/FromMaurizio/Annualndices/",var,".tif"))
df[,var] <- extract(rast.var, xyprov) # EMT
df[,paste0("inv",var)] <- extract(rast.var, xyprov) * (-1) # change the sign of EMT to ease interpretation
df %>% 
  dplyr::select(prov,invEMT) %>% 
  arrange(by=invEMT) %>% 
  kable() %>%
  kable_styling(font_size=11,bootstrap_options = c("striped","hover", "condensed"), full_width = F) %>%
  column_spec(1, bold = T)

climDT <- read.csv("data/Climate/Raw/ClimateDTdownloads/ClimateDT_outputs.csv") %>% 
  group_by(ID) %>% 
  dplyr::summarise(mean_bio6=mean(bio6),   # minimum temperature of the coldest month
                   mean_MCMT=mean(MCMT),   # mean coldest month temperature 
                   mean_EMT=mean(EMT),     # extreme minimum temperature
                   mean_bio14=mean(bio14), # precipitation of the driest month
                   mean_SHM=mean(SHM),     # summer heat moisture index
                   mean_MSP=mean(MSP),     # summer precipitation (called mean summer prec in ClimateDT)
                   
                   var_bio6=var(bio6),
                   var_MCMT=var(MCMT),
                   var_EMT=var(EMT),
                   var_bio14=var(bio14),
                   var_SHM=var(SHM),
                   var_MSP=var(MSP)) %>% 
  dplyr::rename(prov=ID)

# Vizualisation of the correlation among variables
climDT %>% pairs.panels(scale=T,hist.col="palegreen1")

# add to df
df <- df %>%  merge(climDT,by="prov")

5.1 SI Table

df %>% 
  arrange(prov) %>% 
  dplyr::select(prov, 
                mean_MCMT,mean_MSP,
                var_MCMT,var_MSP) %>% 
  xtable(type = "latex",digits=2) %>% 
  print(file = "tables/Drivers/ClimateHarshnessTempHeteroDriversValues.tex",
      include.rownames=FALSE)

6 Climatic transfer distances

6.1 Site coordinates

# Script used before the second Heredity review,
# when EMT and SHM values were extracted from raster files

df.site <- unique(data[,c("site","longitude_site","latitude_site")])

# Create a spatial object
xysite <- SpatialPoints(df.site[,c("longitude_site","latitude_site")], 
                        proj4string=CRS("+proj=longlat +datum=WGS84"))

# change CRS
xysite <- spTransform(xysite,crs(mask)) # reproject in ETRS89 - LAEA 

for(x in c("EMT","SHM")){
  rast <- raster(paste0("data/Climate/Raw/FromMaurizio/Annualndices/",x,".tif"))
  df.site[,paste0(x,"_site")] <- raster::extract(rast,xysite)
}

df.site  %>% 
  kable(digits=3) %>%
  kable_styling(font_size=11,
                bootstrap_options = c("striped","hover", "condensed"), full_width = F) %>%
  column_spec(1, bold = T)

# Site coordinates for ClimateDT
df.site <- unique(data[,c("site","latitude_site","longitude_site")]) %>% # !!!! latitude and longitude were inverted!
  setNames(c("site","longitude","latitude"))

# We add the elevation data from the DEEM of ClimateDT
df.site$elevation[df.site$site=="asturias"] <- 449

## Warning: Unknown or uninitialised column: `elevation`.

df.site$elevation[df.site$site=="bordeaux"] <- 58
df.site$elevation[df.site$site=="portugal"] <- 865

write_csv(df.site,"data/Climate/Raw/ClimateDTdownloads/site_coordinates_readyforClimateDT.csv")

Trees were planted in 2011: 02/2011 for Asturias and Fundão and 10/2011 for Bordeaux.

For height:

trees were measured in October 2012 in Fundão (when trees were 20 months old).
trees were measured in November 2012 in Asturias (when the trees were 21 months old).
trees were measured in November 2013 in Bordeaux (when the trees were 25 months old) and in November 2018 (when the trees were 85 months old).

Phenology traits were measured in 2013, 2014, 2015 and 2017 for the mean bud burst date and 2014, 2015 and 2017 for the duration of bud burst.

We do not know when the functional traits were measured in Portugal (the most likely being that they were measured at the same time as height), so we are going to take the years 2011 and 2012 for the calculation of the climate in the common garden.

We now add in df some columns for the climatic transfer distances.

df.site <- read.csv("data/Climate/Raw/ClimateDTdownloads/ClimateDT_outputs_sites.csv")

# Height (Asturias, 21 months)
bio6_height_asturias <- mean(df.site[df.site$Year %in% 2011:2012 & df.site$ID=="asturias","bio6"]) # mean across 2011 and 2012 
bio14_height_asturias <- mean(df.site[df.site$Year %in% 2011:2012 & df.site$ID=="asturias","bio14"]) # mean across 2011 and 2012 

# Height (Fundão, 20 months) + SLA + d13C
bio6_height_SLA_d13C_portugal <- mean(df.site[df.site$Year %in% 2011:2012 & df.site$ID=="portugal","bio6"]) # mean across 2011 and 2012 
bio14_height_SLA_d13C_portugal <- mean(df.site[df.site$Year %in% 2011:2012 & df.site$ID=="portugal","bio14"]) # mean across 2011 and 2012                         

# Height (Bordeaux, 25 months)
bio6_height_bordeaux25 <- mean(df.site[df.site$Year %in% 2011:2013 & df.site$ID=="bordeaux","bio6"]) # mean across 2011, 2012 and 2013
bio14_height_bordeaux25 <- mean(df.site[df.site$Year %in% 2012:2013 & df.site$ID=="bordeaux","bio14"]) # mean across 2012 and 2013 (were not planted during the 2011 summer)                       
# Height (Bordeaux, 85 months)
bio6_height_bordeaux85 <- mean(df.site[df.site$Year %in% 2011:2018 & df.site$ID=="bordeaux","bio6"]) # mean from 2011 to 2018
bio14_height_bordeaux85 <- mean(df.site[df.site$Year %in% 2012:2018 & df.site$ID=="bordeaux","bio14"]) # mean from 2011 to 2018

# Mean bud burst date (Bordeaux)
bio6_meanBB_bordeaux <- mean(df.site[df.site$Year %in% c(2013:2015,2017) & df.site$ID=="bordeaux","bio6"]) # mean across years 2013, 2014, 2015 and 2017
bio14_meanBB_bordeaux <- mean(df.site[df.site$Year %in% c(2013:2015,2017) & df.site$ID=="bordeaux","bio14"]) # same

# Mean duration of bud burst (Bordeaux)
bio6_meanDBB_bordeaux <- mean(df.site[df.site$Year %in% c(2014:2015,2017) & df.site$ID=="bordeaux","bio6"]) #  mean across years 2014, 2015 and 2017
bio14_meanDBB_bordeaux <- mean(df.site[df.site$Year %in% c(2014:2015,2017) & df.site$ID=="bordeaux","bio14"]) # same


df <- df %>% 
  dplyr::mutate(CTD_bio6_height_asturias=(mean_bio6-bio6_height_asturias) %>% abs(),
                CTD_bio14_height_asturias=(mean_bio14-bio14_height_asturias) %>% abs(),
                CTD_bio6_height_SLA_d13C_portugal=(mean_bio6-bio6_height_SLA_d13C_portugal) %>% abs(),
                CTD_bio14_height_SLA_d13C_portugal=(mean_bio14-bio14_height_SLA_d13C_portugal) %>% abs(),
                CTD_bio6_height_bordeaux25=(mean_bio6-bio6_height_bordeaux25) %>% abs(),
                CTD_bio14_height_bordeaux25=(mean_bio14-bio14_height_bordeaux25) %>% abs(),
                CTD_bio6_height_bordeaux85=(mean_bio6-bio6_height_bordeaux85) %>% abs(),
                CTD_bio14_height_bordeaux85=(mean_bio14-bio14_height_bordeaux85) %>% abs(),
                CTD_bio6_meanBB_bordeaux=(mean_bio6-bio6_meanBB_bordeaux) %>% abs(),
                CTD_bio14_meanBB_bordeaux=(mean_bio14-bio14_meanBB_bordeaux) %>% abs(),
                CTD_bio6_meanDBB_bordeaux=(mean_bio6-bio6_meanDBB_bordeaux) %>% abs(),
                CTD_bio14_meanDBB_bordeaux=(mean_bio14-bio14_meanDBB_bordeaux) %>% abs())
                

df %>% 
  dplyr::select(prov,contains("CTD")) %>% 
  kable(digits=2) %>%
  kable_styling(font_size=11,bootstrap_options = c("striped","hover", "condensed"), full_width = F) %>%
  column_spec(1, bold = T)

prov	CTD_bio6_height_asturias	CTD_bio14_height_asturias	CTD_bio6_height_SLA_d13C_portugal	CTD_bio14_height_SLA_d13C_portugal	CTD_bio6_height_bordeaux25	CTD_bio14_height_bordeaux25	CTD_bio6_height_bordeaux85	CTD_bio14_height_bordeaux85	CTD_bio6_meanBB_bordeaux	CTD_bio14_meanBB_bordeaux	CTD_bio6_meanDBB_bordeaux	CTD_bio14_meanDBB_bordeaux
ALT	1.85	0.90	0.10	17.05	0.32	9.05	1.18	4.64	1.55	8.55	1.79	6.09
ARM	1.75	0.04	0.00	17.91	0.22	8.19	1.07	3.77	1.45	7.69	1.68	5.22
ARN	2.97	18.36	1.22	0.41	1.44	26.51	2.30	22.09	2.67	26.01	2.91	23.54
BAY	4.47	13.89	2.72	4.06	2.94	22.04	3.80	17.63	4.17	21.54	4.41	19.08
BON	4.87	16.04	3.12	1.91	3.34	24.19	4.20	19.78	4.57	23.69	4.81	21.23
CAD	1.63	1.09	3.38	16.86	3.17	9.24	2.31	4.82	1.93	8.74	1.70	6.27
CAR	3.97	17.65	2.22	0.30	2.43	25.80	3.29	21.38	3.67	25.30	3.90	22.83
CAS	1.80	1.48	3.55	16.47	3.33	9.63	2.48	5.22	2.10	9.13	1.87	6.67
CEN	5.28	18.40	3.53	0.45	3.75	26.55	4.61	22.13	4.98	26.05	5.22	23.58
COC	3.75	17.96	2.00	0.01	2.22	26.11	3.08	21.70	3.45	25.61	3.68	23.15
COM	0.85	22.31	2.60	4.36	2.38	30.46	1.52	26.05	1.15	29.96	0.92	27.49
CUE	3.76	17.61	2.01	0.34	2.22	25.76	3.08	21.35	3.46	25.26	3.69	22.79
HOU	0.44	2.57	2.19	20.52	1.97	5.58	1.12	1.17	0.74	5.08	0.51	2.61
LAM	2.15	0.65	3.90	18.60	3.69	7.50	2.83	3.08	2.45	7.00	2.22	4.53
LEI	4.57	18.46	6.32	0.51	6.10	26.61	5.24	22.20	4.87	26.11	4.64	23.64
MIM	1.48	9.43	3.23	27.38	3.01	1.28	2.15	5.69	1.78	1.78	1.55	4.24
OLB	3.78	16.06	2.03	1.89	2.24	24.21	3.10	19.79	3.48	23.71	3.71	21.24
OLO	1.16	3.62	2.91	14.33	2.69	11.77	1.83	7.36	1.46	11.27	1.22	8.81
ORI	4.59	21.16	2.84	3.21	3.06	29.31	3.92	24.90	4.29	28.81	4.53	26.35
PET	1.42	11.38	3.17	29.33	2.95	3.23	2.09	7.65	1.72	3.73	1.48	6.20
PIA	2.64	14.42	4.39	3.53	4.18	22.57	3.32	18.16	2.94	22.07	2.71	19.60
PIE	4.52	10.74	2.77	7.21	2.99	18.89	3.84	14.47	4.22	18.39	4.45	15.92
PLE	1.63	2.03	0.12	15.92	0.10	10.18	0.96	5.76	1.33	9.68	1.57	7.21
PUE	2.40	2.95	4.15	15.00	3.93	11.10	3.07	6.68	2.70	10.60	2.47	8.13
QUA	0.19	18.22	1.94	0.27	1.73	26.37	0.87	21.95	0.49	25.87	0.26	23.40
SAC	1.16	1.68	0.59	16.27	0.37	9.83	0.48	5.41	0.86	9.33	1.09	6.86
SAL	5.55	9.55	3.80	8.40	4.01	17.70	4.87	13.28	5.25	17.20	5.48	14.73
SEG	0.40	0.92	1.35	17.03	1.13	9.07	0.27	4.66	0.10	8.57	0.34	6.10
SIE	1.44	1.00	3.19	16.95	2.97	9.15	2.11	4.73	1.74	8.65	1.50	6.18
STJ	0.56	3.93	2.31	14.02	2.10	12.08	1.24	7.67	0.86	11.58	0.63	9.12
TAM	4.90	19.61	3.15	1.66	3.37	27.76	4.22	23.35	4.60	27.26	4.83	24.80
VAL	3.69	17.58	1.94	0.37	2.16	25.73	3.02	21.31	3.39	25.23	3.63	22.76
VER	0.44	0.34	1.31	17.61	1.09	8.49	0.23	4.08	0.14	7.99	0.37	5.53

df %>% 
  dplyr::select(contains("bio6"),contains("bio14")) %>% 
  cor() %>% 
  lower.triangle() %>%
  kable(digits=3) %>%
  kable_styling(font_size=11,
                bootstrap_options = c("striped","hover", "condensed"), full_width = F) %>%
  column_spec(1, bold = T)

	mean_bio6	var_bio6	CTD_bio6_height_asturias	CTD_bio6_height_SLA_d13C_portugal	CTD_bio6_height_bordeaux25	CTD_bio6_height_bordeaux85	CTD_bio6_meanBB_bordeaux	CTD_bio6_meanDBB_bordeaux	mean_bio14	var_bio14	CTD_bio14_height_asturias	CTD_bio14_height_SLA_d13C_portugal	CTD_bio14_height_bordeaux25	CTD_bio14_height_bordeaux85	CTD_bio14_meanBB_bordeaux	CTD_bio14_meanDBB_bordeaux
mean_bio6	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0
var_bio6	0.356	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0
CTD_bio6_height_asturias	-0.661	-0.467	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0
CTD_bio6_height_SLA_d13C_portugal	0.355	-0.177	0.369	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0
CTD_bio6_height_bordeaux25	0.216	-0.250	0.508	0.987	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0
CTD_bio6_height_bordeaux85	-0.363	-0.434	0.922	0.690	0.795	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0
CTD_bio6_meanBB_bordeaux	-0.548	-0.461	0.986	0.508	0.635	0.973	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0
CTD_bio6_meanDBB_bordeaux	-0.639	-0.467	0.999	0.399	0.536	0.935	0.991	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0
mean_bio14	0.508	0.490	-0.624	-0.155	-0.251	-0.550	-0.605	-0.622	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0
var_bio14	0.432	0.414	-0.584	-0.226	-0.319	-0.562	-0.588	-0.587	0.937	1.000	0.000	0.000	0.000	0.000	0.000	0
CTD_bio14_height_asturias	-0.440	-0.575	0.608	0.266	0.355	0.592	0.616	0.611	-0.844	-0.813	1.000	0.000	0.000	0.000	0.000	0
CTD_bio14_height_SLA_d13C_portugal	0.532	0.450	-0.657	-0.147	-0.249	-0.571	-0.633	-0.654	0.981	0.921	-0.806	1.000	0.000	0.000	0.000	0
CTD_bio14_height_bordeaux25	-0.508	-0.519	0.638	0.178	0.276	0.571	0.623	0.636	-0.993	-0.932	0.895	-0.970	1.000	0.000	0.000	0
CTD_bio14_height_bordeaux85	-0.487	-0.562	0.643	0.225	0.322	0.598	0.638	0.644	-0.941	-0.890	0.971	-0.911	0.972	1.000	0.000	0
CTD_bio14_meanBB_bordeaux	-0.507	-0.525	0.640	0.184	0.282	0.576	0.626	0.638	-0.990	-0.930	0.906	-0.967	1.000	0.978	1.000	0
CTD_bio14_meanDBB_bordeaux	-0.498	-0.551	0.645	0.210	0.308	0.593	0.637	0.645	-0.965	-0.910	0.951	-0.937	0.988	0.997	0.992	1

df.site <- read.csv("data/Climate/Raw/ClimateDTdownloads/ClimateDT_outputs_sites.csv")

# Height (Asturias, 21 months)
MCMT_height_asturias <- mean(df.site[df.site$Year %in% 2011:2012 & df.site$ID=="asturias","MCMT"]) # mean across 2011 and 2012 
MSP_height_asturias <- mean(df.site[df.site$Year %in% 2011:2012 & df.site$ID=="asturias","MSP"]) # mean across 2011 and 2012 

# Height (Fundão, 20 months) + SLA + d13C
MCMT_height_SLA_d13C_portugal <- mean(df.site[df.site$Year %in% 2011:2012 & df.site$ID=="portugal","MCMT"]) # mean across 2011 and 2012
MSP_height_SLA_d13C_portugal <- mean(df.site[df.site$Year %in% 2011:2012 & df.site$ID=="portugal","MSP"]) # mean across 2011 and 2012                         

# Height (Bordeaux, 25 months)
MCMT_height_bordeaux25 <- mean(df.site[df.site$Year %in% 2011:2013 & df.site$ID=="bordeaux","MCMT"]) # mean across 2011, 2012 and 2013
MSP_height_bordeaux25 <- mean(df.site[df.site$Year %in% 2012:2013 & df.site$ID=="bordeaux","MSP"]) # mean across 2012 and 2013 (were not planted during the 2011 summer)                       
# Height (Bordeaux, 85 months)
MCMT_height_bordeaux85 <- mean(df.site[df.site$Year %in% 2011:2018 & df.site$ID=="bordeaux","MCMT"]) # mean from 2011 to 2018
MSP_height_bordeaux85 <- mean(df.site[df.site$Year %in% 2012:2018 & df.site$ID=="bordeaux","MSP"]) # mean from 2012 to 2018

# Mean bud burst date (Bordeaux)
MCMT_meanBB_bordeaux <- mean(df.site[df.site$Year %in% c(2013:2015,2017) & df.site$ID=="bordeaux","MCMT"]) # mean across years 2013, 2014, 2015 and 2017
MSP_meanBB_bordeaux <- mean(df.site[df.site$Year %in% c(2013:2015,2017) & df.site$ID=="bordeaux","MSP"]) # same

# Mean duration of bud burst (Bordeaux)
MCMT_meanDBB_bordeaux <- mean(df.site[df.site$Year %in% c(2014:2015,2017) & df.site$ID=="bordeaux","MCMT"]) #  mean across years 2014, 2015 and 2017
MSP_meanDBB_bordeaux <- mean(df.site[df.site$Year %in% c(2014:2015,2017) & df.site$ID=="bordeaux","MSP"]) # same


df <- df %>% 
  dplyr::mutate(CTD_MCMT_height_asturias=(mean_MCMT-MCMT_height_asturias) %>% abs(),
                CTD_MSP_height_asturias=(mean_MSP-MSP_height_asturias) %>% abs(),
                CTD_MCMT_height_SLA_d13C_portugal=(mean_MCMT-MCMT_height_SLA_d13C_portugal) %>% abs(),
                CTD_MSP_height_SLA_d13C_portugal=(mean_MSP-MSP_height_SLA_d13C_portugal) %>% abs(),
                CTD_MCMT_height_bordeaux25=(mean_MCMT-MCMT_height_bordeaux25) %>% abs(),
                CTD_MSP_height_bordeaux25=(mean_MSP-MSP_height_bordeaux25) %>% abs(),
                CTD_MCMT_height_bordeaux85=(mean_MCMT-MCMT_height_bordeaux85) %>% abs(),
                CTD_MSP_height_bordeaux85=(mean_MSP-MSP_height_bordeaux85) %>% abs(),
                CTD_MCMT_meanBB_bordeaux=(mean_MCMT-MCMT_meanBB_bordeaux) %>% abs(),
                CTD_MSP_meanBB_bordeaux=(mean_MSP-MSP_meanBB_bordeaux) %>% abs(),
                CTD_MCMT_meanDBB_bordeaux=(mean_MCMT-MCMT_meanDBB_bordeaux) %>% abs(),
                CTD_MSP_meanDBB_bordeaux=(mean_MSP-MSP_meanDBB_bordeaux) %>% abs())
                
df %>% 
  dplyr::select(prov,contains("CTD")) %>% 
  kable(digits=2) %>%
  kable_styling(font_size=11,bootstrap_options = c("striped","hover", "condensed"), full_width = F) %>%
  column_spec(1, bold = T)

prov	CTD_bio6_height_asturias	CTD_bio14_height_asturias	CTD_bio6_height_SLA_d13C_portugal	CTD_bio14_height_SLA_d13C_portugal	CTD_bio6_height_bordeaux25	CTD_bio14_height_bordeaux25	CTD_bio6_height_bordeaux85	CTD_bio14_height_bordeaux85	CTD_bio6_meanBB_bordeaux	CTD_bio14_meanBB_bordeaux	CTD_bio6_meanDBB_bordeaux	CTD_bio14_meanDBB_bordeaux	CTD_MCMT_height_asturias	CTD_MSP_height_asturias	CTD_MCMT_height_SLA_d13C_portugal	CTD_MSP_height_SLA_d13C_portugal	CTD_MCMT_height_bordeaux25	CTD_MSP_height_bordeaux25	CTD_MCMT_height_bordeaux85	CTD_MSP_height_bordeaux85	CTD_MCMT_meanBB_bordeaux	CTD_MSP_meanBB_bordeaux	CTD_MCMT_meanDBB_bordeaux	CTD_MSP_meanDBB_bordeaux
ALT	1.85	0.90	0.10	17.05	0.32	9.05	1.18	4.64	1.55	8.55	1.79	6.09	1.80	91.39	0.65	127.24	0.50	10.24	1.29	21.23	1.55	13.66	1.77	1.41
ARM	1.75	0.04	0.00	17.91	0.22	8.19	1.07	3.77	1.45	7.69	1.68	5.22	1.72	100.62	0.57	136.47	0.42	19.47	1.21	30.47	1.47	4.43	1.69	7.82
ARN	2.97	18.36	1.22	0.41	1.44	26.51	2.30	22.09	2.67	26.01	2.91	23.54	2.80	21.97	1.65	13.88	1.50	103.12	2.28	92.13	2.55	127.02	2.76	114.77
BAY	4.47	13.89	2.72	4.06	2.94	22.04	3.80	17.63	4.17	21.54	4.41	19.08	4.14	26.38	2.99	9.47	2.84	107.53	3.62	96.53	3.89	131.43	4.10	119.18
BON	4.87	16.04	3.12	1.91	3.34	24.19	4.20	19.78	4.57	23.69	4.81	21.23	4.36	31.68	3.21	4.17	3.06	112.83	3.84	101.84	4.11	136.73	4.32	124.48
CAD	1.63	1.09	3.38	16.86	3.17	9.24	2.31	4.82	1.93	8.74	1.70	6.27	0.73	65.39	1.88	101.24	2.03	15.76	1.24	4.77	0.98	39.66	0.76	27.41
CAR	3.97	17.65	2.22	0.30	2.43	25.80	3.29	21.38	3.67	25.30	3.90	22.83	3.55	79.76	2.40	43.91	2.25	160.91	3.04	149.92	3.30	184.81	3.52	172.56
CAS	1.80	1.48	3.55	16.47	3.33	9.63	2.48	5.22	2.10	9.13	1.87	6.67	0.94	73.21	2.09	109.06	2.24	7.94	1.46	3.05	1.19	31.84	0.98	19.59
CEN	5.28	18.40	3.53	0.45	3.75	26.55	4.61	22.13	4.98	26.05	5.22	23.58	4.81	69.06	3.66	33.21	3.51	150.21	4.30	139.22	4.56	174.11	4.78	161.86
COC	3.75	17.96	2.00	0.01	2.22	26.11	3.08	21.70	3.45	25.61	3.68	23.15	3.33	83.15	2.18	47.30	2.03	164.30	2.82	153.31	3.08	188.20	3.29	175.95
COM	0.85	22.31	2.60	4.36	2.38	30.46	1.52	26.05	1.15	29.96	0.92	27.49	0.18	117.54	0.97	81.69	1.12	198.69	0.33	187.70	0.07	222.59	0.15	210.34
CUE	3.76	17.61	2.01	0.34	2.22	25.76	3.08	21.35	3.46	25.26	3.69	22.79	3.31	80.24	2.16	44.39	2.01	161.39	2.80	150.39	3.06	185.29	3.28	173.04
HOU	0.44	2.57	2.19	20.52	1.97	5.58	1.12	1.17	0.74	5.08	0.51	2.61	0.67	67.66	0.48	103.51	0.63	13.49	0.16	2.50	0.42	37.39	0.64	25.14
LAM	2.15	0.65	3.90	18.60	3.69	7.50	2.83	3.08	2.45	7.00	2.22	4.53	1.20	87.99	2.35	123.84	2.50	6.84	1.71	17.84	1.45	17.06	1.23	4.81
LEI	4.57	18.46	6.32	0.51	6.10	26.61	5.24	22.20	4.87	26.11	4.64	23.64	3.28	84.89	4.43	49.04	4.58	166.04	3.80	155.05	3.53	189.94	3.32	177.69
MIM	1.48	9.43	3.23	27.38	3.01	1.28	2.15	5.69	1.78	1.78	1.55	4.24	0.49	129.91	1.64	165.76	1.79	48.76	1.00	59.75	0.74	24.86	0.52	37.11
OLB	3.78	16.06	2.03	1.89	2.24	24.21	3.10	19.79	3.48	23.71	3.71	21.24	3.23	12.51	2.08	23.34	1.93	93.66	2.72	82.67	2.98	117.56	3.20	105.31
OLO	1.16	3.62	2.91	14.33	2.69	11.77	1.83	7.36	1.46	11.27	1.22	8.81	0.53	9.67	0.62	45.52	0.77	71.48	0.02	60.49	0.28	95.38	0.50	83.13
ORI	4.59	21.16	2.84	3.21	3.06	29.31	3.92	24.90	4.29	28.81	4.53	26.35	3.63	108.13	2.48	72.28	2.33	189.28	3.12	178.29	3.38	213.18	3.60	200.93
PET	1.42	11.38	3.17	29.33	2.95	3.23	2.09	7.65	1.72	3.73	1.48	6.20	0.47	151.24	1.62	187.09	1.77	70.09	0.98	81.08	0.72	46.19	0.51	58.44
PIA	2.64	14.42	4.39	3.53	4.18	22.57	3.32	18.16	2.94	22.07	2.71	19.60	1.88	50.25	3.03	14.40	3.18	131.40	2.39	120.41	2.13	155.30	1.91	143.05
PIE	4.52	10.74	2.77	7.21	2.99	18.89	3.84	14.47	4.22	18.39	4.45	15.92	3.85	66.98	2.70	102.83	2.55	14.17	3.34	3.17	3.60	38.07	3.81	25.82
PLE	1.63	2.03	0.12	15.92	0.10	10.18	0.96	5.76	1.33	9.68	1.57	7.21	2.20	30.32	1.05	66.17	0.90	50.83	1.69	39.83	1.95	74.73	2.17	62.48
PUE	2.40	2.95	4.15	15.00	3.93	11.10	3.07	6.68	2.70	10.60	2.47	8.13	1.34	44.80	2.49	80.65	2.64	36.35	1.85	25.36	1.59	60.25	1.38	48.00
QUA	0.19	18.22	1.94	0.27	1.73	26.37	0.87	21.95	0.49	25.87	0.26	23.40	0.70	8.37	1.85	27.48	2.00	89.52	1.21	78.53	0.95	113.42	0.73	101.17
SAC	1.16	1.68	0.59	16.27	0.37	9.83	0.48	5.41	0.86	9.33	1.09	6.86	0.71	164.67	0.44	200.52	0.59	83.52	0.19	94.51	0.46	59.62	0.67	71.87
SAL	5.55	9.55	3.80	8.40	4.01	17.70	4.87	13.28	5.25	17.20	5.48	14.73	5.46	6.62	4.31	42.47	4.16	74.53	4.95	63.54	5.21	98.43	5.43	86.18
SEG	0.40	0.92	1.35	17.03	1.13	9.07	0.27	4.66	0.10	8.57	0.34	6.10	0.45	141.12	0.70	176.97	0.85	59.97	0.06	70.96	0.20	36.07	0.42	48.32
SIE	1.44	1.00	3.19	16.95	2.97	9.15	2.11	4.73	1.74	8.65	1.50	6.18	0.56	68.80	1.71	104.65	1.86	12.35	1.08	1.36	0.81	36.25	0.60	24.00
STJ	0.56	3.93	2.31	14.02	2.10	12.08	1.24	7.67	0.86	11.58	0.63	9.12	0.89	10.38	0.26	46.23	0.41	70.77	0.38	59.78	0.64	94.67	0.86	82.42
TAM	4.90	19.61	3.15	1.66	3.37	27.76	4.22	23.35	4.60	27.26	4.83	24.80	3.64	80.47	2.49	44.62	2.34	161.62	3.12	150.63	3.39	185.52	3.60	173.27
VAL	3.69	17.58	1.94	0.37	2.16	25.73	3.02	21.31	3.39	25.23	3.63	22.76	3.59	60.63	2.44	24.78	2.29	141.78	3.08	130.79	3.34	165.68	3.56	153.43
VER	0.44	0.34	1.31	17.61	1.09	8.49	0.23	4.08	0.14	7.99	0.37	5.53	1.05	34.41	0.10	70.26	0.25	46.74	0.53	35.75	0.80	70.64	1.01	58.39

df %>% 
  dplyr::select(contains("MCMT"),contains("MSP")) %>% 
  cor() %>% 
  lower.triangle() %>%
  kable(digits=3) %>%
  kable_styling(font_size=11,
                bootstrap_options = c("striped","hover", "condensed"), full_width = F) %>%
  column_spec(1, bold = T)

	mean_MCMT	var_MCMT	CTD_MCMT_height_asturias	CTD_MCMT_height_SLA_d13C_portugal	CTD_MCMT_height_bordeaux25	CTD_MCMT_height_bordeaux85	CTD_MCMT_meanBB_bordeaux	CTD_MCMT_meanDBB_bordeaux	mean_MSP	var_MSP	CTD_MSP_height_asturias	CTD_MSP_height_SLA_d13C_portugal	CTD_MSP_height_bordeaux25	CTD_MSP_height_bordeaux85	CTD_MSP_meanBB_bordeaux	CTD_MSP_meanDBB_bordeaux
mean_MCMT	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0
var_MCMT	0.190	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0
CTD_MCMT_height_asturias	-0.777	-0.335	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0
CTD_MCMT_height_SLA_d13C_portugal	-0.230	-0.425	0.737	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0
CTD_MCMT_height_bordeaux25	-0.114	-0.402	0.649	0.991	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0
CTD_MCMT_height_bordeaux85	-0.593	-0.406	0.953	0.896	0.833	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0
CTD_MCMT_meanBB_bordeaux	-0.699	-0.365	0.990	0.819	0.741	0.986	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0
CTD_MCMT_meanDBB_bordeaux	-0.768	-0.339	1.000	0.749	0.661	0.959	0.992	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0
mean_MSP	0.376	0.297	-0.583	-0.484	-0.441	-0.577	-0.578	-0.582	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0
var_MSP	0.253	0.049	-0.394	-0.306	-0.274	-0.397	-0.397	-0.395	0.739	1.000	0.000	0.000	0.000	0.000	0.000	0
CTD_MSP_height_asturias	0.242	-0.322	-0.312	-0.196	-0.174	-0.286	-0.309	-0.312	0.355	0.432	1.000	0.000	0.000	0.000	0.000	0
CTD_MSP_height_SLA_d13C_portugal	0.408	0.049	-0.606	-0.477	-0.433	-0.592	-0.605	-0.607	0.821	0.678	0.787	1.000	0.000	0.000	0.000	0
CTD_MSP_height_bordeaux25	-0.334	-0.429	0.481	0.399	0.365	0.462	0.468	0.479	-0.841	-0.423	0.065	-0.514	1.000	0.000	0.000	0
CTD_MSP_height_bordeaux85	-0.307	-0.464	0.440	0.362	0.329	0.420	0.426	0.438	-0.768	-0.347	0.178	-0.409	0.989	1.000	0.000	0
CTD_MSP_meanBB_bordeaux	-0.361	-0.375	0.539	0.455	0.420	0.526	0.530	0.538	-0.939	-0.561	-0.128	-0.672	0.970	0.927	1.000	0
CTD_MSP_meanDBB_bordeaux	-0.348	-0.396	0.511	0.432	0.399	0.496	0.500	0.510	-0.895	-0.492	-0.040	-0.601	0.991	0.962	0.993	1

climcc <- tibble("Common gardens"=c("Asturias",
                                    "Fundão",
                                    "Bordeaux",
                                    "Bordeaux",
                                    "Bordeaux",
                                    "Bordeaux"),
                 "Traits"=c("Height (21 months)",
                            "Height (20 months), SLA, d13C",
                            "Height (25 months)",
                            "Height (85 months)",
                            "Mean bud burst date",
                            "Mean duration of bud burst"),
                 "Years included"=c("2011,2012",
                                    "2011,2012",
                                    "2011 (MCMT only),2012,2013",
                                    "2011 (MCMT only), 2012 to 2018",
                                    "2013, 2014, 2015 and 2017",
                                    "2014, 2015 and 2017"),
                 "MCMT"=c(MCMT_height_asturias,
                        MCMT_height_SLA_d13C_portugal,
                        MCMT_height_bordeaux25,
                        MCMT_height_bordeaux85,
                        MCMT_meanBB_bordeaux,
                        MCMT_meanDBB_bordeaux),
                 "SP"=c(MSP_height_asturias,
                      MSP_height_SLA_d13C_portugal,
                      MSP_height_bordeaux25,
                      MSP_height_bordeaux85,
                      MSP_meanBB_bordeaux,
                      MSP_meanDBB_bordeaux))

print(xtable(climcc, type = "latex",digits=3),
      file = "tables/Drivers/ClimateCommonGardens.tex",
      include.rownames=FALSE)

7 Export drivers

saveRDS(df,file="data/DF_Drivers.rds")

8 Correlations

Correlations among the potential drivers:

dfcor <- df %>% 
  dplyr::select(A,D,
                mean_MCMT,mean_MSP,var_MCMT,var_MSP,
                SH.20km.PC1,SH.20km.PC2,
                SH.1km.PC1,SH.1km.PC2) %>% 
  cor() %>% 
  lower.triangle()

dfcor %>% 
  kable(digits=3) %>%
  kable_styling(font_size=11,
                bootstrap_options = c("striped","hover", "condensed"), full_width = F) %>%
  column_spec(1, bold = T)

	A	D	mean_MCMT	mean_MSP	var_MCMT	var_MSP	SH.20km.PC1	SH.20km.PC2	SH.1km.PC1	SH.1km.PC2
A	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0
D	0.914	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0
mean_MCMT	0.122	0.012	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0
mean_MSP	-0.325	-0.422	0.376	1.000	0.000	0.000	0.000	0.000	0.000	0
var_MCMT	-0.449	-0.445	0.190	0.297	1.000	0.000	0.000	0.000	0.000	0
var_MSP	0.132	0.009	0.253	0.739	0.049	1.000	0.000	0.000	0.000	0
SH.20km.PC1	-0.071	-0.013	-0.244	-0.088	-0.050	0.245	1.000	0.000	0.000	0
SH.20km.PC2	0.197	0.349	-0.035	-0.140	-0.245	0.146	0.778	1.000	0.000	0
SH.1km.PC1	0.375	0.400	-0.142	0.032	-0.321	0.367	0.561	0.698	1.000	0
SH.1km.PC2	0.420	0.565	0.054	-0.191	-0.337	-0.001	0.333	0.799	0.686	1

print(xtable(dfcor, type = "latex",digits=3),
      file = "tables/Drivers/ClimaticDriversCorrelations.tex",
      include.rownames=FALSE)

Correlations between the potential drivers and the altitude of the populations:

dfcoralt <- readRDS(file="data/ClonapinData/PhenoDataNovember2019_AnnualTraits.rds") %>% 
  dplyr::filter(!(prov=="MAD")) %>% 
  dplyr::select(prov,altitude_prov) %>% 
  distinct() %>% 
  inner_join(df,by="prov") %>% 
  dplyr::select(A,D,SH.20km.PC1,SH.20km.PC2,
                SH.1km.PC1,SH.1km.PC2,contains("MSP"),(contains("MCMT")),altitude_prov) %>% 
  cor() %>% 
  upper.triangle() %>% 
  as.data.frame() %>% 
  rownames_to_column(var="Variable") %>% 
  dplyr::select(Variable,altitude_prov) %>% 
  arrange(desc(abs(altitude_prov))) %>% 
  dplyr::filter(!Variable=="altitude_prov") %>% 
  dplyr::rename("Pearson correlation with altitude"=altitude_prov) 


dfcoralt %>% 
  kable(digits=3) %>% 
  kable_styling(font_size=11,
                bootstrap_options = c("striped","hover", "condensed"), full_width = F)

Variable	Pearson correlation with altitude
mean_MCMT	-0.793
CTD_MCMT_height_asturias	0.741
CTD_MCMT_meanDBB_bordeaux	0.737
CTD_MCMT_meanBB_bordeaux	0.707
CTD_MCMT_height_bordeaux85	0.665
var_MCMT	-0.605
CTD_MSP_meanBB_bordeaux	0.585
mean_MSP	-0.580
CTD_MSP_meanDBB_bordeaux	0.576
CTD_MSP_height_bordeaux25	0.567
CTD_MSP_height_bordeaux85	0.548
CTD_MCMT_height_SLA_d13C_portugal	0.440
CTD_MSP_height_SLA_d13C_portugal	-0.439
CTD_MCMT_height_bordeaux25	0.349
D	0.333
var_MSP	-0.273
SH.20km.PC1	0.268
SH.20km.PC2	0.268
SH.1km.PC1	0.258
SH.1km.PC2	0.226
A	0.180
CTD_MSP_height_asturias	-0.079

print(xtable(dfcoralt, type = "latex",digits=3),
      file = "tables/Drivers/ClimaticDriversCorrelationAltitude.tex",
      include.rownames=FALSE)

Bibliography

Jaramillo-Correa, Juan-Pablo, Isabel Rodríguez-Quilón, Delphine Grivet, Camille Lepoittevin, Federico Sebastiani, Myriam Heuertz, Pauline H Garnier-Géré, et al. 2015. “Molecular Proxies for Climate Maladaptation in a Long-Lived Tree (Pinus Pinaster Aiton, Pinaceae).” Genetics 199 (3). Oxford University Press: 793–807.

Pritchard, Jonathan K, Matthew Stephens, and Peter Donnelly. 2000. “Inference of Population Structure Using Multilocus Genotype Data.” Genetics 155 (2). Oxford University Press: 945–59.

Calculating potential drivers of within-population genetic variation

Juliette Archambeau

10 janvier, 2023