Task.1. Blue shark tissue sampling and archiving
Collect and store a representative sample size following a standardized protocol.
Description of Work
a) Sampling design. Sampling will target Western/Central Western and Central Eastern/Eastern Mediterranean Sea as two main areas with marked differences in environmental/oceanographic parameters. A third area included in the sampling design is the North-Eastern Atlantic. Sampling of BS in the target areas and seas will be carried out during the first six months of the project by Team partners UNIBO, NKUA, UNICAL, IEO and QUEENS. It is expected that the sampling strategy will allow in total a minimum amount of 50 sampling locations and 160 BS samples. In addition, up to 50 historical tissue samples approximately one generation-old collected from the Eastern Mediterranean are available within the Team.
b) Sampling protocol. Muscle or skin tissue will be used for the sampling step, having previously prepared sampling material (2mL tubes with screw cap and O-ring, surgical scissors, tweezers, scalpel, beaker, ethanol 96%, ddH2O, latex gloves, paper towels). All materials and equipment must be carefully cleaned before sampling, washed with water and ethanol and dried with clean paper. Rinse the instruments used in a small beaker or jar before with ddH2O and then with ethanol 96% and dried with a paper towel between every sampling to reduce the risk of contamination between samples. Samples should be collected in double! Cut with sharp scissors and tweezers a small thick portion of about 1cm2 from the muscle and skin soft tissue and insert it into an O-ring tube previously filled with about 1.5 mL of 96% ethanol and properly labeled. Larger portions of the tissues are, usually, not necessary and can lead to a poor quality of the DNA to be analyzed due to poor ethanol/tissue relationship. Make sure that the volume of tissue is not more than 10% of the volume of liquid and that the cap is properly closed. In case of larger amount of tissue from the same animal, the use of a second tube properly labeled is recommended. Finally, store the tubes refrigerated, sheltered from light as soon as possible and put it into the cryoboxes at -20 °C. Should this not be possible, ensure that the temperature does not exceed +4 °C. Biological data for each individual to be collected are size (in cm) and sex (female/male). Fishery data are date, geographical coordinates (longitude/latitude) and depth (in m).
c) Data archiving. Biological and fishery data will be archived into a GIS-interfaced database to geo-represent collected samples and genetic variation of BS at the seascape level.
Partner involved: UNIBO, NKUA, IEO, UNICAL, QUEENS
Places of execution of services: On-board work: Mediterranean and NE Atlantic; Lab work: Ravenna, Italy; Athens, Greece; Malaga, Spain; Arcavacata di Rende (CS), Italy; Belfast, UK.
D1: Comprehensive documentation providing a thorough description and an inventory of data collected and archived.
Due to: month 7
Task.2. Development of genomic resources for the Blue Shark
To provide genetic markers for the Blue Shark by identifying genome-wide informative Single Nucleotide Polymorphism (SNP) markers
Description of Work
Genomic tools will be developed for the target species using a modified RAD protocol, i.e. double digest RAD. The procedure is similar to that of Peterson et al. (2012) apart from in one step. Samples will be pooled at the earliest possible stage, and proceed with a single tube preparation. The protocol was originally optimized for pools of 144 samples of species with average genome size <1Gbp. Considering that the genome size of Prionace glauca is considerably larger (two independent estimates suggest >4.2 Gbp), the level of multiplexing individuals in the same library will be substantially reduced. A larger number of fragments is expected for the combination of enzymes and fragment size, which will ensure greater genomic information, but will require higher sequencing effort per library to ensure adequate coverage (sequencing depth) per fragment. Approximately 30 individuals will be pooled in the same library, which means that 6-7 libraries will be prepared and run on a Hiseq2000 at 100 PE, each library in a separate lane of a flow cell. DNA will be extracted using Invitek 96 well tissue DNA extraction kit, quality controlled and quantified using a NanoDrop spectrophotometer and a PicoGreen dye on QBit fluorimeter. Exactly the same amount of genomic DNA will be used for each individual. Individual DNA will be digested with SbfI HFand SphI HF (<1 enzyme unit per sample). P1 and P2 barcoded adapters will be added together with T4 ligase. After enzyme heat inactivation, individual samples will be pooled and cleaned up. The library will be run on an agarose gel (1.1%), to size select fragments 200-300 bp. Size-selected DNA will be eluted from gel slice. Eluted DNA will be PCR amplified with generic P1 and P2 primers after having optimized PCR conditions. Amplified library will be purified using Ampure magnetic beads and final library quality will be assessed loading an aliquot on Agilent BioAnalyser DNA chip. QC libraries will be sent for sequencing to an external sequencing centre.
A first test will be carried out with 30 individuals to evaluate the number of fragments obtained, the relative sequencing coverage and the number of polymorphic sites. It is expected to obtain approximately 10,000-20,000 fragments.
While the ddRAD methodology has been already successfully applied to other species in the lab, in case initial tests will prove it difficult to use with Blue shark, a different RAD-like strategy will be used (2bRAD, Wang et al. 2012). 2bRAD produced shorter fragments (37 bp), which are less informative, but allows individual samples to be processed separately until the end of the protocol to avoid low quality samples to be poorly represented as it might happen for pooled sample libraries. Moreover, 2bRAD protocol could use 1-2 base selective adapters, which allow to reduce the complexity of digested fragments by nearly 20-times. This might prove useful for a complex, large genome. The protocol will be as the published one, with the use of AflI type IIB restriction enzyme. Multiplexing level will be decided upon first test, but up to 196 barcoded adaptors are already available in the lab. Individual libraries will be pooled in equal amount to the desired multiplexing level and sent out for sequencing on a HiSeq2000 with 50 bp SE option.
For both RAD-like technologies, a bioinformatic pipeline based on the program STACKS is already in place. Such pipeline will be used to filter low quality reads, assemble reads into contigs and identify polymorphic sites.
A catalogue of loci with relative SNPs will be produced. From such a catalogue, genomic data will be converted to the appropriate file format for subsequent genetic analysis (e.g. GenPop, Structure).
Partner involved: UNIPD
Places of execution of services: Lab-work: Padova, Italy
D2: Interim Progress Report, including a list of inventories on:
- Sequence data available
- Sequence data assembled
- RAD libraries prepared
- RAD sequences available
- Population-informative candidate SNPs identified
Due to: month 18
Task.3. Establish genetic baselines of natural populations
Results allowing for estimates and evaluation of spatio-temporal differentiation among natural populations.
Description of Work
A dataset will be generated based on >300 SNP loci to be used in a multi-approach analytical framework to disentangle genetic structure within the Mediterranean and between Mediterranean and Atlantic (spatial scale analysis) as well as to test BS temporal genetic variation (if feasible) in some areas over an interval of one generation (approximately 10 years, being the generation time of BS at 12 years; temporal scale analysis).
Spatial scale analysis. Investigation of BS population structure in the Mediterranean by using fastSTRUCTURE (Raj et al. 2014) to analyse large SNP datasets and a model developed by Hubisz et al. (2009), in which the basic models are extended to incorporate information on the sampling location, necessary to properly infer population structure when genetic differences between subpopulations are small. The novel model of STRUCTURE was designed to test for the presence of migrants belonging to a different location and is only useful for highly informative data, i.e. when there is strong evidence of population structure and sampling locations correspond almost exactly to the inferred clusters, therefore appearing appropriate for testing population structure in epipelagic, highly-migratory marine fish as BS.
An important class of Bayesian clustering models improves STRUCTURE by including information on individual geographic coordinates.
Spatial Bayesian clustering models implemented in the software GENELAND (Guillot et al. 2005, 2008) will be also employed to address the required seascape genetics approach for spatially clustering BS samples. GENELAND 3.2.4 (Guillot et al. 2005, an extension of program R 2.12.0; R Development Core Team 2010) considers individual multi-locus genotype data searching for the best fit to HWE and linkage equilibrium and incorporates spatial data directly under the assumption that populations are spatially organized. The correlated allele frequency models will be tested, with or without spatial information. The correlated allele frequency model accounts for the situation where some allele frequencies reflect common ancestry of different populations. The primary distinguishing factor between the spatial model and the non-spatial model in GENELAND is the assumption of spatial correlation of genotypes. Any genetic boundaries found are assumed to separate K random mating subpopulations, thus subdividing the space in a way resembling the Voronoi-Poisson tessellation (Guillot et al. 2005; Manel et al. 2007).
The multiapproach analytical framework will be completed by multivariate methods that have displayed great efficiency in extracting information from genetic markers (Cavalli-Sforza 1966; Johnson et al. 1969; Smouse et al. 1982).
Two different ordination methods, the Correspondence Analysis (CA; Greenacre 1966; Jombart et al. 2009) and the Canonical/constrained Correspondence analysis (CCA; Legendre & Legendre 1998) will be used to further investigate the spatial pattern of genetic variability among BS samples. CA analysis will be performed using the R package ADE4 1.4 (Dray & Dufour 2007) and ADEGENET 2.7 (Jombart 2008). The CA is an ‘ordination in reduced space' method, and it can be used to analyse tables of allele counts. This method optimizes the χ2 distances among observations and therefore it can give a stronger weight to a population possessing a rare allele. As a consequence, to minimize analysis artefacts, alleles present in single copy in only one population will be removed (Jombart et al. 2005).
Multilocus genotype data will be also analysed under a model of isolation by distance (IBD). A matrix of geographical distances will be obtained considering the shortest sea-paths between each pair of sampling sites using Google Earth version 6.0.2 OOB; genetic distances between populations were expressed by the ratio FST/(1- FST) (Rousset 1997). Two additional matrices will be calculated describing salinity and temperature differences between sites. The correlation between distance matrices will be tested by Mantel (1967) tests using the VEGAN package for R (Oksanen et al. 2007).
Environmental data, i.e. seawater salinity (S, psu) and surface temperature (t, °C) data from the sampled sites, will be obtained from SeaDataNet Climatologies Pan-European Infrastructure for Ocean and Marine Data Management (), a Pan-European infrastructure for managing, indexing and providing access to ocean and marine data sets and data products, acquired via research cruises and other observational activities, in situ and remote sensing. Temperature data will be averaged over the period 1985-2014, while S data were averaged over the period 1900-2014.
Temporal scale analysis. The assessment of temporal changes of SNP genetic variation will be limited to Eastern Mediterranean because the availability of ~50 historical one-generation old BS samples. Differentiation between temporal samples (historical, 2003-2005, vs. contemporary, 2014-2015; approximately corresponding to one-generation interval) will be estimated out using the same multi-approach analytical framework used for disentangling spatial structure. In addition, we'll estimate effective population size using temporal methods based on maximum‐likelihood, Markov Chain Monte Carlo re‐sampling, Bayesian and/or coalescence approaches (e.g. Berthier et al. 2002) as well as those based on linkage or gametic phase disequilibrium between two loci (Waples 2002) or from heterozygote excess (e.g. Luikart and Cornuet, 1999).
Partner involved: UNIBO, NKUA, UNIPD, QUEENS
Places of execution of services: Lab work: Ravenna, Italy; Padova, Italy; Belfast, UK. Note: NKUA scientist will work at UNIBO and UNIPD.
D3: Comprehensive documentation including an inventory of baseline genetic data for 300-400 analysed Blue Sharks in an electronic format; Population Genetic Analysis; a high-level quality scientific text on the estimations of spatio-temporal differentiation of populations and a seascape genetic approach using the obtained genetic data; Final Report and a manuscript submitted to a peer-reviewed journal.
Due to: month 24
Task.4. Project Management
Ensure continuous work flow and timely deliverables
Description of Work
UNIBO will have the responsibility of the appropriate project management
• by maintaining the contacts and communications with the JRC, including Project Progress
Communication and Project Diary;
• by organizing scheduled Meetings (preferentially in tele-videoconference using ICT and Web resources available at Partners);
• by preparing, reviewing and submitting Deliverables and Reports
Partner involved: UNIBO
Places of execution of services: UNIBO, Bologna, Italy