Traceability in the "ocean to fork" sense relies on efficient, reliable, cost-effective technologies, enabling the independent control of compliance with rules. In the fisheries sector, this encompasses the ability to determine whether labels on fish and fish products identify the correct species, correct origin, and whether fish derive from aquaculture or the wild. Ideally, such methods should be adapted to end-users such as staff of control authorities, be applicable on whole fish as well as processed products, and swiftly lead to results. Moreover, if they are to be utilised for enforcement, these methods should be validated applying forensic standards and generate levels of confidence based on statistical analysis certainty, which are considerably higher than that required for purely scientific inference (Murphy and Morrison, 2007).
With the advent of molecular biology, molecular and genetic markers are increasingly employed both for species identification and origin assignment. It is important to understand that traceability tools for species identification and origin assignment require comprehensive reference data sets ("baselines"). When control authorities wish to test whether a fish fillet derives from the species indicated on the product label, their analytical data must be comparable to a set of validated data for species identification. This has been achieved to a great extent by the DNA barcoding approach (see below). Likewise, if the origin of a fish (product) is under investigation, fish deriving from different geographical regions must have been formerly analysed and distinct features, robust "population-level signatures", characterising these groups must be recorded.
Genetic tools for species-level identification
For species identification, DNA microarrays ("DNA-Chips") can here be of great value. Microarrays consist of a surface with thousands of covalently attached DNA oligonucleotides. This allows monitoring of thousands of different (i.e. species identifying) DNA sequences simultaneously with one array (size about 1x1cm). Theoretically, just one chip would enable the screening for all major economic fish species simultaneously (Kochzius et al., 2008). While the development of DNA-microarrays is laborious, the running costs are moderate. Other high-throughput and parallel processing methodologies for fish species identification have also been developed (Dooley et al., 2005). Such technologies might ultimately lead to the development of handheld analytical devices, enabling field use, which is critical with respect to the response time (the period between starting an investigation and the receipt of analytical results). For example, inspectors in the fisheries sector carry huge responsibility: if they decide to put landings "on hold" because of suspect content, there can be severe consequences for fishermen and stakeholders. Engineering of such machines is carried out in support of forensic genetic analysis at crime scenes (Liu et al., 2008). However, while recent publications show that progress has been made in this area (Arnaud, 2008), currently no cost-effective handheld analytical device supporting fisheries control and enforcement or traceability is available.
One of the most commonly employed species-level genetic identification systems is DNA barcoding. Although the approach utilises technologies that have been available for some time within the general field of "molecular systematics", initially based primarily on protein variation or allozyme electrophoresis, there are attributes of the approach of DNA barcoding that are highly distinctive and that enhance its utility within fisheries applications. Hebert et al (2003) proposed a new approach to species identification, which offered great promise. The new approach is based on the premise that the sequence analysis of a short fragment of a single gene (cytochrome c oxidase subunit 1), enables unequivocal identification of all animal species. Hence, analogously to the barcodes used in commercial products, the DNA barcode would provide a standardised tool for fast, simple, robust and precise species identification. Such a ‘barcode region' would also have to evolve at a rate that would distinguish species from each other while remaining more or less identical for all members of the same species. Finally it would have to be flanked by conserved DNA regions so as to make the polymerase chain reaction (PCR), a method of targeted gene replication. With the exception of certain groups of Cnidaria and sponges, studies have now confirmed that the target segment of COI ordinarily provides clear-cut discrimination of most animal species. An international consortium (Consortium for the Barcode of Life, CBOL) was established in 2004 to build support for global implementation of DNA barcoding. The critical mass of institutional and community participation required to progress the DNA barcoding effort for species identification now exists: the International Barcode of Life (iBOL) project. With the central goal of building a library of DNA barcodes for 5 million specimens and 500 K eukaryotic species by 2015, iBOL promises rapid progress toward aglobal identification system accessible to all. DNA barcoding differs in many ways from conventional taxonomic identification tools and approaches, over which it offers several advantages. It permits the identification of species from fragments, and from any life-history stage, as well as the standardisation of a universal master key in a format that reduces ambiguity and enables direct comparison of specimens to a global reference database.
DNA barcoding provides a standardised tool for describing and monitoring fish species diversity, not only in the wild, but also throughout the food supply chain in relation to legal enforcement and consumer protection. Moreover, a globally-accessible, standardised DNA barcoding data base means that non-experts may utilise the information to examine species identity, but importantly also allows a coordinated and extensive effort to document biodiversity from throughout species distributions.
Genetic tools for population-level identification
Microsatellites, also called Short Tandem Repeats (STRs) in forensics, are tandem sequence repeats of one to six nucleotides (e.g. ‘cgtacgtacgtacgtacgta') in the genome. Their high polymorphism is characterised by variable repeat numbers (between 5 and 100) even between individuals. Microsatellites are the standard marker for human identity testing by DNA profiling and for forensic genetic crime scene investigation (Butler, 2005). They have also been extensively used in fish population studies, and there potential value as traceability markers for origin assignment is very high. However, despite the widespread application of microsatellites, there are drawbacks, particularly scoring error and lack of comparability among laboratories (Dewoody et al., 2006). Nevertheless, numerous examples exist where microsatellites are used for fish population/stock analysis, management, and also origin assignment (Manel et al., 2005, Hauser and Carvalho, 2008), including Atlantic salmon (Primmer et al., 2000), Pacific salmon (Fisheries and Oceans Canada, DFO) and cod (Nielsen et al., 2001).
Meanwhile, Single Nucleotide Polymorphisms (SNPs) entered the realm of fisheries genetics, offering a great potential for origin assignment (Hudson, 2008). SNPs are genome sites where more than one nucleotide (A, C, G or T) is present in a species. They are the most abundant polymorphism in the genome (Brumfield et al., 2003), but per locus normally only two alleles exists (biallelic markers), thus they are less variable than Microsatellites, where often many alleles exist. The lack of potential information per SNP marker is outweighed by their high abundance. Compared to other genetic markers, where routine genotyping and transfer of protocols between laboratories proves difficult, the information retrieved from SNPs is categorical, and data can be standardized across laboratories for forensic applications (Sobrino et al., 2005). However, a substantial research effort targeting all commercial marine fish species will be necessary before SNPs can be employed routinely for origin assignment. Despite this, available studies on marine fish using SNPs are encouraging. SNPs as markers to distinguish stocks of Atlantic cod (Gadus morhua) provided a high resolution power for stock identification, comparable to that of microsatellite loci (Wirgin et al., 2007). Another example is the North Pacific Anadromous Fish Commission (NPAFC) that is developing SNP arrays for Pacific salmon (http://www.npafc.org). The application of SNPs to population genetics is not without some problems, including so-called "ascertainment bias"-the selection of loci based on an unrepresentative sample of individuals. For example if SNPs have been developed from a few individuals (small ascertainment depth), SNPs with high heterozygosities are preferentially found, providing a false impression of overall genomic polymorphism. Likewise, if SNPs are developed from a biased sample of individuals (e.g. not covering the full range of populations), comparative analysis with respect to population-specific indices of variability can be biased. However, in the context of mixed stock analysis (MSA) for example, ascertainment bias is not expected to create problems. Population-biased ascertainment could result in marginally lower power for MSA in populations not included in the ascertainment sample; however, the high number of markers employed would most likely compensate for this.
Among the most recent application of SNP markers to fisheries were the outputs deriving from an EU Seventh Framework Project, FishPopTrace (http://fishpoptrace.jrc.ec.europa.eu/). Among the most striking scientific results is the provision of several hundred novel genetic markers in, hake, herring and sole. Although these fish represent a major part of the European catch, many aspects of their biology remain unknown. This holds also for the number, location and independence of biological populations. The lack of high resolution genetic data has complicated sustainable management, which should rely on the basal biological independent units rather than geographically defined "stocks". However, access to new genetic methods, the so-called next generation sequencing, has changed the picture in a matter of just a few years. From a dozen genetic markers a few years ago, we now have knowledge about thousands of small genetic differences (genetic variation) at numerous genes, allowing the design of hundreds to thousands of new genetic markers. The unique combinations of the variation make it feasible to assign the fish to specific populations and in some conditions to identify unique individuals.
It is now possible to correctly assign fish to populations from more areas and with higher certainty than previously possible, reaching standards which can be used in a court of law. Based on use of the most highly distinct genes among populations it has been possible to develop "minimum assays with maximum power" with from 10-30 SNPs. These assays have been developed to target some of the most pertinent needs for traceability tools in European fisheries management. For example, fast, efficient and forensically robust tools are now available to discriminate between cod from Canada, North Sea, Baltic Sea and Northeast Arctic populations, between North Sea and North Atlantic herring, between sole from the Irish Sea and Thames and between hake from the Mediterranean and Atlantic areas.
One major advantage of using SNPs is the ability to alter the number of markers in relation to the biology of the species (levels of genetic differentiation) and scale of geographic structuring of interest. Thus by varying the numbers used on a SNP-chip, it is possible, for example, to assign individuals back to their source population across different geographic scales with high levels of certainty and reproducibility. Such outputs are especially significant since previous types of genetic markers either detect levels of population differences that are too low, or there are inherent difficulties in comparing data generated from different laboratories. The use of a marker system such as SNPs, which is essentially based on the presence or absence of large numbers of single genetic variants means that data can be compiled from sources in a much more reliable and high throughput way. The approach thereby enables the generation of baseline and ongoing additions for subsequent genetic monitoring. Moreover it is imperative that any such tools can be used in a legal context, necessitating forensic validation. This has been achieved for SNP markers within FishPopTrace across a range of policy-driven IUU scenarios (see Traceability of Fish Populations and Fish Products: http://fishpoptrace.jrc.ec.europa.eu/).