Exploiting Genomes to Identify Organism

DNA profiling technologies >>

The use of genomic DNA to identify and distinguish organisms started practically at the beginning of the 80s, in coincidence with the advent of recombinant DNA technologies and the ability to sequence DNA.

Apart from the implications for basic research (reconstruction of phylogenetic trees and of the evolution of life on earth), there are obvious advantages in the use of an abundant, chemically stable biological macromolecule present in every cell of the organism as a faithful copy to derive an identification code.

The entering in the genomic era, the availability of the polymerase chain reaction (PCR), a simple and robust way of amplifying short DNA sequences, and new methods for whole genome amplification have now created the conditions for the massive exploitation of DNA tracing technologies to ameliorate everyday life of people. This is the mission of GENTRAS, the first European company with this exclusive focus.

Both non coding and coding sequences of genomes can be targeted by different approaches to generate discrimination assays with high sensitivity and with increasing degrees of specificity. Depending on the targeted genomic sequences and the DNA assay employed, the specificity ranges from individual to interspecies identification, passing through sex, consanguineity, race (or strain or cultivar if, respectively, the studied organisms are not mammals or vertebrates but micro organisms or plants) and species.

The inverted pyramid of the DNA tracing tools, on which GENTRAS technological platform is based. On the right the progressive levels of organism aggregation are reported, being the preferred approach to distinguish these levels aligned on the left.

Eukaryotic genomes are composed of three types of sequences: non-repetitive sequences, that are unique; moderately repetitive sequences, that are dispersed and repeated a small number of times in the form of related but non identical copies; and highly repetitive sequences, that are short and usually repeated as a tandem array. Most structural genes are located in non-repetitive DNA. The coding sequences of the 23,000 human genes comprise about 1% of the genome, and about the same genome share can be assigned to their regulatory sequences, to give a total of only 2% of the full DNA.

Composition of the human genome. The percentage shares of various functional and non-functional sequences are shown.

The repetitive sequences are divided in interspersed repeats and tandem repeats. The latter include satellites, minisatellites and microsatellites, or variable number tandem repeats or short tandem repeats (STRs), which occupy 3% of the human genome. The STRs, which are short (from 1 to 6 nucleotides generally) and tandemly repetitive sequences, tend to be unstable at cell meiosis and mitosis, easily producing a variation in the number of repeats as DNA replication errors. In the human genome a combination of 13 STRs is enough to identify uniquely a single individual.

The human genome is basically uniform, and only about 0,3% of its sequence differs between individuals in a population. This small proportion of DNA is represented by small insertions and deletions, but most of all by single nucleotide polymorphisms (SNPs), binary changes in only one base that on average occur every 300 bases, and that in the human genome are 10 millions. SNPs can also be used to distinguish between different individuals.

Exploiting Genomes to Identify Organism

DNA profiling technologies >>

<< Structures of Genomes

Technological Solution >>