Desoxyribonucleic acid. This is the basis for the genetic code present
in each cell. It is made of units called bases, they are four and they are
called Adenine (A), Guanine (G), Cytosine (C) and Thymidine (T).
This is the combination of the A,T,G, and C, like ATTGCCATT.
The whole DNA sequence present in each cell has around 109 bases. The
complete sequence is called the genome. If printed on paper,
the print-out would extend from the earth to the moon.
It seems that only 5% of the DNA in the genome codes for
proteins. The portion that codes for a specific protein is called a gene. It
is estimated that there are 50,000 to 100,000 different genes.
To determine where the genes are in the genome, different
approaches are taken.
1) The Physical Map
The genome is very long. Therefore, it has been cut into very small pieces
and these pieces have been put into viruses, bacteria or fungi to replicate
rapidly The collections of genomic pieces are called Gene Banks.
We also know that the genome is divided into chromosomes that contain big
chunks of coding DNA (humans have 46 chromosomes and pigs have 38
If we know the gene product, a protein, the corresponding DNA can be
synthesized. A chemical signal is attached to this specific DNA (this is
called a DNA probe) and when this probe is mixed with chromosomes, it binds
with the corresponding gene on a specific chromosome.
The DNA probe is also used with the Gene Banks, and permits to extract from
those banks only the pieces of genome that contain this specific gene.
Those small portions of DNA are then sequenced to give the exact coding for
the protein (not only the code for the structure of the protein but also the
code for regulating its synthesis).
Using that approach, out of the expected 100,000 genes, maybe 4,000 have
been sequenced and put on a specific chromosome in humans. Progresses are
made rapidly and in 10 years all the genes should be localized on human
Gene Banks where the genome has been cut differently and contain overlapping
pieces of DNA have been made. Using DNA probes for known genes, it is
possible to determine the order of all the overlapping pieces. This is the
Gene Map, it has some landmarks but most of the contents of each DNA
segment is still unknown. It needs to be refined.
For example, the sequence around a specific gene can be determined. In the
genome, the beginning of the coding region of a gene is signaled.
Therefore, the presence of another gene in the vicinity of the first gene
can be determined, and this new gene can then be sequenced. The DNA probe
for that new gene, can be used to localize this gene to a chromosome and to
detect new overlapping segments in the gene banks.
2) The Genetic Map
The other way is to use family studies. In each individual, chromosomes
go by pairs, and a specific gene is present on each chromosome of a given
pair. Sometimes, the gene on one chromosome codes for a slightly different
product than the gene on the other chromosome. This is called polymorphism.
During reproduction, one version of the gene or the other is passed on to
the offspring and it is possible to determine the variants present in one
individual by studying enough of his descendants.
If you know the DNA sequence of the gene, you can use DNA probes to
determine the variants. If you know only the protein variants(like the ABO
blood groups,) use those to determine the presence or absence of variants in
Genes that are closed together in the genome on the same chromosome, are
passed on together to the offspring. If they are close enough, variant 1 of
gene A and variant 2 of gene B will always be together in the offspring. If
those two genes are not that close, the different variants of gene A will be
found with various variants of gene B in the offspring.
Then, you try to look for a physiological trait or a genetic disease with no
known gene product in family studies and to find whether they are always
associated with a variant of a known gene. You repeat with variants of other
genes. If there is an association you say that these genes form a linkage
group. This is the Genetic Map.
You do not know the basis for the physiological trait or the genetic disease
but you know the association. The associated genes are called markers (they
are associated with the unknown gene on the DNA strand, but might not be
responsable for the expression of the unknown gene).
The next step is to use the DNA sequence of one of the markers in the
linkage group to go back to the PHYSICAL MAP and find its chromosome
location. Then, you look on the same chromosome for other genes in the area,
and use them for new family studies to refine the linkage group.
You can also use the DNA sequence of one of the markers in the Gene Bank, to
get the piece of DNA that contains the linkage group, and start sequencing
and finding new genes that can be used for new family studies.
If you find a gene that would have a relation to the physiological trait or
the genetic disease, you call it a candidate gene, and if it happens that a
polymorphism in the DNA sequence or a mutation is always related to what you
are looking for, you have found the gene!
That approach (Family Studies, Markers, Gene Bank, Candidate Gene) was the
one used for finding the gene for Cystic Fibrosis!
When you have identified the new gene, you use it to expand again the
linkage and physical maps to find new locations for other genes of interest.
As time progresses, the genetic map (family studies on the transmission of
variants of certain genes and their association) and the physical map (Use
of DNA probes for determining the presence of a gene in a specific segment
in a gene bank, and its localization on a specific chromosome) will be
merged together to give an overall picture of the genome.
This will provide markers for DNA-assisted selection.
Retour CV Back to Publications List