Desoxyribonucleic acid. This is the basis for the genetic code present

in each cell. It is made of units called bases, they are four and they are

called Adenine (A), Guanine (G), Cytosine (C) and Thymidine (T).

DNA sequence:

This is the combination of the A,T,G, and C, like ATTGCCATT.

The whole DNA sequence present in each cell has around 109 bases. The

complete sequence is called the genome. If printed on paper,

the print-out would extend from the earth to the moon.

Coding DNA:

It seems that only 5% of the DNA in the genome codes for

proteins. The portion that codes for a specific protein is called a gene. It

is estimated that there are 50,000 to 100,000 different genes.

Genetic Mapping:

To determine where the genes are in the genome, different

approaches are taken.

1) The Physical Map

The genome is very long. Therefore, it has been cut into very small pieces

and these pieces have been put into viruses, bacteria or fungi to replicate

rapidly The collections of genomic pieces are called Gene Banks.

We also know that the genome is divided into chromosomes that contain big

chunks of coding DNA (humans have 46 chromosomes and pigs have 38


If we know the gene product, a protein, the corresponding DNA can be

synthesized. A chemical signal is attached to this specific DNA (this is

called a DNA probe) and when this probe is mixed with chromosomes, it binds

with the corresponding gene on a specific chromosome.

The DNA probe is also used with the Gene Banks, and permits to extract from

those banks only the pieces of genome that contain this specific gene.

Those small portions of DNA are then sequenced to give the exact coding for

the protein (not only the code for the structure of the protein but also the

code for regulating its synthesis).

Using that approach, out of the expected 100,000 genes, maybe 4,000 have

been sequenced and put on a specific chromosome in humans. Progresses are

made rapidly and in 10 years all the genes should be localized on human


Gene Banks where the genome has been cut differently and contain overlapping

pieces of DNA have been made. Using DNA probes for known genes, it is

possible to determine the order of all the overlapping pieces. This is the

Gene Map, it has some landmarks but most of the contents of each DNA

segment is still unknown. It needs to be refined.

For example, the sequence around a specific gene can be determined. In the

genome, the beginning of the coding region of a gene is signaled.

Therefore, the presence of another gene in the vicinity of the first gene

can be determined, and this new gene can then be sequenced. The DNA probe

for that new gene, can be used to localize this gene to a chromosome and to

detect new overlapping segments in the gene banks.

2) The Genetic Map

The other way is to use family studies. In each individual, chromosomes

go by pairs, and a specific gene is present on each chromosome of a given

pair. Sometimes, the gene on one chromosome codes for a slightly different

product than the gene on the other chromosome. This is called polymorphism.

During reproduction, one version of the gene or the other is passed on to

the offspring and it is possible to determine the variants present in one

individual by studying enough of his descendants.

If you know the DNA sequence of the gene, you can use DNA probes to

determine the variants. If you know only the protein variants(like the ABO

blood groups,) use those to determine the presence or absence of variants in

the offspring.

Genes that are closed together in the genome on the same chromosome, are

passed on together to the offspring. If they are close enough, variant 1 of

gene A and variant 2 of gene B will always be together in the offspring. If

those two genes are not that close, the different variants of gene A will be

found with various variants of gene B in the offspring.

Then, you try to look for a physiological trait or a genetic disease with no

known gene product in family studies and to find whether they are always

associated with a variant of a known gene. You repeat with variants of other

genes. If there is an association you say that these genes form a linkage

group. This is the Genetic Map.

You do not know the basis for the physiological trait or the genetic disease

but you know the association. The associated genes are called markers (they

are associated with the unknown gene on the DNA strand, but might not be

responsable for the expression of the unknown gene).

The next step is to use the DNA sequence of one of the markers in the

linkage group to go back to the PHYSICAL MAP and find its chromosome

location. Then, you look on the same chromosome for other genes in the area,

and use them for new family studies to refine the linkage group.

You can also use the DNA sequence of one of the markers in the Gene Bank, to

get the piece of DNA that contains the linkage group, and start sequencing

and finding new genes that can be used for new family studies.

If you find a gene that would have a relation to the physiological trait or

the genetic disease, you call it a candidate gene, and if it happens that a

polymorphism in the DNA sequence or a mutation is always related to what you

are looking for, you have found the gene!

That approach (Family Studies, Markers, Gene Bank, Candidate Gene) was the

one used for finding the gene for Cystic Fibrosis!

When you have identified the new gene, you use it to expand again the

linkage and physical maps to find new locations for other genes of interest.

As time progresses, the genetic map (family studies on the transmission of

variants of certain genes and their association) and the physical map (Use

of DNA probes for determining the presence of a gene in a specific segment

in a gene bank, and its localization on a specific chromosome) will be

merged together to give an overall picture of the genome.

This will provide markers for DNA-assisted selection.

Retour CV                     Back to Publications List