The Human Genome Project
The Human Genome Project
Yossi Segal, Ph.D.
Foreword
The concept of deciphering the human genome surfaced in the United States in the 1930’s, following the discovery that color blindness and hemophilia are linked to the X chromosome. The Human Genome Project (HGP) originated at the US Department of Energy (DOE) meeting in 1984, when the possible use of DNA analysis in detecting mutations among atomic bomb survivors was contemplated.
Following lengthy deliberations, the US government approved the project and in 1988 the HGP was launched under the supervision of the National Institutes of Health (NIH) and the DOE. In 1990 it was shaped into a fifteen-year program, designed to map and sequence the entire human genome and the genomes of several model organisms. Today, the HGP involves the collabo- ration and contributions of many countries and several inter- national organizations in nucleotide sequencing and dissemination of knowledge and information about it, and the support of scientific and technological activities.
The following provides a concise description of the evolution of the HGP, its goals, the countries and organizations involved, and the activities and progress thereof.
The expectations of the HGP, as held by the scientific, thera- peutic and biotechnological communities, as well as by society as a whole, are enormous, as are the benefits (see Appendix 1). However, the knowledge obtained from the HGP could, and most probably would, result in the establishment of a genetic identification for each person, harboring great risk to the individual and to society if not used prudently and under the strictest of regulations (see Appendix 2).
Evolution
The concept of human genome mapping originated in the 1930’s. In the 1970’s, RFLP (restriction fragment length polymorphism) marker discoveries accelerated the pace, which was boosted again with the discovery of STRP (short tandem repeat polymorphism) markers in the late 1980’s.[1] With the aid of STRPs, the entire genetic map has been saturated by markers – so much so that the new maps incorporate over 3,600 STRPs, 400 genes, and 1,800 other markers.
Genetic maps have helped localize over forty genes, including those for cystic fibrosis, fragile X syndrome, myotic dystrophy, certain types of colon and breast cancer, ataxia telangiectasia, and Alzheimer’s disease. The application of gene therapy has been in progress since 1990 (see Appendix 1), but by the end of 1997, only approximately three percent of the human genome had been sequenced. The short-term goal of genetic mapping has been accomplished, but there are still too many gaps and insufficient anchor points of chromosomal telomers and centromers.
The Process of Deciphering The Human Genome
1. Experimental Procedures
a) Since DNA varies from one individual to another with roughly 1 nucleotide per 500, when DNA is cut with restriction enzymes a polymorphic pattern of fragments is produced which can be employed in genetic mapping by finding RFLPs with similar traits (markers).[2]
b) Pulsed-field gel electrophoresis (PFGE)[3] enables separation of large DNA fragments up to 10 million bp (base pairs).
c) Polymerase chain reaction (PCR)[4] enables a manifold amplifi- cation of a DNA sequence, providing working means for ana- lyzing minute amounts of DNA.
d) Yeast artificial chromosome (YAC)[5] enables cloning of large DNA segments up to 1 million bp.
e) Sequence-tagged site (STS),[6] the common mapping language, is a short (100-1000 bp) DNA segment, unique in the genome, defined by a pair of PCR primers. Genomatron[7] is an automated system that can screen hundreds of STSs in hours.
f) “Positional candidate” strategy is predicted to become the major technique for identifying disease genes. The approach is based on an efficient three-step process: i) localizing a disease gene to a chromosomal subregion (using the traditional linkage ana- lysis); ii) searching databases for an attractive candidate gene within that subregion; and iii) testing the candidate gene for disease-causing mutations. It is believed that by the first quarter of 1995, it helped identify more than fifty disease genes.
2. The Undertaking of the Human Genome Project
Following the founding of the HGP in 1984, the effort to sequence the entire human genome began. It was advocated by several scientists, including Robert Sinsheimer (then chancellor of the University of California, Santa Cruz), Charles Delisi (DOE) and Renate Dulbecco (then president of the Salk Institute).
In September of 1986, a National Research Council committee was asked to determine whether the HGP should be advanced. In February of 1988, the committee recommended its implementation, with the NIH playing a central role. Two months later, another committee, appointed by the US Congress Office of Technology Assessment, released a report supporting the recommendation of the National Research Council committee. That same year, Congress appropriated $17.3 million to the NIH and $11.8 million to the DOE for genome research. An NIH office, the Center for Human Genome Research, was created. It was later renamed the National Center for Human Genome Research (NCHGR).
In early 1990, the NIH and DOE, partners in managing the HGP, presented to Congress a five-year term program, coordinated by the joint Subcommittee on the Human Genome, with seven major goals:
a) To develop maps of human chromosomes.
b) To improve technology for DNA sequencing.
c) To map and sequence the DNA of selected model organisms (mouse, Caenorhabditis elegans, Drosophila melanogaster, Sac- charomyces cervisiae, Escherichia coli).
d) To collect, manage and distribute data (bioinformatics).
e) To study the legal, social and ethical issues involved and to develop policy options.
f) To develop and improve technology.
g) To facilitate the transfer of technology.
A number of bioinformatics databases have been created, such as the Genome Data Base, which specializes in human genetic maps, supported by the NIH and DOE at the John Hopkins University Welch Medical Library in Baltimore.[8]
A program for the ethical, legal, and social implications, ELSI, has been launched.
Progress for the first five-year period was right on schedule, especially genetic mapping and sequencing of model organisms, while sequencing techniques are being progressively improved. The results of the linkage map were published in the “Index Marker Catalog” of the NCHGR, and complete mapping with 10-15 cM (centimorgans) resolution was completed in 1993.
National and International Activities in the HGP
The US leads in terms of effort, cost support, and results, followed by France, the UK and, to a lesser extent, Japan. Specific programs in the HGP have been introduced in other countries, including Australia, Belgium, Canada, Denmark, Germany, Israel, Italy, the Netherlands, Russia, and Sweden. In addition, there is growing collaboration of international organizations, especially in dissemination of information and informatics (data banks) and in mapping and sequencing model organisms.
Three major international organizations are involved in the HGP: the Human Genome Organization (HUGO), which is pri- marily involved in disseminating knowledge in various forms – informatics, information, meetings, and workshops; the Commis- sion of the European Community, which is engaged in processing various functions via its framework programs of science and biotechnology; and the United Nations Educational, Scientific and Cultural Organization (UNESCO), by establishing international bioinformatics networks and via its International Bioethics Committee, which examines the legal, social and ethical aspects of HGP at the international level.
The support for the activities of the HGP primarily comes from governments and, most notably in the UK and France, from charities.
1. The United States
In the US, where it was initiated, the HGP is jointly coordinated by the NIH and DOE. There are 21 centers involved (see Bibliography, number 1), working on the following projects:
a) Physical mapping of human chromosomes 3, 4, 5, 6, 11, 12, 13, 15, 16, 17, 19, 21, X.
b) Physical mapping of mouse chromosomes 11 and X.
c) Complete mapping of the D. melanogaster genome.
d) Mapping of S. cervisiae[9] chromosomes IV and V as part of an international effort to sequence the entire yeast genome.
e) The entire genome of E. coli K12 strain MG 1655 (4.7 kb [kilobases]).[10]
f) The entire genome of E. coli.
g) The complete genomic sequence of the nematode C. elegans, in collaboration with the Sanger Center (UK).
h) The complete sequence of two bacteria, Haemophilus influ- enzae[11] and Mycoplasma genitalium[12] (published May 1995).
The NCHGR and DOE are examining the means by which the beneficial, therapeutic aspects of the HGP, along with the legal, social, and ethical problems embodied by it, can be reconciled.
2. France
The combined efforts of two centers, Genethon (annual budget: £6.5 million), an automated laboratory facility supported by the French Muscular Dystrophy Association, and Centre d’Etude Polymorphisme Humain (CEPH; government- and charity-funded), have placed France at the front of the US in the human genome research effort. In 1992, Genethon completed mapping 28% of human chromosome 21 and provided about 2,000 genetic markers. In 1993, it published a first-generation map (low resolution) of 90% of the human genome. In collaboration with the US, the CEPH program provides investigators with DNA from cultured lym- phoblast cell lines derived from a reference panel of more than sixty nuclear families in France, Venezuela, Pennsylvania, and Utah. Other sources of funding include a television appeal (FF300 million were raised in one night).
The EUREKA Labimap 2001 Genome Project, a four-year project whose goal is to develop and commercialize automated sequencing machines, was initiated in 1988. It is jointly run by France and the UK and involves the CEPH, UK Imperial Cancer Research Fund, and Amersham International, coordinated by Bertin et Cie. It will be the largest center in the world for gene analysis. It aims to collect blood samples from 25,000 disease families and map the disease genes. In 1990, support for this center was £15.6 million.
In 1988, the Ministry of Research and Technology funded the Genome Concerted Actions Programme, which will also support the EUREKA Labimap 2001 Project. In the fall of 1990 it launched the French Human Genome Research Programme, whose main task is to coordinate the efforts of the different bodies, including the automation of molecular biological methods (e.g., through the EUREKA Labimap 2001 Programme).
Other projects include mapping human chromosome 20, currently in progress at the Institute Pasteur but transferring to Genethon Centre. Other human chromosomes may be included later.
Model organisms included mouse genome, E. coli and Bacillus subtilis, C. elegans, and the Arabidopsis thaliana plant.
At the Cold Spring Harbor meeting (May 1995), Jean Morisette of Genethon presented an almost complete human genetic linkage map, covering the entire human genome with 5,300 markers (2.5 times the number available eighteen months prior), and Isabelle Gall of the CEPH presented an improved physical map, comprising overlapping sequences of mega-YACs, covering up to 75% of the human genome with 33,000 YACs. (For detailed information on activities in France see Bibliography, number 2).
3. United Kingdom
The UK assumes an important role in the HGP. Most of the work in the Human Genome Mapping Project (HGMP) is conducted in academic institutions, specific genome centers, and university departments. Some commercial activity is beginning to surface. Support comes from the government, the Welcome Trust, and the Imperial Cancer Research Fund.
Government support, via the Medical Research Council (MRC), was, in 1989, £11 million for a three-year period and £4.5 million per year in 1992. In 1993, an additional £1.5 million per year were allocated to promote work on comparative mapping.
The MRC-HGMP Resource Centre serves as a bioinformatics center for the human and mouse genomes. Services are free for scientists in return for data deposition, while industry pays a small fee to cover costs.
The Biotechnology and Biological Sciences Research Council is involved in identifying genes of commercial value (such as growth, disease resistance, etc.) and mapping several species, including pig (as part of the European PigMap effort), limited effort on chicken and cattle, and plants (A. thaliana, pea, wheat, barley, grasses, and oilseed rape).
The Welcome Trust funded the establishment of two key centers: the Sanger Centre in Cambridge (physical mapping and sequencing of human genome, the worm C. elegans, and yeasts) and in Oxford (a £13-million, five-year study on genetic diseases). To oversee and coordinate these activities and their funding, the Welcome Trust established the Genetic Interest Group.
The Sanger Centre’s aim is to map 70,000 ESTs (expressed sequence tagged) at intervals of 0.5 Mb (million base pairs) or less. To facilitate this process, the Institute of Genomic Research donated primers for 15,000 ESTs. The Imperial Cancer Research Fund made a major contribution in establishing clones (YAC libraries as well as cosmid, cDNA, and P1 libraries) for human, mouse, and yeast genomes, and supports the development of sequencing technology.
Primarily, the species evaluated in the UK are mouse and C. elegans; also evaluated are human, pig, chicken, cattle, yeast, and plants (A. thaliana).
The UK’s strength lies in the robotics-supported cloned DNAs immobilized on filters. It is the leader in throughput and the density of clones on the filters.
4. Japan
Japan demonstrates less awareness of and exposure to genetic diseases.
There are several informatics projects currently in progress in Japan. Since 1993, the Science and Technology Agency (STA) has operated a gene database at the Japan Information Center of Science and Technology. The DNA Data Bank of Japan, established at the National Institute of Genetics in 1986, is connected to GenBank (US) and the European Molecular Biology Laboratory (EMBL). The Genome Net is an inter-university network within Japan.
The Human Genome Center (HGC), dedicated to promoting research, was established at the University of Tokyo Medical Research Institute. It is supported by several government mini- stries, including the STA and the Ministry of Education, the Ministry of Science and Culture (Monbusho), and, to a lesser extent, the Ministry of Health and Welfare and the Ministry of Agriculture, Forestry and Fisheries (the latter because of projection of HGP to other organisms, including rice).
In 1981, the HGC began developing automated sequencing machines. Its budget, since 1988, is £7 million, half of which has been used for automated methods.
Overall support in 1991 was £1.2 million from the Ministry of Education Science and Culture program, £3.7 million from the STA-WADA (automated sequencing machine) project, £0.3 million from the Silver Science Program Yoken Project, and £2.4 million from the Ministry of Agriculture, Forestry and Fisheries Rice Genome Project, totaling £7.6 million overall.
Model organisms are E. coli and, on a smaller scale, B. subtilis, S. cerevisiae, Saccharomyces pombe, C. elegans, and A. thaliana, and plant chloroplasts.
5. Israel
The major body directing the scientific activities pertaining to the HGP in Israel is the Israel Academy of Sciences and Human- ities, assisted by its Advisory Committee on the Human Genome, nominated in March of 1991.
Within its capacity of promoting the advance of basic research in Israel, the Academy has been engaged in stimulating HGP activities along four different lines: establishing national centers, supporting and stimulating research through earmarked grants, attracting young scientists to the field through post-doctoral grants, and disseminating knowledge through meetings and workshops.
In 1993, two national centers, whose major tasks are to assist Israeli researchers in the human genome field, were launched: the Bioinformatics Genome Resource Center at the Weizmann In- stitute of Science in Rehovot and the National Laboratory for the Genetics of Israel Populations at Tel Aviv University.
The Bioinformatics Genome Resource Center serves as the Israel National Node within the EMBL’s International Nodes network and is connected to the European Bioinformatics Institute in Cambridge, England. In this capacity, the Bioinformatics Center provides scientists in Israel with access to the vast and rapidly growing genetic information – primarily that derived from the HGP – produced in the world.
The National Laboratory for the Genetics of Israel Populations will serve as one of the regional centers (the Israel International Center for Human Genetic Diversity) within the International Project of Human Gene Diversity. Israel has a unique advantage owing to the vast immigration from many countries with large genetic diversity. The Center is engaged in blood sampling and preserving (by immortalization) cell lines of various Israeli pop- ulations, and in the future, hopefully, of neighboring population groups. The material stored in the Center will be available for qualified scientists, nationally and internationally, to study the diverse genetic traits (normal and pathologic) for the advance of knowledge in various fields, including anthropology, biology, and medicine. Aware of the potential harmful risks involved as a result of misuse of the vast information accumulated there, the Center takes appropriate precautions to ensure the donors’ protection and well-being.
Israel’s main contribution to and activity in the HGP is with human chromosome 17. Professor Doron Lancet of the Weizmann Institute was nominated by HUGO to serve as editor of number 17 and, recently, of number 21; Professor Yoram Groner is joining the International 21st Chromosome Consortium.
Israel is engaged in cloning, mapping, sequencing, and evalu- ating several human genes, primarily of pathological implications and with Jewish orientation, such as ataxia telangiectasia, cystic fibrosis, Gaucher’s disease, and Tay-Sachs disease.
Expenditures for the various HGP activities during the 1997 fiscal year exceeded $1 million.[13] Israel received support from the government through the Planning and Budgeting Committee of the Council for Higher Education and from several charities.
6. International Organizations
a) The Human Genome Organization (HUGO)
HUGO was established in the US at the Cold Spring Harbor Meeting in 1988. Later, offices were set up in four countries: Switzerland, the US, England, and Japan. It was initially funded by several charity organizations, including the Howard Hughes Medical Institute, the Wesley Foundation, the Imperial Cancer Research Fund, and the Lucille P. Markey Charitable Trust, though now its administrative activities are covered by the governments of the offices’ host countries.
HUGO functions to coordinate the exchange of knowledge and data through seminars, workshops and training programs, and encourages public awareness of the scientific, medical, tech- nological, legal, social, and ethical implications of the HGP.
b) The European Molecular Biology Laboratory (EMBL)
The EMBL is located in Hamburg and funded by some, but not all, of its member countries. Its focus is in two areas: database coordination (informatics) and the development of instrument- ations and techniques.
EMBL’s database activities include the Nucleotide Sequence Data Library (half the size of the US’s GenBank); EMBNET, connected to GenBank and Japan’s DataBank, which comprises more than fifteen nodes, including the Israeli node at the Weiz- mann Institute; and Swissport, a collection of amino acid sequences along with translations of coding sequences in the EMBL Nucleotide Sequence Data Library, developed at the University of Geneva and maintained by EMBL. Swissport is also connected to the protein sequence database in the US.
c) The United Nations Educational, Scientific and Cultural Organization (UNESCO)
UNESCO’s 1990-91 budget was $350,000. Its main focus is the dissemination of information and ethical and social issues, in pursuit of which it established the International Bioethics Committee in 1993.
d) Informatics
There are three main centers for informatics: EMBL in Heidel- berg; the National Center for Biotechnology Information in Wash- ington, DC, specializing in sequences; and the Genome Data Base in Baltimore (to be closed in 1998).
Information Data
1. The human genome consists of 23 pairs of chromosomes – 22 autosomes and 1 sex chromosome (X,Y) having 3×109 bp of DNA – about 100,000 genes in total, of which about 5,000 are disease genes.
2. The smallest human chromosome is Y, at 50 million bp. The largest human chromosome is 1, at 250 million bp. The average- sized gene is 30,000 bp, encoding a 1,000-amino acid protein.
3. Karyotype is the analysis of chromosomes through a microscope based on shape (size and banding pattern).
4. Mapping and sequencing the human genomes involves dividing each chromosome into small segments which are then character- ized and arranged sequentially on the chromosome. A genome map describes the order of genes, other known DNA segments of no known functional protein, and the spacing between them on each chromosome.
A genetic map depicts the order by which genes are arranged along a chromosome. The determination of such a sequence is facilitated by known markers (genes or other DNA stretches). Distances between markers are measured in centimorgans (cM). One genetic cM is about 1 million bp (in physical distance).
A genetic linkage map shows the relative location of a specific DNA marker along the chromosome. Markers must be polymorphic (DNA sequence variations occurring once every 300-500 bp) to be useful in mapping. Most variations occur in introns, whereas in exons this could result in observable changes such as eye color. The human genetic linkage map is constructed by observing how frequently two markers are “interherited” together. The closer the markers are to one another on the same chromosome, the more tightly linked they are and the more likely they will be passed on together to the next generation, i.e., they will not be separated by recombination events. Hence, the distance between two markers can be determined. This can also assist in locating a gene, especially that of genetic disease on a chromosome. Genetic maps assisted in chromosomal location of several inherited diseases, such as sickle cell anemia, cystic fibrosis, Tay-Sachs disease, fragile X syndrome, myotonic dystrophy, and ataxia telangiectasia. The goal is a complete detailed genetic map of 1 cM resolution.
A physical map shows the actual sites of genes on the genome. There are several physical maps with different degrees of resolution. The physical map of DNA, like a topographic map, consists of mapped landmarks, such as restriction enzymes and STSs, providing reference points relative to which functional DNA sequences such as genes can be localized.
· The lowest resolution physical map is the cytogenetic map, where chromosomal band patterns are viewed with light microscope.
· A cDNA map shows the location of genes.
· The highest resolution map shows a complete DNA bp sequence.
Methods
1. Macrorestriction map (top-down mapping): Fragmenting chro- mosomes with a rare restriction enzyme into large pieces, which are then ordered, subdivided, and mapped. This results in more continuity and less gap than the contig method, but it has a lower map resolution.
2. Contig map (bottom-up mapping): Cutting a chromosome into small pieces, each cloned and ordered, forming contiguous DNA blocks.
3. Positional cloning: The markers are used for gene hunt. Once the gene is located, physical maps are used to obtain flanking DNA segments for further detailed study (mostly pertains to regu- lation of gene function).
4. STS-content mapping: Provides the means to establish these overlaps between each clone and its nearest neighbors. If two clones share even a single STS, they can reliably be assumed to overlap. Using the YAC system and STS-content mapping, physical maps of human chromosomes 21 and Y, and a large part of X, have been published. (YAC gives a much larger segment [contigs] than those observed with cosmid clones.) The problem with YAC (and more so with cosmid) is the orientation of the contig obtained. This is overcome by the FISH method, where a fluorescence-labeled probe that binds to a chromosome is visualized through a light microscope.
5. Radiation hybrid mapping: Fragmenting chromosomes in cul- tured cells with high doses of X-rays followed by incorporating the fragments into stable cell lines.
6. PCR markers: Based on short, repetitive DNA sequences widely distributed in the human genome, such as (CA)n (n=number of repetitions of the dinucleotide CA). n is highly variable in the different zones of the human genome. The difference in n results in different copy length, detectable by electrophoresis.
7. Positional cloning: Identifying a gene inflicting a disease (inherit- able disease).
The major technical hindrance in sequencing is that no sequencing machine exists to provide a contiguous, finished DNA sequence on a large scale (>1 Mb). This is a priority goal of the HGP.
Glossary and Abbreviations
Autoradiography – A technique that uses X-ray film to visualize radioactively labeled molecules or fragments of molecules. It is used to analyze the length and number of DNA molecules after separation by gel electrophoresis.
bp – base pair.
cDNA – DNA synthesized from a messenger RNA template; the single strand form is used as a probe in physical mapping.
Centimorgan (cM) – A unit of measure of recombination frequency. One cM is equal to a 1% chance that a marker at one genetic locus will be separated from a marker at another due to recombination in a single generation. In humans, a centimorgan equals, on average, 1 million base pairs.
CEPH – Centre d’Etude Polymorphisme Humain.
Chromosome – A structure of genetic information com- posed of protein and DNA. Each human cell has 46 chromosomes in pairs. One chromosome of each pair is inherited from each parent.
Contigs – Groups of clones representing overlapping or contiguous regions in a genome.
Cosmid – A 20-40 kb DNA fragment.
Cosmid vectors – Plasmids that also contain specific sequen- ces from the bacterial phage lambda. Cosmids are des- igned for cloning large fragments (typically 40,000 bp) of eukaryotic DNA.
Database – Repository of information.
DNA (deoxyribonucleic acid) – DNA is normally a double- stranded molecule that encodes genetic inform- ation, the two strands being held together by chemical bonds between base pairs of nucleotides. There are four such nucleotides: adenosine, guanosine, cytidine, and thymidine. Base pairs are formed only between adenosine and thymidine and between guanosine and cytidine. It is possible to determine the sequence of either strand from that of its partner.
DNA cloning – A means of isolating individual fragments from a mixture and multi- plying each to produce sufficient material for further analysis.
DNA sequence – The relative order of base pairs in a stretch of DNA, a gene, a chromosome, or an entire genome.
DOE – Department of Energy.
Electrophoresis – A method used to separate and characterize large DNA and protein fragments from a mixture of such molecules.
ELSI – ethical, legal, and social implications.
EMBL – European Molecular Biology Laboratory.
Expressed sequence tagged (EST) – Short DNA sequences derived from active genes.
Eukaryote – All organisms except viruses, bacteria and blue-green algae.
Exons – The protein-coding DNA sequences of a gene.
FISH – A physical mapping technique employing fluo- rescinlabeled DNA probes that can detect segments of the human genome by DNA-DNA hybridization on samples of condensed chromosomes of lysed metaphase cells.
Gene – The fundamental unit of inheritance; an ordered sequence of nucleotides that specify a single type of protein (or for some genes, certain RNAs).
Gene therapy – Insertion of normal DNA into cells to correct a given genetic defect.
Genetic linkage map – A map showing the relative positions of gene loci on a chromosome. The distance is measured in centi- morgans.
Genetics – The study of patterns of inheritance of specific traits.
Genome – All the genetic material in all the chromo- somes of a particular organism. Size is usually denoted in base pairs.
Genomic library – A collection of clones made up of a set of overlapping DNA fragments representing the entire genome.
Germinal cell – A cell destined to become a sex cell.
HGC – Human Genome Center.
HGMP – Human Genome Mapping Project.
HGP – Human Genome Project.
HUGO – Human Genome Organization.
Hybridization – The joining of two complementary strands of DNA, or of DNA and RNA, to form a double- stranded molecule.
Informatics – The application of computer and statistical techniques to the manage- ment of information.
Introns – The DNA sequences separating the protein coding sequences of the genes. Introns are transcribed into mRNA (messenger RNA) but are eliminated from the message before it is translated into protein.
kb – kilobase (1000 bases).
Library – A collection of unordered clones whose relationship may be estab- lished by physical mapping.
Linkage – The proximity of two or more markers on a chromosome. The closer together the markers are the lower the probability that they will be separated during meiosis. This gives an idea of the likelihood that they will be inherited together.
Locus – The location of a gene on a chromosome.
Marker – An identifiable physical location on a chromosome whose inherit- ance can be monitored.
Mb – million base pairs.
MRC – Medical Research Council.
NIH – National Institutes of Health.
Nucleotide – The building block of DNA or RNA. Thousands of nucleotides are linked to form a DNA or RNA molecule.
Oligonucleotide probe – A short DNA probe whose hybridization is sensitive to a single base mismatch.
Physical map – A map showing the identifiable landmarks (genes, RFLP markers, etc.) on DNA, regardless of inheritance. Distance is measured in base pairs. Chromosome banding patterns constitute the lowest resolution physical map; complete nucleotide sequences represent the highest resolution map.
Plasmid vectors – Circular, double-stranded DNA of 1,000 to 10,000 bp that can carry additional DNA sequences in fragment inserts up to 12,000 base pairs. Widely employed in genetic engineering.
Polygenic disorders – Genetic disorders resulting from the combined action of alleles of more than one gene, e.g., heart disease or diabetes. The hereditary patterns of these disorders are more complex than single gene disorders.
Polymerase chain reaction (PCR) – A technique that allows a sequence of interest to be amplified selectively against a background of an excess of irrelevant DNA, up to 106 to 109 times.
Polymorphism – Differences in DNA sequence among individuals. Genetic varia- tions occur in more than 1% of a given population, and could be used for linkage analysis.
Pulsed-field gel electrophoresis (PFGE) – A method used to separate large DNA fragments (20 kb to 10 Mb) by applying pulses of electric current to the sample at various angles.
Replication – The synthesis of new DNA strands from existing DNA.
Restriction enzyme – A protein that recognizes specific short nucleotide sequences and cuts DNA at such sites.
Restriction fragment length polymorphism (RFLP) – Sequence variations in DNA sites that can be cleaved by restriction enzymes.
RNA (ribonucleic acid) – Similar in structure to DNA, RNA comprises uracil rather than thymidine nucleotide. Usually a linear single-stranded polymer encoding protein synthesis.
Sequence-tagged site (STS) – A short DNA sequence, readily located and amplified by PCR tech- niques, that uniquely iden- tifies a physical genomic location.
Sequencing – Ordering of nucleotides.
Short tandem repeat polymorphism (STRP) – Variable number of tandem repeat sequences, most commonly of 2 bp. The advantages of STRPs are that they repeat up to thousands of times throughout the genome, they are evenly distributed throughout the genome and are amplifiable by PCR, and the number of repeats varies among individuals. In some cases, an excess of repeats of trinucleotides could lead to inherited diseases such as fragile X syndrome, Huntington’s disease, or myotonic dystrophy.
Single gene disorders – Hereditary disorders caused by a single gene, e.g., Duchenne muscular dystrophy.
Single nucleotide polymorphism (SNP) – A disease- associated gene marker. A point mutation in a gene, very stable in time, useful in identifying disease-related genes.
Somatic cell – Any cell of the eukaryotic body other than those destined to become sex cells.
STA – Science and Technology Agency.
Telomere – DNA that forms the ends of chromosomes and is needed for successful replication.
Trait – Any genetic characteristic.
UNESCO – United Nations Educational, Scientific, and Cultural Organization.
Vector – A DNA molecule capable of autonomous replication in a cell and containing restriction enzyme cleavage sites for the insertion of foreign DNA.
Whole-genome radiation hybrid mapping – A complementary technology for constructing highly integrated maps of human chromosomes.
Yeast artificial chromosome (YAC) – Plasmids containing portions of yeast chromo- somal DNA that function in replication. Used to clone foreign DNA fragment inserts up to 1 million base pairs in length.
Appendix 1: Gene Therapy
The primary benefit to be derived from the HGP is gene therapy, in which defective genes inflicting congenital diseases are replaced by functional genes. In 1990 the first approved human gene therapy was successfully performed on Ashanthi De Silva, a three-year-old girl who suffered from congenital adenosine dea- minase deficiency, which, untreated, can lead to a fatal malfunction of the immune system. The functional adenosine deaminase gene was successfully introduced.
Since then more than one hundred clinical trials have been introduced, most with less success. The last five years have witnessed an intensive effort, both in the public and private sectors, involving hundreds of millions of dollars in the US alone, towards the development of tests and techniques for gene therapy. This five-year experimentation period has helped air the safety-related doubts with regard to possible hazards that engineering DNA might impose, such as the promotion of new infectious diseases and cancers.
However, although still considered a most rewarding prospect, gene therapy at present remains unclear. There are more questions than answers, such as the right vector to be used, side effects (noted in several cases of adenovirus use in cystic fibrosis trials) and, above all, its efficacy.
Appendix 2: The Inuyama Declaration – Human Genome Mapping, Genetic and Screening Therapy
In July of 1990, the Council for International Organizations of Medical Sciences held its XXIVth Round Table Conference, “Genetics, Ethics and Human Values: Human Genome Mapping, Genetic Screening and Therapy,” in Tokyo and Inuyama City, Japan. The Conference was held under the auspices of the Science Council of Japan and was co-sponsored by the World Health Organization and the United Nations Educational, Scientific and Cultural Organization. It was the fifth in a series entitled “Health Policy, Ethics and Human Values: An International Dialogue,” begun in Athens in 1984. The 102 participants came from twenty- four different countries, representing all the continents.
In addition to biomedical scientists and physicians, the partic- ipants represented a wide range of disciplines including sociology, psychology, epidemiology, law, social policy, philosophy and theol- ogy, and brought with them experience in hospital and public health medicine, universities and private industry, and the executive and legislative branches of government. Through presentations and discussions in plenary sessions and working groups, they reached broad agreement on a number of central issues. At its final session the following declaration was agreed upon.
I. Discussion of human genetics is dominated today by the efforts now under way on an international basis to map and sequence the human genome. Such attention is warranted by the scale of the undertaking and its expected contribution to knowledge about human biology and disease. At the same time, the nature of the undertaking, concerned as it is with the basic elements of life, and the potential abuse of the new knowledge which the project will generate are giving rise to anxiety. The Conference agrees that efforts to map the human genome present no inherent ethical problems but are eminently worthwhile, especially as the knowledge revealed will be universally applicable to benefit human health. In terms of ethics and human values, what must be assured are that the manner in which gene mapping efforts are implemented adheres to ethical standards of research and that the knowledge gained will be used appropriately, particularly in genetic screening and gene therapy.
II. Public concern about the growth of genetic knowledge stems in part from the misconception that while the knowledge reveals an essential aspect of humanness it also diminishes human beings by reducing them to mere base pairs of deoxyribonucleic acid (DNA). This misconception can be corrected by education of the public and open discussion, which should reassure the public that plans for the medical use of genetic findings and techniques will be made openly and responsibly.
III. Some types of genetic testing or treatment not yet in prospect could raise novel issues – for example, whether limits should be placed on DNA alterations in human germ cells, because such changes would affect future generations, whose consent cannot be obtained and whose best interests would be difficult to calculate. The Conference concludes, however, that for the most part present genetic research and services do not raise unique or even novel issues, although their connection to private matters such as reproduction and personal health and life prospects, and the rapidity of advances in genetic knowledge and technology, accentuate the need for ethical sensitivity in policy-making.
IV. It is primarily in regard to genetic testing that the human genome project gives rise to concern about ethics and human values. The identification, cloning and sequencing of new genes without first needing to know their protein products greatly expand the possible scope for screening and diagnostic tests. The central objective of genetic screening and diagnosis should always be to safeguard the welfare of the person tested: test results must always be protected against disclosure without con- sent, confidentiality must be ensured at all costs, and adequate counselling must be provided. Physicians and others who counsel should endeavour to ensure that all those concerned understand the difference between being the carrier of a defective gene and having the corresponding genetic disease. In autosomal recessive conditions, the health of carriers (heterozygotes) is usually not affected by their having a single copy of the disease gene; in dominant disorders, what is of concern is the manifestation of the disease, not the mere presence of the defective gene, especially when years may elapse between the results of a genetic test and the manifestation of the disease.
V. The genome project will produce knowledge of relevance to human gene therapy, which will very soon be clinically applicable to a few rare but very burdensome recessive disorders. Alter- ations in somatic cells, which will affect only the DNA of the treated individual, should be evaluated like other innovative therapies. Particular attention by independent ethical review committees is necessary, especially when gene therapy involves children, as it will for many of the disorders in question. Inter- ventions should be limited to conditions that cause significant disability and not employed merely to enhance or suppress cosmetic, behavioral or cognitive characteristics unrelated to any recognized human disease.
VI. The modification of human germ cells for therapeutic or preventive purposes would be technically much more difficult than that of somatic cells and is not at present in prospect. Such therapy might, however, be the only means of treating certain conditions, so continued discussion of both its technical and its ethical aspects is essential. Before germ-line therapy is under- taken, its safety must be very well established, for changes in germ cells would affect the descendants of patients.
VII. Genetic researchers and therapists have a strong responsibility to ensure that the techniques they develop are used ethically. By insisting on truly voluntary programmes designed to benefit directly those involved, they can ensure that no precedents are set for eugenic programmes or other misuse of the techniques by the State or by private parties. One means of ensuring the setting and observance of ethical standards is continuous multidiscip- linary and transcultural dialogue.
VIII. The needs of developing countries should receive special attention, to ensure that they obtain their due share of the benefits that ensue from the human genome project. In par- ticular, methods and techniques of testing and therapy that are affordable and easily accessible to the populations of such countries should be developed and disseminated whenever possible.
Appendix 3: Genetic Engineering
Genetic engineering will serve as the major therapeutic means in correcting gene mutation of genetic diseases. Of the many techniques available several seem to be promising:
1. Transfection: the introduction of new genes.
2. Homologous recombination: curing certain mutations in situ.
3. Injecting new genes into the nuclei of single-cell embryos.
4. Retroviral vectors: introducing useful gene sequences into defec- tive cells.
There are four potential areas for the application of genetic engineering designed to insert a gene into a human:
1. Somatic cell therapy: This would result in correcting a genetic defect in the body cells of a patient. It is regarded technically as the simplest and ethically as the least controversial form of gene therapy.
2. Germ-line gene therapy: This would require the insertion of the gene into the reproductive cells of the patient so that the disorder would also be corrected in future generations. It will require major advances in present knowledge and raises ethical issues that clearly need to be debated further.
3. Enhancement genetic engineering: This would involve the insertion of a gene to enhance a known characteristic of a person, such as placing an additional growth hormone gene into an otherwise normal child. It presents a host of difficult ethical concerns. Unless this type of therapy can be clearly justified on the grounds of preventive medicine, prevailing opinion suggests that enhancement engineering should not be performed.
4. Eugenic genetic engineering: This would represent the attempt to alter or improve complex human traits which are coded by a large number of genes, involving, for example, personality, intelligence and formation of body organs. It is still prohibited and will probably remain so for the foreseeable future despite all the attention it continuously receives in the public and political arenas.
Bibliography
1. “Human Genome Research: A Review of European and Inter- national Contributions,” Medical Research Council, Diane J. McLaren, ed., January 1991.
2. “Report on Genome Research,” The European Science Found- ation, 1991.
3. DOE Human Genome Program (primer on molecular genetics), June 1992.
4. Maynard V. Olson, “The Human Genome Project,” PNAS 90 (1993): 4338-44.
5. F. Collins and D. Galas, “A New Five-Year Plan for the US Human Genome Project,” Science 262 (1993): 43-46.
6. L.W. Engel, “The Human Genome Project,” Arch Pathol Lab Med 117 (1993): 459-465.
7. Human Genome News, NIH and DOE, vol. 6, no. 4 (1994).
8. “The Human Genome Mapping Project in the UK: Priorities and Opportunities in Genome Research,” commissioned by the Office of Science and Technology, April 1994.
9. Human Genome 1993, Program Report, US Department of Energy, March 1994.
10. R. Nowak, “Cold Spring Harbor Meeting Report,” Science 268 (1995): 1134-5.
11. “The Genome Directory,” Nature, 377, 6547S (1995).
Source: ASSIA – Jewish Medical Ethics,
Vol. III, No. 2, September 1998, pp. 20-29
1. J. Weber, Marshfield Medical Research Foundation, CA.
2. Introduced by Solomon and Bodmer, 1979; Botstein and co-workers, 1980.
3. Schwarts and Cantor, 1984.
4. Saiki and co-workers, 1985; Mullis and co-workers, 1986.
5. Burke and co-workers, 1987.
6. Olson and co-workers, 1989.
7. Developed by Eric Lander and co-workers, Whitehead Institute, 1994.
8. Support ended in 1998; obsolete – does not comply with future requirements of the HGP.
9. Published 1996.
10. Blattner et al., January 1997.
11. Fleischmann et al., 1995.
12. Fraser et al., 1995.
13. Personal grant not included.