Human genetics


Introduction

Historical background

Genetics is a very new science, changing rapidly by the day with advances in technology. Although humans have long been aware of some form of heredity, the mechanisms have only recently become clear. Early philosophers talked about the male ripening the female, or seeds being produced in various organs to be transmitted to the child. In the early 18th century, scientists were divided as to whether the sperm held the new child, the purpose of the female simply to provide the womb, or whether the egg held the child, already determined as male or female, growth being stimulated by the sperm.

The theory of pangenesis was a major early influence: pangenes were said to be developed in each organ and then passed to the child through blood. This is probably the origin of phrases such as ‘royal blood’ and ‘blood line’. Blending was the most accepted theory at the time: a tall man and a short woman would tend to produce offspring of average height. This clearly did not always work, e.g. when two brown-eyed parents had blue-eyed children.

Although Gregor Mendel's experiments, over 8 years in the late 19th century involving thousands of garden pea plants, were perhaps all too perfect, his crossing of plants with different characteristics laid the foundation for modern genetics. Describing the appearance of simple characteristics, he used mathematical principles to show that these traits were passed from parents to offspring through what came to be known as genes. Four fundamental principles of modern genetics were illustrated by his experiments:

  • Each parent contributes only one of a pair of factors (tall or short, wrinkled or smooth in Mendel's experiments) – or alleles – of each trait (plant size or seed shape) through a process of separation, or segregation , when the gametes are formed

  • Some factors can be dominant , whereas others can be hidden, or expressed when the dominant factor is not present ( recessive factor ), but remain unchanged when passed to their offspring (do not blend)

  • Males and females contribute equally to their offspring

  • Different traits are inherited independently of each other – independent assortment .

The later recognition of chromosomes (see later), leading to the chromosomal theory of inheritance, and understanding of the process of meiosis when gametes are formed (see later) helped to explain these observations during the early 20th century. Occasionally, Mendel's prediction for what should be inherited in the next generation did not happen; these occurrences were described as mutations . Mendel's principles of inheritance and an acceptance of mutation provided scientific support for the theories put forward in Charles Darwin 's On the origin of species in 1869.

Although deoxyribonucleic acid (DNA) had been isolated at about the same time as Darwin's publication, its recognition as the hereditary material, rather than a protein being responsible, was not proved until the 1950s. The publication in 1955 of the structure of DNA marked the start of modern molecular biology, to which the rest of this chapter is devoted.

Basis of modern genetics

The genetic code, located within DNA, provides the instructions for the complex development and organisation of the multicellular human. Genetics has become an essential component of medicine, extending our understanding of disease mechanisms to facilitate preventative, diagnostic, prognostic and therapeutic developments.

The understanding that disease can be caused by chromosomal abnormalities, or can be inherited, because of the segregation of mutant genes, has provided the impetus to the development of genomic medicine. Along with the association of specific gene mutations and disease comes a greater understanding of the molecular and cellular mechanisms required for normal human development and homeostasis. A key factor that has facilitated our current understanding of human genetics has been the technological developments that have enabled the complete sequencing of the human genome and led to the collection of vast amounts of data that can be ‘mined’ to produce useful information. Because of commercialisation and technological developments, the approximately 15 years and $2.7 billion that were required to sequence the first genome have now been reduced to the equivalent of approximately 10–15 minutes at a cost of $1000 dollars, with a promise of approximately $100 dollars per genome.

The genome is the complete collection of genetic material within an organism. Different levels of genetic classification are used to enable the management of an enormous amount of information:

  • Genomics – the DNA structure of genes and their localisation within the genome

  • Gene expression – the mechanisms through which genes are ‘switched on’ and transcribed into messenger RNA (mRNA) to be translated into protein or other types of RNA

  • Proteomics – the characterisation of biological processes from measurement of protein expression, localisation and post-translational modification

  • Epigenetics – ‘above the genome’ - modifications to the DNA molecule that do not alter the genetic sequence but change gene expression and are heritable.

These ‘banks’ of genetic data can be used to provide information for disease therapies. The variable responses to established drugs, and the effects of particular chemicals on gene transcription and protein expression (drug discovery), are collectively described as pharmacogenomics . The combination of pharmacogenomics and the traditional pharmaceutical sciences offers the potential for better and safer drugs and the ability to provide individualised therapies.

Profiling an individual's DNA for the presence of specific gene variants will enable the future understanding of the inherited basis of many congenital disorders (disorders present at birth) and the risk of developing particular multifactorial conditions (those with many causes) in later life, such as cancer and neurological disease. With more research, it is likely that our lifetime risks of developing a whole range of diseases could be predicted by our individual ‘genetic barcode’ and, of equal importance, what medicines and lifestyle changes will prevent the onset and early management of these diseases. Already, commercial companies offer personal disease risk predictions but the evidence for the reliability of any estimates and usefulness of the information provided to an individual is presently limited. Most of these studies rely on genome-wide association studies, but many identified variants do not necessarily produce functional changes and the better understanding of epigenetic changes may allow better risk predictions in the future.

Most human disorders have a genetic component that is either inherited in the germline or acquired through somatic mutation.

  • The germline refers to the gonadal cells that become eggs or sperm, and also to the genetic material that comes from them.

  • Somatic cells are those that come from the body, but not directly from the gonads. A particular genetic variant (often termed mutation) underlies the monogenic disorders.

  • Monogenic disorders are those controlled by a single gene, in contrast to polygenic disorders. In polygenic diseases, a complex genetic interaction of several genes and environmental factors predisposes to or protects an individual from a particular disorder.

Genetic disorders can be classified in a number of ways:

  • Single gene disorders – a mutation in a single gene leading to disease (e.g. cystic fibrosis)

  • Chromosomal disorders – a change, gain, loss or exchange of chromosome elements (e.g. trisomy 21 – Down syndrome; Clinical box 5.1 )

    Clinical box 5.1
    Trisomy 21

    Trisomy 21 is seen in approximately 0.1% of live births and produces the phenotype originally described by John Langdon Down in 1866. It was recognised to be due to the presence of an extra copy of chromosome 21 in 1959. The extra chromosome copy appears more often to originate from the mother and prevalence increases significantly with increasing maternal age, from less than 1 in 1000 under the age of 30 years to about 1 in 25 over the age of 45 years. Individuals with Down syndrome have distinctive facial features: a small head, a flat nose bridge, misshapen ears, a broad and short neck, and narrow upward slanting eyes. These are often accompanied by a variety of medical conditions, including intellectual disability, congenital heart defects, gastrointestinal obstruction and leukaemia.

    Trisomy 21 accounts for approximately 95% of cases of Down syndrome. The remaining 5% are due to other abnormalities:

    • Translocation and partial trisomy : in this condition copies of parts of chromosome 21 are translocated to other chromosomes. Although there are no additional chromosomes in this case, there are still three copies of particular genes from chromosome 21 (partial trisomy).

    • Mosaics : in which there is a mixture of cell lines in different tissues within one body, some displaying trisomy 21, others being normal.

  • Polygenic disorders – due to the combined effects of many genes, or in combination with environmental factors (multifactorial) (e.g. neural tube defects)

  • Somatic disorders – disorders of body (non-germline) cells, such as uncontrolled cell growth, or cancer

  • Chromatin disorders – diseases caused by a malfunction of epigenetic control.

Individuals who are affected, or are at risk of being affected, by genetic disorders are likely to be offered genetic counselling and prenatal diagnosis – this is when a trained professional provides information, risk assessment and support to patients and their families.

Along with the rapid developments that are coming with the sequencing of the human genome and the associated advances in analytical technologies, there is also a proliferation of ethical, legal and social questions that must be considered by governments. This means that individuals are increasingly faced with important decisions; it is the role of the genetic counsellor to provide information and support to help affected individuals make informed decisions.

Down syndrome can be screened for in early pregnancy through the measurement of various proteins and hormones that are characteristically altered in this condition. Women at high risk are normally offered an examination in early pregnancy that samples foetal cells and allows a detailed examination of all the chromosomes. Small numbers of foetal cells escape into the maternal circulation, and techniques that can separate and enrich these foetal cells are now being developed; non-invasive prenatal genome testing is likely to improve the diagnosis of a large number of congenital disorders in the future.

The Human Genome

Genetic information is stored within deoxyribonucleic acid (DNA) , its sequence providing the ‘blueprint’ for all the proteins in the body. The Human Genome Project (HGP, see later) began formally in 1990, and the identification of the entire human genome sequence was completed in 2003.

The information is arranged in genes that code for proteins. The genes are located on chromosomes. Forty-six chromosomes, arranged in 22 pairs of different chromosomes, plus two sex chromosomes, are found within each somatic cell in the body. The gametes (sperm or egg cells) contain just one sex chromosome and one of the pairs of each of the 22 autosomal , or non-sex, chromosomes.

DNA and chromosomes

DNA is a double-stranded molecule (the double helix ), the strands forming a twisted ‘ladder’ with sides of sugar and phosphate molecules forming strong phosphodiester bonds, connected by ‘rungs’ of nitrogenous bases, linked through weaker hydrogen bonds. The order of the nucleotide bases on the strand is called the DNA sequence; these specify the genetic instructions to produce and maintain an organism. There are four bases in DNA:

  • Adenine (A)

  • Cytosine (C)

  • Guanine (G)

  • Thymine (T).

The strands of DNA are joined by the specific pairing of these nucleotides: A with T, and C with G to form a double-helical structure. Each of the strands is therefore complementary to the other. Each strand has a 3′ (‘3 prime’) and a 5′ end and the complementary strand reverses these. There are 3 billion base pairs making up the human genome sequence. The majority of human DNA is packaged into different-sized chromosomes ( Fig. 5.1 ) located within the nucleus of the cell, with an additional circular piece of DNA located in the mitochondria .

Fig. 5.1, Organisation of DNA within a chromosome.

In a diploid human cell (having two sets of chromosomes), there are 46 chromosomes comprising 22 paired autosomal chromosomes (numbered 1–22). Each pair of chromosomes is homologous (very similar), and one of the pair is inherited from the father and the other from the mother (see Meiosis, later). The largest is known as chromosome 1 and consists of approximately 250 megabases (mb) of DNA, chromosome 2 has 240 mb and then chromosomes descend in size to the smallest autosomes, 21 and 22, which have 55 mb and 60 mb of DNA, respectively. The remaining two are the sex chromosomes, which are not homologous and are called the X and Y chromosomes.

In the gametes (egg and sperm cells), the genome is haploid , carrying only one copy of the 23 chromosomes. The Y chromosome (60 mb of DNA) is much smaller than the X chromosome (140 mb of DNA) ( Fig. 5.2 ) .

  • Males have an X and a Y chromosome. The Y chromosome is inherited from the father and contains the primary genetic information for determining some of the male characteristics. The Y chromosome is much smaller than the X chromosome and has very few genes: for this reason, there are not very many diseases associated with defects in the Y chromosome, compared with those associated with the X chromosome.

  • Females have two X chromosomes, although only one of the two X chromosomes is transcriptionally active in order that female cells do not express twice as much of the X chromosome genes as male cells. This random inactivation of one of the female X chromosomes is termed X-inactivation (or lyonisation ) and occurs via a chromosomal/gene silencing mechanism involving methylation in early embryogenesis. Once this occurs, the same X chromosome is inactivated in all somatic cells. In contrast in the germline, the inactive X is reactivated and at meiosis one X is selected at random to enter the egg.

Fig. 5.2, Human chromosome karyotype.

Chromosome karyotypes (see Fig. 5.2 )

Chromosome karyotyping is the process of visually examining the chromosomes, arranged in pairs, for gross abnormalities. To see chromosomes under the microscope, somatic cells (such as white blood cells) are stimulated to divide ( mitogenesis ), triggered by chemical exposure to an agent such as phytohaemagglutinin. The cultured cells are then exposed to spindle toxins, such as colchicine, which arrest the cell cycle in metaphase when the chromosomes are more condensed and thus easiest to see.

Each chromosome has a constriction at the middle ( meta-centric ) or towards one end ( acrocentric ). This constriction is termed the centromere and joins the pair of homologous chromosomes ( chromatids ) in the dividing cell together during mitosis.

The centromere also divides the chromosome into two diagnostic parts, the shorter p arm and the longer q arm . The banding pattern seen across each chromosome is produced by nuclear staining that produces a characteristic light and dark banding pattern. The dark bands represent condensed supercoiled chromatin ( heterochromatin ) that is transcriptionally silent. The light-staining bands represent chromatin that is not so tightly coiled, and these are transcriptionally active regions of the chromosome. At the end of each chromosome arm are the telomeres , which do not contain any coding genes but numerous copies of the hexameric nucleotide sequence TTAGGG. Because DNA replication mechanisms are unable to extend to the end of the chromosome strand, sequences at the end get lost. Telomeres act to ‘cap’ the ends and protect the sequence, getting removed themselves later. Telomerase enzymes present in germ and stem cells replace the cap. Telomeres maintain the stability of the chromosome, promote complete DNA replication and aid chromosome pairing. At the end the single-stranded section produces a ‘T-loop’ which prevents the telomere end from being recognised as a break in the DNA sequence needing repair. Telomeres become progressively shorter in somatic cells as they divide and this has been linked to the regulation of cell longevity.

Mitochondrial DNA

The remaining DNA within the genome is packaged in circular DNA molecules (16 569 bases), found in the numerous mitochondria within the cytoplasm of the cell. Mitochondria are thought to have evolved after a prokaryotic cell (bacteria-like single celled organism) was internalised within an anaerobic eukaryotic (nucleus-containing) cell, facilitating aerobic respiration, which is a much more efficient way to produce energy in the cell. The mitochondrial genome encodes 13 proteins, 22 transfer RNA (tRNA) and two ribosomal RNA (rRNA) molecules that are primarily involved in the generation of adenosine triphosphate (ATP) to fulfil the cell's energy requirements. There can be hundreds of mitochondria within each cell. This DNA is always inherited from the mother because, at fertilisation, only the nucleus of the sperm cell is transferred into the egg.

Cell division

Human cell division occurs by two processes, mitosis and meiosis .

  • In mitosis the entire DNA content of the cell is duplicated and the cell divides into two identical diploid daughter cells.

  • Meiosis is a special form of cell division that maintains the correct number of chromosomes in the gamete cells. During fertilisation the egg and sperm cells merge to form a single cell with 23 pairs of chromosomes. To ensure that the gametes only have a single set of chromosomes, in meiosis the DNA content of the cell is halved and the daughter cells are referred to as being haploid .

The cell cycle

The cell cycle is the name given to the series of events by which cells duplicate their DNA and divide, in a process known as mitosis . This process is preceded by the longer interphase in which the cell prepares for the division, synthesising the necessary material ( Fig. 5.3 ) . DNA replication occurs once during each cell cycle, and the process is tightly regulated to ensure that DNA is accurately copied and equally divided between the two daughter cells.

Fig. 5.3, The cell cycle.

The length of the cell cycle varies depending on the cell type: epithelial cells in the human gut may divide every 12 hours, whereas neurons and muscle cells lose their ability to divide altogether. Key molecules in this regulation are the cyclin-dependent kinases (CDKs) that are produced and destroyed at key checkpoints in the cell cycle (see later). An understanding of these and other proteins is an important focus of research into the causes of cancer.

Interphase

The time from a cell completing mitosis to the beginning of the next mitosis is referred to as interphase . Cells can remain in a nascent stage, the G0 phase, until stimulated to enter the cell cycle by a variety of agents, such as growth factors or intracellular messengers. Three stages follow:

  • G1 (gap 1) phase

    • The step between division and synthesis

    • Transcription factors activated

    • DNA synthesis initiated.

The G1-S checkpoint (restriction checkpoint) is controlled through two cyclin proteins (D and E) which activate their respective kinases to enable inevitable progression into S phase through positive feedback. If errors are detected in kinase sensors then, through a series of reactions, cyclin E kinase activation is inhibited, halting the transition to S phase.

  • S phase (synthesis)

    • Chromosomes duplicated to form the ‘sister’ chromatids

    • Material may be exchanged between sister chromatids ( recombination ).

  • G2 (gap 2) phase

    • The step between synthesis and division

    • DNA repair

    • Division of mitochondria and preparation for mitosis.

The G2 checkpoint (decaternation checkpoint) is also mediated through a series of cyclin-cyclin-dependent kinases that check for replication errors. The mismatch repair system checks for frameshift errors through a series of genes, stimulating apoptosis when the repair fails.

Each cycle starts with 46 (a pair of 23) chromosomes, and ends with a duplicated set by the end of G2.

Mitosis ( Fig. 5.4 )

Mitosis is a continuous process during which the chromosomes, duplicated during interphase, are separated before the physical division of the cell into two daughter cells. The main physical events that occur during mitosis were first observed in the latter part of the 19th century and elucidated by Flemming in 1882. Flemming divided mitosis into four stages based on morphological changes in the nucleus and cytoplasm of the dividing cell.

  • Prophase : chromosomes condense and become visible. Each duplicated chromosome comprises two chromatids lying side by side and attached at the centromere ( Fig. 5.5 ) , held together by the structural maintenance of chromosome protein, cohesin . A bipolar mitotic spindle starts to form outside the nucleus, radiating out from the centrioles lying at opposite sides of the cell.

    Fig. 5.5, Chromosome structure.

  • Metaphase : metaphase starts ( prometaphase ) with the disappearance of the nuclear membrane. Chromosomes become attached to the mitotic spindle via kinetochore microtubules . At the end of metaphase, the chromosomes are aligned in a plate around the centre of the mitotic spindle apparatus, the equatorial plane . At this point the chromosomes are at their most condensed. Examination of chromosomes for clinical diagnostics is normally done on metaphase chromosomes.

    • The third checkpoint ( mitotic spindle checkpoint ) detects a bipolar tension that degrades cyclin B, leading to proteolytic cutting of cohesion molecules, releasing the sister chromatids, allowing them to separate in anaphase.

  • Anaphase : during anaphase the spindle fibres contract and move towards opposite spindle poles. Cohesin molecules that held the chromatids together are dissolved with protease enzymes. This process typically lasts only a few minutes. At this point, there will be 92 chromosomes, identical halves being located at opposite sides of the cell.

  • Telophase : the daughter chromatids lie at the spindle poles. The kinetochore microtubules disappear and a nuclear envelope forms around each group of daughter chromosomes.

  • Cytokinesis : cleavage of the cytoplasm starts during anaphase. The cell membrane around the centre of the cell is drawn in to form a cleavage furrow and then tightens until it reaches the remains of the mitotic spindle. This is known as the midbody and may persist for some time before it breaks to form the two daughter cells, each containing 46 chromosomes.

Fig. 5.4, Mitosis.

Meiosis ( Fig. 5.6 )

In meiosis DNA replication is followed by two rounds of cell division and leads to the formation of four haploid cells, each containing a single set of chromosomes, i.e. half the normal chromosomal content. This process forms female egg cells or male sperm cells only. Like mitosis, each round of division in meiosis can be divided into four phases based on the nuclear and cell morphology.

Fig. 5.6, Phases in meiosis.

The process begins with interphase I , in which, like mitosis, a single DNA strand duplicates to form a sister chromatid, joined with the other at the centromere. Because the cells are diploid for each chromatid, these are homologues ( Fig. 5.7 ) .

Fig. 5.7, Diploid chromosomes:

Meiosis I is the phase in which two haploid cells are produced from a single diploid cell. It is during this phase that genetic diversity takes place through crossover ( recombination ).

  • Prophase I : this is a very different process from mitotic prophase and approximately 90% of meiotic time is spent in this phase. Female eggs are suspended in this stage at birth until puberty. Prophase I can be subdivided into five stages:

    • Leptonema (thin): the diploid chromosomes condense to form long thin threads. Each chromosome attaches by both ends via an attachment plaque to the nuclear envelope. Individual chromatids are not visible at this stage.

    • Zygonema (yoke shaped): synapsis or intimate pairing marks the beginning of zygonema. Synapsis (pairing) starts when the homologous regions of the two chromosomes come together, starting a zipper-like process during which the two chromosomes become aligned side by side. This can also take place when an X chromosome is paired with a Y chromosome, but is limited to the pseudoautosomal regions, PAR1 and PAR2, located at the tips of the short and long arms, respectively. Synapsis often starts at the nuclear membrane and proceeds inwards, but can also start in the centre of the chromosome and proceed out towards the ends. The paired chromosomes are known as bivalents (one from each parent) and have four chromatids. The whole is known as a tetrad ( Fig. 5.8 ) .

      Fig. 5.8, Tetrad chromosome formation:

    • Pachynema (thick): cells enter pachynema when all the chromosomes are aligned. This stage can last several days. Recombination nodules become visible, which are thought to result in an exchange of chromosomal material between the two non-sister chromatids.

    • Diplonema (double): at this stage the two homologous chromosomes condense, to the position where the sister chromatids are visible, and start to move away from each other. Each tetrad remains attached at chiasmata , which are formed at the point where crossover has occurred. In oocytes, this stage can last for several months or years, but it only takes about 24 days in the human male. The chromosomes de-condense and RNA synthesis starts, to provide storage materials for the egg.

    • Diakinesis (across): RNA synthesis ceases and chromosomes condense, thicken and detach from the nuclear membrane. The four chromatids can be clearly distinguished within each tetrad. Sister chromatids are joined at the centromere. Non-sister chromatids are joined by chiasmata.

  • Metaphase I : in this phase spindles form between centrioles, at opposite poles of the cell, within the nuclear membrane, and the tetrads line up on the spindles on the equatorial plane ( metaphase plate ), with centromeres from the homologous chromosomes lying on opposite sides of the plate ( Fig. 5.9 ) .

    Fig. 5.9, Metaphase I:

  • Anaphase I : the spindles pull the two homologous chromosomes apart, towards opposite ends of the cells. Which homologue of a pair of chromosomes moves in one direction rather than the other occurs at random ( independent assortment ), creating further genetic variation. Unlike in mitosis, the chromosomes do not duplicate and so only half the original number will lie in each half of the cell, which will contain only one of a pair of autosomes (with crossed-over genetic material), and one sex chromosome (23 chromosomes in total).

  • Telophase I : when the chromosomes reach the poles, a new nuclear membrane develops between each set of chromosomes ( Fig. 5.10 ) . Cytoplasmic division is about equal between the daughter cells in males, whereas in females there is unequal division. The daughter cell with the most cytoplasm goes on to form the egg, the other becoming a polar body that eventually degenerates.

    Fig. 5.10, Telophase I:

Meiosis II is the phase in which each haploid cell produces two additional daughter haploid cells. Like in the previous phase, there are a number of similar stages:

  • Prophase II : this is very similar to mitotic prophase, sister chromatids lying together, joined at the centromere, but the cell nucleus has only a haploid number of chromosomes.

  • Metaphase II : spindle fibres line up the chromosomes on the equatorial plane.

  • Anaphase II : this resembles mitotic anaphase – the centromeres split and sister chromatids are pulled towards separate poles. In meiosis, however, because of crossing over that has taken place in the first stage, the separated daughter cells may not be identical.

  • Telophase II : this is like telophase I; nuclear membranes again form around each set of chromosomes. Again, the cytoplasmic division is unequal in female gametes, keeping as much cytoplasm as possible with the true egg/future zygote (combination of the female and male gametes in a fertilised egg).

In females, meiosis II takes place only during fertilisation, stimulated by the penetration of a sperm into the cell, with the sperm nucleus uniting with the large egg nucleus to produce the zygote.

At the end of meiosis, the result in males is four daughter haploid cells, two with 22 chromosomes plus an X chromosome and two with 22 chromosomes plus a Y chromosome, whereas in female cells, two daughter haploid cells plus two polar bodies will form. Occasionally, polar bodies formed in stage 1 can divide and there will be three polar bodies. The mechanism for the asymmetric cell division, required to produce the polar bodies in female cells, is not clear.

Chromosome abnormalities

Chromosome abnormalities are either constitutional or somatic.

  • Constitutional abnormalities are present in all body cells and are the result of something that occurs very early in embryonic development.

  • Somatic abnormalities are only present in some cells and therefore the person is a mosaic (from the same zygote) or chimeras (from different zygotes when twins are aggregated in the early embryo).

Abnormalities in number

Defects in fertilisation or meiosis can result in an additional or reduced number of chromosomes being detected. The body is better able to deal with small excesses of genetic material than a deficit.

  • Polyploidy occurs when an additional set of chromosomes is present; this always happens as a multiple of 23 chromosomes. Triploidy results in 69 chromosomes (69, XXX, XXY or XYY) and tetraploidy leads to 92 chromosomes in each cell. These conditions are rarely compatible with life and triploidy is a common cause of spontaneous miscarriages.

  • Aneuploidy occurs when a multiple of 23 chromosomes is not present. There can be a gain or loss of chromosomes, which can affect either autosomal or sex chromosomes. These conditions are usually caused by non-disjunction during meiosis and can lead to monosomy (one copy of a chromosome in a normally diploid cell) or trisomy (three copies of a chromosome in a normally diploid cell).

  • Autosomal monosomy is rarely compatible with life.

  • Autosomal trisomy – trisomy 21 (Down syndrome) is the most common. Chromosomes 13, 18 and 21 have lower gene density in comparison with other chromosomes, and trisomies associated with chromosome 13 (Patau syndrome), chromosome 18 (Edwards syndrome) and chromosome 21 (Down syndrome) are the only others in which embryos generally survive to term.

  • Sex chromosome monosomy – because of normal X inactivation, the loss of a single X chromosome (Turner syndrome) is compatible with life. The surviving X chromosome is normally maternal in origin and often the result of non-disjunction as a result of meiotic errors producing X chromosomes with p arm deletions, or abnormal Y chromosomes in the male.

  • Sex chromosome trisomy – XXY (Klinefelter syndrome), XXX (trisomy X) and XYY syndrome have fewer problems than those mentioned previously.

Having different numbers of sex chromosomes is less of a problem than in the autosomes. In males this is because there are not many genes on the Y chromosome, and in females it is because X inactivation lowers the effect of having different numbers of X chromosomes.

Abnormalities in structure

Sometimes, as gametes are formed, structural alterations can occur – pieces can be lost, moved to another chromosome or duplicated. These events can be the result of homologous chromosomes failing to line up properly during meiosis, causing an unequal crossover, or chromosome breakage with an imperfect repair. The alterations that can occur are:

  • Translocations – where genetic material is exchanged between different chromosomes ( reciprocal translocation), or where the short arms of two chromosomes are lost and the long arms fuse at the centromere to make a single large chromosome ( Robertsonian translocation). In the latter, although there is the overall loss of one chromosome, this is usually only seen in chromosomes where the amount of genetic material in the short arm is limited, such as in chromosomes 13, 14, 15, 21 and 22. A carrier of a Robertsonian translocation on chromosome 21 is at risk of producing a child with a trisomy and this is the cause of approximately 3%–4% of Down syndrome cases.

The Philadelphia chromosome involves a translocation between chromosomes 9 and 22, and is characteristically observed in chronic myeloid leukaemia (see Ch. 12 ).

  • Deletions – a loss of genetic material can occur when breaks happen. They can be interstitial or terminal, although in the latter case, a telomere cap must be acquired. The amount of material lost determines the severity of abnormality. Some of these deletions can lead to clinically important abnormalities:

  • Cri-du-chat syndrome, so-called because of the child's distinctive cry, occurs because of the loss of part of the short arm of chromosome 5.

  • Wolf–Hirschhorn syndrome is due to the loss of part of the short arm of chromosome 4. Both this and cri-du-chat syndrome are examples of a microdeletion .

  • Ring chromosomes occur when there is a deletion at both ends of the chromosome and the ends join, forming a ring. This often leads to the loss of that chromosome, resulting in a monosomy. The formation of a ring X chromosome can be another cause of Turner syndrome.

  • WAGR syndrome is due to a deletion of the p arm of chromosome 11, involving a series of genes, resulting in W ilms tumour (a kidney cancer), a niridia (no iris), g onad tumours or other genitou r inary abnormalities and r etardation. This is an example of a contiguous gene syndrome .

  • Duplications – can occur as a result of unequal crossover or in the children of someone with a reciprocal translocation.

High-resolution banding techniques and fluorescence in situ hybridisation (FISH) ( Information box 5.1 and Fig. 5.11 ), gene array molecular technologies and, more recently, massively parallel sequencing (MPS) technologies have enabled many microdeletions and microduplications, such as Prader–Willi , Angelman and Williams-Beuren syndromes, to be described, and more are likely to follow.

  • Inversions – this can happen when there are two breaks in the chromosome and the free portion rotates. These changes are said to be pericentric if they involve the centromere and paracentric if they do not. Although these abnormalities may not have severe consequences, they can interfere with the normal meiotic process and therefore they are at higher risk of producing offspring with deletions or duplications. Occasionally, the inversion can be very serious and is the cause of almost 50% of cases of severe haemophilia, where the inversion has interrupted the factor VIII gene, resulting in insufficient production of factor VIII for normal clotting processes (see Ch. 12 ).

  • Isochromosome – chromosome pairs normally line up, joining and separating at the centromeres, but retaining their own long and short arms. An isochromosome occurs when the chromosomes split at the centromere to produce two chromosomes, one with only short arms and the other with only long arms. Only isochromosomes involving the X chromosome appear to be compatible with life.

  • Copy number variants (CNVs) are any gain or loss of a stretch of DNA, whatever the size. Although large gains or losses are more likely to have severe or lethal effects, deletions and duplications from a few nucleotides to even 1 million nucleotides have been described in healthy individuals.

Information box 5.1
Fluorescence in situ hybridisation (FISH)

Fig. 5.11, Fluorescence in situ hybridisation (FISH).

FISH is a technique that has been developed to map (locate) particular regions of a chromosome, or even whole chromosomes, regardless of whether the cells are actively dividing. The technique requires the use of a piece of single-stranded DNA that matches the genetic sequence in the area of interest – the probe .

Chromosomes prepared on a glass slide are denatured (the DNA complementary strands are separated) in situ (on the slide), allowing the probe DNA to hybridise (bind) to its complementary sequence on the separated strand. Fluorescent dyes are attached to the probes so that the results can be visualised under a fluorescent microscope.

FISH can be used in different ways:

  • Gene location : the genetic sequence of every gene is known and a probe can be developed to show the chromosomal location of the gene. The technique is sensitive enough to identify deletions of very small amounts of DNA from a chromosome.

  • Centromeric probes : probes can be used to identify the repetitive sequence found around the centromere. This can be used to check the number of chromosomes.

  • Spectral karyotyping : this involves the use of a large number of probes, labelled with different-coloured fluorescent dyes, to ‘paint’ a whole chromosome. This technique is particularly useful in detecting chromosomal translocations, in which a part of one chromosome is moved to another chromosome.

DNA and genes

Chromosomal DNA is located in the nucleus and is composed of regions that actively make proteins and those that are inactive. Genes are the active DNA sequences coding for proteins and are organised into:

  • Regulatory regions ( promoter , enhancer or repressor sequences)

  • Exons – which encode the mRNA and, in most cases, the protein

  • Introns – which are not transcribed into proteins.

Some genes are found only at a single locus (chromosome location). They are called single copy genes and share little or no DNA sequence homology with other genes. Other genes are part of large gene families . These families have occurred due to gene duplication, and gene members therefore share a high degree of DNA (and amino acid) sequence homology. They often cluster at specific regions of the genome and have similar but distinct functions in different tissues.

Homeobox (and HOX) genes

Genes can also be classified into gene families based on the presence of highly conserved domains , with the rest of the gene sequence sharing no homology at all with other family members. Homeobox (and HOX) genes are examples of these.

HOX genes are a subgroup of homeobox genes and are associated in clusters on different chromosomes and control production of body parts. They are organised in similar ways throughout the animal kingdoms and are arranged along chromosomes in an order that reflects the body parts that they control. Whereas a specific HOX gene from a fly can be identical with one from another organism, such as the chicken, reflecting their common ancestry, a small mutation can lead to fundamental changes in morphology – called homeotic mutations . For example, additional fingers on the hand are the result of a mutation in a specific HOX gene.

Non-coding DNA

Genes are also separated by repetitive non-coding regions. Some non-coding sequences code for functional RNA molecules. Despite not being used for making proteins, this DNA is important and has a number of gene regulatory functions, such as regulation of transcription and translation and provision of structural attachments regions.

Unlike human chromosomal DNA, all the mitochondrial DNA (MtDNA) sequence codes for proteins used to maintain their propagation and the functional aspects of oxidative phosphorylation. MtDNA is present in many copies in the mitochondria within the cytoplasm of most cells and is maternally inherited. Sperm contain very few mitochondria and so are overwhelmed by the female contribution but there also seem to be regulatory mechanisms that inhibit their propagation.

Two processes are involved in making proteins:

  • Transcription – in which DNA is transcribed (copied) into mRNA, which then leaves the nucleus

  • Translation – in which mRNA is used to specify the amino acids required to make the relevant protein.

These processes take place in one direction only, along a nucleic acid strand. Nucleic acids are arranged in sequence along a strand and each end is named after the number of carbon atoms in the nucleotide sugar ring at the end.

  • The 5′ end has a 5-carbon deoxyribose ring.

  • The 3′ end has a hydroxyl group on the third carbon in the sugar ring.

The distinction between the two ends of the molecule is important because the enzymes that allow nucleic acid synthesis can only act at the 3′ carbon position on the sugar ring. This means that synthesis occurs only in a 5′ to 3′ and, by convention sequences are written in a 5′ to 3′ direction. Sequences lying prior to the 5′ end of particular region on a DNA strand are said to be upstream , and those lying beyond the 3′ end are said to be downstream .

Transcription ( Fig. 5.12 )

The process of making mRNA is initiated when an RNA polymerase enzyme binds to a promoter site on the DNA. The promoter region is a sequence of DNA close to, but outside, the gene. The position of the promoter determines which strand is used because nucleotides can only be added to a 3′ end. Thus, mRNA is only made in the 5′ to 3′ direction along the chromosome. Several different promoters can exist in different parts of the gene, allowing for slightly different proteins being produced in different locations in the body.

Fig. 5.12, Transcription of DNA to messenger RNA (mRNA):

Transcription begins assembling a complex of transcription factors (the transcription preinitiation complex ) that enables one turn of the DNA helix to unwind, forming a transcription bubble , allowing the polymerase enzyme to bind. In most promoters, the assembly site is determined by the TATA box sequence. This is a conserved sequence of A:T pairs, usually located 25–35 bases upstream of the start codon. Other sequences are involved in determining the efficiency of the promoter, such as those found in the CAAT box , normally located further upstream. Many genes also have GC-rich regions, or CpG islands , upstream of the transcription start site. CpG islands are unmethylated regions of the genome that are associated with the 5′ ends of many genes.

The polymerase enzyme moves along the DNA strand, adding nucleotides. As it does this, it pulls the DNA strands apart, zipper-like, exposing the DNA bases, unwinding the helix upstream and reannealing with the new RNA strand downstream of the polymerase. This provides a template ( antisense strand), with the enzyme moving from the 3′ in the 5′ direction along this template, producing a complementary copy ( sense strand), which eventually becomes the mRNA. Note that this molecule is identical to the other of the DNA strands that has not been used as a template, except that the thymine base is replaced with a uracil base. A methylated guanine molecule – the 5 cap – is added to the 5′ end of the developing precursor mRNA molecule. This ‘cap’ appears both to protect the molecule from degradation by 5′ nucleases, as the modification makes it look like a functional 3′ end, and it acts as a marker for translation to begin. It also enables mRNA to move through a nuclear pore into the cytoplasm. Transcription continues, through a sequence of exons and introns, until the RNA polymerase is released. This may be through a termination sequence that allows hairpin loops to form on the mRNA strand, or through binding of the rho hexameric protein subunits.

Before mRNA leaves the nucleus, introns are excised and the exons spliced together. Splicing can also happen in different ways: intentionally, resulting in different proteins being produced, or unintentionally, resulting in a mutation (error), which can produce genetic disease. A large number of adenine sequences are added ( polyadenylation ) to the 3′ end – polyA tail – to the mRNA molecule, stabilising it. Although the tail is later lost, it has an important function in nuclear export and subsequent translation of the mRNA. Fig. 5.13 illustrates the process.

Fig. 5.13, Production of messenger RNA (mRNA):

Regulation of transcription

The regulation of transcriptional processes is a vital component in the control of gene expression. Genes are normally transcribed in particular tissues at particular times but typically, only approximately 3%–5% of genes are expressed at any given time. In addition, even though all cells have the same DNA sequence, only a few genes are actively transcribed in any one cell. For example, only nucleated red cell precursors transcribe globin, used in the production of haemoglobin. Other genes may be transcribed all the time, to maintain cell integrity ( housekeeping genes ), or at particular times to aid other processes. For example, RNA polymerase needs other proteins to stabilise the polymerase, to help it bind in order to initiate transcription.

Transcription can be modified by enhancers . These are sequences that do not interact with the relevant genes, and may not even lie close, but which bind to transcription factors, known as activators . The whole then binds to coactivators , increasing transcription. Silencers are analogous to enhancers but they repress transcription. The various transcription factors interact with their targets through DNA-binding motifs , particular protein configurations that fit with, interact and modify the secondary and tertiary structure of the target molecules. These are some characteristic motifs:

  • Helix–turn–helix (hth):

    • Two α-helices lie in different planes, one fitting into the groove on the DNA molecule

  • Zinc finger:

    • Zinc ions are used to stabilise protein secondary structure in the form of α-helices and β-sheets, enabling the α-helix to bind with DNA in the major groove

  • Leucine zipper:

    • Two α-helices that form a Y-shaped structure, being held together with amino acid side chains that also bind to the DNA molecule in the major groove.

Translation

This process takes place on ribosomes . Ribosomes are large complex units made of specialised ribosomal RNA and proteins forming two joined subunits: a smaller unit that reads the RNA sequence and a larger unit where amino acids are joined in a polypeptide chain. mRNA provides the template, specifying the amino acid sequence, helped by molecules of tRNA . tRNA consists of about 80 nucleotides that have a clover-leaf structure ( Fig. 5.14 ) . There is at least one type of tRNA molecule for each amino acid, determined by the anticodon sequence in the loop of the molecule.

Fig. 5.14, Structure of tRNA:

rRNA first binds to mRNA at the start codon at its 5′ end. This is the sequence adenine–uracil–guanine (AUG), which codes for the amino acid methionine (see Table 2.8 for a list of codons). Mitochondrial genomes use other start codons – often GUG. The tRNA that binds is determined by the nucleotide sequence of the mRNA: the three nucleotide codon binds to the complementary anticodon on the tRNA molecule. In the process, the ribosome provides specific enzymes (aminoacyl tRNA synthetases) that pick up and attach the specific amino acid to the 3′ acceptor end – matching the codon in the mRNA molecule. Ribosomes move along the mRNA sequence in a 5′ to 3′ direction and also provide enzymes that make covalent bonds between each of the amino acids, in order to produce the growing polypeptide chain. Translation continues until reaching a stop codon (a nucleotide triplet, normally UAG, UAA or UGA). The ends of the mRNA molecule are not translated; these are the untranslated regions, or UTRs . Once thought to be ‘junk’, they have now been shown to have important functions in gene expression, and mutations in some of these have been separately associated with colorectal cancer and pre-term birth.

DNA damage

DNA can be damaged by a variety of factors, such as X-rays, and a large number of chemicals, but damage can also occur naturally during replication. There are two main types of error: physical abnormalities, such as breaks in the DNA molecule, and mutations , where there is a change in the DNA sequence. This base change can alter the codon, and thus the amino acid and subsequent protein being produced. Whereas damage may be repairable, mutations cannot be repaired, and are transmitted each time the cell replicates. Mutant cells will increase or decrease in frequency dependent on the ability of the cell to survive. Although DNA damage and DNA mutation are different, they are linked because damage will often cause errors in DNA synthesis during repair or replication, producing a mutation.

DNA damage from environmental factors

  • Radiation, from ultraviolet (UV) light (non-ionising), X-rays or gamma rays (ionising) can damage DNA in various ways.

    • UV light can directly damage the DNA by crosslinking adjacent thymine and cytosine bases, producing pyrimidine dimers, or can indirectly damage the DNA through the production of free radicals.

    • Ionising radiation breaks the phosphate backbone by severing the bond between oxygen and phosphate groups. Mechanisms exist to attempt to fix the broken ends, by joining to other pieces of DNA within a cell – a translocation. Where the translocation breakpoint is within or near a gene, then the function of the gene may be affected.

  • Thermal damage increases the loss of purine bases from the DNA backbone (depurination).

  • Human-made mutagenic chemicals, such as vinyl chloride, and other polycyclic aromatic hydrocarbons, such as are found in smoke and tars, create a large number of different DNA adducts – changes to the DNA, such as oxidation (from reactive oxygen species), alkylation (often methylation) and hydrolysis (deamination, depurination, depyrimidation) of bases, or DNA crosslinking, or bulky covalently attached compounds, all of which are associated with carcinogenesis.

  • Viruses – viral proteins activate DNA damage pathways.

  • Plant toxins – some food substances (teas and coffees) and smoke flavourings have been associated with high levels of p53. The p53 gene is activated when DNA is damaged.

Spontaneous damage (DNA replication mistakes)

During DNA replication, the polymerase very occasionally makes a mistake, about once every 100 000 000 bases, even after normal repair mechanisms. These errors may also be induced by the presence of reactive oxygen species produced from normal metabolic processes, such as during oxidative deamination.

Repair mechanisms

A series of genes are involved in producing enzymes that assist in repair mechanisms, recognising altered bases, excising them and replacing them. The process is highly efficient, correcting approximately 99.9% of errors.

Direct reversal of base damage

  • Photolyase is activated by UV light and reverses the pyrimidine dimer formation induced by the light

  • Common damage involves methylation of cytosine, guanine or adenine bases (alkylation) and sustained exposure to this damage produces an adaptive response in which alkylation repair enzymes are upregulated.

Breakage repair

  • Single-strand breaks : these are repaired using the base excision repair (BER), nucleoside excision repair (NER) and mismatch repair (MMR) mechanisms.

Excision repair

  • Base excision repair (BER) : first the glycosylase enzyme removes the damaged base and other enzymes (AP endonucleases) remove the phosphate in the backbone. This is followed by DNA polymerase beta (β) replacing the correct nucleotide; then DNA ligase enzymes mend the break.

  • Nucleotide excision repair (NER) ( Clinical box 5.2 ): protein factors recognise large helix-affecting damage, such as pyrimdine dimer formation, and transcription factor IIH (also involved in normal transcription) unwinds the DNA to produce a ‘bubble’. Cuts are made on the 3′ and 5′ side of the damage, removing a ‘patch’ of nucleotides. DNA polymerases delta (δ) and epsilon (ε) then synthesise the repair using the opposite strand as a template. Finally, DNA ligase binds the new piece into the backbone.

    Clinical box 5.2
    Xeroderma pigmentosum

    Xeroderma pigmentosum (XP) is a rare inherited disease that predisposes individuals to skin lesions (dry skin and freckles) and an increased incidence of skin cancer. The areas affected are particularly those areas exposed to sunlight. Several genes have been implicated, all involved in nucleotide excision repair. Patients are advised to avoid ultraviolet light exposure.

  • Mismatch repair (MMR) : this process corrects mismatches of normal bases and uses enzymes involved in BER and NER. The MSH2 protein recognises the mismatch and the MLH1 protein cuts it out. DNA polymerases δ and ε repair the patch. Mutations in the genes MSH2 and MSH1 predispose to colon cancer and they are therefore referred to as tumour suppressor genes .

  • Double-strand breaks (DSBs) : breakage of both strands of the double helix can result in genome rearrangements and is particularly hazardous to the cell. There are three main repair mechanisms for DSBs: non-homologous end joining (NHEJ), microhomology-mediated end joining and homologous recombination.

Non-homologous end joining

This process enables the direct joining of broken ends – the nucleotides involved do not have to be complementary. It uses short homologous DNA sequences (microhomologies) of 10, present in single-stranded overhands on the ends of double-stranded breaks. When the overhangs are compatible, the repair is accurate, but otherwise there can be losses of nucleotides, potentially leading to the translocations, such as the Philadelphia chromosome (see Ch. 12 ) or telomere fusions seen more frequently in tumour cells. Repair takes place in the G0/G1 and early S phase of the cell cycle.

Microhomology-mediated end joining

This mechanism is used in the S phase of the cycle, when NHEJ is inactive or unsuitable because of the deletions it would introduce, and involves a Ku heterodimer protein and a DNA-PK independent repair mechanism. It ligates the mismatched hanging strands and removes overhanging nucleotides (flaps), which are replaced with a short homology of complementary base pairs from the strands to fill in the gaps and to realign the molecule. Because the repair relies on microhomologous regions up- or downstream from the break, any important nucleotide sequences lost in the gaps can result in significant coding errors which may create oncogenes.

Homologous recombination

Broken ends can also be repaired using information from the:

  • Homologous chromosome in G1 phase.

  • Sister chromatid from G2 phase.

  • Same chromosome, if there are duplicate copies of the gene on the same chromosome in opposite directions. The process involves BRCA1 and BRCA2 ; mutations in the genes encoding these proteins predispose to breast and ovarian cancers.

Genes and development

The most common cause of infant death is a birth defect, and up to 3% of babies have a recognisable defect. Many of these defects are due to mutations in developmental genes. The prevalence of genetic abnormalities is even higher among foetuses that miscarry.

After fertilisation, the developing ovum not only undergoes simple multiplication but also changes to form different cell types, tissues and organs in a highly coordinated fashion. Understanding how cells with identical genes ( stem cells ) develop into different cell types has been studied through the discipline of developmental biology.

Many of the genes and pathways involved in human development are the same across a large range of species, and the function of many of these genes has been elucidated using non-human organisms, such as the roundworm, fruit fly, zebrafish, frog and mouse. Many of these organisms have fast generation times and so facilitate research by allowing the examination of a large number of events in a short period.

Embryo development begins with defining the major axes of the body: ventral/dorsal, anterior/posterior, medial/lateral and left/right. Cells are then differentiated and arranged spatially to form the tissues; organs and limbs are then formed through organogenesis. These processes are driven through the production of proteins that provide signals and switches.

Mediators of development

Developmental genes code for a variety of proteins with differing functions, e.g. signalling, DNA transcription and extracellular matrix components.

Signalling molecules

Signalling molecules allow interactions between cells. A protein is secreted by a cell and diffuses across the extracellular space to bind to a receptor on the target cell. These molecules are called paracrine signalling molecules and include:

  • Fibroblast growth factor (FGF) family

  • Hedgehog family

  • Wingless (Wnt) family

  • Transforming growth factor β (TGF-β).

Fibroblast growth factor and fibroblast growth factor receptor

The receptor for FGF (FGFR) is a glycoprotein consisting of peptides and immunoglobulin-type areas on the outside of the cell, with a protein that crosses the cell membrane and an intracellular tyrosine kinase . Different FGFs can bind to the receptor, leading to phosphorylation and activation of the tyrosine kinase. Many FGFs are involved in bone development, and mutations in the FGFR lead to a variety of skeletal problems in children ( Clinical box 5.3 ).

Clinical box 5.3
Achondroplasia is due to a defect in the fibroblast growth factor receptor

Individuals with achondroplasia (ACH) have short limbs in comparison with the trunk and a prominent forehead ( macrocephaly ). ACH is usually caused by an amino acid substitution in the FGFR3 gene and has autosomal dominant inheritance, although ‘new’ mutations are very common. The gene produces a protein that controls chondrocyte (cartilage cell) proliferation; a mutation causes overactivation of the gene and leads to inhibition of chondrocyte growth, resulting in bone shortening.

Sonic hedgehog

The hedgehog family of genes was named after a mutant form of hairy fruit fly. Vertebrates have similar forms, the most common of which is called sonic hedgehog (SHH) , involved in specifying the body axis. SHH binds to its receptor, a transmembrane protein called patched (PTC) , which suppresses transcription of Wnt and TGF-β and inhibits cell growth.

  • PTC somatic mutations affect the regulation of cell differentiation and cause cancer, such as basal cell carcinoma.

  • PTC germline mutations cause birth defects – rib anomalies, jaw cysts and cancer in later life – Gorlin syndrome .

Wingless (Wnt)

The Wnt genes were named after wingless mutant flies. A variety of different types are present in humans, and are involved in dorsal/ventral axis specification and development of various organs, binding to frizzled and low-density lipoprotein (LDL) receptors.

Wnt genes are involved in signalling processes throughout the cell and have been linked to the development of B-cell and lymphoid malignancies. Different forms appear to act as both tumour suppressors and tumour activators in cancer formation. Mutations in R-spondin proteins that are involved in Wnt signalling have produced inherited defects, such as anonychia (absence of nails) and sex reversal.

Transforming growth factor-β

TGF-βs are a large number of related genes that are involved with bone formation through the production of bone morphogenetic protein (BMP) .

DNA transcription factors

There are many families of transcription factors. These are genes that produce proteins that activate or repress other genes. They often share a common DNA-binding domain but have a wide variety of effects ( pleiotropy ).

Transcription factor genes are important in human development: HOX, SOX and T-box are important examples of these.

SOX family genes

There are a large number of SOX ( Sry-related HMG (high mobility group) box ) genes that are involved in neuronal development and sex determination. SOX genes have much in common with the SRY (sex-determining region on the Y chromosome) gene found on the human Y chromosome.

  • SRY produces DNA-bending proteins that promote events leading to male differentiation through:

    • Sertoli cell differentiation (these cells are found in the testes and aid sperm development)

    • Production of Müllerian-inhibitory factor (embryonic Müllerian ducts develop into the female reproductive organs).

  • SRY abnormalities can produce sex reversals:

    • XX males and XY females can be the result of a faulty crossover in meiosis between the X and Y chromosome, where the SRY gene is transferred to the X chromosome and the resultant affected offspring depends on the chromosome the father passes to his child

    • XY females can also be the result of a mutation in the SRY gene.

Extracellular matrix proteins

These are a large collection of molecules, including collagens and glycoproteins, that form the extracellular matrix and allow for cell migration.

A mutation in the fibrillin gene, responsible for the development of microfibrils in connective tissue, results in Marfan syndrome , with its multiple pleiotropic effects, leading to skeletal, cardiovascular and ocular defects.

Patterning

The position of organs and appendages are laid down in patterns defined during embryogenesis under the control of a series of genes. Genes that specify the various body axes are important in this process.

Most of the knowledge of pattern formation has come from experimentation with ‘knockout’ mice and tissue transplantations between regions in the early embryo ( Fig. 5.15 ) . Mutations in many of the early developmental genes are likely to be lethal in humans.

Fig. 5.15, Cross-section of a chick embryo at 45 hours' incubation,

Anterior/posterior axis formation

The developing collection of cells in the fertilised ovum, or blastula, undergoes gastrulation in which the three embryonic layers – ectoderm, mesoderm and endoderm – are defined and an anterior/posterior thickening occurs – the primitive streak – under control of a series of HOX and other developmental genes.

There are a large number of HOX genes on different chromosomes and they are expressed at different times, in an order that specifies the positioning of cells and tissues along the anterior/posterior axis. A mutation in one of these genes results in the replacement of antennae by legs in the Drosophila fly; limb abnormalities have also been observed in humans.

Left/right axis formation

At the anterior end of the primitive streak lies the node , which is the source of the left-side–expressed nodal protein, stimulated by asymmetrical expression of SHH from the notochord .

Dorsal/ventral axis development

BMPs are growth factors influencing the development of bone and cartilage, and are also important in the embryonic development of the heart and central nervous system.

  • BMP-4 , excreted from the dorsal notochord in the mesoderm, along with SHH proteins, from the ventral notochord, establishes the dorsal/ventral axis and stimulates development of the overlying ectoderm.

  • Noggin and chordin are morphogens (proteins that govern pattern development) that spread from a node and are expressed in a concentration gradient across a tissue. They inhibit BMPs. Noggin is needed in embryonic development to form the neural plate that lies opposite the primitive streak, allowing it to develop into the neural tube and subsequently to form the brain and spinal cord.

  • Noggin and chordin bind directly to BMP-4, inhibiting its ventralising signal, promoting dorsalisation in the mesoderm region that experiences high concentrations of the proteins. Noggin and chordin are thus important in defining the back, as opposed to the front, of the developing embryo.

Organogenesis

Organogenesis occurs after gastrulation and involves many of the same proteins, expressed differentially.

Neuronal development

FGFs influence differentiation of the neural plate into the spinal cord. Absence of FGF towards the anterior part of the neural tube allows brain tissue to generate. HOX genes further control the development of the forebrain, midbrain and hindbrain from the neural tube. FGFs are also important in the development of the skull bones and limb formation.

Neural crest cells , located on the lateral edges of the neural folds and formed from ectoderm, are induced by BMP, Wnt and FGF signalling to migrate through the extracellular matrix. They are very important in development and there are four major types:

  • Cranial – important in development of tissues in the facial region.

  • Vagal and sacral – forming parasympathetic neurons involved in gut peristalsis and blood vessel dilation.

    • Hirschsprung disease leads to severe constipation. It is commonly due to mutations in the RET (rearranged during transfection) gene that is important for neural crest cell migration into the distal bowel.

  • Trunk – forms two populations that influence the development of skin pigment cells and neurons of the sympathetic nervous system.

  • Cardiac – influencing the development of the great vessels supplying the heart.

The asymmetrical heart

Although the developed heart shows left/right asymmetry, the embryonic heart is formed inverted and is bilaterally symmetrical. Various influences are needed to transform the heart tube into chambers and establish the asymmetry.

dHAND and eHAND are basic helix–loop–helix transcription factors that are expressed in different areas of the heart tube. Both factors are transcribed initially, but later dHAND predominates in the region due to becoming the right ventricle, and eHAND in the region that will become the left ventricle. This, along with the right cardiac forward looping produced by the nodal protein, changes the anterior/posterior orientation into the left/right asymmetry of the developed heart.

Organ formation

Whereas the endoderm gives rise to the gastrointestinal tract, respiratory system, liver and pancreas, the mesoderm produces the circulatory, reproductive and urinary systems, the muscles and connective tissues, through complex interactions mediated by a variety of signalling molecules acting through complex interconnected networks.

Genes that are involved in the development of a particular organ are often also involved in the function of specialised cells within the organ, e.g.:

  • Transcription of insulin from β-cells in the pancreas is stimulated by binding of insulin promoter factor 1

  • Mutations in the gene producing this factor prevent development of the pancreas.

Limb development

Much is known about genes that control limb development because, next to congenital heart defects, abnormalities of the limbs are the next most common birth defect. Many of the pathways and transcription controls involved in limb development are conserved throughout the animal kingdom; experimentation in model organisms, such as Drosophila, has helped in understanding the processes involved.

Limbs develop from the lateral plate mesoderm , which leads to bone and cartilage development, and the somatic mesoderm , which produces the muscles, nerves and blood vessels. The intermediate mesoderm , close to the Wolffian duct, is thought to be the origin of the signal that induces limb production from the apical ectodermal ridge (AER) structure in the ectoderm under control of FGF and Wnt signalling proteins. The protein FGF8 , for example, is capable of producing a limb if transplanted into mesoderm tissue ( Clinical box 5.4 ).

Clinical box 5.4
Thalidomide teratogenesis

Thalidomide was marketed in the late 1950s as an anti-sickness drug given to women in early pregnancy. Use within a particular gestational window led to multiple birth effects, predominantly characterised by an absence of arms.

Thalidomide has been shown to increase the production of free radicals, leading to oxidative stress. Nuclear factor κB , an anti-apoptotic transcription factor, is redox sensitive and it is proposed that a species-selective failure to bind to its DNA promoter, because of oxidative damage, leads to a failure of expression of fibroblast growth factor 10 ( FGF10 ), which in turn attenuates expression of FGF8 in the apical ectodermal ridge, essential for limb development.

The AER, destined to produce the skin covering of the limb, sustains the limb formation within the progress zone in the mesoderm, where the limb develops. Both the AER and the progress zone of the mesoderm are needed for limb development. The developmental axes are important:

  • Proximal/distal : The length of time that cells spend in the progress zone will determine the proximal–distal axis. Removal of the AER will lead to a short limb with more distal elements, the actual elements depending on how late the AER influence was removed.

  • Anterior/posterior : The AER provides signals to the zone of polarising activity (ZPA) , at the root of the limb bud, and establishes the anterior/posterior axis (thumb/little finger) in the limb bud. FGF8 is needed to maintain the expression of SHH from the ZPA, which guides the anterior–posterior patterning through the asymmetrical expression of various downstream genes.

Apoptosis leads to separation of the fingers, stimulated by BMP signalling; noggin blocks cell death within the digits.

Epigenetic mechanisms

Epigenetic changes are those that change gene expression without altering the gene sequence. There are several mechanisms, of which the two principal ones are:

  • DNA methylation – in which cytosine residues are methylated where they are immediately followed by a guanine base in the 5′ direction (CpG dinucleotides). Not all CpGs are methylated. They are heritable, but can change through life.

  • Histone modification – in which histones in nucleosomes are modified through methylation, acetylation or phosphorylation.

There are major epigenetic changes in gametogenesis and early embryo development. The DNA of egg cells and sperm cells are extensively methylated but, when these fuse, there is a massive demethylation until early blastula and implantation of the embryo. At that point, a differential remethylation occurs which is more extensive in somatic cells, enabling differing gene expression in different body tissues.

Environmental exposure can also lead to epigenetic modification and there is an overall increase in methylation with age. Even identical twins can be differentiated from an assessment of their global methylation. Epigenetic mechanisms are particularly important in X-inactivation, genetic imprinting and the development of cancer (see later).

Human Genetic Variation

Polymorphisms

EB Ford was a British ecological geneticist who defined genetic polymorphism in 1940 as a ‘The occurrence together in the same habitat of two of more discontinuous forms, or phases, of a species in such proportions that the rarest of them cannot be maintained merely by recurrent mutation.’

Polymorphisms are all the result of mutational events:

  • Mutations in the germline cells that produce the gametes are inherited mutations. Some of these will produce deleterious effects, leading to inherited diseases, whereas others will have no effect, and are therefore passed down the generations as a polymorphism that is maintained at a reasonable level, generally in more than 1% in the population.

  • Mutations of somatic cells potentially result in cancer.

The different polymorphisms, or different sequences of DNA at a particular region ( locus ) of the genome, are referred to as alleles of a gene. Each person inherits half their DNA from one parent and half from the other. Individuals sharing the same allele on both their chromosomes are said to be homozygous for that particular polymorphism; those having different alleles are therefore heterozygous .

Mutation or polymorphism?

The distinction between what is a pathogenic mutation and what is a polymorphic variation is not always clear.

  • A mutation can be defined as an alteration in the (normal) DNA sequence that affects protein function or expression.

  • A polymorphism is defined as a locus where two or more alleles have been seen at a frequency greater than 0.01 (or 1%) in the population. This does not mean, however, that there are not polymorphic changes that are rarer but changes in expression have not been recognised.

This implies therefore that a polymorphism is a mutation that is compatible with life in its heterozygous form.

Types of mutational events leading to polymorphisms

At the nucleotide level, polymorphisms can be a single base substitution , the insertion or deletion of one or more bases, or repeat length or copy number polymorphisms and rearrangements. They are present throughout the genome and can be subdivided into those that alter a protein sequence ( coding polymorphisms ) and those that do not ( non-coding polymorphisms ).

Single nucleotide polymorphisms

The most common type of polymorphism in the human genome is the single nucleotide polymorphism (SNP) . Each SNP will have two, or occasionally three, alleles. Approximately 10 million SNPs have been identified with a minor allele frequency of greater than 0.5%. SNPs are not homogeneous across the genome; the greatest diversity is found at the human lymphocyte antigen (HLA) locus, the least on the sex chromosomes. Less than 1% of SNPs are predicted to result in changes in the composition of proteins; however, they have been shown to contribute significantly toward phenotypic diversity.

Sequence variation that does not result in a change in the amino acid that is coded for are referred to as silent substitutions; those that produce an amino acid change, as missense mutations (e.g. sickle cell disease ; Clinical box 5.5 ); and those that produce stop codons , as nonsense mutations.

Clinical box 5.5
Sickle cell disease

Sickle cell disease is an example of a missense mutation in which A is replaced by T at the 17th nucleotide of the β-chain haemoglobin gene. The normal GAG codon for glutamic acid thus becomes GTG, which encodes the amino acid valine. When it is present in two copies, it results in the sickling (a sickle shape) of red blood cells. The consequences of this are numerous, and include severe anaemia and tissue and organ damage due to the accumulation of the rigid red cells in small blood vessels (see Ch. 12 ).

Deletions and insertions ( Clinical box 5.6 )

Deletions and insertions are defined as the loss or addition of one or more bases from a DNA sequence. Because of the three-nucleotide code for a single amino acid, loss of bases that are not multiples of three is more likely to produce major detrimental effects ( frameshift mutations ).

Clinical box 5.6
Examples of deletions and insertions

Cystic fibrosis is a common inherited autosomal recessive genetic disorder among Caucasians, affecting as many as 1 in 2000 in northern Europe, leading to fibrotic lesions throughout the body, but particularly affecting the lungs, pancreas and intestines. The most common mutation involves a three-base deletion that removes the amino acid phenylalanine from the cystic fibrosis transmembrane regulator ( CFTR ) gene sequence, although more than 1300 mutations have been described involving several different mutational types.

The gene codes for a chloride channel protein controlling the reabsorption of chloride from sweat gland secretions as they travel down the tubule in the dermis to the skin surface where sweat is secreted as a hypotonic solution. The mutant gene needs to be present in two copies (recessive), inherited from both parents, to produce the disease. CFTR mutations lead to the negatively charged chloride ions being trapped in the tubule and this accumulation also prevents the movement of positively charged ions, such as sodium. The ions combine to form salt, which is found in large amounts within sweat glands in this condition. The sweat sections are hypertonic and this phenomenon forms the basis of the sweat test for cystic fibrosis. CFTR acts as a chloride channel and regulator of ion transport in many organs, particularly the lungs and pancreas, where the salt imbalance leads to loss of water and results in thick obstructive mucus-like secretions. The secretions form an ideal environment for bacterial infection in the lungs and block pancreatic ducts which reduce the excretion of pancreatic enzymes and limit absorption of fats and some proteins in the digestive system, evidenced by the appearance of fatty stools.

Huntington disease results from increased numbers of repeated CAG (glutamine Q) sequences (polyQ) that produce a mutant Huntingtin protein (mHTT), which interferes with synaptic transmission in the brain. Neuronal transmission worsens with increasing size of the polyQ component.

PolyQ lengths involving more than 36 glutamines lead to increased neuronal death. An early age of onset and more rapid progression of the disease are associated with increased polyQ lengths. The autosomal dominant inheritance is also affected by ‘dynamic’ mutations, in which the number of repeats is not always exactly copied. The disease occurs at higher prevalence in the Afrikaner population of South Africa and is thought to be the result of a founder (see later), a Dutch man who arrived there in 1652 (see also Clinical box 5.7 ).

Fragile X syndrome results from stretches of CGG repeats in the X chromosome. Repeats can be many without obvious effect but the repeat stretch tends to get longer from generation to generation and several thousand repeats have been described. These very long repeats weaken the structure of the X chromosome and can lead to varied conditions, including intellectual disability. Because females carry two X chromosomes, males are predominantly severely affected, though females can show variable clinical symptoms.

Sometimes the proteins produced are truncated or not produced at all. These types of mutation are often associated with recessive diseases. A four-base pair insertion in the HEXA gene results in low activity of an essential lysosomal enzyme, leading to Tay–Sachs disease when inherited in a recessive fashion. The mutation is prevalent in Ashkenazi Jews and some French Canadians and leads to nerve cell deterioration early in life.

Some disorders are caused by insertions of nucleotides in multiples of three, preserving the reading frame . Several trinucleotide-repeat diseases exist, such as Huntington disease (large numbers of CAG repeats leading to polyglutamination) and fragile X syndrome (CGG – arginine repeats).

Gene duplications

In meiosis, when the sister chromatids line up, if there is a slight mismatch, this can result in an unequal crossover, which can result in two copies of a gene, e.g.:

  • Families showing two copies of the aldosterone gene suffer from high blood pressure and are at increased risk of stroke.

  • Charcot–Marie–Tooth disease arises from a duplication on chromosome 17. One of the genes involved produces a protein involved in myelin formation; demyelination is one characteristic of this disease.

The unequal crossover can also interrupt the promoter region of the gene ( promoter mutation ), which may result in different gene expression.

Consequences of genetic mutation

Not all genetic variation is detrimental to the organism. SNPs can have positive or negative effects, or can be neutral. Sickle cell disease in its homozygous state is almost always lethal and one would expect the allele to be selected against, and therefore be very rare in all populations. The sickle gene is, however, present in approximately 15% of the black population. In its heterozygous state it causes mild anaemia but confers resistance to malaria. Thus, the heterozygous individual is at an advantage in populations where malaria is present ( Fig. 5.16 ) . Comparison of African Americans with Africans has shown that the sickle cell gene is much reduced in African Americans, indicating that the gene is being eliminated where there is absence of positive selection pressure. This is an example of a balanced mutation .

Fig. 5.16, Map of the Old World showing regions where Plasmodium falciparum malaria and sickle cell trait are prevalent.

Duplications lie at the core of evolution. Within single organisms, there are many genes with similar sequences ( paralogous genes ) that have resulted from repeated duplications of an ancestral gene. In contrast, orthologous genes are homologous genes in different species – these are likely to have descended from a common ancestor. For example, there is approximately a 99% similarity between human and chimpanzee DNA, with the differences mostly being seen in those influencing the nervous system.

Paralogous genes can be beneficial by providing for redundancy . Removing a gene (such as in knockout mice) sometimes has little effect because the function has been taken over by a paralogue. Over time, one of a pair of duplicated genes can also mutate and acquire a new and advantageous function ( adaptive evolution ).

Founder effects

The founder effect was defined by Ernst Mayr as ‘The establishment of a new population by a few original founders (in an extreme case, by a single fertilised female) which carry only a small fraction of the total genetic variation of the parental population’, and is recognised when a particular polymorphism can be traced back to a single individual.

The reasons for this phenomenon are twofold. First, a particular area may become populated with a small number of individuals, with all subsequent generations originating from these people while the particular population remains isolated ( Fig. 5.17 ) . For example, many individuals living in Tristan da Cunha originate from the original British settlement of 15 in 1816, one of whom was a carrier for retinitis pigmentosa, a disease which leads to premature blindness in affected homozygotes and has led to this disease remaining more prevalent in this small island population than elsewhere.

Fig. 5.17, Founder effect.

The second reason concerns the origin of a particular set of Y chromosome polymorphisms. The male Y chromosome is passed without change (other than rare mutations) through the generations. Thus, males with the same paternal ancestors are very likely to share identical Y chromosome polymorphism (known as a haplotype , as there is only one Y chromosome).

Genghis Khan and the founder effect

One particular Y chromosome haplotype is found in approximately 8% of the population in the former Mongolian Empire, and has spread throughout the world population. Although the success of this haplotype could be the result of it having some form of biological advantage, scientists have suggested that it could originate from the dynastic family of Genghis Khan and his male relatives in their predominance and subsequent spread of the Mongolian Empire across the whole of Asia. Social norms were very different at the time and Khan's male descendants appeared to have sired many sons from a high number of associations with women.

Bottlenecks

Sometimes the same effect can be the result of a bottleneck , where only a few individuals pass through, or survive, and then expand later. The individuals who pass through the bottleneck may have some polymorphisms that are rare in the original population, but proportionately are not so rare in the second new population.

A bottleneck is one possible reason for the very different mitochondrial sequences seen when comparing African and non-African populations. African mitochondrial DNA shows high divergence, whereas non-African lineages appear to be less divergent and originate from an African branch. This supports an out-of-Africa origin of humans and suggests that there might also have been a bottleneck some 80 000 years ago, with a relatively small population thereafter populating the whole of Europe and Asia ( Fig. 5.18 ) .

Fig. 5.18, Bottleneck effect.

You're Reading a Preview

Become a Clinical Tree membership for Full access and enjoy Unlimited articles

Become membership

If you are a member. Log in here