Molecular Genetics of Corneal Disease


Key Concepts

  • Genes represent the fundamental units of heredity.

  • Identification of disease-causing genetic variants provides insight into pathogenesis of corneal disorders.

  • Population-based and family-based studies allow researchers to identify disease-associated genetic variants.

  • Genetic discoveries refine the classification of diseases historically defined by clinical phenotype.

  • Altered gene regulation may contribute to corneal disease.

Review of Genetics and Human Disease

Genes are the fundamental units of heredity. At the most basic level, genes are segments of deoxyribonucleic acid (DNA) molecules that specify the production of proteins, which in turn perform structural or enzymatic functions necessary for development and homeostasis. In this fashion, information stored in the DNA of each cell in an organism is used to create and organize cellular building blocks and on a larger scale to determine recognizable qualities or traits.

Heritable traits (and genetic diseases) are determined by variations in either the sequence or regulation of genes and the proteins they encode.

The DNA sequence of the human genome encodes approximately 30,000 genes divided among the 23 pairs of chromosomes (22 autosomes and the X and Y sex chromosomes). Each chromosome consists of a single long molecule of DNA that is complexed with proteins, extensively folded, and packaged into a condensed element.

Although all of an organism’s genes are present in each of its cells, only a fraction of these genes are active in a given cell. Some genes that perform basic functions necessary for survival, such as energy production, are active in every cell of the body, whereas others are tissue specific. Tissue-specific activation of genes occurs because proteins are not produced directly from DNA. The DNA sequence of a gene is first converted into ribonucleic acid (RNA) in a process known as transcription. The nucleotide sequence of an RNA molecule specifies a series of amino acids that, when connected in succession, form a protein (translation). In a given cell of a complex organism, most genes lie dormant and are neither transcribed nor translated, and the proteins they encode are not produced. The subset of genes that are transcribed and subsequently translated into proteins give a cell specialized form and function and are the basis of cellular differentiation and tissue formation.

The structure of a gene includes several domains necessary for the production of a particular protein in a coordinated fashion ( Fig. 10.1 ). Only a portion of a gene’s DNA sequence (the coding sequence) specifies the amino acids that compose the encoded protein. In eukaryotic cells, the coding sequence is usually divided into several segments called exons, which are separated by noncoding DNA segments called introns.

Fig. 10.1, The structure of the BIGH3 gene. The coding sequence of BIGH3 is divided into 17 exons (represented by the boxes ) that are separated by 16 introns (represented by the space between the boxes ). Upstream (to the left) of the coding sequences is the promoter region that regulates the transcription (expression) of the BIGH3 gene. The positions of the amino acids most commonly mutated in BIGH3 -associated corneal disease (Arg124 and Arg555) are shown with arrows. The drawing is to scale except that introns greater than 500 nucleotides long have been truncated (indicated with a vertical slash mark ).

Although introns have historically been considered less relevant to genetic diseases, recent research suggests that significant regulatory capacity exists in areas not coding for proteins. In some cases, these introns may code for microRNAs (miRNAs), or RNA molecules that are destined not for translation into proteins but rather for regulation of the process through the silencing of other RNA molecules. Similarly, small interfering RNAs (siRNAs) are short double-stranded RNAs that serve to prevent translation. Variants associated with these intronic sequences have been implicated in disease pathogenesis, such as a mutation in miR184 in endothelial dystrophy, iris hypoplasia, congenital cataract, and thinning (EDICT) syndrome.

“Upstream” of the coding sequence is the promoter region of the gene, which regulates transcription. The promoter contains DNA sequences that are binding sites for the enzyme (RNA polymerase) and cofactors that transcribe the gene from DNA into RNA. Tissue-specific activation of gene expression is primarily determined by differential binding of these cofactors to the promoter.

In addition to the effect of alterations in specific coding sequences, recent investigations have begun to explore the role of epigenetics in anterior segment diseases. In epigenetics, the addition of methyl groups to DNA serves as a switch to turn expression of a gene “on” or “off,” thus contributing to alterations in protein levels. Further research will shed light on the importance of this phenomenon for the cornea.

Techniques Used to Identify Disease-Causing Genes

Investigation of the genetic basis of a disease begins with familial aggregation studies to determine if familial clustering exists, and heritability studies, such as twin studies, to provide evidence for genetic effects. Segregation analyses provide insight into the mode of inheritance. These initial studies can be conducted without DNA samples and are based on clinical phenotypes and familial relationships.

To identify genes responsible for heritable corneal diseases, researchers begin with genome-wide scans, localizing regions of interest to particular chromosomal regions, and/or specific genes. Population-based or family-based techniques may be utilized for association studies, whereas large families are particularly helpful for linkage analyses.

In population-based genome-wide association studies, the frequency of different types of genetic variations is compared between large numbers of patients with a disease and normal control subjects. In such studies, cohorts of patients and controls are typed with hundreds of thousands or even millions of genetic markers in search of a cluster of markers that are seen more frequently in patients than controls. These groups of disease-associated markers may define a region in the genome that contains a genetic risk factor for disease. Genome-wide association studies have been successful in mapping the location and identifying important genetic risk factors for common eye diseases such as Fuchs dystrophy, age-related macular degeneration, and exfoliation syndrome.

Once a region has been identified, hypotheses about the mechanism of disease may be used to select and prioritize candidate genes for mutation screening. Recent advances in high-throughput next-generation sequencing have reduced the cost of screening large numbers of genes for potential variants. Genes are studied as possible causes of corneal disorders based on their function or expression pattern. Although some ocular diseases, such as gyrate atrophy and Leber hereditary optic neuropathy, are caused by genes expressed ubiquitously, many ocular diseases are caused by genes expressed in the eye. Therefore genes expressed in the cornea would be given higher priority as candidate genes for corneal dystrophies than genes not expressed in the cornea. Genes with known functions that suggest an association with a corneal disorder would also be selected for further study. For example, the sulfotransferase gene CHST6 was considered a good candidate for involvement in macular corneal dystrophy after biochemical studies identified a deficit of sulfated glycosaminoglycans (GAGs) in this disorder. , Finally, genes that cause corneal disease when disrupted in an animal model are excellent candidates for involvement in human corneal disease. For example, a keratin 12 knockout mouse (i.e., a mouse genetically engineered to have a genome lacking the keratin 12 gene) has a fragile corneal epithelium and is predisposed to corneal abrasions and erosions. This observation suggested that the keratin 12 gene was a good candidate for causing Meesmann corneal dystrophy (MECD).

Candidate gene screening is useful for identifying disease genes when only individuals or small families affected with a disease are available for study. Candidate genes are evaluated by screening large cohorts of patients for disease-causing mutations in a specific gene. In traditional Sanger sequencing, the sequence of DNA in a region was determined by replicating DNA and adding di-deoxynucleotidetriphosphates (ddNTPs) to elongating DNA. These molecules stop extension of DNA production at each base pair, and fluorescent or radioactive labeling of each type of ddNTP (A, C, G, or T) allows automated sequence machines to read the series of nucleotides. In contrast, the recent widespread adoption of next-generation sequencing has decreased the time and cost necessary for sequencing of genomic data. The term next-generation sequencing refers to methods of high-throughput sequencing in which hundreds of thousands or even millions of small DNA sequences are produced in parallel and assembled based on overlapping regions.

In family-based linkage studies, large families with many affected members are identified among whom disease is transmitted in a Mendelian fashion and linkage analyses are conducted to localize disease variants to a chromosomal location. Family members are typed with thousands of genetic markers with known chromosomal locations in search of markers that are coinherited (or linked) with the disease more often than can be explained by chance. This coinheritance is related to their physical nearness on a chromosome because genetic markers in close proximity to the disease-causing gene are less likely to be separated by a meiotic crossover. The likelihood of a crossover occurring between a marker and a disease gene is proportional to the distance between them. The known position of a linked marker indicates the chromosomal location of the disease-causing gene.

This approach requires no hypothesis regarding the pathogenesis of the disease studied or of the function of the disease-causing gene and is helpful for studying rare disorders for which the disease pathways are only poorly understood. However, common disorders may be multifactorial in origin and may not follow the strict Mendelian inheritance required for linkage analyses.

Disease-Causing Mutations Versus Non–Disease-Causing Sequence Variations

The human genome is composed of approximately three billion base pairs of DNA, and millions of DNA sequence variations exist between any two unrelated individuals. The vast majority of these differences are not associated with any detectable phenotype. Consequently one of the challenges of studying the genetics of human disease is differentiating between disease-causing mutations and non–disease-causing sequence variations.

Numerous criteria have been utilized to judge which sequence variations are causative of disease. Generally, a variant associated with disease must be present in patients more frequently than in control subjects and must alter either the processing, protein sequence, or expression level of a gene. Statistical methods, sequence analysis, computer modeling, and functional studies have been used to infer which variations are truly pathogenic.

One can use statistical approaches to demonstrate a significant association between a sequence variation and disease, but the statistical methods appropriate for studies of many unrelated patients are different from those used to study individual families wherein many members are affected with a given disease. In population studies, the pathogenicity of gene variations may be strongly supported by demonstrating a significantly higher frequency of a certain variation among a large number of patients as compared with a large number of controls. A crucial aspect of this technique is that the subjects and controls must be well matched. Some non–disease-causing variations are specific to certain ethnic groups, and if the ethnicities of the subjects and controls are not well matched, such variations may incorrectly appear to be associated with disease. If a gene variation is found in affected members of a large family, one can use statistics to show that coinheritance of the variation and the disease occurs more often than can likely be attributed to chance. Analysis of sequence homology can also lend support to the pathogenicity of a particular sequence variation. Alterations of portions of genes that are identical in disparate organisms are considered more likely to cause disease than those that occur in portions of genes that were not conserved during evolution. Similarly, variations within known functional domains of a gene are often considered more likely to cause disease than variations in other portions of a gene.

Perhaps the strongest support of the pathogenicity of a sequence variation is to show directly that the variation harms the function of the protein encoded by the gene. This can be done with in vitro assays as well as with various types of animal models. Using molecular genetic techniques, the pathogenicity of a specific gene variation can be evaluated by creating a cell line or an animal that has the gene defect of interest. If such a model expresses a phenotype similar to the human disease, it is likely that the particular gene variation does cause disease. The MCD-like phenotype of the keratin 12 knockout mouse is an excellent example of this type of evidence for disease causation.

Recent advances in our understanding of epigenetics, trinucleotide repeat diseases, and roles for miRNA in disease pathogenesis shed light on the complexity of the genome, as not all disease-causing genetic variants are to be found in regions that code for proteins.

Terminology

According to International Committee for the Classification of Corneal Dystrophies (IC3D), corneal dystrophy refers to “a group of inherited corneal diseases that are typically bilateral, symmetric, slowly progressive, and without relationship to environmental or systemic factors.” Although most dystrophies are included in this description, variability exists regarding patterns of heritability and symmetry. The IC3D classification system for corneal dystrophies, originally published in 2008, was refined in 2015 on the basis of updates in our understanding of the pathogenesis of corneal dystrophies. These conditions have now been reclassified under epithelial and subepithelial dystrophies, epithelial-stromal TGFBI dystrophies, stromal dystrophies, and endothelial dystrophies.

Epithelial and Subepithelial Corneal Dystrophies

Epithelial Basement Membrane Dystrophy

Epithelial basement membrane dystrophy (EBMD, OMIM # 121820 ) is caused by poor adhesion of epithelium to the basal lamina, resulting in symptomatic recurrent erosions and visual disturbances of glare and blur. Though frequently considered to be sporadic or of traumatic origin, rare familial cases have been associated with variants in TGFBI in 5q31.

Clinical examination reveals irregular patches of epithelium—hazy islands with scalloped borders (maps) interspersed among round or oval gray inclusions (dots). Edges of overlapping epithelial layers appear as curvilinear lines (“fingerprints”). Episodic, spontaneous erosions produce lacrimation, glare, and pain.

EBMD has also historically been referred to as map-dot-fingerprint dystrophy, Cogan microcystic epithelial dystrophy, or anterior basement dystrophy.

Epithelial Recurrent Erosion Dystrophies

The category of epithelial recurrent erosion dystrophies (ERED, OMIM # 122400 ) includes Franceschetti corneal dystrophy (FRCD), dystrophia Smolandiensis (DS), and dystrophia Helsinglandica (DH). Transmitted with an autosomal dominant pattern of inheritance, onset occurs in the first decade of life, and recurrent erosions occur spontaneously or after minimal trauma. Repeated erosions result in subepithelial fibrosis. Particularly in the DS variant, these erosions may result in keloid-type structures prominent enough to require corneal transplantation. The genetic locus and genes remain unknown.

Subepithelial Mucinous Corneal Dystrophy

Subepithelial mucinous corneal dystrophy (SMCD, OMIM # 612867 ) has been described in a single family whose affected members developed bilateral subepithelial haze in childhood involving the whole cornea but most prominently in the center. Opacities occur in the first decade of life and are accompanied by painful recurrent erosions. Beneath the epithelium and anterior to the Bowman layer, fine fibrillar material is deposited, which stains with Alcian blue, is positive with periodic acid–Schiff (PAS), and is sensitive to hyaluronidase. Inheritance is likely autosomal dominant, and associated genetic variants remain unknown.

Meesmann Corneal Dystrophy

In MCD (OMIM # 122100 ), multiple tiny intraepithelial vesicles extend from limbus to limbus with intervening clear areas primarily in the intrapalpebral cornea. The lesions may appear as gray cysts or clear vacuoles on indirect illumination. In the Stocker-Holt variant of MCD, the punctate epithelial opacities stain with fluorescein and fine linear whorls of opacities may be present.

Mutations in the keratin genes, keratin 3 (KRT3) on chromosome 12q13 and keratin 12 (KRT12) on chromosome 17q12, have been implicated in MCD and the MCD-Stocker-Holt variant, respectively. These genes code for cornea-specific keratins, which, in turn, develop K3-K12 heterodimers and polymerize, resulting in cytoskeletal filaments that provide stability to the corneal epithelium; mutations in either of these genes may disrupt this process. All known mutations exist in helix initiation or helix termination motifs, suggesting that protein misfolding contributes to pathologic changes; a knockin mouse model demonstrates an increase in markers associated with the unfolded protein response.

Lisch Epithelial Corneal Dystrophy

Localized gray punctate epithelial opacities radiate from the limbus to the center of the cornea in flame-, band-, or feather-like whorls in individuals affected with Lisch epithelial corneal dystrophy (LECD, OMIM # 300778 ). Patients are generally asymptomatic unless the opacities extend to the central cornea. The inheritance pattern is X-linked dominant, with males and females affected equally, with a locus at Xp22.3. The disease-causing gene in this locus has not yet been identified.

Gelatinous Drop-Like Corneal Dystrophy (GDLD)

GDLD (OMIM # 204870 ) is a type of subepithelial amyloidosis transmitted with autosomal recessive inheritance involving the TACSTD2 gene. A gelatinous drop-like corneal dystrophy, it may present in the first to second decades of life with band keratopathy, a multinodular mulberry-type appearance, or with golden kumquat-like diffuse stromal opacity. These distinct presentations may represent either phenotypic variability or stages of disease progression. Disruption of superficial epithelial tight junctions results in hyperpermeability of the corneal surface and staining with fluorescein. Disease frequently recurs within a few years after transplantation. Mutations in tumor-associated calcium signal transducer 2 (TACSTD2) are causative for disease.

Epithelial-Stromal TGFBI Dystrophies

You're Reading a Preview

Become a Clinical Tree membership for Full access and enjoy Unlimited articles

Become membership

If you are a member. Log in here