Transcription


Congenital Heart Disease: Mutations of Transcription Factors

Normal Heart Development

The progenitor cells for mammalian heart development are located in the anterior lateral plate mesoderm. At about day 15 of human embryonic development, the progenitor cells condense into two lateral heart primordia. These include lineage precursors for the myocardial and endocardial lineages. At 3 weeks of human development, the cardiac precursors move toward the center, forming a primitive linear heart tube. This loops to the right when cells from the second heart field are added to the inflow and outflow locations. Next, within the outflow tract (OFT), endocardial cushions (subset of cells involved in septation) begin to be formed and they are also formed at the common atrioventricular canal during the sixth and seventh week of development. These cushions aid in the separation of the heart into four chambers and divide the OFT into the aorta and pulmonary artery. At this time the early conduction system begins developing and contributions from the neural crest and proepicardium (progenitor cells near the venous pole) occur. Later on, extensive remodeling of the heart occurs prior to assuming the mature four-chambered structure with divided inflow and outflow. The developing heart also forms valve leaflets and a functional conduction network. The development of the human heart is summarized in Fig. 12.1 .

Figure 12.1, Schematic representation of several stages of human heart development. Ao , Aorta; AVC , atrioventricular canal; CCS , cardiac conduction system; CM , cardiac mesoderm; LA , left atrium; NCC , neural crest cells; OFT , outflow tract; PA , pulmonary artery; RV , right ventricle; RA , right atrium; SAN , sinoatrial node; SV , sinus venosus; V , ventricle.

Transcription Factors Control Heart Development

Normal development of the heart is dependent on the functions of several transcription factors. The central group of factors involves the GATA family of zinc-finger proteins (GATA4, 5, and 6). Others are the MADS box proteins (the term, MADS, is derived from the first letter of four original members of this group: MCM1, AG, DEFA, and SRF). These proteins recruit other transcription factors into regulatory complexes. Other core factors in heart development are T-box factors (Tbx1, Tbx2, Tbx3, Tbx5, Tbx18, and Tbx20). Isl1 [lim-homeodomain (HD) protein] also is essential. These factors interact with themselves and other transcription factors to regulate the maturation of the cardiac chambers. They are also critical for the development of the conduction system and for remodeling of the endocardial cushion. Of these factors the most studied are Nkx2–5, GATA4, and Tbx5; these factors are critical for the development of the heart. Congenital heart disease is associated with mutations in the genes for these factors. Fig. 12.2A and B shows transcription factors and their interactions that are involved in myocardial development and heart morphogenesis.

Figure 12.2, (A) Transcription factor pathways involved in myocardial development and heart morphogenesis. Hand2 , Member of basic helix–loop–helix family of transcription factors; Irx4 , homeobox gene product; ISL1 , required for motor neuron generation; Jarid2 , interacts with AT-rich domains; MEF2C , myocyte-specific enhancer factor 2C; SMADs , cell-signaling proteins. (B) Transcription factor pathways involved in the development, maturation, and function of the CCS. CCS , Cardiac conduction system.

GATA4

GATA4 is a zinc-finger DNA-binding protein of 442 amino acids and a molecular weight of 44,580 Da. Two types of GATA4 are differentiated based on the position of the zinc finger: either at position 217–241 or at 271–295. It is a transactivation factor that binds to the sequence 5′-AGATAG-3′. GATA4 plays a key role in the development of the heart. It is involved in the induction of cardiac-specific gene expression mediated by bone morphogenetic protein (BMP) through its binding to the BMP response element DNA sequences within cardiac activating domains. In cooperation with another transcription factor, NKX2.5 (or Nkx2–5), it promotes cardiac myocyte enlargement. GATA4 has several biological functions, but a major one is the morphogenesis of the atrial septum, including the development of the arterioventricular canal and the associated valve formation.

Through gene mutation, there are nine positions in the GATA4 protein molecule where point mutations occur involving the change in a single amino acid. These cause ventricular septal defects (VSDs) resulting in abnormal communication between the lower two chambers of the heart. VSD may occur alone or in combination with other cardiac malfunctions. If these defects go unrepaired, they can result in enlargement of the heart, congestive heart failure, pulmonary hypertension, arrhythmias, and possibly sudden cardiac death. A schematic of the GATA4 protein showing its functional domains and the locations of mutations and phenotypes of congenital heart defects is in Fig. 12.3 .

Figure 12.3, Schematic of GATA4 protein indicating the location of mutations and phenotypes of congenital heart defects. The specific point mutations are indicated [e.g., S52F is amino acid 52 serine mutated (in the gene) to form a phenylalanine]. ASD , Atrial septal defect; AVSD , atrioventricular septal defect; HRV , hypoplastic right ventricle; NLS , nuclear localization signal (containing basic amino acids); PAPVR , partial anomalous pulmonary venous return; PDA , patent ductus arteriosus ; PS , pulmonary valve stenosis; PTA , persistent truncus arteriosus ; TOF , tetralogy of Fallot; VSD , ventrivular septal defect; ZF , zinc-finger domain.

GATA4 is also phosphorylated at position 105, a serine residue, in the sequence PPV S PRFSF by mitogen-activated protein kinase (MAPK) (also known as extracellular signal-regulated kinase). The phosphorylated form is apparently more active in binding to DNA.

Nkx2.5

NKX2.5 is a homeobox protein of 324 amino acids and a molecular weight of 34,918 Da that is involved in the differentiation of the myocardial lineage. It is a transcriptional activator of the atrial natriuretic factor in cooperation with GATA4. It is under transcriptional control by PBX1 ( P re B -cell leukemia homeobo x 1). The domains of the NKX2.5 protein are shown in Fig. 12.4 .

Figure 12.4, The figure shows mutations causing various cardiac anomalies. NKX2.5 contains two exons encoding a 324 amino acid protein, including a TN (the name derives from Drosophila ), a homeodomain ( black ) and an NK2 domain. Truncation mutations are shown above, missense mutations below. NK2 domain (amino acids 212–234); TN , tinman domain (amino acids 10–21); homeobox domain (amino acids 138–197). Note the clustering of mutations within the homeobox itself.

A point mutation within the homeobox domain occurs at amino acid 145 where phenylalanine is converted to a serine (F145S). The homeobox is a DNA-binding domain (DBD) first discovered in Drosophila . It is a sequence of about 180 base pairs giving rise to about 60 amino acids. The homeobox domain occurs in proteins that are transcription factors which are involved in the patterns of anatomical development. The HDs of transcription factors have a characteristic DNA-binding fold. The crystal structure of NKX2.5 is known and it binds to two DNA sequence motifs TGAAGTG/TCAAGAG, straddling them both at the same time ( Fig. 12.5 ).

Figure 12.5, NKX2.5 is a homeodomain-containing transactivation factor regulating cardiac formation and function. Its mutations are linked to congenital heart disease. In this first report of a crystal structure of NKX2.5 homeodomain in complex with double-stranded DNA of its endogenous target, locating within the proximal promoter −242 site of the ANF gene. The crystal structure was determined at 1.8 Å resolution and demonstrates that NKX2.5 homeodomain occupies both DNA-binding sites separated by five nucleotides without physical interaction between themselves. The two homeodomains show identical conformation despite differences in the DNA sequences they bind and no significant binding of the DNA was observed. Tyr54, absolutely conserved in NK2 family proteins, mediates sequence-specific interaction with the TAAG motif. This high-resolution crystal structure of NKX2.5 protein provides a detailed picture of protein and DNA interaction, which allows the prediction of DNA binding of mutants identified in human patients. ANF , Atrionatriuretic factor.

The NKX2.5 is apparently controlled by phosphorylation–dephosphorylation. The protein occurs both in the cytoplasm and in the nucleus, and its subcellular location is probably driven by its phosphorylation status.

Tbx5

Tbx5 (T-box transcription factor 5) is a protein of 518 amino acids and a molecular weight of 57,711 Da. It is found in the cell nucleus but there are also reports of its location in the cytosol, cytoskeleton, and Golgi apparatus. It is involved in the transcriptional regulation of genes specifying mesoderm differentiation, especially in heart development and in the differentiation of cardiac progenitors. Mutations in the gene for this transcription factor are associated with the Holt–Oram syndrome that is a developmental disorder affecting the heart and upper limbs. Pictured in Fig. 12.6 is the gene for Tbx5 (A) and the mutations in the Tbx5 protein that cause the Holt–Oram syndrome (B).

Figure 12.6, Fourteen mutations in TBX5 that cause the Holt–Oram syndrome. (A) TBX5 genomic structure is shown with approximate intron sizes. Exons 1–9 are shown with vertical bars. Exons 1a, 1b, or 1c are alternatively spliced as the first exon of TBX5 cDNA. Alternative splicing of the 3′ region of the gene accounts for the variable addition of exon 9. Arrows indicate the translocation t(5;122)(q15;q24) found in the family IIa proband [designated t(5;12)], which disrupts TBX5 in intron 1a and the location of interior AS mutations Int2ASC- 2 A and Int2ASG +1 C. Acceptor site residues were numbered from the splice site with the conserved G residue designated as +1. (B) Schematic representation of TBX5 cDNA illustrating the target of the alternatively spliced transcripts. Untranslated sequence ( dark shading ), exons 1–9 ( numbered boxes ), and locations of amino acids 1 and 517 are indicated. Codons ( gray shading ) that encode the T-box DNA-binding domain [residues 56–238 (exons 3–7)] were defined by homology to other T-box gene family members. TBX5 mutations that are predicted to truncate are shown earlier; missense mutations are indicated next ( bold, italics ). Mutations are designated by the name and number of the first substituted amino acid residue: Δ deletion; FSter , indicates frameshift mutations with resultant premature stop codons; Ins , Insertion; ter , indicates nonsense mutation. AS , Acceptor site.

Aside from the errors in the formation of the upper limbs, about 75% of persons with the Holt–Oram syndrome have potentially life-threatening cardiac problems. Usually, there is a defect (hole) in the septum separating the right and left sides of the heart. If the hole occurs in the septum between the upper chambers of the heart (atria), it is an atrial septal defect. If it occurs in the septum between the lower chambers of the heart (ventricles), it is a VSD. In addition, some patients have conduction disease involving abnormalities in the heart electrical system that can lead to bradycardia (slow heart rate) or fibrillation (uncoordinated heart rate). The Holt–Oram syndrome is autosomal dominant, meaning that only one copy of the altered gene is enough to cause the disease. The syndrome occurs in 1 of 100,000 individuals.

Transcription Factors Involved in Cardiac Hypertrophy

Hypertrophy of the heart is the abnormal enlargement of the heart muscle that results from increases of the size of the cardiac myocytes as well as changes in other components, such as the extracellular matrix. The condition results from biomechanical stress (including hypertension) that can progress to heart failure or even sudden death. It is a maladaptive process resulting from the hypertrophic signaling cascade of which transcriptional factors are prominent as capitulated in Fig. 12.7 .

Figure 12.7, Role of cardiac transcription factors in the regulation of cardiac gene program during cardiac hypertrophy. GATA4 transcriptional activity is stimulated through phosphorylation by (ERK1/2) p38 MAPK, although phosphorylation by GSK3β negatively regulates GATA4 activity. In addition, the transcriptional activity of GATA4 is regulated through physical interaction with NFAT, MEF2, SRF, or a cofactor, p300. Most important is response to hypertrophic stimulation, NFAT is dephosphorylated by calcineurin and translocates into the nucleus, where it activates gene expression partly through forming a complex with GATA4. MEF2 transcriptional activity is enhanced through phosphorylation by p38 MAPK and ERK5 and physical interaction with GATA4, NFAT, and coactivator p300. In addition, MEF2 might be involved in the PI3K/Akt-mediated hypertrophic signal. Most important, MEF2 factors function as important effectors of Ca 2+ signaling. MEF2 activity is stimulated by constitutively active calcineurin or CAMK in vivo. Activation of MEF2 is dependent on dissociation from Class II HDACs. Signal-mediated phosphorylation of HDACs recruits chaperones 14–3–3 to dissociate the HDAC-MEF2 formation, although the endogenous HDAC kinase has not been determined. Csx/Nkx2–5 might regulate cardiac gene expression (1) directly, (2) via association with GATA4 or SRF, and (3) via upregulation of Csl, which activates transcriptional activities of GATA4 and MEF2. Contribution of Csx/Nkx2–5 transcriptional activity to pathophysiological hypertrophic responses remains undefined. CAM , Calcium/calmodulin; CAMK, calcium/calmodulin-dependent kinase; Csl , transcription factor; GSK3β, glycogen synthase kinase 3β; HDAC , histone deacetylase; MAPK, mitogen-activated protein kinase; MEF2, myocyte enhancer factor 2; NFAT, nuclear factor of activated T cells; Rho/ROCK , small GTPase (Rho) kinase; SRF , serum response factor.

Transcription Factors and the Transcription Complex

Transcription is the synthesis of RNA [messenger RNA (mRNA)] from DNA (gene) template. Initiation, elongation, and termination are the phases of transcription . Double-stranded DNA must be opened to enable RNA polymerase to bind to the gene promoter (the regulatory region usually upstream from the gene). The transcriptional apparatus is complex and requires that transcription factor proteins bind to DNA or associate with other proteins, including RNA polymerase , at the transcription site. A cartoon of the transcriptional complex is shown in Fig. 12.8 .

Figure 12.8, The transcriptional complex in human cells. The core promoter DNA contains the TATA box ( TATAAAA ) that is often about 50 bases upstream from the start site. It is bound by TFIID (shown in red ). Only the TFIID binds directly to the TATA box DNA, while the other proteins, shown in blue , bind to each other and some directly to the TATA-binding protein . The TATA-binding proteins are the first occupants to be situated on the core promoter. Basal transcription factors (A, B, F, E, H) are essential for transcription to occur. These proteins, in their binding to each other, may be responsible for the formation of the loop of the double-stranded DNA and position RNA polymerase at the transcriptional start site . There are regulatory molecules, activators or repressors . Activators ( yellow ) communicate with the basal transcription factors through coactivators (in blue ) that are proteins tightly complexed to the TATA-binding protein. TFIID , Transcription factor IID.

While repressors ( purple oblong in Fig. 12.8 ) can bind to silencer sequences in DNA that can be located very far upstream from the core promoter, enhancer DNA sequences that also may be located very far upstream (thousands of bases) from the core promoter form complexes with sequence-specific transcription factors, called enhancer-binding proteins , such as transcription factor IIB (TFIIB) and TFIIA that help to form the transcription complex, bringing the sites into direct contact, and increase the rate of transcription. Another example would be the proteins that bind to the CCAT box.

A molecule of RNA polymerase II ( pol II ) binds to the transcriptional start site. Pol II is a complex of 10 different proteins ( Fig. 12.9 ).

Figure 12.9, Schematic of RNA polymerase II showing its 10 subunits in different colors. The dashed line in blue is double-stranded DNA. DNA enters the enzyme complex in a deep cleft formed by two main subunits, Rpb1 and Rpb2 . A pair of “jaws” clamps the DNA strands as they enter the complex. Where the cleft ends, the double strand is unwound for a short distance (the “ transcription bubble ”) where the template (noncoding strand of DNA) forms a hybrid with the transcribed mRNA. The mRNA transcript leaves the complex through two grooves leading away from the active site. Below the active site is an opening that allows the entry of substrate nucleotides (for continuing mRNA formation) and for the entry of regulatory transcription factors. Proofreading may also take place at this site. mRNA , Messenger RNA.

A simplified model of RNA pol II in a transcriptional complex is shown in Fig. 12.10 .

Figure 12.10, A simplified model of a complex for RNA Pol II-catalyzed transcription. A bridging protein, such as CBP/p300, would closely contact sequence-specific Tfs and nuclear hormone receptors, TBP, and TFIIB. The latter would not contact DNA but would complex with Pol II. A factor, such as CBP/p300, would form complexes with several other transcription factors without the involvement of DNA, such as nuclear receptors, CREB, AP-1, and Sap-1a. The latter can bind DNA and their activities are modulated by phosphorylation by mitogen-activated protein kinases and protein kinase A, so this allows the networking and integration of plasma membrane and nuclear signaling pathways. AP-1 , Adapter protein complex 1; CBP , CREB-binding protein; CREB , cyclic AMP-response element-binding protein; Pol II , polymerase II; Sap-1a , serum response factor 1a; TBP , TATA box–binding protein; TFIIB , transcription factor IIB; Tfs , transcription factors.

The start site is at the beginning of the information encoding mRNA. The template strand of DNA is the noncoding strand that is used for the formation of mRNA that becomes a copy of the coding strand of DNA, except that thymine (T) is replaced by uracil (U). Pol II (about 35 base pairs upstream from the start site) lies across the template strand of DNA in the 3′–5′ direction, and the mRNA produced is an exact copy of the coding DNA strand in the 5′–3′ direction. Transcribing a specific gene requires a specific array of regulatory factors, which allows the transcription of one gene without turning on the expression of other genes that require a different array of transcription factors. In many cases a group of genes will be expressed by the same transcription factors . As mentioned previously, the process of transcription involves three phases: initiation , elongation , and termination . Initiation is the binding of RNA polymerase to the double-stranded DNA molecule. The core transcriptional elements are the TFIIB recognition element ( BRE ), the TATA box , the initiator element ( INR ), and the downstream promoter element as schematically shown in Fig. 12.11 .

Figure 12.11, Elements of the basal eukaryotic promoter.

In the figure the sequences defining each motif are shown. The BRE has the sequence G/C,G/C,G/A,CGCC. Where the 3′ conclusion of BRE is reached, the 5′ of the TATA box starts. Sometimes, at a position upstream, a CCAAT box can be found close to the initiator, INR. The CCAAT motif is often found when a TATA box is absent (the TATA box occurs in about 25% of human gene promoters); the CCAAT box specifies the binding of nuclear factor-1 and has the sequence: 5′-GGCCAACTC-3′. A CCAAT box is found in about half of the vertebrate promoters.

The TATA-binding protein is required by all RNA polymerases (pol I, pol II, and pol III). The TATA box is about 25–30 base pairs upstream from the start site near the INR. However, transcription can occur when the TATA box is absent from the promoter. The transcriptional preinitiation complex is made up of transcription factors, including transcription factor IID (TFIID) during initiation, the TATA-binding protein, and RNA pol II bound to the promoter. The TATA-binding protein transcriptional activator factors ( TAF s) include the TATA-binding protein, TFIIB, TFIIE, TFIIF, and TFIIH ( Fig. 12.12 ).

Figure 12.12, Constituents of the transcriptional preinitiation complex. On left is a molecular model of the preinitiation complex. RNA pol II is in white ; TATA-binding protein in green ; TFIIB in yellow . Highlighted base pair in purple and red is the presumed initiation site for DNA strand separation. On right is a schematic of the complex. The product of the last reaction is shown in Fig. 12.13 . TFIIB , Transcription factor IIB.

To permit the binding of RNA pol II and the process of initiation, other factors play a role as shown in Fig. 12.13 .

Figure 12.13, RNA pol II in the process of initiation.

RNA pol II is a complex enzyme consisting of 12 subunits ( Rpb 1 through 12). The subunits 1, 2, 3, 5, 6, 8, 10, 11, and 12 are conserved in RNA pols I, II, and III. The total mass of all 12 subunits is 513.6 kDa.

The bending of DNA occurs when the TATA-binding protein binds to the TATA box . This provides a saddle structure for other transcription factors to form a complex ( Fig. 12.12 ). TAFs involve histone acetyltransferases (HATs) , protein kinases , coactivators , and other activities. The INR interacts with the TATA-binding protein and with the coactivator, SP1 (spacing relative to TATA-binding protein), and the largest subunit of RNA pol II. The sequence of the INR is YYA +1 NT/AYY where Y is a pyrimidine and A is the transcriptional start site. The transcriptional start site begins with the start codon (usually AUG ; specifies methionine) one position after the 5′-untranslated region, the 3′-end of the leader sequence . The leader sequence often includes information for the destiny of the translated protein but is not part of the coding region.

RNA pol II binds directly to the INR box and initiation occurs when the TATA-binding protein and the TAFs dissociate from RNA pol II, and RNA pol II begins a forward movement over the opened strand of DNA (strand opening by helicase and ATPase ). Some of the transcription factors are dissociated, but TFIIB remains at the initiation site with other transcription factors. During elongation, RNA pol II uses a clamp mechanism to move forward. It also has a positively charged saddle structure for the binding of DNA and RNA. Three zinc ions stabilize the fold of the clamp that closes on DNA and RNA to trap the DNA template and the RNA transcript ( Fig. 12.14 ). A molecular view of RNA polymerase is shown in Fig. 12.15 .

Figure 12.14, Diagram of RNA pol II showing the clamp and the positions of DNA template strand and mRNA emerging.

Figure 12.15, RNA polymerase ( blue ) stalled while unwinding a nucleosome ( orange , with DNA in red ). Several elongation factors are in green, and a little piece of the transcribed RNA is seen poking out in magenta.

The active site that includes the hybrid of DNA and the transcribed RNA being formed has a preference for ribose nucleotide triphosphates over deoxyribonucleotide triphosphates as would be expected for the growing RNA. In Fig. 12.16 is shown the addition of a ribonucleotide triphosphate to the growing mRNA chain in the RNA pol II overall reaction.

Figure 12.16, Transcription by RNA polymerase II. The opened template strand of DNA is in pink . The mRNA chain is in green . The molecule incorporated into the growing mRNA chain is a ribonucleoside phosphate produced by the cleavage and release of pyrophosphate ( blue , step 5). mRNA , messenger RNA.

You're Reading a Preview

Become a Clinical Tree membership for Full access and enjoy Unlimited articles

Become membership

If you are a member. Log in here