Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Учебники / Genetic Hearing Loss Willems 2004

.pdf
Скачиваний:
136
Добавлен:
07.06.2016
Размер:
3.5 Mб
Скачать

404

Antonarakis and Scott

molecular genetics aspects of the first two phenotypes that are due to defects of the same gene, TMPRSS3, or ECHOS1.

A large Palestinian family from a small town in Israel (BT117) was described with more than 40 deaf individuals segregating an autosomal recessive form of nonsyndromic deafness (1). This kindred showed extensive consanguinity over the last seven generations. Hearing evaluation of a ected and nona ected members by pure-tone audiometric tests showed severe deafness in the a ected individuals, without any hearing remnants at a level of 75–80 dB. The same level of hearing loss was evident in all a ected individuals, ruling out progressive deafness. The diagnosis of sensorineural deafness was confirmed in two 1-week-old girls by a brainstem-evoked potential test. None of the deaf individuals showed any signs of vestibular involvement, defects in ear morphology, mental retardation, or any other aberrations that could indicate that the deafness was part of a syndrome. A genome-wide linkage analysis using short sequence repeat (SSR) polymorphic markers resulted in mapping of the locus to chromosome 21q22.3 between markers D21S1Z60 and 21qter, a region of 12 cM. Homozygosity of only the most telomeric marker, D21S1259, was observed (1). This family defined DFNB10, an autosomal recessive, nonsyndromic, congenital deafness.

A large consanguineous Pakistani family (1DF) was described that segregated a recessive, nonsyndromic childhood-onset deafness (2). The age of onset of deafness was 10–12 years and hearing was completely lost within 4–5 years. Pure-tone audiometric tests, between 125 and 8000 Hz up to 120 dB, revealed a maximum audio threshold in both ears of a ected individuals of 105 dB at 1000 Hz. Linkage analysis using SSR markers mapped this disease locus telomeric to D21S212 on chromosome 21q22.3, a large region of more that 15 cM (2). This family defined DFNB8, an autosomal recessive, nonsyndromic childhood-onset deafness.

As the description of these two families occurred independently in the same year, two di erent locus numbers were attributed to them. From various linkage and physical maps of distal chromosome 21q22.3 in 1996, it was clear that the two large genomic regions of linkage on chromosome 21 were overlapping. However, as the phenotypes of these two pedigrees were not identical, childhood onset in DFNB8 versus congenital deafness in DFNB10, it was thought they were likely to define two di erent loci as opposed to allelic variants (3).

III.POSITIONAL CLONING OF THE TMPRSS3 GENE

A.A TMPRSS3 Mutation Causes DFNB10

The group of Shimizu and Kudoh at Keio University in Japan was active in physical and transcription mapping of 21q22.3 as a preliminary to compiling

TMPRSS3

405

the complete genomic sequence of the region. The advances of the physical map and preliminary genomic sequence of chromosome 21 allowed identification and ordering of a total of 50 SSR markers in 21q22.3, comprising 16 published and 34 new markers, precisely mapped and ordered on BAC/ cosmid contigs. The use of these markers in linkage analysis on previously analyzed and additional members of the Palestinian family revealed informative recombinants that narrowed down the genomic mapping of the DFNB10 locus to between markers 1016E7.CA60 and 1151C12.GT45, a critical region (CR) of approximately 1 Mb (Fig. 1) (4).

With the available DNA samples from members of the Pakistani family, it was only possible to further refine the DFNB8 locus as being telomeric to D21S1225 (itself only approximately 500 kb telomeric to D21S212). By assuming DFNB8 and DFNB10 were in fact caused by the same gene, the CR could be refined to approximately 740 kb between D21S1225 and 1151C12. GT45.

Initially, there were six known genes/transcripts in the DFNB10 CR and all these genes (ABCG1, TFF3, TFF2, TFF1, PDE9A, NDUVF3) were excluded as being responsible for DFNB10 (4). By a combination of techniques, ending with detailed analysis of the complete genomic sequence of the CR, seven novel genes were defined (WDR4, SLC37A1, UBASH3A, ZNF295, UMODL1, TMPRSS3, and TSGA2). As transcript mapping and mutation analyses (by direct sequencing) were being performed at the same time, several predicted exons were analyzed for mutations that did not end up being part of the 13 defined genes in the CR. Mutation analysis was

Figure 1 Schematic representation of the critical region of DFNB10 and DFNB8 om chromosome 21q22.3. The genes and their transcription orientation within the critical region are depicted as arrows.

406

Antonarakis and Scott

performed in a ected members of the BT117 Palestinian family for a total of 166 exons, 48 from the six known genes, 93 from the seven novel genes, and 25 orphan exons.

After the mutation analysis of 163 exons was completed with negative results, an amplicon that had been refractory using DNA from a ected, but not normal, individuals finally yielded a product. However, instead of the normal 476 nts the amplified product was abnormally large at 1702 nts and was present in homozygosity and heterozygosity in all a ected individuals and obligate heterozygotes, respectively, of family BT117. This amplicon was exon 11 of the TMPRSS3 gene. Sequence of the abnormal DNA fragment in the patients revealed a deletion of 8 bp and the insertion of 18 complete h- satellite repeat monomers (f68 bp) in addition to 10 and two nucleotides derived from h-satellite repeats at the 5V and 3V end of the rearrangment, respectively. This would result in a frameshift mutation from G393 within the protease domain of TMPRSS3 and termination at 404 amino acids after the addition of 11 unrelated amino acids. We concluded that this mutation in

TMPRSS3 causes DFNB10 (5).

The sequences of the 18 inserted h-satellite monomers, although highly conserved, were also variable (52–93% divergence between repeats). The basic repetitive unit of h-satellites (or Sau3A repeats) is a monomer of f68 bp that has been detected on the short arms of all human acrocentric chromosomes (13, 14, 15, 21, 22) in addition to chromosomes 1, 9 (centromeric) 19p, and Y (6–8). Shiels et al. (9) showed that in a 1.5-Mb domain on the short-arm chromosome 22, a-satellites were interspersed between satellite 3 and satellite 1 sequences, between a-satellite repeats and rRNA. At least four families of h-satellites have been defined to date, mainly by their mapping position and the restriction enzyme used to define a higher-order repeat (HOR). ph4 (AccI HOR) and p21h2 (no HOR) satellites have been detected both distal and proximal to the rRNA genes on the acrocentric chromosomes while the p21h7 (AvaI HOR) satellites have only been detected distal to the rRNA genes. ph3 satellites are defined by a 2.5-kb EcoRI HOR. In the 1234-bp insertion of h-satellites into exon 11 of TMPRSS3, there are no AvaI or EcoRI sites and two AccI. While the length of the h-satellite insertion does not allow definitive identification of the subfamily, it seems likely to be derived from either p21h2 or p21h7 repeats.

The mobile nature of repetitive sequences on the short arms of acrocentric chromosomes is well documented with frequent exchanges between the short arms of the di erent acrocentric chromosomes (e.g., Ref. 10). Small polydisperse circular DNAs (spcDNA) produced by unequal homologous recombination between or within repetitive sequences are a heterogeneous population of extrachromosomal circular molecules present in a large variety of eukaryotic cells. They contain repetitive sequences including h-satellite repeats (11,12), and many of the other repeats present on the short arms of

TMPRSS3

407

Figure 2 The mechanism of h-satellite insertion. The proposed mechanism for the insertion of h-satellite repeats into the TMPRSS3 gene in the BT117 Palestinian DFNB10 family is shown with unequal crossing over (a) producing circular extrachromosomal DNA fragments containing h -satellite repeats (shaded gray, b). Homologous recombination within exon 11 of TMPRSS3 (c and d) results in the insertion of 18 h-satellite repeats (e).

408

Antonarakis and Scott

human acrocentric chromosomes (reviewed in Ref. 13). The h-satellites inserted into TMPRSS3 in the DFNB10 family may be derived from recombination of spcDNA containing h-satellite repeats with a region of minimal homology spanning exon 11 of the TMPRSS3 gene (Fig. 2). While more classic chromosomal rearrangements involving h-satellites, such as Robertsonian translocations and inversions, have been described (14,15), this was the first description of h-satellite insertion into an active gene resulting in a pathogenic state. Our model of insertion of a repeat sequence into an area of a gene with minimal homology implies that other repetitive units may also be involved in similar mutagenic events, and it may be possible to predict potential sites of insertion. Insertions of nonretrotransposon repetitive elements into genes may not previously have been described as it may occur mainly in sporadic cases and may also happen during mitosis resulting in somatic mutation.

B.A TMPRSS3 Mutation Also Causes DFNB8

We subsequently tested for TMPRSS3 mutations in the DNA of the patients of the Pakistani family 1DF with DFNB8, since the DFNB8 CR overlapped with the DFNB10 CR on chromosome 21q22.3. A mutation G to A in position -6 of IVS4, possibly creating a novel acceptor splice site, was found in homozygosity in the a ected members of this family. In vitro splicing analyses of normal and mutant genomic fragments containing exons 4 and 5 of TMPRSS3 revealed a 4-bp insertion between exons 4 and 5, consistent with the use of the putative splice acceptor site created by IVS4-6G > A. The 4-bp insertion would result in a frameshift from C107, and termination at 132 amino acids after the addition of 25 unrelated amino acids. Thus IVS46G>A can be considered a pathogenic mutation.

Splice acceptor site mutations allowing the production of small amounts of normal splicing and thus protein resulting in comparatively mild phenotypes compared to other mutations in the same gene have been described (e.g., Ref. (17)). Despite the fact that no normally spliced transcript could be detected in the in vitro system, it is likely that IVS4-6G>A allows the production of small amounts of normal splicing and thus TMPRSS3 protein, accounting for the phenotypic di erence between the DFNB8 and 10 families having childhood onset and congenital deafness, respectively (3).

IV. THE TMPRSS3 GENE, TRANSCRIPTS AND

EXPRESSION

TMPRSS3 stands for transmembrane protease, serine 3, and is the name approved by the human gene nomenclature committee (http://www.gene.

TMPRSS3

409

ucl.ac.uk/nomenclature/). We also named the gene ECHOS1, from the Greek work echos for sound. The TMPRSS3 gene has 13 exons spanning 24 kb (Fig. 3a). Four alternative transcripts, TMPRSS3a–d encoding putative polypeptides of 454, 327, 327, and 344 amino acids, respectively, were detected (b and c code for the same peptides). The TMPRSS3a transcript contains all 13 exons with the initiating methionine in exon 2. The TMPRSS3b and c transcripts start in introns 2 and 3, respectively, with putative initiating methlonines in exon 5. The 3V-end of TMPRSS3d continues into intron 9 (Fig. 3b).

Detection of TMPRSS3 transcripts by Northern blot analysis was difficult showing that the gene is expressed at low levels in the 23 gross organs/ tissues/cells analyzed. Semiquantitative RT-PCR specific to the four TMPRSS3 transcripts on a multiple-tissue cDNA panel from 27 human tissues and human fetal cochlea cDNA showed that all four transcripts show distinct patterns of expression but TMPRSS3a, containing all 13 exons, is the most highly and widely expressed transcript. Both TMPRSS3a and TMPRSS3d expression were detected in fetal cochlea.

V.THE TMPRSS3 PROTEIN

The TMPRSS3a transcript encodes a putative 454-amino-acid peptide that contains in order a transmembrane (TM), a low-density lipoprotein receptor A (LDLRA), a scavenger receptor cysteine-rich (SRCR) domain, and a serine protease domain (Fig. 3c). This domain structure has been observed in other proteases including the human transmembrane serine protease TMPRSS2, which shows the highest homology to TMPRSS3. TMPRSS2 also maps on 21q just centromeric of the DFNB10 critical region (18). The domain structure of the TMPRSS3 protein is reflected in the gene structure with the TM domain encoded by exon 3, the LDLRA domain by exon 4, and the SRCR domain by exons 5 and 6.

The serine protease domain of TMPRSS3 (residues 217–444) shows between 45 and 38% identity with other transmembrane serine proteases (Fig. 3f). The TMPRSS3 protease domain is compatible with the S1 family of the SA clan of serine-type peptidases for which the prototype is chymotrypsin (19) (http://www.merops.co.uk/). The serine protease active-site residues (H257, D304, and S401) are conserved and TMPRSS3 is predicted to cleave after K or R residues as it contains D395 at the base of the specificity pocket (S1 subsite) that binds to the substrate. The N-terminus of the protease domain is immediately preceded by the peptide sequence, RIVGG. Proteolytic cleavage between R and I would result in protease activation similar to other serine protease zymogens (19), converting TMPRSS3 to a noncatalytic and catalytic subunit linked by a disulfide bond (probably C207 to C324). The

410

Antonarakis and Scott

TMPRSS3

411

TMPRSS3 serine protease domain contains six conserved cysteine residues, which, by homology to other proteases and 3D modeling, are likely to form the following intrasubunit disulfide bonds: C242–C258, C370–C386, C397– C425.

As no recognizable leader sequence precedes the predicted hydrophobic TM domain (residues 48–69), TMPRSS3 is likely to be a type II integral membrane protein. Eleven human type II transmembrane serine proteases (TTSPs) have been described to date (many are reviewed in Ref. 20). Where the subcellular localization is described, the TTSPs are anchored to the plasma membrane with a cytosolic N-terminus and extracellular protease domain (e.g., Ref. 21,22). Similarly, TMPRSS3 is predicted to have its N- terminus on the inside of a membrane and the protease domain on the outside of a membrane.

The f40-amino-acid-long LDLRA domain, which contains six disul- fide-bound cysteines (C72, C79, C85, C92, C98, C107), was originally found in the low-density lipoprotein receptor as the binding sites for LDL (23) and calcium (24,25), and has subsequently been described in numerous extracellular and membrane proteins (PDOC00929; http://www.expasy.ch/cgi- bin/get-prodoc-entry?PDOC00929).

An f100-residue-long putative adhesive extracellular SRCR domain was also identified in TMPRSS3. SRCR domains linked to serine protease domains have been reported in secreted or membrane-bound molecules with diverse biological roles in development and immunity (26) (PDOC00929; http://www.expasy.ch/cgi-bin/get-prodoc-entry?PDOC00348). The LDLRA and SRCR domains of TMPRSS3 are potentially involved in binding with extracellular molecules and/or the cell surface.

Figure 3 The TMPRSS3 gene, transcripts, protein, and mutations. (a) TMPRSS3 contains 13 exons (boxes) spanning 24kb. (b) There are four di erent transcripts TMPRSS3a–d (coding regions in boxes, noncoding regions indicated by lines). (c) A schematic of the TMPRSS3 protein showing the transmembrane (TM), LDLRA, SRCR, and protease domains and their position in the 454-amino-acid peptide. The active site-residues His257, Asp304, and Ser401 are indicated. (d) The position of the nine TMPRSS3 mutations relative to the protein is indicated. (e) Exonic polymorphisms that change amino acids are indicated. (f) Representative protein homologies with other human transmembrane proteases are shown. They are TMPRSS2 (015393), TMPRSS4 (AAF74526), and TMPRSS5 (AB028140). Domains, as detected in TMPRSS3, are boxed according to their position in TMPRSS3 and labeled underneath with the active-site residues His257, Asp304, and Ser401 indicated by asterisks (*) above the alignment. Mutations are indicated above the sequence alignment with a number sign (#) while polymorhisms that change amino acids are indicated with a question mark (?). TMPRSS2 and 4 share exactly the same domain structure as TMPRSS3 while TMPRSS5 lacks an LDLRA domain.

412

Antonarakis and Scott

The putative peptides encoded by the TMPRSS3b and c transcripts would contain only half the SRCR domain while TMPRSS3d would contain only half the protease domain. The TMPRSS3b and c transcripts may be experimental artifacts or, alternatively, produce soluble forms of the protease as has been observed for the archetypal transmembrane protease, hepsin or TMPRSS1 (27) and other TTSPs (20).

VI. TMPRSS3 MUTATION SPECTRUM IN DEAFNESS

Subsequent to the discovery that TMPRSS3 was mutated in the nonsyndromic autosomal recessive deafness DNFB10, and DFNB8, we and other investigators examined the DNA of additional patients in both familial and sporadic cases of deafness. The nine pathogenic changes detected to date are summarized in Table 1 and Figure 3d. In addition to the evidence detailed below, all pathogenic changes were excluded as polymorphisms after examination of a large number of control chromosomes from relevant populations.

A.Familial Mutations

Supportive evidence for linkage to the DFNB8/10 locus was found in 5/159 additional consanguineous Pakistani families segregating profound congenital autosomal recessive deafness.

A missense mutation, R109W, was found in homozygosity in Pakistani family PKSR51a31. This substitution is in the last amino acid of the LDLRA domain, which is potentially involved in binding of TMPRSS3 with extracellular molecules and/or the cell surface. Two of the other three most closely

Table 1 Pathogenic Mutations in TMPRSS3

 

Exon/intron

Nucleotide change

AA level

Origin/ref

 

 

 

 

 

1

Exon 4

del207C

Frameshift +STOP

Spanish, Greek (30)

2

Exon 4

308A

D103G

Greek (30)

3

Intron 4

IVS4-6 G>A

Frameshift +STOP

Pakistani (5)

4

Exon 5

325C>T

R109W

Pakistani (31)

5

Exon 7

581G>T

C194F

Pakistani (31)

6

Exon 8

753G>C

W251C

Tunisian (28)

7

Exon 11

Ins (h-sat)+ del

Frameshift +STOP

Palestinian (5)

8

Exon 12

1211C>T

P404L

Tunisian (28)

9

Exon 12

1219T>C

C407R

Pakistani (31)

 

 

 

 

 

TMPRSS3

413

TTSPs have either Arg or the similar positively charged Lys at this position (Fig. 3f).

A second missense mutation, C194F, was detected in homozygosity in a ected members of the Pakistani PKB16 pedigree (31). The mutation is within the SRCR domain and a ects a highly conserved Cys residue.

A third missense mutation, C407R, was found in homozygosity in two Pakistani pedigrees, PKSN37 and PKSN18b. This substitution is within the serine protease domain only a few amino acids from the active-site residue S401 within the substrate pocket. Although C407 is not highly conserved, the nonconservative substitution of a small polar uncharged Cys to a large positively charged Arg so close to the S401 active-site residue is expected to alter the geometry of the active-site loop and therefore a ect the serine protease activity (31).

Supportive evidence for linkage to the DFNB8/B10 locus was also found in 2/39 Tunisian families segregating profound congenital autosomal recessive deafness.

The W251C missense mutation was found in homozygosity in a consanguineous Tunisian family Z with profound nonsyndromic congenital recessive deafness. This mutation lies in the serine protease domain and affects a Trp residue that is highly conserved among serine proteases of the S1 type (Fig. 3f). Examination of the predicted 3D-structure suggests that the W251C mutation might lead to a destabilization of TMPRSS3 as the large side chain of the Trp residue occupies a large hydrophobic pocket on the exterior of the protein, and structural rearrangements caused by substituting the smaller Cys would likely a ect the nearby active-site H257 residue and thus the activity of the enzyme (28).

The P404L missense mutation was observed in homozygosity in a consanguineous Tunisian family R with profound, nonsyndromic deafness. This mutation is located within the sequence signature characteristic of serine proteases active sites, separated from the catalytic Ser by two Gly residues (-Ser-Gly-Gly-Pro-Leu-). P404 is well conserved among the members of the S1 chymotrypsin family of proteases (Fig. 3f). For an exchange of Pro with Leu at position 404 we would expect a significant alteration of the geometry of the active-site loop a ecting the catalytic activity (28).

B.TMPRSS3 Mutations in Sporadic Deafness Cases

A total of 512 sporadic cases of deafness negative for the common 35delG GJB2 mutation (Cx26 gene) (29) have been analyzed for TMPRSS3 mutations. These include 86 Greek, 99 Spanish, 198 Italian, 65 Australian, and 64 North American patients. Definitive mutations were detected in only 2/512 patients (30).