Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

John Wiley & Sons - 2004 - Analysis of Genes and Genomes

.pdf
Скачиваний:
342
Добавлен:
17.08.2013
Размер:
11.18 Mб
Скачать

278

PROTEIN PRODUCTION AND PURIFICATION 8

 

 

NH

 

 

O

 

 

 

 

C

O

N

C

 

CH2

 

OH

CH

CH2

 

 

 

O

N

 

CH2 NH

CH

NH

 

 

 

CH2

 

N

 

 

O C

 

 

 

CH

CH2 CH2

CH2 CH2O

 

 

Ni2+

CH2

 

CH

CH2

 

 

 

 

 

N

O

C

 

 

 

 

 

 

 

 

HN

 

N

C

 

 

 

 

 

 

 

 

 

 

 

O

O

O

 

Resin

 

 

 

Ni2+-nitriloacetic acid

 

Protein

 

 

Spacer

matrix

Figure 8.9. The binding of proteins tagged with multiple histidine residues to Ni2+ - NTA resin

The purification of a his-tagged protein from E. coli cells is shown in Figure 8.10. E. coli cells containing an inducible expression vector were grown and induced to produce the tagged target protein. The cells were broken open and insoluble cell debris was removed by centrifugation. The supernatant from this process was applied to a Ni2+-NTA column. The column was washed with a low concentration (20 mM) of imidazole, which will compete with low-affinity histidine –column interactions to remove from the column any, perhaps histidine-rich, proteins that are non-specifically bound. Finally, the tagged protein itself is removed from the column by increasing the concentration of imidazole to a high level (250 mM). This process results in the single-step purification of the tagged protein to yield a very pure, almost homogenous, sample. His-tagged proteins from any expression system including bacteria, yeast, baculovirus, and mammalian cells, can be purified to a high degree of homogeneity using this technique. Alternative elution conditions may also be used. For example, lowering the pH from 8 to 4.5 will alter the protonated state of the histidine residues and results in the dissociation of the protein from the metal complex. The tagged protein can also be removed by adding chelating agents, such as EDTA, to strip the nickel ions from the column and consequently remove the tagged protein.

The small size of the histidine tag means that the tagged recombinant protein often behaves identically to its untagged parent. In some cases, the tagged protein is actually found to be more biologically active than the untagged version of the same protein (Janknecht et al., 1991), although this effect is likely to be due to the speed of the purification process rather than any biological activity of the tag itself. Some proteins have been crystallized in the presence of the his-tag (Kim et al., 1996a). Additionally, the his-tag has extremely low

8.5 PROTEIN PURIFICATION

279

 

 

H3+N CH COO

CH

N

 

 

 

NH

 

 

 

 

Histidine

 

 

 

 

supernatant

 

 

 

cells

flow

 

 

 

cells

kDa M

UninducedInducedCell ColumnWash

97

 

 

 

 

 

 

 

 

66

 

 

 

 

 

 

 

 

45

 

 

 

 

 

 

 

 

31

 

 

 

 

 

 

 

 

21

 

 

 

 

 

 

 

 

15

 

 

 

 

 

 

 

 

 

 

 

 

 

N

NH

Imidazole

[Imidazole]

Figure 8.10. The purification of a his-tagged protein. The chemical structures of histidine and imidazole are shown, together with an SDS–polyacrylamide gel of the purification of a his-tagged protein. An E. coli cell extract producing a 14 kDa his-tagged protein was applied to a Ni2+-NTA column. The column was washed with a buffer containing a low concentration (20 mM) of the histidine analogue imidazole prior to elution of the tagged protein with an imidazole gradient (20–250 mM). Proteins were visualized after staining the gel with Coomassie blue

immunogenicity and consequently the recombinant protein containing the tag can be used to produce antibodies. There are some reports of the his-tag altering protein function, (see, e.g. Knapp et al., 2000), but, as we will see later, it is more important to remove some other purification tags. An additional advantage of the his-tag is that purification can be performed under denaturing conditions (Reece, Rickles and Ptashne, 1993). The interaction between the histidine residues and the metal ion does not require any special protein structure and will occur even in the presence of strong protein denaturants (e.g. 8M urea). This is particularly important for the purification of proteins that would otherwise be insoluble.

8.5.2The GST-tag

The glutathione S-transferases (GSTs) are a family of enzymes that are involved

in the cellular defense against electrophilic xenobiotic chemical compounds.

280

 

 

PROTEIN PRODUCTION AND PURIFICATION 8

 

 

 

 

 

 

 

 

 

 

(a)

 

 

 

 

 

 

 

SH

 

 

O

O

O

 

 

 

 

 

 

 

 

 

H

 

 

 

 

 

 

 

 

 

N

(b)

 

 

 

 

 

 

 

 

HO

 

 

 

 

 

 

N

OH

 

 

 

 

 

 

 

 

 

 

 

 

 

 

H

 

 

 

 

 

 

 

 

 

 

O

 

 

 

 

 

NH2

 

GST

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

NH2

 

 

O

 

 

 

 

 

H

 

 

 

 

 

 

 

 

 

 

HO

 

 

 

 

 

 

N

O

 

 

 

 

 

 

 

 

 

N

 

 

 

 

 

 

 

 

 

H

 

 

O

O

OH

 

 

 

 

 

 

 

 

S

 

 

R

(c)

kDa M

97

66

cellscells

 

 

flow

extractpellet

 

UninducedInducedCell

Cell

ColumnElution

45

31

Figure 8.11. The purification of proteins tagged with GST. (a) The chemical structure of the tripeptide glutathione and the action of GST for its addition to electrophilic compounds

(R). Glutathione is composed of three amino acids – glutamic acid, cysteine and glycine. Note that the glutamic acid is joined to the Cys –Gly dipeptide through its γ -carboxyl group. (b) The three-dimensional structure of the GST–glutathione complex. The protein is depicted in a ribbon form and the glutathione as a green stick model (Garcia-Saez´ et al., 1994). (c) The purification if a GST-tagged protein from E. coli cells. The tagged protein was bound to a glutathione-affinity column and eluted using free glutathione itself. The tagged protein is indicated by the arrow

They catalyse the addition of glutathione to these electrophilic substrates, which results in their increased solubility in water and promotes their subsequent enzymatic degradation (Strange, Jones and Fryer, 2000). Glutathione is a tripeptide composed of the amino acids glutamic acid, cysteine and glycine (Figure 8.11(a)). GST binds to glutathione with high affinity (Figure 8.11(b)).

8.5 PROTEIN PURIFICATION

281

 

 

The enzyme from the parasitic flatworm Schistosoma japonicum is a 26 kDa dimeric protein (Walker et al., 1993). The gene encoding this protein is fused, in the correct reading frame, to the target gene and a fusion protein is produced from an expression vector. Host cells producing the fusion protein are broken open and soluble proteins are applied to a column to which glutathione is attached (e.g. glutathione-agarose). The specific interaction between GST and glutathione will result in the binding of the fusion protein to the column, while the majority of host proteins are unable to adhere. The bound protein can then be eluted from the column by washing with a high concentration of glutathione (10 mM) to compete for the interaction with the column (Figure 8.11(c)).

Both the large size of GST and its dimeric nature mean that the tag is more likely to influence the biological activity of the target protein than the his-tag. It is therefore desirable to remove the GST portion of the fusion protein to study the activity of the target protein in isolation. This can be achieved by the inclusion, in the expression vector, of DNA coding for the amino acid sequence of a specific protease cleavage site between GST and the target gene. Treatment of the purified fusion protein with the protease will then result in the generation of two polypeptides – the free target protein and GST itself. GST can then be removed from the target protein by applying the mixture back onto a glutathione column. The GST will, again, bind to the column, but the target protein will not. The column flow-through can be collected and will contain the purified target protein.

A variety of specific proteases have been used to cleave purification tags

from

target

fusion proteins (Table 8.2). Unlike restriction enzymes when

they

cleave

DNA (see Chapter 2), many proteases do not have an abso-

lute sequence requirement for their cleavage sites. For example, the protease Factor Xa cleaves after the arginine residue in its preferred cleavage site Ile – Glu –Gly– Arg. However, it will sometimes cleave at other basic residues, depending on the conformation of the protein substrate, and a number of the secondary sites have been sequenced that show cleavage following Gly–Arg dipeptides (Quinlan, Moir and Stewart, 1989). Consequently, the protease may not only cleave the site between the tag and the target protein, but many also cleave the target protein itself. Obviously, this must be avoided to maintain the integrity of the target protein. Other proteases, e.g. the TEV and PreScission proteases, have larger and more specific recognition sequences and are less likely to cleave at alternative sites. The TEV protease has the added advantage that the protease can be produced in a recombinant form from E. coli and is therefore not contaminated with other plasma proteases and factors.

282

PROTEIN PRODUCTION AND PURIFICATION 8

 

 

Table 8.2. Site-specific proteases. The recognition sequence of each protease is shown, together with the actual site of cleavage, depicted by the arrow

Protease

Recognition and

Notes

Reference

 

cleavage site

 

 

 

 

 

 

Factor Xa

IleGluGlyArg↓

42 kDa protein, composed

(Nagai, Perutz and

 

 

of two disulphide linked

Poyart, 1985)

 

 

chains, purified from

 

 

AspAspAspAspLys↓

bovine plasma

 

Enterokinase

26 kDa light chain of

(LaVallie et al., 1993b)

 

 

bovine enterokinase

 

 

 

produced in and purified

 

 

LeuValProArg↓

from E. coli

 

Thrombin

Purified from bovine

(Chang, 1985)

 

GlySer

plasma

 

TEV

GluAsnLeuTyr-

Tobacco etch virus protease

(Dougherty et al., 1989)

 

PheGln↓Gly

 

 

PreScission

LeuGluValLeuPhe

Protease from the 3C

(Walker et al., 1994)

 

Gln↓ GlyPro

human rhinovirus

 

8.5.3 The MBP-tag

The target gene is inserted downstream from the malE gene of E. coli, which encodes maltose binding protein (MBP), in an expression vector that results in the production of an MBP fusion protein (Kellermann and Ferenci, 1982). Maltose is a disaccharide composed of two molecules of glucose (Figure 8.12(a)). MBP is a 40 kDa monomeric protein that forms part of the maltose/maltodextrin system of E. coli, which is responsible for the uptake and efficient catabolism of glucose polymers (Boos and Shuman, 1998). The protein undergoes a large conformational change upon binding of maltose, and results in the formation of a stable complex (Figure 8.12(b)). One-step purification of fusion proteins is achieved using the affinity of MBP for cross-linked amylose (starch) (di Guan et al., 1988). Bound proteins can be eluted from amylose by including maltose (10 mM) in the column buffer (Figure 8.12(c)).

8.5.4IMPACT

Intein mediated purification with an affinity chitin binding tag (IMPACT) is an approach to protein purification that uses the protein self-splicing of

8.5 PROTEIN PURIFICATION

283

 

 

(b)

(a)

CH2OH

CH2OH

 

O

O

 

OH

OH

HO

O

OH

 

OH

OH

(c)

cells

elution

treatment

cells

 

flow

UninducedInducedAmyloseProteaseAmylose

 

 

 

 

MBP X Target

 

 

 

MBP

 

 

Target

Figure 8.12. The purification of proteins tagged with MBP. (a) The chemical structure of maltose, a glucose disaccharide. (b) The three-dimensional structure of the MBP–maltose complex (Quiocho, Spurlino and Rodseth, 1997). The protein is depicted in a ribbon form with α-helices coloured in purple and β-sheets in blue. Maltose is shown as a green stick model. (c) The purification of an MBP-tagged protein. The tagged protein is bound to an amylose column and eluted with maltose. The MBP–target fusion is then cleaved with a protease at a site indicated by the X, and reapplied to the amylose column. The target protein will not adhere to the column when it is separated from MBP. The gel image is reprinted with permission of New England Biolabs, 2002/2003

284

PROTEIN PRODUCTION AND PURIFICATION 8

 

 

inteins to remove the purification tag and give pure isolated protein in one chromatographic step. Inteins are a class of proteins, found in a wide variety of organisms, that excise themselves from a precursor protein and in the process ligate the flanking protein sequences (exteins) (Cooper and Stevens, 1995). The excised intein is a site-specific DNA endonuclease that catalyses genetic mobility of its own DNA coding sequence. The process of polypeptide cleavage and ligation is dependent on specific chemistry involving thiols and a conserved asparagine residue.

Most inteins have a cysteine residue at their amino-terminal end and an asparagine at their carboxy-terminal end (Figure 8.13(a)). All the information required for the splicing reaction is contained within the intein itself, and if these sequences are placed in the context of a target protein they still splice themselves out. The mechanism of splicing is complex, but the reaction is very efficient. The IMPACT expression system exploits this unusual chemistry by mutation of the C-terminal asparagine to alanine in a yeast intein, VMA1 (Chong and Xu, 1997). This mutation prevents the cleavage reaction occurring at the carboxy-terminal side of the intein and traps the protein in a thioester that can be cleaved by β-mercaptoethanol or dithiothreitol (DTT). The target gene is cloned into an expression vector such that a three-component fusion protein is produced, in which a target protein – intein –chitin binding domain fusion is produced. Chitin is a fibrous insoluble polysaccharide made of β-1,4- N-acetyl-D-glucosamine that is found in the cell walls of fungi and algae and in the exoskeletons of arthropods. Chitinase catalyses the hydrolytic degradation of chitin, and the Bacillus circulans enzyme (Mr 74 kDa) is composed of three domains – an amino-terminal catalytic domain (CatD) (417 amino acid residues), a tandem repeat of fibronectin type III-like (FnIII) domains (duplicate 95 residues) and a carboxy-terminal chitin-binding domain (CBD, 45 amino acid residues) (Watanabe et al., 1990). The isolated CBD shows high-affinity binding to chitin.

In the IMPACT system, the fusion protein is made in E. coli and passed down the chitin column, where it binds. The protein can be cleaved off the column by using thiol containing compounds, such as DTT, at 4 C. This is a slow process and requires an overnight incubation to complete, which may prove problematical if the target protein is not stable under these conditions. The final target protein produced by this method is native except for the DTT thioester moiety attached at the carboxy-terminal end. The thioester is, however, unstable and will spontaneously hydrolyse to yield a native protein. Other thiols can also be used to initiate the cleavage process, e.g. β-mercaptoethanol and cysteine. Cysteine induced cleavage results in the insertion of a cysteine amino acid residue at the carboxy-terminal end of the cleaved polypeptide. The cysteine

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

8.5

PROTEIN PURIFICATION

285

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(c)

 

 

HS

 

 

 

 

O

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Target

O

 

 

 

 

 

CBD

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

protein

 

N

 

Intein

 

N

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

H

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

H

 

 

 

 

CH3

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

N-S acyl shift

 

(a)

 

 

Cys

 

 

 

 

 

Asn

 

 

 

 

 

 

 

 

Target

O

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

N-extein

 

 

Intein

 

C-extein

 

 

 

 

S

 

 

 

 

O

 

 

 

 

 

 

 

protein

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

CBD

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

H2N

 

Intein

 

N

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

H

 

 

 

 

 

Cys

 

 

 

 

+

 

 

 

 

 

Asn

 

 

 

 

 

 

 

CH3

 

 

 

N-extein

 

C-extein

 

 

 

Intein

 

 

 

 

 

 

 

 

 

 

 

 

 

 

-S

 

OH

+ DTT

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

HS

 

 

 

 

 

 

 

(b)

 

 

 

Cys

 

 

 

 

 

 

Ala

 

 

 

 

 

 

 

 

 

 

 

OH

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

N-extein

 

 

Intein

 

 

 

C-extein

 

 

 

O

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Target

 

OH

 

HS

O

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

protein

 

S

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

+

 

 

CBD

 

 

 

+

 

 

Cys

 

 

 

 

 

 

Ala

 

 

 

 

 

 

HS

 

OH

 

 

Intein

N

 

 

N-extein

 

 

 

 

Intein

 

C-extein

 

 

 

 

 

 

H2N

 

H

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Spontaneous

CH3

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

O

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Target

 

+DTT

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

protein

 

 

 

 

OH

(d)

 

cells

flow

 

 

cells

 

M

UninducedInducedColumn

 

 

 

 

Elution

SDS

Target Intein CBD protein

Intein CBD

Target protein

Figure 8.13. The IMPACT system for the purification of tagged proteins and the subsequent removal of the tag. (a) The normal splicing reaction involves the complete removal of the intein and the joining of the polypeptide sequences to its aminoand carboxy-terminal side. (b) A mutant form of the intein, in which an essential asparagine is replaced with an alanine, results in partial cleavage and the release of the amino-terminal side polypeptide only. (c) The chemistry of the splicing reaction used to cleave the target protein from the intein–chitin binding domain (CBD) tag. (d) The purification of HhaI methylase using the IMPACT system (Chong et al., 1997). Purified target protein was eluted from the column, while the detergent SDS was used to remove the intein–CBD fusion. The gel image was kindly provided by Ming-Qun Xu (New England Biolabs)

286

PROTEIN PRODUCTION AND PURIFICATION 8

 

 

can be radio-labelled, or it can be a site for chemical modification, especially if it is the only cysteine in the protein, since it is a good site to add protein cross-linkers, fluorescent probes, spin labels or other tags.

8.5.5TAP-tagging

An extension of tagging over-produced proteins for purification is to tag proteins produced at wild-type levels in their native host cells. Protein purification in these circumstances, if performed under suitably mild conditions, can lead to the isolation of naturally occurring protein complexes. Most proteins do not exist as single entities within cells. They are associated, through non-covalent interactions, with a variety of other proteins that may be involved in the regulation of their function. The over-production of a single protein will not result in the over-production of other proteins in the complex. Therefore, to isolate complexes from cells, protein production should be as close to the

natural state as possible. The DNA encoding what is termed a tandem affinity purification tag (TAP-tag) is cloned at the 3 -end of a target gene so that little disruption is made to its ability to be transcribed, and the fusion protein should be produced at the same level as the wild-type target protein. The TAP-tag encodes two purification elements – a calmodulin binding peptide and Protein A from Staphylococcus aureus. These elements are separated by a TEV protease cleavage site (Puig et al., 2001). Cells containing the tagged protein are gently lysed and then applied to a column containing IgG, which binds with high affinity to Protein A. The fusion protein, and its associated proteins, are removed

from the column using TEV protease and then applied directly to a calmodulin bead column, in the presence of Ca2+, and eluted using the chelating agent EDTA. The two-step purification procedure is highly specific and can result in the isolation of contaminant-free protein complexes. The TAP-tag allows the rapid purification of complexes from a relatively small number of cells without prior knowledge of the complex composition, activity or function (Rigaut et al., 1999; Gavin et al., 2002), and, combined with mass spectrometry, the TAP strategy allows for the identification of proteins interacting with a given target protein.

9 Genome sequencing projects

Key concepts

Genetic and physical maps are used to determine the order of genes on a chromosome and their approximate distance apart

DNA sequence determination is performed using dideoxynucleotides that halt replication at a specific base. DNA fragments that differ by a single base can be separated using polyacrylamide gels

Sequencing reactions generate a few hundred bases of sequence

Whole genomes can be sequenced by cloning random small DNA genomic fragments, sequencing them, and then reassembling the genome sequence based on the overlap between the sequenced fragments

Massive computing power is required to assemble the sequenced fragments and determine the locations of genes within the genome

The ultimate goal of all genome sequencing projects is to determine the precise sequence of bases that make up each DNA molecule within the genome. The knowledge of the sequence of individual genes, and the entire genome, is vital if we are to understand not only how genes and proteins work but also how different gene products influence the activity of each other within the context of the whole organism. The sheer amount of DNA contained within the genome of an organism, however, represents a substantial barrier to attaining this level of analysis. Even in the absence of complete sequence knowledge, however, a variety of methods have been used to map the location of genes and other DNA sequences within the genome. On a small scale, mapping DNA fragments is a relatively straightforward process (Figure 9.1). We have already seen (Chapter 2) that restriction enzymes will cleave DNA at specific sequences, termed recognition sites. The

Analysis of Genes and Genomes

Richard J. Reece

2004 John Wiley & Sons, Ltd

ISBNs: 0-470-84379-9 (HB); 0-470-84380-2 (PB)

Соседние файлы в предмете Генетика