Mass spectrometry is the method of choice at present for fast, accurate characterization of proteins. This is both in terms of obtaining their molecular masses and for deriving their primary structures β amino acid sequence information. And the method is generally applicable to all proteins. At the present state of the art, mass spectrometry, in quantitative terms, can routinely analyze low femtomoles, and increasingly, attomoles of samples; and this is with a dynamic range of four orders of magnitude.
In a broader context, this analytical ability in the post-genomic era of biology is playing an especially important role. In particular, a major focus of attention of biologists is now on the cell proteome. The cell proteome β the aggregate of all proteins expressed in a cell β is complex in its composition and it changes with time, reflecting the cell condition. In the human genome there are 20,300 protein-coding genes; at first consideration, it would seem, quantification of the whole human proteome is achievable. But the complexity of this undertaking is enormous; each such protein may be expressed at a given time in a given cell in one or more of its multiple isoforms, splice variants, and post-translational modifications. Whereas the totality of the problem is immense, the compositions however fully incorporate those normal as well as any abnormal states of the cells, overall holding clues to details of cell biological processes. For determination of these compositions, however, comprehensive analysis is required. Only a few analytical methods at present can take on this challenge and deliver satisfactory results. Among these, mass spectrometry, even with its limitations, is definitely the frontrunner. The present availability of a wide array of mass spectrometry platforms besides the antibody-based Human Protein Atlas project and ProteomeXchange β a single point of submission of proteomics data to integrate proteomics-based knowledge bases β was probably a compelling reason for the Human Proteome Organization's (HUPO's) recent launching of the global Human Proteome Project (HPP). The goal of the HPP is to identify and characterize at least one protein product from each of the protein-coding genes.
The impact of mass spectrometry on biological understanding may be best illustrated by a few examples. Toward higher-order structure determination of membrane proteins and their complexes, hydrogen bonding and stability have been probed via site-directed mutagenesis experiments using the membrane protein bacteriorhodopsin. Myoglobin conformation has been examined in the presence of lipid bilayer and how it is preserved within reverse micelles in the gas phase of the mass spectrometer. It has been possible to compare the conditions that result in the release of transporters (BtuC2D2, MacB, MexB) from gas phase micelles with those typically used for soluble complexes of comparable molecular mass. By controlled release of complexes from intact micelles, subunit stoichiometry and lipid-binding properties have been unequivocally determined. There have been experiments on quaternary structures, and the gating mechanism of the ion channel KirBac3.1 has been determined. It has been shown that rotary adenosine triphosphatases (ATPases)/synthases can remain intact in vacuum along with membrane and soluble subunit interactions. Incorporation of protein subunits into the EmrE membrane complex and the effects of their PTM status upon dimer formation have been examined. Using targeted proteomics approaches it has been shown that low-abundance proteins, such as the tau protein in spinal fluid, or cytokines, can be detected in a proteome. A recent mass spectrometric work identified 1,043 gene products from human cells that are dispersed into more than 3,000 protein species created by PTMs, RNA splicing, and proteolysis. The present status of the field has been summarized in several papers.1,2,3,4,5,6,7,8,9,10
Mass spectrometric analysis requires the sample molecules β rather their ionic forms β be present in gaseous state, something that was difficult to achieve for large molecules by traditional means without causing concurrent uncontrolled fragmentation. Two technical innovations about a decade before the turn of the twenty-first century made soft ionization process conveniently realizable and ushered in large-molecule thus biological mass spectrometry: one was the electrospray ionization (ESI), and the other matrix assisted laser desorption ionization (MALDI). In electrospray ionization, an aqueous solution of the sample is passed through a capillary and vaporized in the presence of a strong electric field which results in protons adhering to the molecules and forming β usually multiply charged β molecular ions. In MALDI, the sample mixed with a solid matrix that has high ability of absorbing laser radiation is exposed to laser; the energy deposited by the radiation causes both desorption of the sample molecules and ionization, forming, mostly singly charged ions. In most protein mass spectrometry these are the two methods of ionization that are used. Laser-induced liquid bead ionization desorption (LILBID) and laser ablation with electrospray ionization (LAESI) are recently introduced methods of ionization.
In determining protein structure, after a first stage of mass spectral analysis or protein mass fingerprinting (PMF), fragmenting the intact molecular ion in a controlled manner bit by bit β usually by some collisional activation β and then from the identity of the fragments thus generated (through a second stage of mass spectral analysis, that is, MS/MS) reconstruct the structure of the parent ion is the basis of the top-down method. The advantage of this intact protein analysis is a simpler tracking of the fragmentation process hence more dependable reconstruction of the parent structure; its disadvantage is that it is more time consuming and often not possible for the analyzer to handle very large molecular mass ions. Its antipode is by far the more popular bottom-up method, in which the initial step is an enzymatic proteolysis of the sample. Then from fragmentation of the peptides thus generated, the peptides are first identified, from which the structure of the original protein is inferred. Here, the advantage is high speed; the disadvantage is the substantially increased complexity in reconstructing the original structure. A new intermediate course is now making appearance, being referred to as the middle-down method. The de novo method is deriving the peptide sequence without any prior information on the sequence or use of DNA database.
In most routine applications, including analysis of mixtures of proteins, the bottom-up method is commonly used. The most commonly used protease is trypsin. The bottom-up method coupled with high-performance liquid chromatography (HPLC) before the sample enters the mass spectrometer is referred to as the shotgun method. In MuDPIT β multidimensional protein identification technology β SCX and RP stationary phases are packed together in the same microcapillary column; the peptides get separated there by 2D chromatography, then they are directly eluted into the mass spectrometer. Inside the mass spectrometer, the tryptic peptides are first sorted out through a mass analysis step, which in tandem is followed by fragmentation studies of individual peptides. Fragmentation is most commonly effected by collisional activation (collision-induced, collisionally activated dissociation, CID, CAD) β helium is mostly used as the collisional agent β but photodissociation (such as infrared multiphoton dissociation IRMPD) also is used. Additional fragmentation techniques include electron capture dissociation (ECD) in which low-energy electrons injected from an external source are made to interact with peptide ions which cause fragmentation; in electron transfer dissociation (ETD), radical cations, instead of free electrons, are employed. Also in use is the electron-detachment dissociation (EDD) method which mimics positron capture; in this, bombardment of anionic species by moderate energy electrons causes electron detachment followed by backbone dissociation.
Once the fragmentation data have been obtained, the next and the final daunting task is to retrace steps and infer from the fragment ion data the amino acid sequence in the peptides and then, in turn, the protein structure. This is commonly done using one or the other available proprietary software packages, mostly commercial, which are created, in the first place, based on available data on fragmentation patterns of known peptides. While subjecting the mass spectral output to software application, the additional inputs usually sought are the taxonomy of the sample, the protein mass range, the protease employed, and the database to be searched; the output is a list of names of probable proteins, along with their expectation values, that are compatible with spectral data. Because of limitations of the software, there is no guarantee that the protein actually present in the sample would turn up at the top of the list. Further, since all packages may not point to the same set of proteins, additional considerations are required to arrive at the final conclusion.
Often, in experiments, after the qualitative identification of the proteins in a sample has been accomplished, the next objective is to follow their variations with respect to changes, which could be over a wide range of biological conditions of the sources of the samples. It is then expedient to concentrate attention on specific peptides related to the proteins of interest. To this objective, operational modes are then set β through software control β to track specific peptides, or specific ions that are fragmentation products of those peptides. Narrowing scans to relatively much smaller region of mass spectrum increases the speed of analysis drastically; it also increases the dynamic range. In the single reaction monitoring (SRM) procedure, no mass spectra are recorded; two mass analyzers are used in tandem, both tantamount to filters, which track a specific fragment ion arising from its precursor ion. In the multiple reaction monitoring (MRM) procedure too, no mass spectra are recorded; the scanning here is between multiple sets of precursor and fragment ion pairs. Both are targeted approaches; the paramount gains here are greater speed of analysis and higher sensitivity.
There has been significant progress also in quantitative proteomics. Here just the main methods are named. In the label-free method no isotope labeling is used. ICAT (isotope-coded affinity tags) and SILAC (stable isotope labeling by amino acids in cell culture) are mass-difference methods which use isotope labeling. In the ICAT method two different cell states are treated with isotopically light and heavy (often 13C or D) ICAT reagents; then they are combined, digested, and analyzed by mass spectrometry. In the SILAC method two populations of cells are cultivated with their growth medium in their cell culture having two different isotopes in their amino acids prior to mass spectrometric analysis. iTRAQ (isobaric tag for relative and absolute quantification) and TMT (tandem mass tag) employ isobaric labeling. Here peptides are labeled with isobaric chemical groups, but on MS/MS fragmentation reporter ions of different masses result. iTRAQ tags are available in 4-plex and 8-plex forms; TMT tags are available in duplex and 6-plex forms. In AQUA (absolute quantification), a synthetic tryptic peptide β a few daltons heavier than the peptide of interest and which incorporates one stable isotope labeled amino acid β is added as an internal standard to the biological sample prior to digestion and HPLC-MS analysis. Ftom the peak ratios of the ion chromatograms, the quantity of the native peptide is obtained. In this method high-attomole detection limits have been claimed. The protein mass spectrometry field has witnessed extensions in nomenclature. The area of analysis where protein composition in the sample was the primary objective used to be known as analysis in discovery mode; analysis of change in concentration of a few specific proteins used to be known as analysis in targeted mode. Currently data acquisition modes are also referred to as data-dependent acquisition (DDA) or data-independent acquisition (DIA). Both DDA and DIA can be used for discovery β that is, without testing a hypothesis. In targeted analysis, a set of hypotheses are needed. Usually the experiment is performed by only collecting data for a set of peptides, but DIA which attempts at collecting data about all peptides can be analyzed in a targeted way once the data has been acquired.
The instrumentation in the field is configurationally traditional in that it has the same set-up: ion source (in this case mainly ESI and MALDI), analyzer, and detector. The options in analyzers however are many: the time-of-flight, the 2D quadrupole (with its trapping version, the linear trap), the 3D RF trap, the ion cyclotron resonance (ICR) cell which requires a high magnetic field, along with a significant new addition to the array, the RF Orbitrap. Along with it are the implements needed for MS/MS: that is, for techniques like CAD/CID, ECD, ETD, EDD, IRMPD. A characteristic of protein mass spectrometry instrumentation is that the designs of the instruments are modular β that is, the overall assembly structure of a given instrument is one of many possible combinations of the basic implements mentioned above to achieve a certain desired result. And almost overwhelmingly, every research laboratory in the field uses commercial instruments including those which have demonstrated capability of designing their own original instruments. An advantage of this is that in many cases results across a number of laboratories can be compared conveniently because of their use of the same or similar instruments; a severe limitation is that the way an experiment can be conducted is entirely determined by what the manufacturer of an instrument makes feasible operationally; the user is limited to a relatively narrow range of options. A major recent introduction to this instrumentation field is the coupling of ion mobility spectrometry and mass spectrometry; ions...