Food protein powders are complex mixtures made of carbohydrates, fats, and proteins. Even the same kind of protein powders can vary in nutritional and processing characteristics based on the origin and the extraction/isolation process. The ability to quickly classify protein powders of a specific type to distinguish nominally similar proteins between suppliers, and to evaluate lot-to-lot variation, is of great importance for many food manufacturers to achieve consistent product quality.
One of the most significant characteristics of any protein is its secondary structure, represented by local structural conformations reliant on the patterns of hydrogen bonding between carbonyl oxygen and amine hydrogen atoms in the backbone peptide bonds. For a long time, FTIR has been recognized as a feasible analytical method for protein secondary structure characterization. Contributions of distinct secondary structures can be estimated via the curve-fitting or deconvolution of the amide I band (~1650 cm-1), arising from the C=O stretching vibration of the protein’s amide group,1,2 to offer significant protein structure characteristics, such as conformation and stability.1,2,3,4,5
However, it is well known that the curve-fitting strategy is dependent on band assignments of the secondary structures (α-helix and β-sheet) from pure proteins. Hence, although this approach is successful in characterizing isolated single proteins, it is less than optimal for the analysis of complex mixtures like protein powders, as interactions between multiple proteins and non-protein materials have an impact on the spectral features in the amide I region (1700–1600 cm-1)6.
This article presents the viability of classifying and differentiating between food protein powders based on the combination of FTIR spectroscopy and principal component analysis (PCA). Similarities/differences between various kinds of protein powders, the same protein powders from different vendors, and/or different lots can be successfully evaluated with respect to both their overall composition and protein secondary structure by choosing the right spectral ranges for PCA.
Samples of pea, rice, milk, and whey protein powder from various vendor sources were made available for analysis. As outlined in Table 1, the number of vendor sources ranged from as many as five (whey protein) to one (milk protein), and the number of lots from a single vendor ranged from three to one.
Table 1. Protein powder samples measured by FTIR spectroscopy
||Number of Lots
||A1, A2, A3
J1, J2, J3
Spectra of the protein powders were measured in attenuated total reflectance (ATR) mode with the help of the integrated ATR accessory of the Thermo Scientific™ Nicolet™ iS50 FTIR Spectrometer, which makes use of a monolithic diamond crystal. A small amount of protein powder was kept on the diamond ATR crystal for each measurement. A good contact between the powder and the diamond crystal was assured using the pressure tower of the accessory. Three sub-samples from each sample were measured at a resolution of 4 cm-1 and 512 scans.
The Nicolet iS50 spectrometer was purged with nitrogen to avoid the effect of water vapor on the spectra. Following data collection, the advanced ATR-correction feature of Thermo Scientific™ OMNIC™ Software was applied to all spectra. Results given in this article are averages over the three sub-samples. The Thermo Scientific™ TQ Analyst™ Software was used to carry out principal component analysis for performing spectra evaluation for characterization and classification.
Results and Discussion
Protein Powder Spectra
Representative spectra of each protein type are shown in Figure 1A. Milk protein powder was available from only one vendor but the lot-to-lot reproducibility of this product can be observed in Figure 1B. In Figures 1C-E, representative spectra of the samples from different vendors for the rest of the protein types are categorized. There are clear variations among various protein types in both the amide I region (1700–1600 cm-1) and the amide II region (1580–1510 cm-1), due to the differences in their secondary structure. The whey protein spectra group (Figure 1E) has the greatest vendor-to-vendor variation; in contrast, the pea protein spectra group (Figure 1C) has the smallest.
Furthermore, variation caused by the non-protein components can also be seen. For instance, the carbohydrate peak at ~1080 cm-1 varies considerably within each protein group. As another example, the lipid peak at ~1743 cm-1 also varies over the samples. Although the lipid peak feature is very weak in the milk protein samples (Figure 1B), it is obvious in the pea protein spectra (Figure 1C), the rice protein spectra (Figure 1D), and the whey protein spectra (Figure 1E), with varying intensities.
Figure 1. (A) Full-scale ATR-corrected spectra of protein powders
Figure 1. (B) Spectra of the three lots from the one milk protein vendor; (C) Spectra of pea protein powders from three different vendors; (D) Spectra of rice protein powders from three different vendors; and (E) Spectra of whey protein powders from five different vendors.
Classification by PCA Using the Overall Mid-IR Region
Principal component analysis is a statistical procedure mostly used to obtain significant variance from a spectral calibration set. PCA calculates factors that are modeled from spectral variance using the spectra. The first factor has the largest variance in the dataset, and each subsequent factor, in turn, has the highest possible variance under the condition that it is orthogonal to the previous components. It is possible to linearly combine factors for reconstructing each individual spectrum of the calibration set. The coefficients for each factor, also known as “scores,” can be plotted in a PCA space to outline similarities and/or differences between spectra. Thereby, a spectrum with thousands of wavelength values can be minimized to a single data point in a two- or three-dimensional space, pertaining to the condition that the overall variance is effectively modeled using the first two or first three factors, respectively.
Before performing principal component analysis, many pretreatments were applied to the ATR-corrected protein spectra. Spectral pre-processing using second derivatives, followed by a standard normal variate (SNV) correction, was applied to the spectra to compensate for intensity variation brought about by different packing densities on the ATR crystal. Only relevant spectral regions are used for the analysis in order to extract the most meaningful variation. Here, the whole mid-IR spectral range of 4000–500 cm-1, except the region 2356–1900 cm-1 associated with diamond ATR measurements, was used for the PCA. Figure 2 represents the resulting scores plot.
Figure 2. Principal component scores plot from protein powder samples using most of mid-IR region.
From Figure 2, it can be observed that just two principal components (PCs) are used for achieving an effective classification of various protein types. Each protein type congregates into its own domain in the PC space. The variation within each protein group is revealed by the closer inspection of the clusters in Figure 2. Data points from the same vendor, such as A1, A2, and A3 for the milk protein and I1 and I2 for the whey protein, are closely clustered, denoting a reproducible process for each vendor that yields products of nominal variation.
However, the variation among different vendors is usually more definite. For instance, while vendors J, K, and M seem to produce similar whey protein powders, the whey protein products from vendor L and I are distinctly different. The spectrum of the whey protein from vendor L (the green trace in Figure 1E) has a considerably larger absorbance in the 1080 cm-1 region compared to the rest of the group, signifying higher carbohydrate content in this product.
In the case of the rice protein powders, the data points from different vendors are comparatively scattered. In the case of the pea protein powders, product B is analogous to D and product C is analogous to E, but the two clusters are distant from each other. It should be noted that the PCA explained above is based on the overall mid-IR spectral range, hence the variance includes the contributions from both protein and non-protein components.
Analysis of the Amide I Region
For the direct comparison of proteins in the products, a PCA based only on the amide I spectral region was carried out. The amide I spectral region was selected since it is specific to the protein secondary structure. Figure 3 shows the amide I region of the spectra for all products, where the shape of the amide I band differs from product to product and from vendor to vendor.
Figure 3. Full-scale, ATR-corrected spectra of protein powders in the amide I spectral region (1700–1600 cm-1) showing each protein type with a spectrum from each vendor source. Since only one milk powder vendor sample was available, this plot includes three different lots.
Figure 4 shows the results of the corresponding PCA scores. The general grouping of various protein types is analogous to the full-range PCA scores plot in Figure 2. Each protein type congregates in its own domain; however, the grouping is somewhat tighter than in the full-range PCA model represented in Figure 2. This observation confirms that the variations exhibited in Figure 2 certainly include both protein and non-protein contributions. The whey proteins from vendor L is a relevant example. In the full-range PCA model (Figure 2), data points from vendor L (L1 and L2) are far away from the cluster that includes vendors J, K, and M, but in the current model, they are much closer (Figure 4).
It is logical to conclude that the whey protein powders from vendor L vary from those from vendors J, K, and M primarily in the non-protein content. On the other hand, products from vendor I remain distinctly different from the other whey protein powders in both PCA models, indicating that the difference between sample I and the remaining whey proteins is, at least to a certain extent, caused by the difference in protein conformation. This observation is confirmed by Figure 3, where the trace I1 is evidently different when compared to the remaining whey proteins.
Figure 4. Principal component scores plot from protein powder samples using the amide I (1700–1600 cm-1) region.
This article shows that the combination of FTIR spectroscopy and principal component analysis is an effective tool in the classification and discrimination of different food protein powders. The overall mid-IR spectral range can be used to readily observe the vendor formulation differences resulting from non-protein components, which can be used as the basis for classification and discrimination. The differences in the protein secondary structure are evidently reflected by the scores plot of the PCA model based only on the amide I region.
Although the amide region-based PCA model is less susceptible to non-protein variation, it still effectively classified each protein type and discriminated products from different vendors. Both models enable FTIR to classify and discriminate protein powders depending on the product type as well as supplier source and repeatability. Apart from being simple and easy, the experiments eliminate the need for sample preparation. Several food manufacturers can readily adopt the methodology for incoming material inspections and QA/QC.
References and Further Reading
- Elliott, A., Ambrose, E. J. Structure of synthetic polypeptides, Nature (1950) 165, 921-922.
- Byler, D.M., Susi, H. Examination of the secondary structure of proteins by deconvolved FTIR spectra, Biopolymers (1986) 25, 469-487.
- Barth, A. Infrared spectroscopy of proteins, Biochim. Biophys. Acta (2007) 1767, 1073-1101.
- Suja Sukumaran, Protein secondary structure elucidation using FTIR spectroscopy, Thermo Fisher Scientific Application Note AN52985.
- Jackson, M., Mantsch, H.H. The use and misuse of FTIR spectroscopy in the determination of protein structure, Crit. Rev. Biochem. Mol. Biol. (1995) 30, 95-120.
- Zeeshan, F., Ta2assum M., Jorgensen, L., and Medlicott, N. Attenuated Total Reflection Fourier Transform Infrared (ATR FTIR) Spectroscopy as an Analytical Method to Investigate the Secondary Structure of a Model Protein Embedded in Solid Lipid Matrices, Applied Spectroscopy (2018) Vol.72(2) 268-279.
This information has been sourced, reviewed and adapted from materials provided by Thermo Fisher Scientific – Materials & Structural Analysis.
For more information on this source, please visit Thermo Fisher Scientific – Materials & Structural Analysis.