An exploration of chemometric regression techniques to analyze infrared spectra of aqueous sugar mixtures

WCU Author/Contributor (non-WCU co-authors, if there are any, appear on document)
Morgan Elise Cheek (Creator)
Institution
Western Carolina University (WCU )
Web Site: http://library.wcu.edu/
Advisor
Scott Huffman

Abstract: Infrared spectroscopy (IR) is a valuable tool for both qualitative and quantitative studies in chemistry. This is due to its high sensitivity, robustness, short measurement time, and ease of use. However, IR has several disadvantages when it comes to quantitation of mixtures. The most notable is that due to its high sensitivity, spectra of mixture samples become highly convoluted. Additionally, intermolecular forces in a mixture can shift the frequency of a vibration, which complicates analyses that rely on only one wavelength. Aqueous samples, and particularly aqueous sugar samples, are mixtures that exemplify these problems. Sugars form hydrates in water, which have different IR spectra than pure sugars. These mixtures violate the assumptions of Beer's Law, the basis for quantitative spectroscopy. Therefore, quantitative analysis of aqueous sugar samples by IR does not give accurate results when using normal regression techniques.The goal of this project was to improve the accuracy of this type of analysis by using advanced multivariate regression techniques and data preprocessing. Simple linear models like classical least squares regression (CLSR) were expected to give less accurate results than models like principal component analysis (PCAR) and partial least squares (PLSR), which can use more variables to explain unknown complexes in a mixture, like sugar hydrates. These regression techniques were used to predict one-component aqueous fructose mixtures as well as three-component aqueous sugar mixtures. Data preprocessing was used to optimize the parameters of these techniques. The accuracy of these analyses were validated by standard error of prediction (SEP) and relative standard error of prediction (RSEP). It was found that for both one-component and three-component mixtures, PLSR was the most accurate regression model. CLSR gave the highest errors, which was expected due to its reliance on Beer's Law assumptions. PCAR performed better than CLSR, but worse than both forms of PLSR. PLSR1 and PLSR2 gave similar error values, and performed better with different components. This work helped show that it is possible to somewhat accurately model data from IR spectra of aqueous sugar mixtures. This is beneficial in that current analysis of these samples is time-consuming and expensive. With more development, these techniques could be applied to more complex samples for use in industry.

Additional Information

Publication
Thesis
Language: English
Date: 2019
Keywords
aqueous, classical least squares, infrared, partial least squares, principal component analysis, sugar

Email this document to