Machine’s conceptual development using FNet

UNCG Author/Contributor (non-UNCG co-authors, if there are any, appear on document)
Deepa Jayanna (Creator)
Institution
The University of North Carolina at Greensboro (UNCG )
Web Site: http://library.uncg.edu/
Advisor
Shanmugathasan Suthaharan

Abstract: Masked language modeling (MLM) is a well-known technique in Natural language processing (NLP) to train a model on randomly masked tokens and use the trained model to predict the masked words. FNet is a recently developed Fourier transform-based transformer that helps solve the MLM problems. It completely eschews the attention computation that has been relatively very famous and replaces it with Fourier transform to perform token mixing. The FNet model reduces the computational complexity of self-attention; however, it compromises with the accuracy scores in contrast to its counterparts. It is well-known that the Fourier transform suffers from the spectral leakage problem caused by the constraint of undersampling of the frequencies from the true infinite frequency domain, as a result; FNet suffers from an aliasing problem that we call text aliasing in our study. The text aliasing, as it resulted from the spectral leakage in Fourier domain, reduces the FNet’s ability to predict the correct word for a masked token. In this thesis, we adapted the concept of learning by exclusion that is well-established in word learning for children’s conceptual development and introduced a new concept of learning by frequency-exclusion in the Fourier domain to facilitate word learning for machine’s (e.g.FNet’s) conceptual development. The idea is to detect the effect of word aliasing through the mutual exclusivity of the narrow-band frequencies, and pass that information to the FNet’s encoding mechanism such that the encoder can learn the masked tokens as its vocabulary grows. To validate and evaluate the performance of the proposed approach, we conducted experiments with 15 different sentences as inputs by masking a few words and performing MLM using the pre-trained FNet model parameters. Our finding is that the integration of the proposed learning by frequency-exclusion helps FNet to improve its performance.

Additional Information

Publication
Thesis
Language: English
Date: 2022
Keywords
Contextual words, FNet, Machine Learning, Natural Language Processing
Subjects
Natural language processing (Computer science)
Fourier transformations $x Computer programs

Email this document to