Theoretically deriving computational limits of Artificial Neural Networks with bounded precision and time
- Author:
- Mali, Ankur Arjun
- Published:
- [University Park, Pennsylvania] : Pennsylvania State University, 2022.
- Physical Description:
- 1 electronic document
- Additional Creators:
- Giles, C. Lee
Access Online
- etda.libraries.psu.edu , Connect to this object online.
- Graduate Program:
- Restrictions on Access:
- Restricted (PSU Only).
- Summary:
- In recent years, Artificial neural networks (ANNs) based architectures have succeeded in various applications, even outperforming humans in a few. Two of the most dominant ANNs use either recurrence, for example, recurrent neural networks (RNN), Auto- regressive models, or self-attention, such as transformers. These models have shown promising results in various applications, including natural language processing. Even though ANNs achieve impressive empirical results, they often lack theoretical motivations and are least interpretable. Recent work has attempted to relate ANNs with formal language theory to derive the computational power of ANNs. These findings show that ANNs such as RNNs and Transformers are Turing Complete (TC) and can model any language or pattern. However, these result relies on infinite precision in the hidden representation, positional encoding for transformer models, and unbounded computation time. Furthermore, they require multiple passes over data to recognize a Turing complete grammar, which is not practical for real-time implementations and doesn't help derive the true computational power of ANNs. Motivated by this issue, we will theoretically derive the computational limits of various categories of ANNs and propose new foundational models for grammatical inference. This dissertation proposes a new family of formal methods with and without external memory that are Turing complete with bounded precision and can operate in real-time. In this dissertation, we will specifically: 1) theoretically show there exists the smallest second-order RNN with only 11 unbounded neurons that are Turing complete (TC) in real-time, which is also more potent than RNN and even transformers, and 2) develop two new foundational models named Neural State Turing machine (NSTM) and Neural Network Turing Machine (NNTM) that theoretically with bounded precision and weights are TC and can process any algorithms in real-time, 3) develop computational approaches to train these systems to recognize complex algorithmic patterns and mathematical equations, 4) Develop computationally efficient architectures inspired by the above theoretical findings that are scalable and work on large datasets. Finally, this dissertation will theoretically show the smallest neural network ever designed with and without recurrence is TC with bounded precision that crucially operates in real-time. Especially we show a neural network with only 7 bounded neurons is TC and can operate in real-time, which is an order of magnitude less than any other model in literature and is the first neural network that operates in real-time. The work carried forth in this dissertation is meant to serve as a crucial stepping stone toward tackling the most significant challenge facing machine learning and natural language processing research efforts-theoretically deriving the computational limits of ANNs with bounded precision and weights in real-time.
- Other Subject(s):
- Genre(s):
- Dissertation Note:
- Ph.D. Pennsylvania State University 2022.
- Technical Details:
- The full text of the dissertation is available as an Adobe Acrobat .pdf file ; Adobe Acrobat Reader required to view the file.
View MARC record | catkey: 38970296