A Holistic Approach to Transformer-Based Models: Architecture, Applications, and Ethical Layers

Authors :Söz Tuana DEMİR, Nida GÖKÇE NARİN
Pages :1-27
Abstract :This section presents a detailed description of transformer-based models, which have revolutionized the field of artificial intelligence, particularly in natural language processing, vision, speech, and multimodal applications, among others. This chapter begins with a description of the limitations of older architectures, such as Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks, which were widely used before the advent of transformers, and how transformers addressed these deficiencies. The Vaswani et al. transformer model, introduced in 2017, offers a successful solution to such concerns, particularly through its parallel processing and self-attention mechanisms. The chapter covers the essential components of transformers, such as the multi-headed attention mechanism, positional encoding, residual connections, and the encoder-decoder model. Then it focuses on current leading models, such as BERT, GPT (through GPT-4), DALL·E, Claude 3, Gemini, LLaMA 3, Mistral, Whisper, and Sora. Concerns such as how these systems are powered, where they excel, and the type of output they produce are presented with examples. In addition to the technical data, the social, environmental, and ethical problems for which transformers provide solutions also come into focus. Serious issues, such as artificial intelligence sometimes producing content contrary to reality and not being grounded in reality, i.e., situations referred to as hallucinations; producing skewed results; inducing copyright infringement; posing a threat to personal data; and the detrimental effects of massive models on the environment, are being addressed. For example, a machine learning algorithm would generate a news title reporting accurately on a non-existent incident, demonstrating just how dangerous the threat of hallucinations can be when it comes to critical applications. Here, what is emphasized is that technology professionals must not only be capable of innovating but also behave by values such as transparency, fairness, ethics, and sustainability. Furthermore, the section offers practical advice on which models might be more suitable for other applications, such as creative text writing, classification, multi-modal tasks, or speech content generation. Specific key points that need to be remembered to minimize risks are also conveyed to the readers. Finally, the importance of collaboration among developers, decision-makers, and society to facilitate this growth in a manner that benefits society is highlighted.
Keywords :Transformer architecture, Generative artificial intelligence, Ethical artificial intelligence, Multi-modal tasks, Natural language processing
Doi:10.5281/zenodo.16008405
Pdf URL :https://www.izmirakademi.org/books/The_Age_of_Generative_Artificial_Intelligence/cp1/pdf/cp1.pdf