Generative Artificial Intelligence, Foundation Models and Large Language Models | BULUT BİLİŞİM VE BÜYÜK VERİ ARAŞTIRMA LABORATUVARI

Generative Artificial Intelligence

Generative Artificial Intelligence (GenAI) refers to AI models designed to generate synthetic content including text, images, audio, or video, that resembles content generated by humans [1]. Lately, numerous GenAI models have been developed and released into the market. GenAI models utilize extensive data analysis to predict and generate content by identifying patterns, relationships, and probabilistically replicating them, resulting in the production of synthetic content that can exhibit similarities to existing patterns or showcase novel elements based on the provided input (prompts) [1].

Norwegian Consumer Council identified some harms and challenges posed by GenAIs:

- Structural challenges of GenAI, including technological solutionism, the concentration of power in big tech and, opacity of systems and lack of accountability;

- Manipulation, including mistakes and inaccurate output, deepfakes and disinformation and usage of chatbots to collect personal data;

- Bias, discrimination, and content moderation, including cultural context as a barrier for content moderation and open source models and the limits of content;

- Privacy and data protection;

- Security vulnerabilities and fraud;

- Replacing humans in consumer-facing applications with generative AI, wholly or in part, including challenges related to combining human and automatic decision-making such as over-reliance and under-reliance on the outputs of automated systems and perceived risks of overturning decisions in terms of human interlocutors;

- Environmental impact, including climate impact, water footprint and green washing;

- Impact on labour, including labour exploitation and ghost work, and labour automation and threats to jobs, and

- Intellectual property, including training of GenAIs without consent of the copyright owners and who owns the copyright to the output of GenAIs [1].

Another crucial technology that requires careful examination is the foundation model. Currently, many GenAIs available in the market are also considered foundation models.

Foundation Models

The term foundation models is coined by Bommasani et al [2] to denote “any model that is trained on broad data (generally using self-supervision at scale) that can be adapted

(e.g., fine-tuned) to a wide range of downstream tasks”. Transfer learning and scale are especially significant for the foundation models. While transfer learning, which is the process of applying the knowledge obtained from one task to a different task, enables foundation models, scale, which encompasses advancements in computer hardware, the development of the transformer model architecture and larger amount of available training data, enhances the power of foundation models [2].
Transfer learning consists of two phases: a pre-training phase, where knowledge is acquired from one or multiple source tasks, and a fine-tuning stage, where the acquired knowledge is transferred to target tasks [3]. In this way, the downstream tasks are performed by fine-tuning pre-trained models (PTMs) without training models from scratch. With PTMs, transfer learning can be applied to different tasks such as image classification, object detection and natural language processing (NLP) tasks. By implementing a foundation model, AI systems with specific intended purpose or general purpose AI systems can be developed [4].

While there is a general understanding of the ethical, legal and social implications associated with foundation models, identifying their risks and benefits can be challenging because their impacts can vary depending on the specific context in which they are used or deployed. Foundation models can pose different risks, challenges, and potential harms in different applications, industries, or societal contexts. Each context in which a foundation model is used should be carefully considered to identify the possible risks and benefits of its use.

Large Language Models

Large language models (LLMs), which have been very popular lately, fall under the category of foundation models and are considered a type of GenAI. LLMs are deep learning neural networks designed for NLP tasks. These models consist of a large number of parameters and are trained on vast amounts of unlabeled text data [5] by self-supervised learning technique. Self-supervised learning enables to pre-train the model on large-scale unsupervised (not labelled or annotated) data [3]. After the model is pre-trained, it is trained to adapt to specific tasks with smaller amounts of labelled data via fine-tuning. LLMs can be used for different tasks such as content creation, text summarization and code generation. Some examples of LLMs are BERT, PaLM and GPT-4.

As a category of foundation models, LLMs can pose different and maybe unexpected harms and challenges in the different contexts in which they are deployed. It is important to consider the specific use case and context when assessing the ethical implications and potential consequences of LLMs. Also, the challenges and harms mentioned above are applicable to LLMs since they are a type of generative AI.

Weidinger et al. classify 6 areas of risk for LLMs [6], all of which are broad,
include different sub-risks, and require further research and deep analysis. These risk areas are:

1) Discrimination, Hate speech and Exclusion, which involves “social stereotypes and unfair discrimination”, “hate speech and offensive language”, “exclusionary norms” and “lower performance for some languages and social groups”;

2) Information Hazards, which involves “compromising privacy leaking sensitive information” and “compromising privacy or security by correctly inferring sensitive information”;

3) Misinformation Harms, which involves “disseminating false or misleading information” and “causing material harm by disseminating false or poor information”;

4) Malicious Uses, which involves “making disinformation cheaper and more effective”, “assisting code generation for cybersecurity threats”, “facilitating fraud, scams and targeted manipulation” and “illegitimate surveillance and censorship”;

5) Human-Computer Interaction Harms, which involves “promoting harmful stereotypes by implying gender or ethnic identity”, “anthropomorphising systems which can lead to over reliance or unsafe use”, “avenues for exploiting user trust and accessing more private information” and “human-like interaction which may amplify opportunities for user nudging, deception or manipulation”, and

6) Environmental and Socioeconomic Harms, which involves “environmental harms from operating LMs”, “increasing inequality and negative effects on job quality”, “undermining creative economies” and “disparate access to benefits due to hardware, software, skills constraints” [7].

Reference

[1] Ghost in the Machine. (n.d.-b). https://www.adiconsum.it/wp-content/uploads/2023/06/Report_AI_2023.pdf

[2] Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., Bernstein, M. S., Bohg, J., Bosselut, A., Brunskill, E., Brynjolfsson, E., Buch, S., Card, D., Castellon, R., Chatterji, N., Chen, A., Creel, K., Davis, J. Q., Demszky, D., … Liang, P. (2022, July 12). On the opportunities and risks of Foundation models. arXiv.org. https://arxiv.org/abs/2108.07258

[3] Han, X., Zhang, Z., Ding, N., Gu, Y., Liu, X., Huo, Y., Qiu, J., Yao, Y., Zhang, A., Zhang, L., Han, W., Huang, M., Jin, Q., Lan, Y., Liu, Y., Liu, Z., Lu, Z., Qiu, X., Song, R., … Zhu, J. (2021, August 11). Pre-trained models: Past, present and future. arXiv.org. https://arxiv.org/abs/2106.07139

[4] European Parliament. (n.d.-a). https://www.europarl.europa.eu/doceo/document/TA-9-2023-0236_EN.pdf

[5] Zhao, W. X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., Min, Y., Zhang, B., Zhang, J., Dong, Z., Du, Y., Yang, C., Chen, Y., Chen, Z., Jiang, J., Ren, R., Li, Y., Tang, X., Liu, Z., … Wen, J.-R. (2023, May 7). A survey of large language models. arXiv.org. https://arxiv.org/abs/2303.18223

[6] DeepMind, L. W., Weidinger, L., DeepMind, DeepMind, J. U., Uesato, J., DeepMind, M. R., Rauh, M., DeepMind, C. G., Griffin, C., DeepMind, P.-S. H., Huang, P.-S., DeepMind, J. M., Mellor, J., DeepMind, A. G., Glaese, A., DeepMind, M. C., Cheng, M., DeepMind, B. B., Balle, B., … Metrics, O. M. A. (2022, June 1). Taxonomy of risks posed by language models: Proceedings of the 2022 ACM conference on fairness, accountability, and transparency. ACM Other conferences. https://dl.acm.org/doi/10.1145/3531146.3533088

Writer : Gamze Büşra Kaya