Generative AI or Generative Artificial Intelligence —in English, Generative AI, GEN AI— opens up a world full of possibilities when it comes to creating new and original content. This branch of AI is capable of generating text, software code, images, videos, sound, and product designs and structures.
All this is based on the use of algorithms and machine learning models, such as Antagonistic Generative Networks (GANs) or Transformers, which can be learned by taking into account a broad set of data.
Once trained, these models can generate new content from an initial input or simply generate random samples. This ability to generate unpublished content turns Generative AI into a powerful technology with a multitude of practical applications. However, it entails certain challenges, as the content generated may have certain significant social and cultural implications.
For all these reasons, in this article in our section on technology, we will take a journey through the history of Generative AI, we will discover different types and examples of applications that use it, as well as the possible risks that its use may entail.
How we got to Generative AI
The concept of Artificial Intelligence (AI) was first coined in the document titled A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence (1955), written by John McCarthy and with the participation of a renowned group of researchers including the names of Marvin Lee Minsky, Nathaniel Rochester, and Claude Shannon.
This text included a formal proposal to carry out a study on artificial intelligence, during the summer of 1956, at Dartmouth College in Hanover (New Hampshire), under the premise that each aspect of learning, or any other characteristic of intelligence, can be described so precisely that it can be simulated on a machine“.
In addition, the aim was to answer “how to make machines use language, form abstractions, and concepts, solve problems reserved for humans, and improve themselves”.
A true declaration of intentions on the part of the authors responsible for this initiative would end up being forged, shortly after, in the definition provided by McCarthy on AI, understood as “the science and technology of creating intelligent machines, especially intelligent computer programs”.
Years later, in 1966, Joseph Weizenbaum of the MIT AI Laboratory created the first chatbot or conversational bot that he named ELIZA. This computer program could simulate a psychotherapist, to have conversations with a patient.
In the 1980s and 1990s, some of the most important milestones in the development of generative AI took place, with the emergence of probabilistic generative models, such as Bayesian Networks and Markov Hidden Models —in English Hidden Markov models, HMMs—. These allowed AI systems to make more complex decisions and generate more diverse results. However, generating high-quality content continued to be a challenge well into the 2000s.
Specifically, it was in the 2010s when generative AI experienced a significant advance with deep learning models, such as Generative Antagonistic Networks (GANs) and, later, Variational Autoencoders, —in English Variational Autoencoders, WOW.
GANs, proposed by Ian Goodfellow in 2014 and his research team, are two neural networks that can interact with each other and learn from data, to generate completely new content. In the process, a generating and a discriminating neural network work connected, so that the generator creates synthetic samples and the discriminator tries to distinguish between the generated samples and the real ones. As these networks compete with each other, the generator improves its ability to create more realistic and compelling content.
On the other hand, VAEs are a type of neural network that is used in an unsupervised learning context. The process begins through a neural network called an “encoder” that captures input data, from images or text, and transforms it into a numerical representation. Then, a certain level of uncertainty is introduced into that representation, so that the other neural network called “decoder” can generate other uncertain representations and, from these, generate new and varied data that allows the creation, for example, of new images or texts.
Finally, after a training phase, with examples of real data, the VAE learns to generate unique and different versions of images or texts, related to the original data.
In 2017, several researchers from Google Brain, Google Research, and the University of Toronto, including Ashish Vaswani, Noam Shazeer, and Niki Parmar, published Attention is All You Need. A scientific article that marked a milestone in the history of AI, by introducing the neural network architecture, called Transformer, which revolutionized the field of Natural Language Processing — in English, Natural Language Processing, NLP—, and which is the basis of the Great Language Models —in English, Large Language Models, LLMs— existing today.
For clarification, it should be noted here that the language models of Artificial Intelligence are the result of the combination of Natural Language Processing (NLP) and Natural Language Generation (NLG). Specifically, NLP provides computers with tools that allow them to understand and process natural language (that which is used by human beings to communicate, whether written or spoken); while NLG does so for the generation of text or even voice using natural language.
Thus, LLMs are used to understand and generate human language automatically, performing tasks such as translation, generation, summary and correction of text, and answering questions. A task for which they use machine learning algorithms and large data sets (corpus), to learn linguistic patterns and rules.
The Transformer neural network architecture proved to be highly effective in a variety of NLP tasks, including machine translation, text generation, question answers, identity identification, and text disambiguation, among others. Furthermore, unlike traditional architectures based on recurring layers, this new model Deep Learning or Deep Learning is based on an attention mechanism that allows you to weigh different parts of an entry, such as a text, based on their importance.
Additionally, it learns context and therefore meaning by tracking relationships in sequential data, such as the words that are part of a sentence. BERT (Bidirectional Encoder Representations for Transformers) was the first LLM created in 2018 by Google, based on Transformer networks. Later, GPT would come with its different versions and BARD.
On the other hand, the current technology market offers a wide and varied menu of generative AI applications, which we will see below and which we will group into different categories depending on the content.
Types of Generative AI applications
Text: Within this section are those applications intended for the generation of creative texts, automatic summaries of long texts, content correction, and chatbots that can maintain conversations with users.
- ChatGPT: Developed by the company OpenAI, this chatbot is capable of generating content, just like a human being, and answering complete questions. Additionally, Shikdar has in its marketplace an integration connector with Azure OpenAI Service that provides access to OpenAI language models, guaranteeing privacy in the data used, with the guarantee of Microsoft Azure. These can be adapted to perform specific tasks and also combined with the functionalities offered by this software platform.
- Copy.ai: AI-based tool that allows you to create content for e-commerce, and publications hosted on blogs, advertisements, social networks, and websites.
- Grammarly: Offers suggestions while writing text in real time to improve grammar, punctuation, and style.
Images: Applications for the automatic generation and editing of realistic images and even the generation of original works of art of any style.
- DALL·E: OpenAI is behind this AI system for creating realistic images and art, from natural language descriptions.
- Stable Diffusion: Designed to generate digital images from natural text.
- NVIDIA Canvas: Use AI to produce photorealistic images from simple sketches.
Code: These generative AI tools are used to streamline the software development process, through the automatic generation of new code.
- GitHub Copilot: Serves as support to programmers when writing code more quickly. It works with OpenAI Codex, a pre-trained generative language model created by the company responsible for ChatGPT.
- Tabnine: Add-on for multiple integrated development environments (IDEs) that use AI techniques to generate useful, real-time suggestions while writing programming code.
- DeepCode: Powered by the Snyk platform, it helps developers improve the quality and security of source code by providing suggestions and detecting errors.
Audio: For the creation of musical compositions and improvement of audio quality in recordings.
- Amper Music: One of the easiest-to-use AI music generators on the market, since it does not require knowledge of musical composition.
- Auphonic: Audio post-production web tool to achieve professional quality results.
- Murf.ai: This is one of the most popular AI voice generators. Any user can convert text to speech (TTS), voice-over, and dictation, being very useful for product developers, customer service, podcasters, and marketers among other professional profiles.
Video: These types of generative AI tools are intended for the generation of —synthetic videos using algorithms and rendering techniques in computer— and deepfakes.
- Synthesia: AI video creation platform that promises professional results in just 15 minutes, without the need for special editing equipment or skills.
- Runway Gen-1: Use words and images to generate new videos, from existing ones. Additionally, Runway Gen-1 was used in some scenes in the 7-time Oscar-winning film Everything Everywhere All at Once.
- Reface: Application to exchange faces in video clips and GIFs for the creation of deepfakes.
Design: Generative AI applications focused on product design, both for the idea generation phase and for design optimization and customization.
- Generative Design: Autodesk is the company responsible for this tool that uses AI algorithms to design and manufacture products based on established requirements.
- Ansys Discovery: Simulation-driven design tool, which reveals critical information in the early phases of the design process (prototyping).
- nTop: Formerly known as nTopology, it is a CAD tool that has revolutionized the way engineers design parts for additive manufacturing, related to the aerospace, medical, construction, automotive, and consumer sectors.
Risks of Generative AI
The final section of this post is dedicated to identifying the main risks of generative AI. An issue that, today, generates controversies between various actors and social groups, such as the scientific community, regulatory bodies and legislators, companies, defense groups, and activists, as well as users and the general public.
This is stated in the quarterly and scientific publication Entrepreneurial Business and Economics Review (EBER), subsidized by the Krakow University of Economics, in an article titled “The dark side of generative artificial intelligence: A critical analysis of controversies and risks of ChatGPT” (2023). The text signed by a large team of researchers from different academic institutions identifies and provides a comprehensive understanding of the challenges and opportunities associated with the use of generative models, among which are:
- Lack of regulation of the AI market and urgent need to establish a regulatory framework.
- Poor quality of information, lack of quality control, misinformation, content deepfake, and algorithmic bias.
- Job losses driven by automation.
- Violation of personal data, social surveillance, and violation of privacy.
- Social manipulation, weakening of ethics and goodwill.
- Increase in socioeconomic inequalities.
- Technology related to technological stress.
Faced with these threats, some countries have decided to take action on the matter. By the end of 2023, the EU is expected to reach an agreement on the form or structure that the Artificial Intelligence Law will adopt in the European Council, together with the member states. Thus, the first comprehensive law on AI in the world will aim to guarantee favorable conditions in the development and application of this technology in different areas.
A challenge is ensuring that the AI systems used in the EU are safe, transparent, traceable, non-discriminatory, and environmentally friendly.