Generative AI
Generative AI (GenAI) involves the creation of new and unique digital content. Before GenAI, digital content of this level could only be created by human beings. These unique creations are driven by natural language prompts.
Generative AI
GenAI is based on Large Language Models (LLMs). LLMs are trained on terabytes of textual data from the Internet. These models have the capability of delivering complex, high-level responses to human language prompts.
In the case of ChatGPT, one breakthrough was the discovery that with some additional training, the LLMs can be used to produce impressive text-based results.
GenAI is the connection of LLMs with technologies that, in addition to text results, can generate digital content such as images, video, music, or code.
There is exploration of going beyond digital content and using this technology for things as varied as the discovery of new molecules and 3D designs.
Things are changing quickly and the impact is only just starting to be predicted and felt. Most people are expecting changes in work and living lifestyles.
Foundation Models
In the past, AI applications were very task specific. The recent technology shift has been to use foundation models to drive a number of different tasks.
The models have a generative ability and were the start of Generative AI. ChatGPT demonstrated the ability to generate text based on natural language prompts.
Using the concept of foundation models, vast amount of pre-trained data can be leveraged with a small amount of additional tuning and prompting. Tuning adds labels into the data. Prompting bridges the gap between training and intention. While training the LLM is expensive, the usage referred to as inference is more cost effective.
The Question of Intelligence?
Artificial Intelligence (AI) is the matching or exceeding the intelligence of a human being. Things that are often associated with intelligence such as doing math, playing chess, or remembering vast amounts of data already can be done more quickly and better by computer.
With LLMs and a technology called diffusion AI, systems can now match or exceed other skills we often associate with intelligence such as:
- Accurately responding to language inputs and creating new text content.
- Writing original stories and poems.
- Analyzing content for tone and emotions.
- Creating unique images and video.
Is It Art?
In 2022 Jason Allen’s AI generated work, Théâtre D’opéra Spatial, won first place in the digital category at the Colorado State Fair. The artist did indicate that he used the GenAI platform Midjourney. It was later discovered that the judges weren’t familiar with the tool at the time of judging.
The question “Does GenAI create art?” is now being asked. When photography was introduced, only painting had been considered art. Photography wasn’t considered art and the same discussion took place. Now, few people would argue that photography cannot be art.
Another argument is that since there is no advanced skill required from the operator, it is not art. Some consider art as the application of skills developed over many years. However, some modern art has been about concept and technique - not necessarily skill.
An additional argument against it being art is that that the human element is not present in GenAI created works. The counter argument is that the creation is still driven by a person who has emotion, creativity, and intent.
Applications
Generative AI is a combination of several technologies and has many applications. Effective LLMs are used to process a natural language prompt and technologies such as diffusion models are used in the case of image generation.
Some content types GenAI is being used for include:
Other Areas
While GenAI is generally discussed in the context of text, images, music, and coding, there are many other active areas. Some of these include:
- High fidelity text to speech
- Scientific Discovery (generating hypothesis and accelerating discovery)
- Chemistry (molecule and drug discovery)
- Medical imaging
- 3D Models (text to space creation)
- Video (movies, animations, transition between real and generated)
- Education (personalizing, customizing lessons, tutoring)
Impact and Possibilities
Legality
The legal aspects of Generative AI are not yet defined. Class action lawsuits due to training data are showing up as infringing on the rights of artists. Related US laws include the first amendment and the fair use copyright act of 1976. These laws allow for the limited use of copyrighted material without permission of the copyright holder.
Some uses are allowed, such as commentary, search engines, news reporting, research, and other transformative work. Transformative is defined as bringing a new expression, meaning, or message to the work. It can be argued that GenAI content can be transformative.
Policy
Different companies are quickly setting up policies around Generative-AI. Google announced that they are allowing AI-generated advertising content. Shutterstock has announced that they will allow the selling of content generated with OpenAI and are working on a royalty system for content used to sell generated content based on original work.
Accuracy
The correctness of the input data used during model training controls the accuracy of the generated content. There is no way to guarantee quality, accuracy, lack of bias, or timeliness. With the increased consumption of generated content, cases of misinformation (unintended and intended) are being considered.
Speed of change
What is different about Generative AI is the speed of change. This is creating fear of the unknown as it is difficult to predict what will happen next. Digital artists and some prominent tech leaders are calling for the halting of Generative AI until the impacts on society are more understood.
Impact on Jobs
Generative AI is expected to have significant impacts on the working world. Hundreds of millions of jobs will be directly affected.
In some sense the change is expected to be welcomed as jobs that are currently mundane or have been difficult to automate will go away. For example frame editing, common artwork generation, and basic copywriting.
One impact of this reduction will be how new joiners are mentored. New joiners are often given basic tasks. The approach to mentorship will change with the ability of Generative AI to accomplish these same tasks.
Possibilities
While any new disruptive technology creates a lot of uncertainties many people are also excited about what the future may bring.
Generative AI may open up entirely new ways to create and consume content. For example, media now is difficult and time-consuming to create even just in one length and format. Generative AI may allow different versions of movies, stories, and short videos to be generated from the same source media to allow new ways of media consumption.
Generation of speech based on text with high fidelity style will allow creators to make content where they were limited by voice overlay before.
With regards to code generation, automating repetitive work may result in higher quality code. Also it gives coders more time to focus on new creative user-centric solutions.
Movies and video creation once expensive and possible only by large studios and digital production groups can instead be created by individuals and is bound only by their imagination.
Generative AI may help people understand more about how creative processes work in general and how to continually enhance the human process.
Generative AI
- Code
- Generative coding is the use of Generative AI (GenAI) to assist in software development. It was one of the first applications of Generative AI technology to be commercialized.
- Images
- GenAI can generate new images from existing text prompts and images. Because of a random seed, the images generated are unique creations. GenAI uses diffusion models to create these new unique images.
- Music
- GenAI allows for the creation of long-playing high-fidelity music from text descriptions and additional sound input conditions.
- Text
- Generative AI can create blogs, write ad copy, create new content based on text input, and is capable of summarizing and changing the style and tone of text content.
Contribute to Docs
- Learn more about how to get involved.
- Edit this page on GitHub to fix an error or make an improvement.
- Submit feedback to let us know how we can improve Docs.