Cait

The Center for Artificial Intelligence Transparency

What can it do?

Image generation [Click to collapse / expand section]

Generative image models are packaged architectures of AI/neural networks used to create images from a text-based prompt. A few "image models" available today are: Stable Diffusion, DALLE 2, Midjourney, and Google's Imagen.

Text-prompt (ex. "A cat wearing a straw hat") ====> Model (ex. Stable Diffusion) ====> Synthetic image

The generative image model takes a text prompt—as input—and processes it through a mathematical space (created during the process of training), translates mathematical values found to be associated with the words from the prompt and their relations to one another and uses those values during the denoising process.

Stable Diffusion

Stable Diffusion is large image-to-text model created by a company called Stability AI. Initially released in August, 2022, the model was the first open-source image model with quality comparative to state-of-the-art image models (DALLE or Imagen).

Predictive tasks
Generation of information
Writing generation
Image generation
Video generation
3D model generation
Code generation
Audio generation
Identifying and surveilling
Knowledge management and organization
Learn correlations between information