Viral Christmas Prompts Click Here!

AI Glossary - Complete AI Terms Dictionary 2026

AI Glossary & Dictionary

The most comprehensive guide to artificial intelligence terminology. Master 80+ essential AI terms used in ChatGPT, Midjourney, Claude, Gemini, Stable Diffusion, Kling, Suno, and all major AI platforms.

80+AI Terms
6Categories
2026Updated
Browse by Category
🤖
General AI
15 terms
💬
Text & LLMs
20 terms
🎨
Image Generation
18 terms
🎬
Video & Audio
12 terms
A

A

12 terms starting with A

Artificial Intelligence (AI)

Basic

The field of computer science focused on creating systems that can perform tasks requiring human-like intelligence. This includes learning from experience, understanding language, recognizing patterns, making decisions, and solving problems. Modern AI encompasses machine learning, deep learning, and generative AI technologies.

Real-World Examples

ChatGPT (conversation), Midjourney (images), Tesla Autopilot (driving), Siri/Alexa (voice assistants), Netflix recommendations.

AGI (Artificial General Intelligence)

Advanced

A theoretical form of AI that would match or exceed human-level intelligence across all cognitive tasks. Unlike narrow AI (designed for specific tasks like chess or image recognition), AGI would understand, learn, and apply knowledge across any domain just like humans. AGI remains a future goal and does not currently exist.

API (Application Programming Interface)

Intermediate

A set of protocols and tools that allows different software applications to communicate with each other. AI APIs enable developers to integrate AI capabilities into their own applications without building models from scratch. Major providers include OpenAI, Anthropic, Google, and Stability AI.

Example

Using the OpenAI API to add ChatGPT capabilities to your website, or the Stability API to generate images in your app.

Attention Mechanism

AdvancedText

A neural network technique that allows models to focus on relevant parts of the input when generating output. Introduced in the groundbreaking 2017 paper "Attention Is All You Need," it enables models to understand relationships between words regardless of their distance in a sentence. This is the foundation of Transformer architecture used in GPT, BERT, and most modern AI models.

Aspect Ratio

BasicImage

The proportional relationship between an image's width and height, expressed as width:height. Different aspect ratios are suited for different purposes. Common ratios include 1:1 (square, Instagram), 16:9 (widescreen, YouTube), 9:16 (vertical, TikTok/Stories), 4:3 (standard), and 3:2 (photography).

In Midjourney

Use --ar 16:9 for cinematic, --ar 9:16 for phone wallpapers, --ar 1:1 for profile pictures, --ar 2:3 for posters.

Adobe Firefly

ToolImage

Adobe's AI image generation platform, designed to be commercially safe as it's trained on licensed Adobe Stock images. Integrated into Photoshop, Illustrator, and other Creative Cloud apps. Features include text-to-image, generative fill, text effects, and vector recoloring.

B

B

4 terms starting with B

Batch Size

Advanced

In AI training and inference, the number of samples processed together in one forward/backward pass. Larger batch sizes can speed up training but require more memory. In image generation, batch size determines how many images are generated simultaneously.

BERT (Bidirectional Encoder Representations from Transformers)

AdvancedText

A landmark language model developed by Google in 2018 that reads text bidirectionally (both left-to-right and right-to-left) to understand context. While GPT models are generative (create text), BERT excels at understanding tasks like search, classification, and question-answering.

Bias (in AI)

Intermediate

Systematic errors in AI outputs that reflect prejudices present in training data or model design. AI bias can lead to unfair or discriminatory results across race, gender, age, or other characteristics. Addressing bias is a major focus of AI ethics and safety research.

C

C

12 terms starting with C

CFG Scale (Classifier-Free Guidance)

IntermediateImage

A parameter controlling how closely AI-generated images follow your text prompt. Higher CFG values (10-15) mean stricter prompt adherence but may reduce natural appearance. Lower values (3-7) allow more creative interpretation. Most users find 7-9 optimal for balanced results.

Recommended Settings

Stable Diffusion: 7-8 standard, 10-12 for strict prompt following. Leonardo: 7-9. Too high causes artifacts; too low ignores your prompt.

ChatGPT

ToolText

OpenAI's conversational AI assistant and the world's most popular AI chatbot. Built on GPT (Generative Pre-trained Transformer) models, ChatGPT can engage in dialogue, write content, code, analyze data, and more. GPT-4o is the latest multimodal version with vision and voice capabilities. Launched November 2022.

Claude

ToolText

Anthropic's AI assistant known for thoughtful, nuanced responses and exceptional coding abilities. Features a massive 200K token context window (can process entire books). Claude 3.5 Sonnet is particularly acclaimed for coding and analysis. Anthropic focuses on AI safety using "Constitutional AI" principles.

Best For

Long document analysis, complex coding, thoughtful writing, research tasks, and applications requiring nuanced understanding.

Checkpoint (Model)

IntermediateImage

A saved state of an AI model's weights at a specific point in training. In Stable Diffusion, checkpoints are complete models fine-tuned for specific styles (realistic, anime, artistic). Users swap checkpoints to change the visual style of their generations.

Popular Checkpoints

Realistic Vision, DreamShaper, SDXL Base, Juggernaut XL (photorealism), Anything V5 (anime), Deliberate (versatile).

CLIP (Contrastive Language-Image Pre-training)

AdvancedImage

An OpenAI model that understands the relationship between images and text by training on millions of image-caption pairs. CLIP is used in image generation to guide the creation process, helping models understand what your text prompt means visually.

Context Window

BasicText

The maximum amount of text (measured in tokens) an AI model can process and remember in a single session. A larger context window allows for longer conversations and document analysis without "forgetting" earlier content.

Comparison (2025)

GPT-4 Turbo: 128K tokens | Claude 3: 200K tokens | Gemini 1.5 Pro: 1M tokens | GPT-3.5: 16K tokens

ControlNet

AdvancedImage

A neural network architecture that adds precise control to Stable Diffusion image generation. ControlNet uses reference inputs like pose skeletons, depth maps, edge detection, or scribbles to guide the AI while maintaining the base model's quality.

Common ControlNet Types

OpenPose (body poses), Canny (edges), Depth (3D depth), Scribble (sketches), Tile (upscaling), IP-Adapter (style transfer).

Chain-of-Thought (CoT)

IntermediateText

A prompting technique that improves AI reasoning by asking the model to show its work step-by-step before giving a final answer. This dramatically improves performance on math, logic, and complex analysis tasks.

How to Use

Add phrases like "Let's think step by step," "Show your reasoning," or "Break this down" to your prompts.

Copilot (Microsoft/GitHub)

Tool

Microsoft's AI assistant brand, appearing across products. GitHub Copilot is an AI coding assistant powered by OpenAI Codex that suggests code in your IDE. Microsoft Copilot (formerly Bing Chat) is Microsoft's general AI assistant with GPT-4 and DALL-E integration, available in Windows, Edge, and Microsoft 365.

D

D

8 terms starting with D

DALL-E

ToolImage

OpenAI's text-to-image AI model that generates images from natural language descriptions. DALL-E 3 (current version) is integrated with ChatGPT, offers excellent prompt understanding, and excels at rendering accurate text within images. Named after Salvador Dalí and WALL-E.

Key Strengths

Best-in-class text rendering in images, natural language prompts (no special syntax), automatic prompt enhancement, ChatGPT integration.

Deep Learning

Basic

A subset of machine learning using artificial neural networks with multiple layers (hence "deep") to learn complex patterns from large amounts of data. Deep learning powers most modern AI including image recognition, natural language processing, speech recognition, and generative AI.

Denoising

IntermediateImage

The core process in diffusion models where the AI progressively removes noise from a random pattern to create a coherent image. The denoising strength parameter (in img2img) controls how much the original image is altered during generation.

Denoising Strength Guide

0.2-0.4: Subtle changes, preserves original. 0.5-0.7: Moderate changes. 0.8-1.0: Almost completely new image based on prompt.

Diffusion Model

IntermediateImage

A type of generative AI that creates images by learning to reverse a gradual noise-adding process. Starting from random noise, the model progressively "denoises" it into a coherent image guided by your text prompt. This architecture powers most modern image generators.

Examples

Stable Diffusion, DALL-E 3, Midjourney, Flux, Google Imagen, Adobe Firefly all use diffusion-based architectures.

DreamBooth

AdvancedImage

A fine-tuning technique developed by Google that trains image models to generate specific subjects (people, pets, products) from just a few reference photos. Used to create personalized AI image models that can place your subject in any scene or style.

E

E

4 terms starting with E

ElevenLabs

ToolAudio

Industry-leading AI voice synthesis platform offering ultra-realistic text-to-speech, voice cloning, and audio generation. Supports 29+ languages with natural prosody and emotion. Used for voiceovers, audiobooks, podcasts, dubbing, and gaming characters.

Embedding

Advanced

A numerical representation (vector) that captures the meaning of text, images, or other data in a way AI can process. Similar concepts have similar embeddings. Used in search, recommendations, and connecting text prompts to image generation.

Epoch

Advanced

One complete pass through the entire training dataset during AI model training. Training typically involves multiple epochs; too few leads to underfitting, too many causes overfitting. In LoRA training, 10-30 epochs is common.

F

F

5 terms starting with F

Few-Shot Learning/Prompting

IntermediateText

A prompting technique where you provide a few examples of the desired input-output pattern before your actual request. This helps the AI understand exactly what format, style, or transformation you want without requiring model fine-tuning.

Example

"Convert to haiku: Summer breeze blowing = Gentle wind caresses skin / Summer's warm embrace. Convert to haiku: Ocean waves..."

Fine-Tuning

Intermediate

The process of further training a pre-trained AI model on a specific dataset to specialize it for particular tasks, domains, or styles. Fine-tuning adapts general-purpose models to specific use cases without training from scratch, saving time and compute resources.

Examples

Fine-tuning GPT on legal documents for a law firm, or training Stable Diffusion on product images for consistent brand visuals.

Flux

ToolImage

A state-of-the-art image generation model by Black Forest Labs (founded by former Stability AI researchers who created Stable Diffusion). Known for exceptional photorealism, anatomical accuracy, and text rendering. Available in Flux Pro (best quality), Flux Dev (development), and Flux Schnell (fastest).

Why It's Popular

Currently considered one of the best open models for photorealism, proper hands/fingers, and text in images. Can run locally with open weights.

Foundation Model

Intermediate

Large AI models trained on broad datasets that can be adapted to many different tasks. Foundation models (like GPT-4, Claude, Llama, Stable Diffusion) serve as starting points for specialized applications through fine-tuning or prompt engineering.

G

G

6 terms starting with G

Google Gemini

ToolText

Google's flagship multimodal AI model family that can process text, images, audio, video, and code. Comes in Ultra (most powerful), Pro (balanced), Flash (fast), and Nano (on-device) variants. Features industry-leading 1 million token context window (Gemini 1.5) and deep Google product integration.

Unique Features

Massive context window for analyzing entire codebases or books, native multimodal understanding, Google Search/Workspace integration.

Generative AI

Basic

AI systems designed to create new content rather than just analyze existing data. Generative AI can produce text, images, music, video, code, and more. This is the category that includes ChatGPT, Midjourney, DALL-E, Stable Diffusion, Suno, and Kling.

GPT (Generative Pre-trained Transformer)

BasicText

OpenAI's family of large language models that power ChatGPT and revolutionized AI. "Generative" means it creates text, "Pre-trained" means it learned from vast internet data, and "Transformer" refers to the neural network architecture. GPT-4o and GPT-4 Turbo are the current flagship models.

Version History

GPT-1 (2018) → GPT-2 (2019) → GPT-3 (2020) → GPT-3.5 (2022) → GPT-4 (2023) → GPT-4o (2024)

Grok

ToolText

xAI's (Elon Musk's AI company) conversational AI assistant. Known for real-time X (Twitter) integration, witty personality, and willingness to answer controversial questions. Grok 2 includes image generation capabilities.

Guardrails

Intermediate

Safety mechanisms built into AI systems to prevent harmful, biased, or inappropriate outputs. Guardrails include content filters, topic restrictions, and behavioral guidelines that keep AI responses safe and aligned with intended use cases.

H

H

3 terms starting with H

Hallucination

BasicText

When an AI confidently generates false, fabricated, or nonsensical information that sounds plausible but is factually incorrect. LLMs may "hallucinate" fake citations, invent facts, or create fictional details. This is a known limitation of current AI technology.

Prevention Tips

Always fact-check important claims, ask for sources, use RAG (Retrieval-Augmented Generation), or use tools like Perplexity that cite sources.

Hires Fix (High Resolution Fix)

IntermediateImage

A technique in Stable Diffusion that generates images at low resolution first, then upscales and refines them. This prevents common issues with generating large images directly (like duplicate subjects) and produces cleaner, more detailed results.

Hyperparameter

Advanced

Configuration settings that control how an AI model learns or generates outputs. Unlike parameters (learned during training), hyperparameters are set by users/developers. Examples include learning rate, batch size, temperature, and CFG scale.

I

I

6 terms starting with I

Ideogram

ToolImage

An AI image generator specializing in accurate text rendering within images. Ideogram excels at creating images with readable, properly spelled text - perfect for logos, posters, signs, and designs requiring typography. Also known for creative, stylized outputs.

Img2Img (Image-to-Image)

BasicImage

A generation mode where you provide an existing image as a starting point, and the AI transforms it based on your text prompt. Used for style transfer, creating variations, editing images, or using sketches as guides for detailed artwork.

Use Cases

Turn rough sketches into polished art, change photo styles, edit specific elements, create variations of a concept, colorize images.

Inpainting

BasicImage

An AI technique that fills in or replaces selected areas of an image while keeping the rest unchanged. You "paint" a mask over the area you want to modify, and the AI regenerates only that portion, matching the surrounding context.

Common Uses

Remove unwanted objects, fix faces/hands, change clothing, add/remove elements, repair damaged photos, extend backgrounds.

Inference

Intermediate

The process of using a trained AI model to generate outputs or make predictions. When you send a prompt to ChatGPT or generate an image with Midjourney, that's inference - the model applies what it learned during training to produce results.

Instruction Tuning

AdvancedText

A fine-tuning approach where models are trained on examples of instructions paired with ideal responses. This makes AI better at following user directions. Models like ChatGPT, Claude, and most commercial assistants use instruction tuning.

K

K

2 terms starting with K

Kling AI

ToolVideo

A revolutionary Chinese AI video generator by Kuaishou capable of creating cinema-quality videos up to 2 minutes long. Known for exceptional motion quality, physics understanding, and character consistency. One of the most powerful text-to-video and image-to-video tools available.

Key Features

Up to 2-minute videos, 1080p output, realistic motion and physics, image-to-video, motion brush for control.

Knowledge Cutoff

BasicText

The date after which an AI model has no training data, meaning it lacks knowledge of events, information, or developments that occurred after that point. For example, GPT-4's knowledge cutoff means it doesn't know about events after its training data ended.

L

L

6 terms starting with L

Latent Space

AdvancedImage

A compressed mathematical representation where AI models perform their core operations. In Stable Diffusion (a "latent" diffusion model), the denoising process happens in this compressed space rather than on full-resolution images, making generation much faster and requiring less memory.

Leonardo AI

ToolImage

A versatile AI image platform featuring multiple fine-tuned models, real-time generation, and powerful editing tools. Known for its user-friendly interface, generous free tier (150 daily credits), and excellent results for game assets, characters, and illustrations.

Popular Features

Phoenix model, real-time generation, AI Canvas editing, consistent character tools, motion generation.

LLaMA (Large Language Model Meta AI)

ToolText

Meta's family of open-source large language models. LLaMA models can be downloaded and run locally, fine-tuned for specific purposes, and used commercially (with some restrictions). LLaMA 3 is competitive with GPT-4 on many benchmarks.

LLM (Large Language Model)

BasicText

AI models trained on massive amounts of text to understand and generate human language. "Large" refers to billions of parameters. LLMs power chatbots, writing assistants, code generators, and many AI applications. They predict the most likely next words based on context.

Major LLMs

GPT-4 (OpenAI), Claude 3 (Anthropic), Gemini (Google), LLaMA 3 (Meta), Mistral, Grok (xAI), Command R (Cohere).

LoRA (Low-Rank Adaptation)

IntermediateImage

A lightweight fine-tuning method that trains small adapter modules instead of modifying the entire model. In image generation, LoRAs add specific styles, characters, or concepts to base models while keeping file sizes small (typically 10-200MB vs. 2-7GB for full checkpoints).

LoRA Types

Character LoRAs (specific people/characters), Style LoRAs (art styles), Concept LoRAs (poses, objects, aesthetics).

Luma AI / Dream Machine

ToolVideo

Luma AI's video generation platform featuring Dream Machine, which creates smooth video clips from text or images. Known for fast generation times, impressive realism, and accessible free tier. Also offers 3D capture and reconstruction tools.

M

M

6 terms starting with M

Machine Learning (ML)

Basic

A subset of AI where systems learn patterns from data rather than following explicit programming rules. Machine learning algorithms improve their performance through experience, enabling applications like recommendation systems, spam filters, and predictive analytics.

Midjourney

ToolImage

A leading AI image generator renowned for its artistic, stylized outputs and exceptional aesthetic quality. Accessed primarily through Discord. Known for cinematic, painterly, and creative interpretations. Midjourney V6 offers photorealism and improved prompt understanding.

Key Parameters

--ar (aspect ratio), --v (version), --style raw (less stylized), --stylize (0-1000), --chaos (variation), --no (negative prompt), --cref (character ref).

MiniMax / Hailuo AI

ToolVideo

A powerful Chinese AI company offering Hailuo AI video generator. Creates high-quality AI videos with excellent character consistency, smooth motion, and fast generation. Popular free alternative to Runway and Kling with generous usage limits.

Multimodal

Intermediate

AI systems that can process, understand, and generate multiple types of content (modalities) such as text, images, audio, and video. GPT-4o, Gemini, and Claude 3 are multimodal - they can "see" images and respond about them, not just process text.

Capabilities

Analyze images, describe photos, read documents, transcribe audio, understand video content, generate across formats.

Mistral AI

ToolText

A French AI company known for efficient, powerful open-source language models. Mistral and Mixtral models offer impressive performance relative to their size and can run locally. Known for being cost-effective alternatives to larger models.

N

N

3 terms starting with N

Negative Prompt

BasicImage

Text that tells image AI what you DON'T want in your generation. Negative prompts help eliminate common issues like bad hands, blurry images, watermarks, and unwanted elements. Essential for quality results in Stable Diffusion, Leonardo, and similar tools.

Common Negative Terms

ugly, blurry, deformed, bad anatomy, extra fingers, mutated hands, low quality, watermark, text, signature, disfigured.

Neural Network

Basic

A computing system inspired by biological brains, consisting of interconnected nodes (neurons) organized in layers that process information. Neural networks can learn patterns from data and form the foundation of modern AI, including deep learning and generative models.

NLP (Natural Language Processing)

BasicText

The branch of AI focused on enabling computers to understand, interpret, and generate human language. NLP powers chatbots, translation, sentiment analysis, text summarization, and voice assistants. LLMs represent the latest advancement in NLP.

O

O

3 terms starting with O

OpenAI

Tool

The AI research company behind ChatGPT, GPT-4, DALL-E, and Sora. Founded in 2015 (with early involvement from Elon Musk), OpenAI has been instrumental in advancing and popularizing AI. They offer both consumer products (ChatGPT) and developer APIs.

Open Source (AI Models)

Intermediate

AI models whose weights and architecture are publicly available for download, modification, and use. Open-source models can run locally without internet, be fine-tuned for specific purposes, and don't require API fees. Examples include Stable Diffusion, LLaMA, Mistral, and Flux.

Outpainting

BasicImage

An AI technique that extends an image beyond its original boundaries, generating new content that seamlessly matches the existing image. Used to change aspect ratios, reveal more of a scene, or create panoramic images from smaller sources.

P

P

6 terms starting with P

Parameters

Intermediate

The internal values (weights and biases) that an AI model learns during training. Model size is often measured in parameters - GPT-4 has an estimated 1.76 trillion parameters. More parameters generally means more capability but requires more compute to train and run.

Model Sizes

GPT-3: 175B | LLaMA 3 70B: 70B | Claude 3 Opus: ~200B | Mistral 7B: 7B | SDXL: ~3.5B

Perplexity AI

ToolText

An AI-powered search engine and research assistant that provides sourced, cited answers. Unlike ChatGPT, Perplexity searches the web in real-time and shows exactly where information comes from. Excellent for research and fact-checking.

Pika Labs

ToolVideo

A user-friendly AI video generator featuring creative tools like lip sync, sound effects, and video-to-video transformation. Great for social media content with fun features that make AI video creation accessible and entertaining.

Prompt

Basic

The text input you give to an AI to generate content. In image AI, prompts describe what you want to create visually. In text AI, prompts are your questions, instructions, or tasks. Prompt quality directly determines output quality - a skill called "prompt engineering."

Good vs Bad Prompts

Bad: "cat" | Good: "A fluffy orange tabby cat sleeping on a vintage armchair, soft afternoon sunlight through a window, cozy atmosphere, 35mm photography"

Prompt Engineering

Basic

The practice of designing and optimizing prompts to get the best results from AI systems. Involves understanding how different AI models interpret language, using effective structures, providing examples, and iteratively refining prompts for desired outputs.

R

R

3 terms starting with R

RAG (Retrieval-Augmented Generation)

IntermediateText

A technique that enhances LLM responses by first retrieving relevant information from external sources (documents, databases) before generating answers. RAG reduces hallucinations and enables AI to access current or proprietary information not in its training data.

How It Works

1. User asks question → 2. System searches knowledge base → 3. Relevant docs retrieved → 4. LLM generates answer using retrieved context.

RLHF (Reinforcement Learning from Human Feedback)

AdvancedText

A training technique where AI models learn from human preferences. Humans rank model outputs, and this feedback trains a reward model that guides further training. RLHF is key to making AI assistants helpful, harmless, and honest - used by ChatGPT, Claude, and others.

Runway

ToolVideo

An industry-standard AI video platform offering Gen-3 Alpha for text/image-to-video generation with precise motion control. Used by professional filmmakers and content creators for its cinematic quality, camera movement controls, and comprehensive video editing suite.

S

S

9 terms starting with S

Sampler

IntermediateImage

The algorithm that guides the denoising process in diffusion models. Different samplers produce different results at different speeds. The sampler determines how the model steps through the generation process.

Popular Samplers

DPM++ 2M Karras (balanced), Euler a (fast, creative), DPM++ SDE Karras (high quality, slower), UniPC (fast, good quality).

Seed

BasicImage

A number that initializes the random generation process. Using the same seed with identical settings produces identical results. Essential for reproducibility, creating variations, and consistent character generation.

Usage Tip

Set seed to -1 for random results each time. Lock a specific seed number when you want to reproduce or iterate on an image.

Sora

ToolVideo

OpenAI's groundbreaking text-to-video model capable of generating up to 1-minute videos with complex scenes, multiple characters, and realistic physics. Sora represents a major leap in video generation quality and coherence.

Stable Diffusion

ToolImage

The pioneering open-source image generation model by Stability AI that can run locally on personal computers. Highly customizable with checkpoints, LoRAs, ControlNets, and extensions. SDXL and SD3 are the latest major versions offering improved quality.

Popular Interfaces

Automatic1111, ComfyUI, Forge, InvokeAI (local) | Leonardo, Tensor.art, Civitai (cloud-based).

Steps (Sampling Steps)

BasicImage

The number of denoising iterations the AI performs when generating an image. More steps generally means more detail but takes longer. There are diminishing returns beyond a certain point.

Recommended Steps

20-30 steps for most samplers. Euler a: 20-25. DPM++ 2M: 25-35. More than 50 rarely improves quality significantly.

Suno AI

ToolAudio

A revolutionary AI music generator that creates full songs with vocals, instruments, and lyrics from text descriptions. Produces radio-quality tracks across all genres in under 2 minutes. One of the most impressive examples of generative AI in music.

Key Features

Full songs with AI vocals, custom lyrics mode, all genres supported, up to 4-minute tracks, free tier available.

System Prompt

IntermediateText

Initial instructions given to an AI that define its behavior, personality, capabilities, and constraints for a session. System prompts set the context before user interaction begins. They're how ChatGPT becomes a "helpful assistant" or custom GPTs get their personalities.

T

T

5 terms starting with T

Temperature

BasicText

A parameter controlling randomness in AI text generation. Low temperature (0-0.3) produces focused, deterministic, consistent outputs. High temperature (0.7-1.0) increases creativity and variety but may reduce coherence. Adjust based on task type.

When to Use

Low (0.1-0.3): Facts, coding, analysis, technical writing. Medium (0.4-0.6): Balanced content. High (0.7-1.0): Creative writing, brainstorming.

Token

BasicText

The basic unit AI uses to process text. A token is roughly 4 characters or 0.75 words in English. Tokens determine context window limits and API pricing. Both input prompts and output responses consume tokens.

Rule of Thumb

1 token ≈ 4 characters ≈ 0.75 words. "Hello world" = 2 tokens. 1,000 tokens ≈ 750 words. 100K tokens ≈ 75,000 words.

Transformer

Advanced

The neural network architecture that revolutionized AI, introduced in the 2017 paper "Attention Is All You Need." Transformers use self-attention mechanisms to process sequences in parallel rather than sequentially. This architecture powers GPT, BERT, and most modern AI models.

Training Data

Basic

The dataset used to train an AI model. For LLMs, this includes books, websites, code, and conversations. For image AI, it includes millions of image-caption pairs. The quality, diversity, and biases in training data directly affect model capabilities and limitations.

Txt2Img (Text-to-Image)

BasicImage

The core AI image generation mode where you describe an image in text and the AI creates it from scratch. This is the default mode in Midjourney, DALL-E, and Stable Diffusion - simply type what you want to see.

U

U

2 terms starting with U

Udio

ToolAudio

A powerful AI music generator competing with Suno, known for exceptional audio quality and realistic vocals. Udio excels at complex musical arrangements, genre accuracy, and professional-sounding productions.

Upscaling

BasicImage

Increasing an image's resolution using AI to add detail and clarity. AI upscalers (like Real-ESRGAN, Topaz) intelligently generate missing pixels rather than simple interpolation, producing much better results than traditional resizing.

V

V

3 terms starting with V

VAE (Variational Autoencoder)

AdvancedImage

A neural network component that encodes images into latent space and decodes them back. In Stable Diffusion, different VAEs affect color vibrancy, contrast, and detail. Swapping VAEs can improve skin tones or fix washed-out colors.

Veo (Google)

ToolVideo

Google DeepMind's AI video generation model. Veo 3 is notable for being the first major video AI to generate native synchronized audio - including sound effects, ambient noise, and dialogue - alongside video content.

Key Innovation

Native audio generation sets Veo 3 apart from competitors, enabling complete video content with matched sound in one generation.

Vision (AI Vision)

Intermediate

The capability of AI models to understand and analyze images. Vision-enabled models (GPT-4V, Claude 3, Gemini) can describe images, read text in photos, analyze charts, identify objects, and answer questions about visual content.

Z

Z

1 term starting with Z

Zero-Shot Learning/Prompting

IntermediateText

Asking an AI to perform a task without providing any examples - relying entirely on the model's pre-trained knowledge and the instructions in your prompt. Contrast with few-shot prompting where you give examples first.

Example Comparison

Zero-shot: "Translate 'hello' to Japanese" | Few-shot: "English to French: hello = bonjour. English to Japanese: hello = ?"

Master AI with Our Prompts & Guides

Now that you understand the terminology, put your knowledge into practice with our curated prompt collections and tutorials!

Read Prompting Guide