A
Artificial General Intelligence (AGI): A hypothetical AI system that possesses general intelligence and can perform intellectual tasks as well as or better than a human. AGI does not currently exist.
Attention: A technique used in transformers and other neural networks that allows models to focus on relevant parts of the input when generating text.
B
BERT: Bidirectional Encoder Representations from Transformers. A popular natural language processing technique developed by Google.
C
Chatbot: A computer program designed to simulate conversation with human users, especially over the internet. Chatbots like ChatGPT are a type of conversational agent.
ChatGPT: A large language model chatbot created by Anthropic and launched in November 2022. It is built on GPT technology.
Context Window:
A context window refers to a fixed-size sliding window that moves across a sequence of tokens (such as words or characters) in a text. This window captures the surrounding context of a specific token, allowing models to consider nearby information when making predictions or understanding the meaning of that token.
For example, in language modeling tasks, a context window helps neural networks process words in context. The size of the window determines how many preceding and subsequent tokens are considered. Larger context windows can capture more distant dependencies, but they also increase computational complexity.
In other words, a context window provides the necessary context for AI models to make informed decisions based on the surrounding words or characters in a given sequence.
Curriculum learning: Training an AI system on progressively more difficult tasks, similar to how students progress through grade levels.
D
DALL-E: A generative AI system created by OpenAI that can create realistic images and art from text descriptions.
E
ELIZA: An early natural language processing computer program created in the 1960s to simulate conversation.
Embeddings: Representing words, phrases or items as numeric vectors that encode semantic meaning based on context. Allows AI models to understand language.
F
Federated learning: A distributed machine learning approach where models are trained across decentralized devices or servers holding local data samples, without exchanging their training data.
Fine-tuning: The process of taking a pretrained machine learning model and customizing it with additional data and training to perform a specific task. This is done with large language models like GPT-3.
G
Generative AI: AI systems capable of generating new content like text, images, video, and audio from scratch. Large language models like GPT-3 are a type of generative AI.
Generative Pretrained Transformer (GPT): A series of natural language processing models developed by OpenAI using the transformer technique. GPT-3 is the third version.
H
Hallucination: When AI systems like generative chatbots produce false information, make up facts or exhibit inconsistent responses. An ongoing challenge.
Human-in-the-loop: A technique in AI where humans work together with AI systems to enhance performance. Used to improve generative models.
Hyperparameters: The variables that govern the training process and model architecture for machine learning algorithms. Must be tuned for optimal results.
I
Interpretability: The ability to explain how and why an AI model makes decisions. Important for understanding generative models.
L
Large language model: AI models like GPT-3 that are trained on massive text datasets to generate human-like text. Key to generative AI.
M
Machine learning: The study of computer algorithms that can improve automatically through experience and data. Powers modern AI like generative models.
N
Natural language processing (NLP): The ability of a computer program to understand, interpret, and manipulate human language. Allows chatbots to converse.
Neural network: A computing system inspired by the human brain’s neurons. Neural nets power deep learning algorithms used to create generative AI.
O
OpenAI: A San Francisco AI research company that created important generative AI models like GPT-3 and DALL-E.
Overfitting: When a machine learning model performs very well on its training data but fails to generalize well to new, unseen data. Models need to avoid overfitting.
P
Parameter: The internal variables or “knobs” which machine learning models learn from training data in order to make predictions and decisions.
Prompt engineering: The crafting of text prompts to get the best results from large language models like GPT-3. More of an art than science.
R
Reinforcement learning: An AI technique where models learn through trial and error and positive/negative feedback without labeled training data. Promising for advancing generative AI.
S
Supervised learning: AI algorithms trained on labeled datasets mapping inputs to desired outputs. Supervision makes models more specialized.
T
Tokenization: Splitting text into smaller parts like words, phrases or sentences called tokens when processing natural language with AI.
Transformer: A neural network architecture particularly effective for natural language processing. First introduced in 2017.
Transfer learning: A technique where a model pretrained on one machine learning task is reused as the starting point for a related task. Enables training complex AI with less data.
Turing test: A test conceived by Alan Turing to assess whether a machine can exhibit intelligent behavior indistinguishable from a human. Smart chatbots approach this.
U
Unsupervised learning: AI models that learn patterns from unlabeled, unclassified data. Key technique that enabled development of generative AI.
V
Variational autoencoder (VAE): A deep learning technique used to generate new content like images. Important component of models like DALL-E.
W
Word embedding: Representing words as numeric vectors that encode semantic meaning based on context. Allows NLP models to understand language.