A Definitions & Terms

A/B Testing

A statistical way of comparing two (or more) techniques, typically an incumbent against a new rival. It aims to determine which technique performs better, and whether the difference is statistically significant.


Refers to the percentage of correct predictions the classifier made.

Activation Function

In artificial neural networks, the activation function of a node defines the output of that node given an input or set of inputs.

Active Learning

A machine learning term that refers to various methods for actively improving the performance of trained models.

Adversarial Example

Specialised inputs created with the purpose of confusing a neural network, resulting in the misclassification of a given input. These notorious inputs are indistinguishable to the human eye, but cause the network to fail to identify the contents of the image.

Adversarial Machine Learning

A research field that lies at the intersection of machine learning (ML) and computer security. It enables the safe adoption of ML techniques in adversarial settings like spam filtering, malware detection, etc.

Al Algorithms

Extended subset of machine learning that tells the computer how to learn to operate on its own through a set of rules or instructions.

Anchor Box

The archetypal location, size, and shape for finding bounding boxes in an object detection problem. For example, square anchor boxes are typically used in face detection models.


A markup placed on an image (bounding boxes for object detection, polygons or a segmentation map for segmentation) to teach the model the ground truth.

Annotation Format

Particular way of encoding an annotation. There are many ways to describe a bounding box’s size and position (JSON, XML, TXT, etc) and to delineate which annotation goes with which image.

Annotation Group

Describes what types of objects you are identifying. For example, “chess pieces” or “vehicles.”

Application Programming Interface (API)

A set of commands, functions, protocols, and objects that programmers can use to create software or interact with an external system.

Application Programming Interface(API)

An API, or application programming interface, is a set of rules and protocols that allows different software programs to communicate and exchange information with each other. It acts as a kind of intermediary, enabling different programs to interact and work together, even if they are not built using the same programming languages or technologies. API’s provide a way for different software programs to talk to each other and share data, helping to create a more interconnected and seamless user experience.

Artificial Intelligence(AI)

The intelligence displayed by machines in performing tasks that typically require human intelligence, such as learning, problem-solving, decision-making, and language understanding. AI is achieved by developing algorithms and systems that can process, analyze, and understand large amounts of data and make decisions based on that data.


A specific neural network layout (layers, neurons, blocks, etc). These often come in multiple sizes whose design is similar except for the number of parameters.

Artificial Intelligence

A computational system that simulates parts of human intelligence but focuses on one harrow task.

Artificial Neural Network

A learning model created to act like a human brain that solves tasks that are too difficult for traditional computer systems to solve.

Audio Speech Recognition (ASR)

A technology that processes human speech into readable text.


Automates each step of the ML workflow so that it’s easier for users with minimal effort and machine learning expertise.

Automation Bias

When a human decision maker favors recommendations made by an automated decision-making system over information made without automation, even when the automated decision-making system makes errors.

Audio Transcription Model

Takes audio containing speech and converts it into text. The text files allow audio to be searched for key terms, or Al models to transmit text instead of audio over networks, which is much smaller and faster.

Autonomous Al?

The most advanced form of Al is autonomous artificial intelligence, in which processes are automated to generate the intelligence that allows machines, bots and systems to act on their own, independent of human intervention. It is often used in autonomous vehicles. This field of Al is still very new, and researchers are continually refining their algorithms and their approaches to the problem, but it entails multiple layers.

Backward Chaining

A method where the model starts with the desired output and works in reverse to find data that might support it.

Base Workflow

One of Clarifai’s prebuilt models that can be built upon to create a custom model. It pre-indexes inputs for search and provides a default embedding space.


A model used as a reference point for comparing how well another model (typically, a more complex one) is performing. Baseline models help developers quantify the minimal expected performance that a new model must achieve to be useful.


The set of examples used in one iteration (that is, one gradient update) of model training.

Batch Inference

Asynchronous process that is executing predictions based on existing models and observations, and then stores the output.

Batch Size

The number of training examples utilized in one iteration.

Bayes’s Theorem

A famous theorem used by statisticians to describe the probability of an event based on prior knowledge of conditions that might be related to an occurrence.


When an Al algorithm produces results that are systemically prejudiced due to erroneous assumptions in the machine learning process.

Big Data

Big data refers to data that is so large, fast or complex that it’s difficult or impossible to process using traditional methods.

Binary Classification

The task of classifying elements of a set into two groups on the basis of a classification rule i.e. a model that evaluates email messages and outputs either “spam” or “not spam” is a binary classifier.

Black Box Al

An Al system whose inputs and operations are not visible to the user. A black box, in a general sense, is an impenetrable system.


A machine learning technique that iteratively combines a set of simple and not very accurate classifiers (referred to as “weak” classifiers) into a classifier with high accuracy (a “strong” classifier) by upweighting the examples that the model is currently misclassifying.


Any test or metric that uses random sampling with replacement and falls under the broader class of resampling methods. Bootstrapping assigns measures of accuracy (bias, variance, confidence intervals, prediction error, etc.) to sample estimates.

Bounding Box

In an image, the (x, y) coordinates of a rectangle around an area of interest.

Brute Force Search

A search that isn’t limited by clustering/ approximations; it searches across all inputs. Often more time-consuming and expensive, but more thorough.

Calibration Layer

A post-prediction adjustment, typically to account for prediction bias. The adjusted predictions and probabilities should match the distribution of an observed set of labels.


Simulates human conversation, using response workflows or artificial intelligence to interact with people based on verbal and written cues. Chatbots have become increasingly sophisticated in recent years and in the future may be indistinguishable from humans.


Data that captures the state of the variables of a model at a particular time. Checkpoints enable exporting model weights, performing training across multiple sessions and continuing training past errors.


One of a set of enumerated target values for a label. For example, in a binary classification model that detects spam, the two classes are spam and not spam. In a multi-class classification model that identifies dog breeds, the classes would be poodle, beagle, pug, etc.

Class Balance

The relative distribution between the number of examples of each class used to train a model. A model performs better if there are a relatively even number of examples for each class.


Process of grouping and categorizing objects and ideas recognized, differentiated, and understood in data.


An algorithm that implements classification. It refers to the mathematical function implemented by a classification algorithm that maps input data to a category.


A group of observations that show similarities to each other and are organized by similarities.


A method of unsupervised learning and common statistical data analysis technique. In this method, observations that show similarities to each other are organized into groups (clusters).

Cognitive Computing

A computerized model that mimics the way the human brain thinks. It involves self learning through the use of data mining, natural language processing, and pattern recognition.

Computer Vision

Field of Al that trains computers to interpret and understand the visual world. Using digital images from cameras and videos and deep learning models, machines can accurately identify and classify objects — and then react to what they “see.”


Describes an input, similar to a “tag” or “keyword.” There are two types: those that you specify to train a model, and those that a model assigns as a prediction.


A model is inherently statistical. Along with its prediction, it also outputs a confidence value that quantifies how “sure” it is that its prediction is correct.

Confidence Threshold

We often discard predictions that fall below a certain bar, This bar is the confidence threshold.

Confusion Matrix

A confusion matrix is a table that is often used to describe the performance of a classification model on a set of test data for which the true values are known. The confusion matrix itself is relatively simple to understand, but the related terminology can be… confusing.


A virtualized environment that packages its dependencies together into a portable environment. Docker is one common way to create containers.


After a path is chosen, any device must ensure that the motors and steering work to move along the path without being diverted by bumps or small obstacles. In general, information flows from the top layer of the sensors down to the control layer as decisions are made. There are feedback loops, though, that bring information from the lower layers back up to the top to improve sensing, planning and perception.

Convolutional Filter

A convolution is a type of block that helps a model learn information about relationships between nearby pixels.

Convolutional Neural Network

Convolutional neural networks are deep artificial neural networks that are used primarily to classify images (e.g. name what they see), cluster them by similarity (photo search), and perform object recognition within scenes.


A proprietary format used to encode weights for Apple devices that takes advantage of the hardware accelerated neural engine present on iPhone and iPad devices.


A no-code training tool created by Apple that will train machine learning models and export to CoreML. It supports classification and object detection along with several types of non computer-vision models (such as sound, activity, and text classification).

Compute Unified Device Architecture(CUDA)

CUDA is a way that computers can work on really hard and big problems by breaking them down into smaller pieces and solving them all at the same time. It helps the computer work faster and better by using special parts inside it called GPUs. It’s like when you have lots of friends help you do a puzzle – it goes much faster than if you try to do it all by yourself. The term “CUDA” is a trademark of NVIDIA Corporation, which developed and popularized the technology.

Curse of Dimensionality

The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high- dimensional spaces that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience.

Custom Dataset

A set of images and annotations pertaining to a domain specific problem. In contrast to a research benchmark dataset like coco or Pascal voe.

Custom Training

The process of teaching a model to make certain predictions.

Classification Model

Reads an input such as text, image, audio, or video data and generates an output that classifies it into a category. For example, a language classification model might read a sentence and determine whether it’s in French, Spanish, or Italian.

Detection Model

Detection comprises two tasks; listing “what” things appear in an image, and “where” they appear. Results are returned as bounding boxes along with the names of the detected items.

Domain Model

Focuses on understanding a single domain, such as travel, weddings, food, not-safe-for-work (NSFW), etc.


In the data science and Al world, any collection of information in a digital form. It’s important to distinguish between structured and unstructured data; structured data is highly specific and is stored in a predefined format such as an spreadsheet table, whereas unstructured data is a conglomeration of many varied types of data that are stored in their native formats, such as images, video, audio, and text. Data is also a plural, with the singular being “datum”.

Data Annotation

The process of labeling datasets to be used as inputs for machine learning models.

Data Curation

The process of collecting, organizing, cleaning, labeling, and maintaining data for use in training and testing models.

Data Mining

The process by which patterns are discovered within large sets of data with the goal of extracting useful information from it.


A collection of data and a ground truth of outputs that you use to train machine learning models by example.


The removal of identical data, or data that is so similar that for all intents and purposes it can be considered duplicate data. Using visual search, a similarity threshold can be set to decide what should be removed.

Deep Learning

The general term for machine learning using layered (or deep) algorithms to learn patterns in data. It is most often used for supervised learning problems.

Deep Neural Network

An artificial neural network (ANN) with multiple layers between the input and output layers. It uses sophisticated mathematical modeling to process data in complex ways.


Taking the results of a trained model and using them to do inference on real world data. This could mean hosting a model on a server or installing it to an edge device.

Detection Mode

Also known as object detection. A model that identifies the presence, location and type of objects within images or video frames.

Diversity, Equity & Inclusion (DEI)

Term used to describe policies and programs that promote the representation and participation of different groups of individuals, including people of different ages, races and ethnicities, abilities and disabilities, genders, religions, cultures, and sexual orientations.

Domain Adaptation

A technique to improve the performance of a model where there is little data in the target domain by using knowledge learned by another model in a related domain. An example could be training a model to recognize taxis using a model that recognizes cars. Data Processing: The process of preparing raw data for use in a machine learning model, including tasks such as cleaning, transforming, and normalizing the data.

Deep Learning(DL)

A subfield of machine learning that uses deep neural networks with many layers to learn complex patterns from data.

Edge Al

Data is processed on the same device that produces it, or at most on a nearby computer with no reliance on distant cloud servers or other remote computing nodes. Al can work faster, and respond more accurately to time-sensitive events.

Edge Computing

A distributed computing framework that brings enterprise applications closer to data sources such as loT devices or local edge servers.


A categorical feature represented as a continuous-valued feature. Typically, an embedding is a translation of a high-dimensional vector into a low- dimensional space.

Embedding Space

The d-dimensional vector space that features from a higher-dimensional vector space are mapped to. Ideally, the embedding space contains a structure that yields meaningful mathematical results.

Emotional Al

Technologies that use affective computing and artificial intelligence techniques to sense, learn about and interact with human emotional life.

Ensemble Models

Machine learning approach to combine multiple other models in the prediction process. While the individual models may not perform very well, when combined they can be very powerful indeed.

Extensible Markup Language (XML)

A language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable andmachine-readable.


When we want a computer to understand language, we need to represent the words as numbers because computers can only understand numbers. An embedding is a way of doing that. Here’s how it works: we take a word, like “cat”, and convert it into a numerical representation that captures its meaning. We do this by using a special algorithm that looks at the word in the context of other words around it. The resulting number represents the word’s meaning and can be used by the computer to understand what the word means and how it relates to other words. For example, the word “kitten” might have a similar embedding to “cat” because they are related in meaning. Similarly, the word “dog” might have a different embedding than “cat” because they have different meanings. This allows the computer to understand relationships between words and make sense of language.

Embedding Model

Computers and models can’t understand images and text like humans do. Embedding models take unstructured input like images, audio, text, and video and transform them into a series of numbers called vectors which can then be input into the prediction models.

Feature Engineering

The process of selecting and creating new features from the raw data that can be used to improve the performance of a machine learning model.


You might see the term “Freemium” used often on this site. It simply means that the specific tool that you’re looking at has both free and paid options. Typically there is very minimal, but unlimited, usage of the tool at a free tier with more access and features introduced in paid tiers.

F Score

A weighted average of the true positive rate of recall and precision.

Facial Recognition

An application capability of identifying or verifying a person from an image or a video frame by comparing selected facial features from the image and a face database.

False Positives

An error where a model falsely predicts the presence of the desired outcome in an input, when in reality it is not present (Actual No, Predicted Yes).

False Negatives

An error where a model falsely predicts an input as not having a desired outcome, when one is actually present. {Actual Yes, Predicted No).


A library built on top of PyTorch for rapid prototyping and experimentation. There is a companion course that teaches the fundamentals of machine learning. Feature Extraction When image features at various levels of complexity are extracted from the image data. Typical examples of such features are: Lines, edges, and ridges. Localized interest points such as corners, blobs, or points.More complex features may be related to texture, shape, or motion. The process by which data that is too large to be processed is transformed into a reduced representation set of features such as texture, shape, lines, and edges.


User-generated system of classifying and organizing online content into different categories by the use of metadata such as electronic tags.


Deep learning frameworks implement neural network concepts. Some are designed for training and inference— TensorFlow, PyTorch, FastAl, etc. And others are designed particularly for speedy inference—OpenVino, TensorRT, etc.


The details from the various sensors must be organized into a single, coherent view of what’s happening around the autonomous vehicle. The sensor fusion algorithms must sort through the details and construct a reliable model that can be used in later stages for planning.


Refers to a model’s ability to make correct predictions on new, previously unseen data as opposed to the data used to train the model.

Generative Adversarial Networks (GANS)

A class of artificial intelligence algorithms used in unsupervised machine learning, implemented by a system of two neural networks contesting with each other in a zero-sum game framework. This technique can generate photographs that look at least superficially authentic to human observers, having many realistic characteristics (though in tests people can tell real from generated in many cases).

Generative Al

Models that can be trained using existing content like text, audio files, or images to create new original content.

Grid Search

A tuning technique that attempts to compute the optimal values of hyperparameters for training models by performing an exhaustive search through a subset of hyperparameters.

Ground Truth

The “answer key” for your dataset. This is how you judge how well your model is doing and calculate the loss function we use for gradient descent. It’s also what we use to calculate our metrics. Having a good ground truth is extremely important. Your model will learn to predict based on the ground truth you give it to replicate.

Generative Adversarial Network (GAN)

A type of computer program that creates new things, such as images or music, by training two neural networks against each other. One network, called the generator, creates new data, while the other network, called the discriminator, checks the authenticity of the data. The generator learns to improve its data generation through feedback from the discriminator, which becomes better at identifying fake data. This back and forth process continues until the generator is able to create data that is almost impossible for the discriminator to tell apart from real data. GANs can be used for a variety of applications, including creating realistic images, videos, and music, removing noise from pictures and videos, and creating new styles of art.

Generative Art

A form of art that is created using a computer program or algorithm to generate visual or audio output. It often involves the use of randomness or mathematical rules to create unique, unpredictable, and sometimes chaotic results.

Generative Pre-trained Transformer (GPT)

GPT stands for Generative Pretrained Transformer. It is a type of large language model developed by OpenAI.

Giant Language model Test Room (GLTR)

GLTR is a tool that helps people tell if a piece of text was written by a computer or a person. It does this by looking at how each word in the text is used and how likely it is that a computer would have chosen that word. GLTR is like a helper that shows you clues by coloring different parts of the sentence different colors. Green means the word is very likely to have been written by a person, yellow means it’s not sure, red means it’s more likely to have been written by a computer and violet means it’s very likely to have been written by a computer.


A platform for hosting and collaborating on software projects

Google Colab

Google Colab is an online platform that allows users to share and run Python scripts in the cloud

Graphics Processing Unit (GPU)

A GPU, or graphics processing unit, is a special type of computer chip that is designed to handle the complex calculations needed to display images and video on a computer or other device. It’s like the brain of your computer’s graphics system, and it’s really good at doing lots of math really fast. GPUs are used in many different types of devices, including computers, phones, and gaming consoles. They are especially useful for tasks that require a lot of processing power, like playing video games, rendering 3D graphics, or running machine learning algorithms.


In machine learning, a mechanism for bucketing categorical data, particularly when the number of categories is large, but the number of categories actually appearing in the dataset is comparatively small.

Hidden Layer

A synthetic layer in a neural network between the input layer (that is, the features) and the output layer (the prediction). Hidden layers typically contain an activation function (such as ReLU) for training. A deep neural network contains more than one hidden layer.

Holdout Data

Examples intentionally not used during training. The validation dataset and test dataset are examples of holdout data. It helps evaluate your model’s ability to generalize to data other than the data on which it was trained.

Hosted Model

A set of trained weights located in the cloud that you can receive predictions from via an API.

Human Workforce (“Labelers”)

Workers who can help to complete work on an as-needed basis, which for purposes usually means labeling data (images).


The levers by which you can tune your model during training. These include things like learning rate and batch size. You can experiment with changing hyperparameters to see which ones perform best with a given model for your dataset.


Making predictions using the weights you save after training your model.


A large visual database designed for use in visual object recognition software research.

Image Recognition

The ability of software to identify objects, places, people, writing and actions in images.

Implicit Bias

Automatically making an association or assumption based on one’s mental models and memories. Implicit bias can affect how data is collected and classified, and how machine learning systems are designed and developed.

Image Segmentation

The process of dividing a digital image into multiple segments with the goal of simplifying the representation of an image into something that is easier to analyze. Segmentation divides whole images into pixel groupings, which can then be labeled and classified.

Information Retrieval

The area of Computer Science studying the process of searching for information in a document, searching for documents themselves, and also searching for metadata that describes data and for databases of texts, images or sounds.


Any information or data sent to a computer for processing is considered input.

Input Layer

The first layer (the one that receives the input data) in a neural network.

Intelligent Character Recognition (ICR)

Related technology to OCR designed to recognize handwritten characters.


An edge computing device created by NVIDIA that includes an onboard GPU.


A freeform data serialization format originally created as part of JavaScript but now used much more broadly. Many annotation formats use JSON to encode their bounding boxes.

Jupyter Notebook

A common data science tool that enables you to execute Python code visually. Each ” ecute by hitting “Ctrl+Enter”. The results of the execution are displayed below the cell.

Knowledge Graph

Collection of nodes and edges where the nodes represent concepts, entities, relationships, and events, and the edges represent the connections between them.


Assigning a class or category to a specific object in your dataset.


Also known as data labeling; the process of annotating datasets to train machine learning models.

Labeling Criteria

A labeling requirements guide which includes instructions for the labeling process itself as well as written definitions and a multitude of visual examples for each concept.


Al-automated tool using end-to-end workflows to label images and video at scale to create high-quality training data.


A library that helps users connect artificial intelligence models to external sources of information. The tool allows users to chain together commands or queries across different sources, enabling the creation of agents or chatbots that can perform actions on a user’s behalf. It aims to simplify the process of connecting AI models to external sources of information, enabling more complex and powerful applications of artificial intelligence.

Large Language Model (LLM)

A type of machine learning model that is trained on a very large amount of text data and is able to generate natural-sounding text.

Machine Learning (ML)

A method of teaching computers to learn from data, without being explicitly programmed.

Machine Intelligence

An umbrella term that encompasses machine learning, deep learning and classical learning algorithms.

Machine Learning

A general term for algorithms that can learn patterns from existing data and use these patterns to make predictions or decisions with new data. A method of teaching computers to learn from data, without being explicitly programmed.

Masked Language Model

A language model that predicts the probability of candidate tokens to fill in blanks in a sequence.


Information about an analog or digital object, a component of an object, or a coherent collection of objects. Metadata describing digital content is often structured (e.g., with tagging or markup).

Misclassification Rate

Rate used to gauge how often a model’s predictions are wrong.


Also known as Machine Learning Operations. Best practices for organizations to operationalize machine learning. Often involves collaboration between data scientists and devops professionals to manage production ML.


A high-level data category. For example, numbers, text, images, video and audio are five different modalities.


The representation of what a machine learning system has learned from the training data.


As defined by Gartner, ModelOps is focused primarily on the governance and life cycle management of a wide range of operationalized artificial intelligence (Al) and decision models, including machine learning, knowledge graphs, rules, optimization, linguistic and agent-based models.

Model Size

The number of parameters (or neurons) a model has. This can also be measured in terms of the size of the weights file on disk.

Masked Language Model

A language model that predicts the probability of which words make the most sense to fill in blanks in a sequence. A simple example could be “Good ___, how are you?” where probable candidate words could be “morning”, “day”, or “evening”.

Multimodal Model

A model whose inputs and/or outputs include more than one modality. For example, consider a model that takes both an image and a text caption (two modalities) as features, and outputs a score indicating how appropriate the text caption is for the image.

Model Training

The process of determining the best model.

Monte Carlo Simulation

Used to model the probability of different outcomes in a process that cannot easily be predicted due to the intervention of random variables. It’s a technique used to understand the impact of risk and uncertainty. It was developed while working on nuclear weapons in the 1940s, and was given the code name “Monte Carlo” in reference to the Monte Carlo Casino in Monaco, where one of the inventor’s uncles would borrow money from relatives to gamble.

Multi-class Classification

Classification problems that distinguish between more than two classes. For example, there are approximately 53 species of maple trees, so a model that categorized maple tree species would be multi-class. Natural Language Processing(NLP): A subfield of AI that focuses on teaching machines to understand, process, and generate human language

Neural Networks

A type of machine learning algorithm modeled on the structure and function of the brain.

Neural Radiance Fields(NeRF)

A type of deep learning model that can be used for a variety of tasks, including image generation, object detection, and segmentation. NeRFs are inspired by the idea of using a neural network to model the radiance of an image, which is a measure of the amount of light that is emitted or reflected by an object.

Named Entity Recognition Model

A method that is used for recognizing entities such as people, dates, organizations, and locations that are present in a text document.

Natural Language Processing (NLP)

A branch of Al that helps computers understand, interpret, and manipulate human language. This field of study focuses on helping machines understand human language in order to improve human-computer interfaces.

Natural Language

Understanding Determining a user’s intentions based on what the user typed or said. For example, a search engine uses natural language understanding to determine what the user is searching for based on what the user typed or said.


The combining of neural and symbolic Al architectures to address complementary strengths and weaknesses of each, providing a robust Al capable of reasoning, learning, and cognitive modeling.

Neural Architecture Search

Automatically trying many variations of model layouts and hyperparameters to find the optimal configuration.


A unit in an Artificial Neural Network processing multiple input values to generate a single output value.

Neural Network

Series of algorithms that endeavors to recognize underlying relationships in a set of data through a process that mimics the way the human brain operates.


Signals with no causal relation to the target function.


The process of converting an actual range of values into a standard range of values, typically -1to +1 or 0 to 1.

Object Detection

A computer technology related to computer vision and image processing that deals with detecting instances of semantic objects of a certain class (such as humans, buildings, or cars) in digital images and videos. This technique also involves localizing the object in question, which differentiates it from classification, which only tells the type of object.

Object Recognition

Also known object classification. A computer vision technique for identifying objects in images or videos.

Object Tracking

The process of following a specific object of interest, or multiple objects, in a given scene. It traditionally has applications in video and real-world interactions where observations are made following an initial object detection.

On-premise Software

Software that is installed and runs on computers located on the premises of the organization using that software versus at a remote facility such as a server farm or on the cloud.

One Shot Classification

A model that only requires that you have one training example of each class you want to predict on. The model is still trained on several instances, but they only have to be in a similar domain as your training example.


A research institute focused on developing and promoting artificial intelligence technologies that are safe, transparent, and beneficial to society


A common problem in machine learning, in which the model performs well on the training data but poorly on new, unseen data. It occurs when the model is too complex and has learned too many details from the training data, so it doesn’t generalize well.

Open Neural Network Exchange (ONNX)

ONNX is an open format to represent machine learning models.


Mission is to ensure that artificial general intelligence benefits all humanity.

Optical Character Recognition (OCR)

A computer system that takes images of typed, handwritten, or printed text and converts them into machine-readable text.


The selection of the best element (with regard to some criterion) from some set of available alternatives.


Predictions made after the input uploaded to or fed into a model are processed by the model.

Outsourced Labeling

Paying people to annotate, or label, your data. Its effectiveness can depend on the domain expertise of annotators. Providing a comprehensive labeling criteria is crucial for training annotators before beginning a project.


A machine learning problem where an algorithm is unable to discern information that is relevant to its assigned task from information which is irrelevant within training data. Overfitting inhibits the algorithm’s predictive performance when dealing with new data.


Any characteristic that can be used to help define or classify a system. In Al, they are used to clarify exactly what an algorithm should be seeking to identify as important data when performing its target function.

Pattern Recognition

A branch of machine learning that focuses on the recognition of patterns and regularities in data, although it is in some cases considered to be nearly synonymous with machine learning.


The process of going from raw images to prediction. Usually this encompasses collecting images, annotation, data inspection and quality assurance, transformation, preprocessing and augmentation, training, evaluation, deployment, inference (and then repeating the cycle to improve the predictions).


After the model is constructed, the system must begin to identify important areas like any roads or paths or moving objects.


Finding the best path forward requires studying the model and also importing information from other sources like mapping software, weather forecasts, traffic sensors and more.


A prompt is a piece of text that is used to prime a large language model and guide its generation. A text input used to return a result from an AI Model


Indicator of a machine learning model’s performance — the quality of a positive prediction made by the model. Refers to the number of true positives divided by the total number of positive predictions.

Predictive Model

A model that uses observations measured in a sample to gauge the probability that a different sample or remainder of the population will exhibit the same behavior or have the same outcome.

Pre-trained Model

A model or the component of a model, that has been preliminary trained, generally using another data set. (for example, finding lines, corners, and patterns of colors). Pre-training on a large dataset like the huge Common Objects in Context (COCO), which has 330,000 images with 1.5 million objects to detect, can reduce the number of custom images you need to obtain satisfactory results.


A (usually non-rectangular) region defining an object with more detail than a rectangular bounding box. Polygon annotations can be used to train segmentation models or to enhance performance of object-detection models by enabling a more accurate bounding box to be maintained after augmentation.

Positive Predictive Value (PPV)

Very similar to precision, except that it takes prevalence into account. In the case where the classes are perfectly balanced (meaning the prevalence is 50%), the positive predictive value is equivalent to precision.


An attempt by a model to replicate the ground truth. A prediction usually contains a confidence value for each class.

Precision (Recognition)

A rate that measures how often a model is correct when it predicts “yes.”


An attempt by a model to replicate the ground truth. A prediction usually contains a confidence value for each class.


The rate of how often the “yes” condition actually occurs in a sample.


The deployment environment where the model will run in the wild on real-world images (as opposed to the testing environment where the model is developed).


The use of a search algorithm to cut off undesirable solutions to a problem in an Al system. It reduces the number of decisions that can be made by the Al system.


A popular, high-level programming language known for its simplicity, readability, and flexibility (many AI tools use it)


A popular open source deep learning framework developed by Facebook. It focuses on accelerating the path from research prototyping to production deployment.

Recall (Sensitivity)

The fraction of relevant instances that have been retrieved over the total amount of relevant instances.

Receiver Operating Characteristic (ROC) Curve

This is a commonly used graph that summarizes the performance of a classifiers over all possible thresholds. It is generated by plotting the True Positive Rate (y-axis) against the False Positive Rate (x-axis) as you vary the threshold for assigning observations to a given class.

Recurrent Neural Network

A type of artificial network with loops in them, allowing recorded information, like data and outcomes, to persist by being passed from one step of the network to the next. They can be thought of as multiple copies of the same network with each passing information to its successor.


A statistical measure used to determine the strength of the relationships between dependent and independent variables.

Reinforcement Learning

A type of machine learning in which machines are “taught” to achieve their target function through a process of experimentation and reward receiving positive reinforcement when its processes produce the desired result and negative reinforcement when they do not. This is differentiated from supervised learning, which would require an annotation for every individual action the algorithmwould take.


In the context of artificial neural networks, the ReLU (rectified linear unit) activation function is an activation function which outputs the same as its input if the input is positive, and zero if the input is negative. A related function is the leaky rectified linear unit (leaky rectified linear unit) which assigns a small positive slope for x < 0.

Responsible Al

Umbrella term for aspects of making appropriate business and ethical choices when adopting Al, including business and societal value, risk, trust, transparency, fairness, bias mitigation, explainability, accountability, safety, privacy, and regulatory compliance.

Reinforcement Learning

A type of machine learning in which the model learns by trial and error, receiving rewards or punishments for its actions and adjusting its behavior accordingly.

Spatial Computing

Spatial computing is the use of technology to add digital information and experiences to the physical world. This can include things like augmented reality, where digital information is added to what you see in the real world, or virtual reality, where you can fully immerse yourself in a digital environment. It has many different uses, such as in education, entertainment, and design, and can change how we interact with the world and with each other.

Stable Diffusion

Stable Diffusion generates complex artistic images based on text prompts. It’s an open source image synthesis AI model available to everyone. Stable Diffusion can be installed locally using code found on GitHub or there are several online user interfaces that also leverage Stable Diffusion models.

Supervised Learning

A type of machine learning in which the training data is labeled and the model is trained to make predictions based on the relationships between the input data and the corresponding labels.

Segmentation Model

Instead of bounding boxes returned for each concept, this model indicates via a heat map and trace (think of a coloring book) of regions for each concept.

Search Query

A query that a user feeds into a search engine to satisfy his or her information needs. If the query itself is a piece of visual content then that is what is known as a “visual search query.”


Building a model of the constantly shifting world requires a collection of sensors that are usually cameras and often controlled lighting from lasers or other sources. The sensors usually also include position information from GPS or some other independent mechanism.

Selective Filtering

When a model ignores “noise” to focus on valuable information.

Siamese Networks

A different way of classifying image where instead of training one model to learn to classify image inputs it trains two neural networks that learn simultaneously to find similarity between images.


Inputs, information, data.

Software Development Kit (SDK)

A set of software development tools that allows for the creation of applications on a specific platform.


The rate of how often a model predicts “no,” when it’s actually “no.”

Standard Classification

The process by which an input is assigned to one of a fixed set of categories. In machine learning, this is often achieved by learning a function that maps an input to a score for each potential category.

Strong Al

A theoretical form of Al that replicates human functions, such as reasoning, planning, and problem-solving.

Structured Data

Data that resides in a fixed field within a file or record. Structured data is typically stored in a relational database. It can consist of numbers and text, and sourcing can happen automatically or manually, as long as it’s within an RDBMS structure.

Supervised Learning

A machine learning approach that’s defined by its use of labeled datasets. These datasets are designed to train or “supervise” algorithms into classifying data or predicting outcomes accurately. Using labeled inputs and outputs, the model can measure its accuracy and learn over time. Symbiotic Intelligence A combination of human and artificial intelligence. Instead of relying on memory, or having to open a book, or visit a website, an enhanced human could have access to all of the information that is stored on the internet, and an advanced Al could feed the relevant data points to the human brain, enabling the human to be fully in control.

Synthetic Intelligence

An alternative term for artificial intelligence emphasizing that the intelligence of machines need not be an imitation or in any way artificial; it can be a genuine form of intelligence. An analogy can be made with simulated diamonds (such as cubic zirconia) versus synthetic diamonds (real diamonds made of carbon created by humans).

Synthetic Data

Images that are created rather than collected.

Target Function

The end goal of an algorithm.


In essence, a taxonomy is a model’s worldview, or the framework for how your model sees its training data. In practice, it’s a list of visually-distinct model concepts and the definitions of those concepts.

Temporal Data

Data recorded at different points in time.


An open-source software library also used for machine learning applications such as neural networks.

Test Dataset

The sample of data used to provide an unbiased evaluation of a final model fit on the training dataset.


A scientific computing framework with wide support for machine learning algorithms, written in C and lua.


The process iteratively adjusts your model’s parameters to converge on the weights that optimally mimic the training data.

Training Datset

An initial dataset used to train machine learning algorithms. Models create and refine their rules using this data. It’s a set of data samples used to fit the parameters of a machine learning model to training it by example.

Transfer Learning

Transferring information from one machine learning task to another. It might involve transferring knowledge from the solution of a simpler task to a more complex one, or involve transferring knowledge from a task where there is more data to one where there is less data.


A neural network that transforms a sequence of elements (like words in a sentence) into another sequence to solve sequence-to-sequence tasks.

True Positives

Actual positives that are correctly identified as actual “Yes” or predicted “Yes.”

True Negatives

Actual negatives that are correctly identified as an actual “No” or predicted “No.”

Turing Test

A test developed by Alan Turing in 1950, used to identify true artificial intelligence. It tested a machine’s ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human.

Unstructured Data

Information that either does not have a pre-defined data model or is not organized in a pre-defined manner. Unstructured data may include documents, images, video and audio.

Unsupervised Learning

Uses machine learning algorithms to analyze and cluster unlabeled datasets. These algorithms discover hidden patterns or data groupings without the need for human intervention. Its ability to discover similarities and differences in information make it the ideal solution for exploratory data analysis, cross-selling strategies, customer segmentation, and image recognition. A type of machine learning in which the training data is not labeled, and the model is trained to find patterns and relationships in the data on its own.


The model is given new, previously unseen data, and then metrics are collected on how well it performs predictions on them. This is analogous to a human learning math problems using one set of questions, then tested to see if they learned properly with a different set of questions.

Validation Data Set

The sample of data used to provide an unbiased evaluation of a model fit on the training dataset while tuning model hyperparameters. The evaluation becomes more biased as skill on the validation dataset is incorporated into the model configuration.


The error due to sensitivity to fluctuations in the training set computed as the expectation of the squared deviation of a random variable from its mean.

Verify / Verification

The process of verifying that labeled data has been labeled correctly in adherence to the ground truth.

Video Frame Interpolation

Is to synthesize several frames in the middle of two adjacent frames of video. Video Frame Interpolation can be applied.

Visual Dictionary

A document that defines every model concept with a written definition and also a wide array of visual examples. This helps establish ground truth by providing confirmation that each involved party understands the model’s worldview, or taxonomy.

Visual Recognition

The ability of software to identify objects, places, people, writing, and actions in images and videos.

Visual Match

Instead of doing a search which returns the items in the database in sorted order, a visual match couple be considered returning a yes/no answer of whether the query is close enough to any item in the database to be considered a “match.”

Visual Search

The ability of software to find visually similar content based on an image or video query.

Weak Al

Also known as narrow Al, weak Al refers to a non-sentient computer system that operates within a predetermined range of skills and usually focuses on a singular task or small set of tasks. Most Al in use today is weak Al.


A webhook is a way for one computer program to send a message or data to another program over the internet in real-time. It works by sending the message or data to a specific URL, which belongs to the other program. Webhooks are often used to automate processes and make it easier for different programs to communicate and work together. They are a useful tool for developers who want to build custom applications or create integrations between different software systems.


A coefficient for a feature in a linear model, or an edge in a deep network. The goal of training a linear model is to determine the ideal weight for each feature. If a weight is 0, then its corresponding feature does not contribute to the model.


Enables users to make predictions on a graph that combines one or more pre-trained, custom models and fixed function model operators using a single API call.


A markup language originally invented by Yahoo that is now commonly used as a format for configuration files.



Interested in the Latest AI tools & software?

Then subscribe to our newslatter to receive frequent updates.