blob: de77c5378947ad6e46eb37de8d5e17d672778279 [file]
// Copyright 2017-2020 The Khronos Group. This work is licensed under a
// Creative Commons Attribution 4.0 International License; see
// http://creativecommons.org/licenses/by/4.0/
[appendix]
[[changes_to_opencl]]
= Changes to OpenCL
Changes to the OpenCL API and OpenCL C between successive versions are
summarized below.
// (Jon) Are these section and table numbers for the current spec, in which
// case they should turn into asciidoctor xrefs, or to older specs?
== Summary of changes from OpenCL 1.0 to OpenCL 1.1
The following features are added to the OpenCL 1.1 platform layer and
runtime (_sections 4 and 5_):
* Following queries to _table 4.3_
** {CL_DEVICE_NATIVE_VECTOR_WIDTH_CHAR},
{CL_DEVICE_NATIVE_VECTOR_WIDTH_SHORT},
{CL_DEVICE_NATIVE_VECTOR_WIDTH_INT},
{CL_DEVICE_NATIVE_VECTOR_WIDTH_LONG},
{CL_DEVICE_NATIVE_VECTOR_WIDTH_FLOAT},
{CL_DEVICE_NATIVE_VECTOR_WIDTH_DOUBLE},
{CL_DEVICE_NATIVE_VECTOR_WIDTH_HALF}
** {CL_DEVICE_HOST_UNIFIED_MEMORY}
** {CL_DEVICE_OPENCL_C_VERSION}
* {CL_CONTEXT_NUM_DEVICES} to the list of queries specified to
{clGetContextInfo}.
* Optional image formats: {CL_Rx}, {CL_RGx}, and {CL_RGBx}.
* Support for sub-buffer objects ability to create a buffer object that
refers to a specific region in another buffer object using
{clCreateSubBuffer}.
* {clEnqueueReadBufferRect}, {clEnqueueWriteBufferRect} and
{clEnqueueCopyBufferRect} APIs to read from, write to and copy a
rectangular region of a buffer object respectively.
* {clSetMemObjectDestructorCallback} API to allow a user to register a
callback function that will be called when the memory object is deleted
and its resources freed.
* Options that <<opencl-c-version, control the OpenCL C version>> used
when building a program executable.
* {CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE} to the list of queries
specified to {clGetKernelWorkGroupInfo}.
* Support for user events.
User events allow applications to enqueue commands that wait on a user
event to finish before the command is executed by the device.
Following new APIs are added - {clCreateUserEvent} and
{clSetUserEventStatus}.
* {clSetEventCallback} API to register a callback function for a specific
command execution status.
The following modifications are made to the OpenCL 1.1 platform layer and
runtime (_sections 4 and 5_):
* Following queries in _table 4.3_
** The minimum FULL_PROFILE value for {CL_DEVICE_MAX_PARAMETER_SIZE}
increased from 256 to 1024 bytes
** The minimum FULL_PROFILE value for {CL_DEVICE_LOCAL_MEM_SIZE} increased
from 16 KB to 32 KB.
* The _global_work_offset_ argument in {clEnqueueNDRangeKernel} can be a
non-`NULL` value.
* All API calls except {clSetKernelArg} are thread-safe.
The following features are added to the OpenCL C programming language
(_section 6_) in OpenCL 1.1:
* 3-component vector data types.
* New built-in functions
** *get_global_offset* work-item function defined in section _6.15.1_.
** *minmag*, *maxmag* math functions defined in section _6.15.2_.
** *clamp* integer function defined in _section 6.15.3_.
** (vector, scalar) variant of integer functions *min* and *max* in
_section 6.12.3_.
** *async_work_group_strided_copy* defined in section _6.15.11_.
** *vec_step*, *shuffle* and *shuffle2* defined in section _6.15.13_.
* *cl_khr_byte_addressable_store* extension is a core feature.
* *cl_khr_global_int32_base_atomics*,
*cl_khr_global_int32_extended_atomics*,
*cl_khr_local_int32_base_atomics* and
*cl_khr_local_int32_extended_atomics* extensions are core features.
The built-in atomic function names are changed to use the *atomic_*
prefix instead of *atom_*.
* Macros `CL_VERSION_1_0` and `CL_VERSION_1_1`.
The following features in OpenCL 1.0 are deprecated (see glossary) in OpenCL
1.1:
// Bugzilla 6140
* The {clSetCommandQueueProperty} API is deprecated, which simplifies
implementations and possibly improves performance by enforcing that
command queue properties are invariant.
Applications are encouraged to create multiple command queues with
different properties versus modifying the properties of a single
command queue.
// Bugzilla 6628
* The `-cl-strict-aliasing` build option has been deprecated.
It is no longer required after defining type-based aliasing rules.
// Bugzilla 5593 and 6068
* The *cl_khr_select_fprounding_mode* extension is deprecated and its
use is no longer recommended.
The following new extensions are added to _section 9_ in OpenCL 1.1:
* *cl_khr_gl_event* for creating a CL event object from a GL sync object.
* *cl_khr_d3d10_sharing* for sharing memory objects with Direct3D 10.
The following modifications are made to the OpenCL ES Profile described in
_section 10_ in OpenCL 1.1:
* 64-bit integer support is optional.
== Summary of changes from OpenCL 1.1 to OpenCL 1.2
The following features are added to the OpenCL 1.2 platform layer and
runtime (_sections 4 and 5_):
* Custom devices and built-in kernels are supported.
{clCreateProgramWithBuiltInKernels} has been added to allow creation of
a {cl_program_TYPE} using built-in kernels.
* Device partitioning that allows a device to be partitioned based on a
number of partitioning schemes supported by the device. This is done by
using {clCreateSubDevices} to create a new {cl_device_id_TYPE} based on a
partitioning.
* {clCompileProgram} and {clLinkProgram} to allow handling these aspects
{clBuildProgram} separately.
* Extend {cl_mem_flags_TYPE} to describe how the host accesses the data in a
{cl_mem_TYPE} object.
* {clEnqueueFillBuffer} and {clEnqueueFillImage} to support filling a
buffer with a pattern or an image with a color.
* Add {CL_MAP_WRITE_INVALIDATE_REGION} to {cl_map_flags_TYPE}.
Appropriate clarification to the behavior of {CL_MAP_WRITE} has been added
to the spec.
* New image types: 1D image, 1D image from a buffer object, 1D image array
and 2D image arrays.
* {clCreateImage} to create an image object.
* {clEnqueueMigrateMemObjects} API that allows a developer to have
explicit control over the location of memory objects or to migrate a
memory object from one device to another.
* Support separate compilation and linking of programs.
* Additional queries to get the number of kernels and kernel names in a
program have been added to {clGetProgramInfo}.
* Additional queries to get the compile and link status and options have
been added to {clGetProgramBuildInfo}.
* {clGetKernelArgInfo} API that returns information about the arguments of
a kernel.
* {clEnqueueMarkerWithWaitList} and {clEnqueueBarrierWithWaitList} APIs.
* {clUnloadPlatformCompiler} to request that a single platform's compiler is
unloaded. This is compatible with the *cl_khr_icd* extension if that is
supported, unlike {clUnloadCompiler}.
The following features are added to the OpenCL C programming language
(_section 6_) in OpenCL 1.2:
* Double-precision is now an optional core feature instead of an
extension.
* New built in image types: *image1d_t*, *image1d_buffer_t*,
*image1d_array_t*, and *image2d_array_t*.
* New built-in functions
** Functions to read from and write to a 1D image, 1D and 2D image arrays
described in _sections 6.15.15.2_, _6.15.15.3_ and _6.15.15.4_.
** Sampler-less image read functions described in _section 6.15.15.3_.
** *popcount* integer function described in _section 6.15.3_.
** *printf* function described in _section 6.15.14_.
* Storage class specifiers extern and static as described in _section
6.10_.
* Macros `CL_VERSION_1_2` and `+__OPENCL_C_VERSION__+`.
The following APIs in OpenCL 1.1 are deprecated (see glossary) in OpenCL
1.2:
// Bugzilla 6597
* The {clEnqueueMarker}, {clEnqueueBarrier} and {clEnqueueWaitForEvents}
APIs are deprecated to simplify the API.
The {clEnqueueMarkerWithWaitList} and {clEnqueueBarrierWithWaitList}
APIs provide equivalent functionality and support explicit event
wait lists.
// No Bugzilla
* The {clCreateImage2D}, {clCreateImage3D}, {clCreateFromGLTexture2D} and
{clCreateFromGLTexture3D} APIs are deprecated to simplify the API.
The {clCreateImage} and {clCreateFromGLTexture} APIs provide equivalent
functionality and support additional image types and properties.
// Bugzilla 5391 - cl_khr_icd specification
* {clUnloadCompiler} and {clGetExtensionFunctionAddress} APIs are deprecated.
The {clUnloadPlatformCompiler} and {clGetExtensionFunctionAddressForPlatform}
APIs provide equivalent functionality are compatible with the *cl_khr_icd*
extension.
The following queries are deprecated (see glossary) in OpenCL 1.2:
// Bugzilla 7832
* The {CL_DEVICE_MIN_DATA_TYPE_ALIGN_SIZE} query is deprecated.
The minimum data type alignment can be derived from
{CL_DEVICE_MEM_BASE_ADDR_ALIGN}.
== Summary of changes from OpenCL 1.2 to OpenCL 2.0
The following features are added to the OpenCL 2.0 platform layer and
runtime (_sections 4 and 5_):
* Shared virtual memory. The associated API additions are:
** {clSetKernelArgSVMPointer} to control which shared virtual memory (SVM)
pointer to associate with a kernel instance.
** {clSVMAlloc}, {clSVMFree} and {clEnqueueSVMFree} to allocate and free
memory for use with SVM.
** {clEnqueueSVMMap} and {clEnqueueSVMUnmap} to map and unmap to update
regions of an SVM buffer from host.
** {clEnqueueSVMMemcpy} and {clEnqueueSVMMemFill} to copy or fill SVM memory
regions.
* Device queues used to enqueue kernels on the device.
** {clCreateCommandQueueWithProperties} is added to allow creation of a
command queue with properties that affect both host command queues and
device queues.
* Pipes.
** {clCreatePipe} and {clGetPipeInfo} have been added to the API for host
side creation and querying of pipes.
* Images support for 2D image from buffer, depth images and sRGB images.
* {clCreateSamplerWithProperties}.
The following modifications are made to the OpenCL 2.0 platform layer and
runtime (sections 4 and 5):
* All API calls except {clSetKernelArg}, {clSetKernelArgSVMPointer} and
{clSetKernelExecInfo} are thread-safe.
Note that this statement does not imply that other API calls were not
thread-safe in earlier versions of the specification.
The following features are added to the OpenCL C programming language
(_section 6_) in OpenCL 2.0:
* Clang Blocks.
* Kernels enqueuing kernels to a device queue.
* Program scope variables in global address space.
* Generic address space.
* C1x atomics.
* New built-in functions (sections 6.15.10, 6.15.12, and 6.15.16).
* Support images with the read_write qualifier.
* 3D image writes are a core feature.
* The `CL_VERSION_2_0` and `NULL` macros.
The following APIs are deprecated (see glossary) in OpenCL 2.0:
// Bugzilla 7971
* The {clCreateCommandQueue} API has been deprecated to simplify
the API.
The {clCreateCommandQueueWithProperties} API provides equivalent
functionality and supports specifying additional command queue
properties.
// Bugzilla 8093 - cl_khr_mipmap_image specification
* The {clCreateSampler} API has been deprecated to simplify the
API.
The {clCreateSamplerWithProperties} API provides equivalent
functionality and supports specifying additional sampler
properties.
// Bugzilla 10270
* The {clEnqueueTask} API has been deprecated to simplify the API.
The {clEnqueueNDRangeKernel} API provides equivalent functionality.
The following queries are deprecated (see glossary) in OpenCL 2.0:
// Bugzilla 7156
* The {CL_DEVICE_HOST_UNIFIED_MEMORY} query is deprecated.
This query was purely informational and had different meanings
for different implementations.
Its use is no longer recommended.
// Bugzilla 7954
* The {CL_IMAGE_BUFFER} query has been deprecated to simplify the API.
The {CL_MEM_ASSOCIATED_MEMOBJECT} query provides equivalent
functionality.
// Bugzilla 7971
* The {CL_DEVICE_QUEUE_PROPERTIES} query has been deprecated and
replaced by {CL_DEVICE_QUEUE_ON_HOST_PROPERTIES}.
// Bugzilla 8761
* Atomics and Fences
** The Explicit Memory Fence Functions defined in section 6.12.9 of the
OpenCL 1.2 specification have been deprecated to simplify the
programming language.
The *atomic_work_item_fence* function provides equivalent
functionality.
The deprecated functions are still described in section 6.15.9 of this
specification.
** The Atomic Functions defined in section 6.12.11 of the OpenCL 1.2
specification have been deprecated to simplify the programming
language.
The *atomic_fetch* and modify functions provide equivalent
functionality.
The deprecated functions are still described in section 6.15.12.8 of this
specification.
== Summary of changes from OpenCL 2.0 to OpenCL 2.1
The following features are added to the OpenCL 2.1 platform layer and
runtime (_sections 4 and 5_):
* {clGetKernelSubGroupInfo} API call.
* {CL_KERNEL_MAX_NUM_SUB_GROUPS}, {CL_KERNEL_COMPILE_NUM_SUB_GROUPS}
additions to table 5.21 of the API specification.
* {clCreateProgramWithIL} API call.
* {clGetHostTimer} and {clGetDeviceAndHostTimer} API calls.
* {clEnqueueSVMMigrateMem} API call.
* {clCloneKernel} API call.
* {clSetDefaultDeviceCommandQueue} API call.
* {CL_PLATFORM_HOST_TIMER_RESOLUTION} added to table 4.1 of the API
specification.
* {CL_DEVICE_IL_VERSION}, {CL_DEVICE_MAX_NUM_SUB_GROUPS},
{CL_DEVICE_SUB_GROUP_INDEPENDENT_FORWARD_PROGRESS} added to table 4.3 of
the API specification.
* {CL_PROGRAM_IL} to table 5.17 of the API specification.
* {CL_QUEUE_DEVICE_DEFAULT} added to table 5.2 of the API specification.
* Added table 5.22 to the API specification with the enums:
{CL_KERNEL_MAX_SUB_GROUP_SIZE_FOR_NDRANGE},
{CL_KERNEL_SUB_GROUP_COUNT_FOR_NDRANGE} and
{CL_KERNEL_LOCAL_SIZE_FOR_SUB_GROUP_COUNT}
The following modifications are made to the OpenCL 2.1 platform layer and
runtime (sections 4 and 5):
* All API calls except {clSetKernelArg}, {clSetKernelArgSVMPointer},
{clSetKernelExecInfo} and {clCloneKernel} are thread-safe.
Note that this statement does not imply that other API calls were not
thread-safe in earlier versions of the specification.
Note that the OpenCL C kernel language is not updated for OpenCL 2.1.
The OpenCL 2.0 kernel language will still be consumed by OpenCL 2.1
runtimes.
The SPIR-V and OpenCL SPIR-V Environment specifications have been added.
== Summary of changes from OpenCL 2.1 to OpenCL 2.2
The following changes have been made to the OpenCL 2.2 execution model
(section 3)
* Added the third prerequisite (executing non-trivial constructors for
program scope global variables).
The following features are added to the OpenCL 2.2 platform layer and
runtime (_sections 4 and 5_):
* {clSetProgramSpecializationConstant} API call
* {clSetProgramReleaseCallback} API call
* Queries for {CL_PROGRAM_SCOPE_GLOBAL_CTORS_PRESENT} and
{CL_PROGRAM_SCOPE_GLOBAL_DTORS_PRESENT}
The following modifications are made to the OpenCL 2.2 platform layer and
runtime (section 4 and 5):
* Modified description of {CL_DEVICE_MAX_CLOCK_FREQUENCY} query.
* Added a new error code {CL_MAX_SIZE_RESTRICTION_EXCEEDED} to
{clSetKernelArg} API call
Added definition of Deprecation and Specialization constants to the
glossary.
== Summary of changes from OpenCL 2.2 to OpenCL 3.0
OpenCL 3.0 is a major revision that breaks backwards compatibility with
previous versions of OpenCL, see
<<opencl-3.0-backwards-compatibility, OpenCL 3.0 Backwards Compatibility>>
for details.
OpenCL 3.0 adds new queries to determine optional capabilities for a
device:
* {CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES} and
{CL_DEVICE_ATOMIC_FENCE_CAPABILITIES} to determine the
atomic memory and atomic fence capabilities of a device.
* {CL_DEVICE_NON_UNIFORM_WORK_GROUP_SUPPORT} to
determine if a device supports non-uniform work-group sizes.
* {CL_DEVICE_WORK_GROUP_COLLECTIVE_FUNCTIONS_SUPPORT}
to determine whether a device supports optional work-group
collective functions, such as broadcasts, scans, and reductions.
* {CL_DEVICE_GENERIC_ADDRESS_SPACE_SUPPORT} to
determine whether a device supports the generic address space.
* {CL_DEVICE_DEVICE_ENQUEUE_CAPABILITIES} to determine the device-side enqueue
capabilities of a device.
* {CL_DEVICE_PIPE_SUPPORT} to determine whether a device supports
pipe memory objects.
* {CL_DEVICE_PREFERRED_WORK_GROUP_SIZE_MULTIPLE} to determine the
the preferred work-group size multiple for a device.
OpenCL 3.0 adds new queries to conveniently and precisely
describe supported features and versions:
* {CL_PLATFORM_NUMERIC_VERSION} to describe the platform
version as a numeric value.
* {CL_PLATFORM_EXTENSIONS_WITH_VERSION} to describe supported
platform extensions and their supported version.
* {CL_DEVICE_NUMERIC_VERSION} to describe the device version
as a numeric value.
* {CL_DEVICE_EXTENSIONS_WITH_VERSION} to describe supported
device extensions and their supported version.
* {CL_DEVICE_ILS_WITH_VERSION} to describe supported
intermediate languages (ILs) and their supported version.
* {CL_DEVICE_BUILT_IN_KERNELS_WITH_VERSION} to describe supported
built-in kernels and their supported version.
OpenCL 3.0 adds a new API to register a function that will be called
when a context is destroyed, enabling an application to safely free
user data associated with a context callback function.
* {clSetContextDestructorCallback}
OpenCL 3.0 adds two new APIs to support creating buffer and image
memory objects with additional properties.
Although no new properties are added in OpenCL 3.0, these APIs enable
new buffer and image extensions to be added easily and consistently:
* {clCreateBufferWithProperties}
* {clCreateImageWithProperties}
OpenCL 3.0 adds new queries for the properties arrays specified
when creating buffers, images, pipes, samplers, and command queues:
* {CL_MEM_PROPERTIES}
* {CL_PIPE_PROPERTIES}
* {CL_SAMPLER_PROPERTIES}
* {CL_QUEUE_PROPERTIES_ARRAY}
// GitHub issue #348
Program initialization and clean-up kernels are not supported in OpenCL
3.0 due to implementation complexity and lack of demand.
The following APIs and queries for program initialization and clean-up
kernels are deprecated in OpenCL 3.0:
* {CL_PROGRAM_SCOPE_GLOBAL_CTORS_PRESENT}
* {CL_PROGRAM_SCOPE_GLOBAL_DTORS_PRESENT}
* {clSetProgramReleaseCallback}
OpenCL 3.0 adds the OpenCL 3.0 C kernel language, which includes
feature macros to describe OpenCL C language support.
Please refer to the OpenCL C specification for details.
// GitHub issue #178
Scalar input arguments to the *any* and *all* built-in functions have
been deprecated in the OpenCL 3.0 C kernel language.
These functions behaved inconsistently with the C language's use of
scalar integers as logical values.
OpenCL 3.0 adds new queries to determine supported OpenCL C language
versions and supported OpenCL C features:
* {CL_DEVICE_OPENCL_C_ALL_VERSIONS} to determine the set
of OpenCL C language versions supported by a device.
* {CL_DEVICE_OPENCL_C_FEATURES} to determine
optional OpenCL C language features supported by a device.
OpenCL 3.0 adds an event command type to identify events
associated with the OpenCL 2.1 command {clEnqueueSVMMigrateMem}:
* {CL_COMMAND_SVM_MIGRATE_MEM}
OpenCL 3.0 adds a new query to determine the latest version of the conformance
test suite that the device has fully passed in accordance with the official
conformance process:
* {CL_DEVICE_LATEST_CONFORMANCE_VERSION_PASSED}