api/opencl_runtime_layer.asciidoc - external/github.com/KhronosGroup/OpenCL-Docs - Git at Google

 // Copyright 2017-2020 The Khronos Group. This work is licensed under a
 // Creative Commons Attribution 4.0 International License; see
 // http://creativecommons.org/licenses/by/4.0/

 [[opencl-runtime]]
 = The OpenCL Runtime

 In this section we describe the API calls that manage OpenCL objects such as
 command-queues, memory objects, program objects, kernel objects for kernel
 functions in a program and calls that allow you to enqueue commands to a
 command-queue such as executing a kernel, reading, or writing a memory
 object.

 == Command Queues

 OpenCL objects such as memory, program and kernel objects are created using
 a context.
 Operations on these objects are performed using a command-queue.
 The command-queue can be used to queue a set of operations (referred to as
 commands) in order.
 Having multiple command-queues allows applications to queue multiple
 independent commands without requiring synchronization.
 Note that this should work as long as these objects are not being shared.
 Sharing of objects across multiple command-queues will require the
 application to perform appropriate synchronization.
 This is described in <<shared-opencl-objects, Shared OpenCL Objects>>

 [open,refpage='clCreateCommandQueueWithProperties',desc='Create a host or device command-queue on a specific device.',type='protos']
 --
 To create a host or device command-queue on a specific device, call the
 function

 include::{generated}/api/protos/clCreateCommandQueueWithProperties.txt[]
 include::{generated}/api/version-notes/clCreateCommandQueueWithProperties.asciidoc[]
 Also see extension *cl_khr_create_command_queue*.

   * _context_ must be a valid OpenCL context.
   * _device_ must be a device or sub-device associated with _context_.
     It can either be in the list of devices and sub-devices specified when
     _context_ is created using {clCreateContext} or be a root device with the
     same device type as specified when _context_ is created using
     {clCreateContextFromType}.
   * _properties_ specifies a list of properties for the command-queue and their
     corresponding values.
     Each property name is immediately followed by the corresponding desired
     value.
     The list is terminated with 0.
     The list of supported properties is described in the
     <<queue-properties-table, table below>>.
     If a supported property and its value is not specified in _properties_, its
     default value will be used.
     _properties_ can be `NULL` in which case the default values for supported
     command-queue properties will be used.
   * _errcode_ret_ will return an appropriate error code.
     If _errcode_ret_ is `NULL`, no error code is returned.

 [[queue-properties-table]]
 .List of supported queue creation properties by <<clCreateCommandQueueWithProperties>>
 [width="100%",cols="<33%,<17%,<50%",options="header"]
 |====
 | Queue Property | Property Value | Description
 | {CL_QUEUE_PROPERTIES_anchor}

 include::{generated}/api/version-notes/CL_QUEUE_PROPERTIES.asciidoc[]
   | {cl_command_queue_properties_TYPE}
       | This is a bitfield and can be set to a combination of the following
         values:

         {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE_anchor} - Determines whether the
         commands queued in the command-queue are executed in-order or
         out-of-order.
         If set, the commands in the command-queue are executed out-of-order.
         Otherwise, commands are executed in-order.
 include::{generated}/api/version-notes/CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE.asciidoc[]

         {CL_QUEUE_PROFILING_ENABLE_anchor} - Enable or disable profiling of commands
         in the command-queue.
         If set, the profiling of commands is enabled.
         Otherwise profiling of commands is disabled.
 include::{generated}/api/version-notes/CL_QUEUE_PROFILING_ENABLE.asciidoc[]

         {CL_QUEUE_ON_DEVICE_anchor} - Indicates that this is a device queue.
         If {CL_QUEUE_ON_DEVICE} is set,
         {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE}
         footnote:[{fn-out-of-order-device-queue}]
         must also be set.
 include::{generated}/api/version-notes/CL_QUEUE_ON_DEVICE.asciidoc[]

         {CL_QUEUE_ON_DEVICE_DEFAULT_anchor}
         footnote:[{fn-default-device-queue}] -
         indicates that this is the default device queue.
         This can only be used with {CL_QUEUE_ON_DEVICE}.
 include::{generated}/api/version-notes/CL_QUEUE_ON_DEVICE_DEFAULT.asciidoc[]

         If {CL_QUEUE_PROPERTIES} is not specified an in-order host command
         queue is created for the specified device
 | {CL_QUEUE_SIZE_anchor}

 include::{generated}/api/version-notes/CL_QUEUE_SIZE.asciidoc[]
   | {cl_uint_TYPE}
       | Specifies the size of the device queue in bytes.

         This can only be specified if {CL_QUEUE_ON_DEVICE} is set in
         {CL_QUEUE_PROPERTIES}.
         This must be a value {leq} {CL_DEVICE_QUEUE_ON_DEVICE_MAX_SIZE}.

         For best performance, this should be {leq}
         {CL_DEVICE_QUEUE_ON_DEVICE_PREFERRED_SIZE}.

         If {CL_QUEUE_SIZE} is not specified, the device queue is created with
         {CL_DEVICE_QUEUE_ON_DEVICE_PREFERRED_SIZE} as the size of the queue.
 |====

 // refError

 {clCreateCommandQueueWithProperties} returns a valid non-zero command-queue
 and _errcode_ret_ is set to {CL_SUCCESS} if the command-queue is created
 successfully.
 Otherwise, it returns a `NULL` value with one of the following error values
 returned in _errcode_ret_:

   * {CL_INVALID_CONTEXT} if _context_ is not a valid context.
   * {CL_INVALID_DEVICE} if _device_ is not a valid device or is not associated
     with _context_.
   * {CL_INVALID_VALUE} if values specified in _properties_ are not valid.
   * {CL_INVALID_QUEUE_PROPERTIES} if values specified in _properties_ are
     valid but are not supported by the device.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clCreateCommandQueue',desc='Create a host command-queue on a specific device.',type='protos']
 --
 To create a host command-queue on a specific device, call the function

 include::{generated}/api/protos/clCreateCommandQueue.txt[]
 include::{generated}/api/version-notes/clCreateCommandQueue.asciidoc[]

   * _context_ must be a valid OpenCL context.
   * _device_ must be a device or sub-device associated with _context_.
     It can either be in the list of devices and sub-devices specified when
     _context_ is created using {clCreateContext} or be a root device with the
     same device type as specified when _context_ is created using
     {clCreateContextFromType}.
   * _properties_ specifies a list of properties for the command-queue.
     This is a bit-field and the supported properties are described in the
     <<legacy-queue-properties-table, table>> below.
     Only command-queue properties specified in this table can be used,
     otherwise the value specified in _properties_ is considered to be not
     valid.
     _properties_ can be 0 in which case the default values for supported
     command-queue properties will be used.

 [[legacy-queue-properties-table]]
 .List of supported `cl_command_queue_property` values by <<clCreateCommandQueue>>
 [width="100%",cols="<50%,<50%",options="header"]
 |====
 | Command-Queue Properties | Description
 | {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE}
   | Determines whether the commands queued in the command-queue are executed
     in-order or out-of-order.
     If set, the commands in the command-queue are executed out-of-order.
     Otherwise, commands are executed in-order.

 | {CL_QUEUE_PROFILING_ENABLE}
   | Enable or disable profiling of commands in the command-queue.
     If set, the profiling of commands is enabled.
     Otherwise profiling of commands is disabled.
 |====

   * _errcode_ret_ will return an appropriate error code.
     If _errcode_ret_ is `NULL`, no error code is returned.

 // refError

 {clCreateCommandQueue} returns a valid non-zero command-queue and _errcode_ret_
 is set to {CL_SUCCESS} if the command-queue is created successfully.
 Otherwise, it returns a `NULL` value with one of the following error values
 returned in _errcode_ret_:

   * {CL_INVALID_CONTEXT} if _context_ is not a valid context.
   * {CL_INVALID_DEVICE} if _device_ is not a valid device or is not associated
     with _context_.
   * {CL_INVALID_VALUE} if values specified in _properties_ are not valid.
   * {CL_INVALID_QUEUE_PROPERTIES} if values specified in _properties_ are
     valid but are not supported by the device.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clSetDefaultDeviceCommandQueue',desc='Replaces the default command queue on the device.',type='protos']
 --
 To replace the default command queue on a device, call the function

 include::{generated}/api/protos/clSetDefaultDeviceCommandQueue.txt[]
 include::{generated}/api/version-notes/clSetDefaultDeviceCommandQueue.asciidoc[]

   * _context_ is the OpenCL context used to create _command_queue_.
   * _device_ is a valid OpenCL device associated with _context_.
   * _command_queue_ specifies a command queue object which replaces the
     default device command queue

 {clSetDefaultDeviceCommandQueue} may be used to replace a default device
 command queue created with {clCreateCommandQueueWithProperties} and the
 {CL_QUEUE_ON_DEVICE_DEFAULT} flag.

 // refError

 {clSetDefaultDeviceCommandQueue} returns {CL_SUCCESS} if the function is
 executed successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_CONTEXT} if _context_ is not a valid context.
   * {CL_INVALID_DEVICE} if _device_ is not a valid device or is not associated
     with _context_.
   * {CL_INVALID_OPERATION} if _device_ does not support a replaceable default on-device queue.
   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid command-queue
     for _device_.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clRetainCommandQueue',desc='Increments the command_queue reference count.',type='protos']
 --
 To retain a command queue, call the function

 include::{generated}/api/protos/clRetainCommandQueue.txt[]
 include::{generated}/api/version-notes/clRetainCommandQueue.asciidoc[]

   * _command_queue_ specifies the command-queue to be retained.

 The _command_queue_ reference count is incremented.

 {clCreateCommandQueueWithProperties} and {clCreateCommandQueue} perform an
 implicit retain.
 This is very helpful for 3^rd^ party libraries, which typically get a
 command-queue passed to them by the application.
 However, it is possible that the application may delete the command-queue
 without informing the library.
 Allowing functions to attach to (i.e. retain) and release a command-queue
 solves the problem of a command-queue being used by a library no longer
 being valid.

 // refError

 {clRetainCommandQueue} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid
     command-queue.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clReleaseCommandQueue',desc='Decrements the command_queue reference count.',type='protos']
 --
 To release a command queue, call the function

 include::{generated}/api/protos/clReleaseCommandQueue.txt[]
 include::{generated}/api/version-notes/clReleaseCommandQueue.asciidoc[]

   * _command_queue_ specifies the command-queue to be released.

 The _command_queue_ reference count is decremented.

 After the _command_queue_ reference count becomes zero and all commands
 queued to _command_queue_ have finished (eg.
 kernel-instances, memory object updates etc.), the command-queue is deleted.

 {clReleaseCommandQueue} performs an implicit flush to issue any previously
 queued OpenCL commands in _command_queue_.
 Using this function to release a reference that was not obtained by creating
 the object or by calling {clRetainCommandQueue} causes undefined behavior.

 // refError

 {clReleaseCommandQueue} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid
     command-queue.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clGetCommandQueueInfo',desc='Query information about a command-queue.',type='protos']
 --
 To query information about a command-queue, call the function

 include::{generated}/api/protos/clGetCommandQueueInfo.txt[]
 include::{generated}/api/version-notes/clGetCommandQueueInfo.asciidoc[]

   * _command_queue_ specifies the command-queue being queried.
   * _param_name_ specifies the information to query.
   * _param_value_ is a pointer to memory where the appropriate result being
     queried is returned.
     If _param_value_ is `NULL`, it is ignored.
   * _param_value_size_ is used to specify the size in bytes of memory pointed to
     by _param_value_.
     This size must be {geq} size of return type as described in the
     <<command-queue-param-table,Command Queue Parameter>> table.
     If _param_value_ is `NULL`, it is ignored.
   * _param_value_size_ret_ returns the actual size in bytes of data being
     queried by _param_name_.
     If _param_value_size_ret_ is `NULL`, it is ignored.

 The list of supported _param_name_ values and the information returned in
 _param_value_ by {clGetCommandQueueInfo} is described in the
 <<command-queue-param-table,Command Queue Parameter>> table.

 [[command-queue-param-table]]
 .List of supported param_names by <<clGetCommandQueueInfo>>
 [width="100%",cols="<33%,<17%,<50%",options="header"]
 |====
 | Queue Info | Return Type | Description
 | {CL_QUEUE_CONTEXT_anchor}

 include::{generated}/api/version-notes/CL_QUEUE_CONTEXT.asciidoc[]
   | {cl_context_TYPE}
       | Return the context specified when the command-queue is created.
 | {CL_QUEUE_DEVICE_anchor}

 include::{generated}/api/version-notes/CL_QUEUE_DEVICE.asciidoc[]
   | {cl_device_id_TYPE}
       | Return the device specified when the command-queue is created.
 | {CL_QUEUE_REFERENCE_COUNT_anchor} footnote:[{fn-reference-count-usage}]

 include::{generated}/api/version-notes/CL_QUEUE_REFERENCE_COUNT.asciidoc[]
   | {cl_uint_TYPE}
       | Return the command-queue reference count.
 | {CL_QUEUE_PROPERTIES_anchor}

 include::{generated}/api/version-notes/CL_QUEUE_PROPERTIES.asciidoc[]
   | {cl_command_queue_properties_TYPE}
       | Return the currently specified properties for the command-queue.
         These properties are specified by the value associated with the
         {CL_QUEUE_PROPERTIES} passed in _properties_ argument in
         {clCreateCommandQueueWithProperties}, or the value of the _properties_
         argument in {clCreateCommandQueue}.

 | {CL_QUEUE_PROPERTIES_ARRAY_anchor}

 include::{generated}/api/version-notes/CL_QUEUE_PROPERTIES_ARRAY.asciidoc[]
   | {cl_queue_properties_TYPE}[]
       | Return the properties argument specified in
         {clCreateCommandQueueWithProperties}.

         If the _properties_ argument specified in
         {clCreateCommandQueueWithProperties} used to create _command_queue_
         was not `NULL`, the implementation must return the values specified in
         the properties argument in the same order and without including
         additional properties.

         If _command_queue_ was created using {clCreateCommandQueue}, or if the
         _properties_ argument specified in clCreateCommandQueueWithProperties}
         was `NULL`, the implementation must return _param_value_size_ret_
         equal to 0, indicating that there are no properties to be returned.

 | {CL_QUEUE_SIZE_anchor}

 include::{generated}/api/version-notes/CL_QUEUE_SIZE.asciidoc[]
   | {cl_uint_TYPE}
       | Return the size of the device command-queue.
         To be considered valid for this query, _command_queue_ must be a
         device command-queue.

 | {CL_QUEUE_DEVICE_DEFAULT_anchor}

 include::{generated}/api/version-notes/CL_QUEUE_DEVICE_DEFAULT.asciidoc[]
   | {cl_command_queue_TYPE}
       | Return the current default command queue for the underlying device.
 |====

 // refError

 {clGetCommandQueueInfo} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid
     command-queue, or if _command_queue_ is not a valid command-queue
     for _param_name_.
   * {CL_INVALID_VALUE} if _param_name_ is not one of the supported values or
     if size in bytes specified by _param_value_size_ is < size of return
     type as specified in the <<command-queue-param-table,Command Queue
     Parameter>> table, and _param_value_ is not a `NULL` value.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clSetCommandQueueProperty',desc='Enable or disable properties of a command-queue.',type='protos']
 --
 To enable or disable the properties of a command-queue, call the function

 include::{generated}/api/protos/clSetCommandQueueProperty.txt[]
 include::{generated}/api/version-notes/clSetCommandQueueProperty.asciidoc[]

   * _command_queue_ specifies the command-queue being modified.
   * _properties_ specifies the new list of properties for the command-queue.
     This is a bit-field and the supported properties are described in the
     <<legacy-queue-properties-table, Command-Queue Properties table>> for
     {clCreateCommandQueue}.
     Only command-queue properties specified in this table can be used,
     otherwise the value specified in _properties_ is considered to be not
     valid.
   * _enable_ determines whether the values specified by _properties_ are
     enabled (if _enable_ is {CL_TRUE}) or disabled (if _enable_ is {CL_FALSE})
     for the command-queue.
   * _old_properties_ returns the command-queue properties before they were
     changed by {clSetCommandQueueProperty}.  If _old_properties_ is `NULL`, it
     is ignored.

 [NOTE]
 ====
 Changing the {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE} command-queue property
 will cause the OpenCL implementation to block until all previously queued
 commands in _command_queue_ have completed.  This can be an expensive operation
 and therefore changes to this property should only be done when absolutely
 necessary.
 ====

 // refError

 {clSetCommandQueueProperty} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid command-queue.
   * {CL_INVALID_VALUE} if values specified in _properties_ are not valid.
   * {CL_INVALID_QUEUE_PROPERTIES} if values specified in _properties_ are
     valid but are not supported by the device.
 --


 == Buffer Objects

 A _buffer_ object stores a one-dimensional collection of elements.
 Elements of a _buffer_ object can be a scalar data type (such as an int,
 float), vector data type, or a user-defined structure.


 ===  Creating Buffer Objects

 [open,refpage='clCreateBuffer',desc='Creates a buffer object.',type='protos',alias='clCreateBufferWithProperties']
 --
 A *buffer object* may be created using the function

 include::{generated}/api/protos/clCreateBuffer.txt[]
 include::{generated}/api/version-notes/clCreateBuffer.asciidoc[]

 A *buffer object* may also be created with additional properties using the function

 include::{generated}/api/protos/clCreateBufferWithProperties.txt[]
 include::{generated}/api/version-notes/clCreateBufferWithProperties.asciidoc[]

   * _context_ is a valid OpenCL context used to create the buffer object.
   * _properties_ is an optional list of properties for the buffer object and their corresponding values.
     The list is terminated with the special property `0`.
     If no properties are required, _properties_ may be `NULL`.
     OpenCL 3.0 does not define any optional properties for buffers.
   * _flags_ is a bit-field that is used to specify allocation and usage
     information about the image memory object being created and is described in
     the <<memory-flags-table, supported memory flag values>> table.
   * _size_ is the size in bytes of the buffer memory object to be allocated.
   * _host_ptr_ is a pointer to the buffer data that may already be allocated
     by the application.
     The size of the buffer that _host_ptr_ points to must be greater than or equal to _size_
     bytes.
   * _errcode_ret_ may return an appropriate error code.
     If _errcode_ret_ is `NULL`, no error code is returned.

 The alignment requirements for data stored in buffer objects are described
 in <<alignment-app-data-types>>.

 If {clCreateBuffer} or {clCreateBufferWithProperties} is called with
 {CL_MEM_USE_HOST_PTR} set in its _flags_ argument, the contents of the
 memory pointed to by _host_ptr_ at the time of the {clCreateBuffer} call
 define the initial contents of the buffer object.

 If {clCreateBuffer} or {clCreateBufferWithProperties} is called with a
 pointer returned by {clSVMAlloc} as its _host_ptr_ argument, and
 {CL_MEM_USE_HOST_PTR} is set in its _flags_ argument, {clCreateBuffer} or
 {clCreateBufferWithProperties} will succeed and return a valid non-zero
 buffer object as long as the _size_ argument is no larger than the
 _size_ argument passed in the original {clSVMAlloc} call.
 The new buffer object returned has the shared memory as the underlying
 storage.
 Locations in the buffers underlying shared memory can be operated on using
 atomic operations to the devices level of support as defined in the memory
 model.

 // refError

 {clCreateBuffer} and {clCreateBufferWithProperties} returns a valid non-zero
 buffer object and _errcode_ret_ is set to {CL_SUCCESS} if the buffer object
 is created successfully.
 Otherwise, they return a `NULL` value with one of the following error values
 returned in _errcode_ret_:

   * {CL_INVALID_CONTEXT} if _context_ is not a valid context.
   * {CL_INVALID_PROPERTY} if a property name in _properties_ is not a
     supported property name, if the value specified for a supported property
     name is not valid, or if the same property name is specified more than
     once.
   * {CL_INVALID_VALUE} if values specified in _flags_ are not valid as defined
     in the <<memory-flags-table,Memory Flags>> table.
   * {CL_INVALID_BUFFER_SIZE} if _size_ is 0 or if _size_ is greater than
     {CL_DEVICE_MAX_MEM_ALLOC_SIZE} for all devices in _context_.
   * {CL_INVALID_HOST_PTR} if _host_ptr_ is `NULL` and {CL_MEM_USE_HOST_PTR} or
     {CL_MEM_COPY_HOST_PTR} are set in _flags_ or if _host_ptr_ is not `NULL`
     but {CL_MEM_COPY_HOST_PTR} or {CL_MEM_USE_HOST_PTR} are not set in _flags_.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for buffer object.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.

 [[memory-flags-table]]
 .List of supported memory flag values
 [width="100%",cols="<50%,<50%",options="header"]
 |====
 | Memory Flags | Description
 | {CL_MEM_READ_WRITE_anchor}

 include::{generated}/api/version-notes/CL_MEM_READ_WRITE.asciidoc[]
   | This flag specifies that the memory object will be read
     and written by a kernel.
     This is the default.
 | {CL_MEM_WRITE_ONLY_anchor}

 include::{generated}/api/version-notes/CL_MEM_WRITE_ONLY.asciidoc[]
   | This flag specifies that the memory object will be
     written but not read by a kernel.

     Reading from a buffer or image object created with {CL_MEM_WRITE_ONLY}
     inside a kernel is undefined.

     {CL_MEM_READ_WRITE} and {CL_MEM_WRITE_ONLY} are mutually exclusive.
 | {CL_MEM_READ_ONLY_anchor}

 include::{generated}/api/version-notes/CL_MEM_READ_ONLY.asciidoc[]
   | This flag specifies that the memory object is a
     readonly memory object when used inside a kernel.

     Writing to a buffer or image object created with {CL_MEM_READ_ONLY} inside
     a kernel is undefined.

     {CL_MEM_READ_WRITE} or {CL_MEM_WRITE_ONLY} and {CL_MEM_READ_ONLY} are mutually
     exclusive.
 | {CL_MEM_USE_HOST_PTR_anchor}

 include::{generated}/api/version-notes/CL_MEM_USE_HOST_PTR.asciidoc[]
   | This flag is valid only if host_ptr is not `NULL`.
     If specified, it indicates that the application wants the OpenCL
     implementation to use memory referenced by host_ptr as the storage bits
     for the memory object.

     The contents of the memory pointed to by host_ptr at the time of the
     {clCreateBuffer}, {clCreateBufferWithProperties}, {clCreateImage},
     {clCreateImageWithProperties}, {clCreateImage2D}, or {clCreateImage3D}
     call define the initial contents of the memory object.

     OpenCL implementations are allowed to cache the contents pointed
     to by host_ptr in device memory.
     This cached copy can be used when kernels are executed on a device.

     The result of OpenCL commands that operate on multiple buffer objects
     created with the same host_ptr or from overlapping host or SVM regions
     is considered to be undefined.
 | {CL_MEM_ALLOC_HOST_PTR_anchor}

 include::{generated}/api/version-notes/CL_MEM_ALLOC_HOST_PTR.asciidoc[]
   | This flag specifies that the application wants the OpenCL implementation
     to allocate memory from host accessible memory.

     {CL_MEM_ALLOC_HOST_PTR} and {CL_MEM_USE_HOST_PTR} are mutually exclusive.
 | {CL_MEM_COPY_HOST_PTR_anchor}

 include::{generated}/api/version-notes/CL_MEM_COPY_HOST_PTR.asciidoc[]
   | This flag is valid only if host_ptr is not `NULL`.
     If specified, it indicates that the application wants the OpenCL
     implementation to allocate memory for the memory object and copy the
     data from memory referenced by host_ptr.
     The implementation will copy the memory immediately and host_ptr is
     available for reuse by the application when the {clCreateBuffer},
     {clCreateBufferWithProperties}, {clCreateImage}, {clCreateImageWithProperties},
     {clCreateImage2D}, or {clCreateImage3D} operation returns.

     {CL_MEM_COPY_HOST_PTR} and {CL_MEM_USE_HOST_PTR} are mutually exclusive.

     {CL_MEM_COPY_HOST_PTR} can be used with {CL_MEM_ALLOC_HOST_PTR} to
     initialize the contents of the {cl_mem_TYPE} object allocated using
     host-accessible (e.g. PCIe) memory.
 | {CL_MEM_HOST_WRITE_ONLY_anchor}

 include::{generated}/api/version-notes/CL_MEM_HOST_WRITE_ONLY.asciidoc[]
   | This flag specifies that the host will only write to the memory object
     (using OpenCL APIs that enqueue a write or a map for write).
     This can be used to optimize write access from the host (e.g. enable
     write-combined allocations for memory objects for devices that
     communicate with the host over a system bus such as PCIe).
 | {CL_MEM_HOST_READ_ONLY_anchor}

 include::{generated}/api/version-notes/CL_MEM_HOST_READ_ONLY.asciidoc[]
   | This flag specifies that the host will only read
     the memory object (using OpenCL APIs that enqueue a read or a map for
     read).

     {CL_MEM_HOST_WRITE_ONLY} and {CL_MEM_HOST_READ_ONLY} are mutually exclusive.
 | {CL_MEM_HOST_NO_ACCESS_anchor}

 include::{generated}/api/version-notes/CL_MEM_HOST_NO_ACCESS.asciidoc[]
   | This flag specifies that the host will not read or
     write the memory object.

     {CL_MEM_HOST_WRITE_ONLY} or {CL_MEM_HOST_READ_ONLY} and
     {CL_MEM_HOST_NO_ACCESS} are mutually exclusive.
 | {CL_MEM_KERNEL_READ_AND_WRITE_anchor}

 include::{generated}/api/version-notes/CL_MEM_KERNEL_READ_AND_WRITE.asciidoc[]
   | This flag is only used by {clGetSupportedImageFormats} to query image
     formats that may be both read from and written to by the same kernel
     instance.
     To create a memory object that may be read from and written to use
     {CL_MEM_READ_WRITE}.
 |====
 --

 [open,refpage='clCreateSubBuffer',desc='Creates a new buffer object (referred to as a sub-buffer object) from an existing buffer object.',type='protos',xrefs='cl_buffer_region']
 --
 To create a new buffer object (referred to as a sub-buffer object) from an
 existing buffer object, call the function

 include::{generated}/api/protos/clCreateSubBuffer.txt[]
 include::{generated}/api/version-notes/clCreateSubBuffer.asciidoc[]

   * _buffer_ must be a valid buffer object and cannot be a sub-buffer object.
   * _flags_ is a bit-field that is used to specify allocation and usage
     information about the sub-buffer memory object being created and is
     described in the <<memory-flags-table,Memory Flags>> table.
     If the {CL_MEM_READ_WRITE}, {CL_MEM_READ_ONLY}, or {CL_MEM_WRITE_ONLY} values are
     not specified in _flags_, they are inherited from the corresponding memory
     access qualifiers associated with _buffer_.
     The {CL_MEM_USE_HOST_PTR}, {CL_MEM_ALLOC_HOST_PTR}, and {CL_MEM_COPY_HOST_PTR}
     values cannot be specified in _flags_ but are inherited from the
     corresponding memory access qualifiers associated with _buffer_.
     If {CL_MEM_COPY_HOST_PTR} is specified in the memory access qualifier values
     associated with _buffer_ it does not imply any additional copies when the
     sub-buffer is created from _buffer_.
     If the {CL_MEM_HOST_WRITE_ONLY}, {CL_MEM_HOST_READ_ONLY}, or
     {CL_MEM_HOST_NO_ACCESS} values are not specified in _flags_, they are
     inherited from the corresponding memory access qualifiers associated with
     _buffer_.
   * _buffer_create_type_ and _buffer_create_info_ describe the type of buffer
     object to be created.
     The list of supported values for _buffer_create_type_ and corresponding
     descriptor that _buffer_create_info_ points to is described in the
     <<subbuffer-create-info-table,SubBuffer Attributes>> table.

 [[subbuffer-create-info-table]]
 .List of supported buffer creation types by <<clCreateSubBuffer>>
 [width="100%",cols="<50%,<50%",options="header"]
 |====
 | Buffer Creation Type | Description
 | {CL_BUFFER_CREATE_TYPE_REGION_anchor}

 include::{generated}/api/version-notes/CL_BUFFER_CREATE_TYPE_REGION.asciidoc[]
   | Create a buffer object that represents a specific region in _buffer_.

     _buffer_create_info_ is a pointer to a {cl_buffer_region_TYPE} structure
     specifying a region of the buffer.

     If _buffer_ is created with {CL_MEM_USE_HOST_PTR}, the _host_ptr_
     associated with the buffer object returned is _host_ptr + origin_.

     The buffer object returned references the data store allocated for
     buffer and points to the region specified by _buffer_create_info_ in
     this data store.
 |====

 // refError

 {clCreateSubBuffer} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors in _errcode_ret_:

   * {CL_INVALID_MEM_OBJECT} if _buffer_ is not a valid buffer object or is a
     sub-buffer object.
   * {CL_INVALID_VALUE} if _buffer_ was created with {CL_MEM_WRITE_ONLY} and
     _flags_ specifies {CL_MEM_READ_WRITE} or {CL_MEM_READ_ONLY}, or if _buffer_
     was created with {CL_MEM_READ_ONLY} and _flags_ specifies
     {CL_MEM_READ_WRITE} or {CL_MEM_WRITE_ONLY}, or if _flags_ specifies
     {CL_MEM_USE_HOST_PTR} or {CL_MEM_ALLOC_HOST_PTR} or {CL_MEM_COPY_HOST_PTR}.
   * {CL_INVALID_VALUE} if _buffer_ was created with {CL_MEM_HOST_WRITE_ONLY} and
     _flags_ specify {CL_MEM_HOST_READ_ONLY}, or if _buffer_ was created with
     {CL_MEM_HOST_READ_ONLY} and _flags_ specify {CL_MEM_HOST_WRITE_ONLY}, or if
     _buffer_ was created with {CL_MEM_HOST_NO_ACCESS} and _flags_ specify
     {CL_MEM_HOST_READ_ONLY} or {CL_MEM_HOST_WRITE_ONLY}.
   * {CL_INVALID_VALUE} if the value specified in _buffer_create_type_ is not
     valid.
   * {CL_INVALID_VALUE} if value(s) specified in _buffer_create_info_ (for a
     given _buffer_create_type_) is not valid or if _buffer_create_info_ is
     `NULL`.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for sub-buffer object.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 // These errors are specific to the {CL_BUFFER_CREATE_TYPE_REGION} parameter
   * {CL_INVALID_VALUE} if the region specified by the {cl_buffer_region_TYPE}
     structure passed in _buffer_create_info_ is out of bounds in _buffer_.
   * {CL_INVALID_BUFFER_SIZE} if the _size_ field of the {cl_buffer_region_TYPE}
     structure passed in _buffer_create_info_ is 0.
   * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if there are no devices in _context_
     associated with _buffer_ for which the _origin_ field of the
     {cl_buffer_region_TYPE} structure passed in _buffer_create_info_ is
     aligned to the {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value.

 [NOTE]
 ====
 Concurrent reading from, writing to and copying between both a buffer object
 and its sub-buffer object(s) is undefined.
 Concurrent reading from, writing to and copying between overlapping
 sub-buffer objects created with the same buffer object is undefined.
 Only reading from both a buffer object and its sub-buffer objects or reading
 from multiple overlapping sub-buffer objects is defined.
 ====
 --

 [open,refpage='cl_buffer_region',desc='',type='structs',xrefs='clCreateSubBuffer']
 --
 The {cl_buffer_region_TYPE} structure specifies a region of a buffer object:

 include::{generated}/api/structs/cl_buffer_region.txt[]

   * _origin_ is the offset in bytes of the region.
   * _size_ is the size in bytes of the region.

 Constraints on the values of _origin_ and _size_ are specified for the
 {clCreateSubBuffer} function to which this structure is passed.
 --


 === Reading, Writing and Copying Buffer Objects

 The following functions enqueue commands to read from a buffer object to
 host memory or write to a buffer object from host memory.

 [open,refpage='clEnqueueReadBuffer',desc='Enqueue commands to read from a buffer object to host memory.',type='protos',alias='clEnqueueWriteBuffer']
 --
 To read from a buffer object to host memory or to write to a buffer object from
 host memory call one of the functions

 include::{generated}/api/protos/clEnqueueReadBuffer.txt[]
 include::{generated}/api/version-notes/clEnqueueReadBuffer.asciidoc[]

 include::{generated}/api/protos/clEnqueueWriteBuffer.txt[]
 include::{generated}/api/version-notes/clEnqueueWriteBuffer.asciidoc[]

   * _command_queue_ is a valid host command-queue in which the read / write
     command will be queued.
     _command_queue_ and _buffer_ must be created with the same OpenCL context.
   * _buffer_ refers to a valid buffer object.
   * _blocking_read_ and _blocking_write_ indicate if the read and write
     operations are _blocking_ or _non-blocking_ (see below).
   * _offset_ is the offset in bytes in the buffer object to read from or write
     to.
   * _size_ is the size in bytes of data being read or written.
   * _ptr_ is the pointer to buffer in host memory where data is to be read into
     or to be written from.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before this particular command can be executed.
     If _event_wait_list_ is `NULL`, then this particular command does not wait
     on any event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this read / write command
     and can be used to query or queue a wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 If _blocking_read_ is {CL_TRUE} i.e. the read command is blocking,
 {clEnqueueReadBuffer} does not return until the buffer data has been read
 and copied into memory pointed to by _ptr_.

 If _blocking_read_ is {CL_FALSE} i.e. the read command is non-blocking,
 {clEnqueueReadBuffer} queues a non-blocking read command and returns.
 The contents of the buffer that _ptr_ points to cannot be used until the
 read command has completed.
 The _event_ argument returns an event object which can be used to query the
 execution status of the read command.
 When the read command has completed, the contents of the buffer that _ptr_
 points to can be used by the application.

 If _blocking_write_ is {CL_TRUE}, the write command is blocking and does not
 return until the command is complete, including transfer of the data.
 The memory pointed to by _ptr_ can be reused by the application after the
 {clEnqueueWriteBuffer} call returns.

 If _blocking_write_ is {CL_FALSE}, the OpenCL implementation will use _ptr_ to
 perform a non-blocking write.
 As the write is non-blocking the implementation can return immediately.
 The memory pointed to by _ptr_ cannot be reused by the application after the
 call returns.
 The _event_ argument returns an event object which can be used to query the
 execution status of the write command.
 When the write command has completed, the memory pointed to by _ptr_ can
 then be reused by the application.

 // refError

 {clEnqueueReadBuffer} and {clEnqueueWriteBuffer} return {CL_SUCCESS} if the
 function is executed successfully.
 Otherwise, they return one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and
     _buffer_ are not the same or if the context associated with
     _command_queue_ and events in _event_wait_list_ are not the same.
   * {CL_INVALID_MEM_OBJECT} if _buffer_ is not a valid buffer object.
   * {CL_INVALID_VALUE} if the region being read or written specified by
     (_offset_, _size_) is out of bounds or if _ptr_ is a `NULL` value.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _buffer_ is a sub-buffer object and
     _offset_ specified when the sub-buffer object is created is not aligned
     to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated with
     _queue_.
     This error code is <<unified-spec, missing before>> version 1.1.
   * {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the read and write
     operations are blocking and the execution status of any of the events in
     _event_wait_list_ is a negative integer value.
     This error code is <<unified-spec, missing before>> version 1.1.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for data store associated with _buffer_.
   * {CL_INVALID_OPERATION} if {clEnqueueReadBuffer} is called on _buffer_
     which has been created with {CL_MEM_HOST_WRITE_ONLY} or
     {CL_MEM_HOST_NO_ACCESS}.
   * {CL_INVALID_OPERATION} if {clEnqueueWriteBuffer} is called on _buffer_
     which has been created with {CL_MEM_HOST_READ_ONLY} or
     {CL_MEM_HOST_NO_ACCESS}.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clEnqueueReadBufferRect',desc='Enqueue command to read from a 2D or 3D rectangular region from a buffer object to host memory.',type='protos',alias='clEnqueueWriteBufferRect']
 --
 The following functions enqueue commands to read a 2D or 3D rectangular
 region from a buffer object to host memory or write a 2D or 3D rectangular
 region to a buffer object from host memory.

 include::{generated}/api/protos/clEnqueueReadBufferRect.txt[]
 include::{generated}/api/version-notes/clEnqueueReadBufferRect.asciidoc[]

 include::{generated}/api/protos/clEnqueueWriteBufferRect.txt[]
 include::{generated}/api/version-notes/clEnqueueWriteBufferRect.asciidoc[]

   * _command_queue_ refers is a valid host command-queue in which the read /
     write command will be queued.
     _command_queue_ and _buffer_ must be created with the same OpenCL context.
   * _buffer_ refers to a valid buffer object.
   * _blocking_read_ and _blocking_write_ indicate if the read and write
     operations are _blocking_ or _non-blocking_ (see below).
   * _buffer_origin_ defines the (_x_, _y_, _z_) offset in the memory region
     associated with _buffer_.
     For a 2D rectangle region, the _z_ value given by _buffer_origin_[2] should
     be 0.
     The offset in bytes is computed as _buffer_origin_[2] {times}
     _buffer_slice_pitch_ + _buffer_origin_[1] {times} _buffer_row_pitch_ {plus}
     _buffer_origin_[0].
   * _host_origin_ defines the (_x_, _y_, _z_) offset in the memory region
     pointed to by _ptr_.
     For a 2D rectangle region, the _z_ value given by _host_origin_[2] should be
     0.
     The offset in bytes is computed as _host_origin_[2] {times}
     _host_slice_pitch_ + _host_origin_[1] {times} _host_row_pitch_ {plus}
     _host_origin_[0].
   * _region_ defines the (_width_ in bytes, _height_ in rows, _depth_ in slices)
     of the 2D or 3D rectangle being read or written.
     For a 2D rectangle copy, the _depth_ value given by _region_[2] should be 1.
     The values in region cannot be 0.
   * _buffer_row_pitch_ is the length of each row in bytes to be used for the
     memory region associated with _buffer_.
     If _buffer_row_pitch_ is 0, _buffer_row_pitch_ is computed as _region_[0].
   * _buffer_slice_pitch_ is the length of each 2D slice in bytes to be used for
     the memory region associated with _buffer_.
     If _buffer_slice_pitch_ is 0, _buffer_slice_pitch_ is computed as
     _region_[1] {times} _buffer_row_pitch_.
   * _host_row_pitch_ is the length of each row in bytes to be used for the
     memory region pointed to by _ptr_.
     If _host_row_pitch_ is 0, _host_row_pitch_ is computed as _region_[0].
   * _host_slice_pitch_ is the length of each 2D slice in bytes to be used for
     the memory region pointed to by _ptr_.
     If _host_slice_pitch_ is 0, _host_slice_pitch_ is computed as _region_[1]
     {times} _host_row_pitch_.
   * _ptr_ is the pointer to buffer in host memory where data is to be read into
     or to be written from.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before this particular command can be executed.
     If _event_wait_list_ is `NULL`, then this particular command does not wait
     on any event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this read / write command
     and can be used to query or queue a wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.


 If _blocking_read_ is {CL_TRUE} i.e. the read command is blocking,
 {clEnqueueReadBufferRect} does not return until the buffer data has been
 read and copied into memory pointed to by _ptr_.

 If _blocking_read_ is {CL_FALSE} i.e. the read command is non-blocking,
 {clEnqueueReadBufferRect} queues a non-blocking read command and returns.
 The contents of the buffer that _ptr_ points to cannot be used until the
 read command has completed.
 The _event_ argument returns an event object which can be used to query the
 execution status of the read command.
 When the read command has completed, the contents of the buffer that _ptr_
 points to can be used by the application.

 If _blocking_write_ is {CL_TRUE}, the write command is blocking and does not
 return until the command is complete, including transfer of the data.
 The memory pointed to by _ptr_ can be reused by the application after the
 {clEnqueueWriteBufferRect} call returns.

 If _blocking_write_ is {CL_FALSE}, the OpenCL implementation will use _ptr_ to
 perform a non-blocking write.
 As the write is non-blocking the implementation can return immediately.
 The memory pointed to by _ptr_ cannot be reused by the application after the
 call returns.
 The _event_ argument returns an event object which can be used to query the
 execution status of the write command.
 When the write command has completed, the memory pointed to by _ptr_ can
 then be reused by the application.

 // refError

 {clEnqueueReadBufferRect} and {clEnqueueWriteBufferRect} return {CL_SUCCESS}
 if the function is executed successfully.
 Otherwise, they return one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and
     _buffer_ are not the same or if the context associated with
     _command_queue_ and events in _event_wait_list_ are not the same.
   * {CL_INVALID_MEM_OBJECT} if _buffer_ is not a valid buffer object.
   * {CL_INVALID_VALUE} if _buffer_origin_, _host_origin_, or _region_ is `NULL`.
   * {CL_INVALID_VALUE} if the region being read or written specified by
     (_buffer_origin_, _region_, _buffer_row_pitch_, _buffer_slice_pitch_) is
     out of bounds.
   * {CL_INVALID_VALUE} if any _region_ array element is 0.
   * {CL_INVALID_VALUE} if _buffer_row_pitch_ is not 0 and is less than
     _region_[0].
   * {CL_INVALID_VALUE} if _host_row_pitch_ is not 0 and is less than
     _region_[0].
   * {CL_INVALID_VALUE} if _buffer_slice_pitch_ is not 0 and is less than
     _region_[1] {times} _buffer_row_pitch_ and not a multiple of
     _buffer_row_pitch_.
   * {CL_INVALID_VALUE} if _host_slice_pitch_ is not 0 and is less than
     _region_[1] {times} _host_row_pitch_ and not a multiple of
     _host_row_pitch_.
   * {CL_INVALID_VALUE} if _ptr_ is `NULL`.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _buffer_ is a sub-buffer object and
     _offset_ specified when the sub-buffer object is created is not aligned
     to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated with
     _queue_.
     This error code is <<unified-spec, missing before>> version 1.1.
   * {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the read and write
     operations are blocking and the execution status of any of the events in
     _event_wait_list_ is a negative integer value.
     This error code is <<unified-spec, missing before>> version 1.1.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for data store associated with _buffer_.
   * {CL_INVALID_OPERATION} if {clEnqueueReadBufferRect} is called on _buffer_
     which has been created with {CL_MEM_HOST_WRITE_ONLY} or
     {CL_MEM_HOST_NO_ACCESS}.
   * {CL_INVALID_OPERATION} if {clEnqueueWriteBufferRect} is called on _buffer_
     which has been created with {CL_MEM_HOST_READ_ONLY} or
     {CL_MEM_HOST_NO_ACCESS}.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.

 [NOTE]
 ====
 Calling {clEnqueueReadBuffer} to read a region of the buffer object with the
 _ptr_ argument value set to _host_ptr_ + _offset_, where _host_ptr_ is a
 pointer to the memory region specified when the buffer object being read is
 created with {CL_MEM_USE_HOST_PTR}, must meet the following requirements in
 order to avoid undefined behavior:

   * All commands that use this buffer object or a memory object (buffer or
     image) created from this buffer object have finished execution before
     the read command begins execution.
   * The buffer object or memory objects created from this buffer object are
     not mapped.
   * The buffer object or memory objects created from this buffer object are
     not used by any command-queue until the read command has finished
     execution.

 Calling {clEnqueueReadBufferRect} to read a region of the buffer object with
 the _ptr_ argument value set to _host_ptr_ and _host_origin_,
 _buffer_origin_ values are the same, where _host_ptr_ is a pointer to the
 memory region specified when the buffer object being read is created with
 {CL_MEM_USE_HOST_PTR}, must meet the same requirements given above for
 {clEnqueueReadBuffer}.

 Calling {clEnqueueWriteBuffer} to update the latest bits in a region of the
 buffer object with the _ptr_ argument value set to _host_ptr_ + _offset_,
 where _host_ptr_ is a pointer to the memory region specified when the buffer
 object being written is created with {CL_MEM_USE_HOST_PTR}, must meet the
 following requirements in order to avoid undefined behavior:

   * The host memory region given by (_host_ptr_ + _offset_, _cb_) contains
     the latest bits when the enqueued write command begins execution.
   * The buffer object or memory objects created from this buffer object are
     not mapped.
   * The buffer object or memory objects created from this buffer object are
     not used by any command-queue until the write command has finished
     execution.

 Calling {clEnqueueWriteBufferRect} to update the latest bits in a region of
 the buffer object with the _ptr_ argument value set to _host_ptr_ and
 _host_origin_, _buffer_origin_ values are the same, where _host_ptr_ is a
 pointer to the memory region specified when the buffer object being written
 is created with {CL_MEM_USE_HOST_PTR}, must meet the following requirements in
 order to avoid undefined behavior:

   * The host memory region given by (_buffer_origin region_) contains the
     latest bits when the enqueued write command begins execution.
   * The buffer object or memory objects created from this buffer object are
     not mapped.
   * The buffer object or memory objects created from this buffer object are
     not used by any command-queue until the write command has finished
     execution.
 ====
 --

 [open,refpage='clEnqueueCopyBuffer',desc='Enqueues a command to copy from one buffer object to another.',type='protos']
 --
 To enqueue a command to copy a buffer object identified by _src_buffer_ to
 another buffer object identified by _dst_buffer_, call the function

 include::{generated}/api/protos/clEnqueueCopyBuffer.txt[]
 include::{generated}/api/version-notes/clEnqueueCopyBuffer.asciidoc[]

   * _command_queue_ refers to a host command-queue in which the copy command
     will be queued.
     The OpenCL context associated with _command_queue_, _src_buffer_ and
     _dst_buffer_ must be the same.
   * _src_offset_ refers to the offset where to begin copying data from
     _src_buffer_.
   * _dst_offset_ refers to the offset where to begin copying data into
     _dst_buffer_.
   * _size_ refers to the size in bytes to copy.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before this particular command can be executed.
     If _event_wait_list_ is `NULL`, then this particular command does not wait
     on any event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this copy command
     and can be used to query or queue a wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 // refError

 {clEnqueueCopyBuffer} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_CONTEXT} if the context associated with _command_queue_,
     _src_buffer_ and _dst_buffer_ are not the same or if the context
     associated with _command_queue_ and events in _event_wait_list_ are not
     the same.
   * {CL_INVALID_MEM_OBJECT} if _src_buffer_ and _dst_buffer_ are not valid
     buffer objects.
   * {CL_INVALID_VALUE} if _src_offset_, _dst_offset_, _size_, _src_offset_
     {plus} _size_ or _dst_offset_ + _size_ require accessing elements
     outside the _src_buffer_ and _dst_buffer_ buffer objects respectively.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _src_buffer_ is a sub-buffer object
     and _offset_ specified when the sub-buffer object is created is not
     aligned to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated
     with _queue_.
     This error code is <<unified-spec, missing before>> version 1.1.
   * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _dst_buffer_ is a sub-buffer object
     and _offset_ specified when the sub-buffer object is created is not
     aligned to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated
     with _queue_.
     This error code is <<unified-spec, missing before>> version 1.1.
   * {CL_MEM_COPY_OVERLAP} if _src_buffer_ and _dst_buffer_ are the same buffer
     or sub-buffer object and the source and destination regions overlap or
     if _src_buffer_ and _dst_buffer_ are different sub-buffers of the same
     associated buffer object and they overlap.
     The regions overlap if _src_offset_ {leq} _dst_offset_ {leq}
     _src_offset_ + _size_ - 1 or if _dst_offset_ {leq} _src_offset_ {leq}
     _dst_offset_ + _size_ - 1.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for data store associated with _src_buffer_ or _dst_buffer_.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clEnqueueCopyBufferRect',desc='Enqueues a command to copy a 2D or 3D rectangular region from a buffer object to another buffer object.',type='protos']
 --
 To enqueue a command to copy a 2D or 3D rectangular region from the buffer
 object identified by _src_buffer_ to a 2D or 3D region in the buffer object
 identified by _dst_buffer_, call the function

 include::{generated}/api/protos/clEnqueueCopyBufferRect.txt[]
 include::{generated}/api/version-notes/clEnqueueCopyBufferRect.asciidoc[]

   * _command_queue_ refers to the host command-queue in which the copy command
     will be queued.
     The OpenCL context associated with _command_queue_, _src_buffer_ and
     _dst_buffer_ must be the same.
   * _src_origin_ defines the (_x_, _y_, _z_) offset in the memory region
     associated with _src_buffer_.
     For a 2D rectangle region, the _z_ value given by _src_origin_[2] should be
     0.
     The offset in bytes is computed as _src_origin_[2] {times} _src_slice_pitch_
     + _src_origin_[1] {times} _src_row_pitch_ + _src_origin_[0].
   * _dst_origin_ defines the (_x_, _y_, _z_) offset in the memory region
     associated with _dst_buffer_.
     For a 2D rectangle region, the _z_ value given by _dst_origin_[2] should be
     0.
     The offset in bytes is computed as _dst_origin_[2] {times} _dst_slice_pitch_
     + _dst_origin_[1] {times} _dst_row_pitch_ + _dst_origin_[0].
   * _region_ defines the (_width_ in bytes, _height_ in rows, _depth_ in slices)
     of the 2D or 3D rectangle being copied.
     For a 2D rectangle, the _depth_ value given by _region_[2] should be 1.
     The values in region cannot be 0.
   * _src_row_pitch_ is the length of each row in bytes to be used for the memory
     region associated with _src_buffer_.
     If _src_row_pitch_ is 0, _src_row_pitch_ is computed as _region_[0].
   * _src_slice_pitch_ is the length of each 2D slice in bytes to be used for the
     memory region associated with _src_buffer_.
     If _src_slice_pitch_ is 0, _src_slice_pitch_ is computed as _region_[1]
     {times} _src_row_pitch_.
   * _dst_row_pitch_ is the length of each row in bytes to be used for the memory
     region associated with _dst_buffer_.
     If _dst_row_pitch_ is 0, _dst_row_pitch_ is computed as _region_[0].
   * _dst_slice_pitch_ is the length of each 2D slice in bytes to be used for the
     memory region associated with _dst_buffer_.
     If _dst_slice_pitch_ is 0, _dst_slice_pitch_ is computed as _region_[1]
     {times} _dst_row_pitch_.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before this particular command can be executed.
     If _event_wait_list_ is `NULL`, then this particular command does not wait
     on any event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this copy command
     and can be used to query or queue a wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 Copying begins at the source offset and destination offset which are
 computed as described below in the description for _src_origin_ and
 _dst_origin_.
 Each byte of the region's width is copied from the source offset to the
 destination offset.
 After copying each width, the source and destination offsets are incremented
 by their respective source and destination row pitches.
 After copying each 2D rectangle, the source and destination offsets are
 incremented by their respective source and destination slice pitches.

 [NOTE]
 ====
 If _src_buffer_ and _dst_buffer_ are the same buffer object, _src_row_pitch_
 must equal _dst_row_pitch_ and _src_slice_pitch_ must equal
 _dst_slice_pitch_.
 ====

 // refError

 {clEnqueueCopyBufferRect} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_CONTEXT} if the context associated with _command_queue_,
     _src_buffer_ and _dst_buffer_ are not the same or if the context
     associated with _command_queue_ and events in _event_wait_list_ are not
     the same.
   * {CL_INVALID_MEM_OBJECT} if _src_buffer_ and _dst_buffer_ are not valid
     buffer objects.
   * {CL_INVALID_VALUE} if _src_origin_, _dst_origin_, or _region_ is `NULL`.
   * {CL_INVALID_VALUE} if (_src_origin_, _region_, _src_row_pitch_,
     _src_slice_pitch_) or (_dst_origin_, _region_, _dst_row_pitch_,
     _dst_slice_pitch_) require accessing elements outside the _src_buffer_
     and _dst_buffer_ buffer objects respectively.
   * {CL_INVALID_VALUE} if any _region_ array element is 0.
   * {CL_INVALID_VALUE} if _src_row_pitch_ is not 0 and is less than
     _region_[0].
   * {CL_INVALID_VALUE} if _dst_row_pitch_ is not 0 and is less than
     _region_[0].
   * {CL_INVALID_VALUE} if _src_slice_pitch_ is not 0 and is less than
     _region_[1] {times} _src_row_pitch_ or if _src_slice_pitch_ is not 0 and
     is not a multiple of _src_row_pitch_.
   * {CL_INVALID_VALUE} if _dst_slice_pitch_ is not 0 and is less than
     _region_[1] {times} _dst_row_pitch_ or if _dst_slice_pitch_ is not 0 and
     is not a multiple of _dst_row_pitch_.
   * {CL_INVALID_VALUE} if _src_buffer_ and _dst_buffer_ are the same buffer
     object and _src_slice_pitch_ is not equal to _dst_slice_pitch_ and
     _src_row_pitch_ is not equal to _dst_row_pitch_.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_MEM_COPY_OVERLAP} if _src_buffer_ and _dst_buffer_ are the same buffer
     or sub-buffer object and the source and destination regions overlap or
     if _src_buffer_ and _dst_buffer_ are different sub-buffers of the same
     associated buffer object and they overlap.
     Refer to <<check-copy-overlap, Checking for Memory Copy Overlap>> for
     details on how to determine if source and destination regions overlap.
   * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _src_buffer_ is a sub-buffer object
     and _offset_ specified when the sub-buffer object is created is not
     aligned to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated
     with _queue_.
     This error code is <<unified-spec, missing before>> version 1.1.
   * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _dst_buffer_ is a sub-buffer object
     and _offset_ specified when the sub-buffer object is created is not
     aligned to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated
     with _queue_.
     This error code is <<unified-spec, missing before>> version 1.1.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for data store associated with _src_buffer_ or _dst_buffer_.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --


 === Filling Buffer Objects

 NOTE: Filling buffer objects is <<unified-spec, missing before>> version 1.2.

 [open,refpage='clEnqueueFillBuffer',desc='Enqueues a command to fill a buffer object with a pattern of a given pattern size.',type='protos']
 --
 To enqueue a command to fill a buffer object with a pattern of a given
 pattern size, call the function

 include::{generated}/api/protos/clEnqueueFillBuffer.txt[]
 include::{generated}/api/version-notes/clEnqueueFillBuffer.asciidoc[]

   * _command_queue_ refers to the host command-queue in which the fill command
     will be queued.
     The OpenCL context associated with _command_queue_ and _buffer_ must be the
     same.
   * _buffer_ is a valid buffer object.
   * _pattern_ is a pointer to the data pattern of size _pattern_size_ in bytes.
     _pattern_ will be used to fill a region in _buffer_ starting at _offset_ and
     is _size_ bytes in size.
     The data pattern must be a scalar or vector integer or floating-point data
     type supported by OpenCL as described in <<scalar-data-types, Shared
     Application Scalar Data Types>> and <<vector-data-types, Supported
     Application Vector Data Types>>.
     For example, if _buffer_ is to be filled with a pattern of `float4` values,
     then _pattern_ will be a pointer to a `cl_float4` value and _pattern_size_
     will be `sizeof(cl_float4)`.
     The maximum value of _pattern_size_ is the size of the largest integer or
     floating-point vector data type supported by the OpenCL device.
     The memory associated with _pattern_ can be reused or freed after the
     function returns.
   * _offset_ is the location in bytes of the region being filled in _buffer_ and
     must be a multiple of _pattern_size_.
   * _size_ is the size in bytes of region being filled in _buffer_ and must be a
     multiple of _pattern_size_.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before this particular command can be executed.
     If _event_wait_list_ is `NULL`, then this particular command does not wait
     on any event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this command
     and can be used to query or queue a wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 The usage information which indicates whether the memory object can be read
 or written by a kernel and/or the host and is given by the {cl_mem_flags_TYPE}
 argument value specified when _buffer_ is created is ignored by
 {clEnqueueFillBuffer}.

 // refError

 {clEnqueueFillBuffer} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and
     _buffer_ are not the same or if the context associated with
     _command_queue_ and events in _event_wait_list_ are not the same.
   * {CL_INVALID_MEM_OBJECT} if _buffer_ is not a valid buffer object.
   * {CL_INVALID_VALUE} if _offset_ or _offset_ + _size_ require accessing
     elements outside the _buffer_ buffer object respectively.
   * {CL_INVALID_VALUE} if _pattern_ is `NULL` or if _pattern_size_ is 0 or if
     _pattern_size_ is not one of { 1, 2, 4, 8, 16, 32, 64, 128 }.
   * {CL_INVALID_VALUE} if _offset_ and _size_ are not a multiple of
     _pattern_size_.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _buffer_ is a sub-buffer object and
     offset specified when the sub-buffer object is created is not aligned to
     {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated with _queue_.
     This error code is <<unified-spec, missing before>> version 1.1.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for data store associated with _buffer_.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --


 === Mapping Buffer Objects

 [open,refpage='clEnqueueMapBuffer',desc='Enqueues a command to map a region of a buffer object into the host address space and returns a pointer to this mapped region.',type='protos']
 --
 To enqueue a command to map a region of the buffer object given by _buffer_
 into the host address space and returns a pointer to this mapped region,
 call the function

 include::{generated}/api/protos/clEnqueueMapBuffer.txt[]
 include::{generated}/api/version-notes/clEnqueueMapBuffer.asciidoc[]

   * _command_queue_ must be a valid host command-queue.
   * _blocking_map_ indicates if the map operation is _blocking_ or
     _non-blocking_.

 If _blocking_map_ is {CL_TRUE}, {clEnqueueMapBuffer} does not return until the
 specified region in _buffer_ is mapped into the host address space and the
 application can access the contents of the mapped region using the pointer
 returned by {clEnqueueMapBuffer}.

 If _blocking_map_ is {CL_FALSE} i.e. map operation is non-blocking, the
 pointer to the mapped region returned by {clEnqueueMapBuffer} cannot be used
 until the map command has completed.
 The _event_ argument returns an event object which can be used to query the
 execution status of the map command.
 When the map command is completed, the application can access the contents
 of the mapped region using the pointer returned by {clEnqueueMapBuffer}.

   * _map_flags_ is a bit-field and is described in the
     <<memory-map-flags-table,Memory Map Flags>> table.
   * _buffer_ is a valid buffer object.
     The OpenCL context associated with _command_queue_ and _buffer_ must be the
     same.
   * _offset_ and _size_ are the offset in bytes and the size of the region in
     the buffer object that is being mapped.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before this particular command can be executed.
     If _event_wait_list_ is `NULL`, then this particular command does not wait
     on any event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this command
     and can be used to query or queue a wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.
   * _errcode_ret_ will return an appropriate error code.
     If _errcode_ret_ is `NULL`, no error code is returned.

 // refError

 {clEnqueueMapBuffer} will return a pointer to the mapped region.
 The _errcode_ret_ is set to {CL_SUCCESS}.

 A `NULL` pointer is returned otherwise with one of the following error
 values returned in _errcode_ret_:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and
     _buffer_ are not the same or if the context associated with
     _command_queue_ and events in _event_wait_list_ are not the same.
   * {CL_INVALID_MEM_OBJECT} if _buffer_ is not a valid buffer object.
   * {CL_INVALID_VALUE} if region being mapped given by (_offset_, _size_) is
     out of bounds or if _size_ is 0 or if values specified in _map_flags_
     are not valid.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _buffer_ is a sub-buffer object and
     _offset_ specified when the sub-buffer object is created is not aligned
     to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for the device associated with
     _queue_.
     This error code is <<unified-spec, missing before>> version 1.1.
   * {CL_MAP_FAILURE} if there is a failure to map the requested region into
     the host address space.
     This error cannot occur for buffer objects created with
     {CL_MEM_USE_HOST_PTR} or {CL_MEM_ALLOC_HOST_PTR}.
   * {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the map operation is
     blocking and the execution status of any of the events in
     _event_wait_list_ is a negative integer value.
     This error code is <<unified-spec, missing before>> version 1.1.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for data store associated with _buffer_.
   * {CL_INVALID_OPERATION} if _buffer_ has been created with
     {CL_MEM_HOST_WRITE_ONLY} or {CL_MEM_HOST_NO_ACCESS} and {CL_MAP_READ} is set
     in _map_flags_ or if _buffer_ has been created with
     {CL_MEM_HOST_READ_ONLY} or {CL_MEM_HOST_NO_ACCESS} and {CL_MAP_WRITE} or
     {CL_MAP_WRITE_INVALIDATE_REGION} is set in _map_flags_.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
   * {CL_INVALID_OPERATION} if mapping would lead to overlapping regions being
     mapped for writing.

 The pointer returned maps a region starting at _offset_ and is at least
 _size_ bytes in size.
 The result of a memory access outside this region is undefined.

 If the buffer object is created with {CL_MEM_USE_HOST_PTR} set in _mem_flags_,
 the following will be true:

   * The _host_ptr_ specified in {clCreateBuffer} or {clCreateBufferWithProperties}
     will contain the latest bits in the region being mapped when the
     {clEnqueueMapBuffer} command has completed.
   * The pointer value returned by {clEnqueueMapBuffer} will be derived from
     the _host_ptr_ specified when the buffer object is created.

 Mapped buffer objects are unmapped using {clEnqueueUnmapMemObject}.
 This is described in <<unmapping-mapped-memory, Unmapping Mapped Memory
 Objects>>.

 [[memory-map-flags-table]]
 .List of supported map flag values
 [width="100%",cols="<50%,<50%",options="header"]
 |====
 | Map Flags | Description
 | {CL_MAP_READ_anchor}

 include::{generated}/api/version-notes/CL_MAP_READ.asciidoc[]
   | This flag specifies that the region being mapped in the memory object is
     being mapped for reading.

     The pointer returned by {clEnqueueMapBuffer} ({clEnqueueMapImage}) is
     guaranteed to contain the latest bits in the region being mapped when
     the {clEnqueueMapBuffer} ({clEnqueueMapImage}) command has completed.
 | {CL_MAP_WRITE_anchor}

 include::{generated}/api/version-notes/CL_MAP_WRITE.asciidoc[]
   | This flag specifies that the region being mapped in the memory object is
     being mapped for writing.

     The pointer returned by {clEnqueueMapBuffer} ({clEnqueueMapImage}) is
     guaranteed to contain the latest bits in the region being mapped when
     the {clEnqueueMapBuffer} ({clEnqueueMapImage}) command has completed
 | {CL_MAP_WRITE_INVALIDATE_REGION_anchor}

 include::{generated}/api/version-notes/CL_MAP_WRITE_INVALIDATE_REGION.asciidoc[]
   | This flag specifies that the region being mapped in the memory object is
     being mapped for writing.

     The contents of the region being mapped are to be discarded.
     This is typically the case when the region being mapped is overwritten
     by the host.
     This flag allows the implementation to no longer guarantee that the
     pointer returned by {clEnqueueMapBuffer} ({clEnqueueMapImage}) contains
     the latest bits in the region being mapped which can be a significant
     performance enhancement.

     {CL_MAP_READ} or {CL_MAP_WRITE} and {CL_MAP_WRITE_INVALIDATE_REGION} are
     mutually exclusive.
 |====
 --


 == Image Objects

 An _image_ object is used to store a one-, two- or three-dimensional
 texture, frame-buffer or image.
 The elements of an image object are selected from a list of predefined image
 formats.
 The minimum number of elements in a memory object is one.


 === Creating Image Objects

 [open,refpage='clCreateImage',desc='Creates a 1D image, 1D image buffer, 1D image array, 2D image, 2D image array or 3D image object.',type='protos',alias='clCreateImageWithProperties']
 --
 An *image object* may be created using the function

 include::{generated}/api/protos/clCreateImage.txt[]
 include::{generated}/api/version-notes/clCreateImage.asciidoc[]

 An *image object* may also be created with additional properties using the function

 include::{generated}/api/protos/clCreateImageWithProperties.txt[]
 include::{generated}/api/version-notes/clCreateImageWithProperties.asciidoc[]

   * _context_ is a valid OpenCL context used to create the image object.
   * _properties_ is an optional list of properties for the image object and their corresponding values.
     The list is terminated with the special property `0`.
     If no properties are required, _properties_ may be `NULL`.
     OpenCL 3.0 does not define any optional properties for images.
   * _flags_ is a bit-field that is used to specify allocation and usage
     information about the image memory object being created and is described in
     the <<memory-flags-table, supported memory flag values>> table.
   * _image_format_ is a pointer to a structure that describes format properties
     of the image to be allocated.
     A 1D image buffer or 2D image can be created from a buffer by specifying a
     buffer object in the __image_desc__->__mem_object__.
     A 2D image can be created from another 2D image object by specifying an
     image object in the __image_desc__->__mem_object__.
     Refer to the <<image-format-descriptor, Image Format Descriptor>> section
     for a detailed description of the image format descriptor.
   * _image_desc_ is a pointer to a structure that describes type and dimensions
     of the image to be allocated.
     Refer to the <<image-descriptor, Image Descriptor>> section for a detailed
     description of the image descriptor.
   * _host_ptr_ is a pointer to the image data that may already be allocated by
     the application.
     Refer to the <<host-ptr-buffer-size-table, table below>> for a description
     of how large the buffer that _host_ptr_ points to must be.
   * _errcode_ret_ will return an appropriate error code.
     If _errcode_ret_ is `NULL`, no error code is returned.

 The alignment requirements for data stored in image objects are described
 in <<alignment-app-data-types>>.

 For all image types except {CL_MEM_OBJECT_IMAGE1D_BUFFER}, if the value
 specified for _flags_ is 0, the default is used which is {CL_MEM_READ_WRITE}.

 For {CL_MEM_OBJECT_IMAGE1D_BUFFER} image type, or an image created from
 another memory object (image or buffer), if the {CL_MEM_READ_WRITE},
 {CL_MEM_READ_ONLY} or {CL_MEM_WRITE_ONLY} values are not specified in _flags_,
 they are inherited from the corresponding memory access qualifiers associated
 with __mem_object__.
 The {CL_MEM_USE_HOST_PTR}, {CL_MEM_ALLOC_HOST_PTR} and {CL_MEM_COPY_HOST_PTR}
 values cannot be specified in _flags_ but are inherited from the
 corresponding memory access qualifiers associated with __mem_object__.
 If {CL_MEM_COPY_HOST_PTR} is specified in the memory access qualifier values
 associated with __mem_object__ it does not imply any additional copies when
 the image is created from __mem_object__.
 If the {CL_MEM_HOST_WRITE_ONLY}, {CL_MEM_HOST_READ_ONLY} or
 {CL_MEM_HOST_NO_ACCESS} values are not specified in _flags_, they are
 inherited from the corresponding memory access qualifiers associated with
 __mem_object__.

 For a 3D image or 2D image array, the image data specified by _host_ptr_ is
 stored as a linear sequence of adjacent 2D image slices or 2D images
 respectively.
 Each 2D image is a linear sequence of adjacent scanlines.
 Each scanline is a linear sequence of image elements.

 For a 2D image, the image data specified by _host_ptr_ is stored as a linear
 sequence of adjacent scanlines.
 Each scanline is a linear sequence of image elements.

 For a 1D image array, the image data specified by _host_ptr_ is stored as a
 linear sequence of adjacent 1D images.
 Each 1D image is stored as a single scanline which is a linear sequence of
 adjacent elements.

 For 1D image or 1D image buffer, the image data specified by _host_ptr_ is
 stored as a single scanline which is a linear sequence of adjacent elements.

 Image elements are stored according to their image format as described in the
 <<image-format-descriptor, Image Format Descriptor>> section.

 // refError

 {clCreateImage} and {clCreateImageWithProperties} returns a valid non-zero
 image object and _errcode_ret_ is set to {CL_SUCCESS} if the image object
 is created successfully.
 Otherwise, they return a `NULL` value with one of the following error values
 returned in _errcode_ret_:

   * {CL_INVALID_CONTEXT} if _context_ is not a valid context.
   * {CL_INVALID_PROPERTY} if a property name in _properties_ is not a
     supported property name, if the value specified for a supported property
     name is not valid, or if the same property name is specified more than
     once.
   * {CL_INVALID_VALUE} if values specified in _flags_ are not valid.
   * {CL_INVALID_IMAGE_FORMAT_DESCRIPTOR} if values specified in _image_format_
     are not valid or if _image_format_ is `NULL`.
   * {CL_INVALID_IMAGE_FORMAT_DESCRIPTOR} if a 2D image is created from a
     buffer and the row pitch and base address alignment does not follow the
     rules described for creating a 2D image from a buffer.
   * {CL_INVALID_IMAGE_FORMAT_DESCRIPTOR} if a 2D image is created from a 2D
     image object and the rules described above are not followed.
   * {CL_INVALID_IMAGE_DESCRIPTOR} if values specified in _image_desc_ are not
     valid or if _image_desc_ is `NULL`.
   * {CL_INVALID_IMAGE_SIZE} if image dimensions specified in _image_desc_
     exceed the maximum image dimensions described in the
     <<device-queries-table,Device Queries>> table for all devices
     in _context_.
   * {CL_INVALID_HOST_PTR} if _host_ptr_ is `NULL` and {CL_MEM_USE_HOST_PTR} or
     {CL_MEM_COPY_HOST_PTR} are set in _flags_ or if _host_ptr_ is not `NULL`
     but {CL_MEM_COPY_HOST_PTR} or {CL_MEM_USE_HOST_PTR} are not set in _flags_.
   * {CL_INVALID_VALUE} if an image is being created from another memory object
     (buffer or image) under one of the following circumstances: 1)
     __mem_object__ was created with {CL_MEM_WRITE_ONLY} and _flags_ specifies
     {CL_MEM_READ_WRITE} or {CL_MEM_READ_ONLY}, 2) __mem_object__ was created with
     {CL_MEM_READ_ONLY} and _flags_ specifies {CL_MEM_READ_WRITE} or
     {CL_MEM_WRITE_ONLY}, 3) _flags_ specifies {CL_MEM_USE_HOST_PTR} or
     {CL_MEM_ALLOC_HOST_PTR} or {CL_MEM_COPY_HOST_PTR}.
   * {CL_INVALID_VALUE} if an image is being created from another memory object
     (buffer or image) and __mem_object__ was created with
     {CL_MEM_HOST_WRITE_ONLY} and _flags_ specifies {CL_MEM_HOST_READ_ONLY}, or
     if __mem_object__ was created with {CL_MEM_HOST_READ_ONLY} and _flags_
     specifies {CL_MEM_HOST_WRITE_ONLY}, or if __mem_object__ was created with
     {CL_MEM_HOST_NO_ACCESS} and_flags_ specifies {CL_MEM_HOST_READ_ONLY} or
     {CL_MEM_HOST_WRITE_ONLY}.
   * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if there are no devices in _context_ that
     support _image_format_.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for image object.
   * {CL_INVALID_OPERATION} if there are no devices in _context_ that support
     images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the
     <<device-queries-table,Device Queries>> table is {CL_FALSE}).
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.

 [[host-ptr-buffer-size-table]]
 .Required _host_ptr_ buffer sizes for images
 [width="100%",cols="<50%,<50%",options="header"]
 |====
 | Image Type | Size of buffer that _host_ptr_ points to
 | {CL_MEM_OBJECT_IMAGE1D_anchor}

 include::{generated}/api/version-notes/CL_MEM_OBJECT_IMAGE1D.asciidoc[]
   | {geq} image_row_pitch
 | {CL_MEM_OBJECT_IMAGE1D_BUFFER_anchor}

 include::{generated}/api/version-notes/CL_MEM_OBJECT_IMAGE1D_BUFFER.asciidoc[]
   | {geq} image_row_pitch
 | {CL_MEM_OBJECT_IMAGE2D_anchor}

 include::{generated}/api/version-notes/CL_MEM_OBJECT_IMAGE2D.asciidoc[]
   | {geq} image_row_pitch {times} image_height
 | {CL_MEM_OBJECT_IMAGE3D_anchor}

 include::{generated}/api/version-notes/CL_MEM_OBJECT_IMAGE3D.asciidoc[]
   | {geq} image_slice_pitch {times} image_depth
 | {CL_MEM_OBJECT_IMAGE1D_ARRAY_anchor}

 include::{generated}/api/version-notes/CL_MEM_OBJECT_IMAGE1D_ARRAY.asciidoc[]
   | {geq} image_slice_pitch {times} image_array_size
 | {CL_MEM_OBJECT_IMAGE2D_ARRAY_anchor}

 include::{generated}/api/version-notes/CL_MEM_OBJECT_IMAGE2D_ARRAY.asciidoc[]
   | {geq} image_slice_pitch {times} image_array_size
 |====

 --

 [open,refpage='clCreateImage2D',desc='Creates a 2D image.',type='protos']
 --
 A *2D image* object can be created using the following function

 include::{generated}/api/protos/clCreateImage2D.txt[]
 include::{generated}/api/version-notes/clCreateImage2D.asciidoc[]

   * _context_ is a valid OpenCL context on which the image object is to be
     created.
   * _flags_ is a bit-field that is used to specify allocation and usage
     information about the image memory object being created and is described in
     the <<memory-flags-table, supported memory flag values>> table.
     If the value specified for _flags_ is 0, the default is used which is
     {CL_MEM_READ_WRITE}.
   * _image_format_ is a pointer to a structure that describes format properties
     of the image to be allocated.
     Refer to the <<image-format-descriptor, Image Format Descriptor>> section
     for a detailed description of the image format descriptor.
   * _image_width_ and _image_height_ are the width and height of the image in
     pixels.
     These must be values greater than or equal to 1.
   * _image_row_pitch_ is the  scan-line pitch in bytes.
     This must be 0 if _host_ptr_ is `NULL` and can be either 0 or {geq}
     _image_width_ {times} size of element in bytes if _host_ptr_ is not `NULL`.
     If _host_ptr_ is not `NULL` and _image_row_pitch_ is 0, _image_row_pitch_
     is calculated as _image_width_ {times} size of element in bytes.
     If _image_row_pitch_ is not 0, it must be a multiple of the image element
     size in bytes.
   * _host_ptr_ is a pointer to the image data that may already be allocated by
     the application.
     Refer to the {CL_MEM_OBJECT_IMAGE2D} entry in the
     <<host-ptr-buffer-size-table, required _host_ptr_ buffer size table>> for a
     description of how large the buffer that _host_ptr_ points to must be.
     The image data specified by _host_ptr_ is stored as a linear sequence of
     adjacent scanlines.
     Each scanline is a linear sequence of image elements.
     Image elements are stored according to their image format as described in
     the <<image-format-descriptor, Image Format Descriptor>> section.
   * _errcode_ret_ will return an appropriate error code.
     If _errcode_ret_ is `NULL`, no error code is returned.

 // refError

 {clCreateImage2D} returns a valid non-zero image object created and the
 _errcode_ret_ is set to {CL_SUCCESS} if the image object is created
 successfully.
 Otherwise, it returns a `NULL` value with one of the following error values
 returned in _errcode_ret_:

   * {CL_INVALID_CONTEXT} if _context_ is not a valid context.
   * {CL_INVALID_VALUE} if values specified in _flags_ are not valid.
   * {CL_INVALID_IMAGE_FORMAT_DESCRIPTOR} if values specified in _image_format_
     are not valid or if _image_format_ is `NULL`.
   * {CL_INVALID_IMAGE_SIZE} if _image_width_ or _image_height_ are 0 or if they
     exceed the maximum values specified in {CL_DEVICE_IMAGE2D_MAX_WIDTH} or
     {CL_DEVICE_IMAGE2D_MAX_HEIGHT} respectively for all devices in _context_ or
     if values specified by _image_row_pitch_ do not follow rules described in the
     argument description above.
   * {CL_INVALID_HOST_PTR} if _host_ptr_ is `NULL` and {CL_MEM_USE_HOST_PTR} or
     {CL_MEM_COPY_HOST_PTR} are set in _flags_ or if _host_ptr_ is not `NULL`
     but {CL_MEM_COPY_HOST_PTR} or {CL_MEM_USE_HOST_PTR} are not set in _flags_.
   * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if there are no devices in _context_ that
     support _image_format_.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for image object.
   * {CL_INVALID_OPERATION} if there are no devices in _context_ that support
     images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the
     <<device-queries-table,Device Queries>> table is {CL_FALSE}).
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clCreateImage3D',desc='Creates a 3D image.',type='protos']
 --
 A *3D image* object can be created using the following function

 include::{generated}/api/protos/clCreateImage3D.txt[]
 include::{generated}/api/version-notes/clCreateImage3D.asciidoc[]

   * _context_ is a valid OpenCL context on which the image object is to be
     created.
   * _flags_ is a bit-field that is used to specify allocation and usage
     information about the image memory object being created and is described in
     the <<memory-flags-table, supported memory flag values>> table.
     If the value specified for _flags_ is 0, the default is used which is
     {CL_MEM_READ_WRITE}.
   * _image_format_ is a pointer to a structure that describes format properties
     of the image to be allocated.
     Refer to the <<image-format-descriptor, Image Format Descriptor>> section
     for a detailed description of the image format descriptor.
   * _image_width_ and _image_height_ are the width and height of the image in
     pixels.
     These must be values greater than or equal to 1.
   * _image_depth_ is the depth of the image in pixels.  For {clCreateImage3D},
     this must be a value {gt} 1.
   * _image_row_pitch_ is the  scan-line pitch in bytes.
     This must be 0 if _host_ptr_ is `NULL` and can be either 0 or {geq}
     _image_width_ {times} size of element in bytes if _host_ptr_ is not `NULL`.
     If _host_ptr_ is not `NULL` and _image_row_pitch_ is 0, _image_row_pitch_
     is calculated as _image_width_ {times} size of element in bytes.
     If _image_row_pitch_ is not 0, it must be a multiple of the image element
     size in bytes.
   * _image_slice_pitch_ is the size in bytes of each 2D slice in the 3D image.
      This be be 0 if _host_ptr_ is `NULL` and can be 0 or {geq}
      _image_row_pitch_ {times} _image_height_ if _host_ptr_ is not `NULL`.
      If _host_ptr_ is not `NULL` and _image_slice_pitch_ is 0,
      _image_slice_pitch_ is calculated as _image_row_pitch_ {times}
      _image_height_.
      If _image_slice_pitch_ is not 0, it must be a multiple of the
      _image_row_pitch_.
   * _host_ptr_ is a pointer to the image data that may already be allocated by
     the application.
     Refer to the {CL_MEM_OBJECT_IMAGE3D} entry in the
     <<host-ptr-buffer-size-table, required _host_ptr_ buffer size table>> for a
     description of how large the buffer that _host_ptr_ points to must be.
     The image data specified by _host_ptr_ is stored as a linear sequence of
     adjacent 2D slices.
     Each scanline is a linear sequence of image elements.
     Image elements are stored according to their image format as described in
     the <<image-format-descriptor, Image Format Descriptor>> section.
   * _errcode_ret_ will return an appropriate error code.
     If _errcode_ret_ is `NULL`, no error code is returned.

 // refError

 {clCreateImage3D} returns a valid non-zero image object created and the
 _errcode_ret_ is set to {CL_SUCCESS} if the image object is created
 successfully.
 Otherwise, it returns a `NULL` value with one of the following error values
 returned in _errcode_ret_:

   * {CL_INVALID_CONTEXT} if _context_ is not a valid context.
   * {CL_INVALID_VALUE} if values specified in _flags_ are not valid.
   * {CL_INVALID_IMAGE_FORMAT_DESCRIPTOR} if values specified in _image_format_
     are not valid or if _image_format_ is `NULL`.
   * {CL_INVALID_IMAGE_SIZE} if _image_width_ or _image_height_ are 0 or if
     _image_depth_ {leq} 1, or if they exceed the maximum values specified in
     {CL_DEVICE_IMAGE3D_MAX_WIDTH}, {CL_DEVICE_IMAGE3D_MAX_HEIGHT} or
     {CL_DEVICE_IMAGE3D_MAX_DEPTH} respectively for all devices in _context_, or
     if values specified by _image_row_pitch_ and _image_slice_pitch_ do not
     follow rules described in the argument description above.
   * {CL_INVALID_HOST_PTR} if _host_ptr_ is `NULL` and {CL_MEM_USE_HOST_PTR} or
     {CL_MEM_COPY_HOST_PTR} are set in _flags_ or if _host_ptr_ is not `NULL`
     but {CL_MEM_COPY_HOST_PTR} or {CL_MEM_USE_HOST_PTR} are not set in _flags_.
   * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if there are no devices in _context_ that
     support _image_format_.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for image object.
   * {CL_INVALID_OPERATION} if there are no devices in _context_ that support
     images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the
     <<device-queries-table,Device Queries>> table is {CL_FALSE}).
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --


 [[image-format-descriptor]]
 ==== Image Format Descriptor

 [open,refpage='cl_image_format',desc='The image format descriptor structure is defined as:',type='structs',xrefs='clCreateImage']
 --
 The {cl_image_format_TYPE} image format descriptor structure describes an image
 format, and is defined as:

 include::{generated}/api/structs/cl_image_format.txt[]

   * `image_channel_order` specifies the number of channels and the channel
     layout i.e. the memory layout in which channels are stored in the image.
     Valid values are described in the <<image-channel-order-table,Image Channel
     Order>> table.
   * `image_channel_data_type` describes the size of the channel data type.
     The list of supported values is described in the
     <<image-channel-data-types-table,Image Channel Data Types>> table.
     The number of bits per element determined by the `image_channel_data_type`
     and `image_channel_order` must be a power of two.

 [[image-channel-order-table]]
 .List of supported Image Channel Order Values
 [width="100%",cols="<50%,<50%",options="header"]
 |====
 | Image Channel Order | Description
 | {CL_R_anchor}, {CL_A_anchor},
 // The CL_R annotation here is used to convey the same information for both
 // entries in this table row.
 include::{generated}/api/version-notes/CL_R.asciidoc[]
   | Single channel image formats where the single channel represents a `RED` or `ALPHA` component.
 | {CL_DEPTH_anchor}

 include::{generated}/api/version-notes/CL_DEPTH.asciidoc[]
   | A single channel image format where the single channel represents a `DEPTH` component.
 | {CL_LUMINANCE_anchor}

 include::{generated}/api/version-notes/CL_LUMINANCE.asciidoc[]
   | A single channel image format where the single channel represents a `LUMINANCE` value.
     The `LUMINANCE` value is replicated into the `RED`, `GREEN`, and `BLUE` components.
 | {CL_INTENSITY_anchor},

 include::{generated}/api/version-notes/CL_INTENSITY.asciidoc[]
   | A single channel image format where the single channel represents an `INTENSITY` value.
     The `INTENSITY` value is replicated into the `RED`, `GREEN`, `BLUE`, and `ALPHA` components.
 | {CL_RG_anchor}, {CL_RA_anchor}

 // The CL_RG annotation here is used to convey the same information for both
 // entries in this table row.
 include::{generated}/api/version-notes/CL_RG.asciidoc[]
   | Two channel image formats.
     The first channel always represents a `RED` component.
     The second channel represents a `GREEN` component or an `ALPHA` component.
 | {CL_Rx_anchor}

 include::{generated}/api/version-notes/CL_Rx.asciidoc[]
   | A two channel image format, where the first channel represents a `RED` component and the second channel is ignored.
 | {CL_RGB_anchor}

 include::{generated}/api/version-notes/CL_RGB.asciidoc[]
   | A three channel image format, where the three channels represent `RED`, `GREEN`, and `BLUE` components.
 | {CL_RGx_anchor}

 include::{generated}/api/version-notes/CL_RGx.asciidoc[]
   | A three channel image format, where the first two channels represent `RED` and `GREEN` components and the third channel is ignored.
 | {CL_RGBA_anchor}, {CL_ARGB_anchor}, {CL_BGRA_anchor}, {CL_ABGR_anchor}

 // Manually write this annotation as CL_ABGR is the odd-one-out here (all the
 // other entries in this row were in OpenCL 1.0).
   {CL_ABGR} is <<unified-spec, missing before>> version 2.0.
   | Four channel image formats, where the four channels represent `RED`, `GREEN`, `BLUE`, and `ALPHA` components.
 | {CL_RGBx_anchor}

 include::{generated}/api/version-notes/CL_RGBx.asciidoc[]
   | A four channel image format, where the first three channels represent `RED`, `GREEN`, and `BLUE` components and the fourth channel is ignored.
 | {CL_sRGB_anchor}

 include::{generated}/api/version-notes/CL_sRGB.asciidoc[]
   | A three channel image format, where the three channels represent `RED`, `GREEN`, and `BLUE` components in the sRGB color space.
 | {CL_sRGBA_anchor}, {CL_sBGRA_anchor}

 // The CL_sRGBA annotation here is used to convey the same information for both
 // entries in this table row.
 include::{generated}/api/version-notes/CL_sRGBA.asciidoc[]
   | Four channel image formats, where the first three channels represent `RED`, `GREEN`, and `BLUE` components in the sRGB color space.
     The fourth channel represents an `ALPHA` component.
 | {CL_sRGBx_anchor}

 include::{generated}/api/version-notes/CL_sRGBx.asciidoc[]
   | A four channel image format, where the three channels represent `RED`, `GREEN`, and `BLUE` components in the sRGB color space.
     The fourth channel is ignored.
 |====

 [[image-channel-data-types-table]]
 .List of supported Image Channel Data Types
 [width="100%",cols="<50%,<50%",options="header"]
 |====
 | Image Channel Data Type | Description
 | {CL_SNORM_INT8_anchor}

 include::{generated}/api/version-notes/CL_SNORM_INT8.asciidoc[]
   | Each channel component is a normalized signed 8-bit integer value
 | {CL_SNORM_INT16_anchor}

 include::{generated}/api/version-notes/CL_SNORM_INT16.asciidoc[]
   | Each channel component is a normalized signed 16-bit integer value
 | {CL_UNORM_INT8_anchor}

 include::{generated}/api/version-notes/CL_UNORM_INT8.asciidoc[]
   | Each channel component is a normalized unsigned 8-bit integer value
 | {CL_UNORM_INT16_anchor}

 include::{generated}/api/version-notes/CL_UNORM_INT16.asciidoc[]
   | Each channel component is a normalized unsigned 16-bit integer value
 | {CL_UNORM_SHORT_565_anchor}

 include::{generated}/api/version-notes/CL_UNORM_SHORT_565.asciidoc[]
   | Represents a normalized 5-6-5 3-channel RGB image.
     The channel order must be {CL_RGB} or {CL_RGBx}.
 | {CL_UNORM_SHORT_555_anchor}

 include::{generated}/api/version-notes/CL_UNORM_SHORT_555.asciidoc[]
   | Represents a normalized x-5-5-5 4-channel xRGB image.
     The channel order must be {CL_RGB} or {CL_RGBx}.
 | {CL_UNORM_INT_101010_anchor}

 include::{generated}/api/version-notes/CL_UNORM_INT_101010.asciidoc[]
   | Represents a normalized x-10-10-10 4-channel xRGB image.
     The channel order must be {CL_RGB} or {CL_RGBx}.
 | {CL_UNORM_INT_101010_2_anchor}

 include::{generated}/api/version-notes/CL_UNORM_INT_101010_2.asciidoc[]
   | Represents a normalized 10-10-10-2 four-channel RGBA image.
     The channel order must be {CL_RGBA}.
 | {CL_SIGNED_INT8_anchor}

 include::{generated}/api/version-notes/CL_SIGNED_INT8.asciidoc[]
   | Each channel component is an unnormalized signed 8-bit integer value
 | {CL_SIGNED_INT16_anchor}

 include::{generated}/api/version-notes/CL_SIGNED_INT16.asciidoc[]
   | Each channel component is an unnormalized signed 16-bit integer value
 | {CL_SIGNED_INT32_anchor}

 include::{generated}/api/version-notes/CL_SIGNED_INT32.asciidoc[]
   | Each channel component is an unnormalized signed 32-bit integer value
 | {CL_UNSIGNED_INT8_anchor}

 include::{generated}/api/version-notes/CL_UNSIGNED_INT8.asciidoc[]
   | Each channel component is an unnormalized unsigned 8-bit integer value
 | {CL_UNSIGNED_INT16_anchor}

 include::{generated}/api/version-notes/CL_UNSIGNED_INT16.asciidoc[]
   | Each channel component is an unnormalized unsigned 16-bit integer value
 | {CL_UNSIGNED_INT32_anchor}

 include::{generated}/api/version-notes/CL_UNSIGNED_INT32.asciidoc[]
   | Each channel component is an unnormalized unsigned 32-bit integer value
 | {CL_HALF_FLOAT_anchor}

 include::{generated}/api/version-notes/CL_HALF_FLOAT.asciidoc[]
   | Each channel component is a 16-bit half-float value
 | {CL_FLOAT_anchor}

 include::{generated}/api/version-notes/CL_FLOAT.asciidoc[]
   | Each channel component is a single precision floating-point value
 |====

 For example, to specify a normalized unsigned 8-bit / channel RGBA image,
 `image_channel_order` = {CL_RGBA}, and `image_channel_data_type` =
 {CL_UNORM_INT8}.
 The memory layout of this image format is described below:

 // (Jon) these tables are missing "Byte Offset -> 0 1 2 3 ..."

 [width="60%",cols="<10%,<10%,<10%,<10%,<60%"]
 |====
 | R | G | B | A | ... |
 |====

 with the corresponding byte offsets

 [width="60%",cols="<10%,<10%,<10%,<10%,<60%"]
 |====
 | 0 | 1 | 2 | 3 | ... |
 |====

 Similar, if `image_channel_order` = {CL_RGBA} and `image_channel_data_type` =
 {CL_SIGNED_INT16}, the memory layout of this image format is described below:

 [width="60%",cols="<10%,<10%,<10%,<10%,<60%"]
 |====
 | R | G | B | A | ... |
 |====

 with the corresponding byte offsets

 [width="60%",cols="<10%,<10%,<10%,<10%,<60%"]
 |====
 | 0 | 2 | 4 | 6 | ... |
 |====

 `image_channel_data_type` values of {CL_UNORM_SHORT_565}, {CL_UNORM_SHORT_555},
 {CL_UNORM_INT_101010}, and {CL_UNORM_INT_101010_2} are special cases of packed
 image formats where the channels of each element are packed into a single
 unsigned short or unsigned int.
 For these special packed image formats, the channels are normally packed
 with the first channel in the most significant bits of the bitfield, and
 successive channels occupying progressively less significant locations.
 For {CL_UNORM_SHORT_565}, R is in bits 15:11, G is in bits 10:5 and B is in
 bits 4:0.
 For {CL_UNORM_SHORT_555}, bit 15 is undefined, R is in bits 14:10, G in bits
 9:5 and B in bits 4:0.
 For {CL_UNORM_INT_101010}, bits 31:30 are undefined, R is in bits 29:20, G in
 bits 19:10 and B in bits 9:0.
 For {CL_UNORM_INT_101010_2}, R is in bits 31:22, G in bits 21:12, B in bits
 11:2 and A in bits 1:0.

 OpenCL implementations must maintain the minimum precision specified by the
 number of bits in `image_channel_data_type`.
 If the image format specified by `image_channel_order`, and
 `image_channel_data_type` cannot be supported by the OpenCL implementation,
 then the call to {clCreateImage}, {clCreateImageWithProperties},
 {clCreateImage2D}, or {clCreateImage3D} will return a `NULL` memory object.
 --


 [[image-descriptor]]
 ==== Image Descriptor

 [open,refpage='cl_image_desc',desc='The image descriptor structure describes the type and dimensions of the image or image array and is defined as:',type='structs',xrefs='clCreateImage']
 --
 The {cl_image_desc_TYPE} image descriptor structure describes the image type
 and dimensions of an image or image array when creating an image using
 {clCreateImage} or {clCreateImageWithProperties}, and is defined as:

 include::{generated}/api/structs/cl_image_desc.txt[]

   * `image_type` describes the image type and must be either
     {CL_MEM_OBJECT_IMAGE1D}, {CL_MEM_OBJECT_IMAGE1D_BUFFER},
     {CL_MEM_OBJECT_IMAGE1D_ARRAY}, {CL_MEM_OBJECT_IMAGE2D},
     {CL_MEM_OBJECT_IMAGE2D_ARRAY}, or {CL_MEM_OBJECT_IMAGE3D}.
   * `image_width` is the width of the image in pixels.
     For a 2D image and image array, the image width must be a value {geq} 1 and
     {leq} {CL_DEVICE_IMAGE2D_MAX_WIDTH}.
     For a 3D image, the image width must be a value {geq} 1 and {leq}
     {CL_DEVICE_IMAGE3D_MAX_WIDTH}.
     For a 1D image buffer, the image width must be a value {geq} 1 and {leq}
     {CL_DEVICE_IMAGE_MAX_BUFFER_SIZE}.
     For a 1D image and 1D image array, the image width must be a value {geq} 1
     and {leq} {CL_DEVICE_IMAGE2D_MAX_WIDTH}.
   * `image_height` is the height of the image in pixels.
     This is only used if the image is a 2D or 3D image, or a 2D image array.
     For a 2D image or image array, the image height must be a value {geq} 1 and
     {leq} {CL_DEVICE_IMAGE2D_MAX_HEIGHT}.
     For a 3D image, the image height must be a value {geq} 1 and {leq}
     {CL_DEVICE_IMAGE3D_MAX_HEIGHT}.
   * `image_depth` is the depth of the image in pixels.
     This is only used if the image is a 3D image and must be a value {geq} 1 and
     {leq} {CL_DEVICE_IMAGE3D_MAX_DEPTH}.
   * `image_array_size` footnote:[{fn-image-array-performance}] is the number of
     images in the image array.
     This is only used if the image is a 1D or 2D image array.
     The values for `image_array_size`, if specified, must be a value {geq} 1 and
     {leq} {CL_DEVICE_IMAGE_MAX_ARRAY_SIZE}.
   * `image_row_pitch` is the scan-line pitch in bytes.
     This must be 0 if _host_ptr_ is `NULL` and can be either 0 or {geq}
     `image_width` {times} size of element in bytes if _host_ptr_ is not `NULL`.
     If _host_ptr_ is not `NULL` and `image_row_pitch` = 0, `image_row_pitch` is
     calculated as `image_width` {times} size of element in bytes.
     If `image_row_pitch` is not 0, it must be a multiple of the image element
     size in bytes.
     For a 2D image created from a buffer, the pitch specified (or computed if
     pitch specified is 0) must be a multiple of the maximum of the
     {CL_DEVICE_IMAGE_PITCH_ALIGNMENT} value for all devices in the context
     associated with the buffer specified by `mem_object` that support images.
   * `image_slice_pitch` is the size in bytes of each 2D slice in the 3D image or
     the size in bytes of each image in a 1D or 2D image array.
     This must be 0 if _host_ptr_ is `NULL`.
     If _host_ptr_ is not `NULL`, `image_slice_pitch` can be either 0 or {geq}
     `image_row_pitch` {times} `image_height` for a 2D image array or 3D image
     and can be either 0 or {geq} `image_row_pitch` for a 1D image array.
     If _host_ptr_ is not `NULL` and `image_slice_pitch` = 0, `image_slice_pitch`
     is calculated as `image_row_pitch` {times} `image_height` for a 2D image
     array or 3D image and `image_row_pitch` for a 1D image array.
     If `image_slice_pitch` is not 0, it must be a multiple of the
     `image_row_pitch`.
   * `num_mip_levels` and `num_samples` must be 0.
   * `mem_object` may refer to a valid buffer or image memory object.
     `mem_object` can be a buffer memory object if `image_type` is
     {CL_MEM_OBJECT_IMAGE1D_BUFFER} or
     {CL_MEM_OBJECT_IMAGE2D} footnote:[{fn-image-from-buffer}].
     `mem_object` can be an image object if `image_type` is
     {CL_MEM_OBJECT_IMAGE2D} footnote:[{fn-image-from-image}].
     Otherwise it must be `NULL`.
     The image pixels are taken from the memory objects data store.
     When the contents of the specified memory objects data store are modified,
     those changes are reflected in the contents of the image object and
     vice-versa at corresponding synchronization points.

 For a 1D image buffer created from a buffer object, the `image_width` {times}
 size of element in bytes must be {leq} size of the buffer object.
 The image data in the buffer object is stored as a single scanline which is
 a linear sequence of adjacent elements.

 For a 2D image created from a buffer object, the `image_row_pitch` {times}
 `image_height` must be {leq} size of the buffer object specified by
 `mem_object`.
 The image data in the buffer object is stored as a linear sequence of
 adjacent scanlines.
 Each scanline is a linear sequence of image elements padded to
 `image_row_pitch` bytes.

 For an image object created from another image object, the values specified
 in the image descriptor except for `mem_object` must match the image
 descriptor information associated with `mem_object`.

 Image elements are stored according to their image format as described in
 <<image-format-descriptor, Image Format Descriptor>>.

 If the buffer object specified by `mem_object` was created with
 {CL_MEM_USE_HOST_PTR}, the _host_ptr_ specified to {clCreateBuffer} or
 {clCreateBufferWithProperties} must be aligned to the maximum of the
 {CL_DEVICE_IMAGE_BASE_ADDRESS_ALIGNMENT} value for all devices in the
 context associated with the buffer specified by `mem_object` that
 support images.

 Creating a 2D image object from another 2D image object creates a new
 2D image object that shares the image data store with `mem_object` but views
 the pixels in the image with a different image channel order.
 Restrictions are:

   * All of the values specified in _image_desc_ must match the image descriptor
     information associated with `mem_object`, except for `mem_object`.

   * The image channel data type specified in _image_format_ must match the
     image channel data type associated with `mem_object`.

   * The image channel order specified in _image_format_ must be compatible
     with the image channel order associated with `mem_object`.
     Compatible image channel orders
     footnote:[{fn-compatible-image-channel-orders}] are:
 +
 --
 [width="100%",cols="<50%,<50%",options="header"]
 |====
 | Image Channel Order in _image_format_:
   | Image Channel Order associated with `mem_object`:
 | {CL_sBGRA}
   | {CL_BGRA}
 | {CL_BGRA}
   | {CL_sBGRA}
 | {CL_sRGBA}
   | {CL_RGBA}
 | {CL_RGBA}
   | {CL_sRGBA}
 | {CL_sRGB}
   | {CL_RGB}
 | {CL_RGB}
   | {CL_sRGB}
 | {CL_sRGBx}
   | {CL_RGBx}
 | {CL_RGBx}
   | {CL_sRGBx}
 | {CL_DEPTH}
   | {CL_R}
 |====
 --

 [NOTE]
 ====
 Concurrent reading from, writing to and copying between both a buffer object
 and 1D image buffer or 2D image object associated with the buffer object is
 undefined.
 Only reading from both a buffer object and 1D image buffer or 2D image
 object associated with the buffer object is defined.

 Writing to an image created from a buffer and then reading from this buffer
 in a kernel even if appropriate synchronization operations (such as a
 barrier) are performed between the writes and reads is undefined.
 Similarly, writing to the buffer and reading from the image created from
 this buffer with appropriate synchronization between the writes and reads is
 undefined.
 ====
 --


 === Querying List of Supported Image Formats

 [open,refpage='clGetSupportedImageFormats',desc='Get the list of image formats supported by an OpenCL implementation.',type='protos']
 --
 To get the list of image formats supported by an OpenCL implementation for a
 specified context, image type, and allocation information, call the function

 include::{generated}/api/protos/clGetSupportedImageFormats.txt[]
 include::{generated}/api/version-notes/clGetSupportedImageFormats.asciidoc[]

   * _context_ is a valid OpenCL context on which the image object(s) will be
     created.
   * _flags_ is a bit-field that is used to specify usage
     information about the image formats being queried and is described in
     the <<memory-flags-table,Memory Flags>> table.
     _flags_ may be {CL_MEM_READ_WRITE} to query image formats that may be read
     from and written to by different kernel instances when correctly ordered by
     event dependencies, or {CL_MEM_READ_ONLY} to query image formats that may
     be read from by a kernel, or {CL_MEM_WRITE_ONLY} to query image formats that
     may be written to by a kernel, or {CL_MEM_KERNEL_READ_AND_WRITE} to query
     image formats that may be both read from and written to by the same kernel
     instance.
     Please see <<image-format-mapping, Image Format Mapping>> for clarification.
   * _image_type_ describes the image type and must be either
     {CL_MEM_OBJECT_IMAGE1D}, {CL_MEM_OBJECT_IMAGE1D_BUFFER}, {CL_MEM_OBJECT_IMAGE2D},
     {CL_MEM_OBJECT_IMAGE3D}, {CL_MEM_OBJECT_IMAGE1D_ARRAY}, or
     {CL_MEM_OBJECT_IMAGE2D_ARRAY}.
   * _num_entries_ specifies the number of entries that can be returned in the
     memory location given by _image_formats_.
   * _image_formats_ is a pointer to a memory location where the list of
     supported image formats are returned.
     Each entry describes a {cl_image_format_TYPE} structure supported by the OpenCL
     implementation.
     If _image_formats_ is `NULL`, it is ignored.
   * _num_image_formats_ is the actual number of supported image formats for a
     specific _context_ and values specified by _flags_.
     If _num_image_formats_ is `NULL`, it is ignored.

 {clGetSupportedImageFormats} returns a union of image formats supported by
 all devices in the context.

 // The following information about an image memory object
 // is specified:
 //
 //   * Context
 //   * Image type 1D, 2D, or 3D image, 1D image buffer, 1D or 2D image array.
 //   * Image object allocation information

 // refError

 {clGetSupportedImageFormats} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_CONTEXT} if _context_ is not a valid context.
   * {CL_INVALID_VALUE} if _flags_ or _image_type_ are not valid, or if
     _num_entries_ is 0 and _image_formats_ is not `NULL`.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.

 If {CL_DEVICE_IMAGE_SUPPORT} specified in the <<device-queries-table,Device
 Queries>> table is {CL_TRUE}, the values assigned to
 {CL_DEVICE_MAX_READ_IMAGE_ARGS}, {CL_DEVICE_MAX_WRITE_IMAGE_ARGS}
 {CL_DEVICE_IMAGE2D_MAX_WIDTH}, {CL_DEVICE_IMAGE2D_MAX_HEIGHT}
 {CL_DEVICE_IMAGE3D_MAX_WIDTH}, {CL_DEVICE_IMAGE3D_MAX_HEIGHT}
 {CL_DEVICE_IMAGE3D_MAX_DEPTH}, and {CL_DEVICE_MAX_SAMPLERS} by the implementation
 must be greater than or equal to the minimum values specified in the
 <<device-queries-table,Device Queries>> table.
 --


 [[minimum-list-of-supported-image-formats]]
 ==== Minimum List of Supported Image Formats

 [open,refpage='supportedImageFormats',desc='Minimum List of Supported Image Formats',type='freeform',anchor='minimum-list-of-supported-image-formats']
 --
 The tables below describe the required minimum lists of supported image
 formats.
 To query all image formats supported by an implementation, call the function {clGetSupportedImageFormats}.

 For full profile devices supporting OpenCL 2.0, 2.1, or 2.2, the minimum
 list of supported image formats for either reading or writing in a kernel
 is:

 [[min-supported-image-formats-2.0]]
 .Minimum list of supported image formats for reading or writing (OpenCL 2.0, 2.1, or 2.2)
 [width="100%",cols="<34%,<33%,<33%",options="header"]
 |====
 | num_channels | channel_order | channel_data_type
 | 1
   | {CL_R}
       | {CL_UNORM_INT8} +
         {CL_UNORM_INT16} +
         {CL_SNORM_INT8} +
         {CL_SNORM_INT16} +
         {CL_SIGNED_INT8} +
         {CL_SIGNED_INT16} +
         {CL_SIGNED_INT32} +
         {CL_UNSIGNED_INT8} +
         {CL_UNSIGNED_INT16} +
         {CL_UNSIGNED_INT32} +
         {CL_HALF_FLOAT} +
         {CL_FLOAT}
 | 1
   | {CL_DEPTH} footnote:[{fn-depth-image-requirements}]
       | {CL_UNORM_INT16} +
         {CL_FLOAT}
 | 2
   | {CL_RG}
       | {CL_UNORM_INT8} +
         {CL_UNORM_INT16} +
         {CL_SNORM_INT8} +
         {CL_SNORM_INT16} +
         {CL_SIGNED_INT8} +
         {CL_SIGNED_INT16} +
         {CL_SIGNED_INT32} +
         {CL_UNSIGNED_INT8} +
         {CL_UNSIGNED_INT16} +
         {CL_UNSIGNED_INT32} +
         {CL_HALF_FLOAT} +
         {CL_FLOAT}
 | 4
   | {CL_RGBA}
       | {CL_UNORM_INT8} +
         {CL_UNORM_INT16} +
         {CL_SNORM_INT8} +
         {CL_SNORM_INT16} +
         {CL_SIGNED_INT8} +
         {CL_SIGNED_INT16} +
         {CL_SIGNED_INT32} +
         {CL_UNSIGNED_INT8} +
         {CL_UNSIGNED_INT16} +
         {CL_UNSIGNED_INT32} +
         {CL_HALF_FLOAT} +
         {CL_FLOAT}
 | 4
   | {CL_BGRA}
       | {CL_UNORM_INT8}
 | 4
   | {CL_sRGBA} footnote:[{fn-srgb-image-requirements}]
       | {CL_UNORM_INT8}
 |====

 For full profile devices supporting other OpenCL versions, such as OpenCL 1.2
 or OpenCL 3.0, the minimum list of supported image formats for either reading
 or writing in a kernel is:

 // Note, differences are:
 // * No {CL_R} images.
 // * No {CL_DEPTH} images.
 // * No {CL_RG} images.
 // * No {CL_sRGBA} images.
 // * For {CL_RGBA} images, no {CL_SNORM_INT8} or {CL_SNORM_INT16} image channel
 //   data types.

 [[min-supported-image-formats]]
 .Minimum list of required image formats for reading or writing
 [width="100%",cols="<34%,<33%,<33%",options="header"]
 |====
 | num_channels | channel_order | channel_data_type
 | 4
   | {CL_RGBA}
       | {CL_UNORM_INT8} +
         {CL_UNORM_INT16} +
         {CL_SIGNED_INT8} +
         {CL_SIGNED_INT16} +
         {CL_SIGNED_INT32} +
         {CL_UNSIGNED_INT8} +
         {CL_UNSIGNED_INT16} +
         {CL_UNSIGNED_INT32} +
         {CL_HALF_FLOAT} +
         {CL_FLOAT}
 | 4
   | {CL_BGRA}
       | {CL_UNORM_INT8}
 |====

 For full profile devices that support reading from and writing to the same
 image object from the same kernel instance (see {CL_DEVICE_MAX_READ_WRITE_IMAGE_ARGS}),
 the minimum list of supported image formats for reading and writing in
 the same kernel instance is:

 [[min-supported-image-formats-read-write]]
 .Minimum list of required image formats for reading and writing
 [width="100%",cols="<34%,<33%,<33%",options="header"]
 |====
 | num_channels | channel_order | channel_data_type
 | 1
   | {CL_R}
       | {CL_UNORM_INT8} +
         {CL_SIGNED_INT8} +
         {CL_SIGNED_INT16} +
         {CL_SIGNED_INT32} +
         {CL_UNSIGNED_INT8} +
         {CL_UNSIGNED_INT16} +
         {CL_UNSIGNED_INT32} +
         {CL_HALF_FLOAT} +
         {CL_FLOAT}
 | 4
   | {CL_RGBA}
       | {CL_UNORM_INT8} +
         {CL_SIGNED_INT8} +
         {CL_SIGNED_INT16} +
         {CL_SIGNED_INT32} +
         {CL_UNSIGNED_INT8} +
         {CL_UNSIGNED_INT16} +
         {CL_UNSIGNED_INT32} +
         {CL_HALF_FLOAT} +
         {CL_FLOAT}
 |====
 --

 [[image-format-mapping]]
 ==== Image format mapping to OpenCL kernel language image access qualifiers

 Image arguments to kernels may have the `read_only`, `write_only` or
 `read_write` qualifier.
 Not all image formats supported by the device and platform are valid to be
 passed to all of these access qualifiers.
 For each access qualifier, only images whose format is in the list of
 formats returned by {clGetSupportedImageFormats} with the given flag
 arguments in the <<image-format-mapping-table,Image Format Mapping>> table
 are permitted.
 It is not valid to pass an image supporting writing as both a `read_only`
 image and a `write_only` image parameter, or to a `read_write` image
 parameter and any other image parameter.

 [[image-format-mapping-table]]
 .Mapping from format flags passed to <<clGetSupportedImageFormats>> to OpenCL kernel language image access qualifiers
 [width="100%",cols="<50%,<50%",options="header"]
 |====
 | Access Qualifier | Memory Flags
 | `read_only`
   | {CL_MEM_READ_ONLY}, +
     {CL_MEM_READ_WRITE}, +
     {CL_MEM_KERNEL_READ_AND_WRITE}
 | `write_only`
   | {CL_MEM_WRITE_ONLY}, +
     {CL_MEM_READ_WRITE}, +
     {CL_MEM_KERNEL_READ_AND_WRITE}
 | `read_write`
   | {CL_MEM_KERNEL_READ_AND_WRITE}
 |====


 === Reading, Writing and Copying Image Objects

 [open,refpage='clEnqueueReadImage',desc='Enqueue commands to read from an image or image array object to host memory.',type='protos',alias='clEnqueueWriteImage']
 --
 The following functions enqueue commands to read from an image or image
 array object to host memory or write to an image or image array object from
 host memory.

 include::{generated}/api/protos/clEnqueueReadImage.txt[]
 include::{generated}/api/version-notes/clEnqueueReadImage.asciidoc[]

 include::{generated}/api/protos/clEnqueueWriteImage.txt[]
 include::{generated}/api/version-notes/clEnqueueWriteImage.asciidoc[]

   * _command_queue_ refers to the host command-queue in which the read / write
     command will be queued.
     _command_queue_ and _image_ must be created with the same OpenCL context.
   * _image_ refers to a valid image or image array object.
   * _blocking_read_ and _blocking_write_ indicate if the read and write
     operations are _blocking_ or _non-blocking_.
   * _origin_ defines the (_x_, _y_, _z_) offset in pixels in the 1D, 2D or 3D
     image, the (_x_, _y_) offset and the image index in the 2D image array or
     the (_x_) offset and the image index in the 1D image array.
     If _image_ is a 2D image object, _origin_[2] must be 0.
     If _image_ is a 1D image or 1D image buffer object, _origin_[1] and
     _origin_[2] must be 0.
     If _image_ is a 1D image array object, _origin_[2] must be 0.
     If _image_ is a 1D image array object, _origin_[1] describes the image index
     in the 1D image array.
     If _image_ is a 2D image array object, _origin_[2] describes the image index
     in the 2D image array.
   * _region_ defines the (_width_, _height_, _depth_) in pixels of the 1D, 2D or
     3D rectangle, the (_width_, _height_) in pixels of the 2D rectangle and the
     number of images of a 2D image array or the (_width_) in pixels of the 1D
     rectangle and the number of images of a 1D image array.
     If _image_ is a 2D image object, _region_[2] must be 1.
     If _image_ is a 1D image or 1D image buffer object, _region_[1] and
     _region_[2] must be 1.
     If _image_ is a 1D image array object, _region_[2] must be 1.
     The values in _region_ cannot be 0.
   * _row_pitch_ in {clEnqueueReadImage} and _input_row_pitch_ in
     {clEnqueueWriteImage} is the length of each row in bytes.
     This value must be greater than or equal to the element size in bytes
     {times} _width_.
     If _row_pitch_ (or _input_row_pitch_) is set to 0, the appropriate row pitch
     is calculated based on the size of each element in bytes multiplied by
     _width_.
   * _slice_pitch_ in {clEnqueueReadImage} and _input_slice_pitch_ in
     {clEnqueueWriteImage} is the size in bytes of the 2D slice of the 3D region
     of a 3D image or each image of a 1D or 2D image array being read or written
     respectively.
     This must be 0 if _image_ is a 1D or 2D image.
     Otherwise this value must be greater than or equal to _row_pitch_ {times}
     _height_.
     If _slice_pitch_ (or _input_slice_pitch_) is set to 0, the appropriate slice
     pitch is calculated based on the _row_pitch_ {times} _height_.
   * _ptr_ is the pointer to a buffer in host memory where image data is to be
     read from or to be written to.
     The alignment requirements for ptr are specified in
     <<alignment-app-data-types>>.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before this particular command can be executed.
     If _event_wait_list_ is `NULL`, then this particular command does not wait
     on any event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this read / write command
     and can be used to query or queue a wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 If _blocking_read_ is {CL_TRUE} i.e. the read command is blocking,
 {clEnqueueReadImage} does not return until the buffer data has been read and
 copied into memory pointed to by _ptr_.

 If _blocking_read_ is {CL_FALSE} i.e. the read command is non-blocking,
 {clEnqueueReadImage} queues a non-blocking read command and returns.
 The contents of the buffer that _ptr_ points to cannot be used until the
 read command has completed.
 The _event_ argument returns an event object which can be used to query the
 execution status of the read command.
 When the read command has completed, the contents of the buffer that _ptr_
 points to can be used by the application.

 If _blocking_write_ is {CL_TRUE}, the write command is blocking and does not
 return until the command is complete, including transfer of the data.
 The memory pointed to by _ptr_ can be reused by the application after the
 {clEnqueueWriteImage} call returns.

 If _blocking_write_ is {CL_FALSE}, the OpenCL implementation will use _ptr_ to
 perform a non-blocking write.
 As the write is non-blocking the implementation can return immediately.
 The memory pointed to by _ptr_ cannot be reused by the application after the
 call returns.
 The _event_ argument returns an event object which can be used to query the
 execution status of the write command.
 When the write command has completed, the memory pointed to by _ptr_ can
 then be reused by the application.

 // refError

 {clEnqueueReadImage} and {clEnqueueWriteImage} return {CL_SUCCESS} if the
 function is executed successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and
     _image_ are not the same or if the context associated with
     _command_queue_ and events in _event_wait_list_ are not the same.
   * {CL_INVALID_MEM_OBJECT} if _image_ is not a valid image object.
   * {CL_INVALID_VALUE} if _origin_ or _region_ is `NULL`.
   * {CL_INVALID_VALUE} if the region being read or written specified by
     _origin_ and _region_ is out of bounds.
   * {CL_INVALID_VALUE} if values in _origin_ and _region_ do not follow rules
     described in the argument description for _origin_ and _region_.
   * {CL_INVALID_VALUE} if _ptr_ is `NULL`.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_INVALID_IMAGE_SIZE} if image dimensions (image width, height,
     specified or compute row and/or slice pitch) for _image_ are not
     supported by device associated with _queue_.
   * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if image format (image channel order and
     data type) for _image_ are not supported by device associated with
     _queue_.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for data store associated with _image_.
   * {CL_INVALID_OPERATION} if the device associated with _command_queue_ does
     not support images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the
     <<device-queries-table,Device Queries>> table is {CL_FALSE}).
   * {CL_INVALID_OPERATION} if {clEnqueueReadImage} is called on _image_ which
     has been created with {CL_MEM_HOST_WRITE_ONLY} or {CL_MEM_HOST_NO_ACCESS}.
   * {CL_INVALID_OPERATION} if {clEnqueueWriteImage} is called on _image_ which
     has been created with {CL_MEM_HOST_READ_ONLY} or {CL_MEM_HOST_NO_ACCESS}.
   * {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the read and write
     operations are blocking and the execution status of any of the events in
     _event_wait_list_ is a negative integer value.
     This error code is <<unified-spec, missing before>> version 1.1.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.

 [NOTE]
 ====
 Calling {clEnqueueReadImage} to read a region of the _image_ with the _ptr_
 argument value set to _host_ptr_ + (_origin_[2] {times} _image slice pitch_
 {plus} _origin_[1] {times} _image row pitch_ + _origin_[0] {times} _bytes
 per pixel_), where _host_ptr_ is a pointer to the memory region specified
 when the _image_ being read is created with {CL_MEM_USE_HOST_PTR}, must meet
 the following requirements in order to avoid undefined behavior:

   * All commands that use this image object have finished execution before
     the read command begins execution.
   * The _row_pitch_ and _slice_pitch_ argument values in
     {clEnqueueReadImage} must be set to the image row pitch and slice pitch.
   * The image object is not mapped.
   * The image object is not used by any command-queue until the read command
     has finished execution.

 Calling {clEnqueueWriteImage} to update the latest bits in a region of the
 _image_ with the _ptr_ argument value set to _host_ptr_ + (_origin_[2]
 {times} _image slice pitch_ + _origin_[1] {times} _image row pitch_ {plus}
 _origin_[0] {times} _bytes per pixel_), where _host_ptr_ is a pointer to the
 memory region specified when the _image_ being written is created with
 {CL_MEM_USE_HOST_PTR}, must meet the following requirements in order to avoid
 undefined behavior:

   * The host memory region being written contains the latest bits when the
     enqueued write command begins execution.
   * The _input_row_pitch_ and _input_slice_pitch_ argument values in
     {clEnqueueWriteImage} must be set to the image row pitch and slice
     pitch.
   * The image object is not mapped.
   * The image object is not used by any command-queue until the write
     command has finished execution.
 ====
 --

 [open,refpage='clEnqueueCopyImage',desc='Enqueues a command to copy image objects.',type='protos']
 --
 To enqueue a command to copy image objects, call the function

 include::{generated}/api/protos/clEnqueueCopyImage.txt[]
 include::{generated}/api/version-notes/clEnqueueCopyImage.asciidoc[]

   * _src_image_ and _dst_image_ can be 1D, 2D, 3D image or a 1D, 2D image array
     objects.
     It is possible to copy subregions between any combinations of source and
     destination types, provided that the dimensions of the subregions are the
     same e.g., one can copy a rectangular region from a 2D image to a slice of a
     3D image.
   * _command_queue_ refers to the host command-queue in which the copy command
     will be queued.
     The OpenCL context associated with _command_queue_, _src_image_ and
     _dst_image_ must be the same.
   * _src_origin_ defines the (_x_, _y_, _z_) offset in pixels in the 1D, 2D or
     3D image, the (_x_, _y_) offset and the image index in the 2D image array or
     the (_x_) offset and the image index in the 1D image array.
     If _image_ is a 2D image object, _src_origin_[2] must be 0.
     If _src_image_ is a 1D image object, _src_origin_[1] and _src_origin_[2]
     must be 0.
     If _src_image_ is a 1D image array object, _src_origin_[2] must be 0.
     If _src_image_ is a 1D image array object, _src_origin_[1] describes the
     image index in the 1D image array.
     If _src_image_ is a 2D image array object, _src_origin_[2] describes the
     image index in the 2D image array.
   * _dst_origin_ defines the (_x_, _y_, _z_) offset in pixels in the 1D, 2D or
     3D image, the (_x_, _y_) offset and the image index in the 2D image array or
     the (_x_) offset and the image index in the 1D image array.
     If _dst_image_ is a 2D image object, _dst_origin_[2] must be 0.
     If _dst_image_ is a 1D image or 1D image buffer object, _dst_origin_[1] and
     _dst_origin_[2] must be 0.
     If _dst_image_ is a 1D image array object, _dst_origin_[2] must be 0.
     If _dst_image_ is a 1D image array object, _dst_origin_[1] describes the
     image index in the 1D image array.
     If _dst_image_ is a 2D image array object, _dst_origin_[2] describes the
     image index in the 2D image array.
   * _region_ defines the (_width_, _height_, _depth_) in pixels of the 1D, 2D or
     3D rectangle, the (_width_, _height_) in pixels of the 2D rectangle and the
     number of images of a 2D image array or the (_width_) in pixels of the 1D
     rectangle and the number of images of a 1D image array.
     If _src_image_ or _dst_image_ is a 2D image object, _region_[2] must be 1.
     If _src_image_ or _dst_image_ is a 1D image or 1D image buffer object,
     _region_[1] and _region_[2] must be 1.
     If _src_image_ or _dst_image_ is a 1D image array object, _region_[2] must
     be 1.
     The values in _region_ cannot be 0.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before this particular command can be executed.
     If _event_wait_list_ is `NULL`, then this particular command does not wait
     on any event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this copy command
     and can be used to query or queue a wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 It is currently a requirement that the _src_image_ and _dst_image_ image
 memory objects for {clEnqueueCopyImage} must have the exact same image
 format (i.e. the {cl_image_format_TYPE} descriptor specified when _src_image_ and
 _dst_image_ are created must match).

 // refError

 {clEnqueueCopyImage} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_CONTEXT} if the context associated with _command_queue_,
     _src_image_ and _dst_image_ are not the same or if the context
     associated with _command_queue_ and events in _event_wait_list_ are not
     the same.
   * {CL_INVALID_MEM_OBJECT} if _src_image_ and _dst_image_ are not valid image
     objects.
   * {CL_IMAGE_FORMAT_MISMATCH} if _src_image_ and _dst_image_ do not use the
     same image format.
   * {CL_INVALID_VALUE} if _src_origin_, _dst_origin_, or _region_ is `NULL`.
   * {CL_INVALID_VALUE} if the 2D or 3D rectangular region specified by
     _src_origin_ and _src_origin_ {plus} _region_ refers to a region outside
     _src_image_, or if the 2D or 3D rectangular region specified by
     _dst_origin_ and _dst_origin_ {plus} _region_ refers to a region outside
     _dst_image_.
   * {CL_INVALID_VALUE} if values in _src_origin_, _dst_origin_ and _region_ do
     not follow rules described in the argument description for _src_origin_,
     _dst_origin_ and _region_.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_INVALID_IMAGE_SIZE} if image dimensions (image width, height,
     specified or compute row and/or slice pitch) for _src_image_ or
     _dst_image_ are not supported by device associated with _queue_.
   * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if image format (image channel order and
     data type) for _src_image_ or _dst_image_ are not supported by device
     associated with _queue_.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for data store associated with _src_image_ or _dst_image_.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
   * {CL_INVALID_OPERATION} if the device associated with _command_queue_ does
     not support images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the
     <<device-queries-table,Device Queries>> table is {CL_FALSE}).
   * {CL_MEM_COPY_OVERLAP} if _src_image_ and _dst_image_ are the same image
     object and the source and destination regions overlap.
 --


 === Filling Image Objects

 NOTE: Filling image objects is <<unified-spec, missing before>> version 1.2.

 [open,refpage='clEnqueueFillImage',desc='Enqueues a command to fill an image object with a specified color.',type='protos']
 --
 To enqueue a command to fill an image object with a specified color, call
 the function

 include::{generated}/api/protos/clEnqueueFillImage.txt[]
 include::{generated}/api/version-notes/clEnqueueFillImage.asciidoc[]

   * _command_queue_ refers to the host command-queue in which the fill command
     will be queued.
     The OpenCL context associated with _command_queue_ and _image_ must be the
     same.
   * _image_ is a valid image object.
   * _fill_color_ is the color used to fill the image.
     The fill color is a single floating point value if the channel order is
     {CL_DEPTH}.
     Otherwise, the fill color is a four component RGBA floating-point color
     value if the _image_ channel data type is not an unnormalized signed or
     unsigned integer type, is a four component signed integer value if the
     _image_ channel data type is an unnormalized signed integer type and is a
     four component unsigned integer value if the _image_ channel data type is an
     unnormalized unsigned integer type.
     The fill color will be converted to the appropriate image channel format and
     order associated with _image_.
   * _origin_ defines the (_x_, _y_, _z_) offset in pixels in the 1D, 2D or 3D
     image, the (_x_, _y_) offset and the image index in the 2D image array or
     the (_x_) offset and the image index in the 1D image array.
     If _image_ is a 2D image object, _origin_[2] must be 0.
     If _image_ is a 1D image or 1D image buffer object, _origin_[1] and
     _origin_[2] must be 0.
     If _image_ is a 1D image array object, _origin_[2] must be 0.
     If _image_ is a 1D image array object, _origin_[1] describes the image index
     in the 1D image array.
     If _image_ is a 2D image array object, _origin_[2] describes the image index
     in the 2D image array.
   * _region_ defines the (_width_, _height_, _depth_) in pixels of the 1D, 2D or
     3D rectangle, the (_width_, _height_) in pixels of the 2D rectangle and the
     number of images of a 2D image array or the (_width_) in pixels of the 1D
     rectangle and the number of images of a 1D image array.
     If _image_ is a 2D image object, _region_[2] must be 1.
     If _image_ is a 1D image or 1D image buffer object, _region_[1] and
     _region_[2] must be 1.
     If _image_ is a 1D image array object, _region_[2] must be 1.
     The values in _region_ cannot be 0.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before this particular command can be executed.
     If _event_wait_list_ is `NULL`, then this particular command does not wait
     on any event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this command and
     can be used to query or queue a wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 The usage information which indicates whether the memory object can be read
 or written by a kernel and/or the host and is given by the {cl_mem_flags_TYPE}
 argument value specified when _image_ is created is ignored by
 {clEnqueueFillImage}.

 // refError

 {clEnqueueFillImage} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and
     _image_ are not the same or if the context associated with
     _command_queue_ and events in _event_wait_list_ are not the same.
   * {CL_INVALID_MEM_OBJECT} if _image_ is not a valid image object.
   * {CL_INVALID_VALUE} if _fill_color_ is `NULL`.
   * {CL_INVALID_VALUE} if _origin_ or _region_ is `NULL`.
   * {CL_INVALID_VALUE} if the region being filled as specified by _origin_ and
     _region_ is out of bounds.
   * {CL_INVALID_VALUE} if values in _origin_ and _region_ do not follow rules
     described in the argument description for _origin_ and _region_.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_INVALID_IMAGE_SIZE} if image dimensions (image width, height,
     specified or compute row and/or slice pitch) for _image_ are not
     supported by device associated with _queue_.
   * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if image format (image channel order and
     data type) for _image_ are not supported by device associated with
     _queue_.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for data store associated with _image_.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --


 === Copying between Image and Buffer Objects

 [open,refpage='clEnqueueCopyImageToBuffer',desc='Enqueues a command to copy an image object to a buffer object.',type='protos']
 --
 To enqueue a command to copy an image object to a buffer object, call the
 function

 include::{generated}/api/protos/clEnqueueCopyImageToBuffer.txt[]
 include::{generated}/api/version-notes/clEnqueueCopyImageToBuffer.asciidoc[]

   * _command_queue_ must be a valid host command-queue.
     The OpenCL context associated with _command_queue_, _src_image_ and
     _dst_buffer_ must be the same.
   * _src_image_ is a valid image object.
   * _dst_buffer_ is a valid buffer object.
   * _src_origin_ defines the (_x_, _y_, _z_) offset in pixels in the 1D, 2D or
     3D image, the (_x_, _y_) offset and the image index in the 2D image array or
     the (_x_) offset and the image index in the 1D image array.
     If _src_image_ is a 2D image object, _src_origin_[2] must be 0.
     If _src_image_ is a 1D image or 1D image buffer object, _src_origin_[1] and
     _src_origin_[2] must be 0.
     If _src_image_ is a 1D image array object, _src_origin_[2] must be 0.
     If _src_image_ is a 1D image array object, _src_origin_[1] describes the
     image index in the 1D image array.
     If _src_image_ is a 2D image array object, _src_origin_[2] describes the
     image index in the 2D image array.
   * _region_ defines the (_width_, _height_, _depth_) in pixels of the 1D, 2D or
     3D rectangle, the (_width_, _height_) in pixels of the 2D rectangle and the
     number of images of a 2D image array or the (_width_) in pixels of the 1D
     rectangle and the number of images of a 1D image array.
     If _src_image_ is a 2D image object, _region_[2] must be 1.
     If _src_image_ is a 1D image or 1D image buffer object, _region_[1] and
     _region_[2] must be 1.
     If _src_image_ is a 1D image array object, _region_[2] must be 1.
     The values in _region_ cannot be 0.
   * _dst_offset_ refers to the offset where to begin copying data into
     _dst_buffer_.
     The size in bytes of the region to be copied referred to as _dst_cb_ is
     computed as _width_ {times} _height_ {times} _depth_ {times} _bytes/image
     element_ if _src_image_ is a 3D image object, is computed as _width_ {times}
     _height_ {times} _bytes/image element_ if _src_image_ is a 2D image, is
     computed as _width_ {times} _height_ {times} _arraysize_ {times}
     _bytes/image element_ if _src_image_ is a 2D image array object, is computed
     as _width_ {times} _bytes/image element_ if _src_image_ is a 1D image or 1D
     image buffer object and is computed as _width_ {times} _arraysize_ {times}
     _bytes/image element_ if _src_image_ is a 1D image array object.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before this particular command can be executed.
     If _event_wait_list_ is `NULL`, then this particular command does not wait
     on any event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this copy command
     and can be used to query or queue a wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 // refError

 {clEnqueueCopyImageToBuffer} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_CONTEXT} if the context associated with _command_queue_,
     _src_image_ and _dst_buffer_ are not the same or if the context
     associated with _command_queue_ and events in _event_wait_list_ are not
     the same.
   * {CL_INVALID_MEM_OBJECT} if _src_image_ is not a valid image object or
     _dst_buffer_ is not a valid buffer object or if _src_image_ is a 1D
     image buffer object created from _dst_buffer_.
   * {CL_INVALID_VALUE} if _src_origin_ or _region_ is `NULL`.
   * {CL_INVALID_VALUE} if the 1D, 2D or 3D rectangular region specified by
     _src_origin_ and _src_origin_ + _region_ refers to a region outside
     _src_image_, or if the region specified by _dst_offset_ and _dst_offset_
     + _dst_cb_ to a region outside _dst_buffer_.
   * {CL_INVALID_VALUE} if values in _src_origin_ and _region_ do not follow
     rules described in the argument description for _src_origin_ and
     _region_.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _dst_buffer_ is a sub-buffer object
     and _offset_ specified when the sub-buffer object is created is not
     aligned to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated
     with _queue_.
     This error code is <<unified-spec, missing before>> version 1.1.
   * {CL_INVALID_IMAGE_SIZE} if image dimensions (image width, height,
     specified or compute row and/or slice pitch) for _src_image_ are not
     supported by device associated with _queue_.
   * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if image format (image channel order and
     data type) for _src_image_ are not supported by device associated with
     _queue_.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for data store associated with _src_image_ or _dst_buffer_.
   * {CL_INVALID_OPERATION} if the device associated with _command_queue_ does
     not support images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the
     <<device-queries-table,Device Queries>> table is {CL_FALSE}).
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --


 [open,refpage='clEnqueueCopyBufferToImage',desc='Enqueues a command to copy a buffer object to an image object.',type='protos']
 --
 To enqueue a command to copy a buffer object to an image object, call the
 function

 include::{generated}/api/protos/clEnqueueCopyBufferToImage.txt[]
 include::{generated}/api/version-notes/clEnqueueCopyBufferToImage.asciidoc[]

   * _command_queue_ must be a valid host command-queue.
     The OpenCL context associated with _command_queue_, _src_buffer_ and
     _dst_image_ must be the same.
   * _src_buffer_ is a valid buffer object.
   * _dst_image_ is a valid image object.
   * _src_offset_ refers to the offset where to begin copying data from
     _src_buffer_.
   * _dst_origin_ defines the (_x_, _y_, _z_) offset in pixels in the 1D, 2D or
     3D image, the (_x_, _y_) offset and the image index in the 2D image array or
     the (_x_) offset and the image index in the 1D image array.
     If _dst_image_ is a 2D image object, _dst_origin_[2] must be 0.
     If _dst_image_ is a 1D image or 1D image buffer object, _dst_origin_[1] and
     _dst_origin_[2] must be 0.
     If _dst_image_ is a 1D image array object, _dst_origin_[2] must be 0.
     If _dst_image_ is a 1D image array object, _dst_origin_[1] describes the
     image index in the 1D image array.
     If _dst_image_ is a 2D image array object, _dst_origin_[2] describes the
     image index in the 2D image array.
   * _region_ defines the (_width_, _height_, _depth_) in pixels of the 1D, 2D or
     3D rectangle, the (_width_, _height_) in pixels of the 2D rectangle and the
     number of images of a 2D image array or the (_width_) in pixels of the 1D
     rectangle and the number of images of a 1D image array.
     If _dst_image_ is a 2D image object, _region_[2] must be 1.
     If _dst_image_ is a 1D image or 1D image buffer object, _region_[1] and
     _region_[2] must be 1.
     If _dst_image_ is a 1D image array object, _region_[2] must be 1.
     The values in _region_ cannot be 0.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before this particular command can be executed.
     If _event_wait_list_ is `NULL`, then this particular command does not wait
     on any event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this copy command
     and can be used to query or queue a wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 The size in bytes of the region to be copied from _src_buffer_ referred to
 as _src_cb_ is computed as _width_ {times} _height_ {times} _depth_ {times}
 _bytes/image element_ if _dst_image_ is a 3D image object, is computed as
 _width_ {times} _height_ {times} _bytes/image element_ if _dst_image_ is a
 2D image, is computed as _width_ {times} _height_ {times} _arraysize_
 {times} _bytes/image element_ if _dst_image_ is a 2D image array object, is
 computed as _width_ {times} _bytes/image element_ if _dst_image_ is a 1D
 image or 1D image buffer object and is computed as _width_ {times}
 _arraysize_ {times} _bytes/image element_ if _dst_image_ is a 1D image array
 object.

 // refError

 {clEnqueueCopyBufferToImage} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_CONTEXT} if the context associated with _command_queue_,
     _src_buffer_ and _dst_image_ are not the same or if the context
     associated with _command_queue_ and events in _event_wait_list_ are not
     the same.
   * {CL_INVALID_MEM_OBJECT} if _src_buffer_ is not a valid buffer object or
     _dst_image_ is not a valid image object or if _dst_image_ is a 1D image
     buffer object created from _src_buffer_.
   * {CL_INVALID_VALUE} if _dst_origin_ or _region_ is `NULL`.
   * {CL_INVALID_VALUE} if the 1D, 2D or 3D rectangular region specified by
     _dst_origin_ and _dst_origin_ + _region_ refer to a region outside
     _dst_image_, or if the region specified by _src_offset_ and _src_offset_
     + _src_cb_ refer to a region outside _src_buffer_.
   * {CL_INVALID_VALUE} if values in _dst_origin_ and _region_ do not follow
     rules described in the argument description for _dst_origin_ and
     _region_.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _src_buffer_ is a sub-buffer object
     and _offset_ specified when the sub-buffer object is created is not
     aligned to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated
     with _queue_.
     This error code is <<unified-spec, missing before>> version 1.1.
   * {CL_INVALID_IMAGE_SIZE} if image dimensions (image width, height,
     specified or compute row and/or slice pitch) for _dst_image_ are not
     supported by device associated with _queue_.
   * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if image format (image channel order and
     data type) for _dst_image_ are not supported by device associated with
     _queue_.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for data store associated with _src_buffer_ or _dst_image_.
   * {CL_INVALID_OPERATION} if the device associated with _command_queue_ does
     not support images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the
     <<device-queries-table,Device Queries>> table is {CL_FALSE}).
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --


 === Mapping Image Objects

 [open,refpage='clEnqueueMapImage',desc='Enqueues a command to map a region of an image object into the host address space and returns a pointer to this mapped region.',type='protos']
 --
 To enqueue a command to map a region in the image object given by _image_
 into the host address space and returns a pointer to this mapped region,
 call the function

 include::{generated}/api/protos/clEnqueueMapImage.txt[]
 include::{generated}/api/version-notes/clEnqueueMapImage.asciidoc[]

   * _command_queue_ must be a valid host command-queue.
   * _image_ is a valid image object.
     The OpenCL context associated with _command_queue_ and _image_ must be the
     same.
   * _blocking_map_ indicates if the map operation is _blocking_ or
     _non-blocking_.
   * _map_flags_ is a bit-field and is described in the
     <<memory-map-flags-table,Memory Map Flags>> table.
   * _origin_ defines the (_x_, _y_, _z_) offset in pixels in the 1D, 2D or 3D
     image, the (_x_, _y_) offset and the image index in the 2D image array or
     the (_x_) offset and the image index in the 1D image array.
     If _image_ is a 2D image object, _origin_[2] must be 0.
     If _image_ is a 1D image or 1D image buffer object, _origin_[1] and
     _origin_[2] must be 0.
     If _image_ is a 1D image array object, _origin_[2] must be 0.
     If _image_ is a 1D image array object, _origin_[1] describes the image index
     in the 1D image array.
     If _image_ is a 2D image array object, _origin_[2] describes the image index
     in the 2D image array.
   * _region_ defines the (_width_, _height_, _depth_) in pixels of the 1D, 2D or
     3D rectangle, the (_width_, _height_) in pixels of the 2D rectangle and the
     number of images of a 2D image array or the (_width_) in pixels of the 1D
     rectangle and the number of images of a 1D image array.
     If _image_ is a 2D image object, _region_[2] must be 1.
     If _image_ is a 1D image or 1D image buffer object, _region_[1] and
     _region_[2] must be 1.
     If _image_ is a 1D image array object, _region_[2] must be 1.
     The values in _region_ cannot be 0.
   * _image_row_pitch_ returns the scan-line pitch in bytes for the mapped
     region.
     This must be a non-`NULL` value.
   * _image_slice_pitch_ returns the size in bytes of each 2D slice of a 3D image
     or the size of each 1D or 2D image in a 1D or 2D image array for the mapped
     region.
     For a 1D and 2D image, zero is returned if this argument is not `NULL`.
     For a 3D image, 1D and 2D image array, _image_slice_pitch_ must be a
     non-`NULL` value.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before {clEnqueueMapImage} can be executed.
     If _event_wait_list_ is `NULL`, then {clEnqueueMapImage} does not wait on
     any event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this command and
     can be used to query or queue a wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.
   * _errcode_ret_ will return an appropriate error code.
     If _errcode_ret_ is `NULL`, no error code is returned.

 If _blocking_map_ is {CL_TRUE}, {clEnqueueMapImage} does not return until the
 specified region in _image_ is mapped into the host address space and the
 application can access the contents of the mapped region using the pointer
 returned by {clEnqueueMapImage}.

 If _blocking_map_ is {CL_FALSE} i.e. map operation is non-blocking, the
 pointer to the mapped region returned by {clEnqueueMapImage} cannot be used
 until the map command has completed.
 The _event_ argument returns an event object which can be used to query the
 execution status of the map command.
 When the map command is completed, the application can access the contents
 of the mapped region using the pointer returned by {clEnqueueMapImage}.

 // refError

 {clEnqueueMapImage} will return a pointer to the mapped region.
 The _errcode_ret_ is set to {CL_SUCCESS}.

 A `NULL` pointer is returned otherwise with one of the following error
 values returned in _errcode_ret_:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and
     _image_ are not the same or if context associated with _command_queue_
     and events in _event_wait_list_ are not the same.
   * {CL_INVALID_MEM_OBJECT} if _image_ is not a valid image object.
   * {CL_INVALID_VALUE} if _origin_ or _region_ is `NULL`.
   * {CL_INVALID_VALUE} if region being mapped given by (_origin_,
     _origin_ + _region_) is out of bounds or if values specified in _map_flags_
     are not valid.
   * {CL_INVALID_VALUE} if values in _origin_ and _region_ do not follow rules
     described in the argument description for _origin_ and _region_.
   * {CL_INVALID_VALUE} if _image_row_pitch_ is `NULL`.
   * {CL_INVALID_VALUE} if _image_ is a 3D image, 1D or 2D image array object
     and _image_slice_pitch_ is `NULL`.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_INVALID_IMAGE_SIZE} if image dimensions (image width, height,
     specified or compute row and/or slice pitch) for _image_ are not
     supported by device associated with _queue_.
   * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if image format (image channel order and
     data type) for _image_ are not supported by device associated with
     _queue_.
   * {CL_MAP_FAILURE} if there is a failure to map the requested region into
     the host address space.
     This error cannot occur for image objects created with
     {CL_MEM_USE_HOST_PTR} or {CL_MEM_ALLOC_HOST_PTR}.
   * {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the map operation is
     blocking and the execution status of any of the events in
     _event_wait_list_ is a negative integer value.
     This error code is <<unified-spec, missing before>> version 1.1.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for data store associated with _image_.
   * {CL_INVALID_OPERATION} if the device associated with _command_queue_ does
     not support images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the
     <<device-queries-table,Device Queries>> table is {CL_FALSE}).
   * {CL_INVALID_OPERATION} if _image_ has been created with
     {CL_MEM_HOST_WRITE_ONLY} or {CL_MEM_HOST_NO_ACCESS} and {CL_MAP_READ} is set
     in _map_flags_ or if _image_ has been created with {CL_MEM_HOST_READ_ONLY}
     or {CL_MEM_HOST_NO_ACCESS} and {CL_MAP_WRITE} or
     {CL_MAP_WRITE_INVALIDATE_REGION} is set in _map_flags_.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
   * {CL_INVALID_OPERATION} if mapping would lead to overlapping regions being
     mapped for writing.

 The pointer returned maps a 1D, 2D or 3D region starting at _origin_ and is
 at least _region_[0] pixels in size for a 1D image, 1D image buffer or 1D
 image array, (_image_row_pitch {times} region[1])_ pixels in size for a 2D
 image or 2D image array, and (_image_slice_pitch {times} region[2])_ pixels
 in size for a 3D image.
 The result of a memory access outside this region is undefined.

 If the image object is created with {CL_MEM_USE_HOST_PTR} set in _mem_flags_,
 the following will be true:

   * The _host_ptr_ specified in {clCreateImage}, {clCreateImageWithProperties},
     {clCreateImage2D}, or {clCreateImage3D} is guaranteed to contain the
     latest bits in the region being mapped when the {clEnqueueMapImage} command
     has completed.
   * The pointer value returned by {clEnqueueMapImage} will be derived from
     the _host_ptr_ specified when the image object is created.

 Mapped image objects are unmapped using {clEnqueueUnmapMemObject}.
 This is described in <<unmapping-mapped-memory, Unmapping Mapped Memory
 Objects>>.
 --


 [[image-object-queries]]
 === Image Object Queries

 To get information that is common to all memory objects, use the
 {clGetMemObjectInfo} function described in <<memory-object-queries, Memory
 Object Queries>>.

 [open,refpage='clGetImageInfo',desc='Get information specific to an image object created with clCreateImage.',type='protos']
 --
 To get information specific to an image object created with {clCreateImage},
 {clCreateImageWithProperties}, {clCreateImage2D}, or {clCreateImage3D} call
 the function

 include::{generated}/api/protos/clGetImageInfo.txt[]
 include::{generated}/api/version-notes/clGetImageInfo.asciidoc[]

   * _image_ specifies the image object being queried.
   * _param_name_ specifies the information to query.
     The list of supported _param_name_ types and the information returned in
     _param_value_ by {clGetImageInfo} is described in the
     <<image-info-table,Image Object Queries>> table.
   * _param_value_ is a pointer to memory where the appropriate result being
     queried is returned.
     If _param_value_ is `NULL`, it is ignored.
   * _param_value_size_ is used to specify the size in bytes of memory pointed to
     by _param_value_.
     This size must be {geq} size of return type as described in the
     <<image-info-table,Image Object Queries>> table.
   * _param_value_size_ret_ returns the actual size in bytes of data being
     queried by _param_name_.
     If _param_value_size_ret_ is `NULL`, it is ignored.

 [[image-info-table]]
 .List of supported param_names by <<clGetImageInfo>>
 [width="100%",cols="<33%,<17%,<50%",options="header"]
 |====
 | Image Info | Return type | Description
 | {CL_IMAGE_FORMAT_anchor}

 include::{generated}/api/version-notes/CL_IMAGE_FORMAT.asciidoc[]
   | {cl_image_format_TYPE}
       | Return the image format descriptor specified when _image_ is created
         with {clCreateImage}, {clCreateImageWithProperties},
         {clCreateImage2D} or {clCreateImage3D}.
 | {CL_IMAGE_ELEMENT_SIZE_anchor}

 include::{generated}/api/version-notes/CL_IMAGE_ELEMENT_SIZE.asciidoc[]
   | {size_t_TYPE}
       | Return size of each element of the image memory object given by
         _image_ in bytes.
 | {CL_IMAGE_ROW_PITCH_anchor}

 include::{generated}/api/version-notes/CL_IMAGE_ROW_PITCH.asciidoc[]
   | {size_t_TYPE}
       | Return calculated row pitch in bytes of a row of elements of the
         image object given by _image_.
 | {CL_IMAGE_SLICE_PITCH_anchor}

 include::{generated}/api/version-notes/CL_IMAGE_SLICE_PITCH.asciidoc[]
   | {size_t_TYPE}
       | Return calculated slice pitch in bytes of a 2D slice for the 3D
         image object or size of each image in a 1D or 2D image array given
         by _image_.
         For a 1D image, 1D image buffer and 2D image object return 0.
 | {CL_IMAGE_WIDTH_anchor}

 include::{generated}/api/version-notes/CL_IMAGE_WIDTH.asciidoc[]
   | {size_t_TYPE}
       | Return width of the image in pixels.
 | {CL_IMAGE_HEIGHT_anchor}

 include::{generated}/api/version-notes/CL_IMAGE_HEIGHT.asciidoc[]
   | {size_t_TYPE}
       | Return height of the image in pixels.
         For a 1D image, 1D image buffer and 1D image array object, height =
         0.
 | {CL_IMAGE_DEPTH_anchor}

 include::{generated}/api/version-notes/CL_IMAGE_DEPTH.asciidoc[]
   | {size_t_TYPE}
       | Return depth of the image in pixels.
         For a 1D image, 1D image buffer, 2D image or 1D and 2D image array
         object, depth = 0.
 | {CL_IMAGE_ARRAY_SIZE_anchor}

 include::{generated}/api/version-notes/CL_IMAGE_ARRAY_SIZE.asciidoc[]
   | {size_t_TYPE}
       | Return number of images in the image array.
         If _image_ is not an image array, 0 is returned.
 | {CL_IMAGE_BUFFER_anchor}

 include::{generated}/api/version-notes/CL_IMAGE_BUFFER.asciidoc[]
   | {cl_mem_TYPE}
       | Return buffer object associated with _image_.
 | {CL_IMAGE_NUM_MIP_LEVELS_anchor}

 include::{generated}/api/version-notes/CL_IMAGE_NUM_MIP_LEVELS.asciidoc[]
   | {cl_uint_TYPE}
       | Return `num_mip_levels` associated with _image_.
 | {CL_IMAGE_NUM_SAMPLES_anchor}

 include::{generated}/api/version-notes/CL_IMAGE_NUM_SAMPLES.asciidoc[]
   | {cl_uint_TYPE}
       | Return `num_samples` associated with _image_.
 |====

 // refError

 {clGetImageInfo} returns  {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
     specified by _param_value_size_ is < size of return type as described in
     the <<image-info-table,Image Object Queries>> table and _param_value_ is
     not `NULL`.
   * {CL_INVALID_MEM_OBJECT} if _image_ is a not a valid image object.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --


 == Pipes

 NOTE: Pipes are <<unified-spec, missing before>> version 2.0.

 A _pipe_ is a memory object that stores data organized as a FIFO.
 Pipe objects can only be accessed using built-in functions that read from
 and write to a pipe.
 Pipe objects are not accessible from the host.
 A pipe object encapsulates the following information:

   * Packet size in bytes
   * Maximum capacity in packets
   * Information about the number of packets currently in the pipe
   * Data packets


 === Creating Pipe Objects

 [open,refpage='clCreatePipe',desc='Creates a pipe object.',type='protos']
 --
 To create a *pipe object*, call the function

 include::{generated}/api/protos/clCreatePipe.txt[]
 include::{generated}/api/version-notes/clCreatePipe.asciidoc[]

   * _context_ is a valid OpenCL context used to create the pipe object.
   * _flags_ is a bit-field that is used to specify allocation and usage
     information such as the memory arena that should be used to allocate the
     pipe object and how it will be used.
     The <<memory-flags,Memory Flags>> table describes the possible values for
     _flags_.
     Only {CL_MEM_READ_WRITE} and {CL_MEM_HOST_NO_ACCESS} can be specified when
     creating a pipe object.
     If the value specified for _flags_ is 0, the default is used which is
     {CL_MEM_READ_WRITE} | {CL_MEM_HOST_NO_ACCESS}.
   * _pipe_packet_size_ is the size in bytes of a pipe packet.
   * _pipe_max_packets_ specifies the pipe capacity by specifying the maximum
     number of packets the pipe can hold.
   * _properties_ specifies a list of properties for the pipe and their
     corresponding values.
     Each property name is immediately followed by the corresponding desired
     value.
     The list is terminated with 0.
     Currently, in all OpenCL versions, _properties_ must be `NULL`.
   * _errcode_ret_ will return an appropriate error code.
     If _errcode_ret_ is `NULL`, no error code is returned.

 // refError

 {clCreatePipe} returns a valid non-zero pipe object and _errcode_ret_ is set
 to {CL_SUCCESS} if the pipe object is created successfully.
 Otherwise, it returns a `NULL` value with one of the following error values
 returned in _errcode_ret_:

   * {CL_INVALID_CONTEXT} if _context_ is not a valid context.
   * {CL_INVALID_OPERATION} if no devices in _context_ support pipes.
   * {CL_INVALID_VALUE} if values specified in _flags_ are not as defined
     above.
   * {CL_INVALID_VALUE} if _properties_ is not `NULL`.
   * {CL_INVALID_PIPE_SIZE} if _pipe_packet_size_ is 0 or the
     _pipe_packet_size_ exceeds {CL_DEVICE_PIPE_MAX_PACKET_SIZE} value
     specified in the <<device-queries-table,Device Queries>> table for all
     devices in _context_ or if _pipe_max_packets_ is 0.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for the pipe object.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.

 Pipes follow the same memory consistency model as defined for buffer and
 image objects.
 The pipe state i.e. contents of the pipe across kernel-instances (on the
 same or different devices) is enforced at a synchronization point.
 --


 === Pipe Object Queries

 To get information that is common to all memory objects, use the
 {clGetMemObjectInfo} function described in <<memory-object-queries, Memory
 Object Queries>>.

 [open,refpage='clGetPipeInfo',desc='Get information specific to a pipe object created with clCreatePipe.',type='protos']
 --
 To get information specific to a pipe object created with {clCreatePipe},
 call the function

 include::{generated}/api/protos/clGetPipeInfo.txt[]
 include::{generated}/api/version-notes/clGetPipeInfo.asciidoc[]

   * _pipe_ specifies the pipe object being queried.
   * _param_name_ specifies the information to query.
     The list of supported _param_name_ types and the information returned in
     _param_value_ by {clGetPipeInfo} is described in the <<pipe-info-table,Pipe
     Object Queries>> table.
   * _param_value_ is a pointer to memory where the appropriate result being
     queried is returned.
     If _param_value_ is `NULL`, it is ignored.
   * _param_value_size_ is used to specify the size in bytes of memory pointed to
     by _param_value_.
     This size must be {geq} size of return type as described in the
     <<pipe-info-table,Pipe Object Queries>> table.
   * _param_value_size_ret_ returns the actual size in bytes of data being
     queried by _param_name_.
     If _param_value_size_ret_ is `NULL`, it is ignored.

 // refError

 {clGetPipeInfo} returns {CL_SUCCESS} if the function is executed successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_MEM_OBJECT} if _pipe_ is a not a valid pipe object.
   * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
     specified by _param_value_size_ is < size of return type as described in
     the <<pipe-info-table,Pipe Object Queries>> table and _param_value_ is
     not `NULL`.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.

 [[pipe-info-table]]
 .List of supported param_names by <<clGetPipeInfo>>
 [width="100%",cols="<33%,<17%,<50%",options="header"]
 |====
 | Pipe Info | Return type | Description
 | {CL_PIPE_PACKET_SIZE_anchor}

 include::{generated}/api/version-notes/CL_PIPE_PACKET_SIZE.asciidoc[]
   | {cl_uint_TYPE}
       | Return pipe packet size specified when _pipe_ is created with
         {clCreatePipe}.
 | {CL_PIPE_MAX_PACKETS_anchor}

 include::{generated}/api/version-notes/CL_PIPE_MAX_PACKETS.asciidoc[]
   | {cl_uint_TYPE}
       | Return max. number of packets specified when _pipe_ is created with
         {clCreatePipe}.

 | {CL_PIPE_PROPERTIES_anchor}

 include::{generated}/api/version-notes/CL_PIPE_PROPERTIES.asciidoc[]
   | {cl_pipe_properties_TYPE}[]
       | Return the properties argument specified in {clCreatePipe}.

         If the _properties_ argument specified in {clCreatePipe} used to
         create _pipe_ was not `NULL`, the implementation must return the
         values specified in the properties argument in the same order and
         without including additional properties.

         If the _properties_ argument specified in {clCreatePipe} used to
         create _pipe_ was `NULL`, the implementation must return
         _param_value_size_ret_ equal to 0, indicating that there are no
         properties to be returned.
 |====
 --


 == Querying, Unmapping, Migrating, Retaining and Releasing Memory Objects

 // === Handling Memory Objects


 === Retaining and Releasing Memory Objects

 [open,refpage='clRetainMemObject',desc='Increments the memory object reference count.',type='protos']
 --
 To retain a memory object, call the function

 include::{generated}/api/protos/clRetainMemObject.txt[]
 include::{generated}/api/version-notes/clRetainMemObject.asciidoc[]

   * _memobj_ specifies the memory object to be retained.

 The _memobj_ reference count is incremented.

 // refError

 {clRetainMemObject} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_MEM_OBJECT} if _memobj_ is not a valid memory object (buffer
     or image object).
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.

 {clCreateBuffer}, {clCreateBufferWithProperties}, {clCreateSubBuffer},
 {clCreateImage}, {clCreateImageWithProperties}, {clCreateImage2D},
 {clCreateImage3D} and {clCreatePipe} perform an implicit retain.
 --

 [open,refpage='clReleaseMemObject',desc='Decrements the memory object reference count.',type='protos']
 --
 To release a memory object, call the function

 include::{generated}/api/protos/clReleaseMemObject.txt[]
 include::{generated}/api/version-notes/clReleaseMemObject.asciidoc[]

   * _memobj_ specifies the memory object to be released.

 The _memobj_ reference count is decremented.

 // refError

 {clReleaseMemObject} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_MEM_OBJECT} if _memobj_ is not a valid memory object.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.

 After the _memobj_ reference count becomes zero and commands queued for
 execution on a command-queue(s) that use _memobj_ have finished, the memory
 object is deleted.
 If _memobj_ is a buffer object, _memobj_ cannot be deleted until all
 sub-buffer objects associated with _memobj_ are deleted.
 Using this function to release a reference that was not obtained by creating
 the object or by calling {clRetainMemObject} causes undefined behavior.
 --

 [open,refpage='clSetMemObjectDestructorCallback',desc='Registers a destructor callback function with a memory object.',type='protos']
 --
 To register a callback function with a memory object that is called when
 the memory object is destroyed, call the function

 include::{generated}/api/protos/clSetMemObjectDestructorCallback.txt[]
 include::{generated}/api/version-notes/clSetMemObjectDestructorCallback.asciidoc[]

   * _memobj_ specifies the memory object to register the callback to.
   * _pfn_notify_ is the callback function to register.
     This callback function may be called asynchronously by the OpenCL
     implementation.
     It is the application's responsibility to ensure that the callback function
     is thread-safe.
     The parameters to this callback function are:
   ** _memobj_ is the memory object being deleted.
      When the callback function is called by the implementation, this memory
      object is not longer valid.
      _memobj_ is only provided for reference purposes.
   ** _user_data_ is a pointer to user-supplied data.
   * _user_data_ will be passed as the _user_data_ argument when _pfn_notify_ is
     called.
     _user_data_ can be `NULL`.

 Each call to {clSetMemObjectDestructorCallback} registers the specified
 callback function on a destructor callback stack associated with _memobj_.
 The registered callback functions are called in the reverse order in
 which they were registered.
 The registered callback functions are called and then the memory object's
 resources are freed and the memory object is deleted.
 Therefore, the memory object destructor callback provides a mechanism for
 an application to safely re-use or free a _host_ptr_ that was specified when
 _memobj_ was created and used as the storage bits for the memory object.

 // refError

 {clSetMemObjectDestructorCallback} returns {CL_SUCCESS} if the function is
 executed successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_MEM_OBJECT} if _memobj_ is not a valid memory object.
   * {CL_INVALID_VALUE} if _pfn_notify_ is `NULL`.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.

 [NOTE]
 ====
 When the user callback function is called by the implementation, the
 contents of the memory region pointed to by _host_ptr_ (if the memory object
 is created with {CL_MEM_USE_HOST_PTR}) are undefined.
 The callback function is typically used by the application to either free or
 reuse the memory region pointed to by _host_ptr_.

 The behavior of calling expensive system routines, OpenCL API calls to
 create contexts or command-queues, or blocking OpenCL operations from the
 following list below, in a callback is undefined.

   * {clFinish},
   * {clWaitForEvents},
   * blocking calls to {clEnqueueReadBuffer}, {clEnqueueReadBufferRect},
     {clEnqueueWriteBuffer}, {clEnqueueWriteBufferRect},
   * blocking calls to {clEnqueueReadImage} and {clEnqueueWriteImage},
   * blocking calls to {clEnqueueMapBuffer}, {clEnqueueMapImage},
   * blocking calls to {clBuildProgram}, {clCompileProgram} or
     {clLinkProgram}

 If an application needs to wait for completion of a routine from the above
 list in a callback, please use the non-blocking form of the function, and
 assign a completion callback to it to do the remainder of your work.
 Note that when a callback (or other code) enqueues commands to a
 command-queue, the commands are not required to begin execution until the
 queue is flushed.
 In standard usage, blocking enqueue calls serve this role by implicitly
 flushing the queue.
 Since blocking calls are not permitted in callbacks, those callbacks that
 enqueue commands on a command queue should either call {clFlush} on the
 queue before returning or arrange for {clFlush} to be called later on
 another thread.

 The user callback function may not call OpenCL APIs with the memory object
 for which the callback function is invoked and for such cases the behavior
 of OpenCL APIs is considered to be undefined.
 ====
 --


 [[unmapping-mapped-memory]]
 === Unmapping Mapped Memory Objects

 [open,refpage='clEnqueueUnmapMemObject',desc='Enqueues a command to unmap a previously mapped region of a memory object.',type='protos']
 --
 To enqueue a command to unmap a previously mapped region of a memory object,
 call the function

 include::{generated}/api/protos/clEnqueueUnmapMemObject.txt[]
 include::{generated}/api/version-notes/clEnqueueUnmapMemObject.asciidoc[]

   * _command_queue_ must be a valid host command-queue.
   * _memobj_ is a valid memory (buffer or image) object.
     The OpenCL context associated with _command_queue_ and _memobj_ must be the
     same.
   * _mapped_ptr_ is the host address returned by a previous call to
     {clEnqueueMapBuffer}, or {clEnqueueMapImage} for _memobj_.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before {clEnqueueUnmapMemObject} can be executed.
     If _event_wait_list_ is `NULL`, then {clEnqueueUnmapMemObject} does not wait
     on any event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this command and
     can be used to query or queue a wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 Reads or writes from the host using the pointer returned by
 {clEnqueueMapBuffer} or {clEnqueueMapImage} are considered to be complete.

 {clEnqueueMapBuffer} and {clEnqueueMapImage} increment the mapped count of
 the memory object.
 The initial mapped count value of the memory object is zero.
 Multiple calls to {clEnqueueMapBuffer}, or {clEnqueueMapImage} on the same
 memory object will increment this mapped count by appropriate number of
 calls.
 {clEnqueueUnmapMemObject} decrements the mapped count of the memory object.

 {clEnqueueMapBuffer}, and {clEnqueueMapImage} act as synchronization points
 for a region of the buffer object being mapped.

 // refError

 {clEnqueueUnmapMemObject} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_MEM_OBJECT} if _memobj_ is not a valid memory object or is a
     pipe object.
   * {CL_INVALID_VALUE} if _mapped_ptr_ is not a valid pointer returned by
     {clEnqueueMapBuffer} or {clEnqueueMapImage} for _memobj_.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or if _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
   * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and
     _memobj_ are not the same or if the context associated with
     _command_queue_ and events in _event_wait_list_ are not the same.
 --


 [[accessing-mapped-regions]]
 === Accessing mapped regions of a memory object

 This section describes the behavior of OpenCL commands that access mapped
 regions of a memory object.

 The contents of the region of a memory object and associated memory objects
 (sub-buffer objects or 1D image buffer objects that overlap this region)
 mapped for writing (i.e. {CL_MAP_WRITE} or {CL_MAP_WRITE_INVALIDATE_REGION} is
 set in _map_flags_ argument to {clEnqueueMapBuffer}, or {clEnqueueMapImage})
 are considered to be undefined until this region is unmapped.

 Multiple commands in command-queues can map a region or overlapping regions
 of a memory object and associated memory objects (sub-buffer objects or 1D
 image buffer objects that overlap this region) for reading (i.e. _map_flags_
 = {CL_MAP_READ}).
 The contents of the regions of a memory object mapped for reading can also
 be read by kernels and other OpenCL commands (such as {clEnqueueCopyBuffer})
 executing on a device(s).

 Mapping (and unmapping) overlapped regions in a memory object and/or
 associated memory objects (sub-buffer objects or 1D image buffer objects
 that overlap this region) for writing is an error and will result in
 {CL_INVALID_OPERATION} error returned by {clEnqueueMapBuffer}, or
 {clEnqueueMapImage}.

 If a memory object is currently mapped for writing, the application must
 ensure that the memory object is unmapped before any enqueued kernels or
 commands that read from or write to this memory object or any of its
 associated memory objects (sub-buffer or 1D image buffer objects) or its
 parent object (if the memory object is a sub-buffer or 1D image buffer
 object) begin execution; otherwise the behavior is undefined.

 If a memory object is currently mapped for reading, the application must
 ensure that the memory object is unmapped before any enqueued kernels or
 commands that write to this memory object or any of its associated memory
 objects (sub-buffer or 1D image buffer objects) or its parent object (if the
 memory object is a sub-buffer or 1D image buffer object) begin execution;
 otherwise the behavior is undefined.

 A memory object is considered as mapped if there are one or more active
 mappings for the memory object irrespective of whether the mapped regions
 span the entire memory object.

 Accessing the contents of the memory region referred to by the mapped
 pointer that has been unmapped is undefined.

 The mapped pointer returned by {clEnqueueMapBuffer} or {clEnqueueMapImage}
 can be used as the _ptr_ argument value to {clEnqueueReadBuffer},
 {clEnqueueWriteBuffer}, {clEnqueueReadBufferRect},
 {clEnqueueWriteBufferRect}, {clEnqueueReadImage}, or
 {clEnqueueWriteImage} provided the rules described above are adhered to.


 === Migrating Memory Objects

 NOTE: Migrating memory objects is <<unified-spec, missing before>> version 1.2.

 This section describes a mechanism for assigning which device an OpenCL
 memory object resides.
 A user may wish to have more explicit control over the location of their
 memory objects on creation.
 This could be used to:

   * Ensure that an object is allocated on a specific device prior to usage.
   * Preemptively migrate an object from one device to another.

 [open,refpage='clEnqueueMigrateMemObjects',desc='Enqueues a command to indicate which device a set of memory objects should be associated with.',type='protos']
 --
 To enqueue a command to indicate which device a set of memory objects should
 be associated with, call the function

 include::{generated}/api/protos/clEnqueueMigrateMemObjects.txt[]
 include::{generated}/api/version-notes/clEnqueueMigrateMemObjects.asciidoc[]

   * _command_queue_ is a valid host command-queue.
     The specified set of memory objects in _mem_objects_ will be migrated to the
     OpenCL device associated with _command_queue_ or to the host if the
     {CL_MIGRATE_MEM_OBJECT_HOST} has been specified.
   * _num_mem_objects_ is the number of memory objects specified in
     _mem_objects_.
   * _mem_objects_ is a pointer to a list of memory objects.
   * _flags_ is a bit-field that is used to specify migration options.
     The <<migration-flags-table,Memory Migration Flags>> describes the possible
     values for flags.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before this particular command can be executed.
     If _event_wait_list_ is `NULL`, then this particular command does not wait
     on any event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this command and
     can be used to query or queue a wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 [[migration-flags-table]]
 .List of supported migration flags by <<clEnqueueMigrateMemObjects>>
 [width="100%",cols="<50%,<50%",options="header"]
 |====
 | Memory Migration Flags | Description
 | {CL_MIGRATE_MEM_OBJECT_HOST_anchor}

 include::{generated}/api/version-notes/CL_MIGRATE_MEM_OBJECT_HOST.asciidoc[]
   | This flag indicates that the specified set of memory objects are to be
     migrated to the host, regardless of the target command-queue.
 | {CL_MIGRATE_MEM_OBJECT_CONTENT_UNDEFINED_anchor}

 include::{generated}/api/version-notes/CL_MIGRATE_MEM_OBJECT_CONTENT_UNDEFINED.asciidoc[]
   | This flag indicates that the contents of the set of memory objects are
     undefined after migration.
     The specified set of memory objects are migrated to the device
     associated with _command_queue_ without incurring the overhead of
     migrating their contents.
 |====

 Typically, memory objects are implicitly migrated to a device for which
 enqueued commands, using the memory object, are targeted.
 {clEnqueueMigrateMemObjects} allows this migration to be explicitly
 performed ahead of the dependent commands.
 This allows a user to preemptively change the association of a memory
 object, through regular command queue scheduling, in order to prepare for
 another upcoming command.
 This also permits an application to overlap the placement of memory objects
 with other unrelated operations before these memory objects are needed
 potentially hiding transfer latencies.
 Once the event, returned from {clEnqueueMigrateMemObjects}, has been marked
 CL_COMPLETE the memory objects specified in _mem_objects_ have been
 successfully migrated to the device associated with _command_queue_.
 The migrated memory object shall remain resident on the device until another
 command is enqueued that either implicitly or explicitly migrates it away.

 {clEnqueueMigrateMemObjects} can also be used to direct the initial
 placement of a memory object, after creation, possibly avoiding the initial
 overhead of instantiating the object on the first enqueued command to use
 it.

 The user is responsible for managing the event dependencies, associated with
 this command, in order to avoid overlapping access to memory objects.
 Improperly specified event dependencies passed to
 {clEnqueueMigrateMemObjects} could result in undefined results.

 // refError

 {clEnqueueMigrateMemObjects} return {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and
     memory objects in _mem_objects_ are not the same or if the context
     associated with _command_queue_ and events in _event_wait_list_ are not
     the same.
   * {CL_INVALID_MEM_OBJECT} if any of the memory objects in _mem_objects_ is
     not a valid memory object.
   * {CL_INVALID_VALUE} if _num_mem_objects_ is zero or if _mem_objects_ is
     `NULL`.
   * {CL_INVALID_VALUE} if _flags_ is not 0 or is not any of the values
     described in the table above.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for the specified set of memory objects in _mem_objects_.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --


 [[memory-object-queries]]
 === Memory Object Queries

 [open,refpage='clGetMemObjectInfo',desc='Get information that is common to all memory objects (buffer and image objects).',type='protos']
 --
 To get information that is common to all memory objects (buffer and image
 objects), call the function

 include::{generated}/api/protos/clGetMemObjectInfo.txt[]
 include::{generated}/api/version-notes/clGetMemObjectInfo.asciidoc[]

   * _memobj_ specifies the memory object being queried.
   * _param_name_ specifies the information to query.
     The list of supported _param_name_ types and the information returned in
     _param_value_ by {clGetMemObjectInfo} is described in the
     <<mem-info-table,Memory Object Info>> table.
   * _param_value_ is a pointer to memory where the appropriate result being
     queried is returned.
     If _param_value_ is `NULL`, it is ignored.
   * _param_value_size_ is used to specify the size in bytes of memory pointed to
     by _param_value_.
     This size must be {geq} size of return type as described in the
     <<mem-info-table,Memory Object Info>> table.
   * _param_value_size_ret_ returns the actual size in bytes of data being
     queried by _param_name_.
     If _param_value_size_ret_ is `NULL`, it is ignored.

 [[mem-info-table]]
 .List of supported param_names by <<clGetMemObjectInfo>>
 [width="100%",cols="<33%,<17%,<50%",options="header"]
 |====
 | Memory Object Info | Return type | Description
 | {CL_MEM_TYPE_anchor}

 include::{generated}/api/version-notes/CL_MEM_TYPE.asciidoc[]
   | {cl_mem_object_type_TYPE}
       | Returns one of the following values:

         {CL_MEM_OBJECT_BUFFER_anchor} if _memobj_ is created with {clCreateBuffer},
         {clCreateBufferWithProperties}, or {clCreateSubBuffer}.

         {CL_MEM_OBJECT_IMAGE2D} if _memobj_ is created with {clCreateImage2D}.

         {CL_MEM_OBJECT_IMAGE3D} if _memobj_ is created with {clCreateImage3D}.

         The value of __image_desc__->__image_type__ if _memobj_ is created with
         {clCreateImage} or {clCreateImageWithProperties}.

         {CL_MEM_OBJECT_PIPE_anchor} if _memobj_ is created with {clCreatePipe}.
 | {CL_MEM_FLAGS_anchor}

 include::{generated}/api/version-notes/CL_MEM_FLAGS.asciidoc[]
   | {cl_mem_flags_TYPE}
       | Return the flags argument value specified when _memobj_ is created
         with {clCreateBuffer}, +
         {clCreateBufferWithProperties}, +
         {clCreateSubBuffer}, +
         {clCreateImage}, +
         {clCreateImageWithProperties}, +
         {clCreateImage2D}, +
         {clCreateImage3D}, or +
         {clCreatePipe}.

         If _memobj_ is a sub-buffer the memory access qualifiers inherited
         from parent buffer is also returned.
 | {CL_MEM_SIZE_anchor}

 include::{generated}/api/version-notes/CL_MEM_SIZE.asciidoc[]
   | {size_t_TYPE}
       | Return actual size of the data store associated with _memobj_ in
         bytes.
 | {CL_MEM_HOST_PTR_anchor}

 include::{generated}/api/version-notes/CL_MEM_HOST_PTR.asciidoc[]
   | {void_TYPE}*
       | If _memobj_ is created with {clCreateBuffer}, {clCreateBufferWithProperties},
         {clCreateImage}, {clCreateImageWithProperties}, {clCreateImage2D}, or
         {clCreateImage3D}, and {CL_MEM_USE_HOST_PTR} is specified in mem_flags,
         return the host_ptr argument value specified when _memobj_ is created.

         Otherwise, if _memobj_ is created with {clCreateSubBuffer}, and
         _memobj_ is created from a buffer that was created with
         {CL_MEM_USE_HOST_PTR} specified in mem_flags, return the host_ptr
         passed to {clCreateBuffer} or {clCreateBufferWithProperties}, plus
         the origin value specified in buffer_create_info when _memobj_ is
         created.

         Otherwise, returns `NULL`.
 | {CL_MEM_MAP_COUNT_anchor} footnote:[{fn-map-count-usage}]

 include::{generated}/api/version-notes/CL_MEM_MAP_COUNT.asciidoc[]
   | {cl_uint_TYPE}
       | Map count.
 | {CL_MEM_REFERENCE_COUNT_anchor} footnote:[{fn-reference-count-usage}]

 include::{generated}/api/version-notes/CL_MEM_REFERENCE_COUNT.asciidoc[]
   | {cl_uint_TYPE}
       | Return _memobj_ reference count.
 | {CL_MEM_CONTEXT_anchor}

 include::{generated}/api/version-notes/CL_MEM_CONTEXT.asciidoc[]
   | {cl_context_TYPE}
       | Return context specified when memory object is created.
         If _memobj_ is created using {clCreateSubBuffer}, the context
         associated with the memory object specified as the _buffer_ argument
         to {clCreateSubBuffer} is returned.
 | {CL_MEM_ASSOCIATED_MEMOBJECT_anchor}

 include::{generated}/api/version-notes/CL_MEM_ASSOCIATED_MEMOBJECT.asciidoc[]
   | {cl_mem_TYPE}
       | Return memory object from which _memobj_ is created.

         This returns the memory object specified as buffer argument to
         {clCreateSubBuffer} if _memobj_ is a subbuffer object created using
         {clCreateSubBuffer}.

         This returns __image_desc__->__mem_object__ if _memobj_
         is an image object created using {clCreateImage} or
         {clCreateImageWithProperties}.

         Otherwise, returns `NULL`.
 | {CL_MEM_OFFSET_anchor}

 include::{generated}/api/version-notes/CL_MEM_OFFSET.asciidoc[]
   | {size_t_TYPE}
       | Return offset if _memobj_ is a sub-buffer object created using
         {clCreateSubBuffer}.

         This return 0 if _memobj_ is not a subbuffer object.
 | {CL_MEM_USES_SVM_POINTER_anchor}

 include::{generated}/api/version-notes/CL_MEM_USES_SVM_POINTER.asciidoc[]
   | {cl_bool_TYPE}
       | Return {CL_TRUE} if _memobj_ is a buffer object that was created with
         {CL_MEM_USE_HOST_PTR} or is a sub-buffer object of a buffer object
         that was created with {CL_MEM_USE_HOST_PTR} and the _host_ptr_
         specified when the buffer object was created is a SVM pointer;
         otherwise returns {CL_FALSE}.
 | {CL_MEM_PROPERTIES_anchor}

 include::{generated}/api/version-notes/CL_MEM_PROPERTIES.asciidoc[]
   | {cl_mem_properties_TYPE}[]
       | Return the properties argument specified in
         {clCreateBufferWithProperties} or {clCreateImageWithProperties}.

         If the _properties_ argument specified in
         {clCreateBufferWithProperties} or {clCreateImageWithProperties}
         used to create _memobj_ was not `NULL`, the implementation must
         return the values specified in the properties argument in the
         same order and without including additional properties.

         If _memobj_ was created using {clCreateBuffer},
         {clCreateSubBuffer}, {clCreateImage}, {clCreateImage2D}, or
         {clCreateImage3D}, or if the _properties_ argument specified
         in {clCreateBufferWithProperties} or
         {clCreateImageWithProperties} was `NULL`, the implementation
         must return _param_value_size_ret_ equal to 0, indicating that
         there are no properties to be returned.
 |====

 // refError

 {clGetMemObjectInfo} returns  {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_MEM_OBJECT} if _memobj_ is a not a valid memory object.
   * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
     specified by _param_value_size_ is < size of return type as described in
     the <<mem-info-table,Memory Object Info>> table and _param_value_ is not
     `NULL`.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --


 == Shared Virtual Memory

 NOTE: Shared virtual memory is <<unified-spec, missing before>> version 2.0.

 Shared virtual memory (a.k.a. SVM) allows the host and kernels executing on
 devices to directly share complex, pointer-containing data structures such as
 trees and linked lists.
 It also eliminates the need to marshal data between the host and devices.
 As a result, SVM substantially simplifies OpenCL programming and may improve
 performance.


 === SVM sharing granularity: coarse- and fine- grained sharing

 OpenCL maintains memory consistency in a coarse-grained fashion in regions
 of buffers.
 We call this coarse-grained sharing.
 Many platforms such as those with integrated CPU-GPU processors and ones
 using the SVM-related PCI-SIG IOMMU services can do better, and can support
 sharing at a granularity smaller than a buffer.
 We call this fine-grained sharing.

   * Coarse-grained sharing: Coarse-grain sharing may be used for memory and
     virtual pointer sharing between multiple devices as well as between the
     host and one or more devices.
     The shared memory region is a memory buffer allocated using
     {clSVMAlloc}.
     Memory consistency is guaranteed at synchronization points and the host
     can use calls to {clEnqueueSVMMap} and {clEnqueueSVMUnmap} or create a
     {cl_mem_TYPE} buffer object using the SVM pointer and use OpenCL's existing host
     API functions {clEnqueueMapBuffer} and {clEnqueueUnmapMemObject} to
     update regions of the buffer.
     What coarse-grain buffer SVM adds to OpenCL's earlier buffer support are
     the ability to share virtual memory pointers and a guarantee that
     concurrent access to the same memory allocation from multiple kernels on
     a single device is valid.
     The coarse-grain buffer SVM provides a memory consistency model similar
     to the global memory consistency model described in _sections 3.3.1_ and
     _3.4.3_ of the OpenCL 1.2 specification.
     This memory consistency applies to the regions of buffers being shared
     in a coarse-grained fashion.
     It is enforced at the synchronization points between commands enqueued
     to command queues in a single context with the additional consideration
     that multiple kernels concurrently running on the same device may safely
     share the data.
   * Fine-grained sharing: Shared virtual memory where memory consistency is
     maintained at a granularity smaller than a buffer.
     How fine-grained SVM is used depends on whether the device supports SVM
     atomic operations.
   ** If SVM atomic operations are supported, they provide memory consistency
      for loads and stores by the host and kernels executing on devices
      supporting SVM.
      This means that the host and devices can concurrently read and update
      the same memory.
      The consistency provided by SVM atomics is in addition to the
      consistency provided at synchronization points.
      There is no need for explicit calls to {clEnqueueSVMMap} and
      {clEnqueueSVMUnmap} or {clEnqueueMapBuffer} and
      {clEnqueueUnmapMemObject} on a {cl_mem_TYPE} buffer object created using the
      SVM pointer.
   ** If SVM atomic operations are not supported, the host and devices can
      concurrently read the same memory locations and can concurrently update
      non-overlapping memory regions, but attempts to update the same memory
      locations are undefined.
      Memory consistency is guaranteed at synchronization points without the
      need for explicit calls to {clEnqueueSVMMap} and {clEnqueueSVMUnmap}
      or {clEnqueueMapBuffer} and {clEnqueueUnmapMemObject} on a {cl_mem_TYPE}
      buffer object created using the SVM pointer.
   * There are two kinds of fine-grain sharing support.
     Devices may support either fine-grain buffer sharing or fine-grain
     system sharing.
   ** Fine-grain buffer sharing provides fine-grain SVM only within buffers
      and is an extension of coarse-grain sharing.
      To support fine-grain buffer sharing in an OpenCL context, all devices
      in the context must support {CL_DEVICE_SVM_FINE_GRAIN_BUFFER}.
   ** Fine-grain system sharing enables fine-grain sharing of the host's
      entire virtual memory, including memory regions allocated by the system
      *malloc* API.
      OpenCL buffer objects are unnecessary and programmers can pass pointers
      allocated using *malloc* to OpenCL kernels.

 As an illustration of fine-grain SVM using SVM atomic operations to maintain
 memory consistency, consider the following example.
 The host and a set of devices can simultaneously access and update a shared
 work-queue data structure holding work-items to be done.
 The host can use atomic operations to insert new work-items into the queue
 at the same time as the devices using similar atomic operations to remove
 work-items for processing.

 It is the programmer's responsibility to ensure that no host code or
 executing kernels attempt to access a shared memory region after that memory
 is freed.
 We require the SVM implementation to work with either 32- or 64- bit host
 applications subject to the following requirement: the address space size
 must be the same for the host and all OpenCL devices in the context.

 [open,refpage='clSVMAlloc',desc='Allocates a shared virtual memory (SVM) buffer that can be shared by the host and all devices in an OpenCL context that support shared virtual memory.',type='protos']
 --
 To allocate a shared virtual memory buffer (referred to as a SVM buffer)
 that can be shared by the host and all devices in an OpenCL context that
 support shared virtual memory, call the function

 include::{generated}/api/protos/clSVMAlloc.txt[]
 include::{generated}/api/version-notes/clSVMAlloc.asciidoc[]

   * _context_ is a valid OpenCL context used to create the SVM buffer.
   * _flags_ is a bit-field that is used to specify allocation and usage
     information.
     The <<svm-flags-table,SVM Memory Flags>> table describes the possible values
     for _flags_.
   * _size_ is the size in bytes of the SVM buffer to be allocated.
   * _alignment_ is the minimum alignment in bytes that is required for the newly
     created buffers memory region.
     It must be a power of two up to the largest data type supported by the
     OpenCL device.
     For the full profile, the largest data type is long16.
     For the embedded profile, it is long16 if the device supports 64-bit
     integers; otherwise it is int16.
     If alignment is 0, a default alignment will be used that is equal to the
     size of largest data type supported by the OpenCL implementation.

 [[svm-flags-table]]
 .List of supported SVM memory flag values
 [width="100%",cols="<50%,<50%",options="header"]
 |====
 | SVM Memory Flags | Description
 | {CL_MEM_READ_WRITE}
   | This flag specifies that the SVM buffer will be read and written by a
     kernel.
     This is the default.
 | {CL_MEM_WRITE_ONLY}
   | This flag specifies that the SVM buffer will be written but not read by
     a kernel.

     Reading from a SVM buffer created with {CL_MEM_WRITE_ONLY} inside a kernel
     is undefined.

     {CL_MEM_READ_WRITE} and {CL_MEM_WRITE_ONLY} are mutually exclusive.
 | {CL_MEM_READ_ONLY}
   | This flag specifies that the SVM buffer object is a read-only memory
     object when used inside a kernel.

     Writing to a SVM buffer created with {CL_MEM_READ_ONLY} inside a kernel is
     undefined.

     {CL_MEM_READ_WRITE} or {CL_MEM_WRITE_ONLY} and {CL_MEM_READ_ONLY} are mutually
     exclusive.
 | {CL_MEM_SVM_FINE_GRAIN_BUFFER_anchor}

 include::{generated}/api/version-notes/CL_MEM_SVM_FINE_GRAIN_BUFFER.asciidoc[]
   | This specifies that the application wants the OpenCL implementation to
     do a fine-grained allocation.
 | {CL_MEM_SVM_ATOMICS_anchor}

 include::{generated}/api/version-notes/CL_MEM_SVM_ATOMICS.asciidoc[]
   | This flag is valid only if {CL_MEM_SVM_FINE_GRAIN_BUFFER} is specified in
     flags.
     It is used to indicate that SVM atomic operations can control visibility
     of memory accesses in this SVM buffer.
 |====

 If {CL_MEM_SVM_FINE_GRAIN_BUFFER} is not specified, the buffer can be created
 as a coarse grained SVM allocation.
 Similarly, if {CL_MEM_SVM_ATOMICS} is not specified, the buffer can be created
 without support for SVM atomic operations (refer to an OpenCL kernel
 language specifications).

 Calling {clSVMAlloc} does not itself provide consistency for the shared
 memory region.
 When the host cannot use the SVM atomic operations, it must rely on OpenCL's
 guaranteed memory consistency at synchronization points.

 For SVM to be used efficiently, the host and any devices sharing a buffer
 containing virtual memory pointers should have the same endianness.
 If the context passed to {clSVMAlloc} has devices with mixed endianness and
 the OpenCL implementation is unable to implement SVM because of that mixed
 endianness, {clSVMAlloc} will fail and return `NULL`.

 Although SVM is generally not supported for image objects, {clCreateImage}
 and {clCreateImageWithProperties}
 may create an image from a buffer (a 1D image from a buffer or a 2D image
 from buffer) if the buffer specified in its image description parameter is a
 SVM buffer.
 Such images have a linear memory representation so their memory can be
 shared using SVM.
 However, fine grained sharing and atomics are not supported for image reads
 and writes in a kernel.

 // refError

 {clSVMAlloc} returns a valid non-`NULL` shared virtual memory address if the
 SVM buffer is successfully allocated.
 Otherwise, like *malloc*, it returns a `NULL` pointer value.
 {clSVMAlloc} will fail if

   * _context_ is not a valid context, or no devices in _context_ support SVM.
   * _flags_ does not contain {CL_MEM_SVM_FINE_GRAIN_BUFFER} but does contain
     {CL_MEM_SVM_ATOMICS}.
   * Values specified in _flags_ do not follow rules described for supported
     values in the <<svm-flags-table,SVM Memory Flags>> table.
   * {CL_MEM_SVM_FINE_GRAIN_BUFFER} or {CL_MEM_SVM_ATOMICS} is specified in
     _flags_ and these are not supported by at least one device in _context_.
   * The values specified in _flags_ are not valid, i.e. don't match those
     defined in the <<svm-flags-table,SVM Memory Flags>> table.
   * _size_ is 0 or > {CL_DEVICE_MAX_MEM_ALLOC_SIZE} value for any device in
     _context_.
   * _alignment_ is not a power of two or the OpenCL implementation cannot
     support the specified alignment for at least one device in _context_.
   * There was a failure to allocate resources.
 --

 [open,refpage='clSVMFree',desc='Frees a shared virtual memory buffer allocated using clSVMAlloc.',type='protos']
 --
 To free a shared virtual memory buffer allocated using {clSVMAlloc}, call
 the function

 include::{generated}/api/protos/clSVMFree.txt[]
 include::{generated}/api/version-notes/clSVMFree.asciidoc[]

   * _context_ is a valid OpenCL context used to create the SVM buffer.
     If no devices in _context_ support SVM, no action occurs.
   * _svm_pointer_ must be the value returned by a call to {clSVMAlloc}.
     If a `NULL` pointer is passed in _svm_pointer_, no action occurs.

 Note that {clSVMFree} does not wait for previously enqueued commands that
 may be using _svm_pointer_ to finish before freeing _svm_pointer_.
 It is the responsibility of the application to make sure that enqueued
 commands that use _svm_pointer_ have finished before freeing _svm_pointer_.
 This can be done by enqueuing a blocking operation such as {clFinish},
 {clWaitForEvents}, {clEnqueueReadBuffer} or by registering a callback with
 the events associated with enqueued commands and when the last enqueued
 command has finished freeing _svm_pointer_.

 The behavior of using _svm_pointer_ after it has been freed is undefined.
 In addition, if a buffer object is created using {clCreateBuffer} or
 {clCreateBufferWithProperties} with _svm_pointer_, the buffer object must
 first be released before the _svm_pointer_ is freed.

 The {clEnqueueSVMFree} API can also be used to enqueue a callback to free
 the shared virtual memory buffer allocated using {clSVMAlloc} or a shared
 system memory pointer.
 --

 [open,refpage='clEnqueueSVMFree',desc='Enqueues a command to free shared virtual memory allocated using clSVMAlloc or a shared system memory pointer.',type='protos']
 --
 To enqueue a command to free the shared virtual memory allocated using
 {clSVMAlloc} or a shared system memory pointer, call the function

 include::{generated}/api/protos/clEnqueueSVMFree.txt[]
 include::{generated}/api/version-notes/clEnqueueSVMFree.asciidoc[]

   * _command_queue_ is a valid host command-queue.
   * _svm_pointers_ and _num_svm_pointers_ specify shared virtual memory pointers
     to be freed.
     Each pointer in _svm_pointers_ that was allocated using {clSVMAlloc} must
     have been allocated from the same context from which _command_queue_ was
     created.
     The memory associated with _svm_pointers_ can be reused or freed after the
     function returns.
   * _pfn_free_func_ specifies the callback function to be called to free the SVM
     pointers.
     _pfn_free_func_ takes four arguments: _queue_ which is the command queue in
     which {clEnqueueSVMFree} was enqueued, the count and list of SVM pointers to
     free and _user_data_ which is a pointer to user specified data.
     If _pfn_free_func_ is `NULL`, all pointers specified in _svm_pointers_ must
     be allocated using {clSVMAlloc} and the OpenCL implementation will free
     these SVM pointers.
     _pfn_free_func_ must be a valid callback function if any SVM pointer to be
     freed is a shared system memory pointer i.e. not allocated using
     {clSVMAlloc}.
     If _pfn_free_func_ is a valid callback function, the OpenCL implementation
     will call _pfn_free_func_ to free all the SVM pointers specified in
     _svm_pointers_.
   * _user_data_ will be passed as the _user_data_ argument when _pfn_free_func_
     is called.
     _user_data_ can be `NULL`.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before {clEnqueueSVMFree} can be executed.
     If _event_wait_list_ is `NULL`, then {clEnqueueSVMFree} does not wait on any
     event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this command and
     can be used to query or queue a wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 // refError

 {clEnqueueSVMFree} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_OPERATION} if the device associated with _command queue_ does not support SVM.
   * {CL_INVALID_VALUE} if _num_svm_pointers_ is 0 and _svm_pointers_ is
     non-`NULL`, _or_ if _svm_pointers_ is `NULL` and _num_svm_pointers_ is
     not 0.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clEnqueueSVMMemcpy',desc='Enqueues a command to do a memcpy operation.',type='protos']
 --
 To enqueue a command to do a memcpy operation, call the function

 include::{generated}/api/protos/clEnqueueSVMMemcpy.txt[]
 include::{generated}/api/version-notes/clEnqueueSVMMemcpy.asciidoc[]

   * _command_queue_ refers to the host command-queue in which the read / write
     command will be queued.
     If either _dst_ptr_ or _src_ptr_ is allocated using {clSVMAlloc} then the
     OpenCL context allocated against must match that of _command_queue_.
   * _blocking_copy_ indicates if the copy operation is _blocking_ or
     _non-blocking_.
   * If _blocking_copy_ is {CL_TRUE} i.e. the copy command is blocking,
     {clEnqueueSVMMemcpy} does not return until the buffer data has been copied
     into memory pointed to by _dst_ptr_.
   * _size_ is the size in bytes of data being copied.
   * _dst_ptr_ is the pointer to a host or SVM memory allocation where data is
     copied to.
   * _src_ptr_ is the pointer to a host or SVM memory allocation where data is
     copied from.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before this particular command can be executed.
     If _event_wait_list_ is `NULL`, then this particular command does not wait
     on any event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this read / write command
     and can be used to query or queue a wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 If _blocking_copy_ is {CL_FALSE} i.e. the copy command is non-blocking,
 {clEnqueueSVMMemcpy} queues a non-blocking copy command and returns.
 The contents of the buffer that _dst_ptr_ points to cannot be used until the
 copy command has completed.
 The _event_ argument returns an event object which can be used to query the
 execution status of the read command.
 When the copy command has completed, the contents of the buffer that
 _dst_ptr_ points to can be used by the application.

 If the memory allocation(s) containing _dst_ptr_ and/or _src_ptr_ are
 allocated using {clSVMAlloc} and either is not allocated from the same
 context from which _command_queue_ was created the behavior is undefined.

 // refError

 {clEnqueueSVMMemcpy} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_OPERATION} if the device associated with _command queue_ does not support SVM.
   * {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and
     events in _event_wait_list_ are not the same.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the copy operation is
     blocking and the execution status of any of the events in
     _event_wait_list_ is a negative integer value.
   * {CL_INVALID_VALUE} if _dst_ptr_ or _src_ptr_ is `NULL`.
   * {CL_MEM_COPY_OVERLAP} if the values specified for _dst_ptr_, _src_ptr_ and
     _size_ result in an overlapping copy.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clEnqueueSVMMemFill',desc='Enqueues a command to fill a region in memory with a pattern of a given pattern size.',type='protos']
 --
 To enqueue a command to fill a region in memory with a pattern of a given
 pattern size, call the function

 include::{generated}/api/protos/clEnqueueSVMMemFill.txt[]
 include::{generated}/api/version-notes/clEnqueueSVMMemFill.asciidoc[]

   * _command_queue_ refers to the host command-queue in which the fill command
     will be queued.
     The OpenCL context associated with _command_queue_ and SVM pointer referred
     to by _svm_ptr_ must be the same.
   * _svm_ptr_ is a pointer to a memory region that will be filled with
     _pattern_.
     It must be aligned to _pattern_size_ bytes.
     If _svm_ptr_ is allocated using {clSVMAlloc} then it must be allocated from
     the same context from which _command_queue_ was created.
     Otherwise the behavior is undefined.
   * _pattern_ is a pointer to the data pattern of size _pattern_size_ in bytes.
     _pattern_ will be used to fill a region in _buffer_ starting at _svm_ptr_
     and is _size_ bytes in size.
     The data pattern must be a scalar or vector integer or floating-point data
     type supported by OpenCL as described in <<scalar-data-types, Shared
     Application Scalar Data Types>> and <<vector-data-types, Supported
     Application Vector Data Types>>.
     For example, if region pointed to by _svm_ptr_ is to be filled with a
     pattern of float4 values, then _pattern_ will be a pointer to a cl_float4
     value and _pattern_size_ will be `sizeof(cl_float4)`.
     The maximum value of _pattern_size_ is the size of the largest integer or
     floating-point vector data type supported by the OpenCL device.
     The memory associated with _pattern_ can be reused or freed after the
     function returns.
   * _size_ is the size in bytes of region being filled starting with _svm_ptr_
     and must be a multiple of _pattern_size_.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before this particular command can be executed.
     If _event_wait_list_ is `NULL`, then this particular command does not wait
     on any event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this command and
     can be used to query or queue a wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 // refError

 {clEnqueueSVMMemFill} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_OPERATION} if the device associated with _command queue_ does not support SVM.
   * {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and
     events in _event_wait_list_ are not the same.
   * {CL_INVALID_VALUE} if _svm_ptr_ is `NULL`.
   * {CL_INVALID_VALUE} if _svm_ptr_ is not aligned to _pattern_size_ bytes.
   * {CL_INVALID_VALUE} if _pattern_ is `NULL` or if _pattern_size_ is 0 or if
     _pattern_size_ is not one of {1, 2, 4, 8, 16, 32, 64, 128}.
   * {CL_INVALID_VALUE} if _size_ is not a multiple of _pattern_size_.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clEnqueueSVMMap',desc='Enqueues a command that will allow the host to update a region of a SVM buffer',type='protos']
 --
 To enqueue a command that will allow the host to update a region of a SVM
 buffer, call the function

 include::{generated}/api/protos/clEnqueueSVMMap.txt[]
 include::{generated}/api/version-notes/clEnqueueSVMMap.asciidoc[]

   * _command_queue_ must be a valid host command-queue.
   * _blocking_map_ indicates if the map operation is _blocking_ or
     _non-blocking_.
   * _map_flags_ is a bit-field and is described in the
     <<memory-map-flags-table,Memory Map Flags>> table.
   * _svm_ptr_ and _size_ are a pointer to a memory region and size in bytes that
     will be updated by the host.
     If _svm_ptr_ is allocated using {clSVMAlloc} then it must be allocated from
     the same context from which _command_queue_ was created.
     Otherwise the behavior is undefined.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before this particular command can be executed.
     If _event_wait_list_ is `NULL`, then this particular command does not wait
     on any event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this command and
     can be used to query or queue a wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 If _blocking_map_ is {CL_TRUE}, {clEnqueueSVMMap} does not return until the
 application can access the contents of the SVM region specified by _svm_ptr_
 and _size_ on the host.

 If _blocking_map_ is {CL_FALSE} i.e. map operation is non-blocking, the region
 specified by _svm_ptr_ and _size_ cannot be used until the map command has
 completed.
 The _event_ argument returns an event object which can be used to query the
 execution status of the map command.
 When the map command is completed, the application can access the contents
 of the region specified by _svm_ptr_ and _size_.

 Note that since we are enqueuing a command with a SVM buffer, the region is
 already mapped in the host address space.

 // refError

 {clEnqueueSVMMap} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_OPERATION} if the device associated with _command queue_ does not support SVM.
   * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and events
     in _event_wait_list_ are not the same.
   * {CL_INVALID_VALUE} if _svm_ptr_ is `NULL`.
   * {CL_INVALID_VALUE} if _size_ is 0 or if values specified in _map_flags_
     are not valid.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the map operation is
     blocking and the execution status of any of the events in
     _event_wait_list_ is a negative integer value.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clEnqueueSVMUnmap',desc='Enqueues a command to indicate that the host has completed updating the region given by an SVM pointer and which was specified in a previous call to clEnqueueSVMMap.',type='protos']
 --
 To enqueue a command to indicate that the host has completed updating the
 region given by _svm_ptr_ and which was specified in a previous call to
 {clEnqueueSVMMap}, call the function

 include::{generated}/api/protos/clEnqueueSVMUnmap.txt[]
 include::{generated}/api/version-notes/clEnqueueSVMUnmap.asciidoc[]

   * _command_queue_ must be a valid host command-queue.
   * _svm_ptr_ is a pointer that was specified in a previous call to
     {clEnqueueSVMMap}.
     If _svm_ptr_ is allocated using {clSVMAlloc} then it must be allocated from
     the same context from which _command_queue_ was created.
     Otherwise the behavior is undefined.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before {clEnqueueSVMUnmap} can be executed.
     If _event_wait_list_ is `NULL`, then {clEnqueueSVMUnmap} does not wait on any
     event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this command and
     can be used to query or queue a wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 {clEnqueueSVMMap} and {clEnqueueSVMUnmap} act as synchronization points for
 the region of the SVM buffer specified in these calls.

 // refError

 {clEnqueueSVMUnmap} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_OPERATION} if the device associated with _command queue_ does not support SVM.
   * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and events
     in _event_wait_list_ are not the same.
   * {CL_INVALID_VALUE} if _svm_ptr_ is `NULL`.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or if _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.

 [NOTE]
 ====
 If a coarse-grained SVM buffer is currently mapped for writing, the
 application must ensure that the SVM buffer is unmapped before any enqueued
 kernels or commands that read from or write to this SVM buffer or any of its
 associated {cl_mem_TYPE} buffer objects begin execution; otherwise the behavior is
 undefined.

 If a coarse-grained SVM buffer is currently mapped for reading, the
 application must ensure that the SVM buffer is unmapped before any enqueued
 kernels or commands that write to this memory object or any of its
 associated {cl_mem_TYPE} buffer objects begin execution; otherwise the behavior is
 undefined.

 A SVM buffer is considered as mapped if there are one or more active
 mappings for the SVM buffer irrespective of whether the mapped regions span
 the entire SVM buffer.

 The above note does not apply to fine-grained SVM buffers (fine-grained
 buffers allocated using {clSVMAlloc} or fine-grained system allocations).
 ====
 --

 [open,refpage='clEnqueueSVMMigrateMem',desc='Enqueues a command to indicate which device a set of ranges of SVM allocations should be associated with.',type='protos']
 --
 To enqueue a command to indicate which device a set of ranges of SVM
 allocations should be associated with, call the function

 include::{generated}/api/protos/clEnqueueSVMMigrateMem.txt[]
 include::{generated}/api/version-notes/clEnqueueSVMMigrateMem.asciidoc[]

   * _command_queue_ is a valid host command queue.
     The specified set of allocation ranges will be migrated to the OpenCL device
     associated with _command_queue_.
   * _num_svm_pointers_ is the number of pointers in the specified _svm_pointers_
     array, and the number of sizes in the _sizes_ array, if _sizes_ is not
     `NULL`.
   * _svm_pointers_ is a pointer to an array of pointers.
     Each pointer in this array must be within an allocation produced by a call
     to {clSVMAlloc}.
   * _sizes_ is an array of sizes.
     The pair _svm_pointers_[i] and _sizes_[i] together define the starting
     address and number of bytes in a range to be migrated.
     _sizes_ may be `NULL` indicating that every allocation containing any
     _svm_pointer_[i] is to be migrated.
     Also, if _sizes_[i] is zero, then the entire allocation containing
     _svm_pointer_[i] is migrated.
   * _flags_ is a bit-field that is used to specify migration options.
     The <<migration-flags-table,Memory Migration Flags>> describes the possible
     values for _flags_.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before this particular command can be executed.
     If _event_wait_list_ is `NULL`, then this particular command does not wait
     on any event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this command and
     can be used to query or queue a wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 Once the event returned by {clEnqueueSVMMigrateMem} has become {CL_COMPLETE},
 the ranges specified by svm pointers and sizes have been successfully
 migrated to the device associated with command queue.

 The user is responsible for managing the event dependencies associated with
 this command in order to avoid overlapping access to SVM allocations.
 Improperly specified event dependencies passed to {clEnqueueSVMMigrateMem}
 could result in undefined results.

 // refError

 {clEnqueueSVMMigrateMem} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_OPERATION} if the device associated with _command queue_ does not support SVM.
   * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and events
     in _event_wait_list_ are not the same.
   * {CL_INVALID_VALUE} if _num_svm_pointers_ is zero or _svm_pointers_ is
     `NULL`.
   * {CL_INVALID_VALUE} if _sizes_[i] is non-zero range [_svm_pointers_[i],
     _svm_pointers_[i]+_sizes_[i]) is not contained within an existing
     {clSVMAlloc} allocation.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or if _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --


 === Memory consistency for SVM allocations

 To ensure memory consistency in SVM allocations, the program can rely on the
 guaranteed memory consistency at synchronization points.
 This consistency support already exists in OpenCL 1.x and can be used for
 coarse-grained SVM allocations or for fine-grained buffer SVM allocations;
 what SVM adds is the ability to share pointers between the host and all SVM
 devices.

 In addition, sub-buffers can also be used to ensure that each device gets a
 consistent view of a SVM buffers memory when it is shared by multiple
 devices.
 For example, assume that two devices share a SVM pointer.
 The host can create a {cl_mem_TYPE} buffer object using {clCreateBuffer} or
 {clCreateBufferWithProperties} with {CL_MEM_USE_HOST_PTR} and _host_ptr_ set
 to the SVM pointer and then create two disjoint sub-buffers with starting
 virtual addresses _sb1_ptr_ and _sb2_ptr_.
 These pointers (_sb1_ptr_ and _sb2_ptr_) can be passed to kernels executing
 on the two devices.
 {clEnqueueMapBuffer} and {clEnqueueUnmapMemObject} and the existing
 <<accessing-mapped-regions, access rules for memory objects>> ensure
 consistency for buffer regions (_sb1_ptr_ and _sb2_ptr_) read and written by
 these kernels.

 When the host and devices are able to use SVM atomic operations (i.e.
 {CL_DEVICE_SVM_ATOMICS} is set in {CL_DEVICE_SVM_CAPABILITIES}), these atomic
 operations can be used to provide memory consistency at a fine grain in a
 shared memory region.
 The effect of these operations is visible to the host and all devices with
 which that memory is shared.


 == Sampler Objects

 A sampler object describes how to sample an image when the image is read in
 the kernel.
 The built-in functions to read from an image in a kernel take a sampler as
 an argument.
 The sampler arguments to the image read function can be sampler objects
 created using OpenCL functions and passed as argument values to the kernel
 or can be samplers declared inside a kernel.
 In this section we discuss how sampler objects are created using OpenCL
 functions.


 === Creating Sampler Objects

 [open,refpage='clCreateSamplerWithProperties',desc='Creates a sampler object.',type='protos']
 --
 To create a sampler object, call the function

 include::{generated}/api/protos/clCreateSamplerWithProperties.txt[]
 include::{generated}/api/version-notes/clCreateSamplerWithProperties.asciidoc[]

   * _context_ must be a valid OpenCL context.
   * _sampler_properties_ specifies a list of sampler property names and their
     corresponding values.
     Each sampler property name is immediately followed by the corresponding
     desired value.
     The list is terminated with 0.
     The list of supported properties is described in the
     <<sampler-properties-table,Sampler Properties>> table.
     If a supported property and its value is not specified in
     _sampler_properties_, its default value will be used.
     _sampler_properties_ can be `NULL` in which case the default values for
     supported sampler properties will be used.

 [[sampler-properties-table]]
 .List of supported sampler creation properties by <<clCreateSamplerWithProperties>>
 [width="100%",cols="<33%,<17%,<50%",options="header"]
 |====
 | Sampler Property | Property Value | Description
 | {CL_SAMPLER_NORMALIZED_COORDS_anchor}

 include::{generated}/api/version-notes/CL_SAMPLER_NORMALIZED_COORDS.asciidoc[]
   | {cl_bool_TYPE}
       | A boolean value that specifies whether the image coordinates
         specified are normalized or not.

         The default value (i.e. the value used if this property is not
         specified in sampler_properties) is {CL_TRUE}.
 | {CL_SAMPLER_ADDRESSING_MODE_anchor}

 include::{generated}/api/version-notes/CL_SAMPLER_ADDRESSING_MODE.asciidoc[]
   | {cl_addressing_mode_TYPE}
       | Specifies how out-of-range image coordinates are handled when
         reading from an image.
         Valid values are:

         {CL_ADDRESS_NONE_anchor} - Behavior is undefined for out-of-range
         image coordinates.

         {CL_ADDRESS_CLAMP_TO_EDGE_anchor} - Out-of-range image coordinates
         are clamped to the edge of the image.

         {CL_ADDRESS_CLAMP_anchor} - Out-of-range image coordinates are
         assigned a border color value.

         {CL_ADDRESS_REPEAT_anchor} - Out-of-range image coordinates read
         from the image as-if the image data were replicated in all dimensions.

         {CL_ADDRESS_MIRRORED_REPEAT_anchor} - Out-of-range image coordinates
         read from the image as-if the image data were replicated in all
         dimensions, mirroring the image contents at the edge of each
         replication.

         The default is {CL_ADDRESS_CLAMP}.
 | {CL_SAMPLER_FILTER_MODE_anchor}

 include::{generated}/api/version-notes/CL_SAMPLER_FILTER_MODE.asciidoc[]
   | {cl_filter_mode_TYPE}
       | Specifies the type of filter that is applied when reading an
         image.
         Valid values are:

         {CL_FILTER_NEAREST_anchor} - Returns the image element nearest
         to the image coordinate.

         {CL_FILTER_LINEAR_anchor} - Returns a weighted average of the
         four image elements nearest to the image coordinate.

         The default value is {CL_FILTER_NEAREST}.
 |====
   * _errcode_ret_ will return an appropriate error code.
     If _errcode_ret_ is `NULL`, no error code is returned.

 // refError

 {clCreateSamplerWithProperties} returns a valid non-zero sampler object and
 _errcode_ret_ is set to {CL_SUCCESS} if the sampler object is created
 successfully.
 Otherwise, it returns a `NULL` value with one of the following error values
 returned in _errcode_ret_:

   * {CL_INVALID_CONTEXT} if _context_ is not a valid context.
   * {CL_INVALID_VALUE} if the property name in _sampler_properties_ is not a
     supported property name, if the value specified for a supported property
     name is not valid, or if the same property name is specified more than
     once.
   * {CL_INVALID_OPERATION} if images are not supported by any device
     associated with _context_ (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the
     <<device-queries-table,Device Queries>> table is {CL_FALSE}).
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clCreateSampler',desc='Creates a sampler object.',type='protos']
 --
 To create a sampler object, call the function

 include::{generated}/api/protos/clCreateSampler.txt[]
 include::{generated}/api/version-notes/clCreateSampler.asciidoc[]

   * _context_ must be a valid OpenCL context.
   * _normalized_coords_ has the same interpretation as
     {CL_SAMPLER_NORMALIZED_COORDS} in the <<sampler-properties-table, sampler
     creation properties table>>.
   * _addressing_mode_ has the same interpretation as
     {CL_SAMPLER_ADDRESSING_MODE} in the <<sampler-properties-table, sampler
     creation properties table>>.
   * _filter_mode_ has the same interpretation as
     {CL_SAMPLER_FILTER_MODE} in the <<sampler-properties-table, sampler
     creation properties table>>.
   * _errcode_ret_ will return an appropriate error code.
     If _errcode_ret_ is `NULL`, no error code is returned.

 // refError

 {clCreateSampler} returns a valid non-zero sampler object and _errcode_ret_ is
 set to {CL_SUCCESS} if the sampler object is created successfully.
 Otherwise, it returns a `NULL` value with one of the following error values
 returned in _errcode_ret_:

   * {CL_INVALID_CONTEXT} if _context_ is not a valid context.
   * {CL_INVALID_VALUE} if _addressing_mode_, _filter_mode_, _normalized_coords_
     or a combination of these arguements are not valid.
   * {CL_INVALID_OPERATION} if images are not supported by any device
     associated with _context_ (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the
     <<device-queries-table,Device Queries>> table is {CL_FALSE}).
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clRetainSampler',desc='Increments the sampler reference count.',type='protos']
 --
 To retain a sampler object, call the function

 include::{generated}/api/protos/clRetainSampler.txt[]
 include::{generated}/api/version-notes/clRetainSampler.asciidoc[]

   * _sampler_ specifies the sampler to be released.

 The _sampler_ reference count is incremented.
 {clCreateSamplerWithProperties} and {clCreateSampler} perform an implicit
 retain.

 // refError

 {clRetainSampler} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_SAMPLER} if _sampler_ is not a valid sampler object.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clReleaseSampler',desc='Decrements the sampler reference count.',type='protos']
 --
 To release a sampler object, call the function

 include::{generated}/api/protos/clReleaseSampler.txt[]
 include::{generated}/api/version-notes/clReleaseSampler.asciidoc[]

   * _sampler_ specifies the sampler to be released.

 The _sampler_ reference count is decremented.
 The sampler object is deleted after the reference count becomes zero and
 commands queued for execution on a command-queue(s) that use _sampler_ have
 finished.

 // refError

 {clReleaseSampler} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_SAMPLER} if _sampler_ is not a valid sampler object.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.

 Using this function to release a reference that was not obtained by creating
 the object or by calling {clRetainSampler} causes undefined behavior.
 --


 === Sampler Object Queries

 [open,refpage='clGetSamplerInfo',desc='Returns information about the sampler object.',type='protos']
 --
 To return information about a sampler object, call the function

 include::{generated}/api/protos/clGetSamplerInfo.txt[]
 include::{generated}/api/version-notes/clGetSamplerInfo.asciidoc[]

   * _sampler_ specifies the sampler being queried.
   * _param_name_ specifies the information to query.
     The list of supported _param_name_ types and the information returned in
     _param_value_ by {clGetSamplerInfo} is described in the
     <<sampler-info-table,Sampler Object Queries>> table.
   * _param_value_ is a pointer to memory where the appropriate result being
     queried is returned.
     If _param_value_ is `NULL`, it is ignored.
   * _param_value_size_ is used to specify the size in bytes of memory pointed to
     by _param_value_.
     This size must be {geq} size of return type as described in the
     <<sampler-info-table,Sampler Object Queries>> table.
   * _param_value_size_ret_ returns the actual size in bytes of data being
     queried by _param_name_.
     If _param_value_size_ret_ is `NULL`, it is ignored.

 [[sampler-info-table]]
 .List of supported param_names by <<clGetSamplerInfo>>
 [width="100%",cols="<33%,<17%,<50%",options="header"]
 |====
 | Sampler Info | Return Type | Description
 | {CL_SAMPLER_REFERENCE_COUNT_anchor} footnote:[{fn-reference-count-usage}]

 include::{generated}/api/version-notes/CL_SAMPLER_REFERENCE_COUNT.asciidoc[]
   | {cl_uint_TYPE}
       | Return the _sampler_ reference count.
 | {CL_SAMPLER_CONTEXT_anchor}

 include::{generated}/api/version-notes/CL_SAMPLER_CONTEXT.asciidoc[]
   | {cl_context_TYPE}
       | Return the context specified when the sampler is created.
 | {CL_SAMPLER_NORMALIZED_COORDS_anchor}

 include::{generated}/api/version-notes/CL_SAMPLER_NORMALIZED_COORDS.asciidoc[]
   | {cl_bool_TYPE}
       | Return the normalized coords value associated with _sampler_.
 | {CL_SAMPLER_ADDRESSING_MODE_anchor}

 include::{generated}/api/version-notes/CL_SAMPLER_ADDRESSING_MODE.asciidoc[]
   | {cl_addressing_mode_TYPE}
       | Return the addressing mode value associated with _sampler_.
 | {CL_SAMPLER_FILTER_MODE_anchor}

 include::{generated}/api/version-notes/CL_SAMPLER_FILTER_MODE.asciidoc[]
   | {cl_filter_mode_TYPE}
       | Return the filter mode value associated with _sampler_.

 | {CL_SAMPLER_PROPERTIES_anchor}

 include::{generated}/api/version-notes/CL_SAMPLER_PROPERTIES.asciidoc[]
   | {cl_sampler_properties_TYPE}[]
       | Return the properties argument specified in
         {clCreateSamplerWithProperties}.

         If the _properties_ argument specified in {clCreateSamplerWithProperties}
         used to create _sampler_ was not `NULL`, the implementation must return
         the values specified in the properties argument in the same order and
         without including additional properties.

         If _sampler_ was created using {clCreateSampler}, or if the _properties_
         argument specified in {clCreateSamplerWithProperties} was `NULL`, the
         implementation must return _param_value_size_ret_ equal to 0,
         indicating that there are no properties to be returned.
 |====

 // refError

 {clGetSamplerInfo} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
     specified by _param_value_size_ is < size of return type as described in
     the <<sampler-info-table,Sampler Object Queries>> table and
     _param_value_ is not `NULL`.
   * {CL_INVALID_SAMPLER} if _sampler_ is a not a valid sampler object.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --


 == Program Objects

 An OpenCL program consists of a set of kernels that are identified as
 functions declared with the `+__kernel+` qualifier in the program source.
 OpenCL programs may also contain auxiliary functions and constant data that
 can be used by kernel functions.
 The program executable can be generated _online_ or _offline_ by the OpenCL
 compiler for the appropriate target device(s).

 A program object encapsulates the following information:

   * An associated context.
   * A program source or binary.
   * The latest successfully built program executable, library or compiled
     binary, the list of devices for which the program executable, library or
     compiled binary is built, the build options used and a build log.
   * The number of kernel objects currently attached.


 === Creating Program Objects

 [open,refpage='clCreateProgramWithSource',desc='Creates a program object for a context, and loads source code specified by text strings into the program object.',type='protos']
 --
 To creates a program object for a context and load source code into that
 object, call the function

 include::{generated}/api/protos/clCreateProgramWithSource.txt[]
 include::{generated}/api/version-notes/clCreateProgramWithSource.asciidoc[]

   * _context_ must be a valid OpenCL context.
   * _strings_ is an array of _count_ pointers to optionally null-terminated
     character strings that make up the source code.
   * _lengths_ argument is an array with the number of chars in each string
     (the string length).
     If an element in _lengths_ is zero, its accompanying string is
     null-terminated.
     If _lengths_ is `NULL`, all strings in the _strings_ argument are considered
     null-terminated.
     Any length value passed in that is greater than zero excludes the null
     terminator in its count.
   * _errcode_ret_ will return an appropriate error code.
     If _errcode_ret_ is `NULL`, no error code is returned.

 The source code specified by _strings_ will be loaded into the program
 object.

 The devices associated with the program object are the devices associated
 with _context_.
 The source code specified by _strings_ is either an OpenCL C program source,
 header or implementation-defined source for custom devices that support an
 online compiler.
 OpenCL {cpp} is not supported as an online-compiled kernel language through
 this interface.

 // refError

 {clCreateProgramWithSource} returns a valid non-zero program object and
 _errcode_ret_ is set to {CL_SUCCESS} if the program object is created
 successfully.
 Otherwise, it returns a `NULL` value with one of the following error values
 returned in _errcode_ret_:

   * {CL_INVALID_CONTEXT} if _context_ is not a valid context.
   * {CL_INVALID_VALUE} if _count_ is zero or if _strings_ or any entry in
     _strings_ is `NULL`.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clCreateProgramWithIL',desc='Creates a program object for a context, and loads the IL into the program object.',type='protos']
 --
 To create a program object for a context and load code in an intermediate
 language into that object, call the function

 include::{generated}/api/protos/clCreateProgramWithIL.txt[]
 include::{generated}/api/version-notes/clCreateProgramWithIL.asciidoc[]
 Also see extension *cl_khr_il_program*.

   * _context_ must be a valid OpenCL context.
   * _il_ is a pointer to a _length_-byte block of memory containing SPIR-V or an
     implementation-defined intermediate language.
   * _errcode_ret_ will return an appropriate error code.
     If _errcode_ret_ is `NULL`, no error code is returned.

 The intermediate language pointed to by _il_ and with length in bytes _length_
 will be loaded into the program object.
 The devices associated with the program object are the devices associated
 with _context_.

 // refError

 {clCreateProgramWithIL} returns a valid non-zero program object and
 _errcode_ret_ is set to {CL_SUCCESS} if the program object is created
 successfully.
 Otherwise, it returns a `NULL` value with one of the following error values
 returned in _errcode_ret_:

   * {CL_INVALID_CONTEXT} if _context_ is not a valid context.
   * {CL_INVALID_OPERATION} if no devices in _context_ support intermediate
     language programs.
   * {CL_INVALID_VALUE} if _il_ is `NULL` or if _length_ is zero.
   * {CL_INVALID_VALUE} if the _length_-byte memory pointed to by _il_ does not
     contain well-formed intermediate language input that can be consumed by
     the OpenCL runtime.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clCreateProgramWithBinary',desc='Creates a program object for a context, and loads binary bits into the program object.',type='protos']
 --
 To create a program object for a context and load binary bits into that
 object, call the function

 include::{generated}/api/protos/clCreateProgramWithBinary.txt[]
 include::{generated}/api/version-notes/clCreateProgramWithBinary.asciidoc[]

   * _context_ must be a valid OpenCL context.
   * _device_list_ is a pointer to a list of devices that are in _context_.
     _device_list_ must be a non-`NULL` value.
     The binaries are loaded for devices specified in this list.
   * _num_devices_ is the number of devices listed in _device_list_.
   * _lengths_ is an array of the size in bytes of the program binaries to be
     loaded for devices specified by _device_list_.
   * _binaries_ is an array of pointers to program binaries to be loaded for
     devices specified by _device_list_.
     For each device given by _device_list_[i], the pointer to the program binary
     for that device is given by _binaries_[i] and the length of this
     corresponding binary is given by _lengths_[i].
     _lengths_[i] cannot be zero and _binaries_[i] cannot be a `NULL` pointer.

 The devices associated with the program object will be the list of devices
 specified by _device_list_.
 The list of devices specified by _device_list_ must be devices associated
 with _context_.

 The program binaries specified by _binaries_ will be loaded into the program
 object.
 They contain bits that describe one of the following:

   * a program executable to be run on the device(s) associated with
     _context_,
   * a compiled program for device(s) associated with _context_, or
   * a library of compiled programs for device(s) associated with _context_.

 The program binary can consist of either or both:

   * Device-specific code and/or,
   * Implementation-specific intermediate representation (IR) which will be
     converted to the device-specific code.

   * _binary_status_ returns whether the program binary for each device specified
     in _device_list_ was loaded successfully or not.
     It is an array of _num_devices_ entries and returns {CL_SUCCESS} in
     _binary_status_[i] if binary was successfully loaded for device specified by
     _device_list_[i]; otherwise returns {CL_INVALID_VALUE} if _lengths_[i] is zero
     or if _binaries_[i] is a `NULL` value or {CL_INVALID_BINARY} in
     _binary_status_[i] if program binary is not a valid binary for the specified
     device.
     If _binary_status_ is `NULL`, it is ignored.
   * _errcode_ret_ will return an appropriate error code.
     If _errcode_ret_ is `NULL`, no error code is returned.

 OpenCL allows applications to create a program object using the program
 source or binary and build appropriate program executables.
 This can be very useful as it allows applications to load program source and
 then compile and link to generate a program executable online on its first
 instance for appropriate OpenCL devices in the system.
 These executables can now be queried and cached by the application.
 The cached executables can be read and loaded by the application, which can
 help significantly reduce the application initialization time.

 // refError

 {clCreateProgramWithBinary} returns a valid non-zero program object and
 _errcode_ret_ is set to {CL_SUCCESS} if the program object is created
 successfully.
 Otherwise, it returns a `NULL` value with one of the following error values
 returned in _errcode_ret_:

   * {CL_INVALID_CONTEXT} if _context_ is not a valid context.
   * {CL_INVALID_VALUE} if _device_list_ is `NULL` or _num_devices_ is zero.
   * {CL_INVALID_DEVICE} if any device in _device_list_ is not in
     the list of devices associated with _context_.
   * {CL_INVALID_VALUE} if _lengths_ or _binaries_ is `NULL` or if any entry
     in _lengths_[i] is zero or _binaries_[i] is `NULL`.
   * {CL_INVALID_BINARY} if an invalid program binary was encountered for any
     device.
     _binary_status_ will return specific status for each device.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clCreateProgramWithBuiltInKernels',desc='Creates a program object for a context, and loads the information related to the built-in kernels into a program object.',type='protos']
 --
 To create a program object for a context and loads the information related
 to the built-in kernels into that object, call the function

 include::{generated}/api/protos/clCreateProgramWithBuiltInKernels.txt[]
 include::{generated}/api/version-notes/clCreateProgramWithBuiltInKernels.asciidoc[]

   * _context_ must be a valid OpenCL context.
   * _num_devices_ is the number of devices listed in _device_list_.
   * _device_list_ is a pointer to a list of devices that are in _context_.
     _device_list_ must be a non-`NULL` value.
     The built-in kernels are loaded for devices specified in this list.
   * _kernel_names_ is a semi-colon separated list of built-in kernel names.

 The devices associated with the program object will be the list of devices
 specified by _device_list_.
 The list of devices specified by _device_list_ must be devices associated
 with _context_.

 // refError

 {clCreateProgramWithBuiltInKernels} returns a valid non-zero program object
 and _errcode_ret_ is set to {CL_SUCCESS} if the program object is created
 successfully.
 Otherwise, it returns a `NULL` value with one of the following error values
 returned in _errcode_ret_:

   * {CL_INVALID_CONTEXT} if _context_ is not a valid context.
   * {CL_INVALID_VALUE} if _device_list_ is `NULL` or _num_devices_ is zero.
   * {CL_INVALID_VALUE} if _kernel_names_ is `NULL` or _kernel_names_ contains
     a kernel name that is not supported by any of the devices in
     _device_list_.
   * {CL_INVALID_DEVICE} if any device in _device_list_ is not in the list
     of devices associated with _context_.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --


 === Retaining and Releasing Program Objects

 [open,refpage='clRetainProgram',desc='Increments the program reference count.',type='protos']
 --
 To retain a program object, call the function

 include::{generated}/api/protos/clRetainProgram.txt[]
 include::{generated}/api/version-notes/clRetainProgram.asciidoc[]

   * _program_ is the program object to be retained.

 The _program_ reference count is incremented.
 All APIs that create a program do an implicit retain.

 // refError

 {clRetainProgram} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_PROGRAM} if _program_ is not a valid program object.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clReleaseProgram',desc='Decrements the program reference count.',type='protos']
 --
 To release a program object, call the function

 include::{generated}/api/protos/clReleaseProgram.txt[]
 include::{generated}/api/version-notes/clReleaseProgram.asciidoc[]

   * _program_ is the program object to be released.

 The _program_ reference count is decremented.
 The program object is deleted after all kernel objects associated with
 _program_ have been deleted and the _program_ reference count becomes zero.

 // refError

 {clReleaseProgram} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_PROGRAM} if _program_ is not a valid program object.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.

 Using this function to release a reference that was not obtained by creating
 the object or by calling {clRetainProgram} causes undefined behavior.
 --

 [open,refpage='clSetProgramReleaseCallback',desc='Registers a destructor callback function with a program object.',type='protos']
 --
 To register a callback function with a program object that is called when
 the program object is destroyed, call the function

 include::{generated}/api/protos/clSetProgramReleaseCallback.txt[]
 include::{generated}/api/version-notes/clSetProgramReleaseCallback.asciidoc[]

   * _program_ specifies the memory object to register the callback to.
   * _pfn_notify_ is the callback function to register.
     This callback function may be called asynchronously by the OpenCL
     implementation.
     It is the application's responsibility to ensure that the callback function
     is thread safe.
     The parameters to this callback function are:
   ** _program_ is the program being deleted.
      When the callback function is called by the implementation, this program
      object is not longer valid.
      _program_ is only provided for reference purposes.
   ** _user_data_ is a pointer to user supplied data.
   * _user_data_ will be passed as the _user_data_ argument when _pfn_notify_ is
     called.
     user data can be `NULL`.

 Each call to {clSetProgramReleaseCallback} registers the specified
 callback function on a callback stack associated with _program_.
 The registered callback functions are called in the reverse order in
 which they were registered.
 The registered callback functions are called after destructors (if any) for
 program scope global variables (if any) are called and before the program
 object is deleted.
 This provides a mechanism for an application to be notified when destructors
 for program scope global variables are complete.

 {clSetProgramReleaseCallback} may unconditionally return an error if no
 devices in the context associated with _program_ support destructors for
 program scope global variables.
 Support for constructors and destructors for program scope global variables
 is required only for OpenCL 2.2 devices.

 // refError

 {clSetProgramReleaseCallback} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_PROGRAM} if _program_ is not a valid program object.
   * {CL_INVALID_OPERATION} if no devices in the context associated with
     _program_ support destructors for program scope global variables.
   * {CL_INVALID_VALUE} if _pfn_notify_ is `NULL`.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --


 === Setting SPIR-V specialization constants

 NOTE: Specialization constants are <<unified-spec, missing before>> version
 2.2.

 [open,refpage='clSetProgramSpecializationConstant',desc='',type='protos']
 --
 To set the value of a specialization constant, call the function

 include::{generated}/api/protos/clSetProgramSpecializationConstant.txt[]
 include::{generated}/api/version-notes/clSetProgramSpecializationConstant.asciidoc[]

   * _program_ must be a valid OpenCL program created from an intermediate
     language (e.g. SPIR-V).
   * _spec_id_ identifies the specialization constant whose value will be
     set.
   * _spec_size_ specifies the size in bytes of the data pointed to by
     _spec_value_.
     This should be 1 for boolean constants.
     For all other constant types this should match the size of the
     specialization constant in the module.
   * _spec_value_ is a pointer to the memory location that contains the value of
     the specialization constant.
     The data pointed to by _spec_value_ are copied and can be safely reused by
     the application after {clSetProgramSpecializationConstant} returns.
     This specialization value will be used by subsequent calls to
     {clBuildProgram} until another call to {clSetProgramSpecializationConstant}
     changes it.
     If a specialization constant is a boolean constant, _spec_value_ should be a
     pointer to a {cl_uchar_TYPE} value.
     A value of zero will set the specialization constant to false; any other
     value will set it to true.

 Calling this function multiple times for the same specialization constant
 shall cause the last provided value to override any previously specified
 value.
 The values are used by a subsequent {clBuildProgram} call for the _program_.

 Application is not required to provide values for every specialization
 constant contained in the module. If the value is not set by this API
 call, default values will be used during the build.

 // refError

 {clSetProgramSpecializationConstant} returns {CL_SUCCESS} if the function is
 executed successfully.

 Otherwise, it returns one of the following errors:

   * {CL_INVALID_PROGRAM} if _program_ is not a valid program object created
     from an intermediate language (e.g. SPIR-V), or if the intermediate
     language does not support specialization constants.
   * {CL_INVALID_OPERATION} if no devices associated with _program_ support
     intermediate language programs.
   * {CL_COMPILER_NOT_AVAILABLE} if _program_ is created with
     {clCreateProgramWithIL} and a compiler is not
     available, i.e. {CL_DEVICE_COMPILER_AVAILABLE} specified in the
     <<device-queries-table,Device Queries>> table is set to {CL_FALSE}.
   * {CL_INVALID_SPEC_ID} if _spec_id_ is not a valid specialization constant
     identifier.
   * {CL_INVALID_VALUE} if _spec_size_ does not match the size of the
     specialization constant in the module, or if _spec_value_ is
     `NULL`.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --


 === Building Program Executables

 [open,refpage='clBuildProgram',desc='Builds (compiles and links) a program executable from the program source or binary.',type='protos']
 --
 To build (compile & link) a program executable, call the function

 include::{generated}/api/protos/clBuildProgram.txt[]
 include::{generated}/api/version-notes/clBuildProgram.asciidoc[]

   * _program_ is the program object.
   * _device_list_ is a pointer to a list of devices associated with _program_.
     If _device_list_ is a `NULL` value, the program executable is built for all
     devices associated with _program_ for which a source or binary has been
     loaded.
     If _device_list_ is a non-`NULL` value, the program executable is built for
     devices specified in this list for which a source or binary has been loaded.
   * _num_devices_ is the number of devices listed in _device_list_.
   * _options_ is a pointer to a null-terminated string of characters that
     describes the build options to be used for building the program executable.
     The list of supported options is described in <<compiler-options, Compiler
     Options>>.
     If the program was created using {clCreateProgramWithBinary} and _options_
     is a `NULL` pointer, the program will be built as if _options_ were the same
     as when the program binary was originally built.
     If the program was created using {clCreateProgramWithBinary} and _options_
     string contains anything other than the same options in the same order
     (whitespace ignored) as when the program binary was originally built, then
     the behavior is implementation defined.
     Otherwise, if _options_ is a `NULL` pointer then it will have the same
     result as the empty string.
   * _pfn_notify_ is a function pointer to a notification routine.
     The notification routine is a callback function that an application can
     register and which will be called when the program executable has been built
     (successfully or unsuccessfully).
     If _pfn_notify_ is not `NULL`, {clBuildProgram} does not need to wait for
     the build to complete and can return immediately once the build operation
     can begin.
     Any state changes of the program object that result from calling
     {clBuildProgram} (e.g. build status or log) will be observable from this
     callback function.
     The build operation can begin if the context, program whose sources are
     being compiled and linked, list of devices and build options specified are
     all valid and appropriate host and device resources needed to perform the
     build are available.
     If _pfn_notify_ is `NULL`, {clBuildProgram} does not return until the build
     has completed.
     This callback function may be called asynchronously by the OpenCL
     implementation.
     It is the applications responsibility to ensure that the callback function
     is thread-safe.
   ** _user_data_ will be passed as an argument when _pfn_notify_ is called.
      _user_data_ can be `NULL`.

 The program executable is built from the program source or binary for all
 the devices, or a specific device(s) in the OpenCL context associated with
 _program_.
 OpenCL allows program executables to be built using the source or the
 binary.
 {clBuildProgram} must be called for _program_ created using
 {clCreateProgramWithSource}, {clCreateProgramWithIL} or
 {clCreateProgramWithBinary} to build the program executable for one or more
 devices associated with _program_.
 If _program_ is created with {clCreateProgramWithBinary}, then the program
 binary must be an executable binary (not a compiled binary or library).

 The executable binary can be queried using {clGetProgramInfo}(_program_,
 {CL_PROGRAM_BINARIES}, ...) and can be specified to
 {clCreateProgramWithBinary} to create a new program object.

 // refError

 {clBuildProgram} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_PROGRAM} if _program_ is not a valid program object.
   * {CL_INVALID_VALUE} if _device_list_ is `NULL` and _num_devices_ is greater
     than zero, or if _device_list_ is not `NULL` and _num_devices_ is zero.
   * {CL_INVALID_VALUE} if _pfn_notify_ is `NULL` but _user_data_ is not
     `NULL`.
   * {CL_INVALID_DEVICE} if any device in _device_list_ is not in
     the list of devices associated with _program_.
   * {CL_INVALID_BINARY} if _program_ is created with
     {clCreateProgramWithBinary} and devices listed in _device_list_ do not
     have a valid program binary loaded.
   * {CL_INVALID_BUILD_OPTIONS} if the build options specified by _options_ are
     invalid.
   * {CL_COMPILER_NOT_AVAILABLE} if _program_ is created with
     {clCreateProgramWithSource} or {clCreateProgramWithIL} and a compiler is
     not available, i.e. {CL_DEVICE_COMPILER_AVAILABLE} specified in the
     <<device-queries-table,Device Queries>> table is set to {CL_FALSE}.
   * {CL_BUILD_PROGRAM_FAILURE} if there is a failure to build the program
     executable.
     This error will be returned if {clBuildProgram} does not return until
     the build has completed.
   * {CL_INVALID_OPERATION} if the build of a program executable for any of the
     devices listed in _device_list_ by a previous call to {clBuildProgram}
     for _program_ has not completed.
   * {CL_INVALID_OPERATION} if there are kernel objects attached to _program_.
   * {CL_INVALID_OPERATION} if _program_ was not created with
     {clCreateProgramWithSource}, {clCreateProgramWithIL} or
     {clCreateProgramWithBinary}.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --


 === Separate Compilation and Linking of Programs

 NOTE: Separate compilation and linking are <<unified-spec, missing before>>
 version 1.2.

 OpenCL programs are compiled and linked to support the following:

   * Separate compilation and link stages.
     Program sources can be compiled to generate a compiled binary object and
     linked in a separate stage with other compiled program objects to the
     program executable.
   * Embedded headers.
     In OpenCL 1.0 and 1.1, the I build option could be used to specify the
     list of directories to be searched for headers files that are included
     by a program source(s).
     OpenCL 1.2 extends this by allowing the header sources to come from
     program objects instead of just header files.
   * Libraries.
     The linker can be used to link compiled objects and libraries into a
     program executable or to create a library of compiled binaries.

 [open,refpage='clCompileProgram',desc='Compiles a program\'s source for all the devices or a specific device(s) in the OpenCL context associated with a program.',type='protos']
 --
 To compile a program's source for all the devices or a specific device(s) in
 the OpenCL context associated with the program, call the function

 include::{generated}/api/protos/clCompileProgram.txt[]
 include::{generated}/api/version-notes/clCompileProgram.asciidoc[]

   * _program_ is the program object that is the compilation target.
   * _device_list_ is a pointer to a list of devices associated with _program_.
     If _device_list_ is a `NULL` value, the compile is performed for all devices
     associated with _program_.
     If _device_list_ is a non-`NULL` value, the compile is performed for devices
     specified in this list.
   * _num_devices_ is the number of devices listed in _device_list_.
   * _options_ is a pointer to a null-terminated string of characters that
     describes the compilation options to be used for building the program
     executable.
     If _options_ is a `NULL` pointer then it will have the same result as the
     empty string.
     Certain options are ignored when program is created with IL.
     The list of supported options is as described in <<compiler-options,
     Compiler Options>>.
   * _num_input_headers_ specifies the number of programs that describe headers
     in the array referenced by _input_headers_.
   * _input_headers_ is an array of program embedded headers created with
     {clCreateProgramWithSource}.
   * _header_include_names_ is an array that has a one to one correspondence with
     _input_headers_.
     Each entry in _header_include_names_ specifies the include name used by
     source in _program_ that comes from an embedded header.
     The corresponding entry in _input_headers_ identifies the program object
     which contains the header source to be used.
     The embedded headers are first searched before the headers in the list of
     directories specified by the `-I` compile option (as described in
     <<preprocessor-options, Preprocessor options>>).
     If multiple entries in _header_include_names_ refer to the same header name,
     the first one encountered will be used.
   * _pfn_notify_ is a function pointer to a notification routine.
     The notification routine is a callback function that an application can
     register and which will be called when the program executable has been built
     (successfully or unsuccessfully).
     If _pfn_notify_ is not `NULL`, {clCompileProgram} does not need to wait for
     the compiler to complete and can return immediately once the compilation can
     begin.
     Any state changes of the program object that result from calling
     {clCompileProgram} (e.g. compile status or log) will be observable from this
     callback function.
     The compilation can begin if the context, program whose sources are being
     compiled, list of devices, input headers, programs that describe input
     headers and compiler options specified are all valid and appropriate host
     and device resources needed to perform the compile are available.
     If _pfn_notify_ is `NULL`, {clCompileProgram} does not return until the
     compiler has completed.
     This callback function may be called asynchronously by the OpenCL
     implementation.
     It is the applications responsibility to ensure that the callback function
     is thread-safe.
   ** _user_data_ will be passed as an argument when _pfn_notify_ is called.
      _user_data_ can be `NULL`.

 The pre-processor runs before the program sources are compiled.
 The compiled binary is built for all devices associated with _program_ or
 the list of devices specified.
 The compiled binary can be queried using {clGetProgramInfo}(_program_,
 {CL_PROGRAM_BINARIES}, ...) and can be passed to {clCreateProgramWithBinary}
 to create a new program object.

 If _program_ was created using {clCreateProgramWithIL}, then
 _num_input_headers_, _input_headers_, and _header_include_names_ are
 ignored.

 For example, consider the following program source:

 [source,c]
 ----
 #include <foo.h>
 #include <mydir/myinc.h>
 __kernel void
 image_filter (int n, int m,
               __constant float *filter_weights,
               __read_only image2d_t src_image,
               __write_only image2d_t dst_image)
 {
 ...
 }
 ----

 This kernel includes two headers foo.h and mydir/myinc.h.
 The following describes how these headers can be passed as embedded headers
 in program objects:

 [source,c]
 ----
 cl_program foo_pg = clCreateProgramWithSource(context,
     1, &foo_header_src, NULL, &err);
 cl_program myinc_pg = clCreateProgramWithSource(context,
     1, &myinc_header_src, NULL, &err);

 // lets assume the program source described above is given
 // by program_A and is loaded via clCreateProgramWithSource
 cl_program input_headers[2] = { foo_pg, myinc_pg };
 char * input_header_names[2] = { foo.h, mydir/myinc.h };
 clCompileProgram(program_A,
                  0, NULL, // num_devices & device_list
                  NULL,    // compile_options
                  2,       // num_input_headers
                  input_headers,
                  input_header_names,
                  NULL, NULL); // pfn_notify & user_data
 ----

 // refError

 {clCompileProgram} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_PROGRAM} if _program_ is not a valid program object.
   * {CL_INVALID_VALUE} if _device_list_ is `NULL` and _num_devices_ is greater
     than zero, or if _device_list_ is not `NULL` and _num_devices_ is zero.
   * {CL_INVALID_VALUE} if _num_input_headers_ is zero and
     _header_include_names_ or _input_headers_ are not `NULL` or if
     _num_input_headers_ is not zero and _header_include_names_ or
     _input_headers_ are `NULL`.
   * {CL_INVALID_VALUE} if _pfn_notify_ is `NULL` but _user_data_ is not
     `NULL`.
   * {CL_INVALID_DEVICE} if device in _device_list_ is not in
     the list of devices associated with _program_.
   * {CL_INVALID_COMPILER_OPTIONS} if the compiler options specified by
     _options_ are invalid.
   * {CL_INVALID_OPERATION} if the compilation or build of a program executable
     for any of the devices listed in _device_list_ by a previous call to
     {clCompileProgram} or {clBuildProgram} for _program_ has not completed.
   * {CL_COMPILER_NOT_AVAILABLE} if a compiler is not available, i.e.
     {CL_DEVICE_COMPILER_AVAILABLE} specified in the
     <<device-queries-table,Device Queries>> table is set to {CL_FALSE}.
   * {CL_COMPILE_PROGRAM_FAILURE} if there is a failure to compile the program
     source.
     This error will be returned if {clCompileProgram} does not return until
     the compile has completed.
   * {CL_INVALID_OPERATION} if there are kernel objects attached to _program_.
   * {CL_INVALID_OPERATION} if _program_ has no source or IL available, i.e. it
     has not been created with {clCreateProgramWithSource} or
     {clCreateProgramWithIL}.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clLinkProgram',desc='Links a set of compiled program objects and libraries for all the devices or a specific device(s) in the OpenCL context and creates a library or executable.',type='protos']
 --
 To link a set of compiled program objects and libraries for all the devices
 or a specific device(s) in the OpenCL context and create a library or
 executable, call the function

 include::{generated}/api/protos/clLinkProgram.txt[]
 include::{generated}/api/version-notes/clLinkProgram.asciidoc[]

   * _context_ must be a valid OpenCL context.
   * _device_list_ is a pointer to a list of devices that are in _context_.
     If _device_list_ is a `NULL` value, the link is performed for all devices
     associated with _context_ for which a compiled object is available.
     If _device_list_ is a non-`NULL` value, the link is performed for devices
     specified in this list for which a compiled object is available.
   * _num_devices_ is the number of devices listed in _device_list_.
   * _options_ is a pointer to a null-terminated string of characters that
     describes the link options to be used for building the program executable.
     The list of supported options is as described in <<linker-options,Linker
     Options>>.
     If the program was created using {clCreateProgramWithBinary} and _options_
     is a `NULL` pointer, the program will be linked as if _options_ were the
     same as when the program binary was originally built.
     If the program was created using {clCreateProgramWithBinary} and _options_
     string contains anything other than the same options in the same order
     (whitespace ignored) as when the program binary was originally built, then
     the behavior is implementation defined.
     Otherwise, if _options_ is a `NULL` pointer then it will have the same
     result as the empty string.
   * _num_input_programs_ specifies the number of programs in array referenced by
     _input_programs_.
   * _input_programs_ is an array of program objects that are compiled binaries
     or libraries that are to be linked to create the program executable.
     For each device in _device_list_ or if _device_list_ is `NULL` the list of
     devices associated with context, the following cases occur:
   ** All programs specified by _input_programs_ contain a compiled binary or
      library for the device.
      In this case, a link is performed to generate a program executable for
      this device.
   ** None of the programs contain a compiled binary or library for that
      device.
      In this case, no link is performed and there will be no program
      executable generated for this device.
   ** All other cases will return a {CL_INVALID_OPERATION} error.
   * _pfn_notify_ is a function pointer to a notification routine.
     The notification routine is a callback function that an application can
     register and which will be called when the program executable has been built
     (successfully or unsuccessfully).
   ** _user_data_ will be passed as an argument when _pfn_notify_ is called.
      _user_data_ can be `NULL`.

 If _pfn_notify_ is not `NULL`, {clLinkProgram} does not need to wait for the
 linker to complete, and can return immediately once the linking operation can
 begin.
 Once the linker has completed, the _pfn_notify_ callback function is called
 which returns the program object returned by {clLinkProgram}.
 Any state changes of the program object that result from calling {clLinkProgram}
 (e.g. link status or log) will be observable from this callback function.
 This callback function may be called asynchronously by the OpenCL
 implementation.
 It is the applications responsibility to ensure that the callback function
 is thread-safe.

 If _pfn_notify_ is `NULL`, {clLinkProgram} does not return until the linker
 has completed.

 {clLinkProgram} creates a new program object which contains the library or
 executable.
 The library or executable binary can be queried using
 {clGetProgramInfo}(_program_, {CL_PROGRAM_BINARIES}, ...) and can be specified
 to {clCreateProgramWithBinary} to create a new program object.

 The devices associated with the returned program object will be the list of
 devices specified by _device_list_ or if _device_list_ is `NULL` it will be
 the list of devices associated with _context_.

 The linking operation can begin if the context, list of devices, input
 programs and linker options specified are all valid and appropriate host and
 device resources needed to perform the link are available.
 If the linking operation can begin, {clLinkProgram} returns a valid non-zero
 program object.

 // refError

 If _pfn_notify_ is `NULL`, the _errcode_ret_ will be set to {CL_SUCCESS} if
 the link operation was successful and {CL_LINK_PROGRAM_FAILURE} if there is a
 failure to link the compiled binaries and/or libraries.

 If _pfn_notify_ is not `NULL`, {clLinkProgram} does not have to wait until
 the linker to complete and can return {CL_SUCCESS} in _errcode_ret_ if the
 linking operation can begin.
 The _pfn_notify_ callback function will return a {CL_SUCCESS} or
 {CL_LINK_PROGRAM_FAILURE} if the linking operation was successful or not.

 Otherwise {clLinkProgram} returns a `NULL` program object with an
 appropriate error in _errcode_ret_.
 The application should query the linker status of this program object to
 check if the link was successful or not.
 The list of errors that can be returned are:

   * {CL_INVALID_CONTEXT} if _context_ is not a valid context.
   * {CL_INVALID_VALUE} if _device_list_ is `NULL` and _num_devices_ is greater
     than zero, or if _device_list_ is not `NULL` and _num_devices_ is zero.
   * {CL_INVALID_VALUE} if _num_input_programs_ is zero and _input_programs_ is
     `NULL` or if _num_input_programs_ is zero and _input_programs_ is not
     `NULL` or if _num_input_programs_ is not zero and _input_programs_ is
     `NULL`.
   * {CL_INVALID_PROGRAM} if programs specified in _input_programs_ are not
     valid program objects.
   * {CL_INVALID_VALUE} if _pfn_notify_ is `NULL` but _user_data_ is not
     `NULL`.
   * {CL_INVALID_DEVICE} if any device in _device_list_ is not in
     the list of devices associated with _context_.
   * {CL_INVALID_LINKER_OPTIONS} if the linker options specified by _options_
     are invalid.
   * {CL_INVALID_OPERATION} if the compilation or build of a program executable
     for any of the devices listed in _device_list_ by a previous call to
     {clCompileProgram} or {clBuildProgram} for _program_ has not completed.
   * {CL_INVALID_OPERATION} if the rules for devices containing compiled
     binaries or libraries as described in _input_programs_ argument above
     are not followed.
   * {CL_LINKER_NOT_AVAILABLE} if a linker is not available, i.e.
     {CL_DEVICE_LINKER_AVAILABLE} specified in the
     <<device-queries-table,Device Queries>> table is set to {CL_FALSE}.
   * {CL_LINK_PROGRAM_FAILURE} if there is a failure to link the compiled
     binaries and/or libraries.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --


 [[compiler-options]]
 === Compiler Options

 The compiler options are categorized as pre-processor options, options for
 math intrinsics, options that control optimization and miscellaneous
 options.
 This specification defines a standard set of options that must be supported
 by the compiler when building program executables online or offline from
 OpenCL C/{cpp} or, where relevant, from an IL.
 These may be extended by a set of vendor- or platform-specific options.


 [[preprocessor-options]]
 ==== Preprocessor options

 These options control the OpenCL C/{cpp} preprocessor which is run on each
 program source before actual compilation.
 These options are ignored for programs created with IL.

 `-D name` ::
     Predefine _name_ as a macro, with definition 1.

 `-D name=definition` ::
     The contents of _definition_ are tokenized and processed as if they
     appeared during translation phase three in a `#define` directive.
     In particular, the definition will be truncated by embedded newline
     characters.
 +
 --
 `-D` options are processed in the order they are given in the _options_
 argument to {clBuildProgram} or {clCompileProgram}.
 Note that a space is required between the `-D` option and the symbol it
 defines, otherwise behavior is implementation defined.
 --

 `-I dir` ::
     Add the directory _dir_ to the list of directories to be searched for
     header files.
     _dir_ can optionally be enclosed in double quotes.
 +
 --
 This option is not portable due to its dependency on host file system and
 host operating system.
 It is supported for backwards compatibility with previous OpenCL versions.
 Developers are encouraged to create and use explicit header objects by means
 of {clCompileProgram} followed by {clLinkProgram}.
 --


 [[math-intrinsics-options]]
 ==== Math Intrinsics Options

 These options control compiler behavior regarding floating-point arithmetic.
 These options trade off between speed and correctness.

 `-cl-single-precision-constant` ::
     Treat double precision floating-point constant as single precision
     constant.
     This option is ignored for programs created with IL.

 `-cl-denorms-are-zero` ::
     This option controls how single precision and double precision
     denormalized numbers are handled.
     If specified as a build option, the single precision denormalized
     numbers may be flushed to zero; double precision denormalized numbers
     may also be flushed to zero if the optional extension for double
     precision is supported.
     This is intended to be a performance hint and the OpenCL compiler can
     choose not to flush denorms to zero if the device supports single
     precision (or double precision) denormalized numbers.
 +
 --
 This option is ignored for single precision numbers if the device does not
 support single precision denormalized numbers i.e. {CL_FP_DENORM} bit is not
 set in {CL_DEVICE_SINGLE_FP_CONFIG}.

 This option is ignored for double precision numbers if the device does not
 support double precision or if it does support double precision but not
 double precision denormalized numbers i.e. {CL_FP_DENORM} bit is not set in
 {CL_DEVICE_DOUBLE_FP_CONFIG}.

 This flag only applies for scalar and vector single precision floating-point
 variables and computations on these floating-point variables inside a
 program.
 It does not apply to reading from or writing to image objects.
 --

 `-cl-fp32-correctly-rounded-divide-sqrt` ::
     The `-cl-fp32-correctly-rounded-divide-sqrt` build option to
     {clBuildProgram} or {clCompileProgram} allows an application to specify
     that single precision floating-point divide (x/y and 1/x) and sqrt used
     in the program source are correctly rounded.
     If this build option is not specified, the minimum numerical accuracy of
     single precision floating-point divide and sqrt are as defined in the
     OpenCL C or OpenCL SPIR-V Environment specifications.
 +
 --
 This build option can only be specified if the
 {CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT} is set in {CL_DEVICE_SINGLE_FP_CONFIG} (as
 defined in the <<device-queries-table,Device Queries>> table) for devices
 that the program is being build.
 {clBuildProgram} or {clCompileProgram} will fail to compile the program for
 a device if the `-cl-fp32-correctly-rounded-divide-sqrt` option is specified
 and {CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT} is not set for the device.

 Note: This option is <<unified-spec, missing before>> version 1.2.
 --


 [[optimization-options]]
 ==== Optimization Options

 These options control various sorts of optimizations.
 Turning on optimization flags makes the compiler attempt to improve the
 performance and/or code size at the expense of compilation time and possibly
 the ability to debug the program.

 `-cl-opt-disable` ::
     This option disables all optimizations.
     The default is optimizations are enabled.

 `-cl-strict-aliasing` ::
     This option allows the compiler to assume the strictest aliasing rules.
 +
 --
 Note: This option is <<unified-spec, deprecated by>> version 1.1.
 --

 `-cl-uniform-work-group-size` ::
     This requires that the global work-size be a multiple of the work-group
     size specified to {clEnqueueNDRangeKernel}.
     Allow optimizations that are made possible by this restriction.
 +
 --
 Note: This option is <<unified-spec, missing before>> version 2.0.
 --

 `-cl-no-subgroup-ifp` ::
     This indicates that kernels in this program do not require subgroups to
     make independent forward progress.
     Allows optimizations that are made possible by this restriction.
     This option has no effect for devices that do not support independent
     forward progress for subgroups.
 +
 --
 Note: This option is <<unified-spec, missing before>> version 2.1.
 --

 The following options control compiler behavior regarding floating-point
 arithmetic.
 These options trade off between performance and correctness and must be
 specifically enabled.
 These options are not turned on by default since it can result in incorrect
 output for programs which depend on an exact implementation of IEEE 754
 rules/specifications for math functions.

 `-cl-mad-enable` ::
     Allow `a * b + c` to be replaced by a *mad* instruction.
     The *mad* instruction may compute `a * b + c` with reduced accuracy
     in the embedded profile.
     See the OpenCL C or OpenCL SPIR-V Environment specification for accuracy
     details.
     On some hardware the *mad* instruction may provide better performance
     than the expanded computation.

 `-cl-no-signed-zeros` ::
     Allow optimizations for floating-point arithmetic that ignore the
     signedness of zero.
     IEEE 754 arithmetic specifies the distinct behavior of `+0.0` and `-0.0`
     values, which then prohibits simplification of expressions such as `x
     {plus} 0.0` or `0.0 * x` (even with `-cl-finite-math-only`).
     This option implies that the sign of a zero result isn't significant.

 `-cl-unsafe-math-optimizations` ::
     Allow optimizations for floating-point arithmetic that (a) assume that
     arguments and results are valid, (b) may violate the IEEE 754 standard,
     (c) assume relaxed OpenCL numerical compliance requirements as defined
     in the unsafe math optimization section of the OpenCL C or OpenCL SPIR-V
     Environment specifications, and (d) may violate edge case behavior in the
     OpenCL C or OpenCL SPIR-V Environment specifications.
     This option includes the `-cl-no-signed-zeros` and `-cl-mad-enable` options.

 `-cl-finite-math-only` ::
     Allow optimizations for floating-point arithmetic that assume that
     arguments and results are not NaNs, +Inf, -Inf.
     This option may violate the OpenCL numerical compliance requirements for
     single precision and double precision floating-point, as well as edge
     case behavior.

 `-cl-fast-relaxed-math` ::
     Sets the optimization options `-cl-finite-math-only` and
     `-cl-unsafe-math-optimizations`.
     This option causes the preprocessor macro `+__FAST_RELAXED_MATH__+` to
     be defined in the OpenCL program.


 ==== Options to Request or Suppress Warnings

 Warnings are diagnostic messages that report constructions which are not
 inherently erroneous but which are risky or suggest there may have been an
 error.
 The following language-independent options do not enable specific warnings
 but control the kinds of diagnostics produced by the OpenCL compiler.
 These options are ignored for programs created with IL.

 `-w` ::
     Inhibit all warning messages.

 `-Werror` ::
     Make all warnings into errors.


 [[opencl-c-version]]
 ==== Options Controlling the OpenCL C version

 The following option controls the version of OpenCL C that the compiler
 accepts.
 These options are ignored for programs created with IL.

 `-cl-std=` ::
     Determine the OpenCL C language version to use.
     A value for this option must be provided.
     Valid values are:
 +
 --
   * `CL1.1`: Support OpenCL C 1.1 language features defined in _section 6_ of
   the OpenCL 1.1 specification or in the unified OpenCL C specification.
   * `CL1.2`: Support OpenCL C 1.2 language features defined in _section 6_ of
   the OpenCL 1.2 specification or in the unified OpenCL C specification.
   * `CL2.0`: Support OpenCL C 2.0 language features defined in the OpenCL C 2.0
   specification or in the unified OpenCL C specification.
   * `CL3.0`: Support OpenCL C 3.0 language features defined in the unified
   OpenCL C specification.
 --

 Calls to {clBuildProgram} or {clCompileProgram} with the `-cl-std=CL1.1`
 option *will fail* to compile the program for any devices with
 {CL_DEVICE_OPENCL_C_VERSION} equal to OpenCL C 1.0 and when
 {CL_DEVICE_OPENCL_C_ALL_VERSIONS} does not include OpenCL C 1.1.

 Calls to {clBuildProgram} or {clCompileProgram} with the `-cl-std=CL1.2`
 option *will fail* to compile the program for any devices with
 {CL_DEVICE_OPENCL_C_VERSION} equal to OpenCL C 1.1 or earlier
 and when {CL_DEVICE_OPENCL_C_ALL_VERSIONS} does not include OpenCL C 1.2.

 Calls to {clBuildProgram} or {clCompileProgram} with the `-cl-std=CL2.0`
 option *will fail* to compile the program for any devices with
 {CL_DEVICE_OPENCL_C_VERSION} equal to OpenCL C 1.2 or earlier
 and when {CL_DEVICE_OPENCL_C_ALL_VERSIONS} does not include OpenCL C 2.0.

 Calls to {clBuildProgram} or {clCompileProgram} with the `-cl-std=CL3.0`
 option *will fail* to compile the program for any devices with
 {CL_DEVICE_OPENCL_C_VERSION} equal to OpenCL C 2.0 or earlier
 and when {CL_DEVICE_OPENCL_C_ALL_VERSIONS} does not include OpenCL C 3.0.

 If the `-cl-std` build option is not specified, the highest OpenCL C 1.x
 language version supported by each device is used when compiling the program
 for each device.
 Applications are required to specify the `-cl-std=CL2.0` build option to
 compile or build programs with OpenCL C 2.0 and the `-cl-std=CL3.0`
 build option to compile or build programs with OpenCL C 3.0.


 ==== Options for Querying Kernel Argument Information

 IMPORTANT: Querying for kernel argument information is <<unified-spec, missing
 before>> version 1.2.

 `-cl-kernel-arg-info` ::
     This option allows the compiler to store information about the arguments
     of a kernel(s) in the program executable.
     The argument information stored includes the argument name, its type,
     the address space and access qualifiers used.
     Refer to description of {clGetKernelArgInfo} on how to query this
     information.


 ==== Options for debugging your program

 IMPORTANT: Debugging options are <<unified-spec, missing before>> version 2.0.

 `-g` ::
     This option can currently be used to generate additional errors for the
     built-in functions that allow you to enqueue commands on a device (refer
     to OpenCL kernel languages specifications).


 [[linker-options]]
 === Linker Options

 NOTE: Linker options are <<unified-spec, missing before>> version 1.2.

 This specification defines a standard set of linker options that must be
 supported by the OpenCL C compiler when linking compiled programs online or
 offline.
 These linker options are categorized as library linking options and program
 linking options.
 These may be extended by a set of vendor- or platform-specific options.


 ==== Library Linking Options

 IMPORTANT: Library linking options are <<unified-spec, missing before>> version
 1.2.

 The following options can be specified when creating a library of compiled
 binaries.

 `-create-library` ::
     Create a library of compiled binaries specified in _input_programs_
     argument to {clLinkProgram}.

 `-enable-link-options` ::
     Allows the linker to modify the library behavior based on one or more
     link options (described in <<program-linking-options, Program Linking
     Options>>) when this library is linked with a program executable.
     This option must be specified with the create-library option.


 [[program-linking-options]]
 ==== Program Linking Options

 The following options can be specified when linking a program executable.

 `-cl-denorms-are-zero` +
 `-cl-no-signed-zeros` +
 `-cl-unsafe-math-optimizations` +
 `-cl-finite-math-only` +
 `-cl-fast-relaxed-math` +
 `-cl-no-subgroup-ifp` (<<unified-spec, missing before>> version 2.1)

 The options are described in <<math-intrinsics-options, Math Intrinsics
 Options>> and <<optimization-options, Optimization Options>>.
 The linker may apply these options to all compiled program objects
 specified to {clLinkProgram}.
 The linker may apply these options only to libraries which were created
 with the option `-enable-link-options`.


 === Unloading the OpenCL Compiler

 [open,refpage='clUnloadPlatformCompiler',desc='Allows the implementation to release the resources allocated by the OpenCL compiler for a platform.',type='protos']
 --
 To unload an OpenCL compiler for a platform, call the function

 include::{generated}/api/protos/clUnloadPlatformCompiler.txt[]
 include::{generated}/api/version-notes/clUnloadPlatformCompiler.asciidoc[]

   * _platform_ is the platform to unload.

 This function allows the implementation to release the resources allocated
 by the OpenCL compiler for _platform_.
 This is a hint from the application and does not guarantee that the compiler
 will not be used in the future or that the compiler will actually be
 unloaded by the implementation.
 Calls to {clBuildProgram}, {clCompileProgram} or {clLinkProgram} after
 {clUnloadPlatformCompiler} will reload the compiler, if necessary, to build
 the appropriate program executable.

 // refError

 {clUnloadPlatformCompiler} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_PLATFORM} if _platform_ is not a valid platform.
 --

 [open,refpage='clUnloadCompiler',desc='Allows the implementation to release the resources allocated by the OpenCL compiler.',type='protos']
 --
 Alternatively, if you are not using OpenCL via the ICD loader, you may unload the OpenCL compiler with the function

 include::{generated}/api/protos/clUnloadCompiler.txt[]
 include::{generated}/api/version-notes/clUnloadCompiler.asciidoc[]

 This function allows the implementation to release the resources allocated
 by the OpenCL compiler.
 This is a hint from the application and does not guarantee that the compiler
 will not be used in the future or that the compiler will actually be
 unloaded by the implementation.
 Calls to {clBuildProgram}, {clCompileProgram} or {clLinkProgram} after
 {clUnloadCompiler} will reload the compiler, if necessary, to build
 the appropriate program executable.

 // refError

 {clUnloadCompiler} will always return {CL_SUCCESS}.
 --


 === Program Object Queries

 [open,refpage='clGetProgramInfo',desc='Returns information about the program object.',type='protos']
 --
 To return information about a program object, call the function

 include::{generated}/api/protos/clGetProgramInfo.txt[]
 include::{generated}/api/version-notes/clGetProgramInfo.asciidoc[]

   * _program_ specifies the program object being queried.
   * _param_name_ specifies the information to query.
     The list of supported _param_name_ types and the information returned in
     _param_value_ by {clGetProgramInfo} is described in the
     <<program-info-table,Program Object Queries>> table.
   * _param_value_ is a pointer to memory where the appropriate result being
     queried is returned.
     If _param_value_ is `NULL`, it is ignored.
   * _param_value_size_ is used to specify the size in bytes of memory pointed to
     by _param_value_.
     This size must be {geq} size of return type as described in the
     <<program-info-table,Program Object Queries>> table.
   * _param_value_size_ret_ returns the actual size in bytes of data being
     queried by _param_name_.
     If _param_value_size_ret_ is `NULL`, it is ignored.

 [[program-info-table]]
 .List of supported param_names by <<clGetProgramInfo>>
 [width="100%",cols="<33%,<17%,<50%",options="header"]
 |====
 | Program Info | Return Type | Description
 | {CL_PROGRAM_REFERENCE_COUNT_anchor} footnote:[{fn-reference-count-usage}]

 include::{generated}/api/version-notes/CL_PROGRAM_REFERENCE_COUNT.asciidoc[]
   | {cl_uint_TYPE}
       | Return the _program_ reference count.
 | {CL_PROGRAM_CONTEXT_anchor}

 include::{generated}/api/version-notes/CL_PROGRAM_CONTEXT.asciidoc[]
   | {cl_context_TYPE}
       | Return the context specified when the program object is created
 | {CL_PROGRAM_NUM_DEVICES_anchor}

 include::{generated}/api/version-notes/CL_PROGRAM_NUM_DEVICES.asciidoc[]
   | {cl_uint_TYPE}
       | Return the number of devices associated with _program_.
 | {CL_PROGRAM_DEVICES_anchor}

 include::{generated}/api/version-notes/CL_PROGRAM_DEVICES.asciidoc[]
   | {cl_device_id_TYPE}[]
       | Return the list of devices associated with the program object.
         This can be the devices associated with context on which the program
         object has been created or can be a subset of devices that are
         specified when a program object is created using
         {clCreateProgramWithBinary}.
 | {CL_PROGRAM_SOURCE_anchor}

 include::{generated}/api/version-notes/CL_PROGRAM_SOURCE.asciidoc[]
   | {char_TYPE}[]
       | Return the program source code specified by
         {clCreateProgramWithSource}.
         The source string returned is a concatenation of all source strings
         specified to {clCreateProgramWithSource} with a null terminator.
         The concatenation strips any nulls in the original source strings.

         If _program_ is created using {clCreateProgramWithBinary},
         {clCreateProgramWithIL} or {clCreateProgramWithBuiltInKernels}, a
         null string or the appropriate program source code is returned
         depending on whether or not the program source code is stored in the
         binary.

         The actual number of characters that represents the program source
         code including the null terminator is returned in
         _param_value_size_ret_.
 | {CL_PROGRAM_IL_anchor}

 include::{generated}/api/version-notes/CL_PROGRAM_IL.asciidoc[]
 Also see extension *cl_khr_il_program*.
   | {char_TYPE}[]
       | Returns the program IL for programs created with
         {clCreateProgramWithIL}.

         If _program_ is created with {clCreateProgramWithSource},
         {clCreateProgramWithBinary} or {clCreateProgramWithBuiltInKernels}
         the memory pointed to by param_value will be unchanged and
         param_value_size_retwill be set to 0.
 | {CL_PROGRAM_BINARY_SIZES_anchor}

 include::{generated}/api/version-notes/CL_PROGRAM_BINARY_SIZES.asciidoc[]
   | {size_t_TYPE}[]
       | Returns an array that contains the size in bytes of the program
         binary (could be an executable binary, compiled binary or library
         binary) for each device associated with program.
         The size of the array is the number of devices associated with
         program.
         If a binary is not available for a device(s), a size of zero is
         returned.

         If _program_ is created using {clCreateProgramWithBuiltInKernels},
         the implementation may return zero in any entries of the returned
         array.
 | {CL_PROGRAM_BINARIES_anchor}

 include::{generated}/api/version-notes/CL_PROGRAM_BINARIES.asciidoc[]
   | {unsigned_char_TYPE}*[]
       | Return the program binaries (could be an executable binary, compiled
         binary or library binary) for all devices associated with program.
         For each device in program, the binary returned can be the binary
         specified for the device when program is created with
         {clCreateProgramWithBinary} or it can be the executable binary
         generated by {clBuildProgram} or {clLinkProgram}.
         If _program_ is created with {clCreateProgramWithSource} or
         {clCreateProgramWithIL}, the binary returned is the binary generated
         by {clBuildProgram}, {clCompileProgram} or {clLinkProgram}.
         The bits returned can be an implementation-specific intermediate
         representation (a.k.a. IR) or device specific executable bits or
         both.
         The decision on which information is returned in the binary is up to
         the OpenCL implementation.

         param_value points to an array of `n` pointers allocated by the
         caller, where `n` is the number of devices associated with program.
         The buffer sizes needed to allocate the memory that these `n`
         pointers refer to can be queried using the {CL_PROGRAM_BINARY_SIZES}
         query as described in this table.

         Each entry in this array is used by the implementation as the
         location in memory where to copy the program binary for a specific
         device, if there is a binary available.
         To find out which device the program binary in the array refers to,
         use the {CL_PROGRAM_DEVICES} query to get the list of devices.
         There is a one-to-one correspondence between the array of n pointers
         returned by {CL_PROGRAM_BINARIES} and array of devices returned by
         {CL_PROGRAM_DEVICES}.
 | {CL_PROGRAM_NUM_KERNELS_anchor}

 include::{generated}/api/version-notes/CL_PROGRAM_NUM_KERNELS.asciidoc[]
   | {size_t_TYPE}
       | Returns the number of kernels declared in _program_ that can be
         created with {clCreateKernel}.
         This information is only available after a successful program
         executable has been built for at least one device in the list of
         devices associated with _program_.
 | {CL_PROGRAM_KERNEL_NAMES_anchor}

 include::{generated}/api/version-notes/CL_PROGRAM_KERNEL_NAMES.asciidoc[]
   | {char_TYPE}[]
       | Returns a semi-colon separated list of kernel names in _program_
         that can be created with {clCreateKernel}.
         This information is only available after a successful program
         executable has been built for at least one device in the list of
         devices associated with _program_.
 | {CL_PROGRAM_SCOPE_GLOBAL_CTORS_PRESENT_anchor}

 include::{generated}/api/version-notes/CL_PROGRAM_SCOPE_GLOBAL_CTORS_PRESENT.asciidoc[]
   | {cl_bool_TYPE}
       | This indicates that the _program_ object contains non-trivial
         constructor(s) that will be executed by runtime before any kernel
         from the program is executed.
         This information is only available after a successful program
         executable has been built for at least one device in the list of
         devices associated with _program_.

         Querying {CL_PROGRAM_SCOPE_GLOBAL_CTORS_PRESENT} may unconditionally
         return {CL_FALSE} if no devices associated with _program_ support
         constructors for program scope global variables.
         Support for constructors and destructors for program scope global
         variables is required only for OpenCL 2.2 devices.
 | {CL_PROGRAM_SCOPE_GLOBAL_DTORS_PRESENT_anchor}

 include::{generated}/api/version-notes/CL_PROGRAM_SCOPE_GLOBAL_DTORS_PRESENT.asciidoc[]
   | {cl_bool_TYPE}
       | This indicates that the program object contains non-trivial
         destructor(s) that will be executed by runtime when _program_ is
         destroyed.
         This information is only available after a successful program
         executable has been built for at least one device in the list of
         devices associated with _program_.

         Querying {CL_PROGRAM_SCOPE_GLOBAL_CTORS_PRESENT} may unconditionally
         return {CL_FALSE} if no devices associated with _program_ support
         destructors for program scope global variables.
         Support for constructors and destructors for program scope global
         variables is required only for OpenCL 2.2 devices.
 |====

 // refError

 {clGetProgramInfo} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
     specified by _param_value_size_ is < size of return type as described in
     the <<program-info-table,Program Object Queries>> table and
     _param_value_ is not `NULL`.
   * {CL_INVALID_PROGRAM} if _program_ is a not a valid program object.
   * {CL_INVALID_PROGRAM_EXECUTABLE} if _param_name_ is
     {CL_PROGRAM_NUM_KERNELS}, {CL_PROGRAM_KERNEL_NAMES},
     {CL_PROGRAM_SCOPE_GLOBAL_CTORS_PRESENT}, or
     {CL_PROGRAM_SCOPE_GLOBAL_DTORS_PRESENT} and a successful program executable
     has not been built for at least one device in the list of devices
     associated with _program_.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clGetProgramBuildInfo',desc='Returns build information for each device in the program object.',type='protos']
 --
 To return build information for each device in the program object, call the
 function

 include::{generated}/api/protos/clGetProgramBuildInfo.txt[]
 include::{generated}/api/version-notes/clGetProgramBuildInfo.asciidoc[]

   * _program_ specifies the program object being queried.
   * _device_ specifies the device for which build information is being queried.
     _device_ must be a valid device associated with _program_.
   * _param_name_ specifies the information to query.
     The list of supported _param_name_ types and the information returned in
     _param_value_ by {clGetProgramBuildInfo} is described in the
     <<program-build-info-table,Program Build Queries>> table.
   * _param_value_ is a pointer to memory where the appropriate result being
     queried is returned.
     If _param_value_ is `NULL`, it is ignored.
   * _param_value_size_ is used to specify the size in bytes of memory pointed to
     by _param_value_.
     This size must be {geq} size of return type as described in the
     <<program-build-info-table,Program Build Queries>> table.
   * _param_value_size_ret_ returns the actual size in bytes of data being
     queried by _param_name_.
     If _param_value_size_ret_ is `NULL`, it is ignored.

 [[program-build-info-table]]
 .List of supported param_names by <<clGetProgramBuildInfo>>
 [width="100%",cols="<33%,<17%,<50%",options="header"]
 |====
 | Program Build Info | Return Type | Description
 | {CL_PROGRAM_BUILD_STATUS_anchor}

 include::{generated}/api/version-notes/CL_PROGRAM_BUILD_STATUS.asciidoc[]
   | {cl_build_status_TYPE}
       | Returns the build, compile or link status, whichever was performed
         last on the specified _program_ object for _device_.

         This can be one of the following:

         {CL_BUILD_NONE_anchor} - The build status returned if no {clBuildProgram},
         {clCompileProgram} or {clLinkProgram} has been performed on the
         specified _program_ object for _device_).

         {CL_BUILD_ERROR_anchor} - The build status returned if {clBuildProgram},
         {clCompileProgram} or {clLinkProgram} - whichever was performed last
         on the specified _program_ object for _device_ - generated an error.

         {CL_BUILD_SUCCESS_anchor} - The build status returned if {clBuildProgram},
         {clCompileProgram} or {clLinkProgram} - whichever was performed last
         on the specified _program_ object for _device_ - was successful.

         {CL_BUILD_IN_PROGRESS_anchor} - The build status returned if
         {clBuildProgram}, {clCompileProgram} or {clLinkProgram} - whichever
         was performed last on the specified _program_ object for _device_ - has
         not finished.
 | {CL_PROGRAM_BUILD_OPTIONS_anchor}

 include::{generated}/api/version-notes/CL_PROGRAM_BUILD_OPTIONS.asciidoc[]
   | {char_TYPE}[]
       | Return the build, compile or link options specified by the options
         argument in {clBuildProgram}, {clCompileProgram} or {clLinkProgram},
         whichever was performed last on the specified _program_ object for
         _device_.

         If build status of the specified _program_ for _device_ is
         {CL_BUILD_NONE}, an empty string is returned.
 | {CL_PROGRAM_BUILD_LOG_anchor}

 include::{generated}/api/version-notes/CL_PROGRAM_BUILD_LOG.asciidoc[]
   | {char_TYPE}[]
       | Return the build, compile or link log for {clBuildProgram},
         {clCompileProgram} or {clLinkProgram}, whichever was performed last
         on program for device.

         If build status of the specified _program_ for _device_ is
         {CL_BUILD_NONE}, an empty string is returned.
 | {CL_PROGRAM_BINARY_TYPE_anchor}

 include::{generated}/api/version-notes/CL_PROGRAM_BINARY_TYPE.asciidoc[]
   | {cl_program_binary_type_TYPE}
       | Return the program binary type for device.
         This can be one of the following values:

         {CL_PROGRAM_BINARY_TYPE_NONE_anchor} - There is no binary associated
         with the specified _program_ object for _device_.

         {CL_PROGRAM_BINARY_TYPE_COMPILED_OBJECT_anchor} - A compiled binary is
         associated with _device_.
         This is the case when the specified _program_ object was created using
         {clCreateProgramWithSource} and compiled using {clCompileProgram}, or
         when a compiled binary was loaded using {clCreateProgramWithBinary}.

         {CL_PROGRAM_BINARY_TYPE_LIBRARY_anchor} - A library binary is
         associated with _device_.
         This is the case when the specified _program_ object was linked by
         {clLinkProgram} using the `-create-library` link option, or when a
         compiled library binary was loaded using {clCreateProgramWithBinary}.

         {CL_PROGRAM_BINARY_TYPE_EXECUTABLE_anchor} - An executable binary is
         associated with _device_.
         This is the case when the specified _program_ object was linked by
         {clLinkProgram} without the `-create-library` link option, or when an
         executable binary was built using {clBuildProgram}.
 | {CL_PROGRAM_BUILD_GLOBAL_VARIABLE_TOTAL_SIZE_anchor}

 include::{generated}/api/version-notes/CL_PROGRAM_BUILD_GLOBAL_VARIABLE_TOTAL_SIZE.asciidoc[]
   | {size_t_TYPE}
       | The total amount of storage, in bytes, used by program variables in
         the global address space.
 |====

 // refError

 {clGetProgramBuildInfo} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_DEVICE} if _device_ is not in the list of devices associated
     with _program_.
   * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
     specified by _param_value_size_ is < size of return type as described in
     the <<program-build-info-table,Program Build Queries>> table and
     _param_value_ is not `NULL`.
   * {CL_INVALID_PROGRAM} if _program_ is a not a valid program object.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.

 [NOTE]
 ====
 A program binary (compiled binary, library binary or executable binary)
 built for a parent device can be used by all its sub-devices.
 If a program binary has not been built for a sub-device, the program binary
 associated with the parent device will be used.

 A program binary for a device specified with {clCreateProgramWithBinary} or
 queried using {clGetProgramInfo} can be used as the binary for the
 associated root device, and all sub-devices created from the root-level
 device or sub-devices thereof.
 ====
 --


 == Kernel Objects

 A kernel is a function declared in a program.
 A kernel is identified by the `+__kernel+` qualifier applied to any function
 in a program.
 A kernel object encapsulates the specific `+__kernel+` function declared in
 a program and the argument values to be used when executing this
 `+__kernel+` function.


 === Creating Kernel Objects

 [open,refpage='clCreateKernel',desc='Creates a kernel object.',type='protos']
 --
 To create a kernel object, use the function

 include::{generated}/api/protos/clCreateKernel.txt[]
 include::{generated}/api/version-notes/clCreateKernel.asciidoc[]

   * _program_ is a program object with a successfully built executable.
   * _kernel_name_ is a function name in the program declared with the
     `+__kernel+` qualifier.
   * _errcode_ret_ will return an appropriate error code.
     If _errcode_ret_ is `NULL`, no error code is returned.

 // refError

 {clCreateKernel} returns a valid non-zero kernel object and _errcode_ret_ is
 set to {CL_SUCCESS} if the kernel object is created successfully.
 Otherwise, it returns a `NULL` value with one of the following error values
 returned in _errcode_ret_:

   * {CL_INVALID_PROGRAM} if _program_ is not a valid program object.
   * {CL_INVALID_PROGRAM_EXECUTABLE} if there is no successfully built
     executable for _program_.
   * {CL_INVALID_KERNEL_NAME} if _kernel_name_ is not found in _program_.
   * {CL_INVALID_KERNEL_DEFINITION} if the function definition for `+__kernel+`
     function given by _kernel_name_ such as the number of arguments, the
     argument types are not the same for all devices for which the _program_
     executable has been built.
   * {CL_INVALID_VALUE} if _kernel_name_ is `NULL`.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clCreateKernelsInProgram',desc='Creates kernel objects for all kernel functions in a program object.',type='protos']
 --
 To create kernel objects for all kernel functions in a program,
 call the function

 include::{generated}/api/protos/clCreateKernelsInProgram.txt[]
 include::{generated}/api/version-notes/clCreateKernelsInProgram.asciidoc[]

   * _program_ is a program object with a successfully built executable.
   * _num_kernels_ is the size of memory pointed to by _kernels_ specified as the
     number of {cl_kernel_TYPE} entries.
   * _kernels_ is the buffer where the kernel objects for kernels in _program_
     will be returned.
     If _kernels_ is `NULL`, it is ignored.
     If _kernels_ is not `NULL`, _num_kernels_ must be greater than or equal to
     the number of kernels in _program_.
   * _num_kernels_ret_ is the number of kernels in _program_.
     If _num_kernels_ret_ is `NULL`, it is ignored.

 Kernel objects are not created for any `+__kernel+` functions in _program_
 that do not have the same function definition across all devices for which a
 program executable has been successfully built.

 Kernel objects can only be created once you have a program object with a
 valid program source or binary loaded into the program object and the
 program executable has been successfully built for one or more devices
 associated with program.
 No changes to the program executable are allowed while there are kernel
 objects associated with a program object.
 This means that calls to {clBuildProgram} and {clCompileProgram} return
 {CL_INVALID_OPERATION} if there are kernel objects attached to a program
 object.
 The OpenCL context associated with _program_ will be the context associated
 with _kernel_.
 The list of devices associated with _program_ are the devices associated
 with _kernel_.
 Devices associated with a program object for which a valid program
 executable has been built can be used to execute kernels declared in the
 program object.

 // refError

 {clCreateKernelsInProgram} will return  {CL_SUCCESS} if the kernel objects were
 successfully allocated.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_PROGRAM} if _program_ is not a valid program object.
   * {CL_INVALID_PROGRAM_EXECUTABLE} if there is no successfully built
     executable for any device in _program_.
   * {CL_INVALID_VALUE} if _kernels_ is not `NULL` and _num_kernels_ is less
     than the number of kernels in _program_.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.

 --

 [open,refpage='clRetainKernel',desc='Increments the kernel object reference count.',type='protos']
 --
 To retain a kernel object, call the function

 include::{generated}/api/protos/clRetainKernel.txt[]
 include::{generated}/api/version-notes/clRetainKernel.asciidoc[]

   * _kernel_ is the kernel object to be retained.

 The _kernel_ reference count is incremented.

 // refError

 {clRetainKernel} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.

 {clCreateKernel} or {clCreateKernelsInProgram} do an implicit retain.
 --

 [open,refpage='clReleaseKernel',desc='Decrements the kernel reference count.',type='protos']
 --
 To release a kernel object, call the function

 include::{generated}/api/protos/clReleaseKernel.txt[]
 include::{generated}/api/version-notes/clReleaseKernel.asciidoc[]

   * _kernel_ is the kernel object to be released.

 The _kernel_ reference count is decremented.

 The kernel object is deleted once the number of instances that are retained
 to _kernel_ become zero and the kernel object is no longer needed by any
 enqueued commands that use _kernel_.
 Using this function to release a reference that was not obtained by creating
 the object or by calling {clRetainKernel} causes undefined behavior.

 // refError

 {clReleaseKernel} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --


 === Setting Kernel Arguments

 To execute a kernel, the kernel arguments must be set.

 [open,refpage='clSetKernelArg',desc='Set the argument value for a specific argument of a kernel.',type='protos']
 --
 To set the argument value for a specific argument of a kernel, call the
 function

 include::{generated}/api/protos/clSetKernelArg.txt[]
 include::{generated}/api/version-notes/clSetKernelArg.asciidoc[]

   * _kernel_ is a valid kernel object.
   * _arg_index_ is the argument index.
     Arguments to the kernel are referred by indices that go from 0 for the
     leftmost argument to _n_ - 1, where _n_ is the total number of arguments
     declared by a kernel (see below).
   * _arg_size_ specifies the size of the argument value.
     If the argument is a memory object, the _arg_size_ value must be equal to
     `sizeof({cl_mem_TYPE})`.
     For arguments declared with the `local` qualifier, the size specified will
     be the size in bytes of the buffer that must be allocated for the `local`
     argument.
     If the argument is of type _sampler_t_, the _arg_size_ value must be equal
     to `sizeof({cl_sampler_TYPE})`.
     If the argument is of type _queue_t_, the _arg_size_ value must be equal to
     `sizeof({cl_command_queue_TYPE})`.
     For all other arguments, the size will be the size of argument type.
   * _arg_value_ is a pointer to data that should be used as the argument value
     for argument specified by _arg_index_.
     The argument data pointed to by _arg_value_ is copied and the _arg_value_
     pointer can therefore be reused by the application after {clSetKernelArg}
     returns.
     The argument value specified is the value used by all API calls that enqueue
     _kernel_ ({clEnqueueNDRangeKernel} and {clEnqueueTask}) until the argument
     value is changed by a call to {clSetKernelArg} for _kernel_.

 For example, consider the following kernel:

 [source,c]
 ----
 kernel void image_filter (int n,
                           int m,
                           constant float *filter_weights,
                           read_only image2d_t src_image,
                           write_only image2d_t dst_image)
 {
 ...
 }
 ----

 Argument index values for `image_filter` will be 0 for `n`, 1 for `m`, 2 for
 `filter_weights`, 3 for `src_image` and 4 for `dst_image`.

 If the argument is a memory object (buffer, pipe, image or image array), the
 _arg_value_ entry will be a pointer to the appropriate buffer, pipe, image
 or image array object.
 The memory object must be created with the context associated with the
 kernel object.
 If the argument is a buffer object, the _arg_value_ pointer can be `NULL` or
 point to a `NULL` value in which case a `NULL` value will be used as the
 value for the argument declared as a pointer to `global` or `constant`
 memory in the kernel.
 If the argument is declared with the `local` qualifier, the _arg_value_
 entry must be `NULL`.
 If the argument is of type _sampler_t_, the _arg_value_ entry must be a
 pointer to the sampler object.
 If the argument is of type _queue_t_, the _arg_value_ entry must be a
 pointer to the device queue object.

 If the argument is declared to be a pointer of a built-in scalar or vector
 type, or a user defined structure type in the global or constant address
 space, the memory object specified as argument value must be a buffer object
 (or `NULL`).
 If the argument is declared with the `constant` qualifier, the size in bytes
 of the memory object cannot exceed {CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE} and
 the number of arguments declared as pointers to `constant` memory cannot
 exceed {CL_DEVICE_MAX_CONSTANT_ARGS}.

 The memory object specified as argument value must be a pipe object if the
 argument is declared with the _pipe_ qualifier.

 The memory object specified as argument value must be a 2D image object if
 the argument is declared to be of type _image2d_t_.
 The memory object specified as argument value must be a 2D image object with
 image channel order = {CL_DEPTH} if the argument is declared to be of type
 _image2d_depth_t_.
 The memory object specified as argument value must be a 3D image object if
 argument is declared to be of type _image3d_t_.
 The memory object specified as argument value must be a 1D image object if
 the argument is declared to be of type _image1d_t_.
 The memory object specified as argument value must be a 1D image buffer
 object if the argument is declared to be of type _image1d_buffer_t_.
 The memory object specified as argument value must be a 1D image array
 object if argument is declared to be of type _image1d_array_t_.
 The memory object specified as argument value must be a 2D image array
 object if argument is declared to be of type _image2d_array_t_.
 The memory object specified as argument value must be a 2D image array
 object with image channel order = {CL_DEPTH} if argument is declared to be of
 type _image2d_array_depth_t_.

 For all other kernel arguments, the _arg_value_ entry must be a pointer to
 the actual data to be used as argument value.

 [NOTE]
 ====
 A kernel object does not update the reference count for objects such as
 memory or sampler objects specified as argument values by {clSetKernelArg}.
 Users may not rely on a kernel object to retain objects specified as
 argument values to the kernel.

 Implementations shall not allow {cl_kernel_TYPE} objects to hold reference
 counts to {cl_kernel_TYPE} arguments, because no mechanism is provided for the
 user to tell the kernel to release that ownership right.
 If the kernel holds ownership rights on kernel args, that would make it
 impossible for users to tell with certainty when they may safely
 release user allocated resources associated with OpenCL objects such as
 the {cl_mem_TYPE} backing store used with {CL_MEM_USE_HOST_PTR}.
 ====

 // refError

 {clSetKernelArg} returns {CL_SUCCESS} if the function was executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object.
   * {CL_INVALID_ARG_INDEX} if _arg_index_ is not a valid argument index.
   * {CL_INVALID_ARG_VALUE} if _arg_value_ specified is not a valid value.
   * {CL_INVALID_MEM_OBJECT} for an argument declared to be a memory object
     when the specified _arg_value_ is not a valid memory object.
   * {CL_INVALID_SAMPLER} for an argument declared to be of type _sampler_t_
     when the specified _arg_value_ is not a valid sampler object.
   * {CL_INVALID_DEVICE_QUEUE} for an argument declared to be of type _queue_t_
     when the specified _arg_value_ is not a valid device queue object.
     This error code is <<unified-spec, missing before>> version 2.0.
   * {CL_INVALID_ARG_SIZE} if _arg_size_ does not match the size of the data
     type for an argument that is not a memory object or if the argument is a
     memory object and _arg_size_ != `sizeof({cl_mem_TYPE})` or if _arg_size_ is
     zero and the argument is declared with the local qualifier or if the
     argument is a sampler and _arg_size_ != `sizeof({cl_sampler_TYPE})`.
   * {CL_MAX_SIZE_RESTRICTION_EXCEEDED} if the size in bytes of the memory
     object (if the argument is a memory object) or _arg_size_ (if the
     argument is declared with `local` qualifier) exceeds a language-
     specified maximum size restriction for this argument, such as the
     *MaxByteOffset* SPIR-V decoration.
     This error code is <<unified-spec, missing before>> version 2.2.
   * {CL_INVALID_ARG_VALUE} if the argument is an image declared with the
     `read_only` qualifier and _arg_value_ refers to an image object created
     with _cl_mem_flags_ of {CL_MEM_WRITE_ONLY} or if the image argument is
     declared with the `write_only` qualifier and _arg_value_ refers to an
     image object created with _cl_mem_flags_ of {CL_MEM_READ_ONLY}.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clSetKernelArgSVMPointer',desc='Set a SVM pointer as the argument value for a specific argument of a kernel.',type='protos']
 --
 To set a SVM pointer as the argument value for a specific argument of a
 kernel, call the function

 include::{generated}/api/protos/clSetKernelArgSVMPointer.txt[]
 include::{generated}/api/version-notes/clSetKernelArgSVMPointer.asciidoc[]

   * _kernel_ is a valid kernel object.
   * _arg_index_ is the argument index.
     Arguments to the kernel are referred by indices that go from 0 for the
     leftmost argument to _n_ - 1, where _n_ is the total number of arguments
     declared by a kernel.
   * _arg_value_ is the SVM pointer that should be used as the argument value for
     argument specified by _arg_index_.
     The SVM pointer specified is the value used by all API calls that enqueue
     _kernel_ ({clEnqueueNDRangeKernel} and {clEnqueueTask}) until the argument
     value is changed by a call to {clSetKernelArgSVMPointer} for _kernel_.
     The SVM pointer can only be used for arguments that are declared to be a
     pointer to `global` or `constant` memory.
     The SVM pointer value must be aligned according to the arguments type.
     For example, if the argument is declared to be `+global float4 *p+`, the SVM
     pointer value passed for `p` must be at a minimum aligned to a `float4`.
     The SVM pointer value specified as the argument value can be the pointer
     returned by {clSVMAlloc} or can be a pointer offset into the SVM region.

 // refError

 {clSetKernelArgSVMPointer} returns {CL_SUCCESS} if the function was executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object.
   * {CL_INVALID_OPERATION} if no devices in the context associated with _kernel_ support SVM.
   * {CL_INVALID_ARG_INDEX} if _arg_index_ is not a valid argument index.
   * {CL_INVALID_ARG_VALUE} if _arg_value_ specified is not a valid value.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clSetKernelExecInfo',desc='Pass additional information other than argument values to a kernel.',type='protos']
 --
 To pass additional information other than argument values to a kernel, call
 the function

 include::{generated}/api/protos/clSetKernelExecInfo.txt[]
 include::{generated}/api/version-notes/clSetKernelExecInfo.asciidoc[]

   * _kernel_ specifies the kernel object being queried.
   * _param_name_ specifies the information to be passed to kernel.
     The list of supported _param_name_ types and the corresponding values passed
     in _param_value_ is described in the <<kernel-exec-info-table,Kernel
     Execution Properties>> table.
   * _param_value_size_ specifies the size in bytes of the memory pointed to by
     _param_value_.
   * _param_value_ is a pointer to memory where the appropriate values determined
     by _param_name_ are specified.

 [[kernel-exec-info-table]]
 .List of supported param_names by <<clSetKernelExecInfo>>
 [width="100%",cols="<33%,<17%,<50%",options="header"]
 |====
 | Kernel Exec Info | Type | Description
 | {CL_KERNEL_EXEC_INFO_SVM_PTRS_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_EXEC_INFO_SVM_PTRS.asciidoc[]
   | {void_TYPE}*[]
       | SVM pointers must reference locations contained entirely within
         buffers that are passed to kernel as arguments, or that are passed
         through the execution information.

         Non-argument SVM buffers must be specified by passing pointers to
         those buffers via {clSetKernelExecInfo} for coarse-grain and
         fine-grain buffer SVM allocations but not for finegrain system SVM
         allocations.
 | {CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM.asciidoc[]
   | {cl_bool_TYPE}
       | This flag indicates whether the kernel uses pointers that are fine
         grain system SVM allocations.
         These fine grain system SVM pointers may be passed as arguments or
         defined in SVM buffers that are passed as arguments to _kernel_.
 |====

 // refError

 {clSetKernelExecInfo} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_KERNEL} if _kernel_ is a not a valid kernel object.
   * {CL_INVALID_OPERATION} if no devices in the context associated with _kernel_ support SVM.
   * {CL_INVALID_VALUE} if _param_name_ is not valid, if _param_value_ is
     `NULL` or if the size specified by _param_value_size_ is not valid.
   * {CL_INVALID_OPERATION} if _param_name_ is
     {CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM} and _param_value_ is {CL_TRUE}
     but no devices in context associated with _kernel_ support fine-grain
     system SVM allocations.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [NOTE]
 ====
 Coarse-grain or fine-grain buffer SVM pointers used by a kernel which
 are not passed as a kernel arguments must be specified using
 {clSetKernelExecInfo} with {CL_KERNEL_EXEC_INFO_SVM_PTRS}.
 For example, if SVM buffer A contains a pointer to another SVM buffer B,
 and the kernel dereferences that pointer, then a pointer to B must
 either be passed as an argument in the call to that kernel or it must be
 made available to the kernel using {clSetKernelExecInfo}.
 For example, we might pass extra SVM pointers as follows:

 [source,c]
 ----
 clSetKernelExecInfo(kernel,
                     CL_KERNEL_EXEC_INFO_SVM_PTRS,
                     num_ptrs * sizeof(void *),
                     extra_svm_ptr_list);
 ----

 Here `num_ptrs` specifies the number of additional SVM pointers while
 `extra_svm_ptr_list` specifies a pointer to memory containing those SVM
 pointers.

 When calling {clSetKernelExecInfo} with {CL_KERNEL_EXEC_INFO_SVM_PTRS} to
 specify pointers to non-argument SVM buffers as extra arguments to a kernel,
 each of these pointers can be the SVM pointer returned by {clSVMAlloc} or
 can be a pointer + offset into the SVM region.
 It is sufficient to provide one pointer for each SVM buffer used.

 {CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM} is used to indicate whether
 SVM pointers used by a kernel will refer to system allocations or not.

 {CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM} = {CL_FALSE} indicates that the
 OpenCL implementation may assume that system pointers are not passed as
 kernel arguments and are not stored inside SVM allocations passed as kernel
 arguments.

 {CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM} = {CL_TRUE} indicates that the
 OpenCL implementation must assume that system pointers might be passed as
 kernel arguments and/or stored inside SVM allocations passed as kernel
 arguments.
 In this case, if the device to which the kernel is enqueued does not support
 system SVM pointers, {clEnqueueNDRangeKernel} and {clEnqueueTask} will return a
 {CL_INVALID_OPERATION} error.
 If none of the devices in the context associated with kernel support
 fine-grain system SVM allocations, {clSetKernelExecInfo} will return a
 {CL_INVALID_OPERATION} error.

 If {clSetKernelExecInfo} has not been called with a value for
 {CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM}, the default value is used for
 this kernel attribute.
 The default value depends on whether the device on which the kernel is
 enqueued supports fine-grain system SVM allocations.
 If so, the default value used is {CL_TRUE} (system pointers might be passed);
 otherwise, the default is {CL_FALSE}.

 A call to {clSetKernelExecInfo} for a given value of _param_name_
 replaces any prior value passed for that value of _param_name_.
 Only one _param_value_ will be stored for each value of _param_name_.
 ====


 === Copying Kernel Objects

 NOTE: Copying kernel objects is <<unified-spec, missing before>> version 2.1.

 [open,refpage='clCloneKernel',desc='Make a shallow copy of the kernel object.',type='protos']
 --
 To clone a kernel object, call the function

 include::{generated}/api/protos/clCloneKernel.txt[]
 include::{generated}/api/version-notes/clCloneKernel.asciidoc[]

   * _source_kernel_ is a valid {cl_kernel_TYPE} object that will be copied.
     _source_kernel_ will not be modified in any way by this function.
   * _errcode_ret_ will be assigned an appropriate error code.
     If _errcode_ret_ is `NULL`, no error code is returned.

 Cloning is used to make a shallow copy of the kernel object, its arguments
 and any information passed to the kernel object using {clSetKernelExecInfo}.
 If the kernel object was ready to be enqueued before copying it, the clone
 of the kernel object is ready to enqueue.

 The returned kernel object is an exact copy of _source_kernel_, with one
 caveat: the reference count on the returned kernel object is set as if it
 had been returned by {clCreateKernel}.
 The reference count of _source_kernel will_ not be changed.

 The resulting kernel will be in the same state as if {clCreateKernel} is
 called to create the resultant kernel with the same arguments as those used
 to create _source_kernel_, the latest call to {clSetKernelArg} or
 {clSetKernelArgSVMPointer} for each argument index applied to kernel and the
 last call to {clSetKernelExecInfo} for each value of the param name
 parameter are applied to the new kernel object.

 All arguments of the new kernel object must be intact and it may be
 correctly used in the same situations as kernel except those that assume a
 pre-existing reference count.
 Setting arguments on the new kernel object will not affect _source_kernel_
 except insofar as the argument points to a shared underlying entity and in
 that situation behavior is as if two kernel objects had been created and the
 same argument applied to each.
 Only the data stored in the kernel object is copied; data referenced by the
 kernels arguments are not copied.
 For example, if a buffer or pointer argument is set on a kernel object, the
 pointer is copied but the underlying memory allocation is not.

 // refError

 {clCloneKernel} returns a valid non-zero kernel object and _errcode_ret_ is
 set to  {CL_SUCCESS} if the kernel is successfully copied.
 Otherwise it returns a `NULL` value with one of the following error values
 returned in _errcode_ret_:

   * {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --


 === Kernel Object Queries

 [open,refpage='clGetKernelInfo',desc='Returns information about the kernel object.',type='protos']
 --
 To return information about a kernel object, call the function

 include::{generated}/api/protos/clGetKernelInfo.txt[]
 include::{generated}/api/version-notes/clGetKernelInfo.asciidoc[]

   * _kernel_ specifies the kernel object being queried.
   * _param_name_ specifies the information to query.
     The list of supported _param_name_ types and the information returned in
     _param_value_ by {clGetKernelInfo} is described in the
     <<kernel-info-table,Kernel Object Queries>> table.
   * _param_value_ is a pointer to memory where the appropriate result being
     queried is returned.
     If _param_value_ is `NULL`, it is ignored.
   * _param_value_size_ is used to specify the size in bytes of memory pointed to
     by _param_value_.
     This size must be {geq} size of return type as described in the
     <<kernel-info-table,Kernel Object Queries>> table.
   * _param_value_size_ret_ returns the actual size in bytes of data being
     queried by _param_name_.
     If _param_value_size_ret_ is `NULL`, it is ignored.

 [[kernel-info-table]]
 .List of supported param_names by <<clGetKernelInfo>>
 [width="100%",cols="<33%,<17%,<50%",options="header"]
 |====
 | Kernel Info | Return Type | Description
 | {CL_KERNEL_FUNCTION_NAME_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_FUNCTION_NAME.asciidoc[]
   | {char_TYPE}[]
       | Return the kernel function name.
 | {CL_KERNEL_NUM_ARGS_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_NUM_ARGS.asciidoc[]
   | {cl_uint_TYPE}
       | Return the number of arguments to kernel.
 | {CL_KERNEL_REFERENCE_COUNT_anchor} footnote:[{fn-reference-count-usage}]

 include::{generated}/api/version-notes/CL_KERNEL_REFERENCE_COUNT.asciidoc[]
   | {cl_uint_TYPE}
       | Return the _kernel_ reference count.
 | {CL_KERNEL_CONTEXT_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_CONTEXT.asciidoc[]
   | {cl_context_TYPE}
       | Return the context associated with _kernel_.
 | {CL_KERNEL_PROGRAM_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_PROGRAM.asciidoc[]
   | {cl_program_TYPE}
       | Return the program object associated with kernel.
 | {CL_KERNEL_ATTRIBUTES_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_ATTRIBUTES.asciidoc[]
   | {char_TYPE}[]
       | Returns any attributes specified using the `+__attribute__+`
         OpenCL C qualifier (or using an OpenCL {cpp} qualifier syntax [[]] )
         with the kernel function declaration in the program source.
         These attributes include attributes described in the earlier OpenCL
         C kernel language specifications and other attributes supported by
         an implementation.

         Attributes are returned as they were declared inside
         `+__attribute__((...))+`, with any surrounding whitespace and
         embedded newlines removed.
         When multiple attributes are present, they are returned as a single,
         space delimited string.

         For kernels not created from OpenCL C source and the
         {clCreateProgramWithSource} API call the string returned from this
         query will be empty.
 |====

 // refError

 {clGetKernelInfo} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
     specified by _param_value_size_ is < size of return type as described in
     the <<kernel-info-table,Kernel Object Queries>> table and _param_value_
     is not `NULL`.
   * {CL_INVALID_KERNEL} if _kernel_ is a not a valid kernel object.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clGetKernelWorkGroupInfo',desc='Returns information about the kernel object that may be specific to a device.',type='protos']
 --
 To return information about the kernel object that may be specific to a
 device, call the function

 include::{generated}/api/protos/clGetKernelWorkGroupInfo.txt[]
 include::{generated}/api/version-notes/clGetKernelWorkGroupInfo.asciidoc[]

   * _kernel_ specifies the kernel object being queried.
   * _device_ identifies a specific device in the list of devices associated with
     _kernel_.
     The list of devices is the list of devices in the OpenCL context that is
     associated with _kernel_.
     If the list of devices associated with _kernel_ is a single device, _device_
     can be a `NULL` value.
   * _param_name_ specifies the information to query.
     The list of supported _param_name_ types and the information returned in
     _param_value_ by {clGetKernelWorkGroupInfo} is described in the
     <<kernel-workgroup-info-table,Kernel Object Device Queries>> table.
   * _param_value_ is a pointer to memory where the appropriate result being
     queried is returned.
     If _param_value_ is `NULL`, it is ignored.
   * _param_value_size_ is used to specify the size in bytes of memory pointed to
     by _param_value_.
     This size must be {geq} size of return type as described in the
     <<kernel-workgroup-info-table,Kernel Object Device Queries>> table.
   * _param_value_size_ret_ returns the actual size in bytes of data being
     queried by _param_name_.
     If _param_value_size_ret_ is `NULL`, it is ignored.

 [[kernel-workgroup-info-table]]
 .List of supported param_names by <<clGetKernelWorkGroupInfo>>
 [width="100%",cols="<33%,<17%,<50%",options="header"]
 |====
 | Kernel Work Group Info | Return Type | Description
 | {CL_KERNEL_GLOBAL_WORK_SIZE_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_GLOBAL_WORK_SIZE.asciidoc[]
   | {size_t_TYPE}[3]
       | This provides a mechanism for the application to query the maximum
         global size that can be used to execute a kernel (i.e.
         _global_work_size_ argument to {clEnqueueNDRangeKernel}) on a custom
         device given by device or a built-in kernel on an OpenCL device
         given by device.

         If device is not a custom device and kernel is not a built-in
         kernel, {clGetKernelWorkGroupInfo} returns the error
         {CL_INVALID_VALUE}.
 | {CL_KERNEL_WORK_GROUP_SIZE_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_WORK_GROUP_SIZE.asciidoc[]
   | {size_t_TYPE}
       | This provides a mechanism for the application to query the maximum
         work-group size that can be used to execute the kernel on a specific
         device given by device.
         The OpenCL implementation uses the resource requirements of the
         kernel (register usage etc.) to determine what this work-group size
         should be.

         As a result and unlike {CL_DEVICE_MAX_WORK_GROUP_SIZE} this value may
         vary from one kernel to another as well as one device to another.

         {CL_KERNEL_WORK_GROUP_SIZE} will be less than or equal to
         {CL_DEVICE_MAX_WORK_GROUP_SIZE} for a given kernel object.
 | {CL_KERNEL_COMPILE_WORK_GROUP_SIZE_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_COMPILE_WORK_GROUP_SIZE.asciidoc[]
   | {size_t_TYPE}[3]
       | Returns the work-group size specified in the kernel source or IL.

         If the work-group size is not specified in the kernel source or IL,
         (0, 0, 0) is returned.
 | {CL_KERNEL_LOCAL_MEM_SIZE_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_LOCAL_MEM_SIZE.asciidoc[]
   | {cl_ulong_TYPE}
       | Returns the amount of local memory in bytes being used by a kernel.
         This includes local memory that may be needed by an implementation
         to execute the kernel, variables declared inside the kernel with the
         `+__local+` address qualifier and local memory to be allocated for
         arguments to the kernel declared as pointers with the `+__local+`
         address qualifier and whose size is specified with {clSetKernelArg}.

         If the local memory size, for any pointer argument to the kernel
         declared with the `+__local+` address qualifier, is not specified,
         its size is assumed to be 0.
 | {CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE.asciidoc[]
   | {size_t_TYPE}
       | Returns the preferred multiple of work-group size for launch.
         This is a performance hint.
         Specifying a work-group size that is not a multiple of the value
         returned by this query as the value of the local work size argument
         to {clEnqueueNDRangeKernel} will not fail to enqueue the kernel for
         execution unless the work-group size specified is larger than the
         device maximum.
 | {CL_KERNEL_PRIVATE_MEM_SIZE_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_PRIVATE_MEM_SIZE.asciidoc[]
   | {cl_ulong_TYPE}
       | Returns the minimum amount of private memory, in bytes, used by each
         work-item in the kernel.
         This value may include any private memory needed by an
         implementation to execute the kernel, including that used by the
         language built-ins and variable declared inside the kernel with the
         `+__private+` qualifier.
 |====

 // refError

 {clGetKernelWorkGroupInfo} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_DEVICE} if _device_ is not in the list of devices associated
     with _kernel_ or if _device_ is `NULL` but there is more than one device
     associated with _kernel_.
   * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
     specified by _param_value_size_ is < size of return type as described in
     the <<kernel-workgroup-info-table,Kernel Object Device Queries>> table
     and _param_value_ is not `NULL`.
   * {CL_INVALID_VALUE} if _param_name_ is {CL_KERNEL_GLOBAL_WORK_SIZE} and
     _device_ is not a custom device and _kernel_ is not a built-in kernel.
   * {CL_INVALID_KERNEL} if _kernel_ is a not a valid kernel object.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clGetKernelSubGroupInfo',desc='Returns information about the kernel object.',type='protos']
 --
 To return information about a kernel object, call the function

 include::{generated}/api/protos/clGetKernelSubGroupInfo.txt[]
 include::{generated}/api/version-notes/clGetKernelSubGroupInfo.asciidoc[]
 Also see extension *cl_khr_subgroups*.

   * _kernel_ specifies the kernel object being queried.
   * _device_ identifies a specific device in the list of devices associated with
     _kernel_.
     The list of devices is the list of devices in the OpenCL context that is
     associated with _kernel_.
     If the list of devices associated with _kernel_ is a single device, _device_
     can be a `NULL` value.
   * _param_name_ specifies the information to query.
     The list of supported _param_name_ types and the information returned in
     _param_value_ by {clGetKernelSubGroupInfo} is described in the
     <<kernel-subgroup-info-table,Kernel Object Subgroup Queries>> table.
   * _input_value_size_ is used to specify the size in bytes of memory pointed to
     by _input_value_.
     This size must be == size of input type as described in the table below.
   * _input_value_ is a pointer to memory where the appropriate parameterization
     of the query is passed from.
     If _input_value_ is `NULL`, it is ignored.
   * _param_value_ is a pointer to memory where the appropriate result being
     queried is returned.
     If _param_value_ is `NULL`, it is ignored.
   * _param_value_size_ is used to specify the size in bytes of memory pointed to
     by _param_value_.
     This size must be {geq} size of return type as described in the
     <<kernel-subgroup-info-table,Kernel Object Subgroup Queries>> table.
   * _param_value_size_ret_ returns the actual size in bytes of data being
     queried by _param_name_.
     If _param_value_size_ret_ is `NULL`, it is ignored.

 [[kernel-subgroup-info-table]]
 .List of supported param_names by <<clGetKernelSubGroupInfo>>
 [width="100%",cols="<25%,<25%,<25%,<25%",options="header"]
 |====
 | Kernel Subgroup Info | Input Type | Return Type | Description
 | {CL_KERNEL_MAX_SUB_GROUP_SIZE_FOR_NDRANGE_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_MAX_SUB_GROUP_SIZE_FOR_NDRANGE.asciidoc[]
 Also see extension *cl_khr_subgroups*.
   | {size_t_TYPE}*
       | {size_t_TYPE}
           | Returns the maximum sub-group size for this kernel.
             All sub-groups must be the same size, while the last subgroup in
             any work-group (i.e. the subgroup with the maximum index) could
             be the same or smaller size.

             The _input_value_ must be an array of {size_t_TYPE} values
             corresponding to the local work size parameter of the intended
             dispatch.
             The number of dimensions in the ND-range will be inferred from
             the value specified for _input_value_size_.
 | {CL_KERNEL_SUB_GROUP_COUNT_FOR_NDRANGE_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_SUB_GROUP_COUNT_FOR_NDRANGE.asciidoc[]
 Also see extension *cl_khr_subgroups*.
   | {size_t_TYPE}*
       | {size_t_TYPE}
           | Returns the number of sub-groups that will be present in each
             work-group for a given local work size.
             All workgroups, apart from the last work-group in each dimension
             in the presence of non-uniform work-group sizes, will have the
             same number of sub-groups.

             The _input_value_ must be an array of {size_t_TYPE} values
             corresponding to the local work size parameter of the intended
             dispatch.
             The number of dimensions in the ND-range will be inferred from
             the value specified for _input_value_size_.
 | {CL_KERNEL_LOCAL_SIZE_FOR_SUB_GROUP_COUNT_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_LOCAL_SIZE_FOR_SUB_GROUP_COUNT.asciidoc[]
 Also see extension *cl_khr_subgroups*.
   | {size_t_TYPE}
       | {size_t_TYPE}[]
           | Returns the local size that will generate the requested number
             of sub-groups for the kernel.
             The output array must be an array of {size_t_TYPE} values corresponding
             to the local size parameter.
             Any returned work-group will have one dimension.
             Other dimensions inferred from the value specified for
             param_value_size will be filled with the value 1.
             The returned value will produce an exact number of sub-groups
             and result in no partial groups for an executing kernel except
             in the case where the last work-group in a dimension has a size
             different from that of the other groups.
             If no work-group size can accommodate the requested number of
             sub-groups, 0 will be returned in each element of the return
             array.
 | {CL_KERNEL_MAX_NUM_SUB_GROUPS_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_MAX_NUM_SUB_GROUPS.asciidoc[]
 Also see extension *cl_khr_subgroups*.
   | ignored
       | {size_t_TYPE}
           | This provides a mechanism for the application to query the
             maximum number of sub-groups that may make up each work-group to
             execute a kernel on a specific device given by device.
             The OpenCL implementation uses the resource requirements of the
             kernel (register usage etc.) to determine what this work-group
             size should be.
             The returned value may be used to compute a work-group size to
             enqueue the kernel with to give a round number of sub-groups for
             an enqueue.
 | {CL_KERNEL_COMPILE_NUM_SUB_GROUPS_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_COMPILE_NUM_SUB_GROUPS.asciidoc[]
 Also see extension *cl_khr_subgroups*.
   | ignored
       | {size_t_TYPE}
           | Returns the number of sub-groups per work-group specified in the kernel
             source or IL. If the sub-group count is not specified then 0 is returned.
 |====

 // refError

 {clGetKernelSubGroupInfo} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_DEVICE} if _device_ is not in the list of devices associated
     with _kernel_ or if _device_ is `NULL` but there is more than one device
     associated with _kernel_.
   * {CL_INVALID_OPERATION} if _device_ does not support subgroups.
   * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
     specified by _param_value_size_ is < size of return type as described in
     the <<kernel-subgroup-info-table,Kernel Object Subgroup Queries>> table
     and _param_value_ is not `NULL`.
   * {CL_INVALID_VALUE} if _param_name_ is
     {CL_KERNEL_MAX_SUB_GROUP_SIZE_FOR_NDRANGE} and the size in bytes specified by
     _input_value_size_ is not valid or if _input_value_ is `NULL`.
   * {CL_INVALID_KERNEL} if _kernel_ is a not a valid kernel object.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clGetKernelArgInfo',desc='Returns information about the arguments of a kernel.',type='protos']
 --
 To return information about the arguments of a kernel, call the function

 include::{generated}/api/protos/clGetKernelArgInfo.txt[]
 include::{generated}/api/version-notes/clGetKernelArgInfo.asciidoc[]

   * _kernel_ specifies the kernel object being queried.
   * _arg_index_ is the argument index.
     Arguments to the kernel are referred by indices that go from 0 for the
     leftmost argument to _n_ - 1, where _n_ is the total number of arguments
     declared by a kernel.
   * _param_name_ specifies the argument information to query.
     The list of supported _param_name_ types and the information returned in
     _param_value_ by {clGetKernelArgInfo} is described in the
     <<kernel-argument-info-table,Kernel Argument Queries>> table.
   * _param_value_ is a pointer to memory where the appropriate result being
     queried is returned.
     If _param_value_ is `NULL`, it is ignored.
   * _param_value_size_ is used to specify the size in bytes of memory pointed to
     by _param_value_.
     This size must be > size of return type as described in the
     <<kernel-argument-info-table,Kernel Argument Queries>> table.
   * _param_value_size ret_ returns the actual size in bytes of data being
     queried by _param_name_.
     If _param_value_size_ret_ is `NULL`, it is ignored.

 Kernel argument information is only available if the program object
 associated with _kernel_ is created with {clCreateProgramWithSource} and the
 program executable was built with the `-cl-kernel-arg-info option` specified
 in options argument to {clBuildProgram} or {clCompileProgram}.

 [[kernel-argument-info-table]]
 .List of supported param_names by <<clGetKernelArgInfo>>
 [width="100%",cols="<33%,<17%,<50%",options="header"]
 |====
 | Kernel Arg Info | Return Type | Description
 | {CL_KERNEL_ARG_ADDRESS_QUALIFIER_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_ARG_ADDRESS_QUALIFIER.asciidoc[]
   | {cl_kernel_arg_address_qualifier_TYPE}
       | Returns the address qualifier specified for the argument given by
         _arg_index_.
         This can be one of the following values:

         {CL_KERNEL_ARG_ADDRESS_GLOBAL_anchor} +
         {CL_KERNEL_ARG_ADDRESS_LOCAL_anchor} +
         {CL_KERNEL_ARG_ADDRESS_CONSTANT_anchor} +
         {CL_KERNEL_ARG_ADDRESS_PRIVATE_anchor}

         If no address qualifier is specified, the default address qualifier
         which is {CL_KERNEL_ARG_ADDRESS_PRIVATE} is returned.
 | {CL_KERNEL_ARG_ACCESS_QUALIFIER_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_ARG_ACCESS_QUALIFIER.asciidoc[]
   | {cl_kernel_arg_access_qualifier_TYPE}
       | Returns the access qualifier specified for the argument given by
         _arg_index_.
         This can be one of the following values:

         {CL_KERNEL_ARG_ACCESS_READ_ONLY_anchor} +
         {CL_KERNEL_ARG_ACCESS_WRITE_ONLY_anchor} +
         {CL_KERNEL_ARG_ACCESS_READ_WRITE_anchor} +
         {CL_KERNEL_ARG_ACCESS_NONE_anchor}

         If argument is not an image type and is not declared with the pipe
         qualifier, {CL_KERNEL_ARG_ACCESS_NONE} is returned.
         If argument is an image type, the access qualifier specified or the
         default access qualifier is returned.
 | {CL_KERNEL_ARG_TYPE_NAME_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_ARG_TYPE_NAME.asciidoc[]
   | {char_TYPE}[]
       | Returns the type name specified for the argument given by
         _arg_index_.
         The type name returned will be the argument type name as it was
         declared with any whitespace removed.
         If argument type name is an unsigned scalar type (i.e. unsigned
         char, unsigned short, unsigned int, unsigned long), uchar, ushort,
         uint and ulong will be returned.
         The argument type name returned does not include any type
         qualifiers.
 | {CL_KERNEL_ARG_TYPE_QUALIFIER_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_ARG_TYPE_QUALIFIER.asciidoc[]
   | {cl_kernel_arg_type_qualifier_TYPE}
       | Returns a bitfield describing one or more type qualifiers specified
         for the argument given by _arg_index_.
         The returned values can be:

         {CL_KERNEL_ARG_TYPE_CONST_anchor}
         footnote:[{fn-kernel-arg-type-qualifier}]
         footnote:[{fn-kernel-arg-type-const-addr-space}] +
         {CL_KERNEL_ARG_TYPE_RESTRICT_anchor} +
         {CL_KERNEL_ARG_TYPE_VOLATILE_anchor} +
         {CL_KERNEL_ARG_TYPE_PIPE_anchor}, or +
         {CL_KERNEL_ARG_TYPE_NONE_anchor}

         {CL_KERNEL_ARG_TYPE_NONE} is returned for all parameters passed by
         value.
 | {CL_KERNEL_ARG_NAME_anchor}

 include::{generated}/api/version-notes/CL_KERNEL_ARG_NAME.asciidoc[]
   | {char_TYPE}[]
       | Returns the name specified for the argument given by _arg_index_.
 |====

 {clGetKernelArgInfo} returns CL SUCCESS if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_ARG_INDEX} if _arg_index_ is not a valid argument index.
   * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
     specified by _param_value_ size is < size of return type as described in
     the <<kernel-argument-info-table,Kernel Argument Queries>> table and
     _param_value_ is not `NULL`.
   * {CL_KERNEL_ARG_INFO_NOT_AVAILABLE} if the argument information is not
     available for kernel.
   * {CL_INVALID_KERNEL} if _kernel_ is a not a valid kernel object.
 --


 == Executing Kernels

 [open,refpage='clEnqueueNDRangeKernel',desc='Enqueues a command to execute a kernel on a device.',type='protos']
 --
 To enqueue a command to execute a kernel on a device, call the function

 include::{generated}/api/protos/clEnqueueNDRangeKernel.txt[]
 include::{generated}/api/version-notes/clEnqueueNDRangeKernel.asciidoc[]

   * _command_queue_ is a valid host command-queue.
     The kernel will be queued for execution on the device associated with
     _command_queue_.
   * _kernel_ is a valid kernel object.
     The OpenCL context associated with _kernel_ and _command-queue_ must be the
     same.
   * _work_dim_ is the number of dimensions used to specify the global work-items
     and work-items in the work-group.
     _work_dim_ must be greater than zero and less than or equal to
     {CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS}.
     If _global_work_size_ is `NULL`, or the value in any passed dimension is 0
     then the kernel command will trivially succeed after its event dependencies
     are satisfied and subsequently update its completion event.
     The behavior in this situation is similar to that of an enqueued marker,
     except that unlike a marker, an enqueued kernel with no events passed to
     _event_wait_list_ may run at any time.
   * _global_work_offset_ can be used to specify an array of _work_dim_ unsigned
     values that describe the offset used to calculate the global ID of a
     work-item.
     If _global_work_offset_ is `NULL`, the global IDs start at offset (0, 0, 0).
     _global_work_offset_ must be `NULL` <<unified-spec, before>> version 1.1.
   * _global_work_size_ points to an array of _work_dim_ unsigned values that
     describe the number of global work-items in _work_dim_ dimensions that will
     execute the kernel function.
     The total number of global work-items is computed as _global_work_size_[0]
     {times} ... {times} _global_work_size_[_work_dim_ - 1].
   * _local_work_size_ points to an array of _work_dim_ unsigned values that
     describe the number of work-items that make up a work-group (also referred
     to as the size of the work-group) that will execute the kernel specified by
     _kernel_.
     The total number of work-items in a work-group is computed as
     _local_work_size_[0] {times} ... {times} _local_work_size_[_work_dim_ - 1].
     The total number of work-items in the work-group must be less than or equal
     to the {CL_KERNEL_WORK_GROUP_SIZE} value specified in the
     <<kernel-workgroup-info-table,Kernel Object Device Queries>> table, and the
     number of work-items specified in _local_work_size_[0], ...,
     _local_work_size_[_work_dim_ - 1] must be less than or equal to the
     corresponding values specified by {CL_DEVICE_MAX_WORK_ITEM_SIZES}[0], ...,
     {CL_DEVICE_MAX_WORK_ITEM_SIZES}[_work_dim_ - 1].
     The explicitly specified _local_work_size_ will be used to determine how to
     break the global work-items specified by _global_work_size_ into appropriate
     work-group instances.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before this particular command can be executed.
     If _event_wait_list_ is `NULL`, then this particular command does not wait
     on any event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this command and
     can be used to query or wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 An ND-range kernel command may require uniform work-groups or may support non-uniform work-groups.
 To support non-uniform work-groups:

 . The device associated with _command_queue_ must support non-uniform work-groups.
 . The program object associated with _kernel_ must support non-uniform work-groups.
 Specifically, this means:
 .. If the program was created with {clCreateProgramWithSource}, the program must be compiled or built using the `-cl-std=CL2.0` or `-cl-std=CL3.0` build option and without the `-cl-uniform-work-group-size` build option.
 .. If the program was created with {clCreateProgramWithIL} or {clCreateProgramWithBinary}, the program must be compiled or built without the `-cl-uniform-work-group-size` build options.
 .. If the program was created using {clLinkProgram}, all input programs must support non-uniform work-groups.

 If non-uniform work-groups are supported, any single dimension
 for which the global size is not divisible by the local size will be
 partitioned into two regions.
 One region will have work-groups that have the same number of work-items as
 was specified by the local size parameter in that dimension.
 The other region will have work-groups with less than the number of work
 items specified by the local size parameter in that dimension.
 The global IDs and group IDs of the work-items in the first region will be
 numerically lower than those in the second, and the second region will be at
 most one work-group wide in that dimension.
 Work-group sizes could be non-uniform in multiple dimensions, potentially
 producing work-groups of up to 4 different sizes in a 2D range and 8
 different sizes in a 3D range.

 If non-uniform work-groups are supported and _local_work_size_ is `NULL`, the OpenCL runtime may choose a uniform or non-uniform work-group size.

 Otherwise, when non-uniform work-groups are not supported, the size of each work-group must be uniform.
 If _local_work_size_ is specified, the values specified in _global_work_size_[0], ..., _global_work_size_[_work_dim_ - 1] must be evenly divisible by the corresponding values specified in _local_work_size_[0], ..., _local_work_size_[_work_dim_ - 1].
 If _local_work_size_ is `NULL`, the OpenCL runtime must choose a uniform work-group size.

 The work-group size to be used for _kernel_ can also be specified in the
 program source or intermediate language.
 In this case the size of work-group specified by _local_work_size_ must
 match the value specified in the program source.

 These work-group instances are executed in parallel across multiple compute
 units or concurrently on the same compute unit.

 Each work-item is uniquely identified by a global identifier.
 The global ID, which can be read inside the kernel, is computed using the
 value given by _global_work_size_ and _global_work_offset_.
 In addition, a work-item is also identified within a work-group by a unique
 local ID.
 The local ID, which can also be read by the kernel, is computed using the
 value given by _local_work_size_.
 The starting local ID is always (0, 0, ..., 0).

 // refError

 {clEnqueueNDRangeKernel} returns {CL_SUCCESS} if the kernel-instance was
 successfully queued.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_PROGRAM_EXECUTABLE} if there is no successfully built program
     executable available for device associated with _command_queue_.
   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object.
   * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and
     _kernel_ are not the same or if the context associated with
     _command_queue_ and events in _event_wait_list_ are not the same.
   * {CL_INVALID_KERNEL_ARGS} if the kernel argument values have not been
     specified.
   * {CL_INVALID_WORK_DIMENSION} if _work_dim_ is not a valid value (i.e. a
     value between 1 and {CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS}).
   * {CL_INVALID_GLOBAL_WORK_SIZE} if _global_work_size_ is NULL or if any of
     the values specified in _global_work_size_[0], ...
     _global_work_size_[_work_dim_ - 1] are 0.
     Returning this error code under these circumstances is  <<unified-spec,
     deprecated by>> version 2.1.
   * {CL_INVALID_GLOBAL_WORK_SIZE} if any of the values specified in
     _global_work_size_[0], ... _global_work_size_[_work_dim_ - 1] exceed the
     maximum value representable by {size_t_TYPE} on the device on which the
     kernel-instance will be enqueued.
   * {CL_INVALID_GLOBAL_OFFSET} if the value specified in _global_work_size_
     {plus} the corresponding values in _global_work_offset_ for any
     dimensions is greater than the maximum value representable by size t on
     the device on which the kernel-instance will be enqueued, or if
     _global_work_offset_ is non-`NULL` <<unified-spec, before>> version 1.1.
   * {CL_INVALID_WORK_GROUP_SIZE} if _local_work_size_ is specified and does
     not match the required work-group size for _kernel_ in the program
     source.
   * {CL_INVALID_WORK_GROUP_SIZE} if _local_work_size_ is specified and is not
     consistent with the required number of sub-groups for _kernel_ in the
     program source.
   * {CL_INVALID_WORK_GROUP_SIZE} if _local_work_size_ is specified and the
     total number of work-items in the work-group computed as
     _local_work_size_[0] {times} ... _local_work_size_[_work_dim_ - 1] is
     greater than the value specified by {CL_KERNEL_WORK_GROUP_SIZE} in the
     <<kernel-workgroup-info-table,Kernel Object Device Queries>> table.
   * {CL_INVALID_WORK_GROUP_SIZE} if the work-group size must be uniform and
     the _local_work_size_ is not `NULL`, is not equal to the required
     work-group size specified in the kernel source, or the
     _global_work_size_ is not evenly divisible by the _local_work_size_.
   * {CL_INVALID_WORK_ITEM_SIZE} if the number of work-items specified in any
     of _local_work_size_[0], ... _local_work_size_[_work_dim_ - 1] is
     greater than the corresponding values specified by
     {CL_DEVICE_MAX_WORK_ITEM_SIZES}[0], ...,
     {CL_DEVICE_MAX_WORK_ITEM_SIZES}[_work_dim_ - 1].
   * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if a sub-buffer object is specified as
     the value for an argument that is a buffer object and the _offset_
     specified when the sub-buffer object is created is not aligned to
     {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated with _queue_.
     This error code is <<unified-spec, missing before>> version 1.1.
   * {CL_INVALID_IMAGE_SIZE} if an image object is specified as an argument
     value and the image dimensions (image width, height, specified or
     compute row and/or slice pitch) are not supported by device associated
     with _queue_.
   * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if an image object is specified as an
     argument value and the image format (image channel order and data type)
     is not supported by device associated with _queue_.
   * {CL_OUT_OF_RESOURCES} if there is a failure to queue the execution
     instance of _kernel_ on the command-queue because of insufficient
     resources needed to execute the kernel.
     For example, the explicitly specified _local_work_size_ causes a failure
     to execute the kernel because of insufficient resources such as
     registers or local memory.
     Another example would be the number of read-only image args used in
     _kernel_ exceed the {CL_DEVICE_MAX_READ_IMAGE_ARGS} value for device or
     the number of write-only and read-write image args used in _kernel_
     exceed the {CL_DEVICE_MAX_READ_WRITE_IMAGE_ARGS} value for device or the
     number of samplers used in _kernel_ exceed {CL_DEVICE_MAX_SAMPLERS} for
     device.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for data store associated with image or buffer objects specified
     as arguments to _kernel_.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_INVALID_OPERATION} if SVM pointers are passed as arguments to a kernel
     and the device does not support SVM or if system pointers are passed as
     arguments to a kernel and/or stored inside SVM allocations passed as
     kernel arguments and the device does not support fine grain system SVM
     allocations.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clEnqueueTask',desc='Enqueues a command to execute a kernel, using a single work-item, on a device.',type='protos']
 --
 To enqueue a command to execute a kernel on a device, using a single work-item,
 call the function

 include::{generated}/api/protos/clEnqueueTask.txt[]
 include::{generated}/api/version-notes/clEnqueueTask.asciidoc[]

   * _command_queue_ is a valid host command-queue.
     The kernel will be queued for execution on the device associated with
     _command_queue_.
   * _kernel_ is a valid kernel object.
     The OpenCL context associated with _kernel_ and _command-queue_ must be the
     same.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before this particular command can be executed.
     If _event_wait_list_ is `NULL`, then this particular command does not wait
     on any event to complete.
     If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this command and
     can be used to query or wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 {clEnqueueTask} is equivalent to calling {clEnqueueNDRangeKernel} with
 _work_dim_ set to 1, _global_work_offset_ set to `NULL`, _global_work_size[0]_
 set to 1, and _local_work_size[0]_ set to 1.

 // refError

 {clEnqueueTask} returns {CL_SUCCESS} if the kernel-instance was successfully
 queued.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_PROGRAM_EXECUTABLE} if there is no successfully built program
     executable available for device associated with _command_queue_.
   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object.
   * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and
     _kernel_ are not the same or if the context associated with
     _command_queue_ and events in _event_wait_list_ are not the same.
   * {CL_INVALID_KERNEL_ARGS} if the kernel argument values have not been
     specified.
   * {CL_INVALID_WORK_GROUP_SIZE} if a work-group size is specified for _kernel_
     in the program source and it is not (1, 1, 1).
 // TODO I'm not sure if the next error makes sense for a 'task'.
   * {CL_INVALID_WORK_GROUP_SIZE} if the required number of sub-groups is
     specified for _kernel_ in the program source and is not consistent with a
     work-group size of (1, 1, 1).
   * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if a sub-buffer object is specified as
     the value for an argument that is a buffer object and the _offset_
     specified when the sub-buffer object is created is not aligned to
     {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated with _queue_.
     This error code is <<unified-spec, missing before>> version 1.1.
   * {CL_INVALID_IMAGE_SIZE} if an image object is specified as an argument
     value and the image dimensions (image width, height, specified or
     compute row and/or slice pitch) are not supported by device associated
     with _queue_.
   * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if an image object is specified as an
     argument value and the image format (image channel order and data type)
     is not supported by device associated with _queue_.
   * {CL_OUT_OF_RESOURCES} if there is a failure to queue the execution
     instance of _kernel_ on the command-queue because of insufficient
     resources needed to execute the kernel.  See how this error code is used
     with {clEnqueueNDRangeKernel} for examples.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for data store associated with image or buffer objects specified
     as arguments to _kernel_.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_INVALID_OPERATION} if SVM pointers are passed as arguments to a kernel
     and the device does not support SVM or if system pointers are passed as
     arguments to a kernel and/or stored inside SVM allocations passed as
     kernel arguments and the device does not support fine grain system SVM
     allocations.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clEnqueueNativeKernel',desc='Enqueues a command to execute a native C/C++ function not compiled using the OpenCL compiler.',type='protos']
 --
 To enqueue a command to execute a native C/{cpp} function not compiled using
 the OpenCL compiler, call the function

 include::{generated}/api/protos/clEnqueueNativeKernel.txt[]
 include::{generated}/api/version-notes/clEnqueueNativeKernel.asciidoc[]

   * _command_queue_ is a valid host command-queue.
     A native user function can only be executed on a command-queue created on a
     device that has {CL_EXEC_NATIVE_KERNEL} capability set in
     {CL_DEVICE_EXECUTION_CAPABILITIES} as specified in the
     <<device-queries-table,Device Queries>> table.
   * _user_func_ is a pointer to a host-callable user function.
   * _args_ is a pointer to the args list that _user_func_ should be called with.
   * _cb_args_ is the size in bytes of the args list that _args_ points to.
   * _num_mem_objects_ is the number of buffer objects that are passed in _args_.
   * _mem_list_ is a list of valid buffer objects, if _num_mem_objects_ > 0.
     The buffer object values specified in _mem_list_ are memory object handles
     (`{cl_mem_TYPE}` values) returned by {clCreateBuffer} or {clCreateBufferWithProperties},
     or `NULL`.
   * _args_mem_loc_ is a pointer to appropriate locations that _args_ points to
     where memory object handles ({cl_mem_TYPE} values) are stored.
     Before the user function is executed, the memory object handles are replaced
     by pointers to global memory.
   * _event_wait_list_, _num_events_in_wait_list_ and _event_ are as described in
     {clEnqueueNDRangeKernel}.

 The data pointed to by _args_ and _cb_args_ bytes in size will be copied and
 a pointer to this copied region will be passed to _user_func_.
 The copy needs to be done because the memory objects ({cl_mem_TYPE} values) that
 _args_ may contain need to be modified and replaced by appropriate pointers
 to global memory.
 When {clEnqueueNativeKernel} returns, the memory region pointed to by _args_
 can be reused by the application.

 // refError

 {clEnqueueNativeKernel} returns {CL_SUCCESS} if the user function execution
 instance was successfully queued.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and events
     in _event_wait_list_ are not the same.
   * {CL_INVALID_VALUE} if _user_func_ is `NULL`.
   * {CL_INVALID_VALUE} if _args_ is a `NULL` value and _cb_args_ > 0, or if
     _args_ is a `NULL` value and _num_mem_objects_ > 0.
   * {CL_INVALID_VALUE} if _args_ is not `NULL` and _cb_args_ is 0.
   * {CL_INVALID_VALUE} if _num_mem_objects_ > 0 and _mem_list_ or
     _args_mem_loc_ are `NULL`.
   * {CL_INVALID_VALUE} if _num_mem_objects_ = 0 and _mem_list_ or
     _args_mem_loc_ are not `NULL`.
   * {CL_INVALID_OPERATION} if the device associated with _command_queue_
     cannot execute the native kernel.
   * {CL_INVALID_MEM_OBJECT} if one or more memory objects specified in
     _mem_list_ are not valid or are not buffer objects.
   * {CL_OUT_OF_RESOURCES} if there is a failure to queue the execution
     instance of _kernel_ on the command-queue because of insufficient
     resources needed to execute the kernel.
   * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
     memory for data store associated with buffer objects specified as
     arguments to _kernel_.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_INVALID_OPERATION} if SVM pointers are passed as arguments to a kernel
     and the device does not support SVM or if system pointers are passed as
     arguments to a kernel and/or stored inside SVM allocations passed as
     kernel arguments and the device does not support fine grain system SVM
     allocations.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.

 [NOTE]
 ====
 The total number of read-only images specified as arguments to a kernel
 cannot exceed {CL_DEVICE_MAX_READ_IMAGE_ARGS}.
 Each image array argument to a kernel declared with the `read_only`
 qualifier counts as one image.
 The total number of write-only images specified as arguments to a kernel
 cannot exceed {CL_DEVICE_MAX_WRITE_IMAGE_ARGS}.
 Each image array argument to a kernel declared with the `write_only`
 qualifier counts as one image.

 The total number of read-write images specified as arguments to a kernel
 cannot exceed {CL_DEVICE_MAX_READ_WRITE_IMAGE_ARGS}.
 Each image array argument to a kernel declared with the `read_write`
 qualifier counts as one image.
 ====
 --


 [[event-objects]]
 == Event Objects

 // Check: Is this list of event object APIs really necessary?

 //Event objects can be used to refer to a kernel-instance command
 //({clEnqueueNDRangeKernel}, {clEnqueueTask}, {clEnqueueNativeKernel}), read,
 //write, map and copy commands on memory objects (
 //{clEnqueueReadBuffer}, {clEnqueueWriteBuffer},
 //{clEnqueueMapBuffer}, {clEnqueueUnmapMemObject},
 //{clEnqueueReadBufferRect}, {clEnqueueWriteBufferRect},
 //{clEnqueueReadImage}, {clEnqueueWriteImage}, {clEnqueueMapImage},
 //{clEnqueueCopyBuffer}, {clEnqueueCopyImage},
 //{clEnqueueCopyBufferRect},
 //{clEnqueueCopyBufferToImage}, {clEnqueueCopyImageToBuffer}),
 //{clEnqueueSVMMemcpy}, {clEnqueueSVMMemFill},
 //{clEnqueueSVMMap}, {clEnqueueSVMUnmap}, {clEnqueueSVMFree}, {clEnqueueMarker},
 //{clEnqueueMarkerWithWaitList}, {clEnqueueWaitForEvents}, {clEnqueueBarrier},
 //{clEnqueueBarrierWithWaitList},  (refer to
 //<<markers-barriers-waiting-for-events,Markers, Barriers and Waiting for
 //Events>>) or user events.

 An event object can be used to track the execution status of a command.
 The API calls that enqueue commands to a command-queue create a new event
 object that is returned in the _event_ argument.
 In case of an error enqueuing the command in the command-queue the event
 argument does not return an event object.

 The execution status of an enqueued command at any given point in time can
 be one of the following:

   * {CL_QUEUED_anchor}: Indicates that the command has been enqueued in a
     command-queue.
     This is the initial state of all events except user events.
   * {CL_SUBMITTED_anchor}: The initial state for all user events.
     For all other events, indicates that the command has been submitted
     by the host to the device.
   * {CL_RUNNING_anchor}: Indicates that the device has started executing this
     command.
     In order for the execution status of an enqueued command to change from
     {CL_SUBMITTED} to {CL_RUNNING}, all events that this command is waiting on
     must have completed successfully i.e. their execution status must be
     {CL_COMPLETE}.
   * {CL_COMPLETE_anchor}: Indicates that the command has successfully completed.
   * An Error Code: A negative integer value indicating that the command was
     abnormally terminated. Abnormal termination may occur for a number of reasons,
     such as a bad memory access.

 [NOTE]
 ====
 A command is considered to be complete if its execution status is
 {CL_COMPLETE} or is a negative integer value.

 If the execution of a command is terminated, the command-queue associated
 with this terminated command, and the associated context (and all other
 command-queues in this context) may no longer be available.
 The behavior of OpenCL API calls that use this context (and command-queues
 associated with this context) are now considered to be
 implementation-defined.
 The user registered callback function specified when context is created can
 be used to report appropriate error information.
 ====

 [open,refpage='clCreateUserEvent',desc='Creates a user event object.',type='protos']
 --
 To create a user event object, call the function

 include::{generated}/api/protos/clCreateUserEvent.txt[]
 include::{generated}/api/version-notes/clCreateUserEvent.asciidoc[]

   * _context_ must be a valid OpenCL context.
   * _errcode_ret_ will return an appropriate error code.
     If _errcode_ret_ is `NULL`, no error code is returned.

 User events allow applications to enqueue commands that wait on a user event
 to finish before the command is executed by the device.

 // refError

 {clCreateUserEvent} returns a valid non-zero event object and _errcode_ret_
 is set to {CL_SUCCESS} if the user event object is created successfully.
 Otherwise, it returns a `NULL` value with one of the following error values
 returned in _errcode_ret_:

   * {CL_INVALID_CONTEXT} if _context_ is not a valid context.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.

 The initial execution status for the user event object is {CL_SUBMITTED}.
 --

 [open,refpage='clSetUserEventStatus',desc='Sets the execution status of a user event object.',type='protos']
 --
 To set the execution status of a user event object, call the function

 include::{generated}/api/protos/clSetUserEventStatus.txt[]
 include::{generated}/api/version-notes/clSetUserEventStatus.asciidoc[]

   * _event_ is a user event object created using {clCreateUserEvent}.
   * _execution_status_ specifies the new execution status to be set and can be
     {CL_COMPLETE} or a negative integer value to indicate an error.
     A negative integer value causes all enqueued commands that wait on this user
     event to be terminated.
     {clSetUserEventStatus} can only be called once to change the execution
     status of _event_.

 [NOTE]
 ====
 If there are enqueued commands with user events in the _event_wait_list_
 argument of *+clEnqueue*+* commands, the user must ensure that the status of
 these user events being waited on are set using {clSetUserEventStatus}
 before any OpenCL APIs that release OpenCL objects except for event objects
 are called; otherwise the behavior is undefined.

 For example, the following code sequence will result in undefined behavior
 of {clReleaseMemObject}.

 [source,c]
 ----
 ev1 = clCreateUserEvent(ctx, NULL);
 clEnqueueWriteBuffer(cq, buf1, CL_FALSE, ..., 1, &ev1, NULL);
 clEnqueueWriteBuffer(cq, buf2, CL_FALSE, ...);
 clReleaseMemObject(buf2);
 clSetUserEventStatus(ev1, CL_COMPLETE);
 ----

 The following code sequence, however, works correctly.

 [source,c]
 ----
 ev1 = clCreateUserEvent(ctx, NULL);
 clEnqueueWriteBuffer(cq, buf1, CL_FALSE, ..., 1, &ev1, NULL);
 clEnqueueWriteBuffer(cq, buf2, CL_FALSE, ...);
 clSetUserEventStatus(ev1, CL_COMPLETE);
 clReleaseMemObject(buf2);
 ----
 ====

 // refError

 {clSetUserEventStatus} returns  {CL_SUCCESS} if the function was executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_EVENT} if _event_ is not a valid user event object.
   * {CL_INVALID_VALUE} if the _execution_status_ is not {CL_COMPLETE} or a
     negative integer value.
   * {CL_INVALID_OPERATION} if the _execution_status_ for _event_ has already
     been changed by a previous call to {clSetUserEventStatus}.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clWaitForEvents',desc='Waits on the host thread for commands identified by event objects to complete.',type='protos']
 --
 To wait for events to complete, call the function

 include::{generated}/api/protos/clWaitForEvents.txt[]
 include::{generated}/api/version-notes/clWaitForEvents.asciidoc[]

   * _num_events_ is the number of events in _event_list_.
   * _event_list_ is a pointer to a list of event object handles.

 This function waits on the host thread for commands identified by event
 objects in _event_list_ to complete.
 A command is considered complete if its execution status is {CL_COMPLETE} or a
 negative value.
 The events specified in _event_list_ act as synchronization points.

 // refError

 {clWaitForEvents} returns {CL_SUCCESS} if the execution status of all events
 in _event_list_ is {CL_COMPLETE}.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_VALUE} if _num_events_ is zero or _event_list_ is `NULL`.
   * {CL_INVALID_CONTEXT} if events specified in _event_list_ do not belong to
     the same context.
   * {CL_INVALID_EVENT} if event objects specified in _event_list_ are not
     valid event objects.
   * {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the execution status of
     any of the events in _event_list_ is a negative integer value.
     This error code is <<unified-spec, missing before>> version 1.1.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clGetEventInfo',desc='Returns information about the event object.',type='protos']
 --
 To return information about an event object, call the function

 include::{generated}/api/protos/clGetEventInfo.txt[]
 include::{generated}/api/version-notes/clGetEventInfo.asciidoc[]

   * _event_ specifies the event object being queried.
   * _param_name_ specifies the information to query.
     The list of supported _param_name_ types and the information returned in
     _param_value_ by {clGetEventInfo} is described in the
     <<event-info-table,Event Object Queries>> table.
   * _param_value_ is a pointer to memory where the appropriate result being
     queried is returned.
     If _param_value_ is `NULL`, it is ignored.
   * _param_value_size_ is used to specify the size in bytes of memory pointed to
     by _param_value_.
     This size must be {geq} size of return type as described in the
     <<event-info-table,Event Object Queries>> table.
   * _param_value_size_ret_ returns the actual size in bytes of data being
     queried by _param_name_.
     If _param_value_size_ret_ is `NULL`, it is ignored.

 [[event-info-table]]
 .List of supported param_names by <<clGetEventInfo>>
 [width="100%",cols="<33%,<17%,<50%",options="header"]
 |====
 | Event Info | Return Type | Description
 | {CL_EVENT_COMMAND_QUEUE_anchor}

 include::{generated}/api/version-notes/CL_EVENT_COMMAND_QUEUE.asciidoc[]
   | {cl_command_queue_TYPE}
       | Return the command-queue associated with _event_.
         For user event objects, a `NULL` value is returned.
 | {CL_EVENT_CONTEXT_anchor}

 include::{generated}/api/version-notes/CL_EVENT_CONTEXT.asciidoc[]
   | {cl_context_TYPE}
       | Return the context associated with _event_.
 | {CL_EVENT_COMMAND_TYPE_anchor}

 include::{generated}/api/version-notes/CL_EVENT_COMMAND_TYPE.asciidoc[]
   | {cl_command_type_TYPE}
       | Return the command type associated with _event_ as described in the
         <<event-command-type-table,Event Command Types>> table.

 | {CL_EVENT_COMMAND_EXECUTION_STATUS_anchor} footnote:[{fn-event-status-order}]

 include::{generated}/api/version-notes/CL_EVENT_COMMAND_EXECUTION_STATUS.asciidoc[]
   | {cl_int_TYPE}
       | Return the execution status of the command identified by event.
         Valid values are:

         {CL_QUEUED} (command has been enqueued in the command-queue),

         {CL_SUBMITTED} (enqueued command has been submitted by the host to the
         device associated with the command-queue),

         {CL_RUNNING} (device is currently executing this command),

         {CL_COMPLETE} (the command has completed), or

         Error code given by a negative integer value. (command was
         abnormally terminated - this may be caused by a bad memory access
         etc.).
         These error codes come from the same set of error codes that are
         returned from the platform or runtime API calls as return values or
         errcode_ret values.
 | {CL_EVENT_REFERENCE_COUNT_anchor} footnote:[{fn-reference-count-usage}]

 include::{generated}/api/version-notes/CL_EVENT_REFERENCE_COUNT.asciidoc[]
   | {cl_uint_TYPE}
       | Return the _event_ reference count.
 |====

 [[event-command-type-table]]
 .List of supported event command types
 [width="100%",cols="2,3",options="header"]
 |====
 | *Events Created By*
 | *Event Command Type*

 | {clEnqueueNDRangeKernel}
 | {CL_COMMAND_NDRANGE_KERNEL_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_NDRANGE_KERNEL.asciidoc[]

 | {clEnqueueTask}
 | {CL_COMMAND_TASK_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_TASK.asciidoc[]

 | {clEnqueueNativeKernel}
 | {CL_COMMAND_NATIVE_KERNEL_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_NATIVE_KERNEL.asciidoc[]

 | {clEnqueueReadBuffer}
 | {CL_COMMAND_READ_BUFFER_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_READ_BUFFER.asciidoc[]

 | {clEnqueueWriteBuffer}
 | {CL_COMMAND_WRITE_BUFFER_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_WRITE_BUFFER.asciidoc[]

 | {clEnqueueCopyBuffer}
 | {CL_COMMAND_COPY_BUFFER_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_COPY_BUFFER.asciidoc[]

 | {clEnqueueReadImage}
 | {CL_COMMAND_READ_IMAGE_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_READ_IMAGE.asciidoc[]

 | {clEnqueueWriteImage}
 | {CL_COMMAND_WRITE_IMAGE_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_WRITE_IMAGE.asciidoc[]

 | {clEnqueueCopyImage}
 | {CL_COMMAND_COPY_IMAGE_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_COPY_IMAGE.asciidoc[]

 | {clEnqueueCopyBufferToImage}
 | {CL_COMMAND_COPY_BUFFER_TO_IMAGE_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_COPY_BUFFER_TO_IMAGE.asciidoc[]

 | {clEnqueueCopyImageToBuffer}
 | {CL_COMMAND_COPY_IMAGE_TO_BUFFER_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_COPY_IMAGE_TO_BUFFER.asciidoc[]

 | {clEnqueueMapBuffer}
 | {CL_COMMAND_MAP_BUFFER_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_MAP_BUFFER.asciidoc[]

 | {clEnqueueMapImage}
 | {CL_COMMAND_MAP_IMAGE_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_MAP_IMAGE.asciidoc[]

 | {clEnqueueUnmapMemObject}
 | {CL_COMMAND_UNMAP_MEM_OBJECT_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_UNMAP_MEM_OBJECT.asciidoc[]

 | {clEnqueueMarker}, +
   {clEnqueueMarkerWithWaitList}
 | {CL_COMMAND_MARKER_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_MARKER.asciidoc[]

 | {clEnqueueReadBufferRect}
 | {CL_COMMAND_READ_BUFFER_RECT_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_READ_BUFFER_RECT.asciidoc[]

 | {clEnqueueWriteBufferRect}
 | {CL_COMMAND_WRITE_BUFFER_RECT_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_WRITE_BUFFER_RECT.asciidoc[]

 | {clEnqueueCopyBufferRect}
 | {CL_COMMAND_COPY_BUFFER_RECT_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_COPY_BUFFER_RECT.asciidoc[]

 | {clCreateUserEvent}
 | {CL_COMMAND_USER_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_USER.asciidoc[]

 | {clEnqueueBarrier}, +
   {clEnqueueBarrierWithWaitList}
 | {CL_COMMAND_BARRIER_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_BARRIER.asciidoc[]

 | {clEnqueueMigrateMemObjects}
 | {CL_COMMAND_MIGRATE_MEM_OBJECTS_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_MIGRATE_MEM_OBJECTS.asciidoc[]

 | {clEnqueueFillBuffer}
 | {CL_COMMAND_FILL_BUFFER_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_FILL_BUFFER.asciidoc[]

 | {clEnqueueFillImage}
 | {CL_COMMAND_FILL_IMAGE_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_FILL_IMAGE.asciidoc[]

 | {clEnqueueSVMFree}
 | {CL_COMMAND_SVM_FREE_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_SVM_FREE.asciidoc[]

 | {clEnqueueSVMMemcpy}
 | {CL_COMMAND_SVM_MEMCPY_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_SVM_MEMCPY.asciidoc[]

 | {clEnqueueSVMMemFill}
 | {CL_COMMAND_SVM_MEMFILL_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_SVM_MEMFILL.asciidoc[]

 | {clEnqueueSVMMap}
 | {CL_COMMAND_SVM_MAP_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_SVM_MAP.asciidoc[]

 | {clEnqueueSVMUnmap}
 | {CL_COMMAND_SVM_UNMAP_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_SVM_UNMAP.asciidoc[]

 | {clEnqueueSVMMigrateMem}
 | {CL_COMMAND_SVM_MIGRATE_MEM_anchor}

 include::{generated}/api/version-notes/CL_COMMAND_SVM_MIGRATE_MEM.asciidoc[]

 Prior to OpenCL 3.0, implementations should return
 {CL_COMMAND_MIGRATE_MEM_OBJECTS}, but may return an implementation-defined
 event command type for {clEnqueueSVMMigrateMem}.

 |====

 Using {clGetEventInfo} to determine if a command identified by _event_ has
 finished execution (i.e. {CL_EVENT_COMMAND_EXECUTION_STATUS} returns
 {CL_COMPLETE}) is not a synchronization point.
 There are no guarantees that the memory objects being modified by command
 associated with _event_ will be visible to other enqueued commands.

 // refError

 {clGetEventInfo} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
     specified by _param_value_size_ is < size of return type as described in
     the <<kernel-argument-info-table,Kernel Argument Queries>> table and
     _param_value_ is not `NULL`.
   * {CL_INVALID_VALUE} if information to query given in _param_name_ cannot be
     queried for _event_.
   * {CL_INVALID_EVENT} if _event_ is a not a valid event object.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clSetEventCallback',desc='Registers a user callback function for a specific command execution status.',type='protos']
 --
 To register a user callback function for a specific command execution
 status, call the function

 include::{generated}/api/protos/clSetEventCallback.txt[]
 include::{generated}/api/version-notes/clSetEventCallback.asciidoc[]

   * _event_ is a valid event object.
   * _command_exec_callback_type_ specifies the command execution status for
     which the callback is registered.
     The command execution callback values for which a callback can be registered
     are: {CL_SUBMITTED}, {CL_RUNNING}, or
     {CL_COMPLETE} footnote:[{fn-event-callback-complete}].
     There is no guarantee that the callback functions registered for various
     execution status values for an event will be called in the exact order that
     the execution status of a command changes.
     Furthermore, it should be noted that receiving a call back for an event with
     a status other than {CL_COMPLETE}, in no way implies that the memory model or
     execution model as defined by the OpenCL specification has changed.
     For example, it is not valid to assume that a corresponding memory transfer
     has completed unless the event is in a state {CL_COMPLETE}.
   * _pfn_event_notify_ is the event callback function that can be registered by
     the application.
     This callback function may be called asynchronously by the OpenCL
     implementation.
     It is the applications responsibility to ensure that the callback function
     is thread-safe.
     The parameters to this callback function are:
   ** _event_ is the event object for which the callback function is invoked.
   ** _event_command_status_ is equal to the _command_exec_callback_type_
      used while registering the callback.
      Refer to the <<event-info-table,Event Object Queries>>
      table for the command execution status values.
      If the callback is called as the result of the command associated with
      event being abnormally terminated, an appropriate error code for the
      error that caused the termination will be passed to
      _event_command_status_ instead.
   ** _user_data_ is a pointer to user supplied data.
   * _user_data_ will be passed as the _user_data_ argument when _pfn_notify_ is
     called.
     _user_data_ can be `NULL`.

 The registered callback function will be called when the execution status of
 command associated with _event_ changes to an execution status equal to or
 past the status specified by _command_exec_status_.

 Each call to {clSetEventCallback} registers the specified user callback
 function on a callback stack associated with _event_.
 The order in which the registered user callback functions are called is
 undefined.

 All callbacks registered for an event object must be called before the event
 object is destroyed.
 Callbacks should return promptly.

 Behavior is undefined when calling expensive system routines, OpenCL APIs to
 create contexts or command-queues, or blocking OpenCL APIs in an event callback.
 Rather than calling a blocking OpenCL API in an event callback, applications
 may call a non-blocking OpenCL API, then register a completion callback
 for the non-blocking OpenCL API with the remainder of the work.

 Because commands in a command-queue are not required to begin execution
 until the command-queue is flushed, callbacks that enqueue commands on a
 command-queue should either call {clFlush} on the queue before returning,
 or arrange for the command-queue to be flushed later.

 // refError

 {clSetEventCallback} returns {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_EVENT} if _event_ is not a valid event object.
   * {CL_INVALID_VALUE} if _pfn_event_notify_ is `NULL` or if
     _command_exec_callback_type_ is not {CL_SUBMITTED}, {CL_RUNNING}, or
     {CL_COMPLETE}.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --


 [open,refpage='clRetainEvent',desc='Increments the event reference count.',type='protos']
 --
 To retain an event object, call the function

 include::{generated}/api/protos/clRetainEvent.txt[]
 include::{generated}/api/version-notes/clRetainEvent.asciidoc[]

   * _event_ is the event object to be retained.

 The _event_ reference count is incremented.
 The OpenCL commands that return an event perform an implicit retain.

 // refError

 {clRetainEvent} returns {CL_SUCCESS} if the function is executed successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_EVENT} if _event_ is not a valid event object.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clReleaseEvent',desc='Decrements the event reference count.',type='protos']
 --
 To release an event object, call the function

 include::{generated}/api/protos/clReleaseEvent.txt[]
 include::{generated}/api/version-notes/clReleaseEvent.asciidoc[]

   * _event_ is the event object to be released.

 The _event_ reference count is decremented.

 The event object is deleted once the reference count becomes zero, the
 specific command identified by this event has completed (or terminated) and
 there are no commands in the command-queues of a context that require a wait
 for this event to complete.
 Using this function to release a reference that was not obtained by creating
 the object or by calling {clRetainEvent} causes undefined behavior.

 [NOTE]
 ====
 Developers should be careful when releasing their last reference count on
 events created by {clCreateUserEvent} that have not yet been set to status
 of {CL_COMPLETE} or an error.
 If the user event was used in the event_wait_list argument passed to a
 *+clEnqueue*+* API or another application host thread is waiting for it in
 {clWaitForEvents}, those commands and host threads will continue to wait for
 the event status to reach {CL_COMPLETE} or error, even after the application
 has released the object.
 Since in this scenario the application has released its last reference count
 to the user event, it would be in principle no longer valid for the
 application to change the status of the event to unblock all the other
 machinery.
 As a result the waiting tasks will wait forever, and associated events,
 {cl_mem_TYPE} objects, command queues and contexts are likely to leak.
 In-order command queues caught up in this deadlock may cease to do any work.
 ====

 // refError

 {clReleaseEvent} returns  {CL_SUCCESS} if the function is executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_EVENT} if _event_ is not a valid event object.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --


 [[markers-barriers-waiting-for-events]]
 == Markers, Barriers and Waiting for Events

 [open,refpage='clEnqueueMarkerWithWaitList',desc='Enqueues a marker command which waits for either a list of events to complete, or all previously enqueued commands to complete.',type='protos']
 --
 To enqueue a marker command which waits for events or commands to complete,
 call the function

 include::{generated}/api/protos/clEnqueueMarkerWithWaitList.txt[]
 include::{generated}/api/version-notes/clEnqueueMarkerWithWaitList.asciidoc[]

   * _command_queue_ is a valid host command-queue.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before this particular command can be executed.
   * _event_ returns an event object that identifies this command and
     can be used to query or wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
 If _event_wait_list_ is not `NULL`, the list of events pointed to by
 _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
 greater than 0.
 The events specified in _event_wait_list_ act as synchronization points.
 The context associated with events in _event_wait_list_ and _command_queue_
 must be the same.
 The memory associated with _event_wait_list_ can be reused or freed after
 the function returns.

 If _event_wait_list_ is `NULL`, then this particular command waits until all
 previous enqueued commands to _command_queue_ have completed.

 The marker command either waits for a list of events to complete, or if the
 list is empty it waits for all commands previously enqueued in
 _command_queue_ to complete before it completes.
 This command returns an _event_ which can be waited on, i.e. this event can
 be waited on to insure that all events either in the _event_wait_list_ or
 all previously enqueued commands, queued before this command to
 _command_queue_, have completed.

 // refError

 {clEnqueueMarkerWithWaitList} returns {CL_SUCCESS} if the function is
 successfully executed.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and events
     in _event_wait_list_ are not the same.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clEnqueueMarker',desc='Enqueues a marker command which waits for all previously enqueued commands to complete.',type='protos']
 --
 To enqueue a marker command which waits for previous commands to complete, call
 the function

 include::{generated}/api/protos/clEnqueueMarker.txt[]
 include::{generated}/api/version-notes/clEnqueueMarker.asciidoc[]

   * _command_queue_ is a valid host command-queue.
   * _event_ returns an event object that identifies this command and
     can be used to query or wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 The marker command waits for all commands previously enqueued in _command_queue_ to complete before it completes.
 This command returns an _event_ which can be waited on, i.e. this event can be
 waited on to insure that all previously enqueued commands, queued before this
 command to _command_queue_, have completed.

 // refError

 {clEnqueueMarker} returns {CL_SUCCESS} if the function is successfully
 executed.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_VALUE} if _event_ is `NULL`.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clEnqueueWaitForEvents',desc='Enqueues a wait on a list of events to complete.',type='protos']
 --
 To enqueue a wait for a specific event or a list of events to complete before any future commands queued in a command-queue are executed, call the function

 include::{generated}/api/protos/clEnqueueWaitForEvents.txt[]
 include::{generated}/api/version-notes/clEnqueueWaitForEvents.asciidoc[]

   * _command_queue_ is a valid host command-queue.
   * _event_list_ and _num_events_ specify events that need to complete before
     this particular command can be executed.

 // Note, this parameter is called event_list (like clWaitForEvents) rather than
 // event_wait_list(like clEnqueueMarkerWithWaitList etc.) because the function
 // predates wait lists (and CL_INVALID_EVENT_WAIT_LIST).

 The events specified in _event_list_ act as synchronization points.
 The context associated with events in _event_list_ and _command_queue_ must be
 the same.
 The memory associated with _event_list_ can be reused or freed after the
 function returns.

 // refError

 {clEnqueueWaitForEvents} returns {CL_SUCCESS} if the function is successfully
 executed.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and events
     in _event_list_ are not the same.
   * {CL_INVALID_VALUE} if _num_events_ is 0 or _event_list_ is `NULL`.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clEnqueueBarrierWithWaitList',desc='A synchronization point that enqueues a barrier operation.',type='protos']
 --
 To enqueue a barrier command which waits for events or commands to complete,
 call the function

 include::{generated}/api/protos/clEnqueueBarrierWithWaitList.txt[]
 include::{generated}/api/version-notes/clEnqueueBarrierWithWaitList.asciidoc[]

   * _command_queue_ is a valid host command-queue.
   * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
     complete before this particular command can be executed.
   * If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
     If _event_wait_list_ is not `NULL`, the list of events pointed to by
     _event_wait_list_ must be valid and _num_events_in_wait_list_ must be
     greater than 0.
     The events specified in _event_wait_list_ act as synchronization points.
     The context associated with events in _event_wait_list_ and _command_queue_
     must be the same.
     The memory associated with _event_wait_list_ can be reused or freed after
     the function returns.
   * _event_ returns an event object that identifies this command and
     can be used to query or wait for this command to complete.
     If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
     created and therefore it will not be possible to query the status of this
     command or to wait for this command to complete.
     If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
     to an element of the _event_wait_list_ array.

 If _event_wait_list_ is `NULL`, then this particular command waits until all
 previous enqueued commands to _command_queue_ have completed.

 The barrier command either waits for a list of events to complete, or if the
 list is empty it waits for all commands previously enqueued in
 _command_queue_ to complete before it completes.
 This command blocks command execution, that is, any following commands
 enqueued after it do not execute until it completes.
 This command returns an _event_ which can be waited on, i.e. this event can
 be waited on to insure that all events either in the _event_wait_list_ or
 all previously enqueued commands, queued before this command to
 _command_queue_, have completed.

 // refError

 {clEnqueueBarrierWithWaitList} returns {CL_SUCCESS} if the function is
 successfully executed.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and events
     in _event_wait_list_ are not the same.
   * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
     _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
     _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
     are not valid events.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clEnqueueBarrier',desc='A synchronization point that enqueues a barrier operation.',type='protos']
 --
 To enqueue a barrier command which waits for commands to complete, call the
 function

 include::{generated}/api/protos/clEnqueueBarrier.txt[]
 include::{generated}/api/version-notes/clEnqueueBarrier.asciidoc[]

   * _command_queue_ is a valid host command-queue.

 The barrier command waits for all commands previously enqueued in
 _command_queue_ to complete before it completes.
 This command blocks command execution, that is, any following commands
 enqueued after it do not execute until it completes.
 // TODO clEnqueueBarrierWithWaitList doesn't say synchronization point, should
 // it, or should the next line be removed?  The main difference is that
 // clEnqueueBarrierWithWaitList returns an event, which is the synchronization
 // point.
 The barrier command is a synchronization point.

 // refError

 {clEnqueueBarrier} returns {CL_SUCCESS} if the function is successfully
 executed.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 == Out-of-order Execution of Kernels and Memory Object Commands

 The OpenCL functions that are submitted to a command-queue are enqueued in
 the order the calls are made but can be configured to execute in-order or
 out-of-order.
 The _properties_ argument in {clCreateCommandQueueWithProperties} or
 {clCreateCommandQueue} can be used to specify the execution order.

 If the {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE} property of a command-queue is
 not set, the commands enqueued to a command-queue execute in order.
 For example, if an application calls {clEnqueueNDRangeKernel} to execute
 kernel A followed by a {clEnqueueNDRangeKernel} to execute kernel B, the
 application can assume that kernel A finishes first and then kernel B is
 executed.
 If the memory objects output by kernel A are inputs to kernel B then kernel
 B will see the correct data in memory objects produced by execution of
 kernel A.
 If the {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE} property of a command-queue is
 set, then there is no guarantee that kernel A will finish before kernel B
 starts execution.

 Applications can configure the commands enqueued to a command-queue to
 execute out-of-order by setting the {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE}
 property of the command-queue.
 This can be specified when the command-queue is created.
 In out-of-order execution mode there is no guarantee that the enqueued
 commands will finish execution in the order they were queued.
 As there is no guarantee that kernels will be executed in order, i.e. based
 on when the {clEnqueueNDRangeKernel} or {clEnqueueTask} calls are made within a
 command-queue, it is therefore possible that an earlier
 {clEnqueueNDRangeKernel} call to execute kernel A identified by event A may
 execute and/or finish later than a {clEnqueueNDRangeKernel} call to execute
 kernel B which was called by the application at a later point in time.
 To guarantee a specific order of execution of kernels, a wait on a
 particular event (in this case event A) can be used.
 The wait for event A can be specified in the _event_wait_list_ argument to
 {clEnqueueNDRangeKernel} for kernel B.

 In addition, a marker ({clEnqueueMarker} or {clEnqueueMarkerWithWaitList}) or a
 barrier ({clEnqueueBarrier} or {clEnqueueBarrierWithWaitList}) command can be
 enqueued to the command-queue.
 The marker command ensures that previously enqueued commands identified by
 the list of events to wait for (or all previous commands) have finished.
 A barrier command is similar to a marker command, but additionally
 guarantees that no later-enqueued commands will execute until the waited-for
 commands have executed.

 Similarly, commands to read, write, copy or map memory objects that are
 enqueued after {clEnqueueNDRangeKernel}, {clEnqueueTask} or
 {clEnqueueNativeKernel} commands are not guaranteed to wait for kernels
 scheduled for execution to have completed (if the
 {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE} property is set).
 To ensure correct ordering of commands, the event object returned by
 {clEnqueueNDRangeKernel}, {clEnqueueTask} or {clEnqueueNativeKernel} can be
 used to enqueue a wait for event or a barrier command can be enqueued that must
 complete before reads or writes to the memory object(s) occur.


 [[profiling-operations]]
 == Profiling Operations on Memory Objects and Kernels

 This section describes the profiling of OpenCL functions that are enqueued
 as commands to a command-queue. Profiling of OpenCL commands can be enabled
 by using a command-queue created with the {CL_QUEUE_PROFILING_ENABLE}
 flag set in the {CL_QUEUE_PROPERTIES} bitfield in the _properties_ argument to
 {clCreateCommandQueueWithProperties}, or in the _properties_ argument to
 {clCreateCommandQueue}.
 When profiling is enabled, the event objects that are created from
 enqueuing a command store a timestamp for each of their state transitions.

 [open,refpage='clGetEventProfilingInfo',desc='Returns profiling information for the command associated with event if profiling is enabled.',type='protos']
 --
 To return profiling information for a command associated with an event when
 profiling is enabled, call the function

 include::{generated}/api/protos/clGetEventProfilingInfo.txt[]
 include::{generated}/api/version-notes/clGetEventProfilingInfo.asciidoc[]

   * _event_ specifies the event object.
   * _param_name_ specifies the profiling data to query.
     The list of supported _param_name_ types and the information returned in
     _param_value_ by {clGetEventProfilingInfo} is described in the
     <<event-profiling-info-table,Event Profiling Queries>> table.
   * _param_value_ is a pointer to memory where the appropriate result being
     queried is returned.
     If _param_value_ is `NULL`, it is ignored.
   * _param_value_size_ is used to specify the size in bytes of memory pointed to
     by _param_value_.
     This size must be {geq} size of return type as described in the
     <<event-profiling-info-table,Event Profiling Queries>> table.
   * _param_value_size_ret_ returns the actual size in bytes of data being
     queried by _param_name_.
     If _param_value_size_ret_ is `NULL`, it is ignored.

 [[event-profiling-info-table]]
 .List of supported param_names by <<clGetEventProfilingInfo>>
 [width="100%",cols="<33%,<17%,<50%",options="header"]
 |====
 | Event Profiling Info | Return Type | Description
 | {CL_PROFILING_COMMAND_QUEUED_anchor}

 include::{generated}/api/version-notes/CL_PROFILING_COMMAND_QUEUED.asciidoc[]
   | {cl_ulong_TYPE}
       | A 64-bit value that describes the current device time counter in
         nanoseconds when the command identified by event is enqueued in a
         command-queue by the host.
 | {CL_PROFILING_COMMAND_SUBMIT_anchor}

 include::{generated}/api/version-notes/CL_PROFILING_COMMAND_SUBMIT.asciidoc[]
   | {cl_ulong_TYPE}
       | A 64-bit value that describes the current device time counter in
         nanoseconds when the command identified by event that has been
         enqueued is submitted by the host to the device associated with the
         command-queue.
 | {CL_PROFILING_COMMAND_START_anchor}

 include::{generated}/api/version-notes/CL_PROFILING_COMMAND_START.asciidoc[]
   | {cl_ulong_TYPE}
       | A 64-bit value that describes the current device time counter in
         nanoseconds when the command identified by event starts execution on
         the device.
 | {CL_PROFILING_COMMAND_END_anchor}

 include::{generated}/api/version-notes/CL_PROFILING_COMMAND_END.asciidoc[]
   | {cl_ulong_TYPE}
       | A 64-bit value that describes the current device time counter in
         nanoseconds when the command identified by event has finished
         execution on the device.
 | {CL_PROFILING_COMMAND_COMPLETE_anchor}

 include::{generated}/api/version-notes/CL_PROFILING_COMMAND_COMPLETE.asciidoc[]
   | {cl_ulong_TYPE}
       | A 64-bit value that describes the current device time counter in
         nanoseconds when the command identified by event and any child
         commands enqueued by this command on the device have finished
         execution.
 |====

 The unsigned 64-bit values returned can be used to measure the time in
 nano-seconds consumed by OpenCL commands.

 OpenCL devices are required to correctly track time across changes in device
 frequency and power states.
 The {CL_DEVICE_PROFILING_TIMER_RESOLUTION} specifies the resolution of the
 timer i.e. the number of nanoseconds elapsed before the timer is
 incremented.

 // refError

 {clGetEventProfilingInfo} returns {CL_SUCCESS} if the function is executed
 successfully and the profiling information has been recorded.
 Otherwise, it returns one of the following errors:

   * {CL_PROFILING_INFO_NOT_AVAILABLE} if the {CL_QUEUE_PROFILING_ENABLE} flag is
     not set for the command-queue, if the execution status of the command
     identified by _event_ is not {CL_COMPLETE} or if _event_ is a user event
     object.
   * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
     specified by _param_value_size_ is < size of return type as described in
     the <<event-profiling-info-table,Event Profiling Queries>> table and
     _param_value_ is not `NULL`.
   * {CL_INVALID_EVENT} if _event_ is a not a valid event object.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --


 == Flush and Finish

 [open,refpage='clFlush',desc='Issues all previously queued OpenCL commands in a command-queue to the device associated with the command-queue.',type='protos']
 --
 To flush commands to a device, call the function

 include::{generated}/api/protos/clFlush.txt[]
 include::{generated}/api/version-notes/clFlush.asciidoc[]

   * _command_queue_ is the command queue to flush.

 All previously queued OpenCL commands in _command_queue_ are issued to the
 device associated with _command_queue_.
 {clFlush} only guarantees that all queued commands to _command_queue_ will
 eventually be submitted to the appropriate device.
 There is no guarantee that they will be complete after {clFlush} returns.

 Any blocking commands queued in a command-queue and {clReleaseCommandQueue}
 perform an implicit flush of the command-queue.
 These blocking commands are {clEnqueueReadBuffer},
 {clEnqueueReadBufferRect}, {clEnqueueReadImage}, with _blocking_read_ set to
 {CL_TRUE}; {clEnqueueWriteBuffer}, {clEnqueueWriteBufferRect},
 {clEnqueueWriteImage} with _blocking_write_ set to {CL_TRUE};
 {clEnqueueMapBuffer}, {clEnqueueMapImage} with _blocking_map_ set to
 {CL_TRUE}; {clEnqueueSVMMemcpy} with _blocking_copy_ set to {CL_TRUE};
 {clEnqueueSVMMap} with _blocking_map_ set to {CL_TRUE} or {clWaitForEvents}.

 To use event objects that refer to commands enqueued in a command-queue as
 event objects to wait on by commands enqueued in a different command-queue,
 the application must call a {clFlush} or any blocking commands that perform
 an implicit flush of the command-queue where the commands that refer to
 these event objects are enqueued.

 // refError

 {clFlush} returns  {CL_SUCCESS} if the function call was executed successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --

 [open,refpage='clFinish',desc='Blocks until all previously queued OpenCL commands in a command-queue are issued to the associated device and have completed.',type='protos']
 --
 To wait for completion of commands on a device, call the function

 include::{generated}/api/protos/clFinish.txt[]
 include::{generated}/api/version-notes/clFinish.asciidoc[]

   * _command_queue_ is the command queue to wait for.

 All previously queued OpenCL commands in _command_queue_ are issued to the
 associated device, and the function blocks until all previously queued
 commands have completed.
 {clFinish} does not return until all previously queued commands in
 _command_queue_ have been processed and completed.
 {clFinish} is also a synchronization point.

 // refError

 {clFinish} returns {CL_SUCCESS} if the function call was executed
 successfully.
 Otherwise, it returns one of the following errors:

   * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
     command-queue.
   * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
     by the OpenCL implementation on the device.
   * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
     required by the OpenCL implementation on the host.
 --