blob: 0ca96694ee114fa70572d8e5765c5a967b8c36e6 [file] [log] [blame] [edit]
// Copyright 2017-2020 The Khronos Group. This work is licensed under a
// Creative Commons Attribution 4.0 International License; see
// http://creativecommons.org/licenses/by/4.0/
[[opencl-runtime]]
= The OpenCL Runtime
In this section we describe the API calls that manage OpenCL objects such as
command-queues, memory objects, program objects, kernel objects for kernel
functions in a program and calls that allow you to enqueue commands to a
command-queue such as executing a kernel, reading, or writing a memory
object.
== Command Queues
OpenCL objects such as memory, program and kernel objects are created using
a context.
Operations on these objects are performed using a command-queue.
The command-queue can be used to queue a set of operations (referred to as
commands) in order.
Having multiple command-queues allows applications to queue multiple
independent commands without requiring synchronization.
Note that this should work as long as these objects are not being shared.
Sharing of objects across multiple command-queues will require the
application to perform appropriate synchronization.
This is described in <<shared-opencl-objects, Shared OpenCL Objects>>
[open,refpage='clCreateCommandQueueWithProperties',desc='Create a host or device command-queue on a specific device.',type='protos']
--
To create a host or device command-queue on a specific device, call the
function
include::{generated}/api/protos/clCreateCommandQueueWithProperties.txt[]
include::{generated}/api/version-notes/clCreateCommandQueueWithProperties.asciidoc[]
Also see extension *cl_khr_create_command_queue*.
* _context_ must be a valid OpenCL context.
* _device_ must be a device or sub-device associated with _context_.
It can either be in the list of devices and sub-devices specified when
_context_ is created using {clCreateContext} or be a root device with the
same device type as specified when _context_ is created using
{clCreateContextFromType}.
* _properties_ specifies a list of properties for the command-queue and their
corresponding values.
Each property name is immediately followed by the corresponding desired
value.
The list is terminated with 0.
The list of supported properties is described in the
<<queue-properties-table, table below>>.
If a supported property and its value is not specified in _properties_, its
default value will be used.
_properties_ can be `NULL` in which case the default values for supported
command-queue properties will be used.
* _errcode_ret_ will return an appropriate error code.
If _errcode_ret_ is `NULL`, no error code is returned.
[[queue-properties-table]]
.List of supported queue creation properties by <<clCreateCommandQueueWithProperties>>
[width="100%",cols="<33%,<17%,<50%",options="header"]
|====
| Queue Property | Property Value | Description
| {CL_QUEUE_PROPERTIES_anchor}
include::{generated}/api/version-notes/CL_QUEUE_PROPERTIES.asciidoc[]
| {cl_command_queue_properties_TYPE}
| This is a bitfield and can be set to a combination of the following
values:
{CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE_anchor} - Determines whether the
commands queued in the command-queue are executed in-order or
out-of-order.
If set, the commands in the command-queue are executed out-of-order.
Otherwise, commands are executed in-order.
include::{generated}/api/version-notes/CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE.asciidoc[]
{CL_QUEUE_PROFILING_ENABLE_anchor} - Enable or disable profiling of commands
in the command-queue.
If set, the profiling of commands is enabled.
Otherwise profiling of commands is disabled.
include::{generated}/api/version-notes/CL_QUEUE_PROFILING_ENABLE.asciidoc[]
{CL_QUEUE_ON_DEVICE_anchor} - Indicates that this is a device queue.
If {CL_QUEUE_ON_DEVICE} is set,
{CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE}
footnote:[{fn-out-of-order-device-queue}]
must also be set.
include::{generated}/api/version-notes/CL_QUEUE_ON_DEVICE.asciidoc[]
{CL_QUEUE_ON_DEVICE_DEFAULT_anchor}
footnote:[{fn-default-device-queue}] -
indicates that this is the default device queue.
This can only be used with {CL_QUEUE_ON_DEVICE}.
include::{generated}/api/version-notes/CL_QUEUE_ON_DEVICE_DEFAULT.asciidoc[]
If {CL_QUEUE_PROPERTIES} is not specified an in-order host command
queue is created for the specified device
| {CL_QUEUE_SIZE_anchor}
include::{generated}/api/version-notes/CL_QUEUE_SIZE.asciidoc[]
| {cl_uint_TYPE}
| Specifies the size of the device queue in bytes.
This can only be specified if {CL_QUEUE_ON_DEVICE} is set in
{CL_QUEUE_PROPERTIES}.
This must be a value {leq} {CL_DEVICE_QUEUE_ON_DEVICE_MAX_SIZE}.
For best performance, this should be {leq}
{CL_DEVICE_QUEUE_ON_DEVICE_PREFERRED_SIZE}.
If {CL_QUEUE_SIZE} is not specified, the device queue is created with
{CL_DEVICE_QUEUE_ON_DEVICE_PREFERRED_SIZE} as the size of the queue.
|====
// refError
{clCreateCommandQueueWithProperties} returns a valid non-zero command-queue
and _errcode_ret_ is set to {CL_SUCCESS} if the command-queue is created
successfully.
Otherwise, it returns a `NULL` value with one of the following error values
returned in _errcode_ret_:
* {CL_INVALID_CONTEXT} if _context_ is not a valid context.
* {CL_INVALID_DEVICE} if _device_ is not a valid device or is not associated
with _context_.
* {CL_INVALID_VALUE} if values specified in _properties_ are not valid.
* {CL_INVALID_QUEUE_PROPERTIES} if values specified in _properties_ are
valid but are not supported by the device.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clCreateCommandQueue',desc='Create a host command-queue on a specific device.',type='protos']
--
To create a host command-queue on a specific device, call the function
include::{generated}/api/protos/clCreateCommandQueue.txt[]
include::{generated}/api/version-notes/clCreateCommandQueue.asciidoc[]
* _context_ must be a valid OpenCL context.
* _device_ must be a device or sub-device associated with _context_.
It can either be in the list of devices and sub-devices specified when
_context_ is created using {clCreateContext} or be a root device with the
same device type as specified when _context_ is created using
{clCreateContextFromType}.
* _properties_ specifies a list of properties for the command-queue.
This is a bit-field and the supported properties are described in the
<<legacy-queue-properties-table, table>> below.
Only command-queue properties specified in this table can be used,
otherwise the value specified in _properties_ is considered to be not
valid.
_properties_ can be 0 in which case the default values for supported
command-queue properties will be used.
[[legacy-queue-properties-table]]
.List of supported `cl_command_queue_property` values by <<clCreateCommandQueue>>
[width="100%",cols="<50%,<50%",options="header"]
|====
| Command-Queue Properties | Description
| {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE}
| Determines whether the commands queued in the command-queue are executed
in-order or out-of-order.
If set, the commands in the command-queue are executed out-of-order.
Otherwise, commands are executed in-order.
| {CL_QUEUE_PROFILING_ENABLE}
| Enable or disable profiling of commands in the command-queue.
If set, the profiling of commands is enabled.
Otherwise profiling of commands is disabled.
|====
* _errcode_ret_ will return an appropriate error code.
If _errcode_ret_ is `NULL`, no error code is returned.
// refError
{clCreateCommandQueue} returns a valid non-zero command-queue and _errcode_ret_
is set to {CL_SUCCESS} if the command-queue is created successfully.
Otherwise, it returns a `NULL` value with one of the following error values
returned in _errcode_ret_:
* {CL_INVALID_CONTEXT} if _context_ is not a valid context.
* {CL_INVALID_DEVICE} if _device_ is not a valid device or is not associated
with _context_.
* {CL_INVALID_VALUE} if values specified in _properties_ are not valid.
* {CL_INVALID_QUEUE_PROPERTIES} if values specified in _properties_ are
valid but are not supported by the device.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clSetDefaultDeviceCommandQueue',desc='Replaces the default command queue on the device.',type='protos']
--
To replace the default command queue on a device, call the function
include::{generated}/api/protos/clSetDefaultDeviceCommandQueue.txt[]
include::{generated}/api/version-notes/clSetDefaultDeviceCommandQueue.asciidoc[]
* _context_ is the OpenCL context used to create _command_queue_.
* _device_ is a valid OpenCL device associated with _context_.
* _command_queue_ specifies a command queue object which replaces the
default device command queue
{clSetDefaultDeviceCommandQueue} may be used to replace a default device
command queue created with {clCreateCommandQueueWithProperties} and the
{CL_QUEUE_ON_DEVICE_DEFAULT} flag.
// refError
{clSetDefaultDeviceCommandQueue} returns {CL_SUCCESS} if the function is
executed successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_CONTEXT} if _context_ is not a valid context.
* {CL_INVALID_DEVICE} if _device_ is not a valid device or is not associated
with _context_.
* {CL_INVALID_OPERATION} if _device_ does not support a replaceable default on-device queue.
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid command-queue
for _device_.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clRetainCommandQueue',desc='Increments the command_queue reference count.',type='protos']
--
To retain a command queue, call the function
include::{generated}/api/protos/clRetainCommandQueue.txt[]
include::{generated}/api/version-notes/clRetainCommandQueue.asciidoc[]
* _command_queue_ specifies the command-queue to be retained.
The _command_queue_ reference count is incremented.
{clCreateCommandQueueWithProperties} and {clCreateCommandQueue} perform an
implicit retain.
This is very helpful for 3^rd^ party libraries, which typically get a
command-queue passed to them by the application.
However, it is possible that the application may delete the command-queue
without informing the library.
Allowing functions to attach to (i.e. retain) and release a command-queue
solves the problem of a command-queue being used by a library no longer
being valid.
// refError
{clRetainCommandQueue} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid
command-queue.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clReleaseCommandQueue',desc='Decrements the command_queue reference count.',type='protos']
--
To release a command queue, call the function
include::{generated}/api/protos/clReleaseCommandQueue.txt[]
include::{generated}/api/version-notes/clReleaseCommandQueue.asciidoc[]
* _command_queue_ specifies the command-queue to be released.
The _command_queue_ reference count is decremented.
After the _command_queue_ reference count becomes zero and all commands
queued to _command_queue_ have finished (eg.
kernel-instances, memory object updates etc.), the command-queue is deleted.
{clReleaseCommandQueue} performs an implicit flush to issue any previously
queued OpenCL commands in _command_queue_.
Using this function to release a reference that was not obtained by creating
the object or by calling {clRetainCommandQueue} causes undefined behavior.
// refError
{clReleaseCommandQueue} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid
command-queue.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clGetCommandQueueInfo',desc='Query information about a command-queue.',type='protos']
--
To query information about a command-queue, call the function
include::{generated}/api/protos/clGetCommandQueueInfo.txt[]
include::{generated}/api/version-notes/clGetCommandQueueInfo.asciidoc[]
* _command_queue_ specifies the command-queue being queried.
* _param_name_ specifies the information to query.
* _param_value_ is a pointer to memory where the appropriate result being
queried is returned.
If _param_value_ is `NULL`, it is ignored.
* _param_value_size_ is used to specify the size in bytes of memory pointed to
by _param_value_.
This size must be {geq} size of return type as described in the
<<command-queue-param-table,Command Queue Parameter>> table.
If _param_value_ is `NULL`, it is ignored.
* _param_value_size_ret_ returns the actual size in bytes of data being
queried by _param_name_.
If _param_value_size_ret_ is `NULL`, it is ignored.
The list of supported _param_name_ values and the information returned in
_param_value_ by {clGetCommandQueueInfo} is described in the
<<command-queue-param-table,Command Queue Parameter>> table.
[[command-queue-param-table]]
.List of supported param_names by <<clGetCommandQueueInfo>>
[width="100%",cols="<33%,<17%,<50%",options="header"]
|====
| Queue Info | Return Type | Description
| {CL_QUEUE_CONTEXT_anchor}
include::{generated}/api/version-notes/CL_QUEUE_CONTEXT.asciidoc[]
| {cl_context_TYPE}
| Return the context specified when the command-queue is created.
| {CL_QUEUE_DEVICE_anchor}
include::{generated}/api/version-notes/CL_QUEUE_DEVICE.asciidoc[]
| {cl_device_id_TYPE}
| Return the device specified when the command-queue is created.
| {CL_QUEUE_REFERENCE_COUNT_anchor} footnote:[{fn-reference-count-usage}]
include::{generated}/api/version-notes/CL_QUEUE_REFERENCE_COUNT.asciidoc[]
| {cl_uint_TYPE}
| Return the command-queue reference count.
| {CL_QUEUE_PROPERTIES_anchor}
include::{generated}/api/version-notes/CL_QUEUE_PROPERTIES.asciidoc[]
| {cl_command_queue_properties_TYPE}
| Return the currently specified properties for the command-queue.
These properties are specified by the value associated with the
{CL_QUEUE_PROPERTIES} passed in _properties_ argument in
{clCreateCommandQueueWithProperties}, or the value of the _properties_
argument in {clCreateCommandQueue}.
| {CL_QUEUE_PROPERTIES_ARRAY_anchor}
include::{generated}/api/version-notes/CL_QUEUE_PROPERTIES_ARRAY.asciidoc[]
| {cl_queue_properties_TYPE}[]
| Return the properties argument specified in
{clCreateCommandQueueWithProperties}.
If the _properties_ argument specified in
{clCreateCommandQueueWithProperties} used to create _command_queue_
was not `NULL`, the implementation must return the values specified in
the properties argument in the same order and without including
additional properties.
If _command_queue_ was created using {clCreateCommandQueue}, or if the
_properties_ argument specified in clCreateCommandQueueWithProperties}
was `NULL`, the implementation must return _param_value_size_ret_
equal to 0, indicating that there are no properties to be returned.
| {CL_QUEUE_SIZE_anchor}
include::{generated}/api/version-notes/CL_QUEUE_SIZE.asciidoc[]
| {cl_uint_TYPE}
| Return the size of the device command-queue.
To be considered valid for this query, _command_queue_ must be a
device command-queue.
| {CL_QUEUE_DEVICE_DEFAULT_anchor}
include::{generated}/api/version-notes/CL_QUEUE_DEVICE_DEFAULT.asciidoc[]
| {cl_command_queue_TYPE}
| Return the current default command queue for the underlying device.
|====
// refError
{clGetCommandQueueInfo} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid
command-queue, or if _command_queue_ is not a valid command-queue
for _param_name_.
* {CL_INVALID_VALUE} if _param_name_ is not one of the supported values or
if size in bytes specified by _param_value_size_ is < size of return
type as specified in the <<command-queue-param-table,Command Queue
Parameter>> table, and _param_value_ is not a `NULL` value.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clSetCommandQueueProperty',desc='Enable or disable properties of a command-queue.',type='protos']
--
To enable or disable the properties of a command-queue, call the function
include::{generated}/api/protos/clSetCommandQueueProperty.txt[]
include::{generated}/api/version-notes/clSetCommandQueueProperty.asciidoc[]
* _command_queue_ specifies the command-queue being modified.
* _properties_ specifies the new list of properties for the command-queue.
This is a bit-field and the supported properties are described in the
<<legacy-queue-properties-table, Command-Queue Properties table>> for
{clCreateCommandQueue}.
Only command-queue properties specified in this table can be used,
otherwise the value specified in _properties_ is considered to be not
valid.
* _enable_ determines whether the values specified by _properties_ are
enabled (if _enable_ is {CL_TRUE}) or disabled (if _enable_ is {CL_FALSE})
for the command-queue.
* _old_properties_ returns the command-queue properties before they were
changed by {clSetCommandQueueProperty}. If _old_properties_ is `NULL`, it
is ignored.
[NOTE]
====
Changing the {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE} command-queue property
will cause the OpenCL implementation to block until all previously queued
commands in _command_queue_ have completed. This can be an expensive operation
and therefore changes to this property should only be done when absolutely
necessary.
====
// refError
{clSetCommandQueueProperty} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid command-queue.
* {CL_INVALID_VALUE} if values specified in _properties_ are not valid.
* {CL_INVALID_QUEUE_PROPERTIES} if values specified in _properties_ are
valid but are not supported by the device.
--
== Buffer Objects
A _buffer_ object stores a one-dimensional collection of elements.
Elements of a _buffer_ object can be a scalar data type (such as an int,
float), vector data type, or a user-defined structure.
=== Creating Buffer Objects
[open,refpage='clCreateBuffer',desc='Creates a buffer object.',type='protos',alias='clCreateBufferWithProperties']
--
A *buffer object* may be created using the function
include::{generated}/api/protos/clCreateBuffer.txt[]
include::{generated}/api/version-notes/clCreateBuffer.asciidoc[]
A *buffer object* may also be created with additional properties using the function
include::{generated}/api/protos/clCreateBufferWithProperties.txt[]
include::{generated}/api/version-notes/clCreateBufferWithProperties.asciidoc[]
* _context_ is a valid OpenCL context used to create the buffer object.
* _properties_ is an optional list of properties for the buffer object and their corresponding values.
The list is terminated with the special property `0`.
If no properties are required, _properties_ may be `NULL`.
OpenCL 3.0 does not define any optional properties for buffers.
* _flags_ is a bit-field that is used to specify allocation and usage
information about the image memory object being created and is described in
the <<memory-flags-table, supported memory flag values>> table.
* _size_ is the size in bytes of the buffer memory object to be allocated.
* _host_ptr_ is a pointer to the buffer data that may already be allocated
by the application.
The size of the buffer that _host_ptr_ points to must be greater than or equal to _size_
bytes.
* _errcode_ret_ may return an appropriate error code.
If _errcode_ret_ is `NULL`, no error code is returned.
The alignment requirements for data stored in buffer objects are described
in <<alignment-app-data-types>>.
If {clCreateBuffer} or {clCreateBufferWithProperties} is called with
{CL_MEM_USE_HOST_PTR} set in its _flags_ argument, the contents of the
memory pointed to by _host_ptr_ at the time of the {clCreateBuffer} call
define the initial contents of the buffer object.
If {clCreateBuffer} or {clCreateBufferWithProperties} is called with a
pointer returned by {clSVMAlloc} as its _host_ptr_ argument, and
{CL_MEM_USE_HOST_PTR} is set in its _flags_ argument, {clCreateBuffer} or
{clCreateBufferWithProperties} will succeed and return a valid non-zero
buffer object as long as the _size_ argument is no larger than the
_size_ argument passed in the original {clSVMAlloc} call.
The new buffer object returned has the shared memory as the underlying
storage.
Locations in the buffers underlying shared memory can be operated on using
atomic operations to the devices level of support as defined in the memory
model.
// refError
{clCreateBuffer} and {clCreateBufferWithProperties} returns a valid non-zero
buffer object and _errcode_ret_ is set to {CL_SUCCESS} if the buffer object
is created successfully.
Otherwise, they return a `NULL` value with one of the following error values
returned in _errcode_ret_:
* {CL_INVALID_CONTEXT} if _context_ is not a valid context.
* {CL_INVALID_PROPERTY} if a property name in _properties_ is not a
supported property name, if the value specified for a supported property
name is not valid, or if the same property name is specified more than
once.
* {CL_INVALID_VALUE} if values specified in _flags_ are not valid as defined
in the <<memory-flags-table,Memory Flags>> table.
* {CL_INVALID_BUFFER_SIZE} if _size_ is 0 or if _size_ is greater than
{CL_DEVICE_MAX_MEM_ALLOC_SIZE} for all devices in _context_.
* {CL_INVALID_HOST_PTR} if _host_ptr_ is `NULL` and {CL_MEM_USE_HOST_PTR} or
{CL_MEM_COPY_HOST_PTR} are set in _flags_ or if _host_ptr_ is not `NULL`
but {CL_MEM_COPY_HOST_PTR} or {CL_MEM_USE_HOST_PTR} are not set in _flags_.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for buffer object.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
[[memory-flags-table]]
.List of supported memory flag values
[width="100%",cols="<50%,<50%",options="header"]
|====
| Memory Flags | Description
| {CL_MEM_READ_WRITE_anchor}
include::{generated}/api/version-notes/CL_MEM_READ_WRITE.asciidoc[]
| This flag specifies that the memory object will be read
and written by a kernel.
This is the default.
| {CL_MEM_WRITE_ONLY_anchor}
include::{generated}/api/version-notes/CL_MEM_WRITE_ONLY.asciidoc[]
| This flag specifies that the memory object will be
written but not read by a kernel.
Reading from a buffer or image object created with {CL_MEM_WRITE_ONLY}
inside a kernel is undefined.
{CL_MEM_READ_WRITE} and {CL_MEM_WRITE_ONLY} are mutually exclusive.
| {CL_MEM_READ_ONLY_anchor}
include::{generated}/api/version-notes/CL_MEM_READ_ONLY.asciidoc[]
| This flag specifies that the memory object is a
readonly memory object when used inside a kernel.
Writing to a buffer or image object created with {CL_MEM_READ_ONLY} inside
a kernel is undefined.
{CL_MEM_READ_WRITE} or {CL_MEM_WRITE_ONLY} and {CL_MEM_READ_ONLY} are mutually
exclusive.
| {CL_MEM_USE_HOST_PTR_anchor}
include::{generated}/api/version-notes/CL_MEM_USE_HOST_PTR.asciidoc[]
| This flag is valid only if host_ptr is not `NULL`.
If specified, it indicates that the application wants the OpenCL
implementation to use memory referenced by host_ptr as the storage bits
for the memory object.
The contents of the memory pointed to by host_ptr at the time of the
{clCreateBuffer}, {clCreateBufferWithProperties}, {clCreateImage},
{clCreateImageWithProperties}, {clCreateImage2D}, or {clCreateImage3D}
call define the initial contents of the memory object.
OpenCL implementations are allowed to cache the contents pointed
to by host_ptr in device memory.
This cached copy can be used when kernels are executed on a device.
The result of OpenCL commands that operate on multiple buffer objects
created with the same host_ptr or from overlapping host or SVM regions
is considered to be undefined.
| {CL_MEM_ALLOC_HOST_PTR_anchor}
include::{generated}/api/version-notes/CL_MEM_ALLOC_HOST_PTR.asciidoc[]
| This flag specifies that the application wants the OpenCL implementation
to allocate memory from host accessible memory.
{CL_MEM_ALLOC_HOST_PTR} and {CL_MEM_USE_HOST_PTR} are mutually exclusive.
| {CL_MEM_COPY_HOST_PTR_anchor}
include::{generated}/api/version-notes/CL_MEM_COPY_HOST_PTR.asciidoc[]
| This flag is valid only if host_ptr is not `NULL`.
If specified, it indicates that the application wants the OpenCL
implementation to allocate memory for the memory object and copy the
data from memory referenced by host_ptr.
The implementation will copy the memory immediately and host_ptr is
available for reuse by the application when the {clCreateBuffer},
{clCreateBufferWithProperties}, {clCreateImage}, {clCreateImageWithProperties},
{clCreateImage2D}, or {clCreateImage3D} operation returns.
{CL_MEM_COPY_HOST_PTR} and {CL_MEM_USE_HOST_PTR} are mutually exclusive.
{CL_MEM_COPY_HOST_PTR} can be used with {CL_MEM_ALLOC_HOST_PTR} to
initialize the contents of the {cl_mem_TYPE} object allocated using
host-accessible (e.g. PCIe) memory.
| {CL_MEM_HOST_WRITE_ONLY_anchor}
include::{generated}/api/version-notes/CL_MEM_HOST_WRITE_ONLY.asciidoc[]
| This flag specifies that the host will only write to the memory object
(using OpenCL APIs that enqueue a write or a map for write).
This can be used to optimize write access from the host (e.g. enable
write-combined allocations for memory objects for devices that
communicate with the host over a system bus such as PCIe).
| {CL_MEM_HOST_READ_ONLY_anchor}
include::{generated}/api/version-notes/CL_MEM_HOST_READ_ONLY.asciidoc[]
| This flag specifies that the host will only read
the memory object (using OpenCL APIs that enqueue a read or a map for
read).
{CL_MEM_HOST_WRITE_ONLY} and {CL_MEM_HOST_READ_ONLY} are mutually exclusive.
| {CL_MEM_HOST_NO_ACCESS_anchor}
include::{generated}/api/version-notes/CL_MEM_HOST_NO_ACCESS.asciidoc[]
| This flag specifies that the host will not read or
write the memory object.
{CL_MEM_HOST_WRITE_ONLY} or {CL_MEM_HOST_READ_ONLY} and
{CL_MEM_HOST_NO_ACCESS} are mutually exclusive.
| {CL_MEM_KERNEL_READ_AND_WRITE_anchor}
include::{generated}/api/version-notes/CL_MEM_KERNEL_READ_AND_WRITE.asciidoc[]
| This flag is only used by {clGetSupportedImageFormats} to query image
formats that may be both read from and written to by the same kernel
instance.
To create a memory object that may be read from and written to use
{CL_MEM_READ_WRITE}.
|====
--
[open,refpage='clCreateSubBuffer',desc='Creates a new buffer object (referred to as a sub-buffer object) from an existing buffer object.',type='protos',xrefs='cl_buffer_region']
--
To create a new buffer object (referred to as a sub-buffer object) from an
existing buffer object, call the function
include::{generated}/api/protos/clCreateSubBuffer.txt[]
include::{generated}/api/version-notes/clCreateSubBuffer.asciidoc[]
* _buffer_ must be a valid buffer object and cannot be a sub-buffer object.
* _flags_ is a bit-field that is used to specify allocation and usage
information about the sub-buffer memory object being created and is
described in the <<memory-flags-table,Memory Flags>> table.
If the {CL_MEM_READ_WRITE}, {CL_MEM_READ_ONLY}, or {CL_MEM_WRITE_ONLY} values are
not specified in _flags_, they are inherited from the corresponding memory
access qualifiers associated with _buffer_.
The {CL_MEM_USE_HOST_PTR}, {CL_MEM_ALLOC_HOST_PTR}, and {CL_MEM_COPY_HOST_PTR}
values cannot be specified in _flags_ but are inherited from the
corresponding memory access qualifiers associated with _buffer_.
If {CL_MEM_COPY_HOST_PTR} is specified in the memory access qualifier values
associated with _buffer_ it does not imply any additional copies when the
sub-buffer is created from _buffer_.
If the {CL_MEM_HOST_WRITE_ONLY}, {CL_MEM_HOST_READ_ONLY}, or
{CL_MEM_HOST_NO_ACCESS} values are not specified in _flags_, they are
inherited from the corresponding memory access qualifiers associated with
_buffer_.
* _buffer_create_type_ and _buffer_create_info_ describe the type of buffer
object to be created.
The list of supported values for _buffer_create_type_ and corresponding
descriptor that _buffer_create_info_ points to is described in the
<<subbuffer-create-info-table,SubBuffer Attributes>> table.
[[subbuffer-create-info-table]]
.List of supported buffer creation types by <<clCreateSubBuffer>>
[width="100%",cols="<50%,<50%",options="header"]
|====
| Buffer Creation Type | Description
| {CL_BUFFER_CREATE_TYPE_REGION_anchor}
include::{generated}/api/version-notes/CL_BUFFER_CREATE_TYPE_REGION.asciidoc[]
| Create a buffer object that represents a specific region in _buffer_.
_buffer_create_info_ is a pointer to a {cl_buffer_region_TYPE} structure
specifying a region of the buffer.
If _buffer_ is created with {CL_MEM_USE_HOST_PTR}, the _host_ptr_
associated with the buffer object returned is _host_ptr + origin_.
The buffer object returned references the data store allocated for
buffer and points to the region specified by _buffer_create_info_ in
this data store.
|====
// refError
{clCreateSubBuffer} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors in _errcode_ret_:
* {CL_INVALID_MEM_OBJECT} if _buffer_ is not a valid buffer object or is a
sub-buffer object.
* {CL_INVALID_VALUE} if _buffer_ was created with {CL_MEM_WRITE_ONLY} and
_flags_ specifies {CL_MEM_READ_WRITE} or {CL_MEM_READ_ONLY}, or if _buffer_
was created with {CL_MEM_READ_ONLY} and _flags_ specifies
{CL_MEM_READ_WRITE} or {CL_MEM_WRITE_ONLY}, or if _flags_ specifies
{CL_MEM_USE_HOST_PTR} or {CL_MEM_ALLOC_HOST_PTR} or {CL_MEM_COPY_HOST_PTR}.
* {CL_INVALID_VALUE} if _buffer_ was created with {CL_MEM_HOST_WRITE_ONLY} and
_flags_ specify {CL_MEM_HOST_READ_ONLY}, or if _buffer_ was created with
{CL_MEM_HOST_READ_ONLY} and _flags_ specify {CL_MEM_HOST_WRITE_ONLY}, or if
_buffer_ was created with {CL_MEM_HOST_NO_ACCESS} and _flags_ specify
{CL_MEM_HOST_READ_ONLY} or {CL_MEM_HOST_WRITE_ONLY}.
* {CL_INVALID_VALUE} if the value specified in _buffer_create_type_ is not
valid.
* {CL_INVALID_VALUE} if value(s) specified in _buffer_create_info_ (for a
given _buffer_create_type_) is not valid or if _buffer_create_info_ is
`NULL`.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for sub-buffer object.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
// These errors are specific to the {CL_BUFFER_CREATE_TYPE_REGION} parameter
* {CL_INVALID_VALUE} if the region specified by the {cl_buffer_region_TYPE}
structure passed in _buffer_create_info_ is out of bounds in _buffer_.
* {CL_INVALID_BUFFER_SIZE} if the _size_ field of the {cl_buffer_region_TYPE}
structure passed in _buffer_create_info_ is 0.
* {CL_MISALIGNED_SUB_BUFFER_OFFSET} if there are no devices in _context_
associated with _buffer_ for which the _origin_ field of the
{cl_buffer_region_TYPE} structure passed in _buffer_create_info_ is
aligned to the {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value.
[NOTE]
====
Concurrent reading from, writing to and copying between both a buffer object
and its sub-buffer object(s) is undefined.
Concurrent reading from, writing to and copying between overlapping
sub-buffer objects created with the same buffer object is undefined.
Only reading from both a buffer object and its sub-buffer objects or reading
from multiple overlapping sub-buffer objects is defined.
====
--
[open,refpage='cl_buffer_region',desc='',type='structs',xrefs='clCreateSubBuffer']
--
The {cl_buffer_region_TYPE} structure specifies a region of a buffer object:
include::{generated}/api/structs/cl_buffer_region.txt[]
* _origin_ is the offset in bytes of the region.
* _size_ is the size in bytes of the region.
Constraints on the values of _origin_ and _size_ are specified for the
{clCreateSubBuffer} function to which this structure is passed.
--
=== Reading, Writing and Copying Buffer Objects
The following functions enqueue commands to read from a buffer object to
host memory or write to a buffer object from host memory.
[open,refpage='clEnqueueReadBuffer',desc='Enqueue commands to read from a buffer object to host memory.',type='protos',alias='clEnqueueWriteBuffer']
--
To read from a buffer object to host memory or to write to a buffer object from
host memory call one of the functions
include::{generated}/api/protos/clEnqueueReadBuffer.txt[]
include::{generated}/api/version-notes/clEnqueueReadBuffer.asciidoc[]
include::{generated}/api/protos/clEnqueueWriteBuffer.txt[]
include::{generated}/api/version-notes/clEnqueueWriteBuffer.asciidoc[]
* _command_queue_ is a valid host command-queue in which the read / write
command will be queued.
_command_queue_ and _buffer_ must be created with the same OpenCL context.
* _buffer_ refers to a valid buffer object.
* _blocking_read_ and _blocking_write_ indicate if the read and write
operations are _blocking_ or _non-blocking_ (see below).
* _offset_ is the offset in bytes in the buffer object to read from or write
to.
* _size_ is the size in bytes of data being read or written.
* _ptr_ is the pointer to buffer in host memory where data is to be read into
or to be written from.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before this particular command can be executed.
If _event_wait_list_ is `NULL`, then this particular command does not wait
on any event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this read / write command
and can be used to query or queue a wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
If _blocking_read_ is {CL_TRUE} i.e. the read command is blocking,
{clEnqueueReadBuffer} does not return until the buffer data has been read
and copied into memory pointed to by _ptr_.
If _blocking_read_ is {CL_FALSE} i.e. the read command is non-blocking,
{clEnqueueReadBuffer} queues a non-blocking read command and returns.
The contents of the buffer that _ptr_ points to cannot be used until the
read command has completed.
The _event_ argument returns an event object which can be used to query the
execution status of the read command.
When the read command has completed, the contents of the buffer that _ptr_
points to can be used by the application.
If _blocking_write_ is {CL_TRUE}, the write command is blocking and does not
return until the command is complete, including transfer of the data.
The memory pointed to by _ptr_ can be reused by the application after the
{clEnqueueWriteBuffer} call returns.
If _blocking_write_ is {CL_FALSE}, the OpenCL implementation will use _ptr_ to
perform a non-blocking write.
As the write is non-blocking the implementation can return immediately.
The memory pointed to by _ptr_ cannot be reused by the application after the
call returns.
The _event_ argument returns an event object which can be used to query the
execution status of the write command.
When the write command has completed, the memory pointed to by _ptr_ can
then be reused by the application.
// refError
{clEnqueueReadBuffer} and {clEnqueueWriteBuffer} return {CL_SUCCESS} if the
function is executed successfully.
Otherwise, they return one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and
_buffer_ are not the same or if the context associated with
_command_queue_ and events in _event_wait_list_ are not the same.
* {CL_INVALID_MEM_OBJECT} if _buffer_ is not a valid buffer object.
* {CL_INVALID_VALUE} if the region being read or written specified by
(_offset_, _size_) is out of bounds or if _ptr_ is a `NULL` value.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _buffer_ is a sub-buffer object and
_offset_ specified when the sub-buffer object is created is not aligned
to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated with
_queue_.
This error code is <<unified-spec, missing before>> version 1.1.
* {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the read and write
operations are blocking and the execution status of any of the events in
_event_wait_list_ is a negative integer value.
This error code is <<unified-spec, missing before>> version 1.1.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for data store associated with _buffer_.
* {CL_INVALID_OPERATION} if {clEnqueueReadBuffer} is called on _buffer_
which has been created with {CL_MEM_HOST_WRITE_ONLY} or
{CL_MEM_HOST_NO_ACCESS}.
* {CL_INVALID_OPERATION} if {clEnqueueWriteBuffer} is called on _buffer_
which has been created with {CL_MEM_HOST_READ_ONLY} or
{CL_MEM_HOST_NO_ACCESS}.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clEnqueueReadBufferRect',desc='Enqueue command to read from a 2D or 3D rectangular region from a buffer object to host memory.',type='protos',alias='clEnqueueWriteBufferRect']
--
The following functions enqueue commands to read a 2D or 3D rectangular
region from a buffer object to host memory or write a 2D or 3D rectangular
region to a buffer object from host memory.
include::{generated}/api/protos/clEnqueueReadBufferRect.txt[]
include::{generated}/api/version-notes/clEnqueueReadBufferRect.asciidoc[]
include::{generated}/api/protos/clEnqueueWriteBufferRect.txt[]
include::{generated}/api/version-notes/clEnqueueWriteBufferRect.asciidoc[]
* _command_queue_ refers is a valid host command-queue in which the read /
write command will be queued.
_command_queue_ and _buffer_ must be created with the same OpenCL context.
* _buffer_ refers to a valid buffer object.
* _blocking_read_ and _blocking_write_ indicate if the read and write
operations are _blocking_ or _non-blocking_ (see below).
* _buffer_origin_ defines the (_x_, _y_, _z_) offset in the memory region
associated with _buffer_.
For a 2D rectangle region, the _z_ value given by _buffer_origin_[2] should
be 0.
The offset in bytes is computed as _buffer_origin_[2] {times}
_buffer_slice_pitch_ + _buffer_origin_[1] {times} _buffer_row_pitch_ {plus}
_buffer_origin_[0].
* _host_origin_ defines the (_x_, _y_, _z_) offset in the memory region
pointed to by _ptr_.
For a 2D rectangle region, the _z_ value given by _host_origin_[2] should be
0.
The offset in bytes is computed as _host_origin_[2] {times}
_host_slice_pitch_ + _host_origin_[1] {times} _host_row_pitch_ {plus}
_host_origin_[0].
* _region_ defines the (_width_ in bytes, _height_ in rows, _depth_ in slices)
of the 2D or 3D rectangle being read or written.
For a 2D rectangle copy, the _depth_ value given by _region_[2] should be 1.
The values in region cannot be 0.
* _buffer_row_pitch_ is the length of each row in bytes to be used for the
memory region associated with _buffer_.
If _buffer_row_pitch_ is 0, _buffer_row_pitch_ is computed as _region_[0].
* _buffer_slice_pitch_ is the length of each 2D slice in bytes to be used for
the memory region associated with _buffer_.
If _buffer_slice_pitch_ is 0, _buffer_slice_pitch_ is computed as
_region_[1] {times} _buffer_row_pitch_.
* _host_row_pitch_ is the length of each row in bytes to be used for the
memory region pointed to by _ptr_.
If _host_row_pitch_ is 0, _host_row_pitch_ is computed as _region_[0].
* _host_slice_pitch_ is the length of each 2D slice in bytes to be used for
the memory region pointed to by _ptr_.
If _host_slice_pitch_ is 0, _host_slice_pitch_ is computed as _region_[1]
{times} _host_row_pitch_.
* _ptr_ is the pointer to buffer in host memory where data is to be read into
or to be written from.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before this particular command can be executed.
If _event_wait_list_ is `NULL`, then this particular command does not wait
on any event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this read / write command
and can be used to query or queue a wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
If _blocking_read_ is {CL_TRUE} i.e. the read command is blocking,
{clEnqueueReadBufferRect} does not return until the buffer data has been
read and copied into memory pointed to by _ptr_.
If _blocking_read_ is {CL_FALSE} i.e. the read command is non-blocking,
{clEnqueueReadBufferRect} queues a non-blocking read command and returns.
The contents of the buffer that _ptr_ points to cannot be used until the
read command has completed.
The _event_ argument returns an event object which can be used to query the
execution status of the read command.
When the read command has completed, the contents of the buffer that _ptr_
points to can be used by the application.
If _blocking_write_ is {CL_TRUE}, the write command is blocking and does not
return until the command is complete, including transfer of the data.
The memory pointed to by _ptr_ can be reused by the application after the
{clEnqueueWriteBufferRect} call returns.
If _blocking_write_ is {CL_FALSE}, the OpenCL implementation will use _ptr_ to
perform a non-blocking write.
As the write is non-blocking the implementation can return immediately.
The memory pointed to by _ptr_ cannot be reused by the application after the
call returns.
The _event_ argument returns an event object which can be used to query the
execution status of the write command.
When the write command has completed, the memory pointed to by _ptr_ can
then be reused by the application.
// refError
{clEnqueueReadBufferRect} and {clEnqueueWriteBufferRect} return {CL_SUCCESS}
if the function is executed successfully.
Otherwise, they return one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and
_buffer_ are not the same or if the context associated with
_command_queue_ and events in _event_wait_list_ are not the same.
* {CL_INVALID_MEM_OBJECT} if _buffer_ is not a valid buffer object.
* {CL_INVALID_VALUE} if _buffer_origin_, _host_origin_, or _region_ is `NULL`.
* {CL_INVALID_VALUE} if the region being read or written specified by
(_buffer_origin_, _region_, _buffer_row_pitch_, _buffer_slice_pitch_) is
out of bounds.
* {CL_INVALID_VALUE} if any _region_ array element is 0.
* {CL_INVALID_VALUE} if _buffer_row_pitch_ is not 0 and is less than
_region_[0].
* {CL_INVALID_VALUE} if _host_row_pitch_ is not 0 and is less than
_region_[0].
* {CL_INVALID_VALUE} if _buffer_slice_pitch_ is not 0 and is less than
_region_[1] {times} _buffer_row_pitch_ and not a multiple of
_buffer_row_pitch_.
* {CL_INVALID_VALUE} if _host_slice_pitch_ is not 0 and is less than
_region_[1] {times} _host_row_pitch_ and not a multiple of
_host_row_pitch_.
* {CL_INVALID_VALUE} if _ptr_ is `NULL`.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _buffer_ is a sub-buffer object and
_offset_ specified when the sub-buffer object is created is not aligned
to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated with
_queue_.
This error code is <<unified-spec, missing before>> version 1.1.
* {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the read and write
operations are blocking and the execution status of any of the events in
_event_wait_list_ is a negative integer value.
This error code is <<unified-spec, missing before>> version 1.1.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for data store associated with _buffer_.
* {CL_INVALID_OPERATION} if {clEnqueueReadBufferRect} is called on _buffer_
which has been created with {CL_MEM_HOST_WRITE_ONLY} or
{CL_MEM_HOST_NO_ACCESS}.
* {CL_INVALID_OPERATION} if {clEnqueueWriteBufferRect} is called on _buffer_
which has been created with {CL_MEM_HOST_READ_ONLY} or
{CL_MEM_HOST_NO_ACCESS}.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
[NOTE]
====
Calling {clEnqueueReadBuffer} to read a region of the buffer object with the
_ptr_ argument value set to _host_ptr_ + _offset_, where _host_ptr_ is a
pointer to the memory region specified when the buffer object being read is
created with {CL_MEM_USE_HOST_PTR}, must meet the following requirements in
order to avoid undefined behavior:
* All commands that use this buffer object or a memory object (buffer or
image) created from this buffer object have finished execution before
the read command begins execution.
* The buffer object or memory objects created from this buffer object are
not mapped.
* The buffer object or memory objects created from this buffer object are
not used by any command-queue until the read command has finished
execution.
Calling {clEnqueueReadBufferRect} to read a region of the buffer object with
the _ptr_ argument value set to _host_ptr_ and _host_origin_,
_buffer_origin_ values are the same, where _host_ptr_ is a pointer to the
memory region specified when the buffer object being read is created with
{CL_MEM_USE_HOST_PTR}, must meet the same requirements given above for
{clEnqueueReadBuffer}.
Calling {clEnqueueWriteBuffer} to update the latest bits in a region of the
buffer object with the _ptr_ argument value set to _host_ptr_ + _offset_,
where _host_ptr_ is a pointer to the memory region specified when the buffer
object being written is created with {CL_MEM_USE_HOST_PTR}, must meet the
following requirements in order to avoid undefined behavior:
* The host memory region given by (_host_ptr_ + _offset_, _cb_) contains
the latest bits when the enqueued write command begins execution.
* The buffer object or memory objects created from this buffer object are
not mapped.
* The buffer object or memory objects created from this buffer object are
not used by any command-queue until the write command has finished
execution.
Calling {clEnqueueWriteBufferRect} to update the latest bits in a region of
the buffer object with the _ptr_ argument value set to _host_ptr_ and
_host_origin_, _buffer_origin_ values are the same, where _host_ptr_ is a
pointer to the memory region specified when the buffer object being written
is created with {CL_MEM_USE_HOST_PTR}, must meet the following requirements in
order to avoid undefined behavior:
* The host memory region given by (_buffer_origin region_) contains the
latest bits when the enqueued write command begins execution.
* The buffer object or memory objects created from this buffer object are
not mapped.
* The buffer object or memory objects created from this buffer object are
not used by any command-queue until the write command has finished
execution.
====
--
[open,refpage='clEnqueueCopyBuffer',desc='Enqueues a command to copy from one buffer object to another.',type='protos']
--
To enqueue a command to copy a buffer object identified by _src_buffer_ to
another buffer object identified by _dst_buffer_, call the function
include::{generated}/api/protos/clEnqueueCopyBuffer.txt[]
include::{generated}/api/version-notes/clEnqueueCopyBuffer.asciidoc[]
* _command_queue_ refers to a host command-queue in which the copy command
will be queued.
The OpenCL context associated with _command_queue_, _src_buffer_ and
_dst_buffer_ must be the same.
* _src_offset_ refers to the offset where to begin copying data from
_src_buffer_.
* _dst_offset_ refers to the offset where to begin copying data into
_dst_buffer_.
* _size_ refers to the size in bytes to copy.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before this particular command can be executed.
If _event_wait_list_ is `NULL`, then this particular command does not wait
on any event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this copy command
and can be used to query or queue a wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
// refError
{clEnqueueCopyBuffer} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_CONTEXT} if the context associated with _command_queue_,
_src_buffer_ and _dst_buffer_ are not the same or if the context
associated with _command_queue_ and events in _event_wait_list_ are not
the same.
* {CL_INVALID_MEM_OBJECT} if _src_buffer_ and _dst_buffer_ are not valid
buffer objects.
* {CL_INVALID_VALUE} if _src_offset_, _dst_offset_, _size_, _src_offset_
{plus} _size_ or _dst_offset_ + _size_ require accessing elements
outside the _src_buffer_ and _dst_buffer_ buffer objects respectively.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _src_buffer_ is a sub-buffer object
and _offset_ specified when the sub-buffer object is created is not
aligned to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated
with _queue_.
This error code is <<unified-spec, missing before>> version 1.1.
* {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _dst_buffer_ is a sub-buffer object
and _offset_ specified when the sub-buffer object is created is not
aligned to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated
with _queue_.
This error code is <<unified-spec, missing before>> version 1.1.
* {CL_MEM_COPY_OVERLAP} if _src_buffer_ and _dst_buffer_ are the same buffer
or sub-buffer object and the source and destination regions overlap or
if _src_buffer_ and _dst_buffer_ are different sub-buffers of the same
associated buffer object and they overlap.
The regions overlap if _src_offset_ {leq} _dst_offset_ {leq}
_src_offset_ + _size_ - 1 or if _dst_offset_ {leq} _src_offset_ {leq}
_dst_offset_ + _size_ - 1.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for data store associated with _src_buffer_ or _dst_buffer_.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clEnqueueCopyBufferRect',desc='Enqueues a command to copy a 2D or 3D rectangular region from a buffer object to another buffer object.',type='protos']
--
To enqueue a command to copy a 2D or 3D rectangular region from the buffer
object identified by _src_buffer_ to a 2D or 3D region in the buffer object
identified by _dst_buffer_, call the function
include::{generated}/api/protos/clEnqueueCopyBufferRect.txt[]
include::{generated}/api/version-notes/clEnqueueCopyBufferRect.asciidoc[]
* _command_queue_ refers to the host command-queue in which the copy command
will be queued.
The OpenCL context associated with _command_queue_, _src_buffer_ and
_dst_buffer_ must be the same.
* _src_origin_ defines the (_x_, _y_, _z_) offset in the memory region
associated with _src_buffer_.
For a 2D rectangle region, the _z_ value given by _src_origin_[2] should be
0.
The offset in bytes is computed as _src_origin_[2] {times} _src_slice_pitch_
+ _src_origin_[1] {times} _src_row_pitch_ + _src_origin_[0].
* _dst_origin_ defines the (_x_, _y_, _z_) offset in the memory region
associated with _dst_buffer_.
For a 2D rectangle region, the _z_ value given by _dst_origin_[2] should be
0.
The offset in bytes is computed as _dst_origin_[2] {times} _dst_slice_pitch_
+ _dst_origin_[1] {times} _dst_row_pitch_ + _dst_origin_[0].
* _region_ defines the (_width_ in bytes, _height_ in rows, _depth_ in slices)
of the 2D or 3D rectangle being copied.
For a 2D rectangle, the _depth_ value given by _region_[2] should be 1.
The values in region cannot be 0.
* _src_row_pitch_ is the length of each row in bytes to be used for the memory
region associated with _src_buffer_.
If _src_row_pitch_ is 0, _src_row_pitch_ is computed as _region_[0].
* _src_slice_pitch_ is the length of each 2D slice in bytes to be used for the
memory region associated with _src_buffer_.
If _src_slice_pitch_ is 0, _src_slice_pitch_ is computed as _region_[1]
{times} _src_row_pitch_.
* _dst_row_pitch_ is the length of each row in bytes to be used for the memory
region associated with _dst_buffer_.
If _dst_row_pitch_ is 0, _dst_row_pitch_ is computed as _region_[0].
* _dst_slice_pitch_ is the length of each 2D slice in bytes to be used for the
memory region associated with _dst_buffer_.
If _dst_slice_pitch_ is 0, _dst_slice_pitch_ is computed as _region_[1]
{times} _dst_row_pitch_.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before this particular command can be executed.
If _event_wait_list_ is `NULL`, then this particular command does not wait
on any event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this copy command
and can be used to query or queue a wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
Copying begins at the source offset and destination offset which are
computed as described below in the description for _src_origin_ and
_dst_origin_.
Each byte of the region's width is copied from the source offset to the
destination offset.
After copying each width, the source and destination offsets are incremented
by their respective source and destination row pitches.
After copying each 2D rectangle, the source and destination offsets are
incremented by their respective source and destination slice pitches.
[NOTE]
====
If _src_buffer_ and _dst_buffer_ are the same buffer object, _src_row_pitch_
must equal _dst_row_pitch_ and _src_slice_pitch_ must equal
_dst_slice_pitch_.
====
// refError
{clEnqueueCopyBufferRect} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_CONTEXT} if the context associated with _command_queue_,
_src_buffer_ and _dst_buffer_ are not the same or if the context
associated with _command_queue_ and events in _event_wait_list_ are not
the same.
* {CL_INVALID_MEM_OBJECT} if _src_buffer_ and _dst_buffer_ are not valid
buffer objects.
* {CL_INVALID_VALUE} if _src_origin_, _dst_origin_, or _region_ is `NULL`.
* {CL_INVALID_VALUE} if (_src_origin_, _region_, _src_row_pitch_,
_src_slice_pitch_) or (_dst_origin_, _region_, _dst_row_pitch_,
_dst_slice_pitch_) require accessing elements outside the _src_buffer_
and _dst_buffer_ buffer objects respectively.
* {CL_INVALID_VALUE} if any _region_ array element is 0.
* {CL_INVALID_VALUE} if _src_row_pitch_ is not 0 and is less than
_region_[0].
* {CL_INVALID_VALUE} if _dst_row_pitch_ is not 0 and is less than
_region_[0].
* {CL_INVALID_VALUE} if _src_slice_pitch_ is not 0 and is less than
_region_[1] {times} _src_row_pitch_ or if _src_slice_pitch_ is not 0 and
is not a multiple of _src_row_pitch_.
* {CL_INVALID_VALUE} if _dst_slice_pitch_ is not 0 and is less than
_region_[1] {times} _dst_row_pitch_ or if _dst_slice_pitch_ is not 0 and
is not a multiple of _dst_row_pitch_.
* {CL_INVALID_VALUE} if _src_buffer_ and _dst_buffer_ are the same buffer
object and _src_slice_pitch_ is not equal to _dst_slice_pitch_ and
_src_row_pitch_ is not equal to _dst_row_pitch_.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_MEM_COPY_OVERLAP} if _src_buffer_ and _dst_buffer_ are the same buffer
or sub-buffer object and the source and destination regions overlap or
if _src_buffer_ and _dst_buffer_ are different sub-buffers of the same
associated buffer object and they overlap.
Refer to <<check-copy-overlap, Checking for Memory Copy Overlap>> for
details on how to determine if source and destination regions overlap.
* {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _src_buffer_ is a sub-buffer object
and _offset_ specified when the sub-buffer object is created is not
aligned to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated
with _queue_.
This error code is <<unified-spec, missing before>> version 1.1.
* {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _dst_buffer_ is a sub-buffer object
and _offset_ specified when the sub-buffer object is created is not
aligned to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated
with _queue_.
This error code is <<unified-spec, missing before>> version 1.1.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for data store associated with _src_buffer_ or _dst_buffer_.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
=== Filling Buffer Objects
NOTE: Filling buffer objects is <<unified-spec, missing before>> version 1.2.
[open,refpage='clEnqueueFillBuffer',desc='Enqueues a command to fill a buffer object with a pattern of a given pattern size.',type='protos']
--
To enqueue a command to fill a buffer object with a pattern of a given
pattern size, call the function
include::{generated}/api/protos/clEnqueueFillBuffer.txt[]
include::{generated}/api/version-notes/clEnqueueFillBuffer.asciidoc[]
* _command_queue_ refers to the host command-queue in which the fill command
will be queued.
The OpenCL context associated with _command_queue_ and _buffer_ must be the
same.
* _buffer_ is a valid buffer object.
* _pattern_ is a pointer to the data pattern of size _pattern_size_ in bytes.
_pattern_ will be used to fill a region in _buffer_ starting at _offset_ and
is _size_ bytes in size.
The data pattern must be a scalar or vector integer or floating-point data
type supported by OpenCL as described in <<scalar-data-types, Shared
Application Scalar Data Types>> and <<vector-data-types, Supported
Application Vector Data Types>>.
For example, if _buffer_ is to be filled with a pattern of `float4` values,
then _pattern_ will be a pointer to a `cl_float4` value and _pattern_size_
will be `sizeof(cl_float4)`.
The maximum value of _pattern_size_ is the size of the largest integer or
floating-point vector data type supported by the OpenCL device.
The memory associated with _pattern_ can be reused or freed after the
function returns.
* _offset_ is the location in bytes of the region being filled in _buffer_ and
must be a multiple of _pattern_size_.
* _size_ is the size in bytes of region being filled in _buffer_ and must be a
multiple of _pattern_size_.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before this particular command can be executed.
If _event_wait_list_ is `NULL`, then this particular command does not wait
on any event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this command
and can be used to query or queue a wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
The usage information which indicates whether the memory object can be read
or written by a kernel and/or the host and is given by the {cl_mem_flags_TYPE}
argument value specified when _buffer_ is created is ignored by
{clEnqueueFillBuffer}.
// refError
{clEnqueueFillBuffer} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and
_buffer_ are not the same or if the context associated with
_command_queue_ and events in _event_wait_list_ are not the same.
* {CL_INVALID_MEM_OBJECT} if _buffer_ is not a valid buffer object.
* {CL_INVALID_VALUE} if _offset_ or _offset_ + _size_ require accessing
elements outside the _buffer_ buffer object respectively.
* {CL_INVALID_VALUE} if _pattern_ is `NULL` or if _pattern_size_ is 0 or if
_pattern_size_ is not one of { 1, 2, 4, 8, 16, 32, 64, 128 }.
* {CL_INVALID_VALUE} if _offset_ and _size_ are not a multiple of
_pattern_size_.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _buffer_ is a sub-buffer object and
offset specified when the sub-buffer object is created is not aligned to
{CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated with _queue_.
This error code is <<unified-spec, missing before>> version 1.1.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for data store associated with _buffer_.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
=== Mapping Buffer Objects
[open,refpage='clEnqueueMapBuffer',desc='Enqueues a command to map a region of a buffer object into the host address space and returns a pointer to this mapped region.',type='protos']
--
To enqueue a command to map a region of the buffer object given by _buffer_
into the host address space and returns a pointer to this mapped region,
call the function
include::{generated}/api/protos/clEnqueueMapBuffer.txt[]
include::{generated}/api/version-notes/clEnqueueMapBuffer.asciidoc[]
* _command_queue_ must be a valid host command-queue.
* _blocking_map_ indicates if the map operation is _blocking_ or
_non-blocking_.
If _blocking_map_ is {CL_TRUE}, {clEnqueueMapBuffer} does not return until the
specified region in _buffer_ is mapped into the host address space and the
application can access the contents of the mapped region using the pointer
returned by {clEnqueueMapBuffer}.
If _blocking_map_ is {CL_FALSE} i.e. map operation is non-blocking, the
pointer to the mapped region returned by {clEnqueueMapBuffer} cannot be used
until the map command has completed.
The _event_ argument returns an event object which can be used to query the
execution status of the map command.
When the map command is completed, the application can access the contents
of the mapped region using the pointer returned by {clEnqueueMapBuffer}.
* _map_flags_ is a bit-field and is described in the
<<memory-map-flags-table,Memory Map Flags>> table.
* _buffer_ is a valid buffer object.
The OpenCL context associated with _command_queue_ and _buffer_ must be the
same.
* _offset_ and _size_ are the offset in bytes and the size of the region in
the buffer object that is being mapped.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before this particular command can be executed.
If _event_wait_list_ is `NULL`, then this particular command does not wait
on any event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this command
and can be used to query or queue a wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
* _errcode_ret_ will return an appropriate error code.
If _errcode_ret_ is `NULL`, no error code is returned.
// refError
{clEnqueueMapBuffer} will return a pointer to the mapped region.
The _errcode_ret_ is set to {CL_SUCCESS}.
A `NULL` pointer is returned otherwise with one of the following error
values returned in _errcode_ret_:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_CONTEXT} if context associated with _command_queue_ and
_buffer_ are not the same or if the context associated with
_command_queue_ and events in _event_wait_list_ are not the same.
* {CL_INVALID_MEM_OBJECT} if _buffer_ is not a valid buffer object.
* {CL_INVALID_VALUE} if region being mapped given by (_offset_, _size_) is
out of bounds or if _size_ is 0 or if values specified in _map_flags_
are not valid.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _buffer_ is a sub-buffer object and
_offset_ specified when the sub-buffer object is created is not aligned
to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for the device associated with
_queue_.
This error code is <<unified-spec, missing before>> version 1.1.
* {CL_MAP_FAILURE} if there is a failure to map the requested region into
the host address space.
This error cannot occur for buffer objects created with
{CL_MEM_USE_HOST_PTR} or {CL_MEM_ALLOC_HOST_PTR}.
* {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the map operation is
blocking and the execution status of any of the events in
_event_wait_list_ is a negative integer value.
This error code is <<unified-spec, missing before>> version 1.1.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for data store associated with _buffer_.
* {CL_INVALID_OPERATION} if _buffer_ has been created with
{CL_MEM_HOST_WRITE_ONLY} or {CL_MEM_HOST_NO_ACCESS} and {CL_MAP_READ} is set
in _map_flags_ or if _buffer_ has been created with
{CL_MEM_HOST_READ_ONLY} or {CL_MEM_HOST_NO_ACCESS} and {CL_MAP_WRITE} or
{CL_MAP_WRITE_INVALIDATE_REGION} is set in _map_flags_.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
* {CL_INVALID_OPERATION} if mapping would lead to overlapping regions being
mapped for writing.
The pointer returned maps a region starting at _offset_ and is at least
_size_ bytes in size.
The result of a memory access outside this region is undefined.
If the buffer object is created with {CL_MEM_USE_HOST_PTR} set in _mem_flags_,
the following will be true:
* The _host_ptr_ specified in {clCreateBuffer} or {clCreateBufferWithProperties}
will contain the latest bits in the region being mapped when the
{clEnqueueMapBuffer} command has completed.
* The pointer value returned by {clEnqueueMapBuffer} will be derived from
the _host_ptr_ specified when the buffer object is created.
Mapped buffer objects are unmapped using {clEnqueueUnmapMemObject}.
This is described in <<unmapping-mapped-memory, Unmapping Mapped Memory
Objects>>.
[[memory-map-flags-table]]
.List of supported map flag values
[width="100%",cols="<50%,<50%",options="header"]
|====
| Map Flags | Description
| {CL_MAP_READ_anchor}
include::{generated}/api/version-notes/CL_MAP_READ.asciidoc[]
| This flag specifies that the region being mapped in the memory object is
being mapped for reading.
The pointer returned by {clEnqueueMapBuffer} ({clEnqueueMapImage}) is
guaranteed to contain the latest bits in the region being mapped when
the {clEnqueueMapBuffer} ({clEnqueueMapImage}) command has completed.
| {CL_MAP_WRITE_anchor}
include::{generated}/api/version-notes/CL_MAP_WRITE.asciidoc[]
| This flag specifies that the region being mapped in the memory object is
being mapped for writing.
The pointer returned by {clEnqueueMapBuffer} ({clEnqueueMapImage}) is
guaranteed to contain the latest bits in the region being mapped when
the {clEnqueueMapBuffer} ({clEnqueueMapImage}) command has completed
| {CL_MAP_WRITE_INVALIDATE_REGION_anchor}
include::{generated}/api/version-notes/CL_MAP_WRITE_INVALIDATE_REGION.asciidoc[]
| This flag specifies that the region being mapped in the memory object is
being mapped for writing.
The contents of the region being mapped are to be discarded.
This is typically the case when the region being mapped is overwritten
by the host.
This flag allows the implementation to no longer guarantee that the
pointer returned by {clEnqueueMapBuffer} ({clEnqueueMapImage}) contains
the latest bits in the region being mapped which can be a significant
performance enhancement.
{CL_MAP_READ} or {CL_MAP_WRITE} and {CL_MAP_WRITE_INVALIDATE_REGION} are
mutually exclusive.
|====
--
== Image Objects
An _image_ object is used to store a one-, two- or three-dimensional
texture, frame-buffer or image.
The elements of an image object are selected from a list of predefined image
formats.
The minimum number of elements in a memory object is one.
=== Creating Image Objects
[open,refpage='clCreateImage',desc='Creates a 1D image, 1D image buffer, 1D image array, 2D image, 2D image array or 3D image object.',type='protos',alias='clCreateImageWithProperties']
--
An *image object* may be created using the function
include::{generated}/api/protos/clCreateImage.txt[]
include::{generated}/api/version-notes/clCreateImage.asciidoc[]
An *image object* may also be created with additional properties using the function
include::{generated}/api/protos/clCreateImageWithProperties.txt[]
include::{generated}/api/version-notes/clCreateImageWithProperties.asciidoc[]
* _context_ is a valid OpenCL context used to create the image object.
* _properties_ is an optional list of properties for the image object and their corresponding values.
The list is terminated with the special property `0`.
If no properties are required, _properties_ may be `NULL`.
OpenCL 3.0 does not define any optional properties for images.
* _flags_ is a bit-field that is used to specify allocation and usage
information about the image memory object being created and is described in
the <<memory-flags-table, supported memory flag values>> table.
* _image_format_ is a pointer to a structure that describes format properties
of the image to be allocated.
A 1D image buffer or 2D image can be created from a buffer by specifying a
buffer object in the __image_desc__->__mem_object__.
A 2D image can be created from another 2D image object by specifying an
image object in the __image_desc__->__mem_object__.
Refer to the <<image-format-descriptor, Image Format Descriptor>> section
for a detailed description of the image format descriptor.
* _image_desc_ is a pointer to a structure that describes type and dimensions
of the image to be allocated.
Refer to the <<image-descriptor, Image Descriptor>> section for a detailed
description of the image descriptor.
* _host_ptr_ is a pointer to the image data that may already be allocated by
the application.
Refer to the <<host-ptr-buffer-size-table, table below>> for a description
of how large the buffer that _host_ptr_ points to must be.
* _errcode_ret_ will return an appropriate error code.
If _errcode_ret_ is `NULL`, no error code is returned.
The alignment requirements for data stored in image objects are described
in <<alignment-app-data-types>>.
For all image types except {CL_MEM_OBJECT_IMAGE1D_BUFFER}, if the value
specified for _flags_ is 0, the default is used which is {CL_MEM_READ_WRITE}.
For {CL_MEM_OBJECT_IMAGE1D_BUFFER} image type, or an image created from
another memory object (image or buffer), if the {CL_MEM_READ_WRITE},
{CL_MEM_READ_ONLY} or {CL_MEM_WRITE_ONLY} values are not specified in _flags_,
they are inherited from the corresponding memory access qualifiers associated
with __mem_object__.
The {CL_MEM_USE_HOST_PTR}, {CL_MEM_ALLOC_HOST_PTR} and {CL_MEM_COPY_HOST_PTR}
values cannot be specified in _flags_ but are inherited from the
corresponding memory access qualifiers associated with __mem_object__.
If {CL_MEM_COPY_HOST_PTR} is specified in the memory access qualifier values
associated with __mem_object__ it does not imply any additional copies when
the image is created from __mem_object__.
If the {CL_MEM_HOST_WRITE_ONLY}, {CL_MEM_HOST_READ_ONLY} or
{CL_MEM_HOST_NO_ACCESS} values are not specified in _flags_, they are
inherited from the corresponding memory access qualifiers associated with
__mem_object__.
For a 3D image or 2D image array, the image data specified by _host_ptr_ is
stored as a linear sequence of adjacent 2D image slices or 2D images
respectively.
Each 2D image is a linear sequence of adjacent scanlines.
Each scanline is a linear sequence of image elements.
For a 2D image, the image data specified by _host_ptr_ is stored as a linear
sequence of adjacent scanlines.
Each scanline is a linear sequence of image elements.
For a 1D image array, the image data specified by _host_ptr_ is stored as a
linear sequence of adjacent 1D images.
Each 1D image is stored as a single scanline which is a linear sequence of
adjacent elements.
For 1D image or 1D image buffer, the image data specified by _host_ptr_ is
stored as a single scanline which is a linear sequence of adjacent elements.
Image elements are stored according to their image format as described in the
<<image-format-descriptor, Image Format Descriptor>> section.
// refError
{clCreateImage} and {clCreateImageWithProperties} returns a valid non-zero
image object and _errcode_ret_ is set to {CL_SUCCESS} if the image object
is created successfully.
Otherwise, they return a `NULL` value with one of the following error values
returned in _errcode_ret_:
* {CL_INVALID_CONTEXT} if _context_ is not a valid context.
* {CL_INVALID_PROPERTY} if a property name in _properties_ is not a
supported property name, if the value specified for a supported property
name is not valid, or if the same property name is specified more than
once.
* {CL_INVALID_VALUE} if values specified in _flags_ are not valid.
* {CL_INVALID_IMAGE_FORMAT_DESCRIPTOR} if values specified in _image_format_
are not valid or if _image_format_ is `NULL`.
* {CL_INVALID_IMAGE_FORMAT_DESCRIPTOR} if a 2D image is created from a
buffer and the row pitch and base address alignment does not follow the
rules described for creating a 2D image from a buffer.
* {CL_INVALID_IMAGE_FORMAT_DESCRIPTOR} if a 2D image is created from a 2D
image object and the rules described above are not followed.
* {CL_INVALID_IMAGE_DESCRIPTOR} if values specified in _image_desc_ are not
valid or if _image_desc_ is `NULL`.
* {CL_INVALID_IMAGE_SIZE} if image dimensions specified in _image_desc_
exceed the maximum image dimensions described in the
<<device-queries-table,Device Queries>> table for all devices
in _context_.
* {CL_INVALID_HOST_PTR} if _host_ptr_ is `NULL` and {CL_MEM_USE_HOST_PTR} or
{CL_MEM_COPY_HOST_PTR} are set in _flags_ or if _host_ptr_ is not `NULL`
but {CL_MEM_COPY_HOST_PTR} or {CL_MEM_USE_HOST_PTR} are not set in _flags_.
* {CL_INVALID_VALUE} if an image is being created from another memory object
(buffer or image) under one of the following circumstances: 1)
__mem_object__ was created with {CL_MEM_WRITE_ONLY} and _flags_ specifies
{CL_MEM_READ_WRITE} or {CL_MEM_READ_ONLY}, 2) __mem_object__ was created with
{CL_MEM_READ_ONLY} and _flags_ specifies {CL_MEM_READ_WRITE} or
{CL_MEM_WRITE_ONLY}, 3) _flags_ specifies {CL_MEM_USE_HOST_PTR} or
{CL_MEM_ALLOC_HOST_PTR} or {CL_MEM_COPY_HOST_PTR}.
* {CL_INVALID_VALUE} if an image is being created from another memory object
(buffer or image) and __mem_object__ was created with
{CL_MEM_HOST_WRITE_ONLY} and _flags_ specifies {CL_MEM_HOST_READ_ONLY}, or
if __mem_object__ was created with {CL_MEM_HOST_READ_ONLY} and _flags_
specifies {CL_MEM_HOST_WRITE_ONLY}, or if __mem_object__ was created with
{CL_MEM_HOST_NO_ACCESS} and_flags_ specifies {CL_MEM_HOST_READ_ONLY} or
{CL_MEM_HOST_WRITE_ONLY}.
* {CL_IMAGE_FORMAT_NOT_SUPPORTED} if there are no devices in _context_ that
support _image_format_.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for image object.
* {CL_INVALID_OPERATION} if there are no devices in _context_ that support
images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the
<<device-queries-table,Device Queries>> table is {CL_FALSE}).
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
[[host-ptr-buffer-size-table]]
.Required _host_ptr_ buffer sizes for images
[width="100%",cols="<50%,<50%",options="header"]
|====
| Image Type | Size of buffer that _host_ptr_ points to
| {CL_MEM_OBJECT_IMAGE1D_anchor}
include::{generated}/api/version-notes/CL_MEM_OBJECT_IMAGE1D.asciidoc[]
| {geq} image_row_pitch
| {CL_MEM_OBJECT_IMAGE1D_BUFFER_anchor}
include::{generated}/api/version-notes/CL_MEM_OBJECT_IMAGE1D_BUFFER.asciidoc[]
| {geq} image_row_pitch
| {CL_MEM_OBJECT_IMAGE2D_anchor}
include::{generated}/api/version-notes/CL_MEM_OBJECT_IMAGE2D.asciidoc[]
| {geq} image_row_pitch {times} image_height
| {CL_MEM_OBJECT_IMAGE3D_anchor}
include::{generated}/api/version-notes/CL_MEM_OBJECT_IMAGE3D.asciidoc[]
| {geq} image_slice_pitch {times} image_depth
| {CL_MEM_OBJECT_IMAGE1D_ARRAY_anchor}
include::{generated}/api/version-notes/CL_MEM_OBJECT_IMAGE1D_ARRAY.asciidoc[]
| {geq} image_slice_pitch {times} image_array_size
| {CL_MEM_OBJECT_IMAGE2D_ARRAY_anchor}
include::{generated}/api/version-notes/CL_MEM_OBJECT_IMAGE2D_ARRAY.asciidoc[]
| {geq} image_slice_pitch {times} image_array_size
|====
--
[open,refpage='clCreateImage2D',desc='Creates a 2D image.',type='protos']
--
A *2D image* object can be created using the following function
include::{generated}/api/protos/clCreateImage2D.txt[]
include::{generated}/api/version-notes/clCreateImage2D.asciidoc[]
* _context_ is a valid OpenCL context on which the image object is to be
created.
* _flags_ is a bit-field that is used to specify allocation and usage
information about the image memory object being created and is described in
the <<memory-flags-table, supported memory flag values>> table.
If the value specified for _flags_ is 0, the default is used which is
{CL_MEM_READ_WRITE}.
* _image_format_ is a pointer to a structure that describes format properties
of the image to be allocated.
Refer to the <<image-format-descriptor, Image Format Descriptor>> section
for a detailed description of the image format descriptor.
* _image_width_ and _image_height_ are the width and height of the image in
pixels.
These must be values greater than or equal to 1.
* _image_row_pitch_ is the scan-line pitch in bytes.
This must be 0 if _host_ptr_ is `NULL` and can be either 0 or {geq}
_image_width_ {times} size of element in bytes if _host_ptr_ is not `NULL`.
If _host_ptr_ is not `NULL` and _image_row_pitch_ is 0, _image_row_pitch_
is calculated as _image_width_ {times} size of element in bytes.
If _image_row_pitch_ is not 0, it must be a multiple of the image element
size in bytes.
* _host_ptr_ is a pointer to the image data that may already be allocated by
the application.
Refer to the {CL_MEM_OBJECT_IMAGE2D} entry in the
<<host-ptr-buffer-size-table, required _host_ptr_ buffer size table>> for a
description of how large the buffer that _host_ptr_ points to must be.
The image data specified by _host_ptr_ is stored as a linear sequence of
adjacent scanlines.
Each scanline is a linear sequence of image elements.
Image elements are stored according to their image format as described in
the <<image-format-descriptor, Image Format Descriptor>> section.
* _errcode_ret_ will return an appropriate error code.
If _errcode_ret_ is `NULL`, no error code is returned.
// refError
{clCreateImage2D} returns a valid non-zero image object created and the
_errcode_ret_ is set to {CL_SUCCESS} if the image object is created
successfully.
Otherwise, it returns a `NULL` value with one of the following error values
returned in _errcode_ret_:
* {CL_INVALID_CONTEXT} if _context_ is not a valid context.
* {CL_INVALID_VALUE} if values specified in _flags_ are not valid.
* {CL_INVALID_IMAGE_FORMAT_DESCRIPTOR} if values specified in _image_format_
are not valid or if _image_format_ is `NULL`.
* {CL_INVALID_IMAGE_SIZE} if _image_width_ or _image_height_ are 0 or if they
exceed the maximum values specified in {CL_DEVICE_IMAGE2D_MAX_WIDTH} or
{CL_DEVICE_IMAGE2D_MAX_HEIGHT} respectively for all devices in _context_ or
if values specified by _image_row_pitch_ do not follow rules described in the
argument description above.
* {CL_INVALID_HOST_PTR} if _host_ptr_ is `NULL` and {CL_MEM_USE_HOST_PTR} or
{CL_MEM_COPY_HOST_PTR} are set in _flags_ or if _host_ptr_ is not `NULL`
but {CL_MEM_COPY_HOST_PTR} or {CL_MEM_USE_HOST_PTR} are not set in _flags_.
* {CL_IMAGE_FORMAT_NOT_SUPPORTED} if there are no devices in _context_ that
support _image_format_.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for image object.
* {CL_INVALID_OPERATION} if there are no devices in _context_ that support
images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the
<<device-queries-table,Device Queries>> table is {CL_FALSE}).
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clCreateImage3D',desc='Creates a 3D image.',type='protos']
--
A *3D image* object can be created using the following function
include::{generated}/api/protos/clCreateImage3D.txt[]
include::{generated}/api/version-notes/clCreateImage3D.asciidoc[]
* _context_ is a valid OpenCL context on which the image object is to be
created.
* _flags_ is a bit-field that is used to specify allocation and usage
information about the image memory object being created and is described in
the <<memory-flags-table, supported memory flag values>> table.
If the value specified for _flags_ is 0, the default is used which is
{CL_MEM_READ_WRITE}.
* _image_format_ is a pointer to a structure that describes format properties
of the image to be allocated.
Refer to the <<image-format-descriptor, Image Format Descriptor>> section
for a detailed description of the image format descriptor.
* _image_width_ and _image_height_ are the width and height of the image in
pixels.
These must be values greater than or equal to 1.
* _image_depth_ is the depth of the image in pixels. For {clCreateImage3D},
this must be a value {gt} 1.
* _image_row_pitch_ is the scan-line pitch in bytes.
This must be 0 if _host_ptr_ is `NULL` and can be either 0 or {geq}
_image_width_ {times} size of element in bytes if _host_ptr_ is not `NULL`.
If _host_ptr_ is not `NULL` and _image_row_pitch_ is 0, _image_row_pitch_
is calculated as _image_width_ {times} size of element in bytes.
If _image_row_pitch_ is not 0, it must be a multiple of the image element
size in bytes.
* _image_slice_pitch_ is the size in bytes of each 2D slice in the 3D image.
This be be 0 if _host_ptr_ is `NULL` and can be 0 or {geq}
_image_row_pitch_ {times} _image_height_ if _host_ptr_ is not `NULL`.
If _host_ptr_ is not `NULL` and _image_slice_pitch_ is 0,
_image_slice_pitch_ is calculated as _image_row_pitch_ {times}
_image_height_.
If _image_slice_pitch_ is not 0, it must be a multiple of the
_image_row_pitch_.
* _host_ptr_ is a pointer to the image data that may already be allocated by
the application.
Refer to the {CL_MEM_OBJECT_IMAGE3D} entry in the
<<host-ptr-buffer-size-table, required _host_ptr_ buffer size table>> for a
description of how large the buffer that _host_ptr_ points to must be.
The image data specified by _host_ptr_ is stored as a linear sequence of
adjacent 2D slices.
Each scanline is a linear sequence of image elements.
Image elements are stored according to their image format as described in
the <<image-format-descriptor, Image Format Descriptor>> section.
* _errcode_ret_ will return an appropriate error code.
If _errcode_ret_ is `NULL`, no error code is returned.
// refError
{clCreateImage3D} returns a valid non-zero image object created and the
_errcode_ret_ is set to {CL_SUCCESS} if the image object is created
successfully.
Otherwise, it returns a `NULL` value with one of the following error values
returned in _errcode_ret_:
* {CL_INVALID_CONTEXT} if _context_ is not a valid context.
* {CL_INVALID_VALUE} if values specified in _flags_ are not valid.
* {CL_INVALID_IMAGE_FORMAT_DESCRIPTOR} if values specified in _image_format_
are not valid or if _image_format_ is `NULL`.
* {CL_INVALID_IMAGE_SIZE} if _image_width_ or _image_height_ are 0 or if
_image_depth_ {leq} 1, or if they exceed the maximum values specified in
{CL_DEVICE_IMAGE3D_MAX_WIDTH}, {CL_DEVICE_IMAGE3D_MAX_HEIGHT} or
{CL_DEVICE_IMAGE3D_MAX_DEPTH} respectively for all devices in _context_, or
if values specified by _image_row_pitch_ and _image_slice_pitch_ do not
follow rules described in the argument description above.
* {CL_INVALID_HOST_PTR} if _host_ptr_ is `NULL` and {CL_MEM_USE_HOST_PTR} or
{CL_MEM_COPY_HOST_PTR} are set in _flags_ or if _host_ptr_ is not `NULL`
but {CL_MEM_COPY_HOST_PTR} or {CL_MEM_USE_HOST_PTR} are not set in _flags_.
* {CL_IMAGE_FORMAT_NOT_SUPPORTED} if there are no devices in _context_ that
support _image_format_.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for image object.
* {CL_INVALID_OPERATION} if there are no devices in _context_ that support
images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the
<<device-queries-table,Device Queries>> table is {CL_FALSE}).
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[[image-format-descriptor]]
==== Image Format Descriptor
[open,refpage='cl_image_format',desc='The image format descriptor structure is defined as:',type='structs',xrefs='clCreateImage']
--
The {cl_image_format_TYPE} image format descriptor structure describes an image
format, and is defined as:
include::{generated}/api/structs/cl_image_format.txt[]
* `image_channel_order` specifies the number of channels and the channel
layout i.e. the memory layout in which channels are stored in the image.
Valid values are described in the <<image-channel-order-table,Image Channel
Order>> table.
* `image_channel_data_type` describes the size of the channel data type.
The list of supported values is described in the
<<image-channel-data-types-table,Image Channel Data Types>> table.
The number of bits per element determined by the `image_channel_data_type`
and `image_channel_order` must be a power of two.
[[image-channel-order-table]]
.List of supported Image Channel Order Values
[width="100%",cols="<50%,<50%",options="header"]
|====
| Image Channel Order | Description
| {CL_R_anchor}, {CL_A_anchor},
// The CL_R annotation here is used to convey the same information for both
// entries in this table row.
include::{generated}/api/version-notes/CL_R.asciidoc[]
| Single channel image formats where the single channel represents a `RED` or `ALPHA` component.
| {CL_DEPTH_anchor}
include::{generated}/api/version-notes/CL_DEPTH.asciidoc[]
| A single channel image format where the single channel represents a `DEPTH` component.
| {CL_LUMINANCE_anchor}
include::{generated}/api/version-notes/CL_LUMINANCE.asciidoc[]
| A single channel image format where the single channel represents a `LUMINANCE` value.
The `LUMINANCE` value is replicated into the `RED`, `GREEN`, and `BLUE` components.
| {CL_INTENSITY_anchor},
include::{generated}/api/version-notes/CL_INTENSITY.asciidoc[]
| A single channel image format where the single channel represents an `INTENSITY` value.
The `INTENSITY` value is replicated into the `RED`, `GREEN`, `BLUE`, and `ALPHA` components.
| {CL_RG_anchor}, {CL_RA_anchor}
// The CL_RG annotation here is used to convey the same information for both
// entries in this table row.
include::{generated}/api/version-notes/CL_RG.asciidoc[]
| Two channel image formats.
The first channel always represents a `RED` component.
The second channel represents a `GREEN` component or an `ALPHA` component.
| {CL_Rx_anchor}
include::{generated}/api/version-notes/CL_Rx.asciidoc[]
| A two channel image format, where the first channel represents a `RED` component and the second channel is ignored.
| {CL_RGB_anchor}
include::{generated}/api/version-notes/CL_RGB.asciidoc[]
| A three channel image format, where the three channels represent `RED`, `GREEN`, and `BLUE` components.
| {CL_RGx_anchor}
include::{generated}/api/version-notes/CL_RGx.asciidoc[]
| A three channel image format, where the first two channels represent `RED` and `GREEN` components and the third channel is ignored.
| {CL_RGBA_anchor}, {CL_ARGB_anchor}, {CL_BGRA_anchor}, {CL_ABGR_anchor}
// Manually write this annotation as CL_ABGR is the odd-one-out here (all the
// other entries in this row were in OpenCL 1.0).
{CL_ABGR} is <<unified-spec, missing before>> version 2.0.
| Four channel image formats, where the four channels represent `RED`, `GREEN`, `BLUE`, and `ALPHA` components.
| {CL_RGBx_anchor}
include::{generated}/api/version-notes/CL_RGBx.asciidoc[]
| A four channel image format, where the first three channels represent `RED`, `GREEN`, and `BLUE` components and the fourth channel is ignored.
| {CL_sRGB_anchor}
include::{generated}/api/version-notes/CL_sRGB.asciidoc[]
| A three channel image format, where the three channels represent `RED`, `GREEN`, and `BLUE` components in the sRGB color space.
| {CL_sRGBA_anchor}, {CL_sBGRA_anchor}
// The CL_sRGBA annotation here is used to convey the same information for both
// entries in this table row.
include::{generated}/api/version-notes/CL_sRGBA.asciidoc[]
| Four channel image formats, where the first three channels represent `RED`, `GREEN`, and `BLUE` components in the sRGB color space.
The fourth channel represents an `ALPHA` component.
| {CL_sRGBx_anchor}
include::{generated}/api/version-notes/CL_sRGBx.asciidoc[]
| A four channel image format, where the three channels represent `RED`, `GREEN`, and `BLUE` components in the sRGB color space.
The fourth channel is ignored.
|====
[[image-channel-data-types-table]]
.List of supported Image Channel Data Types
[width="100%",cols="<50%,<50%",options="header"]
|====
| Image Channel Data Type | Description
| {CL_SNORM_INT8_anchor}
include::{generated}/api/version-notes/CL_SNORM_INT8.asciidoc[]
| Each channel component is a normalized signed 8-bit integer value
| {CL_SNORM_INT16_anchor}
include::{generated}/api/version-notes/CL_SNORM_INT16.asciidoc[]
| Each channel component is a normalized signed 16-bit integer value
| {CL_UNORM_INT8_anchor}
include::{generated}/api/version-notes/CL_UNORM_INT8.asciidoc[]
| Each channel component is a normalized unsigned 8-bit integer value
| {CL_UNORM_INT16_anchor}
include::{generated}/api/version-notes/CL_UNORM_INT16.asciidoc[]
| Each channel component is a normalized unsigned 16-bit integer value
| {CL_UNORM_SHORT_565_anchor}
include::{generated}/api/version-notes/CL_UNORM_SHORT_565.asciidoc[]
| Represents a normalized 5-6-5 3-channel RGB image.
The channel order must be {CL_RGB} or {CL_RGBx}.
| {CL_UNORM_SHORT_555_anchor}
include::{generated}/api/version-notes/CL_UNORM_SHORT_555.asciidoc[]
| Represents a normalized x-5-5-5 4-channel xRGB image.
The channel order must be {CL_RGB} or {CL_RGBx}.
| {CL_UNORM_INT_101010_anchor}
include::{generated}/api/version-notes/CL_UNORM_INT_101010.asciidoc[]
| Represents a normalized x-10-10-10 4-channel xRGB image.
The channel order must be {CL_RGB} or {CL_RGBx}.
| {CL_UNORM_INT_101010_2_anchor}
include::{generated}/api/version-notes/CL_UNORM_INT_101010_2.asciidoc[]
| Represents a normalized 10-10-10-2 four-channel RGBA image.
The channel order must be {CL_RGBA}.
| {CL_SIGNED_INT8_anchor}
include::{generated}/api/version-notes/CL_SIGNED_INT8.asciidoc[]
| Each channel component is an unnormalized signed 8-bit integer value
| {CL_SIGNED_INT16_anchor}
include::{generated}/api/version-notes/CL_SIGNED_INT16.asciidoc[]
| Each channel component is an unnormalized signed 16-bit integer value
| {CL_SIGNED_INT32_anchor}
include::{generated}/api/version-notes/CL_SIGNED_INT32.asciidoc[]
| Each channel component is an unnormalized signed 32-bit integer value
| {CL_UNSIGNED_INT8_anchor}
include::{generated}/api/version-notes/CL_UNSIGNED_INT8.asciidoc[]
| Each channel component is an unnormalized unsigned 8-bit integer value
| {CL_UNSIGNED_INT16_anchor}
include::{generated}/api/version-notes/CL_UNSIGNED_INT16.asciidoc[]
| Each channel component is an unnormalized unsigned 16-bit integer value
| {CL_UNSIGNED_INT32_anchor}
include::{generated}/api/version-notes/CL_UNSIGNED_INT32.asciidoc[]
| Each channel component is an unnormalized unsigned 32-bit integer value
| {CL_HALF_FLOAT_anchor}
include::{generated}/api/version-notes/CL_HALF_FLOAT.asciidoc[]
| Each channel component is a 16-bit half-float value
| {CL_FLOAT_anchor}
include::{generated}/api/version-notes/CL_FLOAT.asciidoc[]
| Each channel component is a single precision floating-point value
|====
For example, to specify a normalized unsigned 8-bit / channel RGBA image,
`image_channel_order` = {CL_RGBA}, and `image_channel_data_type` =
{CL_UNORM_INT8}.
The memory layout of this image format is described below:
// (Jon) these tables are missing "Byte Offset -> 0 1 2 3 ..."
[width="60%",cols="<10%,<10%,<10%,<10%,<60%"]
|====
| R | G | B | A | ... |
|====
with the corresponding byte offsets
[width="60%",cols="<10%,<10%,<10%,<10%,<60%"]
|====
| 0 | 1 | 2 | 3 | ... |
|====
Similar, if `image_channel_order` = {CL_RGBA} and `image_channel_data_type` =
{CL_SIGNED_INT16}, the memory layout of this image format is described below:
[width="60%",cols="<10%,<10%,<10%,<10%,<60%"]
|====
| R | G | B | A | ... |
|====
with the corresponding byte offsets
[width="60%",cols="<10%,<10%,<10%,<10%,<60%"]
|====
| 0 | 2 | 4 | 6 | ... |
|====
`image_channel_data_type` values of {CL_UNORM_SHORT_565}, {CL_UNORM_SHORT_555},
{CL_UNORM_INT_101010}, and {CL_UNORM_INT_101010_2} are special cases of packed
image formats where the channels of each element are packed into a single
unsigned short or unsigned int.
For these special packed image formats, the channels are normally packed
with the first channel in the most significant bits of the bitfield, and
successive channels occupying progressively less significant locations.
For {CL_UNORM_SHORT_565}, R is in bits 15:11, G is in bits 10:5 and B is in
bits 4:0.
For {CL_UNORM_SHORT_555}, bit 15 is undefined, R is in bits 14:10, G in bits
9:5 and B in bits 4:0.
For {CL_UNORM_INT_101010}, bits 31:30 are undefined, R is in bits 29:20, G in
bits 19:10 and B in bits 9:0.
For {CL_UNORM_INT_101010_2}, R is in bits 31:22, G in bits 21:12, B in bits
11:2 and A in bits 1:0.
OpenCL implementations must maintain the minimum precision specified by the
number of bits in `image_channel_data_type`.
If the image format specified by `image_channel_order`, and
`image_channel_data_type` cannot be supported by the OpenCL implementation,
then the call to {clCreateImage}, {clCreateImageWithProperties},
{clCreateImage2D}, or {clCreateImage3D} will return a `NULL` memory object.
--
[[image-descriptor]]
==== Image Descriptor
[open,refpage='cl_image_desc',desc='The image descriptor structure describes the type and dimensions of the image or image array and is defined as:',type='structs',xrefs='clCreateImage']
--
The {cl_image_desc_TYPE} image descriptor structure describes the image type
and dimensions of an image or image array when creating an image using
{clCreateImage} or {clCreateImageWithProperties}, and is defined as:
include::{generated}/api/structs/cl_image_desc.txt[]
* `image_type` describes the image type and must be either
{CL_MEM_OBJECT_IMAGE1D}, {CL_MEM_OBJECT_IMAGE1D_BUFFER},
{CL_MEM_OBJECT_IMAGE1D_ARRAY}, {CL_MEM_OBJECT_IMAGE2D},
{CL_MEM_OBJECT_IMAGE2D_ARRAY}, or {CL_MEM_OBJECT_IMAGE3D}.
* `image_width` is the width of the image in pixels.
For a 2D image and image array, the image width must be a value {geq} 1 and
{leq} {CL_DEVICE_IMAGE2D_MAX_WIDTH}.
For a 3D image, the image width must be a value {geq} 1 and {leq}
{CL_DEVICE_IMAGE3D_MAX_WIDTH}.
For a 1D image buffer, the image width must be a value {geq} 1 and {leq}
{CL_DEVICE_IMAGE_MAX_BUFFER_SIZE}.
For a 1D image and 1D image array, the image width must be a value {geq} 1
and {leq} {CL_DEVICE_IMAGE2D_MAX_WIDTH}.
* `image_height` is the height of the image in pixels.
This is only used if the image is a 2D or 3D image, or a 2D image array.
For a 2D image or image array, the image height must be a value {geq} 1 and
{leq} {CL_DEVICE_IMAGE2D_MAX_HEIGHT}.
For a 3D image, the image height must be a value {geq} 1 and {leq}
{CL_DEVICE_IMAGE3D_MAX_HEIGHT}.
* `image_depth` is the depth of the image in pixels.
This is only used if the image is a 3D image and must be a value {geq} 1 and
{leq} {CL_DEVICE_IMAGE3D_MAX_DEPTH}.
* `image_array_size` footnote:[{fn-image-array-performance}] is the number of
images in the image array.
This is only used if the image is a 1D or 2D image array.
The values for `image_array_size`, if specified, must be a value {geq} 1 and
{leq} {CL_DEVICE_IMAGE_MAX_ARRAY_SIZE}.
* `image_row_pitch` is the scan-line pitch in bytes.
This must be 0 if _host_ptr_ is `NULL` and can be either 0 or {geq}
`image_width` {times} size of element in bytes if _host_ptr_ is not `NULL`.
If _host_ptr_ is not `NULL` and `image_row_pitch` = 0, `image_row_pitch` is
calculated as `image_width` {times} size of element in bytes.
If `image_row_pitch` is not 0, it must be a multiple of the image element
size in bytes.
For a 2D image created from a buffer, the pitch specified (or computed if
pitch specified is 0) must be a multiple of the maximum of the
{CL_DEVICE_IMAGE_PITCH_ALIGNMENT} value for all devices in the context
associated with the buffer specified by `mem_object` that support images.
* `image_slice_pitch` is the size in bytes of each 2D slice in the 3D image or
the size in bytes of each image in a 1D or 2D image array.
This must be 0 if _host_ptr_ is `NULL`.
If _host_ptr_ is not `NULL`, `image_slice_pitch` can be either 0 or {geq}
`image_row_pitch` {times} `image_height` for a 2D image array or 3D image
and can be either 0 or {geq} `image_row_pitch` for a 1D image array.
If _host_ptr_ is not `NULL` and `image_slice_pitch` = 0, `image_slice_pitch`
is calculated as `image_row_pitch` {times} `image_height` for a 2D image
array or 3D image and `image_row_pitch` for a 1D image array.
If `image_slice_pitch` is not 0, it must be a multiple of the
`image_row_pitch`.
* `num_mip_levels` and `num_samples` must be 0.
* `mem_object` may refer to a valid buffer or image memory object.
`mem_object` can be a buffer memory object if `image_type` is
{CL_MEM_OBJECT_IMAGE1D_BUFFER} or
{CL_MEM_OBJECT_IMAGE2D} footnote:[{fn-image-from-buffer}].
`mem_object` can be an image object if `image_type` is
{CL_MEM_OBJECT_IMAGE2D} footnote:[{fn-image-from-image}].
Otherwise it must be `NULL`.
The image pixels are taken from the memory objects data store.
When the contents of the specified memory objects data store are modified,
those changes are reflected in the contents of the image object and
vice-versa at corresponding synchronization points.
For a 1D image buffer created from a buffer object, the `image_width` {times}
size of element in bytes must be {leq} size of the buffer object.
The image data in the buffer object is stored as a single scanline which is
a linear sequence of adjacent elements.
For a 2D image created from a buffer object, the `image_row_pitch` {times}
`image_height` must be {leq} size of the buffer object specified by
`mem_object`.
The image data in the buffer object is stored as a linear sequence of
adjacent scanlines.
Each scanline is a linear sequence of image elements padded to
`image_row_pitch` bytes.
For an image object created from another image object, the values specified
in the image descriptor except for `mem_object` must match the image
descriptor information associated with `mem_object`.
Image elements are stored according to their image format as described in
<<image-format-descriptor, Image Format Descriptor>>.
If the buffer object specified by `mem_object` was created with
{CL_MEM_USE_HOST_PTR}, the _host_ptr_ specified to {clCreateBuffer} or
{clCreateBufferWithProperties} must be aligned to the maximum of the
{CL_DEVICE_IMAGE_BASE_ADDRESS_ALIGNMENT} value for all devices in the
context associated with the buffer specified by `mem_object` that
support images.
Creating a 2D image object from another 2D image object creates a new
2D image object that shares the image data store with `mem_object` but views
the pixels in the image with a different image channel order.
Restrictions are:
* All of the values specified in _image_desc_ must match the image descriptor
information associated with `mem_object`, except for `mem_object`.
* The image channel data type specified in _image_format_ must match the
image channel data type associated with `mem_object`.
* The image channel order specified in _image_format_ must be compatible
with the image channel order associated with `mem_object`.
Compatible image channel orders
footnote:[{fn-compatible-image-channel-orders}] are:
+
--
[width="100%",cols="<50%,<50%",options="header"]
|====
| Image Channel Order in _image_format_:
| Image Channel Order associated with `mem_object`:
| {CL_sBGRA}
| {CL_BGRA}
| {CL_BGRA}
| {CL_sBGRA}
| {CL_sRGBA}
| {CL_RGBA}
| {CL_RGBA}
| {CL_sRGBA}
| {CL_sRGB}
| {CL_RGB}
| {CL_RGB}
| {CL_sRGB}
| {CL_sRGBx}
| {CL_RGBx}
| {CL_RGBx}
| {CL_sRGBx}
| {CL_DEPTH}
| {CL_R}
|====
--
[NOTE]
====
Concurrent reading from, writing to and copying between both a buffer object
and 1D image buffer or 2D image object associated with the buffer object is
undefined.
Only reading from both a buffer object and 1D image buffer or 2D image
object associated with the buffer object is defined.
Writing to an image created from a buffer and then reading from this buffer
in a kernel even if appropriate synchronization operations (such as a
barrier) are performed between the writes and reads is undefined.
Similarly, writing to the buffer and reading from the image created from
this buffer with appropriate synchronization between the writes and reads is
undefined.
====
--
=== Querying List of Supported Image Formats
[open,refpage='clGetSupportedImageFormats',desc='Get the list of image formats supported by an OpenCL implementation.',type='protos']
--
To get the list of image formats supported by an OpenCL implementation for a
specified context, image type, and allocation information, call the function
include::{generated}/api/protos/clGetSupportedImageFormats.txt[]
include::{generated}/api/version-notes/clGetSupportedImageFormats.asciidoc[]
* _context_ is a valid OpenCL context on which the image object(s) will be
created.
* _flags_ is a bit-field that is used to specify usage
information about the image formats being queried and is described in
the <<memory-flags-table,Memory Flags>> table.
_flags_ may be {CL_MEM_READ_WRITE} to query image formats that may be read
from and written to by different kernel instances when correctly ordered by
event dependencies, or {CL_MEM_READ_ONLY} to query image formats that may
be read from by a kernel, or {CL_MEM_WRITE_ONLY} to query image formats that
may be written to by a kernel, or {CL_MEM_KERNEL_READ_AND_WRITE} to query
image formats that may be both read from and written to by the same kernel
instance.
Please see <<image-format-mapping, Image Format Mapping>> for clarification.
* _image_type_ describes the image type and must be either
{CL_MEM_OBJECT_IMAGE1D}, {CL_MEM_OBJECT_IMAGE1D_BUFFER}, {CL_MEM_OBJECT_IMAGE2D},
{CL_MEM_OBJECT_IMAGE3D}, {CL_MEM_OBJECT_IMAGE1D_ARRAY}, or
{CL_MEM_OBJECT_IMAGE2D_ARRAY}.
* _num_entries_ specifies the number of entries that can be returned in the
memory location given by _image_formats_.
* _image_formats_ is a pointer to a memory location where the list of
supported image formats are returned.
Each entry describes a {cl_image_format_TYPE} structure supported by the OpenCL
implementation.
If _image_formats_ is `NULL`, it is ignored.
* _num_image_formats_ is the actual number of supported image formats for a
specific _context_ and values specified by _flags_.
If _num_image_formats_ is `NULL`, it is ignored.
{clGetSupportedImageFormats} returns a union of image formats supported by
all devices in the context.
// The following information about an image memory object
// is specified:
//
// * Context
// * Image type 1D, 2D, or 3D image, 1D image buffer, 1D or 2D image array.
// * Image object allocation information
// refError
{clGetSupportedImageFormats} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_CONTEXT} if _context_ is not a valid context.
* {CL_INVALID_VALUE} if _flags_ or _image_type_ are not valid, or if
_num_entries_ is 0 and _image_formats_ is not `NULL`.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
If {CL_DEVICE_IMAGE_SUPPORT} specified in the <<device-queries-table,Device
Queries>> table is {CL_TRUE}, the values assigned to
{CL_DEVICE_MAX_READ_IMAGE_ARGS}, {CL_DEVICE_MAX_WRITE_IMAGE_ARGS}
{CL_DEVICE_IMAGE2D_MAX_WIDTH}, {CL_DEVICE_IMAGE2D_MAX_HEIGHT}
{CL_DEVICE_IMAGE3D_MAX_WIDTH}, {CL_DEVICE_IMAGE3D_MAX_HEIGHT}
{CL_DEVICE_IMAGE3D_MAX_DEPTH}, and {CL_DEVICE_MAX_SAMPLERS} by the implementation
must be greater than or equal to the minimum values specified in the
<<device-queries-table,Device Queries>> table.
--
[[minimum-list-of-supported-image-formats]]
==== Minimum List of Supported Image Formats
[open,refpage='supportedImageFormats',desc='Minimum List of Supported Image Formats',type='freeform',anchor='minimum-list-of-supported-image-formats']
--
The tables below describe the required minimum lists of supported image
formats.
To query all image formats supported by an implementation, call the function {clGetSupportedImageFormats}.
For full profile devices supporting OpenCL 2.0, 2.1, or 2.2, the minimum
list of supported image formats for either reading or writing in a kernel
is:
[[min-supported-image-formats-2.0]]
.Minimum list of supported image formats for reading or writing (OpenCL 2.0, 2.1, or 2.2)
[width="100%",cols="<34%,<33%,<33%",options="header"]
|====
| num_channels | channel_order | channel_data_type
| 1
| {CL_R}
| {CL_UNORM_INT8} +
{CL_UNORM_INT16} +
{CL_SNORM_INT8} +
{CL_SNORM_INT16} +
{CL_SIGNED_INT8} +
{CL_SIGNED_INT16} +
{CL_SIGNED_INT32} +
{CL_UNSIGNED_INT8} +
{CL_UNSIGNED_INT16} +
{CL_UNSIGNED_INT32} +
{CL_HALF_FLOAT} +
{CL_FLOAT}
| 1
| {CL_DEPTH} footnote:[{fn-depth-image-requirements}]
| {CL_UNORM_INT16} +
{CL_FLOAT}
| 2
| {CL_RG}
| {CL_UNORM_INT8} +
{CL_UNORM_INT16} +
{CL_SNORM_INT8} +
{CL_SNORM_INT16} +
{CL_SIGNED_INT8} +
{CL_SIGNED_INT16} +
{CL_SIGNED_INT32} +
{CL_UNSIGNED_INT8} +
{CL_UNSIGNED_INT16} +
{CL_UNSIGNED_INT32} +
{CL_HALF_FLOAT} +
{CL_FLOAT}
| 4
| {CL_RGBA}
| {CL_UNORM_INT8} +
{CL_UNORM_INT16} +
{CL_SNORM_INT8} +
{CL_SNORM_INT16} +
{CL_SIGNED_INT8} +
{CL_SIGNED_INT16} +
{CL_SIGNED_INT32} +
{CL_UNSIGNED_INT8} +
{CL_UNSIGNED_INT16} +
{CL_UNSIGNED_INT32} +
{CL_HALF_FLOAT} +
{CL_FLOAT}
| 4
| {CL_BGRA}
| {CL_UNORM_INT8}
| 4
| {CL_sRGBA} footnote:[{fn-srgb-image-requirements}]
| {CL_UNORM_INT8}
|====
For full profile devices supporting other OpenCL versions, such as OpenCL 1.2
or OpenCL 3.0, the minimum list of supported image formats for either reading
or writing in a kernel is:
// Note, differences are:
// * No {CL_R} images.
// * No {CL_DEPTH} images.
// * No {CL_RG} images.
// * No {CL_sRGBA} images.
// * For {CL_RGBA} images, no {CL_SNORM_INT8} or {CL_SNORM_INT16} image channel
// data types.
[[min-supported-image-formats]]
.Minimum list of required image formats for reading or writing
[width="100%",cols="<34%,<33%,<33%",options="header"]
|====
| num_channels | channel_order | channel_data_type
| 4
| {CL_RGBA}
| {CL_UNORM_INT8} +
{CL_UNORM_INT16} +
{CL_SIGNED_INT8} +
{CL_SIGNED_INT16} +
{CL_SIGNED_INT32} +
{CL_UNSIGNED_INT8} +
{CL_UNSIGNED_INT16} +
{CL_UNSIGNED_INT32} +
{CL_HALF_FLOAT} +
{CL_FLOAT}
| 4
| {CL_BGRA}
| {CL_UNORM_INT8}
|====
For full profile devices that support reading from and writing to the same
image object from the same kernel instance (see {CL_DEVICE_MAX_READ_WRITE_IMAGE_ARGS}),
the minimum list of supported image formats for reading and writing in
the same kernel instance is:
[[min-supported-image-formats-read-write]]
.Minimum list of required image formats for reading and writing
[width="100%",cols="<34%,<33%,<33%",options="header"]
|====
| num_channels | channel_order | channel_data_type
| 1
| {CL_R}
| {CL_UNORM_INT8} +
{CL_SIGNED_INT8} +
{CL_SIGNED_INT16} +
{CL_SIGNED_INT32} +
{CL_UNSIGNED_INT8} +
{CL_UNSIGNED_INT16} +
{CL_UNSIGNED_INT32} +
{CL_HALF_FLOAT} +
{CL_FLOAT}
| 4
| {CL_RGBA}
| {CL_UNORM_INT8} +
{CL_SIGNED_INT8} +
{CL_SIGNED_INT16} +
{CL_SIGNED_INT32} +
{CL_UNSIGNED_INT8} +
{CL_UNSIGNED_INT16} +
{CL_UNSIGNED_INT32} +
{CL_HALF_FLOAT} +
{CL_FLOAT}
|====
--
[[image-format-mapping]]
==== Image format mapping to OpenCL kernel language image access qualifiers
Image arguments to kernels may have the `read_only`, `write_only` or
`read_write` qualifier.
Not all image formats supported by the device and platform are valid to be
passed to all of these access qualifiers.
For each access qualifier, only images whose format is in the list of
formats returned by {clGetSupportedImageFormats} with the given flag
arguments in the <<image-format-mapping-table,Image Format Mapping>> table
are permitted.
It is not valid to pass an image supporting writing as both a `read_only`
image and a `write_only` image parameter, or to a `read_write` image
parameter and any other image parameter.
[[image-format-mapping-table]]
.Mapping from format flags passed to <<clGetSupportedImageFormats>> to OpenCL kernel language image access qualifiers
[width="100%",cols="<50%,<50%",options="header"]
|====
| Access Qualifier | Memory Flags
| `read_only`
| {CL_MEM_READ_ONLY}, +
{CL_MEM_READ_WRITE}, +
{CL_MEM_KERNEL_READ_AND_WRITE}
| `write_only`
| {CL_MEM_WRITE_ONLY}, +
{CL_MEM_READ_WRITE}, +
{CL_MEM_KERNEL_READ_AND_WRITE}
| `read_write`
| {CL_MEM_KERNEL_READ_AND_WRITE}
|====
=== Reading, Writing and Copying Image Objects
[open,refpage='clEnqueueReadImage',desc='Enqueue commands to read from an image or image array object to host memory.',type='protos',alias='clEnqueueWriteImage']
--
The following functions enqueue commands to read from an image or image
array object to host memory or write to an image or image array object from
host memory.
include::{generated}/api/protos/clEnqueueReadImage.txt[]
include::{generated}/api/version-notes/clEnqueueReadImage.asciidoc[]
include::{generated}/api/protos/clEnqueueWriteImage.txt[]
include::{generated}/api/version-notes/clEnqueueWriteImage.asciidoc[]
* _command_queue_ refers to the host command-queue in which the read / write
command will be queued.
_command_queue_ and _image_ must be created with the same OpenCL context.
* _image_ refers to a valid image or image array object.
* _blocking_read_ and _blocking_write_ indicate if the read and write
operations are _blocking_ or _non-blocking_.
* _origin_ defines the (_x_, _y_, _z_) offset in pixels in the 1D, 2D or 3D
image, the (_x_, _y_) offset and the image index in the 2D image array or
the (_x_) offset and the image index in the 1D image array.
If _image_ is a 2D image object, _origin_[2] must be 0.
If _image_ is a 1D image or 1D image buffer object, _origin_[1] and
_origin_[2] must be 0.
If _image_ is a 1D image array object, _origin_[2] must be 0.
If _image_ is a 1D image array object, _origin_[1] describes the image index
in the 1D image array.
If _image_ is a 2D image array object, _origin_[2] describes the image index
in the 2D image array.
* _region_ defines the (_width_, _height_, _depth_) in pixels of the 1D, 2D or
3D rectangle, the (_width_, _height_) in pixels of the 2D rectangle and the
number of images of a 2D image array or the (_width_) in pixels of the 1D
rectangle and the number of images of a 1D image array.
If _image_ is a 2D image object, _region_[2] must be 1.
If _image_ is a 1D image or 1D image buffer object, _region_[1] and
_region_[2] must be 1.
If _image_ is a 1D image array object, _region_[2] must be 1.
The values in _region_ cannot be 0.
* _row_pitch_ in {clEnqueueReadImage} and _input_row_pitch_ in
{clEnqueueWriteImage} is the length of each row in bytes.
This value must be greater than or equal to the element size in bytes
{times} _width_.
If _row_pitch_ (or _input_row_pitch_) is set to 0, the appropriate row pitch
is calculated based on the size of each element in bytes multiplied by
_width_.
* _slice_pitch_ in {clEnqueueReadImage} and _input_slice_pitch_ in
{clEnqueueWriteImage} is the size in bytes of the 2D slice of the 3D region
of a 3D image or each image of a 1D or 2D image array being read or written
respectively.
This must be 0 if _image_ is a 1D or 2D image.
Otherwise this value must be greater than or equal to _row_pitch_ {times}
_height_.
If _slice_pitch_ (or _input_slice_pitch_) is set to 0, the appropriate slice
pitch is calculated based on the _row_pitch_ {times} _height_.
* _ptr_ is the pointer to a buffer in host memory where image data is to be
read from or to be written to.
The alignment requirements for ptr are specified in
<<alignment-app-data-types>>.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before this particular command can be executed.
If _event_wait_list_ is `NULL`, then this particular command does not wait
on any event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this read / write command
and can be used to query or queue a wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
If _blocking_read_ is {CL_TRUE} i.e. the read command is blocking,
{clEnqueueReadImage} does not return until the buffer data has been read and
copied into memory pointed to by _ptr_.
If _blocking_read_ is {CL_FALSE} i.e. the read command is non-blocking,
{clEnqueueReadImage} queues a non-blocking read command and returns.
The contents of the buffer that _ptr_ points to cannot be used until the
read command has completed.
The _event_ argument returns an event object which can be used to query the
execution status of the read command.
When the read command has completed, the contents of the buffer that _ptr_
points to can be used by the application.
If _blocking_write_ is {CL_TRUE}, the write command is blocking and does not
return until the command is complete, including transfer of the data.
The memory pointed to by _ptr_ can be reused by the application after the
{clEnqueueWriteImage} call returns.
If _blocking_write_ is {CL_FALSE}, the OpenCL implementation will use _ptr_ to
perform a non-blocking write.
As the write is non-blocking the implementation can return immediately.
The memory pointed to by _ptr_ cannot be reused by the application after the
call returns.
The _event_ argument returns an event object which can be used to query the
execution status of the write command.
When the write command has completed, the memory pointed to by _ptr_ can
then be reused by the application.
// refError
{clEnqueueReadImage} and {clEnqueueWriteImage} return {CL_SUCCESS} if the
function is executed successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and
_image_ are not the same or if the context associated with
_command_queue_ and events in _event_wait_list_ are not the same.
* {CL_INVALID_MEM_OBJECT} if _image_ is not a valid image object.
* {CL_INVALID_VALUE} if _origin_ or _region_ is `NULL`.
* {CL_INVALID_VALUE} if the region being read or written specified by
_origin_ and _region_ is out of bounds.
* {CL_INVALID_VALUE} if values in _origin_ and _region_ do not follow rules
described in the argument description for _origin_ and _region_.
* {CL_INVALID_VALUE} if _ptr_ is `NULL`.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_INVALID_IMAGE_SIZE} if image dimensions (image width, height,
specified or compute row and/or slice pitch) for _image_ are not
supported by device associated with _queue_.
* {CL_IMAGE_FORMAT_NOT_SUPPORTED} if image format (image channel order and
data type) for _image_ are not supported by device associated with
_queue_.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for data store associated with _image_.
* {CL_INVALID_OPERATION} if the device associated with _command_queue_ does
not support images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the
<<device-queries-table,Device Queries>> table is {CL_FALSE}).
* {CL_INVALID_OPERATION} if {clEnqueueReadImage} is called on _image_ which
has been created with {CL_MEM_HOST_WRITE_ONLY} or {CL_MEM_HOST_NO_ACCESS}.
* {CL_INVALID_OPERATION} if {clEnqueueWriteImage} is called on _image_ which
has been created with {CL_MEM_HOST_READ_ONLY} or {CL_MEM_HOST_NO_ACCESS}.
* {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the read and write
operations are blocking and the execution status of any of the events in
_event_wait_list_ is a negative integer value.
This error code is <<unified-spec, missing before>> version 1.1.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
[NOTE]
====
Calling {clEnqueueReadImage} to read a region of the _image_ with the _ptr_
argument value set to _host_ptr_ + (_origin_[2] {times} _image slice pitch_
{plus} _origin_[1] {times} _image row pitch_ + _origin_[0] {times} _bytes
per pixel_), where _host_ptr_ is a pointer to the memory region specified
when the _image_ being read is created with {CL_MEM_USE_HOST_PTR}, must meet
the following requirements in order to avoid undefined behavior:
* All commands that use this image object have finished execution before
the read command begins execution.
* The _row_pitch_ and _slice_pitch_ argument values in
{clEnqueueReadImage} must be set to the image row pitch and slice pitch.
* The image object is not mapped.
* The image object is not used by any command-queue until the read command
has finished execution.
Calling {clEnqueueWriteImage} to update the latest bits in a region of the
_image_ with the _ptr_ argument value set to _host_ptr_ + (_origin_[2]
{times} _image slice pitch_ + _origin_[1] {times} _image row pitch_ {plus}
_origin_[0] {times} _bytes per pixel_), where _host_ptr_ is a pointer to the
memory region specified when the _image_ being written is created with
{CL_MEM_USE_HOST_PTR}, must meet the following requirements in order to avoid
undefined behavior:
* The host memory region being written contains the latest bits when the
enqueued write command begins execution.
* The _input_row_pitch_ and _input_slice_pitch_ argument values in
{clEnqueueWriteImage} must be set to the image row pitch and slice
pitch.
* The image object is not mapped.
* The image object is not used by any command-queue until the write
command has finished execution.
====
--
[open,refpage='clEnqueueCopyImage',desc='Enqueues a command to copy image objects.',type='protos']
--
To enqueue a command to copy image objects, call the function
include::{generated}/api/protos/clEnqueueCopyImage.txt[]
include::{generated}/api/version-notes/clEnqueueCopyImage.asciidoc[]
* _src_image_ and _dst_image_ can be 1D, 2D, 3D image or a 1D, 2D image array
objects.
It is possible to copy subregions between any combinations of source and
destination types, provided that the dimensions of the subregions are the
same e.g., one can copy a rectangular region from a 2D image to a slice of a
3D image.
* _command_queue_ refers to the host command-queue in which the copy command
will be queued.
The OpenCL context associated with _command_queue_, _src_image_ and
_dst_image_ must be the same.
* _src_origin_ defines the (_x_, _y_, _z_) offset in pixels in the 1D, 2D or
3D image, the (_x_, _y_) offset and the image index in the 2D image array or
the (_x_) offset and the image index in the 1D image array.
If _image_ is a 2D image object, _src_origin_[2] must be 0.
If _src_image_ is a 1D image object, _src_origin_[1] and _src_origin_[2]
must be 0.
If _src_image_ is a 1D image array object, _src_origin_[2] must be 0.
If _src_image_ is a 1D image array object, _src_origin_[1] describes the
image index in the 1D image array.
If _src_image_ is a 2D image array object, _src_origin_[2] describes the
image index in the 2D image array.
* _dst_origin_ defines the (_x_, _y_, _z_) offset in pixels in the 1D, 2D or
3D image, the (_x_, _y_) offset and the image index in the 2D image array or
the (_x_) offset and the image index in the 1D image array.
If _dst_image_ is a 2D image object, _dst_origin_[2] must be 0.
If _dst_image_ is a 1D image or 1D image buffer object, _dst_origin_[1] and
_dst_origin_[2] must be 0.
If _dst_image_ is a 1D image array object, _dst_origin_[2] must be 0.
If _dst_image_ is a 1D image array object, _dst_origin_[1] describes the
image index in the 1D image array.
If _dst_image_ is a 2D image array object, _dst_origin_[2] describes the
image index in the 2D image array.
* _region_ defines the (_width_, _height_, _depth_) in pixels of the 1D, 2D or
3D rectangle, the (_width_, _height_) in pixels of the 2D rectangle and the
number of images of a 2D image array or the (_width_) in pixels of the 1D
rectangle and the number of images of a 1D image array.
If _src_image_ or _dst_image_ is a 2D image object, _region_[2] must be 1.
If _src_image_ or _dst_image_ is a 1D image or 1D image buffer object,
_region_[1] and _region_[2] must be 1.
If _src_image_ or _dst_image_ is a 1D image array object, _region_[2] must
be 1.
The values in _region_ cannot be 0.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before this particular command can be executed.
If _event_wait_list_ is `NULL`, then this particular command does not wait
on any event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this copy command
and can be used to query or queue a wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
It is currently a requirement that the _src_image_ and _dst_image_ image
memory objects for {clEnqueueCopyImage} must have the exact same image
format (i.e. the {cl_image_format_TYPE} descriptor specified when _src_image_ and
_dst_image_ are created must match).
// refError
{clEnqueueCopyImage} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_CONTEXT} if the context associated with _command_queue_,
_src_image_ and _dst_image_ are not the same or if the context
associated with _command_queue_ and events in _event_wait_list_ are not
the same.
* {CL_INVALID_MEM_OBJECT} if _src_image_ and _dst_image_ are not valid image
objects.
* {CL_IMAGE_FORMAT_MISMATCH} if _src_image_ and _dst_image_ do not use the
same image format.
* {CL_INVALID_VALUE} if _src_origin_, _dst_origin_, or _region_ is `NULL`.
* {CL_INVALID_VALUE} if the 2D or 3D rectangular region specified by
_src_origin_ and _src_origin_ {plus} _region_ refers to a region outside
_src_image_, or if the 2D or 3D rectangular region specified by
_dst_origin_ and _dst_origin_ {plus} _region_ refers to a region outside
_dst_image_.
* {CL_INVALID_VALUE} if values in _src_origin_, _dst_origin_ and _region_ do
not follow rules described in the argument description for _src_origin_,
_dst_origin_ and _region_.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_INVALID_IMAGE_SIZE} if image dimensions (image width, height,
specified or compute row and/or slice pitch) for _src_image_ or
_dst_image_ are not supported by device associated with _queue_.
* {CL_IMAGE_FORMAT_NOT_SUPPORTED} if image format (image channel order and
data type) for _src_image_ or _dst_image_ are not supported by device
associated with _queue_.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for data store associated with _src_image_ or _dst_image_.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
* {CL_INVALID_OPERATION} if the device associated with _command_queue_ does
not support images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the
<<device-queries-table,Device Queries>> table is {CL_FALSE}).
* {CL_MEM_COPY_OVERLAP} if _src_image_ and _dst_image_ are the same image
object and the source and destination regions overlap.
--
=== Filling Image Objects
NOTE: Filling image objects is <<unified-spec, missing before>> version 1.2.
[open,refpage='clEnqueueFillImage',desc='Enqueues a command to fill an image object with a specified color.',type='protos']
--
To enqueue a command to fill an image object with a specified color, call
the function
include::{generated}/api/protos/clEnqueueFillImage.txt[]
include::{generated}/api/version-notes/clEnqueueFillImage.asciidoc[]
* _command_queue_ refers to the host command-queue in which the fill command
will be queued.
The OpenCL context associated with _command_queue_ and _image_ must be the
same.
* _image_ is a valid image object.
* _fill_color_ is the color used to fill the image.
The fill color is a single floating point value if the channel order is
{CL_DEPTH}.
Otherwise, the fill color is a four component RGBA floating-point color
value if the _image_ channel data type is not an unnormalized signed or
unsigned integer type, is a four component signed integer value if the
_image_ channel data type is an unnormalized signed integer type and is a
four component unsigned integer value if the _image_ channel data type is an
unnormalized unsigned integer type.
The fill color will be converted to the appropriate image channel format and
order associated with _image_.
* _origin_ defines the (_x_, _y_, _z_) offset in pixels in the 1D, 2D or 3D
image, the (_x_, _y_) offset and the image index in the 2D image array or
the (_x_) offset and the image index in the 1D image array.
If _image_ is a 2D image object, _origin_[2] must be 0.
If _image_ is a 1D image or 1D image buffer object, _origin_[1] and
_origin_[2] must be 0.
If _image_ is a 1D image array object, _origin_[2] must be 0.
If _image_ is a 1D image array object, _origin_[1] describes the image index
in the 1D image array.
If _image_ is a 2D image array object, _origin_[2] describes the image index
in the 2D image array.
* _region_ defines the (_width_, _height_, _depth_) in pixels of the 1D, 2D or
3D rectangle, the (_width_, _height_) in pixels of the 2D rectangle and the
number of images of a 2D image array or the (_width_) in pixels of the 1D
rectangle and the number of images of a 1D image array.
If _image_ is a 2D image object, _region_[2] must be 1.
If _image_ is a 1D image or 1D image buffer object, _region_[1] and
_region_[2] must be 1.
If _image_ is a 1D image array object, _region_[2] must be 1.
The values in _region_ cannot be 0.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before this particular command can be executed.
If _event_wait_list_ is `NULL`, then this particular command does not wait
on any event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this command and
can be used to query or queue a wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
The usage information which indicates whether the memory object can be read
or written by a kernel and/or the host and is given by the {cl_mem_flags_TYPE}
argument value specified when _image_ is created is ignored by
{clEnqueueFillImage}.
// refError
{clEnqueueFillImage} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and
_image_ are not the same or if the context associated with
_command_queue_ and events in _event_wait_list_ are not the same.
* {CL_INVALID_MEM_OBJECT} if _image_ is not a valid image object.
* {CL_INVALID_VALUE} if _fill_color_ is `NULL`.
* {CL_INVALID_VALUE} if _origin_ or _region_ is `NULL`.
* {CL_INVALID_VALUE} if the region being filled as specified by _origin_ and
_region_ is out of bounds.
* {CL_INVALID_VALUE} if values in _origin_ and _region_ do not follow rules
described in the argument description for _origin_ and _region_.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_INVALID_IMAGE_SIZE} if image dimensions (image width, height,
specified or compute row and/or slice pitch) for _image_ are not
supported by device associated with _queue_.
* {CL_IMAGE_FORMAT_NOT_SUPPORTED} if image format (image channel order and
data type) for _image_ are not supported by device associated with
_queue_.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for data store associated with _image_.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
=== Copying between Image and Buffer Objects
[open,refpage='clEnqueueCopyImageToBuffer',desc='Enqueues a command to copy an image object to a buffer object.',type='protos']
--
To enqueue a command to copy an image object to a buffer object, call the
function
include::{generated}/api/protos/clEnqueueCopyImageToBuffer.txt[]
include::{generated}/api/version-notes/clEnqueueCopyImageToBuffer.asciidoc[]
* _command_queue_ must be a valid host command-queue.
The OpenCL context associated with _command_queue_, _src_image_ and
_dst_buffer_ must be the same.
* _src_image_ is a valid image object.
* _dst_buffer_ is a valid buffer object.
* _src_origin_ defines the (_x_, _y_, _z_) offset in pixels in the 1D, 2D or
3D image, the (_x_, _y_) offset and the image index in the 2D image array or
the (_x_) offset and the image index in the 1D image array.
If _src_image_ is a 2D image object, _src_origin_[2] must be 0.
If _src_image_ is a 1D image or 1D image buffer object, _src_origin_[1] and
_src_origin_[2] must be 0.
If _src_image_ is a 1D image array object, _src_origin_[2] must be 0.
If _src_image_ is a 1D image array object, _src_origin_[1] describes the
image index in the 1D image array.
If _src_image_ is a 2D image array object, _src_origin_[2] describes the
image index in the 2D image array.
* _region_ defines the (_width_, _height_, _depth_) in pixels of the 1D, 2D or
3D rectangle, the (_width_, _height_) in pixels of the 2D rectangle and the
number of images of a 2D image array or the (_width_) in pixels of the 1D
rectangle and the number of images of a 1D image array.
If _src_image_ is a 2D image object, _region_[2] must be 1.
If _src_image_ is a 1D image or 1D image buffer object, _region_[1] and
_region_[2] must be 1.
If _src_image_ is a 1D image array object, _region_[2] must be 1.
The values in _region_ cannot be 0.
* _dst_offset_ refers to the offset where to begin copying data into
_dst_buffer_.
The size in bytes of the region to be copied referred to as _dst_cb_ is
computed as _width_ {times} _height_ {times} _depth_ {times} _bytes/image
element_ if _src_image_ is a 3D image object, is computed as _width_ {times}
_height_ {times} _bytes/image element_ if _src_image_ is a 2D image, is
computed as _width_ {times} _height_ {times} _arraysize_ {times}
_bytes/image element_ if _src_image_ is a 2D image array object, is computed
as _width_ {times} _bytes/image element_ if _src_image_ is a 1D image or 1D
image buffer object and is computed as _width_ {times} _arraysize_ {times}
_bytes/image element_ if _src_image_ is a 1D image array object.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before this particular command can be executed.
If _event_wait_list_ is `NULL`, then this particular command does not wait
on any event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this copy command
and can be used to query or queue a wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
// refError
{clEnqueueCopyImageToBuffer} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_CONTEXT} if the context associated with _command_queue_,
_src_image_ and _dst_buffer_ are not the same or if the context
associated with _command_queue_ and events in _event_wait_list_ are not
the same.
* {CL_INVALID_MEM_OBJECT} if _src_image_ is not a valid image object or
_dst_buffer_ is not a valid buffer object or if _src_image_ is a 1D
image buffer object created from _dst_buffer_.
* {CL_INVALID_VALUE} if _src_origin_ or _region_ is `NULL`.
* {CL_INVALID_VALUE} if the 1D, 2D or 3D rectangular region specified by
_src_origin_ and _src_origin_ + _region_ refers to a region outside
_src_image_, or if the region specified by _dst_offset_ and _dst_offset_
+ _dst_cb_ to a region outside _dst_buffer_.
* {CL_INVALID_VALUE} if values in _src_origin_ and _region_ do not follow
rules described in the argument description for _src_origin_ and
_region_.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _dst_buffer_ is a sub-buffer object
and _offset_ specified when the sub-buffer object is created is not
aligned to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated
with _queue_.
This error code is <<unified-spec, missing before>> version 1.1.
* {CL_INVALID_IMAGE_SIZE} if image dimensions (image width, height,
specified or compute row and/or slice pitch) for _src_image_ are not
supported by device associated with _queue_.
* {CL_IMAGE_FORMAT_NOT_SUPPORTED} if image format (image channel order and
data type) for _src_image_ are not supported by device associated with
_queue_.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for data store associated with _src_image_ or _dst_buffer_.
* {CL_INVALID_OPERATION} if the device associated with _command_queue_ does
not support images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the
<<device-queries-table,Device Queries>> table is {CL_FALSE}).
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clEnqueueCopyBufferToImage',desc='Enqueues a command to copy a buffer object to an image object.',type='protos']
--
To enqueue a command to copy a buffer object to an image object, call the
function
include::{generated}/api/protos/clEnqueueCopyBufferToImage.txt[]
include::{generated}/api/version-notes/clEnqueueCopyBufferToImage.asciidoc[]
* _command_queue_ must be a valid host command-queue.
The OpenCL context associated with _command_queue_, _src_buffer_ and
_dst_image_ must be the same.
* _src_buffer_ is a valid buffer object.
* _dst_image_ is a valid image object.
* _src_offset_ refers to the offset where to begin copying data from
_src_buffer_.
* _dst_origin_ defines the (_x_, _y_, _z_) offset in pixels in the 1D, 2D or
3D image, the (_x_, _y_) offset and the image index in the 2D image array or
the (_x_) offset and the image index in the 1D image array.
If _dst_image_ is a 2D image object, _dst_origin_[2] must be 0.
If _dst_image_ is a 1D image or 1D image buffer object, _dst_origin_[1] and
_dst_origin_[2] must be 0.
If _dst_image_ is a 1D image array object, _dst_origin_[2] must be 0.
If _dst_image_ is a 1D image array object, _dst_origin_[1] describes the
image index in the 1D image array.
If _dst_image_ is a 2D image array object, _dst_origin_[2] describes the
image index in the 2D image array.
* _region_ defines the (_width_, _height_, _depth_) in pixels of the 1D, 2D or
3D rectangle, the (_width_, _height_) in pixels of the 2D rectangle and the
number of images of a 2D image array or the (_width_) in pixels of the 1D
rectangle and the number of images of a 1D image array.
If _dst_image_ is a 2D image object, _region_[2] must be 1.
If _dst_image_ is a 1D image or 1D image buffer object, _region_[1] and
_region_[2] must be 1.
If _dst_image_ is a 1D image array object, _region_[2] must be 1.
The values in _region_ cannot be 0.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before this particular command can be executed.
If _event_wait_list_ is `NULL`, then this particular command does not wait
on any event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this copy command
and can be used to query or queue a wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
The size in bytes of the region to be copied from _src_buffer_ referred to
as _src_cb_ is computed as _width_ {times} _height_ {times} _depth_ {times}
_bytes/image element_ if _dst_image_ is a 3D image object, is computed as
_width_ {times} _height_ {times} _bytes/image element_ if _dst_image_ is a
2D image, is computed as _width_ {times} _height_ {times} _arraysize_
{times} _bytes/image element_ if _dst_image_ is a 2D image array object, is
computed as _width_ {times} _bytes/image element_ if _dst_image_ is a 1D
image or 1D image buffer object and is computed as _width_ {times}
_arraysize_ {times} _bytes/image element_ if _dst_image_ is a 1D image array
object.
// refError
{clEnqueueCopyBufferToImage} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_CONTEXT} if the context associated with _command_queue_,
_src_buffer_ and _dst_image_ are not the same or if the context
associated with _command_queue_ and events in _event_wait_list_ are not
the same.
* {CL_INVALID_MEM_OBJECT} if _src_buffer_ is not a valid buffer object or
_dst_image_ is not a valid image object or if _dst_image_ is a 1D image
buffer object created from _src_buffer_.
* {CL_INVALID_VALUE} if _dst_origin_ or _region_ is `NULL`.
* {CL_INVALID_VALUE} if the 1D, 2D or 3D rectangular region specified by
_dst_origin_ and _dst_origin_ + _region_ refer to a region outside
_dst_image_, or if the region specified by _src_offset_ and _src_offset_
+ _src_cb_ refer to a region outside _src_buffer_.
* {CL_INVALID_VALUE} if values in _dst_origin_ and _region_ do not follow
rules described in the argument description for _dst_origin_ and
_region_.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _src_buffer_ is a sub-buffer object
and _offset_ specified when the sub-buffer object is created is not
aligned to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated
with _queue_.
This error code is <<unified-spec, missing before>> version 1.1.
* {CL_INVALID_IMAGE_SIZE} if image dimensions (image width, height,
specified or compute row and/or slice pitch) for _dst_image_ are not
supported by device associated with _queue_.
* {CL_IMAGE_FORMAT_NOT_SUPPORTED} if image format (image channel order and
data type) for _dst_image_ are not supported by device associated with
_queue_.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for data store associated with _src_buffer_ or _dst_image_.
* {CL_INVALID_OPERATION} if the device associated with _command_queue_ does
not support images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the
<<device-queries-table,Device Queries>> table is {CL_FALSE}).
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
=== Mapping Image Objects
[open,refpage='clEnqueueMapImage',desc='Enqueues a command to map a region of an image object into the host address space and returns a pointer to this mapped region.',type='protos']
--
To enqueue a command to map a region in the image object given by _image_
into the host address space and returns a pointer to this mapped region,
call the function
include::{generated}/api/protos/clEnqueueMapImage.txt[]
include::{generated}/api/version-notes/clEnqueueMapImage.asciidoc[]
* _command_queue_ must be a valid host command-queue.
* _image_ is a valid image object.
The OpenCL context associated with _command_queue_ and _image_ must be the
same.
* _blocking_map_ indicates if the map operation is _blocking_ or
_non-blocking_.
* _map_flags_ is a bit-field and is described in the
<<memory-map-flags-table,Memory Map Flags>> table.
* _origin_ defines the (_x_, _y_, _z_) offset in pixels in the 1D, 2D or 3D
image, the (_x_, _y_) offset and the image index in the 2D image array or
the (_x_) offset and the image index in the 1D image array.
If _image_ is a 2D image object, _origin_[2] must be 0.
If _image_ is a 1D image or 1D image buffer object, _origin_[1] and
_origin_[2] must be 0.
If _image_ is a 1D image array object, _origin_[2] must be 0.
If _image_ is a 1D image array object, _origin_[1] describes the image index
in the 1D image array.
If _image_ is a 2D image array object, _origin_[2] describes the image index
in the 2D image array.
* _region_ defines the (_width_, _height_, _depth_) in pixels of the 1D, 2D or
3D rectangle, the (_width_, _height_) in pixels of the 2D rectangle and the
number of images of a 2D image array or the (_width_) in pixels of the 1D
rectangle and the number of images of a 1D image array.
If _image_ is a 2D image object, _region_[2] must be 1.
If _image_ is a 1D image or 1D image buffer object, _region_[1] and
_region_[2] must be 1.
If _image_ is a 1D image array object, _region_[2] must be 1.
The values in _region_ cannot be 0.
* _image_row_pitch_ returns the scan-line pitch in bytes for the mapped
region.
This must be a non-`NULL` value.
* _image_slice_pitch_ returns the size in bytes of each 2D slice of a 3D image
or the size of each 1D or 2D image in a 1D or 2D image array for the mapped
region.
For a 1D and 2D image, zero is returned if this argument is not `NULL`.
For a 3D image, 1D and 2D image array, _image_slice_pitch_ must be a
non-`NULL` value.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before {clEnqueueMapImage} can be executed.
If _event_wait_list_ is `NULL`, then {clEnqueueMapImage} does not wait on
any event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this command and
can be used to query or queue a wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
* _errcode_ret_ will return an appropriate error code.
If _errcode_ret_ is `NULL`, no error code is returned.
If _blocking_map_ is {CL_TRUE}, {clEnqueueMapImage} does not return until the
specified region in _image_ is mapped into the host address space and the
application can access the contents of the mapped region using the pointer
returned by {clEnqueueMapImage}.
If _blocking_map_ is {CL_FALSE} i.e. map operation is non-blocking, the
pointer to the mapped region returned by {clEnqueueMapImage} cannot be used
until the map command has completed.
The _event_ argument returns an event object which can be used to query the
execution status of the map command.
When the map command is completed, the application can access the contents
of the mapped region using the pointer returned by {clEnqueueMapImage}.
// refError
{clEnqueueMapImage} will return a pointer to the mapped region.
The _errcode_ret_ is set to {CL_SUCCESS}.
A `NULL` pointer is returned otherwise with one of the following error
values returned in _errcode_ret_:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_CONTEXT} if context associated with _command_queue_ and
_image_ are not the same or if context associated with _command_queue_
and events in _event_wait_list_ are not the same.
* {CL_INVALID_MEM_OBJECT} if _image_ is not a valid image object.
* {CL_INVALID_VALUE} if _origin_ or _region_ is `NULL`.
* {CL_INVALID_VALUE} if region being mapped given by (_origin_,
_origin_ + _region_) is out of bounds or if values specified in _map_flags_
are not valid.
* {CL_INVALID_VALUE} if values in _origin_ and _region_ do not follow rules
described in the argument description for _origin_ and _region_.
* {CL_INVALID_VALUE} if _image_row_pitch_ is `NULL`.
* {CL_INVALID_VALUE} if _image_ is a 3D image, 1D or 2D image array object
and _image_slice_pitch_ is `NULL`.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_INVALID_IMAGE_SIZE} if image dimensions (image width, height,
specified or compute row and/or slice pitch) for _image_ are not
supported by device associated with _queue_.
* {CL_IMAGE_FORMAT_NOT_SUPPORTED} if image format (image channel order and
data type) for _image_ are not supported by device associated with
_queue_.
* {CL_MAP_FAILURE} if there is a failure to map the requested region into
the host address space.
This error cannot occur for image objects created with
{CL_MEM_USE_HOST_PTR} or {CL_MEM_ALLOC_HOST_PTR}.
* {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the map operation is
blocking and the execution status of any of the events in
_event_wait_list_ is a negative integer value.
This error code is <<unified-spec, missing before>> version 1.1.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for data store associated with _image_.
* {CL_INVALID_OPERATION} if the device associated with _command_queue_ does
not support images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the
<<device-queries-table,Device Queries>> table is {CL_FALSE}).
* {CL_INVALID_OPERATION} if _image_ has been created with
{CL_MEM_HOST_WRITE_ONLY} or {CL_MEM_HOST_NO_ACCESS} and {CL_MAP_READ} is set
in _map_flags_ or if _image_ has been created with {CL_MEM_HOST_READ_ONLY}
or {CL_MEM_HOST_NO_ACCESS} and {CL_MAP_WRITE} or
{CL_MAP_WRITE_INVALIDATE_REGION} is set in _map_flags_.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
* {CL_INVALID_OPERATION} if mapping would lead to overlapping regions being
mapped for writing.
The pointer returned maps a 1D, 2D or 3D region starting at _origin_ and is
at least _region_[0] pixels in size for a 1D image, 1D image buffer or 1D
image array, (_image_row_pitch {times} region[1])_ pixels in size for a 2D
image or 2D image array, and (_image_slice_pitch {times} region[2])_ pixels
in size for a 3D image.
The result of a memory access outside this region is undefined.
If the image object is created with {CL_MEM_USE_HOST_PTR} set in _mem_flags_,
the following will be true:
* The _host_ptr_ specified in {clCreateImage}, {clCreateImageWithProperties},
{clCreateImage2D}, or {clCreateImage3D} is guaranteed to contain the
latest bits in the region being mapped when the {clEnqueueMapImage} command
has completed.
* The pointer value returned by {clEnqueueMapImage} will be derived from
the _host_ptr_ specified when the image object is created.
Mapped image objects are unmapped using {clEnqueueUnmapMemObject}.
This is described in <<unmapping-mapped-memory, Unmapping Mapped Memory
Objects>>.
--
[[image-object-queries]]
=== Image Object Queries
To get information that is common to all memory objects, use the
{clGetMemObjectInfo} function described in <<memory-object-queries, Memory
Object Queries>>.
[open,refpage='clGetImageInfo',desc='Get information specific to an image object created with clCreateImage.',type='protos']
--
To get information specific to an image object created with {clCreateImage},
{clCreateImageWithProperties}, {clCreateImage2D}, or {clCreateImage3D} call
the function
include::{generated}/api/protos/clGetImageInfo.txt[]
include::{generated}/api/version-notes/clGetImageInfo.asciidoc[]
* _image_ specifies the image object being queried.
* _param_name_ specifies the information to query.
The list of supported _param_name_ types and the information returned in
_param_value_ by {clGetImageInfo} is described in the
<<image-info-table,Image Object Queries>> table.
* _param_value_ is a pointer to memory where the appropriate result being
queried is returned.
If _param_value_ is `NULL`, it is ignored.
* _param_value_size_ is used to specify the size in bytes of memory pointed to
by _param_value_.
This size must be {geq} size of return type as described in the
<<image-info-table,Image Object Queries>> table.
* _param_value_size_ret_ returns the actual size in bytes of data being
queried by _param_name_.
If _param_value_size_ret_ is `NULL`, it is ignored.
[[image-info-table]]
.List of supported param_names by <<clGetImageInfo>>
[width="100%",cols="<33%,<17%,<50%",options="header"]
|====
| Image Info | Return type | Description
| {CL_IMAGE_FORMAT_anchor}
include::{generated}/api/version-notes/CL_IMAGE_FORMAT.asciidoc[]
| {cl_image_format_TYPE}
| Return the image format descriptor specified when _image_ is created
with {clCreateImage}, {clCreateImageWithProperties},
{clCreateImage2D} or {clCreateImage3D}.
| {CL_IMAGE_ELEMENT_SIZE_anchor}
include::{generated}/api/version-notes/CL_IMAGE_ELEMENT_SIZE.asciidoc[]
| {size_t_TYPE}
| Return size of each element of the image memory object given by
_image_ in bytes.
| {CL_IMAGE_ROW_PITCH_anchor}
include::{generated}/api/version-notes/CL_IMAGE_ROW_PITCH.asciidoc[]
| {size_t_TYPE}
| Return calculated row pitch in bytes of a row of elements of the
image object given by _image_.
| {CL_IMAGE_SLICE_PITCH_anchor}
include::{generated}/api/version-notes/CL_IMAGE_SLICE_PITCH.asciidoc[]
| {size_t_TYPE}
| Return calculated slice pitch in bytes of a 2D slice for the 3D
image object or size of each image in a 1D or 2D image array given
by _image_.
For a 1D image, 1D image buffer and 2D image object return 0.
| {CL_IMAGE_WIDTH_anchor}
include::{generated}/api/version-notes/CL_IMAGE_WIDTH.asciidoc[]
| {size_t_TYPE}
| Return width of the image in pixels.
| {CL_IMAGE_HEIGHT_anchor}
include::{generated}/api/version-notes/CL_IMAGE_HEIGHT.asciidoc[]
| {size_t_TYPE}
| Return height of the image in pixels.
For a 1D image, 1D image buffer and 1D image array object, height =
0.
| {CL_IMAGE_DEPTH_anchor}
include::{generated}/api/version-notes/CL_IMAGE_DEPTH.asciidoc[]
| {size_t_TYPE}
| Return depth of the image in pixels.
For a 1D image, 1D image buffer, 2D image or 1D and 2D image array
object, depth = 0.
| {CL_IMAGE_ARRAY_SIZE_anchor}
include::{generated}/api/version-notes/CL_IMAGE_ARRAY_SIZE.asciidoc[]
| {size_t_TYPE}
| Return number of images in the image array.
If _image_ is not an image array, 0 is returned.
| {CL_IMAGE_BUFFER_anchor}
include::{generated}/api/version-notes/CL_IMAGE_BUFFER.asciidoc[]
| {cl_mem_TYPE}
| Return buffer object associated with _image_.
| {CL_IMAGE_NUM_MIP_LEVELS_anchor}
include::{generated}/api/version-notes/CL_IMAGE_NUM_MIP_LEVELS.asciidoc[]
| {cl_uint_TYPE}
| Return `num_mip_levels` associated with _image_.
| {CL_IMAGE_NUM_SAMPLES_anchor}
include::{generated}/api/version-notes/CL_IMAGE_NUM_SAMPLES.asciidoc[]
| {cl_uint_TYPE}
| Return `num_samples` associated with _image_.
|====
// refError
{clGetImageInfo} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
specified by _param_value_size_ is < size of return type as described in
the <<image-info-table,Image Object Queries>> table and _param_value_ is
not `NULL`.
* {CL_INVALID_MEM_OBJECT} if _image_ is a not a valid image object.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
== Pipes
NOTE: Pipes are <<unified-spec, missing before>> version 2.0.
A _pipe_ is a memory object that stores data organized as a FIFO.
Pipe objects can only be accessed using built-in functions that read from
and write to a pipe.
Pipe objects are not accessible from the host.
A pipe object encapsulates the following information:
* Packet size in bytes
* Maximum capacity in packets
* Information about the number of packets currently in the pipe
* Data packets
=== Creating Pipe Objects
[open,refpage='clCreatePipe',desc='Creates a pipe object.',type='protos']
--
To create a *pipe object*, call the function
include::{generated}/api/protos/clCreatePipe.txt[]
include::{generated}/api/version-notes/clCreatePipe.asciidoc[]
* _context_ is a valid OpenCL context used to create the pipe object.
* _flags_ is a bit-field that is used to specify allocation and usage
information such as the memory arena that should be used to allocate the
pipe object and how it will be used.
The <<memory-flags,Memory Flags>> table describes the possible values for
_flags_.
Only {CL_MEM_READ_WRITE} and {CL_MEM_HOST_NO_ACCESS} can be specified when
creating a pipe object.
If the value specified for _flags_ is 0, the default is used which is
{CL_MEM_READ_WRITE} | {CL_MEM_HOST_NO_ACCESS}.
* _pipe_packet_size_ is the size in bytes of a pipe packet.
* _pipe_max_packets_ specifies the pipe capacity by specifying the maximum
number of packets the pipe can hold.
* _properties_ specifies a list of properties for the pipe and their
corresponding values.
Each property name is immediately followed by the corresponding desired
value.
The list is terminated with 0.
Currently, in all OpenCL versions, _properties_ must be `NULL`.
* _errcode_ret_ will return an appropriate error code.
If _errcode_ret_ is `NULL`, no error code is returned.
// refError
{clCreatePipe} returns a valid non-zero pipe object and _errcode_ret_ is set
to {CL_SUCCESS} if the pipe object is created successfully.
Otherwise, it returns a `NULL` value with one of the following error values
returned in _errcode_ret_:
* {CL_INVALID_CONTEXT} if _context_ is not a valid context.
* {CL_INVALID_OPERATION} if no devices in _context_ support pipes.
* {CL_INVALID_VALUE} if values specified in _flags_ are not as defined
above.
* {CL_INVALID_VALUE} if _properties_ is not `NULL`.
* {CL_INVALID_PIPE_SIZE} if _pipe_packet_size_ is 0 or the
_pipe_packet_size_ exceeds {CL_DEVICE_PIPE_MAX_PACKET_SIZE} value
specified in the <<device-queries-table,Device Queries>> table for all
devices in _context_ or if _pipe_max_packets_ is 0.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for the pipe object.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
Pipes follow the same memory consistency model as defined for buffer and
image objects.
The pipe state i.e. contents of the pipe across kernel-instances (on the
same or different devices) is enforced at a synchronization point.
--
=== Pipe Object Queries
To get information that is common to all memory objects, use the
{clGetMemObjectInfo} function described in <<memory-object-queries, Memory
Object Queries>>.
[open,refpage='clGetPipeInfo',desc='Get information specific to a pipe object created with clCreatePipe.',type='protos']
--
To get information specific to a pipe object created with {clCreatePipe},
call the function
include::{generated}/api/protos/clGetPipeInfo.txt[]
include::{generated}/api/version-notes/clGetPipeInfo.asciidoc[]
* _pipe_ specifies the pipe object being queried.
* _param_name_ specifies the information to query.
The list of supported _param_name_ types and the information returned in
_param_value_ by {clGetPipeInfo} is described in the <<pipe-info-table,Pipe
Object Queries>> table.
* _param_value_ is a pointer to memory where the appropriate result being
queried is returned.
If _param_value_ is `NULL`, it is ignored.
* _param_value_size_ is used to specify the size in bytes of memory pointed to
by _param_value_.
This size must be {geq} size of return type as described in the
<<pipe-info-table,Pipe Object Queries>> table.
* _param_value_size_ret_ returns the actual size in bytes of data being
queried by _param_name_.
If _param_value_size_ret_ is `NULL`, it is ignored.
// refError
{clGetPipeInfo} returns {CL_SUCCESS} if the function is executed successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_MEM_OBJECT} if _pipe_ is a not a valid pipe object.
* {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
specified by _param_value_size_ is < size of return type as described in
the <<pipe-info-table,Pipe Object Queries>> table and _param_value_ is
not `NULL`.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
[[pipe-info-table]]
.List of supported param_names by <<clGetPipeInfo>>
[width="100%",cols="<33%,<17%,<50%",options="header"]
|====
| Pipe Info | Return type | Description
| {CL_PIPE_PACKET_SIZE_anchor}
include::{generated}/api/version-notes/CL_PIPE_PACKET_SIZE.asciidoc[]
| {cl_uint_TYPE}
| Return pipe packet size specified when _pipe_ is created with
{clCreatePipe}.
| {CL_PIPE_MAX_PACKETS_anchor}
include::{generated}/api/version-notes/CL_PIPE_MAX_PACKETS.asciidoc[]
| {cl_uint_TYPE}
| Return max. number of packets specified when _pipe_ is created with
{clCreatePipe}.
| {CL_PIPE_PROPERTIES_anchor}
include::{generated}/api/version-notes/CL_PIPE_PROPERTIES.asciidoc[]
| {cl_pipe_properties_TYPE}[]
| Return the properties argument specified in {clCreatePipe}.
If the _properties_ argument specified in {clCreatePipe} used to
create _pipe_ was not `NULL`, the implementation must return the
values specified in the properties argument in the same order and
without including additional properties.
If the _properties_ argument specified in {clCreatePipe} used to
create _pipe_ was `NULL`, the implementation must return
_param_value_size_ret_ equal to 0, indicating that there are no
properties to be returned.
|====
--
== Querying, Unmapping, Migrating, Retaining and Releasing Memory Objects
// === Handling Memory Objects
=== Retaining and Releasing Memory Objects
[open,refpage='clRetainMemObject',desc='Increments the memory object reference count.',type='protos']
--
To retain a memory object, call the function
include::{generated}/api/protos/clRetainMemObject.txt[]
include::{generated}/api/version-notes/clRetainMemObject.asciidoc[]
* _memobj_ specifies the memory object to be retained.
The _memobj_ reference count is incremented.
// refError
{clRetainMemObject} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_MEM_OBJECT} if _memobj_ is not a valid memory object (buffer
or image object).
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
{clCreateBuffer}, {clCreateBufferWithProperties}, {clCreateSubBuffer},
{clCreateImage}, {clCreateImageWithProperties}, {clCreateImage2D},
{clCreateImage3D} and {clCreatePipe} perform an implicit retain.
--
[open,refpage='clReleaseMemObject',desc='Decrements the memory object reference count.',type='protos']
--
To release a memory object, call the function
include::{generated}/api/protos/clReleaseMemObject.txt[]
include::{generated}/api/version-notes/clReleaseMemObject.asciidoc[]
* _memobj_ specifies the memory object to be released.
The _memobj_ reference count is decremented.
// refError
{clReleaseMemObject} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_MEM_OBJECT} if _memobj_ is not a valid memory object.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
After the _memobj_ reference count becomes zero and commands queued for
execution on a command-queue(s) that use _memobj_ have finished, the memory
object is deleted.
If _memobj_ is a buffer object, _memobj_ cannot be deleted until all
sub-buffer objects associated with _memobj_ are deleted.
Using this function to release a reference that was not obtained by creating
the object or by calling {clRetainMemObject} causes undefined behavior.
--
[open,refpage='clSetMemObjectDestructorCallback',desc='Registers a destructor callback function with a memory object.',type='protos']
--
To register a callback function with a memory object that is called when
the memory object is destroyed, call the function
include::{generated}/api/protos/clSetMemObjectDestructorCallback.txt[]
include::{generated}/api/version-notes/clSetMemObjectDestructorCallback.asciidoc[]
* _memobj_ specifies the memory object to register the callback to.
* _pfn_notify_ is the callback function to register.
This callback function may be called asynchronously by the OpenCL
implementation.
It is the application's responsibility to ensure that the callback function
is thread-safe.
The parameters to this callback function are:
** _memobj_ is the memory object being deleted.
When the callback function is called by the implementation, this memory
object is not longer valid.
_memobj_ is only provided for reference purposes.
** _user_data_ is a pointer to user-supplied data.
* _user_data_ will be passed as the _user_data_ argument when _pfn_notify_ is
called.
_user_data_ can be `NULL`.
Each call to {clSetMemObjectDestructorCallback} registers the specified
callback function on a destructor callback stack associated with _memobj_.
The registered callback functions are called in the reverse order in
which they were registered.
The registered callback functions are called and then the memory object's
resources are freed and the memory object is deleted.
Therefore, the memory object destructor callback provides a mechanism for
an application to safely re-use or free a _host_ptr_ that was specified when
_memobj_ was created and used as the storage bits for the memory object.
// refError
{clSetMemObjectDestructorCallback} returns {CL_SUCCESS} if the function is
executed successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_MEM_OBJECT} if _memobj_ is not a valid memory object.
* {CL_INVALID_VALUE} if _pfn_notify_ is `NULL`.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
[NOTE]
====
When the user callback function is called by the implementation, the
contents of the memory region pointed to by _host_ptr_ (if the memory object
is created with {CL_MEM_USE_HOST_PTR}) are undefined.
The callback function is typically used by the application to either free or
reuse the memory region pointed to by _host_ptr_.
The behavior of calling expensive system routines, OpenCL API calls to
create contexts or command-queues, or blocking OpenCL operations from the
following list below, in a callback is undefined.
* {clFinish},
* {clWaitForEvents},
* blocking calls to {clEnqueueReadBuffer}, {clEnqueueReadBufferRect},
{clEnqueueWriteBuffer}, {clEnqueueWriteBufferRect},
* blocking calls to {clEnqueueReadImage} and {clEnqueueWriteImage},
* blocking calls to {clEnqueueMapBuffer}, {clEnqueueMapImage},
* blocking calls to {clBuildProgram}, {clCompileProgram} or
{clLinkProgram}
If an application needs to wait for completion of a routine from the above
list in a callback, please use the non-blocking form of the function, and
assign a completion callback to it to do the remainder of your work.
Note that when a callback (or other code) enqueues commands to a
command-queue, the commands are not required to begin execution until the
queue is flushed.
In standard usage, blocking enqueue calls serve this role by implicitly
flushing the queue.
Since blocking calls are not permitted in callbacks, those callbacks that
enqueue commands on a command queue should either call {clFlush} on the
queue before returning or arrange for {clFlush} to be called later on
another thread.
The user callback function may not call OpenCL APIs with the memory object
for which the callback function is invoked and for such cases the behavior
of OpenCL APIs is considered to be undefined.
====
--
[[unmapping-mapped-memory]]
=== Unmapping Mapped Memory Objects
[open,refpage='clEnqueueUnmapMemObject',desc='Enqueues a command to unmap a previously mapped region of a memory object.',type='protos']
--
To enqueue a command to unmap a previously mapped region of a memory object,
call the function
include::{generated}/api/protos/clEnqueueUnmapMemObject.txt[]
include::{generated}/api/version-notes/clEnqueueUnmapMemObject.asciidoc[]
* _command_queue_ must be a valid host command-queue.
* _memobj_ is a valid memory (buffer or image) object.
The OpenCL context associated with _command_queue_ and _memobj_ must be the
same.
* _mapped_ptr_ is the host address returned by a previous call to
{clEnqueueMapBuffer}, or {clEnqueueMapImage} for _memobj_.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before {clEnqueueUnmapMemObject} can be executed.
If _event_wait_list_ is `NULL`, then {clEnqueueUnmapMemObject} does not wait
on any event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this command and
can be used to query or queue a wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
Reads or writes from the host using the pointer returned by
{clEnqueueMapBuffer} or {clEnqueueMapImage} are considered to be complete.
{clEnqueueMapBuffer} and {clEnqueueMapImage} increment the mapped count of
the memory object.
The initial mapped count value of the memory object is zero.
Multiple calls to {clEnqueueMapBuffer}, or {clEnqueueMapImage} on the same
memory object will increment this mapped count by appropriate number of
calls.
{clEnqueueUnmapMemObject} decrements the mapped count of the memory object.
{clEnqueueMapBuffer}, and {clEnqueueMapImage} act as synchronization points
for a region of the buffer object being mapped.
// refError
{clEnqueueUnmapMemObject} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_MEM_OBJECT} if _memobj_ is not a valid memory object or is a
pipe object.
* {CL_INVALID_VALUE} if _mapped_ptr_ is not a valid pointer returned by
{clEnqueueMapBuffer} or {clEnqueueMapImage} for _memobj_.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or if _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
* {CL_INVALID_CONTEXT} if context associated with _command_queue_ and
_memobj_ are not the same or if the context associated with
_command_queue_ and events in _event_wait_list_ are not the same.
--
[[accessing-mapped-regions]]
=== Accessing mapped regions of a memory object
This section describes the behavior of OpenCL commands that access mapped
regions of a memory object.
The contents of the region of a memory object and associated memory objects
(sub-buffer objects or 1D image buffer objects that overlap this region)
mapped for writing (i.e. {CL_MAP_WRITE} or {CL_MAP_WRITE_INVALIDATE_REGION} is
set in _map_flags_ argument to {clEnqueueMapBuffer}, or {clEnqueueMapImage})
are considered to be undefined until this region is unmapped.
Multiple commands in command-queues can map a region or overlapping regions
of a memory object and associated memory objects (sub-buffer objects or 1D
image buffer objects that overlap this region) for reading (i.e. _map_flags_
= {CL_MAP_READ}).
The contents of the regions of a memory object mapped for reading can also
be read by kernels and other OpenCL commands (such as {clEnqueueCopyBuffer})
executing on a device(s).
Mapping (and unmapping) overlapped regions in a memory object and/or
associated memory objects (sub-buffer objects or 1D image buffer objects
that overlap this region) for writing is an error and will result in
{CL_INVALID_OPERATION} error returned by {clEnqueueMapBuffer}, or
{clEnqueueMapImage}.
If a memory object is currently mapped for writing, the application must
ensure that the memory object is unmapped before any enqueued kernels or
commands that read from or write to this memory object or any of its
associated memory objects (sub-buffer or 1D image buffer objects) or its
parent object (if the memory object is a sub-buffer or 1D image buffer
object) begin execution; otherwise the behavior is undefined.
If a memory object is currently mapped for reading, the application must
ensure that the memory object is unmapped before any enqueued kernels or
commands that write to this memory object or any of its associated memory
objects (sub-buffer or 1D image buffer objects) or its parent object (if the
memory object is a sub-buffer or 1D image buffer object) begin execution;
otherwise the behavior is undefined.
A memory object is considered as mapped if there are one or more active
mappings for the memory object irrespective of whether the mapped regions
span the entire memory object.
Accessing the contents of the memory region referred to by the mapped
pointer that has been unmapped is undefined.
The mapped pointer returned by {clEnqueueMapBuffer} or {clEnqueueMapImage}
can be used as the _ptr_ argument value to {clEnqueueReadBuffer},
{clEnqueueWriteBuffer}, {clEnqueueReadBufferRect},
{clEnqueueWriteBufferRect}, {clEnqueueReadImage}, or
{clEnqueueWriteImage} provided the rules described above are adhered to.
=== Migrating Memory Objects
NOTE: Migrating memory objects is <<unified-spec, missing before>> version 1.2.
This section describes a mechanism for assigning which device an OpenCL
memory object resides.
A user may wish to have more explicit control over the location of their
memory objects on creation.
This could be used to:
* Ensure that an object is allocated on a specific device prior to usage.
* Preemptively migrate an object from one device to another.
[open,refpage='clEnqueueMigrateMemObjects',desc='Enqueues a command to indicate which device a set of memory objects should be associated with.',type='protos']
--
To enqueue a command to indicate which device a set of memory objects should
be associated with, call the function
include::{generated}/api/protos/clEnqueueMigrateMemObjects.txt[]
include::{generated}/api/version-notes/clEnqueueMigrateMemObjects.asciidoc[]
* _command_queue_ is a valid host command-queue.
The specified set of memory objects in _mem_objects_ will be migrated to the
OpenCL device associated with _command_queue_ or to the host if the
{CL_MIGRATE_MEM_OBJECT_HOST} has been specified.
* _num_mem_objects_ is the number of memory objects specified in
_mem_objects_.
* _mem_objects_ is a pointer to a list of memory objects.
* _flags_ is a bit-field that is used to specify migration options.
The <<migration-flags-table,Memory Migration Flags>> describes the possible
values for flags.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before this particular command can be executed.
If _event_wait_list_ is `NULL`, then this particular command does not wait
on any event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this command and
can be used to query or queue a wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
[[migration-flags-table]]
.List of supported migration flags by <<clEnqueueMigrateMemObjects>>
[width="100%",cols="<50%,<50%",options="header"]
|====
| Memory Migration Flags | Description
| {CL_MIGRATE_MEM_OBJECT_HOST_anchor}
include::{generated}/api/version-notes/CL_MIGRATE_MEM_OBJECT_HOST.asciidoc[]
| This flag indicates that the specified set of memory objects are to be
migrated to the host, regardless of the target command-queue.
| {CL_MIGRATE_MEM_OBJECT_CONTENT_UNDEFINED_anchor}
include::{generated}/api/version-notes/CL_MIGRATE_MEM_OBJECT_CONTENT_UNDEFINED.asciidoc[]
| This flag indicates that the contents of the set of memory objects are
undefined after migration.
The specified set of memory objects are migrated to the device
associated with _command_queue_ without incurring the overhead of
migrating their contents.
|====
Typically, memory objects are implicitly migrated to a device for which
enqueued commands, using the memory object, are targeted.
{clEnqueueMigrateMemObjects} allows this migration to be explicitly
performed ahead of the dependent commands.
This allows a user to preemptively change the association of a memory
object, through regular command queue scheduling, in order to prepare for
another upcoming command.
This also permits an application to overlap the placement of memory objects
with other unrelated operations before these memory objects are needed
potentially hiding transfer latencies.
Once the event, returned from {clEnqueueMigrateMemObjects}, has been marked
CL_COMPLETE the memory objects specified in _mem_objects_ have been
successfully migrated to the device associated with _command_queue_.
The migrated memory object shall remain resident on the device until another
command is enqueued that either implicitly or explicitly migrates it away.
{clEnqueueMigrateMemObjects} can also be used to direct the initial
placement of a memory object, after creation, possibly avoiding the initial
overhead of instantiating the object on the first enqueued command to use
it.
The user is responsible for managing the event dependencies, associated with
this command, in order to avoid overlapping access to memory objects.
Improperly specified event dependencies passed to
{clEnqueueMigrateMemObjects} could result in undefined results.
// refError
{clEnqueueMigrateMemObjects} return {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and
memory objects in _mem_objects_ are not the same or if the context
associated with _command_queue_ and events in _event_wait_list_ are not
the same.
* {CL_INVALID_MEM_OBJECT} if any of the memory objects in _mem_objects_ is
not a valid memory object.
* {CL_INVALID_VALUE} if _num_mem_objects_ is zero or if _mem_objects_ is
`NULL`.
* {CL_INVALID_VALUE} if _flags_ is not 0 or is not any of the values
described in the table above.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for the specified set of memory objects in _mem_objects_.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[[memory-object-queries]]
=== Memory Object Queries
[open,refpage='clGetMemObjectInfo',desc='Get information that is common to all memory objects (buffer and image objects).',type='protos']
--
To get information that is common to all memory objects (buffer and image
objects), call the function
include::{generated}/api/protos/clGetMemObjectInfo.txt[]
include::{generated}/api/version-notes/clGetMemObjectInfo.asciidoc[]
* _memobj_ specifies the memory object being queried.
* _param_name_ specifies the information to query.
The list of supported _param_name_ types and the information returned in
_param_value_ by {clGetMemObjectInfo} is described in the
<<mem-info-table,Memory Object Info>> table.
* _param_value_ is a pointer to memory where the appropriate result being
queried is returned.
If _param_value_ is `NULL`, it is ignored.
* _param_value_size_ is used to specify the size in bytes of memory pointed to
by _param_value_.
This size must be {geq} size of return type as described in the
<<mem-info-table,Memory Object Info>> table.
* _param_value_size_ret_ returns the actual size in bytes of data being
queried by _param_name_.
If _param_value_size_ret_ is `NULL`, it is ignored.
[[mem-info-table]]
.List of supported param_names by <<clGetMemObjectInfo>>
[width="100%",cols="<33%,<17%,<50%",options="header"]
|====
| Memory Object Info | Return type | Description
| {CL_MEM_TYPE_anchor}
include::{generated}/api/version-notes/CL_MEM_TYPE.asciidoc[]
| {cl_mem_object_type_TYPE}
| Returns one of the following values:
{CL_MEM_OBJECT_BUFFER_anchor} if _memobj_ is created with {clCreateBuffer},
{clCreateBufferWithProperties}, or {clCreateSubBuffer}.
{CL_MEM_OBJECT_IMAGE2D} if _memobj_ is created with {clCreateImage2D}.
{CL_MEM_OBJECT_IMAGE3D} if _memobj_ is created with {clCreateImage3D}.
The value of __image_desc__->__image_type__ if _memobj_ is created with
{clCreateImage} or {clCreateImageWithProperties}.
{CL_MEM_OBJECT_PIPE_anchor} if _memobj_ is created with {clCreatePipe}.
| {CL_MEM_FLAGS_anchor}
include::{generated}/api/version-notes/CL_MEM_FLAGS.asciidoc[]
| {cl_mem_flags_TYPE}
| Return the flags argument value specified when _memobj_ is created
with {clCreateBuffer}, +
{clCreateBufferWithProperties}, +
{clCreateSubBuffer}, +
{clCreateImage}, +
{clCreateImageWithProperties}, +
{clCreateImage2D}, +
{clCreateImage3D}, or +
{clCreatePipe}.
If _memobj_ is a sub-buffer the memory access qualifiers inherited
from parent buffer is also returned.
| {CL_MEM_SIZE_anchor}
include::{generated}/api/version-notes/CL_MEM_SIZE.asciidoc[]
| {size_t_TYPE}
| Return actual size of the data store associated with _memobj_ in
bytes.
| {CL_MEM_HOST_PTR_anchor}
include::{generated}/api/version-notes/CL_MEM_HOST_PTR.asciidoc[]
| {void_TYPE}*
| If _memobj_ is created with {clCreateBuffer}, {clCreateBufferWithProperties},
{clCreateImage}, {clCreateImageWithProperties}, {clCreateImage2D}, or
{clCreateImage3D}, and {CL_MEM_USE_HOST_PTR} is specified in mem_flags,
return the host_ptr argument value specified when _memobj_ is created.
Otherwise, if _memobj_ is created with {clCreateSubBuffer}, and
_memobj_ is created from a buffer that was created with
{CL_MEM_USE_HOST_PTR} specified in mem_flags, return the host_ptr
passed to {clCreateBuffer} or {clCreateBufferWithProperties}, plus
the origin value specified in buffer_create_info when _memobj_ is
created.
Otherwise, returns `NULL`.
| {CL_MEM_MAP_COUNT_anchor} footnote:[{fn-map-count-usage}]
include::{generated}/api/version-notes/CL_MEM_MAP_COUNT.asciidoc[]
| {cl_uint_TYPE}
| Map count.
| {CL_MEM_REFERENCE_COUNT_anchor} footnote:[{fn-reference-count-usage}]
include::{generated}/api/version-notes/CL_MEM_REFERENCE_COUNT.asciidoc[]
| {cl_uint_TYPE}
| Return _memobj_ reference count.
| {CL_MEM_CONTEXT_anchor}
include::{generated}/api/version-notes/CL_MEM_CONTEXT.asciidoc[]
| {cl_context_TYPE}
| Return context specified when memory object is created.
If _memobj_ is created using {clCreateSubBuffer}, the context
associated with the memory object specified as the _buffer_ argument
to {clCreateSubBuffer} is returned.
| {CL_MEM_ASSOCIATED_MEMOBJECT_anchor}
include::{generated}/api/version-notes/CL_MEM_ASSOCIATED_MEMOBJECT.asciidoc[]
| {cl_mem_TYPE}
| Return memory object from which _memobj_ is created.
This returns the memory object specified as buffer argument to
{clCreateSubBuffer} if _memobj_ is a subbuffer object created using
{clCreateSubBuffer}.
This returns __image_desc__->__mem_object__ if _memobj_
is an image object created using {clCreateImage} or
{clCreateImageWithProperties}.
Otherwise, returns `NULL`.
| {CL_MEM_OFFSET_anchor}
include::{generated}/api/version-notes/CL_MEM_OFFSET.asciidoc[]
| {size_t_TYPE}
| Return offset if _memobj_ is a sub-buffer object created using
{clCreateSubBuffer}.
This return 0 if _memobj_ is not a subbuffer object.
| {CL_MEM_USES_SVM_POINTER_anchor}
include::{generated}/api/version-notes/CL_MEM_USES_SVM_POINTER.asciidoc[]
| {cl_bool_TYPE}
| Return {CL_TRUE} if _memobj_ is a buffer object that was created with
{CL_MEM_USE_HOST_PTR} or is a sub-buffer object of a buffer object
that was created with {CL_MEM_USE_HOST_PTR} and the _host_ptr_
specified when the buffer object was created is a SVM pointer;
otherwise returns {CL_FALSE}.
| {CL_MEM_PROPERTIES_anchor}
include::{generated}/api/version-notes/CL_MEM_PROPERTIES.asciidoc[]
| {cl_mem_properties_TYPE}[]
| Return the properties argument specified in
{clCreateBufferWithProperties} or {clCreateImageWithProperties}.
If the _properties_ argument specified in
{clCreateBufferWithProperties} or {clCreateImageWithProperties}
used to create _memobj_ was not `NULL`, the implementation must
return the values specified in the properties argument in the
same order and without including additional properties.
If _memobj_ was created using {clCreateBuffer},
{clCreateSubBuffer}, {clCreateImage}, {clCreateImage2D}, or
{clCreateImage3D}, or if the _properties_ argument specified
in {clCreateBufferWithProperties} or
{clCreateImageWithProperties} was `NULL`, the implementation
must return _param_value_size_ret_ equal to 0, indicating that
there are no properties to be returned.
|====
// refError
{clGetMemObjectInfo} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_MEM_OBJECT} if _memobj_ is a not a valid memory object.
* {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
specified by _param_value_size_ is < size of return type as described in
the <<mem-info-table,Memory Object Info>> table and _param_value_ is not
`NULL`.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
== Shared Virtual Memory
NOTE: Shared virtual memory is <<unified-spec, missing before>> version 2.0.
Shared virtual memory (a.k.a. SVM) allows the host and kernels executing on
devices to directly share complex, pointer-containing data structures such as
trees and linked lists.
It also eliminates the need to marshal data between the host and devices.
As a result, SVM substantially simplifies OpenCL programming and may improve
performance.
=== SVM sharing granularity: coarse- and fine- grained sharing
OpenCL maintains memory consistency in a coarse-grained fashion in regions
of buffers.
We call this coarse-grained sharing.
Many platforms such as those with integrated CPU-GPU processors and ones
using the SVM-related PCI-SIG IOMMU services can do better, and can support
sharing at a granularity smaller than a buffer.
We call this fine-grained sharing.
* Coarse-grained sharing: Coarse-grain sharing may be used for memory and
virtual pointer sharing between multiple devices as well as between the
host and one or more devices.
The shared memory region is a memory buffer allocated using
{clSVMAlloc}.
Memory consistency is guaranteed at synchronization points and the host
can use calls to {clEnqueueSVMMap} and {clEnqueueSVMUnmap} or create a
{cl_mem_TYPE} buffer object using the SVM pointer and use OpenCL's existing host
API functions {clEnqueueMapBuffer} and {clEnqueueUnmapMemObject} to
update regions of the buffer.
What coarse-grain buffer SVM adds to OpenCL's earlier buffer support are
the ability to share virtual memory pointers and a guarantee that
concurrent access to the same memory allocation from multiple kernels on
a single device is valid.
The coarse-grain buffer SVM provides a memory consistency model similar
to the global memory consistency model described in _sections 3.3.1_ and
_3.4.3_ of the OpenCL 1.2 specification.
This memory consistency applies to the regions of buffers being shared
in a coarse-grained fashion.
It is enforced at the synchronization points between commands enqueued
to command queues in a single context with the additional consideration
that multiple kernels concurrently running on the same device may safely
share the data.
* Fine-grained sharing: Shared virtual memory where memory consistency is
maintained at a granularity smaller than a buffer.
How fine-grained SVM is used depends on whether the device supports SVM
atomic operations.
** If SVM atomic operations are supported, they provide memory consistency
for loads and stores by the host and kernels executing on devices
supporting SVM.
This means that the host and devices can concurrently read and update
the same memory.
The consistency provided by SVM atomics is in addition to the
consistency provided at synchronization points.
There is no need for explicit calls to {clEnqueueSVMMap} and
{clEnqueueSVMUnmap} or {clEnqueueMapBuffer} and
{clEnqueueUnmapMemObject} on a {cl_mem_TYPE} buffer object created using the
SVM pointer.
** If SVM atomic operations are not supported, the host and devices can
concurrently read the same memory locations and can concurrently update
non-overlapping memory regions, but attempts to update the same memory
locations are undefined.
Memory consistency is guaranteed at synchronization points without the
need for explicit calls to {clEnqueueSVMMap} and {clEnqueueSVMUnmap}
or {clEnqueueMapBuffer} and {clEnqueueUnmapMemObject} on a {cl_mem_TYPE}
buffer object created using the SVM pointer.
* There are two kinds of fine-grain sharing support.
Devices may support either fine-grain buffer sharing or fine-grain
system sharing.
** Fine-grain buffer sharing provides fine-grain SVM only within buffers
and is an extension of coarse-grain sharing.
To support fine-grain buffer sharing in an OpenCL context, all devices
in the context must support {CL_DEVICE_SVM_FINE_GRAIN_BUFFER}.
** Fine-grain system sharing enables fine-grain sharing of the host's
entire virtual memory, including memory regions allocated by the system
*malloc* API.
OpenCL buffer objects are unnecessary and programmers can pass pointers
allocated using *malloc* to OpenCL kernels.
As an illustration of fine-grain SVM using SVM atomic operations to maintain
memory consistency, consider the following example.
The host and a set of devices can simultaneously access and update a shared
work-queue data structure holding work-items to be done.
The host can use atomic operations to insert new work-items into the queue
at the same time as the devices using similar atomic operations to remove
work-items for processing.
It is the programmer's responsibility to ensure that no host code or
executing kernels attempt to access a shared memory region after that memory
is freed.
We require the SVM implementation to work with either 32- or 64- bit host
applications subject to the following requirement: the address space size
must be the same for the host and all OpenCL devices in the context.
[open,refpage='clSVMAlloc',desc='Allocates a shared virtual memory (SVM) buffer that can be shared by the host and all devices in an OpenCL context that support shared virtual memory.',type='protos']
--
To allocate a shared virtual memory buffer (referred to as a SVM buffer)
that can be shared by the host and all devices in an OpenCL context that
support shared virtual memory, call the function
include::{generated}/api/protos/clSVMAlloc.txt[]
include::{generated}/api/version-notes/clSVMAlloc.asciidoc[]
* _context_ is a valid OpenCL context used to create the SVM buffer.
* _flags_ is a bit-field that is used to specify allocation and usage
information.
The <<svm-flags-table,SVM Memory Flags>> table describes the possible values
for _flags_.
* _size_ is the size in bytes of the SVM buffer to be allocated.
* _alignment_ is the minimum alignment in bytes that is required for the newly
created buffers memory region.
It must be a power of two up to the largest data type supported by the
OpenCL device.
For the full profile, the largest data type is long16.
For the embedded profile, it is long16 if the device supports 64-bit
integers; otherwise it is int16.
If alignment is 0, a default alignment will be used that is equal to the
size of largest data type supported by the OpenCL implementation.
[[svm-flags-table]]
.List of supported SVM memory flag values
[width="100%",cols="<50%,<50%",options="header"]
|====
| SVM Memory Flags | Description
| {CL_MEM_READ_WRITE}
| This flag specifies that the SVM buffer will be read and written by a
kernel.
This is the default.
| {CL_MEM_WRITE_ONLY}
| This flag specifies that the SVM buffer will be written but not read by
a kernel.
Reading from a SVM buffer created with {CL_MEM_WRITE_ONLY} inside a kernel
is undefined.
{CL_MEM_READ_WRITE} and {CL_MEM_WRITE_ONLY} are mutually exclusive.
| {CL_MEM_READ_ONLY}
| This flag specifies that the SVM buffer object is a read-only memory
object when used inside a kernel.
Writing to a SVM buffer created with {CL_MEM_READ_ONLY} inside a kernel is
undefined.
{CL_MEM_READ_WRITE} or {CL_MEM_WRITE_ONLY} and {CL_MEM_READ_ONLY} are mutually
exclusive.
| {CL_MEM_SVM_FINE_GRAIN_BUFFER_anchor}
include::{generated}/api/version-notes/CL_MEM_SVM_FINE_GRAIN_BUFFER.asciidoc[]
| This specifies that the application wants the OpenCL implementation to
do a fine-grained allocation.
| {CL_MEM_SVM_ATOMICS_anchor}
include::{generated}/api/version-notes/CL_MEM_SVM_ATOMICS.asciidoc[]
| This flag is valid only if {CL_MEM_SVM_FINE_GRAIN_BUFFER} is specified in
flags.
It is used to indicate that SVM atomic operations can control visibility
of memory accesses in this SVM buffer.
|====
If {CL_MEM_SVM_FINE_GRAIN_BUFFER} is not specified, the buffer can be created
as a coarse grained SVM allocation.
Similarly, if {CL_MEM_SVM_ATOMICS} is not specified, the buffer can be created
without support for SVM atomic operations (refer to an OpenCL kernel
language specifications).
Calling {clSVMAlloc} does not itself provide consistency for the shared
memory region.
When the host cannot use the SVM atomic operations, it must rely on OpenCL's
guaranteed memory consistency at synchronization points.
For SVM to be used efficiently, the host and any devices sharing a buffer
containing virtual memory pointers should have the same endianness.
If the context passed to {clSVMAlloc} has devices with mixed endianness and
the OpenCL implementation is unable to implement SVM because of that mixed
endianness, {clSVMAlloc} will fail and return `NULL`.
Although SVM is generally not supported for image objects, {clCreateImage}
and {clCreateImageWithProperties}
may create an image from a buffer (a 1D image from a buffer or a 2D image
from buffer) if the buffer specified in its image description parameter is a
SVM buffer.
Such images have a linear memory representation so their memory can be
shared using SVM.
However, fine grained sharing and atomics are not supported for image reads
and writes in a kernel.
// refError
{clSVMAlloc} returns a valid non-`NULL` shared virtual memory address if the
SVM buffer is successfully allocated.
Otherwise, like *malloc*, it returns a `NULL` pointer value.
{clSVMAlloc} will fail if
* _context_ is not a valid context, or no devices in _context_ support SVM.
* _flags_ does not contain {CL_MEM_SVM_FINE_GRAIN_BUFFER} but does contain
{CL_MEM_SVM_ATOMICS}.
* Values specified in _flags_ do not follow rules described for supported
values in the <<svm-flags-table,SVM Memory Flags>> table.
* {CL_MEM_SVM_FINE_GRAIN_BUFFER} or {CL_MEM_SVM_ATOMICS} is specified in
_flags_ and these are not supported by at least one device in _context_.
* The values specified in _flags_ are not valid, i.e. don't match those
defined in the <<svm-flags-table,SVM Memory Flags>> table.
* _size_ is 0 or > {CL_DEVICE_MAX_MEM_ALLOC_SIZE} value for any device in
_context_.
* _alignment_ is not a power of two or the OpenCL implementation cannot
support the specified alignment for at least one device in _context_.
* There was a failure to allocate resources.
--
[open,refpage='clSVMFree',desc='Frees a shared virtual memory buffer allocated using clSVMAlloc.',type='protos']
--
To free a shared virtual memory buffer allocated using {clSVMAlloc}, call
the function
include::{generated}/api/protos/clSVMFree.txt[]
include::{generated}/api/version-notes/clSVMFree.asciidoc[]
* _context_ is a valid OpenCL context used to create the SVM buffer.
If no devices in _context_ support SVM, no action occurs.
* _svm_pointer_ must be the value returned by a call to {clSVMAlloc}.
If a `NULL` pointer is passed in _svm_pointer_, no action occurs.
Note that {clSVMFree} does not wait for previously enqueued commands that
may be using _svm_pointer_ to finish before freeing _svm_pointer_.
It is the responsibility of the application to make sure that enqueued
commands that use _svm_pointer_ have finished before freeing _svm_pointer_.
This can be done by enqueuing a blocking operation such as {clFinish},
{clWaitForEvents}, {clEnqueueReadBuffer} or by registering a callback with
the events associated with enqueued commands and when the last enqueued
command has finished freeing _svm_pointer_.
The behavior of using _svm_pointer_ after it has been freed is undefined.
In addition, if a buffer object is created using {clCreateBuffer} or
{clCreateBufferWithProperties} with _svm_pointer_, the buffer object must
first be released before the _svm_pointer_ is freed.
The {clEnqueueSVMFree} API can also be used to enqueue a callback to free
the shared virtual memory buffer allocated using {clSVMAlloc} or a shared
system memory pointer.
--
[open,refpage='clEnqueueSVMFree',desc='Enqueues a command to free shared virtual memory allocated using clSVMAlloc or a shared system memory pointer.',type='protos']
--
To enqueue a command to free the shared virtual memory allocated using
{clSVMAlloc} or a shared system memory pointer, call the function
include::{generated}/api/protos/clEnqueueSVMFree.txt[]
include::{generated}/api/version-notes/clEnqueueSVMFree.asciidoc[]
* _command_queue_ is a valid host command-queue.
* _svm_pointers_ and _num_svm_pointers_ specify shared virtual memory pointers
to be freed.
Each pointer in _svm_pointers_ that was allocated using {clSVMAlloc} must
have been allocated from the same context from which _command_queue_ was
created.
The memory associated with _svm_pointers_ can be reused or freed after the
function returns.
* _pfn_free_func_ specifies the callback function to be called to free the SVM
pointers.
_pfn_free_func_ takes four arguments: _queue_ which is the command queue in
which {clEnqueueSVMFree} was enqueued, the count and list of SVM pointers to
free and _user_data_ which is a pointer to user specified data.
If _pfn_free_func_ is `NULL`, all pointers specified in _svm_pointers_ must
be allocated using {clSVMAlloc} and the OpenCL implementation will free
these SVM pointers.
_pfn_free_func_ must be a valid callback function if any SVM pointer to be
freed is a shared system memory pointer i.e. not allocated using
{clSVMAlloc}.
If _pfn_free_func_ is a valid callback function, the OpenCL implementation
will call _pfn_free_func_ to free all the SVM pointers specified in
_svm_pointers_.
* _user_data_ will be passed as the _user_data_ argument when _pfn_free_func_
is called.
_user_data_ can be `NULL`.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before {clEnqueueSVMFree} can be executed.
If _event_wait_list_ is `NULL`, then {clEnqueueSVMFree} does not wait on any
event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this command and
can be used to query or queue a wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
// refError
{clEnqueueSVMFree} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_OPERATION} if the device associated with _command queue_ does not support SVM.
* {CL_INVALID_VALUE} if _num_svm_pointers_ is 0 and _svm_pointers_ is
non-`NULL`, _or_ if _svm_pointers_ is `NULL` and _num_svm_pointers_ is
not 0.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clEnqueueSVMMemcpy',desc='Enqueues a command to do a memcpy operation.',type='protos']
--
To enqueue a command to do a memcpy operation, call the function
include::{generated}/api/protos/clEnqueueSVMMemcpy.txt[]
include::{generated}/api/version-notes/clEnqueueSVMMemcpy.asciidoc[]
* _command_queue_ refers to the host command-queue in which the read / write
command will be queued.
If either _dst_ptr_ or _src_ptr_ is allocated using {clSVMAlloc} then the
OpenCL context allocated against must match that of _command_queue_.
* _blocking_copy_ indicates if the copy operation is _blocking_ or
_non-blocking_.
* If _blocking_copy_ is {CL_TRUE} i.e. the copy command is blocking,
{clEnqueueSVMMemcpy} does not return until the buffer data has been copied
into memory pointed to by _dst_ptr_.
* _size_ is the size in bytes of data being copied.
* _dst_ptr_ is the pointer to a host or SVM memory allocation where data is
copied to.
* _src_ptr_ is the pointer to a host or SVM memory allocation where data is
copied from.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before this particular command can be executed.
If _event_wait_list_ is `NULL`, then this particular command does not wait
on any event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this read / write command
and can be used to query or queue a wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
If _blocking_copy_ is {CL_FALSE} i.e. the copy command is non-blocking,
{clEnqueueSVMMemcpy} queues a non-blocking copy command and returns.
The contents of the buffer that _dst_ptr_ points to cannot be used until the
copy command has completed.
The _event_ argument returns an event object which can be used to query the
execution status of the read command.
When the copy command has completed, the contents of the buffer that
_dst_ptr_ points to can be used by the application.
If the memory allocation(s) containing _dst_ptr_ and/or _src_ptr_ are
allocated using {clSVMAlloc} and either is not allocated from the same
context from which _command_queue_ was created the behavior is undefined.
// refError
{clEnqueueSVMMemcpy} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_OPERATION} if the device associated with _command queue_ does not support SVM.
* {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and
events in _event_wait_list_ are not the same.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the copy operation is
blocking and the execution status of any of the events in
_event_wait_list_ is a negative integer value.
* {CL_INVALID_VALUE} if _dst_ptr_ or _src_ptr_ is `NULL`.
* {CL_MEM_COPY_OVERLAP} if the values specified for _dst_ptr_, _src_ptr_ and
_size_ result in an overlapping copy.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clEnqueueSVMMemFill',desc='Enqueues a command to fill a region in memory with a pattern of a given pattern size.',type='protos']
--
To enqueue a command to fill a region in memory with a pattern of a given
pattern size, call the function
include::{generated}/api/protos/clEnqueueSVMMemFill.txt[]
include::{generated}/api/version-notes/clEnqueueSVMMemFill.asciidoc[]
* _command_queue_ refers to the host command-queue in which the fill command
will be queued.
The OpenCL context associated with _command_queue_ and SVM pointer referred
to by _svm_ptr_ must be the same.
* _svm_ptr_ is a pointer to a memory region that will be filled with
_pattern_.
It must be aligned to _pattern_size_ bytes.
If _svm_ptr_ is allocated using {clSVMAlloc} then it must be allocated from
the same context from which _command_queue_ was created.
Otherwise the behavior is undefined.
* _pattern_ is a pointer to the data pattern of size _pattern_size_ in bytes.
_pattern_ will be used to fill a region in _buffer_ starting at _svm_ptr_
and is _size_ bytes in size.
The data pattern must be a scalar or vector integer or floating-point data
type supported by OpenCL as described in <<scalar-data-types, Shared
Application Scalar Data Types>> and <<vector-data-types, Supported
Application Vector Data Types>>.
For example, if region pointed to by _svm_ptr_ is to be filled with a
pattern of float4 values, then _pattern_ will be a pointer to a cl_float4
value and _pattern_size_ will be `sizeof(cl_float4)`.
The maximum value of _pattern_size_ is the size of the largest integer or
floating-point vector data type supported by the OpenCL device.
The memory associated with _pattern_ can be reused or freed after the
function returns.
* _size_ is the size in bytes of region being filled starting with _svm_ptr_
and must be a multiple of _pattern_size_.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before this particular command can be executed.
If _event_wait_list_ is `NULL`, then this particular command does not wait
on any event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this command and
can be used to query or queue a wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
// refError
{clEnqueueSVMMemFill} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_OPERATION} if the device associated with _command queue_ does not support SVM.
* {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and
events in _event_wait_list_ are not the same.
* {CL_INVALID_VALUE} if _svm_ptr_ is `NULL`.
* {CL_INVALID_VALUE} if _svm_ptr_ is not aligned to _pattern_size_ bytes.
* {CL_INVALID_VALUE} if _pattern_ is `NULL` or if _pattern_size_ is 0 or if
_pattern_size_ is not one of {1, 2, 4, 8, 16, 32, 64, 128}.
* {CL_INVALID_VALUE} if _size_ is not a multiple of _pattern_size_.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clEnqueueSVMMap',desc='Enqueues a command that will allow the host to update a region of a SVM buffer',type='protos']
--
To enqueue a command that will allow the host to update a region of a SVM
buffer, call the function
include::{generated}/api/protos/clEnqueueSVMMap.txt[]
include::{generated}/api/version-notes/clEnqueueSVMMap.asciidoc[]
* _command_queue_ must be a valid host command-queue.
* _blocking_map_ indicates if the map operation is _blocking_ or
_non-blocking_.
* _map_flags_ is a bit-field and is described in the
<<memory-map-flags-table,Memory Map Flags>> table.
* _svm_ptr_ and _size_ are a pointer to a memory region and size in bytes that
will be updated by the host.
If _svm_ptr_ is allocated using {clSVMAlloc} then it must be allocated from
the same context from which _command_queue_ was created.
Otherwise the behavior is undefined.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before this particular command can be executed.
If _event_wait_list_ is `NULL`, then this particular command does not wait
on any event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this command and
can be used to query or queue a wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
If _blocking_map_ is {CL_TRUE}, {clEnqueueSVMMap} does not return until the
application can access the contents of the SVM region specified by _svm_ptr_
and _size_ on the host.
If _blocking_map_ is {CL_FALSE} i.e. map operation is non-blocking, the region
specified by _svm_ptr_ and _size_ cannot be used until the map command has
completed.
The _event_ argument returns an event object which can be used to query the
execution status of the map command.
When the map command is completed, the application can access the contents
of the region specified by _svm_ptr_ and _size_.
Note that since we are enqueuing a command with a SVM buffer, the region is
already mapped in the host address space.
// refError
{clEnqueueSVMMap} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_OPERATION} if the device associated with _command queue_ does not support SVM.
* {CL_INVALID_CONTEXT} if context associated with _command_queue_ and events
in _event_wait_list_ are not the same.
* {CL_INVALID_VALUE} if _svm_ptr_ is `NULL`.
* {CL_INVALID_VALUE} if _size_ is 0 or if values specified in _map_flags_
are not valid.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the map operation is
blocking and the execution status of any of the events in
_event_wait_list_ is a negative integer value.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clEnqueueSVMUnmap',desc='Enqueues a command to indicate that the host has completed updating the region given by an SVM pointer and which was specified in a previous call to clEnqueueSVMMap.',type='protos']
--
To enqueue a command to indicate that the host has completed updating the
region given by _svm_ptr_ and which was specified in a previous call to
{clEnqueueSVMMap}, call the function
include::{generated}/api/protos/clEnqueueSVMUnmap.txt[]
include::{generated}/api/version-notes/clEnqueueSVMUnmap.asciidoc[]
* _command_queue_ must be a valid host command-queue.
* _svm_ptr_ is a pointer that was specified in a previous call to
{clEnqueueSVMMap}.
If _svm_ptr_ is allocated using {clSVMAlloc} then it must be allocated from
the same context from which _command_queue_ was created.
Otherwise the behavior is undefined.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before {clEnqueueSVMUnmap} can be executed.
If _event_wait_list_ is `NULL`, then {clEnqueueSVMUnmap} does not wait on any
event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this command and
can be used to query or queue a wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
{clEnqueueSVMMap} and {clEnqueueSVMUnmap} act as synchronization points for
the region of the SVM buffer specified in these calls.
// refError
{clEnqueueSVMUnmap} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_OPERATION} if the device associated with _command queue_ does not support SVM.
* {CL_INVALID_CONTEXT} if context associated with _command_queue_ and events
in _event_wait_list_ are not the same.
* {CL_INVALID_VALUE} if _svm_ptr_ is `NULL`.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or if _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
[NOTE]
====
If a coarse-grained SVM buffer is currently mapped for writing, the
application must ensure that the SVM buffer is unmapped before any enqueued
kernels or commands that read from or write to this SVM buffer or any of its
associated {cl_mem_TYPE} buffer objects begin execution; otherwise the behavior is
undefined.
If a coarse-grained SVM buffer is currently mapped for reading, the
application must ensure that the SVM buffer is unmapped before any enqueued
kernels or commands that write to this memory object or any of its
associated {cl_mem_TYPE} buffer objects begin execution; otherwise the behavior is
undefined.
A SVM buffer is considered as mapped if there are one or more active
mappings for the SVM buffer irrespective of whether the mapped regions span
the entire SVM buffer.
The above note does not apply to fine-grained SVM buffers (fine-grained
buffers allocated using {clSVMAlloc} or fine-grained system allocations).
====
--
[open,refpage='clEnqueueSVMMigrateMem',desc='Enqueues a command to indicate which device a set of ranges of SVM allocations should be associated with.',type='protos']
--
To enqueue a command to indicate which device a set of ranges of SVM
allocations should be associated with, call the function
include::{generated}/api/protos/clEnqueueSVMMigrateMem.txt[]
include::{generated}/api/version-notes/clEnqueueSVMMigrateMem.asciidoc[]
* _command_queue_ is a valid host command queue.
The specified set of allocation ranges will be migrated to the OpenCL device
associated with _command_queue_.
* _num_svm_pointers_ is the number of pointers in the specified _svm_pointers_
array, and the number of sizes in the _sizes_ array, if _sizes_ is not
`NULL`.
* _svm_pointers_ is a pointer to an array of pointers.
Each pointer in this array must be within an allocation produced by a call
to {clSVMAlloc}.
* _sizes_ is an array of sizes.
The pair _svm_pointers_[i] and _sizes_[i] together define the starting
address and number of bytes in a range to be migrated.
_sizes_ may be `NULL` indicating that every allocation containing any
_svm_pointer_[i] is to be migrated.
Also, if _sizes_[i] is zero, then the entire allocation containing
_svm_pointer_[i] is migrated.
* _flags_ is a bit-field that is used to specify migration options.
The <<migration-flags-table,Memory Migration Flags>> describes the possible
values for _flags_.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before this particular command can be executed.
If _event_wait_list_ is `NULL`, then this particular command does not wait
on any event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this command and
can be used to query or queue a wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
Once the event returned by {clEnqueueSVMMigrateMem} has become {CL_COMPLETE},
the ranges specified by svm pointers and sizes have been successfully
migrated to the device associated with command queue.
The user is responsible for managing the event dependencies associated with
this command in order to avoid overlapping access to SVM allocations.
Improperly specified event dependencies passed to {clEnqueueSVMMigrateMem}
could result in undefined results.
// refError
{clEnqueueSVMMigrateMem} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_OPERATION} if the device associated with _command queue_ does not support SVM.
* {CL_INVALID_CONTEXT} if context associated with _command_queue_ and events
in _event_wait_list_ are not the same.
* {CL_INVALID_VALUE} if _num_svm_pointers_ is zero or _svm_pointers_ is
`NULL`.
* {CL_INVALID_VALUE} if _sizes_[i] is non-zero range [_svm_pointers_[i],
_svm_pointers_[i]+_sizes_[i]) is not contained within an existing
{clSVMAlloc} allocation.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or if _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
=== Memory consistency for SVM allocations
To ensure memory consistency in SVM allocations, the program can rely on the
guaranteed memory consistency at synchronization points.
This consistency support already exists in OpenCL 1.x and can be used for
coarse-grained SVM allocations or for fine-grained buffer SVM allocations;
what SVM adds is the ability to share pointers between the host and all SVM
devices.
In addition, sub-buffers can also be used to ensure that each device gets a
consistent view of a SVM buffers memory when it is shared by multiple
devices.
For example, assume that two devices share a SVM pointer.
The host can create a {cl_mem_TYPE} buffer object using {clCreateBuffer} or
{clCreateBufferWithProperties} with {CL_MEM_USE_HOST_PTR} and _host_ptr_ set
to the SVM pointer and then create two disjoint sub-buffers with starting
virtual addresses _sb1_ptr_ and _sb2_ptr_.
These pointers (_sb1_ptr_ and _sb2_ptr_) can be passed to kernels executing
on the two devices.
{clEnqueueMapBuffer} and {clEnqueueUnmapMemObject} and the existing
<<accessing-mapped-regions, access rules for memory objects>> ensure
consistency for buffer regions (_sb1_ptr_ and _sb2_ptr_) read and written by
these kernels.
When the host and devices are able to use SVM atomic operations (i.e.
{CL_DEVICE_SVM_ATOMICS} is set in {CL_DEVICE_SVM_CAPABILITIES}), these atomic
operations can be used to provide memory consistency at a fine grain in a
shared memory region.
The effect of these operations is visible to the host and all devices with
which that memory is shared.
== Sampler Objects
A sampler object describes how to sample an image when the image is read in
the kernel.
The built-in functions to read from an image in a kernel take a sampler as
an argument.
The sampler arguments to the image read function can be sampler objects
created using OpenCL functions and passed as argument values to the kernel
or can be samplers declared inside a kernel.
In this section we discuss how sampler objects are created using OpenCL
functions.
=== Creating Sampler Objects
[open,refpage='clCreateSamplerWithProperties',desc='Creates a sampler object.',type='protos']
--
To create a sampler object, call the function
include::{generated}/api/protos/clCreateSamplerWithProperties.txt[]
include::{generated}/api/version-notes/clCreateSamplerWithProperties.asciidoc[]
* _context_ must be a valid OpenCL context.
* _sampler_properties_ specifies a list of sampler property names and their
corresponding values.
Each sampler property name is immediately followed by the corresponding
desired value.
The list is terminated with 0.
The list of supported properties is described in the
<<sampler-properties-table,Sampler Properties>> table.
If a supported property and its value is not specified in
_sampler_properties_, its default value will be used.
_sampler_properties_ can be `NULL` in which case the default values for
supported sampler properties will be used.
[[sampler-properties-table]]
.List of supported sampler creation properties by <<clCreateSamplerWithProperties>>
[width="100%",cols="<33%,<17%,<50%",options="header"]
|====
| Sampler Property | Property Value | Description
| {CL_SAMPLER_NORMALIZED_COORDS_anchor}
include::{generated}/api/version-notes/CL_SAMPLER_NORMALIZED_COORDS.asciidoc[]
| {cl_bool_TYPE}
| A boolean value that specifies whether the image coordinates
specified are normalized or not.
The default value (i.e. the value used if this property is not
specified in sampler_properties) is {CL_TRUE}.
| {CL_SAMPLER_ADDRESSING_MODE_anchor}
include::{generated}/api/version-notes/CL_SAMPLER_ADDRESSING_MODE.asciidoc[]
| {cl_addressing_mode_TYPE}
| Specifies how out-of-range image coordinates are handled when
reading from an image.
Valid values are:
{CL_ADDRESS_NONE_anchor} - Behavior is undefined for out-of-range
image coordinates.
{CL_ADDRESS_CLAMP_TO_EDGE_anchor} - Out-of-range image coordinates
are clamped to the edge of the image.
{CL_ADDRESS_CLAMP_anchor} - Out-of-range image coordinates are
assigned a border color value.
{CL_ADDRESS_REPEAT_anchor} - Out-of-range image coordinates read
from the image as-if the image data were replicated in all dimensions.
{CL_ADDRESS_MIRRORED_REPEAT_anchor} - Out-of-range image coordinates
read from the image as-if the image data were replicated in all
dimensions, mirroring the image contents at the edge of each
replication.
The default is {CL_ADDRESS_CLAMP}.
| {CL_SAMPLER_FILTER_MODE_anchor}
include::{generated}/api/version-notes/CL_SAMPLER_FILTER_MODE.asciidoc[]
| {cl_filter_mode_TYPE}
| Specifies the type of filter that is applied when reading an
image.
Valid values are:
{CL_FILTER_NEAREST_anchor} - Returns the image element nearest
to the image coordinate.
{CL_FILTER_LINEAR_anchor} - Returns a weighted average of the
four image elements nearest to the image coordinate.
The default value is {CL_FILTER_NEAREST}.
|====
* _errcode_ret_ will return an appropriate error code.
If _errcode_ret_ is `NULL`, no error code is returned.
// refError
{clCreateSamplerWithProperties} returns a valid non-zero sampler object and
_errcode_ret_ is set to {CL_SUCCESS} if the sampler object is created
successfully.
Otherwise, it returns a `NULL` value with one of the following error values
returned in _errcode_ret_:
* {CL_INVALID_CONTEXT} if _context_ is not a valid context.
* {CL_INVALID_VALUE} if the property name in _sampler_properties_ is not a
supported property name, if the value specified for a supported property
name is not valid, or if the same property name is specified more than
once.
* {CL_INVALID_OPERATION} if images are not supported by any device
associated with _context_ (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the
<<device-queries-table,Device Queries>> table is {CL_FALSE}).
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clCreateSampler',desc='Creates a sampler object.',type='protos']
--
To create a sampler object, call the function
include::{generated}/api/protos/clCreateSampler.txt[]
include::{generated}/api/version-notes/clCreateSampler.asciidoc[]
* _context_ must be a valid OpenCL context.
* _normalized_coords_ has the same interpretation as
{CL_SAMPLER_NORMALIZED_COORDS} in the <<sampler-properties-table, sampler
creation properties table>>.
* _addressing_mode_ has the same interpretation as
{CL_SAMPLER_ADDRESSING_MODE} in the <<sampler-properties-table, sampler
creation properties table>>.
* _filter_mode_ has the same interpretation as
{CL_SAMPLER_FILTER_MODE} in the <<sampler-properties-table, sampler
creation properties table>>.
* _errcode_ret_ will return an appropriate error code.
If _errcode_ret_ is `NULL`, no error code is returned.
// refError
{clCreateSampler} returns a valid non-zero sampler object and _errcode_ret_ is
set to {CL_SUCCESS} if the sampler object is created successfully.
Otherwise, it returns a `NULL` value with one of the following error values
returned in _errcode_ret_:
* {CL_INVALID_CONTEXT} if _context_ is not a valid context.
* {CL_INVALID_VALUE} if _addressing_mode_, _filter_mode_, _normalized_coords_
or a combination of these arguements are not valid.
* {CL_INVALID_OPERATION} if images are not supported by any device
associated with _context_ (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the
<<device-queries-table,Device Queries>> table is {CL_FALSE}).
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clRetainSampler',desc='Increments the sampler reference count.',type='protos']
--
To retain a sampler object, call the function
include::{generated}/api/protos/clRetainSampler.txt[]
include::{generated}/api/version-notes/clRetainSampler.asciidoc[]
* _sampler_ specifies the sampler to be released.
The _sampler_ reference count is incremented.
{clCreateSamplerWithProperties} and {clCreateSampler} perform an implicit
retain.
// refError
{clRetainSampler} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_SAMPLER} if _sampler_ is not a valid sampler object.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clReleaseSampler',desc='Decrements the sampler reference count.',type='protos']
--
To release a sampler object, call the function
include::{generated}/api/protos/clReleaseSampler.txt[]
include::{generated}/api/version-notes/clReleaseSampler.asciidoc[]
* _sampler_ specifies the sampler to be released.
The _sampler_ reference count is decremented.
The sampler object is deleted after the reference count becomes zero and
commands queued for execution on a command-queue(s) that use _sampler_ have
finished.
// refError
{clReleaseSampler} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_SAMPLER} if _sampler_ is not a valid sampler object.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
Using this function to release a reference that was not obtained by creating
the object or by calling {clRetainSampler} causes undefined behavior.
--
=== Sampler Object Queries
[open,refpage='clGetSamplerInfo',desc='Returns information about the sampler object.',type='protos']
--
To return information about a sampler object, call the function
include::{generated}/api/protos/clGetSamplerInfo.txt[]
include::{generated}/api/version-notes/clGetSamplerInfo.asciidoc[]
* _sampler_ specifies the sampler being queried.
* _param_name_ specifies the information to query.
The list of supported _param_name_ types and the information returned in
_param_value_ by {clGetSamplerInfo} is described in the
<<sampler-info-table,Sampler Object Queries>> table.
* _param_value_ is a pointer to memory where the appropriate result being
queried is returned.
If _param_value_ is `NULL`, it is ignored.
* _param_value_size_ is used to specify the size in bytes of memory pointed to
by _param_value_.
This size must be {geq} size of return type as described in the
<<sampler-info-table,Sampler Object Queries>> table.
* _param_value_size_ret_ returns the actual size in bytes of data being
queried by _param_name_.
If _param_value_size_ret_ is `NULL`, it is ignored.
[[sampler-info-table]]
.List of supported param_names by <<clGetSamplerInfo>>
[width="100%",cols="<33%,<17%,<50%",options="header"]
|====
| Sampler Info | Return Type | Description
| {CL_SAMPLER_REFERENCE_COUNT_anchor} footnote:[{fn-reference-count-usage}]
include::{generated}/api/version-notes/CL_SAMPLER_REFERENCE_COUNT.asciidoc[]
| {cl_uint_TYPE}
| Return the _sampler_ reference count.
| {CL_SAMPLER_CONTEXT_anchor}
include::{generated}/api/version-notes/CL_SAMPLER_CONTEXT.asciidoc[]
| {cl_context_TYPE}
| Return the context specified when the sampler is created.
| {CL_SAMPLER_NORMALIZED_COORDS_anchor}
include::{generated}/api/version-notes/CL_SAMPLER_NORMALIZED_COORDS.asciidoc[]
| {cl_bool_TYPE}
| Return the normalized coords value associated with _sampler_.
| {CL_SAMPLER_ADDRESSING_MODE_anchor}
include::{generated}/api/version-notes/CL_SAMPLER_ADDRESSING_MODE.asciidoc[]
| {cl_addressing_mode_TYPE}
| Return the addressing mode value associated with _sampler_.
| {CL_SAMPLER_FILTER_MODE_anchor}
include::{generated}/api/version-notes/CL_SAMPLER_FILTER_MODE.asciidoc[]
| {cl_filter_mode_TYPE}
| Return the filter mode value associated with _sampler_.
| {CL_SAMPLER_PROPERTIES_anchor}
include::{generated}/api/version-notes/CL_SAMPLER_PROPERTIES.asciidoc[]
| {cl_sampler_properties_TYPE}[]
| Return the properties argument specified in
{clCreateSamplerWithProperties}.
If the _properties_ argument specified in {clCreateSamplerWithProperties}
used to create _sampler_ was not `NULL`, the implementation must return
the values specified in the properties argument in the same order and
without including additional properties.
If _sampler_ was created using {clCreateSampler}, or if the _properties_
argument specified in {clCreateSamplerWithProperties} was `NULL`, the
implementation must return _param_value_size_ret_ equal to 0,
indicating that there are no properties to be returned.
|====
// refError
{clGetSamplerInfo} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
specified by _param_value_size_ is < size of return type as described in
the <<sampler-info-table,Sampler Object Queries>> table and
_param_value_ is not `NULL`.
* {CL_INVALID_SAMPLER} if _sampler_ is a not a valid sampler object.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
== Program Objects
An OpenCL program consists of a set of kernels that are identified as
functions declared with the `+__kernel+` qualifier in the program source.
OpenCL programs may also contain auxiliary functions and constant data that
can be used by kernel functions.
The program executable can be generated _online_ or _offline_ by the OpenCL
compiler for the appropriate target device(s).
A program object encapsulates the following information:
* An associated context.
* A program source or binary.
* The latest successfully built program executable, library or compiled
binary, the list of devices for which the program executable, library or
compiled binary is built, the build options used and a build log.
* The number of kernel objects currently attached.
=== Creating Program Objects
[open,refpage='clCreateProgramWithSource',desc='Creates a program object for a context, and loads source code specified by text strings into the program object.',type='protos']
--
To creates a program object for a context and load source code into that
object, call the function
include::{generated}/api/protos/clCreateProgramWithSource.txt[]
include::{generated}/api/version-notes/clCreateProgramWithSource.asciidoc[]
* _context_ must be a valid OpenCL context.
* _strings_ is an array of _count_ pointers to optionally null-terminated
character strings that make up the source code.
* _lengths_ argument is an array with the number of chars in each string
(the string length).
If an element in _lengths_ is zero, its accompanying string is
null-terminated.
If _lengths_ is `NULL`, all strings in the _strings_ argument are considered
null-terminated.
Any length value passed in that is greater than zero excludes the null
terminator in its count.
* _errcode_ret_ will return an appropriate error code.
If _errcode_ret_ is `NULL`, no error code is returned.
The source code specified by _strings_ will be loaded into the program
object.
The devices associated with the program object are the devices associated
with _context_.
The source code specified by _strings_ is either an OpenCL C program source,
header or implementation-defined source for custom devices that support an
online compiler.
OpenCL {cpp} is not supported as an online-compiled kernel language through
this interface.
// refError
{clCreateProgramWithSource} returns a valid non-zero program object and
_errcode_ret_ is set to {CL_SUCCESS} if the program object is created
successfully.
Otherwise, it returns a `NULL` value with one of the following error values
returned in _errcode_ret_:
* {CL_INVALID_CONTEXT} if _context_ is not a valid context.
* {CL_INVALID_VALUE} if _count_ is zero or if _strings_ or any entry in
_strings_ is `NULL`.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clCreateProgramWithIL',desc='Creates a program object for a context, and loads the IL into the program object.',type='protos']
--
To create a program object for a context and load code in an intermediate
language into that object, call the function
include::{generated}/api/protos/clCreateProgramWithIL.txt[]
include::{generated}/api/version-notes/clCreateProgramWithIL.asciidoc[]
Also see extension *cl_khr_il_program*.
* _context_ must be a valid OpenCL context.
* _il_ is a pointer to a _length_-byte block of memory containing SPIR-V or an
implementation-defined intermediate language.
* _errcode_ret_ will return an appropriate error code.
If _errcode_ret_ is `NULL`, no error code is returned.
The intermediate language pointed to by _il_ and with length in bytes _length_
will be loaded into the program object.
The devices associated with the program object are the devices associated
with _context_.
// refError
{clCreateProgramWithIL} returns a valid non-zero program object and
_errcode_ret_ is set to {CL_SUCCESS} if the program object is created
successfully.
Otherwise, it returns a `NULL` value with one of the following error values
returned in _errcode_ret_:
* {CL_INVALID_CONTEXT} if _context_ is not a valid context.
* {CL_INVALID_OPERATION} if no devices in _context_ support intermediate
language programs.
* {CL_INVALID_VALUE} if _il_ is `NULL` or if _length_ is zero.
* {CL_INVALID_VALUE} if the _length_-byte memory pointed to by _il_ does not
contain well-formed intermediate language input that can be consumed by
the OpenCL runtime.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clCreateProgramWithBinary',desc='Creates a program object for a context, and loads binary bits into the program object.',type='protos']
--
To create a program object for a context and load binary bits into that
object, call the function
include::{generated}/api/protos/clCreateProgramWithBinary.txt[]
include::{generated}/api/version-notes/clCreateProgramWithBinary.asciidoc[]
* _context_ must be a valid OpenCL context.
* _device_list_ is a pointer to a list of devices that are in _context_.
_device_list_ must be a non-`NULL` value.
The binaries are loaded for devices specified in this list.
* _num_devices_ is the number of devices listed in _device_list_.
* _lengths_ is an array of the size in bytes of the program binaries to be
loaded for devices specified by _device_list_.
* _binaries_ is an array of pointers to program binaries to be loaded for
devices specified by _device_list_.
For each device given by _device_list_[i], the pointer to the program binary
for that device is given by _binaries_[i] and the length of this
corresponding binary is given by _lengths_[i].
_lengths_[i] cannot be zero and _binaries_[i] cannot be a `NULL` pointer.
The devices associated with the program object will be the list of devices
specified by _device_list_.
The list of devices specified by _device_list_ must be devices associated
with _context_.
The program binaries specified by _binaries_ will be loaded into the program
object.
They contain bits that describe one of the following:
* a program executable to be run on the device(s) associated with
_context_,
* a compiled program for device(s) associated with _context_, or
* a library of compiled programs for device(s) associated with _context_.
The program binary can consist of either or both:
* Device-specific code and/or,
* Implementation-specific intermediate representation (IR) which will be
converted to the device-specific code.
* _binary_status_ returns whether the program binary for each device specified
in _device_list_ was loaded successfully or not.
It is an array of _num_devices_ entries and returns {CL_SUCCESS} in
_binary_status_[i] if binary was successfully loaded for device specified by
_device_list_[i]; otherwise returns {CL_INVALID_VALUE} if _lengths_[i] is zero
or if _binaries_[i] is a `NULL` value or {CL_INVALID_BINARY} in
_binary_status_[i] if program binary is not a valid binary for the specified
device.
If _binary_status_ is `NULL`, it is ignored.
* _errcode_ret_ will return an appropriate error code.
If _errcode_ret_ is `NULL`, no error code is returned.
OpenCL allows applications to create a program object using the program
source or binary and build appropriate program executables.
This can be very useful as it allows applications to load program source and
then compile and link to generate a program executable online on its first
instance for appropriate OpenCL devices in the system.
These executables can now be queried and cached by the application.
The cached executables can be read and loaded by the application, which can
help significantly reduce the application initialization time.
// refError
{clCreateProgramWithBinary} returns a valid non-zero program object and
_errcode_ret_ is set to {CL_SUCCESS} if the program object is created
successfully.
Otherwise, it returns a `NULL` value with one of the following error values
returned in _errcode_ret_:
* {CL_INVALID_CONTEXT} if _context_ is not a valid context.
* {CL_INVALID_VALUE} if _device_list_ is `NULL` or _num_devices_ is zero.
* {CL_INVALID_DEVICE} if any device in _device_list_ is not in
the list of devices associated with _context_.
* {CL_INVALID_VALUE} if _lengths_ or _binaries_ is `NULL` or if any entry
in _lengths_[i] is zero or _binaries_[i] is `NULL`.
* {CL_INVALID_BINARY} if an invalid program binary was encountered for any
device.
_binary_status_ will return specific status for each device.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clCreateProgramWithBuiltInKernels',desc='Creates a program object for a context, and loads the information related to the built-in kernels into a program object.',type='protos']
--
To create a program object for a context and loads the information related
to the built-in kernels into that object, call the function
include::{generated}/api/protos/clCreateProgramWithBuiltInKernels.txt[]
include::{generated}/api/version-notes/clCreateProgramWithBuiltInKernels.asciidoc[]
* _context_ must be a valid OpenCL context.
* _num_devices_ is the number of devices listed in _device_list_.
* _device_list_ is a pointer to a list of devices that are in _context_.
_device_list_ must be a non-`NULL` value.
The built-in kernels are loaded for devices specified in this list.
* _kernel_names_ is a semi-colon separated list of built-in kernel names.
The devices associated with the program object will be the list of devices
specified by _device_list_.
The list of devices specified by _device_list_ must be devices associated
with _context_.
// refError
{clCreateProgramWithBuiltInKernels} returns a valid non-zero program object
and _errcode_ret_ is set to {CL_SUCCESS} if the program object is created
successfully.
Otherwise, it returns a `NULL` value with one of the following error values
returned in _errcode_ret_:
* {CL_INVALID_CONTEXT} if _context_ is not a valid context.
* {CL_INVALID_VALUE} if _device_list_ is `NULL` or _num_devices_ is zero.
* {CL_INVALID_VALUE} if _kernel_names_ is `NULL` or _kernel_names_ contains
a kernel name that is not supported by any of the devices in
_device_list_.
* {CL_INVALID_DEVICE} if any device in _device_list_ is not in the list
of devices associated with _context_.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
=== Retaining and Releasing Program Objects
[open,refpage='clRetainProgram',desc='Increments the program reference count.',type='protos']
--
To retain a program object, call the function
include::{generated}/api/protos/clRetainProgram.txt[]
include::{generated}/api/version-notes/clRetainProgram.asciidoc[]
* _program_ is the program object to be retained.
The _program_ reference count is incremented.
All APIs that create a program do an implicit retain.
// refError
{clRetainProgram} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_PROGRAM} if _program_ is not a valid program object.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clReleaseProgram',desc='Decrements the program reference count.',type='protos']
--
To release a program object, call the function
include::{generated}/api/protos/clReleaseProgram.txt[]
include::{generated}/api/version-notes/clReleaseProgram.asciidoc[]
* _program_ is the program object to be released.
The _program_ reference count is decremented.
The program object is deleted after all kernel objects associated with
_program_ have been deleted and the _program_ reference count becomes zero.
// refError
{clReleaseProgram} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_PROGRAM} if _program_ is not a valid program object.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
Using this function to release a reference that was not obtained by creating
the object or by calling {clRetainProgram} causes undefined behavior.
--
[open,refpage='clSetProgramReleaseCallback',desc='Registers a destructor callback function with a program object.',type='protos']
--
To register a callback function with a program object that is called when
the program object is destroyed, call the function
include::{generated}/api/protos/clSetProgramReleaseCallback.txt[]
include::{generated}/api/version-notes/clSetProgramReleaseCallback.asciidoc[]
* _program_ specifies the memory object to register the callback to.
* _pfn_notify_ is the callback function to register.
This callback function may be called asynchronously by the OpenCL
implementation.
It is the application's responsibility to ensure that the callback function
is thread safe.
The parameters to this callback function are:
** _program_ is the program being deleted.
When the callback function is called by the implementation, this program
object is not longer valid.
_program_ is only provided for reference purposes.
** _user_data_ is a pointer to user supplied data.
* _user_data_ will be passed as the _user_data_ argument when _pfn_notify_ is
called.
user data can be `NULL`.
Each call to {clSetProgramReleaseCallback} registers the specified
callback function on a callback stack associated with _program_.
The registered callback functions are called in the reverse order in
which they were registered.
The registered callback functions are called after destructors (if any) for
program scope global variables (if any) are called and before the program
object is deleted.
This provides a mechanism for an application to be notified when destructors
for program scope global variables are complete.
{clSetProgramReleaseCallback} may unconditionally return an error if no
devices in the context associated with _program_ support destructors for
program scope global variables.
Support for constructors and destructors for program scope global variables
is required only for OpenCL 2.2 devices.
// refError
{clSetProgramReleaseCallback} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_PROGRAM} if _program_ is not a valid program object.
* {CL_INVALID_OPERATION} if no devices in the context associated with
_program_ support destructors for program scope global variables.
* {CL_INVALID_VALUE} if _pfn_notify_ is `NULL`.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
=== Setting SPIR-V specialization constants
NOTE: Specialization constants are <<unified-spec, missing before>> version
2.2.
[open,refpage='clSetProgramSpecializationConstant',desc='',type='protos']
--
To set the value of a specialization constant, call the function
include::{generated}/api/protos/clSetProgramSpecializationConstant.txt[]
include::{generated}/api/version-notes/clSetProgramSpecializationConstant.asciidoc[]
* _program_ must be a valid OpenCL program created from an intermediate
language (e.g. SPIR-V).
* _spec_id_ identifies the specialization constant whose value will be
set.
* _spec_size_ specifies the size in bytes of the data pointed to by
_spec_value_.
This should be 1 for boolean constants.
For all other constant types this should match the size of the
specialization constant in the module.
* _spec_value_ is a pointer to the memory location that contains the value of
the specialization constant.
The data pointed to by _spec_value_ are copied and can be safely reused by
the application after {clSetProgramSpecializationConstant} returns.
This specialization value will be used by subsequent calls to
{clBuildProgram} until another call to {clSetProgramSpecializationConstant}
changes it.
If a specialization constant is a boolean constant, _spec_value_ should be a
pointer to a {cl_uchar_TYPE} value.
A value of zero will set the specialization constant to false; any other
value will set it to true.
Calling this function multiple times for the same specialization constant
shall cause the last provided value to override any previously specified
value.
The values are used by a subsequent {clBuildProgram} call for the _program_.
Application is not required to provide values for every specialization
constant contained in the module. If the value is not set by this API
call, default values will be used during the build.
// refError
{clSetProgramSpecializationConstant} returns {CL_SUCCESS} if the function is
executed successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_PROGRAM} if _program_ is not a valid program object created
from an intermediate language (e.g. SPIR-V), or if the intermediate
language does not support specialization constants.
* {CL_INVALID_OPERATION} if no devices associated with _program_ support
intermediate language programs.
* {CL_COMPILER_NOT_AVAILABLE} if _program_ is created with
{clCreateProgramWithIL} and a compiler is not
available, i.e. {CL_DEVICE_COMPILER_AVAILABLE} specified in the
<<device-queries-table,Device Queries>> table is set to {CL_FALSE}.
* {CL_INVALID_SPEC_ID} if _spec_id_ is not a valid specialization constant
identifier.
* {CL_INVALID_VALUE} if _spec_size_ does not match the size of the
specialization constant in the module, or if _spec_value_ is
`NULL`.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
=== Building Program Executables
[open,refpage='clBuildProgram',desc='Builds (compiles and links) a program executable from the program source or binary.',type='protos']
--
To build (compile & link) a program executable, call the function
include::{generated}/api/protos/clBuildProgram.txt[]
include::{generated}/api/version-notes/clBuildProgram.asciidoc[]
* _program_ is the program object.
* _device_list_ is a pointer to a list of devices associated with _program_.
If _device_list_ is a `NULL` value, the program executable is built for all
devices associated with _program_ for which a source or binary has been
loaded.
If _device_list_ is a non-`NULL` value, the program executable is built for
devices specified in this list for which a source or binary has been loaded.
* _num_devices_ is the number of devices listed in _device_list_.
* _options_ is a pointer to a null-terminated string of characters that
describes the build options to be used for building the program executable.
The list of supported options is described in <<compiler-options, Compiler
Options>>.
If the program was created using {clCreateProgramWithBinary} and _options_
is a `NULL` pointer, the program will be built as if _options_ were the same
as when the program binary was originally built.
If the program was created using {clCreateProgramWithBinary} and _options_
string contains anything other than the same options in the same order
(whitespace ignored) as when the program binary was originally built, then
the behavior is implementation defined.
Otherwise, if _options_ is a `NULL` pointer then it will have the same
result as the empty string.
* _pfn_notify_ is a function pointer to a notification routine.
The notification routine is a callback function that an application can
register and which will be called when the program executable has been built
(successfully or unsuccessfully).
If _pfn_notify_ is not `NULL`, {clBuildProgram} does not need to wait for
the build to complete and can return immediately once the build operation
can begin.
Any state changes of the program object that result from calling
{clBuildProgram} (e.g. build status or log) will be observable from this
callback function.
The build operation can begin if the context, program whose sources are
being compiled and linked, list of devices and build options specified are
all valid and appropriate host and device resources needed to perform the
build are available.
If _pfn_notify_ is `NULL`, {clBuildProgram} does not return until the build
has completed.
This callback function may be called asynchronously by the OpenCL
implementation.
It is the applications responsibility to ensure that the callback function
is thread-safe.
** _user_data_ will be passed as an argument when _pfn_notify_ is called.
_user_data_ can be `NULL`.
The program executable is built from the program source or binary for all
the devices, or a specific device(s) in the OpenCL context associated with
_program_.
OpenCL allows program executables to be built using the source or the
binary.
{clBuildProgram} must be called for _program_ created using
{clCreateProgramWithSource}, {clCreateProgramWithIL} or
{clCreateProgramWithBinary} to build the program executable for one or more
devices associated with _program_.
If _program_ is created with {clCreateProgramWithBinary}, then the program
binary must be an executable binary (not a compiled binary or library).
The executable binary can be queried using {clGetProgramInfo}(_program_,
{CL_PROGRAM_BINARIES}, ...) and can be specified to
{clCreateProgramWithBinary} to create a new program object.
// refError
{clBuildProgram} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_PROGRAM} if _program_ is not a valid program object.
* {CL_INVALID_VALUE} if _device_list_ is `NULL` and _num_devices_ is greater
than zero, or if _device_list_ is not `NULL` and _num_devices_ is zero.
* {CL_INVALID_VALUE} if _pfn_notify_ is `NULL` but _user_data_ is not
`NULL`.
* {CL_INVALID_DEVICE} if any device in _device_list_ is not in
the list of devices associated with _program_.
* {CL_INVALID_BINARY} if _program_ is created with
{clCreateProgramWithBinary} and devices listed in _device_list_ do not
have a valid program binary loaded.
* {CL_INVALID_BUILD_OPTIONS} if the build options specified by _options_ are
invalid.
* {CL_COMPILER_NOT_AVAILABLE} if _program_ is created with
{clCreateProgramWithSource} or {clCreateProgramWithIL} and a compiler is
not available, i.e. {CL_DEVICE_COMPILER_AVAILABLE} specified in the
<<device-queries-table,Device Queries>> table is set to {CL_FALSE}.
* {CL_BUILD_PROGRAM_FAILURE} if there is a failure to build the program
executable.
This error will be returned if {clBuildProgram} does not return until
the build has completed.
* {CL_INVALID_OPERATION} if the build of a program executable for any of the
devices listed in _device_list_ by a previous call to {clBuildProgram}
for _program_ has not completed.
* {CL_INVALID_OPERATION} if there are kernel objects attached to _program_.
* {CL_INVALID_OPERATION} if _program_ was not created with
{clCreateProgramWithSource}, {clCreateProgramWithIL} or
{clCreateProgramWithBinary}.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
=== Separate Compilation and Linking of Programs
NOTE: Separate compilation and linking are <<unified-spec, missing before>>
version 1.2.
OpenCL programs are compiled and linked to support the following:
* Separate compilation and link stages.
Program sources can be compiled to generate a compiled binary object and
linked in a separate stage with other compiled program objects to the
program executable.
* Embedded headers.
In OpenCL 1.0 and 1.1, the I build option could be used to specify the
list of directories to be searched for headers files that are included
by a program source(s).
OpenCL 1.2 extends this by allowing the header sources to come from
program objects instead of just header files.
* Libraries.
The linker can be used to link compiled objects and libraries into a
program executable or to create a library of compiled binaries.
[open,refpage='clCompileProgram',desc='Compiles a program\'s source for all the devices or a specific device(s) in the OpenCL context associated with a program.',type='protos']
--
To compile a program's source for all the devices or a specific device(s) in
the OpenCL context associated with the program, call the function
include::{generated}/api/protos/clCompileProgram.txt[]
include::{generated}/api/version-notes/clCompileProgram.asciidoc[]
* _program_ is the program object that is the compilation target.
* _device_list_ is a pointer to a list of devices associated with _program_.
If _device_list_ is a `NULL` value, the compile is performed for all devices
associated with _program_.
If _device_list_ is a non-`NULL` value, the compile is performed for devices
specified in this list.
* _num_devices_ is the number of devices listed in _device_list_.
* _options_ is a pointer to a null-terminated string of characters that
describes the compilation options to be used for building the program
executable.
If _options_ is a `NULL` pointer then it will have the same result as the
empty string.
Certain options are ignored when program is created with IL.
The list of supported options is as described in <<compiler-options,
Compiler Options>>.
* _num_input_headers_ specifies the number of programs that describe headers
in the array referenced by _input_headers_.
* _input_headers_ is an array of program embedded headers created with
{clCreateProgramWithSource}.
* _header_include_names_ is an array that has a one to one correspondence with
_input_headers_.
Each entry in _header_include_names_ specifies the include name used by
source in _program_ that comes from an embedded header.
The corresponding entry in _input_headers_ identifies the program object
which contains the header source to be used.
The embedded headers are first searched before the headers in the list of
directories specified by the `-I` compile option (as described in
<<preprocessor-options, Preprocessor options>>).
If multiple entries in _header_include_names_ refer to the same header name,
the first one encountered will be used.
* _pfn_notify_ is a function pointer to a notification routine.
The notification routine is a callback function that an application can
register and which will be called when the program executable has been built
(successfully or unsuccessfully).
If _pfn_notify_ is not `NULL`, {clCompileProgram} does not need to wait for
the compiler to complete and can return immediately once the compilation can
begin.
Any state changes of the program object that result from calling
{clCompileProgram} (e.g. compile status or log) will be observable from this
callback function.
The compilation can begin if the context, program whose sources are being
compiled, list of devices, input headers, programs that describe input
headers and compiler options specified are all valid and appropriate host
and device resources needed to perform the compile are available.
If _pfn_notify_ is `NULL`, {clCompileProgram} does not return until the
compiler has completed.
This callback function may be called asynchronously by the OpenCL
implementation.
It is the applications responsibility to ensure that the callback function
is thread-safe.
** _user_data_ will be passed as an argument when _pfn_notify_ is called.
_user_data_ can be `NULL`.
The pre-processor runs before the program sources are compiled.
The compiled binary is built for all devices associated with _program_ or
the list of devices specified.
The compiled binary can be queried using {clGetProgramInfo}(_program_,
{CL_PROGRAM_BINARIES}, ...) and can be passed to {clCreateProgramWithBinary}
to create a new program object.
If _program_ was created using {clCreateProgramWithIL}, then
_num_input_headers_, _input_headers_, and _header_include_names_ are
ignored.
For example, consider the following program source:
[source,c]
----
#include <foo.h>
#include <mydir/myinc.h>
__kernel void
image_filter (int n, int m,
__constant float *filter_weights,
__read_only image2d_t src_image,
__write_only image2d_t dst_image)
{
...
}
----
This kernel includes two headers foo.h and mydir/myinc.h.
The following describes how these headers can be passed as embedded headers
in program objects:
[source,c]
----
cl_program foo_pg = clCreateProgramWithSource(context,
1, &foo_header_src, NULL, &err);
cl_program myinc_pg = clCreateProgramWithSource(context,
1, &myinc_header_src, NULL, &err);
// lets assume the program source described above is given
// by program_A and is loaded via clCreateProgramWithSource
cl_program input_headers[2] = { foo_pg, myinc_pg };
char * input_header_names[2] = { foo.h, mydir/myinc.h };
clCompileProgram(program_A,
0, NULL, // num_devices & device_list
NULL, // compile_options
2, // num_input_headers
input_headers,
input_header_names,
NULL, NULL); // pfn_notify & user_data
----
// refError
{clCompileProgram} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_PROGRAM} if _program_ is not a valid program object.
* {CL_INVALID_VALUE} if _device_list_ is `NULL` and _num_devices_ is greater
than zero, or if _device_list_ is not `NULL` and _num_devices_ is zero.
* {CL_INVALID_VALUE} if _num_input_headers_ is zero and
_header_include_names_ or _input_headers_ are not `NULL` or if
_num_input_headers_ is not zero and _header_include_names_ or
_input_headers_ are `NULL`.
* {CL_INVALID_VALUE} if _pfn_notify_ is `NULL` but _user_data_ is not
`NULL`.
* {CL_INVALID_DEVICE} if device in _device_list_ is not in
the list of devices associated with _program_.
* {CL_INVALID_COMPILER_OPTIONS} if the compiler options specified by
_options_ are invalid.
* {CL_INVALID_OPERATION} if the compilation or build of a program executable
for any of the devices listed in _device_list_ by a previous call to
{clCompileProgram} or {clBuildProgram} for _program_ has not completed.
* {CL_COMPILER_NOT_AVAILABLE} if a compiler is not available, i.e.
{CL_DEVICE_COMPILER_AVAILABLE} specified in the
<<device-queries-table,Device Queries>> table is set to {CL_FALSE}.
* {CL_COMPILE_PROGRAM_FAILURE} if there is a failure to compile the program
source.
This error will be returned if {clCompileProgram} does not return until
the compile has completed.
* {CL_INVALID_OPERATION} if there are kernel objects attached to _program_.
* {CL_INVALID_OPERATION} if _program_ has no source or IL available, i.e. it
has not been created with {clCreateProgramWithSource} or
{clCreateProgramWithIL}.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clLinkProgram',desc='Links a set of compiled program objects and libraries for all the devices or a specific device(s) in the OpenCL context and creates a library or executable.',type='protos']
--
To link a set of compiled program objects and libraries for all the devices
or a specific device(s) in the OpenCL context and create a library or
executable, call the function
include::{generated}/api/protos/clLinkProgram.txt[]
include::{generated}/api/version-notes/clLinkProgram.asciidoc[]
* _context_ must be a valid OpenCL context.
* _device_list_ is a pointer to a list of devices that are in _context_.
If _device_list_ is a `NULL` value, the link is performed for all devices
associated with _context_ for which a compiled object is available.
If _device_list_ is a non-`NULL` value, the link is performed for devices
specified in this list for which a compiled object is available.
* _num_devices_ is the number of devices listed in _device_list_.
* _options_ is a pointer to a null-terminated string of characters that
describes the link options to be used for building the program executable.
The list of supported options is as described in <<linker-options,Linker
Options>>.
If the program was created using {clCreateProgramWithBinary} and _options_
is a `NULL` pointer, the program will be linked as if _options_ were the
same as when the program binary was originally built.
If the program was created using {clCreateProgramWithBinary} and _options_
string contains anything other than the same options in the same order
(whitespace ignored) as when the program binary was originally built, then
the behavior is implementation defined.
Otherwise, if _options_ is a `NULL` pointer then it will have the same
result as the empty string.
* _num_input_programs_ specifies the number of programs in array referenced by
_input_programs_.
* _input_programs_ is an array of program objects that are compiled binaries
or libraries that are to be linked to create the program executable.
For each device in _device_list_ or if _device_list_ is `NULL` the list of
devices associated with context, the following cases occur:
** All programs specified by _input_programs_ contain a compiled binary or
library for the device.
In this case, a link is performed to generate a program executable for
this device.
** None of the programs contain a compiled binary or library for that
device.
In this case, no link is performed and there will be no program
executable generated for this device.
** All other cases will return a {CL_INVALID_OPERATION} error.
* _pfn_notify_ is a function pointer to a notification routine.
The notification routine is a callback function that an application can
register and which will be called when the program executable has been built
(successfully or unsuccessfully).
** _user_data_ will be passed as an argument when _pfn_notify_ is called.
_user_data_ can be `NULL`.
If _pfn_notify_ is not `NULL`, {clLinkProgram} does not need to wait for the
linker to complete, and can return immediately once the linking operation can
begin.
Once the linker has completed, the _pfn_notify_ callback function is called
which returns the program object returned by {clLinkProgram}.
Any state changes of the program object that result from calling {clLinkProgram}
(e.g. link status or log) will be observable from this callback function.
This callback function may be called asynchronously by the OpenCL
implementation.
It is the applications responsibility to ensure that the callback function
is thread-safe.
If _pfn_notify_ is `NULL`, {clLinkProgram} does not return until the linker
has completed.
{clLinkProgram} creates a new program object which contains the library or
executable.
The library or executable binary can be queried using
{clGetProgramInfo}(_program_, {CL_PROGRAM_BINARIES}, ...) and can be specified
to {clCreateProgramWithBinary} to create a new program object.
The devices associated with the returned program object will be the list of
devices specified by _device_list_ or if _device_list_ is `NULL` it will be
the list of devices associated with _context_.
The linking operation can begin if the context, list of devices, input
programs and linker options specified are all valid and appropriate host and
device resources needed to perform the link are available.
If the linking operation can begin, {clLinkProgram} returns a valid non-zero
program object.
// refError
If _pfn_notify_ is `NULL`, the _errcode_ret_ will be set to {CL_SUCCESS} if
the link operation was successful and {CL_LINK_PROGRAM_FAILURE} if there is a
failure to link the compiled binaries and/or libraries.
If _pfn_notify_ is not `NULL`, {clLinkProgram} does not have to wait until
the linker to complete and can return {CL_SUCCESS} in _errcode_ret_ if the
linking operation can begin.
The _pfn_notify_ callback function will return a {CL_SUCCESS} or
{CL_LINK_PROGRAM_FAILURE} if the linking operation was successful or not.
Otherwise {clLinkProgram} returns a `NULL` program object with an
appropriate error in _errcode_ret_.
The application should query the linker status of this program object to
check if the link was successful or not.
The list of errors that can be returned are:
* {CL_INVALID_CONTEXT} if _context_ is not a valid context.
* {CL_INVALID_VALUE} if _device_list_ is `NULL` and _num_devices_ is greater
than zero, or if _device_list_ is not `NULL` and _num_devices_ is zero.
* {CL_INVALID_VALUE} if _num_input_programs_ is zero and _input_programs_ is
`NULL` or if _num_input_programs_ is zero and _input_programs_ is not
`NULL` or if _num_input_programs_ is not zero and _input_programs_ is
`NULL`.
* {CL_INVALID_PROGRAM} if programs specified in _input_programs_ are not
valid program objects.
* {CL_INVALID_VALUE} if _pfn_notify_ is `NULL` but _user_data_ is not
`NULL`.
* {CL_INVALID_DEVICE} if any device in _device_list_ is not in
the list of devices associated with _context_.
* {CL_INVALID_LINKER_OPTIONS} if the linker options specified by _options_
are invalid.
* {CL_INVALID_OPERATION} if the compilation or build of a program executable
for any of the devices listed in _device_list_ by a previous call to
{clCompileProgram} or {clBuildProgram} for _program_ has not completed.
* {CL_INVALID_OPERATION} if the rules for devices containing compiled
binaries or libraries as described in _input_programs_ argument above
are not followed.
* {CL_LINKER_NOT_AVAILABLE} if a linker is not available, i.e.
{CL_DEVICE_LINKER_AVAILABLE} specified in the
<<device-queries-table,Device Queries>> table is set to {CL_FALSE}.
* {CL_LINK_PROGRAM_FAILURE} if there is a failure to link the compiled
binaries and/or libraries.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[[compiler-options]]
=== Compiler Options
The compiler options are categorized as pre-processor options, options for
math intrinsics, options that control optimization and miscellaneous
options.
This specification defines a standard set of options that must be supported
by the compiler when building program executables online or offline from
OpenCL C/{cpp} or, where relevant, from an IL.
These may be extended by a set of vendor- or platform-specific options.
[[preprocessor-options]]
==== Preprocessor options
These options control the OpenCL C/{cpp} preprocessor which is run on each
program source before actual compilation.
These options are ignored for programs created with IL.
`-D name` ::
Predefine _name_ as a macro, with definition 1.
`-D name=definition` ::
The contents of _definition_ are tokenized and processed as if they
appeared during translation phase three in a `#define` directive.
In particular, the definition will be truncated by embedded newline
characters.
+
--
`-D` options are processed in the order they are given in the _options_
argument to {clBuildProgram} or {clCompileProgram}.
Note that a space is required between the `-D` option and the symbol it
defines, otherwise behavior is implementation defined.
--
`-I dir` ::
Add the directory _dir_ to the list of directories to be searched for
header files.
_dir_ can optionally be enclosed in double quotes.
+
--
This option is not portable due to its dependency on host file system and
host operating system.
It is supported for backwards compatibility with previous OpenCL versions.
Developers are encouraged to create and use explicit header objects by means
of {clCompileProgram} followed by {clLinkProgram}.
--
[[math-intrinsics-options]]
==== Math Intrinsics Options
These options control compiler behavior regarding floating-point arithmetic.
These options trade off between speed and correctness.
`-cl-single-precision-constant` ::
Treat double precision floating-point constant as single precision
constant.
This option is ignored for programs created with IL.
`-cl-denorms-are-zero` ::
This option controls how single precision and double precision
denormalized numbers are handled.
If specified as a build option, the single precision denormalized
numbers may be flushed to zero; double precision denormalized numbers
may also be flushed to zero if the optional extension for double
precision is supported.
This is intended to be a performance hint and the OpenCL compiler can
choose not to flush denorms to zero if the device supports single
precision (or double precision) denormalized numbers.
+
--
This option is ignored for single precision numbers if the device does not
support single precision denormalized numbers i.e. {CL_FP_DENORM} bit is not
set in {CL_DEVICE_SINGLE_FP_CONFIG}.
This option is ignored for double precision numbers if the device does not
support double precision or if it does support double precision but not
double precision denormalized numbers i.e. {CL_FP_DENORM} bit is not set in
{CL_DEVICE_DOUBLE_FP_CONFIG}.
This flag only applies for scalar and vector single precision floating-point
variables and computations on these floating-point variables inside a
program.
It does not apply to reading from or writing to image objects.
--
`-cl-fp32-correctly-rounded-divide-sqrt` ::
The `-cl-fp32-correctly-rounded-divide-sqrt` build option to
{clBuildProgram} or {clCompileProgram} allows an application to specify
that single precision floating-point divide (x/y and 1/x) and sqrt used
in the program source are correctly rounded.
If this build option is not specified, the minimum numerical accuracy of
single precision floating-point divide and sqrt are as defined in the
OpenCL C or OpenCL SPIR-V Environment specifications.
+
--
This build option can only be specified if the
{CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT} is set in {CL_DEVICE_SINGLE_FP_CONFIG} (as
defined in the <<device-queries-table,Device Queries>> table) for devices
that the program is being build.
{clBuildProgram} or {clCompileProgram} will fail to compile the program for
a device if the `-cl-fp32-correctly-rounded-divide-sqrt` option is specified
and {CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT} is not set for the device.
Note: This option is <<unified-spec, missing before>> version 1.2.
--
[[optimization-options]]
==== Optimization Options
These options control various sorts of optimizations.
Turning on optimization flags makes the compiler attempt to improve the
performance and/or code size at the expense of compilation time and possibly
the ability to debug the program.
`-cl-opt-disable` ::
This option disables all optimizations.
The default is optimizations are enabled.
`-cl-strict-aliasing` ::
This option allows the compiler to assume the strictest aliasing rules.
+
--
Note: This option is <<unified-spec, deprecated by>> version 1.1.
--
`-cl-uniform-work-group-size` ::
This requires that the global work-size be a multiple of the work-group
size specified to {clEnqueueNDRangeKernel}.
Allow optimizations that are made possible by this restriction.
+
--
Note: This option is <<unified-spec, missing before>> version 2.0.
--
`-cl-no-subgroup-ifp` ::
This indicates that kernels in this program do not require subgroups to
make independent forward progress.
Allows optimizations that are made possible by this restriction.
This option has no effect for devices that do not support independent
forward progress for subgroups.
+
--
Note: This option is <<unified-spec, missing before>> version 2.1.
--
The following options control compiler behavior regarding floating-point
arithmetic.
These options trade off between performance and correctness and must be
specifically enabled.
These options are not turned on by default since it can result in incorrect
output for programs which depend on an exact implementation of IEEE 754
rules/specifications for math functions.
`-cl-mad-enable` ::
Allow `a * b + c` to be replaced by a *mad* instruction.
The *mad* instruction may compute `a * b + c` with reduced accuracy
in the embedded profile.
See the OpenCL C or OpenCL SPIR-V Environment specification for accuracy
details.
On some hardware the *mad* instruction may provide better performance
than the expanded computation.
`-cl-no-signed-zeros` ::
Allow optimizations for floating-point arithmetic that ignore the
signedness of zero.
IEEE 754 arithmetic specifies the distinct behavior of `+0.0` and `-0.0`
values, which then prohibits simplification of expressions such as `x
{plus} 0.0` or `0.0 * x` (even with `-cl-finite-math-only`).
This option implies that the sign of a zero result isn't significant.
`-cl-unsafe-math-optimizations` ::
Allow optimizations for floating-point arithmetic that (a) assume that
arguments and results are valid, (b) may violate the IEEE 754 standard,
(c) assume relaxed OpenCL numerical compliance requirements as defined
in the unsafe math optimization section of the OpenCL C or OpenCL SPIR-V
Environment specifications, and (d) may violate edge case behavior in the
OpenCL C or OpenCL SPIR-V Environment specifications.
This option includes the `-cl-no-signed-zeros` and `-cl-mad-enable` options.
`-cl-finite-math-only` ::
Allow optimizations for floating-point arithmetic that assume that
arguments and results are not NaNs, +Inf, -Inf.
This option may violate the OpenCL numerical compliance requirements for
single precision and double precision floating-point, as well as edge
case behavior.
`-cl-fast-relaxed-math` ::
Sets the optimization options `-cl-finite-math-only` and
`-cl-unsafe-math-optimizations`.
This option causes the preprocessor macro `+__FAST_RELAXED_MATH__+` to
be defined in the OpenCL program.
==== Options to Request or Suppress Warnings
Warnings are diagnostic messages that report constructions which are not
inherently erroneous but which are risky or suggest there may have been an
error.
The following language-independent options do not enable specific warnings
but control the kinds of diagnostics produced by the OpenCL compiler.
These options are ignored for programs created with IL.
`-w` ::
Inhibit all warning messages.
`-Werror` ::
Make all warnings into errors.
[[opencl-c-version]]
==== Options Controlling the OpenCL C version
The following option controls the version of OpenCL C that the compiler
accepts.
These options are ignored for programs created with IL.
`-cl-std=` ::
Determine the OpenCL C language version to use.
A value for this option must be provided.
Valid values are:
+
--
* `CL1.1`: Support OpenCL C 1.1 language features defined in _section 6_ of
the OpenCL 1.1 specification or in the unified OpenCL C specification.
* `CL1.2`: Support OpenCL C 1.2 language features defined in _section 6_ of
the OpenCL 1.2 specification or in the unified OpenCL C specification.
* `CL2.0`: Support OpenCL C 2.0 language features defined in the OpenCL C 2.0
specification or in the unified OpenCL C specification.
* `CL3.0`: Support OpenCL C 3.0 language features defined in the unified
OpenCL C specification.
--
Calls to {clBuildProgram} or {clCompileProgram} with the `-cl-std=CL1.1`
option *will fail* to compile the program for any devices with
{CL_DEVICE_OPENCL_C_VERSION} equal to OpenCL C 1.0 and when
{CL_DEVICE_OPENCL_C_ALL_VERSIONS} does not include OpenCL C 1.1.
Calls to {clBuildProgram} or {clCompileProgram} with the `-cl-std=CL1.2`
option *will fail* to compile the program for any devices with
{CL_DEVICE_OPENCL_C_VERSION} equal to OpenCL C 1.1 or earlier
and when {CL_DEVICE_OPENCL_C_ALL_VERSIONS} does not include OpenCL C 1.2.
Calls to {clBuildProgram} or {clCompileProgram} with the `-cl-std=CL2.0`
option *will fail* to compile the program for any devices with
{CL_DEVICE_OPENCL_C_VERSION} equal to OpenCL C 1.2 or earlier
and when {CL_DEVICE_OPENCL_C_ALL_VERSIONS} does not include OpenCL C 2.0.
Calls to {clBuildProgram} or {clCompileProgram} with the `-cl-std=CL3.0`
option *will fail* to compile the program for any devices with
{CL_DEVICE_OPENCL_C_VERSION} equal to OpenCL C 2.0 or earlier
and when {CL_DEVICE_OPENCL_C_ALL_VERSIONS} does not include OpenCL C 3.0.
If the `-cl-std` build option is not specified, the highest OpenCL C 1.x
language version supported by each device is used when compiling the program
for each device.
Applications are required to specify the `-cl-std=CL2.0` build option to
compile or build programs with OpenCL C 2.0 and the `-cl-std=CL3.0`
build option to compile or build programs with OpenCL C 3.0.
==== Options for Querying Kernel Argument Information
IMPORTANT: Querying for kernel argument information is <<unified-spec, missing
before>> version 1.2.
`-cl-kernel-arg-info` ::
This option allows the compiler to store information about the arguments
of a kernel(s) in the program executable.
The argument information stored includes the argument name, its type,
the address space and access qualifiers used.
Refer to description of {clGetKernelArgInfo} on how to query this
information.
==== Options for debugging your program
IMPORTANT: Debugging options are <<unified-spec, missing before>> version 2.0.
`-g` ::
This option can currently be used to generate additional errors for the
built-in functions that allow you to enqueue commands on a device (refer
to OpenCL kernel languages specifications).
[[linker-options]]
=== Linker Options
NOTE: Linker options are <<unified-spec, missing before>> version 1.2.
This specification defines a standard set of linker options that must be
supported by the OpenCL C compiler when linking compiled programs online or
offline.
These linker options are categorized as library linking options and program
linking options.
These may be extended by a set of vendor- or platform-specific options.
==== Library Linking Options
IMPORTANT: Library linking options are <<unified-spec, missing before>> version
1.2.
The following options can be specified when creating a library of compiled
binaries.
`-create-library` ::
Create a library of compiled binaries specified in _input_programs_
argument to {clLinkProgram}.
`-enable-link-options` ::
Allows the linker to modify the library behavior based on one or more
link options (described in <<program-linking-options, Program Linking
Options>>) when this library is linked with a program executable.
This option must be specified with the create-library option.
[[program-linking-options]]
==== Program Linking Options
The following options can be specified when linking a program executable.
`-cl-denorms-are-zero` +
`-cl-no-signed-zeros` +
`-cl-unsafe-math-optimizations` +
`-cl-finite-math-only` +
`-cl-fast-relaxed-math` +
`-cl-no-subgroup-ifp` (<<unified-spec, missing before>> version 2.1)
The options are described in <<math-intrinsics-options, Math Intrinsics
Options>> and <<optimization-options, Optimization Options>>.
The linker may apply these options to all compiled program objects
specified to {clLinkProgram}.
The linker may apply these options only to libraries which were created
with the option `-enable-link-options`.
=== Unloading the OpenCL Compiler
[open,refpage='clUnloadPlatformCompiler',desc='Allows the implementation to release the resources allocated by the OpenCL compiler for a platform.',type='protos']
--
To unload an OpenCL compiler for a platform, call the function
include::{generated}/api/protos/clUnloadPlatformCompiler.txt[]
include::{generated}/api/version-notes/clUnloadPlatformCompiler.asciidoc[]
* _platform_ is the platform to unload.
This function allows the implementation to release the resources allocated
by the OpenCL compiler for _platform_.
This is a hint from the application and does not guarantee that the compiler
will not be used in the future or that the compiler will actually be
unloaded by the implementation.
Calls to {clBuildProgram}, {clCompileProgram} or {clLinkProgram} after
{clUnloadPlatformCompiler} will reload the compiler, if necessary, to build
the appropriate program executable.
// refError
{clUnloadPlatformCompiler} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_PLATFORM} if _platform_ is not a valid platform.
--
[open,refpage='clUnloadCompiler',desc='Allows the implementation to release the resources allocated by the OpenCL compiler.',type='protos']
--
Alternatively, if you are not using OpenCL via the ICD loader, you may unload the OpenCL compiler with the function
include::{generated}/api/protos/clUnloadCompiler.txt[]
include::{generated}/api/version-notes/clUnloadCompiler.asciidoc[]
This function allows the implementation to release the resources allocated
by the OpenCL compiler.
This is a hint from the application and does not guarantee that the compiler
will not be used in the future or that the compiler will actually be
unloaded by the implementation.
Calls to {clBuildProgram}, {clCompileProgram} or {clLinkProgram} after
{clUnloadCompiler} will reload the compiler, if necessary, to build
the appropriate program executable.
// refError
{clUnloadCompiler} will always return {CL_SUCCESS}.
--
=== Program Object Queries
[open,refpage='clGetProgramInfo',desc='Returns information about the program object.',type='protos']
--
To return information about a program object, call the function
include::{generated}/api/protos/clGetProgramInfo.txt[]
include::{generated}/api/version-notes/clGetProgramInfo.asciidoc[]
* _program_ specifies the program object being queried.
* _param_name_ specifies the information to query.
The list of supported _param_name_ types and the information returned in
_param_value_ by {clGetProgramInfo} is described in the
<<program-info-table,Program Object Queries>> table.
* _param_value_ is a pointer to memory where the appropriate result being
queried is returned.
If _param_value_ is `NULL`, it is ignored.
* _param_value_size_ is used to specify the size in bytes of memory pointed to
by _param_value_.
This size must be {geq} size of return type as described in the
<<program-info-table,Program Object Queries>> table.
* _param_value_size_ret_ returns the actual size in bytes of data being
queried by _param_name_.
If _param_value_size_ret_ is `NULL`, it is ignored.
[[program-info-table]]
.List of supported param_names by <<clGetProgramInfo>>
[width="100%",cols="<33%,<17%,<50%",options="header"]
|====
| Program Info | Return Type | Description
| {CL_PROGRAM_REFERENCE_COUNT_anchor} footnote:[{fn-reference-count-usage}]
include::{generated}/api/version-notes/CL_PROGRAM_REFERENCE_COUNT.asciidoc[]
| {cl_uint_TYPE}
| Return the _program_ reference count.
| {CL_PROGRAM_CONTEXT_anchor}
include::{generated}/api/version-notes/CL_PROGRAM_CONTEXT.asciidoc[]
| {cl_context_TYPE}
| Return the context specified when the program object is created
| {CL_PROGRAM_NUM_DEVICES_anchor}
include::{generated}/api/version-notes/CL_PROGRAM_NUM_DEVICES.asciidoc[]
| {cl_uint_TYPE}
| Return the number of devices associated with _program_.
| {CL_PROGRAM_DEVICES_anchor}
include::{generated}/api/version-notes/CL_PROGRAM_DEVICES.asciidoc[]
| {cl_device_id_TYPE}[]
| Return the list of devices associated with the program object.
This can be the devices associated with context on which the program
object has been created or can be a subset of devices that are
specified when a program object is created using
{clCreateProgramWithBinary}.
| {CL_PROGRAM_SOURCE_anchor}
include::{generated}/api/version-notes/CL_PROGRAM_SOURCE.asciidoc[]
| {char_TYPE}[]
| Return the program source code specified by
{clCreateProgramWithSource}.
The source string returned is a concatenation of all source strings
specified to {clCreateProgramWithSource} with a null terminator.
The concatenation strips any nulls in the original source strings.
If _program_ is created using {clCreateProgramWithBinary},
{clCreateProgramWithIL} or {clCreateProgramWithBuiltInKernels}, a
null string or the appropriate program source code is returned
depending on whether or not the program source code is stored in the
binary.
The actual number of characters that represents the program source
code including the null terminator is returned in
_param_value_size_ret_.
| {CL_PROGRAM_IL_anchor}
include::{generated}/api/version-notes/CL_PROGRAM_IL.asciidoc[]
Also see extension *cl_khr_il_program*.
| {char_TYPE}[]
| Returns the program IL for programs created with
{clCreateProgramWithIL}.
If _program_ is created with {clCreateProgramWithSource},
{clCreateProgramWithBinary} or {clCreateProgramWithBuiltInKernels}
the memory pointed to by param_value will be unchanged and
param_value_size_retwill be set to 0.
| {CL_PROGRAM_BINARY_SIZES_anchor}
include::{generated}/api/version-notes/CL_PROGRAM_BINARY_SIZES.asciidoc[]
| {size_t_TYPE}[]
| Returns an array that contains the size in bytes of the program
binary (could be an executable binary, compiled binary or library
binary) for each device associated with program.
The size of the array is the number of devices associated with
program.
If a binary is not available for a device(s), a size of zero is
returned.
If _program_ is created using {clCreateProgramWithBuiltInKernels},
the implementation may return zero in any entries of the returned
array.
| {CL_PROGRAM_BINARIES_anchor}
include::{generated}/api/version-notes/CL_PROGRAM_BINARIES.asciidoc[]
| {unsigned_char_TYPE}*[]
| Return the program binaries (could be an executable binary, compiled
binary or library binary) for all devices associated with program.
For each device in program, the binary returned can be the binary
specified for the device when program is created with
{clCreateProgramWithBinary} or it can be the executable binary
generated by {clBuildProgram} or {clLinkProgram}.
If _program_ is created with {clCreateProgramWithSource} or
{clCreateProgramWithIL}, the binary returned is the binary generated
by {clBuildProgram}, {clCompileProgram} or {clLinkProgram}.
The bits returned can be an implementation-specific intermediate
representation (a.k.a. IR) or device specific executable bits or
both.
The decision on which information is returned in the binary is up to
the OpenCL implementation.
param_value points to an array of `n` pointers allocated by the
caller, where `n` is the number of devices associated with program.
The buffer sizes needed to allocate the memory that these `n`
pointers refer to can be queried using the {CL_PROGRAM_BINARY_SIZES}
query as described in this table.
Each entry in this array is used by the implementation as the
location in memory where to copy the program binary for a specific
device, if there is a binary available.
To find out which device the program binary in the array refers to,
use the {CL_PROGRAM_DEVICES} query to get the list of devices.
There is a one-to-one correspondence between the array of n pointers
returned by {CL_PROGRAM_BINARIES} and array of devices returned by
{CL_PROGRAM_DEVICES}.
| {CL_PROGRAM_NUM_KERNELS_anchor}
include::{generated}/api/version-notes/CL_PROGRAM_NUM_KERNELS.asciidoc[]
| {size_t_TYPE}
| Returns the number of kernels declared in _program_ that can be
created with {clCreateKernel}.
This information is only available after a successful program
executable has been built for at least one device in the list of
devices associated with _program_.
| {CL_PROGRAM_KERNEL_NAMES_anchor}
include::{generated}/api/version-notes/CL_PROGRAM_KERNEL_NAMES.asciidoc[]
| {char_TYPE}[]
| Returns a semi-colon separated list of kernel names in _program_
that can be created with {clCreateKernel}.
This information is only available after a successful program
executable has been built for at least one device in the list of
devices associated with _program_.
| {CL_PROGRAM_SCOPE_GLOBAL_CTORS_PRESENT_anchor}
include::{generated}/api/version-notes/CL_PROGRAM_SCOPE_GLOBAL_CTORS_PRESENT.asciidoc[]
| {cl_bool_TYPE}
| This indicates that the _program_ object contains non-trivial
constructor(s) that will be executed by runtime before any kernel
from the program is executed.
This information is only available after a successful program
executable has been built for at least one device in the list of
devices associated with _program_.
Querying {CL_PROGRAM_SCOPE_GLOBAL_CTORS_PRESENT} may unconditionally
return {CL_FALSE} if no devices associated with _program_ support
constructors for program scope global variables.
Support for constructors and destructors for program scope global
variables is required only for OpenCL 2.2 devices.
| {CL_PROGRAM_SCOPE_GLOBAL_DTORS_PRESENT_anchor}
include::{generated}/api/version-notes/CL_PROGRAM_SCOPE_GLOBAL_DTORS_PRESENT.asciidoc[]
| {cl_bool_TYPE}
| This indicates that the program object contains non-trivial
destructor(s) that will be executed by runtime when _program_ is
destroyed.
This information is only available after a successful program
executable has been built for at least one device in the list of
devices associated with _program_.
Querying {CL_PROGRAM_SCOPE_GLOBAL_CTORS_PRESENT} may unconditionally
return {CL_FALSE} if no devices associated with _program_ support
destructors for program scope global variables.
Support for constructors and destructors for program scope global
variables is required only for OpenCL 2.2 devices.
|====
// refError
{clGetProgramInfo} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
specified by _param_value_size_ is < size of return type as described in
the <<program-info-table,Program Object Queries>> table and
_param_value_ is not `NULL`.
* {CL_INVALID_PROGRAM} if _program_ is a not a valid program object.
* {CL_INVALID_PROGRAM_EXECUTABLE} if _param_name_ is
{CL_PROGRAM_NUM_KERNELS}, {CL_PROGRAM_KERNEL_NAMES},
{CL_PROGRAM_SCOPE_GLOBAL_CTORS_PRESENT}, or
{CL_PROGRAM_SCOPE_GLOBAL_DTORS_PRESENT} and a successful program executable
has not been built for at least one device in the list of devices
associated with _program_.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clGetProgramBuildInfo',desc='Returns build information for each device in the program object.',type='protos']
--
To return build information for each device in the program object, call the
function
include::{generated}/api/protos/clGetProgramBuildInfo.txt[]
include::{generated}/api/version-notes/clGetProgramBuildInfo.asciidoc[]
* _program_ specifies the program object being queried.
* _device_ specifies the device for which build information is being queried.
_device_ must be a valid device associated with _program_.
* _param_name_ specifies the information to query.
The list of supported _param_name_ types and the information returned in
_param_value_ by {clGetProgramBuildInfo} is described in the
<<program-build-info-table,Program Build Queries>> table.
* _param_value_ is a pointer to memory where the appropriate result being
queried is returned.
If _param_value_ is `NULL`, it is ignored.
* _param_value_size_ is used to specify the size in bytes of memory pointed to
by _param_value_.
This size must be {geq} size of return type as described in the
<<program-build-info-table,Program Build Queries>> table.
* _param_value_size_ret_ returns the actual size in bytes of data being
queried by _param_name_.
If _param_value_size_ret_ is `NULL`, it is ignored.
[[program-build-info-table]]
.List of supported param_names by <<clGetProgramBuildInfo>>
[width="100%",cols="<33%,<17%,<50%",options="header"]
|====
| Program Build Info | Return Type | Description
| {CL_PROGRAM_BUILD_STATUS_anchor}
include::{generated}/api/version-notes/CL_PROGRAM_BUILD_STATUS.asciidoc[]
| {cl_build_status_TYPE}
| Returns the build, compile or link status, whichever was performed
last on the specified _program_ object for _device_.
This can be one of the following:
{CL_BUILD_NONE_anchor} - The build status returned if no {clBuildProgram},
{clCompileProgram} or {clLinkProgram} has been performed on the
specified _program_ object for _device_).
{CL_BUILD_ERROR_anchor} - The build status returned if {clBuildProgram},
{clCompileProgram} or {clLinkProgram} - whichever was performed last
on the specified _program_ object for _device_ - generated an error.
{CL_BUILD_SUCCESS_anchor} - The build status returned if {clBuildProgram},
{clCompileProgram} or {clLinkProgram} - whichever was performed last
on the specified _program_ object for _device_ - was successful.
{CL_BUILD_IN_PROGRESS_anchor} - The build status returned if
{clBuildProgram}, {clCompileProgram} or {clLinkProgram} - whichever
was performed last on the specified _program_ object for _device_ - has
not finished.
| {CL_PROGRAM_BUILD_OPTIONS_anchor}
include::{generated}/api/version-notes/CL_PROGRAM_BUILD_OPTIONS.asciidoc[]
| {char_TYPE}[]
| Return the build, compile or link options specified by the options
argument in {clBuildProgram}, {clCompileProgram} or {clLinkProgram},
whichever was performed last on the specified _program_ object for
_device_.
If build status of the specified _program_ for _device_ is
{CL_BUILD_NONE}, an empty string is returned.
| {CL_PROGRAM_BUILD_LOG_anchor}
include::{generated}/api/version-notes/CL_PROGRAM_BUILD_LOG.asciidoc[]
| {char_TYPE}[]
| Return the build, compile or link log for {clBuildProgram},
{clCompileProgram} or {clLinkProgram}, whichever was performed last
on program for device.
If build status of the specified _program_ for _device_ is
{CL_BUILD_NONE}, an empty string is returned.
| {CL_PROGRAM_BINARY_TYPE_anchor}
include::{generated}/api/version-notes/CL_PROGRAM_BINARY_TYPE.asciidoc[]
| {cl_program_binary_type_TYPE}
| Return the program binary type for device.
This can be one of the following values:
{CL_PROGRAM_BINARY_TYPE_NONE_anchor} - There is no binary associated
with the specified _program_ object for _device_.
{CL_PROGRAM_BINARY_TYPE_COMPILED_OBJECT_anchor} - A compiled binary is
associated with _device_.
This is the case when the specified _program_ object was created using
{clCreateProgramWithSource} and compiled using {clCompileProgram}, or
when a compiled binary was loaded using {clCreateProgramWithBinary}.
{CL_PROGRAM_BINARY_TYPE_LIBRARY_anchor} - A library binary is
associated with _device_.
This is the case when the specified _program_ object was linked by
{clLinkProgram} using the `-create-library` link option, or when a
compiled library binary was loaded using {clCreateProgramWithBinary}.
{CL_PROGRAM_BINARY_TYPE_EXECUTABLE_anchor} - An executable binary is
associated with _device_.
This is the case when the specified _program_ object was linked by
{clLinkProgram} without the `-create-library` link option, or when an
executable binary was built using {clBuildProgram}.
| {CL_PROGRAM_BUILD_GLOBAL_VARIABLE_TOTAL_SIZE_anchor}
include::{generated}/api/version-notes/CL_PROGRAM_BUILD_GLOBAL_VARIABLE_TOTAL_SIZE.asciidoc[]
| {size_t_TYPE}
| The total amount of storage, in bytes, used by program variables in
the global address space.
|====
// refError
{clGetProgramBuildInfo} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_DEVICE} if _device_ is not in the list of devices associated
with _program_.
* {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
specified by _param_value_size_ is < size of return type as described in
the <<program-build-info-table,Program Build Queries>> table and
_param_value_ is not `NULL`.
* {CL_INVALID_PROGRAM} if _program_ is a not a valid program object.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
[NOTE]
====
A program binary (compiled binary, library binary or executable binary)
built for a parent device can be used by all its sub-devices.
If a program binary has not been built for a sub-device, the program binary
associated with the parent device will be used.
A program binary for a device specified with {clCreateProgramWithBinary} or
queried using {clGetProgramInfo} can be used as the binary for the
associated root device, and all sub-devices created from the root-level
device or sub-devices thereof.
====
--
== Kernel Objects
A kernel is a function declared in a program.
A kernel is identified by the `+__kernel+` qualifier applied to any function
in a program.
A kernel object encapsulates the specific `+__kernel+` function declared in
a program and the argument values to be used when executing this
`+__kernel+` function.
=== Creating Kernel Objects
[open,refpage='clCreateKernel',desc='Creates a kernel object.',type='protos']
--
To create a kernel object, use the function
include::{generated}/api/protos/clCreateKernel.txt[]
include::{generated}/api/version-notes/clCreateKernel.asciidoc[]
* _program_ is a program object with a successfully built executable.
* _kernel_name_ is a function name in the program declared with the
`+__kernel+` qualifier.
* _errcode_ret_ will return an appropriate error code.
If _errcode_ret_ is `NULL`, no error code is returned.
// refError
{clCreateKernel} returns a valid non-zero kernel object and _errcode_ret_ is
set to {CL_SUCCESS} if the kernel object is created successfully.
Otherwise, it returns a `NULL` value with one of the following error values
returned in _errcode_ret_:
* {CL_INVALID_PROGRAM} if _program_ is not a valid program object.
* {CL_INVALID_PROGRAM_EXECUTABLE} if there is no successfully built
executable for _program_.
* {CL_INVALID_KERNEL_NAME} if _kernel_name_ is not found in _program_.
* {CL_INVALID_KERNEL_DEFINITION} if the function definition for `+__kernel+`
function given by _kernel_name_ such as the number of arguments, the
argument types are not the same for all devices for which the _program_
executable has been built.
* {CL_INVALID_VALUE} if _kernel_name_ is `NULL`.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clCreateKernelsInProgram',desc='Creates kernel objects for all kernel functions in a program object.',type='protos']
--
To create kernel objects for all kernel functions in a program,
call the function
include::{generated}/api/protos/clCreateKernelsInProgram.txt[]
include::{generated}/api/version-notes/clCreateKernelsInProgram.asciidoc[]
* _program_ is a program object with a successfully built executable.
* _num_kernels_ is the size of memory pointed to by _kernels_ specified as the
number of {cl_kernel_TYPE} entries.
* _kernels_ is the buffer where the kernel objects for kernels in _program_
will be returned.
If _kernels_ is `NULL`, it is ignored.
If _kernels_ is not `NULL`, _num_kernels_ must be greater than or equal to
the number of kernels in _program_.
* _num_kernels_ret_ is the number of kernels in _program_.
If _num_kernels_ret_ is `NULL`, it is ignored.
Kernel objects are not created for any `+__kernel+` functions in _program_
that do not have the same function definition across all devices for which a
program executable has been successfully built.
Kernel objects can only be created once you have a program object with a
valid program source or binary loaded into the program object and the
program executable has been successfully built for one or more devices
associated with program.
No changes to the program executable are allowed while there are kernel
objects associated with a program object.
This means that calls to {clBuildProgram} and {clCompileProgram} return
{CL_INVALID_OPERATION} if there are kernel objects attached to a program
object.
The OpenCL context associated with _program_ will be the context associated
with _kernel_.
The list of devices associated with _program_ are the devices associated
with _kernel_.
Devices associated with a program object for which a valid program
executable has been built can be used to execute kernels declared in the
program object.
// refError
{clCreateKernelsInProgram} will return {CL_SUCCESS} if the kernel objects were
successfully allocated.
Otherwise, it returns one of the following errors:
* {CL_INVALID_PROGRAM} if _program_ is not a valid program object.
* {CL_INVALID_PROGRAM_EXECUTABLE} if there is no successfully built
executable for any device in _program_.
* {CL_INVALID_VALUE} if _kernels_ is not `NULL` and _num_kernels_ is less
than the number of kernels in _program_.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clRetainKernel',desc='Increments the kernel object reference count.',type='protos']
--
To retain a kernel object, call the function
include::{generated}/api/protos/clRetainKernel.txt[]
include::{generated}/api/version-notes/clRetainKernel.asciidoc[]
* _kernel_ is the kernel object to be retained.
The _kernel_ reference count is incremented.
// refError
{clRetainKernel} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
{clCreateKernel} or {clCreateKernelsInProgram} do an implicit retain.
--
[open,refpage='clReleaseKernel',desc='Decrements the kernel reference count.',type='protos']
--
To release a kernel object, call the function
include::{generated}/api/protos/clReleaseKernel.txt[]
include::{generated}/api/version-notes/clReleaseKernel.asciidoc[]
* _kernel_ is the kernel object to be released.
The _kernel_ reference count is decremented.
The kernel object is deleted once the number of instances that are retained
to _kernel_ become zero and the kernel object is no longer needed by any
enqueued commands that use _kernel_.
Using this function to release a reference that was not obtained by creating
the object or by calling {clRetainKernel} causes undefined behavior.
// refError
{clReleaseKernel} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
=== Setting Kernel Arguments
To execute a kernel, the kernel arguments must be set.
[open,refpage='clSetKernelArg',desc='Set the argument value for a specific argument of a kernel.',type='protos']
--
To set the argument value for a specific argument of a kernel, call the
function
include::{generated}/api/protos/clSetKernelArg.txt[]
include::{generated}/api/version-notes/clSetKernelArg.asciidoc[]
* _kernel_ is a valid kernel object.
* _arg_index_ is the argument index.
Arguments to the kernel are referred by indices that go from 0 for the
leftmost argument to _n_ - 1, where _n_ is the total number of arguments
declared by a kernel (see below).
* _arg_size_ specifies the size of the argument value.
If the argument is a memory object, the _arg_size_ value must be equal to
`sizeof({cl_mem_TYPE})`.
For arguments declared with the `local` qualifier, the size specified will
be the size in bytes of the buffer that must be allocated for the `local`
argument.
If the argument is of type _sampler_t_, the _arg_size_ value must be equal
to `sizeof({cl_sampler_TYPE})`.
If the argument is of type _queue_t_, the _arg_size_ value must be equal to
`sizeof({cl_command_queue_TYPE})`.
For all other arguments, the size will be the size of argument type.
* _arg_value_ is a pointer to data that should be used as the argument value
for argument specified by _arg_index_.
The argument data pointed to by _arg_value_ is copied and the _arg_value_
pointer can therefore be reused by the application after {clSetKernelArg}
returns.
The argument value specified is the value used by all API calls that enqueue
_kernel_ ({clEnqueueNDRangeKernel} and {clEnqueueTask}) until the argument
value is changed by a call to {clSetKernelArg} for _kernel_.
For example, consider the following kernel:
[source,c]
----
kernel void image_filter (int n,
int m,
constant float *filter_weights,
read_only image2d_t src_image,
write_only image2d_t dst_image)
{
...
}
----
Argument index values for `image_filter` will be 0 for `n`, 1 for `m`, 2 for
`filter_weights`, 3 for `src_image` and 4 for `dst_image`.
If the argument is a memory object (buffer, pipe, image or image array), the
_arg_value_ entry will be a pointer to the appropriate buffer, pipe, image
or image array object.
The memory object must be created with the context associated with the
kernel object.
If the argument is a buffer object, the _arg_value_ pointer can be `NULL` or
point to a `NULL` value in which case a `NULL` value will be used as the
value for the argument declared as a pointer to `global` or `constant`
memory in the kernel.
If the argument is declared with the `local` qualifier, the _arg_value_
entry must be `NULL`.
If the argument is of type _sampler_t_, the _arg_value_ entry must be a
pointer to the sampler object.
If the argument is of type _queue_t_, the _arg_value_ entry must be a
pointer to the device queue object.
If the argument is declared to be a pointer of a built-in scalar or vector
type, or a user defined structure type in the global or constant address
space, the memory object specified as argument value must be a buffer object
(or `NULL`).
If the argument is declared with the `constant` qualifier, the size in bytes
of the memory object cannot exceed {CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE} and
the number of arguments declared as pointers to `constant` memory cannot
exceed {CL_DEVICE_MAX_CONSTANT_ARGS}.
The memory object specified as argument value must be a pipe object if the
argument is declared with the _pipe_ qualifier.
The memory object specified as argument value must be a 2D image object if
the argument is declared to be of type _image2d_t_.
The memory object specified as argument value must be a 2D image object with
image channel order = {CL_DEPTH} if the argument is declared to be of type
_image2d_depth_t_.
The memory object specified as argument value must be a 3D image object if
argument is declared to be of type _image3d_t_.
The memory object specified as argument value must be a 1D image object if
the argument is declared to be of type _image1d_t_.
The memory object specified as argument value must be a 1D image buffer
object if the argument is declared to be of type _image1d_buffer_t_.
The memory object specified as argument value must be a 1D image array
object if argument is declared to be of type _image1d_array_t_.
The memory object specified as argument value must be a 2D image array
object if argument is declared to be of type _image2d_array_t_.
The memory object specified as argument value must be a 2D image array
object with image channel order = {CL_DEPTH} if argument is declared to be of
type _image2d_array_depth_t_.
For all other kernel arguments, the _arg_value_ entry must be a pointer to
the actual data to be used as argument value.
[NOTE]
====
A kernel object does not update the reference count for objects such as
memory or sampler objects specified as argument values by {clSetKernelArg}.
Users may not rely on a kernel object to retain objects specified as
argument values to the kernel.
Implementations shall not allow {cl_kernel_TYPE} objects to hold reference
counts to {cl_kernel_TYPE} arguments, because no mechanism is provided for the
user to tell the kernel to release that ownership right.
If the kernel holds ownership rights on kernel args, that would make it
impossible for users to tell with certainty when they may safely
release user allocated resources associated with OpenCL objects such as
the {cl_mem_TYPE} backing store used with {CL_MEM_USE_HOST_PTR}.
====
// refError
{clSetKernelArg} returns {CL_SUCCESS} if the function was executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object.
* {CL_INVALID_ARG_INDEX} if _arg_index_ is not a valid argument index.
* {CL_INVALID_ARG_VALUE} if _arg_value_ specified is not a valid value.
* {CL_INVALID_MEM_OBJECT} for an argument declared to be a memory object
when the specified _arg_value_ is not a valid memory object.
* {CL_INVALID_SAMPLER} for an argument declared to be of type _sampler_t_
when the specified _arg_value_ is not a valid sampler object.
* {CL_INVALID_DEVICE_QUEUE} for an argument declared to be of type _queue_t_
when the specified _arg_value_ is not a valid device queue object.
This error code is <<unified-spec, missing before>> version 2.0.
* {CL_INVALID_ARG_SIZE} if _arg_size_ does not match the size of the data
type for an argument that is not a memory object or if the argument is a
memory object and _arg_size_ != `sizeof({cl_mem_TYPE})` or if _arg_size_ is
zero and the argument is declared with the local qualifier or if the
argument is a sampler and _arg_size_ != `sizeof({cl_sampler_TYPE})`.
* {CL_MAX_SIZE_RESTRICTION_EXCEEDED} if the size in bytes of the memory
object (if the argument is a memory object) or _arg_size_ (if the
argument is declared with `local` qualifier) exceeds a language-
specified maximum size restriction for this argument, such as the
*MaxByteOffset* SPIR-V decoration.
This error code is <<unified-spec, missing before>> version 2.2.
* {CL_INVALID_ARG_VALUE} if the argument is an image declared with the
`read_only` qualifier and _arg_value_ refers to an image object created
with _cl_mem_flags_ of {CL_MEM_WRITE_ONLY} or if the image argument is
declared with the `write_only` qualifier and _arg_value_ refers to an
image object created with _cl_mem_flags_ of {CL_MEM_READ_ONLY}.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clSetKernelArgSVMPointer',desc='Set a SVM pointer as the argument value for a specific argument of a kernel.',type='protos']
--
To set a SVM pointer as the argument value for a specific argument of a
kernel, call the function
include::{generated}/api/protos/clSetKernelArgSVMPointer.txt[]
include::{generated}/api/version-notes/clSetKernelArgSVMPointer.asciidoc[]
* _kernel_ is a valid kernel object.
* _arg_index_ is the argument index.
Arguments to the kernel are referred by indices that go from 0 for the
leftmost argument to _n_ - 1, where _n_ is the total number of arguments
declared by a kernel.
* _arg_value_ is the SVM pointer that should be used as the argument value for
argument specified by _arg_index_.
The SVM pointer specified is the value used by all API calls that enqueue
_kernel_ ({clEnqueueNDRangeKernel} and {clEnqueueTask}) until the argument
value is changed by a call to {clSetKernelArgSVMPointer} for _kernel_.
The SVM pointer can only be used for arguments that are declared to be a
pointer to `global` or `constant` memory.
The SVM pointer value must be aligned according to the arguments type.
For example, if the argument is declared to be `+global float4 *p+`, the SVM
pointer value passed for `p` must be at a minimum aligned to a `float4`.
The SVM pointer value specified as the argument value can be the pointer
returned by {clSVMAlloc} or can be a pointer offset into the SVM region.
// refError
{clSetKernelArgSVMPointer} returns {CL_SUCCESS} if the function was executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object.
* {CL_INVALID_OPERATION} if no devices in the context associated with _kernel_ support SVM.
* {CL_INVALID_ARG_INDEX} if _arg_index_ is not a valid argument index.
* {CL_INVALID_ARG_VALUE} if _arg_value_ specified is not a valid value.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clSetKernelExecInfo',desc='Pass additional information other than argument values to a kernel.',type='protos']
--
To pass additional information other than argument values to a kernel, call
the function
include::{generated}/api/protos/clSetKernelExecInfo.txt[]
include::{generated}/api/version-notes/clSetKernelExecInfo.asciidoc[]
* _kernel_ specifies the kernel object being queried.
* _param_name_ specifies the information to be passed to kernel.
The list of supported _param_name_ types and the corresponding values passed
in _param_value_ is described in the <<kernel-exec-info-table,Kernel
Execution Properties>> table.
* _param_value_size_ specifies the size in bytes of the memory pointed to by
_param_value_.
* _param_value_ is a pointer to memory where the appropriate values determined
by _param_name_ are specified.
[[kernel-exec-info-table]]
.List of supported param_names by <<clSetKernelExecInfo>>
[width="100%",cols="<33%,<17%,<50%",options="header"]
|====
| Kernel Exec Info | Type | Description
| {CL_KERNEL_EXEC_INFO_SVM_PTRS_anchor}
include::{generated}/api/version-notes/CL_KERNEL_EXEC_INFO_SVM_PTRS.asciidoc[]
| {void_TYPE}*[]
| SVM pointers must reference locations contained entirely within
buffers that are passed to kernel as arguments, or that are passed
through the execution information.
Non-argument SVM buffers must be specified by passing pointers to
those buffers via {clSetKernelExecInfo} for coarse-grain and
fine-grain buffer SVM allocations but not for finegrain system SVM
allocations.
| {CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM_anchor}
include::{generated}/api/version-notes/CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM.asciidoc[]
| {cl_bool_TYPE}
| This flag indicates whether the kernel uses pointers that are fine
grain system SVM allocations.
These fine grain system SVM pointers may be passed as arguments or
defined in SVM buffers that are passed as arguments to _kernel_.
|====
// refError
{clSetKernelExecInfo} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_KERNEL} if _kernel_ is a not a valid kernel object.
* {CL_INVALID_OPERATION} if no devices in the context associated with _kernel_ support SVM.
* {CL_INVALID_VALUE} if _param_name_ is not valid, if _param_value_ is
`NULL` or if the size specified by _param_value_size_ is not valid.
* {CL_INVALID_OPERATION} if _param_name_ is
{CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM} and _param_value_ is {CL_TRUE}
but no devices in context associated with _kernel_ support fine-grain
system SVM allocations.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[NOTE]
====
Coarse-grain or fine-grain buffer SVM pointers used by a kernel which
are not passed as a kernel arguments must be specified using
{clSetKernelExecInfo} with {CL_KERNEL_EXEC_INFO_SVM_PTRS}.
For example, if SVM buffer A contains a pointer to another SVM buffer B,
and the kernel dereferences that pointer, then a pointer to B must
either be passed as an argument in the call to that kernel or it must be
made available to the kernel using {clSetKernelExecInfo}.
For example, we might pass extra SVM pointers as follows:
[source,c]
----
clSetKernelExecInfo(kernel,
CL_KERNEL_EXEC_INFO_SVM_PTRS,
num_ptrs * sizeof(void *),
extra_svm_ptr_list);
----
Here `num_ptrs` specifies the number of additional SVM pointers while
`extra_svm_ptr_list` specifies a pointer to memory containing those SVM
pointers.
When calling {clSetKernelExecInfo} with {CL_KERNEL_EXEC_INFO_SVM_PTRS} to
specify pointers to non-argument SVM buffers as extra arguments to a kernel,
each of these pointers can be the SVM pointer returned by {clSVMAlloc} or
can be a pointer + offset into the SVM region.
It is sufficient to provide one pointer for each SVM buffer used.
{CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM} is used to indicate whether
SVM pointers used by a kernel will refer to system allocations or not.
{CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM} = {CL_FALSE} indicates that the
OpenCL implementation may assume that system pointers are not passed as
kernel arguments and are not stored inside SVM allocations passed as kernel
arguments.
{CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM} = {CL_TRUE} indicates that the
OpenCL implementation must assume that system pointers might be passed as
kernel arguments and/or stored inside SVM allocations passed as kernel
arguments.
In this case, if the device to which the kernel is enqueued does not support
system SVM pointers, {clEnqueueNDRangeKernel} and {clEnqueueTask} will return a
{CL_INVALID_OPERATION} error.
If none of the devices in the context associated with kernel support
fine-grain system SVM allocations, {clSetKernelExecInfo} will return a
{CL_INVALID_OPERATION} error.
If {clSetKernelExecInfo} has not been called with a value for
{CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM}, the default value is used for
this kernel attribute.
The default value depends on whether the device on which the kernel is
enqueued supports fine-grain system SVM allocations.
If so, the default value used is {CL_TRUE} (system pointers might be passed);
otherwise, the default is {CL_FALSE}.
A call to {clSetKernelExecInfo} for a given value of _param_name_
replaces any prior value passed for that value of _param_name_.
Only one _param_value_ will be stored for each value of _param_name_.
====
=== Copying Kernel Objects
NOTE: Copying kernel objects is <<unified-spec, missing before>> version 2.1.
[open,refpage='clCloneKernel',desc='Make a shallow copy of the kernel object.',type='protos']
--
To clone a kernel object, call the function
include::{generated}/api/protos/clCloneKernel.txt[]
include::{generated}/api/version-notes/clCloneKernel.asciidoc[]
* _source_kernel_ is a valid {cl_kernel_TYPE} object that will be copied.
_source_kernel_ will not be modified in any way by this function.
* _errcode_ret_ will be assigned an appropriate error code.
If _errcode_ret_ is `NULL`, no error code is returned.
Cloning is used to make a shallow copy of the kernel object, its arguments
and any information passed to the kernel object using {clSetKernelExecInfo}.
If the kernel object was ready to be enqueued before copying it, the clone
of the kernel object is ready to enqueue.
The returned kernel object is an exact copy of _source_kernel_, with one
caveat: the reference count on the returned kernel object is set as if it
had been returned by {clCreateKernel}.
The reference count of _source_kernel will_ not be changed.
The resulting kernel will be in the same state as if {clCreateKernel} is
called to create the resultant kernel with the same arguments as those used
to create _source_kernel_, the latest call to {clSetKernelArg} or
{clSetKernelArgSVMPointer} for each argument index applied to kernel and the
last call to {clSetKernelExecInfo} for each value of the param name
parameter are applied to the new kernel object.
All arguments of the new kernel object must be intact and it may be
correctly used in the same situations as kernel except those that assume a
pre-existing reference count.
Setting arguments on the new kernel object will not affect _source_kernel_
except insofar as the argument points to a shared underlying entity and in
that situation behavior is as if two kernel objects had been created and the
same argument applied to each.
Only the data stored in the kernel object is copied; data referenced by the
kernels arguments are not copied.
For example, if a buffer or pointer argument is set on a kernel object, the
pointer is copied but the underlying memory allocation is not.
// refError
{clCloneKernel} returns a valid non-zero kernel object and _errcode_ret_ is
set to {CL_SUCCESS} if the kernel is successfully copied.
Otherwise it returns a `NULL` value with one of the following error values
returned in _errcode_ret_:
* {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
=== Kernel Object Queries
[open,refpage='clGetKernelInfo',desc='Returns information about the kernel object.',type='protos']
--
To return information about a kernel object, call the function
include::{generated}/api/protos/clGetKernelInfo.txt[]
include::{generated}/api/version-notes/clGetKernelInfo.asciidoc[]
* _kernel_ specifies the kernel object being queried.
* _param_name_ specifies the information to query.
The list of supported _param_name_ types and the information returned in
_param_value_ by {clGetKernelInfo} is described in the
<<kernel-info-table,Kernel Object Queries>> table.
* _param_value_ is a pointer to memory where the appropriate result being
queried is returned.
If _param_value_ is `NULL`, it is ignored.
* _param_value_size_ is used to specify the size in bytes of memory pointed to
by _param_value_.
This size must be {geq} size of return type as described in the
<<kernel-info-table,Kernel Object Queries>> table.
* _param_value_size_ret_ returns the actual size in bytes of data being
queried by _param_name_.
If _param_value_size_ret_ is `NULL`, it is ignored.
[[kernel-info-table]]
.List of supported param_names by <<clGetKernelInfo>>
[width="100%",cols="<33%,<17%,<50%",options="header"]
|====
| Kernel Info | Return Type | Description
| {CL_KERNEL_FUNCTION_NAME_anchor}
include::{generated}/api/version-notes/CL_KERNEL_FUNCTION_NAME.asciidoc[]
| {char_TYPE}[]
| Return the kernel function name.
| {CL_KERNEL_NUM_ARGS_anchor}
include::{generated}/api/version-notes/CL_KERNEL_NUM_ARGS.asciidoc[]
| {cl_uint_TYPE}
| Return the number of arguments to kernel.
| {CL_KERNEL_REFERENCE_COUNT_anchor} footnote:[{fn-reference-count-usage}]
include::{generated}/api/version-notes/CL_KERNEL_REFERENCE_COUNT.asciidoc[]
| {cl_uint_TYPE}
| Return the _kernel_ reference count.
| {CL_KERNEL_CONTEXT_anchor}
include::{generated}/api/version-notes/CL_KERNEL_CONTEXT.asciidoc[]
| {cl_context_TYPE}
| Return the context associated with _kernel_.
| {CL_KERNEL_PROGRAM_anchor}
include::{generated}/api/version-notes/CL_KERNEL_PROGRAM.asciidoc[]
| {cl_program_TYPE}
| Return the program object associated with kernel.
| {CL_KERNEL_ATTRIBUTES_anchor}
include::{generated}/api/version-notes/CL_KERNEL_ATTRIBUTES.asciidoc[]
| {char_TYPE}[]
| Returns any attributes specified using the `+__attribute__+`
OpenCL C qualifier (or using an OpenCL {cpp} qualifier syntax [[]] )
with the kernel function declaration in the program source.
These attributes include attributes described in the earlier OpenCL
C kernel language specifications and other attributes supported by
an implementation.
Attributes are returned as they were declared inside
`+__attribute__((...))+`, with any surrounding whitespace and
embedded newlines removed.
When multiple attributes are present, they are returned as a single,
space delimited string.
For kernels not created from OpenCL C source and the
{clCreateProgramWithSource} API call the string returned from this
query will be empty.
|====
// refError
{clGetKernelInfo} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
specified by _param_value_size_ is < size of return type as described in
the <<kernel-info-table,Kernel Object Queries>> table and _param_value_
is not `NULL`.
* {CL_INVALID_KERNEL} if _kernel_ is a not a valid kernel object.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clGetKernelWorkGroupInfo',desc='Returns information about the kernel object that may be specific to a device.',type='protos']
--
To return information about the kernel object that may be specific to a
device, call the function
include::{generated}/api/protos/clGetKernelWorkGroupInfo.txt[]
include::{generated}/api/version-notes/clGetKernelWorkGroupInfo.asciidoc[]
* _kernel_ specifies the kernel object being queried.
* _device_ identifies a specific device in the list of devices associated with
_kernel_.
The list of devices is the list of devices in the OpenCL context that is
associated with _kernel_.
If the list of devices associated with _kernel_ is a single device, _device_
can be a `NULL` value.
* _param_name_ specifies the information to query.
The list of supported _param_name_ types and the information returned in
_param_value_ by {clGetKernelWorkGroupInfo} is described in the
<<kernel-workgroup-info-table,Kernel Object Device Queries>> table.
* _param_value_ is a pointer to memory where the appropriate result being
queried is returned.
If _param_value_ is `NULL`, it is ignored.
* _param_value_size_ is used to specify the size in bytes of memory pointed to
by _param_value_.
This size must be {geq} size of return type as described in the
<<kernel-workgroup-info-table,Kernel Object Device Queries>> table.
* _param_value_size_ret_ returns the actual size in bytes of data being
queried by _param_name_.
If _param_value_size_ret_ is `NULL`, it is ignored.
[[kernel-workgroup-info-table]]
.List of supported param_names by <<clGetKernelWorkGroupInfo>>
[width="100%",cols="<33%,<17%,<50%",options="header"]
|====
| Kernel Work Group Info | Return Type | Description
| {CL_KERNEL_GLOBAL_WORK_SIZE_anchor}
include::{generated}/api/version-notes/CL_KERNEL_GLOBAL_WORK_SIZE.asciidoc[]
| {size_t_TYPE}[3]
| This provides a mechanism for the application to query the maximum
global size that can be used to execute a kernel (i.e.
_global_work_size_ argument to {clEnqueueNDRangeKernel}) on a custom
device given by device or a built-in kernel on an OpenCL device
given by device.
If device is not a custom device and kernel is not a built-in
kernel, {clGetKernelWorkGroupInfo} returns the error
{CL_INVALID_VALUE}.
| {CL_KERNEL_WORK_GROUP_SIZE_anchor}
include::{generated}/api/version-notes/CL_KERNEL_WORK_GROUP_SIZE.asciidoc[]
| {size_t_TYPE}
| This provides a mechanism for the application to query the maximum
work-group size that can be used to execute the kernel on a specific
device given by device.
The OpenCL implementation uses the resource requirements of the
kernel (register usage etc.) to determine what this work-group size
should be.
As a result and unlike {CL_DEVICE_MAX_WORK_GROUP_SIZE} this value may
vary from one kernel to another as well as one device to another.
{CL_KERNEL_WORK_GROUP_SIZE} will be less than or equal to
{CL_DEVICE_MAX_WORK_GROUP_SIZE} for a given kernel object.
| {CL_KERNEL_COMPILE_WORK_GROUP_SIZE_anchor}
include::{generated}/api/version-notes/CL_KERNEL_COMPILE_WORK_GROUP_SIZE.asciidoc[]
| {size_t_TYPE}[3]
| Returns the work-group size specified in the kernel source or IL.
If the work-group size is not specified in the kernel source or IL,
(0, 0, 0) is returned.
| {CL_KERNEL_LOCAL_MEM_SIZE_anchor}
include::{generated}/api/version-notes/CL_KERNEL_LOCAL_MEM_SIZE.asciidoc[]
| {cl_ulong_TYPE}
| Returns the amount of local memory in bytes being used by a kernel.
This includes local memory that may be needed by an implementation
to execute the kernel, variables declared inside the kernel with the
`+__local+` address qualifier and local memory to be allocated for
arguments to the kernel declared as pointers with the `+__local+`
address qualifier and whose size is specified with {clSetKernelArg}.
If the local memory size, for any pointer argument to the kernel
declared with the `+__local+` address qualifier, is not specified,
its size is assumed to be 0.
| {CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE_anchor}
include::{generated}/api/version-notes/CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE.asciidoc[]
| {size_t_TYPE}
| Returns the preferred multiple of work-group size for launch.
This is a performance hint.
Specifying a work-group size that is not a multiple of the value
returned by this query as the value of the local work size argument
to {clEnqueueNDRangeKernel} will not fail to enqueue the kernel for
execution unless the work-group size specified is larger than the
device maximum.
| {CL_KERNEL_PRIVATE_MEM_SIZE_anchor}
include::{generated}/api/version-notes/CL_KERNEL_PRIVATE_MEM_SIZE.asciidoc[]
| {cl_ulong_TYPE}
| Returns the minimum amount of private memory, in bytes, used by each
work-item in the kernel.
This value may include any private memory needed by an
implementation to execute the kernel, including that used by the
language built-ins and variable declared inside the kernel with the
`+__private+` qualifier.
|====
// refError
{clGetKernelWorkGroupInfo} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_DEVICE} if _device_ is not in the list of devices associated
with _kernel_ or if _device_ is `NULL` but there is more than one device
associated with _kernel_.
* {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
specified by _param_value_size_ is < size of return type as described in
the <<kernel-workgroup-info-table,Kernel Object Device Queries>> table
and _param_value_ is not `NULL`.
* {CL_INVALID_VALUE} if _param_name_ is {CL_KERNEL_GLOBAL_WORK_SIZE} and
_device_ is not a custom device and _kernel_ is not a built-in kernel.
* {CL_INVALID_KERNEL} if _kernel_ is a not a valid kernel object.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clGetKernelSubGroupInfo',desc='Returns information about the kernel object.',type='protos']
--
To return information about a kernel object, call the function
include::{generated}/api/protos/clGetKernelSubGroupInfo.txt[]
include::{generated}/api/version-notes/clGetKernelSubGroupInfo.asciidoc[]
Also see extension *cl_khr_subgroups*.
* _kernel_ specifies the kernel object being queried.
* _device_ identifies a specific device in the list of devices associated with
_kernel_.
The list of devices is the list of devices in the OpenCL context that is
associated with _kernel_.
If the list of devices associated with _kernel_ is a single device, _device_
can be a `NULL` value.
* _param_name_ specifies the information to query.
The list of supported _param_name_ types and the information returned in
_param_value_ by {clGetKernelSubGroupInfo} is described in the
<<kernel-subgroup-info-table,Kernel Object Subgroup Queries>> table.
* _input_value_size_ is used to specify the size in bytes of memory pointed to
by _input_value_.
This size must be == size of input type as described in the table below.
* _input_value_ is a pointer to memory where the appropriate parameterization
of the query is passed from.
If _input_value_ is `NULL`, it is ignored.
* _param_value_ is a pointer to memory where the appropriate result being
queried is returned.
If _param_value_ is `NULL`, it is ignored.
* _param_value_size_ is used to specify the size in bytes of memory pointed to
by _param_value_.
This size must be {geq} size of return type as described in the
<<kernel-subgroup-info-table,Kernel Object Subgroup Queries>> table.
* _param_value_size_ret_ returns the actual size in bytes of data being
queried by _param_name_.
If _param_value_size_ret_ is `NULL`, it is ignored.
[[kernel-subgroup-info-table]]
.List of supported param_names by <<clGetKernelSubGroupInfo>>
[width="100%",cols="<25%,<25%,<25%,<25%",options="header"]
|====
| Kernel Subgroup Info | Input Type | Return Type | Description
| {CL_KERNEL_MAX_SUB_GROUP_SIZE_FOR_NDRANGE_anchor}
include::{generated}/api/version-notes/CL_KERNEL_MAX_SUB_GROUP_SIZE_FOR_NDRANGE.asciidoc[]
Also see extension *cl_khr_subgroups*.
| {size_t_TYPE}*
| {size_t_TYPE}
| Returns the maximum sub-group size for this kernel.
All sub-groups must be the same size, while the last subgroup in
any work-group (i.e. the subgroup with the maximum index) could
be the same or smaller size.
The _input_value_ must be an array of {size_t_TYPE} values
corresponding to the local work size parameter of the intended
dispatch.
The number of dimensions in the ND-range will be inferred from
the value specified for _input_value_size_.
| {CL_KERNEL_SUB_GROUP_COUNT_FOR_NDRANGE_anchor}
include::{generated}/api/version-notes/CL_KERNEL_SUB_GROUP_COUNT_FOR_NDRANGE.asciidoc[]
Also see extension *cl_khr_subgroups*.
| {size_t_TYPE}*
| {size_t_TYPE}
| Returns the number of sub-groups that will be present in each
work-group for a given local work size.
All workgroups, apart from the last work-group in each dimension
in the presence of non-uniform work-group sizes, will have the
same number of sub-groups.
The _input_value_ must be an array of {size_t_TYPE} values
corresponding to the local work size parameter of the intended
dispatch.
The number of dimensions in the ND-range will be inferred from
the value specified for _input_value_size_.
| {CL_KERNEL_LOCAL_SIZE_FOR_SUB_GROUP_COUNT_anchor}
include::{generated}/api/version-notes/CL_KERNEL_LOCAL_SIZE_FOR_SUB_GROUP_COUNT.asciidoc[]
Also see extension *cl_khr_subgroups*.
| {size_t_TYPE}
| {size_t_TYPE}[]
| Returns the local size that will generate the requested number
of sub-groups for the kernel.
The output array must be an array of {size_t_TYPE} values corresponding
to the local size parameter.
Any returned work-group will have one dimension.
Other dimensions inferred from the value specified for
param_value_size will be filled with the value 1.
The returned value will produce an exact number of sub-groups
and result in no partial groups for an executing kernel except
in the case where the last work-group in a dimension has a size
different from that of the other groups.
If no work-group size can accommodate the requested number of
sub-groups, 0 will be returned in each element of the return
array.
| {CL_KERNEL_MAX_NUM_SUB_GROUPS_anchor}
include::{generated}/api/version-notes/CL_KERNEL_MAX_NUM_SUB_GROUPS.asciidoc[]
Also see extension *cl_khr_subgroups*.
| ignored
| {size_t_TYPE}
| This provides a mechanism for the application to query the
maximum number of sub-groups that may make up each work-group to
execute a kernel on a specific device given by device.
The OpenCL implementation uses the resource requirements of the
kernel (register usage etc.) to determine what this work-group
size should be.
The returned value may be used to compute a work-group size to
enqueue the kernel with to give a round number of sub-groups for
an enqueue.
| {CL_KERNEL_COMPILE_NUM_SUB_GROUPS_anchor}
include::{generated}/api/version-notes/CL_KERNEL_COMPILE_NUM_SUB_GROUPS.asciidoc[]
Also see extension *cl_khr_subgroups*.
| ignored
| {size_t_TYPE}
| Returns the number of sub-groups per work-group specified in the kernel
source or IL. If the sub-group count is not specified then 0 is returned.
|====
// refError
{clGetKernelSubGroupInfo} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_DEVICE} if _device_ is not in the list of devices associated
with _kernel_ or if _device_ is `NULL` but there is more than one device
associated with _kernel_.
* {CL_INVALID_OPERATION} if _device_ does not support subgroups.
* {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
specified by _param_value_size_ is < size of return type as described in
the <<kernel-subgroup-info-table,Kernel Object Subgroup Queries>> table
and _param_value_ is not `NULL`.
* {CL_INVALID_VALUE} if _param_name_ is
{CL_KERNEL_MAX_SUB_GROUP_SIZE_FOR_NDRANGE} and the size in bytes specified by
_input_value_size_ is not valid or if _input_value_ is `NULL`.
* {CL_INVALID_KERNEL} if _kernel_ is a not a valid kernel object.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clGetKernelArgInfo',desc='Returns information about the arguments of a kernel.',type='protos']
--
To return information about the arguments of a kernel, call the function
include::{generated}/api/protos/clGetKernelArgInfo.txt[]
include::{generated}/api/version-notes/clGetKernelArgInfo.asciidoc[]
* _kernel_ specifies the kernel object being queried.
* _arg_index_ is the argument index.
Arguments to the kernel are referred by indices that go from 0 for the
leftmost argument to _n_ - 1, where _n_ is the total number of arguments
declared by a kernel.
* _param_name_ specifies the argument information to query.
The list of supported _param_name_ types and the information returned in
_param_value_ by {clGetKernelArgInfo} is described in the
<<kernel-argument-info-table,Kernel Argument Queries>> table.
* _param_value_ is a pointer to memory where the appropriate result being
queried is returned.
If _param_value_ is `NULL`, it is ignored.
* _param_value_size_ is used to specify the size in bytes of memory pointed to
by _param_value_.
This size must be > size of return type as described in the
<<kernel-argument-info-table,Kernel Argument Queries>> table.
* _param_value_size ret_ returns the actual size in bytes of data being
queried by _param_name_.
If _param_value_size_ret_ is `NULL`, it is ignored.
Kernel argument information is only available if the program object
associated with _kernel_ is created with {clCreateProgramWithSource} and the
program executable was built with the `-cl-kernel-arg-info option` specified
in options argument to {clBuildProgram} or {clCompileProgram}.
[[kernel-argument-info-table]]
.List of supported param_names by <<clGetKernelArgInfo>>
[width="100%",cols="<33%,<17%,<50%",options="header"]
|====
| Kernel Arg Info | Return Type | Description
| {CL_KERNEL_ARG_ADDRESS_QUALIFIER_anchor}
include::{generated}/api/version-notes/CL_KERNEL_ARG_ADDRESS_QUALIFIER.asciidoc[]
| {cl_kernel_arg_address_qualifier_TYPE}
| Returns the address qualifier specified for the argument given by
_arg_index_.
This can be one of the following values:
{CL_KERNEL_ARG_ADDRESS_GLOBAL_anchor} +
{CL_KERNEL_ARG_ADDRESS_LOCAL_anchor} +
{CL_KERNEL_ARG_ADDRESS_CONSTANT_anchor} +
{CL_KERNEL_ARG_ADDRESS_PRIVATE_anchor}
If no address qualifier is specified, the default address qualifier
which is {CL_KERNEL_ARG_ADDRESS_PRIVATE} is returned.
| {CL_KERNEL_ARG_ACCESS_QUALIFIER_anchor}
include::{generated}/api/version-notes/CL_KERNEL_ARG_ACCESS_QUALIFIER.asciidoc[]
| {cl_kernel_arg_access_qualifier_TYPE}
| Returns the access qualifier specified for the argument given by
_arg_index_.
This can be one of the following values:
{CL_KERNEL_ARG_ACCESS_READ_ONLY_anchor} +
{CL_KERNEL_ARG_ACCESS_WRITE_ONLY_anchor} +
{CL_KERNEL_ARG_ACCESS_READ_WRITE_anchor} +
{CL_KERNEL_ARG_ACCESS_NONE_anchor}
If argument is not an image type and is not declared with the pipe
qualifier, {CL_KERNEL_ARG_ACCESS_NONE} is returned.
If argument is an image type, the access qualifier specified or the
default access qualifier is returned.
| {CL_KERNEL_ARG_TYPE_NAME_anchor}
include::{generated}/api/version-notes/CL_KERNEL_ARG_TYPE_NAME.asciidoc[]
| {char_TYPE}[]
| Returns the type name specified for the argument given by
_arg_index_.
The type name returned will be the argument type name as it was
declared with any whitespace removed.
If argument type name is an unsigned scalar type (i.e. unsigned
char, unsigned short, unsigned int, unsigned long), uchar, ushort,
uint and ulong will be returned.
The argument type name returned does not include any type
qualifiers.
| {CL_KERNEL_ARG_TYPE_QUALIFIER_anchor}
include::{generated}/api/version-notes/CL_KERNEL_ARG_TYPE_QUALIFIER.asciidoc[]
| {cl_kernel_arg_type_qualifier_TYPE}
| Returns a bitfield describing one or more type qualifiers specified
for the argument given by _arg_index_.
The returned values can be:
{CL_KERNEL_ARG_TYPE_CONST_anchor}
footnote:[{fn-kernel-arg-type-qualifier}]
footnote:[{fn-kernel-arg-type-const-addr-space}] +
{CL_KERNEL_ARG_TYPE_RESTRICT_anchor} +
{CL_KERNEL_ARG_TYPE_VOLATILE_anchor} +
{CL_KERNEL_ARG_TYPE_PIPE_anchor}, or +
{CL_KERNEL_ARG_TYPE_NONE_anchor}
{CL_KERNEL_ARG_TYPE_NONE} is returned for all parameters passed by
value.
| {CL_KERNEL_ARG_NAME_anchor}
include::{generated}/api/version-notes/CL_KERNEL_ARG_NAME.asciidoc[]
| {char_TYPE}[]
| Returns the name specified for the argument given by _arg_index_.
|====
{clGetKernelArgInfo} returns CL SUCCESS if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_ARG_INDEX} if _arg_index_ is not a valid argument index.
* {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
specified by _param_value_ size is < size of return type as described in
the <<kernel-argument-info-table,Kernel Argument Queries>> table and
_param_value_ is not `NULL`.
* {CL_KERNEL_ARG_INFO_NOT_AVAILABLE} if the argument information is not
available for kernel.
* {CL_INVALID_KERNEL} if _kernel_ is a not a valid kernel object.
--
== Executing Kernels
[open,refpage='clEnqueueNDRangeKernel',desc='Enqueues a command to execute a kernel on a device.',type='protos']
--
To enqueue a command to execute a kernel on a device, call the function
include::{generated}/api/protos/clEnqueueNDRangeKernel.txt[]
include::{generated}/api/version-notes/clEnqueueNDRangeKernel.asciidoc[]
* _command_queue_ is a valid host command-queue.
The kernel will be queued for execution on the device associated with
_command_queue_.
* _kernel_ is a valid kernel object.
The OpenCL context associated with _kernel_ and _command-queue_ must be the
same.
* _work_dim_ is the number of dimensions used to specify the global work-items
and work-items in the work-group.
_work_dim_ must be greater than zero and less than or equal to
{CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS}.
If _global_work_size_ is `NULL`, or the value in any passed dimension is 0
then the kernel command will trivially succeed after its event dependencies
are satisfied and subsequently update its completion event.
The behavior in this situation is similar to that of an enqueued marker,
except that unlike a marker, an enqueued kernel with no events passed to
_event_wait_list_ may run at any time.
* _global_work_offset_ can be used to specify an array of _work_dim_ unsigned
values that describe the offset used to calculate the global ID of a
work-item.
If _global_work_offset_ is `NULL`, the global IDs start at offset (0, 0, 0).
_global_work_offset_ must be `NULL` <<unified-spec, before>> version 1.1.
* _global_work_size_ points to an array of _work_dim_ unsigned values that
describe the number of global work-items in _work_dim_ dimensions that will
execute the kernel function.
The total number of global work-items is computed as _global_work_size_[0]
{times} ... {times} _global_work_size_[_work_dim_ - 1].
* _local_work_size_ points to an array of _work_dim_ unsigned values that
describe the number of work-items that make up a work-group (also referred
to as the size of the work-group) that will execute the kernel specified by
_kernel_.
The total number of work-items in a work-group is computed as
_local_work_size_[0] {times} ... {times} _local_work_size_[_work_dim_ - 1].
The total number of work-items in the work-group must be less than or equal
to the {CL_KERNEL_WORK_GROUP_SIZE} value specified in the
<<kernel-workgroup-info-table,Kernel Object Device Queries>> table, and the
number of work-items specified in _local_work_size_[0], ...,
_local_work_size_[_work_dim_ - 1] must be less than or equal to the
corresponding values specified by {CL_DEVICE_MAX_WORK_ITEM_SIZES}[0], ...,
{CL_DEVICE_MAX_WORK_ITEM_SIZES}[_work_dim_ - 1].
The explicitly specified _local_work_size_ will be used to determine how to
break the global work-items specified by _global_work_size_ into appropriate
work-group instances.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before this particular command can be executed.
If _event_wait_list_ is `NULL`, then this particular command does not wait
on any event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this command and
can be used to query or wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
An ND-range kernel command may require uniform work-groups or may support non-uniform work-groups.
To support non-uniform work-groups:
. The device associated with _command_queue_ must support non-uniform work-groups.
. The program object associated with _kernel_ must support non-uniform work-groups.
Specifically, this means:
.. If the program was created with {clCreateProgramWithSource}, the program must be compiled or built using the `-cl-std=CL2.0` or `-cl-std=CL3.0` build option and without the `-cl-uniform-work-group-size` build option.
.. If the program was created with {clCreateProgramWithIL} or {clCreateProgramWithBinary}, the program must be compiled or built without the `-cl-uniform-work-group-size` build options.
.. If the program was created using {clLinkProgram}, all input programs must support non-uniform work-groups.
If non-uniform work-groups are supported, any single dimension
for which the global size is not divisible by the local size will be
partitioned into two regions.
One region will have work-groups that have the same number of work-items as
was specified by the local size parameter in that dimension.
The other region will have work-groups with less than the number of work
items specified by the local size parameter in that dimension.
The global IDs and group IDs of the work-items in the first region will be
numerically lower than those in the second, and the second region will be at
most one work-group wide in that dimension.
Work-group sizes could be non-uniform in multiple dimensions, potentially
producing work-groups of up to 4 different sizes in a 2D range and 8
different sizes in a 3D range.
If non-uniform work-groups are supported and _local_work_size_ is `NULL`, the OpenCL runtime may choose a uniform or non-uniform work-group size.
Otherwise, when non-uniform work-groups are not supported, the size of each work-group must be uniform.
If _local_work_size_ is specified, the values specified in _global_work_size_[0], ..., _global_work_size_[_work_dim_ - 1] must be evenly divisible by the corresponding values specified in _local_work_size_[0], ..., _local_work_size_[_work_dim_ - 1].
If _local_work_size_ is `NULL`, the OpenCL runtime must choose a uniform work-group size.
The work-group size to be used for _kernel_ can also be specified in the
program source or intermediate language.
In this case the size of work-group specified by _local_work_size_ must
match the value specified in the program source.
These work-group instances are executed in parallel across multiple compute
units or concurrently on the same compute unit.
Each work-item is uniquely identified by a global identifier.
The global ID, which can be read inside the kernel, is computed using the
value given by _global_work_size_ and _global_work_offset_.
In addition, a work-item is also identified within a work-group by a unique
local ID.
The local ID, which can also be read by the kernel, is computed using the
value given by _local_work_size_.
The starting local ID is always (0, 0, ..., 0).
// refError
{clEnqueueNDRangeKernel} returns {CL_SUCCESS} if the kernel-instance was
successfully queued.
Otherwise, it returns one of the following errors:
* {CL_INVALID_PROGRAM_EXECUTABLE} if there is no successfully built program
executable available for device associated with _command_queue_.
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object.
* {CL_INVALID_CONTEXT} if context associated with _command_queue_ and
_kernel_ are not the same or if the context associated with
_command_queue_ and events in _event_wait_list_ are not the same.
* {CL_INVALID_KERNEL_ARGS} if the kernel argument values have not been
specified.
* {CL_INVALID_WORK_DIMENSION} if _work_dim_ is not a valid value (i.e. a
value between 1 and {CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS}).
* {CL_INVALID_GLOBAL_WORK_SIZE} if _global_work_size_ is NULL or if any of
the values specified in _global_work_size_[0], ...
_global_work_size_[_work_dim_ - 1] are 0.
Returning this error code under these circumstances is <<unified-spec,
deprecated by>> version 2.1.
* {CL_INVALID_GLOBAL_WORK_SIZE} if any of the values specified in
_global_work_size_[0], ... _global_work_size_[_work_dim_ - 1] exceed the
maximum value representable by {size_t_TYPE} on the device on which the
kernel-instance will be enqueued.
* {CL_INVALID_GLOBAL_OFFSET} if the value specified in _global_work_size_
{plus} the corresponding values in _global_work_offset_ for any
dimensions is greater than the maximum value representable by size t on
the device on which the kernel-instance will be enqueued, or if
_global_work_offset_ is non-`NULL` <<unified-spec, before>> version 1.1.
* {CL_INVALID_WORK_GROUP_SIZE} if _local_work_size_ is specified and does
not match the required work-group size for _kernel_ in the program
source.
* {CL_INVALID_WORK_GROUP_SIZE} if _local_work_size_ is specified and is not
consistent with the required number of sub-groups for _kernel_ in the
program source.
* {CL_INVALID_WORK_GROUP_SIZE} if _local_work_size_ is specified and the
total number of work-items in the work-group computed as
_local_work_size_[0] {times} ... _local_work_size_[_work_dim_ - 1] is
greater than the value specified by {CL_KERNEL_WORK_GROUP_SIZE} in the
<<kernel-workgroup-info-table,Kernel Object Device Queries>> table.
* {CL_INVALID_WORK_GROUP_SIZE} if the work-group size must be uniform and
the _local_work_size_ is not `NULL`, is not equal to the required
work-group size specified in the kernel source, or the
_global_work_size_ is not evenly divisible by the _local_work_size_.
* {CL_INVALID_WORK_ITEM_SIZE} if the number of work-items specified in any
of _local_work_size_[0], ... _local_work_size_[_work_dim_ - 1] is
greater than the corresponding values specified by
{CL_DEVICE_MAX_WORK_ITEM_SIZES}[0], ...,
{CL_DEVICE_MAX_WORK_ITEM_SIZES}[_work_dim_ - 1].
* {CL_MISALIGNED_SUB_BUFFER_OFFSET} if a sub-buffer object is specified as
the value for an argument that is a buffer object and the _offset_
specified when the sub-buffer object is created is not aligned to
{CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated with _queue_.
This error code is <<unified-spec, missing before>> version 1.1.
* {CL_INVALID_IMAGE_SIZE} if an image object is specified as an argument
value and the image dimensions (image width, height, specified or
compute row and/or slice pitch) are not supported by device associated
with _queue_.
* {CL_IMAGE_FORMAT_NOT_SUPPORTED} if an image object is specified as an
argument value and the image format (image channel order and data type)
is not supported by device associated with _queue_.
* {CL_OUT_OF_RESOURCES} if there is a failure to queue the execution
instance of _kernel_ on the command-queue because of insufficient
resources needed to execute the kernel.
For example, the explicitly specified _local_work_size_ causes a failure
to execute the kernel because of insufficient resources such as
registers or local memory.
Another example would be the number of read-only image args used in
_kernel_ exceed the {CL_DEVICE_MAX_READ_IMAGE_ARGS} value for device or
the number of write-only and read-write image args used in _kernel_
exceed the {CL_DEVICE_MAX_READ_WRITE_IMAGE_ARGS} value for device or the
number of samplers used in _kernel_ exceed {CL_DEVICE_MAX_SAMPLERS} for
device.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for data store associated with image or buffer objects specified
as arguments to _kernel_.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_INVALID_OPERATION} if SVM pointers are passed as arguments to a kernel
and the device does not support SVM or if system pointers are passed as
arguments to a kernel and/or stored inside SVM allocations passed as
kernel arguments and the device does not support fine grain system SVM
allocations.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clEnqueueTask',desc='Enqueues a command to execute a kernel, using a single work-item, on a device.',type='protos']
--
To enqueue a command to execute a kernel on a device, using a single work-item,
call the function
include::{generated}/api/protos/clEnqueueTask.txt[]
include::{generated}/api/version-notes/clEnqueueTask.asciidoc[]
* _command_queue_ is a valid host command-queue.
The kernel will be queued for execution on the device associated with
_command_queue_.
* _kernel_ is a valid kernel object.
The OpenCL context associated with _kernel_ and _command-queue_ must be the
same.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before this particular command can be executed.
If _event_wait_list_ is `NULL`, then this particular command does not wait
on any event to complete.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this command and
can be used to query or wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
{clEnqueueTask} is equivalent to calling {clEnqueueNDRangeKernel} with
_work_dim_ set to 1, _global_work_offset_ set to `NULL`, _global_work_size[0]_
set to 1, and _local_work_size[0]_ set to 1.
// refError
{clEnqueueTask} returns {CL_SUCCESS} if the kernel-instance was successfully
queued.
Otherwise, it returns one of the following errors:
* {CL_INVALID_PROGRAM_EXECUTABLE} if there is no successfully built program
executable available for device associated with _command_queue_.
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object.
* {CL_INVALID_CONTEXT} if context associated with _command_queue_ and
_kernel_ are not the same or if the context associated with
_command_queue_ and events in _event_wait_list_ are not the same.
* {CL_INVALID_KERNEL_ARGS} if the kernel argument values have not been
specified.
* {CL_INVALID_WORK_GROUP_SIZE} if a work-group size is specified for _kernel_
in the program source and it is not (1, 1, 1).
// TODO I'm not sure if the next error makes sense for a 'task'.
* {CL_INVALID_WORK_GROUP_SIZE} if the required number of sub-groups is
specified for _kernel_ in the program source and is not consistent with a
work-group size of (1, 1, 1).
* {CL_MISALIGNED_SUB_BUFFER_OFFSET} if a sub-buffer object is specified as
the value for an argument that is a buffer object and the _offset_
specified when the sub-buffer object is created is not aligned to
{CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated with _queue_.
This error code is <<unified-spec, missing before>> version 1.1.
* {CL_INVALID_IMAGE_SIZE} if an image object is specified as an argument
value and the image dimensions (image width, height, specified or
compute row and/or slice pitch) are not supported by device associated
with _queue_.
* {CL_IMAGE_FORMAT_NOT_SUPPORTED} if an image object is specified as an
argument value and the image format (image channel order and data type)
is not supported by device associated with _queue_.
* {CL_OUT_OF_RESOURCES} if there is a failure to queue the execution
instance of _kernel_ on the command-queue because of insufficient
resources needed to execute the kernel. See how this error code is used
with {clEnqueueNDRangeKernel} for examples.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for data store associated with image or buffer objects specified
as arguments to _kernel_.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_INVALID_OPERATION} if SVM pointers are passed as arguments to a kernel
and the device does not support SVM or if system pointers are passed as
arguments to a kernel and/or stored inside SVM allocations passed as
kernel arguments and the device does not support fine grain system SVM
allocations.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clEnqueueNativeKernel',desc='Enqueues a command to execute a native C/C++ function not compiled using the OpenCL compiler.',type='protos']
--
To enqueue a command to execute a native C/{cpp} function not compiled using
the OpenCL compiler, call the function
include::{generated}/api/protos/clEnqueueNativeKernel.txt[]
include::{generated}/api/version-notes/clEnqueueNativeKernel.asciidoc[]
* _command_queue_ is a valid host command-queue.
A native user function can only be executed on a command-queue created on a
device that has {CL_EXEC_NATIVE_KERNEL} capability set in
{CL_DEVICE_EXECUTION_CAPABILITIES} as specified in the
<<device-queries-table,Device Queries>> table.
* _user_func_ is a pointer to a host-callable user function.
* _args_ is a pointer to the args list that _user_func_ should be called with.
* _cb_args_ is the size in bytes of the args list that _args_ points to.
* _num_mem_objects_ is the number of buffer objects that are passed in _args_.
* _mem_list_ is a list of valid buffer objects, if _num_mem_objects_ > 0.
The buffer object values specified in _mem_list_ are memory object handles
(`{cl_mem_TYPE}` values) returned by {clCreateBuffer} or {clCreateBufferWithProperties},
or `NULL`.
* _args_mem_loc_ is a pointer to appropriate locations that _args_ points to
where memory object handles ({cl_mem_TYPE} values) are stored.
Before the user function is executed, the memory object handles are replaced
by pointers to global memory.
* _event_wait_list_, _num_events_in_wait_list_ and _event_ are as described in
{clEnqueueNDRangeKernel}.
The data pointed to by _args_ and _cb_args_ bytes in size will be copied and
a pointer to this copied region will be passed to _user_func_.
The copy needs to be done because the memory objects ({cl_mem_TYPE} values) that
_args_ may contain need to be modified and replaced by appropriate pointers
to global memory.
When {clEnqueueNativeKernel} returns, the memory region pointed to by _args_
can be reused by the application.
// refError
{clEnqueueNativeKernel} returns {CL_SUCCESS} if the user function execution
instance was successfully queued.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_CONTEXT} if context associated with _command_queue_ and events
in _event_wait_list_ are not the same.
* {CL_INVALID_VALUE} if _user_func_ is `NULL`.
* {CL_INVALID_VALUE} if _args_ is a `NULL` value and _cb_args_ > 0, or if
_args_ is a `NULL` value and _num_mem_objects_ > 0.
* {CL_INVALID_VALUE} if _args_ is not `NULL` and _cb_args_ is 0.
* {CL_INVALID_VALUE} if _num_mem_objects_ > 0 and _mem_list_ or
_args_mem_loc_ are `NULL`.
* {CL_INVALID_VALUE} if _num_mem_objects_ = 0 and _mem_list_ or
_args_mem_loc_ are not `NULL`.
* {CL_INVALID_OPERATION} if the device associated with _command_queue_
cannot execute the native kernel.
* {CL_INVALID_MEM_OBJECT} if one or more memory objects specified in
_mem_list_ are not valid or are not buffer objects.
* {CL_OUT_OF_RESOURCES} if there is a failure to queue the execution
instance of _kernel_ on the command-queue because of insufficient
resources needed to execute the kernel.
* {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate
memory for data store associated with buffer objects specified as
arguments to _kernel_.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_INVALID_OPERATION} if SVM pointers are passed as arguments to a kernel
and the device does not support SVM or if system pointers are passed as
arguments to a kernel and/or stored inside SVM allocations passed as
kernel arguments and the device does not support fine grain system SVM
allocations.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
[NOTE]
====
The total number of read-only images specified as arguments to a kernel
cannot exceed {CL_DEVICE_MAX_READ_IMAGE_ARGS}.
Each image array argument to a kernel declared with the `read_only`
qualifier counts as one image.
The total number of write-only images specified as arguments to a kernel
cannot exceed {CL_DEVICE_MAX_WRITE_IMAGE_ARGS}.
Each image array argument to a kernel declared with the `write_only`
qualifier counts as one image.
The total number of read-write images specified as arguments to a kernel
cannot exceed {CL_DEVICE_MAX_READ_WRITE_IMAGE_ARGS}.
Each image array argument to a kernel declared with the `read_write`
qualifier counts as one image.
====
--
[[event-objects]]
== Event Objects
// Check: Is this list of event object APIs really necessary?
//Event objects can be used to refer to a kernel-instance command
//({clEnqueueNDRangeKernel}, {clEnqueueTask}, {clEnqueueNativeKernel}), read,
//write, map and copy commands on memory objects (
//{clEnqueueReadBuffer}, {clEnqueueWriteBuffer},
//{clEnqueueMapBuffer}, {clEnqueueUnmapMemObject},
//{clEnqueueReadBufferRect}, {clEnqueueWriteBufferRect},
//{clEnqueueReadImage}, {clEnqueueWriteImage}, {clEnqueueMapImage},
//{clEnqueueCopyBuffer}, {clEnqueueCopyImage},
//{clEnqueueCopyBufferRect},
//{clEnqueueCopyBufferToImage}, {clEnqueueCopyImageToBuffer}),
//{clEnqueueSVMMemcpy}, {clEnqueueSVMMemFill},
//{clEnqueueSVMMap}, {clEnqueueSVMUnmap}, {clEnqueueSVMFree}, {clEnqueueMarker},
//{clEnqueueMarkerWithWaitList}, {clEnqueueWaitForEvents}, {clEnqueueBarrier},
//{clEnqueueBarrierWithWaitList}, (refer to
//<<markers-barriers-waiting-for-events,Markers, Barriers and Waiting for
//Events>>) or user events.
An event object can be used to track the execution status of a command.
The API calls that enqueue commands to a command-queue create a new event
object that is returned in the _event_ argument.
In case of an error enqueuing the command in the command-queue the event
argument does not return an event object.
The execution status of an enqueued command at any given point in time can
be one of the following:
* {CL_QUEUED_anchor}: Indicates that the command has been enqueued in a
command-queue.
This is the initial state of all events except user events.
* {CL_SUBMITTED_anchor}: The initial state for all user events.
For all other events, indicates that the command has been submitted
by the host to the device.
* {CL_RUNNING_anchor}: Indicates that the device has started executing this
command.
In order for the execution status of an enqueued command to change from
{CL_SUBMITTED} to {CL_RUNNING}, all events that this command is waiting on
must have completed successfully i.e. their execution status must be
{CL_COMPLETE}.
* {CL_COMPLETE_anchor}: Indicates that the command has successfully completed.
* An Error Code: A negative integer value indicating that the command was
abnormally terminated. Abnormal termination may occur for a number of reasons,
such as a bad memory access.
[NOTE]
====
A command is considered to be complete if its execution status is
{CL_COMPLETE} or is a negative integer value.
If the execution of a command is terminated, the command-queue associated
with this terminated command, and the associated context (and all other
command-queues in this context) may no longer be available.
The behavior of OpenCL API calls that use this context (and command-queues
associated with this context) are now considered to be
implementation-defined.
The user registered callback function specified when context is created can
be used to report appropriate error information.
====
[open,refpage='clCreateUserEvent',desc='Creates a user event object.',type='protos']
--
To create a user event object, call the function
include::{generated}/api/protos/clCreateUserEvent.txt[]
include::{generated}/api/version-notes/clCreateUserEvent.asciidoc[]
* _context_ must be a valid OpenCL context.
* _errcode_ret_ will return an appropriate error code.
If _errcode_ret_ is `NULL`, no error code is returned.
User events allow applications to enqueue commands that wait on a user event
to finish before the command is executed by the device.
// refError
{clCreateUserEvent} returns a valid non-zero event object and _errcode_ret_
is set to {CL_SUCCESS} if the user event object is created successfully.
Otherwise, it returns a `NULL` value with one of the following error values
returned in _errcode_ret_:
* {CL_INVALID_CONTEXT} if _context_ is not a valid context.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
The initial execution status for the user event object is {CL_SUBMITTED}.
--
[open,refpage='clSetUserEventStatus',desc='Sets the execution status of a user event object.',type='protos']
--
To set the execution status of a user event object, call the function
include::{generated}/api/protos/clSetUserEventStatus.txt[]
include::{generated}/api/version-notes/clSetUserEventStatus.asciidoc[]
* _event_ is a user event object created using {clCreateUserEvent}.
* _execution_status_ specifies the new execution status to be set and can be
{CL_COMPLETE} or a negative integer value to indicate an error.
A negative integer value causes all enqueued commands that wait on this user
event to be terminated.
{clSetUserEventStatus} can only be called once to change the execution
status of _event_.
[NOTE]
====
If there are enqueued commands with user events in the _event_wait_list_
argument of *+clEnqueue*+* commands, the user must ensure that the status of
these user events being waited on are set using {clSetUserEventStatus}
before any OpenCL APIs that release OpenCL objects except for event objects
are called; otherwise the behavior is undefined.
For example, the following code sequence will result in undefined behavior
of {clReleaseMemObject}.
[source,c]
----
ev1 = clCreateUserEvent(ctx, NULL);
clEnqueueWriteBuffer(cq, buf1, CL_FALSE, ..., 1, &ev1, NULL);
clEnqueueWriteBuffer(cq, buf2, CL_FALSE, ...);
clReleaseMemObject(buf2);
clSetUserEventStatus(ev1, CL_COMPLETE);
----
The following code sequence, however, works correctly.
[source,c]
----
ev1 = clCreateUserEvent(ctx, NULL);
clEnqueueWriteBuffer(cq, buf1, CL_FALSE, ..., 1, &ev1, NULL);
clEnqueueWriteBuffer(cq, buf2, CL_FALSE, ...);
clSetUserEventStatus(ev1, CL_COMPLETE);
clReleaseMemObject(buf2);
----
====
// refError
{clSetUserEventStatus} returns {CL_SUCCESS} if the function was executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_EVENT} if _event_ is not a valid user event object.
* {CL_INVALID_VALUE} if the _execution_status_ is not {CL_COMPLETE} or a
negative integer value.
* {CL_INVALID_OPERATION} if the _execution_status_ for _event_ has already
been changed by a previous call to {clSetUserEventStatus}.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clWaitForEvents',desc='Waits on the host thread for commands identified by event objects to complete.',type='protos']
--
To wait for events to complete, call the function
include::{generated}/api/protos/clWaitForEvents.txt[]
include::{generated}/api/version-notes/clWaitForEvents.asciidoc[]
* _num_events_ is the number of events in _event_list_.
* _event_list_ is a pointer to a list of event object handles.
This function waits on the host thread for commands identified by event
objects in _event_list_ to complete.
A command is considered complete if its execution status is {CL_COMPLETE} or a
negative value.
The events specified in _event_list_ act as synchronization points.
// refError
{clWaitForEvents} returns {CL_SUCCESS} if the execution status of all events
in _event_list_ is {CL_COMPLETE}.
Otherwise, it returns one of the following errors:
* {CL_INVALID_VALUE} if _num_events_ is zero or _event_list_ is `NULL`.
* {CL_INVALID_CONTEXT} if events specified in _event_list_ do not belong to
the same context.
* {CL_INVALID_EVENT} if event objects specified in _event_list_ are not
valid event objects.
* {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the execution status of
any of the events in _event_list_ is a negative integer value.
This error code is <<unified-spec, missing before>> version 1.1.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clGetEventInfo',desc='Returns information about the event object.',type='protos']
--
To return information about an event object, call the function
include::{generated}/api/protos/clGetEventInfo.txt[]
include::{generated}/api/version-notes/clGetEventInfo.asciidoc[]
* _event_ specifies the event object being queried.
* _param_name_ specifies the information to query.
The list of supported _param_name_ types and the information returned in
_param_value_ by {clGetEventInfo} is described in the
<<event-info-table,Event Object Queries>> table.
* _param_value_ is a pointer to memory where the appropriate result being
queried is returned.
If _param_value_ is `NULL`, it is ignored.
* _param_value_size_ is used to specify the size in bytes of memory pointed to
by _param_value_.
This size must be {geq} size of return type as described in the
<<event-info-table,Event Object Queries>> table.
* _param_value_size_ret_ returns the actual size in bytes of data being
queried by _param_name_.
If _param_value_size_ret_ is `NULL`, it is ignored.
[[event-info-table]]
.List of supported param_names by <<clGetEventInfo>>
[width="100%",cols="<33%,<17%,<50%",options="header"]
|====
| Event Info | Return Type | Description
| {CL_EVENT_COMMAND_QUEUE_anchor}
include::{generated}/api/version-notes/CL_EVENT_COMMAND_QUEUE.asciidoc[]
| {cl_command_queue_TYPE}
| Return the command-queue associated with _event_.
For user event objects, a `NULL` value is returned.
| {CL_EVENT_CONTEXT_anchor}
include::{generated}/api/version-notes/CL_EVENT_CONTEXT.asciidoc[]
| {cl_context_TYPE}
| Return the context associated with _event_.
| {CL_EVENT_COMMAND_TYPE_anchor}
include::{generated}/api/version-notes/CL_EVENT_COMMAND_TYPE.asciidoc[]
| {cl_command_type_TYPE}
| Return the command type associated with _event_ as described in the
<<event-command-type-table,Event Command Types>> table.
| {CL_EVENT_COMMAND_EXECUTION_STATUS_anchor} footnote:[{fn-event-status-order}]
include::{generated}/api/version-notes/CL_EVENT_COMMAND_EXECUTION_STATUS.asciidoc[]
| {cl_int_TYPE}
| Return the execution status of the command identified by event.
Valid values are:
{CL_QUEUED} (command has been enqueued in the command-queue),
{CL_SUBMITTED} (enqueued command has been submitted by the host to the
device associated with the command-queue),
{CL_RUNNING} (device is currently executing this command),
{CL_COMPLETE} (the command has completed), or
Error code given by a negative integer value. (command was
abnormally terminated - this may be caused by a bad memory access
etc.).
These error codes come from the same set of error codes that are
returned from the platform or runtime API calls as return values or
errcode_ret values.
| {CL_EVENT_REFERENCE_COUNT_anchor} footnote:[{fn-reference-count-usage}]
include::{generated}/api/version-notes/CL_EVENT_REFERENCE_COUNT.asciidoc[]
| {cl_uint_TYPE}
| Return the _event_ reference count.
|====
[[event-command-type-table]]
.List of supported event command types
[width="100%",cols="2,3",options="header"]
|====
| *Events Created By*
| *Event Command Type*
| {clEnqueueNDRangeKernel}
| {CL_COMMAND_NDRANGE_KERNEL_anchor}
include::{generated}/api/version-notes/CL_COMMAND_NDRANGE_KERNEL.asciidoc[]
| {clEnqueueTask}
| {CL_COMMAND_TASK_anchor}
include::{generated}/api/version-notes/CL_COMMAND_TASK.asciidoc[]
| {clEnqueueNativeKernel}
| {CL_COMMAND_NATIVE_KERNEL_anchor}
include::{generated}/api/version-notes/CL_COMMAND_NATIVE_KERNEL.asciidoc[]
| {clEnqueueReadBuffer}
| {CL_COMMAND_READ_BUFFER_anchor}
include::{generated}/api/version-notes/CL_COMMAND_READ_BUFFER.asciidoc[]
| {clEnqueueWriteBuffer}
| {CL_COMMAND_WRITE_BUFFER_anchor}
include::{generated}/api/version-notes/CL_COMMAND_WRITE_BUFFER.asciidoc[]
| {clEnqueueCopyBuffer}
| {CL_COMMAND_COPY_BUFFER_anchor}
include::{generated}/api/version-notes/CL_COMMAND_COPY_BUFFER.asciidoc[]
| {clEnqueueReadImage}
| {CL_COMMAND_READ_IMAGE_anchor}
include::{generated}/api/version-notes/CL_COMMAND_READ_IMAGE.asciidoc[]
| {clEnqueueWriteImage}
| {CL_COMMAND_WRITE_IMAGE_anchor}
include::{generated}/api/version-notes/CL_COMMAND_WRITE_IMAGE.asciidoc[]
| {clEnqueueCopyImage}
| {CL_COMMAND_COPY_IMAGE_anchor}
include::{generated}/api/version-notes/CL_COMMAND_COPY_IMAGE.asciidoc[]
| {clEnqueueCopyBufferToImage}
| {CL_COMMAND_COPY_BUFFER_TO_IMAGE_anchor}
include::{generated}/api/version-notes/CL_COMMAND_COPY_BUFFER_TO_IMAGE.asciidoc[]
| {clEnqueueCopyImageToBuffer}
| {CL_COMMAND_COPY_IMAGE_TO_BUFFER_anchor}
include::{generated}/api/version-notes/CL_COMMAND_COPY_IMAGE_TO_BUFFER.asciidoc[]
| {clEnqueueMapBuffer}
| {CL_COMMAND_MAP_BUFFER_anchor}
include::{generated}/api/version-notes/CL_COMMAND_MAP_BUFFER.asciidoc[]
| {clEnqueueMapImage}
| {CL_COMMAND_MAP_IMAGE_anchor}
include::{generated}/api/version-notes/CL_COMMAND_MAP_IMAGE.asciidoc[]
| {clEnqueueUnmapMemObject}
| {CL_COMMAND_UNMAP_MEM_OBJECT_anchor}
include::{generated}/api/version-notes/CL_COMMAND_UNMAP_MEM_OBJECT.asciidoc[]
| {clEnqueueMarker}, +
{clEnqueueMarkerWithWaitList}
| {CL_COMMAND_MARKER_anchor}
include::{generated}/api/version-notes/CL_COMMAND_MARKER.asciidoc[]
| {clEnqueueReadBufferRect}
| {CL_COMMAND_READ_BUFFER_RECT_anchor}
include::{generated}/api/version-notes/CL_COMMAND_READ_BUFFER_RECT.asciidoc[]
| {clEnqueueWriteBufferRect}
| {CL_COMMAND_WRITE_BUFFER_RECT_anchor}
include::{generated}/api/version-notes/CL_COMMAND_WRITE_BUFFER_RECT.asciidoc[]
| {clEnqueueCopyBufferRect}
| {CL_COMMAND_COPY_BUFFER_RECT_anchor}
include::{generated}/api/version-notes/CL_COMMAND_COPY_BUFFER_RECT.asciidoc[]
| {clCreateUserEvent}
| {CL_COMMAND_USER_anchor}
include::{generated}/api/version-notes/CL_COMMAND_USER.asciidoc[]
| {clEnqueueBarrier}, +
{clEnqueueBarrierWithWaitList}
| {CL_COMMAND_BARRIER_anchor}
include::{generated}/api/version-notes/CL_COMMAND_BARRIER.asciidoc[]
| {clEnqueueMigrateMemObjects}
| {CL_COMMAND_MIGRATE_MEM_OBJECTS_anchor}
include::{generated}/api/version-notes/CL_COMMAND_MIGRATE_MEM_OBJECTS.asciidoc[]
| {clEnqueueFillBuffer}
| {CL_COMMAND_FILL_BUFFER_anchor}
include::{generated}/api/version-notes/CL_COMMAND_FILL_BUFFER.asciidoc[]
| {clEnqueueFillImage}
| {CL_COMMAND_FILL_IMAGE_anchor}
include::{generated}/api/version-notes/CL_COMMAND_FILL_IMAGE.asciidoc[]
| {clEnqueueSVMFree}
| {CL_COMMAND_SVM_FREE_anchor}
include::{generated}/api/version-notes/CL_COMMAND_SVM_FREE.asciidoc[]
| {clEnqueueSVMMemcpy}
| {CL_COMMAND_SVM_MEMCPY_anchor}
include::{generated}/api/version-notes/CL_COMMAND_SVM_MEMCPY.asciidoc[]
| {clEnqueueSVMMemFill}
| {CL_COMMAND_SVM_MEMFILL_anchor}
include::{generated}/api/version-notes/CL_COMMAND_SVM_MEMFILL.asciidoc[]
| {clEnqueueSVMMap}
| {CL_COMMAND_SVM_MAP_anchor}
include::{generated}/api/version-notes/CL_COMMAND_SVM_MAP.asciidoc[]
| {clEnqueueSVMUnmap}
| {CL_COMMAND_SVM_UNMAP_anchor}
include::{generated}/api/version-notes/CL_COMMAND_SVM_UNMAP.asciidoc[]
| {clEnqueueSVMMigrateMem}
| {CL_COMMAND_SVM_MIGRATE_MEM_anchor}
include::{generated}/api/version-notes/CL_COMMAND_SVM_MIGRATE_MEM.asciidoc[]
Prior to OpenCL 3.0, implementations should return
{CL_COMMAND_MIGRATE_MEM_OBJECTS}, but may return an implementation-defined
event command type for {clEnqueueSVMMigrateMem}.
|====
Using {clGetEventInfo} to determine if a command identified by _event_ has
finished execution (i.e. {CL_EVENT_COMMAND_EXECUTION_STATUS} returns
{CL_COMPLETE}) is not a synchronization point.
There are no guarantees that the memory objects being modified by command
associated with _event_ will be visible to other enqueued commands.
// refError
{clGetEventInfo} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
specified by _param_value_size_ is < size of return type as described in
the <<kernel-argument-info-table,Kernel Argument Queries>> table and
_param_value_ is not `NULL`.
* {CL_INVALID_VALUE} if information to query given in _param_name_ cannot be
queried for _event_.
* {CL_INVALID_EVENT} if _event_ is a not a valid event object.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clSetEventCallback',desc='Registers a user callback function for a specific command execution status.',type='protos']
--
To register a user callback function for a specific command execution
status, call the function
include::{generated}/api/protos/clSetEventCallback.txt[]
include::{generated}/api/version-notes/clSetEventCallback.asciidoc[]
* _event_ is a valid event object.
* _command_exec_callback_type_ specifies the command execution status for
which the callback is registered.
The command execution callback values for which a callback can be registered
are: {CL_SUBMITTED}, {CL_RUNNING}, or
{CL_COMPLETE} footnote:[{fn-event-callback-complete}].
There is no guarantee that the callback functions registered for various
execution status values for an event will be called in the exact order that
the execution status of a command changes.
Furthermore, it should be noted that receiving a call back for an event with
a status other than {CL_COMPLETE}, in no way implies that the memory model or
execution model as defined by the OpenCL specification has changed.
For example, it is not valid to assume that a corresponding memory transfer
has completed unless the event is in a state {CL_COMPLETE}.
* _pfn_event_notify_ is the event callback function that can be registered by
the application.
This callback function may be called asynchronously by the OpenCL
implementation.
It is the applications responsibility to ensure that the callback function
is thread-safe.
The parameters to this callback function are:
** _event_ is the event object for which the callback function is invoked.
** _event_command_status_ is equal to the _command_exec_callback_type_
used while registering the callback.
Refer to the <<event-info-table,Event Object Queries>>
table for the command execution status values.
If the callback is called as the result of the command associated with
event being abnormally terminated, an appropriate error code for the
error that caused the termination will be passed to
_event_command_status_ instead.
** _user_data_ is a pointer to user supplied data.
* _user_data_ will be passed as the _user_data_ argument when _pfn_notify_ is
called.
_user_data_ can be `NULL`.
The registered callback function will be called when the execution status of
command associated with _event_ changes to an execution status equal to or
past the status specified by _command_exec_status_.
Each call to {clSetEventCallback} registers the specified user callback
function on a callback stack associated with _event_.
The order in which the registered user callback functions are called is
undefined.
All callbacks registered for an event object must be called before the event
object is destroyed.
Callbacks should return promptly.
Behavior is undefined when calling expensive system routines, OpenCL APIs to
create contexts or command-queues, or blocking OpenCL APIs in an event callback.
Rather than calling a blocking OpenCL API in an event callback, applications
may call a non-blocking OpenCL API, then register a completion callback
for the non-blocking OpenCL API with the remainder of the work.
Because commands in a command-queue are not required to begin execution
until the command-queue is flushed, callbacks that enqueue commands on a
command-queue should either call {clFlush} on the queue before returning,
or arrange for the command-queue to be flushed later.
// refError
{clSetEventCallback} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_EVENT} if _event_ is not a valid event object.
* {CL_INVALID_VALUE} if _pfn_event_notify_ is `NULL` or if
_command_exec_callback_type_ is not {CL_SUBMITTED}, {CL_RUNNING}, or
{CL_COMPLETE}.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clRetainEvent',desc='Increments the event reference count.',type='protos']
--
To retain an event object, call the function
include::{generated}/api/protos/clRetainEvent.txt[]
include::{generated}/api/version-notes/clRetainEvent.asciidoc[]
* _event_ is the event object to be retained.
The _event_ reference count is incremented.
The OpenCL commands that return an event perform an implicit retain.
// refError
{clRetainEvent} returns {CL_SUCCESS} if the function is executed successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_EVENT} if _event_ is not a valid event object.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clReleaseEvent',desc='Decrements the event reference count.',type='protos']
--
To release an event object, call the function
include::{generated}/api/protos/clReleaseEvent.txt[]
include::{generated}/api/version-notes/clReleaseEvent.asciidoc[]
* _event_ is the event object to be released.
The _event_ reference count is decremented.
The event object is deleted once the reference count becomes zero, the
specific command identified by this event has completed (or terminated) and
there are no commands in the command-queues of a context that require a wait
for this event to complete.
Using this function to release a reference that was not obtained by creating
the object or by calling {clRetainEvent} causes undefined behavior.
[NOTE]
====
Developers should be careful when releasing their last reference count on
events created by {clCreateUserEvent} that have not yet been set to status
of {CL_COMPLETE} or an error.
If the user event was used in the event_wait_list argument passed to a
*+clEnqueue*+* API or another application host thread is waiting for it in
{clWaitForEvents}, those commands and host threads will continue to wait for
the event status to reach {CL_COMPLETE} or error, even after the application
has released the object.
Since in this scenario the application has released its last reference count
to the user event, it would be in principle no longer valid for the
application to change the status of the event to unblock all the other
machinery.
As a result the waiting tasks will wait forever, and associated events,
{cl_mem_TYPE} objects, command queues and contexts are likely to leak.
In-order command queues caught up in this deadlock may cease to do any work.
====
// refError
{clReleaseEvent} returns {CL_SUCCESS} if the function is executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_EVENT} if _event_ is not a valid event object.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[[markers-barriers-waiting-for-events]]
== Markers, Barriers and Waiting for Events
[open,refpage='clEnqueueMarkerWithWaitList',desc='Enqueues a marker command which waits for either a list of events to complete, or all previously enqueued commands to complete.',type='protos']
--
To enqueue a marker command which waits for events or commands to complete,
call the function
include::{generated}/api/protos/clEnqueueMarkerWithWaitList.txt[]
include::{generated}/api/version-notes/clEnqueueMarkerWithWaitList.asciidoc[]
* _command_queue_ is a valid host command-queue.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before this particular command can be executed.
* _event_ returns an event object that identifies this command and
can be used to query or wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
If _event_wait_list_ is `NULL`, then this particular command waits until all
previous enqueued commands to _command_queue_ have completed.
The marker command either waits for a list of events to complete, or if the
list is empty it waits for all commands previously enqueued in
_command_queue_ to complete before it completes.
This command returns an _event_ which can be waited on, i.e. this event can
be waited on to insure that all events either in the _event_wait_list_ or
all previously enqueued commands, queued before this command to
_command_queue_, have completed.
// refError
{clEnqueueMarkerWithWaitList} returns {CL_SUCCESS} if the function is
successfully executed.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_CONTEXT} if context associated with _command_queue_ and events
in _event_wait_list_ are not the same.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clEnqueueMarker',desc='Enqueues a marker command which waits for all previously enqueued commands to complete.',type='protos']
--
To enqueue a marker command which waits for previous commands to complete, call
the function
include::{generated}/api/protos/clEnqueueMarker.txt[]
include::{generated}/api/version-notes/clEnqueueMarker.asciidoc[]
* _command_queue_ is a valid host command-queue.
* _event_ returns an event object that identifies this command and
can be used to query or wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
The marker command waits for all commands previously enqueued in _command_queue_ to complete before it completes.
This command returns an _event_ which can be waited on, i.e. this event can be
waited on to insure that all previously enqueued commands, queued before this
command to _command_queue_, have completed.
// refError
{clEnqueueMarker} returns {CL_SUCCESS} if the function is successfully
executed.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_VALUE} if _event_ is `NULL`.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clEnqueueWaitForEvents',desc='Enqueues a wait on a list of events to complete.',type='protos']
--
To enqueue a wait for a specific event or a list of events to complete before any future commands queued in a command-queue are executed, call the function
include::{generated}/api/protos/clEnqueueWaitForEvents.txt[]
include::{generated}/api/version-notes/clEnqueueWaitForEvents.asciidoc[]
* _command_queue_ is a valid host command-queue.
* _event_list_ and _num_events_ specify events that need to complete before
this particular command can be executed.
// Note, this parameter is called event_list (like clWaitForEvents) rather than
// event_wait_list(like clEnqueueMarkerWithWaitList etc.) because the function
// predates wait lists (and CL_INVALID_EVENT_WAIT_LIST).
The events specified in _event_list_ act as synchronization points.
The context associated with events in _event_list_ and _command_queue_ must be
the same.
The memory associated with _event_list_ can be reused or freed after the
function returns.
// refError
{clEnqueueWaitForEvents} returns {CL_SUCCESS} if the function is successfully
executed.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_CONTEXT} if context associated with _command_queue_ and events
in _event_list_ are not the same.
* {CL_INVALID_VALUE} if _num_events_ is 0 or _event_list_ is `NULL`.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clEnqueueBarrierWithWaitList',desc='A synchronization point that enqueues a barrier operation.',type='protos']
--
To enqueue a barrier command which waits for events or commands to complete,
call the function
include::{generated}/api/protos/clEnqueueBarrierWithWaitList.txt[]
include::{generated}/api/version-notes/clEnqueueBarrierWithWaitList.asciidoc[]
* _command_queue_ is a valid host command-queue.
* _event_wait_list_ and _num_events_in_wait_list_ specify events that need to
complete before this particular command can be executed.
* If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0.
If _event_wait_list_ is not `NULL`, the list of events pointed to by
_event_wait_list_ must be valid and _num_events_in_wait_list_ must be
greater than 0.
The events specified in _event_wait_list_ act as synchronization points.
The context associated with events in _event_wait_list_ and _command_queue_
must be the same.
The memory associated with _event_wait_list_ can be reused or freed after
the function returns.
* _event_ returns an event object that identifies this command and
can be used to query or wait for this command to complete.
If _event_ is `NULL` or the enqueue is unsuccessful, no event will be
created and therefore it will not be possible to query the status of this
command or to wait for this command to complete.
If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer
to an element of the _event_wait_list_ array.
If _event_wait_list_ is `NULL`, then this particular command waits until all
previous enqueued commands to _command_queue_ have completed.
The barrier command either waits for a list of events to complete, or if the
list is empty it waits for all commands previously enqueued in
_command_queue_ to complete before it completes.
This command blocks command execution, that is, any following commands
enqueued after it do not execute until it completes.
This command returns an _event_ which can be waited on, i.e. this event can
be waited on to insure that all events either in the _event_wait_list_ or
all previously enqueued commands, queued before this command to
_command_queue_, have completed.
// refError
{clEnqueueBarrierWithWaitList} returns {CL_SUCCESS} if the function is
successfully executed.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_INVALID_CONTEXT} if context associated with _command_queue_ and events
in _event_wait_list_ are not the same.
* {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and
_num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_
are not valid events.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clEnqueueBarrier',desc='A synchronization point that enqueues a barrier operation.',type='protos']
--
To enqueue a barrier command which waits for commands to complete, call the
function
include::{generated}/api/protos/clEnqueueBarrier.txt[]
include::{generated}/api/version-notes/clEnqueueBarrier.asciidoc[]
* _command_queue_ is a valid host command-queue.
The barrier command waits for all commands previously enqueued in
_command_queue_ to complete before it completes.
This command blocks command execution, that is, any following commands
enqueued after it do not execute until it completes.
// TODO clEnqueueBarrierWithWaitList doesn't say synchronization point, should
// it, or should the next line be removed? The main difference is that
// clEnqueueBarrierWithWaitList returns an event, which is the synchronization
// point.
The barrier command is a synchronization point.
// refError
{clEnqueueBarrier} returns {CL_SUCCESS} if the function is successfully
executed.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
== Out-of-order Execution of Kernels and Memory Object Commands
The OpenCL functions that are submitted to a command-queue are enqueued in
the order the calls are made but can be configured to execute in-order or
out-of-order.
The _properties_ argument in {clCreateCommandQueueWithProperties} or
{clCreateCommandQueue} can be used to specify the execution order.
If the {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE} property of a command-queue is
not set, the commands enqueued to a command-queue execute in order.
For example, if an application calls {clEnqueueNDRangeKernel} to execute
kernel A followed by a {clEnqueueNDRangeKernel} to execute kernel B, the
application can assume that kernel A finishes first and then kernel B is
executed.
If the memory objects output by kernel A are inputs to kernel B then kernel
B will see the correct data in memory objects produced by execution of
kernel A.
If the {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE} property of a command-queue is
set, then there is no guarantee that kernel A will finish before kernel B
starts execution.
Applications can configure the commands enqueued to a command-queue to
execute out-of-order by setting the {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE}
property of the command-queue.
This can be specified when the command-queue is created.
In out-of-order execution mode there is no guarantee that the enqueued
commands will finish execution in the order they were queued.
As there is no guarantee that kernels will be executed in order, i.e. based
on when the {clEnqueueNDRangeKernel} or {clEnqueueTask} calls are made within a
command-queue, it is therefore possible that an earlier
{clEnqueueNDRangeKernel} call to execute kernel A identified by event A may
execute and/or finish later than a {clEnqueueNDRangeKernel} call to execute
kernel B which was called by the application at a later point in time.
To guarantee a specific order of execution of kernels, a wait on a
particular event (in this case event A) can be used.
The wait for event A can be specified in the _event_wait_list_ argument to
{clEnqueueNDRangeKernel} for kernel B.
In addition, a marker ({clEnqueueMarker} or {clEnqueueMarkerWithWaitList}) or a
barrier ({clEnqueueBarrier} or {clEnqueueBarrierWithWaitList}) command can be
enqueued to the command-queue.
The marker command ensures that previously enqueued commands identified by
the list of events to wait for (or all previous commands) have finished.
A barrier command is similar to a marker command, but additionally
guarantees that no later-enqueued commands will execute until the waited-for
commands have executed.
Similarly, commands to read, write, copy or map memory objects that are
enqueued after {clEnqueueNDRangeKernel}, {clEnqueueTask} or
{clEnqueueNativeKernel} commands are not guaranteed to wait for kernels
scheduled for execution to have completed (if the
{CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE} property is set).
To ensure correct ordering of commands, the event object returned by
{clEnqueueNDRangeKernel}, {clEnqueueTask} or {clEnqueueNativeKernel} can be
used to enqueue a wait for event or a barrier command can be enqueued that must
complete before reads or writes to the memory object(s) occur.
[[profiling-operations]]
== Profiling Operations on Memory Objects and Kernels
This section describes the profiling of OpenCL functions that are enqueued
as commands to a command-queue. Profiling of OpenCL commands can be enabled
by using a command-queue created with the {CL_QUEUE_PROFILING_ENABLE}
flag set in the {CL_QUEUE_PROPERTIES} bitfield in the _properties_ argument to
{clCreateCommandQueueWithProperties}, or in the _properties_ argument to
{clCreateCommandQueue}.
When profiling is enabled, the event objects that are created from
enqueuing a command store a timestamp for each of their state transitions.
[open,refpage='clGetEventProfilingInfo',desc='Returns profiling information for the command associated with event if profiling is enabled.',type='protos']
--
To return profiling information for a command associated with an event when
profiling is enabled, call the function
include::{generated}/api/protos/clGetEventProfilingInfo.txt[]
include::{generated}/api/version-notes/clGetEventProfilingInfo.asciidoc[]
* _event_ specifies the event object.
* _param_name_ specifies the profiling data to query.
The list of supported _param_name_ types and the information returned in
_param_value_ by {clGetEventProfilingInfo} is described in the
<<event-profiling-info-table,Event Profiling Queries>> table.
* _param_value_ is a pointer to memory where the appropriate result being
queried is returned.
If _param_value_ is `NULL`, it is ignored.
* _param_value_size_ is used to specify the size in bytes of memory pointed to
by _param_value_.
This size must be {geq} size of return type as described in the
<<event-profiling-info-table,Event Profiling Queries>> table.
* _param_value_size_ret_ returns the actual size in bytes of data being
queried by _param_name_.
If _param_value_size_ret_ is `NULL`, it is ignored.
[[event-profiling-info-table]]
.List of supported param_names by <<clGetEventProfilingInfo>>
[width="100%",cols="<33%,<17%,<50%",options="header"]
|====
| Event Profiling Info | Return Type | Description
| {CL_PROFILING_COMMAND_QUEUED_anchor}
include::{generated}/api/version-notes/CL_PROFILING_COMMAND_QUEUED.asciidoc[]
| {cl_ulong_TYPE}
| A 64-bit value that describes the current device time counter in
nanoseconds when the command identified by event is enqueued in a
command-queue by the host.
| {CL_PROFILING_COMMAND_SUBMIT_anchor}
include::{generated}/api/version-notes/CL_PROFILING_COMMAND_SUBMIT.asciidoc[]
| {cl_ulong_TYPE}
| A 64-bit value that describes the current device time counter in
nanoseconds when the command identified by event that has been
enqueued is submitted by the host to the device associated with the
command-queue.
| {CL_PROFILING_COMMAND_START_anchor}
include::{generated}/api/version-notes/CL_PROFILING_COMMAND_START.asciidoc[]
| {cl_ulong_TYPE}
| A 64-bit value that describes the current device time counter in
nanoseconds when the command identified by event starts execution on
the device.
| {CL_PROFILING_COMMAND_END_anchor}
include::{generated}/api/version-notes/CL_PROFILING_COMMAND_END.asciidoc[]
| {cl_ulong_TYPE}
| A 64-bit value that describes the current device time counter in
nanoseconds when the command identified by event has finished
execution on the device.
| {CL_PROFILING_COMMAND_COMPLETE_anchor}
include::{generated}/api/version-notes/CL_PROFILING_COMMAND_COMPLETE.asciidoc[]
| {cl_ulong_TYPE}
| A 64-bit value that describes the current device time counter in
nanoseconds when the command identified by event and any child
commands enqueued by this command on the device have finished
execution.
|====
The unsigned 64-bit values returned can be used to measure the time in
nano-seconds consumed by OpenCL commands.
OpenCL devices are required to correctly track time across changes in device
frequency and power states.
The {CL_DEVICE_PROFILING_TIMER_RESOLUTION} specifies the resolution of the
timer i.e. the number of nanoseconds elapsed before the timer is
incremented.
// refError
{clGetEventProfilingInfo} returns {CL_SUCCESS} if the function is executed
successfully and the profiling information has been recorded.
Otherwise, it returns one of the following errors:
* {CL_PROFILING_INFO_NOT_AVAILABLE} if the {CL_QUEUE_PROFILING_ENABLE} flag is
not set for the command-queue, if the execution status of the command
identified by _event_ is not {CL_COMPLETE} or if _event_ is a user event
object.
* {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes
specified by _param_value_size_ is < size of return type as described in
the <<event-profiling-info-table,Event Profiling Queries>> table and
_param_value_ is not `NULL`.
* {CL_INVALID_EVENT} if _event_ is a not a valid event object.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
== Flush and Finish
[open,refpage='clFlush',desc='Issues all previously queued OpenCL commands in a command-queue to the device associated with the command-queue.',type='protos']
--
To flush commands to a device, call the function
include::{generated}/api/protos/clFlush.txt[]
include::{generated}/api/version-notes/clFlush.asciidoc[]
* _command_queue_ is the command queue to flush.
All previously queued OpenCL commands in _command_queue_ are issued to the
device associated with _command_queue_.
{clFlush} only guarantees that all queued commands to _command_queue_ will
eventually be submitted to the appropriate device.
There is no guarantee that they will be complete after {clFlush} returns.
Any blocking commands queued in a command-queue and {clReleaseCommandQueue}
perform an implicit flush of the command-queue.
These blocking commands are {clEnqueueReadBuffer},
{clEnqueueReadBufferRect}, {clEnqueueReadImage}, with _blocking_read_ set to
{CL_TRUE}; {clEnqueueWriteBuffer}, {clEnqueueWriteBufferRect},
{clEnqueueWriteImage} with _blocking_write_ set to {CL_TRUE};
{clEnqueueMapBuffer}, {clEnqueueMapImage} with _blocking_map_ set to
{CL_TRUE}; {clEnqueueSVMMemcpy} with _blocking_copy_ set to {CL_TRUE};
{clEnqueueSVMMap} with _blocking_map_ set to {CL_TRUE} or {clWaitForEvents}.
To use event objects that refer to commands enqueued in a command-queue as
event objects to wait on by commands enqueued in a different command-queue,
the application must call a {clFlush} or any blocking commands that perform
an implicit flush of the command-queue where the commands that refer to
these event objects are enqueued.
// refError
{clFlush} returns {CL_SUCCESS} if the function call was executed successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--
[open,refpage='clFinish',desc='Blocks until all previously queued OpenCL commands in a command-queue are issued to the associated device and have completed.',type='protos']
--
To wait for completion of commands on a device, call the function
include::{generated}/api/protos/clFinish.txt[]
include::{generated}/api/version-notes/clFinish.asciidoc[]
* _command_queue_ is the command queue to wait for.
All previously queued OpenCL commands in _command_queue_ are issued to the
associated device, and the function blocks until all previously queued
commands have completed.
{clFinish} does not return until all previously queued commands in
_command_queue_ have been processed and completed.
{clFinish} is also a synchronization point.
// refError
{clFinish} returns {CL_SUCCESS} if the function call was executed
successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host
command-queue.
* {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
by the OpenCL implementation on the device.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
required by the OpenCL implementation on the host.
--