| // Copyright 2017-2020 The Khronos Group. This work is licensed under a |
| // Creative Commons Attribution 4.0 International License; see |
| // http://creativecommons.org/licenses/by/4.0/ |
| |
| [[opencl-runtime]] |
| = The OpenCL Runtime |
| |
| In this section we describe the API calls that manage OpenCL objects such as |
| command-queues, memory objects, program objects, kernel objects for kernel |
| functions in a program and calls that allow you to enqueue commands to a |
| command-queue such as executing a kernel, reading, or writing a memory |
| object. |
| |
| == Command Queues |
| |
| OpenCL objects such as memory, program and kernel objects are created using |
| a context. |
| Operations on these objects are performed using a command-queue. |
| The command-queue can be used to queue a set of operations (referred to as |
| commands) in order. |
| Having multiple command-queues allows applications to queue multiple |
| independent commands without requiring synchronization. |
| Note that this should work as long as these objects are not being shared. |
| Sharing of objects across multiple command-queues will require the |
| application to perform appropriate synchronization. |
| This is described in <<shared-opencl-objects, Shared OpenCL Objects>> |
| |
| [open,refpage='clCreateCommandQueueWithProperties',desc='Create a host or device command-queue on a specific device.',type='protos'] |
| -- |
| To create a host or device command-queue on a specific device, call the |
| function |
| |
| include::{generated}/api/protos/clCreateCommandQueueWithProperties.txt[] |
| include::{generated}/api/version-notes/clCreateCommandQueueWithProperties.asciidoc[] |
| Also see extension *cl_khr_create_command_queue*. |
| |
| * _context_ must be a valid OpenCL context. |
| * _device_ must be a device or sub-device associated with _context_. |
| It can either be in the list of devices and sub-devices specified when |
| _context_ is created using {clCreateContext} or be a root device with the |
| same device type as specified when _context_ is created using |
| {clCreateContextFromType}. |
| * _properties_ specifies a list of properties for the command-queue and their |
| corresponding values. |
| Each property name is immediately followed by the corresponding desired |
| value. |
| The list is terminated with 0. |
| The list of supported properties is described in the |
| <<queue-properties-table, table below>>. |
| If a supported property and its value is not specified in _properties_, its |
| default value will be used. |
| _properties_ can be `NULL` in which case the default values for supported |
| command-queue properties will be used. |
| * _errcode_ret_ will return an appropriate error code. |
| If _errcode_ret_ is `NULL`, no error code is returned. |
| |
| [[queue-properties-table]] |
| .List of supported queue creation properties by <<clCreateCommandQueueWithProperties>> |
| [width="100%",cols="<33%,<17%,<50%",options="header"] |
| |==== |
| | Queue Property | Property Value | Description |
| | {CL_QUEUE_PROPERTIES_anchor} |
| |
| include::{generated}/api/version-notes/CL_QUEUE_PROPERTIES.asciidoc[] |
| | {cl_command_queue_properties_TYPE} |
| | This is a bitfield and can be set to a combination of the following |
| values: |
| |
| {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE_anchor} - Determines whether the |
| commands queued in the command-queue are executed in-order or |
| out-of-order. |
| If set, the commands in the command-queue are executed out-of-order. |
| Otherwise, commands are executed in-order. |
| include::{generated}/api/version-notes/CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE.asciidoc[] |
| |
| {CL_QUEUE_PROFILING_ENABLE_anchor} - Enable or disable profiling of commands |
| in the command-queue. |
| If set, the profiling of commands is enabled. |
| Otherwise profiling of commands is disabled. |
| include::{generated}/api/version-notes/CL_QUEUE_PROFILING_ENABLE.asciidoc[] |
| |
| {CL_QUEUE_ON_DEVICE_anchor} - Indicates that this is a device queue. |
| If {CL_QUEUE_ON_DEVICE} is set, |
| {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE} |
| footnote:[{fn-out-of-order-device-queue}] |
| must also be set. |
| include::{generated}/api/version-notes/CL_QUEUE_ON_DEVICE.asciidoc[] |
| |
| {CL_QUEUE_ON_DEVICE_DEFAULT_anchor} |
| footnote:[{fn-default-device-queue}] - |
| indicates that this is the default device queue. |
| This can only be used with {CL_QUEUE_ON_DEVICE}. |
| include::{generated}/api/version-notes/CL_QUEUE_ON_DEVICE_DEFAULT.asciidoc[] |
| |
| If {CL_QUEUE_PROPERTIES} is not specified an in-order host command |
| queue is created for the specified device |
| | {CL_QUEUE_SIZE_anchor} |
| |
| include::{generated}/api/version-notes/CL_QUEUE_SIZE.asciidoc[] |
| | {cl_uint_TYPE} |
| | Specifies the size of the device queue in bytes. |
| |
| This can only be specified if {CL_QUEUE_ON_DEVICE} is set in |
| {CL_QUEUE_PROPERTIES}. |
| This must be a value {leq} {CL_DEVICE_QUEUE_ON_DEVICE_MAX_SIZE}. |
| |
| For best performance, this should be {leq} |
| {CL_DEVICE_QUEUE_ON_DEVICE_PREFERRED_SIZE}. |
| |
| If {CL_QUEUE_SIZE} is not specified, the device queue is created with |
| {CL_DEVICE_QUEUE_ON_DEVICE_PREFERRED_SIZE} as the size of the queue. |
| |==== |
| |
| // refError |
| |
| {clCreateCommandQueueWithProperties} returns a valid non-zero command-queue |
| and _errcode_ret_ is set to {CL_SUCCESS} if the command-queue is created |
| successfully. |
| Otherwise, it returns a `NULL` value with one of the following error values |
| returned in _errcode_ret_: |
| |
| * {CL_INVALID_CONTEXT} if _context_ is not a valid context. |
| * {CL_INVALID_DEVICE} if _device_ is not a valid device or is not associated |
| with _context_. |
| * {CL_INVALID_VALUE} if values specified in _properties_ are not valid. |
| * {CL_INVALID_QUEUE_PROPERTIES} if values specified in _properties_ are |
| valid but are not supported by the device. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clCreateCommandQueue',desc='Create a host command-queue on a specific device.',type='protos'] |
| -- |
| To create a host command-queue on a specific device, call the function |
| |
| include::{generated}/api/protos/clCreateCommandQueue.txt[] |
| include::{generated}/api/version-notes/clCreateCommandQueue.asciidoc[] |
| |
| * _context_ must be a valid OpenCL context. |
| * _device_ must be a device or sub-device associated with _context_. |
| It can either be in the list of devices and sub-devices specified when |
| _context_ is created using {clCreateContext} or be a root device with the |
| same device type as specified when _context_ is created using |
| {clCreateContextFromType}. |
| * _properties_ specifies a list of properties for the command-queue. |
| This is a bit-field and the supported properties are described in the |
| <<legacy-queue-properties-table, table>> below. |
| Only command-queue properties specified in this table can be used, |
| otherwise the value specified in _properties_ is considered to be not |
| valid. |
| _properties_ can be 0 in which case the default values for supported |
| command-queue properties will be used. |
| |
| [[legacy-queue-properties-table]] |
| .List of supported `cl_command_queue_property` values by <<clCreateCommandQueue>> |
| [width="100%",cols="<50%,<50%",options="header"] |
| |==== |
| | Command-Queue Properties | Description |
| | {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE} |
| | Determines whether the commands queued in the command-queue are executed |
| in-order or out-of-order. |
| If set, the commands in the command-queue are executed out-of-order. |
| Otherwise, commands are executed in-order. |
| |
| | {CL_QUEUE_PROFILING_ENABLE} |
| | Enable or disable profiling of commands in the command-queue. |
| If set, the profiling of commands is enabled. |
| Otherwise profiling of commands is disabled. |
| |==== |
| |
| * _errcode_ret_ will return an appropriate error code. |
| If _errcode_ret_ is `NULL`, no error code is returned. |
| |
| // refError |
| |
| {clCreateCommandQueue} returns a valid non-zero command-queue and _errcode_ret_ |
| is set to {CL_SUCCESS} if the command-queue is created successfully. |
| Otherwise, it returns a `NULL` value with one of the following error values |
| returned in _errcode_ret_: |
| |
| * {CL_INVALID_CONTEXT} if _context_ is not a valid context. |
| * {CL_INVALID_DEVICE} if _device_ is not a valid device or is not associated |
| with _context_. |
| * {CL_INVALID_VALUE} if values specified in _properties_ are not valid. |
| * {CL_INVALID_QUEUE_PROPERTIES} if values specified in _properties_ are |
| valid but are not supported by the device. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clSetDefaultDeviceCommandQueue',desc='Replaces the default command queue on the device.',type='protos'] |
| -- |
| To replace the default command queue on a device, call the function |
| |
| include::{generated}/api/protos/clSetDefaultDeviceCommandQueue.txt[] |
| include::{generated}/api/version-notes/clSetDefaultDeviceCommandQueue.asciidoc[] |
| |
| * _context_ is the OpenCL context used to create _command_queue_. |
| * _device_ is a valid OpenCL device associated with _context_. |
| * _command_queue_ specifies a command queue object which replaces the |
| default device command queue |
| |
| {clSetDefaultDeviceCommandQueue} may be used to replace a default device |
| command queue created with {clCreateCommandQueueWithProperties} and the |
| {CL_QUEUE_ON_DEVICE_DEFAULT} flag. |
| |
| // refError |
| |
| {clSetDefaultDeviceCommandQueue} returns {CL_SUCCESS} if the function is |
| executed successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_CONTEXT} if _context_ is not a valid context. |
| * {CL_INVALID_DEVICE} if _device_ is not a valid device or is not associated |
| with _context_. |
| * {CL_INVALID_OPERATION} if _device_ does not support a replaceable default on-device queue. |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid command-queue |
| for _device_. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clRetainCommandQueue',desc='Increments the command_queue reference count.',type='protos'] |
| -- |
| To retain a command queue, call the function |
| |
| include::{generated}/api/protos/clRetainCommandQueue.txt[] |
| include::{generated}/api/version-notes/clRetainCommandQueue.asciidoc[] |
| |
| * _command_queue_ specifies the command-queue to be retained. |
| |
| The _command_queue_ reference count is incremented. |
| |
| {clCreateCommandQueueWithProperties} and {clCreateCommandQueue} perform an |
| implicit retain. |
| This is very helpful for 3^rd^ party libraries, which typically get a |
| command-queue passed to them by the application. |
| However, it is possible that the application may delete the command-queue |
| without informing the library. |
| Allowing functions to attach to (i.e. retain) and release a command-queue |
| solves the problem of a command-queue being used by a library no longer |
| being valid. |
| |
| // refError |
| |
| {clRetainCommandQueue} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid |
| command-queue. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clReleaseCommandQueue',desc='Decrements the command_queue reference count.',type='protos'] |
| -- |
| To release a command queue, call the function |
| |
| include::{generated}/api/protos/clReleaseCommandQueue.txt[] |
| include::{generated}/api/version-notes/clReleaseCommandQueue.asciidoc[] |
| |
| * _command_queue_ specifies the command-queue to be released. |
| |
| The _command_queue_ reference count is decremented. |
| |
| After the _command_queue_ reference count becomes zero and all commands |
| queued to _command_queue_ have finished (eg. |
| kernel-instances, memory object updates etc.), the command-queue is deleted. |
| |
| {clReleaseCommandQueue} performs an implicit flush to issue any previously |
| queued OpenCL commands in _command_queue_. |
| Using this function to release a reference that was not obtained by creating |
| the object or by calling {clRetainCommandQueue} causes undefined behavior. |
| |
| // refError |
| |
| {clReleaseCommandQueue} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid |
| command-queue. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clGetCommandQueueInfo',desc='Query information about a command-queue.',type='protos'] |
| -- |
| To query information about a command-queue, call the function |
| |
| include::{generated}/api/protos/clGetCommandQueueInfo.txt[] |
| include::{generated}/api/version-notes/clGetCommandQueueInfo.asciidoc[] |
| |
| * _command_queue_ specifies the command-queue being queried. |
| * _param_name_ specifies the information to query. |
| * _param_value_ is a pointer to memory where the appropriate result being |
| queried is returned. |
| If _param_value_ is `NULL`, it is ignored. |
| * _param_value_size_ is used to specify the size in bytes of memory pointed to |
| by _param_value_. |
| This size must be {geq} size of return type as described in the |
| <<command-queue-param-table,Command Queue Parameter>> table. |
| If _param_value_ is `NULL`, it is ignored. |
| * _param_value_size_ret_ returns the actual size in bytes of data being |
| queried by _param_name_. |
| If _param_value_size_ret_ is `NULL`, it is ignored. |
| |
| The list of supported _param_name_ values and the information returned in |
| _param_value_ by {clGetCommandQueueInfo} is described in the |
| <<command-queue-param-table,Command Queue Parameter>> table. |
| |
| [[command-queue-param-table]] |
| .List of supported param_names by <<clGetCommandQueueInfo>> |
| [width="100%",cols="<33%,<17%,<50%",options="header"] |
| |==== |
| | Queue Info | Return Type | Description |
| | {CL_QUEUE_CONTEXT_anchor} |
| |
| include::{generated}/api/version-notes/CL_QUEUE_CONTEXT.asciidoc[] |
| | {cl_context_TYPE} |
| | Return the context specified when the command-queue is created. |
| | {CL_QUEUE_DEVICE_anchor} |
| |
| include::{generated}/api/version-notes/CL_QUEUE_DEVICE.asciidoc[] |
| | {cl_device_id_TYPE} |
| | Return the device specified when the command-queue is created. |
| | {CL_QUEUE_REFERENCE_COUNT_anchor} footnote:[{fn-reference-count-usage}] |
| |
| include::{generated}/api/version-notes/CL_QUEUE_REFERENCE_COUNT.asciidoc[] |
| | {cl_uint_TYPE} |
| | Return the command-queue reference count. |
| | {CL_QUEUE_PROPERTIES_anchor} |
| |
| include::{generated}/api/version-notes/CL_QUEUE_PROPERTIES.asciidoc[] |
| | {cl_command_queue_properties_TYPE} |
| | Return the currently specified properties for the command-queue. |
| These properties are specified by the value associated with the |
| {CL_QUEUE_PROPERTIES} passed in _properties_ argument in |
| {clCreateCommandQueueWithProperties}, or the value of the _properties_ |
| argument in {clCreateCommandQueue}. |
| |
| | {CL_QUEUE_PROPERTIES_ARRAY_anchor} |
| |
| include::{generated}/api/version-notes/CL_QUEUE_PROPERTIES_ARRAY.asciidoc[] |
| | {cl_queue_properties_TYPE}[] |
| | Return the properties argument specified in |
| {clCreateCommandQueueWithProperties}. |
| |
| If the _properties_ argument specified in |
| {clCreateCommandQueueWithProperties} used to create _command_queue_ |
| was not `NULL`, the implementation must return the values specified in |
| the properties argument in the same order and without including |
| additional properties. |
| |
| If _command_queue_ was created using {clCreateCommandQueue}, or if the |
| _properties_ argument specified in clCreateCommandQueueWithProperties} |
| was `NULL`, the implementation must return _param_value_size_ret_ |
| equal to 0, indicating that there are no properties to be returned. |
| |
| | {CL_QUEUE_SIZE_anchor} |
| |
| include::{generated}/api/version-notes/CL_QUEUE_SIZE.asciidoc[] |
| | {cl_uint_TYPE} |
| | Return the size of the device command-queue. |
| To be considered valid for this query, _command_queue_ must be a |
| device command-queue. |
| |
| | {CL_QUEUE_DEVICE_DEFAULT_anchor} |
| |
| include::{generated}/api/version-notes/CL_QUEUE_DEVICE_DEFAULT.asciidoc[] |
| | {cl_command_queue_TYPE} |
| | Return the current default command queue for the underlying device. |
| |==== |
| |
| // refError |
| |
| {clGetCommandQueueInfo} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid |
| command-queue, or if _command_queue_ is not a valid command-queue |
| for _param_name_. |
| * {CL_INVALID_VALUE} if _param_name_ is not one of the supported values or |
| if size in bytes specified by _param_value_size_ is < size of return |
| type as specified in the <<command-queue-param-table,Command Queue |
| Parameter>> table, and _param_value_ is not a `NULL` value. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clSetCommandQueueProperty',desc='Enable or disable properties of a command-queue.',type='protos'] |
| -- |
| To enable or disable the properties of a command-queue, call the function |
| |
| include::{generated}/api/protos/clSetCommandQueueProperty.txt[] |
| include::{generated}/api/version-notes/clSetCommandQueueProperty.asciidoc[] |
| |
| * _command_queue_ specifies the command-queue being modified. |
| * _properties_ specifies the new list of properties for the command-queue. |
| This is a bit-field and the supported properties are described in the |
| <<legacy-queue-properties-table, Command-Queue Properties table>> for |
| {clCreateCommandQueue}. |
| Only command-queue properties specified in this table can be used, |
| otherwise the value specified in _properties_ is considered to be not |
| valid. |
| * _enable_ determines whether the values specified by _properties_ are |
| enabled (if _enable_ is {CL_TRUE}) or disabled (if _enable_ is {CL_FALSE}) |
| for the command-queue. |
| * _old_properties_ returns the command-queue properties before they were |
| changed by {clSetCommandQueueProperty}. If _old_properties_ is `NULL`, it |
| is ignored. |
| |
| [NOTE] |
| ==== |
| Changing the {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE} command-queue property |
| will cause the OpenCL implementation to block until all previously queued |
| commands in _command_queue_ have completed. This can be an expensive operation |
| and therefore changes to this property should only be done when absolutely |
| necessary. |
| ==== |
| |
| // refError |
| |
| {clSetCommandQueueProperty} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid command-queue. |
| * {CL_INVALID_VALUE} if values specified in _properties_ are not valid. |
| * {CL_INVALID_QUEUE_PROPERTIES} if values specified in _properties_ are |
| valid but are not supported by the device. |
| -- |
| |
| |
| == Buffer Objects |
| |
| A _buffer_ object stores a one-dimensional collection of elements. |
| Elements of a _buffer_ object can be a scalar data type (such as an int, |
| float), vector data type, or a user-defined structure. |
| |
| |
| === Creating Buffer Objects |
| |
| [open,refpage='clCreateBuffer',desc='Creates a buffer object.',type='protos',alias='clCreateBufferWithProperties'] |
| -- |
| A *buffer object* may be created using the function |
| |
| include::{generated}/api/protos/clCreateBuffer.txt[] |
| include::{generated}/api/version-notes/clCreateBuffer.asciidoc[] |
| |
| A *buffer object* may also be created with additional properties using the function |
| |
| include::{generated}/api/protos/clCreateBufferWithProperties.txt[] |
| include::{generated}/api/version-notes/clCreateBufferWithProperties.asciidoc[] |
| |
| * _context_ is a valid OpenCL context used to create the buffer object. |
| * _properties_ is an optional list of properties for the buffer object and their corresponding values. |
| The list is terminated with the special property `0`. |
| If no properties are required, _properties_ may be `NULL`. |
| OpenCL 3.0 does not define any optional properties for buffers. |
| * _flags_ is a bit-field that is used to specify allocation and usage |
| information about the image memory object being created and is described in |
| the <<memory-flags-table, supported memory flag values>> table. |
| * _size_ is the size in bytes of the buffer memory object to be allocated. |
| * _host_ptr_ is a pointer to the buffer data that may already be allocated |
| by the application. |
| The size of the buffer that _host_ptr_ points to must be greater than or equal to _size_ |
| bytes. |
| * _errcode_ret_ may return an appropriate error code. |
| If _errcode_ret_ is `NULL`, no error code is returned. |
| |
| The alignment requirements for data stored in buffer objects are described |
| in <<alignment-app-data-types>>. |
| |
| If {clCreateBuffer} or {clCreateBufferWithProperties} is called with |
| {CL_MEM_USE_HOST_PTR} set in its _flags_ argument, the contents of the |
| memory pointed to by _host_ptr_ at the time of the {clCreateBuffer} call |
| define the initial contents of the buffer object. |
| |
| If {clCreateBuffer} or {clCreateBufferWithProperties} is called with a |
| pointer returned by {clSVMAlloc} as its _host_ptr_ argument, and |
| {CL_MEM_USE_HOST_PTR} is set in its _flags_ argument, {clCreateBuffer} or |
| {clCreateBufferWithProperties} will succeed and return a valid non-zero |
| buffer object as long as the _size_ argument is no larger than the |
| _size_ argument passed in the original {clSVMAlloc} call. |
| The new buffer object returned has the shared memory as the underlying |
| storage. |
| Locations in the buffers underlying shared memory can be operated on using |
| atomic operations to the devices level of support as defined in the memory |
| model. |
| |
| // refError |
| |
| {clCreateBuffer} and {clCreateBufferWithProperties} returns a valid non-zero |
| buffer object and _errcode_ret_ is set to {CL_SUCCESS} if the buffer object |
| is created successfully. |
| Otherwise, they return a `NULL` value with one of the following error values |
| returned in _errcode_ret_: |
| |
| * {CL_INVALID_CONTEXT} if _context_ is not a valid context. |
| * {CL_INVALID_PROPERTY} if a property name in _properties_ is not a |
| supported property name, if the value specified for a supported property |
| name is not valid, or if the same property name is specified more than |
| once. |
| * {CL_INVALID_VALUE} if values specified in _flags_ are not valid as defined |
| in the <<memory-flags-table,Memory Flags>> table. |
| * {CL_INVALID_BUFFER_SIZE} if _size_ is 0 or if _size_ is greater than |
| {CL_DEVICE_MAX_MEM_ALLOC_SIZE} for all devices in _context_. |
| * {CL_INVALID_HOST_PTR} if _host_ptr_ is `NULL` and {CL_MEM_USE_HOST_PTR} or |
| {CL_MEM_COPY_HOST_PTR} are set in _flags_ or if _host_ptr_ is not `NULL` |
| but {CL_MEM_COPY_HOST_PTR} or {CL_MEM_USE_HOST_PTR} are not set in _flags_. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for buffer object. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| |
| [[memory-flags-table]] |
| .List of supported memory flag values |
| [width="100%",cols="<50%,<50%",options="header"] |
| |==== |
| | Memory Flags | Description |
| | {CL_MEM_READ_WRITE_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_READ_WRITE.asciidoc[] |
| | This flag specifies that the memory object will be read |
| and written by a kernel. |
| This is the default. |
| | {CL_MEM_WRITE_ONLY_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_WRITE_ONLY.asciidoc[] |
| | This flag specifies that the memory object will be |
| written but not read by a kernel. |
| |
| Reading from a buffer or image object created with {CL_MEM_WRITE_ONLY} |
| inside a kernel is undefined. |
| |
| {CL_MEM_READ_WRITE} and {CL_MEM_WRITE_ONLY} are mutually exclusive. |
| | {CL_MEM_READ_ONLY_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_READ_ONLY.asciidoc[] |
| | This flag specifies that the memory object is a |
| readonly memory object when used inside a kernel. |
| |
| Writing to a buffer or image object created with {CL_MEM_READ_ONLY} inside |
| a kernel is undefined. |
| |
| {CL_MEM_READ_WRITE} or {CL_MEM_WRITE_ONLY} and {CL_MEM_READ_ONLY} are mutually |
| exclusive. |
| | {CL_MEM_USE_HOST_PTR_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_USE_HOST_PTR.asciidoc[] |
| | This flag is valid only if host_ptr is not `NULL`. |
| If specified, it indicates that the application wants the OpenCL |
| implementation to use memory referenced by host_ptr as the storage bits |
| for the memory object. |
| |
| The contents of the memory pointed to by host_ptr at the time of the |
| {clCreateBuffer}, {clCreateBufferWithProperties}, {clCreateImage}, |
| {clCreateImageWithProperties}, {clCreateImage2D}, or {clCreateImage3D} |
| call define the initial contents of the memory object. |
| |
| OpenCL implementations are allowed to cache the contents pointed |
| to by host_ptr in device memory. |
| This cached copy can be used when kernels are executed on a device. |
| |
| The result of OpenCL commands that operate on multiple buffer objects |
| created with the same host_ptr or from overlapping host or SVM regions |
| is considered to be undefined. |
| | {CL_MEM_ALLOC_HOST_PTR_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_ALLOC_HOST_PTR.asciidoc[] |
| | This flag specifies that the application wants the OpenCL implementation |
| to allocate memory from host accessible memory. |
| |
| {CL_MEM_ALLOC_HOST_PTR} and {CL_MEM_USE_HOST_PTR} are mutually exclusive. |
| | {CL_MEM_COPY_HOST_PTR_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_COPY_HOST_PTR.asciidoc[] |
| | This flag is valid only if host_ptr is not `NULL`. |
| If specified, it indicates that the application wants the OpenCL |
| implementation to allocate memory for the memory object and copy the |
| data from memory referenced by host_ptr. |
| The implementation will copy the memory immediately and host_ptr is |
| available for reuse by the application when the {clCreateBuffer}, |
| {clCreateBufferWithProperties}, {clCreateImage}, {clCreateImageWithProperties}, |
| {clCreateImage2D}, or {clCreateImage3D} operation returns. |
| |
| {CL_MEM_COPY_HOST_PTR} and {CL_MEM_USE_HOST_PTR} are mutually exclusive. |
| |
| {CL_MEM_COPY_HOST_PTR} can be used with {CL_MEM_ALLOC_HOST_PTR} to |
| initialize the contents of the {cl_mem_TYPE} object allocated using |
| host-accessible (e.g. PCIe) memory. |
| | {CL_MEM_HOST_WRITE_ONLY_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_HOST_WRITE_ONLY.asciidoc[] |
| | This flag specifies that the host will only write to the memory object |
| (using OpenCL APIs that enqueue a write or a map for write). |
| This can be used to optimize write access from the host (e.g. enable |
| write-combined allocations for memory objects for devices that |
| communicate with the host over a system bus such as PCIe). |
| | {CL_MEM_HOST_READ_ONLY_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_HOST_READ_ONLY.asciidoc[] |
| | This flag specifies that the host will only read |
| the memory object (using OpenCL APIs that enqueue a read or a map for |
| read). |
| |
| {CL_MEM_HOST_WRITE_ONLY} and {CL_MEM_HOST_READ_ONLY} are mutually exclusive. |
| | {CL_MEM_HOST_NO_ACCESS_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_HOST_NO_ACCESS.asciidoc[] |
| | This flag specifies that the host will not read or |
| write the memory object. |
| |
| {CL_MEM_HOST_WRITE_ONLY} or {CL_MEM_HOST_READ_ONLY} and |
| {CL_MEM_HOST_NO_ACCESS} are mutually exclusive. |
| | {CL_MEM_KERNEL_READ_AND_WRITE_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_KERNEL_READ_AND_WRITE.asciidoc[] |
| | This flag is only used by {clGetSupportedImageFormats} to query image |
| formats that may be both read from and written to by the same kernel |
| instance. |
| To create a memory object that may be read from and written to use |
| {CL_MEM_READ_WRITE}. |
| |==== |
| -- |
| |
| [open,refpage='clCreateSubBuffer',desc='Creates a new buffer object (referred to as a sub-buffer object) from an existing buffer object.',type='protos',xrefs='cl_buffer_region'] |
| -- |
| To create a new buffer object (referred to as a sub-buffer object) from an |
| existing buffer object, call the function |
| |
| include::{generated}/api/protos/clCreateSubBuffer.txt[] |
| include::{generated}/api/version-notes/clCreateSubBuffer.asciidoc[] |
| |
| * _buffer_ must be a valid buffer object and cannot be a sub-buffer object. |
| * _flags_ is a bit-field that is used to specify allocation and usage |
| information about the sub-buffer memory object being created and is |
| described in the <<memory-flags-table,Memory Flags>> table. |
| If the {CL_MEM_READ_WRITE}, {CL_MEM_READ_ONLY}, or {CL_MEM_WRITE_ONLY} values are |
| not specified in _flags_, they are inherited from the corresponding memory |
| access qualifiers associated with _buffer_. |
| The {CL_MEM_USE_HOST_PTR}, {CL_MEM_ALLOC_HOST_PTR}, and {CL_MEM_COPY_HOST_PTR} |
| values cannot be specified in _flags_ but are inherited from the |
| corresponding memory access qualifiers associated with _buffer_. |
| If {CL_MEM_COPY_HOST_PTR} is specified in the memory access qualifier values |
| associated with _buffer_ it does not imply any additional copies when the |
| sub-buffer is created from _buffer_. |
| If the {CL_MEM_HOST_WRITE_ONLY}, {CL_MEM_HOST_READ_ONLY}, or |
| {CL_MEM_HOST_NO_ACCESS} values are not specified in _flags_, they are |
| inherited from the corresponding memory access qualifiers associated with |
| _buffer_. |
| * _buffer_create_type_ and _buffer_create_info_ describe the type of buffer |
| object to be created. |
| The list of supported values for _buffer_create_type_ and corresponding |
| descriptor that _buffer_create_info_ points to is described in the |
| <<subbuffer-create-info-table,SubBuffer Attributes>> table. |
| |
| [[subbuffer-create-info-table]] |
| .List of supported buffer creation types by <<clCreateSubBuffer>> |
| [width="100%",cols="<50%,<50%",options="header"] |
| |==== |
| | Buffer Creation Type | Description |
| | {CL_BUFFER_CREATE_TYPE_REGION_anchor} |
| |
| include::{generated}/api/version-notes/CL_BUFFER_CREATE_TYPE_REGION.asciidoc[] |
| | Create a buffer object that represents a specific region in _buffer_. |
| |
| _buffer_create_info_ is a pointer to a {cl_buffer_region_TYPE} structure |
| specifying a region of the buffer. |
| |
| If _buffer_ is created with {CL_MEM_USE_HOST_PTR}, the _host_ptr_ |
| associated with the buffer object returned is _host_ptr + origin_. |
| |
| The buffer object returned references the data store allocated for |
| buffer and points to the region specified by _buffer_create_info_ in |
| this data store. |
| |==== |
| |
| // refError |
| |
| {clCreateSubBuffer} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors in _errcode_ret_: |
| |
| * {CL_INVALID_MEM_OBJECT} if _buffer_ is not a valid buffer object or is a |
| sub-buffer object. |
| * {CL_INVALID_VALUE} if _buffer_ was created with {CL_MEM_WRITE_ONLY} and |
| _flags_ specifies {CL_MEM_READ_WRITE} or {CL_MEM_READ_ONLY}, or if _buffer_ |
| was created with {CL_MEM_READ_ONLY} and _flags_ specifies |
| {CL_MEM_READ_WRITE} or {CL_MEM_WRITE_ONLY}, or if _flags_ specifies |
| {CL_MEM_USE_HOST_PTR} or {CL_MEM_ALLOC_HOST_PTR} or {CL_MEM_COPY_HOST_PTR}. |
| * {CL_INVALID_VALUE} if _buffer_ was created with {CL_MEM_HOST_WRITE_ONLY} and |
| _flags_ specify {CL_MEM_HOST_READ_ONLY}, or if _buffer_ was created with |
| {CL_MEM_HOST_READ_ONLY} and _flags_ specify {CL_MEM_HOST_WRITE_ONLY}, or if |
| _buffer_ was created with {CL_MEM_HOST_NO_ACCESS} and _flags_ specify |
| {CL_MEM_HOST_READ_ONLY} or {CL_MEM_HOST_WRITE_ONLY}. |
| * {CL_INVALID_VALUE} if the value specified in _buffer_create_type_ is not |
| valid. |
| * {CL_INVALID_VALUE} if value(s) specified in _buffer_create_info_ (for a |
| given _buffer_create_type_) is not valid or if _buffer_create_info_ is |
| `NULL`. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for sub-buffer object. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| // These errors are specific to the {CL_BUFFER_CREATE_TYPE_REGION} parameter |
| * {CL_INVALID_VALUE} if the region specified by the {cl_buffer_region_TYPE} |
| structure passed in _buffer_create_info_ is out of bounds in _buffer_. |
| * {CL_INVALID_BUFFER_SIZE} if the _size_ field of the {cl_buffer_region_TYPE} |
| structure passed in _buffer_create_info_ is 0. |
| * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if there are no devices in _context_ |
| associated with _buffer_ for which the _origin_ field of the |
| {cl_buffer_region_TYPE} structure passed in _buffer_create_info_ is |
| aligned to the {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value. |
| |
| [NOTE] |
| ==== |
| Concurrent reading from, writing to and copying between both a buffer object |
| and its sub-buffer object(s) is undefined. |
| Concurrent reading from, writing to and copying between overlapping |
| sub-buffer objects created with the same buffer object is undefined. |
| Only reading from both a buffer object and its sub-buffer objects or reading |
| from multiple overlapping sub-buffer objects is defined. |
| ==== |
| -- |
| |
| [open,refpage='cl_buffer_region',desc='',type='structs',xrefs='clCreateSubBuffer'] |
| -- |
| The {cl_buffer_region_TYPE} structure specifies a region of a buffer object: |
| |
| include::{generated}/api/structs/cl_buffer_region.txt[] |
| |
| * _origin_ is the offset in bytes of the region. |
| * _size_ is the size in bytes of the region. |
| |
| Constraints on the values of _origin_ and _size_ are specified for the |
| {clCreateSubBuffer} function to which this structure is passed. |
| -- |
| |
| |
| === Reading, Writing and Copying Buffer Objects |
| |
| The following functions enqueue commands to read from a buffer object to |
| host memory or write to a buffer object from host memory. |
| |
| [open,refpage='clEnqueueReadBuffer',desc='Enqueue commands to read from a buffer object to host memory.',type='protos',alias='clEnqueueWriteBuffer'] |
| -- |
| To read from a buffer object to host memory or to write to a buffer object from |
| host memory call one of the functions |
| |
| include::{generated}/api/protos/clEnqueueReadBuffer.txt[] |
| include::{generated}/api/version-notes/clEnqueueReadBuffer.asciidoc[] |
| |
| include::{generated}/api/protos/clEnqueueWriteBuffer.txt[] |
| include::{generated}/api/version-notes/clEnqueueWriteBuffer.asciidoc[] |
| |
| * _command_queue_ is a valid host command-queue in which the read / write |
| command will be queued. |
| _command_queue_ and _buffer_ must be created with the same OpenCL context. |
| * _buffer_ refers to a valid buffer object. |
| * _blocking_read_ and _blocking_write_ indicate if the read and write |
| operations are _blocking_ or _non-blocking_ (see below). |
| * _offset_ is the offset in bytes in the buffer object to read from or write |
| to. |
| * _size_ is the size in bytes of data being read or written. |
| * _ptr_ is the pointer to buffer in host memory where data is to be read into |
| or to be written from. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before this particular command can be executed. |
| If _event_wait_list_ is `NULL`, then this particular command does not wait |
| on any event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this read / write command |
| and can be used to query or queue a wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| If _blocking_read_ is {CL_TRUE} i.e. the read command is blocking, |
| {clEnqueueReadBuffer} does not return until the buffer data has been read |
| and copied into memory pointed to by _ptr_. |
| |
| If _blocking_read_ is {CL_FALSE} i.e. the read command is non-blocking, |
| {clEnqueueReadBuffer} queues a non-blocking read command and returns. |
| The contents of the buffer that _ptr_ points to cannot be used until the |
| read command has completed. |
| The _event_ argument returns an event object which can be used to query the |
| execution status of the read command. |
| When the read command has completed, the contents of the buffer that _ptr_ |
| points to can be used by the application. |
| |
| If _blocking_write_ is {CL_TRUE}, the write command is blocking and does not |
| return until the command is complete, including transfer of the data. |
| The memory pointed to by _ptr_ can be reused by the application after the |
| {clEnqueueWriteBuffer} call returns. |
| |
| If _blocking_write_ is {CL_FALSE}, the OpenCL implementation will use _ptr_ to |
| perform a non-blocking write. |
| As the write is non-blocking the implementation can return immediately. |
| The memory pointed to by _ptr_ cannot be reused by the application after the |
| call returns. |
| The _event_ argument returns an event object which can be used to query the |
| execution status of the write command. |
| When the write command has completed, the memory pointed to by _ptr_ can |
| then be reused by the application. |
| |
| // refError |
| |
| {clEnqueueReadBuffer} and {clEnqueueWriteBuffer} return {CL_SUCCESS} if the |
| function is executed successfully. |
| Otherwise, they return one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and |
| _buffer_ are not the same or if the context associated with |
| _command_queue_ and events in _event_wait_list_ are not the same. |
| * {CL_INVALID_MEM_OBJECT} if _buffer_ is not a valid buffer object. |
| * {CL_INVALID_VALUE} if the region being read or written specified by |
| (_offset_, _size_) is out of bounds or if _ptr_ is a `NULL` value. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _buffer_ is a sub-buffer object and |
| _offset_ specified when the sub-buffer object is created is not aligned |
| to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated with |
| _queue_. |
| This error code is <<unified-spec, missing before>> version 1.1. |
| * {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the read and write |
| operations are blocking and the execution status of any of the events in |
| _event_wait_list_ is a negative integer value. |
| This error code is <<unified-spec, missing before>> version 1.1. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for data store associated with _buffer_. |
| * {CL_INVALID_OPERATION} if {clEnqueueReadBuffer} is called on _buffer_ |
| which has been created with {CL_MEM_HOST_WRITE_ONLY} or |
| {CL_MEM_HOST_NO_ACCESS}. |
| * {CL_INVALID_OPERATION} if {clEnqueueWriteBuffer} is called on _buffer_ |
| which has been created with {CL_MEM_HOST_READ_ONLY} or |
| {CL_MEM_HOST_NO_ACCESS}. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clEnqueueReadBufferRect',desc='Enqueue command to read from a 2D or 3D rectangular region from a buffer object to host memory.',type='protos',alias='clEnqueueWriteBufferRect'] |
| -- |
| The following functions enqueue commands to read a 2D or 3D rectangular |
| region from a buffer object to host memory or write a 2D or 3D rectangular |
| region to a buffer object from host memory. |
| |
| include::{generated}/api/protos/clEnqueueReadBufferRect.txt[] |
| include::{generated}/api/version-notes/clEnqueueReadBufferRect.asciidoc[] |
| |
| include::{generated}/api/protos/clEnqueueWriteBufferRect.txt[] |
| include::{generated}/api/version-notes/clEnqueueWriteBufferRect.asciidoc[] |
| |
| * _command_queue_ refers is a valid host command-queue in which the read / |
| write command will be queued. |
| _command_queue_ and _buffer_ must be created with the same OpenCL context. |
| * _buffer_ refers to a valid buffer object. |
| * _blocking_read_ and _blocking_write_ indicate if the read and write |
| operations are _blocking_ or _non-blocking_ (see below). |
| * _buffer_origin_ defines the (_x_, _y_, _z_) offset in the memory region |
| associated with _buffer_. |
| For a 2D rectangle region, the _z_ value given by _buffer_origin_[2] should |
| be 0. |
| The offset in bytes is computed as _buffer_origin_[2] {times} |
| _buffer_slice_pitch_ + _buffer_origin_[1] {times} _buffer_row_pitch_ {plus} |
| _buffer_origin_[0]. |
| * _host_origin_ defines the (_x_, _y_, _z_) offset in the memory region |
| pointed to by _ptr_. |
| For a 2D rectangle region, the _z_ value given by _host_origin_[2] should be |
| 0. |
| The offset in bytes is computed as _host_origin_[2] {times} |
| _host_slice_pitch_ + _host_origin_[1] {times} _host_row_pitch_ {plus} |
| _host_origin_[0]. |
| * _region_ defines the (_width_ in bytes, _height_ in rows, _depth_ in slices) |
| of the 2D or 3D rectangle being read or written. |
| For a 2D rectangle copy, the _depth_ value given by _region_[2] should be 1. |
| The values in region cannot be 0. |
| * _buffer_row_pitch_ is the length of each row in bytes to be used for the |
| memory region associated with _buffer_. |
| If _buffer_row_pitch_ is 0, _buffer_row_pitch_ is computed as _region_[0]. |
| * _buffer_slice_pitch_ is the length of each 2D slice in bytes to be used for |
| the memory region associated with _buffer_. |
| If _buffer_slice_pitch_ is 0, _buffer_slice_pitch_ is computed as |
| _region_[1] {times} _buffer_row_pitch_. |
| * _host_row_pitch_ is the length of each row in bytes to be used for the |
| memory region pointed to by _ptr_. |
| If _host_row_pitch_ is 0, _host_row_pitch_ is computed as _region_[0]. |
| * _host_slice_pitch_ is the length of each 2D slice in bytes to be used for |
| the memory region pointed to by _ptr_. |
| If _host_slice_pitch_ is 0, _host_slice_pitch_ is computed as _region_[1] |
| {times} _host_row_pitch_. |
| * _ptr_ is the pointer to buffer in host memory where data is to be read into |
| or to be written from. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before this particular command can be executed. |
| If _event_wait_list_ is `NULL`, then this particular command does not wait |
| on any event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this read / write command |
| and can be used to query or queue a wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| |
| If _blocking_read_ is {CL_TRUE} i.e. the read command is blocking, |
| {clEnqueueReadBufferRect} does not return until the buffer data has been |
| read and copied into memory pointed to by _ptr_. |
| |
| If _blocking_read_ is {CL_FALSE} i.e. the read command is non-blocking, |
| {clEnqueueReadBufferRect} queues a non-blocking read command and returns. |
| The contents of the buffer that _ptr_ points to cannot be used until the |
| read command has completed. |
| The _event_ argument returns an event object which can be used to query the |
| execution status of the read command. |
| When the read command has completed, the contents of the buffer that _ptr_ |
| points to can be used by the application. |
| |
| If _blocking_write_ is {CL_TRUE}, the write command is blocking and does not |
| return until the command is complete, including transfer of the data. |
| The memory pointed to by _ptr_ can be reused by the application after the |
| {clEnqueueWriteBufferRect} call returns. |
| |
| If _blocking_write_ is {CL_FALSE}, the OpenCL implementation will use _ptr_ to |
| perform a non-blocking write. |
| As the write is non-blocking the implementation can return immediately. |
| The memory pointed to by _ptr_ cannot be reused by the application after the |
| call returns. |
| The _event_ argument returns an event object which can be used to query the |
| execution status of the write command. |
| When the write command has completed, the memory pointed to by _ptr_ can |
| then be reused by the application. |
| |
| // refError |
| |
| {clEnqueueReadBufferRect} and {clEnqueueWriteBufferRect} return {CL_SUCCESS} |
| if the function is executed successfully. |
| Otherwise, they return one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and |
| _buffer_ are not the same or if the context associated with |
| _command_queue_ and events in _event_wait_list_ are not the same. |
| * {CL_INVALID_MEM_OBJECT} if _buffer_ is not a valid buffer object. |
| * {CL_INVALID_VALUE} if _buffer_origin_, _host_origin_, or _region_ is `NULL`. |
| * {CL_INVALID_VALUE} if the region being read or written specified by |
| (_buffer_origin_, _region_, _buffer_row_pitch_, _buffer_slice_pitch_) is |
| out of bounds. |
| * {CL_INVALID_VALUE} if any _region_ array element is 0. |
| * {CL_INVALID_VALUE} if _buffer_row_pitch_ is not 0 and is less than |
| _region_[0]. |
| * {CL_INVALID_VALUE} if _host_row_pitch_ is not 0 and is less than |
| _region_[0]. |
| * {CL_INVALID_VALUE} if _buffer_slice_pitch_ is not 0 and is less than |
| _region_[1] {times} _buffer_row_pitch_ and not a multiple of |
| _buffer_row_pitch_. |
| * {CL_INVALID_VALUE} if _host_slice_pitch_ is not 0 and is less than |
| _region_[1] {times} _host_row_pitch_ and not a multiple of |
| _host_row_pitch_. |
| * {CL_INVALID_VALUE} if _ptr_ is `NULL`. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _buffer_ is a sub-buffer object and |
| _offset_ specified when the sub-buffer object is created is not aligned |
| to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated with |
| _queue_. |
| This error code is <<unified-spec, missing before>> version 1.1. |
| * {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the read and write |
| operations are blocking and the execution status of any of the events in |
| _event_wait_list_ is a negative integer value. |
| This error code is <<unified-spec, missing before>> version 1.1. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for data store associated with _buffer_. |
| * {CL_INVALID_OPERATION} if {clEnqueueReadBufferRect} is called on _buffer_ |
| which has been created with {CL_MEM_HOST_WRITE_ONLY} or |
| {CL_MEM_HOST_NO_ACCESS}. |
| * {CL_INVALID_OPERATION} if {clEnqueueWriteBufferRect} is called on _buffer_ |
| which has been created with {CL_MEM_HOST_READ_ONLY} or |
| {CL_MEM_HOST_NO_ACCESS}. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| |
| [NOTE] |
| ==== |
| Calling {clEnqueueReadBuffer} to read a region of the buffer object with the |
| _ptr_ argument value set to _host_ptr_ + _offset_, where _host_ptr_ is a |
| pointer to the memory region specified when the buffer object being read is |
| created with {CL_MEM_USE_HOST_PTR}, must meet the following requirements in |
| order to avoid undefined behavior: |
| |
| * All commands that use this buffer object or a memory object (buffer or |
| image) created from this buffer object have finished execution before |
| the read command begins execution. |
| * The buffer object or memory objects created from this buffer object are |
| not mapped. |
| * The buffer object or memory objects created from this buffer object are |
| not used by any command-queue until the read command has finished |
| execution. |
| |
| Calling {clEnqueueReadBufferRect} to read a region of the buffer object with |
| the _ptr_ argument value set to _host_ptr_ and _host_origin_, |
| _buffer_origin_ values are the same, where _host_ptr_ is a pointer to the |
| memory region specified when the buffer object being read is created with |
| {CL_MEM_USE_HOST_PTR}, must meet the same requirements given above for |
| {clEnqueueReadBuffer}. |
| |
| Calling {clEnqueueWriteBuffer} to update the latest bits in a region of the |
| buffer object with the _ptr_ argument value set to _host_ptr_ + _offset_, |
| where _host_ptr_ is a pointer to the memory region specified when the buffer |
| object being written is created with {CL_MEM_USE_HOST_PTR}, must meet the |
| following requirements in order to avoid undefined behavior: |
| |
| * The host memory region given by (_host_ptr_ + _offset_, _cb_) contains |
| the latest bits when the enqueued write command begins execution. |
| * The buffer object or memory objects created from this buffer object are |
| not mapped. |
| * The buffer object or memory objects created from this buffer object are |
| not used by any command-queue until the write command has finished |
| execution. |
| |
| Calling {clEnqueueWriteBufferRect} to update the latest bits in a region of |
| the buffer object with the _ptr_ argument value set to _host_ptr_ and |
| _host_origin_, _buffer_origin_ values are the same, where _host_ptr_ is a |
| pointer to the memory region specified when the buffer object being written |
| is created with {CL_MEM_USE_HOST_PTR}, must meet the following requirements in |
| order to avoid undefined behavior: |
| |
| * The host memory region given by (_buffer_origin region_) contains the |
| latest bits when the enqueued write command begins execution. |
| * The buffer object or memory objects created from this buffer object are |
| not mapped. |
| * The buffer object or memory objects created from this buffer object are |
| not used by any command-queue until the write command has finished |
| execution. |
| ==== |
| -- |
| |
| [open,refpage='clEnqueueCopyBuffer',desc='Enqueues a command to copy from one buffer object to another.',type='protos'] |
| -- |
| To enqueue a command to copy a buffer object identified by _src_buffer_ to |
| another buffer object identified by _dst_buffer_, call the function |
| |
| include::{generated}/api/protos/clEnqueueCopyBuffer.txt[] |
| include::{generated}/api/version-notes/clEnqueueCopyBuffer.asciidoc[] |
| |
| * _command_queue_ refers to a host command-queue in which the copy command |
| will be queued. |
| The OpenCL context associated with _command_queue_, _src_buffer_ and |
| _dst_buffer_ must be the same. |
| * _src_offset_ refers to the offset where to begin copying data from |
| _src_buffer_. |
| * _dst_offset_ refers to the offset where to begin copying data into |
| _dst_buffer_. |
| * _size_ refers to the size in bytes to copy. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before this particular command can be executed. |
| If _event_wait_list_ is `NULL`, then this particular command does not wait |
| on any event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this copy command |
| and can be used to query or queue a wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| // refError |
| |
| {clEnqueueCopyBuffer} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_CONTEXT} if the context associated with _command_queue_, |
| _src_buffer_ and _dst_buffer_ are not the same or if the context |
| associated with _command_queue_ and events in _event_wait_list_ are not |
| the same. |
| * {CL_INVALID_MEM_OBJECT} if _src_buffer_ and _dst_buffer_ are not valid |
| buffer objects. |
| * {CL_INVALID_VALUE} if _src_offset_, _dst_offset_, _size_, _src_offset_ |
| {plus} _size_ or _dst_offset_ + _size_ require accessing elements |
| outside the _src_buffer_ and _dst_buffer_ buffer objects respectively. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _src_buffer_ is a sub-buffer object |
| and _offset_ specified when the sub-buffer object is created is not |
| aligned to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated |
| with _queue_. |
| This error code is <<unified-spec, missing before>> version 1.1. |
| * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _dst_buffer_ is a sub-buffer object |
| and _offset_ specified when the sub-buffer object is created is not |
| aligned to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated |
| with _queue_. |
| This error code is <<unified-spec, missing before>> version 1.1. |
| * {CL_MEM_COPY_OVERLAP} if _src_buffer_ and _dst_buffer_ are the same buffer |
| or sub-buffer object and the source and destination regions overlap or |
| if _src_buffer_ and _dst_buffer_ are different sub-buffers of the same |
| associated buffer object and they overlap. |
| The regions overlap if _src_offset_ {leq} _dst_offset_ {leq} |
| _src_offset_ + _size_ - 1 or if _dst_offset_ {leq} _src_offset_ {leq} |
| _dst_offset_ + _size_ - 1. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for data store associated with _src_buffer_ or _dst_buffer_. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clEnqueueCopyBufferRect',desc='Enqueues a command to copy a 2D or 3D rectangular region from a buffer object to another buffer object.',type='protos'] |
| -- |
| To enqueue a command to copy a 2D or 3D rectangular region from the buffer |
| object identified by _src_buffer_ to a 2D or 3D region in the buffer object |
| identified by _dst_buffer_, call the function |
| |
| include::{generated}/api/protos/clEnqueueCopyBufferRect.txt[] |
| include::{generated}/api/version-notes/clEnqueueCopyBufferRect.asciidoc[] |
| |
| * _command_queue_ refers to the host command-queue in which the copy command |
| will be queued. |
| The OpenCL context associated with _command_queue_, _src_buffer_ and |
| _dst_buffer_ must be the same. |
| * _src_origin_ defines the (_x_, _y_, _z_) offset in the memory region |
| associated with _src_buffer_. |
| For a 2D rectangle region, the _z_ value given by _src_origin_[2] should be |
| 0. |
| The offset in bytes is computed as _src_origin_[2] {times} _src_slice_pitch_ |
| + _src_origin_[1] {times} _src_row_pitch_ + _src_origin_[0]. |
| * _dst_origin_ defines the (_x_, _y_, _z_) offset in the memory region |
| associated with _dst_buffer_. |
| For a 2D rectangle region, the _z_ value given by _dst_origin_[2] should be |
| 0. |
| The offset in bytes is computed as _dst_origin_[2] {times} _dst_slice_pitch_ |
| + _dst_origin_[1] {times} _dst_row_pitch_ + _dst_origin_[0]. |
| * _region_ defines the (_width_ in bytes, _height_ in rows, _depth_ in slices) |
| of the 2D or 3D rectangle being copied. |
| For a 2D rectangle, the _depth_ value given by _region_[2] should be 1. |
| The values in region cannot be 0. |
| * _src_row_pitch_ is the length of each row in bytes to be used for the memory |
| region associated with _src_buffer_. |
| If _src_row_pitch_ is 0, _src_row_pitch_ is computed as _region_[0]. |
| * _src_slice_pitch_ is the length of each 2D slice in bytes to be used for the |
| memory region associated with _src_buffer_. |
| If _src_slice_pitch_ is 0, _src_slice_pitch_ is computed as _region_[1] |
| {times} _src_row_pitch_. |
| * _dst_row_pitch_ is the length of each row in bytes to be used for the memory |
| region associated with _dst_buffer_. |
| If _dst_row_pitch_ is 0, _dst_row_pitch_ is computed as _region_[0]. |
| * _dst_slice_pitch_ is the length of each 2D slice in bytes to be used for the |
| memory region associated with _dst_buffer_. |
| If _dst_slice_pitch_ is 0, _dst_slice_pitch_ is computed as _region_[1] |
| {times} _dst_row_pitch_. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before this particular command can be executed. |
| If _event_wait_list_ is `NULL`, then this particular command does not wait |
| on any event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this copy command |
| and can be used to query or queue a wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| Copying begins at the source offset and destination offset which are |
| computed as described below in the description for _src_origin_ and |
| _dst_origin_. |
| Each byte of the region's width is copied from the source offset to the |
| destination offset. |
| After copying each width, the source and destination offsets are incremented |
| by their respective source and destination row pitches. |
| After copying each 2D rectangle, the source and destination offsets are |
| incremented by their respective source and destination slice pitches. |
| |
| [NOTE] |
| ==== |
| If _src_buffer_ and _dst_buffer_ are the same buffer object, _src_row_pitch_ |
| must equal _dst_row_pitch_ and _src_slice_pitch_ must equal |
| _dst_slice_pitch_. |
| ==== |
| |
| // refError |
| |
| {clEnqueueCopyBufferRect} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_CONTEXT} if the context associated with _command_queue_, |
| _src_buffer_ and _dst_buffer_ are not the same or if the context |
| associated with _command_queue_ and events in _event_wait_list_ are not |
| the same. |
| * {CL_INVALID_MEM_OBJECT} if _src_buffer_ and _dst_buffer_ are not valid |
| buffer objects. |
| * {CL_INVALID_VALUE} if _src_origin_, _dst_origin_, or _region_ is `NULL`. |
| * {CL_INVALID_VALUE} if (_src_origin_, _region_, _src_row_pitch_, |
| _src_slice_pitch_) or (_dst_origin_, _region_, _dst_row_pitch_, |
| _dst_slice_pitch_) require accessing elements outside the _src_buffer_ |
| and _dst_buffer_ buffer objects respectively. |
| * {CL_INVALID_VALUE} if any _region_ array element is 0. |
| * {CL_INVALID_VALUE} if _src_row_pitch_ is not 0 and is less than |
| _region_[0]. |
| * {CL_INVALID_VALUE} if _dst_row_pitch_ is not 0 and is less than |
| _region_[0]. |
| * {CL_INVALID_VALUE} if _src_slice_pitch_ is not 0 and is less than |
| _region_[1] {times} _src_row_pitch_ or if _src_slice_pitch_ is not 0 and |
| is not a multiple of _src_row_pitch_. |
| * {CL_INVALID_VALUE} if _dst_slice_pitch_ is not 0 and is less than |
| _region_[1] {times} _dst_row_pitch_ or if _dst_slice_pitch_ is not 0 and |
| is not a multiple of _dst_row_pitch_. |
| * {CL_INVALID_VALUE} if _src_buffer_ and _dst_buffer_ are the same buffer |
| object and _src_slice_pitch_ is not equal to _dst_slice_pitch_ and |
| _src_row_pitch_ is not equal to _dst_row_pitch_. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_MEM_COPY_OVERLAP} if _src_buffer_ and _dst_buffer_ are the same buffer |
| or sub-buffer object and the source and destination regions overlap or |
| if _src_buffer_ and _dst_buffer_ are different sub-buffers of the same |
| associated buffer object and they overlap. |
| Refer to <<check-copy-overlap, Checking for Memory Copy Overlap>> for |
| details on how to determine if source and destination regions overlap. |
| * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _src_buffer_ is a sub-buffer object |
| and _offset_ specified when the sub-buffer object is created is not |
| aligned to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated |
| with _queue_. |
| This error code is <<unified-spec, missing before>> version 1.1. |
| * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _dst_buffer_ is a sub-buffer object |
| and _offset_ specified when the sub-buffer object is created is not |
| aligned to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated |
| with _queue_. |
| This error code is <<unified-spec, missing before>> version 1.1. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for data store associated with _src_buffer_ or _dst_buffer_. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| |
| === Filling Buffer Objects |
| |
| NOTE: Filling buffer objects is <<unified-spec, missing before>> version 1.2. |
| |
| [open,refpage='clEnqueueFillBuffer',desc='Enqueues a command to fill a buffer object with a pattern of a given pattern size.',type='protos'] |
| -- |
| To enqueue a command to fill a buffer object with a pattern of a given |
| pattern size, call the function |
| |
| include::{generated}/api/protos/clEnqueueFillBuffer.txt[] |
| include::{generated}/api/version-notes/clEnqueueFillBuffer.asciidoc[] |
| |
| * _command_queue_ refers to the host command-queue in which the fill command |
| will be queued. |
| The OpenCL context associated with _command_queue_ and _buffer_ must be the |
| same. |
| * _buffer_ is a valid buffer object. |
| * _pattern_ is a pointer to the data pattern of size _pattern_size_ in bytes. |
| _pattern_ will be used to fill a region in _buffer_ starting at _offset_ and |
| is _size_ bytes in size. |
| The data pattern must be a scalar or vector integer or floating-point data |
| type supported by OpenCL as described in <<scalar-data-types, Shared |
| Application Scalar Data Types>> and <<vector-data-types, Supported |
| Application Vector Data Types>>. |
| For example, if _buffer_ is to be filled with a pattern of `float4` values, |
| then _pattern_ will be a pointer to a `cl_float4` value and _pattern_size_ |
| will be `sizeof(cl_float4)`. |
| The maximum value of _pattern_size_ is the size of the largest integer or |
| floating-point vector data type supported by the OpenCL device. |
| The memory associated with _pattern_ can be reused or freed after the |
| function returns. |
| * _offset_ is the location in bytes of the region being filled in _buffer_ and |
| must be a multiple of _pattern_size_. |
| * _size_ is the size in bytes of region being filled in _buffer_ and must be a |
| multiple of _pattern_size_. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before this particular command can be executed. |
| If _event_wait_list_ is `NULL`, then this particular command does not wait |
| on any event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this command |
| and can be used to query or queue a wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| The usage information which indicates whether the memory object can be read |
| or written by a kernel and/or the host and is given by the {cl_mem_flags_TYPE} |
| argument value specified when _buffer_ is created is ignored by |
| {clEnqueueFillBuffer}. |
| |
| // refError |
| |
| {clEnqueueFillBuffer} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and |
| _buffer_ are not the same or if the context associated with |
| _command_queue_ and events in _event_wait_list_ are not the same. |
| * {CL_INVALID_MEM_OBJECT} if _buffer_ is not a valid buffer object. |
| * {CL_INVALID_VALUE} if _offset_ or _offset_ + _size_ require accessing |
| elements outside the _buffer_ buffer object respectively. |
| * {CL_INVALID_VALUE} if _pattern_ is `NULL` or if _pattern_size_ is 0 or if |
| _pattern_size_ is not one of { 1, 2, 4, 8, 16, 32, 64, 128 }. |
| * {CL_INVALID_VALUE} if _offset_ and _size_ are not a multiple of |
| _pattern_size_. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _buffer_ is a sub-buffer object and |
| offset specified when the sub-buffer object is created is not aligned to |
| {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated with _queue_. |
| This error code is <<unified-spec, missing before>> version 1.1. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for data store associated with _buffer_. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| |
| === Mapping Buffer Objects |
| |
| [open,refpage='clEnqueueMapBuffer',desc='Enqueues a command to map a region of a buffer object into the host address space and returns a pointer to this mapped region.',type='protos'] |
| -- |
| To enqueue a command to map a region of the buffer object given by _buffer_ |
| into the host address space and returns a pointer to this mapped region, |
| call the function |
| |
| include::{generated}/api/protos/clEnqueueMapBuffer.txt[] |
| include::{generated}/api/version-notes/clEnqueueMapBuffer.asciidoc[] |
| |
| * _command_queue_ must be a valid host command-queue. |
| * _blocking_map_ indicates if the map operation is _blocking_ or |
| _non-blocking_. |
| |
| If _blocking_map_ is {CL_TRUE}, {clEnqueueMapBuffer} does not return until the |
| specified region in _buffer_ is mapped into the host address space and the |
| application can access the contents of the mapped region using the pointer |
| returned by {clEnqueueMapBuffer}. |
| |
| If _blocking_map_ is {CL_FALSE} i.e. map operation is non-blocking, the |
| pointer to the mapped region returned by {clEnqueueMapBuffer} cannot be used |
| until the map command has completed. |
| The _event_ argument returns an event object which can be used to query the |
| execution status of the map command. |
| When the map command is completed, the application can access the contents |
| of the mapped region using the pointer returned by {clEnqueueMapBuffer}. |
| |
| * _map_flags_ is a bit-field and is described in the |
| <<memory-map-flags-table,Memory Map Flags>> table. |
| * _buffer_ is a valid buffer object. |
| The OpenCL context associated with _command_queue_ and _buffer_ must be the |
| same. |
| * _offset_ and _size_ are the offset in bytes and the size of the region in |
| the buffer object that is being mapped. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before this particular command can be executed. |
| If _event_wait_list_ is `NULL`, then this particular command does not wait |
| on any event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this command |
| and can be used to query or queue a wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| * _errcode_ret_ will return an appropriate error code. |
| If _errcode_ret_ is `NULL`, no error code is returned. |
| |
| // refError |
| |
| {clEnqueueMapBuffer} will return a pointer to the mapped region. |
| The _errcode_ret_ is set to {CL_SUCCESS}. |
| |
| A `NULL` pointer is returned otherwise with one of the following error |
| values returned in _errcode_ret_: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and |
| _buffer_ are not the same or if the context associated with |
| _command_queue_ and events in _event_wait_list_ are not the same. |
| * {CL_INVALID_MEM_OBJECT} if _buffer_ is not a valid buffer object. |
| * {CL_INVALID_VALUE} if region being mapped given by (_offset_, _size_) is |
| out of bounds or if _size_ is 0 or if values specified in _map_flags_ |
| are not valid. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _buffer_ is a sub-buffer object and |
| _offset_ specified when the sub-buffer object is created is not aligned |
| to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for the device associated with |
| _queue_. |
| This error code is <<unified-spec, missing before>> version 1.1. |
| * {CL_MAP_FAILURE} if there is a failure to map the requested region into |
| the host address space. |
| This error cannot occur for buffer objects created with |
| {CL_MEM_USE_HOST_PTR} or {CL_MEM_ALLOC_HOST_PTR}. |
| * {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the map operation is |
| blocking and the execution status of any of the events in |
| _event_wait_list_ is a negative integer value. |
| This error code is <<unified-spec, missing before>> version 1.1. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for data store associated with _buffer_. |
| * {CL_INVALID_OPERATION} if _buffer_ has been created with |
| {CL_MEM_HOST_WRITE_ONLY} or {CL_MEM_HOST_NO_ACCESS} and {CL_MAP_READ} is set |
| in _map_flags_ or if _buffer_ has been created with |
| {CL_MEM_HOST_READ_ONLY} or {CL_MEM_HOST_NO_ACCESS} and {CL_MAP_WRITE} or |
| {CL_MAP_WRITE_INVALIDATE_REGION} is set in _map_flags_. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| * {CL_INVALID_OPERATION} if mapping would lead to overlapping regions being |
| mapped for writing. |
| |
| The pointer returned maps a region starting at _offset_ and is at least |
| _size_ bytes in size. |
| The result of a memory access outside this region is undefined. |
| |
| If the buffer object is created with {CL_MEM_USE_HOST_PTR} set in _mem_flags_, |
| the following will be true: |
| |
| * The _host_ptr_ specified in {clCreateBuffer} or {clCreateBufferWithProperties} |
| will contain the latest bits in the region being mapped when the |
| {clEnqueueMapBuffer} command has completed. |
| * The pointer value returned by {clEnqueueMapBuffer} will be derived from |
| the _host_ptr_ specified when the buffer object is created. |
| |
| Mapped buffer objects are unmapped using {clEnqueueUnmapMemObject}. |
| This is described in <<unmapping-mapped-memory, Unmapping Mapped Memory |
| Objects>>. |
| |
| [[memory-map-flags-table]] |
| .List of supported map flag values |
| [width="100%",cols="<50%,<50%",options="header"] |
| |==== |
| | Map Flags | Description |
| | {CL_MAP_READ_anchor} |
| |
| include::{generated}/api/version-notes/CL_MAP_READ.asciidoc[] |
| | This flag specifies that the region being mapped in the memory object is |
| being mapped for reading. |
| |
| The pointer returned by {clEnqueueMapBuffer} ({clEnqueueMapImage}) is |
| guaranteed to contain the latest bits in the region being mapped when |
| the {clEnqueueMapBuffer} ({clEnqueueMapImage}) command has completed. |
| | {CL_MAP_WRITE_anchor} |
| |
| include::{generated}/api/version-notes/CL_MAP_WRITE.asciidoc[] |
| | This flag specifies that the region being mapped in the memory object is |
| being mapped for writing. |
| |
| The pointer returned by {clEnqueueMapBuffer} ({clEnqueueMapImage}) is |
| guaranteed to contain the latest bits in the region being mapped when |
| the {clEnqueueMapBuffer} ({clEnqueueMapImage}) command has completed |
| | {CL_MAP_WRITE_INVALIDATE_REGION_anchor} |
| |
| include::{generated}/api/version-notes/CL_MAP_WRITE_INVALIDATE_REGION.asciidoc[] |
| | This flag specifies that the region being mapped in the memory object is |
| being mapped for writing. |
| |
| The contents of the region being mapped are to be discarded. |
| This is typically the case when the region being mapped is overwritten |
| by the host. |
| This flag allows the implementation to no longer guarantee that the |
| pointer returned by {clEnqueueMapBuffer} ({clEnqueueMapImage}) contains |
| the latest bits in the region being mapped which can be a significant |
| performance enhancement. |
| |
| {CL_MAP_READ} or {CL_MAP_WRITE} and {CL_MAP_WRITE_INVALIDATE_REGION} are |
| mutually exclusive. |
| |==== |
| -- |
| |
| |
| == Image Objects |
| |
| An _image_ object is used to store a one-, two- or three-dimensional |
| texture, frame-buffer or image. |
| The elements of an image object are selected from a list of predefined image |
| formats. |
| The minimum number of elements in a memory object is one. |
| |
| |
| === Creating Image Objects |
| |
| [open,refpage='clCreateImage',desc='Creates a 1D image, 1D image buffer, 1D image array, 2D image, 2D image array or 3D image object.',type='protos',alias='clCreateImageWithProperties'] |
| -- |
| An *image object* may be created using the function |
| |
| include::{generated}/api/protos/clCreateImage.txt[] |
| include::{generated}/api/version-notes/clCreateImage.asciidoc[] |
| |
| An *image object* may also be created with additional properties using the function |
| |
| include::{generated}/api/protos/clCreateImageWithProperties.txt[] |
| include::{generated}/api/version-notes/clCreateImageWithProperties.asciidoc[] |
| |
| * _context_ is a valid OpenCL context used to create the image object. |
| * _properties_ is an optional list of properties for the image object and their corresponding values. |
| The list is terminated with the special property `0`. |
| If no properties are required, _properties_ may be `NULL`. |
| OpenCL 3.0 does not define any optional properties for images. |
| * _flags_ is a bit-field that is used to specify allocation and usage |
| information about the image memory object being created and is described in |
| the <<memory-flags-table, supported memory flag values>> table. |
| * _image_format_ is a pointer to a structure that describes format properties |
| of the image to be allocated. |
| A 1D image buffer or 2D image can be created from a buffer by specifying a |
| buffer object in the __image_desc__->__mem_object__. |
| A 2D image can be created from another 2D image object by specifying an |
| image object in the __image_desc__->__mem_object__. |
| Refer to the <<image-format-descriptor, Image Format Descriptor>> section |
| for a detailed description of the image format descriptor. |
| * _image_desc_ is a pointer to a structure that describes type and dimensions |
| of the image to be allocated. |
| Refer to the <<image-descriptor, Image Descriptor>> section for a detailed |
| description of the image descriptor. |
| * _host_ptr_ is a pointer to the image data that may already be allocated by |
| the application. |
| Refer to the <<host-ptr-buffer-size-table, table below>> for a description |
| of how large the buffer that _host_ptr_ points to must be. |
| * _errcode_ret_ will return an appropriate error code. |
| If _errcode_ret_ is `NULL`, no error code is returned. |
| |
| The alignment requirements for data stored in image objects are described |
| in <<alignment-app-data-types>>. |
| |
| For all image types except {CL_MEM_OBJECT_IMAGE1D_BUFFER}, if the value |
| specified for _flags_ is 0, the default is used which is {CL_MEM_READ_WRITE}. |
| |
| For {CL_MEM_OBJECT_IMAGE1D_BUFFER} image type, or an image created from |
| another memory object (image or buffer), if the {CL_MEM_READ_WRITE}, |
| {CL_MEM_READ_ONLY} or {CL_MEM_WRITE_ONLY} values are not specified in _flags_, |
| they are inherited from the corresponding memory access qualifiers associated |
| with __mem_object__. |
| The {CL_MEM_USE_HOST_PTR}, {CL_MEM_ALLOC_HOST_PTR} and {CL_MEM_COPY_HOST_PTR} |
| values cannot be specified in _flags_ but are inherited from the |
| corresponding memory access qualifiers associated with __mem_object__. |
| If {CL_MEM_COPY_HOST_PTR} is specified in the memory access qualifier values |
| associated with __mem_object__ it does not imply any additional copies when |
| the image is created from __mem_object__. |
| If the {CL_MEM_HOST_WRITE_ONLY}, {CL_MEM_HOST_READ_ONLY} or |
| {CL_MEM_HOST_NO_ACCESS} values are not specified in _flags_, they are |
| inherited from the corresponding memory access qualifiers associated with |
| __mem_object__. |
| |
| For a 3D image or 2D image array, the image data specified by _host_ptr_ is |
| stored as a linear sequence of adjacent 2D image slices or 2D images |
| respectively. |
| Each 2D image is a linear sequence of adjacent scanlines. |
| Each scanline is a linear sequence of image elements. |
| |
| For a 2D image, the image data specified by _host_ptr_ is stored as a linear |
| sequence of adjacent scanlines. |
| Each scanline is a linear sequence of image elements. |
| |
| For a 1D image array, the image data specified by _host_ptr_ is stored as a |
| linear sequence of adjacent 1D images. |
| Each 1D image is stored as a single scanline which is a linear sequence of |
| adjacent elements. |
| |
| For 1D image or 1D image buffer, the image data specified by _host_ptr_ is |
| stored as a single scanline which is a linear sequence of adjacent elements. |
| |
| Image elements are stored according to their image format as described in the |
| <<image-format-descriptor, Image Format Descriptor>> section. |
| |
| // refError |
| |
| {clCreateImage} and {clCreateImageWithProperties} returns a valid non-zero |
| image object and _errcode_ret_ is set to {CL_SUCCESS} if the image object |
| is created successfully. |
| Otherwise, they return a `NULL` value with one of the following error values |
| returned in _errcode_ret_: |
| |
| * {CL_INVALID_CONTEXT} if _context_ is not a valid context. |
| * {CL_INVALID_PROPERTY} if a property name in _properties_ is not a |
| supported property name, if the value specified for a supported property |
| name is not valid, or if the same property name is specified more than |
| once. |
| * {CL_INVALID_VALUE} if values specified in _flags_ are not valid. |
| * {CL_INVALID_IMAGE_FORMAT_DESCRIPTOR} if values specified in _image_format_ |
| are not valid or if _image_format_ is `NULL`. |
| * {CL_INVALID_IMAGE_FORMAT_DESCRIPTOR} if a 2D image is created from a |
| buffer and the row pitch and base address alignment does not follow the |
| rules described for creating a 2D image from a buffer. |
| * {CL_INVALID_IMAGE_FORMAT_DESCRIPTOR} if a 2D image is created from a 2D |
| image object and the rules described above are not followed. |
| * {CL_INVALID_IMAGE_DESCRIPTOR} if values specified in _image_desc_ are not |
| valid or if _image_desc_ is `NULL`. |
| * {CL_INVALID_IMAGE_SIZE} if image dimensions specified in _image_desc_ |
| exceed the maximum image dimensions described in the |
| <<device-queries-table,Device Queries>> table for all devices |
| in _context_. |
| * {CL_INVALID_HOST_PTR} if _host_ptr_ is `NULL` and {CL_MEM_USE_HOST_PTR} or |
| {CL_MEM_COPY_HOST_PTR} are set in _flags_ or if _host_ptr_ is not `NULL` |
| but {CL_MEM_COPY_HOST_PTR} or {CL_MEM_USE_HOST_PTR} are not set in _flags_. |
| * {CL_INVALID_VALUE} if an image is being created from another memory object |
| (buffer or image) under one of the following circumstances: 1) |
| __mem_object__ was created with {CL_MEM_WRITE_ONLY} and _flags_ specifies |
| {CL_MEM_READ_WRITE} or {CL_MEM_READ_ONLY}, 2) __mem_object__ was created with |
| {CL_MEM_READ_ONLY} and _flags_ specifies {CL_MEM_READ_WRITE} or |
| {CL_MEM_WRITE_ONLY}, 3) _flags_ specifies {CL_MEM_USE_HOST_PTR} or |
| {CL_MEM_ALLOC_HOST_PTR} or {CL_MEM_COPY_HOST_PTR}. |
| * {CL_INVALID_VALUE} if an image is being created from another memory object |
| (buffer or image) and __mem_object__ was created with |
| {CL_MEM_HOST_WRITE_ONLY} and _flags_ specifies {CL_MEM_HOST_READ_ONLY}, or |
| if __mem_object__ was created with {CL_MEM_HOST_READ_ONLY} and _flags_ |
| specifies {CL_MEM_HOST_WRITE_ONLY}, or if __mem_object__ was created with |
| {CL_MEM_HOST_NO_ACCESS} and_flags_ specifies {CL_MEM_HOST_READ_ONLY} or |
| {CL_MEM_HOST_WRITE_ONLY}. |
| * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if there are no devices in _context_ that |
| support _image_format_. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for image object. |
| * {CL_INVALID_OPERATION} if there are no devices in _context_ that support |
| images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the |
| <<device-queries-table,Device Queries>> table is {CL_FALSE}). |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| |
| [[host-ptr-buffer-size-table]] |
| .Required _host_ptr_ buffer sizes for images |
| [width="100%",cols="<50%,<50%",options="header"] |
| |==== |
| | Image Type | Size of buffer that _host_ptr_ points to |
| | {CL_MEM_OBJECT_IMAGE1D_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_OBJECT_IMAGE1D.asciidoc[] |
| | {geq} image_row_pitch |
| | {CL_MEM_OBJECT_IMAGE1D_BUFFER_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_OBJECT_IMAGE1D_BUFFER.asciidoc[] |
| | {geq} image_row_pitch |
| | {CL_MEM_OBJECT_IMAGE2D_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_OBJECT_IMAGE2D.asciidoc[] |
| | {geq} image_row_pitch {times} image_height |
| | {CL_MEM_OBJECT_IMAGE3D_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_OBJECT_IMAGE3D.asciidoc[] |
| | {geq} image_slice_pitch {times} image_depth |
| | {CL_MEM_OBJECT_IMAGE1D_ARRAY_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_OBJECT_IMAGE1D_ARRAY.asciidoc[] |
| | {geq} image_slice_pitch {times} image_array_size |
| | {CL_MEM_OBJECT_IMAGE2D_ARRAY_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_OBJECT_IMAGE2D_ARRAY.asciidoc[] |
| | {geq} image_slice_pitch {times} image_array_size |
| |==== |
| |
| -- |
| |
| [open,refpage='clCreateImage2D',desc='Creates a 2D image.',type='protos'] |
| -- |
| A *2D image* object can be created using the following function |
| |
| include::{generated}/api/protos/clCreateImage2D.txt[] |
| include::{generated}/api/version-notes/clCreateImage2D.asciidoc[] |
| |
| * _context_ is a valid OpenCL context on which the image object is to be |
| created. |
| * _flags_ is a bit-field that is used to specify allocation and usage |
| information about the image memory object being created and is described in |
| the <<memory-flags-table, supported memory flag values>> table. |
| If the value specified for _flags_ is 0, the default is used which is |
| {CL_MEM_READ_WRITE}. |
| * _image_format_ is a pointer to a structure that describes format properties |
| of the image to be allocated. |
| Refer to the <<image-format-descriptor, Image Format Descriptor>> section |
| for a detailed description of the image format descriptor. |
| * _image_width_ and _image_height_ are the width and height of the image in |
| pixels. |
| These must be values greater than or equal to 1. |
| * _image_row_pitch_ is the scan-line pitch in bytes. |
| This must be 0 if _host_ptr_ is `NULL` and can be either 0 or {geq} |
| _image_width_ {times} size of element in bytes if _host_ptr_ is not `NULL`. |
| If _host_ptr_ is not `NULL` and _image_row_pitch_ is 0, _image_row_pitch_ |
| is calculated as _image_width_ {times} size of element in bytes. |
| If _image_row_pitch_ is not 0, it must be a multiple of the image element |
| size in bytes. |
| * _host_ptr_ is a pointer to the image data that may already be allocated by |
| the application. |
| Refer to the {CL_MEM_OBJECT_IMAGE2D} entry in the |
| <<host-ptr-buffer-size-table, required _host_ptr_ buffer size table>> for a |
| description of how large the buffer that _host_ptr_ points to must be. |
| The image data specified by _host_ptr_ is stored as a linear sequence of |
| adjacent scanlines. |
| Each scanline is a linear sequence of image elements. |
| Image elements are stored according to their image format as described in |
| the <<image-format-descriptor, Image Format Descriptor>> section. |
| * _errcode_ret_ will return an appropriate error code. |
| If _errcode_ret_ is `NULL`, no error code is returned. |
| |
| // refError |
| |
| {clCreateImage2D} returns a valid non-zero image object created and the |
| _errcode_ret_ is set to {CL_SUCCESS} if the image object is created |
| successfully. |
| Otherwise, it returns a `NULL` value with one of the following error values |
| returned in _errcode_ret_: |
| |
| * {CL_INVALID_CONTEXT} if _context_ is not a valid context. |
| * {CL_INVALID_VALUE} if values specified in _flags_ are not valid. |
| * {CL_INVALID_IMAGE_FORMAT_DESCRIPTOR} if values specified in _image_format_ |
| are not valid or if _image_format_ is `NULL`. |
| * {CL_INVALID_IMAGE_SIZE} if _image_width_ or _image_height_ are 0 or if they |
| exceed the maximum values specified in {CL_DEVICE_IMAGE2D_MAX_WIDTH} or |
| {CL_DEVICE_IMAGE2D_MAX_HEIGHT} respectively for all devices in _context_ or |
| if values specified by _image_row_pitch_ do not follow rules described in the |
| argument description above. |
| * {CL_INVALID_HOST_PTR} if _host_ptr_ is `NULL` and {CL_MEM_USE_HOST_PTR} or |
| {CL_MEM_COPY_HOST_PTR} are set in _flags_ or if _host_ptr_ is not `NULL` |
| but {CL_MEM_COPY_HOST_PTR} or {CL_MEM_USE_HOST_PTR} are not set in _flags_. |
| * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if there are no devices in _context_ that |
| support _image_format_. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for image object. |
| * {CL_INVALID_OPERATION} if there are no devices in _context_ that support |
| images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the |
| <<device-queries-table,Device Queries>> table is {CL_FALSE}). |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clCreateImage3D',desc='Creates a 3D image.',type='protos'] |
| -- |
| A *3D image* object can be created using the following function |
| |
| include::{generated}/api/protos/clCreateImage3D.txt[] |
| include::{generated}/api/version-notes/clCreateImage3D.asciidoc[] |
| |
| * _context_ is a valid OpenCL context on which the image object is to be |
| created. |
| * _flags_ is a bit-field that is used to specify allocation and usage |
| information about the image memory object being created and is described in |
| the <<memory-flags-table, supported memory flag values>> table. |
| If the value specified for _flags_ is 0, the default is used which is |
| {CL_MEM_READ_WRITE}. |
| * _image_format_ is a pointer to a structure that describes format properties |
| of the image to be allocated. |
| Refer to the <<image-format-descriptor, Image Format Descriptor>> section |
| for a detailed description of the image format descriptor. |
| * _image_width_ and _image_height_ are the width and height of the image in |
| pixels. |
| These must be values greater than or equal to 1. |
| * _image_depth_ is the depth of the image in pixels. For {clCreateImage3D}, |
| this must be a value {gt} 1. |
| * _image_row_pitch_ is the scan-line pitch in bytes. |
| This must be 0 if _host_ptr_ is `NULL` and can be either 0 or {geq} |
| _image_width_ {times} size of element in bytes if _host_ptr_ is not `NULL`. |
| If _host_ptr_ is not `NULL` and _image_row_pitch_ is 0, _image_row_pitch_ |
| is calculated as _image_width_ {times} size of element in bytes. |
| If _image_row_pitch_ is not 0, it must be a multiple of the image element |
| size in bytes. |
| * _image_slice_pitch_ is the size in bytes of each 2D slice in the 3D image. |
| This be be 0 if _host_ptr_ is `NULL` and can be 0 or {geq} |
| _image_row_pitch_ {times} _image_height_ if _host_ptr_ is not `NULL`. |
| If _host_ptr_ is not `NULL` and _image_slice_pitch_ is 0, |
| _image_slice_pitch_ is calculated as _image_row_pitch_ {times} |
| _image_height_. |
| If _image_slice_pitch_ is not 0, it must be a multiple of the |
| _image_row_pitch_. |
| * _host_ptr_ is a pointer to the image data that may already be allocated by |
| the application. |
| Refer to the {CL_MEM_OBJECT_IMAGE3D} entry in the |
| <<host-ptr-buffer-size-table, required _host_ptr_ buffer size table>> for a |
| description of how large the buffer that _host_ptr_ points to must be. |
| The image data specified by _host_ptr_ is stored as a linear sequence of |
| adjacent 2D slices. |
| Each scanline is a linear sequence of image elements. |
| Image elements are stored according to their image format as described in |
| the <<image-format-descriptor, Image Format Descriptor>> section. |
| * _errcode_ret_ will return an appropriate error code. |
| If _errcode_ret_ is `NULL`, no error code is returned. |
| |
| // refError |
| |
| {clCreateImage3D} returns a valid non-zero image object created and the |
| _errcode_ret_ is set to {CL_SUCCESS} if the image object is created |
| successfully. |
| Otherwise, it returns a `NULL` value with one of the following error values |
| returned in _errcode_ret_: |
| |
| * {CL_INVALID_CONTEXT} if _context_ is not a valid context. |
| * {CL_INVALID_VALUE} if values specified in _flags_ are not valid. |
| * {CL_INVALID_IMAGE_FORMAT_DESCRIPTOR} if values specified in _image_format_ |
| are not valid or if _image_format_ is `NULL`. |
| * {CL_INVALID_IMAGE_SIZE} if _image_width_ or _image_height_ are 0 or if |
| _image_depth_ {leq} 1, or if they exceed the maximum values specified in |
| {CL_DEVICE_IMAGE3D_MAX_WIDTH}, {CL_DEVICE_IMAGE3D_MAX_HEIGHT} or |
| {CL_DEVICE_IMAGE3D_MAX_DEPTH} respectively for all devices in _context_, or |
| if values specified by _image_row_pitch_ and _image_slice_pitch_ do not |
| follow rules described in the argument description above. |
| * {CL_INVALID_HOST_PTR} if _host_ptr_ is `NULL` and {CL_MEM_USE_HOST_PTR} or |
| {CL_MEM_COPY_HOST_PTR} are set in _flags_ or if _host_ptr_ is not `NULL` |
| but {CL_MEM_COPY_HOST_PTR} or {CL_MEM_USE_HOST_PTR} are not set in _flags_. |
| * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if there are no devices in _context_ that |
| support _image_format_. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for image object. |
| * {CL_INVALID_OPERATION} if there are no devices in _context_ that support |
| images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the |
| <<device-queries-table,Device Queries>> table is {CL_FALSE}). |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| |
| [[image-format-descriptor]] |
| ==== Image Format Descriptor |
| |
| [open,refpage='cl_image_format',desc='The image format descriptor structure is defined as:',type='structs',xrefs='clCreateImage'] |
| -- |
| The {cl_image_format_TYPE} image format descriptor structure describes an image |
| format, and is defined as: |
| |
| include::{generated}/api/structs/cl_image_format.txt[] |
| |
| * `image_channel_order` specifies the number of channels and the channel |
| layout i.e. the memory layout in which channels are stored in the image. |
| Valid values are described in the <<image-channel-order-table,Image Channel |
| Order>> table. |
| * `image_channel_data_type` describes the size of the channel data type. |
| The list of supported values is described in the |
| <<image-channel-data-types-table,Image Channel Data Types>> table. |
| The number of bits per element determined by the `image_channel_data_type` |
| and `image_channel_order` must be a power of two. |
| |
| [[image-channel-order-table]] |
| .List of supported Image Channel Order Values |
| [width="100%",cols="<50%,<50%",options="header"] |
| |==== |
| | Image Channel Order | Description |
| | {CL_R_anchor}, {CL_A_anchor}, |
| // The CL_R annotation here is used to convey the same information for both |
| // entries in this table row. |
| include::{generated}/api/version-notes/CL_R.asciidoc[] |
| | Single channel image formats where the single channel represents a `RED` or `ALPHA` component. |
| | {CL_DEPTH_anchor} |
| |
| include::{generated}/api/version-notes/CL_DEPTH.asciidoc[] |
| | A single channel image format where the single channel represents a `DEPTH` component. |
| | {CL_LUMINANCE_anchor} |
| |
| include::{generated}/api/version-notes/CL_LUMINANCE.asciidoc[] |
| | A single channel image format where the single channel represents a `LUMINANCE` value. |
| The `LUMINANCE` value is replicated into the `RED`, `GREEN`, and `BLUE` components. |
| | {CL_INTENSITY_anchor}, |
| |
| include::{generated}/api/version-notes/CL_INTENSITY.asciidoc[] |
| | A single channel image format where the single channel represents an `INTENSITY` value. |
| The `INTENSITY` value is replicated into the `RED`, `GREEN`, `BLUE`, and `ALPHA` components. |
| | {CL_RG_anchor}, {CL_RA_anchor} |
| |
| // The CL_RG annotation here is used to convey the same information for both |
| // entries in this table row. |
| include::{generated}/api/version-notes/CL_RG.asciidoc[] |
| | Two channel image formats. |
| The first channel always represents a `RED` component. |
| The second channel represents a `GREEN` component or an `ALPHA` component. |
| | {CL_Rx_anchor} |
| |
| include::{generated}/api/version-notes/CL_Rx.asciidoc[] |
| | A two channel image format, where the first channel represents a `RED` component and the second channel is ignored. |
| | {CL_RGB_anchor} |
| |
| include::{generated}/api/version-notes/CL_RGB.asciidoc[] |
| | A three channel image format, where the three channels represent `RED`, `GREEN`, and `BLUE` components. |
| | {CL_RGx_anchor} |
| |
| include::{generated}/api/version-notes/CL_RGx.asciidoc[] |
| | A three channel image format, where the first two channels represent `RED` and `GREEN` components and the third channel is ignored. |
| | {CL_RGBA_anchor}, {CL_ARGB_anchor}, {CL_BGRA_anchor}, {CL_ABGR_anchor} |
| |
| // Manually write this annotation as CL_ABGR is the odd-one-out here (all the |
| // other entries in this row were in OpenCL 1.0). |
| {CL_ABGR} is <<unified-spec, missing before>> version 2.0. |
| | Four channel image formats, where the four channels represent `RED`, `GREEN`, `BLUE`, and `ALPHA` components. |
| | {CL_RGBx_anchor} |
| |
| include::{generated}/api/version-notes/CL_RGBx.asciidoc[] |
| | A four channel image format, where the first three channels represent `RED`, `GREEN`, and `BLUE` components and the fourth channel is ignored. |
| | {CL_sRGB_anchor} |
| |
| include::{generated}/api/version-notes/CL_sRGB.asciidoc[] |
| | A three channel image format, where the three channels represent `RED`, `GREEN`, and `BLUE` components in the sRGB color space. |
| | {CL_sRGBA_anchor}, {CL_sBGRA_anchor} |
| |
| // The CL_sRGBA annotation here is used to convey the same information for both |
| // entries in this table row. |
| include::{generated}/api/version-notes/CL_sRGBA.asciidoc[] |
| | Four channel image formats, where the first three channels represent `RED`, `GREEN`, and `BLUE` components in the sRGB color space. |
| The fourth channel represents an `ALPHA` component. |
| | {CL_sRGBx_anchor} |
| |
| include::{generated}/api/version-notes/CL_sRGBx.asciidoc[] |
| | A four channel image format, where the three channels represent `RED`, `GREEN`, and `BLUE` components in the sRGB color space. |
| The fourth channel is ignored. |
| |==== |
| |
| [[image-channel-data-types-table]] |
| .List of supported Image Channel Data Types |
| [width="100%",cols="<50%,<50%",options="header"] |
| |==== |
| | Image Channel Data Type | Description |
| | {CL_SNORM_INT8_anchor} |
| |
| include::{generated}/api/version-notes/CL_SNORM_INT8.asciidoc[] |
| | Each channel component is a normalized signed 8-bit integer value |
| | {CL_SNORM_INT16_anchor} |
| |
| include::{generated}/api/version-notes/CL_SNORM_INT16.asciidoc[] |
| | Each channel component is a normalized signed 16-bit integer value |
| | {CL_UNORM_INT8_anchor} |
| |
| include::{generated}/api/version-notes/CL_UNORM_INT8.asciidoc[] |
| | Each channel component is a normalized unsigned 8-bit integer value |
| | {CL_UNORM_INT16_anchor} |
| |
| include::{generated}/api/version-notes/CL_UNORM_INT16.asciidoc[] |
| | Each channel component is a normalized unsigned 16-bit integer value |
| | {CL_UNORM_SHORT_565_anchor} |
| |
| include::{generated}/api/version-notes/CL_UNORM_SHORT_565.asciidoc[] |
| | Represents a normalized 5-6-5 3-channel RGB image. |
| The channel order must be {CL_RGB} or {CL_RGBx}. |
| | {CL_UNORM_SHORT_555_anchor} |
| |
| include::{generated}/api/version-notes/CL_UNORM_SHORT_555.asciidoc[] |
| | Represents a normalized x-5-5-5 4-channel xRGB image. |
| The channel order must be {CL_RGB} or {CL_RGBx}. |
| | {CL_UNORM_INT_101010_anchor} |
| |
| include::{generated}/api/version-notes/CL_UNORM_INT_101010.asciidoc[] |
| | Represents a normalized x-10-10-10 4-channel xRGB image. |
| The channel order must be {CL_RGB} or {CL_RGBx}. |
| | {CL_UNORM_INT_101010_2_anchor} |
| |
| include::{generated}/api/version-notes/CL_UNORM_INT_101010_2.asciidoc[] |
| | Represents a normalized 10-10-10-2 four-channel RGBA image. |
| The channel order must be {CL_RGBA}. |
| | {CL_SIGNED_INT8_anchor} |
| |
| include::{generated}/api/version-notes/CL_SIGNED_INT8.asciidoc[] |
| | Each channel component is an unnormalized signed 8-bit integer value |
| | {CL_SIGNED_INT16_anchor} |
| |
| include::{generated}/api/version-notes/CL_SIGNED_INT16.asciidoc[] |
| | Each channel component is an unnormalized signed 16-bit integer value |
| | {CL_SIGNED_INT32_anchor} |
| |
| include::{generated}/api/version-notes/CL_SIGNED_INT32.asciidoc[] |
| | Each channel component is an unnormalized signed 32-bit integer value |
| | {CL_UNSIGNED_INT8_anchor} |
| |
| include::{generated}/api/version-notes/CL_UNSIGNED_INT8.asciidoc[] |
| | Each channel component is an unnormalized unsigned 8-bit integer value |
| | {CL_UNSIGNED_INT16_anchor} |
| |
| include::{generated}/api/version-notes/CL_UNSIGNED_INT16.asciidoc[] |
| | Each channel component is an unnormalized unsigned 16-bit integer value |
| | {CL_UNSIGNED_INT32_anchor} |
| |
| include::{generated}/api/version-notes/CL_UNSIGNED_INT32.asciidoc[] |
| | Each channel component is an unnormalized unsigned 32-bit integer value |
| | {CL_HALF_FLOAT_anchor} |
| |
| include::{generated}/api/version-notes/CL_HALF_FLOAT.asciidoc[] |
| | Each channel component is a 16-bit half-float value |
| | {CL_FLOAT_anchor} |
| |
| include::{generated}/api/version-notes/CL_FLOAT.asciidoc[] |
| | Each channel component is a single precision floating-point value |
| |==== |
| |
| For example, to specify a normalized unsigned 8-bit / channel RGBA image, |
| `image_channel_order` = {CL_RGBA}, and `image_channel_data_type` = |
| {CL_UNORM_INT8}. |
| The memory layout of this image format is described below: |
| |
| // (Jon) these tables are missing "Byte Offset -> 0 1 2 3 ..." |
| |
| [width="60%",cols="<10%,<10%,<10%,<10%,<60%"] |
| |==== |
| | R | G | B | A | ... | |
| |==== |
| |
| with the corresponding byte offsets |
| |
| [width="60%",cols="<10%,<10%,<10%,<10%,<60%"] |
| |==== |
| | 0 | 1 | 2 | 3 | ... | |
| |==== |
| |
| Similar, if `image_channel_order` = {CL_RGBA} and `image_channel_data_type` = |
| {CL_SIGNED_INT16}, the memory layout of this image format is described below: |
| |
| [width="60%",cols="<10%,<10%,<10%,<10%,<60%"] |
| |==== |
| | R | G | B | A | ... | |
| |==== |
| |
| with the corresponding byte offsets |
| |
| [width="60%",cols="<10%,<10%,<10%,<10%,<60%"] |
| |==== |
| | 0 | 2 | 4 | 6 | ... | |
| |==== |
| |
| `image_channel_data_type` values of {CL_UNORM_SHORT_565}, {CL_UNORM_SHORT_555}, |
| {CL_UNORM_INT_101010}, and {CL_UNORM_INT_101010_2} are special cases of packed |
| image formats where the channels of each element are packed into a single |
| unsigned short or unsigned int. |
| For these special packed image formats, the channels are normally packed |
| with the first channel in the most significant bits of the bitfield, and |
| successive channels occupying progressively less significant locations. |
| For {CL_UNORM_SHORT_565}, R is in bits 15:11, G is in bits 10:5 and B is in |
| bits 4:0. |
| For {CL_UNORM_SHORT_555}, bit 15 is undefined, R is in bits 14:10, G in bits |
| 9:5 and B in bits 4:0. |
| For {CL_UNORM_INT_101010}, bits 31:30 are undefined, R is in bits 29:20, G in |
| bits 19:10 and B in bits 9:0. |
| For {CL_UNORM_INT_101010_2}, R is in bits 31:22, G in bits 21:12, B in bits |
| 11:2 and A in bits 1:0. |
| |
| OpenCL implementations must maintain the minimum precision specified by the |
| number of bits in `image_channel_data_type`. |
| If the image format specified by `image_channel_order`, and |
| `image_channel_data_type` cannot be supported by the OpenCL implementation, |
| then the call to {clCreateImage}, {clCreateImageWithProperties}, |
| {clCreateImage2D}, or {clCreateImage3D} will return a `NULL` memory object. |
| -- |
| |
| |
| [[image-descriptor]] |
| ==== Image Descriptor |
| |
| [open,refpage='cl_image_desc',desc='The image descriptor structure describes the type and dimensions of the image or image array and is defined as:',type='structs',xrefs='clCreateImage'] |
| -- |
| The {cl_image_desc_TYPE} image descriptor structure describes the image type |
| and dimensions of an image or image array when creating an image using |
| {clCreateImage} or {clCreateImageWithProperties}, and is defined as: |
| |
| include::{generated}/api/structs/cl_image_desc.txt[] |
| |
| * `image_type` describes the image type and must be either |
| {CL_MEM_OBJECT_IMAGE1D}, {CL_MEM_OBJECT_IMAGE1D_BUFFER}, |
| {CL_MEM_OBJECT_IMAGE1D_ARRAY}, {CL_MEM_OBJECT_IMAGE2D}, |
| {CL_MEM_OBJECT_IMAGE2D_ARRAY}, or {CL_MEM_OBJECT_IMAGE3D}. |
| * `image_width` is the width of the image in pixels. |
| For a 2D image and image array, the image width must be a value {geq} 1 and |
| {leq} {CL_DEVICE_IMAGE2D_MAX_WIDTH}. |
| For a 3D image, the image width must be a value {geq} 1 and {leq} |
| {CL_DEVICE_IMAGE3D_MAX_WIDTH}. |
| For a 1D image buffer, the image width must be a value {geq} 1 and {leq} |
| {CL_DEVICE_IMAGE_MAX_BUFFER_SIZE}. |
| For a 1D image and 1D image array, the image width must be a value {geq} 1 |
| and {leq} {CL_DEVICE_IMAGE2D_MAX_WIDTH}. |
| * `image_height` is the height of the image in pixels. |
| This is only used if the image is a 2D or 3D image, or a 2D image array. |
| For a 2D image or image array, the image height must be a value {geq} 1 and |
| {leq} {CL_DEVICE_IMAGE2D_MAX_HEIGHT}. |
| For a 3D image, the image height must be a value {geq} 1 and {leq} |
| {CL_DEVICE_IMAGE3D_MAX_HEIGHT}. |
| * `image_depth` is the depth of the image in pixels. |
| This is only used if the image is a 3D image and must be a value {geq} 1 and |
| {leq} {CL_DEVICE_IMAGE3D_MAX_DEPTH}. |
| * `image_array_size` footnote:[{fn-image-array-performance}] is the number of |
| images in the image array. |
| This is only used if the image is a 1D or 2D image array. |
| The values for `image_array_size`, if specified, must be a value {geq} 1 and |
| {leq} {CL_DEVICE_IMAGE_MAX_ARRAY_SIZE}. |
| * `image_row_pitch` is the scan-line pitch in bytes. |
| This must be 0 if _host_ptr_ is `NULL` and can be either 0 or {geq} |
| `image_width` {times} size of element in bytes if _host_ptr_ is not `NULL`. |
| If _host_ptr_ is not `NULL` and `image_row_pitch` = 0, `image_row_pitch` is |
| calculated as `image_width` {times} size of element in bytes. |
| If `image_row_pitch` is not 0, it must be a multiple of the image element |
| size in bytes. |
| For a 2D image created from a buffer, the pitch specified (or computed if |
| pitch specified is 0) must be a multiple of the maximum of the |
| {CL_DEVICE_IMAGE_PITCH_ALIGNMENT} value for all devices in the context |
| associated with the buffer specified by `mem_object` that support images. |
| * `image_slice_pitch` is the size in bytes of each 2D slice in the 3D image or |
| the size in bytes of each image in a 1D or 2D image array. |
| This must be 0 if _host_ptr_ is `NULL`. |
| If _host_ptr_ is not `NULL`, `image_slice_pitch` can be either 0 or {geq} |
| `image_row_pitch` {times} `image_height` for a 2D image array or 3D image |
| and can be either 0 or {geq} `image_row_pitch` for a 1D image array. |
| If _host_ptr_ is not `NULL` and `image_slice_pitch` = 0, `image_slice_pitch` |
| is calculated as `image_row_pitch` {times} `image_height` for a 2D image |
| array or 3D image and `image_row_pitch` for a 1D image array. |
| If `image_slice_pitch` is not 0, it must be a multiple of the |
| `image_row_pitch`. |
| * `num_mip_levels` and `num_samples` must be 0. |
| * `mem_object` may refer to a valid buffer or image memory object. |
| `mem_object` can be a buffer memory object if `image_type` is |
| {CL_MEM_OBJECT_IMAGE1D_BUFFER} or |
| {CL_MEM_OBJECT_IMAGE2D} footnote:[{fn-image-from-buffer}]. |
| `mem_object` can be an image object if `image_type` is |
| {CL_MEM_OBJECT_IMAGE2D} footnote:[{fn-image-from-image}]. |
| Otherwise it must be `NULL`. |
| The image pixels are taken from the memory objects data store. |
| When the contents of the specified memory objects data store are modified, |
| those changes are reflected in the contents of the image object and |
| vice-versa at corresponding synchronization points. |
| |
| For a 1D image buffer created from a buffer object, the `image_width` {times} |
| size of element in bytes must be {leq} size of the buffer object. |
| The image data in the buffer object is stored as a single scanline which is |
| a linear sequence of adjacent elements. |
| |
| For a 2D image created from a buffer object, the `image_row_pitch` {times} |
| `image_height` must be {leq} size of the buffer object specified by |
| `mem_object`. |
| The image data in the buffer object is stored as a linear sequence of |
| adjacent scanlines. |
| Each scanline is a linear sequence of image elements padded to |
| `image_row_pitch` bytes. |
| |
| For an image object created from another image object, the values specified |
| in the image descriptor except for `mem_object` must match the image |
| descriptor information associated with `mem_object`. |
| |
| Image elements are stored according to their image format as described in |
| <<image-format-descriptor, Image Format Descriptor>>. |
| |
| If the buffer object specified by `mem_object` was created with |
| {CL_MEM_USE_HOST_PTR}, the _host_ptr_ specified to {clCreateBuffer} or |
| {clCreateBufferWithProperties} must be aligned to the maximum of the |
| {CL_DEVICE_IMAGE_BASE_ADDRESS_ALIGNMENT} value for all devices in the |
| context associated with the buffer specified by `mem_object` that |
| support images. |
| |
| Creating a 2D image object from another 2D image object creates a new |
| 2D image object that shares the image data store with `mem_object` but views |
| the pixels in the image with a different image channel order. |
| Restrictions are: |
| |
| * All of the values specified in _image_desc_ must match the image descriptor |
| information associated with `mem_object`, except for `mem_object`. |
| |
| * The image channel data type specified in _image_format_ must match the |
| image channel data type associated with `mem_object`. |
| |
| * The image channel order specified in _image_format_ must be compatible |
| with the image channel order associated with `mem_object`. |
| Compatible image channel orders |
| footnote:[{fn-compatible-image-channel-orders}] are: |
| + |
| -- |
| [width="100%",cols="<50%,<50%",options="header"] |
| |==== |
| | Image Channel Order in _image_format_: |
| | Image Channel Order associated with `mem_object`: |
| | {CL_sBGRA} |
| | {CL_BGRA} |
| | {CL_BGRA} |
| | {CL_sBGRA} |
| | {CL_sRGBA} |
| | {CL_RGBA} |
| | {CL_RGBA} |
| | {CL_sRGBA} |
| | {CL_sRGB} |
| | {CL_RGB} |
| | {CL_RGB} |
| | {CL_sRGB} |
| | {CL_sRGBx} |
| | {CL_RGBx} |
| | {CL_RGBx} |
| | {CL_sRGBx} |
| | {CL_DEPTH} |
| | {CL_R} |
| |==== |
| -- |
| |
| [NOTE] |
| ==== |
| Concurrent reading from, writing to and copying between both a buffer object |
| and 1D image buffer or 2D image object associated with the buffer object is |
| undefined. |
| Only reading from both a buffer object and 1D image buffer or 2D image |
| object associated with the buffer object is defined. |
| |
| Writing to an image created from a buffer and then reading from this buffer |
| in a kernel even if appropriate synchronization operations (such as a |
| barrier) are performed between the writes and reads is undefined. |
| Similarly, writing to the buffer and reading from the image created from |
| this buffer with appropriate synchronization between the writes and reads is |
| undefined. |
| ==== |
| -- |
| |
| |
| === Querying List of Supported Image Formats |
| |
| [open,refpage='clGetSupportedImageFormats',desc='Get the list of image formats supported by an OpenCL implementation.',type='protos'] |
| -- |
| To get the list of image formats supported by an OpenCL implementation for a |
| specified context, image type, and allocation information, call the function |
| |
| include::{generated}/api/protos/clGetSupportedImageFormats.txt[] |
| include::{generated}/api/version-notes/clGetSupportedImageFormats.asciidoc[] |
| |
| * _context_ is a valid OpenCL context on which the image object(s) will be |
| created. |
| * _flags_ is a bit-field that is used to specify usage |
| information about the image formats being queried and is described in |
| the <<memory-flags-table,Memory Flags>> table. |
| _flags_ may be {CL_MEM_READ_WRITE} to query image formats that may be read |
| from and written to by different kernel instances when correctly ordered by |
| event dependencies, or {CL_MEM_READ_ONLY} to query image formats that may |
| be read from by a kernel, or {CL_MEM_WRITE_ONLY} to query image formats that |
| may be written to by a kernel, or {CL_MEM_KERNEL_READ_AND_WRITE} to query |
| image formats that may be both read from and written to by the same kernel |
| instance. |
| Please see <<image-format-mapping, Image Format Mapping>> for clarification. |
| * _image_type_ describes the image type and must be either |
| {CL_MEM_OBJECT_IMAGE1D}, {CL_MEM_OBJECT_IMAGE1D_BUFFER}, {CL_MEM_OBJECT_IMAGE2D}, |
| {CL_MEM_OBJECT_IMAGE3D}, {CL_MEM_OBJECT_IMAGE1D_ARRAY}, or |
| {CL_MEM_OBJECT_IMAGE2D_ARRAY}. |
| * _num_entries_ specifies the number of entries that can be returned in the |
| memory location given by _image_formats_. |
| * _image_formats_ is a pointer to a memory location where the list of |
| supported image formats are returned. |
| Each entry describes a {cl_image_format_TYPE} structure supported by the OpenCL |
| implementation. |
| If _image_formats_ is `NULL`, it is ignored. |
| * _num_image_formats_ is the actual number of supported image formats for a |
| specific _context_ and values specified by _flags_. |
| If _num_image_formats_ is `NULL`, it is ignored. |
| |
| {clGetSupportedImageFormats} returns a union of image formats supported by |
| all devices in the context. |
| |
| // The following information about an image memory object |
| // is specified: |
| // |
| // * Context |
| // * Image type 1D, 2D, or 3D image, 1D image buffer, 1D or 2D image array. |
| // * Image object allocation information |
| |
| // refError |
| |
| {clGetSupportedImageFormats} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_CONTEXT} if _context_ is not a valid context. |
| * {CL_INVALID_VALUE} if _flags_ or _image_type_ are not valid, or if |
| _num_entries_ is 0 and _image_formats_ is not `NULL`. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| |
| If {CL_DEVICE_IMAGE_SUPPORT} specified in the <<device-queries-table,Device |
| Queries>> table is {CL_TRUE}, the values assigned to |
| {CL_DEVICE_MAX_READ_IMAGE_ARGS}, {CL_DEVICE_MAX_WRITE_IMAGE_ARGS} |
| {CL_DEVICE_IMAGE2D_MAX_WIDTH}, {CL_DEVICE_IMAGE2D_MAX_HEIGHT} |
| {CL_DEVICE_IMAGE3D_MAX_WIDTH}, {CL_DEVICE_IMAGE3D_MAX_HEIGHT} |
| {CL_DEVICE_IMAGE3D_MAX_DEPTH}, and {CL_DEVICE_MAX_SAMPLERS} by the implementation |
| must be greater than or equal to the minimum values specified in the |
| <<device-queries-table,Device Queries>> table. |
| -- |
| |
| |
| [[minimum-list-of-supported-image-formats]] |
| ==== Minimum List of Supported Image Formats |
| |
| [open,refpage='supportedImageFormats',desc='Minimum List of Supported Image Formats',type='freeform',anchor='minimum-list-of-supported-image-formats'] |
| -- |
| The tables below describe the required minimum lists of supported image |
| formats. |
| To query all image formats supported by an implementation, call the function {clGetSupportedImageFormats}. |
| |
| For full profile devices supporting OpenCL 2.0, 2.1, or 2.2, the minimum |
| list of supported image formats for either reading or writing in a kernel |
| is: |
| |
| [[min-supported-image-formats-2.0]] |
| .Minimum list of supported image formats for reading or writing (OpenCL 2.0, 2.1, or 2.2) |
| [width="100%",cols="<34%,<33%,<33%",options="header"] |
| |==== |
| | num_channels | channel_order | channel_data_type |
| | 1 |
| | {CL_R} |
| | {CL_UNORM_INT8} + |
| {CL_UNORM_INT16} + |
| {CL_SNORM_INT8} + |
| {CL_SNORM_INT16} + |
| {CL_SIGNED_INT8} + |
| {CL_SIGNED_INT16} + |
| {CL_SIGNED_INT32} + |
| {CL_UNSIGNED_INT8} + |
| {CL_UNSIGNED_INT16} + |
| {CL_UNSIGNED_INT32} + |
| {CL_HALF_FLOAT} + |
| {CL_FLOAT} |
| | 1 |
| | {CL_DEPTH} footnote:[{fn-depth-image-requirements}] |
| | {CL_UNORM_INT16} + |
| {CL_FLOAT} |
| | 2 |
| | {CL_RG} |
| | {CL_UNORM_INT8} + |
| {CL_UNORM_INT16} + |
| {CL_SNORM_INT8} + |
| {CL_SNORM_INT16} + |
| {CL_SIGNED_INT8} + |
| {CL_SIGNED_INT16} + |
| {CL_SIGNED_INT32} + |
| {CL_UNSIGNED_INT8} + |
| {CL_UNSIGNED_INT16} + |
| {CL_UNSIGNED_INT32} + |
| {CL_HALF_FLOAT} + |
| {CL_FLOAT} |
| | 4 |
| | {CL_RGBA} |
| | {CL_UNORM_INT8} + |
| {CL_UNORM_INT16} + |
| {CL_SNORM_INT8} + |
| {CL_SNORM_INT16} + |
| {CL_SIGNED_INT8} + |
| {CL_SIGNED_INT16} + |
| {CL_SIGNED_INT32} + |
| {CL_UNSIGNED_INT8} + |
| {CL_UNSIGNED_INT16} + |
| {CL_UNSIGNED_INT32} + |
| {CL_HALF_FLOAT} + |
| {CL_FLOAT} |
| | 4 |
| | {CL_BGRA} |
| | {CL_UNORM_INT8} |
| | 4 |
| | {CL_sRGBA} footnote:[{fn-srgb-image-requirements}] |
| | {CL_UNORM_INT8} |
| |==== |
| |
| For full profile devices supporting other OpenCL versions, such as OpenCL 1.2 |
| or OpenCL 3.0, the minimum list of supported image formats for either reading |
| or writing in a kernel is: |
| |
| // Note, differences are: |
| // * No {CL_R} images. |
| // * No {CL_DEPTH} images. |
| // * No {CL_RG} images. |
| // * No {CL_sRGBA} images. |
| // * For {CL_RGBA} images, no {CL_SNORM_INT8} or {CL_SNORM_INT16} image channel |
| // data types. |
| |
| [[min-supported-image-formats]] |
| .Minimum list of required image formats for reading or writing |
| [width="100%",cols="<34%,<33%,<33%",options="header"] |
| |==== |
| | num_channels | channel_order | channel_data_type |
| | 4 |
| | {CL_RGBA} |
| | {CL_UNORM_INT8} + |
| {CL_UNORM_INT16} + |
| {CL_SIGNED_INT8} + |
| {CL_SIGNED_INT16} + |
| {CL_SIGNED_INT32} + |
| {CL_UNSIGNED_INT8} + |
| {CL_UNSIGNED_INT16} + |
| {CL_UNSIGNED_INT32} + |
| {CL_HALF_FLOAT} + |
| {CL_FLOAT} |
| | 4 |
| | {CL_BGRA} |
| | {CL_UNORM_INT8} |
| |==== |
| |
| For full profile devices that support reading from and writing to the same |
| image object from the same kernel instance (see {CL_DEVICE_MAX_READ_WRITE_IMAGE_ARGS}), |
| the minimum list of supported image formats for reading and writing in |
| the same kernel instance is: |
| |
| [[min-supported-image-formats-read-write]] |
| .Minimum list of required image formats for reading and writing |
| [width="100%",cols="<34%,<33%,<33%",options="header"] |
| |==== |
| | num_channels | channel_order | channel_data_type |
| | 1 |
| | {CL_R} |
| | {CL_UNORM_INT8} + |
| {CL_SIGNED_INT8} + |
| {CL_SIGNED_INT16} + |
| {CL_SIGNED_INT32} + |
| {CL_UNSIGNED_INT8} + |
| {CL_UNSIGNED_INT16} + |
| {CL_UNSIGNED_INT32} + |
| {CL_HALF_FLOAT} + |
| {CL_FLOAT} |
| | 4 |
| | {CL_RGBA} |
| | {CL_UNORM_INT8} + |
| {CL_SIGNED_INT8} + |
| {CL_SIGNED_INT16} + |
| {CL_SIGNED_INT32} + |
| {CL_UNSIGNED_INT8} + |
| {CL_UNSIGNED_INT16} + |
| {CL_UNSIGNED_INT32} + |
| {CL_HALF_FLOAT} + |
| {CL_FLOAT} |
| |==== |
| -- |
| |
| [[image-format-mapping]] |
| ==== Image format mapping to OpenCL kernel language image access qualifiers |
| |
| Image arguments to kernels may have the `read_only`, `write_only` or |
| `read_write` qualifier. |
| Not all image formats supported by the device and platform are valid to be |
| passed to all of these access qualifiers. |
| For each access qualifier, only images whose format is in the list of |
| formats returned by {clGetSupportedImageFormats} with the given flag |
| arguments in the <<image-format-mapping-table,Image Format Mapping>> table |
| are permitted. |
| It is not valid to pass an image supporting writing as both a `read_only` |
| image and a `write_only` image parameter, or to a `read_write` image |
| parameter and any other image parameter. |
| |
| [[image-format-mapping-table]] |
| .Mapping from format flags passed to <<clGetSupportedImageFormats>> to OpenCL kernel language image access qualifiers |
| [width="100%",cols="<50%,<50%",options="header"] |
| |==== |
| | Access Qualifier | Memory Flags |
| | `read_only` |
| | {CL_MEM_READ_ONLY}, + |
| {CL_MEM_READ_WRITE}, + |
| {CL_MEM_KERNEL_READ_AND_WRITE} |
| | `write_only` |
| | {CL_MEM_WRITE_ONLY}, + |
| {CL_MEM_READ_WRITE}, + |
| {CL_MEM_KERNEL_READ_AND_WRITE} |
| | `read_write` |
| | {CL_MEM_KERNEL_READ_AND_WRITE} |
| |==== |
| |
| |
| === Reading, Writing and Copying Image Objects |
| |
| [open,refpage='clEnqueueReadImage',desc='Enqueue commands to read from an image or image array object to host memory.',type='protos',alias='clEnqueueWriteImage'] |
| -- |
| The following functions enqueue commands to read from an image or image |
| array object to host memory or write to an image or image array object from |
| host memory. |
| |
| include::{generated}/api/protos/clEnqueueReadImage.txt[] |
| include::{generated}/api/version-notes/clEnqueueReadImage.asciidoc[] |
| |
| include::{generated}/api/protos/clEnqueueWriteImage.txt[] |
| include::{generated}/api/version-notes/clEnqueueWriteImage.asciidoc[] |
| |
| * _command_queue_ refers to the host command-queue in which the read / write |
| command will be queued. |
| _command_queue_ and _image_ must be created with the same OpenCL context. |
| * _image_ refers to a valid image or image array object. |
| * _blocking_read_ and _blocking_write_ indicate if the read and write |
| operations are _blocking_ or _non-blocking_. |
| * _origin_ defines the (_x_, _y_, _z_) offset in pixels in the 1D, 2D or 3D |
| image, the (_x_, _y_) offset and the image index in the 2D image array or |
| the (_x_) offset and the image index in the 1D image array. |
| If _image_ is a 2D image object, _origin_[2] must be 0. |
| If _image_ is a 1D image or 1D image buffer object, _origin_[1] and |
| _origin_[2] must be 0. |
| If _image_ is a 1D image array object, _origin_[2] must be 0. |
| If _image_ is a 1D image array object, _origin_[1] describes the image index |
| in the 1D image array. |
| If _image_ is a 2D image array object, _origin_[2] describes the image index |
| in the 2D image array. |
| * _region_ defines the (_width_, _height_, _depth_) in pixels of the 1D, 2D or |
| 3D rectangle, the (_width_, _height_) in pixels of the 2D rectangle and the |
| number of images of a 2D image array or the (_width_) in pixels of the 1D |
| rectangle and the number of images of a 1D image array. |
| If _image_ is a 2D image object, _region_[2] must be 1. |
| If _image_ is a 1D image or 1D image buffer object, _region_[1] and |
| _region_[2] must be 1. |
| If _image_ is a 1D image array object, _region_[2] must be 1. |
| The values in _region_ cannot be 0. |
| * _row_pitch_ in {clEnqueueReadImage} and _input_row_pitch_ in |
| {clEnqueueWriteImage} is the length of each row in bytes. |
| This value must be greater than or equal to the element size in bytes |
| {times} _width_. |
| If _row_pitch_ (or _input_row_pitch_) is set to 0, the appropriate row pitch |
| is calculated based on the size of each element in bytes multiplied by |
| _width_. |
| * _slice_pitch_ in {clEnqueueReadImage} and _input_slice_pitch_ in |
| {clEnqueueWriteImage} is the size in bytes of the 2D slice of the 3D region |
| of a 3D image or each image of a 1D or 2D image array being read or written |
| respectively. |
| This must be 0 if _image_ is a 1D or 2D image. |
| Otherwise this value must be greater than or equal to _row_pitch_ {times} |
| _height_. |
| If _slice_pitch_ (or _input_slice_pitch_) is set to 0, the appropriate slice |
| pitch is calculated based on the _row_pitch_ {times} _height_. |
| * _ptr_ is the pointer to a buffer in host memory where image data is to be |
| read from or to be written to. |
| The alignment requirements for ptr are specified in |
| <<alignment-app-data-types>>. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before this particular command can be executed. |
| If _event_wait_list_ is `NULL`, then this particular command does not wait |
| on any event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this read / write command |
| and can be used to query or queue a wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| If _blocking_read_ is {CL_TRUE} i.e. the read command is blocking, |
| {clEnqueueReadImage} does not return until the buffer data has been read and |
| copied into memory pointed to by _ptr_. |
| |
| If _blocking_read_ is {CL_FALSE} i.e. the read command is non-blocking, |
| {clEnqueueReadImage} queues a non-blocking read command and returns. |
| The contents of the buffer that _ptr_ points to cannot be used until the |
| read command has completed. |
| The _event_ argument returns an event object which can be used to query the |
| execution status of the read command. |
| When the read command has completed, the contents of the buffer that _ptr_ |
| points to can be used by the application. |
| |
| If _blocking_write_ is {CL_TRUE}, the write command is blocking and does not |
| return until the command is complete, including transfer of the data. |
| The memory pointed to by _ptr_ can be reused by the application after the |
| {clEnqueueWriteImage} call returns. |
| |
| If _blocking_write_ is {CL_FALSE}, the OpenCL implementation will use _ptr_ to |
| perform a non-blocking write. |
| As the write is non-blocking the implementation can return immediately. |
| The memory pointed to by _ptr_ cannot be reused by the application after the |
| call returns. |
| The _event_ argument returns an event object which can be used to query the |
| execution status of the write command. |
| When the write command has completed, the memory pointed to by _ptr_ can |
| then be reused by the application. |
| |
| // refError |
| |
| {clEnqueueReadImage} and {clEnqueueWriteImage} return {CL_SUCCESS} if the |
| function is executed successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and |
| _image_ are not the same or if the context associated with |
| _command_queue_ and events in _event_wait_list_ are not the same. |
| * {CL_INVALID_MEM_OBJECT} if _image_ is not a valid image object. |
| * {CL_INVALID_VALUE} if _origin_ or _region_ is `NULL`. |
| * {CL_INVALID_VALUE} if the region being read or written specified by |
| _origin_ and _region_ is out of bounds. |
| * {CL_INVALID_VALUE} if values in _origin_ and _region_ do not follow rules |
| described in the argument description for _origin_ and _region_. |
| * {CL_INVALID_VALUE} if _ptr_ is `NULL`. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_INVALID_IMAGE_SIZE} if image dimensions (image width, height, |
| specified or compute row and/or slice pitch) for _image_ are not |
| supported by device associated with _queue_. |
| * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if image format (image channel order and |
| data type) for _image_ are not supported by device associated with |
| _queue_. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for data store associated with _image_. |
| * {CL_INVALID_OPERATION} if the device associated with _command_queue_ does |
| not support images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the |
| <<device-queries-table,Device Queries>> table is {CL_FALSE}). |
| * {CL_INVALID_OPERATION} if {clEnqueueReadImage} is called on _image_ which |
| has been created with {CL_MEM_HOST_WRITE_ONLY} or {CL_MEM_HOST_NO_ACCESS}. |
| * {CL_INVALID_OPERATION} if {clEnqueueWriteImage} is called on _image_ which |
| has been created with {CL_MEM_HOST_READ_ONLY} or {CL_MEM_HOST_NO_ACCESS}. |
| * {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the read and write |
| operations are blocking and the execution status of any of the events in |
| _event_wait_list_ is a negative integer value. |
| This error code is <<unified-spec, missing before>> version 1.1. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| |
| [NOTE] |
| ==== |
| Calling {clEnqueueReadImage} to read a region of the _image_ with the _ptr_ |
| argument value set to _host_ptr_ + (_origin_[2] {times} _image slice pitch_ |
| {plus} _origin_[1] {times} _image row pitch_ + _origin_[0] {times} _bytes |
| per pixel_), where _host_ptr_ is a pointer to the memory region specified |
| when the _image_ being read is created with {CL_MEM_USE_HOST_PTR}, must meet |
| the following requirements in order to avoid undefined behavior: |
| |
| * All commands that use this image object have finished execution before |
| the read command begins execution. |
| * The _row_pitch_ and _slice_pitch_ argument values in |
| {clEnqueueReadImage} must be set to the image row pitch and slice pitch. |
| * The image object is not mapped. |
| * The image object is not used by any command-queue until the read command |
| has finished execution. |
| |
| Calling {clEnqueueWriteImage} to update the latest bits in a region of the |
| _image_ with the _ptr_ argument value set to _host_ptr_ + (_origin_[2] |
| {times} _image slice pitch_ + _origin_[1] {times} _image row pitch_ {plus} |
| _origin_[0] {times} _bytes per pixel_), where _host_ptr_ is a pointer to the |
| memory region specified when the _image_ being written is created with |
| {CL_MEM_USE_HOST_PTR}, must meet the following requirements in order to avoid |
| undefined behavior: |
| |
| * The host memory region being written contains the latest bits when the |
| enqueued write command begins execution. |
| * The _input_row_pitch_ and _input_slice_pitch_ argument values in |
| {clEnqueueWriteImage} must be set to the image row pitch and slice |
| pitch. |
| * The image object is not mapped. |
| * The image object is not used by any command-queue until the write |
| command has finished execution. |
| ==== |
| -- |
| |
| [open,refpage='clEnqueueCopyImage',desc='Enqueues a command to copy image objects.',type='protos'] |
| -- |
| To enqueue a command to copy image objects, call the function |
| |
| include::{generated}/api/protos/clEnqueueCopyImage.txt[] |
| include::{generated}/api/version-notes/clEnqueueCopyImage.asciidoc[] |
| |
| * _src_image_ and _dst_image_ can be 1D, 2D, 3D image or a 1D, 2D image array |
| objects. |
| It is possible to copy subregions between any combinations of source and |
| destination types, provided that the dimensions of the subregions are the |
| same e.g., one can copy a rectangular region from a 2D image to a slice of a |
| 3D image. |
| * _command_queue_ refers to the host command-queue in which the copy command |
| will be queued. |
| The OpenCL context associated with _command_queue_, _src_image_ and |
| _dst_image_ must be the same. |
| * _src_origin_ defines the (_x_, _y_, _z_) offset in pixels in the 1D, 2D or |
| 3D image, the (_x_, _y_) offset and the image index in the 2D image array or |
| the (_x_) offset and the image index in the 1D image array. |
| If _image_ is a 2D image object, _src_origin_[2] must be 0. |
| If _src_image_ is a 1D image object, _src_origin_[1] and _src_origin_[2] |
| must be 0. |
| If _src_image_ is a 1D image array object, _src_origin_[2] must be 0. |
| If _src_image_ is a 1D image array object, _src_origin_[1] describes the |
| image index in the 1D image array. |
| If _src_image_ is a 2D image array object, _src_origin_[2] describes the |
| image index in the 2D image array. |
| * _dst_origin_ defines the (_x_, _y_, _z_) offset in pixels in the 1D, 2D or |
| 3D image, the (_x_, _y_) offset and the image index in the 2D image array or |
| the (_x_) offset and the image index in the 1D image array. |
| If _dst_image_ is a 2D image object, _dst_origin_[2] must be 0. |
| If _dst_image_ is a 1D image or 1D image buffer object, _dst_origin_[1] and |
| _dst_origin_[2] must be 0. |
| If _dst_image_ is a 1D image array object, _dst_origin_[2] must be 0. |
| If _dst_image_ is a 1D image array object, _dst_origin_[1] describes the |
| image index in the 1D image array. |
| If _dst_image_ is a 2D image array object, _dst_origin_[2] describes the |
| image index in the 2D image array. |
| * _region_ defines the (_width_, _height_, _depth_) in pixels of the 1D, 2D or |
| 3D rectangle, the (_width_, _height_) in pixels of the 2D rectangle and the |
| number of images of a 2D image array or the (_width_) in pixels of the 1D |
| rectangle and the number of images of a 1D image array. |
| If _src_image_ or _dst_image_ is a 2D image object, _region_[2] must be 1. |
| If _src_image_ or _dst_image_ is a 1D image or 1D image buffer object, |
| _region_[1] and _region_[2] must be 1. |
| If _src_image_ or _dst_image_ is a 1D image array object, _region_[2] must |
| be 1. |
| The values in _region_ cannot be 0. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before this particular command can be executed. |
| If _event_wait_list_ is `NULL`, then this particular command does not wait |
| on any event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this copy command |
| and can be used to query or queue a wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| It is currently a requirement that the _src_image_ and _dst_image_ image |
| memory objects for {clEnqueueCopyImage} must have the exact same image |
| format (i.e. the {cl_image_format_TYPE} descriptor specified when _src_image_ and |
| _dst_image_ are created must match). |
| |
| // refError |
| |
| {clEnqueueCopyImage} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_CONTEXT} if the context associated with _command_queue_, |
| _src_image_ and _dst_image_ are not the same or if the context |
| associated with _command_queue_ and events in _event_wait_list_ are not |
| the same. |
| * {CL_INVALID_MEM_OBJECT} if _src_image_ and _dst_image_ are not valid image |
| objects. |
| * {CL_IMAGE_FORMAT_MISMATCH} if _src_image_ and _dst_image_ do not use the |
| same image format. |
| * {CL_INVALID_VALUE} if _src_origin_, _dst_origin_, or _region_ is `NULL`. |
| * {CL_INVALID_VALUE} if the 2D or 3D rectangular region specified by |
| _src_origin_ and _src_origin_ {plus} _region_ refers to a region outside |
| _src_image_, or if the 2D or 3D rectangular region specified by |
| _dst_origin_ and _dst_origin_ {plus} _region_ refers to a region outside |
| _dst_image_. |
| * {CL_INVALID_VALUE} if values in _src_origin_, _dst_origin_ and _region_ do |
| not follow rules described in the argument description for _src_origin_, |
| _dst_origin_ and _region_. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_INVALID_IMAGE_SIZE} if image dimensions (image width, height, |
| specified or compute row and/or slice pitch) for _src_image_ or |
| _dst_image_ are not supported by device associated with _queue_. |
| * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if image format (image channel order and |
| data type) for _src_image_ or _dst_image_ are not supported by device |
| associated with _queue_. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for data store associated with _src_image_ or _dst_image_. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| * {CL_INVALID_OPERATION} if the device associated with _command_queue_ does |
| not support images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the |
| <<device-queries-table,Device Queries>> table is {CL_FALSE}). |
| * {CL_MEM_COPY_OVERLAP} if _src_image_ and _dst_image_ are the same image |
| object and the source and destination regions overlap. |
| -- |
| |
| |
| === Filling Image Objects |
| |
| NOTE: Filling image objects is <<unified-spec, missing before>> version 1.2. |
| |
| [open,refpage='clEnqueueFillImage',desc='Enqueues a command to fill an image object with a specified color.',type='protos'] |
| -- |
| To enqueue a command to fill an image object with a specified color, call |
| the function |
| |
| include::{generated}/api/protos/clEnqueueFillImage.txt[] |
| include::{generated}/api/version-notes/clEnqueueFillImage.asciidoc[] |
| |
| * _command_queue_ refers to the host command-queue in which the fill command |
| will be queued. |
| The OpenCL context associated with _command_queue_ and _image_ must be the |
| same. |
| * _image_ is a valid image object. |
| * _fill_color_ is the color used to fill the image. |
| The fill color is a single floating point value if the channel order is |
| {CL_DEPTH}. |
| Otherwise, the fill color is a four component RGBA floating-point color |
| value if the _image_ channel data type is not an unnormalized signed or |
| unsigned integer type, is a four component signed integer value if the |
| _image_ channel data type is an unnormalized signed integer type and is a |
| four component unsigned integer value if the _image_ channel data type is an |
| unnormalized unsigned integer type. |
| The fill color will be converted to the appropriate image channel format and |
| order associated with _image_. |
| * _origin_ defines the (_x_, _y_, _z_) offset in pixels in the 1D, 2D or 3D |
| image, the (_x_, _y_) offset and the image index in the 2D image array or |
| the (_x_) offset and the image index in the 1D image array. |
| If _image_ is a 2D image object, _origin_[2] must be 0. |
| If _image_ is a 1D image or 1D image buffer object, _origin_[1] and |
| _origin_[2] must be 0. |
| If _image_ is a 1D image array object, _origin_[2] must be 0. |
| If _image_ is a 1D image array object, _origin_[1] describes the image index |
| in the 1D image array. |
| If _image_ is a 2D image array object, _origin_[2] describes the image index |
| in the 2D image array. |
| * _region_ defines the (_width_, _height_, _depth_) in pixels of the 1D, 2D or |
| 3D rectangle, the (_width_, _height_) in pixels of the 2D rectangle and the |
| number of images of a 2D image array or the (_width_) in pixels of the 1D |
| rectangle and the number of images of a 1D image array. |
| If _image_ is a 2D image object, _region_[2] must be 1. |
| If _image_ is a 1D image or 1D image buffer object, _region_[1] and |
| _region_[2] must be 1. |
| If _image_ is a 1D image array object, _region_[2] must be 1. |
| The values in _region_ cannot be 0. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before this particular command can be executed. |
| If _event_wait_list_ is `NULL`, then this particular command does not wait |
| on any event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this command and |
| can be used to query or queue a wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| The usage information which indicates whether the memory object can be read |
| or written by a kernel and/or the host and is given by the {cl_mem_flags_TYPE} |
| argument value specified when _image_ is created is ignored by |
| {clEnqueueFillImage}. |
| |
| // refError |
| |
| {clEnqueueFillImage} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and |
| _image_ are not the same or if the context associated with |
| _command_queue_ and events in _event_wait_list_ are not the same. |
| * {CL_INVALID_MEM_OBJECT} if _image_ is not a valid image object. |
| * {CL_INVALID_VALUE} if _fill_color_ is `NULL`. |
| * {CL_INVALID_VALUE} if _origin_ or _region_ is `NULL`. |
| * {CL_INVALID_VALUE} if the region being filled as specified by _origin_ and |
| _region_ is out of bounds. |
| * {CL_INVALID_VALUE} if values in _origin_ and _region_ do not follow rules |
| described in the argument description for _origin_ and _region_. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_INVALID_IMAGE_SIZE} if image dimensions (image width, height, |
| specified or compute row and/or slice pitch) for _image_ are not |
| supported by device associated with _queue_. |
| * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if image format (image channel order and |
| data type) for _image_ are not supported by device associated with |
| _queue_. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for data store associated with _image_. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| |
| === Copying between Image and Buffer Objects |
| |
| [open,refpage='clEnqueueCopyImageToBuffer',desc='Enqueues a command to copy an image object to a buffer object.',type='protos'] |
| -- |
| To enqueue a command to copy an image object to a buffer object, call the |
| function |
| |
| include::{generated}/api/protos/clEnqueueCopyImageToBuffer.txt[] |
| include::{generated}/api/version-notes/clEnqueueCopyImageToBuffer.asciidoc[] |
| |
| * _command_queue_ must be a valid host command-queue. |
| The OpenCL context associated with _command_queue_, _src_image_ and |
| _dst_buffer_ must be the same. |
| * _src_image_ is a valid image object. |
| * _dst_buffer_ is a valid buffer object. |
| * _src_origin_ defines the (_x_, _y_, _z_) offset in pixels in the 1D, 2D or |
| 3D image, the (_x_, _y_) offset and the image index in the 2D image array or |
| the (_x_) offset and the image index in the 1D image array. |
| If _src_image_ is a 2D image object, _src_origin_[2] must be 0. |
| If _src_image_ is a 1D image or 1D image buffer object, _src_origin_[1] and |
| _src_origin_[2] must be 0. |
| If _src_image_ is a 1D image array object, _src_origin_[2] must be 0. |
| If _src_image_ is a 1D image array object, _src_origin_[1] describes the |
| image index in the 1D image array. |
| If _src_image_ is a 2D image array object, _src_origin_[2] describes the |
| image index in the 2D image array. |
| * _region_ defines the (_width_, _height_, _depth_) in pixels of the 1D, 2D or |
| 3D rectangle, the (_width_, _height_) in pixels of the 2D rectangle and the |
| number of images of a 2D image array or the (_width_) in pixels of the 1D |
| rectangle and the number of images of a 1D image array. |
| If _src_image_ is a 2D image object, _region_[2] must be 1. |
| If _src_image_ is a 1D image or 1D image buffer object, _region_[1] and |
| _region_[2] must be 1. |
| If _src_image_ is a 1D image array object, _region_[2] must be 1. |
| The values in _region_ cannot be 0. |
| * _dst_offset_ refers to the offset where to begin copying data into |
| _dst_buffer_. |
| The size in bytes of the region to be copied referred to as _dst_cb_ is |
| computed as _width_ {times} _height_ {times} _depth_ {times} _bytes/image |
| element_ if _src_image_ is a 3D image object, is computed as _width_ {times} |
| _height_ {times} _bytes/image element_ if _src_image_ is a 2D image, is |
| computed as _width_ {times} _height_ {times} _arraysize_ {times} |
| _bytes/image element_ if _src_image_ is a 2D image array object, is computed |
| as _width_ {times} _bytes/image element_ if _src_image_ is a 1D image or 1D |
| image buffer object and is computed as _width_ {times} _arraysize_ {times} |
| _bytes/image element_ if _src_image_ is a 1D image array object. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before this particular command can be executed. |
| If _event_wait_list_ is `NULL`, then this particular command does not wait |
| on any event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this copy command |
| and can be used to query or queue a wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| // refError |
| |
| {clEnqueueCopyImageToBuffer} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_CONTEXT} if the context associated with _command_queue_, |
| _src_image_ and _dst_buffer_ are not the same or if the context |
| associated with _command_queue_ and events in _event_wait_list_ are not |
| the same. |
| * {CL_INVALID_MEM_OBJECT} if _src_image_ is not a valid image object or |
| _dst_buffer_ is not a valid buffer object or if _src_image_ is a 1D |
| image buffer object created from _dst_buffer_. |
| * {CL_INVALID_VALUE} if _src_origin_ or _region_ is `NULL`. |
| * {CL_INVALID_VALUE} if the 1D, 2D or 3D rectangular region specified by |
| _src_origin_ and _src_origin_ + _region_ refers to a region outside |
| _src_image_, or if the region specified by _dst_offset_ and _dst_offset_ |
| + _dst_cb_ to a region outside _dst_buffer_. |
| * {CL_INVALID_VALUE} if values in _src_origin_ and _region_ do not follow |
| rules described in the argument description for _src_origin_ and |
| _region_. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _dst_buffer_ is a sub-buffer object |
| and _offset_ specified when the sub-buffer object is created is not |
| aligned to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated |
| with _queue_. |
| This error code is <<unified-spec, missing before>> version 1.1. |
| * {CL_INVALID_IMAGE_SIZE} if image dimensions (image width, height, |
| specified or compute row and/or slice pitch) for _src_image_ are not |
| supported by device associated with _queue_. |
| * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if image format (image channel order and |
| data type) for _src_image_ are not supported by device associated with |
| _queue_. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for data store associated with _src_image_ or _dst_buffer_. |
| * {CL_INVALID_OPERATION} if the device associated with _command_queue_ does |
| not support images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the |
| <<device-queries-table,Device Queries>> table is {CL_FALSE}). |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| |
| [open,refpage='clEnqueueCopyBufferToImage',desc='Enqueues a command to copy a buffer object to an image object.',type='protos'] |
| -- |
| To enqueue a command to copy a buffer object to an image object, call the |
| function |
| |
| include::{generated}/api/protos/clEnqueueCopyBufferToImage.txt[] |
| include::{generated}/api/version-notes/clEnqueueCopyBufferToImage.asciidoc[] |
| |
| * _command_queue_ must be a valid host command-queue. |
| The OpenCL context associated with _command_queue_, _src_buffer_ and |
| _dst_image_ must be the same. |
| * _src_buffer_ is a valid buffer object. |
| * _dst_image_ is a valid image object. |
| * _src_offset_ refers to the offset where to begin copying data from |
| _src_buffer_. |
| * _dst_origin_ defines the (_x_, _y_, _z_) offset in pixels in the 1D, 2D or |
| 3D image, the (_x_, _y_) offset and the image index in the 2D image array or |
| the (_x_) offset and the image index in the 1D image array. |
| If _dst_image_ is a 2D image object, _dst_origin_[2] must be 0. |
| If _dst_image_ is a 1D image or 1D image buffer object, _dst_origin_[1] and |
| _dst_origin_[2] must be 0. |
| If _dst_image_ is a 1D image array object, _dst_origin_[2] must be 0. |
| If _dst_image_ is a 1D image array object, _dst_origin_[1] describes the |
| image index in the 1D image array. |
| If _dst_image_ is a 2D image array object, _dst_origin_[2] describes the |
| image index in the 2D image array. |
| * _region_ defines the (_width_, _height_, _depth_) in pixels of the 1D, 2D or |
| 3D rectangle, the (_width_, _height_) in pixels of the 2D rectangle and the |
| number of images of a 2D image array or the (_width_) in pixels of the 1D |
| rectangle and the number of images of a 1D image array. |
| If _dst_image_ is a 2D image object, _region_[2] must be 1. |
| If _dst_image_ is a 1D image or 1D image buffer object, _region_[1] and |
| _region_[2] must be 1. |
| If _dst_image_ is a 1D image array object, _region_[2] must be 1. |
| The values in _region_ cannot be 0. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before this particular command can be executed. |
| If _event_wait_list_ is `NULL`, then this particular command does not wait |
| on any event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this copy command |
| and can be used to query or queue a wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| The size in bytes of the region to be copied from _src_buffer_ referred to |
| as _src_cb_ is computed as _width_ {times} _height_ {times} _depth_ {times} |
| _bytes/image element_ if _dst_image_ is a 3D image object, is computed as |
| _width_ {times} _height_ {times} _bytes/image element_ if _dst_image_ is a |
| 2D image, is computed as _width_ {times} _height_ {times} _arraysize_ |
| {times} _bytes/image element_ if _dst_image_ is a 2D image array object, is |
| computed as _width_ {times} _bytes/image element_ if _dst_image_ is a 1D |
| image or 1D image buffer object and is computed as _width_ {times} |
| _arraysize_ {times} _bytes/image element_ if _dst_image_ is a 1D image array |
| object. |
| |
| // refError |
| |
| {clEnqueueCopyBufferToImage} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_CONTEXT} if the context associated with _command_queue_, |
| _src_buffer_ and _dst_image_ are not the same or if the context |
| associated with _command_queue_ and events in _event_wait_list_ are not |
| the same. |
| * {CL_INVALID_MEM_OBJECT} if _src_buffer_ is not a valid buffer object or |
| _dst_image_ is not a valid image object or if _dst_image_ is a 1D image |
| buffer object created from _src_buffer_. |
| * {CL_INVALID_VALUE} if _dst_origin_ or _region_ is `NULL`. |
| * {CL_INVALID_VALUE} if the 1D, 2D or 3D rectangular region specified by |
| _dst_origin_ and _dst_origin_ + _region_ refer to a region outside |
| _dst_image_, or if the region specified by _src_offset_ and _src_offset_ |
| + _src_cb_ refer to a region outside _src_buffer_. |
| * {CL_INVALID_VALUE} if values in _dst_origin_ and _region_ do not follow |
| rules described in the argument description for _dst_origin_ and |
| _region_. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if _src_buffer_ is a sub-buffer object |
| and _offset_ specified when the sub-buffer object is created is not |
| aligned to {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated |
| with _queue_. |
| This error code is <<unified-spec, missing before>> version 1.1. |
| * {CL_INVALID_IMAGE_SIZE} if image dimensions (image width, height, |
| specified or compute row and/or slice pitch) for _dst_image_ are not |
| supported by device associated with _queue_. |
| * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if image format (image channel order and |
| data type) for _dst_image_ are not supported by device associated with |
| _queue_. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for data store associated with _src_buffer_ or _dst_image_. |
| * {CL_INVALID_OPERATION} if the device associated with _command_queue_ does |
| not support images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the |
| <<device-queries-table,Device Queries>> table is {CL_FALSE}). |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| |
| === Mapping Image Objects |
| |
| [open,refpage='clEnqueueMapImage',desc='Enqueues a command to map a region of an image object into the host address space and returns a pointer to this mapped region.',type='protos'] |
| -- |
| To enqueue a command to map a region in the image object given by _image_ |
| into the host address space and returns a pointer to this mapped region, |
| call the function |
| |
| include::{generated}/api/protos/clEnqueueMapImage.txt[] |
| include::{generated}/api/version-notes/clEnqueueMapImage.asciidoc[] |
| |
| * _command_queue_ must be a valid host command-queue. |
| * _image_ is a valid image object. |
| The OpenCL context associated with _command_queue_ and _image_ must be the |
| same. |
| * _blocking_map_ indicates if the map operation is _blocking_ or |
| _non-blocking_. |
| * _map_flags_ is a bit-field and is described in the |
| <<memory-map-flags-table,Memory Map Flags>> table. |
| * _origin_ defines the (_x_, _y_, _z_) offset in pixels in the 1D, 2D or 3D |
| image, the (_x_, _y_) offset and the image index in the 2D image array or |
| the (_x_) offset and the image index in the 1D image array. |
| If _image_ is a 2D image object, _origin_[2] must be 0. |
| If _image_ is a 1D image or 1D image buffer object, _origin_[1] and |
| _origin_[2] must be 0. |
| If _image_ is a 1D image array object, _origin_[2] must be 0. |
| If _image_ is a 1D image array object, _origin_[1] describes the image index |
| in the 1D image array. |
| If _image_ is a 2D image array object, _origin_[2] describes the image index |
| in the 2D image array. |
| * _region_ defines the (_width_, _height_, _depth_) in pixels of the 1D, 2D or |
| 3D rectangle, the (_width_, _height_) in pixels of the 2D rectangle and the |
| number of images of a 2D image array or the (_width_) in pixels of the 1D |
| rectangle and the number of images of a 1D image array. |
| If _image_ is a 2D image object, _region_[2] must be 1. |
| If _image_ is a 1D image or 1D image buffer object, _region_[1] and |
| _region_[2] must be 1. |
| If _image_ is a 1D image array object, _region_[2] must be 1. |
| The values in _region_ cannot be 0. |
| * _image_row_pitch_ returns the scan-line pitch in bytes for the mapped |
| region. |
| This must be a non-`NULL` value. |
| * _image_slice_pitch_ returns the size in bytes of each 2D slice of a 3D image |
| or the size of each 1D or 2D image in a 1D or 2D image array for the mapped |
| region. |
| For a 1D and 2D image, zero is returned if this argument is not `NULL`. |
| For a 3D image, 1D and 2D image array, _image_slice_pitch_ must be a |
| non-`NULL` value. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before {clEnqueueMapImage} can be executed. |
| If _event_wait_list_ is `NULL`, then {clEnqueueMapImage} does not wait on |
| any event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this command and |
| can be used to query or queue a wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| * _errcode_ret_ will return an appropriate error code. |
| If _errcode_ret_ is `NULL`, no error code is returned. |
| |
| If _blocking_map_ is {CL_TRUE}, {clEnqueueMapImage} does not return until the |
| specified region in _image_ is mapped into the host address space and the |
| application can access the contents of the mapped region using the pointer |
| returned by {clEnqueueMapImage}. |
| |
| If _blocking_map_ is {CL_FALSE} i.e. map operation is non-blocking, the |
| pointer to the mapped region returned by {clEnqueueMapImage} cannot be used |
| until the map command has completed. |
| The _event_ argument returns an event object which can be used to query the |
| execution status of the map command. |
| When the map command is completed, the application can access the contents |
| of the mapped region using the pointer returned by {clEnqueueMapImage}. |
| |
| // refError |
| |
| {clEnqueueMapImage} will return a pointer to the mapped region. |
| The _errcode_ret_ is set to {CL_SUCCESS}. |
| |
| A `NULL` pointer is returned otherwise with one of the following error |
| values returned in _errcode_ret_: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and |
| _image_ are not the same or if context associated with _command_queue_ |
| and events in _event_wait_list_ are not the same. |
| * {CL_INVALID_MEM_OBJECT} if _image_ is not a valid image object. |
| * {CL_INVALID_VALUE} if _origin_ or _region_ is `NULL`. |
| * {CL_INVALID_VALUE} if region being mapped given by (_origin_, |
| _origin_ + _region_) is out of bounds or if values specified in _map_flags_ |
| are not valid. |
| * {CL_INVALID_VALUE} if values in _origin_ and _region_ do not follow rules |
| described in the argument description for _origin_ and _region_. |
| * {CL_INVALID_VALUE} if _image_row_pitch_ is `NULL`. |
| * {CL_INVALID_VALUE} if _image_ is a 3D image, 1D or 2D image array object |
| and _image_slice_pitch_ is `NULL`. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_INVALID_IMAGE_SIZE} if image dimensions (image width, height, |
| specified or compute row and/or slice pitch) for _image_ are not |
| supported by device associated with _queue_. |
| * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if image format (image channel order and |
| data type) for _image_ are not supported by device associated with |
| _queue_. |
| * {CL_MAP_FAILURE} if there is a failure to map the requested region into |
| the host address space. |
| This error cannot occur for image objects created with |
| {CL_MEM_USE_HOST_PTR} or {CL_MEM_ALLOC_HOST_PTR}. |
| * {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the map operation is |
| blocking and the execution status of any of the events in |
| _event_wait_list_ is a negative integer value. |
| This error code is <<unified-spec, missing before>> version 1.1. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for data store associated with _image_. |
| * {CL_INVALID_OPERATION} if the device associated with _command_queue_ does |
| not support images (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the |
| <<device-queries-table,Device Queries>> table is {CL_FALSE}). |
| * {CL_INVALID_OPERATION} if _image_ has been created with |
| {CL_MEM_HOST_WRITE_ONLY} or {CL_MEM_HOST_NO_ACCESS} and {CL_MAP_READ} is set |
| in _map_flags_ or if _image_ has been created with {CL_MEM_HOST_READ_ONLY} |
| or {CL_MEM_HOST_NO_ACCESS} and {CL_MAP_WRITE} or |
| {CL_MAP_WRITE_INVALIDATE_REGION} is set in _map_flags_. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| * {CL_INVALID_OPERATION} if mapping would lead to overlapping regions being |
| mapped for writing. |
| |
| The pointer returned maps a 1D, 2D or 3D region starting at _origin_ and is |
| at least _region_[0] pixels in size for a 1D image, 1D image buffer or 1D |
| image array, (_image_row_pitch {times} region[1])_ pixels in size for a 2D |
| image or 2D image array, and (_image_slice_pitch {times} region[2])_ pixels |
| in size for a 3D image. |
| The result of a memory access outside this region is undefined. |
| |
| If the image object is created with {CL_MEM_USE_HOST_PTR} set in _mem_flags_, |
| the following will be true: |
| |
| * The _host_ptr_ specified in {clCreateImage}, {clCreateImageWithProperties}, |
| {clCreateImage2D}, or {clCreateImage3D} is guaranteed to contain the |
| latest bits in the region being mapped when the {clEnqueueMapImage} command |
| has completed. |
| * The pointer value returned by {clEnqueueMapImage} will be derived from |
| the _host_ptr_ specified when the image object is created. |
| |
| Mapped image objects are unmapped using {clEnqueueUnmapMemObject}. |
| This is described in <<unmapping-mapped-memory, Unmapping Mapped Memory |
| Objects>>. |
| -- |
| |
| |
| [[image-object-queries]] |
| === Image Object Queries |
| |
| To get information that is common to all memory objects, use the |
| {clGetMemObjectInfo} function described in <<memory-object-queries, Memory |
| Object Queries>>. |
| |
| [open,refpage='clGetImageInfo',desc='Get information specific to an image object created with clCreateImage.',type='protos'] |
| -- |
| To get information specific to an image object created with {clCreateImage}, |
| {clCreateImageWithProperties}, {clCreateImage2D}, or {clCreateImage3D} call |
| the function |
| |
| include::{generated}/api/protos/clGetImageInfo.txt[] |
| include::{generated}/api/version-notes/clGetImageInfo.asciidoc[] |
| |
| * _image_ specifies the image object being queried. |
| * _param_name_ specifies the information to query. |
| The list of supported _param_name_ types and the information returned in |
| _param_value_ by {clGetImageInfo} is described in the |
| <<image-info-table,Image Object Queries>> table. |
| * _param_value_ is a pointer to memory where the appropriate result being |
| queried is returned. |
| If _param_value_ is `NULL`, it is ignored. |
| * _param_value_size_ is used to specify the size in bytes of memory pointed to |
| by _param_value_. |
| This size must be {geq} size of return type as described in the |
| <<image-info-table,Image Object Queries>> table. |
| * _param_value_size_ret_ returns the actual size in bytes of data being |
| queried by _param_name_. |
| If _param_value_size_ret_ is `NULL`, it is ignored. |
| |
| [[image-info-table]] |
| .List of supported param_names by <<clGetImageInfo>> |
| [width="100%",cols="<33%,<17%,<50%",options="header"] |
| |==== |
| | Image Info | Return type | Description |
| | {CL_IMAGE_FORMAT_anchor} |
| |
| include::{generated}/api/version-notes/CL_IMAGE_FORMAT.asciidoc[] |
| | {cl_image_format_TYPE} |
| | Return the image format descriptor specified when _image_ is created |
| with {clCreateImage}, {clCreateImageWithProperties}, |
| {clCreateImage2D} or {clCreateImage3D}. |
| | {CL_IMAGE_ELEMENT_SIZE_anchor} |
| |
| include::{generated}/api/version-notes/CL_IMAGE_ELEMENT_SIZE.asciidoc[] |
| | {size_t_TYPE} |
| | Return size of each element of the image memory object given by |
| _image_ in bytes. |
| | {CL_IMAGE_ROW_PITCH_anchor} |
| |
| include::{generated}/api/version-notes/CL_IMAGE_ROW_PITCH.asciidoc[] |
| | {size_t_TYPE} |
| | Return calculated row pitch in bytes of a row of elements of the |
| image object given by _image_. |
| | {CL_IMAGE_SLICE_PITCH_anchor} |
| |
| include::{generated}/api/version-notes/CL_IMAGE_SLICE_PITCH.asciidoc[] |
| | {size_t_TYPE} |
| | Return calculated slice pitch in bytes of a 2D slice for the 3D |
| image object or size of each image in a 1D or 2D image array given |
| by _image_. |
| For a 1D image, 1D image buffer and 2D image object return 0. |
| | {CL_IMAGE_WIDTH_anchor} |
| |
| include::{generated}/api/version-notes/CL_IMAGE_WIDTH.asciidoc[] |
| | {size_t_TYPE} |
| | Return width of the image in pixels. |
| | {CL_IMAGE_HEIGHT_anchor} |
| |
| include::{generated}/api/version-notes/CL_IMAGE_HEIGHT.asciidoc[] |
| | {size_t_TYPE} |
| | Return height of the image in pixels. |
| For a 1D image, 1D image buffer and 1D image array object, height = |
| 0. |
| | {CL_IMAGE_DEPTH_anchor} |
| |
| include::{generated}/api/version-notes/CL_IMAGE_DEPTH.asciidoc[] |
| | {size_t_TYPE} |
| | Return depth of the image in pixels. |
| For a 1D image, 1D image buffer, 2D image or 1D and 2D image array |
| object, depth = 0. |
| | {CL_IMAGE_ARRAY_SIZE_anchor} |
| |
| include::{generated}/api/version-notes/CL_IMAGE_ARRAY_SIZE.asciidoc[] |
| | {size_t_TYPE} |
| | Return number of images in the image array. |
| If _image_ is not an image array, 0 is returned. |
| | {CL_IMAGE_BUFFER_anchor} |
| |
| include::{generated}/api/version-notes/CL_IMAGE_BUFFER.asciidoc[] |
| | {cl_mem_TYPE} |
| | Return buffer object associated with _image_. |
| | {CL_IMAGE_NUM_MIP_LEVELS_anchor} |
| |
| include::{generated}/api/version-notes/CL_IMAGE_NUM_MIP_LEVELS.asciidoc[] |
| | {cl_uint_TYPE} |
| | Return `num_mip_levels` associated with _image_. |
| | {CL_IMAGE_NUM_SAMPLES_anchor} |
| |
| include::{generated}/api/version-notes/CL_IMAGE_NUM_SAMPLES.asciidoc[] |
| | {cl_uint_TYPE} |
| | Return `num_samples` associated with _image_. |
| |==== |
| |
| // refError |
| |
| {clGetImageInfo} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes |
| specified by _param_value_size_ is < size of return type as described in |
| the <<image-info-table,Image Object Queries>> table and _param_value_ is |
| not `NULL`. |
| * {CL_INVALID_MEM_OBJECT} if _image_ is a not a valid image object. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| |
| == Pipes |
| |
| NOTE: Pipes are <<unified-spec, missing before>> version 2.0. |
| |
| A _pipe_ is a memory object that stores data organized as a FIFO. |
| Pipe objects can only be accessed using built-in functions that read from |
| and write to a pipe. |
| Pipe objects are not accessible from the host. |
| A pipe object encapsulates the following information: |
| |
| * Packet size in bytes |
| * Maximum capacity in packets |
| * Information about the number of packets currently in the pipe |
| * Data packets |
| |
| |
| === Creating Pipe Objects |
| |
| [open,refpage='clCreatePipe',desc='Creates a pipe object.',type='protos'] |
| -- |
| To create a *pipe object*, call the function |
| |
| include::{generated}/api/protos/clCreatePipe.txt[] |
| include::{generated}/api/version-notes/clCreatePipe.asciidoc[] |
| |
| * _context_ is a valid OpenCL context used to create the pipe object. |
| * _flags_ is a bit-field that is used to specify allocation and usage |
| information such as the memory arena that should be used to allocate the |
| pipe object and how it will be used. |
| The <<memory-flags,Memory Flags>> table describes the possible values for |
| _flags_. |
| Only {CL_MEM_READ_WRITE} and {CL_MEM_HOST_NO_ACCESS} can be specified when |
| creating a pipe object. |
| If the value specified for _flags_ is 0, the default is used which is |
| {CL_MEM_READ_WRITE} | {CL_MEM_HOST_NO_ACCESS}. |
| * _pipe_packet_size_ is the size in bytes of a pipe packet. |
| * _pipe_max_packets_ specifies the pipe capacity by specifying the maximum |
| number of packets the pipe can hold. |
| * _properties_ specifies a list of properties for the pipe and their |
| corresponding values. |
| Each property name is immediately followed by the corresponding desired |
| value. |
| The list is terminated with 0. |
| Currently, in all OpenCL versions, _properties_ must be `NULL`. |
| * _errcode_ret_ will return an appropriate error code. |
| If _errcode_ret_ is `NULL`, no error code is returned. |
| |
| // refError |
| |
| {clCreatePipe} returns a valid non-zero pipe object and _errcode_ret_ is set |
| to {CL_SUCCESS} if the pipe object is created successfully. |
| Otherwise, it returns a `NULL` value with one of the following error values |
| returned in _errcode_ret_: |
| |
| * {CL_INVALID_CONTEXT} if _context_ is not a valid context. |
| * {CL_INVALID_OPERATION} if no devices in _context_ support pipes. |
| * {CL_INVALID_VALUE} if values specified in _flags_ are not as defined |
| above. |
| * {CL_INVALID_VALUE} if _properties_ is not `NULL`. |
| * {CL_INVALID_PIPE_SIZE} if _pipe_packet_size_ is 0 or the |
| _pipe_packet_size_ exceeds {CL_DEVICE_PIPE_MAX_PACKET_SIZE} value |
| specified in the <<device-queries-table,Device Queries>> table for all |
| devices in _context_ or if _pipe_max_packets_ is 0. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for the pipe object. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| |
| Pipes follow the same memory consistency model as defined for buffer and |
| image objects. |
| The pipe state i.e. contents of the pipe across kernel-instances (on the |
| same or different devices) is enforced at a synchronization point. |
| -- |
| |
| |
| === Pipe Object Queries |
| |
| To get information that is common to all memory objects, use the |
| {clGetMemObjectInfo} function described in <<memory-object-queries, Memory |
| Object Queries>>. |
| |
| [open,refpage='clGetPipeInfo',desc='Get information specific to a pipe object created with clCreatePipe.',type='protos'] |
| -- |
| To get information specific to a pipe object created with {clCreatePipe}, |
| call the function |
| |
| include::{generated}/api/protos/clGetPipeInfo.txt[] |
| include::{generated}/api/version-notes/clGetPipeInfo.asciidoc[] |
| |
| * _pipe_ specifies the pipe object being queried. |
| * _param_name_ specifies the information to query. |
| The list of supported _param_name_ types and the information returned in |
| _param_value_ by {clGetPipeInfo} is described in the <<pipe-info-table,Pipe |
| Object Queries>> table. |
| * _param_value_ is a pointer to memory where the appropriate result being |
| queried is returned. |
| If _param_value_ is `NULL`, it is ignored. |
| * _param_value_size_ is used to specify the size in bytes of memory pointed to |
| by _param_value_. |
| This size must be {geq} size of return type as described in the |
| <<pipe-info-table,Pipe Object Queries>> table. |
| * _param_value_size_ret_ returns the actual size in bytes of data being |
| queried by _param_name_. |
| If _param_value_size_ret_ is `NULL`, it is ignored. |
| |
| // refError |
| |
| {clGetPipeInfo} returns {CL_SUCCESS} if the function is executed successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_MEM_OBJECT} if _pipe_ is a not a valid pipe object. |
| * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes |
| specified by _param_value_size_ is < size of return type as described in |
| the <<pipe-info-table,Pipe Object Queries>> table and _param_value_ is |
| not `NULL`. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| |
| [[pipe-info-table]] |
| .List of supported param_names by <<clGetPipeInfo>> |
| [width="100%",cols="<33%,<17%,<50%",options="header"] |
| |==== |
| | Pipe Info | Return type | Description |
| | {CL_PIPE_PACKET_SIZE_anchor} |
| |
| include::{generated}/api/version-notes/CL_PIPE_PACKET_SIZE.asciidoc[] |
| | {cl_uint_TYPE} |
| | Return pipe packet size specified when _pipe_ is created with |
| {clCreatePipe}. |
| | {CL_PIPE_MAX_PACKETS_anchor} |
| |
| include::{generated}/api/version-notes/CL_PIPE_MAX_PACKETS.asciidoc[] |
| | {cl_uint_TYPE} |
| | Return max. number of packets specified when _pipe_ is created with |
| {clCreatePipe}. |
| |
| | {CL_PIPE_PROPERTIES_anchor} |
| |
| include::{generated}/api/version-notes/CL_PIPE_PROPERTIES.asciidoc[] |
| | {cl_pipe_properties_TYPE}[] |
| | Return the properties argument specified in {clCreatePipe}. |
| |
| If the _properties_ argument specified in {clCreatePipe} used to |
| create _pipe_ was not `NULL`, the implementation must return the |
| values specified in the properties argument in the same order and |
| without including additional properties. |
| |
| If the _properties_ argument specified in {clCreatePipe} used to |
| create _pipe_ was `NULL`, the implementation must return |
| _param_value_size_ret_ equal to 0, indicating that there are no |
| properties to be returned. |
| |==== |
| -- |
| |
| |
| == Querying, Unmapping, Migrating, Retaining and Releasing Memory Objects |
| |
| // === Handling Memory Objects |
| |
| |
| === Retaining and Releasing Memory Objects |
| |
| [open,refpage='clRetainMemObject',desc='Increments the memory object reference count.',type='protos'] |
| -- |
| To retain a memory object, call the function |
| |
| include::{generated}/api/protos/clRetainMemObject.txt[] |
| include::{generated}/api/version-notes/clRetainMemObject.asciidoc[] |
| |
| * _memobj_ specifies the memory object to be retained. |
| |
| The _memobj_ reference count is incremented. |
| |
| // refError |
| |
| {clRetainMemObject} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_MEM_OBJECT} if _memobj_ is not a valid memory object (buffer |
| or image object). |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| |
| {clCreateBuffer}, {clCreateBufferWithProperties}, {clCreateSubBuffer}, |
| {clCreateImage}, {clCreateImageWithProperties}, {clCreateImage2D}, |
| {clCreateImage3D} and {clCreatePipe} perform an implicit retain. |
| -- |
| |
| [open,refpage='clReleaseMemObject',desc='Decrements the memory object reference count.',type='protos'] |
| -- |
| To release a memory object, call the function |
| |
| include::{generated}/api/protos/clReleaseMemObject.txt[] |
| include::{generated}/api/version-notes/clReleaseMemObject.asciidoc[] |
| |
| * _memobj_ specifies the memory object to be released. |
| |
| The _memobj_ reference count is decremented. |
| |
| // refError |
| |
| {clReleaseMemObject} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_MEM_OBJECT} if _memobj_ is not a valid memory object. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| |
| After the _memobj_ reference count becomes zero and commands queued for |
| execution on a command-queue(s) that use _memobj_ have finished, the memory |
| object is deleted. |
| If _memobj_ is a buffer object, _memobj_ cannot be deleted until all |
| sub-buffer objects associated with _memobj_ are deleted. |
| Using this function to release a reference that was not obtained by creating |
| the object or by calling {clRetainMemObject} causes undefined behavior. |
| -- |
| |
| [open,refpage='clSetMemObjectDestructorCallback',desc='Registers a destructor callback function with a memory object.',type='protos'] |
| -- |
| To register a callback function with a memory object that is called when |
| the memory object is destroyed, call the function |
| |
| include::{generated}/api/protos/clSetMemObjectDestructorCallback.txt[] |
| include::{generated}/api/version-notes/clSetMemObjectDestructorCallback.asciidoc[] |
| |
| * _memobj_ specifies the memory object to register the callback to. |
| * _pfn_notify_ is the callback function to register. |
| This callback function may be called asynchronously by the OpenCL |
| implementation. |
| It is the application's responsibility to ensure that the callback function |
| is thread-safe. |
| The parameters to this callback function are: |
| ** _memobj_ is the memory object being deleted. |
| When the callback function is called by the implementation, this memory |
| object is not longer valid. |
| _memobj_ is only provided for reference purposes. |
| ** _user_data_ is a pointer to user-supplied data. |
| * _user_data_ will be passed as the _user_data_ argument when _pfn_notify_ is |
| called. |
| _user_data_ can be `NULL`. |
| |
| Each call to {clSetMemObjectDestructorCallback} registers the specified |
| callback function on a destructor callback stack associated with _memobj_. |
| The registered callback functions are called in the reverse order in |
| which they were registered. |
| The registered callback functions are called and then the memory object's |
| resources are freed and the memory object is deleted. |
| Therefore, the memory object destructor callback provides a mechanism for |
| an application to safely re-use or free a _host_ptr_ that was specified when |
| _memobj_ was created and used as the storage bits for the memory object. |
| |
| // refError |
| |
| {clSetMemObjectDestructorCallback} returns {CL_SUCCESS} if the function is |
| executed successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_MEM_OBJECT} if _memobj_ is not a valid memory object. |
| * {CL_INVALID_VALUE} if _pfn_notify_ is `NULL`. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| |
| [NOTE] |
| ==== |
| When the user callback function is called by the implementation, the |
| contents of the memory region pointed to by _host_ptr_ (if the memory object |
| is created with {CL_MEM_USE_HOST_PTR}) are undefined. |
| The callback function is typically used by the application to either free or |
| reuse the memory region pointed to by _host_ptr_. |
| |
| The behavior of calling expensive system routines, OpenCL API calls to |
| create contexts or command-queues, or blocking OpenCL operations from the |
| following list below, in a callback is undefined. |
| |
| * {clFinish}, |
| * {clWaitForEvents}, |
| * blocking calls to {clEnqueueReadBuffer}, {clEnqueueReadBufferRect}, |
| {clEnqueueWriteBuffer}, {clEnqueueWriteBufferRect}, |
| * blocking calls to {clEnqueueReadImage} and {clEnqueueWriteImage}, |
| * blocking calls to {clEnqueueMapBuffer}, {clEnqueueMapImage}, |
| * blocking calls to {clBuildProgram}, {clCompileProgram} or |
| {clLinkProgram} |
| |
| If an application needs to wait for completion of a routine from the above |
| list in a callback, please use the non-blocking form of the function, and |
| assign a completion callback to it to do the remainder of your work. |
| Note that when a callback (or other code) enqueues commands to a |
| command-queue, the commands are not required to begin execution until the |
| queue is flushed. |
| In standard usage, blocking enqueue calls serve this role by implicitly |
| flushing the queue. |
| Since blocking calls are not permitted in callbacks, those callbacks that |
| enqueue commands on a command queue should either call {clFlush} on the |
| queue before returning or arrange for {clFlush} to be called later on |
| another thread. |
| |
| The user callback function may not call OpenCL APIs with the memory object |
| for which the callback function is invoked and for such cases the behavior |
| of OpenCL APIs is considered to be undefined. |
| ==== |
| -- |
| |
| |
| [[unmapping-mapped-memory]] |
| === Unmapping Mapped Memory Objects |
| |
| [open,refpage='clEnqueueUnmapMemObject',desc='Enqueues a command to unmap a previously mapped region of a memory object.',type='protos'] |
| -- |
| To enqueue a command to unmap a previously mapped region of a memory object, |
| call the function |
| |
| include::{generated}/api/protos/clEnqueueUnmapMemObject.txt[] |
| include::{generated}/api/version-notes/clEnqueueUnmapMemObject.asciidoc[] |
| |
| * _command_queue_ must be a valid host command-queue. |
| * _memobj_ is a valid memory (buffer or image) object. |
| The OpenCL context associated with _command_queue_ and _memobj_ must be the |
| same. |
| * _mapped_ptr_ is the host address returned by a previous call to |
| {clEnqueueMapBuffer}, or {clEnqueueMapImage} for _memobj_. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before {clEnqueueUnmapMemObject} can be executed. |
| If _event_wait_list_ is `NULL`, then {clEnqueueUnmapMemObject} does not wait |
| on any event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this command and |
| can be used to query or queue a wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| Reads or writes from the host using the pointer returned by |
| {clEnqueueMapBuffer} or {clEnqueueMapImage} are considered to be complete. |
| |
| {clEnqueueMapBuffer} and {clEnqueueMapImage} increment the mapped count of |
| the memory object. |
| The initial mapped count value of the memory object is zero. |
| Multiple calls to {clEnqueueMapBuffer}, or {clEnqueueMapImage} on the same |
| memory object will increment this mapped count by appropriate number of |
| calls. |
| {clEnqueueUnmapMemObject} decrements the mapped count of the memory object. |
| |
| {clEnqueueMapBuffer}, and {clEnqueueMapImage} act as synchronization points |
| for a region of the buffer object being mapped. |
| |
| // refError |
| |
| {clEnqueueUnmapMemObject} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_MEM_OBJECT} if _memobj_ is not a valid memory object or is a |
| pipe object. |
| * {CL_INVALID_VALUE} if _mapped_ptr_ is not a valid pointer returned by |
| {clEnqueueMapBuffer} or {clEnqueueMapImage} for _memobj_. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or if _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and |
| _memobj_ are not the same or if the context associated with |
| _command_queue_ and events in _event_wait_list_ are not the same. |
| -- |
| |
| |
| [[accessing-mapped-regions]] |
| === Accessing mapped regions of a memory object |
| |
| This section describes the behavior of OpenCL commands that access mapped |
| regions of a memory object. |
| |
| The contents of the region of a memory object and associated memory objects |
| (sub-buffer objects or 1D image buffer objects that overlap this region) |
| mapped for writing (i.e. {CL_MAP_WRITE} or {CL_MAP_WRITE_INVALIDATE_REGION} is |
| set in _map_flags_ argument to {clEnqueueMapBuffer}, or {clEnqueueMapImage}) |
| are considered to be undefined until this region is unmapped. |
| |
| Multiple commands in command-queues can map a region or overlapping regions |
| of a memory object and associated memory objects (sub-buffer objects or 1D |
| image buffer objects that overlap this region) for reading (i.e. _map_flags_ |
| = {CL_MAP_READ}). |
| The contents of the regions of a memory object mapped for reading can also |
| be read by kernels and other OpenCL commands (such as {clEnqueueCopyBuffer}) |
| executing on a device(s). |
| |
| Mapping (and unmapping) overlapped regions in a memory object and/or |
| associated memory objects (sub-buffer objects or 1D image buffer objects |
| that overlap this region) for writing is an error and will result in |
| {CL_INVALID_OPERATION} error returned by {clEnqueueMapBuffer}, or |
| {clEnqueueMapImage}. |
| |
| If a memory object is currently mapped for writing, the application must |
| ensure that the memory object is unmapped before any enqueued kernels or |
| commands that read from or write to this memory object or any of its |
| associated memory objects (sub-buffer or 1D image buffer objects) or its |
| parent object (if the memory object is a sub-buffer or 1D image buffer |
| object) begin execution; otherwise the behavior is undefined. |
| |
| If a memory object is currently mapped for reading, the application must |
| ensure that the memory object is unmapped before any enqueued kernels or |
| commands that write to this memory object or any of its associated memory |
| objects (sub-buffer or 1D image buffer objects) or its parent object (if the |
| memory object is a sub-buffer or 1D image buffer object) begin execution; |
| otherwise the behavior is undefined. |
| |
| A memory object is considered as mapped if there are one or more active |
| mappings for the memory object irrespective of whether the mapped regions |
| span the entire memory object. |
| |
| Accessing the contents of the memory region referred to by the mapped |
| pointer that has been unmapped is undefined. |
| |
| The mapped pointer returned by {clEnqueueMapBuffer} or {clEnqueueMapImage} |
| can be used as the _ptr_ argument value to {clEnqueueReadBuffer}, |
| {clEnqueueWriteBuffer}, {clEnqueueReadBufferRect}, |
| {clEnqueueWriteBufferRect}, {clEnqueueReadImage}, or |
| {clEnqueueWriteImage} provided the rules described above are adhered to. |
| |
| |
| === Migrating Memory Objects |
| |
| NOTE: Migrating memory objects is <<unified-spec, missing before>> version 1.2. |
| |
| This section describes a mechanism for assigning which device an OpenCL |
| memory object resides. |
| A user may wish to have more explicit control over the location of their |
| memory objects on creation. |
| This could be used to: |
| |
| * Ensure that an object is allocated on a specific device prior to usage. |
| * Preemptively migrate an object from one device to another. |
| |
| [open,refpage='clEnqueueMigrateMemObjects',desc='Enqueues a command to indicate which device a set of memory objects should be associated with.',type='protos'] |
| -- |
| To enqueue a command to indicate which device a set of memory objects should |
| be associated with, call the function |
| |
| include::{generated}/api/protos/clEnqueueMigrateMemObjects.txt[] |
| include::{generated}/api/version-notes/clEnqueueMigrateMemObjects.asciidoc[] |
| |
| * _command_queue_ is a valid host command-queue. |
| The specified set of memory objects in _mem_objects_ will be migrated to the |
| OpenCL device associated with _command_queue_ or to the host if the |
| {CL_MIGRATE_MEM_OBJECT_HOST} has been specified. |
| * _num_mem_objects_ is the number of memory objects specified in |
| _mem_objects_. |
| * _mem_objects_ is a pointer to a list of memory objects. |
| * _flags_ is a bit-field that is used to specify migration options. |
| The <<migration-flags-table,Memory Migration Flags>> describes the possible |
| values for flags. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before this particular command can be executed. |
| If _event_wait_list_ is `NULL`, then this particular command does not wait |
| on any event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this command and |
| can be used to query or queue a wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| [[migration-flags-table]] |
| .List of supported migration flags by <<clEnqueueMigrateMemObjects>> |
| [width="100%",cols="<50%,<50%",options="header"] |
| |==== |
| | Memory Migration Flags | Description |
| | {CL_MIGRATE_MEM_OBJECT_HOST_anchor} |
| |
| include::{generated}/api/version-notes/CL_MIGRATE_MEM_OBJECT_HOST.asciidoc[] |
| | This flag indicates that the specified set of memory objects are to be |
| migrated to the host, regardless of the target command-queue. |
| | {CL_MIGRATE_MEM_OBJECT_CONTENT_UNDEFINED_anchor} |
| |
| include::{generated}/api/version-notes/CL_MIGRATE_MEM_OBJECT_CONTENT_UNDEFINED.asciidoc[] |
| | This flag indicates that the contents of the set of memory objects are |
| undefined after migration. |
| The specified set of memory objects are migrated to the device |
| associated with _command_queue_ without incurring the overhead of |
| migrating their contents. |
| |==== |
| |
| Typically, memory objects are implicitly migrated to a device for which |
| enqueued commands, using the memory object, are targeted. |
| {clEnqueueMigrateMemObjects} allows this migration to be explicitly |
| performed ahead of the dependent commands. |
| This allows a user to preemptively change the association of a memory |
| object, through regular command queue scheduling, in order to prepare for |
| another upcoming command. |
| This also permits an application to overlap the placement of memory objects |
| with other unrelated operations before these memory objects are needed |
| potentially hiding transfer latencies. |
| Once the event, returned from {clEnqueueMigrateMemObjects}, has been marked |
| CL_COMPLETE the memory objects specified in _mem_objects_ have been |
| successfully migrated to the device associated with _command_queue_. |
| The migrated memory object shall remain resident on the device until another |
| command is enqueued that either implicitly or explicitly migrates it away. |
| |
| {clEnqueueMigrateMemObjects} can also be used to direct the initial |
| placement of a memory object, after creation, possibly avoiding the initial |
| overhead of instantiating the object on the first enqueued command to use |
| it. |
| |
| The user is responsible for managing the event dependencies, associated with |
| this command, in order to avoid overlapping access to memory objects. |
| Improperly specified event dependencies passed to |
| {clEnqueueMigrateMemObjects} could result in undefined results. |
| |
| // refError |
| |
| {clEnqueueMigrateMemObjects} return {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and |
| memory objects in _mem_objects_ are not the same or if the context |
| associated with _command_queue_ and events in _event_wait_list_ are not |
| the same. |
| * {CL_INVALID_MEM_OBJECT} if any of the memory objects in _mem_objects_ is |
| not a valid memory object. |
| * {CL_INVALID_VALUE} if _num_mem_objects_ is zero or if _mem_objects_ is |
| `NULL`. |
| * {CL_INVALID_VALUE} if _flags_ is not 0 or is not any of the values |
| described in the table above. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for the specified set of memory objects in _mem_objects_. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| |
| [[memory-object-queries]] |
| === Memory Object Queries |
| |
| [open,refpage='clGetMemObjectInfo',desc='Get information that is common to all memory objects (buffer and image objects).',type='protos'] |
| -- |
| To get information that is common to all memory objects (buffer and image |
| objects), call the function |
| |
| include::{generated}/api/protos/clGetMemObjectInfo.txt[] |
| include::{generated}/api/version-notes/clGetMemObjectInfo.asciidoc[] |
| |
| * _memobj_ specifies the memory object being queried. |
| * _param_name_ specifies the information to query. |
| The list of supported _param_name_ types and the information returned in |
| _param_value_ by {clGetMemObjectInfo} is described in the |
| <<mem-info-table,Memory Object Info>> table. |
| * _param_value_ is a pointer to memory where the appropriate result being |
| queried is returned. |
| If _param_value_ is `NULL`, it is ignored. |
| * _param_value_size_ is used to specify the size in bytes of memory pointed to |
| by _param_value_. |
| This size must be {geq} size of return type as described in the |
| <<mem-info-table,Memory Object Info>> table. |
| * _param_value_size_ret_ returns the actual size in bytes of data being |
| queried by _param_name_. |
| If _param_value_size_ret_ is `NULL`, it is ignored. |
| |
| [[mem-info-table]] |
| .List of supported param_names by <<clGetMemObjectInfo>> |
| [width="100%",cols="<33%,<17%,<50%",options="header"] |
| |==== |
| | Memory Object Info | Return type | Description |
| | {CL_MEM_TYPE_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_TYPE.asciidoc[] |
| | {cl_mem_object_type_TYPE} |
| | Returns one of the following values: |
| |
| {CL_MEM_OBJECT_BUFFER_anchor} if _memobj_ is created with {clCreateBuffer}, |
| {clCreateBufferWithProperties}, or {clCreateSubBuffer}. |
| |
| {CL_MEM_OBJECT_IMAGE2D} if _memobj_ is created with {clCreateImage2D}. |
| |
| {CL_MEM_OBJECT_IMAGE3D} if _memobj_ is created with {clCreateImage3D}. |
| |
| The value of __image_desc__->__image_type__ if _memobj_ is created with |
| {clCreateImage} or {clCreateImageWithProperties}. |
| |
| {CL_MEM_OBJECT_PIPE_anchor} if _memobj_ is created with {clCreatePipe}. |
| | {CL_MEM_FLAGS_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_FLAGS.asciidoc[] |
| | {cl_mem_flags_TYPE} |
| | Return the flags argument value specified when _memobj_ is created |
| with {clCreateBuffer}, + |
| {clCreateBufferWithProperties}, + |
| {clCreateSubBuffer}, + |
| {clCreateImage}, + |
| {clCreateImageWithProperties}, + |
| {clCreateImage2D}, + |
| {clCreateImage3D}, or + |
| {clCreatePipe}. |
| |
| If _memobj_ is a sub-buffer the memory access qualifiers inherited |
| from parent buffer is also returned. |
| | {CL_MEM_SIZE_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_SIZE.asciidoc[] |
| | {size_t_TYPE} |
| | Return actual size of the data store associated with _memobj_ in |
| bytes. |
| | {CL_MEM_HOST_PTR_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_HOST_PTR.asciidoc[] |
| | {void_TYPE}* |
| | If _memobj_ is created with {clCreateBuffer}, {clCreateBufferWithProperties}, |
| {clCreateImage}, {clCreateImageWithProperties}, {clCreateImage2D}, or |
| {clCreateImage3D}, and {CL_MEM_USE_HOST_PTR} is specified in mem_flags, |
| return the host_ptr argument value specified when _memobj_ is created. |
| |
| Otherwise, if _memobj_ is created with {clCreateSubBuffer}, and |
| _memobj_ is created from a buffer that was created with |
| {CL_MEM_USE_HOST_PTR} specified in mem_flags, return the host_ptr |
| passed to {clCreateBuffer} or {clCreateBufferWithProperties}, plus |
| the origin value specified in buffer_create_info when _memobj_ is |
| created. |
| |
| Otherwise, returns `NULL`. |
| | {CL_MEM_MAP_COUNT_anchor} footnote:[{fn-map-count-usage}] |
| |
| include::{generated}/api/version-notes/CL_MEM_MAP_COUNT.asciidoc[] |
| | {cl_uint_TYPE} |
| | Map count. |
| | {CL_MEM_REFERENCE_COUNT_anchor} footnote:[{fn-reference-count-usage}] |
| |
| include::{generated}/api/version-notes/CL_MEM_REFERENCE_COUNT.asciidoc[] |
| | {cl_uint_TYPE} |
| | Return _memobj_ reference count. |
| | {CL_MEM_CONTEXT_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_CONTEXT.asciidoc[] |
| | {cl_context_TYPE} |
| | Return context specified when memory object is created. |
| If _memobj_ is created using {clCreateSubBuffer}, the context |
| associated with the memory object specified as the _buffer_ argument |
| to {clCreateSubBuffer} is returned. |
| | {CL_MEM_ASSOCIATED_MEMOBJECT_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_ASSOCIATED_MEMOBJECT.asciidoc[] |
| | {cl_mem_TYPE} |
| | Return memory object from which _memobj_ is created. |
| |
| This returns the memory object specified as buffer argument to |
| {clCreateSubBuffer} if _memobj_ is a subbuffer object created using |
| {clCreateSubBuffer}. |
| |
| This returns __image_desc__->__mem_object__ if _memobj_ |
| is an image object created using {clCreateImage} or |
| {clCreateImageWithProperties}. |
| |
| Otherwise, returns `NULL`. |
| | {CL_MEM_OFFSET_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_OFFSET.asciidoc[] |
| | {size_t_TYPE} |
| | Return offset if _memobj_ is a sub-buffer object created using |
| {clCreateSubBuffer}. |
| |
| This return 0 if _memobj_ is not a subbuffer object. |
| | {CL_MEM_USES_SVM_POINTER_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_USES_SVM_POINTER.asciidoc[] |
| | {cl_bool_TYPE} |
| | Return {CL_TRUE} if _memobj_ is a buffer object that was created with |
| {CL_MEM_USE_HOST_PTR} or is a sub-buffer object of a buffer object |
| that was created with {CL_MEM_USE_HOST_PTR} and the _host_ptr_ |
| specified when the buffer object was created is a SVM pointer; |
| otherwise returns {CL_FALSE}. |
| | {CL_MEM_PROPERTIES_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_PROPERTIES.asciidoc[] |
| | {cl_mem_properties_TYPE}[] |
| | Return the properties argument specified in |
| {clCreateBufferWithProperties} or {clCreateImageWithProperties}. |
| |
| If the _properties_ argument specified in |
| {clCreateBufferWithProperties} or {clCreateImageWithProperties} |
| used to create _memobj_ was not `NULL`, the implementation must |
| return the values specified in the properties argument in the |
| same order and without including additional properties. |
| |
| If _memobj_ was created using {clCreateBuffer}, |
| {clCreateSubBuffer}, {clCreateImage}, {clCreateImage2D}, or |
| {clCreateImage3D}, or if the _properties_ argument specified |
| in {clCreateBufferWithProperties} or |
| {clCreateImageWithProperties} was `NULL`, the implementation |
| must return _param_value_size_ret_ equal to 0, indicating that |
| there are no properties to be returned. |
| |==== |
| |
| // refError |
| |
| {clGetMemObjectInfo} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_MEM_OBJECT} if _memobj_ is a not a valid memory object. |
| * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes |
| specified by _param_value_size_ is < size of return type as described in |
| the <<mem-info-table,Memory Object Info>> table and _param_value_ is not |
| `NULL`. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| |
| == Shared Virtual Memory |
| |
| NOTE: Shared virtual memory is <<unified-spec, missing before>> version 2.0. |
| |
| Shared virtual memory (a.k.a. SVM) allows the host and kernels executing on |
| devices to directly share complex, pointer-containing data structures such as |
| trees and linked lists. |
| It also eliminates the need to marshal data between the host and devices. |
| As a result, SVM substantially simplifies OpenCL programming and may improve |
| performance. |
| |
| |
| === SVM sharing granularity: coarse- and fine- grained sharing |
| |
| OpenCL maintains memory consistency in a coarse-grained fashion in regions |
| of buffers. |
| We call this coarse-grained sharing. |
| Many platforms such as those with integrated CPU-GPU processors and ones |
| using the SVM-related PCI-SIG IOMMU services can do better, and can support |
| sharing at a granularity smaller than a buffer. |
| We call this fine-grained sharing. |
| |
| * Coarse-grained sharing: Coarse-grain sharing may be used for memory and |
| virtual pointer sharing between multiple devices as well as between the |
| host and one or more devices. |
| The shared memory region is a memory buffer allocated using |
| {clSVMAlloc}. |
| Memory consistency is guaranteed at synchronization points and the host |
| can use calls to {clEnqueueSVMMap} and {clEnqueueSVMUnmap} or create a |
| {cl_mem_TYPE} buffer object using the SVM pointer and use OpenCL's existing host |
| API functions {clEnqueueMapBuffer} and {clEnqueueUnmapMemObject} to |
| update regions of the buffer. |
| What coarse-grain buffer SVM adds to OpenCL's earlier buffer support are |
| the ability to share virtual memory pointers and a guarantee that |
| concurrent access to the same memory allocation from multiple kernels on |
| a single device is valid. |
| The coarse-grain buffer SVM provides a memory consistency model similar |
| to the global memory consistency model described in _sections 3.3.1_ and |
| _3.4.3_ of the OpenCL 1.2 specification. |
| This memory consistency applies to the regions of buffers being shared |
| in a coarse-grained fashion. |
| It is enforced at the synchronization points between commands enqueued |
| to command queues in a single context with the additional consideration |
| that multiple kernels concurrently running on the same device may safely |
| share the data. |
| * Fine-grained sharing: Shared virtual memory where memory consistency is |
| maintained at a granularity smaller than a buffer. |
| How fine-grained SVM is used depends on whether the device supports SVM |
| atomic operations. |
| ** If SVM atomic operations are supported, they provide memory consistency |
| for loads and stores by the host and kernels executing on devices |
| supporting SVM. |
| This means that the host and devices can concurrently read and update |
| the same memory. |
| The consistency provided by SVM atomics is in addition to the |
| consistency provided at synchronization points. |
| There is no need for explicit calls to {clEnqueueSVMMap} and |
| {clEnqueueSVMUnmap} or {clEnqueueMapBuffer} and |
| {clEnqueueUnmapMemObject} on a {cl_mem_TYPE} buffer object created using the |
| SVM pointer. |
| ** If SVM atomic operations are not supported, the host and devices can |
| concurrently read the same memory locations and can concurrently update |
| non-overlapping memory regions, but attempts to update the same memory |
| locations are undefined. |
| Memory consistency is guaranteed at synchronization points without the |
| need for explicit calls to {clEnqueueSVMMap} and {clEnqueueSVMUnmap} |
| or {clEnqueueMapBuffer} and {clEnqueueUnmapMemObject} on a {cl_mem_TYPE} |
| buffer object created using the SVM pointer. |
| * There are two kinds of fine-grain sharing support. |
| Devices may support either fine-grain buffer sharing or fine-grain |
| system sharing. |
| ** Fine-grain buffer sharing provides fine-grain SVM only within buffers |
| and is an extension of coarse-grain sharing. |
| To support fine-grain buffer sharing in an OpenCL context, all devices |
| in the context must support {CL_DEVICE_SVM_FINE_GRAIN_BUFFER}. |
| ** Fine-grain system sharing enables fine-grain sharing of the host's |
| entire virtual memory, including memory regions allocated by the system |
| *malloc* API. |
| OpenCL buffer objects are unnecessary and programmers can pass pointers |
| allocated using *malloc* to OpenCL kernels. |
| |
| As an illustration of fine-grain SVM using SVM atomic operations to maintain |
| memory consistency, consider the following example. |
| The host and a set of devices can simultaneously access and update a shared |
| work-queue data structure holding work-items to be done. |
| The host can use atomic operations to insert new work-items into the queue |
| at the same time as the devices using similar atomic operations to remove |
| work-items for processing. |
| |
| It is the programmer's responsibility to ensure that no host code or |
| executing kernels attempt to access a shared memory region after that memory |
| is freed. |
| We require the SVM implementation to work with either 32- or 64- bit host |
| applications subject to the following requirement: the address space size |
| must be the same for the host and all OpenCL devices in the context. |
| |
| [open,refpage='clSVMAlloc',desc='Allocates a shared virtual memory (SVM) buffer that can be shared by the host and all devices in an OpenCL context that support shared virtual memory.',type='protos'] |
| -- |
| To allocate a shared virtual memory buffer (referred to as a SVM buffer) |
| that can be shared by the host and all devices in an OpenCL context that |
| support shared virtual memory, call the function |
| |
| include::{generated}/api/protos/clSVMAlloc.txt[] |
| include::{generated}/api/version-notes/clSVMAlloc.asciidoc[] |
| |
| * _context_ is a valid OpenCL context used to create the SVM buffer. |
| * _flags_ is a bit-field that is used to specify allocation and usage |
| information. |
| The <<svm-flags-table,SVM Memory Flags>> table describes the possible values |
| for _flags_. |
| * _size_ is the size in bytes of the SVM buffer to be allocated. |
| * _alignment_ is the minimum alignment in bytes that is required for the newly |
| created buffers memory region. |
| It must be a power of two up to the largest data type supported by the |
| OpenCL device. |
| For the full profile, the largest data type is long16. |
| For the embedded profile, it is long16 if the device supports 64-bit |
| integers; otherwise it is int16. |
| If alignment is 0, a default alignment will be used that is equal to the |
| size of largest data type supported by the OpenCL implementation. |
| |
| [[svm-flags-table]] |
| .List of supported SVM memory flag values |
| [width="100%",cols="<50%,<50%",options="header"] |
| |==== |
| | SVM Memory Flags | Description |
| | {CL_MEM_READ_WRITE} |
| | This flag specifies that the SVM buffer will be read and written by a |
| kernel. |
| This is the default. |
| | {CL_MEM_WRITE_ONLY} |
| | This flag specifies that the SVM buffer will be written but not read by |
| a kernel. |
| |
| Reading from a SVM buffer created with {CL_MEM_WRITE_ONLY} inside a kernel |
| is undefined. |
| |
| {CL_MEM_READ_WRITE} and {CL_MEM_WRITE_ONLY} are mutually exclusive. |
| | {CL_MEM_READ_ONLY} |
| | This flag specifies that the SVM buffer object is a read-only memory |
| object when used inside a kernel. |
| |
| Writing to a SVM buffer created with {CL_MEM_READ_ONLY} inside a kernel is |
| undefined. |
| |
| {CL_MEM_READ_WRITE} or {CL_MEM_WRITE_ONLY} and {CL_MEM_READ_ONLY} are mutually |
| exclusive. |
| | {CL_MEM_SVM_FINE_GRAIN_BUFFER_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_SVM_FINE_GRAIN_BUFFER.asciidoc[] |
| | This specifies that the application wants the OpenCL implementation to |
| do a fine-grained allocation. |
| | {CL_MEM_SVM_ATOMICS_anchor} |
| |
| include::{generated}/api/version-notes/CL_MEM_SVM_ATOMICS.asciidoc[] |
| | This flag is valid only if {CL_MEM_SVM_FINE_GRAIN_BUFFER} is specified in |
| flags. |
| It is used to indicate that SVM atomic operations can control visibility |
| of memory accesses in this SVM buffer. |
| |==== |
| |
| If {CL_MEM_SVM_FINE_GRAIN_BUFFER} is not specified, the buffer can be created |
| as a coarse grained SVM allocation. |
| Similarly, if {CL_MEM_SVM_ATOMICS} is not specified, the buffer can be created |
| without support for SVM atomic operations (refer to an OpenCL kernel |
| language specifications). |
| |
| Calling {clSVMAlloc} does not itself provide consistency for the shared |
| memory region. |
| When the host cannot use the SVM atomic operations, it must rely on OpenCL's |
| guaranteed memory consistency at synchronization points. |
| |
| For SVM to be used efficiently, the host and any devices sharing a buffer |
| containing virtual memory pointers should have the same endianness. |
| If the context passed to {clSVMAlloc} has devices with mixed endianness and |
| the OpenCL implementation is unable to implement SVM because of that mixed |
| endianness, {clSVMAlloc} will fail and return `NULL`. |
| |
| Although SVM is generally not supported for image objects, {clCreateImage} |
| and {clCreateImageWithProperties} |
| may create an image from a buffer (a 1D image from a buffer or a 2D image |
| from buffer) if the buffer specified in its image description parameter is a |
| SVM buffer. |
| Such images have a linear memory representation so their memory can be |
| shared using SVM. |
| However, fine grained sharing and atomics are not supported for image reads |
| and writes in a kernel. |
| |
| // refError |
| |
| {clSVMAlloc} returns a valid non-`NULL` shared virtual memory address if the |
| SVM buffer is successfully allocated. |
| Otherwise, like *malloc*, it returns a `NULL` pointer value. |
| {clSVMAlloc} will fail if |
| |
| * _context_ is not a valid context, or no devices in _context_ support SVM. |
| * _flags_ does not contain {CL_MEM_SVM_FINE_GRAIN_BUFFER} but does contain |
| {CL_MEM_SVM_ATOMICS}. |
| * Values specified in _flags_ do not follow rules described for supported |
| values in the <<svm-flags-table,SVM Memory Flags>> table. |
| * {CL_MEM_SVM_FINE_GRAIN_BUFFER} or {CL_MEM_SVM_ATOMICS} is specified in |
| _flags_ and these are not supported by at least one device in _context_. |
| * The values specified in _flags_ are not valid, i.e. don't match those |
| defined in the <<svm-flags-table,SVM Memory Flags>> table. |
| * _size_ is 0 or > {CL_DEVICE_MAX_MEM_ALLOC_SIZE} value for any device in |
| _context_. |
| * _alignment_ is not a power of two or the OpenCL implementation cannot |
| support the specified alignment for at least one device in _context_. |
| * There was a failure to allocate resources. |
| -- |
| |
| [open,refpage='clSVMFree',desc='Frees a shared virtual memory buffer allocated using clSVMAlloc.',type='protos'] |
| -- |
| To free a shared virtual memory buffer allocated using {clSVMAlloc}, call |
| the function |
| |
| include::{generated}/api/protos/clSVMFree.txt[] |
| include::{generated}/api/version-notes/clSVMFree.asciidoc[] |
| |
| * _context_ is a valid OpenCL context used to create the SVM buffer. |
| If no devices in _context_ support SVM, no action occurs. |
| * _svm_pointer_ must be the value returned by a call to {clSVMAlloc}. |
| If a `NULL` pointer is passed in _svm_pointer_, no action occurs. |
| |
| Note that {clSVMFree} does not wait for previously enqueued commands that |
| may be using _svm_pointer_ to finish before freeing _svm_pointer_. |
| It is the responsibility of the application to make sure that enqueued |
| commands that use _svm_pointer_ have finished before freeing _svm_pointer_. |
| This can be done by enqueuing a blocking operation such as {clFinish}, |
| {clWaitForEvents}, {clEnqueueReadBuffer} or by registering a callback with |
| the events associated with enqueued commands and when the last enqueued |
| command has finished freeing _svm_pointer_. |
| |
| The behavior of using _svm_pointer_ after it has been freed is undefined. |
| In addition, if a buffer object is created using {clCreateBuffer} or |
| {clCreateBufferWithProperties} with _svm_pointer_, the buffer object must |
| first be released before the _svm_pointer_ is freed. |
| |
| The {clEnqueueSVMFree} API can also be used to enqueue a callback to free |
| the shared virtual memory buffer allocated using {clSVMAlloc} or a shared |
| system memory pointer. |
| -- |
| |
| [open,refpage='clEnqueueSVMFree',desc='Enqueues a command to free shared virtual memory allocated using clSVMAlloc or a shared system memory pointer.',type='protos'] |
| -- |
| To enqueue a command to free the shared virtual memory allocated using |
| {clSVMAlloc} or a shared system memory pointer, call the function |
| |
| include::{generated}/api/protos/clEnqueueSVMFree.txt[] |
| include::{generated}/api/version-notes/clEnqueueSVMFree.asciidoc[] |
| |
| * _command_queue_ is a valid host command-queue. |
| * _svm_pointers_ and _num_svm_pointers_ specify shared virtual memory pointers |
| to be freed. |
| Each pointer in _svm_pointers_ that was allocated using {clSVMAlloc} must |
| have been allocated from the same context from which _command_queue_ was |
| created. |
| The memory associated with _svm_pointers_ can be reused or freed after the |
| function returns. |
| * _pfn_free_func_ specifies the callback function to be called to free the SVM |
| pointers. |
| _pfn_free_func_ takes four arguments: _queue_ which is the command queue in |
| which {clEnqueueSVMFree} was enqueued, the count and list of SVM pointers to |
| free and _user_data_ which is a pointer to user specified data. |
| If _pfn_free_func_ is `NULL`, all pointers specified in _svm_pointers_ must |
| be allocated using {clSVMAlloc} and the OpenCL implementation will free |
| these SVM pointers. |
| _pfn_free_func_ must be a valid callback function if any SVM pointer to be |
| freed is a shared system memory pointer i.e. not allocated using |
| {clSVMAlloc}. |
| If _pfn_free_func_ is a valid callback function, the OpenCL implementation |
| will call _pfn_free_func_ to free all the SVM pointers specified in |
| _svm_pointers_. |
| * _user_data_ will be passed as the _user_data_ argument when _pfn_free_func_ |
| is called. |
| _user_data_ can be `NULL`. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before {clEnqueueSVMFree} can be executed. |
| If _event_wait_list_ is `NULL`, then {clEnqueueSVMFree} does not wait on any |
| event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this command and |
| can be used to query or queue a wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| // refError |
| |
| {clEnqueueSVMFree} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_OPERATION} if the device associated with _command queue_ does not support SVM. |
| * {CL_INVALID_VALUE} if _num_svm_pointers_ is 0 and _svm_pointers_ is |
| non-`NULL`, _or_ if _svm_pointers_ is `NULL` and _num_svm_pointers_ is |
| not 0. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clEnqueueSVMMemcpy',desc='Enqueues a command to do a memcpy operation.',type='protos'] |
| -- |
| To enqueue a command to do a memcpy operation, call the function |
| |
| include::{generated}/api/protos/clEnqueueSVMMemcpy.txt[] |
| include::{generated}/api/version-notes/clEnqueueSVMMemcpy.asciidoc[] |
| |
| * _command_queue_ refers to the host command-queue in which the read / write |
| command will be queued. |
| If either _dst_ptr_ or _src_ptr_ is allocated using {clSVMAlloc} then the |
| OpenCL context allocated against must match that of _command_queue_. |
| * _blocking_copy_ indicates if the copy operation is _blocking_ or |
| _non-blocking_. |
| * If _blocking_copy_ is {CL_TRUE} i.e. the copy command is blocking, |
| {clEnqueueSVMMemcpy} does not return until the buffer data has been copied |
| into memory pointed to by _dst_ptr_. |
| * _size_ is the size in bytes of data being copied. |
| * _dst_ptr_ is the pointer to a host or SVM memory allocation where data is |
| copied to. |
| * _src_ptr_ is the pointer to a host or SVM memory allocation where data is |
| copied from. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before this particular command can be executed. |
| If _event_wait_list_ is `NULL`, then this particular command does not wait |
| on any event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this read / write command |
| and can be used to query or queue a wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| If _blocking_copy_ is {CL_FALSE} i.e. the copy command is non-blocking, |
| {clEnqueueSVMMemcpy} queues a non-blocking copy command and returns. |
| The contents of the buffer that _dst_ptr_ points to cannot be used until the |
| copy command has completed. |
| The _event_ argument returns an event object which can be used to query the |
| execution status of the read command. |
| When the copy command has completed, the contents of the buffer that |
| _dst_ptr_ points to can be used by the application. |
| |
| If the memory allocation(s) containing _dst_ptr_ and/or _src_ptr_ are |
| allocated using {clSVMAlloc} and either is not allocated from the same |
| context from which _command_queue_ was created the behavior is undefined. |
| |
| // refError |
| |
| {clEnqueueSVMMemcpy} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_OPERATION} if the device associated with _command queue_ does not support SVM. |
| * {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and |
| events in _event_wait_list_ are not the same. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the copy operation is |
| blocking and the execution status of any of the events in |
| _event_wait_list_ is a negative integer value. |
| * {CL_INVALID_VALUE} if _dst_ptr_ or _src_ptr_ is `NULL`. |
| * {CL_MEM_COPY_OVERLAP} if the values specified for _dst_ptr_, _src_ptr_ and |
| _size_ result in an overlapping copy. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clEnqueueSVMMemFill',desc='Enqueues a command to fill a region in memory with a pattern of a given pattern size.',type='protos'] |
| -- |
| To enqueue a command to fill a region in memory with a pattern of a given |
| pattern size, call the function |
| |
| include::{generated}/api/protos/clEnqueueSVMMemFill.txt[] |
| include::{generated}/api/version-notes/clEnqueueSVMMemFill.asciidoc[] |
| |
| * _command_queue_ refers to the host command-queue in which the fill command |
| will be queued. |
| The OpenCL context associated with _command_queue_ and SVM pointer referred |
| to by _svm_ptr_ must be the same. |
| * _svm_ptr_ is a pointer to a memory region that will be filled with |
| _pattern_. |
| It must be aligned to _pattern_size_ bytes. |
| If _svm_ptr_ is allocated using {clSVMAlloc} then it must be allocated from |
| the same context from which _command_queue_ was created. |
| Otherwise the behavior is undefined. |
| * _pattern_ is a pointer to the data pattern of size _pattern_size_ in bytes. |
| _pattern_ will be used to fill a region in _buffer_ starting at _svm_ptr_ |
| and is _size_ bytes in size. |
| The data pattern must be a scalar or vector integer or floating-point data |
| type supported by OpenCL as described in <<scalar-data-types, Shared |
| Application Scalar Data Types>> and <<vector-data-types, Supported |
| Application Vector Data Types>>. |
| For example, if region pointed to by _svm_ptr_ is to be filled with a |
| pattern of float4 values, then _pattern_ will be a pointer to a cl_float4 |
| value and _pattern_size_ will be `sizeof(cl_float4)`. |
| The maximum value of _pattern_size_ is the size of the largest integer or |
| floating-point vector data type supported by the OpenCL device. |
| The memory associated with _pattern_ can be reused or freed after the |
| function returns. |
| * _size_ is the size in bytes of region being filled starting with _svm_ptr_ |
| and must be a multiple of _pattern_size_. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before this particular command can be executed. |
| If _event_wait_list_ is `NULL`, then this particular command does not wait |
| on any event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this command and |
| can be used to query or queue a wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| // refError |
| |
| {clEnqueueSVMMemFill} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_OPERATION} if the device associated with _command queue_ does not support SVM. |
| * {CL_INVALID_CONTEXT} if the context associated with _command_queue_ and |
| events in _event_wait_list_ are not the same. |
| * {CL_INVALID_VALUE} if _svm_ptr_ is `NULL`. |
| * {CL_INVALID_VALUE} if _svm_ptr_ is not aligned to _pattern_size_ bytes. |
| * {CL_INVALID_VALUE} if _pattern_ is `NULL` or if _pattern_size_ is 0 or if |
| _pattern_size_ is not one of {1, 2, 4, 8, 16, 32, 64, 128}. |
| * {CL_INVALID_VALUE} if _size_ is not a multiple of _pattern_size_. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clEnqueueSVMMap',desc='Enqueues a command that will allow the host to update a region of a SVM buffer',type='protos'] |
| -- |
| To enqueue a command that will allow the host to update a region of a SVM |
| buffer, call the function |
| |
| include::{generated}/api/protos/clEnqueueSVMMap.txt[] |
| include::{generated}/api/version-notes/clEnqueueSVMMap.asciidoc[] |
| |
| * _command_queue_ must be a valid host command-queue. |
| * _blocking_map_ indicates if the map operation is _blocking_ or |
| _non-blocking_. |
| * _map_flags_ is a bit-field and is described in the |
| <<memory-map-flags-table,Memory Map Flags>> table. |
| * _svm_ptr_ and _size_ are a pointer to a memory region and size in bytes that |
| will be updated by the host. |
| If _svm_ptr_ is allocated using {clSVMAlloc} then it must be allocated from |
| the same context from which _command_queue_ was created. |
| Otherwise the behavior is undefined. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before this particular command can be executed. |
| If _event_wait_list_ is `NULL`, then this particular command does not wait |
| on any event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this command and |
| can be used to query or queue a wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| If _blocking_map_ is {CL_TRUE}, {clEnqueueSVMMap} does not return until the |
| application can access the contents of the SVM region specified by _svm_ptr_ |
| and _size_ on the host. |
| |
| If _blocking_map_ is {CL_FALSE} i.e. map operation is non-blocking, the region |
| specified by _svm_ptr_ and _size_ cannot be used until the map command has |
| completed. |
| The _event_ argument returns an event object which can be used to query the |
| execution status of the map command. |
| When the map command is completed, the application can access the contents |
| of the region specified by _svm_ptr_ and _size_. |
| |
| Note that since we are enqueuing a command with a SVM buffer, the region is |
| already mapped in the host address space. |
| |
| // refError |
| |
| {clEnqueueSVMMap} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_OPERATION} if the device associated with _command queue_ does not support SVM. |
| * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and events |
| in _event_wait_list_ are not the same. |
| * {CL_INVALID_VALUE} if _svm_ptr_ is `NULL`. |
| * {CL_INVALID_VALUE} if _size_ is 0 or if values specified in _map_flags_ |
| are not valid. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the map operation is |
| blocking and the execution status of any of the events in |
| _event_wait_list_ is a negative integer value. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clEnqueueSVMUnmap',desc='Enqueues a command to indicate that the host has completed updating the region given by an SVM pointer and which was specified in a previous call to clEnqueueSVMMap.',type='protos'] |
| -- |
| To enqueue a command to indicate that the host has completed updating the |
| region given by _svm_ptr_ and which was specified in a previous call to |
| {clEnqueueSVMMap}, call the function |
| |
| include::{generated}/api/protos/clEnqueueSVMUnmap.txt[] |
| include::{generated}/api/version-notes/clEnqueueSVMUnmap.asciidoc[] |
| |
| * _command_queue_ must be a valid host command-queue. |
| * _svm_ptr_ is a pointer that was specified in a previous call to |
| {clEnqueueSVMMap}. |
| If _svm_ptr_ is allocated using {clSVMAlloc} then it must be allocated from |
| the same context from which _command_queue_ was created. |
| Otherwise the behavior is undefined. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before {clEnqueueSVMUnmap} can be executed. |
| If _event_wait_list_ is `NULL`, then {clEnqueueSVMUnmap} does not wait on any |
| event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this command and |
| can be used to query or queue a wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| {clEnqueueSVMMap} and {clEnqueueSVMUnmap} act as synchronization points for |
| the region of the SVM buffer specified in these calls. |
| |
| // refError |
| |
| {clEnqueueSVMUnmap} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_OPERATION} if the device associated with _command queue_ does not support SVM. |
| * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and events |
| in _event_wait_list_ are not the same. |
| * {CL_INVALID_VALUE} if _svm_ptr_ is `NULL`. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or if _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| |
| [NOTE] |
| ==== |
| If a coarse-grained SVM buffer is currently mapped for writing, the |
| application must ensure that the SVM buffer is unmapped before any enqueued |
| kernels or commands that read from or write to this SVM buffer or any of its |
| associated {cl_mem_TYPE} buffer objects begin execution; otherwise the behavior is |
| undefined. |
| |
| If a coarse-grained SVM buffer is currently mapped for reading, the |
| application must ensure that the SVM buffer is unmapped before any enqueued |
| kernels or commands that write to this memory object or any of its |
| associated {cl_mem_TYPE} buffer objects begin execution; otherwise the behavior is |
| undefined. |
| |
| A SVM buffer is considered as mapped if there are one or more active |
| mappings for the SVM buffer irrespective of whether the mapped regions span |
| the entire SVM buffer. |
| |
| The above note does not apply to fine-grained SVM buffers (fine-grained |
| buffers allocated using {clSVMAlloc} or fine-grained system allocations). |
| ==== |
| -- |
| |
| [open,refpage='clEnqueueSVMMigrateMem',desc='Enqueues a command to indicate which device a set of ranges of SVM allocations should be associated with.',type='protos'] |
| -- |
| To enqueue a command to indicate which device a set of ranges of SVM |
| allocations should be associated with, call the function |
| |
| include::{generated}/api/protos/clEnqueueSVMMigrateMem.txt[] |
| include::{generated}/api/version-notes/clEnqueueSVMMigrateMem.asciidoc[] |
| |
| * _command_queue_ is a valid host command queue. |
| The specified set of allocation ranges will be migrated to the OpenCL device |
| associated with _command_queue_. |
| * _num_svm_pointers_ is the number of pointers in the specified _svm_pointers_ |
| array, and the number of sizes in the _sizes_ array, if _sizes_ is not |
| `NULL`. |
| * _svm_pointers_ is a pointer to an array of pointers. |
| Each pointer in this array must be within an allocation produced by a call |
| to {clSVMAlloc}. |
| * _sizes_ is an array of sizes. |
| The pair _svm_pointers_[i] and _sizes_[i] together define the starting |
| address and number of bytes in a range to be migrated. |
| _sizes_ may be `NULL` indicating that every allocation containing any |
| _svm_pointer_[i] is to be migrated. |
| Also, if _sizes_[i] is zero, then the entire allocation containing |
| _svm_pointer_[i] is migrated. |
| * _flags_ is a bit-field that is used to specify migration options. |
| The <<migration-flags-table,Memory Migration Flags>> describes the possible |
| values for _flags_. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before this particular command can be executed. |
| If _event_wait_list_ is `NULL`, then this particular command does not wait |
| on any event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this command and |
| can be used to query or queue a wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| Once the event returned by {clEnqueueSVMMigrateMem} has become {CL_COMPLETE}, |
| the ranges specified by svm pointers and sizes have been successfully |
| migrated to the device associated with command queue. |
| |
| The user is responsible for managing the event dependencies associated with |
| this command in order to avoid overlapping access to SVM allocations. |
| Improperly specified event dependencies passed to {clEnqueueSVMMigrateMem} |
| could result in undefined results. |
| |
| // refError |
| |
| {clEnqueueSVMMigrateMem} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_OPERATION} if the device associated with _command queue_ does not support SVM. |
| * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and events |
| in _event_wait_list_ are not the same. |
| * {CL_INVALID_VALUE} if _num_svm_pointers_ is zero or _svm_pointers_ is |
| `NULL`. |
| * {CL_INVALID_VALUE} if _sizes_[i] is non-zero range [_svm_pointers_[i], |
| _svm_pointers_[i]+_sizes_[i]) is not contained within an existing |
| {clSVMAlloc} allocation. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or if _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| |
| === Memory consistency for SVM allocations |
| |
| To ensure memory consistency in SVM allocations, the program can rely on the |
| guaranteed memory consistency at synchronization points. |
| This consistency support already exists in OpenCL 1.x and can be used for |
| coarse-grained SVM allocations or for fine-grained buffer SVM allocations; |
| what SVM adds is the ability to share pointers between the host and all SVM |
| devices. |
| |
| In addition, sub-buffers can also be used to ensure that each device gets a |
| consistent view of a SVM buffers memory when it is shared by multiple |
| devices. |
| For example, assume that two devices share a SVM pointer. |
| The host can create a {cl_mem_TYPE} buffer object using {clCreateBuffer} or |
| {clCreateBufferWithProperties} with {CL_MEM_USE_HOST_PTR} and _host_ptr_ set |
| to the SVM pointer and then create two disjoint sub-buffers with starting |
| virtual addresses _sb1_ptr_ and _sb2_ptr_. |
| These pointers (_sb1_ptr_ and _sb2_ptr_) can be passed to kernels executing |
| on the two devices. |
| {clEnqueueMapBuffer} and {clEnqueueUnmapMemObject} and the existing |
| <<accessing-mapped-regions, access rules for memory objects>> ensure |
| consistency for buffer regions (_sb1_ptr_ and _sb2_ptr_) read and written by |
| these kernels. |
| |
| When the host and devices are able to use SVM atomic operations (i.e. |
| {CL_DEVICE_SVM_ATOMICS} is set in {CL_DEVICE_SVM_CAPABILITIES}), these atomic |
| operations can be used to provide memory consistency at a fine grain in a |
| shared memory region. |
| The effect of these operations is visible to the host and all devices with |
| which that memory is shared. |
| |
| |
| == Sampler Objects |
| |
| A sampler object describes how to sample an image when the image is read in |
| the kernel. |
| The built-in functions to read from an image in a kernel take a sampler as |
| an argument. |
| The sampler arguments to the image read function can be sampler objects |
| created using OpenCL functions and passed as argument values to the kernel |
| or can be samplers declared inside a kernel. |
| In this section we discuss how sampler objects are created using OpenCL |
| functions. |
| |
| |
| === Creating Sampler Objects |
| |
| [open,refpage='clCreateSamplerWithProperties',desc='Creates a sampler object.',type='protos'] |
| -- |
| To create a sampler object, call the function |
| |
| include::{generated}/api/protos/clCreateSamplerWithProperties.txt[] |
| include::{generated}/api/version-notes/clCreateSamplerWithProperties.asciidoc[] |
| |
| * _context_ must be a valid OpenCL context. |
| * _sampler_properties_ specifies a list of sampler property names and their |
| corresponding values. |
| Each sampler property name is immediately followed by the corresponding |
| desired value. |
| The list is terminated with 0. |
| The list of supported properties is described in the |
| <<sampler-properties-table,Sampler Properties>> table. |
| If a supported property and its value is not specified in |
| _sampler_properties_, its default value will be used. |
| _sampler_properties_ can be `NULL` in which case the default values for |
| supported sampler properties will be used. |
| |
| [[sampler-properties-table]] |
| .List of supported sampler creation properties by <<clCreateSamplerWithProperties>> |
| [width="100%",cols="<33%,<17%,<50%",options="header"] |
| |==== |
| | Sampler Property | Property Value | Description |
| | {CL_SAMPLER_NORMALIZED_COORDS_anchor} |
| |
| include::{generated}/api/version-notes/CL_SAMPLER_NORMALIZED_COORDS.asciidoc[] |
| | {cl_bool_TYPE} |
| | A boolean value that specifies whether the image coordinates |
| specified are normalized or not. |
| |
| The default value (i.e. the value used if this property is not |
| specified in sampler_properties) is {CL_TRUE}. |
| | {CL_SAMPLER_ADDRESSING_MODE_anchor} |
| |
| include::{generated}/api/version-notes/CL_SAMPLER_ADDRESSING_MODE.asciidoc[] |
| | {cl_addressing_mode_TYPE} |
| | Specifies how out-of-range image coordinates are handled when |
| reading from an image. |
| Valid values are: |
| |
| {CL_ADDRESS_NONE_anchor} - Behavior is undefined for out-of-range |
| image coordinates. |
| |
| {CL_ADDRESS_CLAMP_TO_EDGE_anchor} - Out-of-range image coordinates |
| are clamped to the edge of the image. |
| |
| {CL_ADDRESS_CLAMP_anchor} - Out-of-range image coordinates are |
| assigned a border color value. |
| |
| {CL_ADDRESS_REPEAT_anchor} - Out-of-range image coordinates read |
| from the image as-if the image data were replicated in all dimensions. |
| |
| {CL_ADDRESS_MIRRORED_REPEAT_anchor} - Out-of-range image coordinates |
| read from the image as-if the image data were replicated in all |
| dimensions, mirroring the image contents at the edge of each |
| replication. |
| |
| The default is {CL_ADDRESS_CLAMP}. |
| | {CL_SAMPLER_FILTER_MODE_anchor} |
| |
| include::{generated}/api/version-notes/CL_SAMPLER_FILTER_MODE.asciidoc[] |
| | {cl_filter_mode_TYPE} |
| | Specifies the type of filter that is applied when reading an |
| image. |
| Valid values are: |
| |
| {CL_FILTER_NEAREST_anchor} - Returns the image element nearest |
| to the image coordinate. |
| |
| {CL_FILTER_LINEAR_anchor} - Returns a weighted average of the |
| four image elements nearest to the image coordinate. |
| |
| The default value is {CL_FILTER_NEAREST}. |
| |==== |
| * _errcode_ret_ will return an appropriate error code. |
| If _errcode_ret_ is `NULL`, no error code is returned. |
| |
| // refError |
| |
| {clCreateSamplerWithProperties} returns a valid non-zero sampler object and |
| _errcode_ret_ is set to {CL_SUCCESS} if the sampler object is created |
| successfully. |
| Otherwise, it returns a `NULL` value with one of the following error values |
| returned in _errcode_ret_: |
| |
| * {CL_INVALID_CONTEXT} if _context_ is not a valid context. |
| * {CL_INVALID_VALUE} if the property name in _sampler_properties_ is not a |
| supported property name, if the value specified for a supported property |
| name is not valid, or if the same property name is specified more than |
| once. |
| * {CL_INVALID_OPERATION} if images are not supported by any device |
| associated with _context_ (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the |
| <<device-queries-table,Device Queries>> table is {CL_FALSE}). |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clCreateSampler',desc='Creates a sampler object.',type='protos'] |
| -- |
| To create a sampler object, call the function |
| |
| include::{generated}/api/protos/clCreateSampler.txt[] |
| include::{generated}/api/version-notes/clCreateSampler.asciidoc[] |
| |
| * _context_ must be a valid OpenCL context. |
| * _normalized_coords_ has the same interpretation as |
| {CL_SAMPLER_NORMALIZED_COORDS} in the <<sampler-properties-table, sampler |
| creation properties table>>. |
| * _addressing_mode_ has the same interpretation as |
| {CL_SAMPLER_ADDRESSING_MODE} in the <<sampler-properties-table, sampler |
| creation properties table>>. |
| * _filter_mode_ has the same interpretation as |
| {CL_SAMPLER_FILTER_MODE} in the <<sampler-properties-table, sampler |
| creation properties table>>. |
| * _errcode_ret_ will return an appropriate error code. |
| If _errcode_ret_ is `NULL`, no error code is returned. |
| |
| // refError |
| |
| {clCreateSampler} returns a valid non-zero sampler object and _errcode_ret_ is |
| set to {CL_SUCCESS} if the sampler object is created successfully. |
| Otherwise, it returns a `NULL` value with one of the following error values |
| returned in _errcode_ret_: |
| |
| * {CL_INVALID_CONTEXT} if _context_ is not a valid context. |
| * {CL_INVALID_VALUE} if _addressing_mode_, _filter_mode_, _normalized_coords_ |
| or a combination of these arguements are not valid. |
| * {CL_INVALID_OPERATION} if images are not supported by any device |
| associated with _context_ (i.e. {CL_DEVICE_IMAGE_SUPPORT} specified in the |
| <<device-queries-table,Device Queries>> table is {CL_FALSE}). |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clRetainSampler',desc='Increments the sampler reference count.',type='protos'] |
| -- |
| To retain a sampler object, call the function |
| |
| include::{generated}/api/protos/clRetainSampler.txt[] |
| include::{generated}/api/version-notes/clRetainSampler.asciidoc[] |
| |
| * _sampler_ specifies the sampler to be released. |
| |
| The _sampler_ reference count is incremented. |
| {clCreateSamplerWithProperties} and {clCreateSampler} perform an implicit |
| retain. |
| |
| // refError |
| |
| {clRetainSampler} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_SAMPLER} if _sampler_ is not a valid sampler object. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clReleaseSampler',desc='Decrements the sampler reference count.',type='protos'] |
| -- |
| To release a sampler object, call the function |
| |
| include::{generated}/api/protos/clReleaseSampler.txt[] |
| include::{generated}/api/version-notes/clReleaseSampler.asciidoc[] |
| |
| * _sampler_ specifies the sampler to be released. |
| |
| The _sampler_ reference count is decremented. |
| The sampler object is deleted after the reference count becomes zero and |
| commands queued for execution on a command-queue(s) that use _sampler_ have |
| finished. |
| |
| // refError |
| |
| {clReleaseSampler} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_SAMPLER} if _sampler_ is not a valid sampler object. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| |
| Using this function to release a reference that was not obtained by creating |
| the object or by calling {clRetainSampler} causes undefined behavior. |
| -- |
| |
| |
| === Sampler Object Queries |
| |
| [open,refpage='clGetSamplerInfo',desc='Returns information about the sampler object.',type='protos'] |
| -- |
| To return information about a sampler object, call the function |
| |
| include::{generated}/api/protos/clGetSamplerInfo.txt[] |
| include::{generated}/api/version-notes/clGetSamplerInfo.asciidoc[] |
| |
| * _sampler_ specifies the sampler being queried. |
| * _param_name_ specifies the information to query. |
| The list of supported _param_name_ types and the information returned in |
| _param_value_ by {clGetSamplerInfo} is described in the |
| <<sampler-info-table,Sampler Object Queries>> table. |
| * _param_value_ is a pointer to memory where the appropriate result being |
| queried is returned. |
| If _param_value_ is `NULL`, it is ignored. |
| * _param_value_size_ is used to specify the size in bytes of memory pointed to |
| by _param_value_. |
| This size must be {geq} size of return type as described in the |
| <<sampler-info-table,Sampler Object Queries>> table. |
| * _param_value_size_ret_ returns the actual size in bytes of data being |
| queried by _param_name_. |
| If _param_value_size_ret_ is `NULL`, it is ignored. |
| |
| [[sampler-info-table]] |
| .List of supported param_names by <<clGetSamplerInfo>> |
| [width="100%",cols="<33%,<17%,<50%",options="header"] |
| |==== |
| | Sampler Info | Return Type | Description |
| | {CL_SAMPLER_REFERENCE_COUNT_anchor} footnote:[{fn-reference-count-usage}] |
| |
| include::{generated}/api/version-notes/CL_SAMPLER_REFERENCE_COUNT.asciidoc[] |
| | {cl_uint_TYPE} |
| | Return the _sampler_ reference count. |
| | {CL_SAMPLER_CONTEXT_anchor} |
| |
| include::{generated}/api/version-notes/CL_SAMPLER_CONTEXT.asciidoc[] |
| | {cl_context_TYPE} |
| | Return the context specified when the sampler is created. |
| | {CL_SAMPLER_NORMALIZED_COORDS_anchor} |
| |
| include::{generated}/api/version-notes/CL_SAMPLER_NORMALIZED_COORDS.asciidoc[] |
| | {cl_bool_TYPE} |
| | Return the normalized coords value associated with _sampler_. |
| | {CL_SAMPLER_ADDRESSING_MODE_anchor} |
| |
| include::{generated}/api/version-notes/CL_SAMPLER_ADDRESSING_MODE.asciidoc[] |
| | {cl_addressing_mode_TYPE} |
| | Return the addressing mode value associated with _sampler_. |
| | {CL_SAMPLER_FILTER_MODE_anchor} |
| |
| include::{generated}/api/version-notes/CL_SAMPLER_FILTER_MODE.asciidoc[] |
| | {cl_filter_mode_TYPE} |
| | Return the filter mode value associated with _sampler_. |
| |
| | {CL_SAMPLER_PROPERTIES_anchor} |
| |
| include::{generated}/api/version-notes/CL_SAMPLER_PROPERTIES.asciidoc[] |
| | {cl_sampler_properties_TYPE}[] |
| | Return the properties argument specified in |
| {clCreateSamplerWithProperties}. |
| |
| If the _properties_ argument specified in {clCreateSamplerWithProperties} |
| used to create _sampler_ was not `NULL`, the implementation must return |
| the values specified in the properties argument in the same order and |
| without including additional properties. |
| |
| If _sampler_ was created using {clCreateSampler}, or if the _properties_ |
| argument specified in {clCreateSamplerWithProperties} was `NULL`, the |
| implementation must return _param_value_size_ret_ equal to 0, |
| indicating that there are no properties to be returned. |
| |==== |
| |
| // refError |
| |
| {clGetSamplerInfo} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes |
| specified by _param_value_size_ is < size of return type as described in |
| the <<sampler-info-table,Sampler Object Queries>> table and |
| _param_value_ is not `NULL`. |
| * {CL_INVALID_SAMPLER} if _sampler_ is a not a valid sampler object. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| |
| == Program Objects |
| |
| An OpenCL program consists of a set of kernels that are identified as |
| functions declared with the `+__kernel+` qualifier in the program source. |
| OpenCL programs may also contain auxiliary functions and constant data that |
| can be used by kernel functions. |
| The program executable can be generated _online_ or _offline_ by the OpenCL |
| compiler for the appropriate target device(s). |
| |
| A program object encapsulates the following information: |
| |
| * An associated context. |
| * A program source or binary. |
| * The latest successfully built program executable, library or compiled |
| binary, the list of devices for which the program executable, library or |
| compiled binary is built, the build options used and a build log. |
| * The number of kernel objects currently attached. |
| |
| |
| === Creating Program Objects |
| |
| [open,refpage='clCreateProgramWithSource',desc='Creates a program object for a context, and loads source code specified by text strings into the program object.',type='protos'] |
| -- |
| To creates a program object for a context and load source code into that |
| object, call the function |
| |
| include::{generated}/api/protos/clCreateProgramWithSource.txt[] |
| include::{generated}/api/version-notes/clCreateProgramWithSource.asciidoc[] |
| |
| * _context_ must be a valid OpenCL context. |
| * _strings_ is an array of _count_ pointers to optionally null-terminated |
| character strings that make up the source code. |
| * _lengths_ argument is an array with the number of chars in each string |
| (the string length). |
| If an element in _lengths_ is zero, its accompanying string is |
| null-terminated. |
| If _lengths_ is `NULL`, all strings in the _strings_ argument are considered |
| null-terminated. |
| Any length value passed in that is greater than zero excludes the null |
| terminator in its count. |
| * _errcode_ret_ will return an appropriate error code. |
| If _errcode_ret_ is `NULL`, no error code is returned. |
| |
| The source code specified by _strings_ will be loaded into the program |
| object. |
| |
| The devices associated with the program object are the devices associated |
| with _context_. |
| The source code specified by _strings_ is either an OpenCL C program source, |
| header or implementation-defined source for custom devices that support an |
| online compiler. |
| OpenCL {cpp} is not supported as an online-compiled kernel language through |
| this interface. |
| |
| // refError |
| |
| {clCreateProgramWithSource} returns a valid non-zero program object and |
| _errcode_ret_ is set to {CL_SUCCESS} if the program object is created |
| successfully. |
| Otherwise, it returns a `NULL` value with one of the following error values |
| returned in _errcode_ret_: |
| |
| * {CL_INVALID_CONTEXT} if _context_ is not a valid context. |
| * {CL_INVALID_VALUE} if _count_ is zero or if _strings_ or any entry in |
| _strings_ is `NULL`. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clCreateProgramWithIL',desc='Creates a program object for a context, and loads the IL into the program object.',type='protos'] |
| -- |
| To create a program object for a context and load code in an intermediate |
| language into that object, call the function |
| |
| include::{generated}/api/protos/clCreateProgramWithIL.txt[] |
| include::{generated}/api/version-notes/clCreateProgramWithIL.asciidoc[] |
| Also see extension *cl_khr_il_program*. |
| |
| * _context_ must be a valid OpenCL context. |
| * _il_ is a pointer to a _length_-byte block of memory containing SPIR-V or an |
| implementation-defined intermediate language. |
| * _errcode_ret_ will return an appropriate error code. |
| If _errcode_ret_ is `NULL`, no error code is returned. |
| |
| The intermediate language pointed to by _il_ and with length in bytes _length_ |
| will be loaded into the program object. |
| The devices associated with the program object are the devices associated |
| with _context_. |
| |
| // refError |
| |
| {clCreateProgramWithIL} returns a valid non-zero program object and |
| _errcode_ret_ is set to {CL_SUCCESS} if the program object is created |
| successfully. |
| Otherwise, it returns a `NULL` value with one of the following error values |
| returned in _errcode_ret_: |
| |
| * {CL_INVALID_CONTEXT} if _context_ is not a valid context. |
| * {CL_INVALID_OPERATION} if no devices in _context_ support intermediate |
| language programs. |
| * {CL_INVALID_VALUE} if _il_ is `NULL` or if _length_ is zero. |
| * {CL_INVALID_VALUE} if the _length_-byte memory pointed to by _il_ does not |
| contain well-formed intermediate language input that can be consumed by |
| the OpenCL runtime. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clCreateProgramWithBinary',desc='Creates a program object for a context, and loads binary bits into the program object.',type='protos'] |
| -- |
| To create a program object for a context and load binary bits into that |
| object, call the function |
| |
| include::{generated}/api/protos/clCreateProgramWithBinary.txt[] |
| include::{generated}/api/version-notes/clCreateProgramWithBinary.asciidoc[] |
| |
| * _context_ must be a valid OpenCL context. |
| * _device_list_ is a pointer to a list of devices that are in _context_. |
| _device_list_ must be a non-`NULL` value. |
| The binaries are loaded for devices specified in this list. |
| * _num_devices_ is the number of devices listed in _device_list_. |
| * _lengths_ is an array of the size in bytes of the program binaries to be |
| loaded for devices specified by _device_list_. |
| * _binaries_ is an array of pointers to program binaries to be loaded for |
| devices specified by _device_list_. |
| For each device given by _device_list_[i], the pointer to the program binary |
| for that device is given by _binaries_[i] and the length of this |
| corresponding binary is given by _lengths_[i]. |
| _lengths_[i] cannot be zero and _binaries_[i] cannot be a `NULL` pointer. |
| |
| The devices associated with the program object will be the list of devices |
| specified by _device_list_. |
| The list of devices specified by _device_list_ must be devices associated |
| with _context_. |
| |
| The program binaries specified by _binaries_ will be loaded into the program |
| object. |
| They contain bits that describe one of the following: |
| |
| * a program executable to be run on the device(s) associated with |
| _context_, |
| * a compiled program for device(s) associated with _context_, or |
| * a library of compiled programs for device(s) associated with _context_. |
| |
| The program binary can consist of either or both: |
| |
| * Device-specific code and/or, |
| * Implementation-specific intermediate representation (IR) which will be |
| converted to the device-specific code. |
| |
| * _binary_status_ returns whether the program binary for each device specified |
| in _device_list_ was loaded successfully or not. |
| It is an array of _num_devices_ entries and returns {CL_SUCCESS} in |
| _binary_status_[i] if binary was successfully loaded for device specified by |
| _device_list_[i]; otherwise returns {CL_INVALID_VALUE} if _lengths_[i] is zero |
| or if _binaries_[i] is a `NULL` value or {CL_INVALID_BINARY} in |
| _binary_status_[i] if program binary is not a valid binary for the specified |
| device. |
| If _binary_status_ is `NULL`, it is ignored. |
| * _errcode_ret_ will return an appropriate error code. |
| If _errcode_ret_ is `NULL`, no error code is returned. |
| |
| OpenCL allows applications to create a program object using the program |
| source or binary and build appropriate program executables. |
| This can be very useful as it allows applications to load program source and |
| then compile and link to generate a program executable online on its first |
| instance for appropriate OpenCL devices in the system. |
| These executables can now be queried and cached by the application. |
| The cached executables can be read and loaded by the application, which can |
| help significantly reduce the application initialization time. |
| |
| // refError |
| |
| {clCreateProgramWithBinary} returns a valid non-zero program object and |
| _errcode_ret_ is set to {CL_SUCCESS} if the program object is created |
| successfully. |
| Otherwise, it returns a `NULL` value with one of the following error values |
| returned in _errcode_ret_: |
| |
| * {CL_INVALID_CONTEXT} if _context_ is not a valid context. |
| * {CL_INVALID_VALUE} if _device_list_ is `NULL` or _num_devices_ is zero. |
| * {CL_INVALID_DEVICE} if any device in _device_list_ is not in |
| the list of devices associated with _context_. |
| * {CL_INVALID_VALUE} if _lengths_ or _binaries_ is `NULL` or if any entry |
| in _lengths_[i] is zero or _binaries_[i] is `NULL`. |
| * {CL_INVALID_BINARY} if an invalid program binary was encountered for any |
| device. |
| _binary_status_ will return specific status for each device. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clCreateProgramWithBuiltInKernels',desc='Creates a program object for a context, and loads the information related to the built-in kernels into a program object.',type='protos'] |
| -- |
| To create a program object for a context and loads the information related |
| to the built-in kernels into that object, call the function |
| |
| include::{generated}/api/protos/clCreateProgramWithBuiltInKernels.txt[] |
| include::{generated}/api/version-notes/clCreateProgramWithBuiltInKernels.asciidoc[] |
| |
| * _context_ must be a valid OpenCL context. |
| * _num_devices_ is the number of devices listed in _device_list_. |
| * _device_list_ is a pointer to a list of devices that are in _context_. |
| _device_list_ must be a non-`NULL` value. |
| The built-in kernels are loaded for devices specified in this list. |
| * _kernel_names_ is a semi-colon separated list of built-in kernel names. |
| |
| The devices associated with the program object will be the list of devices |
| specified by _device_list_. |
| The list of devices specified by _device_list_ must be devices associated |
| with _context_. |
| |
| // refError |
| |
| {clCreateProgramWithBuiltInKernels} returns a valid non-zero program object |
| and _errcode_ret_ is set to {CL_SUCCESS} if the program object is created |
| successfully. |
| Otherwise, it returns a `NULL` value with one of the following error values |
| returned in _errcode_ret_: |
| |
| * {CL_INVALID_CONTEXT} if _context_ is not a valid context. |
| * {CL_INVALID_VALUE} if _device_list_ is `NULL` or _num_devices_ is zero. |
| * {CL_INVALID_VALUE} if _kernel_names_ is `NULL` or _kernel_names_ contains |
| a kernel name that is not supported by any of the devices in |
| _device_list_. |
| * {CL_INVALID_DEVICE} if any device in _device_list_ is not in the list |
| of devices associated with _context_. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| |
| === Retaining and Releasing Program Objects |
| |
| [open,refpage='clRetainProgram',desc='Increments the program reference count.',type='protos'] |
| -- |
| To retain a program object, call the function |
| |
| include::{generated}/api/protos/clRetainProgram.txt[] |
| include::{generated}/api/version-notes/clRetainProgram.asciidoc[] |
| |
| * _program_ is the program object to be retained. |
| |
| The _program_ reference count is incremented. |
| All APIs that create a program do an implicit retain. |
| |
| // refError |
| |
| {clRetainProgram} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_PROGRAM} if _program_ is not a valid program object. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clReleaseProgram',desc='Decrements the program reference count.',type='protos'] |
| -- |
| To release a program object, call the function |
| |
| include::{generated}/api/protos/clReleaseProgram.txt[] |
| include::{generated}/api/version-notes/clReleaseProgram.asciidoc[] |
| |
| * _program_ is the program object to be released. |
| |
| The _program_ reference count is decremented. |
| The program object is deleted after all kernel objects associated with |
| _program_ have been deleted and the _program_ reference count becomes zero. |
| |
| // refError |
| |
| {clReleaseProgram} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_PROGRAM} if _program_ is not a valid program object. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| |
| Using this function to release a reference that was not obtained by creating |
| the object or by calling {clRetainProgram} causes undefined behavior. |
| -- |
| |
| [open,refpage='clSetProgramReleaseCallback',desc='Registers a destructor callback function with a program object.',type='protos'] |
| -- |
| To register a callback function with a program object that is called when |
| the program object is destroyed, call the function |
| |
| include::{generated}/api/protos/clSetProgramReleaseCallback.txt[] |
| include::{generated}/api/version-notes/clSetProgramReleaseCallback.asciidoc[] |
| |
| * _program_ specifies the memory object to register the callback to. |
| * _pfn_notify_ is the callback function to register. |
| This callback function may be called asynchronously by the OpenCL |
| implementation. |
| It is the application's responsibility to ensure that the callback function |
| is thread safe. |
| The parameters to this callback function are: |
| ** _program_ is the program being deleted. |
| When the callback function is called by the implementation, this program |
| object is not longer valid. |
| _program_ is only provided for reference purposes. |
| ** _user_data_ is a pointer to user supplied data. |
| * _user_data_ will be passed as the _user_data_ argument when _pfn_notify_ is |
| called. |
| user data can be `NULL`. |
| |
| Each call to {clSetProgramReleaseCallback} registers the specified |
| callback function on a callback stack associated with _program_. |
| The registered callback functions are called in the reverse order in |
| which they were registered. |
| The registered callback functions are called after destructors (if any) for |
| program scope global variables (if any) are called and before the program |
| object is deleted. |
| This provides a mechanism for an application to be notified when destructors |
| for program scope global variables are complete. |
| |
| {clSetProgramReleaseCallback} may unconditionally return an error if no |
| devices in the context associated with _program_ support destructors for |
| program scope global variables. |
| Support for constructors and destructors for program scope global variables |
| is required only for OpenCL 2.2 devices. |
| |
| // refError |
| |
| {clSetProgramReleaseCallback} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_PROGRAM} if _program_ is not a valid program object. |
| * {CL_INVALID_OPERATION} if no devices in the context associated with |
| _program_ support destructors for program scope global variables. |
| * {CL_INVALID_VALUE} if _pfn_notify_ is `NULL`. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| |
| === Setting SPIR-V specialization constants |
| |
| NOTE: Specialization constants are <<unified-spec, missing before>> version |
| 2.2. |
| |
| [open,refpage='clSetProgramSpecializationConstant',desc='',type='protos'] |
| -- |
| To set the value of a specialization constant, call the function |
| |
| include::{generated}/api/protos/clSetProgramSpecializationConstant.txt[] |
| include::{generated}/api/version-notes/clSetProgramSpecializationConstant.asciidoc[] |
| |
| * _program_ must be a valid OpenCL program created from an intermediate |
| language (e.g. SPIR-V). |
| * _spec_id_ identifies the specialization constant whose value will be |
| set. |
| * _spec_size_ specifies the size in bytes of the data pointed to by |
| _spec_value_. |
| This should be 1 for boolean constants. |
| For all other constant types this should match the size of the |
| specialization constant in the module. |
| * _spec_value_ is a pointer to the memory location that contains the value of |
| the specialization constant. |
| The data pointed to by _spec_value_ are copied and can be safely reused by |
| the application after {clSetProgramSpecializationConstant} returns. |
| This specialization value will be used by subsequent calls to |
| {clBuildProgram} until another call to {clSetProgramSpecializationConstant} |
| changes it. |
| If a specialization constant is a boolean constant, _spec_value_ should be a |
| pointer to a {cl_uchar_TYPE} value. |
| A value of zero will set the specialization constant to false; any other |
| value will set it to true. |
| |
| Calling this function multiple times for the same specialization constant |
| shall cause the last provided value to override any previously specified |
| value. |
| The values are used by a subsequent {clBuildProgram} call for the _program_. |
| |
| Application is not required to provide values for every specialization |
| constant contained in the module. If the value is not set by this API |
| call, default values will be used during the build. |
| |
| // refError |
| |
| {clSetProgramSpecializationConstant} returns {CL_SUCCESS} if the function is |
| executed successfully. |
| |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_PROGRAM} if _program_ is not a valid program object created |
| from an intermediate language (e.g. SPIR-V), or if the intermediate |
| language does not support specialization constants. |
| * {CL_INVALID_OPERATION} if no devices associated with _program_ support |
| intermediate language programs. |
| * {CL_COMPILER_NOT_AVAILABLE} if _program_ is created with |
| {clCreateProgramWithIL} and a compiler is not |
| available, i.e. {CL_DEVICE_COMPILER_AVAILABLE} specified in the |
| <<device-queries-table,Device Queries>> table is set to {CL_FALSE}. |
| * {CL_INVALID_SPEC_ID} if _spec_id_ is not a valid specialization constant |
| identifier. |
| * {CL_INVALID_VALUE} if _spec_size_ does not match the size of the |
| specialization constant in the module, or if _spec_value_ is |
| `NULL`. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| |
| === Building Program Executables |
| |
| [open,refpage='clBuildProgram',desc='Builds (compiles and links) a program executable from the program source or binary.',type='protos'] |
| -- |
| To build (compile & link) a program executable, call the function |
| |
| include::{generated}/api/protos/clBuildProgram.txt[] |
| include::{generated}/api/version-notes/clBuildProgram.asciidoc[] |
| |
| * _program_ is the program object. |
| * _device_list_ is a pointer to a list of devices associated with _program_. |
| If _device_list_ is a `NULL` value, the program executable is built for all |
| devices associated with _program_ for which a source or binary has been |
| loaded. |
| If _device_list_ is a non-`NULL` value, the program executable is built for |
| devices specified in this list for which a source or binary has been loaded. |
| * _num_devices_ is the number of devices listed in _device_list_. |
| * _options_ is a pointer to a null-terminated string of characters that |
| describes the build options to be used for building the program executable. |
| The list of supported options is described in <<compiler-options, Compiler |
| Options>>. |
| If the program was created using {clCreateProgramWithBinary} and _options_ |
| is a `NULL` pointer, the program will be built as if _options_ were the same |
| as when the program binary was originally built. |
| If the program was created using {clCreateProgramWithBinary} and _options_ |
| string contains anything other than the same options in the same order |
| (whitespace ignored) as when the program binary was originally built, then |
| the behavior is implementation defined. |
| Otherwise, if _options_ is a `NULL` pointer then it will have the same |
| result as the empty string. |
| * _pfn_notify_ is a function pointer to a notification routine. |
| The notification routine is a callback function that an application can |
| register and which will be called when the program executable has been built |
| (successfully or unsuccessfully). |
| If _pfn_notify_ is not `NULL`, {clBuildProgram} does not need to wait for |
| the build to complete and can return immediately once the build operation |
| can begin. |
| Any state changes of the program object that result from calling |
| {clBuildProgram} (e.g. build status or log) will be observable from this |
| callback function. |
| The build operation can begin if the context, program whose sources are |
| being compiled and linked, list of devices and build options specified are |
| all valid and appropriate host and device resources needed to perform the |
| build are available. |
| If _pfn_notify_ is `NULL`, {clBuildProgram} does not return until the build |
| has completed. |
| This callback function may be called asynchronously by the OpenCL |
| implementation. |
| It is the applications responsibility to ensure that the callback function |
| is thread-safe. |
| ** _user_data_ will be passed as an argument when _pfn_notify_ is called. |
| _user_data_ can be `NULL`. |
| |
| The program executable is built from the program source or binary for all |
| the devices, or a specific device(s) in the OpenCL context associated with |
| _program_. |
| OpenCL allows program executables to be built using the source or the |
| binary. |
| {clBuildProgram} must be called for _program_ created using |
| {clCreateProgramWithSource}, {clCreateProgramWithIL} or |
| {clCreateProgramWithBinary} to build the program executable for one or more |
| devices associated with _program_. |
| If _program_ is created with {clCreateProgramWithBinary}, then the program |
| binary must be an executable binary (not a compiled binary or library). |
| |
| The executable binary can be queried using {clGetProgramInfo}(_program_, |
| {CL_PROGRAM_BINARIES}, ...) and can be specified to |
| {clCreateProgramWithBinary} to create a new program object. |
| |
| // refError |
| |
| {clBuildProgram} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_PROGRAM} if _program_ is not a valid program object. |
| * {CL_INVALID_VALUE} if _device_list_ is `NULL` and _num_devices_ is greater |
| than zero, or if _device_list_ is not `NULL` and _num_devices_ is zero. |
| * {CL_INVALID_VALUE} if _pfn_notify_ is `NULL` but _user_data_ is not |
| `NULL`. |
| * {CL_INVALID_DEVICE} if any device in _device_list_ is not in |
| the list of devices associated with _program_. |
| * {CL_INVALID_BINARY} if _program_ is created with |
| {clCreateProgramWithBinary} and devices listed in _device_list_ do not |
| have a valid program binary loaded. |
| * {CL_INVALID_BUILD_OPTIONS} if the build options specified by _options_ are |
| invalid. |
| * {CL_COMPILER_NOT_AVAILABLE} if _program_ is created with |
| {clCreateProgramWithSource} or {clCreateProgramWithIL} and a compiler is |
| not available, i.e. {CL_DEVICE_COMPILER_AVAILABLE} specified in the |
| <<device-queries-table,Device Queries>> table is set to {CL_FALSE}. |
| * {CL_BUILD_PROGRAM_FAILURE} if there is a failure to build the program |
| executable. |
| This error will be returned if {clBuildProgram} does not return until |
| the build has completed. |
| * {CL_INVALID_OPERATION} if the build of a program executable for any of the |
| devices listed in _device_list_ by a previous call to {clBuildProgram} |
| for _program_ has not completed. |
| * {CL_INVALID_OPERATION} if there are kernel objects attached to _program_. |
| * {CL_INVALID_OPERATION} if _program_ was not created with |
| {clCreateProgramWithSource}, {clCreateProgramWithIL} or |
| {clCreateProgramWithBinary}. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| |
| === Separate Compilation and Linking of Programs |
| |
| NOTE: Separate compilation and linking are <<unified-spec, missing before>> |
| version 1.2. |
| |
| OpenCL programs are compiled and linked to support the following: |
| |
| * Separate compilation and link stages. |
| Program sources can be compiled to generate a compiled binary object and |
| linked in a separate stage with other compiled program objects to the |
| program executable. |
| * Embedded headers. |
| In OpenCL 1.0 and 1.1, the I build option could be used to specify the |
| list of directories to be searched for headers files that are included |
| by a program source(s). |
| OpenCL 1.2 extends this by allowing the header sources to come from |
| program objects instead of just header files. |
| * Libraries. |
| The linker can be used to link compiled objects and libraries into a |
| program executable or to create a library of compiled binaries. |
| |
| [open,refpage='clCompileProgram',desc='Compiles a program\'s source for all the devices or a specific device(s) in the OpenCL context associated with a program.',type='protos'] |
| -- |
| To compile a program's source for all the devices or a specific device(s) in |
| the OpenCL context associated with the program, call the function |
| |
| include::{generated}/api/protos/clCompileProgram.txt[] |
| include::{generated}/api/version-notes/clCompileProgram.asciidoc[] |
| |
| * _program_ is the program object that is the compilation target. |
| * _device_list_ is a pointer to a list of devices associated with _program_. |
| If _device_list_ is a `NULL` value, the compile is performed for all devices |
| associated with _program_. |
| If _device_list_ is a non-`NULL` value, the compile is performed for devices |
| specified in this list. |
| * _num_devices_ is the number of devices listed in _device_list_. |
| * _options_ is a pointer to a null-terminated string of characters that |
| describes the compilation options to be used for building the program |
| executable. |
| If _options_ is a `NULL` pointer then it will have the same result as the |
| empty string. |
| Certain options are ignored when program is created with IL. |
| The list of supported options is as described in <<compiler-options, |
| Compiler Options>>. |
| * _num_input_headers_ specifies the number of programs that describe headers |
| in the array referenced by _input_headers_. |
| * _input_headers_ is an array of program embedded headers created with |
| {clCreateProgramWithSource}. |
| * _header_include_names_ is an array that has a one to one correspondence with |
| _input_headers_. |
| Each entry in _header_include_names_ specifies the include name used by |
| source in _program_ that comes from an embedded header. |
| The corresponding entry in _input_headers_ identifies the program object |
| which contains the header source to be used. |
| The embedded headers are first searched before the headers in the list of |
| directories specified by the `-I` compile option (as described in |
| <<preprocessor-options, Preprocessor options>>). |
| If multiple entries in _header_include_names_ refer to the same header name, |
| the first one encountered will be used. |
| * _pfn_notify_ is a function pointer to a notification routine. |
| The notification routine is a callback function that an application can |
| register and which will be called when the program executable has been built |
| (successfully or unsuccessfully). |
| If _pfn_notify_ is not `NULL`, {clCompileProgram} does not need to wait for |
| the compiler to complete and can return immediately once the compilation can |
| begin. |
| Any state changes of the program object that result from calling |
| {clCompileProgram} (e.g. compile status or log) will be observable from this |
| callback function. |
| The compilation can begin if the context, program whose sources are being |
| compiled, list of devices, input headers, programs that describe input |
| headers and compiler options specified are all valid and appropriate host |
| and device resources needed to perform the compile are available. |
| If _pfn_notify_ is `NULL`, {clCompileProgram} does not return until the |
| compiler has completed. |
| This callback function may be called asynchronously by the OpenCL |
| implementation. |
| It is the applications responsibility to ensure that the callback function |
| is thread-safe. |
| ** _user_data_ will be passed as an argument when _pfn_notify_ is called. |
| _user_data_ can be `NULL`. |
| |
| The pre-processor runs before the program sources are compiled. |
| The compiled binary is built for all devices associated with _program_ or |
| the list of devices specified. |
| The compiled binary can be queried using {clGetProgramInfo}(_program_, |
| {CL_PROGRAM_BINARIES}, ...) and can be passed to {clCreateProgramWithBinary} |
| to create a new program object. |
| |
| If _program_ was created using {clCreateProgramWithIL}, then |
| _num_input_headers_, _input_headers_, and _header_include_names_ are |
| ignored. |
| |
| For example, consider the following program source: |
| |
| [source,c] |
| ---- |
| #include <foo.h> |
| #include <mydir/myinc.h> |
| __kernel void |
| image_filter (int n, int m, |
| __constant float *filter_weights, |
| __read_only image2d_t src_image, |
| __write_only image2d_t dst_image) |
| { |
| ... |
| } |
| ---- |
| |
| This kernel includes two headers foo.h and mydir/myinc.h. |
| The following describes how these headers can be passed as embedded headers |
| in program objects: |
| |
| [source,c] |
| ---- |
| cl_program foo_pg = clCreateProgramWithSource(context, |
| 1, &foo_header_src, NULL, &err); |
| cl_program myinc_pg = clCreateProgramWithSource(context, |
| 1, &myinc_header_src, NULL, &err); |
| |
| // lets assume the program source described above is given |
| // by program_A and is loaded via clCreateProgramWithSource |
| cl_program input_headers[2] = { foo_pg, myinc_pg }; |
| char * input_header_names[2] = { foo.h, mydir/myinc.h }; |
| clCompileProgram(program_A, |
| 0, NULL, // num_devices & device_list |
| NULL, // compile_options |
| 2, // num_input_headers |
| input_headers, |
| input_header_names, |
| NULL, NULL); // pfn_notify & user_data |
| ---- |
| |
| // refError |
| |
| {clCompileProgram} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_PROGRAM} if _program_ is not a valid program object. |
| * {CL_INVALID_VALUE} if _device_list_ is `NULL` and _num_devices_ is greater |
| than zero, or if _device_list_ is not `NULL` and _num_devices_ is zero. |
| * {CL_INVALID_VALUE} if _num_input_headers_ is zero and |
| _header_include_names_ or _input_headers_ are not `NULL` or if |
| _num_input_headers_ is not zero and _header_include_names_ or |
| _input_headers_ are `NULL`. |
| * {CL_INVALID_VALUE} if _pfn_notify_ is `NULL` but _user_data_ is not |
| `NULL`. |
| * {CL_INVALID_DEVICE} if device in _device_list_ is not in |
| the list of devices associated with _program_. |
| * {CL_INVALID_COMPILER_OPTIONS} if the compiler options specified by |
| _options_ are invalid. |
| * {CL_INVALID_OPERATION} if the compilation or build of a program executable |
| for any of the devices listed in _device_list_ by a previous call to |
| {clCompileProgram} or {clBuildProgram} for _program_ has not completed. |
| * {CL_COMPILER_NOT_AVAILABLE} if a compiler is not available, i.e. |
| {CL_DEVICE_COMPILER_AVAILABLE} specified in the |
| <<device-queries-table,Device Queries>> table is set to {CL_FALSE}. |
| * {CL_COMPILE_PROGRAM_FAILURE} if there is a failure to compile the program |
| source. |
| This error will be returned if {clCompileProgram} does not return until |
| the compile has completed. |
| * {CL_INVALID_OPERATION} if there are kernel objects attached to _program_. |
| * {CL_INVALID_OPERATION} if _program_ has no source or IL available, i.e. it |
| has not been created with {clCreateProgramWithSource} or |
| {clCreateProgramWithIL}. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clLinkProgram',desc='Links a set of compiled program objects and libraries for all the devices or a specific device(s) in the OpenCL context and creates a library or executable.',type='protos'] |
| -- |
| To link a set of compiled program objects and libraries for all the devices |
| or a specific device(s) in the OpenCL context and create a library or |
| executable, call the function |
| |
| include::{generated}/api/protos/clLinkProgram.txt[] |
| include::{generated}/api/version-notes/clLinkProgram.asciidoc[] |
| |
| * _context_ must be a valid OpenCL context. |
| * _device_list_ is a pointer to a list of devices that are in _context_. |
| If _device_list_ is a `NULL` value, the link is performed for all devices |
| associated with _context_ for which a compiled object is available. |
| If _device_list_ is a non-`NULL` value, the link is performed for devices |
| specified in this list for which a compiled object is available. |
| * _num_devices_ is the number of devices listed in _device_list_. |
| * _options_ is a pointer to a null-terminated string of characters that |
| describes the link options to be used for building the program executable. |
| The list of supported options is as described in <<linker-options,Linker |
| Options>>. |
| If the program was created using {clCreateProgramWithBinary} and _options_ |
| is a `NULL` pointer, the program will be linked as if _options_ were the |
| same as when the program binary was originally built. |
| If the program was created using {clCreateProgramWithBinary} and _options_ |
| string contains anything other than the same options in the same order |
| (whitespace ignored) as when the program binary was originally built, then |
| the behavior is implementation defined. |
| Otherwise, if _options_ is a `NULL` pointer then it will have the same |
| result as the empty string. |
| * _num_input_programs_ specifies the number of programs in array referenced by |
| _input_programs_. |
| * _input_programs_ is an array of program objects that are compiled binaries |
| or libraries that are to be linked to create the program executable. |
| For each device in _device_list_ or if _device_list_ is `NULL` the list of |
| devices associated with context, the following cases occur: |
| ** All programs specified by _input_programs_ contain a compiled binary or |
| library for the device. |
| In this case, a link is performed to generate a program executable for |
| this device. |
| ** None of the programs contain a compiled binary or library for that |
| device. |
| In this case, no link is performed and there will be no program |
| executable generated for this device. |
| ** All other cases will return a {CL_INVALID_OPERATION} error. |
| * _pfn_notify_ is a function pointer to a notification routine. |
| The notification routine is a callback function that an application can |
| register and which will be called when the program executable has been built |
| (successfully or unsuccessfully). |
| ** _user_data_ will be passed as an argument when _pfn_notify_ is called. |
| _user_data_ can be `NULL`. |
| |
| If _pfn_notify_ is not `NULL`, {clLinkProgram} does not need to wait for the |
| linker to complete, and can return immediately once the linking operation can |
| begin. |
| Once the linker has completed, the _pfn_notify_ callback function is called |
| which returns the program object returned by {clLinkProgram}. |
| Any state changes of the program object that result from calling {clLinkProgram} |
| (e.g. link status or log) will be observable from this callback function. |
| This callback function may be called asynchronously by the OpenCL |
| implementation. |
| It is the applications responsibility to ensure that the callback function |
| is thread-safe. |
| |
| If _pfn_notify_ is `NULL`, {clLinkProgram} does not return until the linker |
| has completed. |
| |
| {clLinkProgram} creates a new program object which contains the library or |
| executable. |
| The library or executable binary can be queried using |
| {clGetProgramInfo}(_program_, {CL_PROGRAM_BINARIES}, ...) and can be specified |
| to {clCreateProgramWithBinary} to create a new program object. |
| |
| The devices associated with the returned program object will be the list of |
| devices specified by _device_list_ or if _device_list_ is `NULL` it will be |
| the list of devices associated with _context_. |
| |
| The linking operation can begin if the context, list of devices, input |
| programs and linker options specified are all valid and appropriate host and |
| device resources needed to perform the link are available. |
| If the linking operation can begin, {clLinkProgram} returns a valid non-zero |
| program object. |
| |
| // refError |
| |
| If _pfn_notify_ is `NULL`, the _errcode_ret_ will be set to {CL_SUCCESS} if |
| the link operation was successful and {CL_LINK_PROGRAM_FAILURE} if there is a |
| failure to link the compiled binaries and/or libraries. |
| |
| If _pfn_notify_ is not `NULL`, {clLinkProgram} does not have to wait until |
| the linker to complete and can return {CL_SUCCESS} in _errcode_ret_ if the |
| linking operation can begin. |
| The _pfn_notify_ callback function will return a {CL_SUCCESS} or |
| {CL_LINK_PROGRAM_FAILURE} if the linking operation was successful or not. |
| |
| Otherwise {clLinkProgram} returns a `NULL` program object with an |
| appropriate error in _errcode_ret_. |
| The application should query the linker status of this program object to |
| check if the link was successful or not. |
| The list of errors that can be returned are: |
| |
| * {CL_INVALID_CONTEXT} if _context_ is not a valid context. |
| * {CL_INVALID_VALUE} if _device_list_ is `NULL` and _num_devices_ is greater |
| than zero, or if _device_list_ is not `NULL` and _num_devices_ is zero. |
| * {CL_INVALID_VALUE} if _num_input_programs_ is zero and _input_programs_ is |
| `NULL` or if _num_input_programs_ is zero and _input_programs_ is not |
| `NULL` or if _num_input_programs_ is not zero and _input_programs_ is |
| `NULL`. |
| * {CL_INVALID_PROGRAM} if programs specified in _input_programs_ are not |
| valid program objects. |
| * {CL_INVALID_VALUE} if _pfn_notify_ is `NULL` but _user_data_ is not |
| `NULL`. |
| * {CL_INVALID_DEVICE} if any device in _device_list_ is not in |
| the list of devices associated with _context_. |
| * {CL_INVALID_LINKER_OPTIONS} if the linker options specified by _options_ |
| are invalid. |
| * {CL_INVALID_OPERATION} if the compilation or build of a program executable |
| for any of the devices listed in _device_list_ by a previous call to |
| {clCompileProgram} or {clBuildProgram} for _program_ has not completed. |
| * {CL_INVALID_OPERATION} if the rules for devices containing compiled |
| binaries or libraries as described in _input_programs_ argument above |
| are not followed. |
| * {CL_LINKER_NOT_AVAILABLE} if a linker is not available, i.e. |
| {CL_DEVICE_LINKER_AVAILABLE} specified in the |
| <<device-queries-table,Device Queries>> table is set to {CL_FALSE}. |
| * {CL_LINK_PROGRAM_FAILURE} if there is a failure to link the compiled |
| binaries and/or libraries. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| |
| [[compiler-options]] |
| === Compiler Options |
| |
| The compiler options are categorized as pre-processor options, options for |
| math intrinsics, options that control optimization and miscellaneous |
| options. |
| This specification defines a standard set of options that must be supported |
| by the compiler when building program executables online or offline from |
| OpenCL C/{cpp} or, where relevant, from an IL. |
| These may be extended by a set of vendor- or platform-specific options. |
| |
| |
| [[preprocessor-options]] |
| ==== Preprocessor options |
| |
| These options control the OpenCL C/{cpp} preprocessor which is run on each |
| program source before actual compilation. |
| These options are ignored for programs created with IL. |
| |
| `-D name` :: |
| Predefine _name_ as a macro, with definition 1. |
| |
| `-D name=definition` :: |
| The contents of _definition_ are tokenized and processed as if they |
| appeared during translation phase three in a `#define` directive. |
| In particular, the definition will be truncated by embedded newline |
| characters. |
| + |
| -- |
| `-D` options are processed in the order they are given in the _options_ |
| argument to {clBuildProgram} or {clCompileProgram}. |
| Note that a space is required between the `-D` option and the symbol it |
| defines, otherwise behavior is implementation defined. |
| -- |
| |
| `-I dir` :: |
| Add the directory _dir_ to the list of directories to be searched for |
| header files. |
| _dir_ can optionally be enclosed in double quotes. |
| + |
| -- |
| This option is not portable due to its dependency on host file system and |
| host operating system. |
| It is supported for backwards compatibility with previous OpenCL versions. |
| Developers are encouraged to create and use explicit header objects by means |
| of {clCompileProgram} followed by {clLinkProgram}. |
| -- |
| |
| |
| [[math-intrinsics-options]] |
| ==== Math Intrinsics Options |
| |
| These options control compiler behavior regarding floating-point arithmetic. |
| These options trade off between speed and correctness. |
| |
| `-cl-single-precision-constant` :: |
| Treat double precision floating-point constant as single precision |
| constant. |
| This option is ignored for programs created with IL. |
| |
| `-cl-denorms-are-zero` :: |
| This option controls how single precision and double precision |
| denormalized numbers are handled. |
| If specified as a build option, the single precision denormalized |
| numbers may be flushed to zero; double precision denormalized numbers |
| may also be flushed to zero if the optional extension for double |
| precision is supported. |
| This is intended to be a performance hint and the OpenCL compiler can |
| choose not to flush denorms to zero if the device supports single |
| precision (or double precision) denormalized numbers. |
| + |
| -- |
| This option is ignored for single precision numbers if the device does not |
| support single precision denormalized numbers i.e. {CL_FP_DENORM} bit is not |
| set in {CL_DEVICE_SINGLE_FP_CONFIG}. |
| |
| This option is ignored for double precision numbers if the device does not |
| support double precision or if it does support double precision but not |
| double precision denormalized numbers i.e. {CL_FP_DENORM} bit is not set in |
| {CL_DEVICE_DOUBLE_FP_CONFIG}. |
| |
| This flag only applies for scalar and vector single precision floating-point |
| variables and computations on these floating-point variables inside a |
| program. |
| It does not apply to reading from or writing to image objects. |
| -- |
| |
| `-cl-fp32-correctly-rounded-divide-sqrt` :: |
| The `-cl-fp32-correctly-rounded-divide-sqrt` build option to |
| {clBuildProgram} or {clCompileProgram} allows an application to specify |
| that single precision floating-point divide (x/y and 1/x) and sqrt used |
| in the program source are correctly rounded. |
| If this build option is not specified, the minimum numerical accuracy of |
| single precision floating-point divide and sqrt are as defined in the |
| OpenCL C or OpenCL SPIR-V Environment specifications. |
| + |
| -- |
| This build option can only be specified if the |
| {CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT} is set in {CL_DEVICE_SINGLE_FP_CONFIG} (as |
| defined in the <<device-queries-table,Device Queries>> table) for devices |
| that the program is being build. |
| {clBuildProgram} or {clCompileProgram} will fail to compile the program for |
| a device if the `-cl-fp32-correctly-rounded-divide-sqrt` option is specified |
| and {CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT} is not set for the device. |
| |
| Note: This option is <<unified-spec, missing before>> version 1.2. |
| -- |
| |
| |
| [[optimization-options]] |
| ==== Optimization Options |
| |
| These options control various sorts of optimizations. |
| Turning on optimization flags makes the compiler attempt to improve the |
| performance and/or code size at the expense of compilation time and possibly |
| the ability to debug the program. |
| |
| `-cl-opt-disable` :: |
| This option disables all optimizations. |
| The default is optimizations are enabled. |
| |
| `-cl-strict-aliasing` :: |
| This option allows the compiler to assume the strictest aliasing rules. |
| + |
| -- |
| Note: This option is <<unified-spec, deprecated by>> version 1.1. |
| -- |
| |
| `-cl-uniform-work-group-size` :: |
| This requires that the global work-size be a multiple of the work-group |
| size specified to {clEnqueueNDRangeKernel}. |
| Allow optimizations that are made possible by this restriction. |
| + |
| -- |
| Note: This option is <<unified-spec, missing before>> version 2.0. |
| -- |
| |
| `-cl-no-subgroup-ifp` :: |
| This indicates that kernels in this program do not require subgroups to |
| make independent forward progress. |
| Allows optimizations that are made possible by this restriction. |
| This option has no effect for devices that do not support independent |
| forward progress for subgroups. |
| + |
| -- |
| Note: This option is <<unified-spec, missing before>> version 2.1. |
| -- |
| |
| The following options control compiler behavior regarding floating-point |
| arithmetic. |
| These options trade off between performance and correctness and must be |
| specifically enabled. |
| These options are not turned on by default since it can result in incorrect |
| output for programs which depend on an exact implementation of IEEE 754 |
| rules/specifications for math functions. |
| |
| `-cl-mad-enable` :: |
| Allow `a * b + c` to be replaced by a *mad* instruction. |
| The *mad* instruction may compute `a * b + c` with reduced accuracy |
| in the embedded profile. |
| See the OpenCL C or OpenCL SPIR-V Environment specification for accuracy |
| details. |
| On some hardware the *mad* instruction may provide better performance |
| than the expanded computation. |
| |
| `-cl-no-signed-zeros` :: |
| Allow optimizations for floating-point arithmetic that ignore the |
| signedness of zero. |
| IEEE 754 arithmetic specifies the distinct behavior of `+0.0` and `-0.0` |
| values, which then prohibits simplification of expressions such as `x |
| {plus} 0.0` or `0.0 * x` (even with `-cl-finite-math-only`). |
| This option implies that the sign of a zero result isn't significant. |
| |
| `-cl-unsafe-math-optimizations` :: |
| Allow optimizations for floating-point arithmetic that (a) assume that |
| arguments and results are valid, (b) may violate the IEEE 754 standard, |
| (c) assume relaxed OpenCL numerical compliance requirements as defined |
| in the unsafe math optimization section of the OpenCL C or OpenCL SPIR-V |
| Environment specifications, and (d) may violate edge case behavior in the |
| OpenCL C or OpenCL SPIR-V Environment specifications. |
| This option includes the `-cl-no-signed-zeros` and `-cl-mad-enable` options. |
| |
| `-cl-finite-math-only` :: |
| Allow optimizations for floating-point arithmetic that assume that |
| arguments and results are not NaNs, +Inf, -Inf. |
| This option may violate the OpenCL numerical compliance requirements for |
| single precision and double precision floating-point, as well as edge |
| case behavior. |
| |
| `-cl-fast-relaxed-math` :: |
| Sets the optimization options `-cl-finite-math-only` and |
| `-cl-unsafe-math-optimizations`. |
| This option causes the preprocessor macro `+__FAST_RELAXED_MATH__+` to |
| be defined in the OpenCL program. |
| |
| |
| ==== Options to Request or Suppress Warnings |
| |
| Warnings are diagnostic messages that report constructions which are not |
| inherently erroneous but which are risky or suggest there may have been an |
| error. |
| The following language-independent options do not enable specific warnings |
| but control the kinds of diagnostics produced by the OpenCL compiler. |
| These options are ignored for programs created with IL. |
| |
| `-w` :: |
| Inhibit all warning messages. |
| |
| `-Werror` :: |
| Make all warnings into errors. |
| |
| |
| [[opencl-c-version]] |
| ==== Options Controlling the OpenCL C version |
| |
| The following option controls the version of OpenCL C that the compiler |
| accepts. |
| These options are ignored for programs created with IL. |
| |
| `-cl-std=` :: |
| Determine the OpenCL C language version to use. |
| A value for this option must be provided. |
| Valid values are: |
| + |
| -- |
| * `CL1.1`: Support OpenCL C 1.1 language features defined in _section 6_ of |
| the OpenCL 1.1 specification or in the unified OpenCL C specification. |
| * `CL1.2`: Support OpenCL C 1.2 language features defined in _section 6_ of |
| the OpenCL 1.2 specification or in the unified OpenCL C specification. |
| * `CL2.0`: Support OpenCL C 2.0 language features defined in the OpenCL C 2.0 |
| specification or in the unified OpenCL C specification. |
| * `CL3.0`: Support OpenCL C 3.0 language features defined in the unified |
| OpenCL C specification. |
| -- |
| |
| Calls to {clBuildProgram} or {clCompileProgram} with the `-cl-std=CL1.1` |
| option *will fail* to compile the program for any devices with |
| {CL_DEVICE_OPENCL_C_VERSION} equal to OpenCL C 1.0 and when |
| {CL_DEVICE_OPENCL_C_ALL_VERSIONS} does not include OpenCL C 1.1. |
| |
| Calls to {clBuildProgram} or {clCompileProgram} with the `-cl-std=CL1.2` |
| option *will fail* to compile the program for any devices with |
| {CL_DEVICE_OPENCL_C_VERSION} equal to OpenCL C 1.1 or earlier |
| and when {CL_DEVICE_OPENCL_C_ALL_VERSIONS} does not include OpenCL C 1.2. |
| |
| Calls to {clBuildProgram} or {clCompileProgram} with the `-cl-std=CL2.0` |
| option *will fail* to compile the program for any devices with |
| {CL_DEVICE_OPENCL_C_VERSION} equal to OpenCL C 1.2 or earlier |
| and when {CL_DEVICE_OPENCL_C_ALL_VERSIONS} does not include OpenCL C 2.0. |
| |
| Calls to {clBuildProgram} or {clCompileProgram} with the `-cl-std=CL3.0` |
| option *will fail* to compile the program for any devices with |
| {CL_DEVICE_OPENCL_C_VERSION} equal to OpenCL C 2.0 or earlier |
| and when {CL_DEVICE_OPENCL_C_ALL_VERSIONS} does not include OpenCL C 3.0. |
| |
| If the `-cl-std` build option is not specified, the highest OpenCL C 1.x |
| language version supported by each device is used when compiling the program |
| for each device. |
| Applications are required to specify the `-cl-std=CL2.0` build option to |
| compile or build programs with OpenCL C 2.0 and the `-cl-std=CL3.0` |
| build option to compile or build programs with OpenCL C 3.0. |
| |
| |
| ==== Options for Querying Kernel Argument Information |
| |
| IMPORTANT: Querying for kernel argument information is <<unified-spec, missing |
| before>> version 1.2. |
| |
| `-cl-kernel-arg-info` :: |
| This option allows the compiler to store information about the arguments |
| of a kernel(s) in the program executable. |
| The argument information stored includes the argument name, its type, |
| the address space and access qualifiers used. |
| Refer to description of {clGetKernelArgInfo} on how to query this |
| information. |
| |
| |
| ==== Options for debugging your program |
| |
| IMPORTANT: Debugging options are <<unified-spec, missing before>> version 2.0. |
| |
| `-g` :: |
| This option can currently be used to generate additional errors for the |
| built-in functions that allow you to enqueue commands on a device (refer |
| to OpenCL kernel languages specifications). |
| |
| |
| [[linker-options]] |
| === Linker Options |
| |
| NOTE: Linker options are <<unified-spec, missing before>> version 1.2. |
| |
| This specification defines a standard set of linker options that must be |
| supported by the OpenCL C compiler when linking compiled programs online or |
| offline. |
| These linker options are categorized as library linking options and program |
| linking options. |
| These may be extended by a set of vendor- or platform-specific options. |
| |
| |
| ==== Library Linking Options |
| |
| IMPORTANT: Library linking options are <<unified-spec, missing before>> version |
| 1.2. |
| |
| The following options can be specified when creating a library of compiled |
| binaries. |
| |
| `-create-library` :: |
| Create a library of compiled binaries specified in _input_programs_ |
| argument to {clLinkProgram}. |
| |
| `-enable-link-options` :: |
| Allows the linker to modify the library behavior based on one or more |
| link options (described in <<program-linking-options, Program Linking |
| Options>>) when this library is linked with a program executable. |
| This option must be specified with the create-library option. |
| |
| |
| [[program-linking-options]] |
| ==== Program Linking Options |
| |
| The following options can be specified when linking a program executable. |
| |
| `-cl-denorms-are-zero` + |
| `-cl-no-signed-zeros` + |
| `-cl-unsafe-math-optimizations` + |
| `-cl-finite-math-only` + |
| `-cl-fast-relaxed-math` + |
| `-cl-no-subgroup-ifp` (<<unified-spec, missing before>> version 2.1) |
| |
| The options are described in <<math-intrinsics-options, Math Intrinsics |
| Options>> and <<optimization-options, Optimization Options>>. |
| The linker may apply these options to all compiled program objects |
| specified to {clLinkProgram}. |
| The linker may apply these options only to libraries which were created |
| with the option `-enable-link-options`. |
| |
| |
| === Unloading the OpenCL Compiler |
| |
| [open,refpage='clUnloadPlatformCompiler',desc='Allows the implementation to release the resources allocated by the OpenCL compiler for a platform.',type='protos'] |
| -- |
| To unload an OpenCL compiler for a platform, call the function |
| |
| include::{generated}/api/protos/clUnloadPlatformCompiler.txt[] |
| include::{generated}/api/version-notes/clUnloadPlatformCompiler.asciidoc[] |
| |
| * _platform_ is the platform to unload. |
| |
| This function allows the implementation to release the resources allocated |
| by the OpenCL compiler for _platform_. |
| This is a hint from the application and does not guarantee that the compiler |
| will not be used in the future or that the compiler will actually be |
| unloaded by the implementation. |
| Calls to {clBuildProgram}, {clCompileProgram} or {clLinkProgram} after |
| {clUnloadPlatformCompiler} will reload the compiler, if necessary, to build |
| the appropriate program executable. |
| |
| // refError |
| |
| {clUnloadPlatformCompiler} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_PLATFORM} if _platform_ is not a valid platform. |
| -- |
| |
| [open,refpage='clUnloadCompiler',desc='Allows the implementation to release the resources allocated by the OpenCL compiler.',type='protos'] |
| -- |
| Alternatively, if you are not using OpenCL via the ICD loader, you may unload the OpenCL compiler with the function |
| |
| include::{generated}/api/protos/clUnloadCompiler.txt[] |
| include::{generated}/api/version-notes/clUnloadCompiler.asciidoc[] |
| |
| This function allows the implementation to release the resources allocated |
| by the OpenCL compiler. |
| This is a hint from the application and does not guarantee that the compiler |
| will not be used in the future or that the compiler will actually be |
| unloaded by the implementation. |
| Calls to {clBuildProgram}, {clCompileProgram} or {clLinkProgram} after |
| {clUnloadCompiler} will reload the compiler, if necessary, to build |
| the appropriate program executable. |
| |
| // refError |
| |
| {clUnloadCompiler} will always return {CL_SUCCESS}. |
| -- |
| |
| |
| === Program Object Queries |
| |
| [open,refpage='clGetProgramInfo',desc='Returns information about the program object.',type='protos'] |
| -- |
| To return information about a program object, call the function |
| |
| include::{generated}/api/protos/clGetProgramInfo.txt[] |
| include::{generated}/api/version-notes/clGetProgramInfo.asciidoc[] |
| |
| * _program_ specifies the program object being queried. |
| * _param_name_ specifies the information to query. |
| The list of supported _param_name_ types and the information returned in |
| _param_value_ by {clGetProgramInfo} is described in the |
| <<program-info-table,Program Object Queries>> table. |
| * _param_value_ is a pointer to memory where the appropriate result being |
| queried is returned. |
| If _param_value_ is `NULL`, it is ignored. |
| * _param_value_size_ is used to specify the size in bytes of memory pointed to |
| by _param_value_. |
| This size must be {geq} size of return type as described in the |
| <<program-info-table,Program Object Queries>> table. |
| * _param_value_size_ret_ returns the actual size in bytes of data being |
| queried by _param_name_. |
| If _param_value_size_ret_ is `NULL`, it is ignored. |
| |
| [[program-info-table]] |
| .List of supported param_names by <<clGetProgramInfo>> |
| [width="100%",cols="<33%,<17%,<50%",options="header"] |
| |==== |
| | Program Info | Return Type | Description |
| | {CL_PROGRAM_REFERENCE_COUNT_anchor} footnote:[{fn-reference-count-usage}] |
| |
| include::{generated}/api/version-notes/CL_PROGRAM_REFERENCE_COUNT.asciidoc[] |
| | {cl_uint_TYPE} |
| | Return the _program_ reference count. |
| | {CL_PROGRAM_CONTEXT_anchor} |
| |
| include::{generated}/api/version-notes/CL_PROGRAM_CONTEXT.asciidoc[] |
| | {cl_context_TYPE} |
| | Return the context specified when the program object is created |
| | {CL_PROGRAM_NUM_DEVICES_anchor} |
| |
| include::{generated}/api/version-notes/CL_PROGRAM_NUM_DEVICES.asciidoc[] |
| | {cl_uint_TYPE} |
| | Return the number of devices associated with _program_. |
| | {CL_PROGRAM_DEVICES_anchor} |
| |
| include::{generated}/api/version-notes/CL_PROGRAM_DEVICES.asciidoc[] |
| | {cl_device_id_TYPE}[] |
| | Return the list of devices associated with the program object. |
| This can be the devices associated with context on which the program |
| object has been created or can be a subset of devices that are |
| specified when a program object is created using |
| {clCreateProgramWithBinary}. |
| | {CL_PROGRAM_SOURCE_anchor} |
| |
| include::{generated}/api/version-notes/CL_PROGRAM_SOURCE.asciidoc[] |
| | {char_TYPE}[] |
| | Return the program source code specified by |
| {clCreateProgramWithSource}. |
| The source string returned is a concatenation of all source strings |
| specified to {clCreateProgramWithSource} with a null terminator. |
| The concatenation strips any nulls in the original source strings. |
| |
| If _program_ is created using {clCreateProgramWithBinary}, |
| {clCreateProgramWithIL} or {clCreateProgramWithBuiltInKernels}, a |
| null string or the appropriate program source code is returned |
| depending on whether or not the program source code is stored in the |
| binary. |
| |
| The actual number of characters that represents the program source |
| code including the null terminator is returned in |
| _param_value_size_ret_. |
| | {CL_PROGRAM_IL_anchor} |
| |
| include::{generated}/api/version-notes/CL_PROGRAM_IL.asciidoc[] |
| Also see extension *cl_khr_il_program*. |
| | {char_TYPE}[] |
| | Returns the program IL for programs created with |
| {clCreateProgramWithIL}. |
| |
| If _program_ is created with {clCreateProgramWithSource}, |
| {clCreateProgramWithBinary} or {clCreateProgramWithBuiltInKernels} |
| the memory pointed to by param_value will be unchanged and |
| param_value_size_retwill be set to 0. |
| | {CL_PROGRAM_BINARY_SIZES_anchor} |
| |
| include::{generated}/api/version-notes/CL_PROGRAM_BINARY_SIZES.asciidoc[] |
| | {size_t_TYPE}[] |
| | Returns an array that contains the size in bytes of the program |
| binary (could be an executable binary, compiled binary or library |
| binary) for each device associated with program. |
| The size of the array is the number of devices associated with |
| program. |
| If a binary is not available for a device(s), a size of zero is |
| returned. |
| |
| If _program_ is created using {clCreateProgramWithBuiltInKernels}, |
| the implementation may return zero in any entries of the returned |
| array. |
| | {CL_PROGRAM_BINARIES_anchor} |
| |
| include::{generated}/api/version-notes/CL_PROGRAM_BINARIES.asciidoc[] |
| | {unsigned_char_TYPE}*[] |
| | Return the program binaries (could be an executable binary, compiled |
| binary or library binary) for all devices associated with program. |
| For each device in program, the binary returned can be the binary |
| specified for the device when program is created with |
| {clCreateProgramWithBinary} or it can be the executable binary |
| generated by {clBuildProgram} or {clLinkProgram}. |
| If _program_ is created with {clCreateProgramWithSource} or |
| {clCreateProgramWithIL}, the binary returned is the binary generated |
| by {clBuildProgram}, {clCompileProgram} or {clLinkProgram}. |
| The bits returned can be an implementation-specific intermediate |
| representation (a.k.a. IR) or device specific executable bits or |
| both. |
| The decision on which information is returned in the binary is up to |
| the OpenCL implementation. |
| |
| param_value points to an array of `n` pointers allocated by the |
| caller, where `n` is the number of devices associated with program. |
| The buffer sizes needed to allocate the memory that these `n` |
| pointers refer to can be queried using the {CL_PROGRAM_BINARY_SIZES} |
| query as described in this table. |
| |
| Each entry in this array is used by the implementation as the |
| location in memory where to copy the program binary for a specific |
| device, if there is a binary available. |
| To find out which device the program binary in the array refers to, |
| use the {CL_PROGRAM_DEVICES} query to get the list of devices. |
| There is a one-to-one correspondence between the array of n pointers |
| returned by {CL_PROGRAM_BINARIES} and array of devices returned by |
| {CL_PROGRAM_DEVICES}. |
| | {CL_PROGRAM_NUM_KERNELS_anchor} |
| |
| include::{generated}/api/version-notes/CL_PROGRAM_NUM_KERNELS.asciidoc[] |
| | {size_t_TYPE} |
| | Returns the number of kernels declared in _program_ that can be |
| created with {clCreateKernel}. |
| This information is only available after a successful program |
| executable has been built for at least one device in the list of |
| devices associated with _program_. |
| | {CL_PROGRAM_KERNEL_NAMES_anchor} |
| |
| include::{generated}/api/version-notes/CL_PROGRAM_KERNEL_NAMES.asciidoc[] |
| | {char_TYPE}[] |
| | Returns a semi-colon separated list of kernel names in _program_ |
| that can be created with {clCreateKernel}. |
| This information is only available after a successful program |
| executable has been built for at least one device in the list of |
| devices associated with _program_. |
| | {CL_PROGRAM_SCOPE_GLOBAL_CTORS_PRESENT_anchor} |
| |
| include::{generated}/api/version-notes/CL_PROGRAM_SCOPE_GLOBAL_CTORS_PRESENT.asciidoc[] |
| | {cl_bool_TYPE} |
| | This indicates that the _program_ object contains non-trivial |
| constructor(s) that will be executed by runtime before any kernel |
| from the program is executed. |
| This information is only available after a successful program |
| executable has been built for at least one device in the list of |
| devices associated with _program_. |
| |
| Querying {CL_PROGRAM_SCOPE_GLOBAL_CTORS_PRESENT} may unconditionally |
| return {CL_FALSE} if no devices associated with _program_ support |
| constructors for program scope global variables. |
| Support for constructors and destructors for program scope global |
| variables is required only for OpenCL 2.2 devices. |
| | {CL_PROGRAM_SCOPE_GLOBAL_DTORS_PRESENT_anchor} |
| |
| include::{generated}/api/version-notes/CL_PROGRAM_SCOPE_GLOBAL_DTORS_PRESENT.asciidoc[] |
| | {cl_bool_TYPE} |
| | This indicates that the program object contains non-trivial |
| destructor(s) that will be executed by runtime when _program_ is |
| destroyed. |
| This information is only available after a successful program |
| executable has been built for at least one device in the list of |
| devices associated with _program_. |
| |
| Querying {CL_PROGRAM_SCOPE_GLOBAL_CTORS_PRESENT} may unconditionally |
| return {CL_FALSE} if no devices associated with _program_ support |
| destructors for program scope global variables. |
| Support for constructors and destructors for program scope global |
| variables is required only for OpenCL 2.2 devices. |
| |==== |
| |
| // refError |
| |
| {clGetProgramInfo} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes |
| specified by _param_value_size_ is < size of return type as described in |
| the <<program-info-table,Program Object Queries>> table and |
| _param_value_ is not `NULL`. |
| * {CL_INVALID_PROGRAM} if _program_ is a not a valid program object. |
| * {CL_INVALID_PROGRAM_EXECUTABLE} if _param_name_ is |
| {CL_PROGRAM_NUM_KERNELS}, {CL_PROGRAM_KERNEL_NAMES}, |
| {CL_PROGRAM_SCOPE_GLOBAL_CTORS_PRESENT}, or |
| {CL_PROGRAM_SCOPE_GLOBAL_DTORS_PRESENT} and a successful program executable |
| has not been built for at least one device in the list of devices |
| associated with _program_. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clGetProgramBuildInfo',desc='Returns build information for each device in the program object.',type='protos'] |
| -- |
| To return build information for each device in the program object, call the |
| function |
| |
| include::{generated}/api/protos/clGetProgramBuildInfo.txt[] |
| include::{generated}/api/version-notes/clGetProgramBuildInfo.asciidoc[] |
| |
| * _program_ specifies the program object being queried. |
| * _device_ specifies the device for which build information is being queried. |
| _device_ must be a valid device associated with _program_. |
| * _param_name_ specifies the information to query. |
| The list of supported _param_name_ types and the information returned in |
| _param_value_ by {clGetProgramBuildInfo} is described in the |
| <<program-build-info-table,Program Build Queries>> table. |
| * _param_value_ is a pointer to memory where the appropriate result being |
| queried is returned. |
| If _param_value_ is `NULL`, it is ignored. |
| * _param_value_size_ is used to specify the size in bytes of memory pointed to |
| by _param_value_. |
| This size must be {geq} size of return type as described in the |
| <<program-build-info-table,Program Build Queries>> table. |
| * _param_value_size_ret_ returns the actual size in bytes of data being |
| queried by _param_name_. |
| If _param_value_size_ret_ is `NULL`, it is ignored. |
| |
| [[program-build-info-table]] |
| .List of supported param_names by <<clGetProgramBuildInfo>> |
| [width="100%",cols="<33%,<17%,<50%",options="header"] |
| |==== |
| | Program Build Info | Return Type | Description |
| | {CL_PROGRAM_BUILD_STATUS_anchor} |
| |
| include::{generated}/api/version-notes/CL_PROGRAM_BUILD_STATUS.asciidoc[] |
| | {cl_build_status_TYPE} |
| | Returns the build, compile or link status, whichever was performed |
| last on the specified _program_ object for _device_. |
| |
| This can be one of the following: |
| |
| {CL_BUILD_NONE_anchor} - The build status returned if no {clBuildProgram}, |
| {clCompileProgram} or {clLinkProgram} has been performed on the |
| specified _program_ object for _device_). |
| |
| {CL_BUILD_ERROR_anchor} - The build status returned if {clBuildProgram}, |
| {clCompileProgram} or {clLinkProgram} - whichever was performed last |
| on the specified _program_ object for _device_ - generated an error. |
| |
| {CL_BUILD_SUCCESS_anchor} - The build status returned if {clBuildProgram}, |
| {clCompileProgram} or {clLinkProgram} - whichever was performed last |
| on the specified _program_ object for _device_ - was successful. |
| |
| {CL_BUILD_IN_PROGRESS_anchor} - The build status returned if |
| {clBuildProgram}, {clCompileProgram} or {clLinkProgram} - whichever |
| was performed last on the specified _program_ object for _device_ - has |
| not finished. |
| | {CL_PROGRAM_BUILD_OPTIONS_anchor} |
| |
| include::{generated}/api/version-notes/CL_PROGRAM_BUILD_OPTIONS.asciidoc[] |
| | {char_TYPE}[] |
| | Return the build, compile or link options specified by the options |
| argument in {clBuildProgram}, {clCompileProgram} or {clLinkProgram}, |
| whichever was performed last on the specified _program_ object for |
| _device_. |
| |
| If build status of the specified _program_ for _device_ is |
| {CL_BUILD_NONE}, an empty string is returned. |
| | {CL_PROGRAM_BUILD_LOG_anchor} |
| |
| include::{generated}/api/version-notes/CL_PROGRAM_BUILD_LOG.asciidoc[] |
| | {char_TYPE}[] |
| | Return the build, compile or link log for {clBuildProgram}, |
| {clCompileProgram} or {clLinkProgram}, whichever was performed last |
| on program for device. |
| |
| If build status of the specified _program_ for _device_ is |
| {CL_BUILD_NONE}, an empty string is returned. |
| | {CL_PROGRAM_BINARY_TYPE_anchor} |
| |
| include::{generated}/api/version-notes/CL_PROGRAM_BINARY_TYPE.asciidoc[] |
| | {cl_program_binary_type_TYPE} |
| | Return the program binary type for device. |
| This can be one of the following values: |
| |
| {CL_PROGRAM_BINARY_TYPE_NONE_anchor} - There is no binary associated |
| with the specified _program_ object for _device_. |
| |
| {CL_PROGRAM_BINARY_TYPE_COMPILED_OBJECT_anchor} - A compiled binary is |
| associated with _device_. |
| This is the case when the specified _program_ object was created using |
| {clCreateProgramWithSource} and compiled using {clCompileProgram}, or |
| when a compiled binary was loaded using {clCreateProgramWithBinary}. |
| |
| {CL_PROGRAM_BINARY_TYPE_LIBRARY_anchor} - A library binary is |
| associated with _device_. |
| This is the case when the specified _program_ object was linked by |
| {clLinkProgram} using the `-create-library` link option, or when a |
| compiled library binary was loaded using {clCreateProgramWithBinary}. |
| |
| {CL_PROGRAM_BINARY_TYPE_EXECUTABLE_anchor} - An executable binary is |
| associated with _device_. |
| This is the case when the specified _program_ object was linked by |
| {clLinkProgram} without the `-create-library` link option, or when an |
| executable binary was built using {clBuildProgram}. |
| | {CL_PROGRAM_BUILD_GLOBAL_VARIABLE_TOTAL_SIZE_anchor} |
| |
| include::{generated}/api/version-notes/CL_PROGRAM_BUILD_GLOBAL_VARIABLE_TOTAL_SIZE.asciidoc[] |
| | {size_t_TYPE} |
| | The total amount of storage, in bytes, used by program variables in |
| the global address space. |
| |==== |
| |
| // refError |
| |
| {clGetProgramBuildInfo} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_DEVICE} if _device_ is not in the list of devices associated |
| with _program_. |
| * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes |
| specified by _param_value_size_ is < size of return type as described in |
| the <<program-build-info-table,Program Build Queries>> table and |
| _param_value_ is not `NULL`. |
| * {CL_INVALID_PROGRAM} if _program_ is a not a valid program object. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| |
| [NOTE] |
| ==== |
| A program binary (compiled binary, library binary or executable binary) |
| built for a parent device can be used by all its sub-devices. |
| If a program binary has not been built for a sub-device, the program binary |
| associated with the parent device will be used. |
| |
| A program binary for a device specified with {clCreateProgramWithBinary} or |
| queried using {clGetProgramInfo} can be used as the binary for the |
| associated root device, and all sub-devices created from the root-level |
| device or sub-devices thereof. |
| ==== |
| -- |
| |
| |
| == Kernel Objects |
| |
| A kernel is a function declared in a program. |
| A kernel is identified by the `+__kernel+` qualifier applied to any function |
| in a program. |
| A kernel object encapsulates the specific `+__kernel+` function declared in |
| a program and the argument values to be used when executing this |
| `+__kernel+` function. |
| |
| |
| === Creating Kernel Objects |
| |
| [open,refpage='clCreateKernel',desc='Creates a kernel object.',type='protos'] |
| -- |
| To create a kernel object, use the function |
| |
| include::{generated}/api/protos/clCreateKernel.txt[] |
| include::{generated}/api/version-notes/clCreateKernel.asciidoc[] |
| |
| * _program_ is a program object with a successfully built executable. |
| * _kernel_name_ is a function name in the program declared with the |
| `+__kernel+` qualifier. |
| * _errcode_ret_ will return an appropriate error code. |
| If _errcode_ret_ is `NULL`, no error code is returned. |
| |
| // refError |
| |
| {clCreateKernel} returns a valid non-zero kernel object and _errcode_ret_ is |
| set to {CL_SUCCESS} if the kernel object is created successfully. |
| Otherwise, it returns a `NULL` value with one of the following error values |
| returned in _errcode_ret_: |
| |
| * {CL_INVALID_PROGRAM} if _program_ is not a valid program object. |
| * {CL_INVALID_PROGRAM_EXECUTABLE} if there is no successfully built |
| executable for _program_. |
| * {CL_INVALID_KERNEL_NAME} if _kernel_name_ is not found in _program_. |
| * {CL_INVALID_KERNEL_DEFINITION} if the function definition for `+__kernel+` |
| function given by _kernel_name_ such as the number of arguments, the |
| argument types are not the same for all devices for which the _program_ |
| executable has been built. |
| * {CL_INVALID_VALUE} if _kernel_name_ is `NULL`. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clCreateKernelsInProgram',desc='Creates kernel objects for all kernel functions in a program object.',type='protos'] |
| -- |
| To create kernel objects for all kernel functions in a program, |
| call the function |
| |
| include::{generated}/api/protos/clCreateKernelsInProgram.txt[] |
| include::{generated}/api/version-notes/clCreateKernelsInProgram.asciidoc[] |
| |
| * _program_ is a program object with a successfully built executable. |
| * _num_kernels_ is the size of memory pointed to by _kernels_ specified as the |
| number of {cl_kernel_TYPE} entries. |
| * _kernels_ is the buffer where the kernel objects for kernels in _program_ |
| will be returned. |
| If _kernels_ is `NULL`, it is ignored. |
| If _kernels_ is not `NULL`, _num_kernels_ must be greater than or equal to |
| the number of kernels in _program_. |
| * _num_kernels_ret_ is the number of kernels in _program_. |
| If _num_kernels_ret_ is `NULL`, it is ignored. |
| |
| Kernel objects are not created for any `+__kernel+` functions in _program_ |
| that do not have the same function definition across all devices for which a |
| program executable has been successfully built. |
| |
| Kernel objects can only be created once you have a program object with a |
| valid program source or binary loaded into the program object and the |
| program executable has been successfully built for one or more devices |
| associated with program. |
| No changes to the program executable are allowed while there are kernel |
| objects associated with a program object. |
| This means that calls to {clBuildProgram} and {clCompileProgram} return |
| {CL_INVALID_OPERATION} if there are kernel objects attached to a program |
| object. |
| The OpenCL context associated with _program_ will be the context associated |
| with _kernel_. |
| The list of devices associated with _program_ are the devices associated |
| with _kernel_. |
| Devices associated with a program object for which a valid program |
| executable has been built can be used to execute kernels declared in the |
| program object. |
| |
| // refError |
| |
| {clCreateKernelsInProgram} will return {CL_SUCCESS} if the kernel objects were |
| successfully allocated. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_PROGRAM} if _program_ is not a valid program object. |
| * {CL_INVALID_PROGRAM_EXECUTABLE} if there is no successfully built |
| executable for any device in _program_. |
| * {CL_INVALID_VALUE} if _kernels_ is not `NULL` and _num_kernels_ is less |
| than the number of kernels in _program_. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| |
| -- |
| |
| [open,refpage='clRetainKernel',desc='Increments the kernel object reference count.',type='protos'] |
| -- |
| To retain a kernel object, call the function |
| |
| include::{generated}/api/protos/clRetainKernel.txt[] |
| include::{generated}/api/version-notes/clRetainKernel.asciidoc[] |
| |
| * _kernel_ is the kernel object to be retained. |
| |
| The _kernel_ reference count is incremented. |
| |
| // refError |
| |
| {clRetainKernel} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| |
| {clCreateKernel} or {clCreateKernelsInProgram} do an implicit retain. |
| -- |
| |
| [open,refpage='clReleaseKernel',desc='Decrements the kernel reference count.',type='protos'] |
| -- |
| To release a kernel object, call the function |
| |
| include::{generated}/api/protos/clReleaseKernel.txt[] |
| include::{generated}/api/version-notes/clReleaseKernel.asciidoc[] |
| |
| * _kernel_ is the kernel object to be released. |
| |
| The _kernel_ reference count is decremented. |
| |
| The kernel object is deleted once the number of instances that are retained |
| to _kernel_ become zero and the kernel object is no longer needed by any |
| enqueued commands that use _kernel_. |
| Using this function to release a reference that was not obtained by creating |
| the object or by calling {clRetainKernel} causes undefined behavior. |
| |
| // refError |
| |
| {clReleaseKernel} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| |
| === Setting Kernel Arguments |
| |
| To execute a kernel, the kernel arguments must be set. |
| |
| [open,refpage='clSetKernelArg',desc='Set the argument value for a specific argument of a kernel.',type='protos'] |
| -- |
| To set the argument value for a specific argument of a kernel, call the |
| function |
| |
| include::{generated}/api/protos/clSetKernelArg.txt[] |
| include::{generated}/api/version-notes/clSetKernelArg.asciidoc[] |
| |
| * _kernel_ is a valid kernel object. |
| * _arg_index_ is the argument index. |
| Arguments to the kernel are referred by indices that go from 0 for the |
| leftmost argument to _n_ - 1, where _n_ is the total number of arguments |
| declared by a kernel (see below). |
| * _arg_size_ specifies the size of the argument value. |
| If the argument is a memory object, the _arg_size_ value must be equal to |
| `sizeof({cl_mem_TYPE})`. |
| For arguments declared with the `local` qualifier, the size specified will |
| be the size in bytes of the buffer that must be allocated for the `local` |
| argument. |
| If the argument is of type _sampler_t_, the _arg_size_ value must be equal |
| to `sizeof({cl_sampler_TYPE})`. |
| If the argument is of type _queue_t_, the _arg_size_ value must be equal to |
| `sizeof({cl_command_queue_TYPE})`. |
| For all other arguments, the size will be the size of argument type. |
| * _arg_value_ is a pointer to data that should be used as the argument value |
| for argument specified by _arg_index_. |
| The argument data pointed to by _arg_value_ is copied and the _arg_value_ |
| pointer can therefore be reused by the application after {clSetKernelArg} |
| returns. |
| The argument value specified is the value used by all API calls that enqueue |
| _kernel_ ({clEnqueueNDRangeKernel} and {clEnqueueTask}) until the argument |
| value is changed by a call to {clSetKernelArg} for _kernel_. |
| |
| For example, consider the following kernel: |
| |
| [source,c] |
| ---- |
| kernel void image_filter (int n, |
| int m, |
| constant float *filter_weights, |
| read_only image2d_t src_image, |
| write_only image2d_t dst_image) |
| { |
| ... |
| } |
| ---- |
| |
| Argument index values for `image_filter` will be 0 for `n`, 1 for `m`, 2 for |
| `filter_weights`, 3 for `src_image` and 4 for `dst_image`. |
| |
| If the argument is a memory object (buffer, pipe, image or image array), the |
| _arg_value_ entry will be a pointer to the appropriate buffer, pipe, image |
| or image array object. |
| The memory object must be created with the context associated with the |
| kernel object. |
| If the argument is a buffer object, the _arg_value_ pointer can be `NULL` or |
| point to a `NULL` value in which case a `NULL` value will be used as the |
| value for the argument declared as a pointer to `global` or `constant` |
| memory in the kernel. |
| If the argument is declared with the `local` qualifier, the _arg_value_ |
| entry must be `NULL`. |
| If the argument is of type _sampler_t_, the _arg_value_ entry must be a |
| pointer to the sampler object. |
| If the argument is of type _queue_t_, the _arg_value_ entry must be a |
| pointer to the device queue object. |
| |
| If the argument is declared to be a pointer of a built-in scalar or vector |
| type, or a user defined structure type in the global or constant address |
| space, the memory object specified as argument value must be a buffer object |
| (or `NULL`). |
| If the argument is declared with the `constant` qualifier, the size in bytes |
| of the memory object cannot exceed {CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE} and |
| the number of arguments declared as pointers to `constant` memory cannot |
| exceed {CL_DEVICE_MAX_CONSTANT_ARGS}. |
| |
| The memory object specified as argument value must be a pipe object if the |
| argument is declared with the _pipe_ qualifier. |
| |
| The memory object specified as argument value must be a 2D image object if |
| the argument is declared to be of type _image2d_t_. |
| The memory object specified as argument value must be a 2D image object with |
| image channel order = {CL_DEPTH} if the argument is declared to be of type |
| _image2d_depth_t_. |
| The memory object specified as argument value must be a 3D image object if |
| argument is declared to be of type _image3d_t_. |
| The memory object specified as argument value must be a 1D image object if |
| the argument is declared to be of type _image1d_t_. |
| The memory object specified as argument value must be a 1D image buffer |
| object if the argument is declared to be of type _image1d_buffer_t_. |
| The memory object specified as argument value must be a 1D image array |
| object if argument is declared to be of type _image1d_array_t_. |
| The memory object specified as argument value must be a 2D image array |
| object if argument is declared to be of type _image2d_array_t_. |
| The memory object specified as argument value must be a 2D image array |
| object with image channel order = {CL_DEPTH} if argument is declared to be of |
| type _image2d_array_depth_t_. |
| |
| For all other kernel arguments, the _arg_value_ entry must be a pointer to |
| the actual data to be used as argument value. |
| |
| [NOTE] |
| ==== |
| A kernel object does not update the reference count for objects such as |
| memory or sampler objects specified as argument values by {clSetKernelArg}. |
| Users may not rely on a kernel object to retain objects specified as |
| argument values to the kernel. |
| |
| Implementations shall not allow {cl_kernel_TYPE} objects to hold reference |
| counts to {cl_kernel_TYPE} arguments, because no mechanism is provided for the |
| user to tell the kernel to release that ownership right. |
| If the kernel holds ownership rights on kernel args, that would make it |
| impossible for users to tell with certainty when they may safely |
| release user allocated resources associated with OpenCL objects such as |
| the {cl_mem_TYPE} backing store used with {CL_MEM_USE_HOST_PTR}. |
| ==== |
| |
| // refError |
| |
| {clSetKernelArg} returns {CL_SUCCESS} if the function was executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object. |
| * {CL_INVALID_ARG_INDEX} if _arg_index_ is not a valid argument index. |
| * {CL_INVALID_ARG_VALUE} if _arg_value_ specified is not a valid value. |
| * {CL_INVALID_MEM_OBJECT} for an argument declared to be a memory object |
| when the specified _arg_value_ is not a valid memory object. |
| * {CL_INVALID_SAMPLER} for an argument declared to be of type _sampler_t_ |
| when the specified _arg_value_ is not a valid sampler object. |
| * {CL_INVALID_DEVICE_QUEUE} for an argument declared to be of type _queue_t_ |
| when the specified _arg_value_ is not a valid device queue object. |
| This error code is <<unified-spec, missing before>> version 2.0. |
| * {CL_INVALID_ARG_SIZE} if _arg_size_ does not match the size of the data |
| type for an argument that is not a memory object or if the argument is a |
| memory object and _arg_size_ != `sizeof({cl_mem_TYPE})` or if _arg_size_ is |
| zero and the argument is declared with the local qualifier or if the |
| argument is a sampler and _arg_size_ != `sizeof({cl_sampler_TYPE})`. |
| * {CL_MAX_SIZE_RESTRICTION_EXCEEDED} if the size in bytes of the memory |
| object (if the argument is a memory object) or _arg_size_ (if the |
| argument is declared with `local` qualifier) exceeds a language- |
| specified maximum size restriction for this argument, such as the |
| *MaxByteOffset* SPIR-V decoration. |
| This error code is <<unified-spec, missing before>> version 2.2. |
| * {CL_INVALID_ARG_VALUE} if the argument is an image declared with the |
| `read_only` qualifier and _arg_value_ refers to an image object created |
| with _cl_mem_flags_ of {CL_MEM_WRITE_ONLY} or if the image argument is |
| declared with the `write_only` qualifier and _arg_value_ refers to an |
| image object created with _cl_mem_flags_ of {CL_MEM_READ_ONLY}. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clSetKernelArgSVMPointer',desc='Set a SVM pointer as the argument value for a specific argument of a kernel.',type='protos'] |
| -- |
| To set a SVM pointer as the argument value for a specific argument of a |
| kernel, call the function |
| |
| include::{generated}/api/protos/clSetKernelArgSVMPointer.txt[] |
| include::{generated}/api/version-notes/clSetKernelArgSVMPointer.asciidoc[] |
| |
| * _kernel_ is a valid kernel object. |
| * _arg_index_ is the argument index. |
| Arguments to the kernel are referred by indices that go from 0 for the |
| leftmost argument to _n_ - 1, where _n_ is the total number of arguments |
| declared by a kernel. |
| * _arg_value_ is the SVM pointer that should be used as the argument value for |
| argument specified by _arg_index_. |
| The SVM pointer specified is the value used by all API calls that enqueue |
| _kernel_ ({clEnqueueNDRangeKernel} and {clEnqueueTask}) until the argument |
| value is changed by a call to {clSetKernelArgSVMPointer} for _kernel_. |
| The SVM pointer can only be used for arguments that are declared to be a |
| pointer to `global` or `constant` memory. |
| The SVM pointer value must be aligned according to the arguments type. |
| For example, if the argument is declared to be `+global float4 *p+`, the SVM |
| pointer value passed for `p` must be at a minimum aligned to a `float4`. |
| The SVM pointer value specified as the argument value can be the pointer |
| returned by {clSVMAlloc} or can be a pointer offset into the SVM region. |
| |
| // refError |
| |
| {clSetKernelArgSVMPointer} returns {CL_SUCCESS} if the function was executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object. |
| * {CL_INVALID_OPERATION} if no devices in the context associated with _kernel_ support SVM. |
| * {CL_INVALID_ARG_INDEX} if _arg_index_ is not a valid argument index. |
| * {CL_INVALID_ARG_VALUE} if _arg_value_ specified is not a valid value. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clSetKernelExecInfo',desc='Pass additional information other than argument values to a kernel.',type='protos'] |
| -- |
| To pass additional information other than argument values to a kernel, call |
| the function |
| |
| include::{generated}/api/protos/clSetKernelExecInfo.txt[] |
| include::{generated}/api/version-notes/clSetKernelExecInfo.asciidoc[] |
| |
| * _kernel_ specifies the kernel object being queried. |
| * _param_name_ specifies the information to be passed to kernel. |
| The list of supported _param_name_ types and the corresponding values passed |
| in _param_value_ is described in the <<kernel-exec-info-table,Kernel |
| Execution Properties>> table. |
| * _param_value_size_ specifies the size in bytes of the memory pointed to by |
| _param_value_. |
| * _param_value_ is a pointer to memory where the appropriate values determined |
| by _param_name_ are specified. |
| |
| [[kernel-exec-info-table]] |
| .List of supported param_names by <<clSetKernelExecInfo>> |
| [width="100%",cols="<33%,<17%,<50%",options="header"] |
| |==== |
| | Kernel Exec Info | Type | Description |
| | {CL_KERNEL_EXEC_INFO_SVM_PTRS_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_EXEC_INFO_SVM_PTRS.asciidoc[] |
| | {void_TYPE}*[] |
| | SVM pointers must reference locations contained entirely within |
| buffers that are passed to kernel as arguments, or that are passed |
| through the execution information. |
| |
| Non-argument SVM buffers must be specified by passing pointers to |
| those buffers via {clSetKernelExecInfo} for coarse-grain and |
| fine-grain buffer SVM allocations but not for finegrain system SVM |
| allocations. |
| | {CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM.asciidoc[] |
| | {cl_bool_TYPE} |
| | This flag indicates whether the kernel uses pointers that are fine |
| grain system SVM allocations. |
| These fine grain system SVM pointers may be passed as arguments or |
| defined in SVM buffers that are passed as arguments to _kernel_. |
| |==== |
| |
| // refError |
| |
| {clSetKernelExecInfo} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_KERNEL} if _kernel_ is a not a valid kernel object. |
| * {CL_INVALID_OPERATION} if no devices in the context associated with _kernel_ support SVM. |
| * {CL_INVALID_VALUE} if _param_name_ is not valid, if _param_value_ is |
| `NULL` or if the size specified by _param_value_size_ is not valid. |
| * {CL_INVALID_OPERATION} if _param_name_ is |
| {CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM} and _param_value_ is {CL_TRUE} |
| but no devices in context associated with _kernel_ support fine-grain |
| system SVM allocations. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [NOTE] |
| ==== |
| Coarse-grain or fine-grain buffer SVM pointers used by a kernel which |
| are not passed as a kernel arguments must be specified using |
| {clSetKernelExecInfo} with {CL_KERNEL_EXEC_INFO_SVM_PTRS}. |
| For example, if SVM buffer A contains a pointer to another SVM buffer B, |
| and the kernel dereferences that pointer, then a pointer to B must |
| either be passed as an argument in the call to that kernel or it must be |
| made available to the kernel using {clSetKernelExecInfo}. |
| For example, we might pass extra SVM pointers as follows: |
| |
| [source,c] |
| ---- |
| clSetKernelExecInfo(kernel, |
| CL_KERNEL_EXEC_INFO_SVM_PTRS, |
| num_ptrs * sizeof(void *), |
| extra_svm_ptr_list); |
| ---- |
| |
| Here `num_ptrs` specifies the number of additional SVM pointers while |
| `extra_svm_ptr_list` specifies a pointer to memory containing those SVM |
| pointers. |
| |
| When calling {clSetKernelExecInfo} with {CL_KERNEL_EXEC_INFO_SVM_PTRS} to |
| specify pointers to non-argument SVM buffers as extra arguments to a kernel, |
| each of these pointers can be the SVM pointer returned by {clSVMAlloc} or |
| can be a pointer + offset into the SVM region. |
| It is sufficient to provide one pointer for each SVM buffer used. |
| |
| {CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM} is used to indicate whether |
| SVM pointers used by a kernel will refer to system allocations or not. |
| |
| {CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM} = {CL_FALSE} indicates that the |
| OpenCL implementation may assume that system pointers are not passed as |
| kernel arguments and are not stored inside SVM allocations passed as kernel |
| arguments. |
| |
| {CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM} = {CL_TRUE} indicates that the |
| OpenCL implementation must assume that system pointers might be passed as |
| kernel arguments and/or stored inside SVM allocations passed as kernel |
| arguments. |
| In this case, if the device to which the kernel is enqueued does not support |
| system SVM pointers, {clEnqueueNDRangeKernel} and {clEnqueueTask} will return a |
| {CL_INVALID_OPERATION} error. |
| If none of the devices in the context associated with kernel support |
| fine-grain system SVM allocations, {clSetKernelExecInfo} will return a |
| {CL_INVALID_OPERATION} error. |
| |
| If {clSetKernelExecInfo} has not been called with a value for |
| {CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM}, the default value is used for |
| this kernel attribute. |
| The default value depends on whether the device on which the kernel is |
| enqueued supports fine-grain system SVM allocations. |
| If so, the default value used is {CL_TRUE} (system pointers might be passed); |
| otherwise, the default is {CL_FALSE}. |
| |
| A call to {clSetKernelExecInfo} for a given value of _param_name_ |
| replaces any prior value passed for that value of _param_name_. |
| Only one _param_value_ will be stored for each value of _param_name_. |
| ==== |
| |
| |
| === Copying Kernel Objects |
| |
| NOTE: Copying kernel objects is <<unified-spec, missing before>> version 2.1. |
| |
| [open,refpage='clCloneKernel',desc='Make a shallow copy of the kernel object.',type='protos'] |
| -- |
| To clone a kernel object, call the function |
| |
| include::{generated}/api/protos/clCloneKernel.txt[] |
| include::{generated}/api/version-notes/clCloneKernel.asciidoc[] |
| |
| * _source_kernel_ is a valid {cl_kernel_TYPE} object that will be copied. |
| _source_kernel_ will not be modified in any way by this function. |
| * _errcode_ret_ will be assigned an appropriate error code. |
| If _errcode_ret_ is `NULL`, no error code is returned. |
| |
| Cloning is used to make a shallow copy of the kernel object, its arguments |
| and any information passed to the kernel object using {clSetKernelExecInfo}. |
| If the kernel object was ready to be enqueued before copying it, the clone |
| of the kernel object is ready to enqueue. |
| |
| The returned kernel object is an exact copy of _source_kernel_, with one |
| caveat: the reference count on the returned kernel object is set as if it |
| had been returned by {clCreateKernel}. |
| The reference count of _source_kernel will_ not be changed. |
| |
| The resulting kernel will be in the same state as if {clCreateKernel} is |
| called to create the resultant kernel with the same arguments as those used |
| to create _source_kernel_, the latest call to {clSetKernelArg} or |
| {clSetKernelArgSVMPointer} for each argument index applied to kernel and the |
| last call to {clSetKernelExecInfo} for each value of the param name |
| parameter are applied to the new kernel object. |
| |
| All arguments of the new kernel object must be intact and it may be |
| correctly used in the same situations as kernel except those that assume a |
| pre-existing reference count. |
| Setting arguments on the new kernel object will not affect _source_kernel_ |
| except insofar as the argument points to a shared underlying entity and in |
| that situation behavior is as if two kernel objects had been created and the |
| same argument applied to each. |
| Only the data stored in the kernel object is copied; data referenced by the |
| kernels arguments are not copied. |
| For example, if a buffer or pointer argument is set on a kernel object, the |
| pointer is copied but the underlying memory allocation is not. |
| |
| // refError |
| |
| {clCloneKernel} returns a valid non-zero kernel object and _errcode_ret_ is |
| set to {CL_SUCCESS} if the kernel is successfully copied. |
| Otherwise it returns a `NULL` value with one of the following error values |
| returned in _errcode_ret_: |
| |
| * {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| |
| === Kernel Object Queries |
| |
| [open,refpage='clGetKernelInfo',desc='Returns information about the kernel object.',type='protos'] |
| -- |
| To return information about a kernel object, call the function |
| |
| include::{generated}/api/protos/clGetKernelInfo.txt[] |
| include::{generated}/api/version-notes/clGetKernelInfo.asciidoc[] |
| |
| * _kernel_ specifies the kernel object being queried. |
| * _param_name_ specifies the information to query. |
| The list of supported _param_name_ types and the information returned in |
| _param_value_ by {clGetKernelInfo} is described in the |
| <<kernel-info-table,Kernel Object Queries>> table. |
| * _param_value_ is a pointer to memory where the appropriate result being |
| queried is returned. |
| If _param_value_ is `NULL`, it is ignored. |
| * _param_value_size_ is used to specify the size in bytes of memory pointed to |
| by _param_value_. |
| This size must be {geq} size of return type as described in the |
| <<kernel-info-table,Kernel Object Queries>> table. |
| * _param_value_size_ret_ returns the actual size in bytes of data being |
| queried by _param_name_. |
| If _param_value_size_ret_ is `NULL`, it is ignored. |
| |
| [[kernel-info-table]] |
| .List of supported param_names by <<clGetKernelInfo>> |
| [width="100%",cols="<33%,<17%,<50%",options="header"] |
| |==== |
| | Kernel Info | Return Type | Description |
| | {CL_KERNEL_FUNCTION_NAME_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_FUNCTION_NAME.asciidoc[] |
| | {char_TYPE}[] |
| | Return the kernel function name. |
| | {CL_KERNEL_NUM_ARGS_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_NUM_ARGS.asciidoc[] |
| | {cl_uint_TYPE} |
| | Return the number of arguments to kernel. |
| | {CL_KERNEL_REFERENCE_COUNT_anchor} footnote:[{fn-reference-count-usage}] |
| |
| include::{generated}/api/version-notes/CL_KERNEL_REFERENCE_COUNT.asciidoc[] |
| | {cl_uint_TYPE} |
| | Return the _kernel_ reference count. |
| | {CL_KERNEL_CONTEXT_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_CONTEXT.asciidoc[] |
| | {cl_context_TYPE} |
| | Return the context associated with _kernel_. |
| | {CL_KERNEL_PROGRAM_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_PROGRAM.asciidoc[] |
| | {cl_program_TYPE} |
| | Return the program object associated with kernel. |
| | {CL_KERNEL_ATTRIBUTES_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_ATTRIBUTES.asciidoc[] |
| | {char_TYPE}[] |
| | Returns any attributes specified using the `+__attribute__+` |
| OpenCL C qualifier (or using an OpenCL {cpp} qualifier syntax [[]] ) |
| with the kernel function declaration in the program source. |
| These attributes include attributes described in the earlier OpenCL |
| C kernel language specifications and other attributes supported by |
| an implementation. |
| |
| Attributes are returned as they were declared inside |
| `+__attribute__((...))+`, with any surrounding whitespace and |
| embedded newlines removed. |
| When multiple attributes are present, they are returned as a single, |
| space delimited string. |
| |
| For kernels not created from OpenCL C source and the |
| {clCreateProgramWithSource} API call the string returned from this |
| query will be empty. |
| |==== |
| |
| // refError |
| |
| {clGetKernelInfo} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes |
| specified by _param_value_size_ is < size of return type as described in |
| the <<kernel-info-table,Kernel Object Queries>> table and _param_value_ |
| is not `NULL`. |
| * {CL_INVALID_KERNEL} if _kernel_ is a not a valid kernel object. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clGetKernelWorkGroupInfo',desc='Returns information about the kernel object that may be specific to a device.',type='protos'] |
| -- |
| To return information about the kernel object that may be specific to a |
| device, call the function |
| |
| include::{generated}/api/protos/clGetKernelWorkGroupInfo.txt[] |
| include::{generated}/api/version-notes/clGetKernelWorkGroupInfo.asciidoc[] |
| |
| * _kernel_ specifies the kernel object being queried. |
| * _device_ identifies a specific device in the list of devices associated with |
| _kernel_. |
| The list of devices is the list of devices in the OpenCL context that is |
| associated with _kernel_. |
| If the list of devices associated with _kernel_ is a single device, _device_ |
| can be a `NULL` value. |
| * _param_name_ specifies the information to query. |
| The list of supported _param_name_ types and the information returned in |
| _param_value_ by {clGetKernelWorkGroupInfo} is described in the |
| <<kernel-workgroup-info-table,Kernel Object Device Queries>> table. |
| * _param_value_ is a pointer to memory where the appropriate result being |
| queried is returned. |
| If _param_value_ is `NULL`, it is ignored. |
| * _param_value_size_ is used to specify the size in bytes of memory pointed to |
| by _param_value_. |
| This size must be {geq} size of return type as described in the |
| <<kernel-workgroup-info-table,Kernel Object Device Queries>> table. |
| * _param_value_size_ret_ returns the actual size in bytes of data being |
| queried by _param_name_. |
| If _param_value_size_ret_ is `NULL`, it is ignored. |
| |
| [[kernel-workgroup-info-table]] |
| .List of supported param_names by <<clGetKernelWorkGroupInfo>> |
| [width="100%",cols="<33%,<17%,<50%",options="header"] |
| |==== |
| | Kernel Work Group Info | Return Type | Description |
| | {CL_KERNEL_GLOBAL_WORK_SIZE_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_GLOBAL_WORK_SIZE.asciidoc[] |
| | {size_t_TYPE}[3] |
| | This provides a mechanism for the application to query the maximum |
| global size that can be used to execute a kernel (i.e. |
| _global_work_size_ argument to {clEnqueueNDRangeKernel}) on a custom |
| device given by device or a built-in kernel on an OpenCL device |
| given by device. |
| |
| If device is not a custom device and kernel is not a built-in |
| kernel, {clGetKernelWorkGroupInfo} returns the error |
| {CL_INVALID_VALUE}. |
| | {CL_KERNEL_WORK_GROUP_SIZE_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_WORK_GROUP_SIZE.asciidoc[] |
| | {size_t_TYPE} |
| | This provides a mechanism for the application to query the maximum |
| work-group size that can be used to execute the kernel on a specific |
| device given by device. |
| The OpenCL implementation uses the resource requirements of the |
| kernel (register usage etc.) to determine what this work-group size |
| should be. |
| |
| As a result and unlike {CL_DEVICE_MAX_WORK_GROUP_SIZE} this value may |
| vary from one kernel to another as well as one device to another. |
| |
| {CL_KERNEL_WORK_GROUP_SIZE} will be less than or equal to |
| {CL_DEVICE_MAX_WORK_GROUP_SIZE} for a given kernel object. |
| | {CL_KERNEL_COMPILE_WORK_GROUP_SIZE_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_COMPILE_WORK_GROUP_SIZE.asciidoc[] |
| | {size_t_TYPE}[3] |
| | Returns the work-group size specified in the kernel source or IL. |
| |
| If the work-group size is not specified in the kernel source or IL, |
| (0, 0, 0) is returned. |
| | {CL_KERNEL_LOCAL_MEM_SIZE_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_LOCAL_MEM_SIZE.asciidoc[] |
| | {cl_ulong_TYPE} |
| | Returns the amount of local memory in bytes being used by a kernel. |
| This includes local memory that may be needed by an implementation |
| to execute the kernel, variables declared inside the kernel with the |
| `+__local+` address qualifier and local memory to be allocated for |
| arguments to the kernel declared as pointers with the `+__local+` |
| address qualifier and whose size is specified with {clSetKernelArg}. |
| |
| If the local memory size, for any pointer argument to the kernel |
| declared with the `+__local+` address qualifier, is not specified, |
| its size is assumed to be 0. |
| | {CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE.asciidoc[] |
| | {size_t_TYPE} |
| | Returns the preferred multiple of work-group size for launch. |
| This is a performance hint. |
| Specifying a work-group size that is not a multiple of the value |
| returned by this query as the value of the local work size argument |
| to {clEnqueueNDRangeKernel} will not fail to enqueue the kernel for |
| execution unless the work-group size specified is larger than the |
| device maximum. |
| | {CL_KERNEL_PRIVATE_MEM_SIZE_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_PRIVATE_MEM_SIZE.asciidoc[] |
| | {cl_ulong_TYPE} |
| | Returns the minimum amount of private memory, in bytes, used by each |
| work-item in the kernel. |
| This value may include any private memory needed by an |
| implementation to execute the kernel, including that used by the |
| language built-ins and variable declared inside the kernel with the |
| `+__private+` qualifier. |
| |==== |
| |
| // refError |
| |
| {clGetKernelWorkGroupInfo} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_DEVICE} if _device_ is not in the list of devices associated |
| with _kernel_ or if _device_ is `NULL` but there is more than one device |
| associated with _kernel_. |
| * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes |
| specified by _param_value_size_ is < size of return type as described in |
| the <<kernel-workgroup-info-table,Kernel Object Device Queries>> table |
| and _param_value_ is not `NULL`. |
| * {CL_INVALID_VALUE} if _param_name_ is {CL_KERNEL_GLOBAL_WORK_SIZE} and |
| _device_ is not a custom device and _kernel_ is not a built-in kernel. |
| * {CL_INVALID_KERNEL} if _kernel_ is a not a valid kernel object. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clGetKernelSubGroupInfo',desc='Returns information about the kernel object.',type='protos'] |
| -- |
| To return information about a kernel object, call the function |
| |
| include::{generated}/api/protos/clGetKernelSubGroupInfo.txt[] |
| include::{generated}/api/version-notes/clGetKernelSubGroupInfo.asciidoc[] |
| Also see extension *cl_khr_subgroups*. |
| |
| * _kernel_ specifies the kernel object being queried. |
| * _device_ identifies a specific device in the list of devices associated with |
| _kernel_. |
| The list of devices is the list of devices in the OpenCL context that is |
| associated with _kernel_. |
| If the list of devices associated with _kernel_ is a single device, _device_ |
| can be a `NULL` value. |
| * _param_name_ specifies the information to query. |
| The list of supported _param_name_ types and the information returned in |
| _param_value_ by {clGetKernelSubGroupInfo} is described in the |
| <<kernel-subgroup-info-table,Kernel Object Subgroup Queries>> table. |
| * _input_value_size_ is used to specify the size in bytes of memory pointed to |
| by _input_value_. |
| This size must be == size of input type as described in the table below. |
| * _input_value_ is a pointer to memory where the appropriate parameterization |
| of the query is passed from. |
| If _input_value_ is `NULL`, it is ignored. |
| * _param_value_ is a pointer to memory where the appropriate result being |
| queried is returned. |
| If _param_value_ is `NULL`, it is ignored. |
| * _param_value_size_ is used to specify the size in bytes of memory pointed to |
| by _param_value_. |
| This size must be {geq} size of return type as described in the |
| <<kernel-subgroup-info-table,Kernel Object Subgroup Queries>> table. |
| * _param_value_size_ret_ returns the actual size in bytes of data being |
| queried by _param_name_. |
| If _param_value_size_ret_ is `NULL`, it is ignored. |
| |
| [[kernel-subgroup-info-table]] |
| .List of supported param_names by <<clGetKernelSubGroupInfo>> |
| [width="100%",cols="<25%,<25%,<25%,<25%",options="header"] |
| |==== |
| | Kernel Subgroup Info | Input Type | Return Type | Description |
| | {CL_KERNEL_MAX_SUB_GROUP_SIZE_FOR_NDRANGE_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_MAX_SUB_GROUP_SIZE_FOR_NDRANGE.asciidoc[] |
| Also see extension *cl_khr_subgroups*. |
| | {size_t_TYPE}* |
| | {size_t_TYPE} |
| | Returns the maximum sub-group size for this kernel. |
| All sub-groups must be the same size, while the last subgroup in |
| any work-group (i.e. the subgroup with the maximum index) could |
| be the same or smaller size. |
| |
| The _input_value_ must be an array of {size_t_TYPE} values |
| corresponding to the local work size parameter of the intended |
| dispatch. |
| The number of dimensions in the ND-range will be inferred from |
| the value specified for _input_value_size_. |
| | {CL_KERNEL_SUB_GROUP_COUNT_FOR_NDRANGE_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_SUB_GROUP_COUNT_FOR_NDRANGE.asciidoc[] |
| Also see extension *cl_khr_subgroups*. |
| | {size_t_TYPE}* |
| | {size_t_TYPE} |
| | Returns the number of sub-groups that will be present in each |
| work-group for a given local work size. |
| All workgroups, apart from the last work-group in each dimension |
| in the presence of non-uniform work-group sizes, will have the |
| same number of sub-groups. |
| |
| The _input_value_ must be an array of {size_t_TYPE} values |
| corresponding to the local work size parameter of the intended |
| dispatch. |
| The number of dimensions in the ND-range will be inferred from |
| the value specified for _input_value_size_. |
| | {CL_KERNEL_LOCAL_SIZE_FOR_SUB_GROUP_COUNT_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_LOCAL_SIZE_FOR_SUB_GROUP_COUNT.asciidoc[] |
| Also see extension *cl_khr_subgroups*. |
| | {size_t_TYPE} |
| | {size_t_TYPE}[] |
| | Returns the local size that will generate the requested number |
| of sub-groups for the kernel. |
| The output array must be an array of {size_t_TYPE} values corresponding |
| to the local size parameter. |
| Any returned work-group will have one dimension. |
| Other dimensions inferred from the value specified for |
| param_value_size will be filled with the value 1. |
| The returned value will produce an exact number of sub-groups |
| and result in no partial groups for an executing kernel except |
| in the case where the last work-group in a dimension has a size |
| different from that of the other groups. |
| If no work-group size can accommodate the requested number of |
| sub-groups, 0 will be returned in each element of the return |
| array. |
| | {CL_KERNEL_MAX_NUM_SUB_GROUPS_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_MAX_NUM_SUB_GROUPS.asciidoc[] |
| Also see extension *cl_khr_subgroups*. |
| | ignored |
| | {size_t_TYPE} |
| | This provides a mechanism for the application to query the |
| maximum number of sub-groups that may make up each work-group to |
| execute a kernel on a specific device given by device. |
| The OpenCL implementation uses the resource requirements of the |
| kernel (register usage etc.) to determine what this work-group |
| size should be. |
| The returned value may be used to compute a work-group size to |
| enqueue the kernel with to give a round number of sub-groups for |
| an enqueue. |
| | {CL_KERNEL_COMPILE_NUM_SUB_GROUPS_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_COMPILE_NUM_SUB_GROUPS.asciidoc[] |
| Also see extension *cl_khr_subgroups*. |
| | ignored |
| | {size_t_TYPE} |
| | Returns the number of sub-groups per work-group specified in the kernel |
| source or IL. If the sub-group count is not specified then 0 is returned. |
| |==== |
| |
| // refError |
| |
| {clGetKernelSubGroupInfo} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_DEVICE} if _device_ is not in the list of devices associated |
| with _kernel_ or if _device_ is `NULL` but there is more than one device |
| associated with _kernel_. |
| * {CL_INVALID_OPERATION} if _device_ does not support subgroups. |
| * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes |
| specified by _param_value_size_ is < size of return type as described in |
| the <<kernel-subgroup-info-table,Kernel Object Subgroup Queries>> table |
| and _param_value_ is not `NULL`. |
| * {CL_INVALID_VALUE} if _param_name_ is |
| {CL_KERNEL_MAX_SUB_GROUP_SIZE_FOR_NDRANGE} and the size in bytes specified by |
| _input_value_size_ is not valid or if _input_value_ is `NULL`. |
| * {CL_INVALID_KERNEL} if _kernel_ is a not a valid kernel object. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clGetKernelArgInfo',desc='Returns information about the arguments of a kernel.',type='protos'] |
| -- |
| To return information about the arguments of a kernel, call the function |
| |
| include::{generated}/api/protos/clGetKernelArgInfo.txt[] |
| include::{generated}/api/version-notes/clGetKernelArgInfo.asciidoc[] |
| |
| * _kernel_ specifies the kernel object being queried. |
| * _arg_index_ is the argument index. |
| Arguments to the kernel are referred by indices that go from 0 for the |
| leftmost argument to _n_ - 1, where _n_ is the total number of arguments |
| declared by a kernel. |
| * _param_name_ specifies the argument information to query. |
| The list of supported _param_name_ types and the information returned in |
| _param_value_ by {clGetKernelArgInfo} is described in the |
| <<kernel-argument-info-table,Kernel Argument Queries>> table. |
| * _param_value_ is a pointer to memory where the appropriate result being |
| queried is returned. |
| If _param_value_ is `NULL`, it is ignored. |
| * _param_value_size_ is used to specify the size in bytes of memory pointed to |
| by _param_value_. |
| This size must be > size of return type as described in the |
| <<kernel-argument-info-table,Kernel Argument Queries>> table. |
| * _param_value_size ret_ returns the actual size in bytes of data being |
| queried by _param_name_. |
| If _param_value_size_ret_ is `NULL`, it is ignored. |
| |
| Kernel argument information is only available if the program object |
| associated with _kernel_ is created with {clCreateProgramWithSource} and the |
| program executable was built with the `-cl-kernel-arg-info option` specified |
| in options argument to {clBuildProgram} or {clCompileProgram}. |
| |
| [[kernel-argument-info-table]] |
| .List of supported param_names by <<clGetKernelArgInfo>> |
| [width="100%",cols="<33%,<17%,<50%",options="header"] |
| |==== |
| | Kernel Arg Info | Return Type | Description |
| | {CL_KERNEL_ARG_ADDRESS_QUALIFIER_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_ARG_ADDRESS_QUALIFIER.asciidoc[] |
| | {cl_kernel_arg_address_qualifier_TYPE} |
| | Returns the address qualifier specified for the argument given by |
| _arg_index_. |
| This can be one of the following values: |
| |
| {CL_KERNEL_ARG_ADDRESS_GLOBAL_anchor} + |
| {CL_KERNEL_ARG_ADDRESS_LOCAL_anchor} + |
| {CL_KERNEL_ARG_ADDRESS_CONSTANT_anchor} + |
| {CL_KERNEL_ARG_ADDRESS_PRIVATE_anchor} |
| |
| If no address qualifier is specified, the default address qualifier |
| which is {CL_KERNEL_ARG_ADDRESS_PRIVATE} is returned. |
| | {CL_KERNEL_ARG_ACCESS_QUALIFIER_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_ARG_ACCESS_QUALIFIER.asciidoc[] |
| | {cl_kernel_arg_access_qualifier_TYPE} |
| | Returns the access qualifier specified for the argument given by |
| _arg_index_. |
| This can be one of the following values: |
| |
| {CL_KERNEL_ARG_ACCESS_READ_ONLY_anchor} + |
| {CL_KERNEL_ARG_ACCESS_WRITE_ONLY_anchor} + |
| {CL_KERNEL_ARG_ACCESS_READ_WRITE_anchor} + |
| {CL_KERNEL_ARG_ACCESS_NONE_anchor} |
| |
| If argument is not an image type and is not declared with the pipe |
| qualifier, {CL_KERNEL_ARG_ACCESS_NONE} is returned. |
| If argument is an image type, the access qualifier specified or the |
| default access qualifier is returned. |
| | {CL_KERNEL_ARG_TYPE_NAME_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_ARG_TYPE_NAME.asciidoc[] |
| | {char_TYPE}[] |
| | Returns the type name specified for the argument given by |
| _arg_index_. |
| The type name returned will be the argument type name as it was |
| declared with any whitespace removed. |
| If argument type name is an unsigned scalar type (i.e. unsigned |
| char, unsigned short, unsigned int, unsigned long), uchar, ushort, |
| uint and ulong will be returned. |
| The argument type name returned does not include any type |
| qualifiers. |
| | {CL_KERNEL_ARG_TYPE_QUALIFIER_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_ARG_TYPE_QUALIFIER.asciidoc[] |
| | {cl_kernel_arg_type_qualifier_TYPE} |
| | Returns a bitfield describing one or more type qualifiers specified |
| for the argument given by _arg_index_. |
| The returned values can be: |
| |
| {CL_KERNEL_ARG_TYPE_CONST_anchor} |
| footnote:[{fn-kernel-arg-type-qualifier}] |
| footnote:[{fn-kernel-arg-type-const-addr-space}] + |
| {CL_KERNEL_ARG_TYPE_RESTRICT_anchor} + |
| {CL_KERNEL_ARG_TYPE_VOLATILE_anchor} + |
| {CL_KERNEL_ARG_TYPE_PIPE_anchor}, or + |
| {CL_KERNEL_ARG_TYPE_NONE_anchor} |
| |
| {CL_KERNEL_ARG_TYPE_NONE} is returned for all parameters passed by |
| value. |
| | {CL_KERNEL_ARG_NAME_anchor} |
| |
| include::{generated}/api/version-notes/CL_KERNEL_ARG_NAME.asciidoc[] |
| | {char_TYPE}[] |
| | Returns the name specified for the argument given by _arg_index_. |
| |==== |
| |
| {clGetKernelArgInfo} returns CL SUCCESS if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_ARG_INDEX} if _arg_index_ is not a valid argument index. |
| * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes |
| specified by _param_value_ size is < size of return type as described in |
| the <<kernel-argument-info-table,Kernel Argument Queries>> table and |
| _param_value_ is not `NULL`. |
| * {CL_KERNEL_ARG_INFO_NOT_AVAILABLE} if the argument information is not |
| available for kernel. |
| * {CL_INVALID_KERNEL} if _kernel_ is a not a valid kernel object. |
| -- |
| |
| |
| == Executing Kernels |
| |
| [open,refpage='clEnqueueNDRangeKernel',desc='Enqueues a command to execute a kernel on a device.',type='protos'] |
| -- |
| To enqueue a command to execute a kernel on a device, call the function |
| |
| include::{generated}/api/protos/clEnqueueNDRangeKernel.txt[] |
| include::{generated}/api/version-notes/clEnqueueNDRangeKernel.asciidoc[] |
| |
| * _command_queue_ is a valid host command-queue. |
| The kernel will be queued for execution on the device associated with |
| _command_queue_. |
| * _kernel_ is a valid kernel object. |
| The OpenCL context associated with _kernel_ and _command-queue_ must be the |
| same. |
| * _work_dim_ is the number of dimensions used to specify the global work-items |
| and work-items in the work-group. |
| _work_dim_ must be greater than zero and less than or equal to |
| {CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS}. |
| If _global_work_size_ is `NULL`, or the value in any passed dimension is 0 |
| then the kernel command will trivially succeed after its event dependencies |
| are satisfied and subsequently update its completion event. |
| The behavior in this situation is similar to that of an enqueued marker, |
| except that unlike a marker, an enqueued kernel with no events passed to |
| _event_wait_list_ may run at any time. |
| * _global_work_offset_ can be used to specify an array of _work_dim_ unsigned |
| values that describe the offset used to calculate the global ID of a |
| work-item. |
| If _global_work_offset_ is `NULL`, the global IDs start at offset (0, 0, 0). |
| _global_work_offset_ must be `NULL` <<unified-spec, before>> version 1.1. |
| * _global_work_size_ points to an array of _work_dim_ unsigned values that |
| describe the number of global work-items in _work_dim_ dimensions that will |
| execute the kernel function. |
| The total number of global work-items is computed as _global_work_size_[0] |
| {times} ... {times} _global_work_size_[_work_dim_ - 1]. |
| * _local_work_size_ points to an array of _work_dim_ unsigned values that |
| describe the number of work-items that make up a work-group (also referred |
| to as the size of the work-group) that will execute the kernel specified by |
| _kernel_. |
| The total number of work-items in a work-group is computed as |
| _local_work_size_[0] {times} ... {times} _local_work_size_[_work_dim_ - 1]. |
| The total number of work-items in the work-group must be less than or equal |
| to the {CL_KERNEL_WORK_GROUP_SIZE} value specified in the |
| <<kernel-workgroup-info-table,Kernel Object Device Queries>> table, and the |
| number of work-items specified in _local_work_size_[0], ..., |
| _local_work_size_[_work_dim_ - 1] must be less than or equal to the |
| corresponding values specified by {CL_DEVICE_MAX_WORK_ITEM_SIZES}[0], ..., |
| {CL_DEVICE_MAX_WORK_ITEM_SIZES}[_work_dim_ - 1]. |
| The explicitly specified _local_work_size_ will be used to determine how to |
| break the global work-items specified by _global_work_size_ into appropriate |
| work-group instances. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before this particular command can be executed. |
| If _event_wait_list_ is `NULL`, then this particular command does not wait |
| on any event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this command and |
| can be used to query or wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| An ND-range kernel command may require uniform work-groups or may support non-uniform work-groups. |
| To support non-uniform work-groups: |
| |
| . The device associated with _command_queue_ must support non-uniform work-groups. |
| . The program object associated with _kernel_ must support non-uniform work-groups. |
| Specifically, this means: |
| .. If the program was created with {clCreateProgramWithSource}, the program must be compiled or built using the `-cl-std=CL2.0` or `-cl-std=CL3.0` build option and without the `-cl-uniform-work-group-size` build option. |
| .. If the program was created with {clCreateProgramWithIL} or {clCreateProgramWithBinary}, the program must be compiled or built without the `-cl-uniform-work-group-size` build options. |
| .. If the program was created using {clLinkProgram}, all input programs must support non-uniform work-groups. |
| |
| If non-uniform work-groups are supported, any single dimension |
| for which the global size is not divisible by the local size will be |
| partitioned into two regions. |
| One region will have work-groups that have the same number of work-items as |
| was specified by the local size parameter in that dimension. |
| The other region will have work-groups with less than the number of work |
| items specified by the local size parameter in that dimension. |
| The global IDs and group IDs of the work-items in the first region will be |
| numerically lower than those in the second, and the second region will be at |
| most one work-group wide in that dimension. |
| Work-group sizes could be non-uniform in multiple dimensions, potentially |
| producing work-groups of up to 4 different sizes in a 2D range and 8 |
| different sizes in a 3D range. |
| |
| If non-uniform work-groups are supported and _local_work_size_ is `NULL`, the OpenCL runtime may choose a uniform or non-uniform work-group size. |
| |
| Otherwise, when non-uniform work-groups are not supported, the size of each work-group must be uniform. |
| If _local_work_size_ is specified, the values specified in _global_work_size_[0], ..., _global_work_size_[_work_dim_ - 1] must be evenly divisible by the corresponding values specified in _local_work_size_[0], ..., _local_work_size_[_work_dim_ - 1]. |
| If _local_work_size_ is `NULL`, the OpenCL runtime must choose a uniform work-group size. |
| |
| The work-group size to be used for _kernel_ can also be specified in the |
| program source or intermediate language. |
| In this case the size of work-group specified by _local_work_size_ must |
| match the value specified in the program source. |
| |
| These work-group instances are executed in parallel across multiple compute |
| units or concurrently on the same compute unit. |
| |
| Each work-item is uniquely identified by a global identifier. |
| The global ID, which can be read inside the kernel, is computed using the |
| value given by _global_work_size_ and _global_work_offset_. |
| In addition, a work-item is also identified within a work-group by a unique |
| local ID. |
| The local ID, which can also be read by the kernel, is computed using the |
| value given by _local_work_size_. |
| The starting local ID is always (0, 0, ..., 0). |
| |
| // refError |
| |
| {clEnqueueNDRangeKernel} returns {CL_SUCCESS} if the kernel-instance was |
| successfully queued. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_PROGRAM_EXECUTABLE} if there is no successfully built program |
| executable available for device associated with _command_queue_. |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object. |
| * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and |
| _kernel_ are not the same or if the context associated with |
| _command_queue_ and events in _event_wait_list_ are not the same. |
| * {CL_INVALID_KERNEL_ARGS} if the kernel argument values have not been |
| specified. |
| * {CL_INVALID_WORK_DIMENSION} if _work_dim_ is not a valid value (i.e. a |
| value between 1 and {CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS}). |
| * {CL_INVALID_GLOBAL_WORK_SIZE} if _global_work_size_ is NULL or if any of |
| the values specified in _global_work_size_[0], ... |
| _global_work_size_[_work_dim_ - 1] are 0. |
| Returning this error code under these circumstances is <<unified-spec, |
| deprecated by>> version 2.1. |
| * {CL_INVALID_GLOBAL_WORK_SIZE} if any of the values specified in |
| _global_work_size_[0], ... _global_work_size_[_work_dim_ - 1] exceed the |
| maximum value representable by {size_t_TYPE} on the device on which the |
| kernel-instance will be enqueued. |
| * {CL_INVALID_GLOBAL_OFFSET} if the value specified in _global_work_size_ |
| {plus} the corresponding values in _global_work_offset_ for any |
| dimensions is greater than the maximum value representable by size t on |
| the device on which the kernel-instance will be enqueued, or if |
| _global_work_offset_ is non-`NULL` <<unified-spec, before>> version 1.1. |
| * {CL_INVALID_WORK_GROUP_SIZE} if _local_work_size_ is specified and does |
| not match the required work-group size for _kernel_ in the program |
| source. |
| * {CL_INVALID_WORK_GROUP_SIZE} if _local_work_size_ is specified and is not |
| consistent with the required number of sub-groups for _kernel_ in the |
| program source. |
| * {CL_INVALID_WORK_GROUP_SIZE} if _local_work_size_ is specified and the |
| total number of work-items in the work-group computed as |
| _local_work_size_[0] {times} ... _local_work_size_[_work_dim_ - 1] is |
| greater than the value specified by {CL_KERNEL_WORK_GROUP_SIZE} in the |
| <<kernel-workgroup-info-table,Kernel Object Device Queries>> table. |
| * {CL_INVALID_WORK_GROUP_SIZE} if the work-group size must be uniform and |
| the _local_work_size_ is not `NULL`, is not equal to the required |
| work-group size specified in the kernel source, or the |
| _global_work_size_ is not evenly divisible by the _local_work_size_. |
| * {CL_INVALID_WORK_ITEM_SIZE} if the number of work-items specified in any |
| of _local_work_size_[0], ... _local_work_size_[_work_dim_ - 1] is |
| greater than the corresponding values specified by |
| {CL_DEVICE_MAX_WORK_ITEM_SIZES}[0], ..., |
| {CL_DEVICE_MAX_WORK_ITEM_SIZES}[_work_dim_ - 1]. |
| * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if a sub-buffer object is specified as |
| the value for an argument that is a buffer object and the _offset_ |
| specified when the sub-buffer object is created is not aligned to |
| {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated with _queue_. |
| This error code is <<unified-spec, missing before>> version 1.1. |
| * {CL_INVALID_IMAGE_SIZE} if an image object is specified as an argument |
| value and the image dimensions (image width, height, specified or |
| compute row and/or slice pitch) are not supported by device associated |
| with _queue_. |
| * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if an image object is specified as an |
| argument value and the image format (image channel order and data type) |
| is not supported by device associated with _queue_. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to queue the execution |
| instance of _kernel_ on the command-queue because of insufficient |
| resources needed to execute the kernel. |
| For example, the explicitly specified _local_work_size_ causes a failure |
| to execute the kernel because of insufficient resources such as |
| registers or local memory. |
| Another example would be the number of read-only image args used in |
| _kernel_ exceed the {CL_DEVICE_MAX_READ_IMAGE_ARGS} value for device or |
| the number of write-only and read-write image args used in _kernel_ |
| exceed the {CL_DEVICE_MAX_READ_WRITE_IMAGE_ARGS} value for device or the |
| number of samplers used in _kernel_ exceed {CL_DEVICE_MAX_SAMPLERS} for |
| device. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for data store associated with image or buffer objects specified |
| as arguments to _kernel_. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_INVALID_OPERATION} if SVM pointers are passed as arguments to a kernel |
| and the device does not support SVM or if system pointers are passed as |
| arguments to a kernel and/or stored inside SVM allocations passed as |
| kernel arguments and the device does not support fine grain system SVM |
| allocations. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clEnqueueTask',desc='Enqueues a command to execute a kernel, using a single work-item, on a device.',type='protos'] |
| -- |
| To enqueue a command to execute a kernel on a device, using a single work-item, |
| call the function |
| |
| include::{generated}/api/protos/clEnqueueTask.txt[] |
| include::{generated}/api/version-notes/clEnqueueTask.asciidoc[] |
| |
| * _command_queue_ is a valid host command-queue. |
| The kernel will be queued for execution on the device associated with |
| _command_queue_. |
| * _kernel_ is a valid kernel object. |
| The OpenCL context associated with _kernel_ and _command-queue_ must be the |
| same. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before this particular command can be executed. |
| If _event_wait_list_ is `NULL`, then this particular command does not wait |
| on any event to complete. |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this command and |
| can be used to query or wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| {clEnqueueTask} is equivalent to calling {clEnqueueNDRangeKernel} with |
| _work_dim_ set to 1, _global_work_offset_ set to `NULL`, _global_work_size[0]_ |
| set to 1, and _local_work_size[0]_ set to 1. |
| |
| // refError |
| |
| {clEnqueueTask} returns {CL_SUCCESS} if the kernel-instance was successfully |
| queued. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_PROGRAM_EXECUTABLE} if there is no successfully built program |
| executable available for device associated with _command_queue_. |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object. |
| * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and |
| _kernel_ are not the same or if the context associated with |
| _command_queue_ and events in _event_wait_list_ are not the same. |
| * {CL_INVALID_KERNEL_ARGS} if the kernel argument values have not been |
| specified. |
| * {CL_INVALID_WORK_GROUP_SIZE} if a work-group size is specified for _kernel_ |
| in the program source and it is not (1, 1, 1). |
| // TODO I'm not sure if the next error makes sense for a 'task'. |
| * {CL_INVALID_WORK_GROUP_SIZE} if the required number of sub-groups is |
| specified for _kernel_ in the program source and is not consistent with a |
| work-group size of (1, 1, 1). |
| * {CL_MISALIGNED_SUB_BUFFER_OFFSET} if a sub-buffer object is specified as |
| the value for an argument that is a buffer object and the _offset_ |
| specified when the sub-buffer object is created is not aligned to |
| {CL_DEVICE_MEM_BASE_ADDR_ALIGN} value for device associated with _queue_. |
| This error code is <<unified-spec, missing before>> version 1.1. |
| * {CL_INVALID_IMAGE_SIZE} if an image object is specified as an argument |
| value and the image dimensions (image width, height, specified or |
| compute row and/or slice pitch) are not supported by device associated |
| with _queue_. |
| * {CL_IMAGE_FORMAT_NOT_SUPPORTED} if an image object is specified as an |
| argument value and the image format (image channel order and data type) |
| is not supported by device associated with _queue_. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to queue the execution |
| instance of _kernel_ on the command-queue because of insufficient |
| resources needed to execute the kernel. See how this error code is used |
| with {clEnqueueNDRangeKernel} for examples. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for data store associated with image or buffer objects specified |
| as arguments to _kernel_. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_INVALID_OPERATION} if SVM pointers are passed as arguments to a kernel |
| and the device does not support SVM or if system pointers are passed as |
| arguments to a kernel and/or stored inside SVM allocations passed as |
| kernel arguments and the device does not support fine grain system SVM |
| allocations. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clEnqueueNativeKernel',desc='Enqueues a command to execute a native C/C++ function not compiled using the OpenCL compiler.',type='protos'] |
| -- |
| To enqueue a command to execute a native C/{cpp} function not compiled using |
| the OpenCL compiler, call the function |
| |
| include::{generated}/api/protos/clEnqueueNativeKernel.txt[] |
| include::{generated}/api/version-notes/clEnqueueNativeKernel.asciidoc[] |
| |
| * _command_queue_ is a valid host command-queue. |
| A native user function can only be executed on a command-queue created on a |
| device that has {CL_EXEC_NATIVE_KERNEL} capability set in |
| {CL_DEVICE_EXECUTION_CAPABILITIES} as specified in the |
| <<device-queries-table,Device Queries>> table. |
| * _user_func_ is a pointer to a host-callable user function. |
| * _args_ is a pointer to the args list that _user_func_ should be called with. |
| * _cb_args_ is the size in bytes of the args list that _args_ points to. |
| * _num_mem_objects_ is the number of buffer objects that are passed in _args_. |
| * _mem_list_ is a list of valid buffer objects, if _num_mem_objects_ > 0. |
| The buffer object values specified in _mem_list_ are memory object handles |
| (`{cl_mem_TYPE}` values) returned by {clCreateBuffer} or {clCreateBufferWithProperties}, |
| or `NULL`. |
| * _args_mem_loc_ is a pointer to appropriate locations that _args_ points to |
| where memory object handles ({cl_mem_TYPE} values) are stored. |
| Before the user function is executed, the memory object handles are replaced |
| by pointers to global memory. |
| * _event_wait_list_, _num_events_in_wait_list_ and _event_ are as described in |
| {clEnqueueNDRangeKernel}. |
| |
| The data pointed to by _args_ and _cb_args_ bytes in size will be copied and |
| a pointer to this copied region will be passed to _user_func_. |
| The copy needs to be done because the memory objects ({cl_mem_TYPE} values) that |
| _args_ may contain need to be modified and replaced by appropriate pointers |
| to global memory. |
| When {clEnqueueNativeKernel} returns, the memory region pointed to by _args_ |
| can be reused by the application. |
| |
| // refError |
| |
| {clEnqueueNativeKernel} returns {CL_SUCCESS} if the user function execution |
| instance was successfully queued. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and events |
| in _event_wait_list_ are not the same. |
| * {CL_INVALID_VALUE} if _user_func_ is `NULL`. |
| * {CL_INVALID_VALUE} if _args_ is a `NULL` value and _cb_args_ > 0, or if |
| _args_ is a `NULL` value and _num_mem_objects_ > 0. |
| * {CL_INVALID_VALUE} if _args_ is not `NULL` and _cb_args_ is 0. |
| * {CL_INVALID_VALUE} if _num_mem_objects_ > 0 and _mem_list_ or |
| _args_mem_loc_ are `NULL`. |
| * {CL_INVALID_VALUE} if _num_mem_objects_ = 0 and _mem_list_ or |
| _args_mem_loc_ are not `NULL`. |
| * {CL_INVALID_OPERATION} if the device associated with _command_queue_ |
| cannot execute the native kernel. |
| * {CL_INVALID_MEM_OBJECT} if one or more memory objects specified in |
| _mem_list_ are not valid or are not buffer objects. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to queue the execution |
| instance of _kernel_ on the command-queue because of insufficient |
| resources needed to execute the kernel. |
| * {CL_MEM_OBJECT_ALLOCATION_FAILURE} if there is a failure to allocate |
| memory for data store associated with buffer objects specified as |
| arguments to _kernel_. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_INVALID_OPERATION} if SVM pointers are passed as arguments to a kernel |
| and the device does not support SVM or if system pointers are passed as |
| arguments to a kernel and/or stored inside SVM allocations passed as |
| kernel arguments and the device does not support fine grain system SVM |
| allocations. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| |
| [NOTE] |
| ==== |
| The total number of read-only images specified as arguments to a kernel |
| cannot exceed {CL_DEVICE_MAX_READ_IMAGE_ARGS}. |
| Each image array argument to a kernel declared with the `read_only` |
| qualifier counts as one image. |
| The total number of write-only images specified as arguments to a kernel |
| cannot exceed {CL_DEVICE_MAX_WRITE_IMAGE_ARGS}. |
| Each image array argument to a kernel declared with the `write_only` |
| qualifier counts as one image. |
| |
| The total number of read-write images specified as arguments to a kernel |
| cannot exceed {CL_DEVICE_MAX_READ_WRITE_IMAGE_ARGS}. |
| Each image array argument to a kernel declared with the `read_write` |
| qualifier counts as one image. |
| ==== |
| -- |
| |
| |
| [[event-objects]] |
| == Event Objects |
| |
| // Check: Is this list of event object APIs really necessary? |
| |
| //Event objects can be used to refer to a kernel-instance command |
| //({clEnqueueNDRangeKernel}, {clEnqueueTask}, {clEnqueueNativeKernel}), read, |
| //write, map and copy commands on memory objects ( |
| //{clEnqueueReadBuffer}, {clEnqueueWriteBuffer}, |
| //{clEnqueueMapBuffer}, {clEnqueueUnmapMemObject}, |
| //{clEnqueueReadBufferRect}, {clEnqueueWriteBufferRect}, |
| //{clEnqueueReadImage}, {clEnqueueWriteImage}, {clEnqueueMapImage}, |
| //{clEnqueueCopyBuffer}, {clEnqueueCopyImage}, |
| //{clEnqueueCopyBufferRect}, |
| //{clEnqueueCopyBufferToImage}, {clEnqueueCopyImageToBuffer}), |
| //{clEnqueueSVMMemcpy}, {clEnqueueSVMMemFill}, |
| //{clEnqueueSVMMap}, {clEnqueueSVMUnmap}, {clEnqueueSVMFree}, {clEnqueueMarker}, |
| //{clEnqueueMarkerWithWaitList}, {clEnqueueWaitForEvents}, {clEnqueueBarrier}, |
| //{clEnqueueBarrierWithWaitList}, (refer to |
| //<<markers-barriers-waiting-for-events,Markers, Barriers and Waiting for |
| //Events>>) or user events. |
| |
| An event object can be used to track the execution status of a command. |
| The API calls that enqueue commands to a command-queue create a new event |
| object that is returned in the _event_ argument. |
| In case of an error enqueuing the command in the command-queue the event |
| argument does not return an event object. |
| |
| The execution status of an enqueued command at any given point in time can |
| be one of the following: |
| |
| * {CL_QUEUED_anchor}: Indicates that the command has been enqueued in a |
| command-queue. |
| This is the initial state of all events except user events. |
| * {CL_SUBMITTED_anchor}: The initial state for all user events. |
| For all other events, indicates that the command has been submitted |
| by the host to the device. |
| * {CL_RUNNING_anchor}: Indicates that the device has started executing this |
| command. |
| In order for the execution status of an enqueued command to change from |
| {CL_SUBMITTED} to {CL_RUNNING}, all events that this command is waiting on |
| must have completed successfully i.e. their execution status must be |
| {CL_COMPLETE}. |
| * {CL_COMPLETE_anchor}: Indicates that the command has successfully completed. |
| * An Error Code: A negative integer value indicating that the command was |
| abnormally terminated. Abnormal termination may occur for a number of reasons, |
| such as a bad memory access. |
| |
| [NOTE] |
| ==== |
| A command is considered to be complete if its execution status is |
| {CL_COMPLETE} or is a negative integer value. |
| |
| If the execution of a command is terminated, the command-queue associated |
| with this terminated command, and the associated context (and all other |
| command-queues in this context) may no longer be available. |
| The behavior of OpenCL API calls that use this context (and command-queues |
| associated with this context) are now considered to be |
| implementation-defined. |
| The user registered callback function specified when context is created can |
| be used to report appropriate error information. |
| ==== |
| |
| [open,refpage='clCreateUserEvent',desc='Creates a user event object.',type='protos'] |
| -- |
| To create a user event object, call the function |
| |
| include::{generated}/api/protos/clCreateUserEvent.txt[] |
| include::{generated}/api/version-notes/clCreateUserEvent.asciidoc[] |
| |
| * _context_ must be a valid OpenCL context. |
| * _errcode_ret_ will return an appropriate error code. |
| If _errcode_ret_ is `NULL`, no error code is returned. |
| |
| User events allow applications to enqueue commands that wait on a user event |
| to finish before the command is executed by the device. |
| |
| // refError |
| |
| {clCreateUserEvent} returns a valid non-zero event object and _errcode_ret_ |
| is set to {CL_SUCCESS} if the user event object is created successfully. |
| Otherwise, it returns a `NULL` value with one of the following error values |
| returned in _errcode_ret_: |
| |
| * {CL_INVALID_CONTEXT} if _context_ is not a valid context. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| |
| The initial execution status for the user event object is {CL_SUBMITTED}. |
| -- |
| |
| [open,refpage='clSetUserEventStatus',desc='Sets the execution status of a user event object.',type='protos'] |
| -- |
| To set the execution status of a user event object, call the function |
| |
| include::{generated}/api/protos/clSetUserEventStatus.txt[] |
| include::{generated}/api/version-notes/clSetUserEventStatus.asciidoc[] |
| |
| * _event_ is a user event object created using {clCreateUserEvent}. |
| * _execution_status_ specifies the new execution status to be set and can be |
| {CL_COMPLETE} or a negative integer value to indicate an error. |
| A negative integer value causes all enqueued commands that wait on this user |
| event to be terminated. |
| {clSetUserEventStatus} can only be called once to change the execution |
| status of _event_. |
| |
| [NOTE] |
| ==== |
| If there are enqueued commands with user events in the _event_wait_list_ |
| argument of *+clEnqueue*+* commands, the user must ensure that the status of |
| these user events being waited on are set using {clSetUserEventStatus} |
| before any OpenCL APIs that release OpenCL objects except for event objects |
| are called; otherwise the behavior is undefined. |
| |
| For example, the following code sequence will result in undefined behavior |
| of {clReleaseMemObject}. |
| |
| [source,c] |
| ---- |
| ev1 = clCreateUserEvent(ctx, NULL); |
| clEnqueueWriteBuffer(cq, buf1, CL_FALSE, ..., 1, &ev1, NULL); |
| clEnqueueWriteBuffer(cq, buf2, CL_FALSE, ...); |
| clReleaseMemObject(buf2); |
| clSetUserEventStatus(ev1, CL_COMPLETE); |
| ---- |
| |
| The following code sequence, however, works correctly. |
| |
| [source,c] |
| ---- |
| ev1 = clCreateUserEvent(ctx, NULL); |
| clEnqueueWriteBuffer(cq, buf1, CL_FALSE, ..., 1, &ev1, NULL); |
| clEnqueueWriteBuffer(cq, buf2, CL_FALSE, ...); |
| clSetUserEventStatus(ev1, CL_COMPLETE); |
| clReleaseMemObject(buf2); |
| ---- |
| ==== |
| |
| // refError |
| |
| {clSetUserEventStatus} returns {CL_SUCCESS} if the function was executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_EVENT} if _event_ is not a valid user event object. |
| * {CL_INVALID_VALUE} if the _execution_status_ is not {CL_COMPLETE} or a |
| negative integer value. |
| * {CL_INVALID_OPERATION} if the _execution_status_ for _event_ has already |
| been changed by a previous call to {clSetUserEventStatus}. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clWaitForEvents',desc='Waits on the host thread for commands identified by event objects to complete.',type='protos'] |
| -- |
| To wait for events to complete, call the function |
| |
| include::{generated}/api/protos/clWaitForEvents.txt[] |
| include::{generated}/api/version-notes/clWaitForEvents.asciidoc[] |
| |
| * _num_events_ is the number of events in _event_list_. |
| * _event_list_ is a pointer to a list of event object handles. |
| |
| This function waits on the host thread for commands identified by event |
| objects in _event_list_ to complete. |
| A command is considered complete if its execution status is {CL_COMPLETE} or a |
| negative value. |
| The events specified in _event_list_ act as synchronization points. |
| |
| // refError |
| |
| {clWaitForEvents} returns {CL_SUCCESS} if the execution status of all events |
| in _event_list_ is {CL_COMPLETE}. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_VALUE} if _num_events_ is zero or _event_list_ is `NULL`. |
| * {CL_INVALID_CONTEXT} if events specified in _event_list_ do not belong to |
| the same context. |
| * {CL_INVALID_EVENT} if event objects specified in _event_list_ are not |
| valid event objects. |
| * {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST} if the execution status of |
| any of the events in _event_list_ is a negative integer value. |
| This error code is <<unified-spec, missing before>> version 1.1. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clGetEventInfo',desc='Returns information about the event object.',type='protos'] |
| -- |
| To return information about an event object, call the function |
| |
| include::{generated}/api/protos/clGetEventInfo.txt[] |
| include::{generated}/api/version-notes/clGetEventInfo.asciidoc[] |
| |
| * _event_ specifies the event object being queried. |
| * _param_name_ specifies the information to query. |
| The list of supported _param_name_ types and the information returned in |
| _param_value_ by {clGetEventInfo} is described in the |
| <<event-info-table,Event Object Queries>> table. |
| * _param_value_ is a pointer to memory where the appropriate result being |
| queried is returned. |
| If _param_value_ is `NULL`, it is ignored. |
| * _param_value_size_ is used to specify the size in bytes of memory pointed to |
| by _param_value_. |
| This size must be {geq} size of return type as described in the |
| <<event-info-table,Event Object Queries>> table. |
| * _param_value_size_ret_ returns the actual size in bytes of data being |
| queried by _param_name_. |
| If _param_value_size_ret_ is `NULL`, it is ignored. |
| |
| [[event-info-table]] |
| .List of supported param_names by <<clGetEventInfo>> |
| [width="100%",cols="<33%,<17%,<50%",options="header"] |
| |==== |
| | Event Info | Return Type | Description |
| | {CL_EVENT_COMMAND_QUEUE_anchor} |
| |
| include::{generated}/api/version-notes/CL_EVENT_COMMAND_QUEUE.asciidoc[] |
| | {cl_command_queue_TYPE} |
| | Return the command-queue associated with _event_. |
| For user event objects, a `NULL` value is returned. |
| | {CL_EVENT_CONTEXT_anchor} |
| |
| include::{generated}/api/version-notes/CL_EVENT_CONTEXT.asciidoc[] |
| | {cl_context_TYPE} |
| | Return the context associated with _event_. |
| | {CL_EVENT_COMMAND_TYPE_anchor} |
| |
| include::{generated}/api/version-notes/CL_EVENT_COMMAND_TYPE.asciidoc[] |
| | {cl_command_type_TYPE} |
| | Return the command type associated with _event_ as described in the |
| <<event-command-type-table,Event Command Types>> table. |
| |
| | {CL_EVENT_COMMAND_EXECUTION_STATUS_anchor} footnote:[{fn-event-status-order}] |
| |
| include::{generated}/api/version-notes/CL_EVENT_COMMAND_EXECUTION_STATUS.asciidoc[] |
| | {cl_int_TYPE} |
| | Return the execution status of the command identified by event. |
| Valid values are: |
| |
| {CL_QUEUED} (command has been enqueued in the command-queue), |
| |
| {CL_SUBMITTED} (enqueued command has been submitted by the host to the |
| device associated with the command-queue), |
| |
| {CL_RUNNING} (device is currently executing this command), |
| |
| {CL_COMPLETE} (the command has completed), or |
| |
| Error code given by a negative integer value. (command was |
| abnormally terminated - this may be caused by a bad memory access |
| etc.). |
| These error codes come from the same set of error codes that are |
| returned from the platform or runtime API calls as return values or |
| errcode_ret values. |
| | {CL_EVENT_REFERENCE_COUNT_anchor} footnote:[{fn-reference-count-usage}] |
| |
| include::{generated}/api/version-notes/CL_EVENT_REFERENCE_COUNT.asciidoc[] |
| | {cl_uint_TYPE} |
| | Return the _event_ reference count. |
| |==== |
| |
| [[event-command-type-table]] |
| .List of supported event command types |
| [width="100%",cols="2,3",options="header"] |
| |==== |
| | *Events Created By* |
| | *Event Command Type* |
| |
| | {clEnqueueNDRangeKernel} |
| | {CL_COMMAND_NDRANGE_KERNEL_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_NDRANGE_KERNEL.asciidoc[] |
| |
| | {clEnqueueTask} |
| | {CL_COMMAND_TASK_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_TASK.asciidoc[] |
| |
| | {clEnqueueNativeKernel} |
| | {CL_COMMAND_NATIVE_KERNEL_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_NATIVE_KERNEL.asciidoc[] |
| |
| | {clEnqueueReadBuffer} |
| | {CL_COMMAND_READ_BUFFER_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_READ_BUFFER.asciidoc[] |
| |
| | {clEnqueueWriteBuffer} |
| | {CL_COMMAND_WRITE_BUFFER_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_WRITE_BUFFER.asciidoc[] |
| |
| | {clEnqueueCopyBuffer} |
| | {CL_COMMAND_COPY_BUFFER_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_COPY_BUFFER.asciidoc[] |
| |
| | {clEnqueueReadImage} |
| | {CL_COMMAND_READ_IMAGE_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_READ_IMAGE.asciidoc[] |
| |
| | {clEnqueueWriteImage} |
| | {CL_COMMAND_WRITE_IMAGE_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_WRITE_IMAGE.asciidoc[] |
| |
| | {clEnqueueCopyImage} |
| | {CL_COMMAND_COPY_IMAGE_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_COPY_IMAGE.asciidoc[] |
| |
| | {clEnqueueCopyBufferToImage} |
| | {CL_COMMAND_COPY_BUFFER_TO_IMAGE_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_COPY_BUFFER_TO_IMAGE.asciidoc[] |
| |
| | {clEnqueueCopyImageToBuffer} |
| | {CL_COMMAND_COPY_IMAGE_TO_BUFFER_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_COPY_IMAGE_TO_BUFFER.asciidoc[] |
| |
| | {clEnqueueMapBuffer} |
| | {CL_COMMAND_MAP_BUFFER_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_MAP_BUFFER.asciidoc[] |
| |
| | {clEnqueueMapImage} |
| | {CL_COMMAND_MAP_IMAGE_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_MAP_IMAGE.asciidoc[] |
| |
| | {clEnqueueUnmapMemObject} |
| | {CL_COMMAND_UNMAP_MEM_OBJECT_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_UNMAP_MEM_OBJECT.asciidoc[] |
| |
| | {clEnqueueMarker}, + |
| {clEnqueueMarkerWithWaitList} |
| | {CL_COMMAND_MARKER_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_MARKER.asciidoc[] |
| |
| | {clEnqueueReadBufferRect} |
| | {CL_COMMAND_READ_BUFFER_RECT_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_READ_BUFFER_RECT.asciidoc[] |
| |
| | {clEnqueueWriteBufferRect} |
| | {CL_COMMAND_WRITE_BUFFER_RECT_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_WRITE_BUFFER_RECT.asciidoc[] |
| |
| | {clEnqueueCopyBufferRect} |
| | {CL_COMMAND_COPY_BUFFER_RECT_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_COPY_BUFFER_RECT.asciidoc[] |
| |
| | {clCreateUserEvent} |
| | {CL_COMMAND_USER_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_USER.asciidoc[] |
| |
| | {clEnqueueBarrier}, + |
| {clEnqueueBarrierWithWaitList} |
| | {CL_COMMAND_BARRIER_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_BARRIER.asciidoc[] |
| |
| | {clEnqueueMigrateMemObjects} |
| | {CL_COMMAND_MIGRATE_MEM_OBJECTS_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_MIGRATE_MEM_OBJECTS.asciidoc[] |
| |
| | {clEnqueueFillBuffer} |
| | {CL_COMMAND_FILL_BUFFER_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_FILL_BUFFER.asciidoc[] |
| |
| | {clEnqueueFillImage} |
| | {CL_COMMAND_FILL_IMAGE_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_FILL_IMAGE.asciidoc[] |
| |
| | {clEnqueueSVMFree} |
| | {CL_COMMAND_SVM_FREE_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_SVM_FREE.asciidoc[] |
| |
| | {clEnqueueSVMMemcpy} |
| | {CL_COMMAND_SVM_MEMCPY_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_SVM_MEMCPY.asciidoc[] |
| |
| | {clEnqueueSVMMemFill} |
| | {CL_COMMAND_SVM_MEMFILL_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_SVM_MEMFILL.asciidoc[] |
| |
| | {clEnqueueSVMMap} |
| | {CL_COMMAND_SVM_MAP_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_SVM_MAP.asciidoc[] |
| |
| | {clEnqueueSVMUnmap} |
| | {CL_COMMAND_SVM_UNMAP_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_SVM_UNMAP.asciidoc[] |
| |
| | {clEnqueueSVMMigrateMem} |
| | {CL_COMMAND_SVM_MIGRATE_MEM_anchor} |
| |
| include::{generated}/api/version-notes/CL_COMMAND_SVM_MIGRATE_MEM.asciidoc[] |
| |
| Prior to OpenCL 3.0, implementations should return |
| {CL_COMMAND_MIGRATE_MEM_OBJECTS}, but may return an implementation-defined |
| event command type for {clEnqueueSVMMigrateMem}. |
| |
| |==== |
| |
| Using {clGetEventInfo} to determine if a command identified by _event_ has |
| finished execution (i.e. {CL_EVENT_COMMAND_EXECUTION_STATUS} returns |
| {CL_COMPLETE}) is not a synchronization point. |
| There are no guarantees that the memory objects being modified by command |
| associated with _event_ will be visible to other enqueued commands. |
| |
| // refError |
| |
| {clGetEventInfo} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes |
| specified by _param_value_size_ is < size of return type as described in |
| the <<kernel-argument-info-table,Kernel Argument Queries>> table and |
| _param_value_ is not `NULL`. |
| * {CL_INVALID_VALUE} if information to query given in _param_name_ cannot be |
| queried for _event_. |
| * {CL_INVALID_EVENT} if _event_ is a not a valid event object. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clSetEventCallback',desc='Registers a user callback function for a specific command execution status.',type='protos'] |
| -- |
| To register a user callback function for a specific command execution |
| status, call the function |
| |
| include::{generated}/api/protos/clSetEventCallback.txt[] |
| include::{generated}/api/version-notes/clSetEventCallback.asciidoc[] |
| |
| * _event_ is a valid event object. |
| * _command_exec_callback_type_ specifies the command execution status for |
| which the callback is registered. |
| The command execution callback values for which a callback can be registered |
| are: {CL_SUBMITTED}, {CL_RUNNING}, or |
| {CL_COMPLETE} footnote:[{fn-event-callback-complete}]. |
| There is no guarantee that the callback functions registered for various |
| execution status values for an event will be called in the exact order that |
| the execution status of a command changes. |
| Furthermore, it should be noted that receiving a call back for an event with |
| a status other than {CL_COMPLETE}, in no way implies that the memory model or |
| execution model as defined by the OpenCL specification has changed. |
| For example, it is not valid to assume that a corresponding memory transfer |
| has completed unless the event is in a state {CL_COMPLETE}. |
| * _pfn_event_notify_ is the event callback function that can be registered by |
| the application. |
| This callback function may be called asynchronously by the OpenCL |
| implementation. |
| It is the applications responsibility to ensure that the callback function |
| is thread-safe. |
| The parameters to this callback function are: |
| ** _event_ is the event object for which the callback function is invoked. |
| ** _event_command_status_ is equal to the _command_exec_callback_type_ |
| used while registering the callback. |
| Refer to the <<event-info-table,Event Object Queries>> |
| table for the command execution status values. |
| If the callback is called as the result of the command associated with |
| event being abnormally terminated, an appropriate error code for the |
| error that caused the termination will be passed to |
| _event_command_status_ instead. |
| ** _user_data_ is a pointer to user supplied data. |
| * _user_data_ will be passed as the _user_data_ argument when _pfn_notify_ is |
| called. |
| _user_data_ can be `NULL`. |
| |
| The registered callback function will be called when the execution status of |
| command associated with _event_ changes to an execution status equal to or |
| past the status specified by _command_exec_status_. |
| |
| Each call to {clSetEventCallback} registers the specified user callback |
| function on a callback stack associated with _event_. |
| The order in which the registered user callback functions are called is |
| undefined. |
| |
| All callbacks registered for an event object must be called before the event |
| object is destroyed. |
| Callbacks should return promptly. |
| |
| Behavior is undefined when calling expensive system routines, OpenCL APIs to |
| create contexts or command-queues, or blocking OpenCL APIs in an event callback. |
| Rather than calling a blocking OpenCL API in an event callback, applications |
| may call a non-blocking OpenCL API, then register a completion callback |
| for the non-blocking OpenCL API with the remainder of the work. |
| |
| Because commands in a command-queue are not required to begin execution |
| until the command-queue is flushed, callbacks that enqueue commands on a |
| command-queue should either call {clFlush} on the queue before returning, |
| or arrange for the command-queue to be flushed later. |
| |
| // refError |
| |
| {clSetEventCallback} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_EVENT} if _event_ is not a valid event object. |
| * {CL_INVALID_VALUE} if _pfn_event_notify_ is `NULL` or if |
| _command_exec_callback_type_ is not {CL_SUBMITTED}, {CL_RUNNING}, or |
| {CL_COMPLETE}. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| |
| [open,refpage='clRetainEvent',desc='Increments the event reference count.',type='protos'] |
| -- |
| To retain an event object, call the function |
| |
| include::{generated}/api/protos/clRetainEvent.txt[] |
| include::{generated}/api/version-notes/clRetainEvent.asciidoc[] |
| |
| * _event_ is the event object to be retained. |
| |
| The _event_ reference count is incremented. |
| The OpenCL commands that return an event perform an implicit retain. |
| |
| // refError |
| |
| {clRetainEvent} returns {CL_SUCCESS} if the function is executed successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_EVENT} if _event_ is not a valid event object. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clReleaseEvent',desc='Decrements the event reference count.',type='protos'] |
| -- |
| To release an event object, call the function |
| |
| include::{generated}/api/protos/clReleaseEvent.txt[] |
| include::{generated}/api/version-notes/clReleaseEvent.asciidoc[] |
| |
| * _event_ is the event object to be released. |
| |
| The _event_ reference count is decremented. |
| |
| The event object is deleted once the reference count becomes zero, the |
| specific command identified by this event has completed (or terminated) and |
| there are no commands in the command-queues of a context that require a wait |
| for this event to complete. |
| Using this function to release a reference that was not obtained by creating |
| the object or by calling {clRetainEvent} causes undefined behavior. |
| |
| [NOTE] |
| ==== |
| Developers should be careful when releasing their last reference count on |
| events created by {clCreateUserEvent} that have not yet been set to status |
| of {CL_COMPLETE} or an error. |
| If the user event was used in the event_wait_list argument passed to a |
| *+clEnqueue*+* API or another application host thread is waiting for it in |
| {clWaitForEvents}, those commands and host threads will continue to wait for |
| the event status to reach {CL_COMPLETE} or error, even after the application |
| has released the object. |
| Since in this scenario the application has released its last reference count |
| to the user event, it would be in principle no longer valid for the |
| application to change the status of the event to unblock all the other |
| machinery. |
| As a result the waiting tasks will wait forever, and associated events, |
| {cl_mem_TYPE} objects, command queues and contexts are likely to leak. |
| In-order command queues caught up in this deadlock may cease to do any work. |
| ==== |
| |
| // refError |
| |
| {clReleaseEvent} returns {CL_SUCCESS} if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_EVENT} if _event_ is not a valid event object. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| |
| [[markers-barriers-waiting-for-events]] |
| == Markers, Barriers and Waiting for Events |
| |
| [open,refpage='clEnqueueMarkerWithWaitList',desc='Enqueues a marker command which waits for either a list of events to complete, or all previously enqueued commands to complete.',type='protos'] |
| -- |
| To enqueue a marker command which waits for events or commands to complete, |
| call the function |
| |
| include::{generated}/api/protos/clEnqueueMarkerWithWaitList.txt[] |
| include::{generated}/api/version-notes/clEnqueueMarkerWithWaitList.asciidoc[] |
| |
| * _command_queue_ is a valid host command-queue. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before this particular command can be executed. |
| * _event_ returns an event object that identifies this command and |
| can be used to query or wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| |
| If _event_wait_list_ is `NULL`, then this particular command waits until all |
| previous enqueued commands to _command_queue_ have completed. |
| |
| The marker command either waits for a list of events to complete, or if the |
| list is empty it waits for all commands previously enqueued in |
| _command_queue_ to complete before it completes. |
| This command returns an _event_ which can be waited on, i.e. this event can |
| be waited on to insure that all events either in the _event_wait_list_ or |
| all previously enqueued commands, queued before this command to |
| _command_queue_, have completed. |
| |
| // refError |
| |
| {clEnqueueMarkerWithWaitList} returns {CL_SUCCESS} if the function is |
| successfully executed. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and events |
| in _event_wait_list_ are not the same. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clEnqueueMarker',desc='Enqueues a marker command which waits for all previously enqueued commands to complete.',type='protos'] |
| -- |
| To enqueue a marker command which waits for previous commands to complete, call |
| the function |
| |
| include::{generated}/api/protos/clEnqueueMarker.txt[] |
| include::{generated}/api/version-notes/clEnqueueMarker.asciidoc[] |
| |
| * _command_queue_ is a valid host command-queue. |
| * _event_ returns an event object that identifies this command and |
| can be used to query or wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| The marker command waits for all commands previously enqueued in _command_queue_ to complete before it completes. |
| This command returns an _event_ which can be waited on, i.e. this event can be |
| waited on to insure that all previously enqueued commands, queued before this |
| command to _command_queue_, have completed. |
| |
| // refError |
| |
| {clEnqueueMarker} returns {CL_SUCCESS} if the function is successfully |
| executed. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_VALUE} if _event_ is `NULL`. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clEnqueueWaitForEvents',desc='Enqueues a wait on a list of events to complete.',type='protos'] |
| -- |
| To enqueue a wait for a specific event or a list of events to complete before any future commands queued in a command-queue are executed, call the function |
| |
| include::{generated}/api/protos/clEnqueueWaitForEvents.txt[] |
| include::{generated}/api/version-notes/clEnqueueWaitForEvents.asciidoc[] |
| |
| * _command_queue_ is a valid host command-queue. |
| * _event_list_ and _num_events_ specify events that need to complete before |
| this particular command can be executed. |
| |
| // Note, this parameter is called event_list (like clWaitForEvents) rather than |
| // event_wait_list(like clEnqueueMarkerWithWaitList etc.) because the function |
| // predates wait lists (and CL_INVALID_EVENT_WAIT_LIST). |
| |
| The events specified in _event_list_ act as synchronization points. |
| The context associated with events in _event_list_ and _command_queue_ must be |
| the same. |
| The memory associated with _event_list_ can be reused or freed after the |
| function returns. |
| |
| // refError |
| |
| {clEnqueueWaitForEvents} returns {CL_SUCCESS} if the function is successfully |
| executed. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and events |
| in _event_list_ are not the same. |
| * {CL_INVALID_VALUE} if _num_events_ is 0 or _event_list_ is `NULL`. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clEnqueueBarrierWithWaitList',desc='A synchronization point that enqueues a barrier operation.',type='protos'] |
| -- |
| To enqueue a barrier command which waits for events or commands to complete, |
| call the function |
| |
| include::{generated}/api/protos/clEnqueueBarrierWithWaitList.txt[] |
| include::{generated}/api/version-notes/clEnqueueBarrierWithWaitList.asciidoc[] |
| |
| * _command_queue_ is a valid host command-queue. |
| * _event_wait_list_ and _num_events_in_wait_list_ specify events that need to |
| complete before this particular command can be executed. |
| * If _event_wait_list_ is `NULL`, _num_events_in_wait_list_ must be 0. |
| If _event_wait_list_ is not `NULL`, the list of events pointed to by |
| _event_wait_list_ must be valid and _num_events_in_wait_list_ must be |
| greater than 0. |
| The events specified in _event_wait_list_ act as synchronization points. |
| The context associated with events in _event_wait_list_ and _command_queue_ |
| must be the same. |
| The memory associated with _event_wait_list_ can be reused or freed after |
| the function returns. |
| * _event_ returns an event object that identifies this command and |
| can be used to query or wait for this command to complete. |
| If _event_ is `NULL` or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If _event_wait_list_ and _event_ are not `NULL`, _event_ must not refer |
| to an element of the _event_wait_list_ array. |
| |
| If _event_wait_list_ is `NULL`, then this particular command waits until all |
| previous enqueued commands to _command_queue_ have completed. |
| |
| The barrier command either waits for a list of events to complete, or if the |
| list is empty it waits for all commands previously enqueued in |
| _command_queue_ to complete before it completes. |
| This command blocks command execution, that is, any following commands |
| enqueued after it do not execute until it completes. |
| This command returns an _event_ which can be waited on, i.e. this event can |
| be waited on to insure that all events either in the _event_wait_list_ or |
| all previously enqueued commands, queued before this command to |
| _command_queue_, have completed. |
| |
| // refError |
| |
| {clEnqueueBarrierWithWaitList} returns {CL_SUCCESS} if the function is |
| successfully executed. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_INVALID_CONTEXT} if context associated with _command_queue_ and events |
| in _event_wait_list_ are not the same. |
| * {CL_INVALID_EVENT_WAIT_LIST} if _event_wait_list_ is `NULL` and |
| _num_events_in_wait_list_ > 0, or _event_wait_list_ is not `NULL` and |
| _num_events_in_wait_list_ is 0, or if event objects in _event_wait_list_ |
| are not valid events. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clEnqueueBarrier',desc='A synchronization point that enqueues a barrier operation.',type='protos'] |
| -- |
| To enqueue a barrier command which waits for commands to complete, call the |
| function |
| |
| include::{generated}/api/protos/clEnqueueBarrier.txt[] |
| include::{generated}/api/version-notes/clEnqueueBarrier.asciidoc[] |
| |
| * _command_queue_ is a valid host command-queue. |
| |
| The barrier command waits for all commands previously enqueued in |
| _command_queue_ to complete before it completes. |
| This command blocks command execution, that is, any following commands |
| enqueued after it do not execute until it completes. |
| // TODO clEnqueueBarrierWithWaitList doesn't say synchronization point, should |
| // it, or should the next line be removed? The main difference is that |
| // clEnqueueBarrierWithWaitList returns an event, which is the synchronization |
| // point. |
| The barrier command is a synchronization point. |
| |
| // refError |
| |
| {clEnqueueBarrier} returns {CL_SUCCESS} if the function is successfully |
| executed. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| == Out-of-order Execution of Kernels and Memory Object Commands |
| |
| The OpenCL functions that are submitted to a command-queue are enqueued in |
| the order the calls are made but can be configured to execute in-order or |
| out-of-order. |
| The _properties_ argument in {clCreateCommandQueueWithProperties} or |
| {clCreateCommandQueue} can be used to specify the execution order. |
| |
| If the {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE} property of a command-queue is |
| not set, the commands enqueued to a command-queue execute in order. |
| For example, if an application calls {clEnqueueNDRangeKernel} to execute |
| kernel A followed by a {clEnqueueNDRangeKernel} to execute kernel B, the |
| application can assume that kernel A finishes first and then kernel B is |
| executed. |
| If the memory objects output by kernel A are inputs to kernel B then kernel |
| B will see the correct data in memory objects produced by execution of |
| kernel A. |
| If the {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE} property of a command-queue is |
| set, then there is no guarantee that kernel A will finish before kernel B |
| starts execution. |
| |
| Applications can configure the commands enqueued to a command-queue to |
| execute out-of-order by setting the {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE} |
| property of the command-queue. |
| This can be specified when the command-queue is created. |
| In out-of-order execution mode there is no guarantee that the enqueued |
| commands will finish execution in the order they were queued. |
| As there is no guarantee that kernels will be executed in order, i.e. based |
| on when the {clEnqueueNDRangeKernel} or {clEnqueueTask} calls are made within a |
| command-queue, it is therefore possible that an earlier |
| {clEnqueueNDRangeKernel} call to execute kernel A identified by event A may |
| execute and/or finish later than a {clEnqueueNDRangeKernel} call to execute |
| kernel B which was called by the application at a later point in time. |
| To guarantee a specific order of execution of kernels, a wait on a |
| particular event (in this case event A) can be used. |
| The wait for event A can be specified in the _event_wait_list_ argument to |
| {clEnqueueNDRangeKernel} for kernel B. |
| |
| In addition, a marker ({clEnqueueMarker} or {clEnqueueMarkerWithWaitList}) or a |
| barrier ({clEnqueueBarrier} or {clEnqueueBarrierWithWaitList}) command can be |
| enqueued to the command-queue. |
| The marker command ensures that previously enqueued commands identified by |
| the list of events to wait for (or all previous commands) have finished. |
| A barrier command is similar to a marker command, but additionally |
| guarantees that no later-enqueued commands will execute until the waited-for |
| commands have executed. |
| |
| Similarly, commands to read, write, copy or map memory objects that are |
| enqueued after {clEnqueueNDRangeKernel}, {clEnqueueTask} or |
| {clEnqueueNativeKernel} commands are not guaranteed to wait for kernels |
| scheduled for execution to have completed (if the |
| {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE} property is set). |
| To ensure correct ordering of commands, the event object returned by |
| {clEnqueueNDRangeKernel}, {clEnqueueTask} or {clEnqueueNativeKernel} can be |
| used to enqueue a wait for event or a barrier command can be enqueued that must |
| complete before reads or writes to the memory object(s) occur. |
| |
| |
| [[profiling-operations]] |
| == Profiling Operations on Memory Objects and Kernels |
| |
| This section describes the profiling of OpenCL functions that are enqueued |
| as commands to a command-queue. Profiling of OpenCL commands can be enabled |
| by using a command-queue created with the {CL_QUEUE_PROFILING_ENABLE} |
| flag set in the {CL_QUEUE_PROPERTIES} bitfield in the _properties_ argument to |
| {clCreateCommandQueueWithProperties}, or in the _properties_ argument to |
| {clCreateCommandQueue}. |
| When profiling is enabled, the event objects that are created from |
| enqueuing a command store a timestamp for each of their state transitions. |
| |
| [open,refpage='clGetEventProfilingInfo',desc='Returns profiling information for the command associated with event if profiling is enabled.',type='protos'] |
| -- |
| To return profiling information for a command associated with an event when |
| profiling is enabled, call the function |
| |
| include::{generated}/api/protos/clGetEventProfilingInfo.txt[] |
| include::{generated}/api/version-notes/clGetEventProfilingInfo.asciidoc[] |
| |
| * _event_ specifies the event object. |
| * _param_name_ specifies the profiling data to query. |
| The list of supported _param_name_ types and the information returned in |
| _param_value_ by {clGetEventProfilingInfo} is described in the |
| <<event-profiling-info-table,Event Profiling Queries>> table. |
| * _param_value_ is a pointer to memory where the appropriate result being |
| queried is returned. |
| If _param_value_ is `NULL`, it is ignored. |
| * _param_value_size_ is used to specify the size in bytes of memory pointed to |
| by _param_value_. |
| This size must be {geq} size of return type as described in the |
| <<event-profiling-info-table,Event Profiling Queries>> table. |
| * _param_value_size_ret_ returns the actual size in bytes of data being |
| queried by _param_name_. |
| If _param_value_size_ret_ is `NULL`, it is ignored. |
| |
| [[event-profiling-info-table]] |
| .List of supported param_names by <<clGetEventProfilingInfo>> |
| [width="100%",cols="<33%,<17%,<50%",options="header"] |
| |==== |
| | Event Profiling Info | Return Type | Description |
| | {CL_PROFILING_COMMAND_QUEUED_anchor} |
| |
| include::{generated}/api/version-notes/CL_PROFILING_COMMAND_QUEUED.asciidoc[] |
| | {cl_ulong_TYPE} |
| | A 64-bit value that describes the current device time counter in |
| nanoseconds when the command identified by event is enqueued in a |
| command-queue by the host. |
| | {CL_PROFILING_COMMAND_SUBMIT_anchor} |
| |
| include::{generated}/api/version-notes/CL_PROFILING_COMMAND_SUBMIT.asciidoc[] |
| | {cl_ulong_TYPE} |
| | A 64-bit value that describes the current device time counter in |
| nanoseconds when the command identified by event that has been |
| enqueued is submitted by the host to the device associated with the |
| command-queue. |
| | {CL_PROFILING_COMMAND_START_anchor} |
| |
| include::{generated}/api/version-notes/CL_PROFILING_COMMAND_START.asciidoc[] |
| | {cl_ulong_TYPE} |
| | A 64-bit value that describes the current device time counter in |
| nanoseconds when the command identified by event starts execution on |
| the device. |
| | {CL_PROFILING_COMMAND_END_anchor} |
| |
| include::{generated}/api/version-notes/CL_PROFILING_COMMAND_END.asciidoc[] |
| | {cl_ulong_TYPE} |
| | A 64-bit value that describes the current device time counter in |
| nanoseconds when the command identified by event has finished |
| execution on the device. |
| | {CL_PROFILING_COMMAND_COMPLETE_anchor} |
| |
| include::{generated}/api/version-notes/CL_PROFILING_COMMAND_COMPLETE.asciidoc[] |
| | {cl_ulong_TYPE} |
| | A 64-bit value that describes the current device time counter in |
| nanoseconds when the command identified by event and any child |
| commands enqueued by this command on the device have finished |
| execution. |
| |==== |
| |
| The unsigned 64-bit values returned can be used to measure the time in |
| nano-seconds consumed by OpenCL commands. |
| |
| OpenCL devices are required to correctly track time across changes in device |
| frequency and power states. |
| The {CL_DEVICE_PROFILING_TIMER_RESOLUTION} specifies the resolution of the |
| timer i.e. the number of nanoseconds elapsed before the timer is |
| incremented. |
| |
| // refError |
| |
| {clGetEventProfilingInfo} returns {CL_SUCCESS} if the function is executed |
| successfully and the profiling information has been recorded. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_PROFILING_INFO_NOT_AVAILABLE} if the {CL_QUEUE_PROFILING_ENABLE} flag is |
| not set for the command-queue, if the execution status of the command |
| identified by _event_ is not {CL_COMPLETE} or if _event_ is a user event |
| object. |
| * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes |
| specified by _param_value_size_ is < size of return type as described in |
| the <<event-profiling-info-table,Event Profiling Queries>> table and |
| _param_value_ is not `NULL`. |
| * {CL_INVALID_EVENT} if _event_ is a not a valid event object. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| |
| == Flush and Finish |
| |
| [open,refpage='clFlush',desc='Issues all previously queued OpenCL commands in a command-queue to the device associated with the command-queue.',type='protos'] |
| -- |
| To flush commands to a device, call the function |
| |
| include::{generated}/api/protos/clFlush.txt[] |
| include::{generated}/api/version-notes/clFlush.asciidoc[] |
| |
| * _command_queue_ is the command queue to flush. |
| |
| All previously queued OpenCL commands in _command_queue_ are issued to the |
| device associated with _command_queue_. |
| {clFlush} only guarantees that all queued commands to _command_queue_ will |
| eventually be submitted to the appropriate device. |
| There is no guarantee that they will be complete after {clFlush} returns. |
| |
| Any blocking commands queued in a command-queue and {clReleaseCommandQueue} |
| perform an implicit flush of the command-queue. |
| These blocking commands are {clEnqueueReadBuffer}, |
| {clEnqueueReadBufferRect}, {clEnqueueReadImage}, with _blocking_read_ set to |
| {CL_TRUE}; {clEnqueueWriteBuffer}, {clEnqueueWriteBufferRect}, |
| {clEnqueueWriteImage} with _blocking_write_ set to {CL_TRUE}; |
| {clEnqueueMapBuffer}, {clEnqueueMapImage} with _blocking_map_ set to |
| {CL_TRUE}; {clEnqueueSVMMemcpy} with _blocking_copy_ set to {CL_TRUE}; |
| {clEnqueueSVMMap} with _blocking_map_ set to {CL_TRUE} or {clWaitForEvents}. |
| |
| To use event objects that refer to commands enqueued in a command-queue as |
| event objects to wait on by commands enqueued in a different command-queue, |
| the application must call a {clFlush} or any blocking commands that perform |
| an implicit flush of the command-queue where the commands that refer to |
| these event objects are enqueued. |
| |
| // refError |
| |
| {clFlush} returns {CL_SUCCESS} if the function call was executed successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |
| |
| [open,refpage='clFinish',desc='Blocks until all previously queued OpenCL commands in a command-queue are issued to the associated device and have completed.',type='protos'] |
| -- |
| To wait for completion of commands on a device, call the function |
| |
| include::{generated}/api/protos/clFinish.txt[] |
| include::{generated}/api/version-notes/clFinish.asciidoc[] |
| |
| * _command_queue_ is the command queue to wait for. |
| |
| All previously queued OpenCL commands in _command_queue_ are issued to the |
| associated device, and the function blocks until all previously queued |
| commands have completed. |
| {clFinish} does not return until all previously queued commands in |
| _command_queue_ have been processed and completed. |
| {clFinish} is also a synchronization point. |
| |
| // refError |
| |
| {clFinish} returns {CL_SUCCESS} if the function call was executed |
| successfully. |
| Otherwise, it returns one of the following errors: |
| |
| * {CL_INVALID_COMMAND_QUEUE} if _command_queue_ is not a valid host |
| command-queue. |
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required |
| by the OpenCL implementation on the device. |
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources |
| required by the OpenCL implementation on the host. |
| -- |