Skip to content

Commit 2d424c5

Browse files
author
Pekka Jääskeläinen
committed
cl_ext_buffer_device_address updates
Changed the CL_MEM_DEVICE_ADDRESS_EXT wording for multi-device cases "all", not "any", covering a case where not all devices can ensure the same address across the context. In that case CL_INVALID_VALUE can be returned. Defined sub-buffer address computation to be 'base_addr + origin'. Added error conditions for clSetKernelExecInfo when the device doesn't support device pointers.
1 parent 5026adb commit 2d424c5

1 file changed

Lines changed: 32 additions & 6 deletions

File tree

extensions/cl_ext_buffer_device_address.asciidoc

Lines changed: 32 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ Draft.
4646
== Version
4747

4848
Built On: {docdate} +
49-
Revision: 0.1.0
49+
Revision: 0.2.0
5050

5151
== Dependencies
5252

@@ -147,7 +147,7 @@ Add new allocation flags <<clCreateBuffer, List of supported memory flag values
147147
include::{generated}/api/version-notes/CL_MEM_DEVICE_ADDRESS_EXT.asciidoc[]
148148
| This flag specifies that the buffer must have a single fixed address
149149
for its lifetime and the address should be unique at least across the devices
150-
of the context, but not necessarily withing the host (virtual) memory.
150+
of the context, but not necessarily within the host (virtual) memory.
151151

152152
The flag might imply that the buffer will be "pinned" permanently to
153153
a device's memory, but might not be necessarily so, as long as the address
@@ -158,18 +158,28 @@ include::{generated}/api/version-notes/CL_MEM_DEVICE_ADDRESS_EXT.asciidoc[]
158158

159159
The device-specific buffer content updates are still performed by
160160
implicit or explicit buffer migrations performed by the runtime or the
161-
client code. If any of the devices in the context does not support
161+
client code. If all of the devices in the context do not support
162162
this type of allocations, an error (CL_INVALID_VALUE) is returned.
163+
164+
The device addresses of sub-buffers derived from CL_MEM_DEVICE_ADDRESS_EXT
165+
allocated buffers can be computed by adding the sub-buffer origin to the
166+
start address.
167+
163168
| {CL_MEM_DEVICE_PRIVATE_EXT_anchor}
164169

165170
include::{generated}/api/version-notes/CL_MEM_DEVICE_PRIVATE_EXT.asciidoc[]
166171
| If this flag is combined with CL_MEM_DEVICE_ADDRESS_EXT, each device in
167-
the context can have their own (fixed) device-side address and copy of
172+
the context can have their own (fixed) device-side address and a copy of
168173
the created buffer which are synchronized implicitly by the runtime.
169174
The main difference to a default cl_mem allocation in that case is then
170175
that the addresses are queriable with CL_MEM_DEVICE_PTRS_EXT and the
171176
per-device address is guaranteed to be the same for the entire lifetime
172177
of the cl_mem.
178+
179+
The device addresses of sub-buffers derived from CL_MEM_DEVICE_PRIVATE_EXT
180+
allocated buffers can be computed by adding the sub-buffer origin to the
181+
device-specific start address.
182+
173183
|====
174184

175185
// refError
@@ -218,7 +228,7 @@ include::{generated}/api/version-notes/clSetKernelArgDevicePointerEXT.asciidoc[]
218228
argument specified by _arg_index_.
219229
The device pointer specified is the value used by all API calls that enqueue
220230
_kernel_ ({clEnqueueNDRangeKernel} and {clEnqueueTask}) until the argument
221-
value is changed by a call to {clSetKernelArgSVMPointer} for _kernel_.
231+
value is changed by a call to {clSetKernelArgDevicePointer} for _kernel_.
222232
The device pointer can only be used for arguments that are declared to be a
223233
pointer to `global` memory allocated with clCreateBuffer() with the
224234
CL_MEM_DEVICE_ADDRESS_EXT flag. The pointer value specified as the argument value
@@ -260,6 +270,14 @@ include::{generated}/api/version-notes/CL_KERNEL_EXEC_INFO_DEVICE_PTRS_EXT.ascii
260270

261271
// refError
262272

273+
Change the descriptions for when returning CL_INVALID_OPERATION from {clSetKernelExecInfo}
274+
as follows:
275+
276+
* {CL_INVALID_OPERATION} if passing {CL_KERNEL_EXEC_INFO_SVM_PTRS} or
277+
{CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM} with _param_value_ set to CL_TRUE
278+
and no device in the context associated with _kernel_ support SVM.
279+
* {CL_INVALID_OPERATION} if passing {CL_KERNEL_EXEC_INFO_DEVICE_PTRS_EXT} and no
280+
device in the context associated with _kernel_ support device pointers.
263281

264282
== Interactions with Other Extensions
265283

@@ -280,7 +298,15 @@ None.
280298
[grid="rows"]
281299
[options="header"]
282300
|====
283-
| *Version* | *Date* | *Author* | *Changes*
301+
| *Version* | *Date* | *Author* | *Changes*
302+
| 0.2.0 | 2024-09-09 | Pekka Jääskeläinen, Karol Herbst |
303+
Changed the CL_MEM_DEVICE_ADDRESS_EXT wording for multi-device
304+
cases "all", not "any", covering a case where not all devices
305+
can ensure the same address across the context. In that case
306+
CL_INVALID_VALUE can be returned. Defined sub-buffer address
307+
computation to be 'base_addr + origin'. Added error conditions
308+
for clSetKernelExecInfo when the device doesn't support
309+
device pointers.
284310
| 0.1.0 | 2024-05-07 | Pekka Jääskeläinen | First draft text for feedback.
285311
This version describes the first API version that was prototyped
286312
in PoCL and RustiCL using temporary placeholder flag/enum values.

0 commit comments

Comments
 (0)