Skip to content

Commit 161fc79

Browse files
Ewan Crawfordaharon-abramson
authored andcommitted
Refactor command-buffer queue compatability (#1292)
* Refactor command-buffer queue compatability As proposed in #1142 the PR changes the semantics of the command-queues parameters used for command-buffer creation and enqueue. The queues used on command-buffer creation now only inform the device and dependencies of commands, rather than restricting the properties set on the queues used for command-buffer enqueue. This is based ontop on the change in #850 to add supported queue property semantics. * Address review feedback Clarify wording around default list of command-queues used for command-buffer enqueue. * Update XML version
1 parent 1224a1b commit 161fc79

5 files changed

Lines changed: 78 additions & 60 deletions

api/cl_khr_command_buffer.asciidoc

Lines changed: 21 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ include::{generated}/meta/{refprefix}cl_khr_command_buffer.txt[]
1212
=== Other Extension Metadata
1313

1414
*Last Modified Date*::
15-
2024-10-02
15+
2024-12-13
1616
*IP Status*::
1717
No known IP claims.
1818
*Contributors*::
@@ -43,11 +43,6 @@ Command-buffers enable a reduction in overhead when enqueuing the same
4343
workload multiple times. By separating the command-queue setup from dispatch,
4444
the ability to replay a set of previously created commands is introduced.
4545

46-
The command-queues a command-buffer will be executed on can be set on replay via
47-
parameters to {clEnqueueCommandBufferKHR}, provided they are
48-
<<compatible, compatible>> with the command-queues used on command-buffer
49-
recording.
50-
5146
==== Background
5247

5348
On embedded devices where building a command stream accounts for a significant
@@ -74,7 +69,7 @@ or writes memory objects; or enqueues a native kernel, is not available for
7469
command-buffer recording. Finally commands recorded into a command buffer do
7570
not wait for or return event objects, these are instead replaced with
7671
device-side synchronization-point identifiers which enable out-of-order
77-
execution when enqueued on <<compatible, compatible>> command-queues.
72+
execution of the command-buffer commands.
7873

7974
Adding new entry-points for individual commands, rather than recording existing
8075
command-queue APIs with begin/end markers was a design decision made for the
@@ -102,16 +97,22 @@ following reasons:
10297

10398
==== Command Synchronization
10499

105-
Device-side {cl_sync_point_khr_TYPE} synchronization-points can be used within
106-
command-buffers to define command dependencies. This allows the commands of a
107-
command-buffer to execute out-of-order on a single <<compatible, compatible>>
108-
command-queue. The command-buffer itself has no inherent in-order/out-of-order
109-
property, this ordering is inferred from the command-queue used on command
110-
recording. {clEnqueueCommandBufferKHR} submissions to an out-of-order queue
111-
have the same execution semantics are other operations enqueued to an
112-
out-of-order queue, such as {clEnqueueFillBuffer}, where execution between
113-
enqueued operations may happen concurrently unless dependencies between the
114-
operations are expressed with events.
100+
The command-buffer object has no in-order/out-of-order property set on creation,
101+
it is out-of-order, and command ordering is defined by the dependencies set when
102+
commands are created. Command dependencies can be defined in 3 ways:
103+
104+
1. Device-side {cl_sync_point_khr_TYPE} synchronization-points, providing an
105+
explicit list of the commands to depend on.
106+
2. Appending a {clCommandBarrierWithWaitListKHR} barrier command.
107+
3. Passing an in-order queue when creating the command, creating an implicit
108+
dependency on the previous command created in the command-buffer using
109+
the same queue.
110+
111+
{clEnqueueCommandBufferKHR} submissions to an out-of-order queue have the same
112+
execution semantics as other operations enqueued to an out-of-order queue,
113+
such as {clEnqueueFillBuffer}, where execution between enqueued operations may
114+
happen concurrently unless dependencies between the operations are expressed
115+
with events.
115116

116117
The {cl_sync_point_khr_TYPE} type is defined as a `cl_uint`, giving a hard
117118
upper limit on the number of commands a command-buffer can hold as
@@ -466,3 +467,6 @@ features:
466467
(provisional).
467468
* 0.9.6, 2024-10-02
468469
** Add device query for supported queue properties (provisional).
470+
* 0.9.7, 2024-12-13
471+
** Refactor queue compatability between command-buffer creation and enqueue
472+
(provisional).

api/cl_khr_command_buffer_multi_device.asciidoc

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ include::{generated}/meta/{refprefix}cl_khr_command_buffer_multi_device.txt[]
66
=== Other Extension Metadata
77

88
*Last Modified Date*::
9-
2023-04-30
9+
2024-12-13
1010
*IP Status*::
1111
No known IP claims.
1212
*Contributors*::
@@ -312,3 +312,6 @@ require it.
312312
* Revision 0.9.1, 2023-04-30
313313
** Added clCommandSVMMemcpyKHR and clCommandSVMMemFillKHR as affected
314314
functions (provisional).
315+
* Revision 0.9.2, 2024-12-13
316+
** Update clRemapCommandBufferKHR behavior to match cl_khr_command_buffer
317+
version 0.9.7 (provisional).

api/opencl_platform_layer.asciidoc

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -240,8 +240,7 @@ include::{generated}/api/version-notes/CL_COMMAND_BUFFER_PLATFORM_UNIVERSAL_SYNC
240240

241241
{CL_COMMAND_BUFFER_PLATFORM_REMAP_QUEUES_KHR_anchor} - Platform
242242
supports the ability to create a deep copy of an existing
243-
command-buffer with the commands explicitly remapped to different,
244-
potentially <<compatible, incompatible>>, queues.
243+
command-buffer with the commands explicitly remapped to different queues.
245244

246245
include::{generated}/api/version-notes/CL_COMMAND_BUFFER_PLATFORM_REMAP_QUEUES_KHR.asciidoc[]
247246

api/opencl_runtime_layer.asciidoc

Lines changed: 50 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -14193,16 +14193,34 @@ of 0 or 1.
1419314193
The simultaneous use capability removes this restriction and allows
1419414194
command-buffers to have a <<pending_count, Pending Count>> greater than 1.
1419514195

14196-
[[compatible]]
1419714196
Command-buffers are created using an ordered list of command-queues that
14198-
commands are recorded to and execute on by default.
14199-
These command-queues can be replaced on command-buffer enqueue with
14200-
different command-queues, provided for each element in the replacement list
14201-
the substitute command-queue is compatible with the command-queue used on
14202-
command-buffer creation.
14203-
A _compatible_ command-queue is defined as a command-queue with
14204-
identical properties targeting the same device and in the same OpenCL
14205-
context.
14197+
commands are recorded to and execute on by default. All these queue objects
14198+
must share the same context, but may be associated with different devices when
14199+
the {cl_khr_command_buffer_multi_device_EXT} extension is supported.
14200+
14201+
When constructing a command-buffer by appending commands, the queue parameter
14202+
passed for the command being created is used to set the device with which the
14203+
command will be associated with, and also inform the scheduling of the command.
14204+
If the queue is an in-order queue, then an additional dependency is created on the
14205+
last command appended to the command-buffer using the same queue parameter. If
14206+
the queue is an out-of-order queue, then no extra dependencies on previous
14207+
commands using the same queue are created. All queue properties other than
14208+
{CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE} are ignored for the purposes of command
14209+
creation, with the exception of any vendor extension defined queue properties
14210+
that explicitly define semantics for this purpose.
14211+
14212+
When enqueuing a command-buffer, a list of command-queues to execute the
14213+
command-buffer on can be passed by the user, otherwise the command-queues set
14214+
on command-buffer creation are used by default for execution. A user passed
14215+
list may contain different command-queues, provided for each element the
14216+
substitute command-queue matches the device and context of the command-queue
14217+
used on command-buffer creation. Each command-queue in the enqueue list must
14218+
also have the minimum properties defined by
14219+
{CL_DEVICE_COMMAND_BUFFER_REQUIRED_QUEUE_PROPERTIES_KHR} and no properties
14220+
which are not reported by
14221+
{CL_DEVICE_COMMAND_BUFFER_SUPPORTED_QUEUE_PROPERTIES_KHR}. These queue
14222+
properties have the same execution semantics for {clEnqueueCommandBufferKHR}
14223+
as other operations enqueued to the queue.
1420614224

1420714225
While constructing a command-buffer it is valid for the user to interleave
1420814226
calls to the same queue which create commands, such as
@@ -14266,7 +14284,7 @@ target the same device.
1426614284

1426714285
Commands recorded to different command-queues in the same command-buffer may
1426814286
be executed concurrently to each other unless synchronized explicitly with
14269-
sync-points.
14287+
sync-points, barrier commands, or in-order queue implicit dependencies.
1427014288
Ordering of other commands submitted to the same command-queues as used to
1427114289
enqueue a command-buffer is the responsibility of the programmer.
1427214290
A command-buffer enqueue spanning multiple queues can return an event to use
@@ -14467,12 +14485,6 @@ returned in _errcode_ret_:
1446714485

1446814486
* {CL_INVALID_COMMAND_QUEUE} if any command-queue in _queues_ is not a
1446914487
valid command-queue.
14470-
* {CL_INCOMPATIBLE_COMMAND_QUEUE_KHR} if the properties of any command-queue
14471-
in _queues_ contains a property not specified by
14472-
{CL_DEVICE_COMMAND_BUFFER_SUPPORTED_QUEUE_PROPERTIES_KHR}.
14473-
* {CL_INCOMPATIBLE_COMMAND_QUEUE_KHR} if the properties of any
14474-
command-queue in _queues_ does not contain the minimum properties
14475-
specified by {CL_DEVICE_COMMAND_BUFFER_REQUIRED_QUEUE_PROPERTIES_KHR}.
1447614488
* {CL_INVALID_CONTEXT} if all the command-queues in _queues_ do not have
1447714489
the same OpenCL context.
1447814490
* {CL_INVALID_VALUE} if the {cl_khr_command_buffer_multi_device_EXT}
@@ -14605,10 +14617,10 @@ include::{generated}/api/protos/clEnqueueCommandBufferKHR.txt[]
1460514617
include::{generated}/api/version-notes/clEnqueueCommandBufferKHR.asciidoc[]
1460614618

1460714619
* _num_queues_ is the number of command-queues listed in _queues_.
14608-
* _queues_ is a pointer to an ordered list of command-queues <<compatible,
14609-
compatible>> with the command-queues used on recording.
14610-
_queues_ can be `NULL`, in which case the default command-queues used on
14611-
command-buffer creation are used and _num_queues_ must be 0.
14620+
* _queues_ is a pointer to an ordered list of command-queues to execute the
14621+
command-buffer on. _queues_ can be `NULL`, in which case the default
14622+
command-queues used on command-buffer creation are used and _num_queues_
14623+
must be 0.
1461214624
* _command_buffer_ refers to a valid command-buffer object.
1461314625
* _event_wait_list_, _num_events_in_wait_list_ specify events that need to
1461414626
complete before this particular command can be executed.
@@ -14653,9 +14665,15 @@ execution was successfully queued, or one of the errors below:
1465314665
_num_queues_ set on _command_buffer_ creation.
1465414666
* {CL_INVALID_COMMAND_QUEUE} if any element of _queues_ is not a valid
1465514667
command-queue.
14656-
* {CL_INCOMPATIBLE_COMMAND_QUEUE_KHR} if any element of _queues_ is not
14657-
<<compatible, compatible>> with the command-queue set on
14658-
_command_buffer_ creation at the same list index.
14668+
* {CL_INCOMPATIBLE_COMMAND_QUEUE_KHR} if the properties of any command-queue
14669+
in _queues_ contains a property not specified by
14670+
{CL_DEVICE_COMMAND_BUFFER_SUPPORTED_QUEUE_PROPERTIES_KHR}.
14671+
* {CL_INCOMPATIBLE_COMMAND_QUEUE_KHR} if the properties of any
14672+
command-queue in _queues_ does not contain the minimum properties
14673+
specified by {CL_DEVICE_COMMAND_BUFFER_REQUIRED_QUEUE_PROPERTIES_KHR}.
14674+
* {CL_INVALID_DEVICE} if any element of _queues_ does not have the same
14675+
device as the command-queue set on _command_buffer_ creation at the
14676+
same list index.
1465914677
* {CL_INVALID_CONTEXT} if any element of _queues_ does not have the same
1466014678
context as the command-queue set on _command_buffer_ creation at the
1466114679
same list index.
@@ -16039,22 +16057,18 @@ ifdef::cl_khr_command_buffer_multi_device[]
1603916057
If the {cl_khr_command_buffer_multi_device_EXT} extension is supported,
1604016058
platforms reporting the {CL_COMMAND_BUFFER_PLATFORM_REMAP_QUEUES_KHR}
1604116059
capability support generating a deep copy of a command-buffer with its
16042-
commands remapped to a list of command-queues that are potentially
16043-
<<compatible, incompatible>> with the queues used to create the
16044-
command-buffer.
16045-
That is, the remapped command-buffer can execute on queues that differ in
16046-
terms of properties and/or associated device from the original
16060+
commands remapped to different devices than the devices used to create the
16061+
commands. That is, the remapped command-buffer can execute on queues that
16062+
differ in terms of properties and/or associated device from the original
1604716063
command-buffer queues.
1604816064

1604916065
This functionality is invoked through a new synchronous entry-point
1605016066
{clRemapCommandBufferKHR} which takes a list of queues to which the commands
16051-
should now target.
16052-
It then returns a command-buffer containing the same commands as the
16053-
original, with the same command dependencies, but targeting different
16054-
queues.
16055-
A list of command handles may also be passed to the entry-point, which
16056-
allows handles to the equivalent commands in the remapped command-buffer to
16057-
be returned by an output parameter.
16067+
should now target the associated devices of. It then returns a command-buffer
16068+
containing the same commands as the original, with the same command
16069+
dependencies, but targeting different devices. A list of command handles may
16070+
also be passed to the entry-point, which allows handles to the equivalent
16071+
commands in the remapped command-buffer to be returned by an output parameter.
1605816072

1605916073
Device properties restrict remapping possibilities, as existing commands can
1606016074
have a configuration which is not supported by another device, and so
@@ -16077,7 +16091,7 @@ appear and disappear during runtime.
1607716091
[open,refpage='clRemapCommandBufferKHR',desc='Create copy of a command-buffer remapped to specified command-queues',type='protos']
1607816092
--
1607916093
To create a deep copy of the input command-buffer with the copied commands
16080-
remapped to target the passed command-queues, call the function
16094+
remapped to target devices of the passed command-queues, call the function
1608116095

1608216096
include::{generated}/api/protos/clRemapCommandBufferKHR.txt[]
1608316097
include::{generated}/api/version-notes/clRemapCommandBufferKHR.asciidoc[]
@@ -16136,8 +16150,6 @@ one of the following error values returned in _errcode_ret_:
1613616150
* {CL_INVALID_OPERATION} if the platform does not support the
1613716151
{CL_COMMAND_BUFFER_PLATFORM_AUTOMATIC_REMAP_KHR} flag and _automatic_ is
1613816152
{CL_TRUE}.
16139-
* {CL_INCOMPATIBLE_COMMAND_QUEUE_KHR} if such an error would be returned
16140-
by passing _queues_ to {clCreateCommandBufferKHR}.
1614116153
* Any error relating to device support that can be returned by a command
1614216154
recording entry-point may also be returned.
1614316155
As a command in _command_buffer_ can have a configuration that is not

xml/cl.xml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7191,7 +7191,7 @@ server's OpenCL/api-docs repository.
71917191
<command name="clSetContentSizeBufferPoCL"/>
71927192
</require>
71937193
</extension>
7194-
<extension name="cl_khr_command_buffer" revision="0.9.6" supported="opencl" depends="CL_VERSION_1_2" ratified="opencl" provisional="true">
7194+
<extension name="cl_khr_command_buffer" revision="0.9.7" supported="opencl" depends="CL_VERSION_1_2" ratified="opencl" provisional="true">
71957195
<require>
71967196
<type name="CL/cl.h"/>
71977197
</require>
@@ -7410,7 +7410,7 @@ server's OpenCL/api-docs repository.
74107410
<enum name="CL_MEM_DEVICE_ID_INTEL"/>
74117411
</require>
74127412
</extension>
7413-
<extension name="cl_khr_command_buffer_multi_device" revision="0.9.1" supported="opencl" depends="cl_khr_command_buffer" ratified="opencl" provisional="true" comment="in sync with version 0.9.1; requires cl_khr_command_buffer 0.9.3 or later">
7413+
<extension name="cl_khr_command_buffer_multi_device" revision="0.9.2" supported="opencl" depends="cl_khr_command_buffer" ratified="opencl" provisional="true" comment="requires cl_khr_command_buffer 0.9.7 or later">
74147414
<require>
74157415
<type name="CL/cl.h"/>
74167416
</require>

0 commit comments

Comments
 (0)