Skip to content

Commit 2b26248

Browse files
authored
Extended Subgroups (#529)
* initial version for publication * add version information * clarify undefined behavior for non-uniform broadcast and inverse ballot document supported types for subgroup mask built-ins * update version information * Apply suggestions from code review
1 parent be5003d commit 2b26248

4 files changed

Lines changed: 1238 additions & 3 deletions

File tree

OpenCL_Ext.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,8 @@ include::ext/cl_khr_async_work_group_copy_fence.asciidoc[]
8282
include::ext/cl_khr_device_uuid.asciidoc[]
8383
include::ext/cl_khr_extended_versioning.asciidoc[]
8484

85+
include::ext/cl_khr_subgroup_extensions.asciidoc[]
86+
8587
// NOTE: To keep meaningful section numbers, new
8688
// extension documents should be added above here!
8789

env/extensions.asciidoc

Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -180,6 +180,139 @@ If the OpenCL environment supports the extension `cl_khr_spirv_no_integer_wrap_d
180180

181181
If the OpenCL environment supports the extension `cl_khr_spirv_no_integer_wrap_decoration` and use of the SPIR-V extension `SPV_KHR_no_integer_wrap_decoration` is declared in the module via *OpExtension*, then the environment must accept modules that include the *NoSignedWrap* or *NoUnsignedWrap* decorations.
182182

183+
==== `cl_khr_subgroup_extended_types`
184+
185+
If the OpenCL environment supports the extension `cl_khr_subgroup_extended_types`, then additional types are valid for the following for *Groups* instructions with _Scope_ for _Execution_ equal to *Subgroup*:
186+
187+
* *OpGroupBroadcast*
188+
* *OpGroupIAdd*, *OpGroupFAdd*
189+
* *OpGroupSMin*, *OpGroupUMin*, *OpGroupFMin*
190+
* *OpGroupSMax*, *OpGroupUMax*, *OpGroupFMax*
191+
192+
For these instructions, valid types for _Value_ are:
193+
194+
* Scalars of supported types:
195+
** *OpTypeInt* (equivalent to `char`, `uchar`, `short`, `ushort`, `int`, `uint`, `long`, and `ulong`)
196+
** *OpTypeFloat* (equivalent to `half`, `float`, and `double`)
197+
198+
Additionally, for *OpGroupBroadcast*, valid types for _Value_ are:
199+
200+
* *OpTypeVectors* with 2, 3, 4, 8, or 16 _Component Count_ components of supported types:
201+
** *OpTypeInt* (equivalent to `char__n__`, `uchar__n__`, `short__n__`, `ushort__n__`, `int__n__`, `uint__n__`, `long__n__`, and `ulong__n__`)
202+
** *OpTypeFloat* (equivalent to `half__n__`, `float__n__`, and `double__n__`)
203+
204+
==== `cl_khr_subgroup_non_uniform_vote`
205+
206+
If the OpenCL environment supports the extension `cl_khr_subgroup_non_uniform_vote`, then the environment must accept SPIR-V modules that declare the following SPIR-V capabilities:
207+
208+
* *GroupNonUniform*
209+
* *GroupNonUniformVote*
210+
211+
For instructions requiring these capabilities, _Scope_ for _Execution_ may be:
212+
213+
* *Subgroup*
214+
215+
For the instruction *OpGroupNonUniformAllEqual*, valid types for _Value_ are:
216+
217+
* Scalars of supported types:
218+
** *OpTypeInt* (equivalent to `char`, `uchar`, `short`, `ushort`, `int`, `uint`, `long`, and `ulong`)
219+
** *OpTypeFloat* (equivalent to `half`, `float`, and `double`)
220+
221+
==== `cl_khr_subgroup_ballot`
222+
223+
If the OpenCL environment supports the extension `cl_khr_subgroup_ballot`, then the environment must accept SPIR-V modules that declare the following SPIR-V capabilities:
224+
225+
* *GroupNonUniformBallot*
226+
227+
For instructions requiring these capabilities, _Scope_ for _Execution_ may be:
228+
229+
* *Subgroup*
230+
231+
For the non-uniform broadcast instruction *OpGroupNonUniformBroadcast*, valid types for _Value_ are:
232+
233+
* Scalars of supported types:
234+
** *OpTypeInt* (equivalent to `char`, `uchar`, `short`, `ushort`, `int`, `uint`, `long`, and `ulong`)
235+
** *OpTypeFloat* (equivalent to `half`, `float`, and `double`)
236+
* *OpTypeVectors* with 2, 3, 4, 8, or 16 _Component Count_ components of supported types:
237+
** *OpTypeInt* (equivalent to `char__n__`, `uchar__n__`, `short__n__`, `ushort__n__`, `int__n__`, `uint__n__`, `long__n__`, and `ulong__n__`)
238+
** *OpTypeFloat* (equivalent to `half__n__`, `float__n__`, and `double__n__`)
239+
240+
For the instruction *OpGroupNonUniformBroadcastFirst*, valid types for _Value_ are:
241+
242+
* Scalars of supported types:
243+
** *OpTypeInt* (equivalent to `char`, `uchar`, `short`, `ushort`, `int`, `uint`, `long`, and `ulong`)
244+
** *OpTypeFloat* (equivalent to `half`, `float`, and `double`)
245+
246+
For the instruction *OpGroupNonUniformBallot*, the valid _Result Type_ is an *OpTypeVector* with four _Component Count_ components of *OpTypeInt*, with _Width_ equal to 32 and _Signedness_ equal to 0 (equivalent to `uint4`).
247+
248+
For the instructions *OpGroupNonUniformInverseBallot*, *OpGroupNonUniformBallotBitExtract*, *OpGroupNonUniformBallotBitCount*, *OpGroupNonUniformBallotFindLSB*, and *OpGroupNonUniformBallotFindMSB*, the valid type for _Value_ is an *OpTypeVector* with four _Component Count_ components of *OpTypeInt*, with _Width_ equal to 32 and _Signedness_ equal to 0 (equivalent to `uint4`).
249+
250+
For built-in variables decorated with *SubgroupEqMask*, *SubgroupGeMask*, *SubgroupGtMask*, *SubgroupLeMask*, or *SubgroupLtMask*, the supported variable type is an *OpTypeVector* with four _Component Count_ components of *OpTypeInt*, with _Width_ equal to 32 and _Signedness_ equal to 0 (equivalent to `uint4`).
251+
252+
==== `cl_khr_subgroup_non_uniform_arithmetic`
253+
254+
If the OpenCL environment supports the extension `cl_khr_subgroup_non_uniform_arithmetic`, then the environment must accept SPIR-V modules that declare the following SPIR-V capabilities:
255+
256+
* *GroupNonUniformArithmetic*
257+
258+
For instructions requiring these capabilities, _Scope_ for _Execution_ may be:
259+
260+
* *Subgroup*
261+
262+
For the instructions *OpGroupNonUniformLogicalAnd*, *OpGroupNonUniformLogicalOr*, and *OpGroupNonUniformLogicalXor*, the valid type for _Value_ is *OpTypeBool*.
263+
264+
Otherwise, for the *GroupNonUniformArithmetic* scan and reduction instructions, valid types for _Value_ are:
265+
266+
* Scalars of supported types:
267+
** *OpTypeInt* (equivalent to `char`, `uchar`, `short`, `ushort`, `int`, `uint`, `long`, and `ulong`)
268+
** *OpTypeFloat* (equivalent to `half`, `float`, and `double`)
269+
270+
For the *GroupNonUniformArithmetic* scan and reduction instructions, the optional _ClusterSize_ operand must not be present.
271+
272+
==== `cl_khr_subgroup_shuffle`
273+
274+
If the OpenCL environment supports the extension `cl_khr_subgroup_shuffle`, then the environment must accept SPIR-V modules that declare the following SPIR-V capabilities:
275+
276+
* *GroupNonUniformShuffle*
277+
278+
For instructions requiring these capabilities, _Scope_ for _Execution_ may be:
279+
280+
* *Subgroup*
281+
282+
For the instructions *OpGroupNonUniformShuffle* and *OpGroupNonUniformShuffleXor* requiring these capabilities, valid types for _Value_ are:
283+
284+
* Scalars of supported types:
285+
** *OpTypeInt* (equivalent to `char`, `uchar`, `short`, `ushort`, `int`, `uint`, `long`, and `ulong`)
286+
** *OpTypeFloat* (equivalent to `half`, `float`, and `double`)
287+
288+
==== `cl_khr_subgroup_shuffle_relative`
289+
290+
If the OpenCL environment supports the extension `cl_khr_subgroup_shuffle_relative`, then the environment must accept SPIR-V modules that declare the following SPIR-V capabilities:
291+
292+
* *GroupNonUniformShuffleRelative*
293+
294+
For instructions requiring these capabilities, _Scope_ for _Execution_ may be:
295+
296+
* *Subgroup*
297+
298+
For the *GroupNonUniformShuffleRelative* instructions, valid types for _Value_ are:
299+
300+
* Scalars of supported types:
301+
** *OpTypeInt* (equivalent to `char`, `uchar`, `short`, `ushort`, `int`, `uint`, `long`, and `ulong`)
302+
** *OpTypeFloat* (equivalent to `half`, `float`, and `double`)
303+
304+
==== `cl_khr_subgroup_clustered_reduce`
305+
306+
If the OpenCL environment supports the extension `cl_khr_subgroup_clustered_reduce`, then the environment must accept SPIR-V modules that declare the following SPIR-V capabilities:
307+
308+
* *GroupNonUniformClustered*
309+
310+
For instructions requiring these capabilities, _Scope_ for _Execution_ may be:
311+
312+
* *Subgroup*
313+
314+
When the *GroupNonUniformClustered* capability is declared, the *GroupNonUniformArithmetic* scan and reduction instructions may include the optional _ClusterSize_ operand.
315+
183316
=== Embedded Profile Extensions
184317

185318
==== `cles_khr_int64`

0 commit comments

Comments
 (0)