You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
== Modifications to the OpenCL SPIR-V Environment Specification
347
+
348
+
[NOTE]
349
+
====
350
+
SPIR-V support was added in extension version 1.1.0.
351
+
====
352
+
353
+
=== Add a new section 5.2.X - `cl_intel_subgroup_matrix_multiply_accumulate`
354
+
355
+
If the OpenCL environment supports the extension `cl_intel_subgroup_matrix_multiply_accumulate` then the environment must accept modules that declare use of the extension `SPV_INTEL_subgroup_matrix_multiply_accumulate` and that declare the SPIR-V capability *SubgroupMatrixMultiplyAccumulateINTEL*.
356
+
357
+
For devices where the minimum subgroup size is 8, the following matrix dimensions and types are supported.
358
+
For these devices, the subgroup size must be 8 (the minimum subgroup size).
359
+
Behavior is undefined if these functions are called on other devices or from kernels with a different subgroup size:
360
+
361
+
[cols="^1,^1,^1,^2,^2,^2,^2",width="100%"]
362
+
[options="header"]
363
+
|=====
364
+
| M Dimension | N Dimension | K Dimension | Result Type | Matrix A Type | Matrix B Type | Matrix C Type
| `M x int32_t` with *MatrixAPackedInt4INTEL* and *MatrixASignedComponentsINTEL*
398
+
| `8 x int32_t` with *MatrixBPackedInt4INTEL* and *MatrixBSignedComponentsINTEL*
399
+
| `M x int32_t`
400
+
401
+
| 1, 2, 4, 8 | 8 | 64 | `M x int32_t`
402
+
| `M x int32_t` with *MatrixAPackedInt4INTEL* and *MatrixASignedComponentsINTEL*
403
+
| `8 x int32_t` with *MatrixBPackedInt4INTEL*
404
+
| `M x int32_t`
405
+
406
+
| 1, 2, 4, 8 | 8 | 64 | `M x int32_t`
407
+
| `M x int32_t` with *MatrixAPackedInt4INTEL*
408
+
| `8 x int32_t` with *MatrixBPackedInt4INTEL* and *MatrixBSignedComponentsINTEL*
409
+
| `M x int32_t`
410
+
411
+
| 1, 2, 4, 8 | 8 | 64 | `M x int32_t`
412
+
| `M x int32_t` with *MatrixAPackedInt4INTEL*
413
+
| `8 x int32_t` with *MatrixBPackedInt4INTEL*
414
+
| `M x int32_t`
415
+
416
+
// f32 = f16 x f16 + f32
417
+
7+<| *fp16 matrix sources, fp32 accumulator*:
418
+
| 1, 2, 4, 8 | 8 | 16 | `M x float32_t` | `M x int32_t` with *MatrixAPackedFloat16INTEL* | `8 x int32_t` with *MatrixBPackedFloat16INTEL* | `M x float32_t`
419
+
420
+
// f32 = bf16 x bf16 + f32
421
+
7+<| *bf16 matrix sources, fp32 accumulator*:
422
+
| 1, 2, 4, 8 | 8 | 16 | `M x float32_t` | `M x int32_t` with *MatrixAPackedBFloat16INTEL* | `8 x int32_t` with *MatrixBPackedBFloat16INTEL* | `M x float32_t`
423
+
424
+
|=====
425
+
426
+
For devices where the minimum subgroup size is 16, the following matrix dimensions and types are supported.
427
+
For these devices, the subgroup size must be 16 (the minimum subgroup size).
428
+
Behavior is undefined if these functions are called on other devices or from kernels with a different subgroup size:
429
+
430
+
[cols="^1,^1,^1,^2,^2,^2,^2",width="100%"]
431
+
[options="header"]
432
+
|=====
433
+
| M Dimension | N Dimension | K Dimension | Result Type | Matrix A Type | Matrix B Type | Matrix C Type
0 commit comments