Skip to content

Commit 274e76e

Browse files
linehillpjaaskel
authored andcommitted
Apply suggestions from code review
Co-authored-by: Pekka Jääskeläinen <pekka.jaaskelainen@tuni.fi>
1 parent ac8499f commit 274e76e

1 file changed

Lines changed: 16 additions & 16 deletions

File tree

ext/cl_exp_tensor.asciidoc

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -44,18 +44,18 @@ Ben Ashbaugh, Intel. +
4444
=== Overview
4545

4646
The new tensor object enables applications to describe N-dimensional
47-
arrays whose memory layout is abstract to applications. The goal and
48-
intent of this extension is to give leverage for:
47+
arrays whose memory layout is opaque to applications. The goals
48+
of this extension are the following:
4949

50-
* implementations to have freedom of placement data of the tensors for
50+
* Enable implementations to have freedom of placement data of the tensors for
5151
improving performance of the kernels which use them. This extension
52-
should be designed so it allows implementations to determine optimal
52+
is designed such it allows implementations to determine optimal
5353
memory layouts for the tensors based on their use cases for
54-
increasing performance - for example, by analyzing kernels’ access
55-
patterns - or, in case of built-in kernels, by inspecting tensor
54+
increased performance, by means of, for example, analyzing kernels’ access
55+
patterns or, in case of built-in kernels, by inspecting the tensor
5656
arguments they operate on.
5757

58-
* reduce details and boilerplate needed for porting performant
58+
* Reduce details and boilerplate needed for performance portable implementation of
5959
applications by being less dependent on platform or device specifics
6060
on the memory layout / data arrangements which matters for
6161
performance. Such specifics may include:
@@ -74,23 +74,23 @@ intent of this extension is to give leverage for:
7474
cores).
7575

7676
** arrangement of data into rows separated by a stride in order to
77-
avoid back conflicts in GPUs.
77+
avoid bank conflicts in GPUs.
7878

79-
The tensor data type is deemed to be effective with command buffers
80-
and built-in kernels - including kernels to be provided by defined
81-
built-in kernel (cl_khr_defined_builtin_kernels) extension under work.
79+
The tensor data type is designed to be efficiently used together with command buffers (cl_khr_command_buffers)
80+
and built-in kernels, including kernels to be provided by the Defined
81+
Built-in Kernels (cl_khr_defined_builtin_kernels) extension that is being prepared together with this extension.
8282

8383
=== Modifications to OpenCL
8484

8585
==== New Section: 5.x Tensor Objects
8686

87-
A tensor object stores a N-dimensional array of elements. The memory
87+
A tensor object stores an N-dimensional array of elements. The memory
8888
layout of the tensor is opaque to the application. When a tensor
89-
object is created it initially does not have storage where the
90-
elements of the tensor are stored into. A storage is bind to a tensor
89+
object is created it is initially not associated to any storage for the tensor elements.
90+
A storage is bound to a tensor
9191
by creating a memory buffer with CL_MEM_BIND_TO_BUFFER. Tensor objects
9292
without storage can be set as kernel arguments for kernels which
93-
accepts them. Kernels which have tensor arguments must have a storage
93+
accepts them. Kernels which have tensor arguments must have storage
9494
assigned to them prior enqueuing the kernels for execution.
9595

9696
==== New OpenCL Functions added to Tensor Objects section
@@ -684,5 +684,5 @@ assert(clEnqueueTranslateFromTensor(..., t0, ...) == CL_INVALID_OPERATION);
684684
--
685685
*RESOLVED*: OpenCL C support for tensors can be introduced later in a
686686
separate extension. Built-in kernels may benefit from this
687-
extension.
687+
extension as it is.
688688
--

0 commit comments

Comments
 (0)