@@ -1198,10 +1198,10 @@ The precise semantics of synchronization and the memory orders are formally
11981198defined in <<memory-ordering-rules, Memory Ordering Rules>>.
11991199Here, we give a high level description of how these memory orders apply to
12001200atomic operations on atomic objects shared between units of execution.
1201- OpenCL 2.x memory_order choices are based on those from the ISO C11 standard
1201+ The OpenCL 2.x memory orders are based on those from the ISO C11 standard
12021202memory model.
12031203They are specified in certain OpenCL functions through the following
1204- enumeration constants:
1204+ *memory_order* enumeration constants:
12051205
12061206 * *memory_order_relaxed*: implies no order constraints.
12071207 This memory order can be used safely to increment counters that are
@@ -1239,13 +1239,13 @@ detailed rules for when synchronisation must occur.
12391239 loads and stores from different units of execution appear to be simply
12401240 interleaved.
12411241
1242- Regardless of which memory_order is specified, resolving constraints on
1242+ Regardless of which memory order is specified, resolving constraints on
12431243memory operations across a heterogeneous platform adds considerable overhead
12441244to the execution of a program.
12451245An OpenCL platform may be able to optimize certain operations that depend on
12461246the features of the memory consistency model by restricting the scope of the
12471247memory operations.
1248- Distinct memory scopes are defined by the values of the memory_scope
1248+ Distinct memory scopes are defined by the values of the * memory_scope*
12491249enumeration constant:
12501250
12511251 * *memory_scope_work_item*: memory-ordering constraints only apply within
@@ -1319,17 +1319,15 @@ operations_ and _fences_.
13191319Atomic operations are indivisible.
13201320They either occur completely or not at all.
13211321These operations are used to order memory operations between units of
1322- execution and hence they are parameterized with the memory_order and
1323- memory_scope parameters defined by the OpenCL memory consistency model.
1322+ execution and hence they are parameterized with the memory order and
1323+ memory scope parameters defined by the OpenCL memory consistency model.
13241324The atomic operations for OpenCL kernel languages are similar to the
13251325corresponding operations defined by the C11 standard.
13261326
13271327The OpenCL 2.x atomic operations apply to variables of an atomic type (a
1328- subset of those in the C11 standard) including atomic versions of the int,
1329- uint, long, ulong, float, double, half, intptr_t, uintptr_t, size_t, and
1330- ptrdiff_t types.
1331- However, support for some of these atomic types depends on support for the
1332- corresponding regular types.
1328+ subset of those in the C11 standard).
1329+ Support for some of the atomic types depends on support for the corresponding
1330+ regular non-atomic types.
13331331
13341332An atomic operation on one or more memory locations is either an acquire
13351333operation, a release operation, or both an acquire and release operation.
@@ -1345,40 +1343,41 @@ The orders *memory_order_acquire* (used for reads), *memory_order_release*
13451343(used for writes), and *memory_order_acq_rel* (used for read-modify-write
13461344operations) are used for simple communication between units of execution
13471345using shared variables.
1348- Informally, executing a *memory_order_release* on an atomic object A makes
1346+ Informally, executing a *memory_order_release* on an atomic object *A* makes
13491347all previous side effects visible to any unit of execution that later
1350- executes a *memory_order_acquire* on A .
1348+ executes a *memory_order_acquire* on *A* .
13511349The orders *memory_order_acquire*, *memory_order_release*, and
13521350*memory_order_acq_rel* do not provide sequential consistency for race-free
13531351programs because they will not ensure that atomic stores followed by atomic
13541352loads become visible to other threads in that order.
13551353
13561354[[atomic-fence-orders]]
1357- The fence operation is atomic_work_item_fence, which includes a memory_order
1358- argument as well as the memory_scope and cl_mem_fence_flags arguments.
1359- Depending on the memory_order argument, this operation:
1360-
1361- * has no effects, if *memory_order_relaxed*;
1362- * is an acquire fence, if *memory_order_acquire*;
1363- * is a release fence, if *memory_order_release*;
1364- * is both an acquire fence and a release fence, if *memory_order_acq_rel*;
1355+ The fence operation is *atomic_work_item_fence*, which includes a memory order
1356+ argument as well as memory scope and memory flag arguments.
1357+ Depending on the memory order argument, this operation:
1358+
1359+ * has no effects, if the memory order is *memory_order_relaxed*;
1360+ * is an acquire fence, if the memory order is *memory_order_acquire*;
1361+ * is a release fence, if the memory order is *memory_order_release*;
1362+ * is both an acquire fence and a release fence, if the memory order is
1363+ *memory_order_acq_rel*;
13651364 * is a sequentially-consistent fence with both acquire and release
1366- semantics, if *memory_order_seq_cst*.
1365+ semantics, if the memory order is *memory_order_seq_cst*.
13671366
13681367If specified, the cl_mem_fence_flags argument must be `CLK_IMAGE_MEM_FENCE`,
13691368`CLK_GLOBAL_MEM_FENCE`, `CLK_LOCAL_MEM_FENCE`, or `CLK_GLOBAL_MEM_FENCE |
13701369CLK_LOCAL_MEM_FENCE`.
13711370
1372- The ` atomic_work_item_fence(CLK_IMAGE_MEM_FENCE, ...)` built-in function must be
1373- used to make sure that sampler-less writes are visible to later reads by the
1374- same work-item.
1375- Without use of the atomic_work_item_fence function, write-read coherence on
1371+ The * atomic_work_item_fence* built-in function must be used with
1372+ `CLK_IMAGE_MEM_FENCE` to make sure that sampler-less writes are visible to later
1373+ reads by the same work-item.
1374+ Without use of the * atomic_work_item_fence* function, write-read coherence on
13761375image objects is not guaranteed: if a work-item reads from an image to which
1377- it has previously written without an intervening atomic_work_item_fence, it
1376+ it has previously written without an intervening * atomic_work_item_fence* , it
13781377is not guaranteed that those previous writes are visible to the work-item.
13791378
13801379The synchronization operations in OpenCL 2.x can be parameterized by a
1381- memory_scope .
1380+ memory scope .
13821381Memory scopes control the extent that an atomic operation or fence is
13831382visible with respect to the memory model.
13841383These memory scopes may be used when performing atomic operations and fences
@@ -1595,10 +1594,10 @@ C code and the host program contribute to the local- and
15951594global-happens-before relations.
15961595This section discusses ordering rules for OpenCL 2.x atomic operations.
15971596
1598- <<device-side-enqueue, Device-side enqueue>> defines the enumerated type
1599- memory_order.
1597+ The <<memory-consistency-model>> section defines the enumerated type
1598+ * memory_order* .
16001599
1601- * For *memory_order_relaxed*, no operation orders memory.
1600+ * For *memory_order_relaxed*, there is no memory ordering .
16021601 * For *memory_order_release*, *memory_order_acq_rel*, and
16031602 *memory_order_seq_cst*, a store operation performs a release operation
16041603 on the affected memory location.
@@ -1752,7 +1751,7 @@ This section describes how the OpenCL 2.x fence operations contribute to the
17521751local- and global-happens-before relations.
17531752
17541753Earlier, we introduced synchronization primitives called fences.
1755- Fences can utilize the acquire memory_order , release memory_order , or both.
1754+ Fences can utilize the acquire memory order , release memory order , or both.
17561755A fence with acquire semantics is called an acquire fence; a fence with
17571756release semantics is called a release fence. The <<atomic-fence-orders,
17581757overview of atomic and fence operations>> section describes the memory orders
0 commit comments