|
2 | 2 | Copyright 2023 The Khronos Group Inc. |
3 | 3 | SPDX-License-Identifier: CC-BY-4.0 |
4 | 4 |
|
| 5 | +*********************************** |
| 6 | +Host allocations and memory sharing |
| 7 | +*********************************** |
| 8 | + |
5 | 9 | .. _host-allocations: |
6 | 10 |
|
7 | | -**************** |
| 11 | +================ |
8 | 12 | Host allocations |
9 | | -**************** |
| 13 | +================ |
| 14 | + |
| 15 | +A SYCL runtime may need to allocate temporary objects |
| 16 | +on the host to handle some operations (such as copying |
| 17 | +data from one context to another). |
| 18 | + |
| 19 | +Allocation on the host is managed using an allocator |
| 20 | +object, following the standard C++ allocator class |
| 21 | +definition. |
| 22 | + |
| 23 | +The default allocator for memory objects is |
| 24 | +implementation-defined, but the user can supply |
| 25 | +their own allocator class by passing it as |
| 26 | +template parameter. For example, this is how to provide |
| 27 | +user-defined allocator class to :ref:`buffer`: |
| 28 | + |
| 29 | +:: |
| 30 | + |
| 31 | + { |
| 32 | + sycl::buffer<int, 1, UserDefinedAllocator<int>> b(d); |
| 33 | + } |
| 34 | + |
| 35 | +When an allocator returns a ``nullptr``, the runtime |
| 36 | +cannot allocate data on the host. Note that in this case |
| 37 | +the runtime will raise an error if it requires host memory |
| 38 | +but it is not available (e.g when moving data across SYCL |
| 39 | +backend contexts). |
| 40 | + |
| 41 | +In some cases, the implementation may retain a copy of |
| 42 | +the allocator object even after the buffer is destroyed. |
| 43 | +For example, this can happen when the buffer object is |
| 44 | +destroyed before commands using accessors to the buffer |
| 45 | +have completed. Therefore, the application must be |
| 46 | +prepared for calls to the allocator even after the buffer |
| 47 | +is destroyed. |
| 48 | + |
| 49 | +.. note:: |
| 50 | + |
| 51 | + If the application needs to know when the implementation |
| 52 | + has destroyed all copies of the allocator, it can maintain |
| 53 | + a reference count within the allocator. |
| 54 | + |
| 55 | +The definition of allocators extends the current functionality |
| 56 | +of SYCL, ensuring that users can define allocator functions for |
| 57 | +specific hardware or certain complex shared memory mechanisms, |
| 58 | +and improves interoperability with STL-based libraries. |
| 59 | + |
| 60 | +.. seealso:: |SYCL_SPEC_HOST_ALLOC| |
| 61 | + |
| 62 | +Default allocators |
| 63 | +================== |
| 64 | + |
| 65 | +A default allocator is always defined by the implementation. |
| 66 | +For allocations greater than size zero, when successful |
| 67 | +it is guaranteed to return non-``nullptr`` and new memory |
| 68 | +positions every call. |
| 69 | + |
| 70 | +The default allocator for ``const`` buffers will remove the |
| 71 | +constantness of the type (therefore, the default allocator |
| 72 | +for a buffer of type ``const int`` will be an ``Allocator<int>``). |
| 73 | +This implies that host accessors will not synchronize with the |
| 74 | +pointer given by the user in the :ref:`buffer`/:ref:`image <iface-images>` |
| 75 | +constructor, but will use the memory returned by the ``Allocator`` |
| 76 | +itself for that purpose. |
| 77 | + |
| 78 | +The user can implement an allocator that returns the same address |
| 79 | +as the one passed in the buffer constructor, but it is the |
| 80 | +responsibility of the user to handle the potential race conditions. |
| 81 | + |
| 82 | +The list of SYCL default allocators: |
| 83 | + |
| 84 | +.. list-table:: |
| 85 | + :header-rows: 1 |
| 86 | + |
| 87 | + * - Allocator |
| 88 | + - Description |
| 89 | + * - ``template <class T> sycl::buffer_allocator`` |
| 90 | + - It is the default :ref:`buffer` allocator used by the runtime, |
| 91 | + when no allocator is defined by the user. |
| 92 | + |
| 93 | + Meets the C++ named requirement Allocator. |
| 94 | + |
| 95 | + A buffer of data type ``const T`` uses ``buffer_allocator<T>`` |
| 96 | + by default. |
| 97 | + * - ``sycl::image_allocator`` |
| 98 | + - It is the default allocator used by the runtime for the SYCL :ref:`unsampled_image` |
| 99 | + and :ref:`sampled_image` classes when no allocator is provided by the user. |
| 100 | + |
| 101 | + The ``sycl::image_allocator`` is required to allocate in elements of ``std::byte``. |
| 102 | + |
| 103 | + |
| 104 | +.. _host_memory_sharing: |
| 105 | + |
| 106 | +========================================================= |
| 107 | +Sharing host memory with the SYCL data management classes |
| 108 | +========================================================= |
| 109 | + |
| 110 | +In order to allow the SYCL runtime to do memory management |
| 111 | +and allow for data dependencies, there are :ref:`buffer` |
| 112 | +and :ref:`image <iface-images>` classes defined. |
| 113 | + |
| 114 | +The default behavior for them is that if a “raw” pointer |
| 115 | +is given during the construction of the data management |
| 116 | +class, then full ownership to use it is given to the SYCL |
| 117 | +runtime until the destruction of the SYCL object. |
| 118 | + |
| 119 | +Below you can find details on sharing or explicitly not |
| 120 | +sharing host memory with the SYCL data classes. The same |
| 121 | +rules apply to :ref:`images <iface-images>` as well. |
| 122 | + |
| 123 | +.. seealso:: |SYCL_SPEC_HOST_MEM_SHARING| |
| 124 | + |
| 125 | + |
| 126 | +Default behavior |
| 127 | +================ |
| 128 | + |
| 129 | +When using a :ref:`buffer`, the ownership of a pointer |
| 130 | +passed to the constructor of the class is, by default, |
| 131 | +passed to SYCL runtime, and that pointer cannot be used |
| 132 | +on the host side until the :ref:`buffer` or |
| 133 | +:ref:`image <iface-images>` is destroyed. |
| 134 | + |
| 135 | +A SYCL application can access the contents of the memory |
| 136 | +managed by a SYCL buffer by using a :ref:`host_accessor`. |
| 137 | +However, there is no guarantee that the |
| 138 | +host accessor synchronizes with the original host |
| 139 | +address used in its constructor. |
| 140 | + |
| 141 | +The pointer passed in is the one used to copy data back |
| 142 | +to the host, if needed, before buffer destruction. |
| 143 | +The memory pointed to by the host pointer will not be |
| 144 | +deallocated by the runtime, and the data is copied back |
| 145 | +from the device to the host through the host pointer |
| 146 | +if there is a need for it. |
| 147 | + |
| 148 | + |
| 149 | +SYCL ownership of the host memory |
| 150 | +================================= |
| 151 | + |
| 152 | +In the case where there is host memory to be used for |
| 153 | +initialization of data but there is no intention of using |
| 154 | +that host memory after the :ref:`buffer` is destroyed, |
| 155 | +then the :ref:`buffer` can take full ownership of that |
| 156 | +host memory. |
| 157 | + |
| 158 | +When a :ref:`buffer` owns the host pointer there is no copy back, |
| 159 | +by default. To create this situation, the SYCL application may pass |
| 160 | +a unique pointer to the host data, which will be then used by the |
| 161 | +runtime internally to initialize the data in the device. |
| 162 | + |
| 163 | +For example, the following could be used: |
| 164 | + |
| 165 | +:: |
| 166 | + |
| 167 | + { |
| 168 | + auto ptr = std::make_unique<int>(-1234); |
| 169 | + buffer<int, 1> b { std::move(ptr), range { 1 } }; |
| 170 | + // ptr is not valid anymore. |
| 171 | + // There is nowhere to copy data back |
| 172 | + } |
| 173 | + |
| 174 | +However, optionally the ``sycl::buffer::set_final_data()`` can be |
| 175 | +set to an output iterator (including a “raw” pointer) or to a |
| 176 | +``std::weak_ptr`` to enable copying data back, to another host memory |
| 177 | +address that is going to be valid after :ref:`buffer` destruction. |
| 178 | + |
| 179 | +:: |
| 180 | + |
| 181 | + { |
| 182 | + auto ptr = std::make_unique<int>(-42); |
| 183 | + buffer<int, 1> b { std::move(ptr), range { 1 } }; |
| 184 | + // ptr is not valid anymore. |
| 185 | + // There is nowhere to copy data back. |
| 186 | + // To get copy back, a location can be specified: |
| 187 | + b.set_final_data(std::weak_ptr<int> { .... }) |
| 188 | + } |
| 189 | + |
| 190 | + |
| 191 | +Shared SYCL ownership of the host memory |
| 192 | +======================================== |
| 193 | + |
| 194 | +When an instance of ``std::shared_ptr`` is passed to the |
| 195 | +:ref:`buffer` constructor, then the :ref:`buffer` object |
| 196 | +and the developer's application share the memory region. |
| 197 | + |
| 198 | +Rules of shared ownership: |
| 199 | + |
| 200 | +1. If the ``std::shared_ptr`` is not empty, the contents of the |
| 201 | + referenced memory are used to initialize the :ref:`buffer`. |
| 202 | + |
| 203 | + If the ``std::shared_ptr`` is empty, then the :ref:`buffer` |
| 204 | + is created with uninitialized memory. |
| 205 | + |
| 206 | +2. If the ``std::shared_ptr`` is still used on the application's |
| 207 | + side then the data will be copied back from the :ref:`buffer` |
| 208 | + or :ref:`image <iface-images>` and will be available to the |
| 209 | + application after the :ref:`buffer` or |
| 210 | + :ref:`image <iface-images>` object is destroyed. |
| 211 | + |
| 212 | +3. When the :ref:`buffer` is destroyed and the data have |
| 213 | + potentially been updated, if the number of copies of |
| 214 | + the ``std::shared_ptr`` outside the runtime is 0, |
| 215 | + there is no user-side shared pointer to read the data. |
| 216 | + |
| 217 | + Therefore the data is not copied out, and the :ref:`buffer` |
| 218 | + destructor does not need to wait for the data processes |
| 219 | + to be finished, as the outcome is not needed on the |
| 220 | + application's side. |
| 221 | + |
| 222 | +Example of such behavior: |
| 223 | + |
| 224 | +:: |
| 225 | + |
| 226 | + { |
| 227 | + std::shared_ptr<int> ptr { data }; |
| 228 | + { |
| 229 | + buffer<int, 1> b { ptr, range<2>{ 10, 10 } }; |
| 230 | + // update the data |
| 231 | + [...] |
| 232 | + } // Data is copied back because there is an user side shared_ptr |
| 233 | + } |
| 234 | + |
| 235 | +:: |
| 236 | + |
| 237 | + { |
| 238 | + std::shared_ptr<int> ptr { data }; |
| 239 | + { |
| 240 | + buffer<int, 1> b { ptr, range<2>{ 10, 10 } }; |
| 241 | + // update the data |
| 242 | + [...] |
| 243 | + ptr.reset(); |
| 244 | + } // Data is not copied back, there is no user side shared_ptr. |
| 245 | + } |
| 246 | + |
| 247 | +This behavior can be overridden using the |
| 248 | +``sycl::buffer::set_final_data()`` member function of the |
| 249 | +:ref:`buffer` class, which will by any means force the |
| 250 | +:ref:`buffer` destructor to wait until the data is copied to |
| 251 | +wherever the ``set_final_data()`` member function has put the |
| 252 | +data (or not wait nor copy if set final data is ``nullptr``). |
0 commit comments