|
3 | 3 | [[introduction]] |
4 | 4 | = Introduction |
5 | 5 |
|
6 | | -SYCL (pronounced "`sickle`") is a royalty-free, cross-platform abstraction {cpp} |
7 | | -programming model for heterogeneous computing. |
8 | | -SYCL builds on the underlying concepts, portability and efficiency of parallel |
9 | | -API or standards like OpenCL while adding much of the ease of use and |
10 | | -flexibility of single-source {cpp}. |
11 | | - |
12 | | -Developers using SYCL are able to write standard modern {cpp} code, with many of |
13 | | -the techniques they are accustomed to, such as inheritance and templates. |
14 | | -At the same time, developers have access to the full range of capabilities of |
15 | | -the underlying implementation (such as OpenCL) both through the features of the |
16 | | -SYCL libraries and, where necessary, through interoperation with code written |
17 | | -directly using the underneath implementation, via their APIs. |
18 | | - |
19 | | -To reduce programming effort and increase the flexibility with which developers |
20 | | -can write code, SYCL extends the concepts found in standards like OpenCL model |
21 | | -in a few ways beyond the general use of {cpp} features: |
22 | | - |
23 | | - * execution of parallel kernels on a heterogeneous device is made |
24 | | - simultaneously convenient and flexible. |
25 | | - Common parallel patterns are prioritized with simple syntax, which through a |
26 | | - series {cpp} types allow the programmer to express additional requirements, |
27 | | - such as dependencies, if needed; |
28 | | - * when using buffers and accessors, data access in SYCL is separated from data |
29 | | - storage. |
30 | | - By relying on the {cpp}-style resource acquisition is initialization (RAII) |
31 | | - idiom to capture data dependencies between device code blocks, the runtime |
32 | | - library can track data movement and provide correct behavior without the |
33 | | - complexity of manually managing event dependencies between kernel instances |
34 | | - and without the programmer having to explicitly move data. |
35 | | - This approach enables the data-parallel task-graphs that might be already |
36 | | - part of the execution model to be built up easily and safely by SYCL |
37 | | - programmers; |
38 | | - * Unified Shared Memory (<<usm>>) provides a mechanism for explicit data |
39 | | - allocation and movement. |
40 | | - This approach enables the use of pointer-based algorithms and data |
41 | | - structures on heterogeneous devices, and allows for increased re-use of code |
42 | | - across host and device; |
43 | | - * the hierarchical parallelism syntax offers a way of expressing data |
44 | | - parallelism similar to the OpenCL device or OpenMP target device execution |
45 | | - model in an easy-to-understand modern {cpp} form. |
46 | | - It more cleanly layers parallel loops to avoid fragmentation of code and to |
47 | | - more efficiently map to CPU-style architectures. |
48 | | - |
49 | | -SYCL retains the execution model, runtime feature set and device capabilities |
50 | | -inspired by the OpenCL standard. |
51 | | -This standard imposes some limitations on the full range of {cpp} features that |
52 | | -SYCL is able to support. |
53 | | -This ensures portability of device code across as wide a range of devices as |
54 | | -possible. |
55 | | -As a result, while the code can be written in standard {cpp} syntax with |
56 | | -interoperability with standard {cpp} programs, the entire set of {cpp} features |
57 | | -is not available in SYCL device code. |
58 | | -In particular, SYCL device code, as defined by this specification, does not |
59 | | -support virtual function calls, function pointers in general, exceptions, |
60 | | -runtime type information or the full set of {cpp} libraries that may depend on |
61 | | -these features or on features of a particular host compiler. |
62 | | -Nevertheless, these basic restrictions can be relieved by some specific Khronos |
63 | | -or vendor extensions. |
64 | | - |
65 | | -SYCL implements an <<smcp>> design which offers the power of source integration |
66 | | -while allowing toolchains to remain flexible. |
67 | | -The <<smcp>> design supports embedding of code intended to be compiled for a |
68 | | -device, for example a GPU, inline with host code. |
69 | | -This embedding of code offers three primary benefits: |
70 | | - |
71 | | -Simplicity:: |
72 | | - For novice programmers using frameworks like OpenCL, the separation of host |
73 | | - and device source code in OpenCL can become complicated to deal with, |
74 | | - particularly when similar kernel code is used for multiple different |
75 | | - operations on different data types. |
76 | | - A single compiler flow and integrated tool chain combined with libraries |
77 | | - that perform a lot of simple tasks simplifies initial OpenCL programs to a |
78 | | - minimum complexity. |
79 | | - This reduces the learning curve for programmers new to heterogeneous |
80 | | - programming and allows them to concentrate on parallelization techniques |
81 | | - rather than syntax. |
82 | | -Reuse:: |
83 | | - {cpp}'s type system allows for complex interactions between different code |
84 | | - units and supports efficient abstract interface design and reuse of library |
85 | | - code. |
86 | | - For example, a [keyword]#transform# or [keyword]#map# operation applied to |
87 | | - an array of data may allow specialization on both the operation applied to |
88 | | - each element of the array and on the type of the data. |
89 | | - The <<smcp>> design of SYCL enables this interaction to bridge the host |
90 | | - code/device code boundary such that the device code to be specialized on |
91 | | - both of these factors directly from the host code. |
92 | | -Efficiency:: |
93 | | - Tight integration with the type system and reuse of library code enables a |
94 | | - compiler to perform inlining of code and to produce efficient specialized |
95 | | - device code based on decisions made in the host code without having to |
96 | | - generate kernel source strings dynamically. |
97 | | - |
98 | | -The use of {cpp} features such as generic programming, templated code, |
99 | | -functional programming and inheritance on top of existing heterogeneous |
100 | | -execution model opens a wide scope for innovation in software design for |
101 | | -heterogeneous systems. |
102 | | -Clean integration of device and host code within a single {cpp} type system |
103 | | -enables the development of modern, templated generic and adaptable libraries |
104 | | -that build simple, yet efficient, interfaces to offer more developers access to |
105 | | -heterogeneous computing capabilities and devices. |
106 | | -SYCL is intended to serve as a foundation for innovation in programming models |
107 | | -for heterogeneous systems, that builds on open and widely implemented standard |
108 | | -foundation like OpenCL or Vulkan. |
109 | | - |
110 | | -SYCL is designed to be as close to standard {cpp} as possible. |
111 | | -In practice, this means that as long as no dependence is created on SYCL's |
112 | | -integration with the underlying implementation, a standard {cpp} compiler can |
113 | | -compile SYCL programs and they will run correctly on a host CPU. |
114 | | -Any use of specialized low-level features can be masked using the C preprocessor |
115 | | -in the same way that compiler-specific intrinsics may be hidden to ensure |
116 | | -portability between different host compilers. |
117 | | - |
118 | | -SYCL is designed to allow a compilation flow where the source file is passed |
119 | | -through multiple different compilers, including a standard {cpp} host compiler |
120 | | -of the developer's choice, and where the resulting application combines the |
121 | | -results of these compilation passes. |
122 | | -This is distinct from a single-source flow that might use language extensions |
123 | | -that preclude the use of a standard host compiler. |
124 | | -The SYCL standard does not preclude the use of a single compiler flow, but is |
125 | | -designed to not require it. |
126 | | -SYCL can also be implemented purely as a library, in which case no special |
127 | | -compiler support is required at all. |
128 | | - |
129 | | -The advantages of this design are two-fold. |
130 | | -First, it offers better integration with existing tool chains. |
131 | | -An application that already builds using a chosen compiler can continue to do so |
132 | | -when SYCL code is added. |
133 | | -Using the SYCL tools on a source file within a project will both compile for a |
134 | | -device and let the same source file be compiled using the same host compiler |
135 | | -that the rest of the project is compiled with. |
136 | | -Linking and library relationships are unaffected. |
137 | | -This design simplifies porting of pre-existing applications to SYCL. |
138 | | -Second, the design allows the optimal compiler to be chosen for each device |
139 | | -where different vendors may provide optimized tool-chains. |
140 | | - |
141 | | -To summarize, SYCL enables computational kernels to be written inside {cpp} |
142 | | -source files as normal {cpp} code, leading to the concept of "`single-source`" |
143 | | -programming. |
144 | | -This means that software developers can develop and use generic algorithms and |
145 | | -data structures using standard {cpp} template techniques, while still supporting |
146 | | -multi-platform, multi-device heterogeneous execution. |
147 | | -Access to the low level APIs of an underlying implementation (such as OpenCL) is |
148 | | -also supported. |
149 | | -The specification has been designed to enable implementation across as wide a |
150 | | -variety of platforms as possible as well as ease of integration with other |
151 | | -platform-specific technologies, thereby letting both users and implementers |
152 | | -build on top of SYCL as an open platform for system-wide heterogeneous |
153 | | -processing innovation. |
| 6 | +// What is SYCL? |
| 7 | +SYCL (pronounced "`sickle`") is a royalty-free, cross-platform API for |
| 8 | +heterogeneous computing in {cpp}. |
| 9 | + |
| 10 | +SYCL enables developers to write standard {cpp} code that executes on a wide |
| 11 | +range of devices, using modern techniques such as inheritance, templates, and |
| 12 | +lambda functions. |
| 13 | +All computational kernels to be executed on a device can be written inside {cpp} |
| 14 | +source files as normal {cpp} code, alongside any code intended to be run on a |
| 15 | +system's host processor. |
| 16 | +This concept, known as "`single-source`" programming, reduces the complexity of |
| 17 | +heterogeneous programming for developers and gives compilers greater |
| 18 | +opportunities to analyze/optimize across the host-device boundary. |
| 19 | + |
| 20 | +// How does SYCL relate to C++? |
| 21 | +SYCL is designed to be as close to standard {cpp} as possible, and some |
| 22 | +implementations of SYCL may be able to use a standard {cpp} compiler to target |
| 23 | +CPU devices. |
| 24 | +However, to ensure portability of device code across a wide range of devices, |
| 25 | +SYCL imposes some restrictions on the set of {cpp} features that SYCL |
| 26 | +implementations are required to support within device code. |
| 27 | +These restrictions may not be applicable to all devices and can therefore be |
| 28 | +relaxed by specific Khronos extensions or vendor extensions. |
| 29 | + |
| 30 | +// How does SYCL relate to lower-level APIs? |
| 31 | +SYCL was originally based on OpenCL, and retains an execution model, runtime |
| 32 | +feature set, and device capability set inspired by the OpenCL standard. |
| 33 | +However, there is no requirement that SYCL implementations must use OpenCL; SYCL |
| 34 | +implementations are free to support devices via any low-level API (or |
| 35 | +"`backend`") they choose. |
| 36 | + |
| 37 | +// What are some key features of SYCL? |
| 38 | +Some of the key features of SYCL are: |
| 39 | + |
| 40 | + * Common parallel patterns, such as <<sec:reduction, reductions>> and |
| 41 | + <<sec:algorithms, group algorithms>>, are exposed via high-level |
| 42 | + abstractions. |
| 43 | + |
| 44 | + * Interoperability with the lower-level capabilities of specific |
| 45 | + <<sec:backends, backends>> guarantees access to platform-specific |
| 46 | + optimizations. |
| 47 | + |
| 48 | + * <<subsec:buffers, Buffers>> and <<subsec:accessors, accessors>> provide a |
| 49 | + simple way to build task-graphs without manually managing dependencies. |
| 50 | + |
| 51 | + * <<sec:usm, Unified Shared Memory>> (USM) provides an explicit, |
| 52 | + pointer-based, mechanism for managing and sharing data. |
| 53 | + |
| 54 | +// How would you summarize SYCL? |
| 55 | +SYCL has been designed to enable implementations on a wide variety of platforms, |
| 56 | +permitting easy integration with other platform-specific technologies. |
| 57 | +Both users and implementers are encouraged to build upon SYCL as an open |
| 58 | +platform for system-wide heterogeneous programming. |
154 | 59 |
|
155 | 60 |
|
156 | 61 | [[sec:normativerefs]] |
|
0 commit comments