The wgpu backends
What do backends do?
The heavy lifting (i.e communication with the hardware) in wgpu is performed by one of its backends.
Backends can be selected explicitly by importing them:
import wgpu.backends.wgpu_natve
There is also an auto backend to help keep code portable:
import wgpu.backends.auto
In most cases, however, you don’t need any of the above imports, because
a backend is automatically selected in the first call to wgpu.GPU.request_adapter()
.
Each backend can also provide additional (backend-specific) functionality. To keep the main API clean and portable, this extra functionality is provided as a functional API that has to be imported from the specific backend.
The wgpu_native backend
import wgpu.backends.wgpu_natve
This backend wraps wgpu-native, which is a C-api for wgpu, a Rust library that wraps Vulkan, Metal, DirectX12 and more. This is the main backend for wgpu-core. The only working backend, right now, to be precise. It also works out of the box, because the wgpu-native DLL is shipped with wgpu-py.
The wgpu_native backend provides a few extra functionalities:
- wgpu.backends.wgpu_native.request_device_sync(adapter, trace_path, *, label='', required_features, required_limits, default_queue)
- An alternative to :func:`wgpu.GPUAdapter.request_adapter`, that streams a trace
- of all low level calls to disk, so the visualization can be replayed (also on other systems),
- investigated, and debugged.
The trace_path argument is ignored on drivers that do not support tracing.
- Parameters:
adapter – The adapter to create a device for.
trace_path – The path to an (empty) directory. Is created if it does not exist.
label – A human readable label. Optional.
required_features – The features (extensions) that you need. Default [].
required_limits – the various limits that you need. Default {}.
default_queue – Descriptor for the default queue. Optional.
- Returns:
Device
- Return type:
The wgpu_native backend provides support for push constants. Since WebGPU does not support this feature, documentation on its use is hard to find. A full explanation of push constants and its use in Vulkan can be found here. Using push constants in WGPU closely follows the Vulkan model.
The advantage of push constants is that they are typically faster to update than uniform buffers. Modifications to push constants are included in the command encoder; updating a uniform buffer involves sending a separate command to the GPU. The disadvantage of push constants is that their size limit is much smaller. The limit is guaranteed to be at least 128 bytes, and 256 bytes is typical.
Given an adapter, first determine if it supports push constants:
>> "push-constants" in adapter.features
True
If push constants are supported, determine the maximum number of bytes that can be allocated for push constants:
>> adapter.limits["max-push-constant-size"]
256
You must tell the adapter to create a device that supports push constants, and you must tell it the number of bytes of push constants that you are using. Overestimating is okay:
device = adapter.request_device_sync(
required_features=["push-constants"],
required_limits={"max-push-constant-size": 256},
)
Creating a push constant in your shader code is similar to the way you would create
a uniform buffer.
The fields that are only used in the @vertex
shader should be separated from the fields
that are only used in the @fragment
shader which should be separated from the fields
used in both shaders:
struct PushConstants {
// vertex shader
vertex_transform: vec4x4f,
// fragment shader
fragment_transform: vec4x4f,
// used in both
generic_transform: vec4x4f,
}
var<push_constant> push_constants: PushConstants;
To the pipeline layout for this shader, use
wgpu.backends.wpgu_native.create_pipeline_layout
instead of
device.create_pipelinelayout
. It takes an additional argument,
push_constant_layouts
, describing
the layout of the push constants. For example, in the above example:
push_constant_layouts = [
{"visibility": ShaderState.VERTEX, "start": 0, "end": 64},
{"visibility": ShaderStage.FRAGMENT, "start": 64, "end": 128},
{"visibility": ShaderState.VERTEX + ShaderStage.FRAGMENT , "start": 128, "end": 192},
],
Finally, you set the value of the push constant by using
wgpu.backends.wpgu_native.set_push_constants
:
set_push_constants(this_pass, ShaderStage.VERTEX, 0, 64, <64 bytes>)
set_push_constants(this_pass, ShaderStage.FRAGMENT, 64, 128, <64 bytes>)
set_push_constants(this_pass, ShaderStage.VERTEX + ShaderStage.FRAGMENT, 128, 192, <64 bytes>)
Bytes must be set separately for each of the three shader stages. If the push constant has
already been set, on the next use you only need to call set_push_constants
on those
bytes you wish to change.
- wgpu.backends.wpgu_native.create_pipeline_layout(device, *, label='', bind_group_layouts, push_constant_layouts=[])
This method provides the same functionality as
wgpu.GPUDevice.create_pipeline_layout()
, but provides an extra push_constant_layouts argument. When using push constants, this argument is a list of dictionaries, where each item in the dictionary has three fields: visibility, start, and end.- param device:
The device on which we are creating the pipeline layout
- param label:
An optional label
- param bind_group_layouts:
- param push_constant_layouts:
Described above.
- wgpu.backends.wgpu_native.set_push_constants(render_pass_encoder, visibility, offset, size_in_bytes, data, data_offset=0)
This function requires that the underlying GPU implement push_constants. These push constants are a buffer of bytes available to the fragment and vertex shaders. They are similar to a bound buffer, but the buffer is set using this function call.
- Parameters:
render_pass_encoder – The render pass encoder to which we are pushing constants.
visibility – The stages (vertex, fragment, or both) to which these constants are visible
offset – The offset into the push constants at which the bytes are to be written
size_in_bytes – The number of bytes to copy from the ata
data – The data to copy to the buffer
data_offset – The starting offset in the data at which to begin copying.
There are two functions that allow you to perform multiple draw calls at once. Both require that you enable the feature “multi-draw-indirect”.
Typically, these calls do not reduce work or increase parallelism on the GPU. Rather they reduce driver overhead on the CPU.
- wgpu.backends.wgpu_native.multi_draw_indirect(render_pass_encoder, buffer, *, offset=0, count):
- Equivalent to::
- for i in range(count):
render_pass_encoder.draw_indirect(buffer, offset + i * 16)
- Parameters:
render_pass_encoder – The current render pass encoder.
buffer – The indirect buffer containing the arguments.
offset – The byte offset in the indirect buffer containing the first argument.
count – The number of draw operations to perform.
- wgpu.backends.wgpu_native.multi_draw_indexed_indirect(render_pass_encoder, buffer, *, offset=0, count):
- Equivalent to::
- for i in range(count):
render_pass_encoder.draw_indexed_indirect(buffer, offset + i * 2-)
- Parameters:
render_pass_encoder – The current render pass encoder.
buffer – The indirect buffer containing the arguments.
offset – The byte offset in the indirect buffer containing the first argument.
count – The number of draw operations to perform.
Some GPUs allow you collect statistics on their pipelines. Those GPUs that support this have the feature “pipeline-statistics-query”, and you must enable this feature when getting the device.
You create a query set using the function
wgpu.backends.wgpu_native.create_statistics_query_set
.
The possible statistics are:
PipelineStatisticName.VertexShaderInvocations
= “vertex-shader-invocations”The number of times the vertex shader is called.
PipelineStatisticName.ClipperInvocations
= “clipper-invocations”The number of triangles generated by the vertex shader.
PipelineStatisticName.ClipperPrimitivesOut
= “clipper-primitives-out”The number of primitives output by the clipper.
PipelineStatisticName.FragmentShaderInvocations
= “fragment-shader-invocations”The number of times the fragment shader is called.
PipelineStatisticName.ComputeShaderInvocations
= “compute-shader-invocations”The number of times the compute shader is called.
The statistics argument is a list or a tuple of statistics names. Each element of the sequence must either be:
The enumeration, e.g.
PipelineStatisticName.FragmentShaderInvocations
A camel case string, e.g.
"VertexShaderInvocations"
A snake-case string, e.g.
"vertex-shader-invocations"
An underscored string, e.g.
"vertex_shader_invocations"
You may use any number of these statistics in a query set. Each result is an 8-byte unsigned integer, and the total size of each entry in the query set is 8 times the number of statistics chosen.
The statistics are always output to the query set in the order above, even if they are given in a different order in the list.
- wgpu.backends.wgpu_native.create_statistics_query_set(device, count, statistics):
Create a query set that could hold count entries for the specified statistics. The statistics are specified as a list of strings.
- Parameters:
device – The device.
count – Number of entries that go into the query set.
statistics – A sequence of strings giving the desired statistics.
- wgpu.backends.wgpu_native.begin_pipeline_statistics_query(encoder, query_set, index):
Start collecting statistics.
- Parameters:
encoder – The ComputePassEncoder or RenderPassEncoder.
query_set – The query set into which to save the result.
index – The index of the query set into which to write the result.
- wgpu.backends.wgpu_native.begin_pipeline_statistics_query(encoder, query_set, index):
Stop collecting statistics and write them into the query set.
- Parameters:
encoder – The ComputePassEncoder or RenderPassEncoder.
The js_webgpu backend
import wgpu.backends.js_webgpu
This backend calls into the JavaScript WebGPU API. For this, the Python code would need access to JavaScript - this backend is intended for use-cases like PScript PyScript, and RustPython.
This backend is still a stub, see issue #407 for details.