This article is about storage buffers and continues where the
previous article left off.
Storage buffers are similar to uniform buffers in many ways.
If all we did was change UNIFORM to STORAGE in our JavaScript
and var<uniform> to var<storage, read> in our WGSL, the examples
on the previous page would just work.
In fact, here are the differences, without renaming variables to have more
appropriate names.
Differences between uniform buffers and storage buffers
The major differences between uniform buffers and storage buffers are:
Uniform buffers can be faster for their typical use-case
It really depends on the use case. A typical app will need to draw
lots of different things. Say it’s a 3D game. The app might draw
cars, buildings, rocks, bushes, people, etc… Each of those will
require passing in orientations and material properties similar
to what our example above passes in. In this case, using a uniform buffer
is the recommended solution.
Storage buffers can be much larger than uniform buffers.
By default, the maximum size of a uniform buffer is 64 kiB (65536 bytes).
By default, the maximum size of a storage buffer is 128 MiB (134217728 bytes).
All implementations are required to support at least these sizes. We’ll cover how to check for and request larger limits in
detail in another article.
Storage buffers can be read/write, Uniform buffers are read-only.
We saw an example of writing to a storage buffer in the compute shader
example in the first article.
Given the first 2 points above, let’s take our last example and change it
to draw all 100 triangles in a single draw call. This is a use-case that
might fit storage buffers. I say might because again, WebGPU is similar
to other programming languages. There are many ways to achieve the same thing.
array.forEach vs for (const elem of array) vs for (let i = 0; i < array.length; ++i). Each has its uses. The same is true of WebGPU. Each thing we try to do
has multiple ways we can achieve it. When it comes to drawing triangles,
all that WebGPU cares about is we return a value for builtin(position) from
the vertex shader and return a color/value for location(0) from the fragment shader.[1]
The first thing we’ll do is change our storage declarations to runtime-sized
arrays.
We added a new parameter to our vertex shader called
instanceIndex and gave it the @builtin(instance_index) attribute
which means it gets its value from WebGPU for each “instance” drawn.
When we call draw, we can pass a second argument for number of instances
and for each instance drawn, the number of the instance being processed
will be passed to our function.
Using instanceIndex, we can get specific struct elements from our arrays
of structs.
We also need to get the color from the correct array element and use
it in our fragment shader. The fragment shader doesn’t have access to
@builtin(instance_index) because that would make no sense. We could pass
it as an inter-stage variable but it
would be more common to look up the color in the vertex shader and just pass
the color.
pass.draw(3, kNumObjects);// call our vertex shader 3 times for each instance
pass.end();
const commandBuffer = encoder.finish();
device.queue.submit([commandBuffer]);
}
The code above is going to draw kNumObjects instances. For each instance
WebGPU will call the vertex shader 3 times with vertex_index set to 0, 1, 2
and instance_index set to 0 ~ kNumObjects - 1
We managed to draw all 100 triangles, each with a different scale, color, and
offset, with a single draw call. For situations where you want to draw lots
of instances of the same object, this is one way to do it.
Using storage buffers for vertex data
Until this point, we’ve used a hard-coded triangle directly in our shader.
One use case of storage buffers is to store vertex data. Just like we indexed
the current storage buffers by instance_index in our example above, we could
index another storage buffer with vertex_index to get vertex data.
But, by making it a struct, it would arguably be easier to add per-vertex
data later?
Passing in vertices via storage buffers is gaining popularity.
I’m told though that for some older devices, it’s slower than the classic way
which we’ll cover next in an article on vertex buffers.
We can have multiple color attachments and then we’ll need to return more colors/values for location(1), location(2), etc… ↩︎