Nice article, though the diagram showing [OpenGL] [WebGL] [WebGPU] being built on Vulkan and then from Vulkan to D3D12 and Metal is wrong. WebGL and WebGPU go directly to D3D and Metal, they do not go through Vulkan first
Also, Vulkan is labeled as Open Source. It is not open source.
The are other mistakes in that area as well. It claims WebGPU is limited to Browsers. It is, not. WebGPU is available as both a C++ (Dawn) and a Rust (WGPU) library. Both run on Windows, MacOS, Linux, iOS, and Android. It is arguably the most cross platform library. Tons of native projects using both libraries.
Vulkan is also not really cross-platform any more than DirectX . DirectX runs on 2 platforms (listed in the aritcle). Vulkan runs on 2+ platforms, Android, and Linux. It runs on Windows but not on all windows. For example, in a business context using Remote Desktop, Vulkan is rarely available. It is not part of Windows, and is not installed by default. Graphics card companies (NVidia, AMD) include it. Windows itself does not. Vulkan also does not run on MacOS nor iOS
To elaborate on this, Vulkan is an open _standard_ whose many implementations (user mode driver) may or may not be open source. Vulkan is just a header file, it's up to the various independent hardware vendors (IHVs) to implement it for their platform.
Also a small correction: Vulkan actually _does_ run Apple platforms (via Vulkan-to-Metal translation) using MoltenVK and the new KosmicKrisp driver, and it works quite well.
That too kinda depends on where you draw the line, the spec text is freely available but all development happens behind closed doors under strict NDA. From an outsiders perspective it doesn't feel very open to get completely stonewalled by Khronos on the status of a feature that debuted in DirectX or CUDA well over a year ago, they won't even confirm whether it's on their roadmap.
Sure, that's true and a pretty valid criticism of the system.
My definition of open is that Khronos doesn't restrict (as far as I know) what platforms Vulkan can be implemented on. The same cannot be said for DX12 or Metal.
> Also a small correction: Vulkan actually _does_ run Apple platforms (via Vulkan-to-Metal translation) using MoltenVK and the new KosmicKrisp driver, and it works quite well.
I think, since they mentioned some enterprise deployments on Windows won't have Vulkan drivers preinstalled, that drivers merely being available is not enough for GP to count them as Vulkan "running". I think GP is only counting platforms (and circumstances) where you can reasonably expect Vulkan support to already be present.
Fair enough! In my humble opinion those specific circumstances outlined are a bit overly-pedantic for the target audience of this article and for general practical purposes. The average Windows user can reasonably expect that they'll be able to write a Vulkan application without much fuss ( until you run into the driver bugs of course ;) ).
> It claims WebGPU is limited to Browsers. It is, not. WebGPU is available as both a C++ (Dawn) and a Rust (WGPU) library. Both run on Windows, MacOS, Linux, iOS, and Android. It is arguably the most cross platform library. Tons of native projects using both libraries.
I feel like it's important to mention that WebGPU is a unified API on top of whatever is native to the machine (DirectX, Vulkan or Metal).
Also all Khronos APIs have endless extensions, many of which are proprietary and never made part of core, thus many applications cannot be ported across graphics card vendors or operating systems.
Playstation also doesn't do Vulkan, even though people routinely say otherwise.
Also while the Switch does OpenGL and Vulkan, it is really NVN what everyone that wants to get all the juice out of it uses.
At least Vulkan runs on Windows, that certainly makes it more cross-platform than DirectX. According to Google, some VNCs on Linux don’t support Vulkan, but I don’t think that’s a fair reason to suggest Vulkan doesn’t run on Linux in general, right? You’re right Vulkan isn’t part of the OS. Does that usually matter, if most people get Vulkan support through their driver?
The Vulkan specification is Open Source. Many Vulkan implementations are not.
Vulkan also isn't built on D3D at all, and only MoltenVK (an open-source implementation for Apple platforms) is built on Metal. (Edit: It appears Mesa also now has KosmicKrisp as a Vulkan translation layer for extremely recent (≤5 years) Mac devices.)
> WebGPU is available as both a C++ (Dawn) and a Rust (WGPU) library.
WebGPU is implemented by both a C++ library (Dawn) and Rust crate (wgpu-rs), but like Vulkan, is itself only a specification. Also I'd still hesitate to even call wgpu-rs an implementation of WebGPU, because it's only based on the WebGPU specification, not actually an exact conforming implementation like Dawn is.
> Vulkan is also not really cross-platform any more than DirectX .
Sure it is: DirectX makes assumptions that it's running on Windows, and uses Windows data structures in its APIs. Vulkan makes no such assumptions, and uses no platform-specific data structures, making it automatically more cross-platform.
Sure, users of DirectX can be cross-platform with the use of translation layers such as Wine or Proton, but the API itself can't be ported natively to other platforms without bringing the Windows baggage with it, while Vulkan can.
Additionally, Vulkan provides a lot more metadata that can be used to adapt to differing situations at runtime, for example the ability to query the graphics devices on the system and select which one to use. DirectX did not have this capability at first, but added it two years after Vulkan did.
Depends on which extensions, things is with Khronos APIs people always forget to mention the extension spaghetti that makes many use cases proprietary to a specific implementation.
I see lots of shader related videos but to me the worst part of GPU code is that the setup is archaic and hard to understand.
Does anyone have a good resource for the stages such as:
- What kind of data formats do I design to pipe into the GPU? Describe like i'm five texcels, arrays, buffers, etc.
- describe the difference between the data formats of traditional 3D workflow and more modern compute shader data formats
- Now that I supplied data, obviously I want to supply transformations as well. Transforms are not commutative, so it implies there is sequential state in which transforms are applied which seems to contradict this whole article
- The above point is more abstractly part of "supplying data into the GPU at a later stage". Am I crossing the CPU-GPU boundary multiple times before a frame is complete? If so describe the process and how/why.
- There is some kind of global variable system in GPUs. explain it. List every variable reachable from a shader fragment program
You’re partly asking questions about APIs rather than questions about GPUs. It might help to identify any specific goals you have and any specific tools or APIs you intend to use.
In general, you can use just about any data format you want. Since GPUs are SIMT, it’s good to try to keep data access coherent in thread groups, that’s the high level summary. There are various APIs that come with formats ready for you. Depends on whether you’re talking textures, geometry, audio, fields, NN weights, etc., etc.
I’m not sure I understand what you mean about sequential state contradicting the article. Shaders don’t necessarily need to deal with transforms (though they can). Transforms in shaders are applied sequentially within each thread, and are still parallel across threads.
Crossing the CPU GPU boundary is an application specific question. You can do that as many times as you have the budget for. Crossing that boundary might imply synchronization, which can affect performance.
For global variables, you might be thinking of shader uniforms, but I can’t tell. Uniforms are not globals, they are constants passed separately to each thread, even when the value is the same for each thread. You can use globals in CUDA with atomic instructions that causes threads to block when another thread is accessing the global. That costs performance, so people will avoid it when possible.
I hope that helps. Answering these is a bigger question than just shaders, it’s more like understanding the pipeline and how GPUs work and knowing what APIs are available. An intro course to WebGL might be a good starting point.
Programming in general is about converting something you understand into something a computer understands, and making sure the computer can execute it fast enough.
This is already hard enough as it is, but GPU programming (at least in its current state) is an order of magnitude worse in my experience. Tons of ways to get tripped up, endless trivial/arbitrary things you need to know or do, a seemingly bottomless pit of abstraction that contains countless bugs or performance pitfalls, hardware disparity, software/platform disparity, etc. Oh right, and a near complete lack of tooling for debugging. What little tooling there is only ever works on one GPU backend, or one OS, or one software stack.
I’m no means an expert but I feel our GPU programming “developer experience” standards are woefully out of touch and the community seems happy to keep it that way.
OpenGL and pre-12 DirectX were the attempt at unifying video programming in an abstract way. It turned out that trying to abstract away what the low-level hardware was doing was more harmful than beneficial.
> It turned out that trying to abstract away what the low-level hardware was doing was more harmful than beneficial.
Abstraction isn’t inherently problematic, but the _wrong_ abstraction is. Just because abstraction is hard to do well, doesn’t mean we shouldn’t try. Just because abstraction gets in the way of certain applications, doesn’t mean it’s not useful in others.
Not to say nobody is trying, but there’s a bit of a catch-22 where those most qualified to do something about it don’t see a problem with the status quo. This sort of thing happens in many technical fields, but I just have to pick on GPU programming because I’ve felt this pain for decades now and it hasn’t really budged.
Part of the problem is probably that the applications for GPUs have broadened and changed dramatically in the last decade or so, so it’s understandable that this moves slowly. I just want more people on the inside to acknowledge the problem.
Just absolutely beautiful execution, and this is a metric ton of work. I love the diagrams and the scrollbar and the style in general.
There are a few tiny conceptual things that maybe could smooth out and improve the story. I’m a fan of keeping writing for newcomers simple and accessible and not getting lost in details trying to be a Wikipedia on the subject, so I don’t know how much it matters, take these as notes or just pedantic nerdy nitpicks that you can ignore.
Shaders predate the GPU, and they run perfectly fine on CPU, and they’re used for ray tracing, so summarizing them as GPU programs for raster doesn’t explain what they are at all. Similarly, titling this as using an “x y coordinate” misses the point of shading, which is at it’s most basic to figure out the color of a sample, if we’re talking fragment shaders. Vertex shaders are just unfortunately misnamed, they’re not ‘shading’ anything. In that sense, a shader is just a specific kind of callback function. In OpenGL you get a callback for each vertex, and a callback for each pixel, and the callback’s (shader’s) job is to produce the final value for that element, given whatever inputs you want. In 3d scenes, fragment shaders rarely use x y coordinates. They use material, incoming light direction, and outgoing camera direction to “shade” the surface, i.e., figure out the color.
The section on GPU is kind of missing the most important point about GPUs: the fact that neighboring threads share the same instruction. The important part is the SIMT/SIMD execution model. But shaders actually aren’t conceptually different from CPU programming, they are still (usually) a single-threaded programming model, not a SIMT programming model. They can be run in parallel, and shaders tend to do the same thing (same sequence of instructions) for every pixel, and that’s why they’re great and fast on the GPU, but they are programmed with the same sequential techniques we use on the CPU and they can be run sequentially on the CPU too, there are no special parallel programming techniques needed, nor a different mindset (as is suggested multiple times). The fact that you don’t need a different mindset in order to get massively parallel super fast execution is one of the reasons why shaders are so simple and elegant and effective.
That, I think, is the most unintuitive part about writing fragment shaders. The idea that you take a couple of coordinates and output a color. Compared to traditional drawing, as with a pen and paper, you have to think in reverse.
For example if you want to draw a square with a pen, you put your pen where the square is, draw the outlines, than fill it up, with a shader, for each pixel, you look at where you are, calculate where the pixel is relative to the square, and output the fill color if it is inside the square. If you want to draw another square to the right, with the pen, you move your pen to the right, but with the shader, you move the reference coordinates to the left. Another way to see it is that you don't manipulate objects, you manipulate the space around the objects.
Vertex shaders are more natural as the output is the position of your triangles, like the position of your pen should you be drawing on paper.
I'd say the unintuive part is mostly a problem only if you abuse fragment shaders for something they weren't meant to be used for. All the fancy drawings that people make on shadertoy are cool tricks but you would very rarely do something like that in any practical use case. Fragment shaders weren't meant to be used for making arbitrary drawings that's why you have high level graphic APIs and content creation software.
They were meant to be means for more flexible last stage of more or less traditional GPU pipeline. Normal shader would do something like sample a pixel from texture using UV coordinates already interpolated by GPU (don't even have to convert x,y screen or world coordinates into texture UV yourself), maybe from multiple textures (normal map, bump map, roughness, ...) combine it with light direction and calculate final color for that specific pixel of triangle. But the actual drawing structure comes mostly from geometry and texture not the fragment shader. With popularity of PBR and deferred rendering large fraction of objects can share the same common PBR shader parametrized by textures and only some special effects using custom stuff.
For any programmable system people will explore how far can it be pushed, but it shouldn't be surprise that things get inconvenient and not so intuitive once you go beyond normal use case. I don't think anyone is surprised that computing Fibonacci numbers using C++ templates isn't intuitive.
Yeah, I work professionally as a gpu/graphics programmer and even I have trouble sometimes wrapping my head around some of the fancy shadertoy one-liners. Most of the stuff I work with is conceptually much more simple. The math if any is all generally really straightforward and derived from a handful of commonly used algorithms or PBR models. The more common work is performance tuning and scaling across multiple architectures/platforms. You might run into this stuff more if working with SDFs though.
I think what you’re describing is the difference between raster and vector graphics, and doesn’t reflect on shaders directly.
It always depends on your goals, of course. The goal of drawing with a pen is to draw outlines, but the goal behind rasterizing or ray tracing, and shading, is not to draw outlines, but often to render 3d scenes with physically based materials. Achieving that goal with a pen is extremely difficult and tedious and time consuming, which is why the way we render scenes doesn’t do that, it is closer to simulating bundles of light particles and approximating their statistical behavior.
Of course, painting is slighty closer to shading than pen drawing is.
I think their explanation is great. The shader is run on all the pixels within the quad and your shader code needs to figure out if the pixel is within the shape you want to draw or not. Compared to just drawing it pixel by pixel if you do it by pen or on the CPU.
For a red line between A and B:
CPU/pen: for each pixel between A and B: draw red
GPU/shader: for all pixels: draw red if it's on the intersection between A and B
Figuring out if a pixel is within a shape, or is on the A-B intersection line, is part of the rasterizing step, not the shading. At least in the parent’s analogy. There are quite a few different ways to draw a red line between two points.
Also using CPU and GPU here isn’t correct. There is no difference in the way CPUs and GPUs draw things unless you choose different drawing algorithms.
While (I presume) technically correct I don't think your clarifications are helpful for someone trying to understand shaders. The only thing that made me understand (fragment) shaders was something similar to the parent's explanation. Do you have anything better?
It's not about the correct way to draw a square or a line but using something simple to illustrate the difference. How would you make a shader drawing a 10x10 pixels red square on shadertoy?
You’re asking a strange question that doesn’t get at why shaders exist. If you actually want to understand them, you must understand the bigger picture of how they fit into the pipeline, and what they are designed to do.
You can do line drawing on a CPU or GPU, and you don’t need to reach for shaders to do that. Shaders are not necessarily the right tool for that job, which is why comparing shaders to pen drawing makes it seems like someone is confused about what they want.
ShaderToy is fun and awesome, but it’s fundamentally a confusing abuse of what shaders were intended for. When you ask how to make a 10x10 pixel square, you’re asking how to make a procedural texture with a red square, you’re imposing a non-standard method of rendering on your question, and failing to talk about the way shaders work normally. To draw a red square the easy way, you render a quad (pair of triangles) and you assign a shader that returns red unconditionally. You tell the rasterizer the pixel coordinate corners of your square, and it figures out which pixels are in between the corners, before the shader is ever called.
It actually works the way you are describing it: vertex shader defines the boundaries of things to draw, and its fragment shader fills in the boundaries. For example, if you want to draw 1 billion ellipses, your vertex shader would enumerate 1 billion rectangles where those ellipses are to be drawn, and the fragment shader will fill only the necessary portion of those rectangles. But sometimes, for educational purposes, people omit the vertex shader and the fragment shader fills in the entire screen.
Obligatory "this painting is a mathematical formula, a big function on the x and y coordinates of each pixel" video from Iñigo Quilez. https://www.youtube.com/watch?v=8--5LwHRhjk
Does anyone have any resources for learning how to do very beautiful clean technical drawings like this? I have some art skill but not the kind that translates to such clean technical drawings with this nice personality. Would love to be able to make some for my own projects.
Seems like the author is planning to publish a coffee table style book and I could not find a store page for pricing information. Seems a bit early. They have a mailing list though.
I skimmed it but didn't see any mention of "ray marching", which is raytracing done in a shader. GPUs are pretty fast now. You can just do that. However you do have to encode the scene geometry analytically in the shader - if you try to raytrace a big bag of triangles, it's still too slow. There's more info on this and other techniques at https://iquilezles.org/articles/
Nitpick: "raymarching" is not "raytracing done in a shader" and it's not polygon-based.
Raymarching is a raytracing technique that takes advantage of Signed Distance Functions to have a minimum bound on the ray's distance to complex surfaces, letting you march rays by discrete amounts using this distance[0]. If the distance is still large after a set number of steps the ray is assumed to have escaped the scene.
This allows tracing complex geometry cheaply because, unlike in traditional raytracing, you don't have to calculate each ray's intersection with a large number of analytical shapes (SDFs are O(1), analytical raytracing is O(n)).
There are disadvantages to raymarching. In particular, many useful operations on SDFs only produce a bounded result, actually result in a pseudo-SDF that is not differentiable everywhere, might be non-Euclidean, etc. which might introduce artifacts on the rendering.
You can do analytical raytracing in fragment shaders[1].
SDFs still scale by geometry complexity, though. It costs instructions to evaluate each SDF component. You could still use something like BvH (or Matt Keeter’s interval arithmetic trick) to speed things up.
Did you mean ‘raymarching is a technique…’? Otherwise you’re somewhat contradicting the first sentence, and also ray marching and ray tracing are two different techniques, which is what you’re trying to say, right?
Raymarching can be polygon based, if you want. It’s not usually on ShaderToy, but there’s no technical reason or rule against raymarching polygons. And use of Monte Carlo with ray tracing doesn’t necessarily imply path tracing, FWIW.
Sorry, let me clarify, the terms are used imprecisely.
Some people use "raytracing" only for the ray intersection technique, but some people (me included, in the post above) consider it an umbrella term and raymarching, path tracing, etc. only as specific techniques of raytracing.
So what I meant is "'raymarching' is not 'raytracing in shaders' but just a technique of raytracing, in shaders or not".
I was not correcting OP, just adding clarifications on top.
> Raymarching can be polygon based, if you want
But not polygon-intersection-based, it'd still be a SDF (to an implicit polygon).
Why were you expecting this article to specifically mention ray marching? It looks like a comprehensive beginner article on what shaders are, not an exhaustive list of what you can do with them.
Shaders technically don't even know their X and Y coordinates by default unless you specifically provide those, just as you can provide other coordinates (such as U and V for surfaces) or other values in general (either varying / input attributes, like fragment coordinates typically are, or uniforms, which are the same for every invocation).
Is the article about OpenGL? I see a few mentions of OpenGL in a section about graphics APIs that also mentions at least Vulkan, which doesn't automatically provide fragment coordinates, and WebGPU, which also doesn't. Shaders by default have no concept of fragment coordinates; it's OpenGL the API that introduces them by default.
Rasterizing and shading are two separate stages. You don’t need to know pixel position when shading. You can wire up the pixel coordinates, if you want, and they are often nearby, but it’s not necessary. This gets even more clear when you do deferred shading - storing what you need in a G-buffer, and running the shaders later, long after all rasterization is complete.
The view transform doesn't necessarily have to be known to the fragment shader, though. That's usually in the realm of the geometry shader, but even the geometry shader doesn't have to know how things correspond to screen coordinates, for example if your API of choice represents coordinates as floats from [0.5, 0.5) and all you feed it is vertex positions. (I experienced that with wgpu-rs) You can rasterize things perfectly fine with just vertex positions; in fact you can even hardcode vertex positions into the geometry shader and not have to input any coordinates at all.
Nice article, though the diagram showing [OpenGL] [WebGL] [WebGPU] being built on Vulkan and then from Vulkan to D3D12 and Metal is wrong. WebGL and WebGPU go directly to D3D and Metal, they do not go through Vulkan first
Also, Vulkan is labeled as Open Source. It is not open source.
The are other mistakes in that area as well. It claims WebGPU is limited to Browsers. It is, not. WebGPU is available as both a C++ (Dawn) and a Rust (WGPU) library. Both run on Windows, MacOS, Linux, iOS, and Android. It is arguably the most cross platform library. Tons of native projects using both libraries.
Vulkan is also not really cross-platform any more than DirectX . DirectX runs on 2 platforms (listed in the aritcle). Vulkan runs on 2+ platforms, Android, and Linux. It runs on Windows but not on all windows. For example, in a business context using Remote Desktop, Vulkan is rarely available. It is not part of Windows, and is not installed by default. Graphics card companies (NVidia, AMD) include it. Windows itself does not. Vulkan also does not run on MacOS nor iOS
To elaborate on this, Vulkan is an open _standard_ whose many implementations (user mode driver) may or may not be open source. Vulkan is just a header file, it's up to the various independent hardware vendors (IHVs) to implement it for their platform.
Also a small correction: Vulkan actually _does_ run Apple platforms (via Vulkan-to-Metal translation) using MoltenVK and the new KosmicKrisp driver, and it works quite well.
> Vulkan is an open _standard_
That too kinda depends on where you draw the line, the spec text is freely available but all development happens behind closed doors under strict NDA. From an outsiders perspective it doesn't feel very open to get completely stonewalled by Khronos on the status of a feature that debuted in DirectX or CUDA well over a year ago, they won't even confirm whether it's on their roadmap.
Sure, that's true and a pretty valid criticism of the system.
My definition of open is that Khronos doesn't restrict (as far as I know) what platforms Vulkan can be implemented on. The same cannot be said for DX12 or Metal.
Vulkan also runs on Apple Silicon without translation on linux
> Also a small correction: Vulkan actually _does_ run Apple platforms (via Vulkan-to-Metal translation) using MoltenVK and the new KosmicKrisp driver, and it works quite well.
I think, since they mentioned some enterprise deployments on Windows won't have Vulkan drivers preinstalled, that drivers merely being available is not enough for GP to count them as Vulkan "running". I think GP is only counting platforms (and circumstances) where you can reasonably expect Vulkan support to already be present.
Fair enough! In my humble opinion those specific circumstances outlined are a bit overly-pedantic for the target audience of this article and for general practical purposes. The average Windows user can reasonably expect that they'll be able to write a Vulkan application without much fuss ( until you run into the driver bugs of course ;) ).
> Vulkan is also not really cross-platform any more than DirectX. [...]
Vulkan is not entirely cross-platform, but it's still way "more" cross-platform than DirectX by your own point of view.
DirectX: - Windows - Xbox
Vulkan: - Linux - Android - Windows - Nintendo Switch - Nintendo Switch 2
Metal: - MacOS - iOS
On modern Windows it depends on DirectX, as the new ICD infrastructure is part of the DirectX Runtime part of the OS.
https://learn.microsoft.com/en-us/windows-hardware/drivers/d...
> It claims WebGPU is limited to Browsers. It is, not. WebGPU is available as both a C++ (Dawn) and a Rust (WGPU) library. Both run on Windows, MacOS, Linux, iOS, and Android. It is arguably the most cross platform library. Tons of native projects using both libraries.
I feel like it's important to mention that WebGPU is a unified API on top of whatever is native to the machine (DirectX, Vulkan or Metal).
Also all Khronos APIs have endless extensions, many of which are proprietary and never made part of core, thus many applications cannot be ported across graphics card vendors or operating systems.
Playstation also doesn't do Vulkan, even though people routinely say otherwise.
Also while the Switch does OpenGL and Vulkan, it is really NVN what everyone that wants to get all the juice out of it uses.
At least Vulkan runs on Windows, that certainly makes it more cross-platform than DirectX. According to Google, some VNCs on Linux don’t support Vulkan, but I don’t think that’s a fair reason to suggest Vulkan doesn’t run on Linux in general, right? You’re right Vulkan isn’t part of the OS. Does that usually matter, if most people get Vulkan support through their driver?
The Vulkan specification is Open Source. Many Vulkan implementations are not.
Vulkan also isn't built on D3D at all, and only MoltenVK (an open-source implementation for Apple platforms) is built on Metal. (Edit: It appears Mesa also now has KosmicKrisp as a Vulkan translation layer for extremely recent (≤5 years) Mac devices.)
> WebGPU is available as both a C++ (Dawn) and a Rust (WGPU) library.
WebGPU is implemented by both a C++ library (Dawn) and Rust crate (wgpu-rs), but like Vulkan, is itself only a specification. Also I'd still hesitate to even call wgpu-rs an implementation of WebGPU, because it's only based on the WebGPU specification, not actually an exact conforming implementation like Dawn is.
> Vulkan is also not really cross-platform any more than DirectX .
Sure it is: DirectX makes assumptions that it's running on Windows, and uses Windows data structures in its APIs. Vulkan makes no such assumptions, and uses no platform-specific data structures, making it automatically more cross-platform.
Sure, users of DirectX can be cross-platform with the use of translation layers such as Wine or Proton, but the API itself can't be ported natively to other platforms without bringing the Windows baggage with it, while Vulkan can.
Additionally, Vulkan provides a lot more metadata that can be used to adapt to differing situations at runtime, for example the ability to query the graphics devices on the system and select which one to use. DirectX did not have this capability at first, but added it two years after Vulkan did.
Depends on which extensions, things is with Khronos APIs people always forget to mention the extension spaghetti that makes many use cases proprietary to a specific implementation.
For anyone looking for some IDEs to tinker around with shaders:
* shadertoy - in-browser, the most popular and easiest to get started with
* Shadron - my personal preference due to ease of use and high capability, but a bit niche
* SHADERed - the UX can take a bit of getting used to, but it gets the job done
* KodeLife - heard of it, never tried it
Cables[0] is pretty cool too. Kirell Benzi has released some impressive work using it [1].
[0]: https://cables.gl/
[1]: https://youtu.be/CltYdTVH7_A
Had a look in Mint's software manager and found this (flatpak/aur/macports/windows): https://github.com/fralonra/wgshadertoy
Also on macOS (and iPadOS) it's super easy to get started with Metal shaders in Playgrounds.
For swiftUI+metal specifically: https://metal.graphics
There's also bonzomatic which the demo scene uses for shader live coding competitions:
https://github.com/Gargaj/Bonzomatic
I see lots of shader related videos but to me the worst part of GPU code is that the setup is archaic and hard to understand.
Does anyone have a good resource for the stages such as:
- What kind of data formats do I design to pipe into the GPU? Describe like i'm five texcels, arrays, buffers, etc.
- describe the difference between the data formats of traditional 3D workflow and more modern compute shader data formats
- Now that I supplied data, obviously I want to supply transformations as well. Transforms are not commutative, so it implies there is sequential state in which transforms are applied which seems to contradict this whole article
- The above point is more abstractly part of "supplying data into the GPU at a later stage". Am I crossing the CPU-GPU boundary multiple times before a frame is complete? If so describe the process and how/why.
- There is some kind of global variable system in GPUs. explain it. List every variable reachable from a shader fragment program
You’re partly asking questions about APIs rather than questions about GPUs. It might help to identify any specific goals you have and any specific tools or APIs you intend to use.
In general, you can use just about any data format you want. Since GPUs are SIMT, it’s good to try to keep data access coherent in thread groups, that’s the high level summary. There are various APIs that come with formats ready for you. Depends on whether you’re talking textures, geometry, audio, fields, NN weights, etc., etc.
I’m not sure I understand what you mean about sequential state contradicting the article. Shaders don’t necessarily need to deal with transforms (though they can). Transforms in shaders are applied sequentially within each thread, and are still parallel across threads.
Crossing the CPU GPU boundary is an application specific question. You can do that as many times as you have the budget for. Crossing that boundary might imply synchronization, which can affect performance.
For global variables, you might be thinking of shader uniforms, but I can’t tell. Uniforms are not globals, they are constants passed separately to each thread, even when the value is the same for each thread. You can use globals in CUDA with atomic instructions that causes threads to block when another thread is accessing the global. That costs performance, so people will avoid it when possible.
I hope that helps. Answering these is a bigger question than just shaders, it’s more like understanding the pipeline and how GPUs work and knowing what APIs are available. An intro course to WebGL might be a good starting point.
Programming in general is about converting something you understand into something a computer understands, and making sure the computer can execute it fast enough.
This is already hard enough as it is, but GPU programming (at least in its current state) is an order of magnitude worse in my experience. Tons of ways to get tripped up, endless trivial/arbitrary things you need to know or do, a seemingly bottomless pit of abstraction that contains countless bugs or performance pitfalls, hardware disparity, software/platform disparity, etc. Oh right, and a near complete lack of tooling for debugging. What little tooling there is only ever works on one GPU backend, or one OS, or one software stack.
I’m no means an expert but I feel our GPU programming “developer experience” standards are woefully out of touch and the community seems happy to keep it that way.
OpenGL and pre-12 DirectX were the attempt at unifying video programming in an abstract way. It turned out that trying to abstract away what the low-level hardware was doing was more harmful than beneficial.
> It turned out that trying to abstract away what the low-level hardware was doing was more harmful than beneficial.
Abstraction isn’t inherently problematic, but the _wrong_ abstraction is. Just because abstraction is hard to do well, doesn’t mean we shouldn’t try. Just because abstraction gets in the way of certain applications, doesn’t mean it’s not useful in others.
Not to say nobody is trying, but there’s a bit of a catch-22 where those most qualified to do something about it don’t see a problem with the status quo. This sort of thing happens in many technical fields, but I just have to pick on GPU programming because I’ve felt this pain for decades now and it hasn’t really budged.
Part of the problem is probably that the applications for GPUs have broadened and changed dramatically in the last decade or so, so it’s understandable that this moves slowly. I just want more people on the inside to acknowledge the problem.
Just absolutely beautiful execution, and this is a metric ton of work. I love the diagrams and the scrollbar and the style in general.
There are a few tiny conceptual things that maybe could smooth out and improve the story. I’m a fan of keeping writing for newcomers simple and accessible and not getting lost in details trying to be a Wikipedia on the subject, so I don’t know how much it matters, take these as notes or just pedantic nerdy nitpicks that you can ignore.
Shaders predate the GPU, and they run perfectly fine on CPU, and they’re used for ray tracing, so summarizing them as GPU programs for raster doesn’t explain what they are at all. Similarly, titling this as using an “x y coordinate” misses the point of shading, which is at it’s most basic to figure out the color of a sample, if we’re talking fragment shaders. Vertex shaders are just unfortunately misnamed, they’re not ‘shading’ anything. In that sense, a shader is just a specific kind of callback function. In OpenGL you get a callback for each vertex, and a callback for each pixel, and the callback’s (shader’s) job is to produce the final value for that element, given whatever inputs you want. In 3d scenes, fragment shaders rarely use x y coordinates. They use material, incoming light direction, and outgoing camera direction to “shade” the surface, i.e., figure out the color.
The section on GPU is kind of missing the most important point about GPUs: the fact that neighboring threads share the same instruction. The important part is the SIMT/SIMD execution model. But shaders actually aren’t conceptually different from CPU programming, they are still (usually) a single-threaded programming model, not a SIMT programming model. They can be run in parallel, and shaders tend to do the same thing (same sequence of instructions) for every pixel, and that’s why they’re great and fast on the GPU, but they are programmed with the same sequential techniques we use on the CPU and they can be run sequentially on the CPU too, there are no special parallel programming techniques needed, nor a different mindset (as is suggested multiple times). The fact that you don’t need a different mindset in order to get massively parallel super fast execution is one of the reasons why shaders are so simple and elegant and effective.
The entire website is incredible. I'm amazed all the illustrations were done in Figma.
These are fantastic follow-up notes, thank you.
That, I think, is the most unintuitive part about writing fragment shaders. The idea that you take a couple of coordinates and output a color. Compared to traditional drawing, as with a pen and paper, you have to think in reverse.
For example if you want to draw a square with a pen, you put your pen where the square is, draw the outlines, than fill it up, with a shader, for each pixel, you look at where you are, calculate where the pixel is relative to the square, and output the fill color if it is inside the square. If you want to draw another square to the right, with the pen, you move your pen to the right, but with the shader, you move the reference coordinates to the left. Another way to see it is that you don't manipulate objects, you manipulate the space around the objects.
Vertex shaders are more natural as the output is the position of your triangles, like the position of your pen should you be drawing on paper.
I'd say the unintuive part is mostly a problem only if you abuse fragment shaders for something they weren't meant to be used for. All the fancy drawings that people make on shadertoy are cool tricks but you would very rarely do something like that in any practical use case. Fragment shaders weren't meant to be used for making arbitrary drawings that's why you have high level graphic APIs and content creation software.
They were meant to be means for more flexible last stage of more or less traditional GPU pipeline. Normal shader would do something like sample a pixel from texture using UV coordinates already interpolated by GPU (don't even have to convert x,y screen or world coordinates into texture UV yourself), maybe from multiple textures (normal map, bump map, roughness, ...) combine it with light direction and calculate final color for that specific pixel of triangle. But the actual drawing structure comes mostly from geometry and texture not the fragment shader. With popularity of PBR and deferred rendering large fraction of objects can share the same common PBR shader parametrized by textures and only some special effects using custom stuff.
For any programmable system people will explore how far can it be pushed, but it shouldn't be surprise that things get inconvenient and not so intuitive once you go beyond normal use case. I don't think anyone is surprised that computing Fibonacci numbers using C++ templates isn't intuitive.
Yeah, I work professionally as a gpu/graphics programmer and even I have trouble sometimes wrapping my head around some of the fancy shadertoy one-liners. Most of the stuff I work with is conceptually much more simple. The math if any is all generally really straightforward and derived from a handful of commonly used algorithms or PBR models. The more common work is performance tuning and scaling across multiple architectures/platforms. You might run into this stuff more if working with SDFs though.
I think what you’re describing is the difference between raster and vector graphics, and doesn’t reflect on shaders directly.
It always depends on your goals, of course. The goal of drawing with a pen is to draw outlines, but the goal behind rasterizing or ray tracing, and shading, is not to draw outlines, but often to render 3d scenes with physically based materials. Achieving that goal with a pen is extremely difficult and tedious and time consuming, which is why the way we render scenes doesn’t do that, it is closer to simulating bundles of light particles and approximating their statistical behavior.
Of course, painting is slighty closer to shading than pen drawing is.
I think their explanation is great. The shader is run on all the pixels within the quad and your shader code needs to figure out if the pixel is within the shape you want to draw or not. Compared to just drawing it pixel by pixel if you do it by pen or on the CPU.
For a red line between A and B:
CPU/pen: for each pixel between A and B: draw red
GPU/shader: for all pixels: draw red if it's on the intersection between A and B
Figuring out if a pixel is within a shape, or is on the A-B intersection line, is part of the rasterizing step, not the shading. At least in the parent’s analogy. There are quite a few different ways to draw a red line between two points.
Also using CPU and GPU here isn’t correct. There is no difference in the way CPUs and GPUs draw things unless you choose different drawing algorithms.
While (I presume) technically correct I don't think your clarifications are helpful for someone trying to understand shaders. The only thing that made me understand (fragment) shaders was something similar to the parent's explanation. Do you have anything better?
It's not about the correct way to draw a square or a line but using something simple to illustrate the difference. How would you make a shader drawing a 10x10 pixels red square on shadertoy?
You’re asking a strange question that doesn’t get at why shaders exist. If you actually want to understand them, you must understand the bigger picture of how they fit into the pipeline, and what they are designed to do.
You can do line drawing on a CPU or GPU, and you don’t need to reach for shaders to do that. Shaders are not necessarily the right tool for that job, which is why comparing shaders to pen drawing makes it seems like someone is confused about what they want.
ShaderToy is fun and awesome, but it’s fundamentally a confusing abuse of what shaders were intended for. When you ask how to make a 10x10 pixel square, you’re asking how to make a procedural texture with a red square, you’re imposing a non-standard method of rendering on your question, and failing to talk about the way shaders work normally. To draw a red square the easy way, you render a quad (pair of triangles) and you assign a shader that returns red unconditionally. You tell the rasterizer the pixel coordinate corners of your square, and it figures out which pixels are in between the corners, before the shader is ever called.
If you are using fragment shaders to draw squares you're doing something wrong.
Shaders would be more for something like _shading_ the square.
Ray casting shaders are a thing. A very performant thing too.
It actually works the way you are describing it: vertex shader defines the boundaries of things to draw, and its fragment shader fills in the boundaries. For example, if you want to draw 1 billion ellipses, your vertex shader would enumerate 1 billion rectangles where those ellipses are to be drawn, and the fragment shader will fill only the necessary portion of those rectangles. But sometimes, for educational purposes, people omit the vertex shader and the fragment shader fills in the entire screen.
Those 10,000 vertices got me acting up.
Wonderful explanation, and delightfully designed website.
This is really, really well made (and beautiful).
100% - Is it all custom site? Looks like a Next.js app
Beautifully explained, very well done.
I found the "What is a color space?" chapter even more interesting though, as it contains new things (for me).
Obligatory "this painting is a mathematical formula, a big function on the x and y coordinates of each pixel" video from Iñigo Quilez. https://www.youtube.com/watch?v=8--5LwHRhjk
Does anyone have any resources for learning how to do very beautiful clean technical drawings like this? I have some art skill but not the kind that translates to such clean technical drawings with this nice personality. Would love to be able to make some for my own projects.
From the home page:
Question: How do you make the illustrations?
Answer:
I get asked this more than anything else but honestly, I don't have a good answer.
I make them by hand, in Figma. There's no secret - it's as complicated as it looks.
The Advanced Edition of the book will include a tutorial explaining how I make them, where I get references and inspiration from.
This has some great examples, enough to get started and provide some inspiration, but is sadly incomplete.
https://rougier.github.io/python-opengl/book.html
Beautiful website!
This article is fricking OVERKILL to say "use a framebuffer".
I'm impressed by how well written the articles on this website are.
It seems that only 3 of them are ready tho. I'm not sure why it asked me to enter 'license key' (of what?)... are they paywalled?
Seems like the author is planning to publish a coffee table style book and I could not find a store page for pricing information. Seems a bit early. They have a mailing list though.
Really neat site though. I’ll be following.
You can do some pretty impressive things: https://shadertoy.com/
I skimmed it but didn't see any mention of "ray marching", which is raytracing done in a shader. GPUs are pretty fast now. You can just do that. However you do have to encode the scene geometry analytically in the shader - if you try to raytrace a big bag of triangles, it's still too slow. There's more info on this and other techniques at https://iquilezles.org/articles/
Nitpick: "raymarching" is not "raytracing done in a shader" and it's not polygon-based.
Raymarching is a raytracing technique that takes advantage of Signed Distance Functions to have a minimum bound on the ray's distance to complex surfaces, letting you march rays by discrete amounts using this distance[0]. If the distance is still large after a set number of steps the ray is assumed to have escaped the scene.
This allows tracing complex geometry cheaply because, unlike in traditional raytracing, you don't have to calculate each ray's intersection with a large number of analytical shapes (SDFs are O(1), analytical raytracing is O(n)).
There are disadvantages to raymarching. In particular, many useful operations on SDFs only produce a bounded result, actually result in a pseudo-SDF that is not differentiable everywhere, might be non-Euclidean, etc. which might introduce artifacts on the rendering.
You can do analytical raytracing in fragment shaders[1].
[0] https://en.wikipedia.org/wiki/Ray_marching#Sphere_tracing Good visualization of raymarching steps
[1] https://www.shadertoy.com/view/WlXcRM Fragment shader using Monte Carlo raytracing (aka "path tracing")
SDFs still scale by geometry complexity, though. It costs instructions to evaluate each SDF component. You could still use something like BvH (or Matt Keeter’s interval arithmetic trick) to speed things up.
> Raymarching is a raytracing technique…
Did you mean ‘raymarching is a technique…’? Otherwise you’re somewhat contradicting the first sentence, and also ray marching and ray tracing are two different techniques, which is what you’re trying to say, right?
Raymarching can be polygon based, if you want. It’s not usually on ShaderToy, but there’s no technical reason or rule against raymarching polygons. And use of Monte Carlo with ray tracing doesn’t necessarily imply path tracing, FWIW.
Sorry, let me clarify, the terms are used imprecisely.
Some people use "raytracing" only for the ray intersection technique, but some people (me included, in the post above) consider it an umbrella term and raymarching, path tracing, etc. only as specific techniques of raytracing.
So what I meant is "'raymarching' is not 'raytracing in shaders' but just a technique of raytracing, in shaders or not".
I was not correcting OP, just adding clarifications on top.
> Raymarching can be polygon based, if you want
But not polygon-intersection-based, it'd still be a SDF (to an implicit polygon).
Why were you expecting this article to specifically mention ray marching? It looks like a comprehensive beginner article on what shaders are, not an exhaustive list of what you can do with them.
The next chapter is about SDFs, but it is not available yet.
https://www.makingsoftware.com/chapters/rays-and-sdfs
Shaders technically don't even know their X and Y coordinates by default unless you specifically provide those, just as you can provide other coordinates (such as U and V for surfaces) or other values in general (either varying / input attributes, like fragment coordinates typically are, or uniforms, which are the same for every invocation).
>Shaders technically don't even know their X and Y coordinates by default unless you specifically provide those
im kind of a boomer regarding shaders but isnt gl_FragCoord always available?
Is the article about OpenGL? I see a few mentions of OpenGL in a section about graphics APIs that also mentions at least Vulkan, which doesn't automatically provide fragment coordinates, and WebGPU, which also doesn't. Shaders by default have no concept of fragment coordinates; it's OpenGL the API that introduces them by default.
The pixel position has to be known, how else are you rasterizing something?
Rasterizing and shading are two separate stages. You don’t need to know pixel position when shading. You can wire up the pixel coordinates, if you want, and they are often nearby, but it’s not necessary. This gets even more clear when you do deferred shading - storing what you need in a G-buffer, and running the shaders later, long after all rasterization is complete.
Technically, the (pixel) fragment shader stage happens after the rasterization stage.
The view transform doesn't necessarily have to be known to the fragment shader, though. That's usually in the realm of the geometry shader, but even the geometry shader doesn't have to know how things correspond to screen coordinates, for example if your API of choice represents coordinates as floats from [0.5, 0.5) and all you feed it is vertex positions. (I experienced that with wgpu-rs) You can rasterize things perfectly fine with just vertex positions; in fact you can even hardcode vertex positions into the geometry shader and not have to input any coordinates at all.
What a fantastic website, be sure to check out the rest of the content! makingsoftware.com