Skip to content

sokol_gfx.h — Simple 3D API Wrapper

Getting Started

Do this:

#define SOKOL_IMPL
// or
#define SOKOL_GFX_IMPL

before you include this file in one C or C++ file to create the implementation.

In the same place define one of the following to select the rendering backend:

#define SOKOL_GLCORE
#define SOKOL_GLES3
#define SOKOL_D3D11
#define SOKOL_METAL
#define SOKOL_WGPU
#define SOKOL_VULKAN
#define SOKOL_DUMMY_BACKEND

For example, for desktop GL it should look like this:

#include ...
#include ...
#define SOKOL_IMPL
#define SOKOL_GLCORE
#include "sokol_gfx.h"

The dummy backend replaces the platform-specific backend code with empty stub functions. This is useful for writing tests that need to run on the command line.

Optional Defines

Optionally provide the following defines with your own implementations:

Define Purpose
SOKOL_ASSERT(c) Your own assert macro (default: assert(c))
SOKOL_UNREACHABLE() Guard macro for unreachable code (default: assert(false))
SOKOL_GFX_API_DECL Public function declaration prefix (default: extern)
SOKOL_API_DECL Same as SOKOL_GFX_API_DECL
SOKOL_API_IMPL Public function implementation prefix (default: -)
SOKOL_TRACE_HOOKS Enable trace hook callbacks (search below for TRACE HOOKS)
SOKOL_EXTERNAL_GL_LOADER Indicates that you're using your own GL loader; sokol_gfx.h will not include any platform GL headers and will disable the integrated Win32 GL loader

If sokol_gfx.h is compiled as a DLL, define the following before including the declaration or implementation:

SOKOL_DLL

On Windows, SOKOL_DLL will define SOKOL_GFX_API_DECL as __declspec(dllexport) or __declspec(dllimport) as needed.

Optionally define the following to force debug checks and validations even in release mode:

SOKOL_DEBUG  // by default this is defined if NDEBUG is not defined

System Library Linking

Link with the following system libraries (note that sokol_app.h has additional linker requirements):

  • On macOS/iOS with Metal: Metal
  • On macOS with GL: OpenGL
  • On iOS with GL: OpenGLES
  • On Linux with EGL: GL or GLESv2
  • On Linux with GLX: GL
  • On Linux with Vulkan: vulkan
  • On Android: GLESv3, log, android
  • On Windows:
    • With Vulkan: link with vulkan-1 (this is explicit in case you want to use your own Vulkan loader library)
    • With D3D11:
      • On MSVC or Clang: no action needed, libs are defined in-source via pragma-comment-lib
      • On MINGW/MSYS2 gcc: compile with -mwin32 so that _WIN32 is defined and link with -ld3d11
    • With GL: no linking needed since sokol_gfx.h comes with its own GL loader on Windows

On macOS and iOS, the implementation must be compiled as Objective-C.

For Linux+Vulkan install the following packages (or equivalents):

  • libvulkan-dev
  • vulkan-validationlayers
  • vulkan-tools

For Windows+Vulkan install the Vulkan SDK and in your build system:

  • Add a header search path to $ENV{VULKAN_SDK}/Include
  • Add a link search path to $ENV{VULKAN_SDK}/Env

On Emscripten:

What sokol_gfx Does NOT Do

  • Create a window, swapchain or the 3D-API context/device. You must do this before sokol_gfx is initialized, and pass any required information (like 3D device pointers) to the sokol_gfx initialization call.
  • Present the rendered frame. How this is done exactly usually depends on how the window and 3D-API context/device were created.
  • Provide a unified shader language. Instead, 3D-API-specific shader source-code or shader-bytecode must be provided. For the "official" offline shader cross-compiler / code-generator, see: https://github.com/floooh/sokol-tools/blob/master/docs/sokol-shdc.md

Step By Step

Initialization

To initialize sokol_gfx, after creating a window and a 3D-API context/device, call:

sg_setup(const sg_desc*)

Depending on the selected 3D backend, sokol-gfx requires some information about its runtime environment, like a GPU device pointer, default swapchain pixel formats and so on. If you are using sokol_app.h for the window system glue, you can use a helper function provided in the sokol_glue.h header:

#include "sokol_gfx.h"
#include "sokol_app.h"
#include "sokol_glue.h"
//...
sg_setup(&(sg_desc){
    .environment = sglue_environment(),
});

To get any logging output for errors and from the validation layer, you need to provide a logging callback. The easiest way is through sokol_log.h:

#include "sokol_log.h"
//...
sg_setup(&(sg_desc){
    //...
    .logger.func = slog_func,
});

Creating Resources

Create resource objects (buffers, images, views, samplers, shaders and pipeline objects):

sg_buffer sg_make_buffer(const sg_buffer_desc*)
sg_image sg_make_image(const sg_image_desc*)
sg_view sg_make_view(const sg_view_desc*)
sg_sampler sg_make_sampler(const sg_sampler_desc*)
sg_shader sg_make_shader(const sg_shader_desc*)
sg_pipeline sg_make_pipeline(const sg_pipeline_desc*)

Render and Compute Passes

Start a render- or compute-pass:

sg_begin_pass(const sg_pass* pass);

Typically, render passes render into an externally provided swapchain which presents the rendering result on the display. Such a 'swapchain pass' is started like this:

sg_begin_pass(&(sg_pass){ .action = { ... }, .swapchain = sglue_swapchain() })

...where .action is an sg_pass_action struct containing actions to be performed at the start and end of a render pass (such as clearing the render surfaces to a specific color), and .swapchain is an sg_swapchain struct with all the required information to render into the swapchain's surfaces.

To start an 'offscreen render pass' into sokol-gfx image objects, populate the sg_pass.attachments nested struct with attachment view objects (1..4 color-attachment-views to render into, a depth-stencil-attachment-view to provide the depth-stencil-buffer, and optionally 1..4 resolve-attachment-views for an MSAA-resolve operation):

sg_begin_pass(&(sg_pass){
    .action = { ... },
    .attachments = {
        .colors[0] = color_attachment_view,
        .resolves[0] = optional_resolve_attachment_view,
        .depth_stencil = depth_stencil_attachment_view,
    },
});

To start a compute-pass, just set the .compute item to true:

sg_begin_pass(&(sg_pass){ .compute = true });

Pipeline State and Bindings

Set the pipeline state for the next draw call with:

sg_apply_pipeline(sg_pipeline pip)

Fill an sg_bindings struct with the resource bindings for the next draw- or dispatch-call (0..N vertex buffers, 0 or 1 index buffer, 0..N views, 0..N samplers), and call:

sg_apply_bindings(const sg_bindings* bindings)

...to update the resource bindings. Note that in a compute pass, no vertex- or index-buffer bindings can be used, and in render passes, no storage-image bindings are allowed. Those restrictions will be checked by the sokol-gfx validation layer.

Uniforms

Optionally update shader uniform data with:

sg_apply_uniforms(int ub_slot, const sg_range* data)

Read the section Uniform Data Layout to learn about the expected memory layout of the uniform data passed into sg_apply_uniforms().

Drawing

Kick off a draw call with:

sg_draw(int base_element, int num_elements, int num_instances)

The sg_draw() function unifies all the different ways to render primitives in a single call (indexed vs non-indexed rendering, and instanced vs non-instanced rendering). In case of indexed rendering, base_element and num_element specify indices in the currently bound index buffer. In case of non-indexed rendering, base_element and num_elements specify vertices in the currently bound vertex-buffer(s). To perform instanced rendering, the rendering pipeline must be setup for instancing (see sg_pipeline_desc below), a separate vertex buffer containing per-instance data must be bound, and the num_instances parameter must be > 1.

Alternatively, call:

sg_draw_ex(...)

to provide a base-vertex and/or base-instance which allows rendering from different sections of a vertex buffer without rebinding the vertex buffer with a different offset. Note that sg_draw_ex() only has limited portability on OpenGL — check the sg_limits struct members .draw_base_vertex and .draw_base_instance for runtime support. Those are generally true on non-GL-backends, and on GL the feature flags are set according to the GL version:

  • On GL, base_instance != 0 is only supported since GL 4.2
  • On GLES3.x, base_instance != 0 is not supported
  • On GLES3.x, base_vertex is only supported since GLES3.2 (e.g. not supported on WebGL2)

Compute Dispatch

...or kick off a dispatch call to invoke a compute shader workload:

sg_dispatch(int num_groups_x, int num_groups_y, int num_groups_z)

The dispatch args define the number of 'compute workgroups' processed by the currently applied compute shader.

Ending Passes and Frames

Finish the current pass with:

sg_end_pass()

When done with the current frame, call:

sg_commit()

At the end of your program, shutdown sokol_gfx with:

sg_shutdown()

Destroying Resources

If you need to destroy resources before sg_shutdown(), call:

sg_destroy_buffer(sg_buffer buf)
sg_destroy_image(sg_image img)
sg_destroy_sampler(sg_sampler smp)
sg_destroy_shader(sg_shader shd)
sg_destroy_pipeline(sg_pipeline pip)
sg_destroy_view(sg_view view)

Viewport and Scissor

To set a new viewport rectangle, call:

sg_apply_viewport(int x, int y, int width, int height, bool origin_top_left)

...or with float values:

sg_apply_viewportf(float x, float y, float width, float height, bool origin_top_left)

To set a new scissor rect, call:

sg_apply_scissor_rect(int x, int y, int width, int height, bool origin_top_left)

...or with float values:

sg_apply_scissor_rectf(float x, float y, float width, float height, bool origin_top_left)

Both sg_apply_viewport() and sg_apply_scissor_rect() must be called inside a rendering pass (e.g. not in a compute pass, or outside a pass).

Note that sg_begin_pass() will reset both the viewport and scissor rectangles to cover the entire framebuffer.

Updating Buffers and Images

To update (overwrite) the content of buffer and image resources, call:

sg_update_buffer(sg_buffer buf, const sg_range* data)
sg_update_image(sg_image img, const sg_image_data* data)

Buffers and images to be updated must have been created with sg_buffer_desc.usage.dynamic_update or .stream_update.

Only one update per frame is allowed for buffer and image resources when using the sg_update_*() functions. The rationale is to have a simple protection from the CPU scribbling over data the GPU is currently using, or the CPU having to wait for the GPU.

Buffer and image updates can be partial, as long as a rendering operation only references the valid (updated) data in the buffer or image.

Appending to Buffers

To append a chunk of data to a buffer resource, call:

int sg_append_buffer(sg_buffer buf, const sg_range* data)

The difference to sg_update_buffer() is that sg_append_buffer() can be called multiple times per frame to append new data to the buffer piece by piece, optionally interleaved with draw calls referencing the previously written data.

sg_append_buffer() returns a byte offset to the start of the written data. This offset can be assigned to sg_bindings.vertex_buffer_offsets[n] or sg_bindings.index_buffer_offset.

Code example:

for (...) {
    const void* data = ...;
    const int num_bytes = ...;
    int offset = sg_append_buffer(buf, &(sg_range) { .ptr=data, .size=num_bytes });
    bindings.vertex_buffer_offsets[0] = offset;
    sg_apply_pipeline(pip);
    sg_apply_bindings(&bindings);
    sg_apply_uniforms(...);
    sg_draw(...);
}

A buffer to be used with sg_append_buffer() must have been created with sg_buffer_desc.usage.dynamic_update or .stream_update.

If the application appends more data to the buffer than fits into the buffer, the buffer will go into the "overflow" state for the rest of the frame.

Any draw calls attempting to render an overflown buffer will be silently dropped (in debug mode this will also result in a validation error).

You can also check manually if a buffer is in overflow-state by calling:

bool sg_query_buffer_overflow(sg_buffer buf)

You can manually check to see if an overflow would occur before adding any data to a buffer by calling:

bool sg_query_buffer_will_overflow(sg_buffer buf, size_t size)

NOTE: Due to restrictions in underlying 3D-APIs, appended chunks of data will be 4-byte aligned in the destination buffer. This means that there will be gaps in index buffers containing 16-bit indices when the number of indices in a call to sg_append_buffer() is odd. This isn't a problem when each call to sg_append_buffer() is associated with one draw call, but will be problematic when a single indexed draw call spans several appended chunks of indices.

Runtime Queries

To check at runtime for optional features, limits and pixelformat support, call:

sg_features sg_query_features()
sg_limits sg_query_limits()
sg_pixelformat_info sg_query_pixelformat(sg_pixel_format fmt)

If you need to call into the underlying 3D-API directly, you must call:

sg_reset_state_cache()

...before calling sokol_gfx functions again.

You can inspect the original sg_desc structure handed to sg_setup() by calling sg_query_desc(). This will return an sg_desc struct with the default values patched in instead of any zero-initialized values.

You can get a desc struct matching the creation attributes of a specific resource object via:

sg_buffer_desc sg_query_buffer_desc(sg_buffer buf)
sg_image_desc sg_query_image_desc(sg_image img)
sg_sampler_desc sg_query_sampler_desc(sg_sampler smp)
sg_shader_desc sq_query_shader_desc(sg_shader shd)
sg_pipeline_desc sg_query_pipeline_desc(sg_pipeline pip)
sg_view_desc sg_query_view_desc(sg_view view)

...but NOTE that the returned desc structs may be incomplete: only creation attributes that are kept around internally after resource creation will be filled in, and in some cases (like shaders) that's very little. Any missing attributes will be set to zero. The returned desc structs might still be useful as a partial blueprint for creating similar resources if filled up with the missing attributes.

Calling the query-desc functions on an invalid resource will return completely zeroed structs (it makes sense to check the resource state with sg_query_*_state() first).

You can query the default resource creation parameters through the functions:

sg_buffer_desc sg_query_buffer_defaults(const sg_buffer_desc* desc)
sg_image_desc sg_query_image_defaults(const sg_image_desc* desc)
sg_sampler_desc sg_query_sampler_defaults(const sg_sampler_desc* desc)
sg_shader_desc sg_query_shader_defaults(const sg_shader_desc* desc)
sg_pipeline_desc sg_query_pipeline_defaults(const sg_pipeline_desc* desc)
sg_view_desc sg_query_view_defaults(const sg_view_desc* desc)

These functions take a pointer to a desc structure which may contain zero-initialized items for default values. These zero-init values will be replaced with their concrete values in the returned desc struct.

You can inspect various internal resource runtime values via:

sg_buffer_info sg_query_buffer_info(sg_buffer buf)
sg_image_info sg_query_image_info(sg_image img)
sg_sampler_info sg_query_sampler_info(sg_sampler smp)
sg_shader_info sg_query_shader_info(sg_shader shd)
sg_pipeline_info sg_query_pipeline_info(sg_pipeline pip)
sg_view_info sg_query_view_info(sg_view view)

...please note that the returned info-structs are tied quite closely to sokol_gfx.h internals, and may change more often than other public API functions and structs.

You can query the type/flavour and parent resource of a view:

sg_view_type sg_query_view_type(sg_view view)
sg_image sg_query_view_image(sg_view view)
sg_buffer sg_query_view_buffer(sg_view view)

You can query stats and control stats collection via:

sg_query_stats()
sg_enable_stats()
sg_disable_stats()
sg_stats_enabled()

You can ask at runtime what backend sokol_gfx.h has been compiled for:

sg_backend sg_query_backend(void)

Pitch Helper Functions

Call the following helper functions to compute the number of bytes in a texture row or surface for a specific pixel format. These functions might be helpful when preparing image data for consumption by sg_make_image() or sg_update_image():

int sg_query_row_pitch(sg_pixel_format fmt, int width, int int row_align_bytes);
int sg_query_surface_pitch(sg_pixel_format fmt, int width, int height, int row_align_bytes);

Width and height are generally in number of pixels, but note that 'row' has different meaning for uncompressed vs compressed pixel formats: for uncompressed formats, a row is identical with a single line of pixels, while in compressed formats, one row is a line of compression blocks.

This is why calling sg_query_surface_pitch() for a compressed pixel format and height N, N+1, N+2, ... may return the same result.

The row_align_bytes parameter is for added flexibility. For image data that goes into the sg_make_image() or sg_update_image() this should generally be 1, because these functions take tightly packed image data as input no matter what alignment restrictions exist in the backend 3D APIs.

On Initialization

When calling sg_setup(), a pointer to an sg_desc struct must be provided which contains initialization options. These options provide two types of information to sokol-gfx:

(1) Upper bounds and limits needed to allocate various internal data structures:

  • The max number of resources of each type that can be alive at the same time. This is used for allocating internal pools.
  • The max overall size of uniform data that can be updated per frame, including a worst-case alignment per uniform update (this worst-case alignment is 256 bytes).
  • The max size of all dynamic resource updates (sg_update_buffer, sg_append_buffer and sg_update_image) per frame.
  • The max number of compute-dispatch calls in a compute pass.

Not all of those limit values are used by all backends, but it is good practice to provide them nonetheless.

(2) 3D backend "environment information" in a nested sg_environment struct:

  • Pointers to backend-specific context- or device-objects (for instance the D3D11, WebGPU or Metal device objects).
  • Defaults for external swapchain pixel formats and sample counts. These will be used as default values in image and pipeline objects, and the sg_swapchain struct passed into sg_begin_pass().

Usually you provide a complete sg_environment struct through a helper function — as an example, look at the sglue_environment() function in the sokol_glue.h header.

See the documentation block of the sg_desc struct below for more information.

On Render Passes

Relevant samples:

A render pass groups rendering commands into a set of render target images (called 'render pass attachments'). Render target images can be used in subsequent passes as textures (it is invalid to use the same image both as render target and as texture in the same pass).

The following sokol-gfx functions must only be called inside a render-pass:

sg_apply_viewport[f]
sg_apply_scissor_rect[f]
sg_draw

The following functions may be called inside a render- or compute-pass, but not outside a pass:

sg_apply_pipeline
sg_apply_bindings
sg_apply_uniforms

Swapchain Passes

A frame must have at least one 'swapchain render pass' which renders into an externally provided swapchain provided as an sg_swapchain struct to the sg_begin_pass() function. If you use sokol_gfx.h together with sokol_app.h, just call the sglue_swapchain() helper function in sokol_glue.h to provide the swapchain information. Otherwise the following information must be provided:

  • The color pixel-format of the swapchain's render surface
  • An optional depth/stencil pixel format if the swapchain has a depth/stencil buffer
  • An optional sample-count for MSAA rendering
  • NOTE: the above three values can be zero-initialized, in that case the defaults from the sg_environment struct will be used that had been passed to the sg_setup() function.
  • A number of backend-specific objects:
    • GL/GLES3: just a GL framebuffer handle
    • D3D11:
      • An ID3D11RenderTargetView for the rendering surface
      • If MSAA is used, an ID3D11RenderTargetView as MSAA resolve-target
      • An optional ID3D11DepthStencilView for the depth/stencil buffer
    • WebGPU:
      • A WGPUTextureView object for the rendering surface
      • If MSAA is used, a WGPUTextureView object as MSAA resolve target
      • An optional WGPUTextureView for the depth/stencil buffer
    • Metal (NOTE that the roles of provided surfaces are slightly different in Metal than in D3D11 or WebGPU. Notably, the CAMetalDrawable is either rendered to directly, or serves as MSAA resolve target):
      • A CAMetalDrawable object which is either rendered into directly, or in case of MSAA rendering, serves as MSAA-resolve-target
      • If MSAA is used, a multisampled MTLTexture where rendering goes into
      • An optional MTLTexture for the depth/stencil buffer

A sg_swapchain struct provided to sg_begin_pass() can indicate that the swapchain is in an 'invalid state' via the boolean sg_swapchain.invalid. When this flag is set, all other sg_swapchain members must be zeroed. An invalid swapchain will cause all rendering operations in that pass to be silently skipped.

It's recommended that you create a helper function which returns an initialized sg_swapchain struct by value. This can then be directly plugged into the sg_begin_pass function like this:

sg_begin_pass(&(sg_pass){ .swapchain = sglue_swapchain() });

As an example of such a helper function, check out the function sglue_swapchain() in the sokol_glue.h header.

Offscreen Render Passes

For offscreen render passes, the render target images used in a render pass must be provided as sg_view objects specialized for the specific pass-attachment types:

  • Color-attachment-views for color-rendering
  • Depth-stencil-attachment-views for the depth-stencil-buffer surface
  • Resolve-attachment-views for MSAA-resolve operations

For a simple offscreen scenario with one color-, one depth-stencil-render target and without multisampling, setting up the required image- and view-objects looks like this.

First create two render target images, one with a color pixel format, and one with the depth- or depth-stencil pixel format. Both images must have the same dimensions. Also note the usage flags:

const sg_image color_img = sg_make_image(&(sg_image_desc){
    .usage.color_attachment = true,
    .width = 256,
    .height = 256,
    .pixel_format = SG_PIXELFORMAT_RGBA8,
    .sample_count = 1,
});
const sg_image depth_img = sg_make_image(&(sg_image_desc){
    .usage.depth_stencil_attachment = true,
    .width = 256,
    .height = 256,
    .pixel_format = SG_PIXELFORMAT_DEPTH,
    .sample_count = 1,
});

NOTE: when creating render target images, have in mind that some default values are aligned with the default environment attributes in the sg_environment struct that was passed into the sg_setup() call:

  • The default value for sg_image_desc.pixel_format is taken from sg_environment.defaults.color_format
  • The default value for sg_image_desc.sample_count is taken from sg_environment.defaults.sample_count
  • The default value for sg_image_desc.num_mipmaps is always 1

Next, create two view objects, one color-attachment-view and one depth-stencil-attachment view:

const sg_view color_att_view = sg_make_view(&(sg_view_desc){
    .color_attachment.image = color_img,
});
const sg_view depth_att_view = sg_make_view(&(sg_view_desc){
    .depth_stencil_attachment.image = depth_img,
});

You'll typically also want to create a texture-view on the color image to sample the color attachment image as texture in a later pass:

const sg_view tex_view = sg_make_view(&(sg_view_desc){
    .texture.image = color_img,
});

The attachment-view objects are then passed into the sg_begin_pass function in place of the nested swapchain struct:

sg_begin_pass(&(sg_pass){
    .attachments = {
        .colors[0] = color_att_view,
        .depth_stencil = depth_att_view,
    },
});

...in a later pass when you want to sample the color attachment image as texture, use the texture view in the sg_apply_bindings() call:

sg_apply_bindings(&(sg_bindings){
    .vertex_buffers[0] = ...,
    .index_buffer = ...,
    .views[VIEW_tex] = tex_view,
    .samplers[SMP_smp] = smp,
});

Swapchain and offscreen passes form dependency trees with a swapchain pass at the root, offscreen passes as nodes, and attachment images as dependencies between passes.

Pass Actions

sg_pass_action structs are used to define actions that should happen at the start and end of render passes (such as clearing pass attachments to a specific color or depth-value, or performing an MSAA resolve operation at the end of a pass).

A typical sg_pass_action object which clears the color attachment to black might look like this:

const sg_pass_action = {
    .colors[0] = {
        .load_action = SG_LOADACTION_CLEAR,
        .clear_value = { 0.0f, 0.0f, 0.0f, 1.0f }
    }
};

This omits the defaults for the color attachment store action, and the depth-stencil-attachments actions. The same pass action with the defaults explicitly filled in would look like this:

const sg_pass_action pass_action = {
    .colors[0] = {
        .load_action = SG_LOADACTION_CLEAR,
        .store_action = SG_STOREACTION_STORE,
        .clear_value = { 0.0f, 0.0f, 0.0f, 1.0f }
    },
    .depth = = {
        .load_action = SG_LOADACTION_CLEAR,
        .store_action = SG_STOREACTION_DONTCARE,
        .clear_value = 1.0f,
    },
    .stencil = {
        .load_action = SG_LOADACTION_CLEAR,
        .store_action = SG_STOREACTION_DONTCARE,
        .clear_value = 0
    }
};

With the sg_pass object and sg_pass_action struct in place, everything is ready now for the actual render pass.

Using such a prepared sg_pass_action in a swapchain pass looks like this:

sg_begin_pass(&(sg_pass){
    .action = pass_action,
    .swapchain = sglue_swapchain()
});
...
sg_end_pass();

...or alternatively in one offscreen pass:

sg_begin_pass(&(sg_pass){
    .action = pass_action,
    .attachments = {
        .colors[0] = color_att_view,
        .depth_stencil = ds_att_view,
    },
});
...
sg_end_pass();

Mipmap and Slice Selection

Offscreen rendering can also go into a mipmap, or a slice/face of a cube-, array- or 3D-image (with some restrictions, for instance it's not possible to create a 3D image with a depth/stencil pixel format — these exceptions are generally caught by the sokol-gfx validation layer).

The mipmap/slice selection is baked into the attachment-view objects. For instance, to create a color-attachment-view for rendering into mip-level 2 and slice 3 of an array texture:

const sg_view color_att_view = sg_make_view(&(sg_view_desc){
    .color_attachment = {
        .image = color_img,
        .mip_level = 2,
        .slice = 3,
    },
});

MSAA Offscreen Rendering

If MSAA offscreen rendering is desired, the multi-sample rendering result must be 'resolved' into a separate 'resolve image', before that image can be used as texture.

Setting up MSAA offscreen 3D rendering requires three image objects: one color-attachment image with a sample count > 1, a resolve-attachment image with a sample count of 1, and a depth-stencil-attachment image with the same sample count as the color-attachment image:

const sg_image color_img = sg_make_image(&(sg_image_desc){
    .usage.color_attachment = true,
    .width = 256,
    .height = 256,
    .pixel_format = SG_PIXELFORMAT_RGBA8,
    .sample_count = 4,
});
const sg_image resolve_img = sg_make_image(&(sg_image_desc){
    .usage.resolve_attachment = true,
    .width = 256,
    .height = 256,
    .pixel_format = SG_PIXELFORMAT_RGBA8,
    .sample_count = 1,
});
const sg_image depth_img = sg_make_image(&(sg_image_desc){
    .usage.depth_stencil_attachment = true,
    .width = 256,
    .height = 256,
    .pixel_format = SG_PIXELFORMAT_DEPTH,
    .sample_count = 4,
});

Next you'll need the corresponding attachment-view objects:

const sg_view color_att_view = sg_make_view(&(sg_view_desc){
    .color_attachment.image = color_img,
});
const sg_view resolve_att_view = sg_make_view(&(sg_view_desc){
    .resolve_attachment.image = resolve_img,
});
const sg_view depth_att_view = sg_make_view(&(sg_view_desc){
    .depth_stencil_attachment.image = depth_img,
});

To sample the rendered image as a texture in a later pass you'll also need a texture-view on the resolve-attachment-image (not the color-attachment-image!):

const sg_view tex_view = sg_make_view(&(sg_view_desc){
    .texture.image = resolve_img,
});

Next, start the render pass with all attachment-views. As soon as a resolve-attachment-view is provided, an MSAA resolve operation will happen at the end of the pass. Also note that the content of the MSAA color-attachment-image doesn't need to be preserved, since it's only needed until the MSAA-resolve at the end of the pass, so the .store_action should be set to "don't care":

sg_begin_pass(&(sg_pass){
    .attachments = {
        .colors[0] = color_att_view,
        .resolves[0] = resolve_att_view,
        .depth_stencil = depth_att_view,
    },
    .action = {
        .colors[0] = {
            .load_action = SG_LOADACTION_CLEAR,
            .store_action = SG_STOREACTION_DONTCARE,
            .clear_value = { 0.0f, 0.0f, 0.0f, 1.0f },
        }
    },
});

...in a later pass, use the texture-view that had been created on the resolve-image to use the rendering result as texture:

sg_apply_bindings(&(sg_bindings){
    .vertex_buffers[0] = ...,
    .index_buffer = ...,
    .views[VIEW_tex] = tex_view,
    .samplers[SMP_smp] = smp,
});

On Compute Passes

Compute passes are used to update the content of storage buffers and storage images by running compute shader code on the GPU. Updating storage resources with a compute shader will almost always be more efficient than computing the same data on the CPU and then uploading it via sg_update_buffer() or sg_update_image().

NOTE: Compute passes are only supported on the following platforms and backends:

  • macOS and iOS with Metal
  • Windows with D3D11 and OpenGL
  • Linux with OpenGL or GLES3.1+
  • Web with WebGPU
  • Android with GLES3.1+

...this means compute shaders can't be used on the following platform/backend combos (the same restrictions apply to using storage buffers without compute shaders):

  • macOS with GL
  • iOS with GLES3
  • Web with WebGL2

A compute pass is started with:

sg_begin_pass(&(sg_pass){ .compute = true });

...and finished with a regular:

sg_end_pass();

Typically the following functions will be called inside a compute pass:

sg_apply_pipeline()
sg_apply_bindings()
sg_apply_uniforms()
sg_dispatch()

The following functions are disallowed inside a compute pass and will cause validation layer errors:

sg_apply_viewport[f]()
sg_apply_scissor_rect[f]()
sg_draw()

Only special 'compute shaders' and 'compute pipelines' can be used in compute passes. A compute shader only has a compute-function instead of a vertex- and fragment-function pair, and it doesn't accept vertex- and index-buffers as bindings, only storage-buffer-views (readable and writable), storage-image-views (read/write or writeonly) and texture-views (read-only).

A compute pipeline is created by providing a compute shader object, setting the .compute creation parameter to true and not defining any 'render state':

sg_pipeline pip = sg_make_pipeline(&(sg_pipeline_desc){
    .compute = true,
    .shader = compute_shader,
});

The sg_apply_bindings and sg_apply_uniforms calls are the same as in render passes, with the exception that no vertex- and index-buffers can be bound in the sg_apply_bindings call.

Finally, to kick off a compute workload, call sg_dispatch with the number of workgroups in the x, y and z-dimension:

sg_dispatch(int num_groups_x, int num_groups_y, int num_groups_z)

Also see the following compute-shader samples:

On Shader Creation

sokol-gfx doesn't come with an integrated shader cross-compiler. Instead, backend-specific shader sources or binary blobs need to be provided when creating a shader object, along with reflection information about the shader resource binding interface needed to bind sokol-gfx resources to the proper shader inputs.

The easiest way to provide all this shader creation data is to use the sokol-shdc shader compiler tool to compile shaders from a common GLSL syntax into backend-specific sources or binary blobs, along with shader interface information and uniform blocks and storage buffer array items mapped to C structs.

To create a shader using a C header which has been code-generated by sokol-shdc:

// include the C header code-generated by sokol-shdc:
#include "myshader.glsl.h"
...

// create shader using a code-generated helper function from the C header:
sg_shader shd = sg_make_shader(myshader_shader_desc(sg_query_backend()));

The samples in the 'sapp' subdirectory of the sokol-samples project also use the sokol-shdc approach:

https://github.com/floooh/sokol-samples/tree/master/sapp

If you're planning to use sokol-shdc, you can stop reading here, and instead continue with the sokol-shdc documentation:

https://github.com/floooh/sokol-tools/blob/master/docs/sokol-shdc.md

Manual Shader Creation

To create shaders with backend-specific shader code or binary blobs, the sg_make_shader() function requires the following information.

Shader Code

Shader code or shader binary blobs for the vertex- and fragment-, or the compute-shader-stage:

  • For the desktop GL backend, source code can be provided in #version 410 or #version 430. Version 430 is required when using storage buffers and compute shaders, but note that this is not available on macOS.
  • For the GLES3 backend, source code must be provided in #version 300 es or #version 310 es syntax (version 310 is required for storage buffer and compute shader support, but note that this is not supported on WebGL2).
  • For the D3D11 backend, shaders can be provided as source or binary blobs. The source code should be in HLSL4.0 (for compatibility with old low-end GPUs) or preferably in HLSL5.0 syntax. Note that when shader source code is provided for the D3D11 backend, sokol-gfx will dynamically load d3dcompiler_47.dll.
  • For the Metal backends, shaders can be provided as source or binary blobs. The MSL version should be in metal-1.1 (other versions may work but are not tested).
  • For the WebGPU backend, shaders must be provided as WGSL source code.
  • Optionally the following shader-code related attributes can be provided:
    • An entry function name (only on D3D11 or Metal, but not OpenGL)
    • On D3D11 only, a compilation target (default is "vs_4_0" and "ps_4_0")

Vertex Attributes

Information about the input vertex attributes used by the vertex shader, most of that backend-specific:

  • An optional 'base type' (float, signed-/unsigned-int) for each vertex attribute. When provided, this is used by the validation layer to check that the CPU-side input vertex format is compatible with the input vertex declaration of the vertex shader.
  • Metal: no location information needed since vertex attributes are always bound by their attribute location defined in the shader via [[attribute(N)]]
  • WebGPU: no location information needed since vertex attributes are always bound by their attribute location defined in the shader via @location(N)
  • GLSL: vertex attribute names can be optionally provided. In that case their location will be looked up by name; otherwise, the vertex attribute location can be defined with layout(location = N)
  • D3D11: a 'semantic name' and 'semantic index' must be provided for each vertex attribute. E.g. if the vertex attribute is defined as TEXCOORD1 in the shader, the semantic name would be TEXCOORD, and the semantic index would be 1.

NOTE: vertex attributes currently must not have gaps. This requirement may be relaxed in the future.

Metal Compute Threads

Specifically for Metal compute shaders, the 'number of threads per threadgroup' must be provided. Normally this is extracted by sokol-shdc from the GLSL shader source code. For instance the following statement in the input GLSL:

layout(local_size_x=64, local_size_y=1, local_size_z=1) in;

...will be communicated to the sokol-gfx Metal backend in the code-generated sg_shader_desc struct:

(sg_shader_desc){
    .mtl_threads_per_threadgroup = { .x = 64, .y = 1, .z = 1 },
}

Uniform Block Bindings

Information about each uniform block binding used in the shader:

  • The shader stage of the uniform block (vertex, fragment or compute)
  • The size of the uniform block in number of bytes
  • A memory layout hint (currently 'native' or 'std140') where 'native' defines a backend-specific memory layout which shouldn't be used for cross-platform code. Only std140 guarantees a backend-agnostic memory layout.
  • A backend-specific bind slot:
    • D3D11/HLSL: the buffer register N (register(bN)) where N is 0..7
    • Metal/MSL: the buffer bind slot N ([[buffer(N)]]) where N is 0..7
    • WebGPU: the binding N in @group(0) @binding(N) where N is 0..15
  • For GLSL only: a description of the internal uniform block layout, which maps member types and their offsets on the CPU side to uniform variable names in the GLSL shader
  • Please also see the documentation sections about Uniform Data Layout and Cross-Backend Common Uniform Data Layout below!

Resource Bindings

A description of each resource binding (texture-, storage-buffer- and storage-image-bindings) which directly map to the sg_bindings.view[] array slots.

Each resource binding slot comes in three flavours.

1. Texture bindings with the following properties:

  • The shader stage of the texture (vertex, fragment or compute)
  • The expected image type:
    • SG_IMAGETYPE_2D
    • SG_IMAGETYPE_CUBE
    • SG_IMAGETYPE_3D
    • SG_IMAGETYPE_ARRAY
  • The expected 'image sample type':
    • SG_IMAGESAMPLETYPE_FLOAT
    • SG_IMAGESAMPLETYPE_DEPTH
    • SG_IMAGESAMPLETYPE_SINT
    • SG_IMAGESAMPLETYPE_UINT
    • SG_IMAGESAMPLETYPE_UNFILTERABLE_FLOAT
  • A flag whether the texture is expected to be multisampled
  • A backend-specific bind slot:
    • D3D11/HLSL: the texture register N (register(tN)) where N is 0..31 (in HLSL, readonly storage buffers and texture share the same bind space)
    • Metal/MSL: the texture bind slot N ([[texture(N)]]) where N is 0..31 (the bind slot must not collide with storage image bindings on the same stage)
    • WebGPU/WGSL: the binding N in @group(0) @binding(N) where N is 0..127

2. Storage buffer bindings with the following properties:

  • The shader stage of the storage buffer
  • A boolean 'readonly' flag, this is used for validation and hazard tracking in some 3D backends. Note that in render passes, only readonly storage buffer bindings are allowed. In compute passes, any read/write storage buffer binding is assumed to be written to by the compute shader.
  • A backend-specific bind slot:
    • D3D11/HLSL:
      • For readonly storage buffer bindings: the texture register N (register(tN)) where N is 0..31 (in HLSL, readonly storage buffers and textures share the same bind space for 'shader resource views')
      • For read/write storage buffer bindings: the UAV register N (register(uN)) where N is 0..31 (in HLSL, readwrite storage buffers use their own bind space for 'unordered access views')
    • Metal/MSL: the buffer bind slot N ([[buffer(N)]]) where N is 8..23
    • WebGPU/WGSL: the binding N in @group(0) @binding(N) where N is 0..127
    • GL/GLSL: the buffer binding N in layout(binding=N) where N is 0..sg_limits.max_storage_buffer_bindings_per_stage
  • Note that storage buffer bindings are not supported on all backends and platforms.

3. Storage image bindings with the following properties:

  • The shader stage (must be compute)
  • The expected image type:
    • SG_IMAGETYPE_2D
    • SG_IMAGETYPE_CUBE
    • SG_IMAGETYPE_3D
    • SG_IMAGETYPE_ARRAY
  • The 'access pixel format', currently limited to:
    • SG_PIXELFORMAT_RGBA8
    • SG_PIXELFORMAT_RGBA8SN/UI/SI
    • SG_PIXELFORMAT_RGBA16UI/SI/F
    • SG_PIXELFORMAT_R32UIUI/SI/F
    • SG_PIXELFORMAT_RG32UI/SI/F
    • SG_PIXELFORMAT_RGBA32UI/SI/F
  • The access type (readwrite or writeonly)
  • A backend-specific bind slot:
    • D3D11/HLSL: the UAV register N (register(uN) where N is 0..31, the bind slot must not collide with UAV storage buffer bindings
    • Metal/MSL: the texture bind slot N ([[texture(N)]]) where N is 0..31, the bind slot must not collide with other texture bindings on the same stage
    • WebGPU/WGSL: the binding N in @group(1) @binding(N) where N is 0..127
    • GL/GLSL: the buffer binding N in layout(binding=N) where N is 0.._sg.max_storage_image_bindings_per_stage
  • Note that storage image bindings are not supported on all backends and platforms.

Sampler Bindings

A description of each sampler used in the shader:

  • The shader stage of the sampler (vertex, fragment or compute)
  • The expected sampler type:
    • SG_SAMPLERTYPE_FILTERING
    • SG_SAMPLERTYPE_NONFILTERING
    • SG_SAMPLERTYPE_COMPARISON
  • A backend-specific bind slot:
    • D3D11/HLSL: the sampler register N (register(sN)) where N is 0..SG_MAX_SAMPLER_BINDINGS
    • Metal/MSL: the sampler bind slot N ([[sampler(N)]]) where N is 0..SG_MAX_SAMPLER_BINDINGS
    • WebGPU/WGSL: the binding N in @group(0) @binding(N) where N is 0..127

Texture-Sampler Pairs

An array of 'texture-sampler-pairs' used by the shader to sample textures. For D3D11, Metal and WebGPU this is used for validation purposes to check whether the texture and sampler are compatible with each other (especially WebGPU is very picky about combining the correct texture-sample-type with the correct sampler-type). For GLSL, an additional 'combined-image-sampler name' must be provided because 'OpenGL style GLSL' cannot handle separate texture and sampler objects, but still groups them into a traditional GLSL 'sampler object'.

Compatibility rules for image-sample-type vs sampler-type are as follows:

Image Sample Type Compatible Sampler Type
SG_IMAGESAMPLETYPE_FLOAT SG_SAMPLERTYPE_FILTERING or SG_SAMPLERTYPE_NONFILTERING
SG_IMAGESAMPLETYPE_UNFILTERABLE_FLOAT SG_SAMPLERTYPE_NONFILTERING
SG_IMAGESAMPLETYPE_SINT SG_SAMPLERTYPE_NONFILTERING
SG_IMAGESAMPLETYPE_UINT SG_SAMPLERTYPE_NONFILTERING
SG_IMAGESAMPLETYPE_DEPTH SG_SAMPLERTYPE_COMPARISON

Backend-Specific Bindslot Ranges

Not relevant when using sokol-shdc:

  • D3D11/HLSL:
    • Separate bindslot space per shader stage
    • Uniform block bindings (as cbuffer): register(b0..b7)
    • Texture- and readonly storage buffer bindings: register(t0..t31)
    • Read/write storage buffer and storage image bindings: register(u0..u31)
    • Samplers: register(s0..s11)
  • Metal/MSL:
    • Separate bindslot space per shader stage
    • Uniform blocks: [[buffer(0..7)]]
    • Storage buffers: [[buffer(8..23)]]
    • Textures and storage image bindings: [[texture(0..31)]]
    • Samplers: [[sampler(0..11)]]
  • WebGPU/WGSL:
    • Common bindslot space across shader stages
    • Uniform blocks: @group(0) @binding(0..15)
    • Textures, storage-images, storage-buffers and sampler: @group(1) @binding(0..127)
  • GL/GLSL:
    • Uniforms and image-samplers are bound by name
    • Storage buffer bindings: layout(std430, binding=0..sg_limits.max_storage_buffer_bindings_per_stage (common bindslot space across shader stages)
    • Storage image bindings: layout(binding=0..sg_limits.max_storage_image_bindings_per_stage, [access_format])

For example code of how to create backend-specific shader objects, please refer to the following samples:

On SG_IMAGESAMPLETYPE_UNFILTERABLE_FLOAT and SG_SAMPLERTYPE_NONFILTERING

The WebGPU backend introduces the concept of 'unfilterable-float' textures, which can only be combined with 'nonfiltering' samplers (this is a restriction specific to WebGPU, but since the same sokol-gfx code should work across all backends, the sokol-gfx validation layer also enforces this restriction — the alternative would be undefined behaviour in some backend APIs on some devices).

The background is that some mobile devices (most notably iOS devices) cannot perform linear filtering when sampling textures with certain pixel formats, most notably the 32F formats:

  • SG_PIXELFORMAT_R32F
  • SG_PIXELFORMAT_RG32F
  • SG_PIXELFORMAT_RGBA32F

The information of whether a shader is going to be used with such an unfilterable-float texture must already be provided in the sg_shader_desc struct when creating the shader (see the above section On Shader Creation).

If you are using the sokol-shdc shader compiler, the information whether a texture/sampler binding expects an 'unfilterable-float/nonfiltering' texture/sampler combination cannot be inferred from the shader source alone — you'll need to provide this hint via annotation-tags. For instance, here is an example from the ozz-skin-sapp.c sample shader which samples an RGBA32F texture with skinning matrices in the vertex shader:

@image_sample_type joint_tex unfilterable_float
uniform texture2D joint_tex;
@sampler_type smp nonfiltering
uniform sampler smp;

This will result in SG_IMAGESAMPLETYPE_UNFILTERABLE_FLOAT and SG_SAMPLERTYPE_NONFILTERING being written to the code-generated sg_shader_desc struct.

On Vertex Formats

sokol-gfx implements strict mapping rules from CPU-side vertex component formats to GPU-side vertex input data types:

  • Float and packed normalized CPU-side formats must be used as floating point base type in the vertex shader
  • Packed signed-integer CPU-side formats must be used as signed integer base type in the vertex shader
  • Packed unsigned-integer CPU-side formats must be used as unsigned integer base type in the vertex shader

These mapping rules are enforced by the sokol-gfx validation layer, but only when sufficient reflection information is provided in sg_shader_desc.attrs[].base_type. This is the case when sokol-shdc is used; otherwise the default base_type will be SG_SHADERATTRBASETYPE_UNDEFINED which causes the sokol-gfx validation check to be skipped (of course you can also provide the per-attribute base type information manually when not using sokol-shdc).

The detailed mapping rules from SG_VERTEXFORMAT_* to GLSL data types are as follows:

Vertex Format GLSL Type
FLOAT[*] float, vec*
BYTE4N vec* (scaled to -1.0 .. +1.0)
UBYTE4N vec* (scaled to 0.0 .. +1.0)
SHORT[*]N vec* (scaled to -1.0 .. +1.0)
USHORT[*]N vec* (scaled to 0.0 .. +1.0)
INT[*] int, ivec*
UINT[*] uint, uvec*
BYTE4 int*
UBYTE4 uint*
SHORT[*] int*
USHORT[*] uint*

NOTE: sokol-gfx only provides vertex formats with sizes of a multiple of 4 (e.g. BYTE4N but not BYTE2N). This is because vertex components must be 4-byte aligned anyway.

Uniform Data Layout

NOTE: if you use the sokol-shdc shader compiler tool, you don't need to worry about the following details.

The data that's passed into the sg_apply_uniforms() function must adhere to specific layout rules so that the GPU shader finds the uniform block items at the right offset.

For the D3D11 and Metal backends, sokol-gfx only cares about the size of uniform blocks, but not about the internal layout. The data will just be copied into a uniform/constant buffer in a single operation, and it's up to you to arrange the CPU-side layout so that it matches the GPU side layout. This also means that with the D3D11 and Metal backends you are not limited to a 'cross-platform' subset of uniform variable types.

If you ever only use one of the D3D11, Metal or WebGPU backends, you can stop reading here.

For the GL backends, the internal layout of uniform blocks matters though, and you are limited to a small number of uniform variable types. This is because sokol-gfx must be able to locate the uniform block members in order to upload them to the GPU with glUniformXXX() calls.

To describe the uniform block layout to sokol-gfx, the following information must be passed to the sg_make_shader() call in the sg_shader_desc struct:

  • A hint about the used packing rule (either SG_UNIFORMLAYOUT_NATIVE or SG_UNIFORMLAYOUT_STD140)
  • A list of the uniform block member types in the correct order they appear on the CPU side

For example if the GLSL shader has the following uniform declarations:

uniform mat4 mvp;
uniform vec2 offset0;
uniform vec2 offset1;
uniform vec2 offset2;

...and on the CPU side, there's a similar C struct:

typedef struct {
    float mvp[16];
    float offset0[2];
    float offset1[2];
    float offset2[2];
} params_t;

...the uniform block description in the sg_shader_desc must look like this:

sg_shader_desc desc = {
    .vs.uniform_blocks[0] = {
        .size = sizeof(params_t),
        .layout = SG_UNIFORMLAYOUT_NATIVE,  // this is the default and can be omitted
        .uniforms = {
            // order must be the same as in 'params_t':
            [0] = { .name = "mvp", .type = SG_UNIFORMTYPE_MAT4 },
            [1] = { .name = "offset0", .type = SG_UNIFORMTYPE_VEC2 },
            [2] = { .name = "offset1", .type = SG_UNIFORMTYPE_VEC2 },
            [3] = { .name = "offset2", .type = SG_UNIFORMTYPE_VEC2 },
        }
    }
};

With this information sokol-gfx can now compute the correct offsets of the data items within the uniform block struct.

The SG_UNIFORMLAYOUT_NATIVE packing rule works fine if only the GL backends are used, but for proper D3D11/Metal/GL a subset of the std140 layout must be used, which is described in the next section.

Cross-Backend Common Uniform Data Layout

For cross-platform / cross-3D-backend code it is important that the same uniform block layout on the CPU side can be used for all sokol-gfx backends. To achieve this, a common subset of the std140 layout must be used:

  • The uniform block layout hint in sg_shader_desc must be explicitly set to SG_UNIFORMLAYOUT_STD140.
  • Only the following GLSL uniform types can be used (with their associated sokol-gfx enums):
GLSL Type sokol-gfx Enum
float SG_UNIFORMTYPE_FLOAT
vec2 SG_UNIFORMTYPE_FLOAT2
vec3 SG_UNIFORMTYPE_FLOAT3
vec4 SG_UNIFORMTYPE_FLOAT4
int SG_UNIFORMTYPE_INT
ivec2 SG_UNIFORMTYPE_INT2
ivec3 SG_UNIFORMTYPE_INT3
ivec4 SG_UNIFORMTYPE_INT4
mat4 SG_UNIFORMTYPE_MAT4
  • Alignment for those types must be as follows (in bytes):
Type Alignment (bytes)
float 4
vec2 8
vec3 16
vec4 16
int 4
ivec2 8
ivec3 16
ivec4 16
mat4 16
  • Arrays are only allowed for the following types: vec4, int4, mat4.

Note that the HLSL cbuffer layout rules are slightly different from the std140 layout rules, this means that the cbuffer declarations in HLSL code must be tweaked so that the layout is compatible with std140.

The by far easiest way to tackle the common uniform block layout problem is to use the sokol-shdc shader cross-compiler tool!

On Storage Buffers

The two main purposes of storage buffers are:

  • To be populated by compute shaders with dynamically generated data
  • For providing random-access data to all shader stages

Storage buffers can be used to pass large amounts of random access structured data from the CPU side to the shaders. They are similar to data textures, but are more convenient to use both on the CPU and shader side since they can be accessed in shaders as a 1-dimensional array of struct items.

Storage buffers are NOT supported on the following platform/backend combos:

  • macOS+GL (because storage buffers require GL 4.3, while macOS only goes up to GL 4.1)
  • Platforms which only support a GLES3.0 context (WebGL2 and iOS)

To use storage buffers, the following steps are required:

  • Write a shader which uses storage buffers (vertex- and fragment-shaders can only read from storage buffers, while compute-shaders can both read and write storage buffers)
  • Create one or more storage buffers via sg_make_buffer() with .usage.storage_buffer = true
  • When creating a shader via sg_make_shader(), populate the sg_shader_desc struct with binding info (when using sokol-shdc, this step will be taken care of automatically):
    • Which storage buffer bind slots on the vertex-, fragment- or compute-stage are occupied
    • Whether the storage buffer on that bind slot is readonly (readonly bindings are required for vertex- and fragment-shaders, and in compute shaders the readonly flag is used to control hazard tracking in some 3D backends)
  • When calling sg_apply_bindings(), apply the matching bind slots with the previously created storage buffers.
  • ...and that's it.

Sample Code

For more details, see the following backend-agnostic sokol samples:

Also see the following backend-specific vertex pulling samples (those also don't use sokol-shdc):

...and the backend-specific compute shader samples:

Storage Buffer Shader Authoring with sokol-shdc

Storage buffer shader authoring caveats when using sokol-shdc:

  • Declare a read-only storage buffer interface block with layout(binding=N) readonly buffer [name] { ... } (where 'N' is the index in sg_bindings.storage_buffers[N])
  • ...or a read/write storage buffer interface block with layout(binding=N) buffer [name] { ... }
  • Declare a struct which describes a single array item in the storage buffer interface block
  • Only put a single flexible array member into the storage buffer interface block

E.g. a complete example in 'sokol-shdc GLSL':

@vs
// declare a struct:
struct sb_vertex {
    vec3 pos;
    vec4 color;
}
// declare a buffer interface block with a single flexible struct array:
layout(binding=0) readonly buffer vertices {
    sb_vertex vtx[];
}
// in the shader function, access the storage buffer like this:
void main() {
    vec3 pos = vtx[gl_VertexIndex].pos;
    ...
}
@end

In a compute shader you can read and write the same item in the same storage buffer (but you'll have to be careful for random access since many threads of the same compute function run in parallel):

@cs
struct sb_item {
    vec3 pos;
    vec3 vel;
}
layout(binding=0) buffer items_ssbo {
    sb_item items[];
}
layout(local_size_x=64, local_size_y=1, local_size_z=1) in;
void main() {
    uint idx = gl_GlobalInvocationID.x;
    vec3 pos = items[idx].pos;
    ...
    items[idx].pos = pos;
}
@end

Backend-Specific Storage Buffer Caveats

Not relevant when using sokol-shdc.

D3D11:

Metal:

  • In Metal there is no internal difference between vertex-, uniform- and storage-buffers, all are bound to the same 'buffer bind slots' with the following reserved ranges:
    • Vertex shader stage:
      • Uniform buffers: slots 0..7
      • Storage buffers: slots 8..15
      • Vertex buffers: slots 15..23
    • Fragment shader stage:
      • Uniform buffers: slots 0..7
      • Storage buffers: slots 8..15
  • This means in MSL, storage buffer bindings start at [[buffer(8)]] both in the vertex and fragment stage

GL:

  • The GL backend doesn't use name-lookup to find storage buffer bindings; this means you must annotate buffers with layout(std430, binding=N) in GLSL
  • ...where N is 0..sg_limits.max_storage_buffer_bindings_per_stage

WebGPU:

  • In WGSL, textures, samplers and storage buffers all use a shared bindspace across all shader stages on bindgroup 1: @group(1) @binding(0..127)

On Storage Images

To write pixel data to texture objects in compute shaders, first an image object must be created with storage_image usage:

sg_image storage_image = sg_make_image(&(sg_image_desc){
    .usage.storage_image = true,
    },
    .width = ...,
    .height = ...,
    .pixel_format = ...,
});

Next, a storage-image-view object is required which also allows you to pick a specific mip-level or slice for the compute-shader to access:

sg_view simg_view = sg_make_view(&(sg_view_desc){
    .storage_image = {
        .image = storage_image,
        .mip_level = ...,
        .slice = ...
    },
});

Finally 'bind' the storage-image-view via a regular sg_apply_bindings() call inside a compute pass:

sg_begin_pass(&(sg_pass){ .compute = true });
sg_apply_pipeline(...);
sg_apply_bindings(&(sg_bindings){
    .views[VIEW_simg] = simg_view,
});
sg_dispatch(...);
sg_end_pass();

Currently, storage images can only be used with readwrite or writeonly access in shaders. For readonly access use a regular texture binding instead.

For an example of using storage images in compute shaders see imageblur-sapp:

Trace Hooks

sokol_gfx.h optionally allows you to install "trace hook" callbacks for each public API function. When a public API function is called, and a trace hook callback has been installed for this function, the callback will be invoked with the parameters and result of the function. This is useful for things like debugging- and profiling-tools, or keeping track of resource creation and destruction.

To use the trace hook feature:

  • Define SOKOL_TRACE_HOOKS before including the implementation.
  • Setup an sg_trace_hooks structure with your callback function pointers (keep all function pointers you're not interested in zero-initialized), optionally set the user_data member in the sg_trace_hooks struct.
  • Install the trace hooks by calling sg_install_trace_hooks(). The return value of this function is another sg_trace_hooks struct which contains the previously set of trace hooks. You should keep this struct around, and call those previous function pointers from your own trace callbacks for proper chaining.

As an example of how trace hooks are used, have a look at the imgui/sokol_gfx_imgui.h header which implements a realtime debugging UI for sokol_gfx.h on top of Dear ImGui.

Memory Allocation Override

You can override the memory allocation functions at initialization time like this:

void* my_alloc(size_t size, void* user_data) {
    return malloc(size);
}

void my_free(void* ptr, void* user_data) {
    free(ptr);
}

...
    sg_setup(&(sg_desc){
        // ...
        .allocator = {
            .alloc_fn = my_alloc,
            .free_fn = my_free,
            .user_data = ...,
        }
    });
...

If no overrides are provided, malloc and free will be used.

This only affects memory allocation calls done by sokol_gfx.h itself though, not any allocations in OS libraries.

Error Reporting and Logging

To get any logging information at all you need to provide a logging callback in the setup call. The easiest way is to use sokol_log.h:

#include "sokol_log.h"

sg_setup(&(sg_desc){ .logger.func = slog_func });

To override logging with your own callback, first write a logging function like this:

void my_log(const char* tag,                // e.g. 'sg'
            uint32_t log_level,             // 0=panic, 1=error, 2=warn, 3=info
            uint32_t log_item_id,           // SG_LOGITEM_*
            const char* message_or_null,    // a message string, may be nullptr in release mode
            uint32_t line_nr,               // line number in sokol_gfx.h
            const char* filename_or_null,   // source filename, may be nullptr in release mode
            void* user_data)
{
    ...
}

...and then setup sokol-gfx like this:

sg_setup(&(sg_desc){
    .logger = {
        .func = my_log,
        .user_data = my_user_data,
    }
});

The provided logging function must be reentrant (e.g. be callable from different threads).

If you don't want to provide your own custom logger, it is highly recommended to use the standard logger in sokol_log.h instead, otherwise you won't see any warnings or errors.

Commit Listeners

It's possible to hook callback functions into sokol-gfx which are called from inside sg_commit() in unspecified order. This is mainly useful for libraries that build on top of sokol_gfx.h to be notified about the end/start of a frame.

To add a commit listener, call:

static void my_commit_listener(void* user_data) {
    ...
}

bool success = sg_add_commit_listener((sg_commit_listener){
    .func = my_commit_listener,
    .user_data = ...,
});

The function returns false if the internal array of commit listeners is full, or the same commit listener had already been added.

If the function returns true, my_commit_listener() will be called each frame from inside sg_commit().

By default, 1024 distinct commit listeners can be added, but this number can be tweaked in the sg_setup() call:

sg_setup(&(sg_desc){
    .max_commit_listeners = 2048,
});

An sg_commit_listener item is equal to another if both the function pointer and user_data field are equal.

To remove a commit listener:

bool success = sg_remove_commit_listener((sg_commit_listener){
    .func = my_commit_listener,
    .user_data = ...,
});

...where the .func and .user_data fields are equal to a previous sg_add_commit_listener() call. The function returns true if the commit listener item was found and removed, and false otherwise.

Resource Creation and Destruction in Detail

The 'vanilla' way to create resource objects is with the 'make functions':

sg_buffer sg_make_buffer(const sg_buffer_desc* desc)
sg_image sg_make_image(const sg_image_desc* desc)
sg_sampler sg_make_sampler(const sg_sampler_desc* desc)
sg_shader sg_make_shader(const sg_shader_desc* desc)
sg_pipeline sg_make_pipeline(const sg_pipeline_desc* desc)
sg_view sg_make_view(const sg_view_desc* desc)

This will result in one of three cases:

1. The returned handle is invalid. This happens when there are no more free slots in the resource pool for this resource type. An invalid handle is associated with the INVALID resource state, for instance:

sg_buffer buf = sg_make_buffer(...)
if (sg_query_buffer_state(buf) == SG_RESOURCESTATE_INVALID) {
    // buffer pool is exhausted
}

2. The returned handle is valid, but creating the underlying resource has failed for some reason. This results in a resource object in the FAILED state. The reason why resource creation has failed differs by resource type. Look for log messages with more details. A failed resource state can be checked with:

sg_buffer buf = sg_make_buffer(...)
if (sg_query_buffer_state(buf) == SG_RESOURCESTATE_FAILED) {
    // creating the resource has failed
}

3. And finally, if everything goes right, the returned resource is in resource state VALID and ready to use. This can be checked with:

sg_buffer buf = sg_make_buffer(...)
if (sg_query_buffer_state(buf) == SG_RESOURCESTATE_VALID) {
    // creating the resource has succeeded
}

Resource States

When calling the 'make functions', the created resource goes through a number of states:

  • INITIAL: the resource slot associated with the new resource is currently free (technically, there is no resource yet, just an empty pool slot)
  • ALLOC: a handle for the new resource has been allocated, this just means a pool slot has been reserved.
  • VALID or FAILED: in VALID state any 3D API backend resource objects have been successfully created, otherwise if anything went wrong, the resource will be in FAILED state.

Two-Step Resource Creation

Sometimes it makes sense to first grab a handle, but initialize the underlying resource at a later time. For instance when loading data asynchronously from a slow data source, you may know what buffers and textures are needed at an early stage of the loading process, but actually loading the buffer or texture content can only be completed at a later time.

For such situations, sokol-gfx resource objects can be created in two steps. You can allocate a handle upfront with one of the 'alloc functions':

sg_buffer sg_alloc_buffer(void)
sg_image sg_alloc_image(void)
sg_sampler sg_alloc_sampler(void)
sg_shader sg_alloc_shader(void)
sg_pipeline sg_alloc_pipeline(void)
sg_view sg_alloc_view(void)

This will return a handle with the underlying resource object in the ALLOC state:

sg_image img = sg_alloc_image();
if (sg_query_image_state(img) == SG_RESOURCESTATE_ALLOC) {
    // allocating an image handle has succeeded, otherwise
    // the image pool is full
}

Such an 'incomplete' handle can be used in most sokol-gfx rendering functions without doing any harm — sokol-gfx will simply skip any rendering operation that involves resources which are not in VALID state.

At a later time (for instance once the texture has completed loading asynchronously), the resource creation can be completed by calling one of the 'init functions'. Those functions take an existing resource handle and 'desc struct':

void sg_init_buffer(sg_buffer buf, const sg_buffer_desc* desc)
void sg_init_image(sg_image img, const sg_image_desc* desc)
void sg_init_sampler(sg_sampler smp, const sg_sampler_desc* desc)
void sg_init_shader(sg_shader shd, const sg_shader_desc* desc)
void sg_init_pipeline(sg_pipeline pip, const sg_pipeline_desc* desc)
void sg_init_view(sg_view view, const sg_view_desc* desc)

The init functions expect a resource in ALLOC state, and after the function returns, the resource will be either in VALID or FAILED state. Calling an 'alloc function' followed by the matching 'init function' is fully equivalent with calling the 'make function' alone.

Two-Step Resource Destruction

Destruction can also happen as a two-step process. The 'uninit functions' will put a resource object from the VALID or FAILED state back into the ALLOC state:

void sg_uninit_buffer(sg_buffer buf)
void sg_uninit_image(sg_image img)
void sg_uninit_sampler(sg_sampler smp)
void sg_uninit_shader(sg_shader shd)
void sg_uninit_pipeline(sg_pipeline pip)
void sg_uninit_view(sg_view view)

Calling the 'uninit functions' with a resource that is not in the VALID or FAILED state is a no-op.

To finally free the pool slot for recycling, call the 'dealloc functions':

void sg_dealloc_buffer(sg_buffer buf)
void sg_dealloc_image(sg_image img)
void sg_dealloc_sampler(sg_sampler smp)
void sg_dealloc_shader(sg_shader shd)
void sg_dealloc_pipeline(sg_pipeline pip)
void sg_dealloc_view(sg_view view)

Calling the 'dealloc functions' on a resource that's not in ALLOC state is a no-op, but will generate a warning log message.

Calling an 'uninit function' and 'dealloc function' in sequence is equivalent with calling the associated 'destroy function':

void sg_destroy_buffer(sg_buffer buf)
void sg_destroy_image(sg_image img)
void sg_destroy_sampler(sg_sampler smp)
void sg_destroy_shader(sg_shader shd)
void sg_destroy_pipeline(sg_pipeline pip)
void sg_destroy_view(sg_view view)

The 'destroy functions' can be called on resources in any state and generally do the right thing (for instance if the resource is in ALLOC state, the destroy function will be equivalent to the 'dealloc function' and skip the 'uninit part').

Fail Functions

And finally to close the circle, the 'fail functions' can be called to manually put a resource in ALLOC state into the FAILED state:

sg_fail_buffer(sg_buffer buf)
sg_fail_image(sg_image img)
sg_fail_sampler(sg_sampler smp)
sg_fail_shader(sg_shader shd)
sg_fail_pipeline(sg_pipeline pip)
sg_fail_view(sg_view view)

This is recommended if anything went wrong outside of sokol-gfx during asynchronous resource setup (for instance a file loading operation failed). In this case, the 'fail function' should be called instead of the 'init function'.

Calling a 'fail function' on a resource that's not in ALLOC state is a no-op, but will generate a warning log message.

NOTE: two-step resource creation usually only makes sense for buffers, images and views, but not for samplers, shaders or pipelines. Most notably, trying to create a pipeline object with a shader that's not in VALID state will trigger a validation layer error, or if the validation layer is disabled, result in a pipeline object in FAILED state.

WebGPU Caveats

For a general overview and design notes of the WebGPU backend see:

https://floooh.github.io/2023/10/16/sokol-webgpu.html

In general, don't expect an automatic speedup when switching from the WebGL2 backend to the WebGPU backend. Some WebGPU functions currently actually have a higher CPU overhead than similar WebGL2 functions, leading to the paradoxical situation that some WebGPU code may be slower than similar WebGL2 code.

WGSL Bind-Slot Convention

When writing WGSL shader code by hand, a specific bind-slot convention must be used.

All uniform block structs must use @group(0) and bindings in the range 0..15:

@group(0) @binding(0..15)

All textures, samplers, storage-buffers and storage-images must use @group(1) and bindings must be in the range 0..127:

@group(1) @binding(0..127)

Note that the number of texture, sampler, storage-buffer and storage-image bindings is still limited despite the large bind range:

  • Up to 16 textures and samplers across all shader stages
  • Up to 8 storage buffers across all shader stages
  • Up to 4 storage images on the compute shader stage

If you use sokol-shdc to generate WGSL shader code, you don't need to worry about the above binding conventions since sokol-shdc will allocate the WGSL bindslots.

Uniform Buffer Size

The sokol-gfx WebGPU backend uses the sg_desc.uniform_buffer_size item to allocate a single per-frame uniform buffer which must be big enough to hold all data written by sg_apply_uniforms() during a single frame, including a worst-case 256-byte alignment (e.g. each sg_apply_uniform call will cost at least 256 bytes of uniform buffer size). The default size is 4 MB, which is enough for 16384 sg_apply_uniform() calls per frame (assuming the uniform data 'payload' is less than 256 bytes per call). These rules are the same as for the Metal backend, so if you are already using the Metal backend you'll be fine.

Bindgroup Cache

sg_apply_bindings(): the sokol-gfx WebGPU backend implements a bindgroup cache to prevent excessive creation and destruction of BindGroup objects when calling sg_apply_bindings(). The number of slots in the bindgroups cache is defined in sg_desc.wgpu.bindgroups_cache_size when calling sg_setup. The cache size must be a power-of-2 number, with the default being 1024. The bindgroups cache behaviour can be observed by calling the new function sg_query_stats(), where the following struct items are of interest:

.wgpu.num_bindgroup_cache_hits
.wgpu.num_bindgroup_cache_misses
.wgpu.num_bindgroup_cache_collisions
.wgpu_num_bindgroup_cache_invalidates
.wgpu.num_bindgroup_cache_vs_hash_key_mismatch

The value to pay attention to is .wgpu.num_bindgroup_cache_collisions. If this number is consistently higher than a few percent of the .wgpu.num_set_bindgroup value, it might be a good idea to bump the bindgroups cache size to the next power-of-2.

Other Caveats

  • sg_apply_viewport(): WebGPU currently has a unique restriction that viewport rectangles must be contained entirely within the framebuffer. As a shitty workaround, sokol_gfx.h will clip incoming viewport rectangles against the framebuffer, but this will distort the clipspace-to-screenspace mapping. There's no proper way to handle this inside sokol_gfx.h — this must be fixed in a future WebGPU update (see: https://github.com/gpuweb/gpuweb/issues/373 and https://github.com/gpuweb/gpuweb/pull/5025).
  • The sokol shader compiler generally adds diagnostic(off, derivative_uniformity); into the WGSL output. Currently only the Chrome WebGPU implementation seems to recognize this.
  • Likewise, the following sokol-gfx pixel formats are not supported in WebGPU: R16, R16SN, RG16, RG16SN, RGBA16, RGBA16SN. Unlike unsupported vertex formats, unsupported pixel formats can be queried in cross-backend code via sg_query_pixelformat() though.
  • The Emscripten WebGPU shim currently doesn't support the Closure minification post-link-step (e.g. currently the emcc argument --closure 1 or --closure 2 will generate broken JavaScript code).
  • sokol-gfx requires the WebGPU device feature depth32float-stencil8 to be enabled (this should be widely supported).
  • sokol-gfx expects that the WebGPU device feature float32-filterable is not enabled (since this would exclude all iOS devices).