RenderingPipeline

from geometry to pixels

Partially Resident Textures – AMD_sparse_texture

AMD has just released more infos about the new feature of the Radeon 7xxx GPUs: Partially Resident Textures.

PRT is a new way of getting some hardware support for MegaTextures / VirtualTexturing as it is for example implemented in Rage right now and as it is planned (with PRT support) e.g. for Doom 4.

The new infos are these:

  • The functionality will get exposed as the OpenGL extension AMD_sparse_texture – no vendor independent extension :-( Let’s see what NVidia comes up with and hope that it wont be too different (and we will get a EXT or even ARB sooner). The extension itself is not online yet at the OpenGL registry.
  • Each texture tile is 64kb and independent of the texture type (and compressed textures are also supported). For each tile the GPU must store a 4 DWORD page table entry, so a 128k*128k RGB texture (uncompressed) would need: (128k*128k*3)/64k = 786432 entries = 24.5MiB for the page table. These 64kb per tile is a constant on Southern Islands GPUs, I guess this could change over time. AMDs OpenGL extension for PRTs (see below) exposes a way to get the tile size in texels via glGetInternalFormativ with the new token GL_VIRTUAL_PAGE_SIZE_X_AMD, GL_VIRTUAL_PAGE_SIZE_Y_AMD and GL_VIRTUAL_PAGE_SIZE_Z_AMD.
  • Supporting PRTs will need some more work on the application side as I tought: The texture lookup is done with a new fetch command in GLSL but if the requested tile is not available, an error code will get returned. This code has to get transferred to the application (e.g. via an additional render target) and the application has to inform the driver to update the tile (I was hoping for a more automatic way here).
  • The old texture sampling functions in GLSL will still work but to get the special status codes (e.g. cache miss) you will need the new function versions: sparseTexture() et al. The sampled color is a new inout parameter, the return value is the return code:
    sampler2D prt;
    vec2 coord = ...;
    vec4 oldStype = texture( prt, coord );
    vec4 newStyle;
    int returnCode = sparseTexture( prt, coord, newStyle );
    if ( sparseTexelResident( returnCode ) == false ) {
      returnCode = sparseTextureLod( prt, coord, lastKnownResidentLOD, newStyle );
      // inform the app to swap in more texture tiles!
    }

    The return code format is not defined and has to get decoded with extra functions to check if the texture fetch was ok or there was a cache miss.
    (The sparse* variants of the functions will also work with ‘classic’ textures).

  • After generating and binding a new texture, the new function glTexStorageSparseAMD() will define this texture as a partially resident texture (example from AMDs slides):
    GLuint tex;
    glGenTextures( 1, &tex );
    glBindTexture( GL_TEXTURE_2D, tex);
    glTexStorageSparseAMD( GL_TEXTURE_2D, GL_RGBA, 1024, 1024, 1, 1, GL_TEXTURE_STORAGE_SPARSE_BIT_AMD );
    glTexSubImage2D( GL_TEXTURE_2D, 0, 0, 0, 1024, 1024, GL_RGBA, GL_UNSIGNED_BYTE, data );

    glTexStorageSparseAMD replaces glTexStorage2D here.

  • To allow the shader to always render something, at least some low LOD levels have to be resident all the time. This way the rendering will just be blurry in case of a cache miss. Instead of figuring out which tiles this would be and manually load them at application start, the sparse_texture extension supports us with this handy call: glTexParameteri( GL_TEXTURE_2D, GL_MIN_WARNING_LOD_AMD, N) with N being the highest LOD level that always should be resident. In case this level gets fetched later in the shader a warning flag gets returned in addition to the texel value which can be used to prefetch higher MipMap levels in advance (check the return code with sparseTexelMinLodWarning(returnCode) ).
  • Rendering to virtual textures will also be possible, just add a partially resident texture to a framebuffer object like any other rendertarget. In case a fragment should get rendered to a tile that is not resident, that fragment gets dropped.

There are some limitations:

  • The texture size has to be a multiple of the tile size (good thing the glGetInternalFormat described earlier does also work before the first PRT got created).
  • Texture buffer objects can’t be PRTs, but AMD also anounced that we will get an extension for that as well :-)
  • No depth or stencil formats and no multisampling support for PRTs.

 

I hope the actual extension text will be out in the near future as well but so far it looks quite simple to use. It is more manual work but also very flexible. In fact I was hoping for (maybe just as an option) an automatic loading of missing tiles by the GPU transparent to the application. In case of a cache miss the TMU would just return a lower MipMap level until the correct data is available.

Setting up partially resident textures is simple, more work will be to get the cache misses detected to the CPU and upload the missing parts. Rendering to an additional rendertarget will result in readbacks, looking at the buffer, maybe sorting it etc… Alternatively the shader could write the misses to a texture (OpenGL 4.2 to the rescue!) with limited size and note the number of misses in an atomic counter (would limit the uploads per frame, but that’s ok).

One thing left is to figure out which name will stick: sparse texture, partially resident texture, virtual texture? AMD didn’t help by introducing two of these terms in short time period itself…

Update: The spec is now available.

, , , , , , ,

4 thoughts on “Partially Resident Textures – AMD_sparse_texture

Leave a Reply

Your email address will not be published. Required fields are marked *

*