Low-Level GPU Documentation
While programming graphics applications means programming against an API that abstracts us from the actual hardware (OpenGL, Direct3D), it can still be interesting to dig a bit deeper. Having a good understanding of the hardware (and its limitations!) can help you optimizing your code but also understanding limits of the API. Or maybe you’re just curious what happens below your API calls.
This is a growing collection of (mostly) low-level documentation of GPUs I stumbled across. First sorted by vendor, then roughly by the hardware release / generation (newest chips at the bottom). To get a rough feeling of the chips feature set, I tried to add the highest supported OpenGL version, 4.2+ means that this chips will probably support also newer, not yet defined OpenGL versions.
Note 1: Fabian ‘ryg’ Giesen wrote some very good articles about the graphics pipeline named “A trip through the Graphics Pipeline 2011” (don’t mix that up with “A trip down the graphics pipeline” by Jim Blinn). He describes what goes on in an average DX11 graphics card from your program, API, driver down to the bare metal. You might want to read these articles first before you want to look up the specifics of certain chips below.
Note 2: If all you are interested is writing better shader code by understanding the hardware better, the talk “Low-Level Thinking in High-Level Shading Languages” given by Emil Persson (aka Humus) at GDC 2013 is a great start.
ATI / AMD GPUs:
R300 register listing (2002, e.g. ATI Radeon 9800, OpenGL 2.0)
ATI Radeon X1950 and others, ca. 2005, OpenGL 2.0:
- R500 very detailed information of the architecture
- M56 register listing (X1600, a R500 chip)
- Radeon 1×00 programming guide
- ATI OpenGL Programming and Optimization Guide (October 2005)
ATI Radeon HD 2900 and others, ca. 2006, OpenGL 3.3:
- R600 overview slides (ppt)
- R600 ISA overview slides (ppt)
- R600 ISA in Detail
- R6xx 3D Register listing
- RV630 register listing
- RS690 register listing
- M76 register listing (Mobility Radeon HD 2600/2700/3600/3800)
- HD2000 Programming Guide High-Level documentation targeted at graphics programmers
ATI Radeon HD 3800/4800, ca 2008, OpenGL 3.3:
- R700 overview slides (ppt)
- R700 ISA
- R600 / R700 / Evergreen Intermediate Language Spec
- R600 / R700 / Evergreen Assembly Language Format
- R600 / R700 register listing
- R600 / R700 very detailed architecture overview
- Evergreen Family ISA
- Evergreen 3D Registers v2
Heterogeneous Computing OpenCL and the ATI Radeon HD 5870 Architecture – Starting on slide 47 is a high level introduction of the ‘Evergreen’ / ‘Southern Islands’ Architecture
AMD Radeon HD 69xx, 2010, OpenGL 4.2:
AMD Accelerated Parallel Processing OpenCL Programming Guide (OpenCL guides often give useful informations about the GPUs that can be used, same is true for CUDA guides!)
AMD Radeon HD 7xxx, codename Southern Islands (aka Graphics Cores Next, GCN, 2011, e.g. AMD 7970, OpenGL 4.2+):
- Whitepaper
- Southern Islands ISA
- Southern Islands 3D Registers
- Sea Islands 3D Registers
- Southern Islands / Sea Islands Programming Guide
Description of the HDA Audio on Radeon GPUs – updated as a pdf here.
AMD Radeon HD 8xxx, (GCN Generation 3, codename Volcanic Islands):
NVidia GPUs:
NVIDIA GeForce 7 Programming Guide
G80 Programming Guide High-Level documentation targeted at graphics programmers (2006, e.g. GeForce 8800, OpenGL 3.3)
NVIDIA GeForce 8800 GPU Architecture Overview
Fermi:
Fermi Whitepaper (2010, e.g. GeForce 580, OpenGL 4.2)
Kepler:
Kepler Whitepaper (2012, e.g. GeForce 680, OpenGL 4.2+)
Kepler GK110 Compute Whitepaper (2012, CUDA focussed)
Device Control Block 4.0 Specification.
Maxwell:
NVidia GeForce GTX 750 Ti Whitepaper
NVidia GeForce GTX 980 Whitepaper (GM204)
General information:
Low level linux tools to inspect the binary NV drivers
NV GPU docs and wiki by the open source community.
Intel GPUs:
Intel 965 (OpenGL 1.x) (x.org | 01.org):
- Volume One: Graphics Core
- Volume Two: 3D/Media
- Volume Three: Display Registers
- Volume Four: Subsystem and Cores
Intel Graphics Media Accelerator X3000 and 3000 White Paper
GMA X4500 (Intel G45 chipset, OpenGL 3) (x.org | 01.org):
- G45: Volume 1a Graphics Core
- G45: Volume Two: 3D/Media
- G45: Volume Three: Display Registers
- G45: Volume Four: Subsystem and Cores
Intel HD (2010 Core™ i7/i5/i3) (01.org):
- Volume 1 Part 1: Graphics Core
- Volume 1 Part 2: Graphics Core – MMIO, Media Registers & Programming Environment
- Volume 1 Part 3: Graphics Core – Memory Interface and Commands Render Engine
- Volume 1 Part 4: Graphics Core – Video Codec Engine
- Volume 1 Part 5: Graphics Core – Blitter Engine
- Volume 2 Part 1: 3D/Media – 3D Pipeline
- Volume 2 Part 2: 3D/Media – Media
- Volume 3 Part 1: Display Registers – VGA Registers
- Volume 3 Part 2: Display Registers – CPU Registers
- Volume 3 Part 3: PCH Display Registers
- Volume 4 Part 1: Subsystem and Cores – Shared Functions
- Volume 4 Part 2: Subsystem and Cores – Message Gateway, URB, Video Motion, and IS
Sandybridge (Intel HD 2000 / HD 3000, OpenGL 3.2) (x.org | 01.org):
- Volume 1 Part 1: Graphics Core
- Volume 1 Part 2: Graphics Core – MMIO, Media Registers & Programming Environment
- Volume 1 Part 3: Graphics Core – Memory Interface and Commands for the Render Engine
- Volume 1 Part 4: Graphics Core – Video Codec Engine
- Volume 1 Part 5: Graphics Core – Blitter Engine
- Volume 2 Part 1: 3D/Media – 3D Pipeline
- Volume 2 Part 2: 3D/Media – Media
- Volume 3 Part 1: Display Registers – VGA Registers
- Volume 3 Part 2: Display Registers – CPU Registers
- Volume 3 Part 3: PCH Display Registers
- Volume 4 Part 1: Subsystem and Cores – Shared Functions
- Volume 4 Part 2: Subsystem and Cores – Message Gateway, URB, Video Motion, and IS
Ivy Bridge (Intel HD 2500 / HD 4000, OpenGL 4.0) (01.org):
- Volume 1 Part 1: Graphics Core
- Volume 1 Part 2: Graphics Core – MMIO, Media Registers & Programming Environment
- Volume 1 Part 3: Graphics Core – Memory Interface and Commands for the Render Engine
- Volume 1 Part 4: Graphics Core – Blitter Engine
- Volume 1 Part 5: Graphics Core – Video Codec Engine Command Streamer
- Volume 1 Part 6: GT Interface Register
- Volume 1 Part 7: L3$/URB
- Volume 2 Part 1: 3D/Media – 3D Pipeline
- Volume 2 Part 2: Media and General Purpose Pipeline
- Volume 2 Part 3: Multi-Format Transcoder – MFX
- Volume 3 Part 1: VGA and Extended VGA Registers
- Volume 3 Part 2: PCI Registers
- Volume 3 Part 3: North Display Engine
- Volume 3 Part 4: South Display Engine
- Volume 4 Part 1: Subsystem and Cores – Shared Functions
- Volume 4 Part 2: Subsystem and Cores – Message Gateway, URB, Video Motion Estimation, Pixel Interpolator
- Volume 4 Part 3: Execution Unit ISA
Intel Haswell (2013 Intel HD 4400 / HD 4600 / Intel Iris Pro 5200, OpenGL 4.x) (01.org):
- bspec-live-opensource-hsw_0.pdf
- HSW – Volume 1: Preface and Introduction
- HSW – Volume 2a: Command Reference: Enumerations
- HSW – Volume 2b: Command Reference: Instructions (Command Opcodes)
- HSW – Volume 2c: Command Reference: Registers
- HSW – Volume 2d: Command Reference: Structures
- HSW – Volume 3: GPU Overview
- HSW – Volume 4: Configurations
- HSW – Volume 5: Memory Views
- HSW – Volume 6: Command Stream Programming
- HSW – Volume 7: 3D Media GPGU
- HSW – Volume 8: Media VDBOX
- HSW – Volume 9: Media VEBOX
- HSW – Volume 10: Blitter
- HSW – Volume 11a: Display
- HSW – Volume 11b: Display Watermark Guide
- HSW – Volume 12: PCIE Configuration Registers
ImgTec:
PowerVR MBX Tile-based rendering architecture overview (ImgTec MBX, OpenGL ES 1.1)
PowerVR SGX Architecture Guide for Developers (ImgTec SGX, OpenGL ES 2.0)
PowerVR performance recommendations (ImgTec SGX, OpenGL ES 2.0)
Various Graphics documents by ImgTec
The PowerVR SDK contains some architecture guides and whitepapers.
Other GPUs:
The Broadcom VideoCore IV as used in the Raspberry PI gets reverse engineered. Broadcom also published the specifications for VideoCore IV.
If you’re interested in a little history lesson, Texas Instruments has the specs for its TSM34010 microprocessor still online – this was the first programmable graphics processor (but a quite unsuccessful one) from 1985. The Wikipedia has some more general information about it.
VIA Chrome 9, a Direct3D 9 GPU
The Mali 400 reverse engineered open source drivers have some documentation.
Other collections / sources:
AMD App SDK
AMD OpenGPU Docs
x.org Docs Mirrors a lot of the specs listed above.
More development oriented collection: