Memory
Location, on/off chip
Cached
Access
Scope
Lifetime
Note
Register
on
n/a
R/W
1 thread
thread
no latency, no sharing, TB/s
Local
off
>=2.0
Shared
all threads in block
block
i.e: 64KB per block
Global
all threads + host
host allocation
GBs, cudamemcpy, cudamalloc
Constant
yes!
R
1500GB/s
Texture
yes
CUDA memory spaces & scopes
global
local (per-thread global memory)
shared
constant
registers
Last updated 5 years ago