The types of memory modules differ in capacity, speed, cost and power consumption.

To make data accessible for a processor (CPU or GPU) computers contain a hierarchy of memory modules. What's common to all types here is that they are volatile, meaning their contents is lost without power supply.

The types of memory modules differ in capacity, speed, cost and power consumption. Memory closer to the CPU is faster, more expensive and thus smaller in capacity. During processing data is stored in multiple places along the hierarchy. For example video images that are processed by a video encoder are stored on disk (slow), a copy resides in main memory (fast), another copy in 3rd-level cache (very fast) and copies of pixel blocks in 1st and 2nd-level caches (ultra fast). Program code and data are loaded into memory when needed and evicted when it is no longer used, for example when a program exist.

Memory Types

Caches are very fast memories close to the CPU, often residing on the same silicon as the CPU. Due to their high cost and power consumption they are limited in capacity (256kB for 1st level, 2-12 MB for 3rd level caches). The purpose of a cache is to hide latency of the much slower main memory by buffering often-used data for fast access by the CPU. Caches are shared by multiple CPU cores, but the CPU in coordination with the operating system ensures no data is shared between different programs or users who run programs in the same CPU.

Main memory, also called dynamic random access memory (DRAM) or RAM for short, is the central working memory inside a computer. DRAM is fast and cheap, so it is possible to equip computers with 16-512GB of memory today. Access latencies are in the range of 50-200ns and memory busses are typically 64 data bits wide, meaning a CPU can read 8 bytes of data in every RAM clock cycle. DRAM is shared between all programs running on a computer. CPU and operating system make sure no data is involuntarily accessed, overwritten or shared between programs.

Graphics RAM: Special types of DRAM technology with higher bandwidths are found in high-performance computing such as graphics cards and game consoles. GDDR5 for example which is used in high-end GPUs is based on DDR3 technology, but has lower power requirements, varying clock frequencies and wider memory busses (256 or 384 bits). This translate into sustained transfer rates of 240 GB/s or more.

Memory Performance Table

Looking at performance of memory the three important metrics are size, throughput and access latency. Server and desktop computers allow you to add more physical memory up what the CPU can address. The maximum throughput is determined by type and clock-speed of memory modules. Memory module types and speeds must match the CPU generation and the motherboard in use. New CPU's generally require newer and faster types of memory. Details can be found in CPU handbooks or motherboard documentations.

Memory Labels

When purchasing memory you get modules, so called DIMMs (dual in-line memory module), which are assembled from multiple DRAM chips. Chips and modules are labelled differently. For example DDR3-xxx denotes DDR chips of the 3rd technology generation with a particular data transfer rate, whereas PC3-xxxx denotes an assembled module with its theoretical bandwidth.Bandwidth is calculated by multiplying transfer width (64 bit or 8 bytes for DDR3) with transfers per second.

Example: How to read DRAM type labels
DDR3-1600 PC3-12800 DDR3 SO-DIMM | Chip Technology Generation (DDR3) | Data Transfer Rate (1600 M transfers per second) | Module Technology Generation (PC3) | Module bandwith (MB/s) | Module Packaging (small outline 204pin DIMM)

Besides capacity and speed DRAM exists in different variants. Error-correcting code memory (e.g. PC3-xxxxE) can correct and detect in-memory data corruption caused by electrical or magnetic interference. Registered or buffered DRAM (e.g. PC3-xxxxR / F / FB) uses extra registers for improved signal quality and low voltage DRAM (e.g. PC3-xxxx-L / U) reduces power consumption in mobile devices. Not every variant matches every motherboard and CPU type, so consult your documentation first.

Memory Pressure

When programs require more memory than physically available some data is offloaded from DRAM to slower disk storage. When a program requires the data again it is reloaded into DRAM which may take a considerable amount of time during which the program or even the entire system seem unresponsive. To avoid such situations of high memory pressure you should monitor the memory requirements of the software you use and try not to run multiple memory-heavy programs in parallel.

Some operating systems (OSX) have started to compress unused data in memory, trading a few CPU cycles for lower memory pressure which results in an overall percieved performance improvement.