ARM ARMv8

ARMv8 Synchronization Model

Introduce the elements and exclusive states in the ARMv8 synchronization model.

ARMv8 designs a synchronization model to provide non-blocking synchronization. This model includes the following key elements:

  • Load-Exclusive / Store-Exclusive instruction pair and Clear-Exclusive instruction.
  • Memory block accessed by the above instruction pair.
  • Local exclusive monitor, or local monitor, for non-shareable and shareable memory (1).
  • Global exclusive monitor, or global monitor, for shareable memory.

ARMv8 defines two states for memory blocks accessed by PE:

  • Exclusive Access state.
  • Open Access state.

When a memory block is marked as Exclusive Access for a PE, this means that this PE can modify the data of that memory block by issuing a Store-Exclusive instruction.

When a memory block is marked as Open Access for a PE, this means that this PE cannot modify the data of that memory block by issuing a Store-Exclusive instruction.

ARMv8 gives different state machine diagrams for local monitor and global monitor. I cut these figures directly from Spec because I thought it was clear enough to understand.

Before looking at the figures, let me give some advice.
  • It is important to note that the Marked_address in the figures is a memory block. You can figure it out by reading Exclusives Reservation Granule.
  • The operations with * at the end are IMPLEMENTATION DEFINED, so I don't think it's necessary to focus more on them, unless you want to design an ARMv8 based CPU.
The below figure shows the state machine for the local monitor and the effect of each of the operations shown in the figure.

The below figure shows the state machine for PE(n) in a global monitor.

An exception return clears the local monitor[a]. That means that the marked state of all memory blocks monitored by the local monitor will be set to Open Access state (or another unspecified state, which must not be Exclusive Access state).

This section introduces some of the key elements in the synchronization model.

There are three types of synchronization primitives:

  • Load-Exclusive
    When PE executes such an instruction, the data in the specified memory will be read out and the memory will be set to or held in Exclusive Access state for the observer(2).
  • Store-Exclusive
    The result of the execution of such an instruction is determined by the state of the memory specified by the address parameter, which is marked by the monitor.
    • If memory is marked as the Exclusive Access state, the store takes place and a status value of 0 is returned to a register.
    • If memory is marked as the Open Access state, no store takes place and a status value of 1 is returned to a register.
When a Store-Exclusive instruction is issued, ARMv8 will check the status of the monitor and whether the memory is marked by the monitor to determine the execution result. [b][c] SO PLEASE NOTE THAT I am just summarizing this state in one statement. That is,
  • The Exclusive Access state of memory means that the monitor is in the Exclusive Access state and the memory is marked.
  • The Open Access state of memory means that the monitor is in Open Access state, or the memory is not marked.
  • Clear-Exclusive
    Such a instruction will clear the Exclusive Access state of monitor, which means move to the Open Access state.
The following table lists the synchronization primitives.

Transaction Size Additional Semantics Load-Exclusive Store-Exclusive Others
Byte - LDXRB STXRB -
Load-Acquire/Store-Release LDAXRB STLXRB -
Halfword - LDXRH STXRH -
Load-Acquire/Store-Release LDAXRH STLXRRH -
Register - LDXR STXR -
Load-Acquire/Store-Release LDAXR STLXR -
Pair - LDXP STXP -
Load-Acquire/Store-Release LDAXP STLXP -
None Clear-Exclusive - - CLREX

Except for the row showing the CLREX instruction, the two instructions in a single row are a Load-Exclusive/ Store-Exclusive instruction pair.

Memory Type

The are two types of memory:

  • Non-Shareable memory
    For this type of memory, exclusive access to it requires only local monitor.
  • Shareable memory
    Exclusive access to this type of memory requires both local monitor and global monitor.

Exclusives Reservation Granule

When the PE executes the Load-Exclusive instruction to load 64-bit data from address A1, the memory block that includes A1 will be marked for Exclusive Access. The size of this memory block is aligned by exclusives reservation granule specified in the ARMv8 Spec. Exclusives reservation granule is IMPLEMENTATION DEFINED in the range 4-512 words. And in some implements it can be read from CTR register, e.g. CTR_EL0.

For example, assume that exclusives reservation granule is 8 words. When the PE executes the Load-Exclusive instruction to 0x30 as shown below, the memory block [0x20, 0x3F] will be marked as Exclusive Access state.

The local monitor is located in the PE. It is designed to monitor the exclusive status of a memory block(3). No PE can directly change the exclusive state of memory blocks monitored by other PEs.

The global monitor can either reside within the PE, or exist as a secondary monitor at the memory interfaces. It is designed to synchronize between multiprocessor system. The global monitors monitor one memory block for each PE separately, as shown below.

The memories M-A and M-B monitored by the global monitor in the following figure may or may not be the same. If they are identical, either PE is affected by the store operations issued by the other PE on the common memory block. This is the key to achieve synchronization.

Table-1. Local monitor in CPU
CPU Location Exclusives Reservation Granule
A72 L1 memory system[d] 64 bytes, one cache line.
A78AE L1 memory system[e] 64 bytes, one cache line.
(1) Todo: Introduce shareable and non-shareable.
(2) I researched the ARMv8 Spec and I think observer represents PE. The PE observes the state of the memory block marked by the monitor.
(3) I researched the ARMv8 Spec and it seems to support monitoring only one memory block.
[a] B2.9.4 Context switch support. In Arm Architecture Reference Manual for A-profile architecture. ARM, 2022.
[b] B2.9.1 Exclusive access instructions and Non-shareable memory locations. In Arm Architecture Reference Manual for A-profile architecture. ARM, 2022.
[c] B2.9.2 Exclusive access instructions and Shareable memory locations. In Arm Architecture Reference Manual for A-profile architecture. ARM, 2022.
[d] 6.4.5 Synchronization primitives. In ARM Cortex-A72 MPCore Processor Technical Reference Manual. ARM, 2015.
[e] A6.4.2 Internal exclusive monitor. In Arm® Cortex®‑A78AE Core Technical Reference Manual. ARM, 2023.
M S U B᠎ Z