~~PAGEIMAGE:arch:arm:a:media:img:global-monitor.svg~~
====== ARMv8 Synchronization Model ======
{{template>meta:template:pageinfo#tpl
|desc=Introduce the elements and exclusive states in the ''ARMv8'' synchronization model.}}
ARMv8 designs a synchronization model to provide non-blocking synchronization.
This model includes the following key elements:
* Load-Exclusive / Store-Exclusive
instruction pair and Clear-Exclusive instruction.
* Memory block accessed by the above instruction pair.
* Local exclusive monitor, or local monitor, for non-shareable and shareable memory
((note:>Todo: Introduce shareable and non-shareable.)).
* Global exclusive monitor, or global monitor, for shareable memory.
ARMv8 defines two states for memory blocks accessed by PE:
* Exclusive Access state.
* Open Access state.
===== Access States =====
==== Exclusive Access State ====
When a memory block is marked as Exclusive Access for a PE,
this means that this PE can modify the data of that memory block by issuing a
Store-Exclusive instruction.
==== Open Access State ====
When a memory block is marked as Open Access for a PE, this
means that this PE cannot modify the data of that memory block by issuing a
Store-Exclusive instruction.
==== State Machine ====
ARMv8 gives different state machine diagrams for [[#Local Monitor|local monitor]]
and [[#Global Monitor|global monitor]]. I cut these figures directly from Spec
because I thought it was clear enough to understand.
Before looking at the figures, let me give some advice.
* It is important to note that the ''Marked_address'' in the figures is a memory
block. You can figure it out by reading
[[#Exclusives Reservation Granule|Exclusives Reservation Granule]].
* The operations with ''*'' at the end are IMPLEMENTATION DEFINED, so I don't think
it's necessary to focus more on them, unless you want to design an ARMv8 based CPU.
The below figure shows the state machine for the local monitor and the effect of each
of the operations shown in the figure.
\\ \\
{{:arch:arm:a:media:img:20230520-163716.png?600&direct}}
The below figure shows the state machine for PE(n) in a global monitor.
\\ \\
{{:arch:arm:a:media:img:20230520-164158.png?600&direct}}
An exception return clears the local monitor((:arm-ddi0487-i-a-c2-9-4)).
That means that the marked state of all memory blocks monitored by the local monitor will
be set to Open Access state (or another unspecified state, which must
not be Exclusive Access state).
===== Key Elements =====
This section introduces some of the key elements in the synchronization model.
==== Synchronization Primitives ====
There are three types of synchronization primitives:
* **Load-Exclusive** \\
When PE executes such an instruction, the data in the specified memory will
be read out and the memory will be set to or held in Exclusive
Access state for the observer((note:>I researched the ARMv8 Spec and
I think ''observer'' represents PE. The PE observes the state of the memory
block marked by the monitor.)).
* **Store-Exclusive** \\
The result of the execution of such an instruction is determined by the state
of the memory specified by the address parameter, which is marked by the monitor.
* If memory is marked as the Exclusive Access state, the store
takes place and a status value of ''0'' is returned to a register.
* If memory is marked as the Open Access state, no store takes
place and a status value of ''1'' is returned to a register.
When a Store-Exclusive instruction is issued, ARMv8
will check the status of the monitor and whether the memory is marked by the monitor to
determine the execution result. ((:arm-ddi0487-i-a-b2-9-1))((:arm-ddi0487-i-a-b2-9-2))
SO PLEASE NOTE THAT I am just summarizing this state in
one statement. That is,
* **//The Exclusive Access state of memory//** means that the
monitor is in the Exclusive Access state and the memory is marked.
* **//The Open Access state of memory//** means that the monitor
is in Open Access state, or the memory is not marked.
* **Clear-Exclusive** \\ Such a instruction will clear the
Exclusive Access state of monitor, which means move to the
Open Access state.
The following table lists the synchronization primitives. \\ \\
^ Transaction Size ^ Additional Semantics ^ Load-Exclusive ^ Store-Exclusive ^ Others ^
| Byte | - | LDXRB | STXRB | - |
| ::: | Load-Acquire/Store-Release | LDAXRB | STLXRB | - |
| Halfword | - | LDXRH | STXRH | - |
| ::: | Load-Acquire/Store-Release | LDAXRH | STLXRRH | - |
| Register | - | LDXR | STXR | - |
| ::: | Load-Acquire/Store-Release | LDAXR | STLXR | - |
| Pair | - | LDXP | STXP | - |
| ::: | Load-Acquire/Store-Release | LDAXP | STLXP | - |
| None | Clear-Exclusive | - | - | CLREX |
Except for the row showing the CLREX instruction, the two
instructions in a single row are a Load-Exclusive/
Store-Exclusive instruction pair.
==== Memory ====
=== Memory Type ===
The are two types of memory:
* ''Non-Shareable'' memory \\ For this type of memory, exclusive access to
it requires only local monitor.
* ''Shareable'' memory \\ Exclusive access to this type of memory requires
both local monitor and global monitor.
=== Exclusives Reservation Granule ===
When the PE executes the Load-Exclusive instruction to load
64-bit data from address A1, the memory block that includes A1 will be marked
for Exclusive Access. The size of this memory block is aligned
by exclusives reservation granule specified in the ARMv8 Spec.
Exclusives reservation granule is IMPLEMENTATION DEFINED in
the range ''4''-''512'' words. And in some implements it can be read from
CTR register, e.g. CTR_EL0.
For example, assume that exclusives reservation granule is
8 words. When the PE executes the Load-Exclusive instruction to
''0x30'' as shown below, the memory block ''[0x20, 0x3F]'' will be marked as
Exclusive Access state.
{{:arch:arm:a:media:img:monitered-memory.svg}}
==== Local Monitor ====
The local monitor is located in the PE. It is designed to monitor the
exclusive status of a memory block((note:>I researched the ARMv8 Spec and
it seems to support monitoring only one memory block.)). No PE can
**directly** change the exclusive state of memory blocks monitored by other PEs.
==== Global Monitor ====
The global monitor can either reside within the PE, or exist as a secondary
monitor at the memory interfaces. It is designed to synchronize between
multiprocessor system. The global monitors monitor one memory block for each
PE separately, as shown below.
The memories ''M-A'' and ''M-B'' monitored by the global monitor in the
following figure may or may not be the same. If they are identical, either
PE is affected by the store operations issued by the other PE on the common
memory block. This is the key to achieve **synchronization**.
{{:arch:arm:a:media:img:global-monitor.svg}}
===== CPU Implementation List =====
^ Table-1. Local monitor in CPU |||
^ CPU ^ Location ^ Exclusives Reservation Granule ^
| A72 | L1 memory system((:arm-a72-6-4-5)) | 64 bytes, one cache line. |
| A78AE | L1 memory system((:arm-a78ae-6-4-2)) | 64 bytes, one cache line. |
{{template>meta:template:refnote#note}}
{{template>meta:template:refnote#ref}}
===== Further Reading =====
* [[https://developer.arm.com/documentation/ddi0487/ia/|ArmĀ® Architecture Reference Manual for A-profile architecture]]
* B2.9. Exclusive access instructions and Shareable memory locations
* //[[https://patchwork.kernel.org/project/linux-arm-kernel/patch/1472726820-32959-1-git-send-email-vladimir.murzin@arm.com/]]//
* //[[https://github.com/ARM-software/arm-trusted-firmware/blob/master/lib/locks/exclusive/aarch64/spinlock.S]]//
* //ARM community//
* //[[https://community.arm.com/support-forums/f/architectures-and-processors-forum/51294/ldrex-strex-breaked-by-third-task]]//
* //[[https://community.arm.com/support-forums/f/architectures-and-processors-forum/10361/ldrex-strex-on-the-m3-m4-m7/33045#33045]]//
* //[[https://community.arm.com/support-forums/f/architectures-and-processors-forum/4273/how-to-understand-armv8-sevl-instruction-in-spin-lock]]//