~~PAGEIMAGE:arch:arm:a:media:img:global-monitor.svg~~ ====== ARMv8 Synchronization Model ====== {{template>meta:template:pageinfo#tpl |desc=Introduce the elements and exclusive states in the ''ARMv8'' synchronization model.}} ARMv8 designs a synchronization model to provide non-blocking synchronization. This model includes the following key elements: * Load-Exclusive / Store-Exclusive instruction pair and Clear-Exclusive instruction. * Memory block accessed by the above instruction pair. * Local exclusive monitor, or local monitor, for non-shareable and shareable memory ((note:>Todo: Introduce shareable and non-shareable.)). * Global exclusive monitor, or global monitor, for shareable memory. ARMv8 defines two states for memory blocks accessed by PE: * Exclusive Access state. * Open Access state. ===== Access States ===== ==== Exclusive Access State ==== When a memory block is marked as Exclusive Access for a PE, this means that this PE can modify the data of that memory block by issuing a Store-Exclusive instruction. ==== Open Access State ==== When a memory block is marked as Open Access for a PE, this means that this PE cannot modify the data of that memory block by issuing a Store-Exclusive instruction. ==== State Machine ==== ARMv8 gives different state machine diagrams for [[#Local Monitor|local monitor]] and [[#Global Monitor|global monitor]]. I cut these figures directly from Spec because I thought it was clear enough to understand. Before looking at the figures, let me give some advice. * It is important to note that the ''Marked_address'' in the figures is a memory block. You can figure it out by reading [[#Exclusives Reservation Granule|Exclusives Reservation Granule]]. * The operations with ''*'' at the end are IMPLEMENTATION DEFINED, so I don't think it's necessary to focus more on them, unless you want to design an ARMv8 based CPU. The below figure shows the state machine for the local monitor and the effect of each of the operations shown in the figure. \\ \\ {{:arch:arm:a:media:img:20230520-163716.png?600&direct}} The below figure shows the state machine for PE(n) in a global monitor. \\ \\ {{:arch:arm:a:media:img:20230520-164158.png?600&direct}} An exception return clears the local monitor((:arm-ddi0487-i-a-c2-9-4)). That means that the marked state of all memory blocks monitored by the local monitor will be set to Open Access state (or another unspecified state, which must not be Exclusive Access state). ===== Key Elements ===== This section introduces some of the key elements in the synchronization model. ==== Synchronization Primitives ==== There are three types of synchronization primitives: * **Load-Exclusive** \\ When PE executes such an instruction, the data in the specified memory will be read out and the memory will be set to or held in Exclusive Access state for the observer((note:>I researched the ARMv8 Spec and I think ''observer'' represents PE. The PE observes the state of the memory block marked by the monitor.)). * **Store-Exclusive** \\ The result of the execution of such an instruction is determined by the state of the memory specified by the address parameter, which is marked by the monitor. * If memory is marked as the Exclusive Access state, the store takes place and a status value of ''0'' is returned to a register. * If memory is marked as the Open Access state, no store takes place and a status value of ''1'' is returned to a register. When a Store-Exclusive instruction is issued, ARMv8 will check the status of the monitor and whether the memory is marked by the monitor to determine the execution result. ((:arm-ddi0487-i-a-b2-9-1))((:arm-ddi0487-i-a-b2-9-2)) SO PLEASE NOTE THAT I am just summarizing this state in one statement. That is, * **//The Exclusive Access state of memory//** means that the monitor is in the Exclusive Access state and the memory is marked. * **//The Open Access state of memory//** means that the monitor is in Open Access state, or the memory is not marked. * **Clear-Exclusive** \\ Such a instruction will clear the Exclusive Access state of monitor, which means move to the Open Access state. The following table lists the synchronization primitives. \\ \\ ^ Transaction Size ^ Additional Semantics ^ Load-Exclusive ^ Store-Exclusive ^ Others ^ | Byte | - | LDXRB | STXRB | - | | ::: | Load-Acquire/Store-Release | LDAXRB | STLXRB | - | | Halfword | - | LDXRH | STXRH | - | | ::: | Load-Acquire/Store-Release | LDAXRH | STLXRRH | - | | Register | - | LDXR | STXR | - | | ::: | Load-Acquire/Store-Release | LDAXR | STLXR | - | | Pair | - | LDXP | STXP | - | | ::: | Load-Acquire/Store-Release | LDAXP | STLXP | - | | None | Clear-Exclusive | - | - | CLREX | Except for the row showing the CLREX instruction, the two instructions in a single row are a Load-Exclusive/ Store-Exclusive instruction pair. ==== Memory ==== === Memory Type === The are two types of memory: * ''Non-Shareable'' memory \\ For this type of memory, exclusive access to it requires only local monitor. * ''Shareable'' memory \\ Exclusive access to this type of memory requires both local monitor and global monitor. === Exclusives Reservation Granule === When the PE executes the Load-Exclusive instruction to load 64-bit data from address A1, the memory block that includes A1 will be marked for Exclusive Access. The size of this memory block is aligned by exclusives reservation granule specified in the ARMv8 Spec. Exclusives reservation granule is IMPLEMENTATION DEFINED in the range ''4''-''512'' words. And in some implements it can be read from CTR register, e.g. CTR_EL0. For example, assume that exclusives reservation granule is 8 words. When the PE executes the Load-Exclusive instruction to ''0x30'' as shown below, the memory block ''[0x20, 0x3F]'' will be marked as Exclusive Access state. {{:arch:arm:a:media:img:monitered-memory.svg}} ==== Local Monitor ==== The local monitor is located in the PE. It is designed to monitor the exclusive status of a memory block((note:>I researched the ARMv8 Spec and it seems to support monitoring only one memory block.)). No PE can **directly** change the exclusive state of memory blocks monitored by other PEs. ==== Global Monitor ==== The global monitor can either reside within the PE, or exist as a secondary monitor at the memory interfaces. It is designed to synchronize between multiprocessor system. The global monitors monitor one memory block for each PE separately, as shown below. The memories ''M-A'' and ''M-B'' monitored by the global monitor in the following figure may or may not be the same. If they are identical, either PE is affected by the store operations issued by the other PE on the common memory block. This is the key to achieve **synchronization**. {{:arch:arm:a:media:img:global-monitor.svg}} ===== CPU Implementation List ===== ^ Table-1. Local monitor in CPU ||| ^ CPU ^ Location ^ Exclusives Reservation Granule ^ | A72 | L1 memory system((:arm-a72-6-4-5)) | 64 bytes, one cache line. | | A78AE | L1 memory system((:arm-a78ae-6-4-2)) | 64 bytes, one cache line. | {{template>meta:template:refnote#note}} {{template>meta:template:refnote#ref}} ===== Further Reading ===== * [[https://developer.arm.com/documentation/ddi0487/ia/|ArmĀ® Architecture Reference Manual for A-profile architecture]] * B2.9. Exclusive access instructions and Shareable memory locations * //[[https://patchwork.kernel.org/project/linux-arm-kernel/patch/1472726820-32959-1-git-send-email-vladimir.murzin@arm.com/]]// * //[[https://github.com/ARM-software/arm-trusted-firmware/blob/master/lib/locks/exclusive/aarch64/spinlock.S]]// * //ARM community// * //[[https://community.arm.com/support-forums/f/architectures-and-processors-forum/51294/ldrex-strex-breaked-by-third-task]]// * //[[https://community.arm.com/support-forums/f/architectures-and-processors-forum/10361/ldrex-strex-on-the-m3-m4-m7/33045#33045]]// * //[[https://community.arm.com/support-forums/f/architectures-and-processors-forum/4273/how-to-understand-armv8-sevl-instruction-in-spin-lock]]//