Figure 7-1.
An overview with six tables that show the machine instructions for vaarious atomic operations.
The rows on each table are labeled ARMv8 and x86-64.
The columns are labeled with various memory orderings.
First table: load.
The ARMv8 row shows the LDR instruction is used in both the non-atomic case and relaxed ordering. The LDAR instruction is used for both acquire and sequentially consistent ordering.
The x86-64 row shows the MOV instruction in all cases.
Second table: store.
The ARMv8 row shows the STR instruction is used in both the non-atomic case and relaxed ordering. The STLR instruction is used for both release and sequentially consistent ordering.
The x86-64 row shows the MOV instruction in all cases except for sequentially consistent ordering, which uses XCHG.
Third table: swap.
The ARMv8 row shows two instructions for the non-atomic case: LDR, STR. The rest of the row shows a nearly identical three instruction loop for each memory ordering: LDXR, STXR, CBNZ, with an arrow indicating a branch from the last instruction to the first. In the acquire, acquire-release, and sequentially consistent columns, the LDXR instruction is replaced by LDXAR. In the release, acquire-release, and sequentially consistent columns, the STXR instruction is replaced by STLXR.
An additional row labeled ARMv8.1 does not contain these loops, but instead has a single instruction for each memory ordering. SWP for relaxed, SWPA for acquire, SWPL for release, and SWPAL for both acquire-release and sequentially consistent.
The x86-64 row shows MOV, MOV for the non-atomic case, and XCHG for all atomic memory orderings.
Fourth table: fetch op.
Op is a placeholder, such that this table represents fetch-add, fetch-sub, and so on.
The ARMv8 row is very similar as in the previous table, but with an additional instruction in the middle of each listing labeled op. Non-atomic: LDR, op, STR. Relaxed: LDXR, op, STXR, CBNZ, with an arrow indicating a branch from the last instruction to the first. The rest of the row shows a nearly identical four instruction loop for the other memory orderings In the acquire, acquire-release, and sequentially consistent columns, the LDXR instruction is replaced by LDXAR. In the release, acquire-release, and sequentially consistent columns, the STXR instruction is replaced by STLXR.
An additional row labeled ARMv8.1 does not contain these loops, but instead has a single instruction for each memory ordering. LDOP for relaxed, LDOPA for acquire, LDOPL for release, and LDOPAL for both acquire-release and sequentially consistent. The OP part of these instructions is a placeholder. It can stand for ADD, SUB, and so on.
The x86-64 row is split in two. The first half is labeled “op”, and the second is labeled “fetch and op”.
The first part has just a single “op” instruction for the non-atomic case, and “lock op” for all atomic memory orderings.
The second part, labeled “fetch and op”, shows two instrucitons for the non-atomic case: MOV, op. For the atomic case, regardless of memory ordering, it shows multiple options. Either “lock” followed by XADD, BTS, BTR, or BTC, or a four-instruction loop: MOV, op, lock CMPXCHG, JNE, with an arrow indicating a branch from the last instruction to the first.
Fifth table: compare exchange.
This table has two columns: weak and strong.
The weak ARMv8 instructions start with either LDXR or LDAXR, followed by CMP, BNE, and then either STXR or STLXR. An arrow from BNE branches off to a separate path with CLREX.
The strong ARMv8 instructions are identical, except for an additional CBNZ at the end, with an arrow indicating a branch back to the first instruction.
An additional row labeled ARMv8.1 shows no difference between the weak and strong cases. It is just a single CAS, CASA, CASL, or CASAL.
The x86-64 row shows the same instruction for both the weak and strong case: lock CMPXCHG.
Sixth and last table: fence.
The ARMv8 row shows DMB ISHLD for an acquire fence, and DMB ISH for a release, acquire-release and sequentially consistent fence.
The x86-64 row shows MFENCE for sequentially consistent fence, and no instructions for the other fences.