You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: memory_barrier.md
+30-21Lines changed: 30 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -122,26 +122,33 @@ Time SB 0.11
122
122
123
123

124
124
125
-
Following are the different type of Processor requests and Bus side requests:
125
+
下面是不同类型的处理器请求和总线侧的请求:
126
126
127
-
Processor Requests to Cache includes the following operations:
127
+
处理器向 Cache 发请求包括下面这些操作:
128
128
129
-
PrRd: The processor requests to read a Cache block.
130
-
PrWr: The processor requests to write a Cache block
131
-
Bus side requests are the following:
129
+
PrRd: 处理器请求读取一个 Cache block
130
+
PrWr: 处理器请求写入一个 Cache block
132
131
133
-
BusRd: Snooped request that indicates there is a read request to a Cache block made by another processor
134
-
BusRdX: Snooped request that indicates there is a write request to a Cache block made by another processor which doesn't already have the block.
135
-
BusUpgr: Snooped request that indicates that there is a write request to a Cache block made by another processor but that processor already has that Cache block resident in its Cache.
136
-
Flush: Snooped request that indicates that an entire cache block is written back to the main memory by another processor.
137
-
FlushOpt: Snooped request that indicates that an entire cache block is posted on the bus in order to supply it to another processor(Cache to Cache transfers).
138
-
(Such Cache to Cache transfers can reduce the read miss latency if the latency to bring the block from the main memory is more than from Cache to Cache transfers which is generally the case in bus based systems. But in multicore architectures, where the coherence is maintained at the level of L2 caches, there is on chip L3 cache, it may be faster to fetch the missed block from the L3 cache rather than from another L2)
@@ -160,18 +167,20 @@ On modern CPUs, the LOCK prefix locks the cache line so that the read-modify-wri
160
167
161
168
Unlocked increment:
162
169
163
-
Acquire cache line, shareable is fine. Read the value.
164
-
Add one to the read value.
165
-
Acquire cache line exclusive (if not already E or M) and lock it.
166
-
Write the new value to the cache line.
167
-
Change the cache line to modified and unlock it.
170
+
1. Acquire cache line, shareable is fine. Read the value.
171
+
2. Add one to the read value.
172
+
3. Acquire cache line exclusive (if not already E or M) and lock it.
173
+
4. Write the new value to the cache line.
174
+
5. Change the cache line to modified and unlock it.
175
+
168
176
Locked increment:
169
177
170
-
Acquire cache line exclusive (if not already E or M) and lock it.
171
-
Read value.
172
-
Add one to it.
173
-
Write the new value to the cache line.
174
-
Change the cache line to modified and unlock it.
178
+
1. Acquire cache line exclusive (if not already E or M) and lock it.
179
+
2. Read value.
180
+
3. Add one to it.
181
+
4. Write the new value to the cache line.
182
+
5. Change the cache line to modified and unlock it.
183
+
175
184
Notice the difference? In the unlocked increment, the cache line is only locked during the write memory operation, just like all writes. In the locked increment, the cache line is held across the entire instruction, all the way from the read operation to the write operation and including during the increment itself.
176
185
177
186
Also, some CPUs have things other than memory caches that can affect memory visibility. For example, some CPUs have a read prefetcher or a posted write buffer that can result in memory operations executing out of order. Where needed, a LOCK prefix (or equivalent functionality on other CPUs) will also do whatever needs to be done to handle memory operation ordering issues.
0 commit comments