Questions with the in_cksumdata() function in sys/amd64/amd64/in_cksum.c
btw at mail.ustc.edu.cn
btw at mail.ustc.edu.cn
Fri Sep 26 14:51:39 UTC 2014
Hi All,
I'm reading the in_cksumdata() function in sys/amd64/amd64/in_cksum.c, and
I have some questions with the following comment and code:
static u_int64_t
in_cksumdata(const void *buf, int len)
{
......
/*
* access prefilling to start load of next cache line.
* then add current cache line
* save result of prefilling for loop iteration.
*/
prefilled = lw[0];
while ((len -= 32) >= 4) {
u_int64_t prefilling = lw[8];
sum += prefilled + lw[1] + lw[2] + lw[3]
+ lw[4] + lw[5] + lw[6] + lw[7];
lw += 8;
prefilled = prefilling;
}
......
}
It said that it adds the current cache line, and it adds 32 bytes actually,
while on amd64 platform, the size of each cache line is 64 bytes. So I think
the correct code should be something like this:
static u_int64_t
in_cksumdata(const void *buf, int len)
{
......
/*
* access prefilling to start load of next cache line.
* then add current cache line
* save result of prefilling for loop iteration.
*/
prefilled = lw[0];
while ((len -= 64) >= 4) {
u_int64_t prefilling = lw[16];
sum += prefilled + lw[1] + lw[2] + lw[3]
+ lw[4] + lw[5] + lw[6] + lw[7]
+ lw[8] + lw[9] + lw[10] + lw[11]
+ lw[12] + lw[13] + lw[14] + lw[15];
lw += 16;
prefilled = prefilling;
}
......
}
The full patch is:
diff --git a/in_cksum.c b/in_cksum.c
index 2ae3a0c..4f141f8 100644
--- a/in_cksum.c
+++ b/in_cksum.c
@@ -140,19 +140,23 @@ in_cksumdata(const void *buf, int len)
* save result of prefilling for loop iteration.
*/
prefilled = lw[0];
- while ((len -= 32) >= 4) {
- u_int64_t prefilling = lw[8];
+ while ((len -= 64) >= 4) {
+ u_int64_t prefilling = lw[16];
sum += prefilled + lw[1] + lw[2] + lw[3]
- + lw[4] + lw[5] + lw[6] + lw[7];
- lw += 8;
+ + lw[4] + lw[5] + lw[6] + lw[7]
+ + lw[8] + lw[9] + lw[10] + lw[11]
+ + lw[12] + lw[13] + lw[14] + lw[15];
+ lw += 16;
prefilled = prefilling;
}
if (len >= 0) {
sum += prefilled + lw[1] + lw[2] + lw[3]
- + lw[4] + lw[5] + lw[6] + lw[7];
- lw += 8;
+ + lw[4] + lw[5] + lw[6] + lw[7]
+ + lw[8] + lw[9] + lw[10] + lw[11]
+ + lw[12] + lw[13] + lw[14] + lw[15];
+ lw += 16;
} else {
- len += 32;
+ len += 64;
}
while ((len -= 16) >= 0) {
sum += (u_int64_t) lw[0] + lw[1] + lw[2] + lw[3];
Sorry about the confusion if I did something wrong.
- twb
More information about the freebsd-hackers
mailing list