3 Mar 2014 by Member 9964804
I have a loop in my Intel Vector assembly code. In the loop, the loop counter is used to read from and write to 4 consecutive memory locations. For example, vmovdqu [r9 + rdx + 64], y0 vmovdqu [r9 + rdx + 96], y1where is my loop counter. During profiling, I notice that using "r10d"...