Thursday, January 29, 2015

embed assembly inside cuda kernel

If you know specific asm, you could potentially just right the kernel assebmly by your own.
Ha, too much work!

Here is two lines I found in the SDK.


    unsigned lane_mask_lt;
    asm("mov.u32 %0, %%lanemask_lt;" : "=r"(lane_mask_lt));


No comments:

Post a Comment