Thursday, January 29, 2015

embed assembly inside cuda kernel

If you know specific asm, you could potentially just right the kernel assebmly by your own.
Ha, too much work!

Here is two lines I found in the SDK.


    unsigned lane_mask_lt;
    asm("mov.u32 %0, %%lanemask_lt;" : "=r"(lane_mask_lt));


error while loading shared libraries: libcudart.so.5.5: cannot open shared object file: No such file or directory

It happens when the system can't find the dynamic linker bindings, though you probably set every environment correctly and compiled the program successfully.

Here is the solution.

32-bit: sudo ldconfig /usr/local/cuda/lib

64-bit: sudo ldconfig /usr/local/cuda/lib64

Thanks.(http://stackoverflow.com/questions/10808958/why-cant-libcudart-so-4-be-found-when-compiling-the-cuda-samples-under-ubuntu)