The original title of our paper is "Demystifying GPU Microarchitecture to Tune SGEMM". Now it is changed to "Understanding the GPU Microarchitecture to Achieve Bare-Metal Performance Tuning".
These codes are corresponding to section 3 (Instruction solver and GPU assembler KeplerAs), section 4 (SGEMM implementation and optimizations), and section 5 (SGEMM performance evaluation). Follow the instructions in each directory, and validate the functionality and performance results.
If you have any question, please feel free to contact me: zhangxiuxia1@gmail.com