Hacker Newsnew | past | comments | ask | show | jobs | submit | xianyi's commentslogin

Hi scythe,

I am Xianyi, a co-author of AUGEM paper and the developer of OpenBLAS.

We used the AUGEM generated assembly codes at OpenBLAS sandy bridge kernel(OpenBLAS/kernel/x86_64/dgemm_kernel_4x8_sandy.S).

Meanwhile, we need add some hand written codes to deal with the tail ( undivided by block size).

Thus, we didn't compare AUGEMM with OpenBLAS. However, we compared the performance with Intel MKL and AMD ACML.

Xianyi


Thank you for the support


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: