refactored gemm-related code. added gemm-related optimizations and tests. added specializations for mul