PR types

Others

PR changes

OPs

Describe

Modified Kernel Primitive API and elementwise for xpu2.
PR-CI-OP-benchmark 执行错误的case 如下:
image
由于代码修改不涉及softmax,重新rerun导致运行超时,因此本地benchmark 验证结果如下:

op name diff speed develop (us) speed new (us) speed up
softmax_0_forward 0.00E+00 3.097 3.107 0.32%
softmax_4_forward 0.00E+00 5.103 5.103 0.00%
softmax_0_backward 5.61E-10 3.151 3.127 -0.76%
softmax_4_backward 1.63E-08 3.856 3.842 -0.36%