代码如下,其中mask_pad的shape为(16, 1, 319, 319, 16),g_sz=127
mask_uf = tf.image.extract_image_patches(mask_pad, ksizes=[1, g_sz, g_sz, 1],
strides=[1, 8, 8, 1],
rates=[1, 1, 1, 1],
padding='VALID')
报错信息如下:
[ERROR] TEFUSION(194884,python3.7):2020-12-10-15:11:56.811.220 [tensor_engine/te_fusion/fusion_op.cc:5713]GetFinishedCompilationTask FinishedTask[0]: taskID[281446916809200:1637], status[1], kernel[None] res: prebuild failed. module[impl.extract_image_patches] func[extract_image_patches], compile_info: 0
[ERROR] TEFUSION(194884,python3.7):2020-12-10-15:11:56.811.334 [tensor_engine/te_fusion/fusion_op.cc:5716]GetFinishedCompilationTask Op error args: args:(), input:({'shape': (16, 1, 319, 319, 16), 'ori_shape': (16, 319, 319, 1), 'format': 'NC1HWC0', 'ori_format': 'NHWC', 'dtype': 'float16', 'addr_type': 0, 'valid_shape': (), 'slice_offset': (), 'L1_workspace_size': -1, 'L1_fusion_type': -1, 'L1_addr_offset': 0, 'total_shape': (), 'split_index': 0},), outputs:({'shape': (16, 25, 25, 16129), 'ori_shape': (16, 25, 25, 16129), 'format': 'NHWC', 'ori_format': 'NHWC', 'dtype': 'float16', 'addr_type': 0, 'valid_shape': (), 'slice_offset': (), 'L1_workspace_size': -1, 'L1_fusion_type': -1, 'L1_addr_offset': 0, 'total_shape': (), 'split_index': 0},), attrs:((1, 127, 127, 1), (1, 8, 8, 1), (1, 1, 1, 1), 'VALID')
[ERROR] TEFUSION(194884,python3.7):2020-12-10-15:11:56.811.369 [tensor_engine/te_fusion/fusion_op.cc:5720]GetFinishedCompilationTask Op python exception: Traceback (most recent call last):
[ERROR] TEFUSION(194884,python3.7):2020-12-10-15:11:56.811.390 [tensor_engine/te_fusion/fusion_op.cc:5720]GetFinishedCompilationTask File "/usr/local/lib/python3.7/site-packages/te/platform/parallel_compilation.py", line 1184, in run
[ERROR] TEFUSION(194884,python3.7):2020-12-10-15:11:56.811.408 [tensor_engine/te_fusion/fusion_op.cc:5720]GetFinishedCompilationTask int64_mode=self._int64_mode)
[ERROR] TEFUSION(194884,python3.7):2020-12-10-15:11:56.811.426 [tensor_engine/te_fusion/fusion_op.cc:5720]GetFinishedCompilationTask File "/usr/local/lib/python3.7/site-packages/te/platform/fusion_manager.py", line 866, in build_single_op
[ERROR] TEFUSION(194884,python3.7):2020-12-10-15:11:56.811.443 [tensor_engine/te_fusion/fusion_op.cc:5720]GetFinishedCompilationTask compile_info = call_op()
[ERROR] TEFUSION(194884,python3.7):2020-12-10-15:11:56.811.460 [tensor_engine/te_fusion/fusion_op.cc:5720]GetFinishedCompilationTask File "/usr/local/lib/python3.7/site-packages/te/platform/fusion_manager.py", line 851, in call_op
[ERROR] TEFUSION(194884,python3.7):2020-12-10-15:11:56.811.478 [tensor_engine/te_fusion/fusion_op.cc:5720]GetFinishedCompilationTask opfunc(*inputs, *outputs, *attrs, **kwargs)
[ERROR] TEFUSION(194884,python3.7):2020-12-10-15:11:56.811.496 [tensor_engine/te_fusion/fusion_op.cc:5720]GetFinishedCompilationTask File "/usr/local/lib/python3.7/site-packages/te/utils/op_utils.py", line 597, in _in_wrapper
[ERROR] TEFUSION(194884,python3.7):2020-12-10-15:11:56.811.513 [tensor_engine/te_fusion/fusion_op.cc:5720]GetFinishedCompilationTask return func(*args, **kwargs)
[ERROR] TEFUSION(194884,python3.7):2020-12-10-15:11:56.811.530 [tensor_engine/te_fusion/fusion_op.cc:5720]GetFinishedCompilationTask File "/home/HwHiAiUser/Ascend/ascend-toolkit/latest/arm64-linux/opp/op_impl/built-in/ai_core/tbe/impl/extract_image_patches.py", line 876, in extract_image_patches
[ERROR] TEFUSION(194884,python3.7):2020-12-10-15:11:56.811.548 [tensor_engine/te_fusion/fusion_op.cc:5720]GetFinishedCompilationTask (cut_h_col * fmap_w * fmap_c0 * type_size * DOUBLE_BUFFER))
[ERROR] TEFUSION(194884,python3.7):2020-12-10-15:11:56.811.565 [tensor_engine/te_fusion/fusion_op.cc:5720]GetFinishedCompilationTask RuntimeError: Input size is too large load to L1, while cut h, need size: 3756544
[ERROR] FE(194884,python3.7):2020-12-10-15:11:56.821.805 [fusion_engine/adapter/tbe_adapter/tbe_op_store_adapter.cpp:135]196108 ProcessFailPreCompTask:"tid[281446916809200], taskId[1637], node[model/ExtractImagePatches], precompile failed"
tf.not_equal 之前这个算子也报错,不知道是不是没有实现
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。
从您提供的日志来看,extract_image_patches在算子编译过程中报错,应该是算子泛化问题(该算子对于您的模型支持度不够),请您提供下整网中算子的详细信息(输入输出类型、格式与shape信息),谢谢!
初步定位为该规格已超出算子目前实现的约束限制,请问你这个输入规格是来自哪个网络场景吗?谢谢 @zx
初步定位为该规格已超出算子目前实现的约束限制,请问你这个输入规格是来自哪个网络场景吗?谢谢 @zx
@wangyulong 目标跟踪任务,siammask模型,开源pytorch代码 使用的函数是nn.unfold,这个操作可以视作卷积的一部分,用滑动窗口取patch,但没有filter,也没有将patch和filter作卷积的操作。对应tf中的tf.image.extract_image_patches 算子
目前是单通道图片,代码可以参考:
def extract_image_patches_v2(image, ksizes=127, strides=8):
b, h, w, c = image.shape
# print(b, h, w, c)
extract_image_h = (h - ksizes) // strides + 1
extract_image_w = (w - ksizes) // strides + 1
# print(b, extract_image_h, extract_image_w, ksizes * ksizes * c)
extract_image = np.zeros([b, extract_image_h, extract_image_w, ksizes * ksizes * c], dtype=np.float32)
for i in range(extract_image_h):
for j in range(extract_image_w):
patch = image[:, i * strides:i * strides + ksizes, j * strides:j * strides + ksizes, :]
extract_image[:, i, j, :] = np.reshape(patch, newshape=(b, -1,)).copy()
return extract_image
调用如下:
mask_uf = tf.py_func(extract_image_patches_v2, [mask_pad], tf.float32)
也可以使用卷积来做:图片太大就有内存溢出问题
def tf_extract_image_patches_v2(image=None, ksizes=127, strides=8):
image = np.random.normal(size=(8, 255, 255, 1)) * 255
feature_map = tf.pad(image, paddings=[[0, 0], [32, 32], [32, 32], [0, 0]], mode="CONSTANT")
g_sz = 127
filter_data = np.zeros((g_sz, g_sz, 1, g_sz * g_sz), dtype=np.float64)
for i in range(g_sz):
for j in range(g_sz):
filter_data[i][j][0][i * g_sz + j] = 1.0
filters = tf.Variable(filter_data, dtype=np.float64)
result = tf.nn.conv2d(feature_map, filters, strides=(1, 8, 8, 1), padding="VALID")
print(result.shape)
init = tf.global_variables_initializer()
with tf.Session() as sess:
print(sess.run(init))
print(sess.run(result))
也可以使用循环来做
def tf_extract_image_patches_v3(image=None, ksizes=127, strides=8):
print(3)
print("tf_extract_image_patches_v3")
b, h, w, c = image.shape
# b, h, w, c = 8, 319, 319, 1
print(b, h, w, c)
extract_image_h = (h - ksizes) // strides + 1
extract_image_w = (w - ksizes) // strides + 1
print(b, extract_image_h, extract_image_w, ksizes * ksizes * c)
# extract_image = np.zeros([b, extract_image_h, extract_image_w, ksizes * ksizes * c], dtype=np.float32)
output_list = []
for batch_ in range(b):
for i in range(extract_image_h):
for j in range(extract_image_w):
patch = image[batch_, i * strides:i * strides + ksizes, j * strides:j * strides + ksizes, :]
output_list.append(tf.reshape(patch, (-1,)))
return tf.stack(output_list)
目前是单通道图片,代码可以参考:
def extract_image_patches_v2(image, ksizes=127, strides=8): b, h, w, c = image.shape # print(b, h, w, c) extract_image_h = (h - ksizes) // strides + 1 extract_image_w = (w - ksizes) // strides + 1 # print(b, extract_image_h, extract_image_w, ksizes * ksizes * c) extract_image = np.zeros([b, extract_image_h, extract_image_w, ksizes * ksizes * c], dtype=np.float32) for i in range(extract_image_h): for j in range(extract_image_w): patch = image[:, i * strides:i * strides + ksizes, j * strides:j * strides + ksizes, :] extract_image[:, i, j, :] = np.reshape(patch, newshape=(b, -1,)).copy() return extract_image 调用如下: mask_uf = tf.py_func(extract_image_patches_v2, [mask_pad], tf.float32)
也可以使用卷积来做:图片太大就有内存溢出问题
def tf_extract_image_patches_v2(image=None, ksizes=127, strides=8): image = np.random.normal(size=(8, 255, 255, 1)) * 255 feature_map = tf.pad(image, paddings=[[0, 0], [32, 32], [32, 32], [0, 0]], mode="CONSTANT") g_sz = 127 filter_data = np.zeros((g_sz, g_sz, 1, g_sz * g_sz), dtype=np.float64) for i in range(g_sz): for j in range(g_sz): filter_data[i][j][0][i * g_sz + j] = 1.0 filters = tf.Variable(filter_data, dtype=np.float64) result = tf.nn.conv2d(feature_map, filters, strides=(1, 8, 8, 1), padding="VALID") print(result.shape) init = tf.global_variables_initializer() with tf.Session() as sess: print(sess.run(init)) print(sess.run(result))
@zx 你好哇, 感谢你们分享的代码。我们针对第一个函数,改成了tensorflow版本。但是实际运行发现速度较慢。你们有针对tensorflow的版本吗
@zx 你好哇, 感谢你们分享的代码。我们针对第一个函数,改成了tensorflow版本。但是实际运行发现速度较慢。你们有针对tensorflow的版本吗
@forechoni 确实会影响速度,按照我的理解的话,tf.py_func会把数据拷到CPU,然后又拷贝回NPU,但是不知道这理解是否正确。我们这边GPU对比的话,应该是慢了3倍左右。 TensorFlow版本的就是这份代码tf_extract_image_patches_v2,但是我们用不了,图片太大了
目前是单通道图片,代码可以参考:
def extract_image_patches_v2(image, ksizes=127, strides=8): b, h, w, c = image.shape # print(b, h, w, c) extract_image_h = (h - ksizes) // strides + 1 extract_image_w = (w - ksizes) // strides + 1 # print(b, extract_image_h, extract_image_w, ksizes * ksizes * c) extract_image = np.zeros([b, extract_image_h, extract_image_w, ksizes * ksizes * c], dtype=np.float32) for i in range(extract_image_h): for j in range(extract_image_w): patch = image[:, i * strides:i * strides + ksizes, j * strides:j * strides + ksizes, :] extract_image[:, i, j, :] = np.reshape(patch, newshape=(b, -1,)).copy() return extract_image 调用如下: mask_uf = tf.py_func(extract_image_patches_v2, [mask_pad], tf.float32)
也可以使用卷积来做:图片太大就有内存溢出问题
def tf_extract_image_patches_v2(image=None, ksizes=127, strides=8): image = np.random.normal(size=(8, 255, 255, 1)) * 255 feature_map = tf.pad(image, paddings=[[0, 0], [32, 32], [32, 32], [0, 0]], mode="CONSTANT") g_sz = 127 filter_data = np.zeros((g_sz, g_sz, 1, g_sz * g_sz), dtype=np.float64) for i in range(g_sz): for j in range(g_sz): filter_data[i][j][0][i * g_sz + j] = 1.0 filters = tf.Variable(filter_data, dtype=np.float64) result = tf.nn.conv2d(feature_map, filters, strides=(1, 8, 8, 1), padding="VALID") print(result.shape) init = tf.global_variables_initializer() with tf.Session() as sess: print(sess.run(init)) print(sess.run(result))
@zx
发现无法使用-1 后续网络崩溃
@zx
发现无法使用-1 后续网络崩溃
@forechoni 修改一下代码
@forechoni 确实会影响速度,按照我的理解的话,tf.py_func会把数据拷到CPU,然后又拷贝回NPU,但是不知道这理解是否正确。我们这边GPU对比的话,应该是慢了3倍左右。 TensorFlow版本的就是这份代码tf_extract_image_patches_v2,但是我们用不了,图片太大了
@zx 最新的master已经上了一版issue中能够支持的版本,有尝试直接使用ExtractImagePatches算子吗
@zx 最新的master已经上了一版issue中能够支持的版本,有尝试直接使用ExtractImagePatches算子吗
@wangyulong 是直接新建一个docker,就能直接跑吗?
@wangyulong 是直接新建一个docker,就能直接跑吗?
@zx 您好,extract_image_patches已支持您在上述描述中提到的规格,您可以使用最新的软件版本进行验证。在docker中验证的话,需要替换extract_image_patches.py文件,该文件可以在/目录下使用find命令查找对应的路径。
find / -name extract_image_patches.py
目前已解决,感谢各位老师的支持!
登录 后才可以发表评论