1,、expand 算子用法介紹oneflow.expand(input, *sizes)
下面介紹
具體示例示例1: input_shape = [4, 3, 1, 2]exand_size = [4, 3, 5, 2] # 下面這些 expand_size 的設(shè)置都是合法的# [-1, 3, 5, 2] # [-1, -1, 5, 2] # [-1, -1, 5, -1] # [4, -1, 5, 2]# [4, -1, 5, -1]# [4, 3, 5, -1]out_shape = [4, 3, 5, 2] 示例2: input_shape = [1, 4, 3, 5]exand_size = [2, 1, 2, 4, 3, 5] # 下面這些 expand_size 的設(shè)置都是合法的# [2, 1, 2, -1, 3, 5] # [2, 1, 2, -1, -1, 5] # [2, 1, 2, -1, -1, -1] # [2, 1, 2, 4, -1, 5] # [2, 1, 2, 4, -1, -1] # [2, 1, 2, 4, 3, -1] out_shape = [2, 1, 2, 4, 3, 5] 單卡視角實現(xiàn)思路接下來介紹 從上一節(jié)的介紹可知 在介紹如何計算索引映射之前,,首先來復(fù)習(xí)一下張量的 舉個例子: input_shape = [6, 3, 4, 5]stride = [60, 20, 5, 1] # 下面會介紹 stide 的計算方法input[x, y, z, k] == input_flatten[x * 60 + y * 20 + z * 5 + k * 1]
示例代碼: # 最后一維初始化為1stride = [1]# 從后往前生成 stridefor i in range(len(input_shape) - 2, -1, -1): # 在 stride 數(shù)組開頭插入元素 stride.insert(0, input_stride[0] * input_shape[i + 1]) 接著來看該如何計算 我們知道如果輸入張量某維度是 1,,而
計算 output_stride = []diff = len(expand_size) - len(input_shape)for i in range(len(expand_size) - 1, -1, -1): if i >= diff: if expand_size[i] == -1 or expand_size[i] == input_shape[i - diff]: output_stride.insert(0, input_stride[i - diff]) else: assert expand_size[i] >= 1 and input_shape[i - diff] == 1 output_stride.insert(0, 0) else: assert expand_size[i] >= 1 output_stride.insert(0, 0) 舉個例子: input_shape = [4, 1, 3, 5]stride = [15, 15, 5, 1]exand_size = [2, 1, 4, 4, 3, 5] output_stride = [0, 0, 15, 0, 5, 1]# 輸出張量意位置的索引 (x, y, z, k, v, w)output[x, y, z, k, v, w] = input_flatten[x * 0 + y * 0 + z * 15 + k * 0 + v * 5 + w * 1]# 反向的計算邏輯input_grad_flatten[x * 0 + y * 0 + z * 15 + k * 0 + v * 5 + w * 1] += output_grad[x, y, z, k, v, w] 前向代碼鏈接: https://github.com/Oneflow-Inc/oneflow/blob/master/oneflow/user/kernels/expand_kernel.cu#L30 反向代碼鏈接: https://github.com/Oneflow-Inc/oneflow/blob/master/oneflow/user/kernels/expand_kernel.cu#L43 多卡一致性視角 接下來介紹 OneFlow 中添加算子與其他框架不一樣的地方,。除了要正確實現(xiàn)單卡視角下的計算邏輯,,還需要考慮多卡一致性視角下的邏輯,包括輸出形狀推理的邏輯,、 首先簡單介紹一致性視角的概念:
然后什么是 sbp:
更多詳細(xì)內(nèi)容可以參考: https://docs./v0.5.0/parallelism/02_sbp.html 所以在 Oneflow 中開發(fā)算子,,開發(fā)者還需要為算子設(shè)置其輸入和輸出支持哪些 而在一致性視角下,算子的實現(xiàn)邏輯有可能需要考慮,,其在真實物理設(shè)備上的計算與邏輯上的計算(也就是一致性視角)不一致的地方,。 比如對于 具體代碼鏈接: https://github.com/Oneflow-Inc/oneflow/blob/master/oneflow/user/ops/expand_op.cpp#L62 舉個具體的例子: logical input_shape = [4, 3, 1, 2]logical stride = [6, 2, 2, 1]logical expand_size = [2, 4, 3, 4, 2]logical output_stride = [0, 6, 2, 0, 1] 假設(shè)用戶設(shè)置了輸入張量的 sbp 為 physical input_shape = [4, 3, 1, 1]physical stride = [3, 1, 1, 1] 則對于真實物理設(shè)備上的 physical expand_size = [2, 4, 3, 4, 1]physical output_stride = [0, 3, 1, 0, 1] 為什么 首先在一致性視角下,,每個物理設(shè)備上進(jìn)行實際計算的時候,,實際上拿到的輸入大小是切分之后的物理形狀。 而對于上面的例子,,輸入的在每個設(shè)備上的物理形狀變?yōu)?nbsp; 而由于用戶怎么設(shè)置 sbp 是運(yùn)行時才能拿到的信息,,所以在物理設(shè)備上進(jìn)行計算之前,,都需要根據(jù)實際的輸入大小,重新計算 具體代碼鏈接: https://github.com/Oneflow-Inc/oneflow/blob/master/oneflow/user/kernels/expand_kernel.cu#L129 2,、repeat 算子用法介紹oneflow.repeat(input, *sizes)
下面介紹
則輸出張量每一維的大小計算方式如下: 對于非新增的維度: 對于新增的維度: 具體示例input_shape = [4, 1, 3, 5]repeat_size = [2, 1, 2, 4, 1, 1] output_shape = [2, 1, 8, 4, 3, 5] 與 expand 算子的聯(lián)系其實仔細(xì)思考一下,,可以感覺到 舉些例子: 例子1: input_shape = [5] repeat_size = [3] output_shape = [15]# 等價與以下操作input_shape = [5] reshaped_input_shape = [1, 5] expand_size = [3, 5] output_shape = [3, 5] reshaped_output_shape = [15] 例子2: input_shape = [3, 1, 5]repeat_size = [5, 3, 1] output_shape = [15, 3, 5]# 等價于以下操作input_shape = [3, 1, 5]reshaped_input_shape = [1, 3, 1, 5]expand_size = [5, 3, 3, 5] output_shape = [5, 3, 3, 5]reshaped_output_shape = [15, 3, 5] 例子3: input_shape = [3, 1, 5]repeat_size = [2, 5, 3, 1] output_shape = [2, 15, 3, 5]# 等價與以下操作input_shape = [3, 1, 5]reshaped_input_shape = [1, 3, 1, 5]expand_size = [2, 5, 3, 3, 5] output_shape = [2, 5, 3, 3, 5]reshaped_output_shape = [2, 15, 3, 5] 從上面的例子可以知道,, 計算示例代碼: input_reshape = [] output_reshape = [] expand_size = [] diff = len(repeat_size) - len(input_shape)for i in range(len(repeat_size) - 1, -1, -1): if i >= diff: if repeat_size[i] > 1: if input_shape[i - diff] > 1: input_reshape.insert(0, input_shape[i - diff]) input_reshape.insert(0, 1) expand_size.insert(0, input_shape[i - diff]) expand_size.insert(0, repeat_size[i]) output_reshape.insert(0, input_shape[i - diff] * repeat_size[i]) else: input_reshape.insert(0, input_shape[i - diff]) expand_size.insert(0, repeat_size[i]) output_reshape.insert(0, repeat_size[i]) else: input_reshape.insert(0, input_shape[i - diff]) expand_size.insert(0, input_shape[i - diff]) output_reshape.insert(0, input_shape[i - diff]) else: # 新增的維度 expand_size.insert(0, repeat_size[i]) output_reshape.insert(0, repeat_size[i]) new_tensor = flow.reshape(input, input_reshape) tmp_tensor = new_tensor.expand(*expand_size) out = flow.reshape(tmp_tensor, output_reshape) 不過這算是取巧的實現(xiàn)了 |
|