
【pythonでCNN#7】Pooling層
記事の目的
pythonでCNN(畳み込みニューラルネットワーク)を実装する上で必要になるPooling層を実装していきます。ここにある全てのコードは、コピペで再現することが可能です。
目次
1 max pooling概要

2 im2col関数とcol2im関数

# In[1]
import numpy as np
np.random.seed(1)
# In[2]
def im2col(x, fil_size, y_size, stride, pad):
x_b, x_c, x_h, x_w = x.shape
fil_h, fil_w = fil_size, fil_size
y_h, y_w = y_size, y_size
index = -1
x_pad = np.pad(x, [(0, 0), (0, 0), (pad, pad), (pad, pad)], "constant")
x_col = np.zeros((fil_h*fil_w, x_b, x_c, y_h, y_w))
for h in range(fil_h):
h2 = h + y_h*stride
for w in range(fil_w):
index += 1
w2 = w + y_w*stride
x_col[index,:,:,:,:] = x_pad[:,:,h:h2:stride,w:w2:stride]
x_col = x_col.transpose(2,0,1,3,4).reshape(x_c*fil_h*fil_w, x_b*y_h*y_w)
return x_col
def col2im(dx_col, x_shape, fil_size, y_size, stride, pad):
x_b, x_c, x_h, x_w = x_shape
fil_h, fil_w = fil_size, fil_size
y_h, y_w = y_size, y_size
index = -1
dx_col = dx_col.reshape(x_c, fil_h*fil_w, x_b, y_h, y_w).transpose(1,2,0,3,4)
dx = np.zeros((x_b, x_c, x_h+2*pad+stride-1, x_w+2*pad+stride-1))
for h in range(fil_h):
h2 = h + y_h*stride
for w in range(fil_w):
index += 1
w2 = w + y_w*stride
dx[:,:,h:h2:stride,w:w2:stride] += dx_col[index,:,:,:,:]
return dx[:,:,pad:x_h+pad, pad:x_w+pad]
3 Pooling実装

# In[3] x = np.random.randint(0,10,2*3*4*4).reshape(2,3,4,4) x # In[4] x_col = im2col(x,2,2,2,0).T.reshape(-1,4) x_col # In[5] y = np.max(x_col, axis=1) y # In[6] y = y.reshape(2, 2, 2, 3).transpose(0,3,1,2) y # In[7] max_index = np.argmax(x_col, axis=1) max_index # In[8] dy = np.ones(y.shape).transpose(0,2,3,1) dy # In[9] dx = np.zeros((2*2, dy.size)) dx # In[10] dx[max_index.reshape(-1), np.arange(dy.size)] = dy.reshape(-1) dx # In[11] dx = dx.reshape(2, 2, 2, 2, 2, 3).transpose(5,0,1,2,3,4).reshape(3*2*2, 2*2*2) dx # In[12] col2im(dx, x.shape, 2, 2, 2, 0)
4 Pooling層

# in[13]
class Pooling:
def __init__(self, pool):
self.pool = pool
def forward(self, x):
self.xshape = x.shape
self.x_b, self.x_c, self.x_h, self.x_w = x.shape
self.y_h = self.x_h//self.pool if self.x_h%self.pool==0 else self.x_h//self.pool+1
self.y_w = self.x_w//self.pool if self.x_w%self.pool==0 else self.x_w//self.pool+1
x_col = im2col(x, self.pool, self.y_h, self.pool, 0).T.reshape(-1,self.pool*self.pool)
y = np.max(x_col, axis=1)
self.y = y.reshape(self.x_b, self.y_h, self.y_w, self.x_c).transpose(0,3,1,2)
self.max_index = np.argmax(x_col, axis=1)
return self.y
def backward(self, dy):
dy = dy.transpose(0,2,3,1)
dx = np.zeros((self.pool*self.pool, dy.size))
dx[self.max_index.reshape(-1), np.arange(dy.size)] = dy.reshape(-1)
dx = dx.reshape(self.pool, self.pool, self.x_b, self.y_h, self.y_w, self.x_c)
dx = dx.transpose(5,0,1,2,3,4)
dx = dx.reshape(self.x_c*self.pool*self.pool, self.x_b*self.y_h*self.y_w)
self.dx = col2im(dx, self.xshape, self.pool, self.y_h, self.pool, 0)
return self.dx
# in[14]
pool = Pooling(2)
# in[15]
x.shape
# in[16]
y = pool.forward(x)
y