美文网首页Machine Learning
Python深度学习中Numpy的常用操作

Python深度学习中Numpy的常用操作

作者: ClementCJ | 来源:发表于2019-05-13 11:27 被阅读0次

Numpy中的数组也叫做张量,张量是机器学习系统中的基本数据结构,张量的维度(dimension)通常叫做轴(axis),按维度不同分为0D张量(标量)、1D张量(向量)、2D张量(矩阵)、3D及更高维度张量。

1.sigmoid函数

sigmoid(x) = \frac{1}{1+e^{-x}}是一个非线性函数,用于机器学习中作为Logistics Regression分类,也用于深度学习作为激活函数,我们分别使用python中的numpy和math模块来构建sigmoid函数。

1).使用math模块构建sigmoid函数:
import math

def sigmoid(x):
    s = 1 / (1 + math.exp(-x))
    return s

使用math构建的sigmoid函数接收一个数字,但是在深度学习中我们通常处理的数据类型是矩阵和向量,所以要用到numpy。

2).使用numpy模块构建sigmoid函数:

numpy.exp(x),输入x可以是标量,向量或多维数组

x是向量

import numpy as np
x = np.array([1, 2, 3])
print(np.exp(x)) # 结果是 (exp(1), exp(2), exp(3))

# 输出
[ 2.71828183 7.3890561 20.08553692]

x是矩阵

import numpy as np
x = np.array([[1, 2, 3], [4, 5, 6]])
print(np.exp(x)) # 结果是 [exp(1), exp(2), exp(3), exp(4), exp(5), exp(6)]

# 输出
[[ 2.71828183 7.3890561 20.08553692]
[ 54.59815003 148.4131591  403.42879349]]

sigmoid函数


import numpy as np

def basic_sigmoid(x):
    s = 1 / (1 + np.exp(-x))
    return s

使用numpy模块计算sigmoid函数的导数值

import numpy as np
def sigmoid_derivative(x):
    s = 1 / (1 + np.exp(-x))
    ds = s * (1 - s)
    return ds

2.张量维度

在深度学习中,常使用numpy模块提供了shape()和reshap()两个函数来操作张量的维度

  • shape:获取矩阵或向量X的维度
  • reshape:把更改X的维度
    例如,把图片3D数组(length, width, channels)改为1D(length * width *channels, 1)向量
def image2vector(image):
    v = image.reshape(image.shape[0] * image.shape[1] * image.shape[2], 1)
    return v

3.权重正则化

奥卡姆剃刀(Occam’s razor)原理:如果一件事情有两种解释,那么最可能正确的解释就是最简单的那个,即假设更少的那个。这个原理也适用于神经网络学到的模型:给 定一些训练数据和一种网络架构,很多组权重值(即很多模型)都可以解释这些数据。简单模型比复杂模型更不容易过拟合。
这里的简单模型(simple model)是指参数值分布的熵更小的模型(或参数更少的模型)。因此,一种常见的降低过拟合的方法就是强制让模型权重只能取较小的值, 从而限制模型的复杂度,这使得权重值的分布更加规则(regular)。这种方法叫作权重正则化 (weight regularization),其实现方法是向网络损失函数中添加与较大权重值相关的成本(cost)。 这个成本有两种形式。

  • L1 正则化(L1 regularization):添加的成本与权重系数的绝对值[权重的 L1 范数(norm)] 成正比。
  • L2 正则化(L2 regularization):添加的成本与权重系数的平方(权重的 L2 范数)成正比。神经网络的 L2 正则化也叫权重衰减(weight decay)。

如矩阵x = \left[ \begin{matrix} 0 & 3 & 4 \\ 2 & 6 & 4 \\ \end{matrix} \right]
有矩阵的L1范数\| x\| = np.linalg.norm(x, axis = 1, keepdims = True) = \begin{bmatrix} 5 \\ \sqrt{56} \\ \end{bmatrix}
那么
x\_normalized = \frac{x}{\| x\|} = \begin{bmatrix} 0 & \frac{3}{5} & \frac{4}{5} \\ \frac{2}{\sqrt{56}} & \frac{6}{\sqrt{56}} & \frac{4}{\sqrt{56}} \\ \end{bmatrix}
L1正化:

def normalizeRows(x):
    x_norm = np.linalg.norm(x, axis = 1, keepdims = True)
    x = x / x_norm
    return x

x = np.array([
    [0, 3, 4],
    [1, 6, 4]])
print("normalizeRows(x) = " + str(normalizeRows(x)))

#输出
normalizeRows(x) = [[0.0 0.6 0.8]
 [0.13736056 0.82416338 0.54944226]]

4.softmax函数

\text{for } x \in \mathbb{R}^{1\times n} \text{, } softmax(x) = softmax(\begin{bmatrix} x_1 && x_2 && ... && x_n \end{bmatrix}) = \begin{bmatrix} \frac{e^{x_1}}{\sum_{j}e^{x_j}} && \frac{e^{x_2}}{\sum_{j}e^{x_j}} && ... && \frac{e^{x_n}}{\sum_{j}e^{x_j}} \end{bmatrix}

\text{for a matrix } x \in \mathbb{R}^{m \times n}, x_{ij} \text{ maps to the element in the } i^{th} \text{row and} j^{th} \text{ column of x , thus we have:}

softmax(x) = softmax\begin{bmatrix} x_{11} & x_{12} & x_{13} & \dots & x_{1n} \\ x_{21} & x_{22} & x_{23} & \dots & x_{2n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ x_{m1} & x_{m2} & x_{m3} & \dots & x_{mn} \end{bmatrix} = \begin{bmatrix} \frac{e^{x_{11}}}{\sum_{j}e^{x_{1j}}} & \frac{e^{x_{12}}}{\sum_{j}e^{x_{1j}}} & \frac{e^{x_{13}}}{\sum_{j}e^{x_{1j}}} & \dots & \frac{e^{x_{1n}}}{\sum_{j}e^{x_{1j}}} \\ \frac{e^{x_{21}}}{\sum_{j}e^{x_{2j}}} & \frac{e^{x_{22}}}{\sum_{j}e^{x_{2j}}} & \frac{e^{x_{23}}}{\sum_{j}e^{x_{2j}}} & \dots & \frac{e^{x_{2n}}}{\sum_{j}e^{x_{2j}}} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ \frac{e^{x_{m1}}}{\sum_{j}e^{x_{mj}}} & \frac{e^{x_{m2}}}{\sum_{j}e^{x_{mj}}} & \frac{e^{x_{m3}}}{\sum_{j}e^{x_{mj}}} & \dots & \frac{e^{x_{mn}}}{\sum_{j}e^{x_{mj}}} \end{bmatrix} = \begin{pmatrix} softmax\text{(first row of x)} \\ softmax\text{(second row of x)} \\ ... \\ softmax\text{(last row of x)} \\ \end{pmatrix}

import numpy as np

def softmax(x):
    # x.shape() is (m, n)
    x_exp = np.exp(x)
    # x_sum.shape() is (m, 1)
    x_sum = np.sum(x_exp, axis = 1, keepdims = True)
    # s.shape is (m, n)
    #  x_exp/x_sum works due to python broadcasting.
    s = x_exp / x_sum

    return s

5.向量化(vectorization)

在深度学习中,当你要处理大量数据时,一个没有经过计算优化的函数可能会成为你的算法中的巨大瓶颈,并导致模型运算花费很长时间,通过把数据向量化可以提升代码运行效率。例如,未向量化与向量化后分别对以下dot、outer、elementwise product运算花费的时间差异:

未向量化

import time

x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0]
x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0]

### CLASSIC DOT PRODUCT OF VECTORS IMPLEMENTATION ###
tic = time.process_time()
dot = 0
for i in range(len(x1)):
    dot+= x1[i]*x2[i]
toc = time.process_time()
print ("dot = " + str(dot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")

### CLASSIC OUTER PRODUCT IMPLEMENTATION ###
tic = time.process_time()
outer = np.zeros((len(x1),len(x2))) # we create a len(x1)*len(x2) matrix with only zeros
for i in range(len(x1)):
    for j in range(len(x2)):
        outer[i,j] = x1[i]*x2[j]
toc = time.process_time()
print ("outer = " + str(outer) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")

### CLASSIC ELEMENTWISE IMPLEMENTATION ###
tic = time.process_time()
mul = np.zeros(len(x1))
for i in range(len(x1)):
    mul[i] = x1[i]*x2[i]
toc = time.process_time()
print ("elementwise multiplication = " + str(mul) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")

### CLASSIC GENERAL DOT PRODUCT IMPLEMENTATION ###
W = np.random.rand(3,len(x1)) # Random 3*len(x1) numpy array
tic = time.process_time()
gdot = np.zeros(W.shape[0])
for i in range(W.shape[0]):
    for j in range(len(x1)):
        gdot[i] += W[i,j]*x1[j]
toc = time.process_time()
print ("gdot = " + str(gdot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")

# 输出
dot = 278
 ----- Computation time = 0.1380000000000825ms
outer = [[81. 18. 18. 81.  0. 81. 18. 45.  0.  0. 81. 18. 45.  0.  0.]
 [18.  4.  4. 18.  0. 18.  4. 10.  0.  0. 18.  4. 10.  0.  0.]
 [45. 10. 10. 45.  0. 45. 10. 25.  0.  0. 45. 10. 25.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [63. 14. 14. 63.  0. 63. 14. 35.  0.  0. 63. 14. 35.  0.  0.]
 [45. 10. 10. 45.  0. 45. 10. 25.  0.  0. 45. 10. 25.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [81. 18. 18. 81.  0. 81. 18. 45.  0.  0. 81. 18. 45.  0.  0.]
 [18.  4.  4. 18.  0. 18.  4. 10.  0.  0. 18.  4. 10.  0.  0.]
 [45. 10. 10. 45.  0. 45. 10. 25.  0.  0. 45. 10. 25.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]]
 ----- Computation time = 0.3070000000000572ms
elementwise multiplication = [81.  4. 10.  0.  0. 63. 10.  0.  0.  0. 81.  4. 25.  0.  0.]
 ----- Computation time = 0.09599999999998499ms
gdot = [26.31518509 28.05833259 33.88453962]
 ----- Computation time = 0.14199999999997548ms

向量化

import time
import numpy as np

x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0]
x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0]

### VECTORIZED DOT PRODUCT OF VECTORS ###
tic = time.process_time()
dot = np.dot(x1,x2)
toc = time.process_time()
print ("dot = " + str(dot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")

### VECTORIZED OUTER PRODUCT ###
tic = time.process_time()
outer = np.outer(x1,x2)
toc = time.process_time()
print ("outer = " + str(outer) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")

### VECTORIZED ELEMENTWISE MULTIPLICATION ###
tic = time.process_time()
mul = np.multiply(x1,x2)
toc = time.process_time()
print ("elementwise multiplication = " + str(mul) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")

### VECTORIZED GENERAL DOT PRODUCT ###
tic = time.process_time()
dot = np.dot(W,x1)
toc = time.process_time()
print ("gdot = " + str(dot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")

# 输出
dot = 278
 ----- Computation time = 0.12699999999998823ms
outer = [[81 18 18 81  0 81 18 45  0  0 81 18 45  0  0]
 [18  4  4 18  0 18  4 10  0  0 18  4 10  0  0]
 [45 10 10 45  0 45 10 25  0  0 45 10 25  0  0]
 [ 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0]
 [63 14 14 63  0 63 14 35  0  0 63 14 35  0  0]
 [45 10 10 45  0 45 10 25  0  0 45 10 25  0  0]
 [ 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0]
 [81 18 18 81  0 81 18 45  0  0 81 18 45  0  0]
 [18  4  4 18  0 18  4 10  0  0 18  4 10  0  0]
 [45 10 10 45  0 45 10 25  0  0 45 10 25  0  0]
 [ 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0]]
 ----- Computation time = 0.08399999999997299ms
elementwise multiplication = [81  4 10  0  0 63 10  0  0  0 81  4 25  0  0]
 ----- Computation time = 0.058000000000002494ms
gdot = [26.31518509 28.05833259 33.88453962]
 ----- Computation time = 0.12400000000001299ms

6.实现L1和L2损失函数

  • L1 损失函数:\begin{align*} & L_1(\hat{y}, y) = \sum_{i=0}^m|y^{(i)} - \hat{y}^{(i)}| \end{align*}
import numpy as np

def L1(yhat, y):
    loss = np.sum(abs(y - yhat))
    return loss
yhat = np.array([.9, 0.2, 0.1, .4, .9])
y = np.array([1, 0, 0, 1, 1])
print("L1 = " + str(L1(yhat,y)))

# 输出
L1 = 1.1
  • L2 损失函数:\begin{align*} & L_2(\hat{y},y) = \sum_{i=0}^m(y^{(i)} - \hat{y}^{(i)})^2 \end{align*}
import numpy as np

def L2(yhat, y):
    loss = np.sum((y - yhat) ** 2)
    return loss
yhat = np.array([.9, 0.2, 0.1, .4, .9])
y = np.array([1, 0, 0, 1, 1])
print("L2 = " + str(L2(yhat,y)))
# 输出
L2 = 0.43

相关文章

网友评论

    本文标题:Python深度学习中Numpy的常用操作

    本文链接:https://www.haomeiwen.com/subject/juklaqtx.html