Numpy中的数组也叫做张量,张量是机器学习系统中的基本数据结构,张量的维度(dimension)通常叫做轴(axis),按维度不同分为0D张量(标量)、1D张量(向量)、2D张量(矩阵)、3D及更高维度张量。
1.sigmoid函数
是一个非线性函数,用于机器学习中作为Logistics Regression分类,也用于深度学习作为激活函数,我们分别使用python中的numpy和math模块来构建sigmoid函数。
1).使用math模块构建sigmoid函数:
import math
def sigmoid(x):
s = 1 / (1 + math.exp(-x))
return s
使用math构建的sigmoid函数接收一个数字,但是在深度学习中我们通常处理的数据类型是矩阵和向量,所以要用到numpy。
2).使用numpy模块构建sigmoid函数:
numpy.exp(x),输入x可以是标量,向量或多维数组
x是向量
import numpy as np
x = np.array([1, 2, 3])
print(np.exp(x)) # 结果是 (exp(1), exp(2), exp(3))
# 输出
[ 2.71828183 7.3890561 20.08553692]
x是矩阵
import numpy as np
x = np.array([[1, 2, 3], [4, 5, 6]])
print(np.exp(x)) # 结果是 [exp(1), exp(2), exp(3), exp(4), exp(5), exp(6)]
# 输出
[[ 2.71828183 7.3890561 20.08553692]
[ 54.59815003 148.4131591 403.42879349]]
sigmoid函数
import numpy as np
def basic_sigmoid(x):
s = 1 / (1 + np.exp(-x))
return s
使用numpy模块计算sigmoid函数的导数值
import numpy as np
def sigmoid_derivative(x):
s = 1 / (1 + np.exp(-x))
ds = s * (1 - s)
return ds
2.张量维度
在深度学习中,常使用numpy模块提供了shape()和reshap()两个函数来操作张量的维度
- shape:获取矩阵或向量X的维度
- reshape:把更改X的维度
例如,把图片3D数组(length, width, channels)改为1D(length * width *channels, 1)向量
def image2vector(image):
v = image.reshape(image.shape[0] * image.shape[1] * image.shape[2], 1)
return v
3.权重正则化
奥卡姆剃刀(Occam’s razor)原理:如果一件事情有两种解释,那么最可能正确的解释就是最简单的那个,即假设更少的那个。这个原理也适用于神经网络学到的模型:给 定一些训练数据和一种网络架构,很多组权重值(即很多模型)都可以解释这些数据。简单模型比复杂模型更不容易过拟合。
这里的简单模型(simple model)是指参数值分布的熵更小的模型(或参数更少的模型)。因此,一种常见的降低过拟合的方法就是强制让模型权重只能取较小的值, 从而限制模型的复杂度,这使得权重值的分布更加规则(regular)。这种方法叫作权重正则化 (weight regularization),其实现方法是向网络损失函数中添加与较大权重值相关的成本(cost)。 这个成本有两种形式。
- L1 正则化(L1 regularization):添加的成本与权重系数的绝对值[权重的 L1 范数(norm)] 成正比。
- L2 正则化(L2 regularization):添加的成本与权重系数的平方(权重的 L2 范数)成正比。神经网络的 L2 正则化也叫权重衰减(weight decay)。
如矩阵
有矩阵的L1范数
那么
L1正化:
def normalizeRows(x):
x_norm = np.linalg.norm(x, axis = 1, keepdims = True)
x = x / x_norm
return x
x = np.array([
[0, 3, 4],
[1, 6, 4]])
print("normalizeRows(x) = " + str(normalizeRows(x)))
#输出
normalizeRows(x) = [[0.0 0.6 0.8]
[0.13736056 0.82416338 0.54944226]]
4.softmax函数
import numpy as np
def softmax(x):
# x.shape() is (m, n)
x_exp = np.exp(x)
# x_sum.shape() is (m, 1)
x_sum = np.sum(x_exp, axis = 1, keepdims = True)
# s.shape is (m, n)
# x_exp/x_sum works due to python broadcasting.
s = x_exp / x_sum
return s
5.向量化(vectorization)
在深度学习中,当你要处理大量数据时,一个没有经过计算优化的函数可能会成为你的算法中的巨大瓶颈,并导致模型运算花费很长时间,通过把数据向量化可以提升代码运行效率。例如,未向量化与向量化后分别对以下dot、outer、elementwise product运算花费的时间差异:
未向量化
import time
x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0]
x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0]
### CLASSIC DOT PRODUCT OF VECTORS IMPLEMENTATION ###
tic = time.process_time()
dot = 0
for i in range(len(x1)):
dot+= x1[i]*x2[i]
toc = time.process_time()
print ("dot = " + str(dot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
### CLASSIC OUTER PRODUCT IMPLEMENTATION ###
tic = time.process_time()
outer = np.zeros((len(x1),len(x2))) # we create a len(x1)*len(x2) matrix with only zeros
for i in range(len(x1)):
for j in range(len(x2)):
outer[i,j] = x1[i]*x2[j]
toc = time.process_time()
print ("outer = " + str(outer) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
### CLASSIC ELEMENTWISE IMPLEMENTATION ###
tic = time.process_time()
mul = np.zeros(len(x1))
for i in range(len(x1)):
mul[i] = x1[i]*x2[i]
toc = time.process_time()
print ("elementwise multiplication = " + str(mul) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
### CLASSIC GENERAL DOT PRODUCT IMPLEMENTATION ###
W = np.random.rand(3,len(x1)) # Random 3*len(x1) numpy array
tic = time.process_time()
gdot = np.zeros(W.shape[0])
for i in range(W.shape[0]):
for j in range(len(x1)):
gdot[i] += W[i,j]*x1[j]
toc = time.process_time()
print ("gdot = " + str(gdot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
# 输出
dot = 278
----- Computation time = 0.1380000000000825ms
outer = [[81. 18. 18. 81. 0. 81. 18. 45. 0. 0. 81. 18. 45. 0. 0.]
[18. 4. 4. 18. 0. 18. 4. 10. 0. 0. 18. 4. 10. 0. 0.]
[45. 10. 10. 45. 0. 45. 10. 25. 0. 0. 45. 10. 25. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[63. 14. 14. 63. 0. 63. 14. 35. 0. 0. 63. 14. 35. 0. 0.]
[45. 10. 10. 45. 0. 45. 10. 25. 0. 0. 45. 10. 25. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[81. 18. 18. 81. 0. 81. 18. 45. 0. 0. 81. 18. 45. 0. 0.]
[18. 4. 4. 18. 0. 18. 4. 10. 0. 0. 18. 4. 10. 0. 0.]
[45. 10. 10. 45. 0. 45. 10. 25. 0. 0. 45. 10. 25. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
----- Computation time = 0.3070000000000572ms
elementwise multiplication = [81. 4. 10. 0. 0. 63. 10. 0. 0. 0. 81. 4. 25. 0. 0.]
----- Computation time = 0.09599999999998499ms
gdot = [26.31518509 28.05833259 33.88453962]
----- Computation time = 0.14199999999997548ms
向量化
import time
import numpy as np
x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0]
x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0]
### VECTORIZED DOT PRODUCT OF VECTORS ###
tic = time.process_time()
dot = np.dot(x1,x2)
toc = time.process_time()
print ("dot = " + str(dot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
### VECTORIZED OUTER PRODUCT ###
tic = time.process_time()
outer = np.outer(x1,x2)
toc = time.process_time()
print ("outer = " + str(outer) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
### VECTORIZED ELEMENTWISE MULTIPLICATION ###
tic = time.process_time()
mul = np.multiply(x1,x2)
toc = time.process_time()
print ("elementwise multiplication = " + str(mul) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
### VECTORIZED GENERAL DOT PRODUCT ###
tic = time.process_time()
dot = np.dot(W,x1)
toc = time.process_time()
print ("gdot = " + str(dot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
# 输出
dot = 278
----- Computation time = 0.12699999999998823ms
outer = [[81 18 18 81 0 81 18 45 0 0 81 18 45 0 0]
[18 4 4 18 0 18 4 10 0 0 18 4 10 0 0]
[45 10 10 45 0 45 10 25 0 0 45 10 25 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[63 14 14 63 0 63 14 35 0 0 63 14 35 0 0]
[45 10 10 45 0 45 10 25 0 0 45 10 25 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[81 18 18 81 0 81 18 45 0 0 81 18 45 0 0]
[18 4 4 18 0 18 4 10 0 0 18 4 10 0 0]
[45 10 10 45 0 45 10 25 0 0 45 10 25 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]]
----- Computation time = 0.08399999999997299ms
elementwise multiplication = [81 4 10 0 0 63 10 0 0 0 81 4 25 0 0]
----- Computation time = 0.058000000000002494ms
gdot = [26.31518509 28.05833259 33.88453962]
----- Computation time = 0.12400000000001299ms
6.实现L1和L2损失函数
- L1 损失函数:
import numpy as np
def L1(yhat, y):
loss = np.sum(abs(y - yhat))
return loss
yhat = np.array([.9, 0.2, 0.1, .4, .9])
y = np.array([1, 0, 0, 1, 1])
print("L1 = " + str(L1(yhat,y)))
# 输出
L1 = 1.1
- L2 损失函数:
import numpy as np
def L2(yhat, y):
loss = np.sum((y - yhat) ** 2)
return loss
yhat = np.array([.9, 0.2, 0.1, .4, .9])
y = np.array([1, 0, 0, 1, 1])
print("L2 = " + str(L2(yhat,y)))
# 输出
L2 = 0.43
网友评论