0%

softmaxa Regression

Hey

Machine Learning notes

softmaxa Regression

为LR在多分类上的推广,与LR一样,同属于广义线性模型(Generalized Linear Model)

其中,Vi 是分类器前级输出单元的输出。i 表示类别索引,总的类别个数为 C。Si 表示的是当前元素的指数与所有元素指数和的比值。Softmax 将多分类的输出数值转化为相对概率,更容易理解和比较。我们来看下面这个例子

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 什么是Softmax函数
"""
实际应用中,使用 Softmax 需要注意数值溢出的问题。因为有指数运算,
如果 V 数值很大,经过指数运算后的数值往往可能有溢出的可能。
所以,需要对 V 进行一些数值处理:即 V 中的每个元素减去 V 中的最大值。"""
scores = np.array([123,456,789])
scores -= np.max(scores)
p = np.exp(scores) / np.sum(np.exp(scores))
# print(p)

# Softmax损失函数
# 其中,Syi是正确类别对应的线性得分函数,Si 是正确类别对应的 Softmax输出。
# 由于 log 运算符不会影响函数的单调性,我们对 Si 进行 log 操作:
# 我们希望 Si 越大越好,即正确类别对应的相对概率越大越好,那么就可以对 Si 前面加个负号,来表示损失函数
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
def softmax_loss_naive(W,X,y,reg):
"""
Softmax loss function ,naive implementation(with loops)

Inputs have dimension D,there are C classes,and we operate on minibatches of N examples
:param W: A numpy array of shape(D,C) containing weights
:param X: A numpy array of shaep(N,D) containing a minibatch of data
:param y:A numpy array of shape(N,) containing training,labels,y[i] = c means
:param reg:(float) regularization strength
:return: A tuple of (loss as single float, gradient with respect to weights W,an array of same shape as W)
"""
# initialize the loss and gradient to zero
loss = 0.0
dW = np.zeros_like(W)

num_train = X.shape[0]
num_classes = W.shape[1]
for i in range(num_train):
scores = X[i,:].dot(W)
scores_shift = scores - np.max(scores)
right_class = y[i]
loss += (-scores_shift[right_class] + np.log(np.sum(np.exp(scores_shift))))
for j in range(num_classes):
softmax_output = np.exp(scores_shift[j]) / np.sum(np.exp(scores_shift))
if j == y[i]:
dW[:,j] += (-1 + softmax_output) * X[i,:]
else:
dW[:,j] += softmax_output * X[i,:]
loss /= num_train
loss += 0.5 * reg * np.sum(W*W)
dW /= num_train
dW += reg * W

return loss,dW


def softmax_loss_vectorized(W,X,y,reg):
"""
Softmax loss function,vectorized version

Inputs and outputs are the same as softmax_loss_naive
:param W:
:param X:
:param y:
:param reg:
:return:
"""
loss = 0.0
dW = np.zeros_like(W)

# 获得样本的数量
num_train = X.shape[0]
# 获取样本的分裂数量
num_classes = W.shape[1]
# 获取到每个样本所对应的分类的得分
scores = X.dot(W)
scores_shift = scores - np.max(scores, axis=1)
softmax_output = np.exp(scores_shift) / np.sum(np.exp(scores_shift), axis=1)
loss = -np.sum(np.log(softmax_output[range(num_train), list(y)]))
loss /= num_train
loss += 0.5 * reg * np.sum(W * W)

dS = softmax_output.copy()
dS[range(num_train), list(y)] += -1
dW = (X.T).dot(dS)
dW = dW / num_train + reg * W

return loss, dW