[ML] 機器學習技法：第十三講 Deep Learning

ML：基礎技法學習
Package：scikit-learn
課程：機器學習技法
簡介：第十三講 Deep Learning

比較

Shallow NNet

較少的 hidden layers
較有效率訓練
較簡單的架構
理論上足夠強大

Deep NNet

非常多的 hidden layers
難以訓練
難以決定架構
非常強大，理論上可做到任何想做的事
layer 較有意義

因很多層，每層可只做簡單的事
從簡單的 features 慢慢組合成複雜的 features
像是辨識數字，從 pixels -> 簡單筆畫 -> 部分區塊 -> 數字

Deep NNet Challenges

如何決定架構

validation

對問題的了解，例如： convolutional NNet 在影像上的運用

model 複雜度高

通常不是問題，資料量通常夠多
regularization

dropout

當一些神經元壞掉時，仍可正常工作

denoising

輸入資料壞掉時，仍可正常工作

架構加上 constraints ，例 CNN(convolutional NNet)
weight elimination
early stopping

最佳化困難

pre-training

小心地決定初始值，防止 local minimum

計算複雜，特別是資料量太多

更進步的硬體與計算架構，例如：平行處理 mini-batch with GPU

Autoencoder

可用在 pre-training，
unsupervised learning technique
\(d-\widetilde{d}-d\ \mathrm{NNet}\) 令 \(g_i(\mathbf{x})\approx x_i\)
\(\widetilde{d}<d\)：壓縮資料維度
近似 identity function
error function \(\sum_{i=1}^{d}(g_i(\mathbf{x})-x_i)^2\)
\(w_{ij}^{(1)}\)：encoding weights
\(w_{ji}^{(2)}\)：dencoding weights
通常令 \(w_{ij}^{(1)}=w_{ji}^{(2)}\) for regularization

denoising autoencoder

資料為 \(\left \{ (\widetilde{\mathbf{x}}_1,\mathbf{y}_1=\mathbf{x}_1),(\widetilde{\mathbf{x}}_2,\mathbf{y}_2=\mathbf{x}_2),\cdots,(\widetilde{\mathbf{x}}_N,\mathbf{y}_N=\mathbf{x}_N) \right \}\) 在 autoencoder 上訓練
且 \(\widetilde{\mathbf{x}}_n=\mathbf{x}_n+noise\)

Principal Component Analysis

Linear Autoencoder or PCA
紅色表示兩者的差異

令 \(\overline{\mathbf{x}}=\frac{1}{N}\sum_{n=1}^{N}\mathbf{x}_n\)
且修正 \(\mathbf{x}_n \leftarrow \mathbf{x}_n-\overline{\mathbf{x}}\)
計算 \(\mathbf{X}^T\mathbf{X}\) 的 \(\widetilde{d}\) 個 top eigenvectors \(\mathbf{w}_1,\mathbf{w}_2,\cdots ,\mathbf{w}_{\widetilde{d}}\)
回傳特徵轉換 \(\Phi (\mathbf{x})=\mathbf{W}(\mathbf{x}-\color{Red}{\overline{\mathbf{x}}})\)

程式碼

PCA Demo

1import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from sklearn import datasets
from sklearn.decomposition import PCA

# import some data to play with
iris = datasets.load_iris()
X = iris.data[:, :2]  # we only take the first two features.
y = iris.target

x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5
y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5

plt.figure(2, figsize=(8, 6))
plt.clf()

# 畫出原始資料
plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.Set1, edgecolor='k')
plt.xlabel('Sepal length')
plt.ylabel('Sepal width')

plt.xlim(x_min, x_max)
plt.ylim(y_min, y_max)
plt.xticks(())
plt.yticks(())

fig = plt.figure(1, figsize=(8, 6))
# 設定 3D 圖
# elev 為看向 z plane 的仰角，此為 45度
# azim 為 xy 平面轉的角度，此為 80度
ax = Axes3D(fig, elev=45, azim=80)
# 訓練並回傳 reduction 後的結果
X_reduced = PCA(n_components=3).fit_transform(iris.data)
ax.scatter(X_reduced[:, 0], X_reduced[:, 1], X_reduced[:, 2], c=y,
           cmap=plt.cm.Set1, edgecolor='k', s=40)
ax.set_title("First three PCA directions")
ax.set_xlabel("1st eigenvector")
ax.w_xaxis.set_ticklabels([])
ax.set_ylabel("2nd eigenvector")
ax.w_yaxis.set_ticklabels([])
ax.set_zlabel("3rd eigenvector")
ax.w_zaxis.set_ticklabels([])

plt.show()1

參考

PCA and linear autoencoders: a better proof
Neural networks [6.4] : Autoencoder - linear autoencoder
主成分分析與低秩矩陣近似
Singular value decomposition
sklearn.decomposition.PCA

子風的知識庫

搜尋此網誌

[ML] 機器學習技法：第十三講 Deep Learning

比較

Deep NNet Challenges

Autoencoder

denoising autoencoder

Principal Component Analysis

程式碼

參考

留言

張貼留言