創建你自己的模型演算法-Gradient Descent

01-25

在Coursera上看了Dr. NG的 Machine Learning。就用R語言寫了Gradient Descent演算法。

以下是實現思路。

創建有5個自變數一個常數的X數據集，一個應變數y的數據集以及一個參數集合 $heta$ 。目標函數則為 $y= heta_{0}+ heta_{1} x_{1}+ heta_{2} x_{2}+ heta_{3} x_{3}+ heta_{4} x_{4}+ heta_{5} x_{5}$
創建Cost Funcion。
用Gradient Descent求出最優參數集合 $heta$ 。

#Create a dataset that has 5 variables and one DV yset.seed(1)x <- matrix(rnorm(4000), ncol = 5)X<-cbind(rep(1, 800), x)#X is a 800x6 matrix,y <- rnorm(800)#y is a 800x1 column vectortheta<-rep(0,6)#theta is a 6x1 column vector#Cost FunctionCostFun<-function(X, y, theta){ J <- sum((X%*%theta- y)^2)/2 return(J)}#Gradient DescentGradDescent<-function(X, y, theta, alpha, i){ hist <- rep(0, i) for(i in 1:i){ theta <- theta - alpha*(t(X)%*%(X%*%theta - y)) hist[i] <- CostFun(X, y, theta) } return(results)}

t(X)：把X轉置然後進行矩陣運算。

hist[i]：經過之前寫的cost function算出每次cost history並記錄在其中。

接下來就是輸入參數擬合。

#Find the right fitalpha <- .00005i <- 200results <- GradDescent(X, y, theta, alpha, i)theta <- results[[1]]cost_hist <- results[[2]]print(theta) [,1][1,] -0.007200831[2,] -0.048239943[3,] -0.027426464[4,] -0.005848687[5,] 0.027033944[6,] 0.044072725library(ggplot2)plotcosthis<-data.frame(1:i,results[[2]])names(plotcosthis)=c("i","cost_his")ggplot(plotcosthis,aes(i,cost_hist))+ geom_line()

列印theta我們得到

[,1]

[1,] -0.007200831

[2,] -0.048239943

[3,] -0.027426464

[4,] -0.005848687

[5,] 0.027033944

[6,] 0.044072725

所以我們用Gradient Descent演算法得到的公式為：

$y=-0.007200831-0.048239943 x_{1}-0.027426464 x_{2}-0.005848687 x_{3}+0.027033944 x_{4}+0.044072725 x_{5}$