機器學習筆記16 —— 編程作業5線性回歸演算法的評估





function [J, grad] = linearRegCostFunction(X, y, theta, lambda)n%LINEARREGCOSTFUNCTION Compute cost and gradient for regularized linear n%regression with multiple variablesn% [J, grad] = LINEARREGCOSTFUNCTION(X, y, theta, lambda) computes the n% cost of using theta as the parameter for linear regression to fit the n% data points in X and y. Returns the cost in J and the gradient in gradnn% Initialize some useful valuesnm = length(y); % number of training examplesnn% You need to return the following variables correctly nJ = 0;ngrad = zeros(size(theta));nn% ====================== YOUR CODE HERE ======================n% Instructions: Compute the cost and gradient of regularized linear n% regression for a particular choice of theta.n%n% You should set J to the cost and grad to the gradient.n%nh = X * theta; nJ = (X * theta - y). * (X * theta - y) / (2*m)+(lambda/(2*m)) * sum(theta(2:end).^2); nngrad = grad(:); ngrad(1) = (X(:, 1). * (h - y)) /m;ngrad(2:end) = (X(:, 2:end). * (h - y)) /m + (lambda/m) * theta(2:end);nn% =========================================================================nngrad = grad(:);nnendn


因為trainLinearReg.m已經寫好了,所以我們可以算出 theta ,從而就可以得到假設曲線:

這裡面程序讓 lambda=0 。在圖上可以看到欠擬合的情況。


function [error_train, error_val] = ...n learningCurve(X, y, Xval, yval, lambda)n%LEARNINGCURVE Generates the train and cross validation set errors needed n%to plot a learning curven% [error_train, error_val] = ...n% LEARNINGCURVE(X, y, Xval, yval, lambda) returns the train andn% cross validation set errors for a learning curve. In particular, n% it returns two vectors of the same length - error_train and n% error_val. Then, error_train(i) contains the training error forn% i examples (and similarly for error_val(i)).n%n% In this function, you will compute the train and test errors forn% dataset sizes from 1 up to m. In practice, when working with largern% datasets, you might want to do this in larger intervals.n%nn% Number of training examplesnm = size(X, 1);nn% You need to return these values correctlynerror_train = zeros(m, 1);nerror_val = zeros(m, 1);nn% ====================== YOUR CODE HERE ======================n% Instructions: Fill in this function to return training errors in n% error_train and the cross validation errors in error_val. n% i.e., error_train(i) and n% error_val(i) should give you the errorsn% obtained after training on i examples.n%n% Note: You should evaluate the training error on the first i trainingn% examples (i.e., X(1:i, :) and y(1:i)).n%n% For the cross-validation error, you should instead evaluate onn% the _entire_ cross validation set (Xval and yval).n%n% Note: If you are using your cost function (linearRegCostFunction)n% to compute the training and cross validation error, you should n% call the function with the lambda argument set to 0. n% Do note that you will still need to use lambda when runningn% the training to obtain the theta parameters.n%n% Hint: You can loop over the examples with the following:n%n% for i = 1:mn% % Compute train/cross validation errors using training examples n% % X(1:i, :) and y(1:i), storing the result in n% % error_train(i) and error_val(i)n% ....n% n% endn%nn% ---------------------- Sample Solution ----------------------nnfor i = 1:m,n [theta] = trainLinearReg(X(1:i,:),y(1:i),lambda);n error_train(i) = linearRegCostFunction(X(1:i,:),y(1:i),theta,0);n error_val(i) = linearRegCostFunction(Xval,yval,theta,0);nend;nnn% -------------------------------------------------------------nn% =========================================================================nnendn





function [X_poly] = polyFeatures(X, p)n%POLYFEATURES Maps X (1D vector) into the p-th powern% [X_poly] = POLYFEATURES(X, p) takes a data matrix X (size m x 1) andn% maps each example into its polynomial features wheren% X_poly(i, :) = [X(i) X(i).^2 X(i).^3 ... X(i).^p];n%nnn% You need to return the following variables correctly.nX_poly = zeros(numel(X), p);nn% ====================== YOUR CODE HERE ======================n% Instructions: Given a vector X, return a matrix X_poly where the p-th n% column of X contains the values of X to the p-th power.n%n% nfor i=1:pn X_poly(:,i) = X.^i;nend;nn% =========================================================================nnendn


這個是 J_{train}(theta)J_{cv}(theta) 的關係曲線(藍色曲線 J_{train}(theta) 一直為0),我們可以從中得知現在欠擬合的問題解決了,但是好像有點過擬合了喔。

既然是過擬合,而且之前 lambda=0 ,所以接下來自然而然我們想到使用正則化來解決過擬合的問題,打開validationCurve.m:

function [lambda_vec, error_train, error_val] = ...n validationCurve(X, y, Xval, yval)n%VALIDATIONCURVE Generate the train and validation errors needed ton%plot a validation curve that we can use to select lambdan% [lambda_vec, error_train, error_val] = ...n% VALIDATIONCURVE(X, y, Xval, yval) returns the trainn% and validation errors (in error_train, error_val)n% for different values of lambda. You are given the training set (X,n% y) and validation set (Xval, yval).n%nn% Selected values of lambda (you should not change this)nlambda_vec = [0 0.001 0.003 0.01 0.03 0.1 0.3 1 3 10];nn% You need to return these variables correctly.nerror_train = zeros(length(lambda_vec), 1);nerror_val = zeros(length(lambda_vec), 1);nn% ====================== YOUR CODE HERE ======================n% Instructions: Fill in this function to return training errors in n% error_train and the validation errors in error_val. The n% vector lambda_vec contains the different lambda parameters n% to use for each calculation of the errors, i.e, n% error_train(i), and error_val(i) should give n% you the errors obtained after training with n% lambda = lambda_vec(i)n%n% Note: You can loop over lambda_vec with the following:n%n% for i = 1:length(lambda_vec)n% lambda = lambda_vec(i);n% % Compute train / val errors when training linear n% % regression with regularization parameter lambdan% % You should store the result in error_train(i)n% % and error_val(i)n% ....n% n% endn%n%nfor i = 1:length(lambda_vec),n lambda = lambda_vec(i);n theta = trainLinearReg(X, y, lambda);n error_train(i) = linearRegCostFunction(X, y, theta, 0);n error_val(i) = linearRegCostFunction(Xval, yval, theta, 0);nend;n% =========================================================================nnendn

最後隨著 lambda 增加J_{train}(theta)J_{cv}(theta) 曲線,也可以驗證我們之前學的 lambda 與過擬合欠擬合之間的關係:




