如何求出libsvm中每個樣本點到超平面的歐式距離？

01-22

通過libsvm分類完成後，可以求得支持向量，但是對於高維的情況，超平面不知道該怎麼表示，那麼求出樣本點到每個支持向量的歐式距離再取平均，作為到超平面的距離，不知道可以不可以，求各位大神解答。

首先，對於libsvm的使用，我也在摸索當中，關於點到超平面的距離，以下是我的做法，歡迎指正~【鞠躬

PS:操作環境：win7 + matlab R2014a + libsvm-3.21

---------------------------------------------分割線-----------------------------------------------------

首先，查官方文檔--&> LIBSVM FAQ

獲得以下信息：

The distance is |decision_value| / |w|.

We have |w|^2 = w^Tw = alpha^T Q alpha = 2*(dual_obj + sum alpha_i).

Thus in svm.cpp please find the place where we calculate the dual objective value (i.e., the subroutine Solve()) and add a statement to print w^Tw.

More precisely, here is what you need to do ：

a) Search for "calculate objective value" in svm.cpp

b) In that place, si-&>obj is the variable for the objective value

c) Add a for loop to calculate the sum of alpha

d) Calculate 2*(si-&>obj + sum of alpha) and print the square root of it. You now get |w|. You need to recompile the code

e) Check an earlier FAQ on printing decision values. You need to recompile the code

f) Then print decision value divided by the |w| value obtained earlier.

可知兩個公式：

|w|^2=2*(dual_obj + sum alpha_i).

distance = |decision_value| / |w|.

以下為過程記錄：【溫馨提示：修改svm.cpp文件前，請做好此文件備份^_^】

1）在svm.cpp（路徑參考C:Program FilesMATLABR2014a oolboxlibsvm-3.21）內搜索calculate objective value：

si-&>obj 即為 the variable for the objective value

2）用for循環求出alpha值之和sum_alpha：

3）計算2*(si-&>obj + sum of alpha) 並列印。

【頭文件math.h看有沒有沒有就加上，我也忘記我是不是手動加的了。。。

現在svm.cpp代碼修改好了，我們需要重新編譯一下了。

路徑參考：C:Program FilesMATLABR2014a oolboxlibsvm-3.21matlab

在matlab下打開以上路徑，然後make一下，生成的那四個mexw64文件就是新的

4）decison_value不用說了吧，predict的時候會生成的

5）公式里缺失的值現在都得到了。之後就是根據公式計算了~

【如有錯誤，歡迎指正~】

def getW(model): """ 入參是上面的model模型參考libsvm FAQ，w = (model.SV)"s * model.sv_coef 求得w向量就可以很容易的計算出|w| """ sv_dict = model.get_SV() #形式是{1:x,2:y,3:z......}這種字典式的支持向量表示方法 sv_ceof = [float(x[0]) for x in model.get_sv_coef()] #alpha向量


    max_key= max([x for x in sv_dict[0].keys()])   #求支持向量維度
    sv_ceof_nd = numpy.array(sv_ceof)
    sv_nd = []

    for d in sv_dict:

        tmp = []

        for i in xrange(1,max_key+1):

            if d.has_key(i):     #稀疏存儲

                tmp.append(d[i])

            else:

                tmp.append(0)

        sv_nd.append(tmp)
    sv_nd = numpy.array(sv_nd)

w = numpy.dot(sv_nd.T,sv_ceof_nd) if model.label[1] == -1: w = -w return w ————————————————————————我是分割線---------------------------------- 上面的是python版的計演算法向量w的方法，dist = |decesion_val| / |w| 這是我自己參考FAQ寫的一些，如果不對歡迎指正

你好！為什麼我列印出來的 wvalues.txt 文件裡面是空的呢？什麼都沒有？