sklearn中的one hot編碼

07-17

sklearn中的one hot編碼

4 人贊了文章

sklearn.preprocessing中的OneHotEncoder將shape=(None,1)的列向量中每個分量表示的下標(index)編碼成one hot行向量。

import numpy as npfrom sklearn.preprocessing import OneHotEncoder

行向量轉列向量：

# 非負整數表示的標籤列表labels = [0,1,0,2]# 行向量轉列向量labels = np.array(labels).reshape(len(labels), -1)

one hot編碼：

enc = OneHotEncoder()enc.fit(labels)targets = enc.transform(labels).toarray()

編碼結果：

array([[ 1., 0., 0.], [ 0., 1., 0.], [ 1., 0., 0.], [ 0., 0., 1.]])

編碼結果是one hot行向量，行向量的第index個分量為1，其餘為0。