購物籃關聯分析——R挖掘Apriori演算法
> summary(Groceries)transactions as itemMatrix in sparse format with 9835 rows (elements/itemsets/transactions) and 169 columns (items) and a density of 0.02609146 most frequent items: whole milk other vegetables rolls/buns soda yogurt 2513 1903 1809 1715 1372 (Other) 34055 element (itemset/transaction) length distribution:sizes 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 2159 1643 1299 1005 855 645 545 438 350 246 182 117 78 77 55 46 29 18 19 20 21 22 23 24 26 27 28 29 32 14 14 9 11 4 6 1 1 1 1 3 1 Min. 1st Qu. Median Mean 3rd Qu. Max. 1.000 2.000 3.000 4.409 6.000 32.000 includes extended item information - examples: labels level2 level11 frankfurter sausage meet and sausage2 sausage sausage meet and sausage3 liver loaf sausage meet and sausage> inspect(Groceries[1:5]) items 1 {citrus fruit, semi-finished bread, margarine, ready soups} 2 {tropical fruit, yogurt, coffee} 3 {whole milk} 4 {pip fruit, yogurt, cream cheese , meat spreads} 5 {other vegetables, whole milk, condensed milk, long life bakery product}
需要注意的是:"Groceries"作為自帶數據集不需要做轉換處理,當用arules包處理實際數據的時候,要把數據轉換為arules包識別的交易數據。
下面看一下完整的代碼:
setwd("E:/Rwd_All")library(arules) #載入arules包data(Groceries) #Groceries數據集summary(Groceries)length(Groceries)inspect(Groceries[1:20]) #查看前20條購買記錄rul=apriori(Groceries,parameter=list(support=0.005, confidence=0.65)) #可嘗試不同的參數值(支持度support、置信度confidence)rulinspect(rul) #查看頻繁項集、關聯規則rul_lift=sort(rul,by="lift") #亦可按support或confidence來控制inspect(rul_lift)##應用:假如我們作為商場經理想要促銷chocolate,該如何做(捆綁促銷)?rul1=apriori(Groceries,parameter=list(support=0.002,confidence=0.2,maxlen=3), appearance=list(rhs="chocolate",default="lhs")) #設置最大項數為3rul1inspect(rul1) #根據實際情況選擇捆綁組合,如{other vegetables,candy,chocolate}
部分運行結果:
> rulset of 3 rules > inspect(rul) #查看頻繁項集、關聯規則 lhs rhs support confidence lift1 {butter, whipped/sour cream} => {whole milk} 0.006710727 0.660 2.5830082 {pip fruit, root vegetables, other vegetables} => {whole milk} 0.005490595 0.675 2.6417133 {tropical fruit, root vegetables, yogurt} => {whole milk} 0.005693950 0.700 2.739554 > rul_lift=sort(rul,by="lift") #亦可按support或confidence來控制> inspect(rul_lift) lhs rhs support confidence lift1 {tropical fruit, root vegetables, yogurt} => {whole milk} 0.005693950 0.700 2.7395542 {pip fruit, root vegetables, other vegetables} => {whole milk} 0.005490595 0.675 2.6417133 {butter, whipped/sour cream} => {whole milk} 0.006710727 0.660 2.583008> rul1set of 5 rules > inspect(rul1) #根據實際情況選擇捆綁組合,如{other vegetables,candy,chocolate} lhs rhs support confidence lift1 {other vegetables, candy} => {chocolate} 0.002033554 0.2941176 5.9275552 {rolls/buns, waffles} => {chocolate} 0.002033554 0.2222222 4.4785973 {other vegetables, long life bakery product} => {chocolate} 0.002338587 0.2190476 4.4146174 {whole milk, long life bakery product} => {chocolate} 0.003050330 0.2255639 4.5459455 {butter, fruit/vegetable juice} => {chocolate} 0.002033554 0.2531646 5.102200
推薦閱讀: