关联规则

Reads: 895 Edit

1 数据

我们这里将采用BASKETS.txt数据文件,根据客户的购买记录来发现商品之间的关联规则。

> shop=read.table("D:/Desktop/BASKETS.txt",header = TRUE,sep = ",")

2 Apriori算法

> install.packages("arules")
> install.packages("arulesViz")
> library(arules)
> library(arulesViz)
> shopData = as(shop, "transactions")
Warning message:
Column(s) 1, 2, 3, 4, 5, 6, 7 not logical or factor. Applying default discretization (see '? discretizeDF').
> fit_apriori=apriori(shopData,parameter = list(support = 0.1,confidence = 0.8,minlen = 1))
> fit_apriori
set of 53 rules 
> inspectDT(sort(fit_apriori,by = "lift"))

说明:规则后项有sex=M这种形式,显然不符合我们的期望。除了可以使用inspectDT()函数输出规则外,也可以直接使用inspect()函数输出规则!

r-96

3 筛选规则

> itemLabels(shopData)
 [1] "cardid=[1.02e+04,4.22e+04)" "cardid=[4.22e+04,7.86e+04)"
 [3] "cardid=[7.86e+04,1.1e+05]"  "value=[10,22.6)"           
 [5] "value=[22.6,36.1)"          "value=[36.1,49.9]"         
 [7] "pmethod=CARD"               "pmethod=CASH"              
 [9] "pmethod=CHEQUE"             "sex=F"                     
[11] "sex=M"                      "homeown=NO"                
[13] "homeown=YES"                "income=[1.02e+04,1.68e+04)"
[15] "income=[1.68e+04,2.36e+04)" "income=[2.36e+04,3e+04]"   
[17] "age=[16,26)"                "age=[26,39)"               
[19] "age=[39,50]"                "fruitveg"                  
[21] "freshmeat"                  "dairy"                     
[23] "cannedveg"                  "cannedmeat"                
[25] "frozenmeal"                 "beer"                      
[27] "wine"                       "softdrink"                 
[29] "fish"                       "confectionery" 

说明:1到9变量不应该进行关联规则;10到19变量只能出现在关联规则的左侧。

> rules_sub <- subset(fit_apriori, subset = rhs %in% itemLabels(shopData)[20:30] & lhs %in% itemLabels(shopData)[10:30] & lift>3)
> inspect(rules_sub)
     lhs                              rhs          support confidence coverage     lift count
[1]  {homeown=NO,                                                                            
      age=[16,26),                                                                           
      fish}                        => {fruitveg}     0.111  0.9098361    0.122 3.042930   111
[2]  {homeown=NO,                                                                            
      age=[16,26),                                                                           
      fruitveg}                    => {fish}         0.111  0.9568966    0.116 3.277043   111
[3]  {income=[1.02e+04,1.68e+04),                                                            
      frozenmeal,                                                                            
      beer}                        => {cannedveg}    0.138  0.9718310    0.142 3.207363   138
[4]  {income=[1.02e+04,1.68e+04),                                                            
      cannedveg,                                                                             
      beer}                        => {frozenmeal}   0.138  0.9787234    0.141 3.240806   138
[5]  {income=[1.02e+04,1.68e+04),                                                            
      cannedveg,                                                                             
      frozenmeal}                  => {beer}         0.138  0.9387755    0.147 3.204012   138
[6]  {sex=M,                                                                                 
      frozenmeal,                                                                            
      beer}                        => {cannedveg}    0.141  0.9527027    0.148 3.144233   141
[7]  {sex=M,                                                                                 
      cannedveg,                                                                             
      beer}                        => {frozenmeal}   0.141  0.9400000    0.150 3.112583   141
[8]  {sex=M,                                                                                 
      cannedveg,                                                                             
      frozenmeal}                  => {beer}         0.141  0.9276316    0.152 3.165978   141
[9]  {sex=M,                                                                                 
      income=[1.02e+04,1.68e+04),                                                            
      beer}                        => {frozenmeal}   0.136  0.9714286    0.140 3.216651   136
[10] {sex=M,                                                                                 
      income=[1.02e+04,1.68e+04),                                                            
      frozenmeal}                  => {beer}         0.136  0.9510490    0.143 3.245901   136
[11] {sex=M,                                                                                 
      income=[1.02e+04,1.68e+04),                                                            
      beer}                        => {cannedveg}    0.136  0.9714286    0.140 3.206035   136
[12] {sex=M,                                                                                 
      income=[1.02e+04,1.68e+04),                                                            
      cannedveg}                   => {beer}         0.136  0.9714286    0.140 3.315456   136
[13] {sex=M,                                                                                 
      income=[1.02e+04,1.68e+04),                                                            
      frozenmeal}                  => {cannedveg}    0.137  0.9580420    0.143 3.161855   137
[14] {sex=M,                                                                                 
      income=[1.02e+04,1.68e+04),                                                            
      cannedveg}                   => {frozenmeal}   0.137  0.9785714    0.140 3.240303   137
[15] {sex=M,                                                                                 
      income=[1.02e+04,1.68e+04),                                                            
      frozenmeal,                                                                            
      beer}                        => {cannedveg}    0.136  1.0000000    0.136 3.300330   136
[16] {sex=M,                                                                                 
      income=[1.02e+04,1.68e+04),                                                            
      cannedveg,                                                                             
      beer}                        => {frozenmeal}   0.136  1.0000000    0.136 3.311258   136
[17] {sex=M,                                                                                 
      income=[1.02e+04,1.68e+04),                                                            
      cannedveg,                                                                             
      frozenmeal}                  => {beer}         0.136  0.9927007    0.137 3.388057   136

说明:通过subset可以提取出符合需求的规则。rhs表示规则右项,lhs表示规则左项!

> rules_sub1 <- subset(fit_apriori, subset = rhs %in% itemLabels(shopData)[20:30] & lhs %in% "age=[16,26)" & lift>3)
> inspect(rules_sub1)
    lhs                                    rhs        support confidence coverage
[1] {homeown=NO, age=[16,26), fish}     => {fruitveg} 0.111   0.9098361  0.122   
[2] {homeown=NO, age=[16,26), fruitveg} => {fish}     0.111   0.9568966  0.116   
    lift     count
[1] 3.042930 111  
[2] 3.277043 111  

说明:筛选左项包含age=[16,26)条件的规则

> rules_sub2 <- subset(fit_apriori, subset = rhs %in% itemLabels(shopData)[20:30] & !(lhs %in% itemLabels(shopData)[1:19]) )
> inspect(rules_sub2)
    lhs                        rhs          support confidence coverage lift     count
[1] {frozenmeal, beer}      => {cannedveg}  0.146   0.8588235  0.170    2.834401 146  
[2] {cannedveg, beer}       => {frozenmeal} 0.146   0.8742515  0.167    2.894873 146  
[3] {cannedveg, frozenmeal} => {beer}       0.146   0.8439306  0.173    2.880309 146

说明:只筛选商品之间的关联规则



获取案例数据,请关注微信公众号并回复:R_dt15


Comments

Make a comment