Effecient way to create market basket matrix in R -
i trying create market basket matrix data looks following:
input <- matrix( c(1000001,1000001,1000001,1000001,1000001,1000001,1000002,1000002,1000002,1000003,1000003,1000003,100001,100002,100003,100004,100005,100006,100002,100003,100007,100002,100003,100008), ncol=2) this represents folowing data:
colnames(input) <- c( "customer" , "product" ) from matrix created has customer row , products columns. can achieved first creating matrix zero's:
input <- as.data.frame(input) m <- matrix(0, length(unique(input$customer)), length(unique(input$product))) rownames(m) <- unique(input$customer) colnames(m) <- unique(input$product) this fast enough (have data of 750 000+ rows, creating 15000 1500 matrix), want fill matrix appropriate:
for( in 1:nrow(input) ) { m[ as.character(input[i,1]),as.character(input[i,2])] <- 1 } i think there has more efficient way this, learned stackoverflow loops can avoided. question is, there faster way?
and need data in matrix because use packages caret. , after running same problem here r memory management advice (caret, model matrices, data frames), that's concern later.
you don't need reshape2 this; table looking for.
m1 <- as.matrix(as.data.frame.matrix(table(input))) all.equal(m, m1) true
Comments
Post a Comment