CONFIDENTIAL and PROPRIETARY to CVDI Action Association Rules
21 Slides722.35 KB
CONFIDENTIAL and PROPRIETARY to CVDI Action Association Rules Mining Project proposal for Fall 2016 Jian Chen & Jennifer Lavergne A National Science Foundation Industry/ University Cooperative Research Center
CONFIDENTIAL and PROPRIETARY to CVDI Project Goals To modify the itemset tree algorithm: 1. Discovering action sets 2. Generating action rules A National Science Foundation Industry/ University Cooperative Research Center
CONFIDENTIAL and PROPRIETARY to CVDI BACKGROUND A National Science Foundation Industry/ University Cooperative Research Center
CONFIDENTIAL and PROPRIETARY to CVDI Application: Customer Attrition Facts: On average, most US corporations lose half of their customers every five years Longer a customer stays with the organization, the more profitable he or she becomes The cost of attracting new customers is five to ten times more than retaining existing ones About 14% to 17% of the accounts are closed for reasons that can be controlled like price or service Action: Reducing the outflow of the customers by 5% can double a typical company’s profit A National Science Foundation Industry/ University Cooperative Research Center
CONFIDENTIAL and PROPRIETARY to CVDI Association Rules Association rule - an implication {X Y, support, confidence}. Where X and Y are subsets of the itemset I and X Y Ø – Example: {{bread, milk} {cheese}, 30%, 75%} Support #occurrences of I in database/#rows in database – Minsup – The minimum support threshold for an itemset I to be considered frequent Confidence Support(X Y)/Support(X) for itemset I X Y. – Minconf – a user specified threshold that indicates the interestingness of a candidate rule I: conf(I) minconf A National Science Foundation Industry/ University Cooperative Research Center
CONFIDENTIAL and PROPRIETARY to CVDI Itemset Trees A data structure which aids in users querying for a specific itemset and it’s support: Targeted Association Mining Item mapped to numeric values: {bread} {1}, {cheese} {2} – Numbers must be in ascending order within the itemset – Ex: I {1, 2, 56, 120} Note: Can be used to find all or specific rules within a dataset. A National Science Foundation Industry/ University Cooperative Research Center
CONFIDENTIAL and PROPRIETARY to CVDI RULE GENERATION A National Science Foundation Industry/ University Cooperative Research Center
CONFIDENTIAL and PROPRIETARY to CVDI Rule Generation minsup 10% and minconf 30% Rules generated using this method: {query} {I - query} {2, 4} {1}, support 1/7 14%, confidence 1/3 33% {2, 4} {8}, support 2/7 29%, confidence 2/3 66% A National Science Foundation Industry/ University Cooperative Research Center
CONFIDENTIAL and PROPRIETARY to CVDI ACTION ASSOCIATION RULES A National Science Foundation Industry/ University Cooperative Research Center
CONFIDENTIAL and PROPRIETARY to CVDI Action Association Rules Action Rule - Association rules with flexible and stable attributes. Flexible attribute – objects which can potentially change from one state to another: (interest rate, low high) Stable attribute – objects which will remain stable and not change: (date of birth) Action rule example: (a, a1), (b, b1 b2) (d, d2 d1)(c, c1) A National Science Foundation Industry/ University Cooperative Research Center
CONFIDENTIAL and PROPRIETARY to CVDI Action Set Discovery A B C D x1 a1 b1 c1 d1 x2 a2 b1 c1 d1 x3 a2 b2 c1 d2 x4 a2 b2 c1 d2 x5 a2 b1 c1 d1 x6 a2 b2 c1 d2 x7 a2 b1 c2 d2 x8 a1 b2 c2 d1 - B and D are flexible - A and C are stable - MinSup 3 A National Science Foundation Industry/ University Cooperative Research Center (category, item) – support count (A, a1) - support 2 (A, a2) - support 6 (B, b1) - support 4 (B, b2) - support 4 (C, c1) - support 6 (C, c2) - support 2 (D, d1) - support 4 (D, d2) - support 4
CONFIDENTIAL and PROPRIETARY to CVDI Action Set Discovery A B C D x1 a1 b1 c1 d1 x2 a2 b1 c1 d1 x3 a2 b2 c1 d2 x4 a2 b2 c1 d2 x5 a2 b1 c1 d1 x6 a2 b2 c1 d2 x7 a2 b1 c2 d2 x8 a1 b2 c2 d1 - B and D are flexible - A and C are stable - MinSup 3 A National Science Foundation Industry/ University Cooperative Research Center (category, item change) – support count (B, b1 b2) - support 4 (B, b2 b1) - support 4 (D, d1 d2) - support 4 (D, d2 d1) - support 4 The minimum of the two supports is kept and compared against MinSup. Support(b1 b2) min(support(b1), support(b2))
CONFIDENTIAL and PROPRIETARY to CVDI Action Set Discovery A B C D x1 a1 b1 c1 d1 x2 a2 b1 c1 d1 x3 a2 b2 c1 d2 x4 a2 b2 c1 d2 x5 a2 b1 c1 d1 x6 a2 b2 c1 d2 x7 a2 b1 c2 d2 x8 a1 b2 c2 d1 - B and D are flexible - A and C are stable - MinSup 3 A National Science Foundation Industry/ University Cooperative Research Center (A, a2) (A, a2) (A, a2) (A, a2) (A, a2) (B, b1) (B, b1) (B, b1) (B, b2) (B, b2) (B, b2) (C, c1) (C, c1) ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ (B, b1) - support 3 (B, b2) - support 3 (C, c1) - support 4 (D, d1) - support 2 (D, d2) - support 4 (C, c1) - support 3 (D, d1) - support 3 (D, d2) - support 1 (C, c1) - support 2 (D, d1) - support 1 (D, d2) - support 3 (D, d1) - support 3 (D, d2) - support 2
CONFIDENTIAL and PROPRIETARY to CVDI Action Set Discovery A B C D x1 a1 b1 c1 d1 x2 a2 b1 c1 d1 x3 a2 b2 c1 d2 x4 a2 b2 c1 d2 x5 a2 b1 c1 d1 x6 a2 b2 c1 d2 x7 a2 b1 c2 d2 x8 a1 b2 c2 d1 - B and D are flexible - A and C are stable - MinSup 3 A National Science Foundation Industry/ University Cooperative Research Center (A, a2) ・ (B, b1 b2) - support 3 (A, a2) ・ (B, b2 b1) - support 3 (A, a2) ・ (D, d2 d1) - support 2 (A, a2) ・ (D, d1 d2) - support 2 (B, b1 b2) ・ (D, d1) - support 1 (B, b1 b2) ・ (D, d2) - support 1 (B, b1 b2) ・ (D, d2 d1) - support 1 (B, b1 b2) ・ (D, d1 d2) - support 3 (C, c2) ・ (D, d2 d1) - support 1 (C, c2) ・ (D, d1 d2) - support 1 And so on .
CONFIDENTIAL and PROPRIETARY to CVDI Action Rule Generation Selection of rules generated from this dataset using MinSup and MinConf thresholds: (B, b1 b2) ・ (D, d1 d2) - support 3 (B, b1 b2) (D, d1 d2) (D, d1 d2) (B, b1 b2) There has to be a transition on each side of the association rule. A National Science Foundation Industry/ University Cooperative Research Center
CONFIDENTIAL and PROPRIETARY to CVDI PROJECT DESCRIPTION A National Science Foundation Industry/ University Cooperative Research Center
CONFIDENTIAL and PROPRIETARY to CVDI Project description Modify the ordered itemset tree algorithm: 1. Discovering action sets 2. Generating action rules Modify existing code or write your own. – Needs to be “attribute aware.” – Needs to keep track of flexible/stable Read papers on itemset tree and action association rules (provided) A National Science Foundation Industry/ University Cooperative Research Center
CONFIDENTIAL and PROPRIETARY to CVDI Project description Need to map datasets to integer values without losing track of flexible and stable. Modifications to search: – Additions to keep track of the flexible/stable attributes during search – Ability to search for itemsets as well as transitions or both simultaneously Mark nodes and use subtrees to find transitions A National Science Foundation Industry/ University Cooperative Research Center
CONFIDENTIAL and PROPRIETARY to CVDI Project description Modifications to rule generation: – Produce rules with both stable and flexible attributes – Has to have a transition on each side of the rule (a, a2) (d, d1 d2) (b, b1 b2) (d, d1 d2) (a, a2) (b, b1 b2) (b, b1 b2) (d, d1 d2) – All flexible attributes don’t have to have a transition A National Science Foundation Industry/ University Cooperative Research Center
CONFIDENTIAL and PROPRIETARY to CVDI References M. Kubat, A. Hafez, V. V. Raghavan, J. Lekkala, and W. K. Chen, “Itemset trees for targeted association mining”, IEEE Trans. on Knowledge and Data Engineering, 2002 Z.W. Ras, A. Dardzinska, L.-S. Tsay, H. Wasyluk, “Association Action Rules”, IEEE/ICDM Workshop on Mining Complex Data (MCD 2008), Pisa, Italy, ICDM Workshops Proceedings, IEEE Computer Society, 2008, 283-290 S. Im, Z.W. Ras, "Action rule extraction from a decision table: ARED“, in "Foundations of Intelligent Systems", Proceedings of ISMIS'08, A. An et al. (Eds.), Toronto, Canada, LNAI, Vol. 4994, Springer, 2008, 160-168 Z.W. Ras, A. Dardzinska, "Action Rules Discovery without pre-existing classification rules“ in Proceedings of RSCTC 2008 Conference, in Akron, Ohio, LNAI 5306, Springer, 2008, 181-190 D. Difallah, R. Benton, T. Johnsten and V. Raghavan, "FAARM: Frequent Association Action Rules Mining Using FP-Tree", in Workshop on Domain Driven Data Mining, part of 11th IEEE International Conference on Data Mining Workshops, Vancouver, Canada, pp. 398-404, December 11, 2011. A National Science Foundation Industry/ University Cooperative Research Center
CONFIDENTIAL and PROPRIETARY to CVDI Questions? Jian Chen, Ph.D. [email protected] Jennifer Lavergne, Ph.D. [email protected] A National Science Foundation Industry/ University Cooperative Research Center