Process Mining: The next step in Business Process
52 Slides1.60 MB
Process Mining: The next step in Business Process Management Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology Department of Information and Technology P.O. Box 513, 5600 MB Eindhoven The Netherlands [email protected] & Centre for Information Technology Innovation (CITI) Queensland University of Technology (QUT) Brisbane, Australia
Outline Motivation Overview of process mining – – – – – Basic performance metrics Process models Organizational models Social networks Performance characteristics Process Mining: Some of our tools – EMiT – Thumb – MinSocN Conclusion
Workflow/BPM in The Netherlands “The Netherlands in the country with the highest density of workflow systems per capita” John O'Connell (CEO Staffware) (cf. population density per sq. km 390 versus 2.5 for Australia) Emphasis on process modeling and analysis (the European way) Innovative companies like Pallas Athena, Baan,
I&T department, Eindhoven University of Technology Embedded in research institute BETA joining multiple disciplines Three subgroups: – Business Process Management (workflow management, Petri nets, mining, .) – ICT Architectures (agents, transactions, .) – Software Engineering (software quality, .) Team working on process mining: Wil van der Aalst, Ton Weijters, Ana Karla Alves de Medeiros, Boudewijn van Dongen, Eric Verbeek, Minseok Song, Monique Vullers-Jansen, Laura Maruster,
Motivation
(Zur Muehlen 2003) Commercial Workflow Systems ViewStar Lucent Mosaix eiStream WANG Workflow BlueCross BlueShield Eastman SIGMA WANG 25 years of workflow J CALS CARNOT Verve MS2 Accelerate VisualInfo Pioneers like Skip Ellis and Michael Zisman already worked on “office automation” in the 70ties The WFM hype is over , but there are more and more applications, it has become a mature technology, and WFM is adopted by many other technologies (ERP, Web Services, etc.). Versata Continuum Netscape PM iPlanet jFlow BEA PI DST AWD DST AWD ImagePlus FMS/FAF Pavone Onestone Domino Workflow NCR ProcessIT Exotica I - III FlowMark Pegasus MQSeries Workflow OpenPM WorkManager FlowJ et AdminFlow Changengine SNI WorkParty Recognition Int. Plexus FloWare COSA BaaN Ley COSA Oracle Workflow Digital Objectflow DEC LinkWorks BancTec FloWare Digital Proc.Flo. Beyond BeyondMail AltaVista Proc.Flow Banyan BeyondMail Fujitsu iFlow Fujitsu Regatta Teamware Flow Staffware FileNet WorkFlo Visual WorkFlo Panagon WorkFlo FileNet Ensemble Action Coordinator ActionWorkflow DaVinci ActionWorks Metro Xerox InConcert TIB/InConcert IABG ProMInanD Olivetti X Workflow 1980 1985 1990 LEU 1995 2000
Start Register order Let us reverse the process! Prepare shipment (Re)send bill process mining Ship goods Contact customer Receive payment Process mining can be used for: – Process discovery (What is the process?) – Delta analysis (Are we doing what was specified?) – Performance analysis (How can we improve?) Archive order End Particularly interesting in pre- and post-workflow settings!
Process mining: Overview
Classification of process mining The following types of process mining can be distinguished: 1) 2) 3) 4) 5) Determine basic performance metrics Determine process model Determine organizational model Analyze social network (i.e., relations between actors) Analyze performance characteristics (i.e., derive rules explaining performance)
2) process model 3) organizational model 4) social network Start Register order Prepare shipment (Re)send bill Ship goods Contact customer Receive payment Archive order End 1) basic performance metrics 5) performance characteristics If then
(1) Determine basic performance metrics Process/control-flow perspective: flow time, waiting time, processing time and synchronization time. Questions: What is the average flow time of orders? What is the maximum waiting time for activity approve? What percentage of requests is handled within 10 days? What is the minimum processing time of activity reject? What is the average time between scheduling an activity and actually starting it? Resource perspective: frequencies, time, utilization, and variability. Questions: How many times did Sue complete activity reject claim? How many times did John withdraw activity go shopping? How many times did Clare suspend some running activity? How much time did Peter work on instances of activity reject claim? How much time did people with role Manager work on this process? What is the utilization of John? What is the average utilization of people with role Manager? How many times did John work for more than 2 hours without interruption?
Example (ARIS PPM) IDS Scheer's ARIS Process Performance Manager
(2) Determine process model Discover a process model (e.g., in terms of a PN or EPC) without prior knowledge about the structure of the process. case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D B A D C E F (W)
(3) Determine organizational model Discover the organizational model (i.e., roles, departments,etc.) without prior knowledge about the structure of the organization. Row Points for Source John Alex Lucia Peter Mary A 88 0 112 0 0 B 0 189 0 11 0 C 8 0 0 192 0 D 0 2 0 0 198 E 38 0 62 0 0 Symmetrical Normalization F 50 0 40 0 0 2.0 Alex 1.5 Peter 1.0 e.g., correspondence analysis (typically applied in ecology) Dimension 2 .5 0.0 Mary -.5 -1.0 John Lucia -1.0 -.5 0.0 Dimension 1 .5 1.0 1.5 2.0
(4) Analyze social network Social Network Analysis (SNA) Based on: – – – – – Handover of work Subcontracting Working together Reassignments Doing similar tasks
Example John Alex Lucia Peter Mary John 0 0 0 0 2 Alex 0 0 0 0 0 Lucia 0 0 0 2 2 Peter 0 0 2 0 2 Mary 2 0 2 2 0
(5) Analyze performance characteristics Each case (process/workflow instance) has a number of properties: – Resource that worked on a specific activity – Value of a characteristic data element (e.g., size of order, age of customer, etc.) – Performance metrics of case (e.g., flow time) Using machine-learning techniques it is possible to find relevant relations between these properties.
Example caseid 1 2 3 . Act Act A B John Mike Clare Jim John Mike . . . . Act Data Z D1 Anne 50 Ike 75 Clare 55 . . Data D2 20y 15y 20y . . . Data D9 80% 75% 80% . Proc Wait Flow time Time time 12h 3d 3.5d 6h 3d 3.25d 18h 4d 4.75d . . . If John and Mike work together, it takes longer. Expensive cases require less processing. Etc.
Process mining: The tools EMiT Thumb MinSocN
Process Mining: Tooling workflow management systems case handling / CRM systems ERP systems Staffware FLOWer SAP R/3 InConcert Vectus BaaN MQ Series Siebel Peoplesoft common XML format for storing/ exchanging workflow logs mining tools EMiT Thumb MinSocN
Example: processing customer orders Example in Staffware: 7 tasks and all basic routing constructs
Fragment of Staffware log Case 21 Diractive Description Event User yyyy/mm/dd hh:mm ---------------------------------------------------------------------------Start swdemo@staffw edl 2003/02/05 15:00 Register order Processed To swdemo@staffw edl 2003/02/05 15:00 Register order Released By swdemo@staffw edl 2003/02/05 15:00 Prepare shipment Processed To swdemo@staffw edl 2003/02/05 15:00 (Re)send bill Processed To swdemo@staffw edl 2003/02/05 15:00 (Re)send bill Released By swdemo@staffw edl 2003/02/05 15:01 Receive payment Processed To swdemo@staffw edl 2003/02/05 15:01 Prepare shipment Released By swdemo@staffw edl 2003/02/05 15:01 Ship goods Processed To swdemo@staffw edl 2003/02/05 15:01 Ship goods Released By swdemo@staffw edl 2003/02/05 15:02 Receive payment Released By swdemo@staffw edl 2003/02/05 15:02 Archive order Processed To swdemo@staffw edl 2003/02/05 15:02 Archive order Released By swdemo@staffw edl 2003/02/05 15:02 Terminated 2003/02/05 15:02 Case 22 Diractive Description Event User yyyy/mm/dd hh:mm ---------------------------------------------------------------------------Start swdemo@staffw edl 2003/02/05 15:02 Register order Processed To swdemo@staffw edl 2003/02/05 15:02 Register order Released By swdemo@staffw edl 2003/02/05 15:02 Prepare shipment Processed To swdemo@staffw edl 2003/02/05 15:02
Fragment of XML file ?xml version "1.0"? !DOCTYPE WorkFlow log SYSTEM "http://www.tm.tue.nl/it/research/workflow/mining/WorkFlow log.dtd" WorkFlow log source program "staffware"/ process id "main process" case id "case 0" log line task name Case start /task name event kind "normal"/ date 05-02-2003 /date time 15:04 /time /log line log line task name Register order /task name event kind "schedule"/ date 05-02-2003 /date time 15:04 /time
EMiT Focus on time.
Thumb Focus on noise.
Thumb is able to deal with noise (D/F-graphs) no noise causalit y 10% noise
Start Register order Representation in terms of an EPC (collaboration with IDS Scheer) Prepare shipment (Re)send bill Ship goods Contact customer Receive payment Archive order End
MinSocN (Mining Social Networks)
Real case: CJIB Processing of fines 130136 cases 99 different activities
Process in EMiT
Complete process model Validated by CJIB
Conclusion
Conclusion (1) Process mining is practically relevant and the logical next step in Business Process Management. diagnosis process enactment process design implementation/ configuration
Conclusion (2) Process mining provides many interesting challenges for scientists, customers, users, managers, consultants, and tool developers. Start 2) process model 3) organizational model Register order 4) social network Prepare shipment (Re)send bill Ship goods Contact customer Receive payment Archive order End 1) basic performance metrics 5) performance characteristics If then
More information http://www.tm.tue.nl/it/research/workflow mining.htm http://www.tm.tue.nl/it/research/patterns http://www.tm.tue.nl/it/staff/wvdaalst W.M.P. van der Aalst and K.M. van Hee. Workflow Management: Models, Methods, and Systems. MIT press, Cambridge, MA, 2002.
References BPM (just books and far from complete) W.M.P. van der Aalst and K.M. van Hee. Workflow Management: Models, Methods, and Systems. MIT press, Cambridge, MA, 2002. Workflow Management: Modeling Concepts, Architecture and Implementation by Stefan Jablonski and Christoph Bussler; Paperback: 351 pages; International Thomson Publishing, October 1996. Production Workflow: Concepts and Techniques, by Frank Leymann, Dieter Roller, Andreas Reuter; Paperback, 479 pages; Prentice Hall PTR, 1st edition, September 1999. Workflow-Based Process Controlling: Foundation, Design and Application of Workflow-Driven Process Information Systems, by Michael Zur Muehlen. Logos, Berlin, 2003 Proceedings of the International Conference on Business Process Management (BPM), Eindhoven, The Netherlands, June 26-27, 2003, by Wil M. P. van der Aalst, Arthur H. M. ter Hofstede, and Mathias Weske (Editors); Paperback, 391 pages; Springer Verlag, 2003. W.M.P. van der Aalst, J. Desel, and A. Oberweis, editors. Business Process Management: Models, Techniques, and Empirical Studies, volume 1806 of Lecture Notes in Computer Science. Springer-Verlag, Berlin, 2000.
References (2) Internet Based Workflow Management: Towards a Semantic Web by Dan C. Marinescu; Hardcover, 626 pages; John Wiley & Sons, 1st edition, April 2002. Web Services, by Gustavo Alonso, Fabio Casati, Harumi Kuno, and Vijay Machiraju; Hardcover, 480 pages, Springer Verlag, June 2003. The Workflow Imperative, by Thomas M. Koulopolous; Hardcover, 240 pages; Van Nostrand Reinhold, 1st edition, January 1995. Database Support for Workflow Management: The WIDE Project, by Paul Grefen, Barbara Pernici, and Gabriel Sanchez (Editors); Hardcover, 296 pages. Kluwer Academic Publishers, February, 1999. Design and Control of Workflow Processes: Business Process Management for the Service Industry (Lecture Notes in Computer Science # 2617), by Hajo Reijers; Paperback, 320 pages; Springer Verlag; October 2003. Practical Workflow for SAP - Effective Business Processes using SAP's WebFlow Engine, by Alan Rickayzen et al; Hardcover, 52 pages; SAP Press, July 2002. Workflow Modeling: Tools for Process Improvement and Application Development, by Alec Sharp and Patrick McDermott, Hardcover, 345 pages; Artech House, 1st edition, February 2001. Business Process Modelling With ARIS: A Practical Guide, by Rob Davis; Paperback, 545 ; Springer Verlag, August 2001.
References (3) Workflow Handbook 2003, by Layna Fischer (Editor); Hardcover, 384 pages. Future Strategies, April 2003. Specific for process mining: W.M.P. van der Aalst, B.F. van Dongen, J. Herbst, L. Maruster, G. Schimm, and A.J.M.M. Weijters. Workflow Mining: A Survey of Issues and Approaches. Data and Knowledge Engineering , 47(2):237-267, 2003. W.M.P. van der Aalst and B.F. van Dongen. Discovering Workflow Performance Models from Timed Logs. EDCIS 2002, volume 2480 of Lecture Notes in Computer Science, pages 45-63. Springer-Verlag, Berlin, 2002. A.J.M.M. Weijters and W.M.P. van der Aalst. Rediscovering Workflow Models from Event-Based Data using Little Thumb. Integrated Computer-Aided Engineering, 10(2):151-162, 2003. W.M.P. van der Aalst and A.J.M.M. Weijters, editors. Process Mining, Special Issue of Computers in Industry, Elsevier Science Publishers, Amsterdam, 2004. W.M.P. van der Aalst, A.J.M.M. Weijters, and L. Maruster. Workflow Mining: Discovering Process Models from Event Logs. IEEE Transactions on Knowledge and Data Engineering (to appear).
Appendix: A concrete algorithm
Process Mining: The alpha algorithm 1 start begin proces is collectief collectief 2 collectief of particulier particulier klaar voor controle 4 dubbele aanvraag? dubbele 5 navraag VA (telefoon) voldoende onvoldoende 3 controleren compleetheid/juistheid opvagen gegevens niet compleet/onjuist 6 opvragen ontbrekende gegevens P1 ontbrekende gegevens D1 Geen reactie wachten compleet/juist 7 ontvangst gegevens particulier en invoeren 9 Bepalen vervolg1 particulier en afwijzen 8 verlopen deadline incompleet collectief klaar voor registreren afgewezen 10 registreren alpha algorithm klaar voor invoeren 11 afwijzen 12 Bepalen offerte standaard of NIET Standaard offerte Niet Standaard offerte 13 inv., 1e controle, printen STANDAARD 15 inv, 1e controle, printen NIET STD. offerte uitgeprint NS uitgeprint Afgekeurd NS afgekeurde offerte 14 eindcontrolere, tekenen Standaard 16 eindcontrolere, tekenen niet std. Goedgekeurde offerte 17 bepalen vervolg P of C retour gewenst retour gewenst particulier zonder retour 19 wachten op accoord verklaring collectief retour reeds ontvangen P2 accoord verklaring naar registreren 18 registreren offerte gesloten klaar voor einde 22 Opbergen en einde 20 ontvangst verklaring D2 geen retour ontvangen wachten2 21 registreren offerte afgelegd
Process log Minimal information in log: case id’s and task id’s. Additional information: event type, time, resources, and data. In this log there are three possible sequences: – ABCD – ACBD – EF case case case case case case case case case case case case case case case case case case 1 2 3 3 1 1 2 4 2 2 5 4 1 3 3 4 5 4 : : : : : : : : : : : : : : : : : : task task task task task task task task task task task task task task task task task task A A A B B C C A B D E C D C D B F D
, , ,# relations Direct succession: x y iff for some case x is directly followed by y. Causality: x y iff x y and not y x. Parallel: x y iff x y and y x Choice: x#y iff not x y and not y x. case case case case case case case case case case case case case case case case case case 1 2 3 3 1 1 2 4 2 2 5 4 1 3 3 4 5 4 : : : : : : : : : : : : : : : : : : task task task task task task task task task task task task task task task task task task A A A B B C C A B D E C D C D B F D A B A C B C B D C B C D E F B C C B A B A C B D C D E F
Basic idea (1) x y x y
Basic idea (2) y x z x y, x z, and y z
Basic idea (3) y x z x y, x z, and y#z
Basic idea (4) x z y x z, y z, and x y
Basic idea (5) x z y x z, y z, and x#y
It is not that simple: Basic alpha algorithm Let W be a workflow log over T. (W) is defined as follows. 1. TW { t T W t }, 2. TI { t T W t first( ) }, 3. TO { t T W t last( ) }, 4. XW { (A,B) A TW B TW a A b B a W b a1,a2 A a1#W a2 b1,b2 B b1#W b2 }, 5. YW { (A,B) X (A ,B ) XA A B B (A,B) (A ,B ) }, 6. PW { p(A,B) (A,B) YW } {iW,oW}, 7. FW { (a,p(A,B)) (A,B) YW a A } { (p(A,B),b) (A,B) YW b B } { (iW,t) t TI} { (t,oW) t TO}, and 8. (W) (PW,TW,FW).
Results If log is complete with respect to relation , it can be used to mine any SWF-net! Structured Workflow Nets (SWF-nets) have no implicit places and the following two constructs cannot be used: (Short loops require some refinement but not a problem.)
W Example case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D (W) B A D C E F