过程挖掘

    传统的机器学习技术假设输入数据是一个扁平表结构,其中列描述变量,行表示样本。过程分析将此进一步推广,处理描述流、轨迹或序列活动数据。过程挖掘技术在业务流程管理中特别需要。过程挖掘在过去十年中围绕商业流程分析迅速发展,我们的研究团队目前专注于以下主题:

  • 过程挖掘与智能

    当今企业广泛使用信息系统支持其业务流程。这种系统存储了众多的记录和日志数据,里面含有多种事件,这些事件与商业流程中某个任务的发生是相联系的。过程挖掘以事件日志分析为基础,对操作性流程进行建模、改进和扩展。因此,过程挖掘是业务流程管理和数据挖掘领域交叉点。

(1)过程发现与识别

    考虑如下客户信用卡申请事件日志:

A simple event log for a credit application process.

图:信用卡申请事件日志

    数据表中并不是扁平结构,而是包含了以下几个特点:

1. 分组因子,可将多行分为一组;
2. 排序,可对行进行全局或局部排序;
3. 标签,每行具有标签分配方法;

    过程挖掘的第一任务是过程发现和提取代表的现实生活状况的模型。这样做可以帮助我们回答例如:“真正的过程是什么样的”,“什么是常见的途径”,“瓶颈在哪里有存在?”,“人员是如何一起工作的?”,“哪些事件和活动属性是如何影响持续时间或流程的?“。回答这些问题的挑战在于不仅要建立事件将日志中活动流抽象出来,还要使得建立的模型可理解,并且使重要信息一目了然。

logmodel

图:事件日志中提取的活动流可视化过程模型

   (2)一致性检查

    在某些情况下,终端用户希望将日志记录的现实行为与规定的模型,或一组需要遵守的规则进行比较。这可以帮助回答诸如“现实在什么地方与模型发生了偏离?”,“哪些规则被破坏了?”,“哪些SLA没有得到满足?”和“现实在什么地方与规定的规则发生了偏离?”。

   (3)与数据挖掘的交叉

    在大多数情况下,分析师试图通过探索性和可视化的方式回答上述问题。不过,我们认为,过程挖掘和数据挖掘之间应该加强联系,特别是当模型变得太复杂,无法可视化的情况下,这显得尤为重要。我们目前正在回答这样的问题:“模式随着时间的推移,发生了什么变化?”,“两个模型之间的显著差异是什么?”,“我们可否基于过去预测未来事件的运行时间或复杂性?“。尤其是预测性过程挖掘,是我们目前关注的一个重点。

  • 客户“旅程流”

untitled

    最近,越来越多的研究开始将过程挖掘用于“非内部流程”场景,例如客户的旅程映射和动态客户细分。我们的研究小组,利用过程挖掘提出了探索消费者不同时间点行为的动态变化。新方法结合数据挖掘聚类技术与序列挖掘方法来发现数据库中的典型行为过程,可以称之为“行为挖掘”。
    该框架应用到事件组织者,帮助它们使用行为轨迹解释消费者决策,改善受客户行为影响的业务流程。
    除了关注客户旅程和动态细分,我们也正在探索如何将过程分析技术应用到其他类型的数据流中,例如Web服务器日志。

 

  • 日志分析

    过程挖掘最开始发源于事件日志分析。最近的几年呈现出如下趋势:首先,越来越多关注除商业流程以为的过程挖掘;其次,网络日志数据成指数级数上升。例如由网络设备、应用软件、手机、操作系统和智能设备产生的安全日志、应用程序日志、服务器日志等,以及由物联网中各相互连接的智能设备产生的日志。我们预测在未来几年,过程挖掘、流分析、日志分析、序列挖掘以及预测数据挖掘都继续发展相互交叉,并面临着如下挑战:

  • 如何将不同的日志流合并成一个,并找到相关方法进行匹配;
  • 从非结构化日志中提取结构化信息;
  • 模式识别、规范化、相关分析和查询工具的研究;
  • 结合预测机器学习技术来预测消息类,开发模型过滤掉不感兴趣的消息;

代表性论文:

  • vanden Broucke, S., De Weerdt, J., Vanthienen, J., Baesens, B. (2014). Determining process model precision and generalization with weighted artificial negative events. IEEE Transactions on Knowledge and Data Engineering, 26 (8), 1877-1889.
  • Caron, F., vanden Broucke, S., Vanthienen, J., Baesens, B. (2014). Advanced rule-based process analytics: applications for risk response decisions and management control activities. Expert Systems with Applications, accepted.
  • Seret, A., vanden Broucke, S., Baesens, B., Vanthienen, J. (2014). A dynamic understanding of customer behavior processes based on clustering and sequence mining. Expert Systems with Applications, 41 (10), 4648-4657.
  • De Weerdt, J., Schupp, A., Vanderloock, A., Baesens, B. (2013). Process mining for the multi-faceted analysis of business processes – A case study in a financial services organization. Computers in Industry, 64, 57-67.
  • De Weerdt, J., vanden Broucke, S. (2014). SECPI: searching for explanations for clustered process instances. Proceeding of the 12th International Conference on Business Process Management, BPM 2014: Vol. accepted. International Conference (BPM 2014). Haifa (Israel), 7-11 September 2014.
  • De Weerdt, J., De Backer, M., Vanthienen, J., Baesens, B. (2012). A multi-dimensional quality assessment of state-of-the-art process discovery algorithms using real-life event logs. Information Systems, 37 (7), 654-676.
  • vanden Broucke, S., Munoz-Gama, J., Carmona, J., Baesens, B., Vanthienen, J. (2014). Event-based real-time decomposed conformance analysis. On the Move Federated Conferences & Workshops: Vol. accepted. International Conference on Cooperative Information Systems (CoopIS 2014). Amantea, Calabria (Italy), 27-31 October 2014 Springer.
  • vanden Broucke, S., Vanthienen, J., Baesens, B. (2014). Declarative process discovery with evolutionary computing. 2014 IEEE Congress on Evolutionary Computation Proceedings. 2014 IEEE. Beijing (China), 6-11 July 2014 (pp. 2412-2419) IEEE.
  • De Weerdt, J., vanden Broucke, S., Vanthienen, J., Baesens, B. (2012). Active trace clustering for improved process discovery. IEEE Transactions on Knowledge and Data Engineering, accepted.
  • Goedertier, S., De Weerdt, J., Martens, D., Vanthienen, J., Baesens, B. (2011). Process discovery in event logs: an application in the telecom industry. Applied Soft Computing, 11 (2), 1697-1710.
  • Goedertier, S., Martens, D., Vanthienen, J., Baesens, B. (2009). Robust process discovery with artificial negative events. Journal of Machine Learning Research, 10, 1305-1340.
  • vanden Broucke, S., De Weerdt, J., Vanthienen, J., Baesens, B. (2013). A comprehensive benchmarking framework (CoBeFra) for conformance analysis between procedural process models and event logs in ProM. Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining, CIDM 2013, part of the IEEE Symposium Series on Computational Intelligence 2013, SSCI 2013: vol. accepted. IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2013). Singapore, 16-19 April 2013.
  • Caron, F., vanden Broucke, S., Vanthienen, J., Baesens, B. (2012). On the distinction between truthful, invisible, false and unobserved events. Proceedings of the 18th Americas Conference on Information Systems: vol. accepted. Americas Conference on Information Systems. Seattle, Washington (US), 9-12 August 2012.
  • De Weerdt, J., vanden Broucke, S., Vanthienen, J., Baesens, B. (2012). Leveraging process discovery with trace clustering and text mining for intelligent analysis of incident management processes. Evolutionary Computation (CEC), 2012 IEEE Congress on. Congress on Evolutionary Computation (CEC), 2012 IEEE. Brisbane (Australia), 10-15 June 2012 (pp. 1-8).
  • vanden Broucke, S., De Weerdt, J., Baesens, B., Vanthienen, J. (2012). An improved artificial negative event generator to enhance process event logs. Lecture Notes in Computer Science: vol. Accepted. International Conference on Advanced Information Systems Engineering (CAiSE’12). Gdansk (Poland), 25-29 June 2012.
  • De Weerdt, J., De Backer, M., Vanthienen, J., Baesens, B. (2011). A robust F-measure for evaluating discovered process models. CIDM. IEEE Symposium Series in Computational Intelligence 2011 (SSCI 2011). Paris (France), 11-15 April 2011 (pp. 148-155) IEEE.
  • Caron, F., Vanthienen, J., De Weerdt, J., Baesens, B. (2011). Advanced care-flow mining and analysis. In Daniel, F. (Ed.), Barkaoui, K. (Ed.), Dustdar, S. (Ed.), Lecture Notes in Business Information Processing: Vol. 99. Business Process Management Workshops (BPM 2011). Clermont-Ferrand (France), 28 August-2 September 2011 (pp. 167-168).
  • De Weerdt, J., De Backer, M., Vanthienen, J., Baesens, B. (2010). A critical evaluation study of model-log metrics in process discovery. In zur Muehlen, M. (Ed.), Su, J. (Ed.), Business Process Management Workshops: Vol. 66. Workshop on Business Process Intelligence (BPI2010). New Jersey (US), 14-16 September 2010 (pp. 158-169) Springer.
  • Goedertier, S., Martens, D., Baesens, B., Haesen, R., Vanthienen, J. (2008). Process Mining as First-Order Classification Learning on Logs with Negative Events. Lecture Notes in Computer Science: Vol. 4928. Workshop on Business Process Intelligence (BPI 07) at BPM 2007. Brisbane, Australia, 24 September 2007 (pp. 42-53) Springer.