Prof. Dr. Henrik Leopold

Assistant Professor for Data Science and Business Intelligence

Publications

Journal Articles (Peer-Reviewed)

DOI: 10.1007/s10796-017-9823-6 

Abstract: Understanding conceptual models of business domains is a key skill for practitioners tasked with systems analysis and design. Research in this field predominantly uses experiments with specific user proxy cohorts to examine factors that explain how well different types of conceptual models can be comprehended by model viewers. However, the results from these studies are difficult to compare. One key difficulty rests in the unsystematic and fluctuating consideration of model viewer characteristics (MVCs) to date. In this paper, we review MVCs used in prominent prior studies on conceptual model comprehension. We then design an empirical review of the influence of MVCS through a global, cross-sectional experimental study in which over 500 student and practitioner users were asked to answer comprehension questions about a prominent type of conceptual model - BPMN process models. As an experimental treatment, we used good versus bad layout in order to increase the variance of performance. Our results show MVC to be a multi-dimensional construct. Moreover, process model comprehension is related in different ways to different traits of the MVC construct. Based on these findings, we offer guidance for experimental designs in this area of research and provide implications for the study of MVCs.

Export record:CitaviEndnoteRISISIBibTeXWordXML

Open reference in new window "An Empirical Review of the Connection Between Model Viewer Characteristics and the Comprehension of Conceptual Process Models"

DOI: 10.1016/j.is.2018.01.007 

Abstract: Textual process descriptions are widely used in organizations since they can be created and understood by virtually everyone. Because of their widespread use, they also provide a valuable source for process analysis, such as compliance checking. However, the inherent ambiguity of natural language impedes the automated analysis of textual process descriptions. While human readers can use their context knowledge to correctly understand statements with multiple possible interpretations, automated tools currently have to make assumptions about their correct meaning. As a result, compliance-checking techniques are prone to draw incorrect conclusions about the proper execution of a process. To provide a comprehensive solution to these reasoning problems, we use this paper to introduce the concept of a behavioral space as a means to deal with behavioral ambiguity in textual process descriptions. A behavioral space captures all possible interpretations of a textual process description in a systematic manner. Thus, it avoids the problem of focusing on a single, possibly incorrect interpretation. We use a quantitative evaluation with a set of 47 textual process descriptions to demonstrate the usefulness of a behavioral space for compliance checking in the context of ambiguous texts.

Export record:CitaviEndnoteRISISIBibTeXWordXML

Open reference in new window "Checking process compliance against natural language specifications using behavioral spaces"

DOI: 10.1016/j.datak.2018.04.008 

Abstract: Process model matching refers to the automatic identification of corresponding activities between two process models. It represents the basis for many advanced process model analysis techniques such as the identification of similar process parts or process model search. A central problem is how to evaluate the performance of process model matching techniques. Current evaluation methods require a binary gold standard that clearly defines which correspondences are correct. The problem is that often not even humans can agree on a set of correct correspondences. Hence, evaluating the performance of matching techniques based on a binary gold standard does not take the true complexity of the matching problem into account and does not fairly assess the capabilities of a matching technique. In this paper, we propose a novel evaluation procedure for process model matching techniques. In particular, we build on the assessments of multiple annotators to define the notion of a non-binary gold standard. In this way, we avoid the problem of agreeing on a single set of correct correspondences. Based on this non-binary gold standard, we introduce probabilistic versions of precision, recall, and F-measure as well as a distance-based performance measure. We use a dataset from the Process Model Matching Contest 2015 and a total of 16 matching systems to assess and compare the insights that can be obtained by using our evaluation procedure. We find that our probabilistic evaluation procedure allows us to gain more detailed insights into the performance of matching systems than a traditional evaluation based on a binary gold standard.

Export record:CitaviEndnoteRISISIBibTeXWordXML

Open reference in new window "A probabilistic evaluation procedure for process model matching techniques"

DOI: 10.1016/j.is.2016.07.010 

Abstract: Many organizations maintain textual process descriptions alongside graphical process models. The purpose is to make process information accessible to various stakeholders, including those who are not familiar with reading and interpreting the complex execution logic of process models. Despite this merit, there is a clear risk that model and text become misaligned when changes are not applied to both descriptions consistently. For organizations with hundreds of different processes, the effort required to identify and clear up such conflicts is considerable. To support organizations in keeping their process descriptions consistent, we present an approach to automatically identify inconsistencies between a process model and a corresponding textual description. Our approach detects cases where the two process representations describe activities in different orders and detect process model activities not contained in the textual description. A quantitative evaluation with 53 real-life model-text pairs demonstrates that our approach accurately identifies inconsistencies between model and text.

Export record:CitaviEndnoteRISISIBibTeXWordXML

Open reference in new window "Comparing textual descriptions to process models – The automatic detection of inconsistencies"

DOI: 10.1016/j.dss.2017.02.013 

Abstract: In recent years, a considerable number of process model matching techniques have been proposed. The goal of these techniques is to identify correspondences between the activities of two process models. However, the results from the Process Model Matching Contest 2015 reveal that there is still no universally applicable matching technique and that each technique has particular strengths and weaknesses. It is hard or even impossible to choose the best technique for a given matching problem. We propose to cope with this problem by running an ensemble of matching techniques and automatically selecting a subset of the generated correspondences. To this end, we propose a Markov Logic based optimization approach that automatically selects the best correspondences. The approach builds on an adaption of a voting technique from the domain of schema matching and combines it with process model specific constraints. Our experiments show that our approach is capable of generating results that are significantly better than alternative approaches.

Export record:CitaviEndnoteRISISIBibTeXWordXML

Open reference in new window "Overcoming individual process model matcher weaknesses using ensemble matching"

DOI: 10.1016/j.datak.2017.03.010 

Abstract: Process models play an important role for specifying requirements of business-related software. However, the usefulness of process models is highly dependent on their quality. Recognizing this, researches have proposed various techniques for the automated quality assurance of process models. A considerable shortcoming of these techniques is the assumption that each activity label consistently refers to a single stream of action. If, however, activities textually describe control flow related aspects such as decisions or conditions, the analysis results of these tools are distorted. Due to the ambiguity that is associated with this misuse of natural language, also humans struggle with drawing valid conclusions from such inconsistently specified activities. In this paper, we therefore introduce the notion of canonicity to prevent the mixing of natural language and modeling language. We identify and formalize non-canonical patterns, which we then use to define automated techniques for detecting and refactoring activities that do not comply with it. We evaluated these techniques by the help of four process model collections from industry, which confirmed the applicability and accuracy of these techniques.

Export record:CitaviEndnoteRISISIBibTeXWordXML

Open reference in new window "Ensuring the canonicity of process models"

DOI: 10.1016/j.is.2017.06.005 

Abstract: Monitoring process performance is an important means for organizations to identify opportunities to improve their operations. The definition of suitable Process Performance Indicators (PPIs) is a crucial task in this regard. Because PPIs need to be in line with strategic business objectives, the formulation of PPIs is a managerial concern. Managers typically start out to provide relevant indicators in the form of natural language PPI descriptions. Therefore, considerable time and effort have to be invested to transform these descriptions into PPI definitions that can actually be monitored. This work presents an approach that automates this task. The presented approach transforms an unstructured natural language PPI description into a structured notation that is aligned with the implementation underlying a business process. To do so, we combine Hidden Markov Models and semantic matching techniques. A quantitative evaluation on the basis of a data collection obtained from practice demonstrates that our approach works accurately. Therefore, it represents a viable automated alternative to an otherwise laborious manual endeavor.

Export record:CitaviEndnoteRISISIBibTeXWordXML

Open reference in new window "Transforming unstructured natural language descriptions into measurable process performance indicators using Hidden Markov Models"

DOI: 10.1109/MS.2015.81 

Abstract: Many organizations use business process models to document business operations and formalize business requirements in software-engineering projects. The Business Process Model and Notation (BPMN), a specification by the Object Management Group, has evolved into the leading standard for process modeling. One challenge is BPMN's complexity: it offers a huge variety of elements and often several representational choices for the same semantics. This raises the question of how well modelers can deal with these choices. Empirical insights into BPMN use from the practitioners' perspective are still missing. To close this gap, researchers analyzed 585 BPMN 2.0 process models from six companies. They found that split and join representations, message flow, the lack of proper model decomposition, and labeling related to quality issues. They give five specific recommendations on how to avoid these issues.

Export record:CitaviEndnoteRISISIBibTeXWordXML

Open reference in new window "Learning from Quality Issues of BPMN Models from Industry"

DOI: 10.1016/j.jss.2015.06.007 

Abstract: Although several approaches for service identification have been defined in research and practice, there is a notable lack of fully automated techniques. In this paper, we address the problem of manual work in the context of service derivation and present an approach for automatically deriving service candidates from business process model repositories. Our approach leverages semantic technology in order to derive ranked lists of useful service candidates. An evaluation of the approach with three large process model collection from practice indicates that the approach can effectively identify useful services with hardly any manual effort. The evaluation further demonstrates that our approach can address varying degrees of service cohesion by applying different aggregation mechanisms. Hence, the presented approach represents a useful artifact for enabling business and IT managers to quickly spot reuse potential in their company. In addition, our approach improves the alignment between business and IT. As the ranked service candidates give a good impression on the relative importance of a business operation, they can provide companies with first clues on where IT support is needed and where it could be reduced.

Export record:CitaviEndnoteRISISIBibTeXWordXML

Open reference in new window "Automatic service derivation from business process model repositories via semantic technology"

DOI: 10.1109/TSE.2015.2396895 

Abstract: System-related engineering tasks are often conducted using process models. In this context, it is essential that these models do not contain structural or terminological inconsistencies. To this end, several automatic analysis techniques have been proposed to support quality assurance. While formal properties of control flow can be checked in an automated fashion, there is a lack of techniques addressing textual quality. More specifically, there is currently no technique available for handling the issue of lexical ambiguity caused by homonyms and synonyms. In this paper, we address this research gap and propose a technique that detects and resolves lexical ambiguities in process models. We evaluate the technique using three process model collections from practice varying in size, domain, and degree of standardization. The evaluation demonstrates that the technique significantly reduces the level of lexical ambiguity and that meaningful candidates are proposed for resolving ambiguity.

Export record:CitaviEndnoteRISISIBibTeXWordXML

Open reference in new window "Automatic Detection and Resolution of Lexical Ambiguity in Process Models"

DOI: 10.1109/TSE.2014.2327044 

Abstract: The design and development of process-aware information systems is often supported by specifying requirements as business process models. Although this approach is generally accepted as an effective strategy, it remains a fundamental challenge to adequately validate these models given the diverging skill set of domain experts and system analysts. As domain experts often do not feel confident in judging the correctness and completeness of process models that system analysts create, the validation often has to regress to a discourse using natural language. In order to support such a discourse appropriately, so-called verbalization techniques have been defined for different types of conceptual models. However, there is currently no sophisticated technique available that is capable of generating natural-looking text from process models. In this paper, we address this research gap and propose a technique for generating natural language texts from business process models. A comparison with manually created process descriptions demonstrates that the generated texts are superior in terms of completeness, structure, and linguistic complexity. An evaluation with users further demonstrates that the texts are very understandable and effectively allow the reader to infer the process model semantics. Hence, the generated texts represent a useful input for process model validation.

Export record:CitaviEndnoteRISISIBibTeXWordXML

Open reference in new window "Supporting Process Model Validation through Natural Language Generation"

DOI: 10.1016/j.is.2013.06.007 

Abstract: The increased adoption of business process management approaches, tools, and practices has led organizations to accumulate large collections of business process models. These collections can easily include from a hundred to a thousand models, especially in the context of multinational corporations or as a result of organizational mergers and acquisitions. A concrete problem is thus how to maintain these large repositories in such a way that their complexity does not hamper their practical usefulness as a means to describe and communicate business operations. This paper proposes a technique to automatically infer suitable names for business process models and fragments thereof. This technique is useful for model abstraction scenarios, as for instance when user-specific views of a repository are required, or as part of a refactoring initiative aimed to simplify the repository’s complexity. The technique is grounded in an adaptation of the theory of meaning to the realm of business process models. We implemented the technique in a prototype tool and conducted an extensive evaluation using three process model collections from practice and a case study involving process modelers with different experience.

Export record:CitaviEndnoteRISISIBibTeXWordXML

Open reference in new window "Simplifying process model abstraction: Techniques for generating model names"

DOI: 10.1016/j.dss.2013.06.014 

Abstract: Companies increasingly use business process modeling for documenting and redesigning their operations. However, due to the size of such modeling initiatives, they often struggle with the quality assurance of their model collections. While many model properties can already be checked automatically, there is a notable gap of techniques for checking linguistic aspects such as naming conventions of process model elements. In this paper, we address this problem by introducing an automatic technique for detecting violations of naming conventions. This technique is based on text corpora and independent of linguistic resources such as WordNet. Therefore, it can be easily adapted to the broad set of languages for which corpora exist. We demonstrate the applicability of the technique by analyzing nine process model collections from practice, including over 27,000 labels and covering three different languages. The results of the evaluation show that our technique yields stable results and can reliably deal with ambiguous cases. In this way, this paper provides an important contribution to the field of automated quality assurance of conceptual models.

Export record:CitaviEndnoteRISISIBibTeXWordXML

Open reference in new window "Detection of naming convention violations in process models for different languages"

DOI: 10.1016/j.is.2012.01.004 

Abstract: Large corporations increasingly utilize business process models for documenting and redesigning their operations. The extent of such modeling initiatives with several hundred models and dozens of often hardly trained modelers calls for automated quality assurance. While formal properties of control flow can easily be checked by existing tools, there is a notable gap for checking the quality of the textual content of models, in particular, its activity labels. In this paper, we address the problem of activity label quality in business process models. We designed a technique for the recognition of labeling styles, and the automatic refactoring of labels with quality issues. More specifically, we developed a parsing algorithm that is able to deal with the shortness of activity labels, which integrates natural language tools like WordNet and the Stanford Parser. Using three business process model collections from practice with differing labeling style distributions, we demonstrate the applicability of our technique. In comparison to a straightforward application of standard natural language tools, our technique provides much more stable results. As an outcome, the technique shifts the boundary of process model quality issues that can be checked automatically from syntactic to semantic aspects.

Export record:CitaviEndnoteRISISIBibTeXWordXML

Open reference in new window "On the refactoring of activity labels in business process models"

DOI: 10.18417/emisa.6.1.2 

Abstract: Quality assurance is a serious issue for large-scale process modelling initiatives. While formal control flow analysis has been extensively studied in prior research, there is little work on how the textual content of a process model and its activity labels can be systematically analysed. In this context, it is a major challenge to systematically identify and to consequently assure high label quality. As many large process model collections contain more than thousand models, each including several activity labels, there is a strong need for an automatic detection of labels that might be of bad quality. Recent research has shown that different grammatical styles correlate with potential ambiguity of a label. In this paper, we propose an algorithm for recognition of activity labeling styles. The developed algorithm exploits natural language processing techniques, e.g., part of speech tagging and analysis of the grammatical structure. We also study how ontologies, like WordNet, can support the solution. We conduct a thorough evaluation of the developed techniques utilising about 6,000 activity labels from the SAP Reference Model. The evaluation of this algorithm shows that spurious labels can be identified with a significant level of precision and recall. In this way, our approach can be used as a means of quality assurance for process repository management by listing bad quality labels, which a human modeler should correct.

Export record:CitaviEndnoteRISISIBibTeXWordXML

Open reference in new window "Recognising Activity Labeling Styles in Business Process Models"

Journal Articles (Professional)

DOI: 10.1016/j.infsof.2017.08.009 

Abstract: Context: The analysis of requirements for business-related software systems is often supported by using business process models. However, the final requirements are typically still specified in natural language. This means that the knowledge captured in process models must be consistently transferred to the specified requirements. Possible inconsistencies between process models and requirements represent a serious threat for the successful development of the software system and may require the repetition of process analysis activities. Objective: The objective of this paper is to address the problem of inconsistency between process models and natural language requirements in the context of software development. Method: We define a semi-automated approach that consists of a process model-based procedure for capturing execution-related data in requirements models and an algorithm that takes these models as input for generating natural language requirements. We evaluated our approach in the context of a multiple case study with three organizations and a total of 13 software development projects. Results: We found that our approach can successfully generate well-readable requirements, which do not only positively contribute to consistency, but also to the completeness and maintainability of requirements. The practical use of our approach to identify a suitable subcontractor on the market in 11 of the 13 projects further highlights the practical value of our approach. Conclusion: Our approach provides a structured way to obtain high-quality requirements documents from process models and to maintain textual and visual representations of requirements in a consistent way.

Export record:CitaviEndnoteRISISIBibTeXWordXML

Open reference in new window "A semi-automated approach for generating natural language requirements documents based on business process models"

Abstract: Process modeling has become an essential part of many organizations for documenting, analyzing and redesigning their business operations and to support them with suitable information systems. In order to serve this purpose, it is important for process models to be well grounded in formal and precise semantics. While behavioural semantics of process models are well understood, there is a considerable gap of research into the semantic aspects of their text labels and natural language descriptions. The aim of this paper is to make this research gap more transparent. To this end, we clarify the role of textual content in process models and the challenges that are associated with the interpretation, analysis, and improvement of their natural language parts. More specifically, we discuss particular use cases of semantic process modeling to identify 25 challenges. For each challenge, we identify prior research and discuss directions for addressing them.

Export record:CitaviEndnoteRISISIBibTeXWordXML

Open reference in new window "25 Challenges of Semantic Process Modeling"

Books

Conference Proceedings

Abstract: Many use cases in business process management rely on the identification of correspondences between process models. However, the sparse information in process models makes matching a fundamentally hard problem. Consequently, existing approaches yield a matching quality which is too low to be useful in practice. Therefore, we investigate incorporating user feedback to improve matching quality. To this end, we examine which information is suitable for feedback analysis. On this basis, we design an approach that performs matching in an iterative, mixed-initiative approach: we determine correspondences between two models automatically, let the user correct them, and analyze this input to adapt the matching algorithm. Then, we continue with matching the next two models, and so forth. This approach improves the matching quality, as showcased by a comparative evaluation. From this study, we also derive strategies on how to maximize the quality while limiting the additional effort required from the user.

Export record:CitaviEndnoteRISISIBibTeXWordXML

Open reference in new window "Listen to Me: Improving Process Model Matching through User Feedback"

Book Chapters

DOI: 10.1007/978-3-642-36926-1_34 

Abstract: Process model similarity has developed into a prolific field of investigation. This paper summarizes the research after the CAISE 2008 paper on this topic. We identify categories of problems and provide an outlook on future directions.

Export record:CitaviEndnoteRISISIBibTeXWordXML

Open reference in new window "A Short Survey on Process Model Similarity"