Analysis of Decision Model and Notation tooling in the Visual Studio Code ecosystem - Masaryk University
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Masaryk University Faculty of Informatics Analysis of Decision Model and Notation tooling in the Visual Studio Code ecosystem Master’s Thesis Bc. Marcel Mráz Brno, Spring 2021
Masaryk University Faculty of Informatics Analysis of Decision Model and Notation tooling in the Visual Studio Code ecosystem Master’s Thesis Bc. Marcel Mráz Brno, Spring 2021
This is where a copy of the official signed thesis assignment and a copy of the Statement of an Author is located in the printed version of the document.
Declaration Hereby I declare that this paper is my original authorial work, which I have worked out on my own. All sources, references, and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source. Bc. Marcel Mráz Advisor: Bruno Rossi, PhD i
Acknowledgements I would like to express my gratitude to my thesis advisor Bruno Rossi, PhD, for valuable guidance during consultations. Also, I would like to thank Adacta for the opportunity to combine study and work life. Additionally, I would like to thank my family and close ones for their support. ii
Abstract Decision Model and Notation (DMN) is a standard specifying a vi- sual notation and a lower-level Friendly Enough Expression Language (FEEL) for the definition of interchangeable and executable decision models. The definition and maintenance of such models is contin- uously being improved with the emerging tooling ecosystem built around DMN graphical editors. One of the requested features in- side such editors is the ability to provide language features, such as DMN model validation and FEEL completion on each content change and request, respectively. Simultaneously, Language Server Protocol (LSP) provides a convenient way to provide such language features for text-based languages and specifications across a number of different development tools and text-based editors, such as Visual Studio Code. In addition to text-based editors, Visual Studio Code offers a way to create custom graphical editors, allowing to embed a DMN graphical editor inside its user interface. This thesis reviews the concepts of DMN and its validations, LSP and related Visual Studio Code API for graphical editors and researches the related questions for providing DMN language features through the use of LSP inside the embedded graphical editors. As a result, an architecture using LSP for providing language features inside embedded DMN graphical editors is pro- posed. The proposed solution architecture is specifically designed to address related domain problems in the context of commercial, enterprise-level insurance software. iii
Keywords DMN, FEEL, LSP, VS Code, Extension API, Custom Editor API, Web- view API, Node.js, Software Architecture, Static Code Analysis, Adacta, AdInsure Studio iv
Contents 1 Introduction 1 1.1 Problem domain . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Thesis statement . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 Research questions . . . . . . . . . . . . . . . . . . . . . 5 1.4 Thesis structure . . . . . . . . . . . . . . . . . . . . . . . 7 2 Decision Model and Notation 9 2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2 Use cases . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3 Outline of specification . . . . . . . . . . . . . . . . . . . 13 2.3.1 FEEL . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3.2 DRD and its elements . . . . . . . . . . . . . . . 15 2.3.3 Boxed expression types . . . . . . . . . . . . . . 17 2.3.4 Conformance level . . . . . . . . . . . . . . . . . 19 2.4 Related standards . . . . . . . . . . . . . . . . . . . . . . 21 2.4.1 CMMN . . . . . . . . . . . . . . . . . . . . . . . . 22 2.4.2 BPMN . . . . . . . . . . . . . . . . . . . . . . . . 22 2.4.3 PMML . . . . . . . . . . . . . . . . . . . . . . . . 23 3 DMN model validation 24 3.1 Market overview . . . . . . . . . . . . . . . . . . . . . . 25 3.1.1 Vendors and involved parties . . . . . . . . . . . 27 3.2 Validation breakdown . . . . . . . . . . . . . . . . . . . 30 3.2.1 Validation against the schema . . . . . . . . . . . 31 3.2.2 Validation of DRD and its elements . . . . . . . 31 3.2.3 Validation during compilation process . . . . . . 31 3.2.4 Validation of FEEL expressions . . . . . . . . . . 32 3.2.5 Validation of a decision table . . . . . . . . . . . 32 v
3.2.6 Dynamic validation . . . . . . . . . . . . . . . . . 33 4 Language Server Protocol 34 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.2.1 Client, server and their capabilities . . . . . . . . 38 4.2.2 JSON-RPC . . . . . . . . . . . . . . . . . . . . . . 39 4.2.3 Communication flow . . . . . . . . . . . . . . . . 40 4.2.4 Limitations . . . . . . . . . . . . . . . . . . . . . 41 4.2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . 43 5 Visual Studio Code 45 5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 5.2 Electron.js . . . . . . . . . . . . . . . . . . . . . . . . . . 47 5.3 Extension API . . . . . . . . . . . . . . . . . . . . . . . . 49 5.3.1 TextDocument . . . . . . . . . . . . . . . . . . . . 50 5.3.2 Text editor . . . . . . . . . . . . . . . . . . . . . . 50 5.3.3 Webview API . . . . . . . . . . . . . . . . . . . . 51 5.3.4 Custom Editor API . . . . . . . . . . . . . . . . . 51 5.3.5 Language extensions . . . . . . . . . . . . . . . . 52 5.3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . 53 6 Proposed solution 54 6.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . 55 6.1.1 Functional requirements . . . . . . . . . . . . . . 55 6.1.2 Quality attributes . . . . . . . . . . . . . . . . . . 56 6.1.3 Technical constraints . . . . . . . . . . . . . . . . 58 6.1.4 Business constraints . . . . . . . . . . . . . . . . 59 6.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 60 6.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . 60 6.2.2 LSP language server . . . . . . . . . . . . . . . . 61 6.2.3 DMN analysis . . . . . . . . . . . . . . . . . . . . 62 6.2.4 DMN custom text editor . . . . . . . . . . . . . . 66 6.2.5 Validation process example . . . . . . . . . . . . 67 6.2.6 Possible integrations . . . . . . . . . . . . . . . . 69 7 Conclusion 72 7.1 Research questions evaluation . . . . . . . . . . . . . . . 73 vi
7.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . 76 A List of Abbreviations 79 Bibliography 82 vii
List of Figures 1.1 Hierarchy of explained AdInsure concepts 3 1.2 Visualisation of references between the research questions 6 1.3 Visualisation of the thesis outline 7 2.1 DRD and its elements 15 2.2 DMN conformance levels hierarchy 19 2.3 Linking business automation with machine learning 21 4.1 Interactions between a user, LSP client and LSP server 41 5.1 Electron.js application architecture 47 5.2 Class diagram showing conceptual relations between the mentioned concepts 50 5.3 SDK for development of VS Code’s LSP-based server 52 6.1 Dependencies between the separate packages 60 6.2 Main conceptual relations inside and outside of the language server 61 6.3 Main conceptual infrastructure of DMN analysis package and its dependencies 63 6.4 DMN analysis package infrastructure and its dependencies 67 6.5 Integration architecture of utilizing multiple language servers for multiple custom text editors 71 viii
Introduction 1 This chapter focuses on explaining the problem domain and related context. Based on the context, research questions are defined, and the overall structure of the thesis is explained and visualised. 1
1. Introduction 1.1 Problem domain The problem is defined in the context of two systems - AdInsure [1] and AdInsure Studio [2]. AdInsure and AdInsure Studio are software products developed by Adacta, a Slovenian-based company with of- fices spread across major European cities, starting with headquarters in Ljubljana and continuing with Maribor, Belgrade, Zagreb, Moscow and Brno. Adacta is, with more than 30 years of experience, the soft- ware provider for the insurance industry. AdInsure is an insurance platform, and its newest version is de- signed, apart from other things, with configurability in mind, allowing configuration of all of its infrastructural and business elements. AdIn- sure Studio is what makes the configuration of AdInsure and all of its elements quick, convenient and business-user friendly. AdInsure Studio supports the entire configuration lifecycle of insurance prod- ucts and business processes mainly through its Visual Studio Code extension client1 , supporting multiple authoring modes and focusing on a broad audience reaching from developers and testers to business analysts and actuaries. There are two AdInsure Studio extension authoring modes relevant to the problem2 : • Basic mode (default): A mode focused on business users and configuration of products and processes using custom made Graphical User Interface-centered (GUI-centered) editors and business user friendly explorers. • Expert mode: A mode focused on AdInsure domain experts, such as configurators and developers, allowing the configuration of products and processes using plain built-in text editors. Many configuration concepts3 defined by the AdInsure platform are supported by both of these modes, with one such concept be- 1 The other AdInsure Studio client is Command Line Interface (CLI) for auto- mated purposes, such as the use during continuous integration and continuous delivery (CI/CD). 2 The third mode is called "Accelerated mode" and is used for rapid configuration of products and processes using scaffolding technique. 3 All of such concepts are defined as textual files, structured mostly using follow- ing formats - .js, .json, xml, csv. 2
1. Introduction ing a business rule configuration. One way to configure a business rule in AdInsure, using AdInsure Studio extension, is by using the Decision Model and Notation (DMN), which is a higher, visual lan- guage focused on the definition of business decisions. In terms of AdInsure Studio modes, it means either to edit the DMN file using a GUI-centered editor (a custom DMN editor or, in Adacta’s Ubiquitous Language4 , so-called Rule editor) or a built-in text editor (Monaco editor5 ). Figure 1.1: Hierarchy of explained AdInsure concepts 4 Ubiquitous Language is a term for specifying common language between devel- opers and other stakeholders, used by Eric Evans in Domain Driven Design [3]. 5 Monaco editor is a standalone text-editor that powers VS Code [4]. 3
1. Introduction The problem is that neither the custom Rule editor nor AdInsure Studio provides any DMN model validations. From the configuration perspective, it implies that it is possible to create an invalid DMN model (thus an invalid business rule), which might cause problems during its evaluation in the AdInsure run-time environment. Such issues detected in the run-time environment can be very costly and pro- long the development time, testing time, and overall time-to-market for the given product, which are the exact attributes that DMN and AdInsure Studio are trying to solve. As it is possible to create an invalid DMN model and pass it to the run-time environment, it is necessary to detect the potential problems as soon as possible. In terms of AdInsure Studio, it means detecting such problems within the VS Code extension during the DMN model authoring phase in both related modes (Basic and Expert).6 Moreover, AdInsure Studio includes the concept of validations for other configuration concepts specified by the AdInsure platform. Problem detection for the DMN model should be thus performed in a similar validation manner and thus integrable with the already existing validations. However, as these validations are performed within the same process that the AdInsure Studio extension is running on, they start to bring performance issues. Simultaneously, many VS Code extensions have adopted a stan- dard focused on providing language features for different types of text-based languages. This standard is called Language Server Pro- tocol (LSP), and it is a standard protocol for unifying and providing the language features (such as diagnostics, completion and others) across different text editors and Integrated Development Environ- ments (IDE). Such language features help with problem detection on multiple levels and provide a much larger set of features than exist- ing AdInsure Studio validations. Moreover, many editors, such as VS Code, automatically include support for LSP in its built-in text editors (Monaco editor) and other UI parts (such as the Problems panel [5]). The adoption of LSP within AdInsure Studio would mean pro- viding the validations as LSP diagnostics with automatic support in built-in text editors (Expert mode). Also, it would open possibilities 6 AsAdInsure Studio also supports CLI client (i.e. for CI/CD purposes), such problems should be detectable on this level too. 4
1. Introduction of implementing other language features (such as completion) for DMN-based business rules and various other AdInsure configuration concepts. Such features could improve the problem detection even more (lower the number of actually created problems) and provide a better user experience in the authoring phase over time. Moreover, an LSP-compliant solution by design processes all of the work in a separate LSP server process. This would automatically solve the need to improve the overall performance by offloading all AdInsure Studio validations into a separate process. On the other hand, the adoption of LSP raises multiple questions related to the use within the Basic mode, CLI use case, DMN model validation, overall AdInsure Studio validations architecture and others. All of these questions are defined in the section below. 1.2 Thesis statement The main goal of the thesis is to analyse DMN language features, such as validation and completion, in the context of Visual Studio Code Ex- tension API, its custom Webview-based editors and Language Server Protocol. The gathered knowledge from the analysis phase should used to create a proposed solution architecture for providing such features inside VS Code’s graphical editors. The proposed solution is specifically designed to address related domain problems in the context of commercial, enterprise-level insurance software named AdInsure Studio. 1.3 Research questions As described in the problem domain section, LSP solves some ma- jor problems automatically. On the other hand, it raises additional questions in the context of AdInsure Studio and DMN specification. The purpose of this thesis is to answer all of the research questions specified below. It is done so by carefully and independently analysing and reviewing concepts of DMN and its validations, LSP and VS Code. Based on the gathered knowledge, a solution addressing all of the issues is proposed, combining the concepts covered in the previous sections. 5
1. Introduction Figure 1.2: Visualisation of references between the research questions Questions Q1 Can LSP be used to exchange information with GUI-based edi- tors in the VS Code, such as with the Rule (DMN) editor? Q2 Is it possible to use LSP to retrieve DMN model validation results? Q3 What types of DMN model of validations can be actually pro- vided? Q4 Are there some third-party DMN model validators, and is it possible to integrate them? Q5 While using LSP for communication with the editor, is it still possible to provide validations using the CLI? 6
1. Introduction 1.4 Thesis structure Figure 1.3: Visualisation of the thesis outline Chapter 1 provides a detailed examination of the problem domain, specifies the research questions, and outlines the thesis structure. Chapter 2 reviews the DMN model specification. It focuses on the possible use cases, key elements specified by the standard and related specifications. Chapter 3 focuses on the DMN model validations. Based on the market research of DMN vendors, tools, and involved parties provides an overview of DMN model validation types. Chapter 4 reviews the LSP, its architecture, related concepts, sup- ported features and discovered limitations. Chapter 5 reviews the VS Code concepts related to the LSP. The first part focuses on the overall architecture and used framework. The second part reviews the essential parts of the VS Code Extension API used in the proposed solution. Chapter 6 provides functional and non-functional requirements and other constraints for the end-system. Based on the requirements and constraints, a solution that addresses the research questions is proposed. 7
1. Introduction Chapter 7 concludes the provided results and answers the research questions. This chapter also suggests potential improvements as follow- up activities. 8
Decision Model and Notation 2 This chapter provides an overview of Decision Model and Notation, its use cases, specification and related standards. 9
2. Decision Model and Notation 2.1 Overview "DMN is a modelling language and notation for the precise specification of business decisions and business rules. DMN is easily readable by the different types of people involved in decision management. These include business people who specify the rules and monitor their application; business analysts." [6] Decision Model and Notation (DMN) is a standard, released in September 2015, maintained by Object Management Group (OMG), which stands behind many popular and ratified ISO standards, such as Business Process Model and Notation (BPMN), Case Management Model and Notation (CMMN), Unified Modeling Language (UML), Common Object Request Broker Architecture (CORBA) and more. DMN is a standard for defining repeatable, interchangeable and exe- cutable decision models, focusing on providing a common and easily understandable notation for a wide range of end-users. The primary goal of DMN is to provide a standardized bridge between those who create and maintain business decisions and those who implement and automate those decisions. This removes the gap between technical users and business users and eliminates the risk of misinterpretation or other communication misunderstandings, resulting in more effective collaboration and faster production changes. The secondary goal is to provide standardized decision models, which could be interchanged and reused across different tools and organizations, thanks to the unified XML specification. 10
2. Decision Model and Notation 2.2 Use cases In traditional approaches and processes, any maintenance or change of business decisions kept in the code could be a slow process, requiring many people involved along the way. Making any change in the busi- ness decisions could go through business analysts, domain experts, programmers, testers and possibly other parties. Such a chain of peo- ple involved in the process prolongs the whole design, development and deployment procedure and could lead to slower time-to-market response with possible misunderstandings of business needs on the way. DMN tries to limit this chain and separate the responsibilities so that involved parties can focus just on things within their domain of expertise. It could mean a design of business decisions within the DMN visual model for business analysts and domain experts, and on the other hand, it could mean operational support and an integration of the DMN model with related services for technical users. From a business perspective, DMN could be used in any area re- quiring rather complex and automized decision making based on some business rules. Numerous areas satisfy these conditions, reach- ing from healthcare applications to blockchain smart contracts [7] [8]. Other examples are financial systems used by institutions such as banks or insurance companies. In these institutions, there are required countless automatized operational decisions per day, such as calculat- ing the maximum amount of loan or mortgage that could be granted to the customer or determining insurance premium or premium reserves based on the provided data. Even though DMN was designed to be understandable for a wide range of business users, it enables many possibilities to cover most of the business needs without noticeable limitations. For example, DMN enables integration with almost any external services through the business knowledge model element or compatibility with mature PMML (Predictive Model Markup Lan- guage) standard, enabling the integration of predictive machine learn- ing models within DMN itself. From a more technical perspective, DMN could be used on the following levels [9]: 1. Definition of manual decision-making model. 2. Definition of requirements for automated decision-making model. 11
2. Decision Model and Notation 3. Definition and implementation of an executable decision-making model. The third level is what really differentiates DMN from require- ments languages. DMN is not just another requirements language, and compared with classical business rules tools, DMN prescribes much more than just a definition of decision requirements and decision logic. With DMN, there is no need to create a technical specification of the decision model and then translate it manually to a specific programming language. DMN is not designed to produce just doc- umentation for technical users, but it is designed with executability in mind. Meaning with the help of an appropriate tool, the DMN model could be taken and deployed, integrated or executed straight after being authored. In other words, DMN strictly specifies execution semantics with higher-level visual language, which then could be source-to-source compiled to a specific programming language, such as Java or JavaScript, producing self-contained executable code. "DMN is a business-oriented, tool-independent, executable de- cision language." [10] There are a couple of ways on how to execute a decision model. One possible way would be to invoke a decision model as a service and have it integrated internally as part of the system. This approach might be beneficial for applications that are more inclining to a monolithic architecture. Such integration comes with typical benefits of mono- lithic applications - tightly coupled components usually running in one process, resulting in overall fast performance and a small number of deployments. Another possible approach is to use decision mod- els as independently deployed decision services, more inclining into a service-oriented architecture with loosely coupled components, a large number of deployments and benefits such as horizontal scalabil- ity, self-maintainability, isolation, resiliency and more. Each decision service can be then hosted in the cloud, exposed over an automatically generated API, deployed directly from a Business Rules Management System (BRMS) and even invoked by a Business Process Management System (BPMS). 12
2. Decision Model and Notation 2.3 Outline of specification "DMN’s value proposition - Help all stakeholders understand a complex domain of decision-making using easily readable dia- grams.Provide a natural basis for discussion and agreement on the scope and nature of business decision-making. Reduce the effort and risk of decision automation projects through graphi- cal decomposition of requirements. Allow business rules to be defined simply and reliably in unambiguous decision tables. Sim- plify development of decisioning systems using specifications that may be automatically validated and executed. Provide a structured context for the development and management of pre- dictive analytic models. Enable the development of a library of reusable decision-making components." [11] DMN is visual language and notation designed for business users, with the logic defined in hierarchical diagrams consisting of inputs, decisions, decision service, business knowledge models and knowl- edge sources. All together, it creates a logical structure view on top of the whole DMN, called Decision Requirements Diagram (DRD). Inputs are oval nodes in the DRD diagram and are simply the data coming into the decisions. Individual decisions are rectangular nodes in the diagram. Each decision node takes some inputs and, based on some decision logic, returns corresponding outputs. Decision service, the overlay rectangle containing other decisions, is a top-level decision that can be invoked as a standalone service from an external application or business process inside BPMN. Business knowledge models, rectangular nodes with clipped corners, introduce the concept of reusability and integration with external services. Last but not least, knowledge sources, note-like nodes, refer to external documents such as documentation, policies, regulations or other real-world factors. As a part of the DMN standard, a particular expression language called Friendly Enough Expression Language (FEEL) was introduced to provide a familiar and business-friendly way of writing simple expressions and decision logic. As DMN, FEEL is also designed for business users. It is not a full-blown programming language, but it is a potent language for the definition of business rules, and it can also help in certain situations when arithmetic calculations or other formulas are 13
2. Decision Model and Notation needed. A good analogy to FEEL is the expression language used by Microsoft Excel, which aims at a similar, business-oriented audience. 2.3.1 FEEL As its name reveals, Friendly Enough Expression Language is an expression language specifically designed for business users. It is a lightweight but yet powerful language that can be used inside DMN decisions and offers many features that can be found in other expres- sion or programming languages. Some of these features include: • conditional, loop statements, filtering • data types supporting booleans, numbers, string, dates, lists, ranges, contexts and functions • support for three-valued logic - true, false and null • built-in functions for operating with the basic data types • custom function definition and invocation • four different types of scopes - built-in, global, local and special • side-effect free Compared to traditional programming languages, in expressions and corresponding expression languages, it is impossible to explicitly declare a variable, meaning it is only possible to reference a variable, not to create one1 . Nevertheless, the following is the description of expressions by Michael Kay, author of the book about XPath, another expression language: "Every expression takes one or more values as its inputs, and produces a value as its output. . . . One of the things an expression language tries to achieve is that wherever you can use a value, you can replace it with an expression that is evaluated to produce that value. . . . This property is called composability: expressions can be used anywhere that values are permitted. [12]" Another significant distinction from traditional programming lan- guages is the support for a space in variable or function names, making 1 Unless being defined in the global or local scope. In other words, "variables" can be defined in the context of the whole DMN as an input (decision or data) or a context entry. Both are, however, constant variables. 14
2. Decision Model and Notation FEEL grammar context-sensitive and more challenging to parse [13]. However, multiple vendors were already able to implement a parser for FEEL grammar and share it with the open-source community. Between the available open-source parsers belong: • ANTLR4 (ANother Tool for Language Recognition) parser by RedHat (Java) [14]. • Peg.js (Parsing Expression Grammar) parser by EdgeVerve (Java- Script) [15]. 2.3.2 DRD and its elements Decision Service Decision 2 Knowledge source Decision 1 Knowledge model Input data Figure 2.1: DRD and its elements A decision requirements diagram, also called a decision requirements graph, is a visual representation of a decision model. It shows from the higher perspective, regardless of the actual decision logic, connections 15
2. Decision Model and Notation between individual DMN elements2 . From DRD, it is clearly visible which decisions depend on which input data or which decisions de- pend on other decisions. It is also possible to see external elements influencing particular decisions, such as business knowledge models or knowledge sources. Following is an overview of available DRD elements3 : • Input data: Input data is a piece of information that is provided to the DMN model in the run-time, but its structure is defined in the design time. Its data type should be, by default, supported by FEEL but could be either a custom one or an imported one [9]. To be able to process input data, it is necessary to connect them into a decision. Arrow connecting input data and decision (or knowledge model) is called information requirement and is visible in the Figure 2.1. • Decision: Each decision in DMN is graphically represented by a table called boxed expression, which defines decision logic and has its own type and structure. Each boxed expression takes input data, either from the input data element itself or other decision element and, based on the specified logic and refer- enced knowledge models, determines the outputs. The boxed expression types are FEEL expression, Decision table, Context expression, Function, Invocation, Relation and List. More infor- mation about boxed expressions and their types is available in the next subsection. • Decision service: Decision service is a top-level decision that defines a reusable element in the decision model. It then can be published as a standalone service, integrated into an external application, reused in other DMN model or executed in the business process inside BPMN. • Business knowledge model: A knowledge model, also called the business knowledge model, is another reusable piece of de- 2 The figure showing DRD element is inspired by a figure available on RedHat documentation portal [16] 3 This outline of the specification is focused on DMN version 1.3, released in late 2019. 16
2. Decision Model and Notation cision logic. Compared with the decision service, it does not define a reusable element but is an endpoint for connecting one. It can depend on some sub-model, sub-decision or sub-inputs and can be used inside the decision logic. Typically it takes some inputs, and after being internally evaluated, it passes the out- puts back to the decision. In addition to plain FEEL expressions and declarative sub-model, it also supports an external function, which can call a Java code or a PMML model. The arrow connect- ing the knowledge model and a decision is called knowledge requirement and is visible in the Figure 2.1. • Knowledge source: A knowledge source refers to external docu- ments such as documentation, policies, law regulations or other real-world factors. They can reference a wide range of sources, such as documents, web pages or even video or audio content. The arrow connecting knowledge source and a decision is called authority requirement and is visible in the Figure 2.1. 2.3.3 Boxed expression types In DMN, all decision logic inside decisions is represented by so-called boxed expressions. A boxed expression is defined recursively, mean- ing each boxed expression can contain another boxed expression and lowest level expressions inside each boxed expression are FEEL expres- sions. In order to create a boxed expression, it is necessary to connect it to an appropriate decision. The only possible way for connecting a decision with a boxed expression is by a name, meaning the name of a decision must correspond to the top-level boxed expression name. How is this interaction designed and visualized in the DMN tools is, however, left to the corresponding tool. A boxed expression can be one of the following types: • Decision table: A decision table is a graphically defined tabular boxed expression. The basic form of a decision table contains rows and columns, where the first row specifies individual in- put/s and output/s for each column, and the rest of the rows specify corresponding rules for each input (defining the decision logic in prioritised order) and expected output. Rule’s inputs 17
2. Decision Model and Notation are so-called Unary tests4 and rule’s outputs are defined as FEEL expressions. Each row thus graphically represents a simple if [Unary test] then [FEEL expression]. An additional element of a decision table, named Hit policy, can be used to influence the result and return, i.e. just the first matched row or sum of all matched outputs. • Boxed context: A boxed context is a map of so-called context entries (key, value pairs), where each key is output clause name (a constant variable) and value is another boxed expression, such as boxed FEEL expression, decision table or even another boxed context. Boxed context can optional have a result clause, which can be used for additional calculation on top of context entries. • Boxed literal expression: A boxed literal expression, also called boxed FEEL expression, is simply any FEEL expression used either as a standalone boxed expression or inside other boxed expressions, such as inside the decision table’s cells. It is defined by FEEL grammar as followed FEEL(e, s), where e stands for FEEL expression and s stands for the given scope. The scope can include input data elements from the DRD, other input decisions, built-in FEEL functions, custom FEEL functions and variables defined as context entries in a higher scope. • Boxed function: A boxed function is an element for the defini- tion of a function, which can be either FEEL function, Java code or PMML model. • Boxed invocation: A boxed invocation is an element used for the invocation of the function, such as those defined by the boxed function. • Boxed list: A boxed list is just a list of n elements. • Relation: Relation is a list of horizontal contexts with no result clause. 4 Unary tests have additional layer of grammar defined, which when simpli- fied can consist of multiple FEEL expressions separated by comma, which can be wrapped in not() function or can equal dash (which always evaluates to true). 18
2. Decision Model and Notation 2.3.4 Conformance level Figure 2.2: DMN conformance levels hierarchy There are three conformance levels specified in the DMN specification to divide DMN tools into clearly separated groups, based on their support of the standard. Conformance levels are certain certification degree defined by OMG. There are three conformance levels, and each level corresponds to a strictly defined set of functionalities, which the end-tool should support. The highest possible level is conformance level 3, which automatically includes the support for conformance level 2 and 1. In other words, conformance level 3 is the highest possible level recognized by the OMG and means full support of the standard. A summary of the three conformance levels could be found below. • Conformance level 1: First level conformance specifies that im- plementation supports full DRD and its corresponding elements. However, the model is not executable, which means the decision logic can be defined informally by any language of choice. This means that any language, even an unstructured natural one, is a valid language. • Conformance level 2: Second level conformance specifies that implementation supports everything that conformance level 19
2. Decision Model and Notation 1 support. It also supports decision tables, literal expressions and a subset of FEEL expressions, called Simplified Friendly Expression Language (S-FEEL), which are, in short, simple com- parisons and arithmetic expressions. However, this makes the decision logic structured and model fully executable. • Conformance level 3: Third level conformance specifies that im- plementation supports everything that conformance level 2 does. In addition to that, it supports a complete set of FEEL expres- sions and also provides support for other boxed expressions, such as function invocation. Last but not least, as the second level, this level of a decision model is also fully executable. 20
2. Decision Model and Notation 2.4 Related standards "DMN is designed to work alongside BPMN and/or CMMN, providing a mechanism to model the decision-making associated with processes and cases. While BPMN, CMMN and DMN can be used independently, they were carefully designed to be com- plementary. Indeed, many organizations require a combination of process models for their prescriptive workflows, case models for their reactive activities, and decision models for their more complex, multi-criteria business rules. Those organizations will benefit from using the three standards in combination, selecting which one is most appropriate to each type of activity modelling. This is why BPMN, CMMN and DMN really constitute the "triple crown" of process improvement standards." [11] Figure 2.3: Linking business automation with machine learning DMN standard could be effectively adopted and used standalone, but it is specifically designed to complement two related business- user oriented graphical standards, which are, as well, maintained by the OMG group - BPMN and CMMN. BPMN also called a "triple crown" of process improvement standards. The first version was in- troduced in 2004, and till now, it went through many iterations with currently propagated version 2.0.2. BPMN’s primary focus is on al- lowing business users to define and automate business processes. On the other hand, CMMN is a relatively new standard, with the first version introduced in 2014. It is designed explicitly for graphically representing case management, with a secondary goal of case model interchangeability among different tools. On top of that, DMN itself also allows easy integration with mature PMML standard, allowing 21
2. Decision Model and Notation cross-compatibility across various machine learning tools and pre- dictive models. Altogether these standards link business automation with machine learning, as shown in Figure 2.3 [17]. 2.4.1 CMMN The Case Management Model and Notation is a standard way of ex- pressing a case. A case is a derived concept from case management, with a focus on unpredictable processes where it is impossible to pre- scribe a process with fixed activities. It is, as well as DMN and BPMN, a standard maintained by OMG. The initial version was released in 2014, with currently latest supported version 1.1. Similarly to DMN and BPMN, CMMN specification also focuses on a broad audience and provides a format for interchanging case models between vari- ous tools. CMMN directly expands the boundaries of what can be provided by a BPMN model. In comparison to BPMN, CMMN is not focused on structured processes with a defined set of activities but on unstructured processes with an event-centred approach and case file concept. For example, into this category of dynamic, ad-hoc processes fall tasks such as incident management or consulting. CMMN thus covers much more than pure BPMN and complements it directly. 2.4.2 BPMN The Business Process Model and Notation is a mature standard for modelling end-to-end business processes. It is, as well as DMN, main- tained by OMG and is also a ratified ISO standard. The initial version was released in 2004 but since then went through many iterations, with the latest supported version 2.0.2. Similarly to DMN, BPMN specifica- tion is also focused on wide-audience, reaching from business users to technical people. BPMN also prescribes a mapping between graphical notation and underlying execution logic, meaning the whole process can be automized thanks to the mapping to a unified language called Business Process Execution Language (BPEL). The main elements of BPMN are messages, which flow between different participants and activities. The activity element can then be of multiple types. One of them is a business rule, and each business rule in a BPMN diagram can be represented as a standalone DMN model composed of multi- 22
2. Decision Model and Notation ple decisions. Many tools already support this interconnection with seamless interactions, such that clicking on a business rule activity in a BPMN diagram opens a particular DMN model inside a DMN specific tool. 2.4.3 PMML "PMML, administered by the Data Mining Group, is not a machine learning algorithm but a common XML format encom- passing a wide variety of machine learning algorithms, including decision trees, logistic regression, support vector machine, neural networks, and others." [18] The Predictive Model Markup Language is a mature XML-like format for the interchange of predictive models, and it is de facto rec- ognized as a standard to represent machine learning models. The first version of PMML was released in 1998, and since then, it is constantly improved, and many tools and companies support it [19]. PMML is, be- sides FEEL expressions, supported by DMN out-of-the-box, meaning DMN decisions can execute any predictive models defined in PMML format via imported business knowledge model element. Compared with DMN decisions, machine learning models often work as black- box, and the incapability to explain why they for specific inputs return certain outputs is increasingly seen as a problem [20]. DMN decisions, such as decision tables, are, on the other hand, extremely transparent, enabling some degree of control over the predictive model. That is one of the reasons why PMML and DMN are often seen as complementary. 23
DMN model validation 3 This chapter provides a brief market overview of vendors, other parties and tools involved in the DMN model validations and related area. It also provides a summary of available DMN validations provided by the listed tools and vendors. 24
3. DMN model validation 3.1 Market overview Since the first released version of the DMN specification, several ven- dors tried to adopt the DMN standard and provide appropriate tooling for it. According to Bruce Silver, author of the book DMN Method and Style, many of the vendors in the beginnings did not fulfil the common executable decision model’s promise. Instead, they used DMN more- or-less as a marketing badge, providing no more than just simple DRD editors, with CRUD operations on top of its elements. "Unfortunately, many proprietary decision modeling tools have appropriated the DMN name as a marketing decal without con- forming to the spirit, much less the letter, of this promise. [21]" OMG promptly realized that they needed to somehow push the vendors in order to bring all the benefits of DMN to the end-users. The ability to compile and execute the model is no doubt essential, distinguishing DMN from any other requirements language. That is the reason they introduced three levels of conformance into the standard. By doing so, vendors, and tooling they provide, are not badged with simply supporting DMN, but with label strictly defining the particular conformance level they achieved. From a different point of view, conformance levels work as some form of certification degree and divide the available tooling into clear and separated groups. In order to encourage the vendors to implement the full support for DMN, as stated by Conformance Level 3, a TCK [22] (Technology Compatibility Kit) group was established to provide a set of black-box tests ensuring conformance to the specification. TCK group is not, however, maintained by the centralized authority but by the commu- nity of vendors themself. Therefore the list does not include all the possible tools on the market and does not necessarily need to be ob- jective enough. On the other hand, its goals and initiatives are clear, and its involved parties belong to the most active ones in the whole DMN community. That is also confirmed by the fact that many TCK group members can be found on the list of parties contributing to the development of the DMN specification itself [6]. The list of vendors, tools and involved parties below is not a com- plete market overview of all available DMN vendors and their tools 25
3. DMN model validation and is specifically structured to achieve this thesis’s goal. Nevertheless, the list was carefully filtered and chosen by the following criteria: 1. Tools or their parts are prefered to be open-source projects for the following reasons. • They can be explored and objectively analyzed from the ground, even without relying too much on available sources or marketing materials. • The project’s overall goal is as transparent as possible, so there are clear plans for development and maintenance in the future. • It is possible to contribute to the project, allowing one to be a part of the community and shape the project to the common objectives. 2. Tools or their parts could be used in the commercial sector, mean- ing their license agreement allows free commercial usage and further integrations. 3. Tools or their parts support or prescribe some level of validation on top of the DMN model itself, as it is one of the main focus of this thesis. 4. Vendors or involved parties actively participated in the devel- opment of the DMN standard or noticeably contributed to the community. 5. Vendors or involved parties are part of the TCK group. Their tool passes as many tests as possible, meaning they are operating on the highest possible conformance level and with the latest possible DMN specification. The first part of this section focuses on providing an overview of vendors and involved parties satisfying most of the set conditions above. The second part is mostly focused on providing an overview of the chosen vendors’ tools and describing their key features. The last part summarises all possible DMN model validations provided by the listed tools and vendors. 26
3. DMN model validation 3.1.1 Vendors and involved parties RedHat RedHat is the only vendor satisfying all the conditions above and plays one of the main roles not just in the open-source DMN territory but in the whole business rules, "A business rule is a compact, atomic, well-formed, declarative state- ment about an aspect of a business that can be expressed in terms that can be directly related to the business and its collabora- tors, using sim- ple unambiguous language that is accessible to all interested parties: business owner, business analyst, technical architect, customer, and so on. This simple language may include domain-specific jargon." [23] and related Business Rules Management Systems (BRMS) world. "A BRMS or business rule management system is a software system used to define, deploy, execute, monitor and maintain the variety and complexity of decision logic that is used by opera- tional systems within an organization or enterprise." [24] RedHat has multiple projects related to business rules, decisions and DMN standard, starting from the ground with a mature Drools BRMS and continuing with the RedHat Decision Manager enterprise platform, jBPM toolkit and the Kogito project. All of the projects are mostly based on the Java ecosystem. They are also contributors to the DMN specification itself. Drools is a mature open-source platform, licensed under Apache License, for decision management and business rules, with its history reaching the year 2001. Nevertheless, version 1.0 was never released due to the rule engine’s performance constraints, and thus version 2.0 was actually the first released version of Drools. It contains many components, such as a business rules engine, optimization engine or DMN engine. On top of that, it enables the end-users to work with higher-level declarative metaphors, which are closer to human language than the imperative code. Examples of such metaphors could 27
3. DMN model validation be a DRD diagram, decision table or lower-level, but still a declarative Drools Rules Language (DRL) rule. RedHat Decision Manager is a platform based on Drools, enabling the development and maintenance of containerized microservices that automate business decisions. Its focus is on the enterprise sector and comes with the usual RedHat subscription model, offering SLA based support, regular updates and more. The jBPM toolkit is approaching the problem from a different and more traditional, code-first imperative approach, and it offers Java libraries for the development of business decisions. It is an open-source project, licensed under Apache License and based on Drools. Kogito is an open-source project, licensed under Apache License, focused on bringing Drools, together with business automation tools such as jBPM, to the cloud environment. It also stands behind pro- viding toolings for business automation, such as standalone BPMN and DMN editors that can either run in a browser or as embedded extensions inside Google Chrome or VS Code. Camunda Camunda is a company offering open-source processes and design automation platform, licensed under Apache License and called Ca- munda BPM. It provides a BPMN workflow engine and DMN decision engine, both implemented in Java. They also develop and maintain open-source, web-based editors, supporting DMN nad BPMN, under the bpmn.io project. They are also contributors to the DMN specifica- tion itself. Trisotech Trisotech is a company offering enterprise software for end-to-end business automation and digital transformation. One of their main products is the so-called Digital Modeling Suite, including, among other things, advanced web-based applications for creating Case mod- els, BPMN processes and DMN Decisions. Decision Modeler is their application for managing DMN decisions, allowing many features, such as advanced static analysis on top of the DMN models, the defi- nition of test cases, data model creation and more. They are closely 28
3. DMN model validation collaborating with RedHat and use Drools engine behind the curtain. They are also contributors to the DMN specification itself. EdgeVerve EdgeVerve is a company behind the open-source project called oe- Cloud, a digital transformation platform and a framework to build and deploy cloud-native SaaS quickly. They are also behind the open- source js-feel package, licensed under MIT license. This package is a JavaScript-based rule engine, enabling the execution of DMN de- cision tables together with the full support of FEEL. They are also contributors to the DMN specification itself. Method and style Method and style is a trademark created by Bruce Silver, who is a huge contributor to the DMN community. He is recognized as a major provider of BPMN and DMN training and certification. He is also an author of popular books, including BPMN Methods and Style, DMN Method and Style, BPMN Quick and easy and DMN cookbook, co- authored with Edson Tirelli, RedHat. Moreover, he is also a public speaker, an active contributor to the DMN standard and a principal consultant at Trisotech. His website, methodandstyle.com, provides mate- rials on many insights into business process management and decision modelling. 29
3. DMN model validation 3.2 Validation breakdown Because of the fact that DMN offers a fully executable model and that simultaneously DMN specification does not enforce the model to be complete or consistent, it is crucial to ensure its completeness and cor- rectness by external tools. For example, overlaps or gaps between rules in a decision table can make the model incomplete. On the other hand, wrongly specified FEEL expressions can make the model incorrect. DMN specification specifically allows incomplete models so they can be interchanged between different tools or people. In addition, the specification also prescribes a standardized way for such interchange called DMN DI, allowing, among the other things, storing of the DRD elements positions. Meaning the position of the elements is stored directly in the XML document, and each tool can read it and visualize it exactly the same way it was designed in other tool. It is even more important to check for this model incompletion because of the targeted audience interacting with the model; business users and domain experts. These people often do not have a deep IT background and are not familiar with the basic concepts of soft- ware verification and assurance. That is why the burden of ensuring completeness and correctness of the model is even more shifted to the corresponding tools. One of the main focuses is detecting all the possible errors and bugs before run-time in a production environment. At the same time, DMN specification is relatively quickly evolving, bringing new concepts, such as business knowledge models, decision services or data types imported from a different model, making suffi- cient validation support for newer versions of the specification even harder to achieve. Nevertheless, all listed vendors, tools or involved parties above provide or prescribe some degree of validation features. The following is a summarised list of DMN model validation ap- proaches across tools listed in the previous section. This thus directly answers the research question number three. Question 3 What types of DMN model of validations can be actually provided? 30
3. DMN model validation 3.2.1 Validation against the schema First, and possibly the most straightforward validation that could be performed on the DMN model is a validation of its XML, ensur- ing compliance and syntactical correctness of XML file against the standard XML Schema Definition (XSD). • compliance of XML file against the XSD schema 3.2.2 Validation of DRD and its elements Another validation that the vendors often do is the semantic validation of the DRD and its elements. This type is validation typically checks for: • correctly set references between elements, detecting cycles, wrongly connected elements or missing top-level decision • wrong or missing inputs, outputs between decisions • wrong or missing input, output types between elements • duplicate names of elements • missing decision logic 3.2.3 Validation during compilation process To enable DMN executable model, DMN is by vendors source-to- source compiled or so-called transpiled, into another source language, such as Java or JavaScript. Thus, one type of validations is outputted by a FEEL parser, transforming FEEL into an Abstract Syntax Tree (AST) and reporting errors from lexical and syntactic analysis based on the defined grammar. Other validations reported at this level are reported from transforming DMN elements into Java or JavaScript- based codebase. Different validations performed on this level are hard to generalize, as they are strictly dependent on the particular transpiler and the target language. • errors from the lexical and syntactical analysis • other transpiler specific errors 31
3. DMN model validation 3.2.4 Validation of FEEL expressions In FEEL and its corresponding grammar, not all syntactically correct expressions are valid, meaning additional semantic and logical analy- sis need to be performed. Validation of FEEL expressions is similar to a static analysis performed on any other programming or expression language. Meaning after FEEL is parsed and AST created, its model can be further processed and analyzed. One of the problems is that FEEL is a dynamic language, and DMN specification does not enforce the definition of input and output data types. It is then up to the tool to enforce explicit data types of input and output data of decisions. The definition of data-types then does not influence just the static analyzer and available validations, but also other language features, such as context-aware IntelliSense or go-to definition. • semantic and logical FEEL problems • undefined variables or input data in FEEL expressions • wrongly specified types of built-in FEEL expressions, FEEL func- tions, input and output data, variables and their usage inside FEEL expressions • code maintainability enhancements based on the specifically defined rules 3.2.5 Validation of a decision table Validations of decision tables are deeply described by the book by Bruce Silver, named DMN Method Style [25], in which he introduced countless possible validations on top of decision tables, ensuring their completeness and better maintainability. Following is the summary of such validations with accompanying rules. • gaps between rules • overlaps between rules • conflicts between rules • unused rules • missing columns • missing ranges • cycles in the rules 32
3. DMN model validation • subsumption between rules, meaning two rules could be com- bined, and table contracted • other best practices, such as guessing the "best" hit policy based on defined rows 3.2.6 Dynamic validation Into dynamic validation and analysis belong mainly tests performed either on top of the individual decisions and knowledge models (unit tests) or the whole DRD diagram (integration tests). Another bene- ficial type of tests that could compare the results between the same model changed over time could be regression tests, showing the dif- ferences in decision logic between two different versions of the same model. Another type of tests proposed by Bruce Silver and supported by Drools [20] are auto-generated Modified Condition/Decision Cov- erage (MC/DC) tests, allowing to check for as many problems in the decision tables with as few autogenerated tests as possible. Many tools provided by vendors above have test components directly inte- grated into their editor user interfaces, enabling business-user friendly, declarative definition of test cases. • individual decision or knowledge model tests • whole DRD tests, testing interaction between individual ele- ments • regression tests between two versions of the same model • MC/DC auto-generated tests 33
You can also read