środa, 13 grudnia 2017

Be lazy with your data

jBPM comes with really handy feature called pluggable variable persistence strategy (or shorter marshalling strategies). This is the mechanism responsible for persisting process instance and task variables to their data store - whatever kind of data store you use.

An important thing to keep in mind when using marshalling strategies is the fact that each time variable is read it will read that from the data store which might be:

  • file system
  • data base
  • REST service
  • document management system e.g. ECM
  • and more
depending on the accessibility of this service and its performance this might not be a big deal but if that data is loaded many times it might become an issue. Especially if the data is of fair size like documents. 
Imagine process instance operates on bunch of documents (word, pdf, etc). Each of this document is of 2MB size which will give 20 MB for 10 documents. That means every time user interact with this process instance all documents will be loaded from external system. And this is not a problem (or it is actually how it should behave) when these documents are needed for the work user intend to perform. 

But what if (s)he is not going to use all documents or even not a single one? Or if it's not a user but an timer is firing of to send reminder to users?

Should all documents be loaded then? Obviously it does not make much sense though the problem with this decision is how would process instance know if given data will be used or not? And because of that it simply loads all variables whenever process instance (or task) is loaded.
This is default mechanism for process instance and task variables that are stored as part of them as it does not make much sense to separate them since their life cycle is bound to each other - always loaded and stored together. So this does not bring any overhead or additional communication effort.

Though situation looks slightly different for variables that are actually stored externally - like physical documents stored in ECM system. In this case, documents (including their content - 2MB each) will be loaded regardless if the data will be used or not. Moreover, in high volume systems this might put unnecessary load on ECM to constantly load and store documents which actually didn't change.

jBPM 7.6 comes with support for lazy load of variables to resolve these issues. Though it won't magically apply to all your variables that are stored externally, but will provide all the support from the engine side to take advantage of lazy loading. Since the mechanism for loading variables will differ based on the back end data store it's up to user to apply this principle which is rather simple as follows:
  • your variable must implement org.kie.internal.utils.LazyLoaded
  • your marshaler strategy needs to provide kind of service responsible for loading content of the variable - user by load method of the variable
  • your marshalling strategy should not load content by default but set that service on the variable so can be used to lazy load when needed
  • to avoid unnecessary operations to store variable, implement sort of tracking system on your data to identify if variable has changed and store it only if it did, your marshaler should check this tracking mechanism in marshal method and should reset it after loading variable in unmarshal method

jBPM 7.6 provides this mechanism as part of its support for documents (jbpm-document module). The DocumentImpl implements both LazyLoaded and tracking mechanism to resolve both issues (too often load and too often store). This extremely improves overall performance as it reduces number of reads and writes to document service at the same time provides content on demand and only when needed.

I'd like to encourage everyone who stores process or task variables externally to make it lazy loaded and tracked to improve the performance of your system and reduce load on your back end data store.

piątek, 24 listopada 2017

Case management - mention someone in comments and we will notify them

Another post around case management shipped with jBPM 7. This time some additions for case comments that improves communication and collaboration.

jBPM 7.5 brings in case comment notifications.

That means case participants can be notified directly when they are being mentioned in the case comments. What's really worth noting is that this is completely integrated with case roles. In fact when you mention someone you mention by case role, there is no need to know who is actually assigned to the given role.



That enables smooth collaboration and plays nicely with dynamic nature of case instances and case role assignments.

Comment notifications

Comment notification processing is provided as case event listener that is NOT registered by default though it can be easily registered via deployment descriptor (as event listener).



This listener has three properties that are configurable but comes with defaults for easier setup

  • sender (defaults to cases@jbpm.org)
  • template (defaults to mentioned-in-comment)
  • subject (defaults to You have been mentioned in case ({0}) comment)
{0} place holder will be replaced with actual case id that comment belongs to.

So if you would like to use different values for sender, template and subject, you can provide them as constructor parameters when registering the listener in deployment descriptor:

new org.jbpm.casemgmt.impl.event.CommentNotificationEventListener(
"no-reply@domain.org", 
"custom-template", 
"New comment has been added to case {0}");

Comment notifications are published only when there are any roles mentioned. Roles are then resolved to users or groups assigned to the mentioned roles and then used to publish notification.

First it will attempt to publish notification with the template name it has defined, and if not found then it will send the comment body as raw text.

When sending with templates it will provide following parameters that can be used in the template to be replaced with actual values:
  • _CASE_ID_ - case id of the case the comment belongs to
  • _AUTHOR_ - author of the comment
  • _COMMENT_ - comment text (body)
  • _COMMENT_ID_ - unique id of the comment
  • _CREATED_AT_ - date when the comment was added
Then in the template you can refer to these parameters like this:
You have been mentioned in case comment by ${_AUTHOR_} most likely requires your attention

Notification publishers

Notifications are build as pluggable component and by default jBPM 7.5 comes with email based notification publisher. Notification publishers are discovered on runtime based on ServiceLoader mechanism, so as long as you have jbpm-workitems-email on class path then you have email based notification publisher.

Notification publisher interface (org.kie.internal.utils.NotificationPublisher) comes with three methods:
  • publish notification with raw body
  • publish notification built from a template
  • indication if given publisher is active

Email notification publisher

Email based notification publisher uses two important components to do its work:

  • java mail session that can either be retrieved from JNDI of application server (recommended as it hides the complexity needed to properly setup and secure communication with mail server) or based on property file
  • user info implementation to resolve user ids to email addresses
Since email publisher is discovered on runtime via ServiceLoader then there is no easy way to pass in other objects when initialising it. So it relies on some convention:
  • java mail session
    • register mail session in JNDI via application server configuration and then point the JNDI name via system property so jBPM can pick it up
      • -Dorg.kie.mail.session=java:jboss/mail/jbpmMailSession
    • create email.properties file on the root of the class path with following properties
      • mail.smtp.host
      • mail.smtp.port
      • mail.username
      • mail.password
      • mail.tls
  • user info implementation is automatically retrieved from the environment via org.jbpm.runtime.manager.impl.identity.UserDataServiceProvider

To handle templates there is an extra component (TemplateManager) that allows to deal with html based templates managed with free marker to render actual body of the notifications.

Email templates

Emails can be build with templates to allow better looking notifications than just raw content. TemplateManager that comes with email notification publisher can be used to define custom templates and then use them for your notifications and also email (that are send via EmailWorkItemHandler).

To make use of email templates you need to set the location where those templates are stored via system properties:
  • org.jbpm.email.templates.dir - this property should point to an absolute file system path and must resolve to a directory.
  • org.jbpm.email.templates.watcher.enabled - optionally you can enable watcher thread that will load updated, added or removed templates without the need to restart the server
  • org.jbpm.email.templates.watcher.interval - when using watcher you can also control the poll interval for it, it defaults to 5 and can be changed with this property that represents number of seconds
Emails are processed by free marker so make sure you define templates according to rules for free marker. Each time comment notification is issued it will attempt to use template named mentioned-in-comment and then if failed will send the comment as raw notification.

Implementing your custom notification publisher

Since notification publishers are pluggable you might want to implement your own, e.g. SMS notification. To do that follow these simple steps and off you go:
  • implement org.kie.internal.utils.NotificationPublisher your custom class with the mechanism chosen to send notification
  • create file name org.kie.internal.utils.NotificationPublisher under META-INF/services and paste the fully qualified class name of your implementation
  • usually it's a good practice to implement isActive method with system property to allow to disable given publisher on runtime without a need to remove it from class path 
  • package everything in a jar file and place it on class path e.g. kie-server.war/WEB-INF/lib
And that's all, your notification publisher is now active and will be used whenever someone is mentioned in the comments.

See it in action

Below is a screencast showing it in action using Case management showcase app.


As usual stay tuned and comments are more than welcome.

wtorek, 14 listopada 2017

Tune async execution in Kie Server

jBPM comes with handy feature for executing async jobs - essentially it allows to put a wait state almost at any place in the process definition. Either by marking node to be executed asynchronously or by using async work item handler.
For improved performance JMS based trigger has been provided (long time ago ... in version 6.3 - read up more on it here) and since then it's being used more and more.

With this article I'd like to explain how to actually tune it so it performs as expected (instead relying on defaults). It will focus on the JMS tuning as this brings the most powerful execution model.

So let's define simple use case to test this.


Basic process that has two script nodes (Hello and output) and work item node - Task 1. Task 1 is backed by async work item handler meaning it will use jBPM executor to execute that work asynchronously. The output script node will then hold execution (by using Thread.sleep) for 3 seconds for the sake of the test. This aims at making sure that given thread will not be directly available for other async jobs so it will actually illustrate tuned environment compared with defaults.

Default configuration

in WildFly (tested on version 10) there are two sides of the story:
  • The number of sessions that the JCA RA can use concurrently to consume messages.
  • The number of MDB instances available to concurrently receive messages from the JCA RA's sessions.
First is controlled by the "maxSession" activation configuration property on the MDB.
Second is controlled by the bean-instance-pools configured for MDBs in the "ejb3" subsystem in the server configuration (e.g. standalone-full.xml)

So maxSession activation configuration property is not set at all on KIEServer MDBs (none of them) and thus a default value is used which is 15.
Then "ejb3" subsystem default configuration is set to be derived from worker pools. Which will then differ based on your actual environment


The actual values can be inspected using jconsole (make sure you use jconsole from WildFly instead of default that comes with Java).

Below is the default setting for consumer count - refers to maxSession for given MDB



Below is default pool size derived from worker pools.


Execution on default configuration

Knowing what is the default configuration lets test its execution. The test will consists of executing 100 instances of the above process which will then result in executing 100 async jobs. With default settings we know that max number of concurrently executed async jobs will be 15 so total execution time will certainly be longer than expected.

And here are the results:
  • start time:09:44:37,021
  • end time: 09:44:58,513
Total execution time to complete all 100 process instances was around 21 seconds. Quite a lot as each instance sleeps for 3 seconds so the time to complete all 100 seconds should not take 7 times more time then it should.

Though this makes perfect sense as it will process these instances in 15 instances batches. As that is the limit to how many concurrent messages can be consumed.


Tuning time

It's time to do some tuning to make this execute much faster than on the defaults configuration. As mentioned, there are two settings that must be altered:

  • maxSession for KieExecutorMDB
  • max pool size in the "ejb3" subsystem
Activation config property can be set either via annotation on the MDB class or via xml descriptor. Since we don't want to recompile the code we'll go with the xml descriptor that overrides the annotations if they overlap.
To do so, edit kie-server.war/WEB-INF/ejb-jar.xml and add/merge following code:

<ejb-jar id="ejb-jar_ID" version="3.1"
      xmlns="http://java.sun.com/xml/ns/javaee"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://java.sun.com/xml/ns/javaee
                          http://java.sun.com/xml/ns/javaee/ejb-jar_3_1.xsd">


    <enterprise-beans>    
    <message-driven>
      <ejb-name>KieExecutorMDB</ejb-name>
      <ejb-class>org.kie.server.jms.executor.KieExecutorMDB</ejb-class>
      <transaction-type>Bean</transaction-type>
      <activation-config>
        <activation-config-property>
          <activation-config-property-name>destinationType</activation-config-property-name>
          <activation-config-property-value>javax.jms.Queue</activation-config-property-value>
        </activation-config-property>
        <activation-config-property>
          <activation-config-property-name>destination</activation-config-property-name>
          <activation-config-property-value>java:/queue/KIE.SERVER.EXECUTOR</activation-config-property-value>
        </activation-config-property> 
        <activation-config-property>
          <activation-config-property-name>maxSession</activation-config-property-name>
          <activation-config-property-value>100</activation-config-property-value>
        </activation-config-property> 
      </activation-config>
    </message-driven>
  </enterprise-beans>
</ejb-jar>

adjust as needed value of the maxSession config property - here set to 100.

Edit standalone-full.xml of the server and navigate to "ejb3" subsystem. Configure the strict max pool for pool named mdb-strict-max-pool. To do that add new attribute called max-pool-size with value that you want to have. Please also remove the attribute derive-size as they are mutually exclusive - you cannot have both present in the strict-max-pool element.

 <pools>
    <bean-instance-pools>
        <strict-max-pool name="slsb-strict-max-pool" derive-size="from-worker-pools" instance-acquisition-timeout="5" instance-acquisition-timeout-unit="MINUTES"/>
        <strict-max-pool name="mdb-strict-max-pool" max-pool-size="200" instance-acquisition-timeout="5" instance-acquisition-timeout-unit="MINUTES"/>
    </bean-instance-pools>
</pools>
That's all that is needed to tune the async execution for jBPM in Kie Server.

To validate, let's look at jconsole to see if the actual values are as they were set

First look at consumer count

Next, looking at pool size for mdbs

All looks good, max session that was set to 100 (instead of default 15) is now set as consumer count, and max pool size is now set to 200 (instead of default 128).


Execution on the tuned configuration

so now let's rerun exactly the same scenario as before and measure the results - starting 100 process instances of the above process definition.

And here are the results:
  • start time:09:47:40,091
  • end time: 09:47:45,986
Total execution time to complete all 100 process instances was bit more than 5 seconds. This proves that once the environment is configured properly it will result in expected outcome.

Note that this was tested on rather loaded local laptop so on actual server like machine will produce much better results.

Final words regarding tuning

last but not least I'd like to stress out few other things to keep in mind when going for more throughput for KIE Server:
  • make sure your data source is configured with enough connection - it defaults to 20
  • make sure your JVM is configured properly and will have enough memory (to say the least)
  • make sure number of threads in particular pool is configured properly - according to needs
  • make sure you don't use Singleton runtime strategy for your project (kjar) - use either per process instance or request

And that's it for today. Hope this will be helpful to get the performance and throughput the way you need it.

wtorek, 31 października 2017

Sub cases for case instance and ... process instance

This time I'd like to present support for Sub Cases. It actually provides very powerful concept as it gives option to compose advanced cases that consists of other cases. That way a large and complex case can be split into multiple layers of abstraction (and even multiple case projects). Similar as process can be split into multiple sub processes.

Let's take a look at what case is - or to be more precise, a case definition. Case definition is set of activities that might happen while the case instance is active. It is completely dynamic and thus what's in the definition is not the only activities that might be invoked in case instance. Dynamic activities can also be included on runtime (on already active case instance).

Case instance, is a single instance of case definition and thus should encapsulate given business context. All case instance data is stored in case file which is then accessible to all process instances that might participate in the given case instance. Each case instance is completely isolated from other and same applies for case file. Only owning case instance can access case file.

So with that short recap of case terminology let's take a look at sub cases. Sub case is another case definition that is invoked from within another case instance (or even regular process instance). It has all the capabilities of regular case instance:
  • has dedicated case file
  • isolated from any other case instance
  • its own set of case roles
  • its own case prefix
  • etc
Sub case activity is available in the process designer, in category Cases as shown in the below screenshot


Sub case supports following set of input parameters that allow to properly configure it and start it:
  • CaseDefinitionId - mandatory case definition id that should be started
  • DeploymentId - optional deployment id (container id in context of kie server) that indicates where the case definition (that should be started) belongs to
  • Independent - optional indicator that tells the engine if the case instance is independent and thus main case instance will not wait for its completion - defaults to false
  • DestroyOnAbort - optional indicator that tells engine how to deal when the sub case activity was aborted - cancel case or destroy - defaults to true (will destroy the sub case - remove case file)
  • Data_XXX - optional data mapping from this case instance/process to sub case, where XXX is the name of the data in sub case this mapping should target. Can be given as many times as needed
  • UserRole_XXX - optional user to case role mapping. this case instance role names can be referenced here - meaning an owner of main case can be mapped to an owner of the sub case that way whoever was assigned to main case will be automatically assigned to sub case, where XXX is role name and value is the user role assignment
  • GroupRole_XXX - optional group to case role mapping. this case instance role names can be referenced here - meaning a participants of main case can be mapped to participants of the sub case that way whatever group was assigned to main case will be automatically assigned to sub case, where XXX is role name and value is the group role assignment
  • DataAccess_XXX - optional data access restrictions where XXX is the name of data item and the value is access restrictions.


Regardless of the settings of the Independent flag, there will always be output variable available named:
  • CaseId - this is the case instance id of the sub cases after it was started.
In addition to that, current snapshot of case file data will also be given. And thus can be mapped to main case data. They will be named exactly the same as in sub case instance's case file. As presented in above screen shot, sub case file item named name is mapped upon completion to main case instance case file item named name.

You can take a look at following screencast to see it in action


Sub case support also applies to regular processes. That allows users to start a case from within a process instance. This can be seen in the next screen cast


As usual, comments are more than welcome


piątek, 27 października 2017

Improved KIE Server documentation

In the upcoming 7.5 release, users will finally see more information about the available REST endpoints of KIE Server. The runtime documentation has been completely rewritten and is now based on Swagger. It replaces previous docs and thus is available at exact same location

http://localhost:8080/kie-server/docs

You might need to adjust the host, port and context path depending on your system setup.

So what's new in this documentation?

  • finally it actually documents the endpoints and not only list them
  • presents only enabled KIE server extensions' endpoints
  • is built as KIE Server extension so it can be disabled, e.g. production deployments
  • provides direct possibility to try given endpoint out
  • shows all possible parameters and they expected values
  • gives curl like examples if needed to try it out from command line

Personally I think that the most important features are:
  • possibility to directly try it out
  • shows only endpoints that are active at given server
Second feature is very useful as it does not make unnecessary noise when some of the extensions are disabled. For instance when KIE Server is used as decision service (BRM and DMN extensions only) then there is no need to show endpoints from BPM (which is the most verbose one - in terms of number of endpoints).

Take a look at this short screencast that illustrates it in action - showing how you can actually use the Swagger docs to run processes and user tasks.



An advantage is that now there is a rule in place that makes it mandatory to describe that way every and each REST endpoint being added to KIE Server.

Custom extensions can do that too, just make sure your custom endpoint class has swagger annotation on the type level and once added to KIE Server (and restarted) they will show up in the docs.

@Api(value="NAME OF YOUR RESOURCE")

An example can be found here 

It's all thanks to contribution


I would like to thank Duncan Doyle who was the initiator of this work and actually contributed the replacement of the docs. Another great example that contribution is a win win :)

Note that this is first version of the docs and I'd like to encourage everyone to contribute to improve the docs. Now it's way easier than it was before so ... it's up to you too now :)

wtorek, 17 października 2017

Case management improvements - data authorisation

It's been a while since the last article about case management so I thought it's the time to refresh that topic a bit. And the best way to do it is with the face lift of sample case application - IT Orders App.

So what's new in case management since last time?

  • case file authorisation
  • case comments authorisation
  • case close with comment 
  • index for case file items for searching 

So let's start with the most significant improvements - case file and case comments authorisation. Prior to this, all data and comments were visible to all participants of the case instance. So as long as user has access to case instance (is assigned to any of the roles) she will have access to all the data in the case file. Similar will have access to all comments.

This actually prevents of using rather common mechanism within data driven applications - access control and visibility. Let's take into account situation where there are sensitive data that only certain roles in the case instance should be able to view or maybe there are needs to post private comments. Without the authorisation within the case instance users would not be able to deal with such use cases.

jBPM 7.5 will bring solution to this by allowing users to define restrictions for both comments and data within case file. 

Case comments authorisation 

Case comment can be restricted to roles within the case instance when the comment is:
  • created 
  • updated
Whenever comment is updated or removed authorisation checks are performed to make sure that the action is done by a privileged user. 
When comments are retrieved they are filtered by authorisation to ensure comments eligible for given user will be returned.

Case file authorisation 

Case file data is protected individually, each item in the case file can have its own access restrictions. Similar to comments, whenever data is put into the case file or removed from it, authorisation checks are performed. When case instance or its data is retrieved it is filtered by authorisation to ensure only eligible data will be present in the data set.

Case file data access restrictions can be set in following ways:
  • within case definition by setting custom meta data called customCaseDataAccess that supports multiple case file items with one or more roles for them. It expects following format:

         item:role1,role2;another:role2,role3
  • when creating a case instance (start a case) access restrictions can be given as part of the case file, next to data and role assignments
  • when putting data into a case file access restrictions can be provided

Close case with comment

Case instance can be completed when there are no more activities to be performed and the business goal was achieved or it can be closed prematurely. In the second case, it's quite often needed to make a comment why the case instance is being closed. This feature has now been provided as part of jBPM 7.5 so that can help to keep track of various scenarios that led to case instance being closed.


Indexing of case file items

Case file data is really powerful when case instance is running, rules can be built on top of them, they can be added or removed at any point in time. But when it comes to searching by that data, going through each case instance to find out if it matches given criteria or not does not sound like a good idea. 
To provide more efficient support for such operation, case file data is indexed (similar to how process instance variables are). They are stored in data base table and constantly being updated so it represents the latest state of the case file.

This allows then to easily and efficiently:
  • find case instances that contain given data (by name and value)
  • filter case file by data name or type
  • build up UI based on the index rather than loading all the data - keep in mind that case file might contain any type of information like large documents etc

Some of these features are included in the face lift of the IT Order Case App that you can watch below.


As usual, comments are more than welcome... ideas for improvements and real case studies for case mgmt even more :)

środa, 11 października 2017

where did the authoring go? getting started with workbench and kie server on 7.3 and onwards

Recently there were number of questions on mailing list about authoring perspective missing in recent versions of jBPM and/or drools... that lead me to take few minutes to make this write up and attached screen case to show that ... it was not removed :)

It was actually redesigned for easier access and to improve user experience when using the workbench. Though for people who get used to the "old" way could be a bit lost at first. So here it goes:



What you can see in this screen cast is:

  • where to go to see the projects
  • how to create new project
  • how to import new project 
  • how to import project from external git server
  • how to find the settings and various configuration options of the project
  • how to build and deploy
  • where to find deployed projects into execution servers
  • where to find process definitions
  • where to start new process instance
Nothing fancy but might get you started much easier after moving from previous version.

Comments are more than welcome in terms of improvements and feedback about usability.