Hi All,

Recently I had a need for a workflow engine for our CakePHP based project. Even though we needed relatively straightforward document approval workflows, we decided to go with a workflow engine, because the separation of concerns allows for easy personalization of different workflows for different clients. After searching the web for the a bit, we found that ezComponents has by far the most well-developed open source PHP workflow engine. Integrating the workflow engine with CakePHP was not a big issue, as the workflow components is fairly separate from the rest of ezComponents, and the API of the workflow component is well thought out and easy to adapt (in short, thanks to the developers of ezComponents for desgining a great system!).

Anyways, along the way I had to make several modifications to the workflow source code to accommodate the type of environment that I envisioned. I think a lot of the changes I had to make came from a different philosophy on how and where a workflow is instantiated. Correct me if I'm wrong, but it seems that's the default mode of instantiating a workflow is meant to be from code. This certainly is a valid way, but I prefer to write the workflow directly in XML. The changes I made to accommodate that are:

   * allow for string node IDs
   * do not rewrite/renumber node IDs
   * do not require IDs for nodes with only one input that are only
     connected with the node before it

These 3 new rules make it much easier to write and debug workflows in XML. An example of readability improvement:

before:

<node id=1 type=Start>
<outNode id=2/>
</node>
<node id=2 type=MultiChoice>
<outNode id=3/>
<outNode id=4/>
</node>
<node id=3 type=Action>
<outNode id=5/>
</node>
<node id=4 type=Action>
<outNode id=5/>
</node>
<node id=5 type=SimpleMerge>
<outNode id=6/>
</node>
<node id=6 type=End>
        after:

<node type=Start/>
<node type=MultiChoice>
<outNode id=accept/>
<outNode id=reject/>
</node>
<node id=accept type=Action>
<outNode id=exit/>
</node>
<node id=reject type=Action>
<outNode id=exit/>
</node>
<node id=exit type=SimpleMerge/>
<node type=End/>


Even for such a small example this makes it far more legible and easy to maintain (you can insert a node without having to renumber everything for example). It does impose a new rule that the order of nodes in the XML file matters.

Other changes:

   * One need we had was that we wanted to present the user a choice
     based on the workflow state and available branches. Although this
     can be parsed in some way from the current node in the workflow
     (by traveling along the out notes until you get to a
     multiple-choice action), it makes things much more complicated
     than needed and imposes implicit restrictions to the workflow. For
     this I added a new action that has one input node and multiple
     (labeled) outputs and a method to get the list of those outputs.
     The action behaves as an input action and waits for a specified
     workflow variable and uses that value to determine what output to
     take. This allows you to load the current workflow state, and get
     the choices (if the current active node is our new action) to
     display to the user.
     Example:
     <node type="InputSplit">
     <variable name="next_step"/>
     <condition type="IsEqual" value="Accept" >
     <outNode id="accept" />
     </condition>
     <condition type="IsEqual" value="Reject" >
     <outNode id="reject" />
     </condition>
     </node>

     and then

              foreach($execution->getActivatedNodes() as $node) {
                  if ( is_a($node,'ezcWorkflowNodeInputSplit') ) {
                      $choices = array_merge($choices,
     $node->getChoices());
                  }
              }

     returns array( "Accept", "Reject")


   * We also found that using the threading paradigm made the workflow
     much more complicated if there are multiple intertwined loops in
     the workflow (e.g. send document back to previous reviewer or send
     the document back to originator in a review loop) but only one
     possible path. For that I made the above described new choice
     action not create new threads ($startNewThreadForBranch = false)
     and I added a converge action (same as simple merge, except
     assumes incoming nodes are all on the same thread).

   * I added a new variable node "VariableAppend" that appends an value
     to an (array) workflow variable. This allows you for example to
     accumulate comments in a review loop. eg.

     <node type="VariableAppend">
     <variable name="comment">
     <string>new comment</string>
     </variable>
     </node>

   * I added a variable type that is an indirect value, i.e. the value
     of the node is the name of a workflow variable, which allows you
     to assign one workflow variable with another. eg:

     <node type="VariableSet">
     <variable name="one">
     <wfvariable name="two"/>
     </variable>
     </node>

     assigns the value of workflow variable "two" to the workflow
     variable "one". This one is very useful when using XML-based
     workflows, as it is easier to vary the input of certain workflow
     actions (which otherwise would be hard coded in the XML).

   * I added a new optional label property to the node object (e.g.
     label="Waiting for review"). This allows you to label (input)
     nodes, so it becomes easy to display the current (wait) state of
     the workflow to the user by displaying the label of the currently
     active node.


If there is any interest in any of these changes, I can of course make code available. One remaining issues I am still looking at and on which I could use advice, is the notion of a current user. Often a workflow involves handing the flow from one user to another. I added a current_user as a first class citizen to the execution state of a workflow, but it is of very limited use right now. A more generic notion of the current owner(s) of an execution (thread) would be useful, but I haven't come up with a good solution yet.

Marcel

Reply via email to