Introducing the Query by Example addon of VIATRA

We present an illustrated introduction to Query By Example (QBE), an exciting new addon of VIATRA. QBE is a tool that helps you write queries simply by selecting model elements in your favorite editor. This automatic feature is intended to help users who are learning the VIATRA Query Language and/or are unfamiliar with the internal structure of the modeling language (metamodel) they are working with.

The problem: querying what you do not know

Model queries are used for a multitude of reasons. Often, they are developed by modeling tool authors to accomplish built-in functionalities of the language or tool, such as well-formedness checking, derived features or declarative views. But sometimes it is not the developer of the modeling language who specifies the query: e.g. users may define queries themselves to enforce a company-specific design rules, or 3rd parties may provide transformation plugins that map a model into a different representation.

There is a hidden obstacle here: usually only the language developer has intimate knowledge of the metamodel (the abstract syntax), while others are familiar with the language merely through views presented to users (the concrete syntax). It is, however, the abstract syntax which is necessary for defining queries in the traditional way.

Motivating case study: defining UML design rules

Imagine, for instance, that you are an engineer at a company that creates UML models using Papyrus, and you wish to define model queries in order to implement validation for an in-house design rule that all your UML Sequence Diagrams should adhere to: "Engine objects can invoke UI methods only in a non-blocking way".

The first challenge would be formulating a query that identifies blocking calls between objects on Sequence Diagrams - such a situation would look like this in the Papyrus editor:

what a synchronous call looks like in the concrete syntax

what a synchronous call looks like in the concrete syntax

Expressing this query in the .vql syntax would require you to know the names of the relevant EClasses of the UML metamodel and their features.

There are some easier hurdles to jump - the editor palette tells you that the vertical lines are not actually called "objects" but rather Lifelines. You might also understand from the default name offered by Papyrus that the contents of the diagram are actually represented by a model object of type Interaction.

Sometimes, you will find a bit more difficulty in formulating the query. Although the Papyrus editor palette tells you that the "arrow" thingy representing a blocking method invocation is a "Message Sync", but the actual model object is of the class Message, and the synchronous nature is expressed by its messageSort attribute being set to MessageSort::synchCall.

However, some aspects turn out to be much more difficult to guess. The UML graphical syntax offers no clues that would let you realize that the message does not directly refer to the lifelines, or vica versa. Instead, there are two invisible objects (of type MessageOccurrenceSpecification) at play, that represent the sending or receiving of the message by a lifeline:

A schematic representation of a UML model fragment in abstract syntax

A schematic representation of a UML model fragment in abstract syntax

Really, this whole thing is a mess. It is quite difficult to understand the abstract structure and come up with the right type names when writing a query, unless you are an expert of the relevant modeling language (the UML standard, in this case).

The solution: Query by Example at a glance

Wouldn't it be much easier if you could just create an example in concrete syntax using a regular model editor, and then instruct the model query framework to "fetch me stuff that looks like this"? You are in luck - this is pretty much what the new Query By Example (QBE) tool lets you do! (Available from the update sites as a VIATRA Addon since v1.3.)

To get started, the you have to select a few elements in the model, and initiate the QBE process. The QBE tool will perform a model exploration on the given EMF model to find how the selected anchor elements relate to each other. The Query by Example view will present the results of the model discovery, where you can follow up on the status of the pattern being generated, and perform some fine-tuning on it (via the Properties view). The pattern code can be exported either to the clipboard, or to a .vql file. After subsequent fine tuning, the Update button can be used to propagate any changes made to the previously saved .vql file.

Screencast

(View video with subtitles/CC turned on.)

Case study walk-through: creating your first query by example

Select the two lifelines and the message in the Papyrus editor:

Sequence Diagram, with message and its source and target lifelines selected

Sequence Diagram, with message and its source and target lifelines selected

Now it is time to press the "Start" button on the Query By Example tool:

Pressing start on the QBE View

Pressing start on the QBE View

A quick glance on the contents of the QBE view (which will be explained in greater detail soon) immediately tells you that the tool has discovered that the three selected model elements are connected via three additional objects - an Interaction and two MessageOccurrenceSpecifications:

QBE View after selecting an example

QBE View after selecting an example

The QBE tool has also explored all the attribute values of these six objects, but has no way to know which of them are actually relevant to the query. Most attribute values in the example are incidental, such as the name of the lifeline. So you need to manually go through the list, find the one that has MessageSort::synchCall as the value of the messageSort of the Message (it is quite apparent from the list that none of the other attributes have anything to do with the synchronous nature of the invocation). Then you can simply indicate that it should be added as a condition to the query, by selecting it from the list, and marking it in the Properties view as included.

Finally, another button on the QBE UI lets you export the pattern to the clipboard, or a .vql file in a Viatra Query project.

Saving the finished query to the clipboard

Saving the finished query to the clipboard

If you save the generated query to a .vql file, you will notice that it does not compile at first.

A compiler error comes from the fact that the QBE tool can not (yet) guess the name of the Java package where you save the file. You can fix this by manually specifying your package: select the "package" entry in the QBE view, and use the Properties view to change the name. While you are at it, you may even change the name of the pattern to something meaningful. If you have previously opted to save the generated query to a file in the workspace, you can now overwrite that file with the new content by a single click on the Update button:

PAckage and pattern names, and the Update button

PAckage and pattern names, and the Update button

There is one more likely reason for your query not compile: the query project might not have the UML types on its classpath; you can fix this easily by adding the metamodel bundle org.eclipse.uml2.uml to your dependencies.

The generated query should look something like this (small variations possible), with pattern variables created for anchors and intermediate objects (the former appearing as pattern parameters); reference constraints created for the paths connecting them; and the additional attribute constraint that was manually requested:

package org.example.uml.designrules

import "http://www.eclipse.org/uml2/5.0.0/UML"

pattern blockingCall(
    lifeline0 : Lifeline,
    lifeline1 : Lifeline,
    message0 : Message
) {
    MessageOccurrenceSpecification(messageoccurrencespecification0);
    Interaction(interaction0);
    MessageOccurrenceSpecification(messageoccurrencespecification1);
    Lifeline.coveredBy(lifeline0, messageoccurrencespecification0);
    Lifeline.coveredBy(lifeline1, messageoccurrencespecification1);
    Interaction.lifeline(interaction0, lifeline1);
    Lifeline.interaction(lifeline0, interaction0);
    Message.sendEvent(message0, messageoccurrencespecification0);
    MessageOccurrenceSpecification.message(messageoccurrencespecification1, message0);
    Interaction.message(interaction0, message0);
    Message.receiveEvent(message0, messageoccurrencespecification1);
    MessageOccurrenceSpecification.message(messageoccurrencespecification0, message0);

    Message.messageSort(message0, MessageSort::synchCall);
}

You can now load the query into Query Explorer (or the new Query Results view) to verify that it does the right thing - i.e. it matches exactly those elements that you want it to match. If it does not, you can use the QBE UI to make adjustments, and fine-tune the query (e.g. adding or removing additional constraints, see below) to meet your goals.

How it works under the hood

When you select some elements in an editor or viewer, and press the "Start" button of QBE, the tool needs to recognize the selection as a set of EObjects. VIATRA ships with integration components for several popular editor frameworks (in particular, it works out of the box with Papyrus, which is GMF-based), but you might need to contribute a model connector plug-in in order to be able to use QBE with a custom editor.

The model discovery will start separately from each selected EObject (anchor element), and will traverse EMF reference links up to a given exploration depth limit, in order to collect all paths (not longer than the given depth) connecting two anchors.

Initially, the tool automatically selects the smallest exploration depth that makes all anchors connected by the paths discovered. You can use the Exploration depth slider in the QBE view to manually increase this depth limit, so that the tool notices additional, less direct connections between the selected anchors. Changing the exploration depth will re-trigger model discovery, so that the tool can gather new paths.

After model discovery, a first attempt at a pattern will be formed by all paths that were found to connect the anchors. Pattern variables will consist of anchor points (given by the user as part of the selection) as well as the additional objects discovered as intermediate points along the paths. By default, only variables corresponding to anchors will be used as pattern parameters, while the rest will be local variables. References traversed along paths will contribute edge constraints to the pattern body.   

Fine tuning

Sometimes, the QBE tool will not immediately stumble upon the query that you are interested in. You can identify problems with the proposed pattern by directly inspecting the generated query code, or by examining query results on a test model. In these cases, there are still ways you can fine-tune the pattern through the QBE (and Properties) view, to make sure the generated query is useful.

For instance, recall how we originally designated the two Lifelines and the Message as part of the example. Had you selected the two Lifelines only, QBE would have found them connected by a path of length 2 - as they are both lifelines in the same Interaction. The fact that they exchange a Message turns out to be a less direct relationship between the two anchors. In order to arrive at the correct query, you have to manually increase the exploration depth to 4, thereby forcing QBE to search for connections between anchors in a wider context.

Depending on other particularities of the model you use as example, this wider context may have included connections between the Lifelines that are incidental in the example model, and not essential parts of the query. In this case, it is still the responsibility of the query developer to determine which details are relevant; the QBE view allows you to mark connecting paths as well as intermediate objects as excluded from the result. In the previous case study, having a common Interaction was not actually necessary as part of the query - but it is not important to remove it now, as it will not influence the results.

Additional fine-tuning options include:

  • Promoting intermediate objects found during the discovery to act as pattern parameters (along with the anchors).
  • Renaming pattern variables.
  • Adding attribute constraints based on the attribute values of the discovered objects (as demonstrated before).
  • Similarly, adding negative application conditions. By pressing the 'Find negative constraints' button, the tool will search for references between pairs of variables (anchors or intermediate objects) that are permitted in the metamodel, but not present in the example instance model. The absence of such references will be offered as additional, opt-in 'neg find' constraints that can be individually selected to be included in the query.

The rest of the case study and conclusion

You might have noticed that the above case study is far from complete. We have only managed to identify the relationship between the sender and receiver Lifelines of a blocking call; we still need to

  1. express using QBE that a given Lifeline represents an instance of a certain Class
  2. express using QBE that a certain Class resides in a Package with the name "Engine" or "GUI"
  3. write a final query that uses pattern composition to combine the previously created QBE queries in order to match a violation of the constraint in question (i.e. "an Engine object invoking a method on a UI class in a blocking way").

Here is what the Class diagram looks like:

Snippet of Class diagram in the concrete syntax

Snippet of Class diagram in the concrete syntax

For the first task, one only needs to select a Lifeline and its associated Class. Whoops - the Papyrus UI becomes an obstacle here, as there seems to be no way to simultaneously display these two elements. Fortunately, we can just select one of the two elements as a single anchor, switch over the view to the other element, and then tell the QBE tool to expand the previous selection with the new element by selecting the "Expand" alternative action available from the little drop-down arrow at the "Start" button.

The second task requires familiar steps only: selecting a Class and its Package as the two anchors, and then adding the name of the UML package as an attribute constraint. Note that in a more complex real-word example, the query would probably include a more complex condition (such as a regular expression) on the name of the package. As of now, QBE can only generate constraints for exact attribute value filtering; if you need anything more advanced than that, then you will have to consider the query generated by QBE merely as a starting point that you have to modify manually.

The third task is best solved by simply manually writing a query that composes the previously obtained patterns, and not by applying QBE, so it is left as an exercise to the reader :) The main purpose of QBE is to help the user discover connections in a model with an unfamiliar abstract syntax; it will not replace regular query engineering in the generic case.

To put it simply, you should think of Query By Example as a tool for abstracting away the ugly, unknown details of a modeling language before developing more complex queries for that language as usual.