Current Research 

Current Research

The Object Constraint Language and its relationship to Meta-modeling
Supervisor: Prof. Hans Vangheluwe (Summer 2004)

In Progress...

Section I. Getting Started

New Page 1

            To begin with, it is always a good idea to understand the notions of model, formalism and constraint. As figure 1 illustrates the relationships between the main modeling notions, a system is any set of (concrete or abstract) things to model. A model is an abstraction that represents a view of this system. A model aims to represent a system into a formalism (e.g. the UML) using a set of abstract primitives. These primitives are relevant notions of modeling (e.g. a class) that are mapped to forms, i.e. concrete primitives (e.g. a box). The semantics defines the meaning of the notions. (e.g. what a class is) using any language, including natural language. This semantics includes constraints that induce restrictions on the use of notions in order to ensure that a model has at least one licit interpretation in the formalism. The UML metamodel cannot be changed (this is justified at least for practical reasons). On the other hand, a greater expressive power is often required and fortunately the UML abstract primitives include the ingredients needed to define new notions. Pseudo-metaclasses (stereotypes)  are defined based on UML metaclasses and stored in profiles. The semantics of the new primitives is defined into the profile using adornments in free text and OCL constraint that restrict the use of these primitives, the same as in the metamodel.





1.1. Syntax vs sematics

New Page 1

            A language consists of a syntactic notation (syntax), which is possibly a finite set of elements that can be used in the communication, together with their meaning (semantics). Semantics is normally the puristic notion of information where syntax is its representation as data, which is the medium used to transport and store information. It is generally agreed in literature that data is used to communicate and needs an interpretation to extract the information behind it. An interpretation is always a mapping assigning a meaning to each legal piece of data. Therefore these two notions are often mixed up. On the other hand, the same piece of information may be encoded in a variety of pieces of data and vice versa, the same piece of data may have several meanings and may therefore denote different information for different people or for different applications. Thus, it is necessary to distinguish between the two.

            The term syntax is used whenever we refer to the notation of the language, and this includes diagrams too. (Note, there are textual and visual representations for syntactic expressions.) Syntactic issues focus purely on the notational aspects of the language, completely disregarding any meaning. The meaning of a language is described by its semantics. Note that most computerized tools do not allow us to manipulate semantics directly. Instead, everything we see and work with on the paper or on the screen is a syntactic representation. Since we are still far from computers that can understand free-flowing natural language, a formal, concise, and rigid set of syntactic rules is essential. However, this is not enough. Without semantics that assigns an unambiguous meaning to each syntactically allowed phrase in the programming language, the syntax is worthless. Otherwise, several misinterpretations become possible. It is useful in the computer engineering discipline, any language, textual or visual, must come complete with rigid rules that prescribe the allowed form of a syntactically well-formed program, and also with rules, just as rigid, that prescribe its semantics.

            Figure 1.1 describes the structure of a language. The semantics of a language tells us about the meaning of each of its expressions. That meaning must be an element in some well-defined domain. However, unfortunately, the confusion that often exists between syntax and semantics is made worse by the fact that we need a syntactic representation for the semantics itself! Here are some examples. In the UML 1.n, its definition is also confused about the meaning of semantics, the semantics section is mostly about abstract syntax. In compiler theory, the context conditions are often called "semantic conditions" as they are triggered by semantic considerations. However, they really just constrain the syntax and do not contribute to the definition of semantics.


1.2. Abstract syntax vs concrete syntax in visual and textual perspectives

New Page 1

    Normally, concrete syntax is part of the definition of a language; it is what programmer writes to define a program or a piece of data. Abstract syntax is an abstract representation used for encoding the concrete syntax. It can also be consider part of a language whenever the language allows user to access the implementation structures from itself.

    In UML the basic idea of its abstract syntax and concrete syntax is the same. Figure 1.2 shows a simple example of how "a + b" represents in both the meta-level and instance-level in the abstract syntax of a language and the concrete syntax of a language in both textual and visual representations.

    In the meta-level, an abstract syntax of "a + b" is simply an entity relationship which may also involve other operators.  The graph grammar rule maps the LHS and RHS of the operator to the ids of the operands and produces the end result. Such an abstract syntax in the meta-level can be described using concrete syntax textually or visually. The meta-level visual concrete representation of "a + b" can be described, as figure 2 suggests, in a state chart which loops back to itself under some insideness definition. This case becomes very trivial in the instance level. Its abstract syntax is simply a tree (or syntax tree) holding the operator as the root and operands as the leaves. This tree can be expressed concretely as "a + b", "+ (a, b)", or "plus (a, b)". Obviously there are several ways to represent this abstract syntax in a concrete context and thus in a sense, the choice is rather arbitrary.

    In implementation, the mapping is mostly between the textual representations of the concrete syntax in the meta-level to the abstract syntax in the instance level. The compiler-compiler takes the expression of concrete syntax in the meta-level and generates a syntax tree in the instance level. Whenever a user inputs a textual expression, it is compiled and validated against the syntax tree.



Abstract Syntax

Concrete Syntax

Meta Level

EXP :: = <ID> ' + | * '<ID>

Instance Level

a + b

Text Box: Figure 1.2



Section II. The UML 2.0 OCL

2.1. OCL & Its Relationship to Metamodeling presentation slides

Maintained by Victoria Yang.