CSC 530 Lecture Notes Week 5

CSC 530 Lecture Notes Week 5
More on Formal Semantics with Attribute Grammars




  1. Attribute semantics of real programming languages
    1. The examples in last week's notes are illustrative but pretty trivial
    2. In these notes we investigate the attribute semantics of larger language called SIL -- a simple imperative language.

  2. Practical attribute semantics meta-languages
    1. The technique of attribute semantics as originally developed by Donald Knuth is not 100% rigorous.
      1. In particular, the meta-language for expressing the various components of the definition was not fully formalized.
      2. In order to make attribute definitions precise, the meta-language conventions must be defined.
    2. The syntactic meta-language
      1. The meta-language we will use is based on YACC.
      2. The $n notation is used to refer unambiguously to grammar symbols.
      3. The semantic equations are associated with grammar rules using the same format as YACC; viz., the equations are enclosed in curly braces.
    3. The semantic meta-language
      1. Our semantic meta-language is based on ML.
      2. Attributes are defined as ML types.
      3. The RHS of semantic equations are ML expressions (with one enhancement as described below).
      4. Auxiliary functions are defined as ML functions.
    4. Additional meta-language notation
      1. The basic format of a semantic equation is:
        $n.attr = expression
        
        where expression is a standard ML expression, with the addition of attribute access terms, of the form
        $n.attr
        
        which may appear anywhere that an expression is syntactically valid.
      2. In the Yacc-style "$" notation, the term "$$" refers to the LHS non- terminal in a syntax rule and the term "$i" refers to the ith constituent of the RHS of a syntax rule.
      3. ML types are extended with built-in nil and error values. I.e., for each semantic attribute type T, there exists nil_T and error_T, which represent the nil and error values of type T.

  3. Circularity in attribute definitions
    1. Circular definitions often arise in practice when the root of a subtree seeks to modify an inherited attribute, and pass up the modified attribute value as a synthesized attribute of the same name.
    2. E.g.,
      A : B
          {$1.x = $$.x
           $$.x = $1.x}
      
    3. In a standard attribute definition, any circularities render the entire definition ill-formed 1.
    4. Circularities that arise in the course of an attribute definition can eliminated by attribute splitting
      1. A single attribute, say x, is decomposed into two attributes, say x and x', where x is inherited and x' is synthesized.
      2. Using this splitting technique, the above circular definition would be rewritten as follows:
        A : B
            {$1.x = $$.x
             $$.x' = $1.x'}
        
    5. Attribute splitting is used significantly in the SIL definition coming up.

  4. Inherited and synthesized attributes in real programming languages
    1. In practice with real programming languages, attribute dependencies are defined as shown in Figure 1


      Figure 1: Attribute flow in languages like SIL.



    2. With such forms of dependency, the attribute evaluation can be accomplished in one depth-first pass over the tree.
    3. This is the case with the definition of SIL that follows.
    4. Certain advanced language features, overloading for example, require that more than one pass be used; we will discuss features in an upcoming lecture.
    5. A general multi-pass attribute evaluation algorithm is discussed in the Bochman paper.

  5. The attribute semantics of a simple imperative language

    /*
     * This is a Yacc-style attribute grammar for type checking and interpretation
     * of a simple imperative language (SIL).  SIL has the same basic semantics as
     * Lisp with setq.  The differences between SIL and imperative Lisp are: (1)
     * SIL has more Pascal-like syntax; (2) SIL has explicit type declarations for
     * variables and function parameters; (3) SIL distinguishes between statements
     * and expressions, where statements are executed solely for there effect on
     * the store, and do not return a value.
     *
     * The following semantic attributes are used in the SIL definition.  The meta
     * notation for semantic data definitions and auxiliary functions is standard
     * ML, except for the "one of" notation, which is a simplified form of the ML
     * union-defining datatype construct.
     *
     *      NAME    DESCRIPTION
     *      ===================================================================
     *
     *      state   A tuple of the form (env, store).
     *
     *      env     A list of the form [ env_binding ...].  Note that env is
     *              actually two separate attributes, denoted env and env`.  Env is
     *              an inherited attribute representing the incoming environment;
     *              env` is a synthesized attribute representing the outgoing
     *              environment.
     *
     *      store   A list of the form [ act_rec ... ].  As with env, store is
     *              actually two attributes: an inherited store and a synthesized
     *              store`.  Note the store is managed in a LIFO discipline, with
     *              the bottomost (earliest added) act_rec representing the global
     *              store, and the topmost (most recently added) act_rec
     *              representing the current function activation record.
     *
     *      env_binding
     *              A tuple of the form (name, def).
     *
     *      def     One of var_def or fun_def.
     *
     *      var_def
     *              A type.
     *
     *      fun_def
     *              A triple form (type, formals, body).
     *
     *      formals
     *              A list of the form [ var_binding ... ] used in a function def.
     *
     *      var_binding
     *              A tuple of the form (name, var_def).
     *
     *      type    One of "integer", "real", "string", "boolean", or "OK".
     *
     *      body    A function (env*store)->store`.
     *
     *      act_rec
     *              A list of the form [ value_binding, ... ]
     *
     *      value_binding
     *              A tuple of the form (name, value).
     *
     *      value   One of integer or real or string or boolean, where these are
     *              considered primitive value types of the meta-language.
     *
     *      op_fun  A function (value*value)->value representing the built-in
     *              binary operators of the language.
     *
     *      op_fun_1 A function value->value representing the built-in unary
     *              operators of the language.
     *
     *      name    A string.
     *
     *      nil_X, error_X
     *              Attribute values built-in to the metalanguage for each
     *              attribute type X.  nil_X is the nil or empty value for
     *              attributes of type X; error_X is the error value for attributes
     *              of type X.
     *
     * The following auxiliary functions are used in the definition:
     *
     *      fun assoc(name, alist) =
     *          if null(alist) then nil_binding
     *          else if name = #1(hd(alist)) then hd(alist)
     *          else assoc(name, tl(alist))
     *
     *      fun last(l) = hd(nthtail(l, length(l)-1))
     *
     *      fun butlast(l) =
     *          if (null(l) orelse null(tl(l))) then nil
     *          else hd(l) :: butlast(tl(l))
     *
     *      fun reassign(name, value, alist) =
     *          if name = #1(hd(alist)) then (name, value) :: tl(alist)
     *          else hd(alist) :: reassign(name, value, tl(alist)
     *
     *      fun assign(name, value, alist) = (name, value) :: alist
     *
     *      fun chk_apply(fun_name, actual_types, env)
     *          let
     *              val fun_binding = assoc(fun_name, env)
     *              val formals = make_type_list(#2(fun_binding))
     *              val fun-type = #1(fun_binding)
     *          in
     *              if chk_bindings(formals, actuals) then
     *                  if fun_type = nil_type then
     *                      "OK"
     *                  else
     *                      fun_type
     *              else
     *                  error_type
     *          end
     *
     *      fun make_type_list(formals) =
     *          if formals = nil then nil
     *          else #2(hd(formals)) :: make_type_list(tl(formals))
     *
     *      fun chk_bindings(formals, actuals) =
     *          if formals = nil then true
     *          else (hd(formals) = hd(actuals)) and
     *                      chk_bindings(tl(formals), tl(actuals))
     *
     *      fun apply(fun_name, actuals, env, store) =
     *          let
     *              val fun_binding = assoc(fun_name, env)
     *              val fun_body = #3(fun_binding)
     *              val formals = make_name_list(#2(fun_binding))
     *              val bindings = bind(formals, actuals)
     *          in
     *                  fun_body(env, bindings @ store)
     *          end
     *
     *      fun make_name_list(formals) =
     *          if formals = nil then nil
     *          else #1(hd(formals)) :: make_name_list(tl(formals))
     *
     *      fun bind(formals, actuals) =
     *          if formals = nil then nil
     *          else (hd(formals), hd(actuals)) :: bind(tl(formals), tl(actuals))
     *
     *      fun functionize(tree,ins,outs) = a meta-function that transforms an
     *          attributed parse tree denoted by T into a function
     *              fT(ia<1>*...*ia<m>)->(sa<1>*...*sa<n>)
     *          where ia<i> are some or all of the inherited attributes of T and
     *          sa<j> are some or all of the synthesized attributes of T, as
     *          specified by ins and outs, resp.  In practice within this
     *          definition, functionize is used to transform the parse tree of a
     *          function body into a function fn:(env*store)->store`.
     *
     *      fun init_env() = []
     */
    
    %token PROGRAM
    %token END
    %token VAR
    %token INTEGER
    %token REAL
    %token CHAR
    %token BOOLEAN
    %token IDENTIFIER
    %token PROCEDURE
    %token BEGIN
    %token IF
    %token THEN
    %token ENDIF
    %token ELSE
    %token OR
    %token AND
    %token LEQ
    %token GEG
    %token NEW
    %token REALVAL
    %token INTEGERVAL
    %token CHARVAL
    %token BOOLVAL
    
    %left '=' '<' '>' LEQ GEQ NEQ
    %left '+' '-' OR
    %left '*' '/' AND
    
    %%
    
    program         : PROGRAM decls stmts END
                            {$2.env = init_env()
                             $3.env = $2.env`
                             $3.store = []
                             $$.state = if ($2.type != error_type) and
                                                ($3.type != error_type) then
                                            ($2.env`, $3.store`)
                                        else
                                            error_state}
                    ;
    
    decls           : /* empty */
                            {$$.env` = []
                             $$.type = "OK"}
                    | decl ';' decls
                            {$1.env = $$.env
                             $3.env = $1.env`
                             $$.env` = $1.env` @ $3.env`
                             $$.type = if ($1.type != error_type) and
                                               ($3.type != error_type) then
                                           "OK"
                                       else
                                           error_type}
                    ;
    
    decl            : vardecl
                            {$$.env` = $1.env`
                             $$.type = $1.type}
                    | procdecl
                            {$1.env = $$.env
                             $$.env` = $1.env`
                             $$.type = $1.type}
                    ;
    
    vardecl         : VAR vars ':' type
                            {$2.type = $4.type
                             $$.env` = $2.env`
                             $$.type = $4.type}
                    ;
    
    type            : INTEGER
                            {$$.type = "integer"}
                    | REAL
                            {$$.type = "real"}
                    | CHAR
                            {$$.type = "char"}
                    | BOOLEAN
                            {$$.type = "boolean"}
                    ;
    
    vars            : var
                            {$$.env` = [($1.name, $$.type)]}
                    | var ',' vars
                            {$3.type = $$.type
                             $$.env` = [($1.name, $$.type)] @ $3.env`}
                    ;
    
    var             : IDENTIFIER
                            {$$.name = $1.name}
                               /* NOTE: The lexer provides ident string names. */
                    ;
    
    procdecl        : PROCEDURE prochdr procbody
                            {$3.env = $2.formals @ $$.env
                             $$.env` =
                                [($2.name, nil_type, $2.formals, $3.fun_body)]
                             $$.type = $3.type}
                    | PROCEDURE prochdr ':' type procbody
                            {$5.env = $2.formals @ [($2.name, $4.type)] @ $$.env
                             $$.env` =
                                [($2.name, $4.type,
                                     $2.formals @ [($2.name, $4.type)],
                                               /*  ^^^^^^^^^^^^^^^^^^ return val */
                                     $5.fun_body)]
                             $$.type = $5.type}
                    ;
    prochdr         : IDENTIFIER '(' formals ')'
                            {$$.name = $1.name
                             $$.formals = $3.formals}
                    ;
    
    formals :        /* empty */
                            {$$.formals = []}
                    | formal
                            {$$.formals = [$1.env_binding]}
                    | formal ',' formals
                            {$$.formals = $1.env_binding @ $3.formals}
                    ;
    
    formal          : var ':' type
                            {$$.env_binding = ($1.name, $3.type)}
                    ;
    
    procbody        : BEGIN stmts END
                            {$$.type = $2.type
                             $$.fun_body = functionize($2,(env*store),store`)}
                    ;
    
    stmts           : stmt ';'
                            {$1.env = $$.env
                             $1.store = $$.store
                             $$.type = $1.type
                             $$.store` = $1.store`}
                    | stmt ';' stmts
                            {$1.env = $3.env = $$.env
                             $$.type = if $1.(type = "OK") and ($3.type = "OK")
                                            then "OK" else error_type
                             $1.store = $$.store
                             $3.store = $1.store`
                             $$.store` = $3.store`}
                    ;
    
    stmt            : /* empty */
                    | var ':=' expr
                            {$3.env = $$.env
                             $$.type = if #2(assoc($1.name, $$.env)) = $3.type
                                            then "OK" else error_type
                             $3.store = $$.store
                             $$.store` =
                                 if (length($3.store`) > 1) andalso
                                         assoc($1.name, hd($3.store`)) then
                                     reassign($1,name, $3.value, hd($3.store`)) @
                                         tl($3.store`)
                                 else if assoc($1.name, last($3.store`)) then
                                     butlast($3.store`) @
                                       reassign($1.name, $3.value, last($3.store`))
                                 else
                                     butlast($3.store`) @
                                        assign($1.name, $3.value, last($3.store`))}
                    | IDENTIFIER '(' actuals ')'
                            {$$.type = if chk_apply($1,name, $3.types, $$.env)
                             $$.store` = tl(apply(
                                 $1.name, $3.values, $$.env, $$.store))}
                    | IF expr THEN stmts ENDIF
                            {$2.env = $4.env = $$.env
                             $$.type =
                                 if $2.type = "boolean"
                                     then $4.type else error_type
                                                     (* NOTE WEAKNESS HERE *)
                             $4.store = $2.store`
                             $$.store` = if $2.value then $4.store` else $2.store`}
                    | IF expr THEN stmts ELSE stmts ENDIF
                            {$2.env = $4.env = $6.env = $$.env
                             $$.type =
                                 if $2.type = "boolean"
                                     then if $4.type = "OK" and $6.type = "OK"
                                          then "OK"
                                          else error_type (* NOTE WEAKNESS HERE *)
                             $4.store = $6.store = $2.store`
                             $$.store` = if $2.value then $4.store` else $6.store`}
                    ;
    
    expr            : number
                            {$$.type = $1.type
                             $$.store` = $$.store
                             $$.value = $1.value}
                    | char
                            {$$.type = $1.type
                             $$.store` = $$.store
                             $$.value = $1.value}
                    | bool
                            {$$.type = $1.type
                             $$.store` = $$.store
                             $$.value = $1.value}
                    | var
                            {$$.type =
                                 if assoc($1.name, $$.env) then
                                     #2(assoc($1.name, $$.env))
                                 else
                                     error_type
                             $$.store` = $$.store
                             $$.value =
                                 if (length($$.store) > 1) and also
                                        assoc($1.name, hd($$.store)) then
                                    #2(assoc($1.name, hd($$.store)))
                                 else if assoc($1.name, last($$.store)) then
                                    #2(assoc($1.name, last($$.store)))
                                 else
                                    error_value}
                   | IDENTIFIER '(' actuals ')'
                            {$3.env = $$.env
                             $3.store = $$.store
                             $$.type = chk_apply($1,name, $3.types, $$.env)
                             $$.store` = tl(apply(
                                 $1.name, $3.values, $$.env, $$.store))
                             $$.value = last(hd(apply(
                                 $1.name, $3.values, $$.env, $$.store)))}
                    | expr rel_op expr      %prec '<'
                            {$1.env = $3.env = $$.env
                             $$.type =
                                 if ($1.type = $2.type)
                                 then $1.type
                                 else error_type
                             $1.store = $$.store
                             $3.store = $1.store`   (* NOTE *)
                             $$.store` = $3.store `
                             $$.value = $2.op_fun($1.value, $3.value)}
                    | expr add_op expr      %prec '+'
                            {$1.env = $3.env = $$.env
                             $$.type =
                                 if ($1.type = $2.type) and
                                   (($1.type = "real") or ($1.type = "integer"))
                                 then $1.type
                                 else error_type
                             $1.store = $$.store
                             $3.store = $1.store`
                             $$.store` = $3.store `
                             $$.value = $2.op_fun($1.value, $3.value)}
                    | expr mult_op expr     %prec '*'
                            {$1.env = $3.env = $$.env
                             $$.type =
                                 if ($1.type = $2.type) and
                                    (($1.type = "real") or ($1.type = "integer"))
                                 then $1.type
                                 else error_type
                             $1.store = $$.store
                             $3.store = $1.store`
                             $$.store` = $3.store `
                             $$.value = $2.op_fun($1.value, $3.value)}
                    | '(' expr ')'
                            {$2.env = $$.env
                             $$.type = $2.type
                             $2.store = $$.store
                             $$.store` = $2.store`
                             $$.value = $2.value}
                    ;
    
    add_op          : '+'   {$$.op_fun = $1.op_fun}
                    | '-'   {$$.op_fun = $1.op_fun}
                    | OR    {$$.op_fun = $1.op_fun}
                               /* NOTE: The lexer provides function literals. */
                    ;
    
    mult_op         : '*'   {$$.op_fun = $1.op_fun}
                    | '/'   {$$.op_fun = $1.op_fun}
                    | AND   {$$.op_fun = $1.op_fun}
                    ;
    
    rel_op          : '<'   {$$.op_fun = $1.op_fun}
                    | '>'   {$$.op_fun = $1.op_fun}
                    | '='   {$$.op_fun = $1.op_fun}
                    | LEQ   {$$.op_fun = $1.op_fun}
                    | GEQ   {$$.op_fun = $1.op_fun}
                    | NEQ   {$$.op_fun = $1.op_fun}
                    ;
    
    actuals         : /* empty */
                            {$$.types = []
                             $$.store` = $$.store
                             $$.values = []}
                    | actual
                            {$1.env = $$.env
                             $1.store = $$.store
                             $$.types = [$1.type]
                             $$.store` = $1.store`
                             $$.values = [$1.value]}
                    | actual ',' actuals
                            {$1.env = $3.env = $$.env
                             $1.store = $$.store
                             $3.store = $1.store`   /* NOTE sequential eval */
                             $$.store` = $3.store`
                             $$.values = $1.value @ $3.values}
                    ;
    
    actual          : expr
                            {$$.type = $1.type
                             $$.store` = $1.store`
                             $$.value = $1.value}
                    ;
    
    number          : real
                            {$$.type = $1.type
                             $$.value = $1.value}
                    | integer
                            {$$.type = $1.type
                             $$.value = $1.value}
                    ;
    
    real            : REALVAL
                            {$$.type = "real"
                             $$.value = $1.value}
                               /* The lexer provides real literals. */
                    ;
    
    integer         : INTEGERVAL
                            {$$.type = "integer"
                             $$.value = $1.value}
                               /* The lexer provides integer literals. */
                    ;
    
    char            : CHARVAL
                            {$$.type = "char"
                             $$.value = $1.value}
                               /* The lexer provides char literals. */
                    ;
    
    bool            : BOOLVAL
                            {$$.type = "boolean"
                             $$.value = $1.value}
                               /* The lexer provides boolean literals. */
                    ;
    



  6. Tracing the SIL attribute definition
    1. Attached are a couple copies of a parse tree for a sample SIL program (two copies in paper notes, one in online HTML).
    2. We will "decorate" these trees in class with some sample traces of attribute evaluation.





index | lectures | handouts | assignments | examples | doc | solutions | bin

    

Footnotes:

____________________ 1 We will discuss later in the quarter how circular attribute definitions can be meaningfully interpreted; for now they are considered a malformation.