CSC 530 Lecture Notes Week 3, Part 2

CSC 530 Lecture Notes Week 3, Part 2
Discussion of Assignments 1 and 2
More on Type Theory




  1. The "naive" alist layout, from Assignment 1

    1. To review, the alist is fundamentally a list of bindings of the form ( name value ).
    2. In assignment 1, bindings are subdivided into two categories:

      1. A variable binding is a pair of the form:
        ( var-name data-value )
        

      2. Function binding is a triple of the form
        ( function-name formal-parms function-body )
        

    3. We distinguish between variable versus function bindings by their lengths -- two and three, respectively.
    4. In the Lisp of Assignment 1, bindings can be created and modified in three ways:

      1. Variable bindings are created and modified by (setq x v)
      2. Function bindings are created and modified by (defun f parms body) in precisely the same way as variable bindings with setq; namely
      3. Function call bindings (a.k.a., activation records) are created and removed by function calls of the form (f a1 ... an)

    5. In all cases above, the addition, removal, and search for bindings is done in a LIFO discipline
    6. What is naive about the above alist organization is that it does not accurately represent the scoping rules of Common Lisp, as we shall now see.

  2. The Lisp alist as an environment and store.

    1. In upcoming lectures, we will discuss the formal definition of two semantic structures called an environment and store.

      1. The environment holds static attributes of a program, such the types of variables, scoping information, and results of static declarations.
      2. The store holds runtime values that are computed dynamically as the program executes.

        1. A stack store holds values computed by and for functions activations.
        2. A state store holds values assigned to global variables via imperative assignment (i.e., setq).

    2. In pure untyped Lisp, the environment and store can be combined into a single alist structure, since both declarations and executable expressions are evaluated dynamically.
    3. Doing so requires imposing some additional structure on the alist, which we now investigate.

  3. A less naive alist layout.

    1. There are a number of ways to physically lay out the environment and store in an alist structure.
    2. Here is one:
      (  ( state-store )
         ( environment )
         ( stack-store ) )
      
      where the stack-store is further subdivided into sub-alists, one per each active function invocation.
    3. Each component of the alist holds the same form of bindings as before, but now they are organized into separate areas:

      1. The state-store holds only bindings made by setq.
      2. The environment holds only bindings made by defun.
      3. The stack-store holds only bindings made and removed for function calls by apply and bind.

    4. The simple LIFO management discipline of the alist is now replaced with the following scheme (which accurately reflects Common Lisp scoping rules):

      1. The stack-store is still managed in a LIFO discipline.
      2. The state-store and environment can be managed in any discipline.
      3. When a variable binding is searched for, the search proceeds in two phases:
        1. First, the topmost activation record is searched.
        2. If the binding is not located there, then the state-store is searched.

      4. When a function binding is searched for, only the environment is searched.

  4. Comparing the naive versus non-naive alist layouts.

    1. So, what exactly is wrong with the naive alist layout?
    2. In a word (or two words, actually), it defines dynamic scoping instead of static scoping.
    3. Consider:
      (setq x 1)
      (setq y 2)
      (defun f(y) (g))
      (defun g() (+ x y))
      

      1. With static scoping, both (f 10) and (g) return the same result -- 3.
      2. With dynamic scoping, (f 10) returns 11, whereas (g) returns 3.

    4. The cause of this behavior is traceable directly to how the naive alist handling is performed.

      1. When the search for a variable's binding is made, we simply start at the end of the alist, and search backward until we find the nearest (i.e., most recently added) binding.
      2. This means that if there are both stack-store and state-store bindings for a free variable in a function, we'll find the stack binding first, and use it.
      3. But note that this means we may find it in some other function's activation record.
      4. This explains the above behavior when calling function g in a dynamically scoped interpreter.
        1. When g is called by itself from the top level, the alist contains only the global binding of x, with value 1
        2. When g is from within f, the alist contains both the global binding of x, plus the dynamically more recent binding of x in f's activation record, with value 10.

    5. In contrast to dynamic behavior of the naive alist, the binding rule for the non-naive alist looks only in the topmost activation record, and if the binding is not found there, the state-store is searched.
    6. Was dynamic scoping ever used in a real programming language?

      1. Sure -- before Common Lisp came along, dynamic scoping was the Lisp standard.
      2. And the reason stems directly from the ease of interpretation of dynamic scoping vis a vis static scoping.

    7. Some things to consider about dynamic scoping:

      1. While dynamic scoping looks pretty silly from a C programmer's perspective, when we look at the details of evaluation it makes sense -- and good sense at that.
      2. Dynamic scoping and strong typing do not get along well at all (think about this, and we'll discuss further soon).
      3. With the "Pascalization" of the programming language world, dynamic scoping is a relic of the past from a practical standpoint.

  5. Another view of Assignments 1 and 2 -- the fundamental semantics of programming languages.

    1. A continuing major objective of the assignments is to investigate the fundamental semantics of programming languages.
    2. I.e., what do programming constructs really mean?

      1. What does it really mean to define and call a function in a program?
      2. What does an assignment statement mean?
      3. What does it mean to have assignment or not have assignment in a programming language?
      4. What is the difference between the semantics of an imperative language versus an applicative language?
      5. What does it mean for a language to be strongly, weakly, dynamically, statically, and/or polymorphically typed?

    3. In assignments 1 and 2, we are looking at semantics from an operational point of view.

      1. Operational semantics are defined by a program or machine that does what a programming language means.
      2. In this sense, compilers and interpreters are forms operational of semantic definition.

    4. Why interpreters and why Lisp?

      1. We have chosen interpreters for our study of operational semantics since they are quicker and easier to write than compilers.
      2. We have chosen Lisp as our linguistic vehicle since its simple syntax allows us to avoid messy details of syntactic processing (i.e., lexical analysis and parsing).
      3. We have also chosen Lisp since it allows us to compare and contrast applicative versus imperative semantics more easily than with other languages.

  6. Some hints on Assignment 2 (see the writeup).
    1. Put all type bindings in the environment, along with xdefun'd functions.
      1. Type bindings are pairs of the form ( name type ).
      2. They are independent of ( name value ) bindings, which are put in on of the stores (state or stack).
      3. E.g., for the global declaration
        (xdefvar x int 10)
        
        the type binding (x int) would be put in the environment and the value binding (x 10) would be put in the state-store.
    2. Structurally, the xcheck function mirrors what xeval does
      1. You can define functions of the form check-X, each of which is analogous to one of the eval-X functions.
      2. E.g., check-xsetq is the type checking analog of eval-xsetq.
    3. The externally visible equiv function is used internally in the type checker to check if two types are equivalent.
    4. The Assignment 2 test data do not contain any really hairy test cases for overloaded or polymorphic functions; these will come in Assignment 4 when we implement the operational semantics of inheritance.
    5. Type checking the built-in composite type functions (i.e., index, etc).
      1. Method 1: implement a separate check-X function for each (e.g., check-index, check-setelem, etc.).
      2. Method 2: preload the alist with the signatures of these built-ins and have them all uniformly checked in check-apply.
    6. Note the two limited contexts of type checking
      1. xsetq, call of new built-in function, call of xdefun'd function
      2. And if you preload the alist with signatures of xsetq and the new built-ins, checking can all be done in one place -- viz., check- apply.

    Now on to Further Discussion of Type Theory

  7. Relevant readings -- papers 9, 10, and 11 (in addition to paper 8).

  8. Recap of the kinds of typedness
    1. Recall from Notes 3 the following kinds of typedness (subtyping has been added):
      1. Strong versus weak typing
      2. Static versus dynamic typing
      3. Monomorphic versus polymorphic
      4. Encapsulated versus flat
      5. Subtyped versus non-subtyped
      6. Generic versus non-generic
    2. We have already discussed some details of items 1-3; the operational semantics of these typing issues are addressed in Assignment 2.
    3. We will now discuss some details of items 4-6; the operational semantics of these typing issues are addressed in Assignment 4. (FYI, we will take a break from programming in Assignment 3.)

  9. Type encapsulation
    1. Type encapsulation is fundamentally the means used in a PL to construct an absract datatype.
    2. Recall from your earlier (i.e., undergrad) studies that an abstract data type (ADT) can be defined as follows:
      1. A (hidden) type of values, called the ADT's representation.
      2. A set of functions (with hidden implementations), called the ADT's operations.
    3. This definition of ADT requires that a PL supporting ADTs must provide the following features, in some form:
      1. Support for information hiding, so that some or all of the representation and operations can be hidden to users outside of the ADT's definition.
      2. A packaging construct, in which to house the representation and operation definitions.
    4. Information hiding features in programming language differ syntactically, but are fundamentally the same semantically.
    5. Packaging construct differ both syntactically, as well as semantically.
      1. For now, the syntactic distinctions are of no concern to us.
      2. The most important semantic distinction in the encapsulation mechanisms of PL is whether the ADT construct denotes a type.
      3. When an ADT encapsulation construct is also a type construct, the construct is called a first-class type.
      4. When an ADT encapsulation construct is not a type construct, it is second- class.

  10. A brief survey of of information hiding forms in common PLs.
    1. Simula -- hidden/non-hidden, single class body
    2. Modula-2 -- import/export, two-part module body
    3. Ada -- private/non-private, two- or three-part package body
    4. C++ -- public/private/protected, crude one- or two-part class body
    5. Cardelli and Wegner existential types -- quantified variables, one-part representation (as a body)

  11. Examples of second-class encapsulation constructs -- Modula-2 modules, Ada packages, ML modules.
    1. In languages with second-class type encapsulation, the declaration of an ADT is completely independent of any type declaration.
    2. Consider the following example from Modula-2
      definition module Stack;
          type Stack;
          procedure Push(var s: Stack; elem: integer);
          procedure Pop(var s: Stack): integer;
          procedure Peek(s: Stack): integer;
      end Stack.
      
      implementation module Stack;
          const Size = 100;
          type Stack = array[1..Size] of integer;
          var curtop: integer;
          (* ... implementations of Push, Pop, and Peek *)
      end Stack.
      
      (* program *) module TestIntStack;
          import Stack;
          var s: Stack;
              i: integer;
      begin
          Stack.Push(s, 1);
          i := Stack.Pop(s);
      end TestIntStack.
      
    3. The following are noteworthy features of this example:
      1. The module named "Stack" is completely distinct from the type named "Stack" (this is what makes Modula-2's module construct a second-class type constructor).
      2. Information hiding is achieved (at least in part) by a two-part module specification.
        1. All identifiers declared in the definition part of the module are visible outside of the module.
        2. All identifiers declared in the implementation part of the module are hidden outside of the module.
    4. Note the calling form of ADT operations:
      Stack.Push(s, 1);
      i := Stack.Pop(s);
      
      wherein the module name is used to access the function, and an explicit value of the abstract type is the first actual parameter.

  12. Examples of first-class encapsulation constructs -- Simula, Smalltalk, C++, and Java classes.
    1. In languages with first-class type encapsulation, the declaration of an ADT declares a type.
    2. Consider the following C++ example
      class Stack {
        public:
          void Push(int elem);
          int Pop();
          int Peek();
        protected:
          const int Size = 100;
          int curtop;
          int body[Size];
      };
      
      /* ... implementations of Push, Pop, and Peek */
      
      main() {
          Stack s;
          int i;
      
          s.Push(1);
          i = s.Pop();
      }
      
    3. The following are noteworthy features of this example:
      1. The class named "Stack" is a type named "Stack" (this is what makes C++'s class construct a first-class type constructor).
      2. Information hiding is achieved by explicit keywords.
      3. Note the calling form of ADT operations:
        s.Push(1);
        i := s.Pop();
        
        wherein a value (a.k.a. object) of the class type is used to access the function, which value serves also as an (implicit) actual parameter.

  13. A theoretical idea from the Danforth and Tomlinson paper -- State-free ADTs.
    1. Consider the following variant of the earlier C++ stack (this is not legal C++, but uses C++ syntax for convenience):
      class Stack {
        public:
          Stack();
          void push(int elem);
          void pop();
          int top();
      
        equations:
          pop(Stack()) = null;
          pop(s.push(e)) = s;
      
          top(Stack()) = null;
          top(s.push(e)) = e;
      };
      
      /* NO implementations of Push, Pop, and Peek */
      
      main() {
          Stack s;
          int i;
      
          s = s.Push(1);
          i = s.Pop();
      }
      
    2. The following are noteworthy features of this example:
      1. This is still a first-class ADT.
      2. Information hiding of the representation is unecessary, because there is no representation!
      3. There are no operation implementations necessary.
      4. Note the side-effect-free calling form of ADT operations, that is otherwise the same as standard C++:
        s = s.Push(1);
        i = s.Pop();
        
    3. Does this mean anything?
      1. Most definitely yes.
      2. It's C++ syntax for the OBJ algebraic language we'll look more closely at later in the quarter.

  14. Object-orientedness, part 1
    1. So, what relationship is there between ADTs and object orientedness (OO)?
    2. Partial anser: there is no debate whatsoever that one aspect of OO is support for ADTs.
    3. There is some debate as to whether a first-class ADT feature is necessary for a language to be truly OO.
    4. Most people say yes (i.e., first-class ADT is necessary).
    5. However, there are prime examples such as Grady Booch's "Object-Oriented Programming in Ada" that suggest a language with only second-class ADTs is just fine for OOP.

  15. Subtyping
    1. A language with a subtyping capability allows one type to be a parent type from which one or more child types may inherit properties.
    2. An immediate question is "Does subtyping require an ADT feature?"
      1. The theoretic answer is definitely no.
      2. However in practice, almost all languages that support subtyping do so in the context of an first-class ADT mechanism.
      3. A notable real-language exception is OBJ, which again we'll look at later in the quarter.
      4. Also, the Cardelli and Danforth papers discuss subtyping plenty outside of the context of encapsulated ADTs.
    3. There are a wide variety of issues related to subtyping, as discussed in extensively in the Cardelli, Danforth, and Taivalsaari papers; here is an overview:
      1. Multiple or single inheritance.
      2. Representation inheritance allowed or disallowed.
      3. Representation and/or operation overriding allowed or disallowed.
      4. Operation behavior inheritance supported.
      5. Strong, weak, static, or dynamic typing.

  16. A brief survey of of subtyping forms in common (and not-so-common) PLs.
    1. Simula -- single rep+ops inheritance, full override, strong static typing
    2. Smalltalk -- single rep+ops inheritance, full override, weakish dynamic typing
    3. Modula-2, Ada (pre 9X) -- no subtyping
    4. C++ -- multiple rep+ops inheritance, full override, weakish mostly static typing
    5. Java -- single reps inheritance (classes), multiple ops inheritance (interfaces), full override, strong mostly static typing
    6. Emerald, Owl -- single ops-only inheritance, no override, strong static typing, behavior inheritance

  17. Generics.
    1. A language with generics is one in which some form of parameterized type is definable.
    2. As with subtyping, almost all languages provide generic typing in the context of an ADT framework
      1. Unencapsulated parameterized types are perfectly reasonable in theory, as discussed in the papers.
      2. The language Euclid is a notable real-language exception in which parameterized types can be defined outside of the context of an ADT.
    3. Here is a Modula-2esque example to illustrate the basic concepts of generics, in a syntax:
      (* A generic stack module. *)
      definition module Stack(Size: integer, ElemType: type);
          type Stack;
          procedure Push(var S: Stack; Elem: ElemType);
          procedure Pop(var S: Stack): ElemType:
          procedure Peek(S: Stack): ElemType;
      end Stack.
      
      implementation module Stack;
          type Stack = array[1..Size] of ElemType;
          var curtop: integer;
          (* ... implementations of Push, Pop, and Peek *)
      end Stack;
      
      (* program *) module TestIntStack;
          import Stack(100, integer);
          var S: Stack;
              i: integer;
      begin
          Stack.Push(S, 1);
          i := Stack.Pop(S);
      end TestIntStack;
      
      
      (* program *) module TestThreeStacks;
          import Stack(100, integer) as IntStack100;
          import Stack(200, integer) as IntStack200;
          import Stack(200, real) as RealStack200;
      
      var SI100: IntStack100.Stack; (* A 100-elem integer stack *) var SI200: IntStack200.Stack; (* A 200-elem integer stack *) var SR200: RealStack200.Stack; (* A 200-elem real stack *)
      begin IntStack100.Push(SI100, 1); (* push a value on SI100 *) IntStack200.Push(SI200, 2); (* push a value on SI200 *) RealStack200.Push(SR200, 2.5); (* push a value on SR200 *) (* etc. ... *) end TestThreeStacks;

  18. Object-orientedness, the complete picture
    1. So, what exactly constitutes OO?
    2. Here is Danforth and Tomlinson's take:
      If ADTs are so similar to objects, is it not possible that OOP is simply a programming model in which all data are abstract and all data manipulation is implemented via ADT operations? In fact, this is not an unreasonable conjecture ... . However, there is more to OOP than support of ADTs as first-class citizens -- in particular, inheritance.
    3. In summary, OOP = first-class ADTs + inheritance.
    4. It is almost universally accepted that a language does NOT need generics to be truly and completely OO.
    5. As an interesting conclusion for now, we note that Danforth and Tomlinson observe an inherent conflict of OOP, given two fundamentally conflicting goals of ADTs versus inheritance. Viz.,
      1. a primary goal of data abstraction is to hide information
      2. a primary goal of inheritance is to share information
    6. Does this mean that OOP is a crock?




index | lectures | handouts | assignments | examples | doc | solutions | bin