git @ Cat's Eye Technologies Dieter / 99161ea
Rename .markdown file extension to .md. Chris Pressey 1 year, 24 days ago
2 changed file(s) with 555 addition(s) and 555 deletion(s). Raw diff Collapse all Expand all
+0
-555
README.markdown less more
0 The Dieter Programming Language
1 ===============================
2
3 Description
4 -----------
5
6 Dieter (that's Dieter as in the German masculine given name Dieter, not
7 dieter as in "one who diets") is a little experimental programming
8 language which conflates *type qualifiers* with *modules*. In this
9 article, we'll first describe these mechanisms. Then we'll show how
10 their interaction produces something that resembles object-oriented
11 programming.
12
13 ### Type Qualifiers ###
14
15 A type qualifier is simply a keyword which may be placed before any type
16 expression, further specializing it. Type qualifiers should be familiar
17 to programmers of C, where *storage classes*, such as `const` and
18 `static`, are examples of type qualifiers. The cardinal rule of type
19 qualifiers in Dieter is this: **during assignment (or
20 parameter-passing), a variable (or parameter) whose type possesses some
21 type qualifier *only* accepts values of the same type that possess *at
22 least* that same qualifier.**
23
24 Some basic properties of type qualifiers follow (but first, for
25 concreteness, let's stipulate that Dieter has an assortment of primitive
26 types: `bool`, `int`, `rat`, and `string`. And `void` denoting the type
27 of the absence of any value.) The order that type qualifiers are given
28 in a type expression doesn't matter: if `beefy` and `gnarly` are type
29 qualifiers, then `beefy gnarly int` and `gnarly beefy int` are wholly
30 equivalent types. Neither does the number of occurrences matter:
31 `beefy gnarly beefy int` is equivalent to `beefy gnarly int`.
32 Algebraically speaking, type qualifiers are commmutative and idempotent.
33
34 You can't assign an `int` to a `beefy int` (or pass it to a procedure
35 expecting a `beefy int`,) but you can assign a `beefy int` to an `int`.
36 This might sound counter-intuitive initially, but after you think about
37 it a bit I'm sure you'll conclude it's appropriate: it's just like how
38 you can assign a `const int` to an `int` in C, but not an `int` to a
39 `const int`. `int` is more general, and will tolerate information being
40 lost in the copy, whereas `beefy int` is more specific, and will not
41 accept a value with insufficient information associated with it. Or, if
42 you prefer, while all `beefy int`s are `int`s, there are `int`s that
43 aren't `beefy` and thus can't be assigned to a variable which is
44 restricted to `beefy int`s.
45
46 ### Type Operators ###
47
48 In order to make use of type qualifiers in user programs, Dieter
49 provides type operators, similar to C's type-casting facilities, to
50 manipulate type qualifiers.
51
52 In fact, Dieter has only one explicit type operator, called `bestow`. It
53 can be used in an expression to endow its type with further type
54 qualifiers. `bestow` takes a type qualifier q and a value of some type t
55 and returns a new value of type q·t. (You can think of q·t as a kind of
56 pseudo-product type that obeys the algebraic laws given in the previous
57 section rather than those of conventional products.)
58
59 The complementary operation — stripping q·t back to t — is not
60 explicitly denoted; it happens automatically as needed. (This is the
61 underlying rule that was in effect, when we stated that a `beefy int`
62 can be assigned to an `int`.)
63
64 Note that type operators in no way alter the *value* of the expression
65 they are used in, only its *type*.
66
67 ### Type Variables ###
68
69 Dieter supports uniform polymorphic types. Type expressions can contain
70 type variables (which I'll denote with ♥ for no good reason) which can
71 unify with concrete types or other type variables during instantiation
72 of a language construct. In practice, this means that procedures can
73 have type variables in their parameter and return types; these variables
74 are replaced by types specified at the call site whenever a procedure
75 call is type-checked.
76
77 The scope of each type variable ranges over a single instance of a
78 single procedure. So `♥x` in the parameter list of `grunt` is the same
79 type as every occurrence of `♥x` anywhere in the definition of `grunt`
80 during that same invokation of `grunt`, but may be different from `♥x`
81 in other invokations of `grunt` (calls to `grunt` from different call
82 sites), and different from `♥x` appearing in other procedures entirely.
83
84 A complication for polymorphism in Dieter is that type variables range
85 not only over core type expressions, but *also* over type qualifiers.
86 That is, the types `beefy gnarly int` and `beefy ♥t` unify, and their
87 most general unifier is {`♥t` → `gnarly int`}. This has a few
88 implications for the unification algorithm; see [Appendix
89 A](#appendix_a) for a detailed discussion of these.
90
91 ### Modules ###
92
93 Dieter is a modular language, which has its usual meaning — a Dieter
94 program consists of a set of modules, each of which exposes only its
95 interface, not its implementation, to all other modules. In Dieter,
96 however, there is a notable twist: every type qualifier is defined by
97 some module which bears the same name. The key idea of Dieter is this:
98 **a type operator that affects a given qualifier may *only* be used in
99 the module which defines that qualifier.** The reasoning for this is
100 similar to the argument against using casts in the C language: the goal
101 of typing is to specify constraints on the program's structure and to
102 automatically check that they are met. If programmers are allowed to
103 muck with types willy-nilly, it defeats the purpose of having these
104 constraints in the first place. So the power to `bestow` a particular
105 type qualifier is only given out to the module "in charge" of that type
106 qualifier. This is a form of encapsulation: the internal details of some
107 thing (in this case, a type modifier) can only be manipulated by the
108 implementation of that thing.
109
110 ### Procedures ###
111
112 Each Dieter module consists of a set of procedure declarations. Any
113 procedure can be called from any procedure in any module; they are
114 comparable to `public static` methods in Java in this respect.
115 Procedures are polymorphic, as mentioned; each one's parameter and
116 return types can contain type variables. At each call site, the type
117 variables are unified with the types of the values being passed to them,
118 and resolved ultimately to concrete types.
119
120 A bit of administrivia that we should also mention is that Dieter
121 requires procedures to be declared, though not necessarily defined,
122 before they are called. Much like Pascal, it offers a `forward`
123 declaration construct for this purpose. It can also be used to declare
124 procedures intrinsic to the implementation, procedures external to some
125 set of modules, or completely ficticious procedures for the purpose of
126 demonstrating and testing the type system.
127
128 ### Example ###
129
130 We're now ready to give a very simple example of a Dieter module.
131
132 module beefy
133
134 procedure beef_up(x: ♥t): beefy ♥t
135 begin
136 return (bestow beefy x)
137 end
138
139 end
140
141 The intent of this example is to get the general gist of the language
142 across, not to demonstrate good programming practice. The procedure
143 `beef_up` essentially behaves as `bestow`, except, being a procedure, it
144 may be called from any procedure, including those in other modules. So,
145 it breaks the abstraction we just presented; it implements something
146 akin to a "generic setter method," providing direct access to a data
147 member that is ordinarily private.
148
149 ### Object-oriented Programming ###
150
151 We'll now try to show that this interaction between type qualifiers and
152 modules produces something resembling object-oriented programming by
153 adding three language features to Dieter: a declaration form and two new
154 types.
155
156 - Module-level variables are variables that are instantiated on, well,
157 the level of the module. (This is in contrast to local variables in
158 procedures, which I haven't said anything about, but which I've
159 nevertheless assumed exist in Dieter...) Module-level variables
160 can't be accessed directly from outside the module; they're
161 comparable to `private static` fields in Java.
162 - `ref` is a built-in type. These aren't references *to* anything;
163 each is just a unique value, much like values of `ref` type in the
164 language Erlang. A program can obtain a new `ref` value by calling
165 `new_ref()`; the value thus obtained is guaranteed to be different
166 from all other `ref` values in the program.
167 - `map` is a built-in type constructor, over range (key) and domain
168 (value) types, that defines a data structure that acts like a
169 dictionary. The usual square-bracket syntax `x[y]` is used to
170 reference a location within a map.
171
172 We can now construct a class of objects like so:
173
174 module person
175
176 var name_map: map from person ref to string
177 var age_map: map from person ref to int
178
179 procedure person_new(name: string, age: int): person ref
180 var p: person ref
181 begin
182 p := bestow person new_ref()
183 name_map[p] := name
184 age_map[p] := age
185 return p
186 end
187
188 procedure person_get_name(p: person ref): string
189 begin
190 return name_map[p]
191 end
192
193 procedure person_attend_birthday_party(p: person ref): void
194 begin
195 age_map[p] := succ(age_map[p])
196 end
197
198 end
199
200 Because the type qualifier `person` is defined by the module `person`,
201 which only uses `bestow` in one place, we know exactly what kind of
202 expressions can result in a type qualified by `person`. More to the
203 point, we know that no other procedure in any other module can create a
204 `person`-qualified type without calling `person_new`.
205
206 ### Mixins ###
207
208 If we loosen the constraints on `map`s, and say that the range (key)
209 type can be left unspecified and that values of all types can be used as
210 keys in any given `map`, then we have have a way to make type qualifiers
211 work like mixins:
212
213 module tagged
214
215 var tag_map : map to string
216
217 procedure make_tagged(x: ♥t): tagged ♥t
218 begin
219 return (bestow tagged x)
220 end
221
222 procedure tag(x: tagged ♥t, y: string): void
223 begin
224 tag_map[x] := y
225 end
226
227 procedure get_tag(x: tagged ♥t): string
228 begin
229 return tag_map[x]
230 end
231
232 end
233
234 I think this demonstrates an underrated principle of OO, namely
235 *identity*. If I call `tag(make_tagged(4), "mine")`, do *all*
236 occurrences of 4 have that tag, or just the ephemeral instance of 4
237 passed to `tag`? If there can't be seperate instances of 4, that might
238 be a reason for OO languages to not treat it as an object. And if you
239 believe seperate instances of 4 are silly, that's an argument against
240 "everything is an object".
241
242 ### Inheritance ###
243
244 Since module-level variables are private to each module, they give
245 Dieter something akin to traditional encapsulation. But what about
246 inheritance?
247
248 Well, we can get something like inheritance if we give Dieter another
249 feature. We haven't said much about scope of names so far; in
250 particular, we haven't said what happens when one declares two different
251 procedures with the same name and number of parameters, and what, if
252 anything, should happen when client code tries to call such ambiguously
253 declared procedures.
254
255 Hey, in the spirit of compromise, let's just say that when this happens,
256 *all* those procedures are called! Specifically, all the procedures
257 whose formal parameter types are compatible with the types of the actual
258 parameters are called — procedures with parameters of completely
259 different types, or of more highly qualified types, are not called.
260
261 And, to keep this interesting, let's say that the order in which these
262 procedures are called depends on the generality of their parameter
263 types. If `grind(beefy ♥t):void` and `grind(beefy gnarly ♥t):void` are
264 both defined, and `grind` is called with a variable whose type is
265 qualified with both `beefy` and `gnarly`, then `grind(beefy ♥t):void`
266 (the more general) should be called first, then
267 `grind(beefy gnarly ♥t):void` (the more specific) called second.
268
269 There are a few issues now that need to be dealt with.
270
271 - We've discussed compatibility of parameter types of a procedure, but
272 not its return type. For simplicity, let's just disallow differing
273 return types on different procedures with the same name — it's a
274 compile-time error.
275 - It would be nice if procedures had a way to capture the results of a
276 procedure that was called before them in this calling chain. So,
277 let's say that every procedure has access to an implicit read-only
278 variable that holds the result of the previous procedure (if any)
279 that was executed along this chain. The type of this variable will
280 always be the same as the return type of the current procedure
281 (because by the previous bullet point, the previous procedure must
282 have had the same return type as the current one.) If no previous
283 procedure with this signature was executed, the value will be `nil`.
284 For syntax, let's call this implicit variable `super`.
285 - It would also be nice if a procedure could prevent any further
286 procedures from being called in this chain. Let's give Dieter a
287 special form of `return` for this purpose — say, `return final`.
288 - Finally, what about incomparable signatures? Say that, in the above
289 example, procedures `grind(gnarly ♥t):void` and `grind(♥t):void` are
290 defined too. Then, when `grind` is called with a `beefy gnarly`
291 variable, `grind(♥t):void` gets called first (it's the most
292 general), but which gets called next: `grind(gnarly ♥t):void` or
293 `grind(beefy ♥t):void`? We can let the programmer give some sort of
294 disambiguating assertion, like `order beefy < gnarly`, to enforce a
295 partial ordering between type qualifiers. Then we know that the more
296 general procedure — in this case `grind(gnarly ♥t):void` — will be
297 called before `grind(beefy ♥t):void`, because we just noted that,
298 all other things being equal, we want `gnarly` to be treated as more
299 general than `beefy`.
300
301 Now: do you think it's a coincidence that the last problem looks similar
302 to the problems that come from multiple inheritance, where we don't
303 quite know what we should be inheriting when two of our superclasses are
304 descendants of the same class? Well, I can tell you this much: it's
305 definately *not* a coincidence that I chose `super` and `final` for the
306 names of the other two features!
307
308 This whole thing looks to me very much as if we were approaching
309 object-oriented programming from another direction. Which might not be
310 such a surprise, if one thinks of subclassing as a special form of type
311 qualification.
312
313 Of course, it's not *quite* the same in Dieter as it is in most OO
314 languages. For example, procedures in Dieter do not actually override
315 those with more general (less qualified) signatures, they simply add to
316 them. But, those more specific procedures could be written to ignore the
317 results of, and undo the state changes of, the more general procedures
318 in the chain that were called first, which would accomplish essentially
319 the same thing.
320
321 Background
322 ----------
323
324 Why did I design this language, anyway?
325
326 ### Hungarian Notation ###
327
328 The origins of the central idea in Dieter — **encapsulate type
329 qualifiers in modules that define them** — was an indirect result of
330 reading [Joel Spolsky's explanation of the value of Hungarian
331 notation](http://www.joelonsoftware.com/articles/Wrong.html). He
332 contrasts the original notion of Hungarian notation with the
333 cargo-cultist convention that it somehow degenerated into. He explains
334 how it helps make incorrect programs look incorrect.
335
336 I thought, why stop there? If the purpose of Hungarian notation is to
337 make evident assignment errors between variables that have the same type
338 but different *roles*, then why can't you have the compiler type-check
339 the roles, too? Well, you can, and that's basically what Dieter is doing
340 with type qualifiers — using them as a computer-checkable form of
341 Hungarian notation.
342
343 ### Aliasing Prevention ###
344
345 What further spurred development of the idea was the problem of
346 aliasing. Aliasing is where some part of a program is dealing with two
347 references which might or might not point to the same thing —
348 importantly, you can't make guarantees for the sake of safety (or
349 optimization) that they *don't* point to the same thing. So you have to
350 assume that they might.
351
352 The contribution of type qualifiers to this is that in some situations
353 you might be able to give the two references two different type
354 qualifiers, even though they are basically the same type, thus
355 guaranteeing that they don't in fact refer to the same value.
356
357 There's a significant problem with this, though: it still doesn't give
358 you a hard-and-fast guarantee that no aliasing is occurring, because the
359 module that defines a modifier can still do whatever it likes with it
360 internally. It gives you only a sort of "module-level guarantee"; only
361 if you trust the individual modules involved to not be aliasing values
362 of these types, can you be sure the values won't be aliased "in the
363 large".
364
365 In addition, it's far from certain that there are lots of cases where
366 this would *in practice* support some genuine non-trivial beneficial
367 code pattern while preventing aliasing. It could be that all examples
368 where this works are quite contrived. I suppose that if I were to do
369 further work on Dieter, it would be to try to discover whether this is
370 really the case or not.
371
372 ### Related work ###
373
374 Type qualifiers have been around informally for a long time, probably
375 almost as long as there have been types. Here's [Dennis Ritchie ranting
376 against certain type qualifiers proposed for ANSI
377 C](http://www.lysator.liu.se/c/dmr-on-noalias.html). (One of which,
378 coincidentally, was intended to deal with aliasing, albiet quite
379 myopically.)
380
381 Mark P Jones has written [a
382 paper](http://web.cecs.pdx.edu/~mpj/pubs/esop92.html) and [a
383 thesis-turned-book](http://web.cecs.pdx.edu/~mpj/pubs/thesis.html)
384 describing a theory of qualified types. Dieter can be seen as a concrete
385 instance of Jones's theory (as can many other type systems — it's a very
386 general theory), although I have not explicated it as such here, as the
387 resulting article would likely have been much less accessible.
388
389 Haskell's type classes (another type system easily seen as a concrete
390 instance of qualified types) are very similar to Dieter's type
391 qualifiers. However, in Haskell, every type either belongs to or does
392 not belong to some class: every `Int` is an `Eq Ord`, due to the
393 mathematical properties of `Int`s. In Dieter, every type can potentially
394 be modified with every qualifier: there can be `int`s, `eq int`s,
395 `ord int`s, and `eq ord int`s, all slightly different.
396
397 On the other end of the spectrum, solidly in the domain of application
398 rather than theory, [CQUAL](http://www.cs.umd.edu/~jfoster/cqual/) is an
399 extension of C which adds user-defined type qualifiers. CQUAL is
400 primarily concerned with inferring type qualifiers where there are none,
401 and is not concerned with encapsulating qualifiers within modules.
402
403 Conclusion
404 ----------
405
406 I hope you found Dieter to be an entertaining little diversion through
407 type-qualifier-land (and that you did not expect too much more, or I
408 imagine you'll have been somewhat disappointed.)
409
410 Although there is no reference implementation of Dieter (and it's
411 unlikely that there could be without some more specified semantics,)
412 there is a reference parser and type-checker (not at all guaranteed to
413 be bug-free) written in Python of all things.
414
415 Appendix A
416 ----------
417
418 ### Polymorphic Typing with Type Qualifiers ###
419
420 While Dieter does not support full-blown type inference — all variables
421 must be explicitly notated with their type — it does support uniform
422 polymorphic typing using type variables, and it uses the core mechanism
423 of type inference, namely unification, to give those variables concrete
424 types at each usage instance.
425
426 In place of a concrete type, a type variable may be given. Like a
427 concrete type, this type variable can possess any number of type
428 qualifiers. During each instance of the thing being typed (for example,
429 each call to a procedure,) the type variables are resolved into concrete
430 types by unification.
431
432 The presence of type qualifiers makes this process more complicated.
433
434 The first thing to note is that unification during inference is no
435 longer commutative — it is no longer the case that, if A unifies with B,
436 then B necessarily unifies with A. As mentioned, a procedure with a
437 formal parameter of type `gnarly int` will not accept (i.e. is not
438 compatible with) an actual parameter with type simply `int`. But the
439 reverse situation is acceptable — we think of the `gnarly` modifier
440 being 'dropped'. The convention we will use when describing this
441 non-commutative unification is to call the formal parameter (or variable
442 being assigned to) the receptor and write it on the left, and call the
443 actual parameter (or expression being assigned) the provider and write
444 it on the right.
445
446 The cardinal rule, which applies to every primitive type expression,
447 including those with variables, is that **the qualifiers on the provider
448 must be a superset of the qualifiers on the receptor**.
449
450 As during the conventional algorithm, an unbound type variable in either
451 type expression will unify with (take on the binding of) the type
452 subexpression that corresponds with it in the other type expression. In
453 addition, here it will also take on the type qualifiers of that
454 subexpresion, if it can. This leaves us with the question of which ones,
455 and when.
456
457 If the variable is in the receptor position, it might as well unify with
458 *all* of the type qualifiers on the provider, because those must be a
459 superset of the receptor's in order to unify at all, and because down
460 the road, they can always be 'dropped' from the variable, should it be
461 used as a provider.
462
463 If the variable is in the provider position, the type in receptor
464 position can't possibly have any more qualifiers than the variable — or
465 it wouldn't be able to unify in the first place. So the variable might
466 as well unify with the whole expression in the receptor position.
467
468 If both positions contain variables, the provider's variable should be
469 bound to the receptor's type expression, because it will be the one that
470 is more general.
471
472 The other thing to note that differs from conventional type inference is
473 that **a type variable, once bound to a qualified type, may be
474 *re-bound* to a less qualified type**.
475
476 ### Examples ###
477
478 Module names aren't really important for this, so they're omitted. We'll
479 assume the following intrinsics are available:
480
481 forward and(bool, bool): bool
482 forward equal(♥t, ♥t): bool
483 forward print(string): void
484
485 #### Example 1 ####
486
487 procedure thing(): void
488 var i, j: int
489 var s, t: string
490 begin
491 if and(equal(i, j), equal(s, t)) then print("yes") else print("no")
492 end
493
494 Should typecheck successfully, because the two calls to `equal` are two
495 seperate instances of the type ♥t, but the two parameters inside the
496 type signature of `equal` are the same instance.
497
498 #### Example 2 ####
499
500 forward glunt(beefy gnarly ♥t): gnarly ♥t
501 ...
502 procedure thing(): void
503 var i: beefy gnarly int
504 begin
505 if equal(glunt(i), 4) then print("yes") else print("no")
506 end
507
508 Should typecheck successfully. The call to `glunt` returns a
509 `gnarly int`. Midway through typechecking the call to `equal`, we obtain
510 {♥t → `gnarly int`}. But when we typecheck the second argument we see
511 that it's simply an `int`. We *re-bind* the variable to obtain {♥t →
512 `int`}. This is OK with respect to the first argument — we just consider
513 the `gnarly` to be dropped.
514
515 #### Example 3 ####
516
517 forward traub(beefy gnarly ♥t): bool
518 ...
519 procedure thing(p: beefy ♥s): ♥s
520 begin
521 if traub(p) then print("yes") else print("no")
522 return p
523 end
524
525 Should *not* typecheck. The call to `traub` needs something qualified
526 with both `beefy` and `gnarly`. `beefy ♥s` will fail to unify with it.
527
528 Grammar
529 -------
530
531 Dieter ::= {Module | Ordering | Forward} ".".
532 Ordering ::= "order" Name/qual "<" Name/qual.
533 Module ::= "module" Name/qual {"var" VarDecl} {ProcDecl} "end".
534 Forward ::= "forward" Name/proc "(" [Type {"," Type}] ")" ":" Type.
535 VarDecl ::= Name/var ":" Type.
536 ProcDecl ::= "procedure" Name/proc "(" [VarDecl {"," VarDecl}] ")" ":" Type {"var" VarDecl} Statement.
537 Statement ::= "begin" {Statement} "end"
538 | "if" Expr "then" Statement ["else" Statement]
539 | "while" Expr "do" Statement
540 | Name/var ["[" Expr "]"] ":=" Expr
541 | Name/proc "(" [Expr {"," Expr}] ")"
542 | "return" ["final"] Expr
543 .
544 Expr ::= Name/var ["[" Expr "]"]
545 | Name/proc "(" [Expr {"," Expr}] ")"
546 | "(" Expr ")"
547 | "bestow" Qualifier Expr
548 | "super"
549 .
550 Type ::= {Name/qual} BareType.
551 BareType ::= "map" ["from" Type] "to" Type
552 | "♥" Name/tvar
553 | "bool" | "int" | "rat" | "string" | "ref"
554 .
0 The Dieter Programming Language
1 ===============================
2
3 Description
4 -----------
5
6 Dieter (that's Dieter as in the German masculine given name Dieter, not
7 dieter as in "one who diets") is a little experimental programming
8 language which conflates *type qualifiers* with *modules*. In this
9 article, we'll first describe these mechanisms. Then we'll show how
10 their interaction produces something that resembles object-oriented
11 programming.
12
13 ### Type Qualifiers ###
14
15 A type qualifier is simply a keyword which may be placed before any type
16 expression, further specializing it. Type qualifiers should be familiar
17 to programmers of C, where *storage classes*, such as `const` and
18 `static`, are examples of type qualifiers. The cardinal rule of type
19 qualifiers in Dieter is this: **during assignment (or
20 parameter-passing), a variable (or parameter) whose type possesses some
21 type qualifier *only* accepts values of the same type that possess *at
22 least* that same qualifier.**
23
24 Some basic properties of type qualifiers follow (but first, for
25 concreteness, let's stipulate that Dieter has an assortment of primitive
26 types: `bool`, `int`, `rat`, and `string`. And `void` denoting the type
27 of the absence of any value.) The order that type qualifiers are given
28 in a type expression doesn't matter: if `beefy` and `gnarly` are type
29 qualifiers, then `beefy gnarly int` and `gnarly beefy int` are wholly
30 equivalent types. Neither does the number of occurrences matter:
31 `beefy gnarly beefy int` is equivalent to `beefy gnarly int`.
32 Algebraically speaking, type qualifiers are commmutative and idempotent.
33
34 You can't assign an `int` to a `beefy int` (or pass it to a procedure
35 expecting a `beefy int`,) but you can assign a `beefy int` to an `int`.
36 This might sound counter-intuitive initially, but after you think about
37 it a bit I'm sure you'll conclude it's appropriate: it's just like how
38 you can assign a `const int` to an `int` in C, but not an `int` to a
39 `const int`. `int` is more general, and will tolerate information being
40 lost in the copy, whereas `beefy int` is more specific, and will not
41 accept a value with insufficient information associated with it. Or, if
42 you prefer, while all `beefy int`s are `int`s, there are `int`s that
43 aren't `beefy` and thus can't be assigned to a variable which is
44 restricted to `beefy int`s.
45
46 ### Type Operators ###
47
48 In order to make use of type qualifiers in user programs, Dieter
49 provides type operators, similar to C's type-casting facilities, to
50 manipulate type qualifiers.
51
52 In fact, Dieter has only one explicit type operator, called `bestow`. It
53 can be used in an expression to endow its type with further type
54 qualifiers. `bestow` takes a type qualifier q and a value of some type t
55 and returns a new value of type q·t. (You can think of q·t as a kind of
56 pseudo-product type that obeys the algebraic laws given in the previous
57 section rather than those of conventional products.)
58
59 The complementary operation — stripping q·t back to t — is not
60 explicitly denoted; it happens automatically as needed. (This is the
61 underlying rule that was in effect, when we stated that a `beefy int`
62 can be assigned to an `int`.)
63
64 Note that type operators in no way alter the *value* of the expression
65 they are used in, only its *type*.
66
67 ### Type Variables ###
68
69 Dieter supports uniform polymorphic types. Type expressions can contain
70 type variables (which I'll denote with ♥ for no good reason) which can
71 unify with concrete types or other type variables during instantiation
72 of a language construct. In practice, this means that procedures can
73 have type variables in their parameter and return types; these variables
74 are replaced by types specified at the call site whenever a procedure
75 call is type-checked.
76
77 The scope of each type variable ranges over a single instance of a
78 single procedure. So `♥x` in the parameter list of `grunt` is the same
79 type as every occurrence of `♥x` anywhere in the definition of `grunt`
80 during that same invokation of `grunt`, but may be different from `♥x`
81 in other invokations of `grunt` (calls to `grunt` from different call
82 sites), and different from `♥x` appearing in other procedures entirely.
83
84 A complication for polymorphism in Dieter is that type variables range
85 not only over core type expressions, but *also* over type qualifiers.
86 That is, the types `beefy gnarly int` and `beefy ♥t` unify, and their
87 most general unifier is {`♥t` → `gnarly int`}. This has a few
88 implications for the unification algorithm; see [Appendix
89 A](#appendix_a) for a detailed discussion of these.
90
91 ### Modules ###
92
93 Dieter is a modular language, which has its usual meaning — a Dieter
94 program consists of a set of modules, each of which exposes only its
95 interface, not its implementation, to all other modules. In Dieter,
96 however, there is a notable twist: every type qualifier is defined by
97 some module which bears the same name. The key idea of Dieter is this:
98 **a type operator that affects a given qualifier may *only* be used in
99 the module which defines that qualifier.** The reasoning for this is
100 similar to the argument against using casts in the C language: the goal
101 of typing is to specify constraints on the program's structure and to
102 automatically check that they are met. If programmers are allowed to
103 muck with types willy-nilly, it defeats the purpose of having these
104 constraints in the first place. So the power to `bestow` a particular
105 type qualifier is only given out to the module "in charge" of that type
106 qualifier. This is a form of encapsulation: the internal details of some
107 thing (in this case, a type modifier) can only be manipulated by the
108 implementation of that thing.
109
110 ### Procedures ###
111
112 Each Dieter module consists of a set of procedure declarations. Any
113 procedure can be called from any procedure in any module; they are
114 comparable to `public static` methods in Java in this respect.
115 Procedures are polymorphic, as mentioned; each one's parameter and
116 return types can contain type variables. At each call site, the type
117 variables are unified with the types of the values being passed to them,
118 and resolved ultimately to concrete types.
119
120 A bit of administrivia that we should also mention is that Dieter
121 requires procedures to be declared, though not necessarily defined,
122 before they are called. Much like Pascal, it offers a `forward`
123 declaration construct for this purpose. It can also be used to declare
124 procedures intrinsic to the implementation, procedures external to some
125 set of modules, or completely ficticious procedures for the purpose of
126 demonstrating and testing the type system.
127
128 ### Example ###
129
130 We're now ready to give a very simple example of a Dieter module.
131
132 module beefy
133
134 procedure beef_up(x: ♥t): beefy ♥t
135 begin
136 return (bestow beefy x)
137 end
138
139 end
140
141 The intent of this example is to get the general gist of the language
142 across, not to demonstrate good programming practice. The procedure
143 `beef_up` essentially behaves as `bestow`, except, being a procedure, it
144 may be called from any procedure, including those in other modules. So,
145 it breaks the abstraction we just presented; it implements something
146 akin to a "generic setter method," providing direct access to a data
147 member that is ordinarily private.
148
149 ### Object-oriented Programming ###
150
151 We'll now try to show that this interaction between type qualifiers and
152 modules produces something resembling object-oriented programming by
153 adding three language features to Dieter: a declaration form and two new
154 types.
155
156 - Module-level variables are variables that are instantiated on, well,
157 the level of the module. (This is in contrast to local variables in
158 procedures, which I haven't said anything about, but which I've
159 nevertheless assumed exist in Dieter...) Module-level variables
160 can't be accessed directly from outside the module; they're
161 comparable to `private static` fields in Java.
162 - `ref` is a built-in type. These aren't references *to* anything;
163 each is just a unique value, much like values of `ref` type in the
164 language Erlang. A program can obtain a new `ref` value by calling
165 `new_ref()`; the value thus obtained is guaranteed to be different
166 from all other `ref` values in the program.
167 - `map` is a built-in type constructor, over range (key) and domain
168 (value) types, that defines a data structure that acts like a
169 dictionary. The usual square-bracket syntax `x[y]` is used to
170 reference a location within a map.
171
172 We can now construct a class of objects like so:
173
174 module person
175
176 var name_map: map from person ref to string
177 var age_map: map from person ref to int
178
179 procedure person_new(name: string, age: int): person ref
180 var p: person ref
181 begin
182 p := bestow person new_ref()
183 name_map[p] := name
184 age_map[p] := age
185 return p
186 end
187
188 procedure person_get_name(p: person ref): string
189 begin
190 return name_map[p]
191 end
192
193 procedure person_attend_birthday_party(p: person ref): void
194 begin
195 age_map[p] := succ(age_map[p])
196 end
197
198 end
199
200 Because the type qualifier `person` is defined by the module `person`,
201 which only uses `bestow` in one place, we know exactly what kind of
202 expressions can result in a type qualified by `person`. More to the
203 point, we know that no other procedure in any other module can create a
204 `person`-qualified type without calling `person_new`.
205
206 ### Mixins ###
207
208 If we loosen the constraints on `map`s, and say that the range (key)
209 type can be left unspecified and that values of all types can be used as
210 keys in any given `map`, then we have have a way to make type qualifiers
211 work like mixins:
212
213 module tagged
214
215 var tag_map : map to string
216
217 procedure make_tagged(x: ♥t): tagged ♥t
218 begin
219 return (bestow tagged x)
220 end
221
222 procedure tag(x: tagged ♥t, y: string): void
223 begin
224 tag_map[x] := y
225 end
226
227 procedure get_tag(x: tagged ♥t): string
228 begin
229 return tag_map[x]
230 end
231
232 end
233
234 I think this demonstrates an underrated principle of OO, namely
235 *identity*. If I call `tag(make_tagged(4), "mine")`, do *all*
236 occurrences of 4 have that tag, or just the ephemeral instance of 4
237 passed to `tag`? If there can't be seperate instances of 4, that might
238 be a reason for OO languages to not treat it as an object. And if you
239 believe seperate instances of 4 are silly, that's an argument against
240 "everything is an object".
241
242 ### Inheritance ###
243
244 Since module-level variables are private to each module, they give
245 Dieter something akin to traditional encapsulation. But what about
246 inheritance?
247
248 Well, we can get something like inheritance if we give Dieter another
249 feature. We haven't said much about scope of names so far; in
250 particular, we haven't said what happens when one declares two different
251 procedures with the same name and number of parameters, and what, if
252 anything, should happen when client code tries to call such ambiguously
253 declared procedures.
254
255 Hey, in the spirit of compromise, let's just say that when this happens,
256 *all* those procedures are called! Specifically, all the procedures
257 whose formal parameter types are compatible with the types of the actual
258 parameters are called — procedures with parameters of completely
259 different types, or of more highly qualified types, are not called.
260
261 And, to keep this interesting, let's say that the order in which these
262 procedures are called depends on the generality of their parameter
263 types. If `grind(beefy ♥t):void` and `grind(beefy gnarly ♥t):void` are
264 both defined, and `grind` is called with a variable whose type is
265 qualified with both `beefy` and `gnarly`, then `grind(beefy ♥t):void`
266 (the more general) should be called first, then
267 `grind(beefy gnarly ♥t):void` (the more specific) called second.
268
269 There are a few issues now that need to be dealt with.
270
271 - We've discussed compatibility of parameter types of a procedure, but
272 not its return type. For simplicity, let's just disallow differing
273 return types on different procedures with the same name — it's a
274 compile-time error.
275 - It would be nice if procedures had a way to capture the results of a
276 procedure that was called before them in this calling chain. So,
277 let's say that every procedure has access to an implicit read-only
278 variable that holds the result of the previous procedure (if any)
279 that was executed along this chain. The type of this variable will
280 always be the same as the return type of the current procedure
281 (because by the previous bullet point, the previous procedure must
282 have had the same return type as the current one.) If no previous
283 procedure with this signature was executed, the value will be `nil`.
284 For syntax, let's call this implicit variable `super`.
285 - It would also be nice if a procedure could prevent any further
286 procedures from being called in this chain. Let's give Dieter a
287 special form of `return` for this purpose — say, `return final`.
288 - Finally, what about incomparable signatures? Say that, in the above
289 example, procedures `grind(gnarly ♥t):void` and `grind(♥t):void` are
290 defined too. Then, when `grind` is called with a `beefy gnarly`
291 variable, `grind(♥t):void` gets called first (it's the most
292 general), but which gets called next: `grind(gnarly ♥t):void` or
293 `grind(beefy ♥t):void`? We can let the programmer give some sort of
294 disambiguating assertion, like `order beefy < gnarly`, to enforce a
295 partial ordering between type qualifiers. Then we know that the more
296 general procedure — in this case `grind(gnarly ♥t):void` — will be
297 called before `grind(beefy ♥t):void`, because we just noted that,
298 all other things being equal, we want `gnarly` to be treated as more
299 general than `beefy`.
300
301 Now: do you think it's a coincidence that the last problem looks similar
302 to the problems that come from multiple inheritance, where we don't
303 quite know what we should be inheriting when two of our superclasses are
304 descendants of the same class? Well, I can tell you this much: it's
305 definately *not* a coincidence that I chose `super` and `final` for the
306 names of the other two features!
307
308 This whole thing looks to me very much as if we were approaching
309 object-oriented programming from another direction. Which might not be
310 such a surprise, if one thinks of subclassing as a special form of type
311 qualification.
312
313 Of course, it's not *quite* the same in Dieter as it is in most OO
314 languages. For example, procedures in Dieter do not actually override
315 those with more general (less qualified) signatures, they simply add to
316 them. But, those more specific procedures could be written to ignore the
317 results of, and undo the state changes of, the more general procedures
318 in the chain that were called first, which would accomplish essentially
319 the same thing.
320
321 Background
322 ----------
323
324 Why did I design this language, anyway?
325
326 ### Hungarian Notation ###
327
328 The origins of the central idea in Dieter — **encapsulate type
329 qualifiers in modules that define them** — was an indirect result of
330 reading [Joel Spolsky's explanation of the value of Hungarian
331 notation](http://www.joelonsoftware.com/articles/Wrong.html). He
332 contrasts the original notion of Hungarian notation with the
333 cargo-cultist convention that it somehow degenerated into. He explains
334 how it helps make incorrect programs look incorrect.
335
336 I thought, why stop there? If the purpose of Hungarian notation is to
337 make evident assignment errors between variables that have the same type
338 but different *roles*, then why can't you have the compiler type-check
339 the roles, too? Well, you can, and that's basically what Dieter is doing
340 with type qualifiers — using them as a computer-checkable form of
341 Hungarian notation.
342
343 ### Aliasing Prevention ###
344
345 What further spurred development of the idea was the problem of
346 aliasing. Aliasing is where some part of a program is dealing with two
347 references which might or might not point to the same thing —
348 importantly, you can't make guarantees for the sake of safety (or
349 optimization) that they *don't* point to the same thing. So you have to
350 assume that they might.
351
352 The contribution of type qualifiers to this is that in some situations
353 you might be able to give the two references two different type
354 qualifiers, even though they are basically the same type, thus
355 guaranteeing that they don't in fact refer to the same value.
356
357 There's a significant problem with this, though: it still doesn't give
358 you a hard-and-fast guarantee that no aliasing is occurring, because the
359 module that defines a modifier can still do whatever it likes with it
360 internally. It gives you only a sort of "module-level guarantee"; only
361 if you trust the individual modules involved to not be aliasing values
362 of these types, can you be sure the values won't be aliased "in the
363 large".
364
365 In addition, it's far from certain that there are lots of cases where
366 this would *in practice* support some genuine non-trivial beneficial
367 code pattern while preventing aliasing. It could be that all examples
368 where this works are quite contrived. I suppose that if I were to do
369 further work on Dieter, it would be to try to discover whether this is
370 really the case or not.
371
372 ### Related work ###
373
374 Type qualifiers have been around informally for a long time, probably
375 almost as long as there have been types. Here's [Dennis Ritchie ranting
376 against certain type qualifiers proposed for ANSI
377 C](http://www.lysator.liu.se/c/dmr-on-noalias.html). (One of which,
378 coincidentally, was intended to deal with aliasing, albiet quite
379 myopically.)
380
381 Mark P Jones has written [a
382 paper](http://web.cecs.pdx.edu/~mpj/pubs/esop92.html) and [a
383 thesis-turned-book](http://web.cecs.pdx.edu/~mpj/pubs/thesis.html)
384 describing a theory of qualified types. Dieter can be seen as a concrete
385 instance of Jones's theory (as can many other type systems — it's a very
386 general theory), although I have not explicated it as such here, as the
387 resulting article would likely have been much less accessible.
388
389 Haskell's type classes (another type system easily seen as a concrete
390 instance of qualified types) are very similar to Dieter's type
391 qualifiers. However, in Haskell, every type either belongs to or does
392 not belong to some class: every `Int` is an `Eq Ord`, due to the
393 mathematical properties of `Int`s. In Dieter, every type can potentially
394 be modified with every qualifier: there can be `int`s, `eq int`s,
395 `ord int`s, and `eq ord int`s, all slightly different.
396
397 On the other end of the spectrum, solidly in the domain of application
398 rather than theory, [CQUAL](http://www.cs.umd.edu/~jfoster/cqual/) is an
399 extension of C which adds user-defined type qualifiers. CQUAL is
400 primarily concerned with inferring type qualifiers where there are none,
401 and is not concerned with encapsulating qualifiers within modules.
402
403 Conclusion
404 ----------
405
406 I hope you found Dieter to be an entertaining little diversion through
407 type-qualifier-land (and that you did not expect too much more, or I
408 imagine you'll have been somewhat disappointed.)
409
410 Although there is no reference implementation of Dieter (and it's
411 unlikely that there could be without some more specified semantics,)
412 there is a reference parser and type-checker (not at all guaranteed to
413 be bug-free) written in Python of all things.
414
415 Appendix A
416 ----------
417
418 ### Polymorphic Typing with Type Qualifiers ###
419
420 While Dieter does not support full-blown type inference — all variables
421 must be explicitly notated with their type — it does support uniform
422 polymorphic typing using type variables, and it uses the core mechanism
423 of type inference, namely unification, to give those variables concrete
424 types at each usage instance.
425
426 In place of a concrete type, a type variable may be given. Like a
427 concrete type, this type variable can possess any number of type
428 qualifiers. During each instance of the thing being typed (for example,
429 each call to a procedure,) the type variables are resolved into concrete
430 types by unification.
431
432 The presence of type qualifiers makes this process more complicated.
433
434 The first thing to note is that unification during inference is no
435 longer commutative — it is no longer the case that, if A unifies with B,
436 then B necessarily unifies with A. As mentioned, a procedure with a
437 formal parameter of type `gnarly int` will not accept (i.e. is not
438 compatible with) an actual parameter with type simply `int`. But the
439 reverse situation is acceptable — we think of the `gnarly` modifier
440 being 'dropped'. The convention we will use when describing this
441 non-commutative unification is to call the formal parameter (or variable
442 being assigned to) the receptor and write it on the left, and call the
443 actual parameter (or expression being assigned) the provider and write
444 it on the right.
445
446 The cardinal rule, which applies to every primitive type expression,
447 including those with variables, is that **the qualifiers on the provider
448 must be a superset of the qualifiers on the receptor**.
449
450 As during the conventional algorithm, an unbound type variable in either
451 type expression will unify with (take on the binding of) the type
452 subexpression that corresponds with it in the other type expression. In
453 addition, here it will also take on the type qualifiers of that
454 subexpresion, if it can. This leaves us with the question of which ones,
455 and when.
456
457 If the variable is in the receptor position, it might as well unify with
458 *all* of the type qualifiers on the provider, because those must be a
459 superset of the receptor's in order to unify at all, and because down
460 the road, they can always be 'dropped' from the variable, should it be
461 used as a provider.
462
463 If the variable is in the provider position, the type in receptor
464 position can't possibly have any more qualifiers than the variable — or
465 it wouldn't be able to unify in the first place. So the variable might
466 as well unify with the whole expression in the receptor position.
467
468 If both positions contain variables, the provider's variable should be
469 bound to the receptor's type expression, because it will be the one that
470 is more general.
471
472 The other thing to note that differs from conventional type inference is
473 that **a type variable, once bound to a qualified type, may be
474 *re-bound* to a less qualified type**.
475
476 ### Examples ###
477
478 Module names aren't really important for this, so they're omitted. We'll
479 assume the following intrinsics are available:
480
481 forward and(bool, bool): bool
482 forward equal(♥t, ♥t): bool
483 forward print(string): void
484
485 #### Example 1 ####
486
487 procedure thing(): void
488 var i, j: int
489 var s, t: string
490 begin
491 if and(equal(i, j), equal(s, t)) then print("yes") else print("no")
492 end
493
494 Should typecheck successfully, because the two calls to `equal` are two
495 seperate instances of the type ♥t, but the two parameters inside the
496 type signature of `equal` are the same instance.
497
498 #### Example 2 ####
499
500 forward glunt(beefy gnarly ♥t): gnarly ♥t
501 ...
502 procedure thing(): void
503 var i: beefy gnarly int
504 begin
505 if equal(glunt(i), 4) then print("yes") else print("no")
506 end
507
508 Should typecheck successfully. The call to `glunt` returns a
509 `gnarly int`. Midway through typechecking the call to `equal`, we obtain
510 {♥t → `gnarly int`}. But when we typecheck the second argument we see
511 that it's simply an `int`. We *re-bind* the variable to obtain {♥t →
512 `int`}. This is OK with respect to the first argument — we just consider
513 the `gnarly` to be dropped.
514
515 #### Example 3 ####
516
517 forward traub(beefy gnarly ♥t): bool
518 ...
519 procedure thing(p: beefy ♥s): ♥s
520 begin
521 if traub(p) then print("yes") else print("no")
522 return p
523 end
524
525 Should *not* typecheck. The call to `traub` needs something qualified
526 with both `beefy` and `gnarly`. `beefy ♥s` will fail to unify with it.
527
528 Grammar
529 -------
530
531 Dieter ::= {Module | Ordering | Forward} ".".
532 Ordering ::= "order" Name/qual "<" Name/qual.
533 Module ::= "module" Name/qual {"var" VarDecl} {ProcDecl} "end".
534 Forward ::= "forward" Name/proc "(" [Type {"," Type}] ")" ":" Type.
535 VarDecl ::= Name/var ":" Type.
536 ProcDecl ::= "procedure" Name/proc "(" [VarDecl {"," VarDecl}] ")" ":" Type {"var" VarDecl} Statement.
537 Statement ::= "begin" {Statement} "end"
538 | "if" Expr "then" Statement ["else" Statement]
539 | "while" Expr "do" Statement
540 | Name/var ["[" Expr "]"] ":=" Expr
541 | Name/proc "(" [Expr {"," Expr}] ")"
542 | "return" ["final"] Expr
543 .
544 Expr ::= Name/var ["[" Expr "]"]
545 | Name/proc "(" [Expr {"," Expr}] ")"
546 | "(" Expr ")"
547 | "bestow" Qualifier Expr
548 | "super"
549 .
550 Type ::= {Name/qual} BareType.
551 BareType ::= "map" ["from" Type] "to" Type
552 | "♥" Name/tvar
553 | "bool" | "int" | "rat" | "string" | "ref"
554 .