Merge pull request #1 from catseye/develop-2021
Develop 2021
Chris Pressey authored 3 years ago
GitHub committed 3 years ago
0 | ec3be86c774cb7b26d51bf5c7c9ad8fd532e5897 rel_1_0_2011_1214 | |
1 | 8b6e9dba2149b8016fd3c20f3ae8529953e0a190 rel_1_0_2012_0623 | |
2 | 10ae611db0bb032977baac76ad189f45b8b3b505 rel_1_0_2014_0819 |
0 | The contents of the Dieter distribution are distributed under the following | |
1 | four licenses. | |
0 | BSD 3-Clause License | |
2 | 1 | |
3 | The documentation (in `README.markdown`) is covered by the following | |
4 | BSD-compatible license, modelled after the "Report on the Programming | |
5 | Language Haskell 98" license: | |
2 | Copyright (c) 2009-2021, Chris Pressey, Cat's Eye Technologies. | |
3 | All rights reserved. | |
6 | 4 | |
7 | ----------------------------------------------------------------------------- | |
5 | Redistribution and use in source and binary forms, with or without | |
6 | modification, are permitted provided that the following conditions are met: | |
8 | 7 | |
9 | Copyright (c)2009-2012 Chris Pressey, Cat's Eye Technologies. | |
8 | * Redistributions of source code must retain the above copyright notice, this | |
9 | list of conditions and the following disclaimer. | |
10 | 10 | |
11 | The authors intend this Report to belong to the entire Dieter | |
12 | community, and so we grant permission to copy and distribute it for | |
13 | any purpose, provided that it is reproduced in its entirety, | |
14 | including this Notice. Modified versions of this Report may also be | |
15 | copied and distributed for any purpose, provided that the modified | |
16 | version is clearly presented as such, and that it does not claim to | |
17 | be a definition of the Dieter Programming Language. | |
11 | * Redistributions in binary form must reproduce the above copyright notice, | |
12 | this list of conditions and the following disclaimer in the documentation | |
13 | and/or other materials provided with the distribution. | |
18 | 14 | |
19 | ----------------------------------------------------------------------------- | |
15 | * Neither the name of the copyright holder nor the names of its | |
16 | contributors may be used to endorse or promote products derived from | |
17 | this software without specific prior written permission. | |
20 | 18 | |
21 | The source code for the reference interpreter and supporting tools (in the | |
22 | `src` subdirectory) is covered under the following BSD-style license: | |
23 | ||
24 | ----------------------------------------------------------------------------- | |
25 | ||
26 | Copyright (c)2009-2012, Chris Pressey, Cat's Eye Technologies. | |
27 | All rights reserved. | |
28 | ||
29 | Redistribution and use in source and binary forms, with or without | |
30 | modification, are permitted provided that the following conditions | |
31 | are met: | |
32 | ||
33 | Redistributions of source code must retain the above copyright | |
34 | notices, this list of conditions and the following disclaimer. | |
35 | ||
36 | Redistributions in binary form must reproduce the above copyright | |
37 | notices, this list of conditions, and the following disclaimer in | |
38 | the documentation and/or other materials provided with the | |
39 | distribution. | |
40 | ||
41 | Neither the names of the copyright holders nor the names of their | |
42 | contributors may be used to endorse or promote products derived | |
43 | from this software without specific prior written permission. | |
44 | ||
45 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS | |
46 | ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES INCLUDING, BUT NOT | |
47 | LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS | |
48 | FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE | |
49 | COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, | |
50 | INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, | |
51 | BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; | |
52 | LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER | |
53 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT | |
54 | LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN | |
55 | ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE | |
56 | POSSIBILITY OF SUCH DAMAGE. | |
57 | ||
58 | ----------------------------------------------------------------------------- | |
59 | ||
60 | All other example sources in the `eg` directory were written by Chris Pressey, | |
61 | and are hereby placed in the public domain, as described in the following | |
62 | unlicense: | |
63 | ||
64 | ----------------------------------------------------------------------------- | |
65 | ||
66 | This is free and unencumbered software released into the public domain. | |
67 | ||
68 | Anyone is free to copy, modify, publish, use, compile, sell, or | |
69 | distribute this software, either in source code form or as a compiled | |
70 | binary, for any purpose, commercial or non-commercial, and by any | |
71 | means. | |
72 | ||
73 | In jurisdictions that recognize copyright laws, the author or authors | |
74 | of this software dedicate any and all copyright interest in the | |
75 | software to the public domain. We make this dedication for the benefit | |
76 | of the public at large and to the detriment of our heirs and | |
77 | successors. We intend this dedication to be an overt act of | |
78 | relinquishment in perpetuity of all present and future rights to this | |
79 | software under copyright law. | |
80 | ||
81 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, | |
82 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF | |
83 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. | |
84 | IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR | |
85 | OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, | |
86 | ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR | |
87 | OTHER DEALINGS IN THE SOFTWARE. | |
88 | ||
89 | For more information, please refer to <http://unlicense.org/> | |
19 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" | |
20 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE | |
21 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE | |
22 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE | |
23 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL | |
24 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR | |
25 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER | |
26 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, | |
27 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE | |
28 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. |
0 | The Dieter Programming Language | |
1 | =============================== | |
2 | ||
3 | Description | |
4 | ----------- | |
5 | ||
6 | Dieter (that's Dieter as in the German masculine given name Dieter, not | |
7 | dieter as in "one who diets") is a little experimental programming | |
8 | language which conflates *type qualifiers* with *modules*. In this | |
9 | article, we'll first describe these mechanisms. Then we'll show how | |
10 | their interaction produces something that resembles object-oriented | |
11 | programming. | |
12 | ||
13 | ### Type Qualifiers ### | |
14 | ||
15 | A type qualifier is simply a keyword which may be placed before any type | |
16 | expression, further specializing it. Type qualifiers should be familiar | |
17 | to programmers of C, where *storage classes*, such as `const` and | |
18 | `static`, are examples of type qualifiers. The cardinal rule of type | |
19 | qualifiers in Dieter is this: **during assignment (or | |
20 | parameter-passing), a variable (or parameter) whose type possesses some | |
21 | type qualifier *only* accepts values of the same type that possess *at | |
22 | least* that same qualifier.** | |
23 | ||
24 | Some basic properties of type qualifiers follow (but first, for | |
25 | concreteness, let's stipulate that Dieter has an assortment of primitive | |
26 | types: `bool`, `int`, `rat`, and `string`. And `void` denoting the type | |
27 | of the absence of any value.) The order that type qualifiers are given | |
28 | in a type expression doesn't matter: if `beefy` and `gnarly` are type | |
29 | qualifiers, then `beefy gnarly int` and `gnarly beefy int` are wholly | |
30 | equivalent types. Neither does the number of occurrences matter: | |
31 | `beefy gnarly beefy int` is equivalent to `beefy gnarly int`. | |
32 | Algebraically speaking, type qualifiers are commmutative and idempotent. | |
33 | ||
34 | You can't assign an `int` to a `beefy int` (or pass it to a procedure | |
35 | expecting a `beefy int`,) but you can assign a `beefy int` to an `int`. | |
36 | This might sound counter-intuitive initially, but after you think about | |
37 | it a bit I'm sure you'll conclude it's appropriate: it's just like how | |
38 | you can assign a `const int` to an `int` in C, but not an `int` to a | |
39 | `const int`. `int` is more general, and will tolerate information being | |
40 | lost in the copy, whereas `beefy int` is more specific, and will not | |
41 | accept a value with insufficient information associated with it. Or, if | |
42 | you prefer, while all `beefy int`s are `int`s, there are `int`s that | |
43 | aren't `beefy` and thus can't be assigned to a variable which is | |
44 | restricted to `beefy int`s. | |
45 | ||
46 | ### Type Operators ### | |
47 | ||
48 | In order to make use of type qualifiers in user programs, Dieter | |
49 | provides type operators, similar to C's type-casting facilities, to | |
50 | manipulate type qualifiers. | |
51 | ||
52 | In fact, Dieter has only one explicit type operator, called `bestow`. It | |
53 | can be used in an expression to endow its type with further type | |
54 | qualifiers. `bestow` takes a type qualifier q and a value of some type t | |
55 | and returns a new value of type q·t. (You can think of q·t as a kind of | |
56 | pseudo-product type that obeys the algebraic laws given in the previous | |
57 | section rather than those of conventional products.) | |
58 | ||
59 | The complementary operation — stripping q·t back to t — is not | |
60 | explicitly denoted; it happens automatically as needed. (This is the | |
61 | underlying rule that was in effect, when we stated that a `beefy int` | |
62 | can be assigned to an `int`.) | |
63 | ||
64 | Note that type operators in no way alter the *value* of the expression | |
65 | they are used in, only its *type*. | |
66 | ||
67 | ### Type Variables ### | |
68 | ||
69 | Dieter supports uniform polymorphic types. Type expressions can contain | |
70 | type variables (which I'll denote with ♥ for no good reason) which can | |
71 | unify with concrete types or other type variables during instantiation | |
72 | of a language construct. In practice, this means that procedures can | |
73 | have type variables in their parameter and return types; these variables | |
74 | are replaced by types specified at the call site whenever a procedure | |
75 | call is type-checked. | |
76 | ||
77 | The scope of each type variable ranges over a single instance of a | |
78 | single procedure. So `♥x` in the parameter list of `grunt` is the same | |
79 | type as every occurrence of `♥x` anywhere in the definition of `grunt` | |
80 | during that same invokation of `grunt`, but may be different from `♥x` | |
81 | in other invokations of `grunt` (calls to `grunt` from different call | |
82 | sites), and different from `♥x` appearing in other procedures entirely. | |
83 | ||
84 | A complication for polymorphism in Dieter is that type variables range | |
85 | not only over core type expressions, but *also* over type qualifiers. | |
86 | That is, the types `beefy gnarly int` and `beefy ♥t` unify, and their | |
87 | most general unifier is {`♥t` → `gnarly int`}. This has a few | |
88 | implications for the unification algorithm; see [Appendix | |
89 | A](#appendix_a) for a detailed discussion of these. | |
90 | ||
91 | ### Modules ### | |
92 | ||
93 | Dieter is a modular language, which has its usual meaning — a Dieter | |
94 | program consists of a set of modules, each of which exposes only its | |
95 | interface, not its implementation, to all other modules. In Dieter, | |
96 | however, there is a notable twist: every type qualifier is defined by | |
97 | some module which bears the same name. The key idea of Dieter is this: | |
98 | **a type operator that affects a given qualifier may *only* be used in | |
99 | the module which defines that qualifier.** The reasoning for this is | |
100 | similar to the argument against using casts in the C language: the goal | |
101 | of typing is to specify constraints on the program's structure and to | |
102 | automatically check that they are met. If programmers are allowed to | |
103 | muck with types willy-nilly, it defeats the purpose of having these | |
104 | constraints in the first place. So the power to `bestow` a particular | |
105 | type qualifier is only given out to the module "in charge" of that type | |
106 | qualifier. This is a form of encapsulation: the internal details of some | |
107 | thing (in this case, a type modifier) can only be manipulated by the | |
108 | implementation of that thing. | |
109 | ||
110 | ### Procedures ### | |
111 | ||
112 | Each Dieter module consists of a set of procedure declarations. Any | |
113 | procedure can be called from any procedure in any module; they are | |
114 | comparable to `public static` methods in Java in this respect. | |
115 | Procedures are polymorphic, as mentioned; each one's parameter and | |
116 | return types can contain type variables. At each call site, the type | |
117 | variables are unified with the types of the values being passed to them, | |
118 | and resolved ultimately to concrete types. | |
119 | ||
120 | A bit of administrivia that we should also mention is that Dieter | |
121 | requires procedures to be declared, though not necessarily defined, | |
122 | before they are called. Much like Pascal, it offers a `forward` | |
123 | declaration construct for this purpose. It can also be used to declare | |
124 | procedures intrinsic to the implementation, procedures external to some | |
125 | set of modules, or completely ficticious procedures for the purpose of | |
126 | demonstrating and testing the type system. | |
127 | ||
128 | ### Example ### | |
129 | ||
130 | We're now ready to give a very simple example of a Dieter module. | |
131 | ||
132 | module beefy | |
133 | ||
134 | procedure beef_up(x: ♥t): beefy ♥t | |
135 | begin | |
136 | return (bestow beefy x) | |
137 | end | |
138 | ||
139 | end | |
140 | ||
141 | The intent of this example is to get the general gist of the language | |
142 | across, not to demonstrate good programming practice. The procedure | |
143 | `beef_up` essentially behaves as `bestow`, except, being a procedure, it | |
144 | may be called from any procedure, including those in other modules. So, | |
145 | it breaks the abstraction we just presented; it implements something | |
146 | akin to a "generic setter method," providing direct access to a data | |
147 | member that is ordinarily private. | |
148 | ||
149 | ### Object-oriented Programming ### | |
150 | ||
151 | We'll now try to show that this interaction between type qualifiers and | |
152 | modules produces something resembling object-oriented programming by | |
153 | adding three language features to Dieter: a declaration form and two new | |
154 | types. | |
155 | ||
156 | - Module-level variables are variables that are instantiated on, well, | |
157 | the level of the module. (This is in contrast to local variables in | |
158 | procedures, which I haven't said anything about, but which I've | |
159 | nevertheless assumed exist in Dieter...) Module-level variables | |
160 | can't be accessed directly from outside the module; they're | |
161 | comparable to `private static` fields in Java. | |
162 | - `ref` is a built-in type. These aren't references *to* anything; | |
163 | each is just a unique value, much like values of `ref` type in the | |
164 | language Erlang. A program can obtain a new `ref` value by calling | |
165 | `new_ref()`; the value thus obtained is guaranteed to be different | |
166 | from all other `ref` values in the program. | |
167 | - `map` is a built-in type constructor, over range (key) and domain | |
168 | (value) types, that defines a data structure that acts like a | |
169 | dictionary. The usual square-bracket syntax `x[y]` is used to | |
170 | reference a location within a map. | |
171 | ||
172 | We can now construct a class of objects like so: | |
173 | ||
174 | module person | |
175 | ||
176 | var name_map: map from person ref to string | |
177 | var age_map: map from person ref to int | |
178 | ||
179 | procedure person_new(name: string, age: int): person ref | |
180 | var p: person ref | |
181 | begin | |
182 | p := bestow person new_ref() | |
183 | name_map[p] := name | |
184 | age_map[p] := age | |
185 | return p | |
186 | end | |
187 | ||
188 | procedure person_get_name(p: person ref): string | |
189 | begin | |
190 | return name_map[p] | |
191 | end | |
192 | ||
193 | procedure person_attend_birthday_party(p: person ref): void | |
194 | begin | |
195 | age_map[p] := succ(age_map[p]) | |
196 | end | |
197 | ||
198 | end | |
199 | ||
200 | Because the type qualifier `person` is defined by the module `person`, | |
201 | which only uses `bestow` in one place, we know exactly what kind of | |
202 | expressions can result in a type qualified by `person`. More to the | |
203 | point, we know that no other procedure in any other module can create a | |
204 | `person`-qualified type without calling `person_new`. | |
205 | ||
206 | ### Mixins ### | |
207 | ||
208 | If we loosen the constraints on `map`s, and say that the range (key) | |
209 | type can be left unspecified and that values of all types can be used as | |
210 | keys in any given `map`, then we have have a way to make type qualifiers | |
211 | work like mixins: | |
212 | ||
213 | module tagged | |
214 | ||
215 | var tag_map : map to string | |
216 | ||
217 | procedure make_tagged(x: ♥t): tagged ♥t | |
218 | begin | |
219 | return (bestow tagged x) | |
220 | end | |
221 | ||
222 | procedure tag(x: tagged ♥t, y: string): void | |
223 | begin | |
224 | tag_map[x] := y | |
225 | end | |
226 | ||
227 | procedure get_tag(x: tagged ♥t): string | |
228 | begin | |
229 | return tag_map[x] | |
230 | end | |
231 | ||
232 | end | |
233 | ||
234 | I think this demonstrates an underrated principle of OO, namely | |
235 | *identity*. If I call `tag(make_tagged(4), "mine")`, do *all* | |
236 | occurrences of 4 have that tag, or just the ephemeral instance of 4 | |
237 | passed to `tag`? If there can't be seperate instances of 4, that might | |
238 | be a reason for OO languages to not treat it as an object. And if you | |
239 | believe seperate instances of 4 are silly, that's an argument against | |
240 | "everything is an object". | |
241 | ||
242 | ### Inheritance ### | |
243 | ||
244 | Since module-level variables are private to each module, they give | |
245 | Dieter something akin to traditional encapsulation. But what about | |
246 | inheritance? | |
247 | ||
248 | Well, we can get something like inheritance if we give Dieter another | |
249 | feature. We haven't said much about scope of names so far; in | |
250 | particular, we haven't said what happens when one declares two different | |
251 | procedures with the same name and number of parameters, and what, if | |
252 | anything, should happen when client code tries to call such ambiguously | |
253 | declared procedures. | |
254 | ||
255 | Hey, in the spirit of compromise, let's just say that when this happens, | |
256 | *all* those procedures are called! Specifically, all the procedures | |
257 | whose formal parameter types are compatible with the types of the actual | |
258 | parameters are called — procedures with parameters of completely | |
259 | different types, or of more highly qualified types, are not called. | |
260 | ||
261 | And, to keep this interesting, let's say that the order in which these | |
262 | procedures are called depends on the generality of their parameter | |
263 | types. If `grind(beefy ♥t):void` and `grind(beefy gnarly ♥t):void` are | |
264 | both defined, and `grind` is called with a variable whose type is | |
265 | qualified with both `beefy` and `gnarly`, then `grind(beefy ♥t):void` | |
266 | (the more general) should be called first, then | |
267 | `grind(beefy gnarly ♥t):void` (the more specific) called second. | |
268 | ||
269 | There are a few issues now that need to be dealt with. | |
270 | ||
271 | - We've discussed compatibility of parameter types of a procedure, but | |
272 | not its return type. For simplicity, let's just disallow differing | |
273 | return types on different procedures with the same name — it's a | |
274 | compile-time error. | |
275 | - It would be nice if procedures had a way to capture the results of a | |
276 | procedure that was called before them in this calling chain. So, | |
277 | let's say that every procedure has access to an implicit read-only | |
278 | variable that holds the result of the previous procedure (if any) | |
279 | that was executed along this chain. The type of this variable will | |
280 | always be the same as the return type of the current procedure | |
281 | (because by the previous bullet point, the previous procedure must | |
282 | have had the same return type as the current one.) If no previous | |
283 | procedure with this signature was executed, the value will be `nil`. | |
284 | For syntax, let's call this implicit variable `super`. | |
285 | - It would also be nice if a procedure could prevent any further | |
286 | procedures from being called in this chain. Let's give Dieter a | |
287 | special form of `return` for this purpose — say, `return final`. | |
288 | - Finally, what about incomparable signatures? Say that, in the above | |
289 | example, procedures `grind(gnarly ♥t):void` and `grind(♥t):void` are | |
290 | defined too. Then, when `grind` is called with a `beefy gnarly` | |
291 | variable, `grind(♥t):void` gets called first (it's the most | |
292 | general), but which gets called next: `grind(gnarly ♥t):void` or | |
293 | `grind(beefy ♥t):void`? We can let the programmer give some sort of | |
294 | disambiguating assertion, like `order beefy < gnarly`, to enforce a | |
295 | partial ordering between type qualifiers. Then we know that the more | |
296 | general procedure — in this case `grind(gnarly ♥t):void` — will be | |
297 | called before `grind(beefy ♥t):void`, because we just noted that, | |
298 | all other things being equal, we want `gnarly` to be treated as more | |
299 | general than `beefy`. | |
300 | ||
301 | Now: do you think it's a coincidence that the last problem looks similar | |
302 | to the problems that come from multiple inheritance, where we don't | |
303 | quite know what we should be inheriting when two of our superclasses are | |
304 | descendants of the same class? Well, I can tell you this much: it's | |
305 | definately *not* a coincidence that I chose `super` and `final` for the | |
306 | names of the other two features! | |
307 | ||
308 | This whole thing looks to me very much as if we were approaching | |
309 | object-oriented programming from another direction. Which might not be | |
310 | such a surprise, if one thinks of subclassing as a special form of type | |
311 | qualification. | |
312 | ||
313 | Of course, it's not *quite* the same in Dieter as it is in most OO | |
314 | languages. For example, procedures in Dieter do not actually override | |
315 | those with more general (less qualified) signatures, they simply add to | |
316 | them. But, those more specific procedures could be written to ignore the | |
317 | results of, and undo the state changes of, the more general procedures | |
318 | in the chain that were called first, which would accomplish essentially | |
319 | the same thing. | |
320 | ||
321 | Background | |
322 | ---------- | |
323 | ||
324 | Why did I design this language, anyway? | |
325 | ||
326 | ### Hungarian Notation ### | |
327 | ||
328 | The origins of the central idea in Dieter — **encapsulate type | |
329 | qualifiers in modules that define them** — was an indirect result of | |
330 | reading [Joel Spolsky's explanation of the value of Hungarian | |
331 | notation](http://www.joelonsoftware.com/articles/Wrong.html). He | |
332 | contrasts the original notion of Hungarian notation with the | |
333 | cargo-cultist convention that it somehow degenerated into. He explains | |
334 | how it helps make incorrect programs look incorrect. | |
335 | ||
336 | I thought, why stop there? If the purpose of Hungarian notation is to | |
337 | make evident assignment errors between variables that have the same type | |
338 | but different *roles*, then why can't you have the compiler type-check | |
339 | the roles, too? Well, you can, and that's basically what Dieter is doing | |
340 | with type qualifiers — using them as a computer-checkable form of | |
341 | Hungarian notation. | |
342 | ||
343 | ### Aliasing Prevention ### | |
344 | ||
345 | What further spurred development of the idea was the problem of | |
346 | aliasing. Aliasing is where some part of a program is dealing with two | |
347 | references which might or might not point to the same thing — | |
348 | importantly, you can't make guarantees for the sake of safety (or | |
349 | optimization) that they *don't* point to the same thing. So you have to | |
350 | assume that they might. | |
351 | ||
352 | The contribution of type qualifiers to this is that in some situations | |
353 | you might be able to give the two references two different type | |
354 | qualifiers, even though they are basically the same type, thus | |
355 | guaranteeing that they don't in fact refer to the same value. | |
356 | ||
357 | There's a significant problem with this, though: it still doesn't give | |
358 | you a hard-and-fast guarantee that no aliasing is occurring, because the | |
359 | module that defines a modifier can still do whatever it likes with it | |
360 | internally. It gives you only a sort of "module-level guarantee"; only | |
361 | if you trust the individual modules involved to not be aliasing values | |
362 | of these types, can you be sure the values won't be aliased "in the | |
363 | large". | |
364 | ||
365 | In addition, it's far from certain that there are lots of cases where | |
366 | this would *in practice* support some genuine non-trivial beneficial | |
367 | code pattern while preventing aliasing. It could be that all examples | |
368 | where this works are quite contrived. I suppose that if I were to do | |
369 | further work on Dieter, it would be to try to discover whether this is | |
370 | really the case or not. | |
371 | ||
372 | ### Related work ### | |
373 | ||
374 | Type qualifiers have been around informally for a long time, probably | |
375 | almost as long as there have been types. Here's [Dennis Ritchie ranting | |
376 | against certain type qualifiers proposed for ANSI | |
377 | C](http://www.lysator.liu.se/c/dmr-on-noalias.html). (One of which, | |
378 | coincidentally, was intended to deal with aliasing, albiet quite | |
379 | myopically.) | |
380 | ||
381 | Mark P Jones has written [a | |
382 | paper](http://web.cecs.pdx.edu/~mpj/pubs/esop92.html) and [a | |
383 | thesis-turned-book](http://web.cecs.pdx.edu/~mpj/pubs/thesis.html) | |
384 | describing a theory of qualified types. Dieter can be seen as a concrete | |
385 | instance of Jones's theory (as can many other type systems — it's a very | |
386 | general theory), although I have not explicated it as such here, as the | |
387 | resulting article would likely have been much less accessible. | |
388 | ||
389 | Haskell's type classes (another type system easily seen as a concrete | |
390 | instance of qualified types) are very similar to Dieter's type | |
391 | qualifiers. However, in Haskell, every type either belongs to or does | |
392 | not belong to some class: every `Int` is an `Eq Ord`, due to the | |
393 | mathematical properties of `Int`s. In Dieter, every type can potentially | |
394 | be modified with every qualifier: there can be `int`s, `eq int`s, | |
395 | `ord int`s, and `eq ord int`s, all slightly different. | |
396 | ||
397 | On the other end of the spectrum, solidly in the domain of application | |
398 | rather than theory, [CQUAL](http://www.cs.umd.edu/~jfoster/cqual/) is an | |
399 | extension of C which adds user-defined type qualifiers. CQUAL is | |
400 | primarily concerned with inferring type qualifiers where there are none, | |
401 | and is not concerned with encapsulating qualifiers within modules. | |
402 | ||
403 | Conclusion | |
404 | ---------- | |
405 | ||
406 | I hope you found Dieter to be an entertaining little diversion through | |
407 | type-qualifier-land (and that you did not expect too much more, or I | |
408 | imagine you'll have been somewhat disappointed.) | |
409 | ||
410 | Although there is no reference implementation of Dieter (and it's | |
411 | unlikely that there could be without some more specified semantics,) | |
412 | there is a reference parser and type-checker (not at all guaranteed to | |
413 | be bug-free) written in Python of all things. | |
414 | ||
415 | Appendix A | |
416 | ---------- | |
417 | ||
418 | ### Polymorphic Typing with Type Qualifiers ### | |
419 | ||
420 | While Dieter does not support full-blown type inference — all variables | |
421 | must be explicitly notated with their type — it does support uniform | |
422 | polymorphic typing using type variables, and it uses the core mechanism | |
423 | of type inference, namely unification, to give those variables concrete | |
424 | types at each usage instance. | |
425 | ||
426 | In place of a concrete type, a type variable may be given. Like a | |
427 | concrete type, this type variable can possess any number of type | |
428 | qualifiers. During each instance of the thing being typed (for example, | |
429 | each call to a procedure,) the type variables are resolved into concrete | |
430 | types by unification. | |
431 | ||
432 | The presence of type qualifiers makes this process more complicated. | |
433 | ||
434 | The first thing to note is that unification during inference is no | |
435 | longer commutative — it is no longer the case that, if A unifies with B, | |
436 | then B necessarily unifies with A. As mentioned, a procedure with a | |
437 | formal parameter of type `gnarly int` will not accept (i.e. is not | |
438 | compatible with) an actual parameter with type simply `int`. But the | |
439 | reverse situation is acceptable — we think of the `gnarly` modifier | |
440 | being 'dropped'. The convention we will use when describing this | |
441 | non-commutative unification is to call the formal parameter (or variable | |
442 | being assigned to) the receptor and write it on the left, and call the | |
443 | actual parameter (or expression being assigned) the provider and write | |
444 | it on the right. | |
445 | ||
446 | The cardinal rule, which applies to every primitive type expression, | |
447 | including those with variables, is that **the qualifiers on the provider | |
448 | must be a superset of the qualifiers on the receptor**. | |
449 | ||
450 | As during the conventional algorithm, an unbound type variable in either | |
451 | type expression will unify with (take on the binding of) the type | |
452 | subexpression that corresponds with it in the other type expression. In | |
453 | addition, here it will also take on the type qualifiers of that | |
454 | subexpresion, if it can. This leaves us with the question of which ones, | |
455 | and when. | |
456 | ||
457 | If the variable is in the receptor position, it might as well unify with | |
458 | *all* of the type qualifiers on the provider, because those must be a | |
459 | superset of the receptor's in order to unify at all, and because down | |
460 | the road, they can always be 'dropped' from the variable, should it be | |
461 | used as a provider. | |
462 | ||
463 | If the variable is in the provider position, the type in receptor | |
464 | position can't possibly have any more qualifiers than the variable — or | |
465 | it wouldn't be able to unify in the first place. So the variable might | |
466 | as well unify with the whole expression in the receptor position. | |
467 | ||
468 | If both positions contain variables, the provider's variable should be | |
469 | bound to the receptor's type expression, because it will be the one that | |
470 | is more general. | |
471 | ||
472 | The other thing to note that differs from conventional type inference is | |
473 | that **a type variable, once bound to a qualified type, may be | |
474 | *re-bound* to a less qualified type**. | |
475 | ||
476 | ### Examples ### | |
477 | ||
478 | Module names aren't really important for this, so they're omitted. We'll | |
479 | assume the following intrinsics are available: | |
480 | ||
481 | forward and(bool, bool): bool | |
482 | forward equal(♥t, ♥t): bool | |
483 | forward print(string): void | |
484 | ||
485 | #### Example 1 #### | |
486 | ||
487 | procedure thing(): void | |
488 | var i, j: int | |
489 | var s, t: string | |
490 | begin | |
491 | if and(equal(i, j), equal(s, t)) then print("yes") else print("no") | |
492 | end | |
493 | ||
494 | Should typecheck successfully, because the two calls to `equal` are two | |
495 | seperate instances of the type ♥t, but the two parameters inside the | |
496 | type signature of `equal` are the same instance. | |
497 | ||
498 | #### Example 2 #### | |
499 | ||
500 | forward glunt(beefy gnarly ♥t): gnarly ♥t | |
501 | ... | |
502 | procedure thing(): void | |
503 | var i: beefy gnarly int | |
504 | begin | |
505 | if equal(glunt(i), 4) then print("yes") else print("no") | |
506 | end | |
507 | ||
508 | Should typecheck successfully. The call to `glunt` returns a | |
509 | `gnarly int`. Midway through typechecking the call to `equal`, we obtain | |
510 | {♥t → `gnarly int`}. But when we typecheck the second argument we see | |
511 | that it's simply an `int`. We *re-bind* the variable to obtain {♥t → | |
512 | `int`}. This is OK with respect to the first argument — we just consider | |
513 | the `gnarly` to be dropped. | |
514 | ||
515 | #### Example 3 #### | |
516 | ||
517 | forward traub(beefy gnarly ♥t): bool | |
518 | ... | |
519 | procedure thing(p: beefy ♥s): ♥s | |
520 | begin | |
521 | if traub(p) then print("yes") else print("no") | |
522 | return p | |
523 | end | |
524 | ||
525 | Should *not* typecheck. The call to `traub` needs something qualified | |
526 | with both `beefy` and `gnarly`. `beefy ♥s` will fail to unify with it. | |
527 | ||
528 | Grammar | |
529 | ------- | |
530 | ||
531 | Dieter ::= {Module | Ordering | Forward} ".". | |
532 | Ordering ::= "order" Name/qual "<" Name/qual. | |
533 | Module ::= "module" Name/qual {"var" VarDecl} {ProcDecl} "end". | |
534 | Forward ::= "forward" Name/proc "(" [Type {"," Type}] ")" ":" Type. | |
535 | VarDecl ::= Name/var ":" Type. | |
536 | ProcDecl ::= "procedure" Name/proc "(" [VarDecl {"," VarDecl}] ")" ":" Type {"var" VarDecl} Statement. | |
537 | Statement ::= "begin" {Statement} "end" | |
538 | | "if" Expr "then" Statement ["else" Statement] | |
539 | | "while" Expr "do" Statement | |
540 | | Name/var ["[" Expr "]"] ":=" Expr | |
541 | | Name/proc "(" [Expr {"," Expr}] ")" | |
542 | | "return" ["final"] Expr | |
543 | . | |
544 | Expr ::= Name/var ["[" Expr "]"] | |
545 | | Name/proc "(" [Expr {"," Expr}] ")" | |
546 | | "(" Expr ")" | |
547 | | "bestow" Qualifier Expr | |
548 | | "super" | |
549 | . | |
550 | Type ::= {Name/qual} BareType. | |
551 | BareType ::= "map" ["from" Type] "to" Type | |
552 | | "♥" Name/tvar | |
553 | | "bool" | "int" | "rat" | "string" | "ref" | |
554 | . |
0 | The Dieter Programming Language | |
1 | =============================== | |
2 | ||
3 | _Entry_ [@ catseye.tc](https://catseye.tc/node/Dieter) | |
4 | | _See also:_ [Castile](https://github.com/catseye/Castile#readme) | |
5 | ||
6 | Description | |
7 | ----------- | |
8 | ||
9 | Dieter (that's Dieter as in the German masculine given name Dieter, not | |
10 | dieter as in "one who diets") is a little experimental programming | |
11 | language which conflates *type qualifiers* with *modules*. In this | |
12 | article, we'll first describe these mechanisms. Then we'll show how | |
13 | their interaction produces something that resembles object-oriented | |
14 | programming. | |
15 | ||
16 | ### Type Qualifiers ### | |
17 | ||
18 | A type qualifier is simply a keyword which may be placed before any type | |
19 | expression, further specializing it. Type qualifiers should be familiar | |
20 | to programmers of C, where *storage classes*, such as `const` and | |
21 | `static`, are examples of type qualifiers. The cardinal rule of type | |
22 | qualifiers in Dieter is this: **during assignment (or | |
23 | parameter-passing), a variable (or parameter) whose type possesses some | |
24 | type qualifier *only* accepts values of the same type that possess *at | |
25 | least* that same qualifier.** | |
26 | ||
27 | Some basic properties of type qualifiers follow (but first, for | |
28 | concreteness, let's stipulate that Dieter has an assortment of primitive | |
29 | types: `bool`, `int`, `rat`, and `string`. And `void` denoting the type | |
30 | of the absence of any value.) The order that type qualifiers are given | |
31 | in a type expression doesn't matter: if `beefy` and `gnarly` are type | |
32 | qualifiers, then `beefy gnarly int` and `gnarly beefy int` are wholly | |
33 | equivalent types. Neither does the number of occurrences matter: | |
34 | `beefy gnarly beefy int` is equivalent to `beefy gnarly int`. | |
35 | Algebraically speaking, type qualifiers are commmutative and idempotent. | |
36 | ||
37 | You can't assign an `int` to a `beefy int` (or pass it to a procedure | |
38 | expecting a `beefy int`,) but you can assign a `beefy int` to an `int`. | |
39 | This might sound counter-intuitive initially, but after you think about | |
40 | it a bit I'm sure you'll conclude it's appropriate: it's just like how | |
41 | you can assign a `const int` to an `int` in C, but not an `int` to a | |
42 | `const int`. `int` is more general, and will tolerate information being | |
43 | lost in the copy, whereas `beefy int` is more specific, and will not | |
44 | accept a value with insufficient information associated with it. Or, if | |
45 | you prefer, while all `beefy int`s are `int`s, there are `int`s that | |
46 | aren't `beefy` and thus can't be assigned to a variable which is | |
47 | restricted to `beefy int`s. | |
48 | ||
49 | ### Type Operators ### | |
50 | ||
51 | In order to make use of type qualifiers in user programs, Dieter | |
52 | provides type operators, similar to C's type-casting facilities, to | |
53 | manipulate type qualifiers. | |
54 | ||
55 | In fact, Dieter has only one explicit type operator, called `bestow`. It | |
56 | can be used in an expression to endow its type with further type | |
57 | qualifiers. `bestow` takes a type qualifier q and a value of some type t | |
58 | and returns a new value of type q·t. (You can think of q·t as a kind of | |
59 | pseudo-product type that obeys the algebraic laws given in the previous | |
60 | section rather than those of conventional products.) | |
61 | ||
62 | The complementary operation — stripping q·t back to t — is not | |
63 | explicitly denoted; it happens automatically as needed. (This is the | |
64 | underlying rule that was in effect, when we stated that a `beefy int` | |
65 | can be assigned to an `int`.) | |
66 | ||
67 | Note that type operators in no way alter the *value* of the expression | |
68 | they are used in, only its *type*. | |
69 | ||
70 | ### Type Variables ### | |
71 | ||
72 | Dieter supports uniform polymorphic types. Type expressions can contain | |
73 | type variables (which I'll denote with ♥ for no good reason) which can | |
74 | unify with concrete types or other type variables during instantiation | |
75 | of a language construct. In practice, this means that procedures can | |
76 | have type variables in their parameter and return types; these variables | |
77 | are replaced by types specified at the call site whenever a procedure | |
78 | call is type-checked. | |
79 | ||
80 | The scope of each type variable ranges over a single instance of a | |
81 | single procedure. So `♥x` in the parameter list of `grunt` is the same | |
82 | type as every occurrence of `♥x` anywhere in the definition of `grunt` | |
83 | during that same invokation of `grunt`, but may be different from `♥x` | |
84 | in other invokations of `grunt` (calls to `grunt` from different call | |
85 | sites), and different from `♥x` appearing in other procedures entirely. | |
86 | ||
87 | A complication for polymorphism in Dieter is that type variables range | |
88 | not only over core type expressions, but *also* over type qualifiers. | |
89 | That is, the types `beefy gnarly int` and `beefy ♥t` unify, and their | |
90 | most general unifier is {`♥t` → `gnarly int`}. This has a few | |
91 | implications for the unification algorithm; see [Appendix | |
92 | A](#appendix_a) for a detailed discussion of these. | |
93 | ||
94 | ### Modules ### | |
95 | ||
96 | Dieter is a modular language, which has its usual meaning — a Dieter | |
97 | program consists of a set of modules, each of which exposes only its | |
98 | interface, not its implementation, to all other modules. In Dieter, | |
99 | however, there is a notable twist: every type qualifier is defined by | |
100 | some module which bears the same name. The key idea of Dieter is this: | |
101 | **a type operator that affects a given qualifier may *only* be used in | |
102 | the module which defines that qualifier.** The reasoning for this is | |
103 | similar to the argument against using casts in the C language: the goal | |
104 | of typing is to specify constraints on the program's structure and to | |
105 | automatically check that they are met. If programmers are allowed to | |
106 | muck with types willy-nilly, it defeats the purpose of having these | |
107 | constraints in the first place. So the power to `bestow` a particular | |
108 | type qualifier is only given out to the module "in charge" of that type | |
109 | qualifier. This is a form of encapsulation: the internal details of some | |
110 | thing (in this case, a type modifier) can only be manipulated by the | |
111 | implementation of that thing. | |
112 | ||
113 | ### Procedures ### | |
114 | ||
115 | Each Dieter module consists of a set of procedure declarations. Any | |
116 | procedure can be called from any procedure in any module; they are | |
117 | comparable to `public static` methods in Java in this respect. | |
118 | Procedures are polymorphic, as mentioned; each one's parameter and | |
119 | return types can contain type variables. At each call site, the type | |
120 | variables are unified with the types of the values being passed to them, | |
121 | and resolved ultimately to concrete types. | |
122 | ||
123 | A bit of administrivia that we should also mention is that Dieter | |
124 | requires procedures to be declared, though not necessarily defined, | |
125 | before they are called. Much like Pascal, it offers a `forward` | |
126 | declaration construct for this purpose. It can also be used to declare | |
127 | procedures intrinsic to the implementation, procedures external to some | |
128 | set of modules, or completely ficticious procedures for the purpose of | |
129 | demonstrating and testing the type system. | |
130 | ||
131 | ### Example ### | |
132 | ||
133 | We're now ready to give a very simple example of a Dieter module. | |
134 | ||
135 | module beefy | |
136 | ||
137 | procedure beef_up(x: ♥t): beefy ♥t | |
138 | begin | |
139 | return (bestow beefy x) | |
140 | end | |
141 | ||
142 | end | |
143 | ||
144 | The intent of this example is to get the general gist of the language | |
145 | across, not to demonstrate good programming practice. The procedure | |
146 | `beef_up` essentially behaves as `bestow`, except, being a procedure, it | |
147 | may be called from any procedure, including those in other modules. So, | |
148 | it breaks the abstraction we just presented; it implements something | |
149 | akin to a "generic setter method," providing direct access to a data | |
150 | member that is ordinarily private. | |
151 | ||
152 | ### Object-oriented Programming ### | |
153 | ||
154 | We'll now try to show that this interaction between type qualifiers and | |
155 | modules produces something resembling object-oriented programming by | |
156 | adding three language features to Dieter: a declaration form and two new | |
157 | types. | |
158 | ||
159 | - Module-level variables are variables that are instantiated on, well, | |
160 | the level of the module. (This is in contrast to local variables in | |
161 | procedures, which I haven't said anything about, but which I've | |
162 | nevertheless assumed exist in Dieter...) Module-level variables | |
163 | can't be accessed directly from outside the module; they're | |
164 | comparable to `private static` fields in Java. | |
165 | - `ref` is a built-in type. These aren't references *to* anything; | |
166 | each is just a unique value, much like values of `ref` type in the | |
167 | language Erlang. A program can obtain a new `ref` value by calling | |
168 | `new_ref()`; the value thus obtained is guaranteed to be different | |
169 | from all other `ref` values in the program. | |
170 | - `map` is a built-in type constructor, over range (key) and domain | |
171 | (value) types, that defines a data structure that acts like a | |
172 | dictionary. The usual square-bracket syntax `x[y]` is used to | |
173 | reference a location within a map. | |
174 | ||
175 | We can now construct a class of objects like so: | |
176 | ||
177 | module person | |
178 | ||
179 | var name_map: map from person ref to string | |
180 | var age_map: map from person ref to int | |
181 | ||
182 | procedure person_new(name: string, age: int): person ref | |
183 | var p: person ref | |
184 | begin | |
185 | p := bestow person new_ref() | |
186 | name_map[p] := name | |
187 | age_map[p] := age | |
188 | return p | |
189 | end | |
190 | ||
191 | procedure person_get_name(p: person ref): string | |
192 | begin | |
193 | return name_map[p] | |
194 | end | |
195 | ||
196 | procedure person_attend_birthday_party(p: person ref): void | |
197 | begin | |
198 | age_map[p] := succ(age_map[p]) | |
199 | end | |
200 | ||
201 | end | |
202 | ||
203 | Because the type qualifier `person` is defined by the module `person`, | |
204 | which only uses `bestow` in one place, we know exactly what kind of | |
205 | expressions can result in a type qualified by `person`. More to the | |
206 | point, we know that no other procedure in any other module can create a | |
207 | `person`-qualified type without calling `person_new`. | |
208 | ||
209 | ### Mixins ### | |
210 | ||
211 | If we loosen the constraints on `map`s, and say that the range (key) | |
212 | type can be left unspecified and that values of all types can be used as | |
213 | keys in any given `map`, then we have have a way to make type qualifiers | |
214 | work like mixins: | |
215 | ||
216 | module tagged | |
217 | ||
218 | var tag_map : map to string | |
219 | ||
220 | procedure make_tagged(x: ♥t): tagged ♥t | |
221 | begin | |
222 | return (bestow tagged x) | |
223 | end | |
224 | ||
225 | procedure tag(x: tagged ♥t, y: string): void | |
226 | begin | |
227 | tag_map[x] := y | |
228 | end | |
229 | ||
230 | procedure get_tag(x: tagged ♥t): string | |
231 | begin | |
232 | return tag_map[x] | |
233 | end | |
234 | ||
235 | end | |
236 | ||
237 | I think this demonstrates an underrated principle of OO, namely | |
238 | *identity*. If I call `tag(make_tagged(4), "mine")`, do *all* | |
239 | occurrences of 4 have that tag, or just the ephemeral instance of 4 | |
240 | passed to `tag`? If there can't be seperate instances of 4, that might | |
241 | be a reason for OO languages to not treat it as an object. And if you | |
242 | believe seperate instances of 4 are silly, that's an argument against | |
243 | "everything is an object". | |
244 | ||
245 | ### Inheritance ### | |
246 | ||
247 | Since module-level variables are private to each module, they give | |
248 | Dieter something akin to traditional encapsulation. But what about | |
249 | inheritance? | |
250 | ||
251 | Well, we can get something like inheritance if we give Dieter another | |
252 | feature. We haven't said much about scope of names so far; in | |
253 | particular, we haven't said what happens when one declares two different | |
254 | procedures with the same name and number of parameters, and what, if | |
255 | anything, should happen when client code tries to call such ambiguously | |
256 | declared procedures. | |
257 | ||
258 | Hey, in the spirit of compromise, let's just say that when this happens, | |
259 | *all* those procedures are called! Specifically, all the procedures | |
260 | whose formal parameter types are compatible with the types of the actual | |
261 | parameters are called — procedures with parameters of completely | |
262 | different types, or of more highly qualified types, are not called. | |
263 | ||
264 | And, to keep this interesting, let's say that the order in which these | |
265 | procedures are called depends on the generality of their parameter | |
266 | types. If `grind(beefy ♥t):void` and `grind(beefy gnarly ♥t):void` are | |
267 | both defined, and `grind` is called with a variable whose type is | |
268 | qualified with both `beefy` and `gnarly`, then `grind(beefy ♥t):void` | |
269 | (the more general) should be called first, then | |
270 | `grind(beefy gnarly ♥t):void` (the more specific) called second. | |
271 | ||
272 | There are a few issues now that need to be dealt with. | |
273 | ||
274 | - We've discussed compatibility of parameter types of a procedure, but | |
275 | not its return type. For simplicity, let's just disallow differing | |
276 | return types on different procedures with the same name — it's a | |
277 | compile-time error. | |
278 | - It would be nice if procedures had a way to capture the results of a | |
279 | procedure that was called before them in this calling chain. So, | |
280 | let's say that every procedure has access to an implicit read-only | |
281 | variable that holds the result of the previous procedure (if any) | |
282 | that was executed along this chain. The type of this variable will | |
283 | always be the same as the return type of the current procedure | |
284 | (because by the previous bullet point, the previous procedure must | |
285 | have had the same return type as the current one.) If no previous | |
286 | procedure with this signature was executed, the value will be `nil`. | |
287 | For syntax, let's call this implicit variable `super`. | |
288 | - It would also be nice if a procedure could prevent any further | |
289 | procedures from being called in this chain. Let's give Dieter a | |
290 | special form of `return` for this purpose — say, `return final`. | |
291 | - Finally, what about incomparable signatures? Say that, in the above | |
292 | example, procedures `grind(gnarly ♥t):void` and `grind(♥t):void` are | |
293 | defined too. Then, when `grind` is called with a `beefy gnarly` | |
294 | variable, `grind(♥t):void` gets called first (it's the most | |
295 | general), but which gets called next: `grind(gnarly ♥t):void` or | |
296 | `grind(beefy ♥t):void`? We can let the programmer give some sort of | |
297 | disambiguating assertion, like `order beefy < gnarly`, to enforce a | |
298 | partial ordering between type qualifiers. Then we know that the more | |
299 | general procedure — in this case `grind(gnarly ♥t):void` — will be | |
300 | called before `grind(beefy ♥t):void`, because we just noted that, | |
301 | all other things being equal, we want `gnarly` to be treated as more | |
302 | general than `beefy`. | |
303 | ||
304 | Now: do you think it's a coincidence that the last problem looks similar | |
305 | to the problems that come from multiple inheritance, where we don't | |
306 | quite know what we should be inheriting when two of our superclasses are | |
307 | descendants of the same class? Well, I can tell you this much: it's | |
308 | definately *not* a coincidence that I chose `super` and `final` for the | |
309 | names of the other two features! | |
310 | ||
311 | This whole thing looks to me very much as if we were approaching | |
312 | object-oriented programming from another direction. Which might not be | |
313 | such a surprise, if one thinks of subclassing as a special form of type | |
314 | qualification. | |
315 | ||
316 | Of course, it's not *quite* the same in Dieter as it is in most OO | |
317 | languages. For example, procedures in Dieter do not actually override | |
318 | those with more general (less qualified) signatures, they simply add to | |
319 | them. But, those more specific procedures could be written to ignore the | |
320 | results of, and undo the state changes of, the more general procedures | |
321 | in the chain that were called first, which would accomplish essentially | |
322 | the same thing. | |
323 | ||
324 | Background | |
325 | ---------- | |
326 | ||
327 | Why did I design this language, anyway? | |
328 | ||
329 | ### Hungarian Notation ### | |
330 | ||
331 | The origins of the central idea in Dieter — **encapsulate type | |
332 | qualifiers in modules that define them** — was an indirect result of | |
333 | reading [Joel Spolsky's explanation of the value of Hungarian | |
334 | notation](http://www.joelonsoftware.com/articles/Wrong.html). He | |
335 | contrasts the original notion of Hungarian notation with the | |
336 | cargo-cultist convention that it somehow degenerated into. He explains | |
337 | how it helps make incorrect programs look incorrect. | |
338 | ||
339 | I thought, why stop there? If the purpose of Hungarian notation is to | |
340 | make evident assignment errors between variables that have the same type | |
341 | but different *roles*, then why can't you have the compiler type-check | |
342 | the roles, too? Well, you can, and that's basically what Dieter is doing | |
343 | with type qualifiers — using them as a computer-checkable form of | |
344 | Hungarian notation. | |
345 | ||
346 | ### Aliasing Prevention ### | |
347 | ||
348 | What further spurred development of the idea was the problem of | |
349 | aliasing. Aliasing is where some part of a program is dealing with two | |
350 | references which might or might not point to the same thing — | |
351 | importantly, you can't make guarantees for the sake of safety (or | |
352 | optimization) that they *don't* point to the same thing. So you have to | |
353 | assume that they might. | |
354 | ||
355 | The contribution of type qualifiers to this is that in some situations | |
356 | you might be able to give the two references two different type | |
357 | qualifiers, even though they are basically the same type, thus | |
358 | guaranteeing that they don't in fact refer to the same value. | |
359 | ||
360 | There's a significant problem with this, though: it still doesn't give | |
361 | you a hard-and-fast guarantee that no aliasing is occurring, because the | |
362 | module that defines a modifier can still do whatever it likes with it | |
363 | internally. It gives you only a sort of "module-level guarantee"; only | |
364 | if you trust the individual modules involved to not be aliasing values | |
365 | of these types, can you be sure the values won't be aliased "in the | |
366 | large". | |
367 | ||
368 | In addition, it's far from certain that there are lots of cases where | |
369 | this would *in practice* support some genuine non-trivial beneficial | |
370 | code pattern while preventing aliasing. It could be that all examples | |
371 | where this works are quite contrived. I suppose that if I were to do | |
372 | further work on Dieter, it would be to try to discover whether this is | |
373 | really the case or not. | |
374 | ||
375 | ### Related work ### | |
376 | ||
377 | Type qualifiers have been around informally for a long time, probably | |
378 | almost as long as there have been types. Here's [Dennis Ritchie ranting | |
379 | against certain type qualifiers proposed for ANSI | |
380 | C](http://www.lysator.liu.se/c/dmr-on-noalias.html). (One of which, | |
381 | coincidentally, was intended to deal with aliasing, albiet quite | |
382 | myopically.) | |
383 | ||
384 | Mark P Jones has written [a | |
385 | paper](http://web.cecs.pdx.edu/~mpj/pubs/esop92.html) and [a | |
386 | thesis-turned-book](http://web.cecs.pdx.edu/~mpj/pubs/thesis.html) | |
387 | describing a theory of qualified types. Dieter can be seen as a concrete | |
388 | instance of Jones's theory (as can many other type systems — it's a very | |
389 | general theory), although I have not explicated it as such here, as the | |
390 | resulting article would likely have been much less accessible. | |
391 | ||
392 | Haskell's type classes (another type system easily seen as a concrete | |
393 | instance of qualified types) are very similar to Dieter's type | |
394 | qualifiers. However, in Haskell, every type either belongs to or does | |
395 | not belong to some class: every `Int` is an `Eq Ord`, due to the | |
396 | mathematical properties of `Int`s. In Dieter, every type can potentially | |
397 | be modified with every qualifier: there can be `int`s, `eq int`s, | |
398 | `ord int`s, and `eq ord int`s, all slightly different. | |
399 | ||
400 | On the other end of the spectrum, solidly in the domain of application | |
401 | rather than theory, [CQUAL](http://www.cs.umd.edu/~jfoster/cqual/) is an | |
402 | extension of C which adds user-defined type qualifiers. CQUAL is | |
403 | primarily concerned with inferring type qualifiers where there are none, | |
404 | and is not concerned with encapsulating qualifiers within modules. | |
405 | ||
406 | Conclusion | |
407 | ---------- | |
408 | ||
409 | I hope you found Dieter to be an entertaining little diversion through | |
410 | type-qualifier-land (and that you did not expect too much more, or I | |
411 | imagine you'll have been somewhat disappointed.) | |
412 | ||
413 | Although there is no reference implementation of Dieter (and it's | |
414 | unlikely that there could be without some more specified semantics,) | |
415 | there is a reference parser and type-checker (not at all guaranteed to | |
416 | be bug-free) written in Python of all things. | |
417 | ||
418 | Appendix A | |
419 | ---------- | |
420 | ||
421 | ### Polymorphic Typing with Type Qualifiers ### | |
422 | ||
423 | While Dieter does not support full-blown type inference — all variables | |
424 | must be explicitly notated with their type — it does support uniform | |
425 | polymorphic typing using type variables, and it uses the core mechanism | |
426 | of type inference, namely unification, to give those variables concrete | |
427 | types at each usage instance. | |
428 | ||
429 | In place of a concrete type, a type variable may be given. Like a | |
430 | concrete type, this type variable can possess any number of type | |
431 | qualifiers. During each instance of the thing being typed (for example, | |
432 | each call to a procedure,) the type variables are resolved into concrete | |
433 | types by unification. | |
434 | ||
435 | The presence of type qualifiers makes this process more complicated. | |
436 | ||
437 | The first thing to note is that unification during inference is no | |
438 | longer commutative — it is no longer the case that, if A unifies with B, | |
439 | then B necessarily unifies with A. As mentioned, a procedure with a | |
440 | formal parameter of type `gnarly int` will not accept (i.e. is not | |
441 | compatible with) an actual parameter with type simply `int`. But the | |
442 | reverse situation is acceptable — we think of the `gnarly` modifier | |
443 | being 'dropped'. The convention we will use when describing this | |
444 | non-commutative unification is to call the formal parameter (or variable | |
445 | being assigned to) the receptor and write it on the left, and call the | |
446 | actual parameter (or expression being assigned) the provider and write | |
447 | it on the right. | |
448 | ||
449 | The cardinal rule, which applies to every primitive type expression, | |
450 | including those with variables, is that **the qualifiers on the provider | |
451 | must be a superset of the qualifiers on the receptor**. | |
452 | ||
453 | As during the conventional algorithm, an unbound type variable in either | |
454 | type expression will unify with (take on the binding of) the type | |
455 | subexpression that corresponds with it in the other type expression. In | |
456 | addition, here it will also take on the type qualifiers of that | |
457 | subexpresion, if it can. This leaves us with the question of which ones, | |
458 | and when. | |
459 | ||
460 | If the variable is in the receptor position, it might as well unify with | |
461 | *all* of the type qualifiers on the provider, because those must be a | |
462 | superset of the receptor's in order to unify at all, and because down | |
463 | the road, they can always be 'dropped' from the variable, should it be | |
464 | used as a provider. | |
465 | ||
466 | If the variable is in the provider position, the type in receptor | |
467 | position can't possibly have any more qualifiers than the variable — or | |
468 | it wouldn't be able to unify in the first place. So the variable might | |
469 | as well unify with the whole expression in the receptor position. | |
470 | ||
471 | If both positions contain variables, the provider's variable should be | |
472 | bound to the receptor's type expression, because it will be the one that | |
473 | is more general. | |
474 | ||
475 | The other thing to note that differs from conventional type inference is | |
476 | that **a type variable, once bound to a qualified type, may be | |
477 | *re-bound* to a less qualified type**. | |
478 | ||
479 | ### Examples ### | |
480 | ||
481 | Module names aren't really important for this, so they're omitted. We'll | |
482 | assume the following intrinsics are available: | |
483 | ||
484 | forward and(bool, bool): bool | |
485 | forward equal(♥t, ♥t): bool | |
486 | forward print(string): void | |
487 | ||
488 | #### Example 1 #### | |
489 | ||
490 | procedure thing(): void | |
491 | var i, j: int | |
492 | var s, t: string | |
493 | begin | |
494 | if and(equal(i, j), equal(s, t)) then print("yes") else print("no") | |
495 | end | |
496 | ||
497 | Should typecheck successfully, because the two calls to `equal` are two | |
498 | seperate instances of the type ♥t, but the two parameters inside the | |
499 | type signature of `equal` are the same instance. | |
500 | ||
501 | #### Example 2 #### | |
502 | ||
503 | forward glunt(beefy gnarly ♥t): gnarly ♥t | |
504 | ... | |
505 | procedure thing(): void | |
506 | var i: beefy gnarly int | |
507 | begin | |
508 | if equal(glunt(i), 4) then print("yes") else print("no") | |
509 | end | |
510 | ||
511 | Should typecheck successfully. The call to `glunt` returns a | |
512 | `gnarly int`. Midway through typechecking the call to `equal`, we obtain | |
513 | {♥t → `gnarly int`}. But when we typecheck the second argument we see | |
514 | that it's simply an `int`. We *re-bind* the variable to obtain {♥t → | |
515 | `int`}. This is OK with respect to the first argument — we just consider | |
516 | the `gnarly` to be dropped. | |
517 | ||
518 | #### Example 3 #### | |
519 | ||
520 | forward traub(beefy gnarly ♥t): bool | |
521 | ... | |
522 | procedure thing(p: beefy ♥s): ♥s | |
523 | begin | |
524 | if traub(p) then print("yes") else print("no") | |
525 | return p | |
526 | end | |
527 | ||
528 | Should *not* typecheck. The call to `traub` needs something qualified | |
529 | with both `beefy` and `gnarly`. `beefy ♥s` will fail to unify with it. | |
530 | ||
531 | Grammar | |
532 | ------- | |
533 | ||
534 | Dieter ::= {Module | Ordering | Forward} ".". | |
535 | Ordering ::= "order" Name/qual "<" Name/qual. | |
536 | Module ::= "module" Name/qual {"var" VarDecl} {ProcDecl} "end". | |
537 | Forward ::= "forward" Name/proc "(" [Type {"," Type}] ")" ":" Type. | |
538 | VarDecl ::= Name/var ":" Type. | |
539 | ProcDecl ::= "procedure" Name/proc "(" [VarDecl {"," VarDecl}] ")" ":" Type {"var" VarDecl} Statement. | |
540 | Statement ::= "begin" {Statement} "end" | |
541 | | "if" Expr "then" Statement ["else" Statement] | |
542 | | "while" Expr "do" Statement | |
543 | | Name/var ["[" Expr "]"] ":=" Expr | |
544 | | Name/proc "(" [Expr {"," Expr}] ")" | |
545 | | "return" ["final"] Expr | |
546 | . | |
547 | Expr ::= Name/var ["[" Expr "]"] | |
548 | | Name/proc "(" [Expr {"," Expr}] ")" | |
549 | | "(" Expr ")" | |
550 | | "bestow" Qualifier Expr | |
551 | | "super" | |
552 | . | |
553 | Type ::= {Name/qual} BareType. | |
554 | BareType ::= "map" ["from" Type] "to" Type | |
555 | | "♥" Name/tvar | |
556 | | "bool" | "int" | "rat" | "string" | "ref" | |
557 | . |
74 | 74 | failed = False |
75 | 75 | try: |
76 | 76 | module.typecheck(context) |
77 | except context.TypingError, e: | |
77 | except context.TypingError as e: | |
78 | 78 | logger.info("caught TypingError " + str(e)) |
79 | 79 | failed = True |
80 | except Exception, e: | |
80 | except Exception as e: | |
81 | 81 | logger.info("caught Exception " + str(e)) |
82 | 82 | finally: |
83 | 83 | if not failed: |
63 | 63 | def dump(self): |
64 | 64 | for k, v in self.map.iteritems(): |
65 | 65 | if isinstance(v, Type): |
66 | print k + " : " + str(v) # return string instead? | |
66 | print(k + " : " + str(v)) # return string instead? | |
67 | 67 | else: |
68 | print k + " : " + v | |
68 | print(k + " : " + v) | |
69 | 69 | |
70 | 70 | def assert_equiv(self, inwhat, s, t): |
71 | 71 | if not s.unify(t): |
14 | 14 | Create a new Scanner object that will consume the given |
15 | 15 | UTF-8 encoded input string. |
16 | 16 | """ |
17 | self._input = unicode(input, 'utf-8') | |
17 | try: | |
18 | self._input = unicode(input, 'utf-8') | |
19 | except NameError: | |
20 | self._input = input.decode('utf-8') | |
18 | 21 | self._token = None |
19 | 22 | self.scan() |
20 | 23 | |
68 | 71 | self._token = self._input[0] |
69 | 72 | self._input = self._input[1:] |
70 | 73 | self.toktype = "op" |
71 | #print "scanned: '" + str(self._token) + "'" | |
74 | # print("scanned: '" + str(self._token) + "'") | |
72 | 75 | |
73 | 76 | def get_token(self): |
74 | 77 | """ |
101 | 104 | """ |
102 | 105 | Log the given scan error. |
103 | 106 | """ |
104 | print "error: " + str | |
107 | print("error: " + str) | |
105 | 108 | self.scan() |
15 | 15 | |
16 | 16 | |
17 | 17 | def load(filename, options): |
18 | f = open(filename, "r") | |
19 | scanner = Scanner(f.read()) | |
20 | f.close() | |
18 | with open(filename, "rb") as f: | |
19 | scanner = Scanner(f.read()) | |
21 | 20 | parser = Parser(scanner) |
22 | 21 | ast = parser.Dieter() |
23 | 22 | context = TypingContext(None) |
25 | 24 | logging.basicConfig(level=logging.INFO) |
26 | 25 | ast.typecheck(context) |
27 | 26 | if options.dump_ast: |
28 | print "--- AST: ---" | |
29 | print ast.dump(0) | |
27 | print("--- AST: ---") | |
28 | print(ast.dump(0)) | |
30 | 29 | if options.dump_symtab: |
31 | print "--- Symbol Table: ---" | |
30 | print("--- Symbol Table: ---") | |
32 | 31 | context.dump() |
32 | ||
33 | 33 | |
34 | 34 | def main(argv): |
35 | 35 | optparser = OptionParser("[python] dieter.py {options} {source.dtr}\n" + __doc__) |
0 | 0 | #!/bin/sh |
1 | 1 | |
2 | for FILE in eg/*.dtr; do | |
3 | echo $FILE | |
4 | src/dieter.py $FILE || exit 1 | |
5 | done | |
2 | MISSING="" | |
3 | ||
4 | if command -v python2 > /dev/null 2>&1; then | |
5 | for FILE in eg/*.dtr; do | |
6 | echo "python2 src/dieter.py $FILE" | |
7 | python2 src/dieter.py $FILE || exit 1 | |
8 | done | |
9 | else | |
10 | MISSING="${MISSING}2" | |
11 | fi | |
12 | if command -v python3 > /dev/null 2>&1; then | |
13 | for FILE in eg/*.dtr; do | |
14 | echo "python3 src/dieter.py $FILE" | |
15 | python3 src/dieter.py $FILE || exit 1 | |
16 | done | |
17 | else | |
18 | MISSING="${MISSING}3" | |
19 | fi | |
20 | ||
21 | if [ "x${MISSING}" = "x23" ]; then | |
22 | echo "Neither python2 nor python3 found on executable search path. Aborting." | |
23 | exit 1 | |
24 | fi |