git @ Cat's Eye Technologies Castile / rel_0_3_2021_0625
Merge pull request #1 from catseye/develop-2021 Develop 2021 Chris Pressey authored 3 years ago GitHub committed 3 years ago
20 changed file(s) with 1325 addition(s) and 1337 deletion(s). Raw diff Collapse all Expand all
00 *.pyc
1 __pycache__
+0
-3
.hgignore less more
0 syntax: glob
1
2 *.pyc
+0
-4
.hgtags less more
0 078b8f15f0632cedae29522ad1dcce3095dd61a3 rel_0_1
1 ccc418957b6377337391ee608916fd5f1b0c8d40 rel_0_2
2 1a4234975ba8138d2a4c1aedc29eae55d20a5688 rel_0_3
3 93a9b9b9efde6d41d549482b2ff9d7f2a4e44022 rel_0_3_2015_0101
+0
-1156
README.markdown less more
0 Castile
1 =======
2
3 This is the reference distribution for Castile, an unremarkable programming
4 language.
5
6 The current version of Castile is 0.3. It is not only subject to change,
7 it is pretty much *guaranteed* to change.
8
9 Unlike most of my programming languages, there is nothing that could really
10 be described as innovative or experimental or even particularly unusual
11 about Castile. It is not a particularly comfortable programming experience,
12 often forcing the programmer to be explicit and verbose.
13
14 The reference implementation is slightly less unremarkable than the language
15 itself, if only for the fact that it compiles to four different target
16 languages: Javascript, Ruby, a hypothetical stack machine called
17 "stackmac" (a stackmac emulator ships with this distribution,) and (coming
18 soon) C.
19
20 Castile's influences might include:
21
22 * **C**: Most of Castile's syntax follows C, but it is generally more
23 permissive (semicolons are optional, types of local variables and return
24 types for functions do not have to be declared, etc.) It has a type
25 system (where `struct`s are the only types with name equivalence) which
26 can be applied statically. It has function values, but not closures.
27
28 * **Rust**: There is a union type, to which values must be explicitly
29 promoted (with `as`) and extracted (with `typecase ... is`.) This is
30 like Rust's `Enum`, which is (to quote its tutorial) "much like the
31 'tagged union' pattern in C, but with better static guarantees." Along
32 with structs, this provides something similar to algebraic data typing,
33 as seen in languages such as Haskell, Scala, etc.
34
35 * **Eightebed**: A few years back I realized that pointers that can
36 assume a null value are really a variant type, like Haskell's `Maybe`.
37 Of course, in most languages with pointers, the property of being null
38 isn't captured by the type; you can go ahead and dereference a pointer
39 in C or Java, whether it's valid or not. In Castile, this is captured
40 with a union type which includes `void`, and `typecase` generalizes
41 Eightebed's `ifvalid`.
42
43 * **Python**: The first time a local variable is assigned counts as its
44 declaration as a local.
45
46 * **Ruby**: The last expression in a function body is the return value
47 of that function; no explicit `return` is needed there. (But unlike
48 Ruby, and more like Pascal or linted C, all *other* expressions in
49 statement position within a block must have void type.)
50
51 * **Erlang** (or any other purely functional language): There are no
52 language-level pointers; sharing, if it happens at all, must be
53 orchestrated by the implementation. Global variables and function
54 arguments are not mutable, and neither are the fields of structs.
55 (But unlike Erlang, local variables *are* mutable.)
56
57 Some lines of research underneath all this are, if all we have is a relatively
58 crude language, but we make it typesafe and give it a slightly nicer type
59 system, does it suffice to make programming tolerable? Do tolerable ways of
60 managing memory without a full garbage collector present themselves? Does
61 having a simple compiler which can be target many backends provide any
62 advantages?
63
64 Also unlike most of my programming languages (with the exceptions of ILLGOL
65 and Bhuna), Castile was largely "designed by building" -- I wrote an
66 interpreter, and the language it happens to accept, I called Castile.
67 I wrote the interpreter in a very short span of time; most of it was done
68 within 24 hours of starting (but consider that I ripped off some of the
69 scanning/parsing code from ALPACA.) A few days later, I extended the
70 implementation to also allow compiling to Javascript, and a few days after
71 that, I added a Ruby backend (why not, eh?), and over the next few days,
72 the stackmac backend and emulator.
73
74 This document contains what is as close as there is to a specification of
75 the language, in the form of a Falderal test suite. The interpreter and all
76 compilers pass all the tests, but there are known shortcomings in at least
77 the compilers (no name mangling, etc.)
78
79 The `eg` directory contains a few example Castile programs, including a
80 string tokenizer.
81
82 One area where the Castile implementation is not entirely unremarkable is
83 that the typechecker is not required to be run. Unchecked Castile is
84 technically a different language from Castile; there are Castile programs
85 which would result in an error, where the Unchecked Castile program would
86 *not* (because it never executes the part of the program with a bad type.)
87 However, Unchecked Castile programs should be otherwise well-behaved;
88 any attempt to execute code which would have resulted in a type failure,
89 will result in a crash. Note, however, that this only applies to the
90 evaluator, not any of the compiler backends. Compiling Unchecked Castile
91 will simply not work (the backend will crash when it can't see any types.)
92
93 Grammar
94 -------
95
96 Program ::= {Defn [";"]}.
97 Defn ::= "fun" ident "(" [Arg {"," Arg}] ")" Body
98 | "struct" ident "{" {ident ":" TExpr [";"]} "}"
99 | ident (":" TExpr0 | "=" Literal).
100 Arg ::= ident [":" TExpr1].
101 Body ::= "{" {Stmt [";"]} "}".
102 Stmt ::= "while" Expr0 Block
103 | "typecase" ident "is" TExpr0 Block
104 | "do" Expr0
105 | "return" Expr0
106 | If
107 | Expr0.
108 Block ::= "{" {Stmt [";"]} "}".
109 If ::= "if" Expr0 Block ["else" (Block | If)].
110 Expr0 ::= Expr1 {("and" | "or") Expr1} ["as" TExpr0].
111 Expr1 ::= Expr2 {(">" | ">=" | "<" | "<=" | "==" | "!=") Expr2}.
112 Expr2 ::= Expr3 {("+" | "-") Expr3}.
113 Expr3 ::= Expr4 {("*" | "/") Expr4}.
114 Expr4 ::= Expr5 {"(" [Expr0 {"," Expr0}] ")" | "." ident}.
115 Expr5 ::= "make" ident "(" [ident ":" Expr0 {"," ident ":" Expr0}] ")"
116 | "(" Expr0 ")"
117 | "not" Expr1
118 | Literal
119 | ident ["=" Expr0].
120 Literal ::= strlit
121 | ["-"] intlit
122 | "true" | "false" | "null"
123 | "fun" "(" [Arg {"," Arg}] ")" Body.
124 TExpr0 ::= TExpr1 [{"," TExpr1} "->" TExpr1].
125 TExpr1 ::= TExpr2 {"|" TExpr2}.
126 TExpr2 ::= "integer"
127 | "boolean"
128 | "void"
129 | "(" TExpr0 ")"
130 | ident.
131
132 Examples
133 --------
134
135 -> Tests for functionality "Run Castile Program"
136
137 ### Rudiments ###
138
139 Minimal correct program.
140
141 | fun main() {}
142 =
143
144 A program may evaluate to a value.
145
146 | fun main() { 160 }
147 = 160
148
149 The function named `main` is the one that is evaluated when the
150 program is run.
151
152 | fun foobar(a, b, c) { 100 }
153 | fun main() { 120 }
154 | fun f() { 140 }
155 = 120
156
157 `main` should have no formal arguments.
158
159 | fun main(a, b, c) {
160 | 120
161 | }
162 ? type mismatch
163
164 But other functions may.
165
166 | fun foobar(a, b) { b }
167 | fun main() { foobar(100, 200) }
168 = 200
169
170 Defined function names must be unique.
171
172 | fun dup() { 1 }
173 | fun dup() { 2 }
174 ? duplicate
175
176 Formal argument names must be unique.
177
178 | fun f(g, g) {}
179 | fun main() { 1 }
180 ? defined
181
182 Functions must be defined before they are referenced.
183
184 | fun main() { f(7) }
185 | fun f(g) { g }
186 ? undefined
187
188 Either that, or forward-declared.
189
190 | f : integer -> integer
191 | fun main() { f(7) }
192 | fun f(g) { g * 2 }
193 = 14
194
195 If forward-declared, types must match.
196
197 | f : integer -> string
198 | fun main() { f(7) }
199 | fun f(g) { g * 2 }
200 ? type mismatch
201
202 Arguments must match...
203
204 | fun f(g, h) { g * 2 + h * 2 }
205 | fun main() { f(7) }
206 ? argument mismatch
207
208 | fun f(g, h) { g * 2 + h * 2 }
209 | fun main() { f(7,8,9) }
210 ? argument mismatch
211
212 ### Statements ###
213
214 Statements are commands that have the type void and are executed for their
215 side-effects. So, in general, statements may not be expressions. The
216 exception is that the last statement in a block may be an expression; the
217 result of that expression is the value of the block.
218
219 | fun main() {
220 | 20 * 8
221 | }
222 = 160
223
224 | fun main() {
225 | 20 + 3 * 8;
226 | 20 * 8
227 | }
228 ? type mismatch
229
230 An `if`/`else` lets you make decisions.
231
232 | fun main() {
233 | a = 0;
234 | if 3 > 2 {
235 | a = 70
236 | } else {
237 | a = 80
238 | }
239 | a
240 | }
241 = 70
242
243 An `if` need not have an `else`.
244
245 | fun main() {
246 | a = 60
247 | if 3 > 2 {
248 | a = 70
249 | }
250 | a
251 | }
252 = 70
253
254 `if` always typechecks to void, one branch or two.
255
256 | fun main() {
257 | a = 60
258 | if 3 > 2 {
259 | a = 70
260 | }
261 | }
262 =
263
264 | fun main() {
265 | a = 60
266 | if 3 > 2 {
267 | a = 70
268 | } else {
269 | a = 90
270 | }
271 | }
272 =
273
274 If an `if` does have an `else`, the part after `else` must be either a block
275 (already shown) or another `if`.
276
277 | fun main() {
278 | if 2 > 3 {
279 | return 60
280 | } else if 4 > 5 {
281 | return 0
282 | } else {
283 | return 1
284 | }
285 | }
286 = 1
287
288 No dangling else problem.
289
290 | fun main() {
291 | if 2 > 3 {
292 | return 60
293 | } else if 4 < 5 {
294 | return 99
295 | } else {
296 | return 1
297 | }
298 | }
299 = 99
300
301 `while` loops.
302
303 | fun main() {
304 | a = 0 b = 4
305 | while b > 0 {
306 | a = a + b
307 | b = b - 1
308 | }
309 | a
310 | }
311 = 10
312
313 A `while` itself has void type.
314
315 | fun main() {
316 | a = 0; b = 4;
317 | while b > 0 {
318 | a = a + b;
319 | b = b - 1;
320 | }
321 | }
322 =
323
324 `break` may be used to prematurely exit a `while`.
325
326 | fun main() {
327 | a = 0; b = 0;
328 | while true {
329 | a = a + b;
330 | b = b + 1;
331 | if (b > 4) { break; }
332 | }
333 | a
334 | }
335 = 10
336
337 ### Expressions ###
338
339 Precedence.
340
341 | fun main() {
342 | 2 + 3 * 4 /* not 20 */
343 | }
344 = 14
345
346 Unary negation.
347
348 | fun main() {
349 | -3
350 | }
351 = -3
352
353 | fun main() {
354 | 2+-5
355 | }
356 = -3
357
358 Minus sign must be right in front of a number.
359
360 | fun main() {
361 | -(4)
362 | }
363 ? Expected
364
365 Unary not.
366
367 | fun main() {
368 | not (4 > 3)
369 | }
370 = False
371
372 Precedence of unary not.
373
374 | fun main() {
375 | not true or true
376 | }
377 = True
378
379 | fun main() {
380 | not 3 > 4
381 | }
382 = True
383
384 ### Local Variables ###
385
386 Local variables.
387
388 | fun main() {
389 | a = 6;
390 | b = 7;
391 | a + b
392 | }
393 = 13
394
395 Local variables can be assigned functions.
396
397 | fun ancillary(x) { x * 2 }
398 | fun main() {
399 | a = ancillary;
400 | a(7)
401 | }
402 = 14
403
404 Local variables can be assigned.
405
406 | fun main() {
407 | a = 6;
408 | a = a + 12;
409 | a
410 | }
411 = 18
412
413 | fun main() {
414 | a = 6;
415 | z = 99;
416 | a
417 | }
418 = 6
419
420 | fun main() {
421 | z = 6;
422 | a
423 | }
424 ? undefined
425
426 Local variables cannot occur in expressions until they are defined by an
427 initial assignment.
428
429 | fun main() {
430 | z = a * 10;
431 | a = 10;
432 | z
433 | }
434 ? undefined
435
436 A local variables may not be defined inside an `if` or `while` or `typecase`
437 block, as it might not be executed.
438
439 | fun main() {
440 | if (4 > 5) {
441 | a = 10;
442 | } else {
443 | b = 11;
444 | }
445 | b
446 | }
447 ? within control
448
449 | fun main() {
450 | b = false;
451 | while b {
452 | a = 10;
453 | }
454 | a
455 | }
456 ? within control
457
458 | fun main() {
459 | a = 55 as integer|string;
460 | typecase a is string {
461 | b = 7
462 | }
463 | a
464 | }
465 ? within control
466
467 Assignment, though it syntactically may occur in expressions, has a type of
468 void, so it can only really happen at the statement level.
469
470 | fun main() {
471 | a = 0; b = 0;
472 | a = b = 9;
473 | }
474 ? type mismatch
475
476 Variables in upper scopes may be modified.
477
478 | fun main() {
479 | a = 0
480 | if 3 > 2 {
481 | a = 4;
482 | }
483 | a
484 | }
485 = 4
486
487 ### Non-local Values ###
488
489 Literals may appear at the toplevel. Semicolons are optional at toplevel.
490
491 | factor = 5;
492 | fun main() {
493 | 6 * factor
494 | }
495 = 30
496
497 Toplevel literals may not be updated. (And thus
498
499 | factor = 5
500 | fun main() {
501 | factor = 7
502 | }
503 ? shadows
504
505 Toplevel literals may be function literals (the syntax we've been using is just sugar.)
506
507 | main = fun() {
508 | 7
509 | }
510 = 7
511
512 Truth and falsehood are builtin toplevels.
513
514 | fun main() {
515 | true or false
516 | }
517 = True
518
519 | fun main() {
520 | false and true
521 | }
522 = False
523
524 So is `null`, which is the single value of `void` type.
525
526 | fun wat(x: void) { 3 }
527 | fun main() {
528 | wat(null)
529 | }
530 = 3
531
532 ### More on Functions ###
533
534 Function arguments may not be updated.
535
536 | fun foo(x) {
537 | x = x + 14;
538 | x
539 | }
540 | fun main() {
541 | foo(7)
542 | }
543 ? shadows
544
545 Factorial can be computed.
546
547 | factorial : integer -> integer
548 | fun factorial(a) {
549 | if a == 0 {
550 | return 1
551 | } else {
552 | return a * factorial(a - 1)
553 | }
554 | }
555 | fun main() {
556 | factorial(6)
557 | }
558 = 720
559
560 Literal functions.
561
562 | fun main() {
563 | inc = fun(x) { x + 1 };
564 | inc(7)
565 | }
566 = 8
567
568 | fun main() {
569 | fun(x){ x + 1 }(9)
570 | }
571 = 10
572
573 | fun main() {
574 | a = 99;
575 | a = fun(x){ x + 1 }(9);
576 | a
577 | }
578 = 10
579
580 Literal functions can have local variables, loops, etc.
581
582 | fun main() {
583 | z = 99;
584 | z = fun(x) {
585 | a = x; b = x;
586 | while a > 0 {
587 | b = b + a; a = a - 1;
588 | }
589 | return b
590 | }(9);
591 | z
592 | }
593 = 54
594
595 Literal functions can define other literal functions...
596
597 | fun main() {
598 | fun(x){ fun(y){ fun(z){ z + 1 } } }(4)(4)(10)
599 | }
600 = 11
601
602 Literal functions can access globals.
603
604 | oid = 19
605 | fun main() {
606 | fun(x){ x + oid }(11);
607 | }
608 = 30
609
610 Literal functions cannot access variables declared in enclosing scopes.
611
612 | fun main() {
613 | oid = 19;
614 | fun(x){ x + oid }(11);
615 | }
616 ? undefined
617
618 Literal functions cannot access arguments declared in enclosing scopes.
619
620 | fun main() {
621 | fun(x){ fun(y){ fun(z){ y + 1 } } }(4)(4)(10)
622 | }
623 ? undefined
624
625 Functions can be passed to functions and returned from functions.
626
627 | fun doubble(x) { x * 2 }
628 | fun triple(x) { x * 3 }
629 | fun apply_and_add_one(f: (integer -> integer), x) { f(x) + 1 }
630 | fun sellect(a) { if a > 10 { return doubble } else { return triple } }
631 | fun main() {
632 | t = sellect(5);
633 | d = sellect(15);
634 | p = t(10);
635 | apply_and_add_one(d, p)
636 | }
637 = 61
638
639 To overcome the syntactic ambiguity with commas, function types
640 in function definitions must be in parens.
641
642 | fun add(x, y) { x + y }
643 | fun mul(x, y) { x * y }
644 | fun do_it(f: (integer, integer -> integer), g) {
645 | f(3, g)
646 | }
647 | fun main() {
648 | do_it(mul, 4) - do_it(add, 4)
649 | }
650 = 5
651
652 `return` may be used to prematurely return a value from a function.
653
654 | fun foo(y) {
655 | x = y
656 | while x > 0 {
657 | if x < 5 {
658 | return x;
659 | }
660 | x = x - 1;
661 | }
662 | 17
663 | }
664 | fun main() {
665 | foo(10) + foo(0)
666 | }
667 = 21
668
669 Type of value returned must jibe with value of function's block.
670
671 | fun foo(x) {
672 | return "string";
673 | 17
674 | }
675 | fun main() {
676 | foo(10) + foo(0)
677 | }
678 ? type mismatch
679
680 Type of value returned must jibe with other return statements.
681
682 | fun foo(x) {
683 | if x > 0 {
684 | return "string";
685 | } else {
686 | return 17
687 | }
688 | }
689 | fun main() {
690 | foo(10) + foo(0)
691 | }
692 ? type mismatch
693
694 ### Builtins ###
695
696 The usual.
697
698 | fun main() {
699 | print("Hello, world!")
700 | }
701 = Hello, world!
702
703 Some standard functions are builtin and available as toplevels.
704
705 | fun main() {
706 | a = "hello";
707 | b = len(a);
708 | while b > 0 {
709 | print(a);
710 | b = b - 1;
711 | a = substr(a, 1, b)
712 | }
713 | }
714 = hello
715 = ello
716 = llo
717 = lo
718 = o
719
720 The `+` operator is not string concatenation. `concat` is.
721
722 | fun main() {
723 | print("hello " + "world")
724 | }
725 ? type mismatch
726
727 | fun main() {
728 | print(concat("hello ", "world"))
729 | }
730 = hello world
731
732 The builtin toplevels are functions and functions need parens.
733
734 | fun main() {
735 | print "hi"
736 | }
737 ? type mismatch
738
739 Note that the above was the motivation for requiring statements to have void
740 type; if non-void exprs could be used anywhere, that would just throw away
741 the function value `print` (b/c semicolons are optional) and return 'hi'.
742
743 ### Struct Types ###
744
745 Record types. You can define them:
746
747 | struct person { name: string; age: integer }
748 | main = fun() {}
749 =
750
751 And make them.
752
753 | struct person { name: string; age: integer }
754 | main = fun() {
755 | j = make person(name:"Jake", age:23);
756 | print("ok")
757 | }
758 = ok
759
760 And extract the fields from them.
761
762 | struct person { name: string; age: integer }
763 | main = fun() {
764 | j = make person(name:"Jake", age:23);
765 | print(j.name)
766 | if j.age > 20 {
767 | print("Older than twenty")
768 | } else {
769 | print("Underage")
770 | }
771 | }
772 = Jake
773 = Older than twenty
774
775 Structs must be defined somewhere.
776
777 | main = fun() {
778 | j = make person(name:"Jake", age:23);
779 | j
780 | }
781 ? undefined
782
783 Structs need not be defined before use.
784
785 | main = fun() {
786 | j = make person(name:"Jake", age:23);
787 | j.age
788 | }
789 | struct person { name: string; age: integer }
790 = 23
791
792 Structs may not contain structs which don't exist.
793
794 | struct person { name: string; age: foobar }
795 | main = fun() { 333 }
796 ? undefined
797
798 Types must match when making a struct.
799
800 | struct person { name: string; age: integer }
801 | main = fun() {
802 | j = make person(name:"Jake", age:"Old enough to know better");
803 | j.age
804 | }
805 ? type mismatch
806
807 | struct person { name: string; age: integer }
808 | main = fun() {
809 | j = make person(name:"Jake");
810 | j.age
811 | }
812 ? argument mismatch
813
814 | struct person { name: string }
815 | main = fun() {
816 | j = make person(name:"Jake", age:23);
817 | j.age
818 | }
819 ? argument mismatch
820
821 Order of field initialization when making a struct doesn't matter.
822
823 | struct person { name: string; age: integer }
824 | main = fun() {
825 | j = make person(age: 23, name:"Jake");
826 | j.age
827 | }
828 = 23
829
830 Structs can be tested for equality. (Since structs are immutable, it
831 doesn't matter if this is structural equality or identity.)
832
833 /| struct person { age: integer; name: string }
834 /| main = fun() {
835 /| j = make person(age: 23, name:"Jake");
836 /| k = make person(age: 23, name:"Jake");
837 /| j == k
838 /| }
839 /= True
840
841 /| struct person { name: string; age: integer }
842 /| main = fun() {
843 /| j = make person(age: 23, name:"Jake");
844 /| k = make person(name:"Jake", age: 23);
845 /| j == k
846 /| }
847 /= True
848
849 /| struct person { age: integer; name: string }
850 /| main = fun() {
851 /| j = make person(age: 23, name:"Jake");
852 /| k = make person(age: 23, name:"John");
853 /| j == k
854 /| }
855 /= False
856
857 Structs can be passed to functions.
858
859 | struct person { name: string; age: integer }
860 | fun wat(bouncer: person) { bouncer.age }
861 | main = fun() {
862 | j = make person(name:"Jake", age:23);
863 | wat(j)
864 | }
865 = 23
866
867 Structs have name equivalence, not structural.
868
869 | struct person { name: string; age: integer }
870 | struct city { name: string; population: integer }
871 | fun wat(hometown: city) { hometown }
872 | main = fun() {
873 | j = make person(name:"Jake", age:23);
874 | wat(j)
875 | }
876 ? type mismatch
877
878 Struct fields must all be unique.
879
880 | struct person { name: string; name: string }
881 | main = fun() {
882 | j = make person(name:"Jake", name:"Smith");
883 | }
884 ? defined
885
886 Values can be retrieved from structs.
887
888 | struct person { name: string; age: integer }
889 | fun age(bouncer: person) { bouncer.age }
890 | main = fun() {
891 | j = make person(name:"Jake", age:23);
892 | age(j)
893 | }
894 = 23
895
896 | struct person { name: string }
897 | fun age(bouncer: person) { bouncer.age }
898 | main = fun() {
899 | j = make person(name:"Jake");
900 | age(j)
901 | }
902 ? undefined
903
904 Different structs may have the same field name in different positions.
905
906 | struct person { name: string; age: integer }
907 | struct city { population: integer; name: string }
908 | main = fun() {
909 | j = make person(name:"Jake", age:23);
910 | w = make city(population:600000, name:"Winnipeg");
911 | print(j.name)
912 | print(w.name)
913 | }
914 = Jake
915 = Winnipeg
916
917 Can't define the same struct multiple times.
918
919 | struct person { name: string; age: integer }
920 | struct person { name: string; age: string }
921 | fun main() { 333 }
922 ? duplicate
923
924 Structs may refer to themselves.
925
926 | struct recursive {
927 | next: recursive;
928 | }
929 | fun main() { 333 }
930 = 333
931
932 | struct odd {
933 | next: even;
934 | }
935 | struct even {
936 | next: odd;
937 | }
938 | fun main() { 333 }
939 = 333
940
941 But you can't actually make one of these infinite structs.
942
943 | struct recursive {
944 | next: recursive;
945 | }
946 | fun main() { make recursive(next:make recursive(next:"nooo")) }
947 ? type mismatch
948
949 ### Union Types ###
950
951 Values of union type are created with the type promotion operator,
952 `as ...`. Type promotion has a very low precedence, and can be
953 applied to any expression.
954
955 The type after the `as` must be a union.
956
957 | fun main() {
958 | a = 20;
959 | b = 30;
960 | a + b as integer
961 | }
962 ? bad cast
963
964 The type after the `as` must be one of the types in the union.
965
966 | fun main() {
967 | a = 20;
968 | b = 30;
969 | a + b as string|void
970 | }
971 ? bad cast
972
973 The type after the `as` must be the type of the expression.
974
975 | fun main() {
976 | a = 20;
977 | b = 30;
978 | c = a + b as integer|string
979 | print("ok")
980 | }
981 = ok
982
983 Values of union type can be passed to functions.
984
985 | fun foo(a, b: integer|string) {
986 | a + 1
987 | }
988 | main = fun() {
989 | a = 0;
990 | a = foo(a, 333 as integer|string);
991 | a = foo(a, "hiya" as integer|string);
992 | a
993 | }
994 = 2
995
996 Order of types in a union doesn't matter.
997
998 | fun foo(a, b: integer|string) {
999 | a + 1
1000 | }
1001 | main = fun() {
1002 | a = 0;
1003 | a = foo(a, 333 as integer|string);
1004 | a = foo(a, "hiya" as string|integer);
1005 | a
1006 | }
1007 = 2
1008
1009 The `typecase` construct can operate on the "right" type of a union.
1010
1011 | fun foo(a, b: integer|string) {
1012 | r = a;
1013 | typecase b is integer {
1014 | r = r + b;
1015 | };
1016 | typecase b is string {
1017 | r = r + len(b);
1018 | };
1019 | r
1020 | }
1021 | main = fun() {
1022 | a = 0;
1023 | a = foo(a, 333 as integer|string);
1024 | a = foo(a, "hiya" as integer|string);
1025 | a
1026 | }
1027 = 337
1028
1029 The expression in a `typecase` must be a variable.
1030
1031 | main = fun() {
1032 | a = 333 as integer|string;
1033 | typecase 333 is integer {
1034 | print("what?")
1035 | };
1036 | }
1037 ? identifier
1038
1039 The expression in a `typecase` can be an argument.
1040
1041 | fun wat(j: integer|string) {
1042 | typecase j is integer {
1043 | print("integer")
1044 | };
1045 | }
1046 | main = fun() {
1047 | wat(444 as integer|string)
1048 | }
1049 = integer
1050
1051 The expression in a `typecase` cannot effectively be a global, as globals
1052 must be literals and there is no way (right now) to make a literal of union
1053 type.
1054
1055 Inside a `typecase` the variable cannot be updated.
1056
1057 | main = fun() {
1058 | a = 333 as integer|string;
1059 | typecase a is integer {
1060 | a = 700;
1061 | };
1062 | }
1063 ? cannot assign
1064
1065 The union can include void.
1066
1067 | main = fun() {
1068 | j = null as void|integer;
1069 | typecase j is void {
1070 | print("nothing there")
1071 | };
1072 | }
1073 = nothing there
1074
1075 ### Struct Types + Union Types ###
1076
1077 Union types may be used to make fields of a struct "nullable", so that
1078 you can in actuality create recursive, but finite, data structures.
1079
1080 | struct list {
1081 | value: string;
1082 | next: list|integer;
1083 | }
1084 | main = fun() {
1085 | l = make list(
1086 | value: "first",
1087 | next: make list(
1088 | value: "second",
1089 | next:0 as list|integer
1090 | ) as list|integer)
1091 | s = l.next
1092 | typecase s is list {
1093 | print(s.value)
1094 | }
1095 | }
1096 = second
1097
1098 You may want to use helper functions to hide this ugliness.
1099
1100 | struct list {
1101 | value: string;
1102 | next: list|void;
1103 | }
1104 | fun singleton(v: string) {
1105 | make list(value:v, next:null as list|void)
1106 | }
1107 | fun cons(v: string, l: list) {
1108 | make list(value:v, next:l as list|void)
1109 | }
1110 | fun nth(n, l: list) {
1111 | u = l as list|void;
1112 | v = u;
1113 | k = n;
1114 | while k > 1 {
1115 | typecase u is void { break; }
1116 | typecase u is list { v = u.next; }
1117 | u = v;
1118 | k = k - 1;
1119 | }
1120 | return u
1121 | }
1122 | main = fun() {
1123 | l = cons("first", singleton("second"));
1124 | g = nth(2, l);
1125 | typecase g is list { print(g.value); }
1126 | }
1127 = second
1128
1129 Structs may be empty.
1130
1131 | struct red { }
1132 | fun show(color: red) {
1133 | print("hi")
1134 | }
1135 | main = fun() {
1136 | show(make red());
1137 | }
1138 = hi
1139
1140 In combination with unions, this lets us create "typed enums".
1141
1142 | struct red { }
1143 | struct green { }
1144 | struct blue { }
1145 | fun show(color: red|green|blue) {
1146 | typecase color is red { print("red"); }
1147 | typecase color is green { print("green"); }
1148 | typecase color is blue { print("blue"); }
1149 | }
1150 | main = fun() {
1151 | show(make red() as red|green|blue);
1152 | show(make blue() as red|green|blue);
1153 | }
1154 = red
1155 = blue
0 Castile
1 =======
2
3 This is the reference distribution for Castile, an unremarkable programming
4 language.
5
6 The current version of Castile is 0.3. It is not only subject to change,
7 it is pretty much *guaranteed* to change.
8
9 Unlike most of my programming languages, there is nothing that could really
10 be described as innovative or experimental or even particularly unusual
11 about Castile. It is not a particularly comfortable programming experience,
12 often forcing the programmer to be explicit and verbose.
13
14 The reference implementation is slightly less unremarkable than the language
15 itself, if only for the fact that it compiles to four different target
16 languages: Javascript, Ruby, a hypothetical stack machine called
17 "stackmac" (a stackmac emulator ships with this distribution,) and (coming
18 soon) C.
19
20 Castile's influences might include:
21
22 * **C**: Most of Castile's syntax follows C, but it is generally more
23 permissive (semicolons are optional, types of local variables and return
24 types for functions do not have to be declared, etc.) It has a type
25 system (where `struct`s are the only types with name equivalence) which
26 can be applied statically. It has function values, but not closures.
27
28 * **Rust**: There is a union type, to which values must be explicitly
29 promoted (with `as`) and extracted (with `typecase ... is`.) This is
30 like Rust's `Enum`, which is (to quote its tutorial) "much like the
31 'tagged union' pattern in C, but with better static guarantees." Along
32 with structs, this provides something similar to algebraic data typing,
33 as seen in languages such as Haskell, Scala, etc.
34
35 * **Eightebed**: A few years back I realized that pointers that can
36 assume a null value are really a variant type, like Haskell's `Maybe`.
37 Of course, in most languages with pointers, the property of being null
38 isn't captured by the type; you can go ahead and dereference a pointer
39 in C or Java, whether it's valid or not. In Castile, this is captured
40 with a union type which includes `void`, and `typecase` generalizes
41 Eightebed's `ifvalid`.
42
43 * **Python**: The first time a local variable is assigned counts as its
44 declaration as a local.
45
46 * **Ruby**: The last expression in a function body is the return value
47 of that function; no explicit `return` is needed there. (But unlike
48 Ruby, and more like Pascal or linted C, all *other* expressions in
49 statement position within a block must have void type.)
50
51 * **Erlang** (or any other purely functional language): There are no
52 language-level pointers; sharing, if it happens at all, must be
53 orchestrated by the implementation. Global variables and function
54 arguments are not mutable, and neither are the fields of structs.
55 (But unlike Erlang, local variables *are* mutable.)
56
57 Some lines of research underneath all this are, if all we have is a relatively
58 crude language, but we make it typesafe and give it a slightly nicer type
59 system, does it suffice to make programming tolerable? Do tolerable ways of
60 managing memory without a full garbage collector present themselves? Does
61 having a simple compiler which can be target many backends provide any
62 advantages?
63
64 Also unlike most of my programming languages (with the exceptions of ILLGOL
65 and Bhuna), Castile was largely "designed by building" -- I wrote an
66 interpreter, and the language it happens to accept, I called Castile.
67 I wrote the interpreter in a very short span of time; most of it was done
68 within 24 hours of starting (but consider that I ripped off some of the
69 scanning/parsing code from ALPACA.) A few days later, I extended the
70 implementation to also allow compiling to Javascript, and a few days after
71 that, I added a Ruby backend (why not, eh?), and over the next few days,
72 the stackmac backend and emulator.
73
74 This document contains what is as close as there is to a specification of
75 the language, in the form of a Falderal test suite. The interpreter and all
76 compilers pass all the tests, but there are known shortcomings in at least
77 the compilers (no name mangling, etc.)
78
79 The `eg` directory contains a few example Castile programs, including a
80 string tokenizer.
81
82 One area where the Castile implementation is not entirely unremarkable is
83 that the typechecker is not required to be run. Unchecked Castile is
84 technically a different language from Castile; there are Castile programs
85 which would result in an error, where the Unchecked Castile program would
86 *not* (because it never executes the part of the program with a bad type.)
87 However, Unchecked Castile programs should be otherwise well-behaved;
88 any attempt to execute code which would have resulted in a type failure,
89 will result in a crash. Note, however, that this only applies to the
90 evaluator, not any of the compiler backends. Compiling Unchecked Castile
91 will simply not work (the backend will crash when it can't see any types.)
92
93 Grammar
94 -------
95
96 Program ::= {Defn [";"]}.
97 Defn ::= "fun" ident "(" [Arg {"," Arg}] ")" Body
98 | "struct" ident "{" {ident ":" TExpr [";"]} "}"
99 | ident (":" TExpr0 | "=" Literal).
100 Arg ::= ident [":" TExpr1].
101 Body ::= "{" {Stmt [";"]} "}".
102 Stmt ::= "while" Expr0 Block
103 | "typecase" ident "is" TExpr0 Block
104 | "do" Expr0
105 | "return" Expr0
106 | If
107 | Expr0.
108 Block ::= "{" {Stmt [";"]} "}".
109 If ::= "if" Expr0 Block ["else" (Block | If)].
110 Expr0 ::= Expr1 {("and" | "or") Expr1} ["as" TExpr0].
111 Expr1 ::= Expr2 {(">" | ">=" | "<" | "<=" | "==" | "!=") Expr2}.
112 Expr2 ::= Expr3 {("+" | "-") Expr3}.
113 Expr3 ::= Expr4 {("*" | "/") Expr4}.
114 Expr4 ::= Expr5 {"(" [Expr0 {"," Expr0}] ")" | "." ident}.
115 Expr5 ::= "make" ident "(" [ident ":" Expr0 {"," ident ":" Expr0}] ")"
116 | "(" Expr0 ")"
117 | "not" Expr1
118 | Literal
119 | ident ["=" Expr0].
120 Literal ::= strlit
121 | ["-"] intlit
122 | "true" | "false" | "null"
123 | "fun" "(" [Arg {"," Arg}] ")" Body.
124 TExpr0 ::= TExpr1 [{"," TExpr1} "->" TExpr1].
125 TExpr1 ::= TExpr2 {"|" TExpr2}.
126 TExpr2 ::= "integer"
127 | "boolean"
128 | "void"
129 | "(" TExpr0 ")"
130 | ident.
131
132 Examples
133 --------
134
135 -> Tests for functionality "Run Castile Program"
136
137 ### Rudiments ###
138
139 Minimal correct program.
140
141 | fun main() {}
142 =
143
144 A program may evaluate to a value.
145
146 | fun main() { 160 }
147 = 160
148
149 The function named `main` is the one that is evaluated when the
150 program is run.
151
152 | fun foobar(a, b, c) { 100 }
153 | fun main() { 120 }
154 | fun f() { 140 }
155 = 120
156
157 `main` should have no formal arguments.
158
159 | fun main(a, b, c) {
160 | 120
161 | }
162 ? type mismatch
163
164 But other functions may.
165
166 | fun foobar(a, b) { b }
167 | fun main() { foobar(100, 200) }
168 = 200
169
170 Defined function names must be unique.
171
172 | fun dup() { 1 }
173 | fun dup() { 2 }
174 ? duplicate
175
176 Formal argument names must be unique.
177
178 | fun f(g, g) {}
179 | fun main() { 1 }
180 ? defined
181
182 Functions must be defined before they are referenced.
183
184 | fun main() { f(7) }
185 | fun f(g) { g }
186 ? undefined
187
188 Either that, or forward-declared.
189
190 | f : integer -> integer
191 | fun main() { f(7) }
192 | fun f(g) { g * 2 }
193 = 14
194
195 If forward-declared, types must match.
196
197 | f : integer -> string
198 | fun main() { f(7) }
199 | fun f(g) { g * 2 }
200 ? type mismatch
201
202 Arguments must match...
203
204 | fun f(g, h) { g * 2 + h * 2 }
205 | fun main() { f(7) }
206 ? argument mismatch
207
208 | fun f(g, h) { g * 2 + h * 2 }
209 | fun main() { f(7,8,9) }
210 ? argument mismatch
211
212 ### Statements ###
213
214 Statements are commands that have the type void and are executed for their
215 side-effects. So, in general, statements may not be expressions. The
216 exception is that the last statement in a block may be an expression; the
217 result of that expression is the value of the block.
218
219 | fun main() {
220 | 20 * 8
221 | }
222 = 160
223
224 | fun main() {
225 | 20 + 3 * 8;
226 | 20 * 8
227 | }
228 ? type mismatch
229
230 An `if`/`else` lets you make decisions.
231
232 | fun main() {
233 | a = 0;
234 | if 3 > 2 {
235 | a = 70
236 | } else {
237 | a = 80
238 | }
239 | a
240 | }
241 = 70
242
243 An `if` need not have an `else`.
244
245 | fun main() {
246 | a = 60
247 | if 3 > 2 {
248 | a = 70
249 | }
250 | a
251 | }
252 = 70
253
254 `if` always typechecks to void, one branch or two.
255
256 | fun main() {
257 | a = 60
258 | if 3 > 2 {
259 | a = 70
260 | }
261 | }
262 =
263
264 | fun main() {
265 | a = 60
266 | if 3 > 2 {
267 | a = 70
268 | } else {
269 | a = 90
270 | }
271 | }
272 =
273
274 If an `if` does have an `else`, the part after `else` must be either a block
275 (already shown) or another `if`.
276
277 | fun main() {
278 | if 2 > 3 {
279 | return 60
280 | } else if 4 > 5 {
281 | return 0
282 | } else {
283 | return 1
284 | }
285 | }
286 = 1
287
288 No dangling else problem.
289
290 | fun main() {
291 | if 2 > 3 {
292 | return 60
293 | } else if 4 < 5 {
294 | return 99
295 | } else {
296 | return 1
297 | }
298 | }
299 = 99
300
301 `while` loops.
302
303 | fun main() {
304 | a = 0 b = 4
305 | while b > 0 {
306 | a = a + b
307 | b = b - 1
308 | }
309 | a
310 | }
311 = 10
312
313 A `while` itself has void type.
314
315 | fun main() {
316 | a = 0; b = 4;
317 | while b > 0 {
318 | a = a + b;
319 | b = b - 1;
320 | }
321 | }
322 =
323
324 `break` may be used to prematurely exit a `while`.
325
326 | fun main() {
327 | a = 0; b = 0;
328 | while true {
329 | a = a + b;
330 | b = b + 1;
331 | if (b > 4) { break; }
332 | }
333 | a
334 | }
335 = 10
336
337 ### Expressions ###
338
339 Precedence.
340
341 | fun main() {
342 | 2 + 3 * 4 /* not 20 */
343 | }
344 = 14
345
346 Unary negation.
347
348 | fun main() {
349 | -3
350 | }
351 = -3
352
353 | fun main() {
354 | 2+-5
355 | }
356 = -3
357
358 Minus sign must be right in front of a number.
359
360 | fun main() {
361 | -(4)
362 | }
363 ? Expected
364
365 Unary not.
366
367 | fun main() {
368 | not (4 > 3)
369 | }
370 = False
371
372 Precedence of unary not.
373
374 | fun main() {
375 | not true or true
376 | }
377 = True
378
379 | fun main() {
380 | not 3 > 4
381 | }
382 = True
383
384 ### Local Variables ###
385
386 Local variables.
387
388 | fun main() {
389 | a = 6;
390 | b = 7;
391 | a + b
392 | }
393 = 13
394
395 Local variables can be assigned functions.
396
397 | fun ancillary(x) { x * 2 }
398 | fun main() {
399 | a = ancillary;
400 | a(7)
401 | }
402 = 14
403
404 Local variables can be assigned.
405
406 | fun main() {
407 | a = 6;
408 | a = a + 12;
409 | a
410 | }
411 = 18
412
413 | fun main() {
414 | a = 6;
415 | z = 99;
416 | a
417 | }
418 = 6
419
420 | fun main() {
421 | z = 6;
422 | a
423 | }
424 ? undefined
425
426 Local variables cannot occur in expressions until they are defined by an
427 initial assignment.
428
429 | fun main() {
430 | z = a * 10;
431 | a = 10;
432 | z
433 | }
434 ? undefined
435
436 A local variables may not be defined inside an `if` or `while` or `typecase`
437 block, as it might not be executed.
438
439 | fun main() {
440 | if (4 > 5) {
441 | a = 10;
442 | } else {
443 | b = 11;
444 | }
445 | b
446 | }
447 ? within control
448
449 | fun main() {
450 | b = false;
451 | while b {
452 | a = 10;
453 | }
454 | a
455 | }
456 ? within control
457
458 | fun main() {
459 | a = 55 as integer|string;
460 | typecase a is string {
461 | b = 7
462 | }
463 | a
464 | }
465 ? within control
466
467 Assignment, though it syntactically may occur in expressions, has a type of
468 void, so it can only really happen at the statement level.
469
470 | fun main() {
471 | a = 0; b = 0;
472 | a = b = 9;
473 | }
474 ? type mismatch
475
476 Variables in upper scopes may be modified.
477
478 | fun main() {
479 | a = 0
480 | if 3 > 2 {
481 | a = 4;
482 | }
483 | a
484 | }
485 = 4
486
487 ### Non-local Values ###
488
489 Literals may appear at the toplevel. Semicolons are optional at toplevel.
490
491 | factor = 5;
492 | fun main() {
493 | 6 * factor
494 | }
495 = 30
496
497 Toplevel literals may not be updated. (And thus
498
499 | factor = 5
500 | fun main() {
501 | factor = 7
502 | }
503 ? shadows
504
505 Toplevel literals may be function literals (the syntax we've been using is just sugar.)
506
507 | main = fun() {
508 | 7
509 | }
510 = 7
511
512 Truth and falsehood are builtin toplevels.
513
514 | fun main() {
515 | true or false
516 | }
517 = True
518
519 | fun main() {
520 | false and true
521 | }
522 = False
523
524 So is `null`, which is the single value of `void` type.
525
526 | fun wat(x: void) { 3 }
527 | fun main() {
528 | wat(null)
529 | }
530 = 3
531
532 ### More on Functions ###
533
534 Function arguments may not be updated.
535
536 | fun foo(x) {
537 | x = x + 14;
538 | x
539 | }
540 | fun main() {
541 | foo(7)
542 | }
543 ? shadows
544
545 Factorial can be computed.
546
547 | factorial : integer -> integer
548 | fun factorial(a) {
549 | if a == 0 {
550 | return 1
551 | } else {
552 | return a * factorial(a - 1)
553 | }
554 | }
555 | fun main() {
556 | factorial(6)
557 | }
558 = 720
559
560 Literal functions.
561
562 | fun main() {
563 | inc = fun(x) { x + 1 };
564 | inc(7)
565 | }
566 = 8
567
568 | fun main() {
569 | fun(x){ x + 1 }(9)
570 | }
571 = 10
572
573 | fun main() {
574 | a = 99;
575 | a = fun(x){ x + 1 }(9);
576 | a
577 | }
578 = 10
579
580 Literal functions can have local variables, loops, etc.
581
582 | fun main() {
583 | z = 99;
584 | z = fun(x) {
585 | a = x; b = x;
586 | while a > 0 {
587 | b = b + a; a = a - 1;
588 | }
589 | return b
590 | }(9);
591 | z
592 | }
593 = 54
594
595 Literal functions can define other literal functions...
596
597 | fun main() {
598 | fun(x){ fun(y){ fun(z){ z + 1 } } }(4)(4)(10)
599 | }
600 = 11
601
602 Literal functions can access globals.
603
604 | oid = 19
605 | fun main() {
606 | fun(x){ x + oid }(11);
607 | }
608 = 30
609
610 Literal functions cannot access variables declared in enclosing scopes.
611
612 | fun main() {
613 | oid = 19;
614 | fun(x){ x + oid }(11);
615 | }
616 ? undefined
617
618 Literal functions cannot access arguments declared in enclosing scopes.
619
620 | fun main() {
621 | fun(x){ fun(y){ fun(z){ y + 1 } } }(4)(4)(10)
622 | }
623 ? undefined
624
625 Functions can be passed to functions and returned from functions.
626
627 | fun doubble(x) { x * 2 }
628 | fun triple(x) { x * 3 }
629 | fun apply_and_add_one(f: (integer -> integer), x) { f(x) + 1 }
630 | fun sellect(a) { if a > 10 { return doubble } else { return triple } }
631 | fun main() {
632 | t = sellect(5);
633 | d = sellect(15);
634 | p = t(10);
635 | apply_and_add_one(d, p)
636 | }
637 = 61
638
639 To overcome the syntactic ambiguity with commas, function types
640 in function definitions must be in parens.
641
642 | fun add(x, y) { x + y }
643 | fun mul(x, y) { x * y }
644 | fun do_it(f: (integer, integer -> integer), g) {
645 | f(3, g)
646 | }
647 | fun main() {
648 | do_it(mul, 4) - do_it(add, 4)
649 | }
650 = 5
651
652 `return` may be used to prematurely return a value from a function.
653
654 | fun foo(y) {
655 | x = y
656 | while x > 0 {
657 | if x < 5 {
658 | return x;
659 | }
660 | x = x - 1;
661 | }
662 | 17
663 | }
664 | fun main() {
665 | foo(10) + foo(0)
666 | }
667 = 21
668
669 Type of value returned must jibe with value of function's block.
670
671 | fun foo(x) {
672 | return "string";
673 | 17
674 | }
675 | fun main() {
676 | foo(10) + foo(0)
677 | }
678 ? type mismatch
679
680 Type of value returned must jibe with other return statements.
681
682 | fun foo(x) {
683 | if x > 0 {
684 | return "string";
685 | } else {
686 | return 17
687 | }
688 | }
689 | fun main() {
690 | foo(10) + foo(0)
691 | }
692 ? type mismatch
693
694 ### Builtins ###
695
696 The usual.
697
698 | fun main() {
699 | print("Hello, world!")
700 | }
701 = Hello, world!
702
703 Some standard functions are builtin and available as toplevels.
704
705 | fun main() {
706 | a = "hello";
707 | b = len(a);
708 | while b > 0 {
709 | print(a);
710 | b = b - 1;
711 | a = substr(a, 1, b)
712 | }
713 | }
714 = hello
715 = ello
716 = llo
717 = lo
718 = o
719
720 The `+` operator is not string concatenation. `concat` is.
721
722 | fun main() {
723 | print("hello " + "world")
724 | }
725 ? type mismatch
726
727 | fun main() {
728 | print(concat("hello ", "world"))
729 | }
730 = hello world
731
732 The builtin toplevels are functions and functions need parens.
733
734 | fun main() {
735 | print "hi"
736 | }
737 ? type mismatch
738
739 Note that the above was the motivation for requiring statements to have void
740 type; if non-void exprs could be used anywhere, that would just throw away
741 the function value `print` (b/c semicolons are optional) and return 'hi'.
742
743 ### Struct Types ###
744
745 Record types. You can define them:
746
747 | struct person { name: string; age: integer }
748 | main = fun() {}
749 =
750
751 And make them.
752
753 | struct person { name: string; age: integer }
754 | main = fun() {
755 | j = make person(name:"Jake", age:23);
756 | print("ok")
757 | }
758 = ok
759
760 And extract the fields from them.
761
762 | struct person { name: string; age: integer }
763 | main = fun() {
764 | j = make person(name:"Jake", age:23);
765 | print(j.name)
766 | if j.age > 20 {
767 | print("Older than twenty")
768 | } else {
769 | print("Underage")
770 | }
771 | }
772 = Jake
773 = Older than twenty
774
775 Structs must be defined somewhere.
776
777 | main = fun() {
778 | j = make person(name:"Jake", age:23);
779 | j
780 | }
781 ? undefined
782
783 Structs need not be defined before use.
784
785 | main = fun() {
786 | j = make person(name:"Jake", age:23);
787 | j.age
788 | }
789 | struct person { name: string; age: integer }
790 = 23
791
792 Structs may not contain structs which don't exist.
793
794 | struct person { name: string; age: foobar }
795 | main = fun() { 333 }
796 ? undefined
797
798 Types must match when making a struct.
799
800 | struct person { name: string; age: integer }
801 | main = fun() {
802 | j = make person(name:"Jake", age:"Old enough to know better");
803 | j.age
804 | }
805 ? type mismatch
806
807 | struct person { name: string; age: integer }
808 | main = fun() {
809 | j = make person(name:"Jake");
810 | j.age
811 | }
812 ? argument mismatch
813
814 | struct person { name: string }
815 | main = fun() {
816 | j = make person(name:"Jake", age:23);
817 | j.age
818 | }
819 ? argument mismatch
820
821 Order of field initialization when making a struct doesn't matter.
822
823 | struct person { name: string; age: integer }
824 | main = fun() {
825 | j = make person(age: 23, name:"Jake");
826 | j.age
827 | }
828 = 23
829
830 Structs can be tested for equality. (Since structs are immutable, it
831 doesn't matter if this is structural equality or identity.)
832
833 /| struct person { age: integer; name: string }
834 /| main = fun() {
835 /| j = make person(age: 23, name:"Jake");
836 /| k = make person(age: 23, name:"Jake");
837 /| j == k
838 /| }
839 /= True
840
841 /| struct person { name: string; age: integer }
842 /| main = fun() {
843 /| j = make person(age: 23, name:"Jake");
844 /| k = make person(name:"Jake", age: 23);
845 /| j == k
846 /| }
847 /= True
848
849 /| struct person { age: integer; name: string }
850 /| main = fun() {
851 /| j = make person(age: 23, name:"Jake");
852 /| k = make person(age: 23, name:"John");
853 /| j == k
854 /| }
855 /= False
856
857 Structs can be passed to functions.
858
859 | struct person { name: string; age: integer }
860 | fun wat(bouncer: person) { bouncer.age }
861 | main = fun() {
862 | j = make person(name:"Jake", age:23);
863 | wat(j)
864 | }
865 = 23
866
867 Structs have name equivalence, not structural.
868
869 | struct person { name: string; age: integer }
870 | struct city { name: string; population: integer }
871 | fun wat(hometown: city) { hometown }
872 | main = fun() {
873 | j = make person(name:"Jake", age:23);
874 | wat(j)
875 | }
876 ? type mismatch
877
878 Struct fields must all be unique.
879
880 | struct person { name: string; name: string }
881 | main = fun() {
882 | j = make person(name:"Jake", name:"Smith");
883 | }
884 ? defined
885
886 Values can be retrieved from structs.
887
888 | struct person { name: string; age: integer }
889 | fun age(bouncer: person) { bouncer.age }
890 | main = fun() {
891 | j = make person(name:"Jake", age:23);
892 | age(j)
893 | }
894 = 23
895
896 | struct person { name: string }
897 | fun age(bouncer: person) { bouncer.age }
898 | main = fun() {
899 | j = make person(name:"Jake");
900 | age(j)
901 | }
902 ? undefined
903
904 Different structs may have the same field name in different positions.
905
906 | struct person { name: string; age: integer }
907 | struct city { population: integer; name: string }
908 | main = fun() {
909 | j = make person(name:"Jake", age:23);
910 | w = make city(population:600000, name:"Winnipeg");
911 | print(j.name)
912 | print(w.name)
913 | }
914 = Jake
915 = Winnipeg
916
917 Can't define the same struct multiple times.
918
919 | struct person { name: string; age: integer }
920 | struct person { name: string; age: string }
921 | fun main() { 333 }
922 ? duplicate
923
924 Structs may refer to themselves.
925
926 | struct recursive {
927 | next: recursive;
928 | }
929 | fun main() { 333 }
930 = 333
931
932 | struct odd {
933 | next: even;
934 | }
935 | struct even {
936 | next: odd;
937 | }
938 | fun main() { 333 }
939 = 333
940
941 But you can't actually make one of these infinite structs.
942
943 | struct recursive {
944 | next: recursive;
945 | }
946 | fun main() { make recursive(next:make recursive(next:"nooo")) }
947 ? type mismatch
948
949 ### Union Types ###
950
951 Values of union type are created with the type promotion operator,
952 `as ...`. Type promotion has a very low precedence, and can be
953 applied to any expression.
954
955 The type after the `as` must be a union.
956
957 | fun main() {
958 | a = 20;
959 | b = 30;
960 | a + b as integer
961 | }
962 ? bad cast
963
964 The type after the `as` must be one of the types in the union.
965
966 | fun main() {
967 | a = 20;
968 | b = 30;
969 | a + b as string|void
970 | }
971 ? bad cast
972
973 The type after the `as` must be the type of the expression.
974
975 | fun main() {
976 | a = 20;
977 | b = 30;
978 | c = a + b as integer|string
979 | print("ok")
980 | }
981 = ok
982
983 Values of union type can be passed to functions.
984
985 | fun foo(a, b: integer|string) {
986 | a + 1
987 | }
988 | main = fun() {
989 | a = 0;
990 | a = foo(a, 333 as integer|string);
991 | a = foo(a, "hiya" as integer|string);
992 | a
993 | }
994 = 2
995
996 Order of types in a union doesn't matter.
997
998 | fun foo(a, b: integer|string) {
999 | a + 1
1000 | }
1001 | main = fun() {
1002 | a = 0;
1003 | a = foo(a, 333 as integer|string);
1004 | a = foo(a, "hiya" as string|integer);
1005 | a
1006 | }
1007 = 2
1008
1009 The `typecase` construct can operate on the "right" type of a union.
1010
1011 | fun foo(a, b: integer|string) {
1012 | r = a;
1013 | typecase b is integer {
1014 | r = r + b;
1015 | };
1016 | typecase b is string {
1017 | r = r + len(b);
1018 | };
1019 | r
1020 | }
1021 | main = fun() {
1022 | a = 0;
1023 | a = foo(a, 333 as integer|string);
1024 | a = foo(a, "hiya" as integer|string);
1025 | a
1026 | }
1027 = 337
1028
1029 The expression in a `typecase` must be a variable.
1030
1031 | main = fun() {
1032 | a = 333 as integer|string;
1033 | typecase 333 is integer {
1034 | print("what?")
1035 | };
1036 | }
1037 ? identifier
1038
1039 The expression in a `typecase` can be an argument.
1040
1041 | fun wat(j: integer|string) {
1042 | typecase j is integer {
1043 | print("integer")
1044 | };
1045 | }
1046 | main = fun() {
1047 | wat(444 as integer|string)
1048 | }
1049 = integer
1050
1051 The expression in a `typecase` cannot effectively be a global, as globals
1052 must be literals and there is no way (right now) to make a literal of union
1053 type.
1054
1055 Inside a `typecase` the variable cannot be updated.
1056
1057 | main = fun() {
1058 | a = 333 as integer|string;
1059 | typecase a is integer {
1060 | a = 700;
1061 | };
1062 | }
1063 ? cannot assign
1064
1065 The union can include void.
1066
1067 | main = fun() {
1068 | j = null as void|integer;
1069 | typecase j is void {
1070 | print("nothing there")
1071 | };
1072 | }
1073 = nothing there
1074
1075 ### Struct Types + Union Types ###
1076
1077 Union types may be used to make fields of a struct "nullable", so that
1078 you can in actuality create recursive, but finite, data structures.
1079
1080 | struct list {
1081 | value: string;
1082 | next: list|integer;
1083 | }
1084 | main = fun() {
1085 | l = make list(
1086 | value: "first",
1087 | next: make list(
1088 | value: "second",
1089 | next:0 as list|integer
1090 | ) as list|integer)
1091 | s = l.next
1092 | typecase s is list {
1093 | print(s.value)
1094 | }
1095 | }
1096 = second
1097
1098 You may want to use helper functions to hide this ugliness.
1099
1100 | struct list {
1101 | value: string;
1102 | next: list|void;
1103 | }
1104 | fun singleton(v: string) {
1105 | make list(value:v, next:null as list|void)
1106 | }
1107 | fun cons(v: string, l: list) {
1108 | make list(value:v, next:l as list|void)
1109 | }
1110 | fun nth(n, l: list) {
1111 | u = l as list|void;
1112 | v = u;
1113 | k = n;
1114 | while k > 1 {
1115 | typecase u is void { break; }
1116 | typecase u is list { v = u.next; }
1117 | u = v;
1118 | k = k - 1;
1119 | }
1120 | return u
1121 | }
1122 | main = fun() {
1123 | l = cons("first", singleton("second"));
1124 | g = nth(2, l);
1125 | typecase g is list { print(g.value); }
1126 | }
1127 = second
1128
1129 Structs may be empty.
1130
1131 | struct red { }
1132 | fun show(color: red) {
1133 | print("hi")
1134 | }
1135 | main = fun() {
1136 | show(make red());
1137 | }
1138 = hi
1139
1140 In combination with unions, this lets us create "typed enums".
1141
1142 | struct red { }
1143 | struct green { }
1144 | struct blue { }
1145 | fun show(color: red|green|blue) {
1146 | typecase color is red { print("red"); }
1147 | typecase color is green { print("green"); }
1148 | typecase color is blue { print("blue"); }
1149 | }
1150 | main = fun() {
1151 | show(make red() as red|green|blue);
1152 | show(make blue() as red|green|blue);
1153 | }
1154 = red
1155 = blue
+0
-127
TODO.markdown less more
0 TODO
1 ----
2
3 Don't output final value. Command-line arguments passed to `main`. (`sysmain`?)
4
5 Name mangling for compilers (prepend with `_` most likely.)
6
7 Tests for unions of unions.
8
9 Test for equality of union values.
10
11 Tests for multiple occurrences of same type in a union.
12
13 ### Implementation ###
14
15 Struct equality in Javascript, stackmac backends.
16
17 Figure out a way to do `input`, `read`, and `write` with node.js backend.
18
19 Implement `int`, `str`, `chr`, `ord` for Ruby, Javascript, stackmac.
20
21 TaggedValue -> just a tuple.
22
23 stackmac: store tagged values as two values on the stack.
24 and void types in unions of (void, X) should only be one value.
25 (structs are still boxed though)
26
27 AST nodes should have source line numbers, it would be really nice.
28
29 C backend. Other backends (Python? Java? CIL? Scheme?)
30
31 ### Design ###
32
33 Convenience:
34
35 * Should we have automatic promotion (value tagging?)
36 Since it causes an operation, I think it should be explicit, but the
37 explicit syntax could be more lightweight.
38 * Lua-esque `:` operator: `a:b(c)` -> `a.b(a, c)`
39
40 Type promotion with higher precedence? So that it can be used at toplevel.
41
42 Should there be closures as well as function values?
43
44 Should there be an expression form of `if`?
45
46 Should there be an expression form of `=`? (`:=`?)
47
48 Block scope in blocks; if the first assignment to a variable occurs inside
49 a block, its scope is that block. It cannot be seen after that block closes.
50 (Shadowing is not possible; if the first assignment was before, that outer
51 variable is what gets updated.)
52
53 ### Object Orientation ###
54
55 Castile will likely not have any "real" object-oriented features. Probably,
56 only nominal subtyping, with a Lua-like method call operator. Meaning
57 something like this:
58
59 struct counter {
60 value: integer;
61 bump: counter, integer -> counter
62 }
63
64 struct named_counter < counter {
65 name: string;
66 }
67
68 forward bump: counter, integer -> counter
69
70 bump = fun(c: counter, v: integer) {
71 return make counter(value=c.value + v, bump=bump;
72 }
73
74 new_counter = fun(init) {
75 return make counter(value=init, bump=bump);
76 }
77
78 new_named_counter = fun(init) {
79 g = make named_counter(value=init, bump=bump, name="Jeremy");
80 return g:bump(5);
81 }
82
83 `bump` can operate on a `named_counter` type, but only because it
84 is declared to be a subtype of `counter`, so its fields are known to be
85 a superset of `counter`'s fields. The `bump` method itself is not defined
86 within either structure, and the `:` syntax is used to pass the object as
87 the first argument.
88
89 ### Memory Management ###
90
91 Some notes on memory management.
92
93 Some properties of Castile have some potential for simplifying memory
94 management. For example, structures are immutable. I do not expect to
95 be able to get away with avoiding garbage collection completely, but I
96 suspect there to be ways to tie reference counting to execution in a
97 "nice" fashion. Although this will likely still be difficult.
98
99 I'd like Castile's memory allocation system to be based on the following
100 idea:
101
102 Every time you call a function of type `A -> B`, you have created an
103 object of type `B`.
104
105 If, in some function, your argument `a` to the call of the function of type
106 `A -> B` was your last use of `a`, you have destroyed an object of type `a`.
107 (Unless objects of type `B` contain objects of type `A`.)
108
109 If you call a function of type `A -> A`, you haven't created or destroyed
110 any objects. (Even if you discard the argument and allocate a new `A` to
111 be returned, you can just "allocate" the new `A` over the old one.)
112
113 I don't know enough about linear types to know how well that can be
114 expressed with them.
115
116 Suppose you are `f`, a function `A -> B`, and `B` contains `A`. You
117 actually ignore the `A` given to you, and call another function `g` to
118 allocate a new `B` containing the new `A`.
119
120 It would seem that the caller of `f` must pass "this is the last use of
121 the argument" to you, and that you must pass "it's ok to overwrite
122 my `A` with your new `A`" to `g`. This is complicated enough communication
123 that it would probably be best modelled as a global dictionary of some sort,
124 rather than each function telling each other function what's up. For
125 example, at the point of the last use of the `A` just before calling `f`, the
126 address of that `A` could be placed on a list of "newly available `A`s."
0 TODO
1 ----
2
3 Don't output final value. Command-line arguments passed to `main`. (`sysmain`?)
4
5 Name mangling for compilers (prepend with `_` most likely.)
6
7 Tests for unions of unions.
8
9 Test for equality of union values.
10
11 Tests for multiple occurrences of same type in a union.
12
13 ### Implementation ###
14
15 Struct equality in Javascript, stackmac backends.
16
17 Figure out a way to do `input`, `read`, and `write` with node.js backend.
18
19 Implement `int`, `str`, `chr`, `ord` for Ruby, Javascript, stackmac.
20
21 TaggedValue -> just a tuple.
22
23 stackmac: store tagged values as two values on the stack.
24 and void types in unions of (void, X) should only be one value.
25 (structs are still boxed though)
26
27 AST nodes should have source line numbers, it would be really nice.
28
29 C backend. Other backends (Python? Java? CIL? Scheme?)
30
31 ### Design ###
32
33 Convenience:
34
35 * Should we have automatic promotion (value tagging?)
36 Since it causes an operation, I think it should be explicit, but the
37 explicit syntax could be more lightweight.
38 * Lua-esque `:` operator: `a:b(c)` -> `a.b(a, c)`
39
40 Type promotion with higher precedence? So that it can be used at toplevel.
41
42 Should there be closures as well as function values?
43
44 Should there be an expression form of `if`?
45
46 Should there be an expression form of `=`? (`:=`?)
47
48 Block scope in blocks; if the first assignment to a variable occurs inside
49 a block, its scope is that block. It cannot be seen after that block closes.
50 (Shadowing is not possible; if the first assignment was before, that outer
51 variable is what gets updated.)
52
53 ### Object Orientation ###
54
55 Castile will likely not have any "real" object-oriented features. Probably,
56 only nominal subtyping, with a Lua-like method call operator. Meaning
57 something like this:
58
59 struct counter {
60 value: integer;
61 bump: counter, integer -> counter
62 }
63
64 struct named_counter < counter {
65 name: string;
66 }
67
68 forward bump: counter, integer -> counter
69
70 bump = fun(c: counter, v: integer) {
71 return make counter(value=c.value + v, bump=bump;
72 }
73
74 new_counter = fun(init) {
75 return make counter(value=init, bump=bump);
76 }
77
78 new_named_counter = fun(init) {
79 g = make named_counter(value=init, bump=bump, name="Jeremy");
80 return g:bump(5);
81 }
82
83 `bump` can operate on a `named_counter` type, but only because it
84 is declared to be a subtype of `counter`, so its fields are known to be
85 a superset of `counter`'s fields. The `bump` method itself is not defined
86 within either structure, and the `:` syntax is used to pass the object as
87 the first argument.
88
89 ### Memory Management ###
90
91 Some notes on memory management.
92
93 Some properties of Castile have some potential for simplifying memory
94 management. For example, structures are immutable. I do not expect to
95 be able to get away with avoiding garbage collection completely, but I
96 suspect there to be ways to tie reference counting to execution in a
97 "nice" fashion. Although this will likely still be difficult.
98
99 I'd like Castile's memory allocation system to be based on the following
100 idea:
101
102 Every time you call a function of type `A -> B`, you have created an
103 object of type `B`.
104
105 If, in some function, your argument `a` to the call of the function of type
106 `A -> B` was your last use of `a`, you have destroyed an object of type `a`.
107 (Unless objects of type `B` contain objects of type `A`.)
108
109 If you call a function of type `A -> A`, you haven't created or destroyed
110 any objects. (Even if you discard the argument and allocate a new `A` to
111 be returned, you can just "allocate" the new `A` over the old one.)
112
113 I don't know enough about linear types to know how well that can be
114 expressed with them.
115
116 Suppose you are `f`, a function `A -> B`, and `B` contains `A`. You
117 actually ignore the `A` given to you, and call another function `g` to
118 allocate a new `B` containing the new `A`.
119
120 It would seem that the caller of `f` must pass "this is the last use of
121 the argument" to you, and that you must pass "it's ok to overwrite
122 my `A` with your new `A`" to `g`. This is complicated enough communication
123 that it would probably be best modelled as a global dictionary of some sort,
124 rather than each function telling each other function what's up. For
125 example, at the point of the last use of the `A` just before calling `f`, the
126 address of that `A` could be placed on a list of "newly available `A`s."
2525 Its usage may conflict with the usage of standard output.
2626
2727 """
28 print s
28 print(s)
2929
3030
3131 def builtin_input(s):
1616 class TypeChecker(object):
1717 def __init__(self):
1818 global_context = {}
19 for (name, (value, type)) in BUILTINS.iteritems():
19 for (name, (value, type)) in BUILTINS.items():
2020 global_context[name] = type
2121 self.context = ScopedContext(global_context, level='global')
2222 self.toplevel_context = ScopedContext({}, self.context, level='toplevel')
3232 def set(self, name, type):
3333 self.context[name] = type
3434 if self.verbose:
35 print '%s: %s' % (name, type)
35 print('%s: %s' % (name, type))
3636 return type
3737
3838 def assert_eq(self, t1, t2):
55 """
66 >>> d = ScopedContext({ 'a': 2, 'b': 3 })
77 >>> e = ScopedContext({ 'c': 4 }, parent=d)
8 >>> print e['c']
8 >>> e['c']
99 4
10 >>> print e['b']
10 >>> e['b']
1111 3
12 >>> print 'a' in e
12 >>> 'a' in e
1313 True
14 >>> print 'e' in e
14 >>> 'e' in e
1515 False
1616
1717 """
5353 import doctest
5454 (fails, something) = doctest.testmod()
5555 if fails == 0:
56 print "All tests passed."
56 print("All tests passed.")
5757 sys.exit(0)
5858 else:
5959 sys.exit(1)
3535
3636
3737 class FunctionReturn(Exception):
38 pass
38 @property
39 def message(self):
40 return self.args[0]
3941
4042
4143 class WhileBreak(Exception):
171173 class Program(object):
172174 def __init__(self):
173175 self.stab = {}
174 for (name, (value, type)) in BUILTINS.iteritems():
176 for (name, (value, type)) in BUILTINS.items():
175177 if callable(value):
176178 value = Closure(self, value)
177179 self.stab[name] = value
3939 import doctest
4040 (fails, something) = doctest.testmod()
4141 if fails == 0:
42 print "All tests passed."
42 print("All tests passed.")
4343 sys.exit(0)
4444 else:
4545 sys.exit(1)
4747 p = Parser(f.read())
4848 ast = p.program()
4949 if options.show_ast:
50 print ast.pprint(0)
51 print "-----"
50 print(ast.pprint(0))
51 print("-----")
5252 if options.parse_only:
5353 sys.exit(0)
5454 if options.typecheck:
5959 x = FunctionLifter()
6060 ast = x.lift_functions(ast)
6161 if options.show_ast:
62 print ast.pprint(0)
63 print "-----"
62 print(ast.pprint(0))
63 print("-----")
6464 c = getattr(backends, options.compile_to).Compiler(sys.stdout)
6565 c.compile(ast)
6666 sys.exit(0)
6868 e.load(ast)
6969 r = e.run()
7070 if r is not None:
71 print repr(r)
71 print(repr(r))
3030 while ip < len(program):
3131 (op, arg) = program[ip]
3232 if debug:
33 print ip, op, arg, stack, callstack
33 print(ip, op, arg, stack, callstack)
3434 if op == 'push':
3535 stack.append(arg)
3636 elif op == 'pop':
163163 result = 'False'
164164 if result == -1:
165165 result = 'True'
166 print result
166 print(result)
167167
168168
169169 def main(args):
00 #!/bin/sh
11
2 echo -n '' > test_config
2 APPLIANCES="tests/appliances/castile.md"
33
4 if [ "x$1" = x ]; then
5 cat >>test_config <<EOF
6 -> Functionality "Run Castile Program" is implemented by shell command
7 -> "bin/castile %(test-body-file)"
8
9 EOF
4 if [ ! x`which python3` = x ]; then
5 APPLIANCES="$APPLIANCES tests/appliances/python3-castile.md"
106 fi
117
128 if [ ! x`which node` = x ]; then
13 cat >>test_config <<EOF
14 -> Functionality "Run Castile Program" is implemented by shell command
15 -> "bin/castile -c javascript %(test-body-file) > foo.js && node foo.js"
16
17 EOF
9 APPLIANCES="$APPLIANCES tests/appliances/castile-c-javascript.md"
1810 fi
1911
2012 if [ ! x`which ruby` = x ]; then
21 cat >>test_config <<EOF
22 -> Functionality "Run Castile Program" is implemented by shell command
23 -> "bin/castile -c ruby %(test-body-file) > foo.rb && ruby foo.rb"
24
25 EOF
13 APPLIANCES="$APPLIANCES tests/appliances/castile-c-ruby.md"
2614 fi
2715
2816 if [ -e bin/stackmac ]; then
29 cat >>test_config <<EOF
30 -> Functionality "Run Castile Program" is implemented by shell command
31 -> "bin/castile -c stackmac %(test-body-file) > foo.stack && bin/stackmac foo.stack"
32
33 EOF
17 APPLIANCES="$APPLIANCES tests/appliances/castile-c-stackmac.md"
3418 fi
3519
3620 if [ "x$1" = "xc" ]; then
37 cat >>test_config <<EOF
38 -> Functionality "Run Castile Program" is implemented by shell command
39 -> "bin/castile -c c %(test-body-file) > foo.c && gcc foo.c && ./a.out"
40
41 EOF
21 APPLIANCES="$APPLIANCES tests/appliances/castile-c-c.md"
4222 fi
4323
44 falderal -b test_config README.markdown
24 falderal $APPLIANCES README.md
4525 RESULT=$?
46 rm -f test_config foo.* a.out
26 rm -f foo.* a.out
4727 exit $RESULT
0 -> Functionality "Run Castile Program" is implemented by shell command
1 -> "bin/castile -c c %(test-body-file) > foo.c && gcc foo.c && ./a.out"
0 -> Functionality "Run Castile Program" is implemented by shell command
1 -> "bin/castile -c javascript %(test-body-file) > foo.js && node foo.js"
0 -> Functionality "Run Castile Program" is implemented by shell command
1 -> "bin/castile -c ruby %(test-body-file) > foo.rb && ruby foo.rb"
0 -> Functionality "Run Castile Program" is implemented by shell command
1 -> "bin/castile -c stackmac %(test-body-file) > foo.stack && bin/stackmac foo.stack"
0 -> Functionality "Run Castile Program" is implemented by shell command
1 -> "bin/castile %(test-body-file)"
0 -> Functionality "Run Castile Program" is implemented by shell command
1 -> "python3 bin/castile %(test-body-file)"