git @ Cat's Eye Technologies Decoy / master doc / Modules.md
master

Tree @master (Download .tar.gz)

Modules.md @masterview markup · raw · history · blame

Notes on Decoy Modules

Unclear how much of this belongs to the Decoy language, vs. how much it belongs to the decoy implementation.

We'll start with some simple facts, to try to make the situation clearer.

Evaluating Modules

decoy is an implementation of Decoy. decoy can evaluate modules.

Evaluating a module means two things: creating a new environment based on the definitions in the module, and evaluating any bare expressions and displaying the results of them.

When you evaluate a module, it needs to have an implementation written in Decoy. (But see below.)

When you evaluate a module, it may import other modules. Or you may supply, on the command line, modules to be (implicitly) imported beforehand.

Importing a module, in this context, means evaluating it and providing any new environment it provides to the module that imported it.

Either way, when you import a module, it too needs to have an implementation. Usually this implementation is in Decoy. However in the case of "built-in" modules like (presently) stdenv, the module implementation will be in Lua.

(( NOTE: This could be the case for other modules too, but many of the use cases for this -- for example, accessing functionality that is specific to the base language, like Lua bindings for GTK or something -- go against the spirit of Decoy, where functions are pure functions that don't have side effects. But there may still be reasons for doing this, such as the efficiency. ))

Compiling Modules

As an implementation of Decoy, decoy can also compile modules.

Compiling a module means translating it to a module in another programming language.

(( NOTE: For the moment we will consider primarily JavaScript, but other languages with similar operating models such as Lua and Scheme would be worth considering. The operating model does have an effect on this, since the more a language's operating model is different from Decoy's, the more work the compiler will need to do to translate. For example to accomodate JavaScript's module system, we cannot simply import al symbols from a module in a non-namespaced way. Also, if we want to do interop (which we do), we need to think about how the data structures map. Everything in Scheme is cons cells, but most things in JavaScript are not built on cons cells. ))

When you compile a module, it needs to have an implementation written in Decoy.

When you compile a module, it may import other modules. Or you may supply, on the command line, modules to be (implicitly) imported beforehand.

Importing a module, in this context, means knowing that which references made by your module (the one being compiled) refer to symbols exported by the imported module.

For this to happen, the module being imported does not need to have an implementation written in anything. But there's two things to consider here.

First, when you go to run the compiled code, you will need an implementation of that module, written in the target language.

A common case for having that implementation would be to implement the module in Decoy and use decoy to compile it to the target language. But that need not be the only way. The module could be implemented directly in the target language, for example for efficiency or for it to have access to facilities specific to the target language.

Secondly, when you compile a module it behooves you to have some information about what it provides. At minimum, a list of the symbols it exports.

In the common case where you already have the module implemented in Decoy, the compiler can read that to glean this information.

However, when the module exists only in the target language, the compiler will need to get this information from somewhere else. To this end, Decoy needs to provide a facility like C's "extern" declarations, where symbols are declared but not defined.

This second point implies that there does need to be a Decoy-language file available for resolving imports when running the compiler, whether the module in the target language has an implementation in Decoy or not.

Evaluating and Compiling

In the common case where you already have the module implemented in Decoy, you can both interpret, and compile, a module that imports it. You can also compile the module itself to have it available to be imported within the target language.

In the case were a module imports a module that does not have an implementation in Decoy, but does have a (built-in) implementation in Lua, the module can be interpreted, but in order to be compiled, it needs an "extern" module for its dependency.

In the case were a module imports a module that does not have any implementation in Decoy or Lua, the module cannot be interpreted. But as long as there are "extern" modules for its dependencies, it can be compiled, and as long as there are implementations of those modules in the target language, the compiled code can be executed.

Operational Description of Importing during Interpretation

If asked to interpret a module, each time it encounters an import-from, it first checks if it has loaded a module by that name. If it has, it adds the environment that was loaded for that module, to the working environment, and does nothing else.

If it hasn't, it checks if there is a Decoy source by that name on its modules search index. If there is, it evaluates it (so this happens recursively), then finally adds the imported module's environment to the working environment.

If there is no such module in the module search index it produces an error and interpretation does not proceed.

Built-in modules such as stdenv are handled by being loaded automatically at startup, even if they are never imported. If it is imported, its environment is added to the working environment (see paragraph on import-from above).

Also, when decoy is given modules to import on the command line, simply it executes the equivalent of (import-from module *) for each of them, before processing the actual modules on the command line.

Operational Description of Importing during Compilation

If asked to compile a module, each time it encounters an import-from, it first checks if it has scanned a module by that name. If it has, it adds the context that was scanned for that module, to the working context, and does nothing else.

(( NOTE: indeed this is just the same as for interpretation except we have substituted "scanned" for loaded and "context" for environment. To scan a Decoy source is to load it but instead of fully evaluating it just glean some information from it, information that is useful for compiling. This information is the context. ))

If it hasn't, it checks if there is a Decoy source by that name on its modules search index. If there is, it scans it (and this happens recursively), then finally adds the imported module's context to the working context.

If there isn't, it checks if there is a corresponding source which has the file extension .extern.decoy.scm or something. If there is, it scans that instead. This file is not expected to have real content in it, only enough that scanning it produces something sensible (like a list of exported symbols) that can be used as context by the compiler.

If that file doesn't exist either, the compiler produces an error and interpretation does not proceed.

Built-in modules such as stdenv are handled by being "scanned" (that is, having their context register as having been scanned) automatically at startup, even if they are never imported.

Also, when decoy is given modules to import on the command line, simply it executes the equivalent of (import-from module *) for each of them, before processing the actual modules on the command line.

Take Two

Now that I've written all that, I think I want to do something completely different.

Define the concept of a "Decoy module". A Decoy module has an interface. It also has one or more instances (implementations). Unlike what the above writing suggests, it cannot have zero instances.

Implementations may be in many languages. Obvious languages include:

  • Decoy
  • Lua
  • JavaScript

If a Decoy module implementation is written in Decoy, then it can be interpreted or compiled equally well.

If a Decoy module implementation is written in the same language as the Decoy implementation itself, then it can be evaluated (we leave it up to the implementation to figure out how it loads the module from the file) in all cases, but it can only be compiled, if the target language is the same as the Decoy implementation language.

If a Decoy module implementation is written in a language different from the language of the Decoy implementation, then it cannot be evaluated. (Barring some special tricks that we won't ask Decoy implementations to undertake!) But it can be compiled, if the target language matches the language the Decoy module implementation.

And in any given case there can be multiple implementations of the module in the module library. So, for example,

module/
    decoy/
        map.scm
    lua/
        map.lua
    js/
        map.js

When the Decoy implementation sees (import map), what it does depends on whether it is interpreting or compiling.

If it is interpreting, it looks first for an implementation of the module map written in the language of the Decoy implementation itself. For instance, the reference implementation is written in Lua, so it would first try to load lua/map.lua. If that module implementation were not available, it would fall back to trying to load an implementation in Decoy, decoy/map.scm.

If it is compiling, it looks first for an implementation of the module map written in the target language it is compiling to. For instance, compiling to JavaScript, it loads up js/map.js and (as a baseline) includes it verbatim into the generated code. If that module implementation were not available, it would fall back to trying to load an implementation in Decoy, decoy/map.scm, and compiling that instead.

The downside of that baseline is that you can only import entire modules, not individual functions from them. Being able to extract individual functions means that the Decoy compiler would need to be able to understand how functions are bundled into modules in every language it supports. This is "a big ask" as they say nowadays, so it's something of an issue.

One consideration is that module systems in different languages have different restrictions, so there can be a certain amount of "impedance mismatch". For example, ES6 JavaScript does not allow importing all symbols from a module; we can only import a set of names that we explicitly ask for. So we can import all symbols, but we need to know what they are. So this is a similar issue.

One way to get around it is to have every module consist of only one function. Robin takes an approach similar to this. It's a bit awkward; we end up with a lot of files.

Another option is to mark up the source somehow so that the Decoy implementation can pick it apart, even if it doesn't have a deep understanding of the target language's syntax. ribbit takes an approach similar to this for its compiler targets.