Guide:Code

From IokeWiki
Revision as of 15:59, 7 February 2009 by Cv (talk | contribs) (Syntax)
Jump to: navigation, search

Code

Many of the things you do in Ioke will directly manipulate code. Since the messages that make up code is really easy to get hold of, this manipulation comes easy too. Ioke takes the Lisp philosophy of "code is data" to heart. The basic unit of a piece of code is a Message. A Message has a name, a next and prev pointer, and any number of arguments. When you manipulate a message, the argument list will contain messages too - and if the next or prev pointers are not nil, they will point to other messages. It serves well to remember that except for the message itself, all code will be evaluated in the context of a receiver and a ground. The ground is necessary because arguments to be evaluated need to be run in some specific context, even though the current receiver is not the same as the ground.

The current types of code can be divided into three different categories. These are methods, macros and blocks. Native methods are all of the kind JavaMethod, but can have any kind of semantics - including semantics that look like macros. Most native methods do have the same semantics as regular methods, however.

Methods

A method in Ioke is executable code that is activatable. A method can take arguments of several different types. The arguments to a method will always be evaluated before the code in the method starts to execute. An Ioke method is defined using the "method" method. All Ioke methods have the kind DefaultMethod. This leaves the room open to define other kinds of methods, if need be. DefaultMethod's could be implemented using macros, but at this point they aren't. A DefaultMethod can have a name - and will get a name the first time it is assigned to a cell.

It is really easy to define and use a simple method. The easiest case is to define a method that is empty. This method will just return nil:

m = method()
m ;; call the method

Since methods are activatable, when you name a cell that contains a method, that method will be invoked. To stop that behavior, use the "cell" method.

The definition of a method can take several different pieces. These are a documentation string, definitions of positional required arguments, definitions of positional optional arguments, definitions of keyword arguments, definition of a rest argument, definition of a keyword rest argument and the actual code of the method.

Let's take these one by one. First, if the the first element of a call to "method" is a literal text, and there is at least one more argument in the definition, then that text will be the documentation text for the method:

;; a method that returns "foo"
m = method("foo") 

;; a method that returns nil, but
;; has the documentation text "foo"
m = method("foo", nil)

A method can take any number of required positional arguments. These will be checked when a method is called, and if not enough -- or too many -- arguments are provided, an error will be signalled.

m = method(x, x println)
m = method(x, y, z,
  x * y + z)

The first method takes one argument and prints that argument. The second method takes three arguments and return the product of the two first added to the third.

A method can also have optional positional arguments. In that case the optional arguments must follow the required arguments. Optional arguments need to have a default value -- in fact, that is how you distinguish them from required arguments. The arity of method calls will still be checked, but using minimum and maximum values instead. The default value for an argument should be code that can be executed in the context of the running method, so a default value can refer to earlier positional arguments. A default value can also do quite complex things, if need be, although it's not really recommended.

;; takes zero or one arguments
m = method(x 42, x println)

;; takes one to three arguments
m = method(x, y 42, z 25, 
  x*y + z)

The syntax for optional arguments is to just write a space after the name of the argument, and then write the code to generate the default value after it.

A method can also have keyword arguments. Keyword arguments are checked, just like regular arguments, and you can't generally give keyword arguments to a method not expecting it. Nor can you give unexpected keyword arguments to a method that takes other keywords. Keyword arguments can never be required. They can have default values, which will default to nil if not provided. They can be defined anywhere among the arguments -- the only reason to reorder them is that default values of other optional arguments can use prior defined keyword arguments.

A keyword argument is defined just like a regular argument, except that it ends in a colon.

m = method(foo:, bar: 42,
  foo println
  bar println
)

Just as with regular optional arguments, you supply the default value of the keyword argument after a space. The cells for the keyword arguments will be the same as their names, without the ending colon. The above code would print nil and 42 if no arguments were specified. It's important to remember that keyword arguments and positional arguments do not interact -- except for when calculating default values. When assigning values it's always possible to see what is positional and what is a keyword argument.

Ioke methods can collect positional arguments into a list. This allow methods to take variable number of arguments. The rule is that all other positional arguments are first calculated, and the remaining positional arguments will be added to the rest argument. If no positional arguments are available, the rest argument will be empty. A rest argument is defined by preceding it with a plus sign in the argument definition. For clarity a rest argument should be defined last in the list, although it doesn't exactly matter anyway.

m = method(+rest,
  rest println)

m = method(x, y 42, +rest,
  rest println)

The above code defines one method that only takes one rest argument. That means the method can take any number of arguments and all of them will be collected into a list. The second method takes one required argument, one optional argument and any number of extra arguments. So if four arguments are given, the rest argument will contain two.

The final type of argument is keyword rest arguments. Just like positional rest arguments, a keyword rest argument can collect all keywords given to a method, no matter what. If a keyword rest argument is used, no conditions will be signalled if an unknown keyword is given to a method. If other keywords are defined, these keywords will not show up in the keyword rest argument. The keyword rest argument is defined by preceding the name with a +: sigil, and the keyword rest argument will be a Dict instead of a list. The keys will be symbols but without the ending colon.

m = method(+:krest,
  krest println)

m = method(x, y:, +rest, +:krest,
  [x, y, rest, krest])

The above code first creates a method that can take any number of keyword arguments but nothing else. The second method takes one required positional argument, one keyword argument, rest arguments and keyword rest arguments, and returns a new list containing all the arguments given to it.

The final argument to the method method should always be the code to execute. This code will be executed in the context of a receiver, that is the object the method is activated on. A method execution also happens in the context of the method activation context, where local variables are stored. This activation context contain some predefined variables that can be used. These are "self", "@", "currentMessage" and "surroundingContext". Both "self" and "@" refer to the receiver of the method call. "currentMessage" returns the message that initiated the activation of the method, and "surroundingContext" returns the object that represents the context where this method was called from. Both "self" and "@" can be used to specify that something should be assigned to the receiver, for example.

createNewCell = method(
  @foo = 42
)

The method create above will create assign the value 42 to the cell "foo" on the object the method was called on.

When calling a method, you specify positional arguments separated with commas. You can provide keyword arguments in any order, in any place inside the braces:

;; the method foo takes any kind of argument
foo
foo()
foo(1, 2, 3)
foo(blarg: 42, 2, 3, 4)
foo(quux: 42*2)

To give a keyword argument, you just write it exactly like you define keyword arugments - a name followed by a colon.

Sometimes it can be useful to be able to take a list of values and give them as positional arguments. The same can be useful to do with a dict of names. You can do that using splatting. This is done by preceding a list or a dict with an asterisk. This will result in the method getting the values inside of it as if the arguments were given directly. You can splat several things to the same invocation.

dc = {foo: 42, bar: 13}
ls = [1, 2, 3, 4]
ls2 = [42, 43, 44]

foo(*dc)
;; the same as:
foo(foo: 42, bar: 13)

foo(*ls)
;; the same as:
foo(1, 2, 3, 4)

foo(*ls2, 111, *dc, *ls)
;; the same as:
foo(42, 43, 44, 111, foo: 42, bar: 13, 1, 2, 3, 4)

If you try to splat something that can't be splatted, a condition will be signalled.

Macros

The main difference between a macro and a method in Ioke is that the arguments to a macro is not evaluated before they are sent to the macro. That means you have to use macros to send raw message chains in an invocation. In most languages, this kind of feature is generally called call-by-name. When a macro gets called, it will get access to a cell called "call" which is a mimic of the kind Call. This gives access to information about the call and makes it possible to evaluate the code sent as arguments, check how many arguments are supplied, and so on.

A macro is created using the "macro" cell on DefaultBehavior. This will return a mimic of DefaultMacro. Since macros can't define arguments, it's a bit easier to describe than methods, but the things that can be done with macros is also a bit more interesting than what can be achieved with methods. One important thing to keep in mind is that most macros can not receive splatted arguments. In most cases keyword arguments aren't available either - but they could be faked if needed. Macros should generally be used to implement control structures and things that need to manipulate code in different ways.

Just like a method, a macro gets evaluated on a specific receiver. It also gets the same kind of method activation context, but the contents of it is a bit different. Specifically, the context for a macro contains cells named "self", "@", "currentMessage", "surroundingContext" and "call". It's the "call" cell that is most important. It is a mimic of Call, and Call defines several important methods for manipulating the call environment. These are:

arguments
This method returns a list containing the unevaluated arguments given to this message. Any kind of manipulation can be done with these arguments.
ground
Returns the ground in which the call was initiated. This is necessary to evaluate arguments in their own environment.
message
The currently executing message. This is the same as the "currentMessage" cell in the macro activation context.
evaluatedArguments
Returns a list containing all arguments, evaluated according to the regular rules (but not handling splatting or keywords).
resendToMethod
Allows a specific message to be resent to another method, without manually copying lots of information.

These methods are a bit hard to understand, so I'll take some examples from the implementation of Ioke, and show how macros are used here.

Mixins Enumerable map = macro(
  "takes one or two arguments. if one argument is given,
it will be evaluated as a message chain on each element
in the enumerable, and then the result will be collected
in a new List. if two arguments are given, the first one
should be an unevaluated argument name, which will be
bound inside the scope of executing the second piece of
code. it's important to notice that the one argument
form will establish no context, while the two argument form
establishes a new lexical closure.",
  
  len = call arguments length
  result = list()
  if(len == 1,
    code = call arguments first
    self each(n, result << code evaluateOn(call ground, cell(:n))),

    code = LexicalBlock createFrom(call arguments, call ground)
    self each(n, result << code call(cell(:n))))
  result)

The code above implements map, one of the methods from Enumerable. The map method allows one collection to be mapped in a predefined way into something else. It can take either one or two arguments. If one argument is given, that is a message chain to apply, and then collect the results. If two arguments are given, the first is the argument name to use, and the second is the code to execute for each entry.

The first step is to figure out how many arguments have been given. This is done by checking the length of the "call arguments" cell. If we have a length of one, we know that the first argument is a piece of code to apply, so we assign that argument to a cell called "code". Now, "code" will be a mimic of Message, and Message has a method called "evaluateOn", that can be used to fully evaluate a message chain. And that's exacty what we do for each element in the collection we are in. The result of evaluateOn is added to the result list. We use "call ground" to get the correct ground for the code to be evaluated in.

If we get two arguments, it's possible to take a shortcut and generate a lexical block from those arguments, and then use that. So we call "LexicalBlock createFrom" and send in the arguments and the ground, and then call that piece of code once for each element in the collection.

It is a bit tricky to figure out how macros work. I recommend looking at the implementations of some of the core Ioke methods/macros, since these use much of the functionality.

Blocks

A lexical block allows the execution of a piece of code in the lexical context of some other code, instead of in a dynamic object scope. A lexical block does not have a receiver. Instead, it just establishes a new lexical context, and executes the code in that. The exact effect that has on assignments has been described earlier.

A lexical block can be created using either the "fn" or the "fnx" methods of DefaultBehavior. The main difference between the two is that a block created with "fnx" will be activatable, while something created with "fn" will not. Lexical blocks handle arguments exactly the same way as methods, so a lexical block can take optional arguments, keyword arguments, rest arguments and so on. Both "fn" and "fnx" also take optional documentation text.

A block created with the "fn" method can be invoked using the "call" method of the kind LexicalBlock.

x = fn(z, z println)
x call(42)

If a block created with the "fn" method takes one or more explicit parameters it can also be activated like a regular method. The reason for this is shown in the code snippet below. Here the result of invoking the block referred to by "x" is passed to "y" (which may be a regular method or even another block). If "x" would be fully non-activatable, "x" would be passed to "y" as is with the argument thrown away. In other words, that would be dead code. However, you can still refer to the block as "x" without an invocation to happen.

x = fn(z, z + 42)
y(x(100)) ;; activates the block with argument 100 and passes the result to y
x ;; refers to the block without activating it

A block created with the "fnx" method is activatable per se and thus can be activated like a regular method. The default is to use "fn" to create inactive blocks though, since blocks are generally used to pass pieces of code around.

y = fnx(z, z println)
y(42)

A lexical block is a regular kind of object that can be assigned to any cell, just like other objects. Lexical blocks mimic LexicalBlock, and blocks don't have names. In contrast to methods and macros, no extra cells will be added to the activation context for a lexical block.

Lecros

A macro works exactly like a method, in that it always has a receiver, and that receiver is available inside the macro as 'self' and '@'. In some circumstances it can be really useful to have a macro that behaves like a lexical block instead - being lexical so it can use cells defined outside of the definition of the macro. These macros won't have access to 'self' or '@', since they don't have a receiver in that way. Where such a macro is called is only based on namespacing.

Ioke supports these kind of macros. They are all mimics of the kind LexicalMacro, and they are created using the method 'lecro'. A LexicalMacro is activatable by default, but a non-activatable lecro can be created using lecrox. The 'lecro' method takes the same arguments as 'macro', and the only real difference is the way it handles outside cells and the receiver value. A lecro also has a cell called outerScope that can be used if you need to explicitly access something in the outer name space - such as call.

Syntax

Ioke supports loads of stuff with the standard macro, but sometimes these are a bit too low level for commonly used operations. Syntax is one of those cases: you can achieve the same goals with macros, but you don't really want to. Many features in Ioke S are implemented using syntax.

You can define syntax using the syntax method. This returns a mimic of DefaultSyntax. You can use the same kind of cells in a syntax as you can in a macro. What is different with syntax is that syntax can only return one of two things. The first is nil, and the second is a message chain. A syntax will only be executed once at every point in the message chains, because after a syntax executes the first time, it will replace itself with the result of that evaluation. If that evaluation returns nil, syntax will just remove itself from the message chain.

You can use this for many things, but one of the more useful things you can do is translate a high level declarative definition of something into a low level executable version. That is exactly how for comprehensions are implemented.

Syntactic macros are fairly advanced, and take some time to grok. They are incredibly useful though, and they are used all over the standard library to achieve all manner of interesting things. Take a look there and things should hopefully become clearer. It's also a must to read the section on message chain manipulation and quoting in this guide to make syntax macros readable.

Destructuring

A common problem with macros is that you want to take several different combinations of arguments, and do different things depending on how many you get. Say you might want to take one code argument, but also two optional arguments that should be evaluated. All of that code turns out to be highly repetetive, so Ioke contains a collection of syntax macros that make it easier to write these things. These are collectively called destructuring syntax.

Let us say we have a macro that either [code], [evaluatedArgument, code], or [evaluatedArgument, code, evaluatedArgument]. The stuff that should happen is totally different for each of these cases. With a regular macro the code would look something like this:

foo = macro(
  len = call arguments length
  case(len,
    1,
    code = call arguments[0]
    ; do something with the code
    ,
    2,
    arg1 = call argAt(0)
    code = call arguments[1]
    ; do something with the code and arg
    ,
    3,
    arg1 = call argAt(0)
    code = call arguments[1]
    arg2 = call argAt(2)
    ; do something with the code and args
    ))

As you can see it's really a lot of code to see what happens here, and it is very imperative in style. But, if I instead use dmacro - which is the destructuring version of macro - it looks like this:

foo = dmacro(
  [code]
  ; do something with the code
  ,
  [>arg1, code]
  ; do something with the code and arg
  ,
  [>arg1, code, >arg2]
  ; do something with the code and args
)

dmacro will automatically check the length and extract the different arguments. The right arrow before the names of arg1 and arg2 marks that these should be evaluated. And what is more, dmacro will generate code that also generates a good condition if no argument matching works out. If you give zero arguments to the first version, it will fail silently. The dmacro will complain immediately. The dmacro destructuring syntax actually supports several more ways of ripping arguments apart. You can find this information in the doks for dmacro. Also, there are equivalent versions of dmacro for lecro, lecrox and syntax, called dlecro, dlecrox and dsyntax. They do the same thing, except they act like lecros or syntax instead.

Message chains

In many cases a macro will take code that is not wrapped up inside of a method, macro or block. These pieces of code is called message chains, since their representation will be to a raw Message mimic. The chains are quite flexible, since they can be taken apart, modified and put together again. They can also be unevaluated and used as data definitions of some kind. That's how the argument handling to methods are implemented, for example. Since the call to "method" can be seen as a regular call to a macro, the argument descriptions are actually just unevaluated message chains that are picked apart to tease out the argument names. The same technique is applicable in any macro usage.

The term message chain fragment is also used to specifically mean a message chain that is meant to be put together with something and evaluated. Picture a daisy chain that gets added at the end of another chain and then executed. That's what happens if you execute something like [1, 2, 3] map(*2). In this case the call to * with the argument 2 will be a message chain fragment that will be put together with a new receiver before execution.

To handle syntax correctly - but also to generally handle manipulation of message chains - it is important to know about the available methods to do this. I have added quite a lot of nice stuff that makes it easy to work with message chains.

First, messages are actually Enumerable, so you can use any Enumerable methods on them. The enumeration always starts at the receiver. It will not proceed into arguments, just following the next-pointer. To create a new message or message chain, there are several helpful methods and operators. The first method is called message and takes an evaluated name and returns a new message with that name. Message from takes one argument that will not be evaluated and returns a message chain corresponding to that argument. Message fromText parses text and returns the message chain for it. Message wrap takes an evaluated argument and returns a message that will always return that value. As will be mentioned later, Message has next= and prev= methods that you can use to set the next and previous pointers. Message also has appendArgument and prependArgument that allow you to add new arguments to the message arguments.

The most used versions for creating message chains are short cuts for the above. Let us begin with creation. Instead of Message from you can use '. That is a single quote mark. The message after that will be unevaluated and returned as a message chain. If you use a `, a backtick, that is equivalent to Message wrap. And then we have '', that is two single quotes after each other. This message is generally called metaquote or quasiquote. It works the same as ', except that it will find any place where ` is used and insert the value of evaluating the message after the ` and insert that into the current message chain. Finally, '' will replace a `` with a literal ` message.

You can add new arguments to a message by using the << operator. This operator returns the receiver.

If you want to chain together a message chain, using next= and prev= is pretty tedious. You can instead use the -> operator. This will chain together the left hand side and the right hand side messages, and return the right hand side message.

I think it is time for some examples:

; create a new message with name foo
x = 'foo

; add two arguments to the foo message
arg = '(bar quux)
(x << arg) << 'baz

; what we have done so far could be done with:
x = '(foo(bar quux, baz))


y = 'blurg
; chain together x and y
x -> y

; the above is equivalent to
if(y prev,
  y prev next = nil)
x next = y
y prev = x

val = 42

; insert the message chain in x
''(foo bar(`val) `x)

; the above will return the same as
'(foo bar(42) foo(bar quux, baz))

To understand these operators, you need to have a clear understanding of how the internals of message chains work. Once that clicks, these should be fairly straight forward to understand.