Difference between revisions of "Guide"
Line 1: | Line 1: | ||
+ | |||
{{:Guide:Introduction}} | {{:Guide:Introduction}} | ||
+ | |||
{{:Guide:Syntax}} | {{:Guide:Syntax}} | ||
+ | |||
{{:Guide:Execution Model}} | {{:Guide:Execution Model}} | ||
+ | |||
{{:Guide:Objects}} | {{:Guide:Objects}} | ||
+ | |||
{{:Guide:Assignment}} | {{:Guide:Assignment}} | ||
+ | |||
{{:Guide:Control flow}} | {{:Guide:Control flow}} | ||
+ | |||
+ | {{:Guide:Code}} |
Revision as of 00:15, 26 January 2009
Introduction
Ioke is a general purpose language. It is a strongly typed, extremely dynamic, prototype object oriented language. It is homoiconic and its closest ancestors are Io, Smalltalk, Ruby and Lisp - but it's quite a distance from all of them. It looks a lot like Io, to a limit.
Ioke is a folding language. This means it folds in on itself. You can create new abstractions covering any of the existing abstractions in the language. You can abstract over these, over and over again, until you have a language that lets you express what you want to express in a succinct and readable way. Ioke allows you to fold your code.
Ioke is targeted at the Java Virtual Machine and is tightly integrated with the platform. Why the JVM? It's available everywhere, it gives several important features such as world class garbage collectors, capable thread schedulers and an amazing JIT compiler. All of these are things that serve Ioke well, without requiring direct development resources from the Ioke team. Access to the Java platform also means access to all existing libraries and functionality, with all that entails. The JVM is just a very pragmatic choice.
You're probably reading this guide at ioke.org. That is the official home of the project, although some of the project functionality is hosted at Kenai, where such things as mailing lists and a bug tracker is available. The canonical source repository for Ioke is on GitHub.
The current version of Ioke is called Ioke P. The naming of Ioke will change regularly with major revisions. There are two different versions in play here. Ioke P is the name and version of the language and core libraries. The initial implementation for Ioke P is called ikj 0.4.0, and the version numbers are not interdependent. The next major version of Ioke will be called Ioke F, and you can find information about it in the chapter on future plans.
This programming guide -- together with the reference for your current version -- should be the complete document needed to understand Ioke P, how to program in it, how to understand the names and concepts used, and also give an initial inkling on what I think is good taste.
Note that I will use many names that aren't necessarily the same as the ones traditional programming languages use. These names will be made clear sooner or later in this document, but it might help some to jump forward to Objects, skim that bit, and then start over once words like Origin, cell and mimic make sense.
Vision
The evolution of programming languages is a steady progression of finding new ways to express abstractions naturally - in a way that doesn't stray too far away from the smaller details. A programming language has to make it possible to abstract away common things, while making it easy to customize these abstractions in very detailed ways. A programming language should be able to do this well without sacrificing readability and understandability. This tension lies at the core of programming.
How do you create a language that makes it easy to express powerful concepts in a succinct way, while still making it easy to maintain and work with after the fact, without turning it into a new compression mode? How do you make it easy for a programmer to express high level abstractions that are abstractions of abstractions of abstractions?
There are many open problems in programming language design. Concurrency is one of them, performance another. These are two areas Ioke does not address. Instead, Ioke is a remodeling of the core concepts and ideas embodied in other programming languages.
Are Lisp and Smalltalk still the most powerful languages around, or are there ways of providing more expressiveness without sacrificing understandability? Is there a way to combine all the lessons learned from languages like Ruby and Python, and patch them back into a Lisp and Smalltalk core? Is it possible to do this while taking some of the benefits of Io? Can a language be both small, regular, homoiconic, reflective and easy to understand? I hope that Ioke is just that.
Simplicity doesn't mean lack of power. Small, simple, orthogonal functionality can be more powerful than larger, complicated abstractions that don't fit together.
Io explicitly states that the goal of the language is to refocus attention on expressiveness, and with Ioke I want to take that philosophy one step further.
It's important to realize that an experiment like this doesn't necessarily have to mean the language can't be used for real projects. By wedding Ioke to the Java Virtual Machine, I make it easy to get access to good libraries and existing implementations on most platforms. In that way, Ioke can be used to create real systems, even though the ecosystem will initially be very small. And I think that this is necessary. How can you know if a language really is worthwhile or not, if you can't use it as a general purpose programming language? The Java platform makes this possible.
Getting started
Ioke is very easy to get started with. The first step is to download a package. Which one you choose depends on what platform you're on, and whether you want to build Ioke yourself, or just start using it. This guide will only cover using a prebuilt version. Go to the download page, and grab one of the distributions. At the time of writing the full version of Ioke is Ioke P ikj 0.4.0. Choose the latest download in the 0.4-series for this document to apply.
Once you have downloaded the distribution, you need to unpack it somewhere, and finally add the bin directory to your PATH environment variable. There is also a jar download that can be run directly. If you choose this option you don't get the benefits of having a home for Ioke, which in some cases might be inconvenient. Ioke can be run directly from the jar file, though.
Building Ioke
If you'd like to build Ioke from source, make sure you have a recent version of the Java Development Kit installed (1.5.0 or higher, preferrably 1.6.0) and Apache Ant. You must have the ant script reachable from your PATH variable. Then, simply check out the source code from the main repository, and build it using ant. That should run all the compilation steps and tests, and allow the bin/ioke script to run. Just proceed as if you had unpacked the distribution, adding the bin directory to the PATH.
Running scripts
To run an Ioke script, you can generally just use the ioke command:
$ ioke helloWorld.ik Hello world
You can also execute snippets of code on the command line using the -e argument to the ioke command. You can have several of these in the same line too:
$ ioke -e'"Hello world" println' -e'"Goodbye world" println' Hello world Goodbye world
When using -e, be careful about what quoting style you use, since the shell sometimes can munge up your commands if you don't surround them correctly.
The ioke command has several helpful command line options, which can change what happens during execution. These are:
- -Cdirectory
- Switch to directory before executing any files and command line scripts. This will make the directory the initial current working directory for Ioke during the execution of the JVM.
- -d
- Enable debug output.
- -e script
- Execute script, as describe above. May occur more than once on a command line.
- -h
--help
- Display help information, including descriptions of these command line options.
- -Idirectory
- Add directory to the load path of Ioke. May occur more than once on a command line.
- -JjvmOptions
- Pass on options to the JVM. This can be used to change any runtime parameters that your JVM takes. May occur more than once. The options are provided directly after the -J, so if you want to change the maximum amount of memory used, you can do that writing -J-Xmx128M.
- --copyright
- Print copyright information and exit.
- --version
- Print version information and exit
- --server
- Run the JVM in server Hotspot mode
- --client
- Run the JVM in client Hotspot mode (the default)
- --
- Mark the end of options to the ioke script, anything after this are options to be sent to the code running.
If you provide the name of a script file on the command line, it should come after all the arguments to the ioke script. Everything after the script will be added as data to the System programArguments
cell. You can use both one-line scripts with -e and specify a script file. If so, the script file will be run after the one-line scripts.
Interactive mode
If no code to execute has been specified to the ioke script, IIk - Interactive Ioke - will start. This is a REPL that allows the execution of arbitrary code in a shell that immediately displays the result. The main difference between running Ioke from a file and interactively is that the interactive prompt will show a notice of the result of the last operation after each execution. IIk will also invoke a debugger when a condition is encountered. This debugger gives you the possibility to inspect what happened more closely. The final difference with IIk is that it does not execute code directly in Ground - which the top level inside an Ioke script will do. This difference is crucial, when considering namespacing issues.
IIk will try to use Readline through JLine if your platform supports it.
IIk will be more closely described later, but just to give you a glimpse, this is how a small session could look like:
iik> "hello world" println hello world +> nil iik> 10 * 20 +> 200 iik> 3/2 +> 3/2 iik> 3/2 + 3/2 +> 3 iik> 3/2 * 3 +> 9/2 iik> foo = "hello" +> "hello" iik> foo +> "hello" iik> exit Bye.
When you see the prompt "iik>", you know that IIk is waiting for input. The result of a computation is shown after the "+>" sigil. You can exit from IIk by calling either "exit" or "quit". There is also a restart named "quit" that can be invoked to quit IIk.
Syntax
Ioke has no keywords or statements. Everything is an expression composed of a chain of messages. A piece of code is represented as a chain of messages that links to the next message. The result of one message will be the receiver of the next message, until a "." message is received. The "." message is a terminator that throws away the current receiver. A newline will serve as a "." message in the circumstances where it feels natural.
An informal BNF description of Ioke looks like this:
program ::= messageChain? messageChain ::= expression+ expression ::= message | brackets | literal | terminator literal ::= Text | Regexp | Number | Decimal | Unit message ::= Identifier ( "(" commated? ")" )? commated ::= messageChain ( "," messageChain )* brackets ::= ( "[" commated? "]" ) | ( "{" commated? "}" ) terminator ::= "." | "\n" comment ::= ";" .* "\n"
What isn't visible here is that all whitespace -- except for newlines -- will work only as separators of messages, and is otherwise ignored. That means that message sending does not use the dot, as in most other languages. A phrase such as foo().bar(quux(42)).baaz()
would be expressed as foo() bar(quux(42)) baaz()
, or more succinctly foo bar(quux(42)) baaz
in Ioke.
All the types of literals are actually turned into a message to create that literal, so the canonical form of the message chain contains no literals, just a message to create that literal. Any message can have zero or more arguments given to it. Arguments are separated with comma. If there are no arguments to a message, the parenthesis can be left off, but they need to be there if there are arguments. Mostly any combination of characters can be used as an Identifier, with some exceptions.
There used to be a parsing element called operators, but these have now been included into identifiers. They are not parsed differently at all, but the operator shuffling step will handle them differently. Specifically, operators can be used in infix, including having different precedence rules. Assignment is a specific form of operator which gets its own kind of shuffling. These are both described below.
An identifier in Ioke can be one of several things. Ioke takes the rules for Java identifiers, and adds some more to them. All Unicode letters and digits can be part of an identifier, except for the first entry. Underscores are allowed, just like in Java. Ioke also allows colons as an identifier. Exclamation mark and question mark is allowed anywhere in the identifier except for in the beginning. Identifiers can be broadly classified into identifiers and operators, where operators can be any combination of several sigils. There are also some special operators that have restrictions. These are: Opening and close brackets are not allowed, except together with its counterpart, so [ is not a valid identifier, while [] is. So is {}. () is not valid either. Two or more dots is a valid identifier. A hash sign can be followed by any operator char, but isn't parsed as an identifier by itself. Slash is not an operator char, but can be used as it except in combinations that look like regular expressions. The operator chars are: +, -, *, %, <, >, !, ?, ~, &, |, ^, $, =, @, ', ` and :. These can be combined together in any order, and any number, except for the caveats noted before. That means the available operator space is infinite, and very wide. Combinations of letters and operator characters are generally not allowed, except for the exceptions with :, ! and ?. This is to make it possible to have infix operations without spaces in some situations.
The two forms of brackets will get turned into a canonical form. Surrounding comma-separated message chains with square brackets is the same as calling the method [], giving it those message chains as argument. So [foo, bar, quux] is exactly the same as [](foo, bar, quux). The same is true for curly brackets.
Comments start with semicolon and end at the first newline. They can be used mostly anywhere, except inside of literal texts. The hash sign followed by an exclamation mark is also a comment, to allow the shebang line in Unix scripts.
How and when the actual evaluation of messages happen depend on what kind the message type is. If it's inactive, the value reflecting that cell will be returned. If it's active, the cell will be activated and the result of that activation returned. How the activation depends on what kind of code the cell contains. The various kinds of code is described more closely in the chapter about code.
Literal values
Ioke currently contains four different kinds of literals. There is a fifth quasi literal, that isn't exactly parsed as a literal, but will be evaluated differently based on its name. These literals are texts, regular expressions, integers and decimal numbers. Symbols are actually parsed as regular identifiers, but they are handled a bit differently during evaluation.
Text
A literal text in Ioke is what is generally called strings in most languages. As in most languages, text is written inside of double quotes. Any characters are valid inside of those double quotes. That includes newlines - so you can write a literal text that extends to several lines. There is an alternate syntax for text when the value contains a lot of double quotes. As in most other languages, several escapes are valid inside of a text. Escapes are preceded by the backslash, and insert the character corresponding to the escape values. These escapes are:
- \b
- Inserts the backspace character, that is represented in ASCII by the decimal value 8.
- \e
- Inserts the character that is represented in ASCII by the decimal value 27. This value is used for sending escape values to the TTYs in some operating systems.
- \t
- Inserts the TAB character - ASCII decimal 9.
- \n
- Inserts the newline character - ASCII decimal 10.
- \f
- Inserts the form feed character - ASCII decimal 12.
- \r
- Inserts the carriage return character - ASCII decimal 13.
- \"
- Inserts the double quote character - ASCII decimal 34.
- \\
- Inserts the backslash character - ASCII decimal 92.
- \[newline]
- Inserts nothing at all. Used to escape necessary newlines, without having them show up in the output text.
- \#
- Inserts a literal hash character - ASCII decimal 35.
- \uABCD
- Inserts the Unicode codepoint corresponding to the hexadecimal value of the four characters following the "u". All four hexadecimal characters need to be specified.
- \7, \12, \316
- Inserts the Unicode codepoint corresponding to the octal value of the one, two or three octal characters. The maximum value allowed is \377, and the minimum is obviously \0.
Ioke also supports an alternative text syntax that can be used when the text in question contains many scare quotes. The alternative syntax starts with #[ and ends with ]. A right bracket will have to be escaped, but scare quotes doesn't have to be.
The parsing of text will generate a message with name "internal:createText". This message will get one argument that is the raw Java String corresponding to the text.
Ioke allows automatic interpolation of arbitrary values in the same manner as Ruby. It uses the same syntax for this, which is the #{} syntax inside a text. These can be nested in any way. The elements will be parsed and sent as arguments to the message with name "internal:concatenateText". So an Ioke text such as "foo bar#{flux} will #{1+2}" will generate the message internal:concatenateText("foo bar", flux, " will ", 1+(2), ""). As you can see, there is a small amount of waste in the way this is generated -- but the simple model makes it easy to understand. It's not guaranteed that this will remain the same, although the message will definitely remain.
Some examples:
"foo"
"flax \
mux"
"one two #{three} \b four"
#[you don't really "#{1+2+3}" believe that?]
Regular expressions
Ioke has very capable regular expressions. Exactly what you can do with them can be found further down in this guide. The literal syntax allows regular expressions to be embedded in code directly. The syntax for this starts with a #/ and ends with another /. The last slash can optionally be followed by some flags that change the behavior of the expression. Regular expressions can also use an alternative syntax that starts with #r[ and ends with ]. Just as with Text, regular expressions can contain interpolation. This interpolation will be transformed into regular expressions and then combined with the outer regular expression. A few examples might be in order here:
#//
#r[]
#/foo/
#r[foo]
#/fo+/x
#r[fo+]x
#/bla #{"foo"} bar/
#r[bla #{"foo"} bar]
The first example is an empty regular expression. The second is an expression matching the word "foo". The third expression matches an "f" followed with one or more "o". It also allows extended regular expression syntax, due to the x flag. The flags supported in Ioke are x, i, u, m and s. The meaning of these match the meaning of corresponding Ruby flags. Regular expressions allow most of the same escapes as Ioke text. Specifically, these escapes are supported: b, t, n, f, r, /, \ and newline. Unicode and octal escapes also work. The fourth example shows the insertion of a literal text inside of a regular expression.
Ioke regular expressions will be transformed into a call to internal:createRegexp. This message expects two Java strings, one with the actual pattern, and one with the flags.
Integers
Ioke supports arbitrarily sized numbers. It also contains a numerical tower that can be more closely explored in the reference documentation. The numerical tower is based in Number. Number Real mimics Number. Number Rational mimics Number Real, and so does Number Decimal. Finally, Number Integer and Number Ratio both mimics Number Rational. The interesting parts of this tower is Number Integer, which corresponds to integers, Number Ratio, which is any ratio between two integers, and Number Decimal, which corresponds to decimal numbers. These are arbitrarily sized and exact. There are no floats or doubles in Ioke. There is also a potential place for Number Complex at the same layer as Number Real, although complex numbers are not currently implemented. There are also plans for implementing a unit system further down the line. Number Infinity represents the singleton infinity object.
Literal integers can be written using either decimal or hexadecimal notation. Hexadecimal notation begins with 0x or 0X and are then followed by one or more hexadecimal letters. They can be either upper or lower case. A decimal literal number is written using one or more decimal letters, but nothing else.
There is no literal to create ratios - these can only be created by division of integers. Negative numbers have no literal syntax, but preceding a number with a minus sign will call the message - on the number and generate the negative value.
A literal integer will be transformed into a call to internal:createNumber, which takes one native Java String from which to create the number.
Some examples:
1234444444444444444444444444444444444444235234534534
0
0xFFFFF
Decimals
Literal decimal values can be written either using exponential notation, or using a decimal dot. A decimal dot notation can be combined with exponential notation. Exponential notation starts with a number or a decimal number, followed by lower or upper case E, followed by an optional sign, and then followed by one or more decimal letters.
A literal decimal will be transformed into a call to internal:createDecimal, which takes one native Java String from which to create the decimal.
Some examples:
0.0
1E6
1E-32
23.4445e10
Symbols
Symbols aren't exactly syntax, but they aren't exactly messages either. Or rather, they are messages that will evaluate to the symbol that represent themselves. Symbol is a kind in Ioke. There are two kinds of symbols - the first one is simple symbols that can be parsed as is. The second is symbols that can't be parsed as is. Symbols are preceded by a colon and then directly followed by the symbol text. If it can't be parsed correctly, the value should be surrounded by quotes, and this will be turned into a call to the method :, which takes the text as argument. That means that you can actually get dynamic symbols by calling the : method.
Some examples:
:foo
:flaxBarFoo
:""
:"mux mex mox \n ::::::::"
Operator shuffling
One exception to the way message handling works in Ioke is operators. All the so called operators in this section is possible to call directly in message passing position too -- but to make it possible to use them in a more natural way, the parsing step will handle them a bit differently, and then do a shuffling step that actually takes operator precedence into account. So all the common operators will generally work as you expect them to -- although I recommend adding parenthesis when something is possibly unclear.
Ioke has a slightly larger amount of operators than most other languages. Most of these are currently unused, but they are certainly available for use for any purpose the programmer wants to use it for. Many adherents of other languages (Java, I'm looking at you) claim that operator overloading is evil. I don't believe that is true, seeing as how it works so well in Ruby, so Ioke instead allow you quite large freedom with regards to operators.
The precedence rules for regular operators can be found in the cell 'Message OperatorTable operators', which is a regular Dict that can be updated with new values. The new values will obviously not take effect until the current code has run, and a new parse is started.
Note that the below is only the operators that have defined precedence rules. As noted in the section on syntax, you can use any operator you want really. It is easy to add new precedences to the table, either temporarily or permanently.
At the time of writing, the available operators - in order of precedence - are these:
- !
- #
- $
- ?
- ~
- **
- %
- *
- /
- +
- -
- ∩
- ∪
- <<
- >>
- <
- <=
- <=>
- <>
- <>>
- >
- >=
- ≤
- ≥
- ⊂
- ⊃
- ⊆
- ⊇
- !=
- !~
- ==
- ===
- =~
- ≠
- &
- ^
- |
- &&
- ?&
- ?|
- ||
- !>
- !>>
- #>
- #>>
- $>
- $>>
- %>
- %>>
- &&>
- &&>>
- &>
- &>>
- **>
- **>>
- *>
- *>>
- +>
- +>>
- ->
- ->>
- ..
- ...
- />
- />>
- <->
- =>
- =>>
- ?>
- ?>>
- @>
- @>>
- ^>
- ^>>
- |>
- |>>
- ||>
- ||>>
- ~>
- ~>>
- ∘
- %=
- &&=
- &=
- **=
- *=
- +=
- -=
- /=
- <<=
- >>=
- ^=
- and
- nand
- nor
- or
- xor
- |=
- ||=
- <-
- import
- return
And as mentioned above, all of these can be used for your own purpose, although some of them already have reserved meanings. This document will cover most of the used operators, while the rest can be found in the reference.
Since this operator shuffling happens, that also means that an Ioke program has a canonical inner form that can differ from the source text. When you use introspection of any kind, you will get back that canonical form which might not look exactly like you expected. Similarly, if you ask some code to print itself, it will use the canonical form instead of the operator skin. Macros that modify message chains should work against the canonical form, and nothing else.
What an operator does depends on the result of sending the message of that name to the receiver, just like regular messages. In fact, to Ioke there really isn't any difference, except that the parsing takes special notice about operators and assignment operators.
Assignment shuffling
Much like with regular operators, trinary - assignment - operators are subject to a kind of shuffling. This shuffling differs from regular operator shuffling, in that it will shuffle around two things - the left hand side and the right hand side. This is true for every assignment operator except for the unary ones, which will only reshuffle one message.
A few examples might make the translation easier to perceive. The first item is the readable form, while the second form is the canonical form:
foo = 1 + 2
=(foo, 1 +(2))
Ground foo *= "text"
Ground *=(foo, "text")
bar foo(123) = 42
bar =(foo(123), 42)
flux++
++(flux)
These examples show some more advanced details -- specifically the fact that assignment operators generally work on "places", not on names or cells. This will be more explored in the chapter on assignment. The important thing to notice from the above examples is that for most assignments two things will be rearranged. For the unary operators only one thing will be moved.
Just as with regular operators, the assignment operators have information in the 'Message OperatorTable' cell. The specific cell is 'Message OperatorTable trinaryOperators', and it matches an assignment operator to either the integer 1, or the integer 2. Everything with 1 will be matched as being unary assignment.
The currently available assignment operators are:
- =
- ++
- --
- +=
- -=
- /=
- **=
- *=
- %=
- &=
- &&=
- |=
- ||=
- ^=
- <<=
- >>=
Just as with regular operators, what an assignment operator does depend on what the result is from sending the message of that name to the receiver object, just like with any type of message.
Inverted operators
In addition to the regular binary operators and the trinary assignment operators, Ioke also sports inverted operators. These aren't actually used anywhere in the core distribution, but they might be useful at some time or another. The basic idea is that sometimes you want to have the right hand side of an expresssion become the receiver of an operator call, and the left hand side become the argument to the operator. Inverted operators allow this.
As with both the binary and trinary operators, you can find and update information about inverted operators in the cell 'Message OperatorTable invertedOperators'. To make this a little less abstract, let us look at two simple examples and what they translate into:
"foo" :: [1, 2, 3, 4] map(asText)
;; will be translated to
[1, 2, 3, 4] map(asText) ::("foo")
;; provided we have an inverted
;; operator called 'doit'
abc foo quux doit another time
;; will be translated to
another time doit(abc foo quux)
Execution model
The way an Ioke program works is very simple. Everything executes based on two things. The first is the context, or the ground, and the second is the receiver. The first message sent in each message chain will have the ground as receiver. The default ground in Ioke source files is an object called Ground
. This object is in the mimic chain for most regular objects created in Ioke, which means that things defined at the top level will generally be available in most objects. Inside of methods and blocks, the ground will be different. Exactly in what way is defined by the type of code executing.
Every message in a chain will be sent to the receiver of that message. That receiver is the result of the last message, or the current ground if there was no previous message, or if that previous message was a terminator. So Ioke code like foo bar(flux bar) quux
involves 5 different messages.
- The message
foo
is sent toGround
, which is the current ground and also the default receiver. - The message
bar
is sent to the result of thefoo
message. The value returned will be activated. - The cell
bar
contains a method in this case, and that method expects one argument, so that forces evaluation of the arguments. - The message
flux
is sent toGround
, since it's the ground and there is no prior message inside of an argument list. - The message
bar
is sent to the result of theflux
message. - The result of the
bar
message is used as the argument value given to the outsidebar
method. - The message
quux
is sent to the result of the initialbar
message. - The result of the
quux
message is thrown away, unless this code is part of a larger piece of code.
This description generally describes what happens in the case of this code. The more general control flow is this:
- A message is encountered
- If the message is a symbol message, the corresponding symbol will be returned.
- Otherwise the name of the message will be looked up in the receiver, or in the receivers mimics.
- If the name is found and is not activatable, the value of that name (the cell) is returned.
- If the name is found and is activatable, it will be activated, with the current ground, receiver and message sent to the activatable object.
- If the name is not found, a second search is done for the name
pass
. If a pass is found, use that instead of the name of the original message, and go back to 4. - If a pass is not found, signal a
Condition Error NoSuchCell
condition.
Exactly what happens when an object is activated depends on what kind of code gets activated. It's really up to the method, block or macro to handle evaluation of arguments in any way it likes - including not evaluating them. For a description of the default models available, see the chapter on code.
Objects
The object model of Ioke is quite simple. Everything in Ioke is an object that follows these same rules. An object is something with an identity. It can have zero or more mimics, and zero or more cells. An object can also have a documentation text. Some objects can have a native data component. This acts more or less like a hidden cell that contains information that can't be directly represented in Ioke - for example the actual text in a Text. Or the actual number in a Number. Or the actual regular expression in a Regexp. These objects are the core types that contain primitive information.
A cell is the main way of representing data in Ioke. A cell has a name and a value. Every value in Ioke is a cell - every time you send a message, a cell is looked up for the value of that cell. Cells can contain any kind of data. In other languages, cells are generally called properties or slots. They are quite close to instance variables that also can contain methods. Cells can be added and removed at any time during runtime.
A mimic could also be called the parent of the object. Ioke is a prototype based language, which means that there is no distinction between classes of objects, and the objects themselves. In fact, any object can be used as the "class" of a new object. The word for that is mimicking, since the word "class" loses it's meaning in this kind of language. It's most common for an object to mimic one other object, at least initially. It's impossible to create an object that doesn't mimic anything, but you can remove all mimics for an object after the fact. You can also add more mimics. This turns out to be useful to represent shared functionality in the manner of Ruby mixins, for example. The actual effect of a mimic is that when a cell can't be found in the current object, all mimics will be searched for that cell (depth-first). So all cells available in an object's mimic is available to the object too. This is the inheritance part of Object-Oriented Programming.
In many places you will find the word "kind" being used. A Kind is by convention an object that is used primarily to use as a mimic for other objects. The convention is that kinds are named with an initial upper case letter, while everything else starts with a lower case letter. The assignment process of Ioke also uses this convention to automatically set a cell called "kind" on any object that gets assigned to a name matching this convention.
The rest of this chapter will discuss the kinds that are the basis of the object system.
Base
The kind called Base is the top of the mimic chain. It's not generally useful in itself as it only defines the bare minimum of cells to make it possible to add new cells to it, mimic it, and so on. But if you want an object that is possible to use but not include most of the other stuff, Base is place to begin. Be careful when defining methods in Base, since it doesn't have access to most of the namespace. In fact, it doesn't even know about its own name. Base can act as a kind of blank slate, if needed, but it's probably easier to just create a regular object and remove all mimics from it after the fact.
Base defines these cells:
- kind
- returns the kind of the object, which is "Base".
- notice
- returns the short notice of the object, which is "Base". Refer to Introspection for more information about notice.
- =
- Takes two values, the first a place and the second a value, and assigns the place named to that value. Refer to Assignment for more information about it.
- ==
- Compares this object against the argument. Returns true if they are the same, otherwise false.
- cell
- Takes one argument that should be the name of a cell that exists, and returns the value of the cell unactivated.
- cell=
- Sets a cell to a specific value. Used to set cells that can't be set using the regular assignment model. Refer to Assignment for more information about it.
- cell?
- Takes one argument that should be the name of a cell to check if it exists in this objects mimic chain.
- cellNames
- Returns a List containing the names of all cells this object contains.
- cells
- Returns a Dict with all cells this object contains. The key is the name and the value is the cell value.
- cellOwner
- Returns the closest mimic that has a cell with the name given as argument to the message. A condition will be signalled if you try to find the owner of a cell that doesn't exist in this mimic tree. This method will only return the closest cell owner for the named cell. It will not use "pass", so it's the responsibility of pass-implementers to make it return a correct result for those names.
- cellOwner?
- Takes the name of a cell and returns true if the receiver of the message defines a cell by that name, otherwise false. Note that there can be more than one cell owner in a message chain. This just returns true if the current receiver is the closest one.
- removeCell!
- Removes the named cell from the current object. This means that if the current cell shadowed cells in mimics, those can be called again. It only removes a cell if the receiver is the owner of that cell. Otherwise it is an error to call this method.
- undefineCell!
- Makes it impossible to find a cell from the receiver. In all ways it looks like this cell doesn't exist in the mimic chain at all, even if mimics define several implementations of it. The use of undefining can make an object conceptually totally clean from cells, although it might be hard to use the objec after that. An interesting side-effect of the way these methods work is that removeCell! can be used to remove the undefine. So if you call removeCell! with a cell name and a receiver that has been called with undefine earlier, that undefine-status will be removed, and access to mimic versions of the cell will be possible again. Look at the specs for a better understanding.
- documentation
- Returns the documentation text for this object, or nil if no documentation exists for it.
- documentation=
- Sets the documentation text for this object.
- mimic
- Returns a newly created object that has the receiver as mimic. This is the magic way of creation new objects in Ioke. It is also the ONLY way to do it.
- hash
- The base implementation of hash coding - currently the default implementation just return an identity dependent hash code.
- identity
- Returns the object receiving the message.
All of these methods are described further in the reference.
Ground
As mentioned above, Ground is the default ground/context for evaluation. Ground IokeGround and JavaGround, and IokeGround mimics Base and DefaultBehavior. IokeGround is special in that this is the place where all top level kinds are defined. If you want to create a top level kind, you should put it in IokeGround. If you take a look in IokeGround, you will see that it contains cells for Text, Dict, List, Base, Origin, itself and many other. Ioke doesn't have any global state at all, but IokeGround is as close as it gets. IokeGround and Ground should in most cases not be mimicked directly.
JavaGround is the place where all Java integration support is integrated into Ioke.
Origin
Origin should be the place where most objects in Ioke start from. It is specifically created to be the origin of objects. As such it doesn't contain many cells for itself, but it mimics Ground and has access to everything from Base, DefaultBehavior and Ground in that way. When adding new more or less global functionality, Origin is probably the best place to put it. Currently, the only cells Origin contains is for purposes of printing itself.
Origin also happens to be the point where initialization is defined. This is really done as an aspect on 'mimic'. If you want an object to be able to be initialized every time a new mimic of it is created, just create a method called initialize in your kind. It will be called by the mimic-aspect. Any arguments given to mimic will be ignored and passed along to initialize. An example:
Foo = Origin mimic
Foo initialize = method("New foo created!" println)
Foo mimic
Foo mimic
Foo initialize = method(arg1, key:, self value = [arg1, key])
Foo mimic(42, key: 15)
Foo mimic(key: "blarg", 42)
There is nothing special with the initialize method, so if you want more initialization to happen in a deep hierarchy, you will have to use super-calls and so on.
DefaultBehavior
DefaultBehavior is a mixin - meaning it should never be the sole mimic of an object. Mixins are generally not grounded in Base, and doesn't contain most of the things you would expect from an object. DefaultBehavior contain almost all the general methods you use when programming Ioke. It contains the internal methods to create values from literals, and most other functionality specified in this document. In short, DefaultBehavior is the work horse, and you should have a pretty good reason to not have it in the mimic chain of an object. Since Ground mimics DefaultBehavior, any object you create from Origin, will have DefaultBehavior in its mimic chain.
The actual implementation of DefaultBehavior is divided into several smaller mixins that are all mixed in to DefaultBehavior. These give more focused pieces of behavior. They are, in alphabetical order:
-
DefaultBehavior Aspects
-
DefaultBehavior Assignment
-
DefaultBehavior BaseBehavior
-
DefaultBehavior Boolean
-
DefaultBehavior Case
-
DefaultBehavior Conditions
-
DefaultBehavior Definitions
-
DefaultBehavior FlowControl
-
DefaultBehavior Internal
-
DefaultBehavior Literals
-
DefaultBehavior Reflection
The recommended way to add new global behavior to Ioke is to either add a cell to one of these, or create a new mixin and mix it in to the appropriate place. If you're adding new flow control features, mixing these in DefaultBehavior FlowControl
might be appropriate, for example.
nil, true, false
The three values nil, true and false are the only values that are considered kinds, even though they start with lower case letters. They are not like the other kinds in the other important way either - these values can not be mimicked, and you will get a condition if you try it. The reason is that Ioke's basic boolean system revolves around these values. It is not entirely certrain that these values will forever be the only boolean values, but for now they are. nil should be used to represent the absence of a value, including the absence of a reasonable return value. false is the quintessential false value, and true is the quintessential true value. The value true isn't strictly necessary since any value except for nil and false are true. This notion of truthness mimics Ruby. The cells nil, true and false are defined in Ground, and they can actually be overridden or changed - but I don't recommend it. I can guarantee lots of chaos and non-working programs from doing it. More info on how these values interact can be found in the section on Comparison.
Assignment
Superficially, Ioke's assignment model is quite simple. But there exists some subtleties too. One of the main reasons for this is that assigning a cell that doesn't exist will create that cell. Where it gets created is different based on what kind of context the assignment happens in. The main difference here is between a method activation context, or a lexical block context.
Ioke also supports assignment of places, which makes assigning much more flexible. A third feature of Ioke assignment is that it will check for the existence of an assignment method before assigning a specific name. This chapter will make all these things clear, and show some examples.
Ioke can also do destructuring assignment, which means you can assign more than one value at the same time. Destructuring features nesting and can also apply places, which gives it a lot of flexibility.
Let's start with a small example of simple assignment:
foo = Origin mimic
foo x = 42
foo y = 13
foo x += 2
The first line creates a new Origin mimic, and then assigns that to the name foo. Since this code executes at the top level, "foo" will be a new cell created in Ground. The second line creates a new cell called "x" inside the "foo" object. It gets assigned the value 42. The third line creates a "y" cell, and the fourth line sends the += message, which will first call +, and then assign using =. So at the end of this program, "foo" will contain two cells: "x" with value 44, and "y" with value 13. As mentioned above, cells get created the first time they are assigned to. If you need to create a cell in a specific object, just namespace it. For example, if you want to make sure that you create a cell in Ground, just do "Ground foo = 42".
Inside of a method, the situation is exactly the same. If you assign something, it will be assigned in the current context, which is the local activation context (meaning it's the place where local variables are available). There are two situations where this doesn't hold true. The first one is within the special method "do". This method will take any code as argument and execute that with the receiver of the "do" message as the ground/context of the code inside it. That means "do" is a good way to create new cells inside an object.
This is a bit academic, so lets take a look at an example:
Foo = Origin mimic
Foo x = method(
;; this creates a local variable in the method activation
foo = 42
)
Foo = Origin mimic
Foo do(
;; this creates the cell foo inside of Foo
foo = 42
)
Here you can see a method defined called x. This method will just create a new local cell, which means calling the method will not make any difference on its receiver at all. The call to "do" in contrast will immediately execute the code inside it, and this code will create the cell "foo" inside of "Foo".
The second exception to the general rule is when executing inside of a lexical context. A lexical context is basically established inside of a block, but can also be created transparently when sending code to a method. A lexical block will try to not create new cells. When you assign a cell without a specific place to assign it, a lexical block will first see if there is any cell with that name further out, and if so it will make the assignment there instead. Only when no such cell exists, a new cell will be created in the lexical context. This code shows this in action:
x = 42
fn(x = 43. y = 42) call
x ;; => 43
y ;; => Condition Error NoSuchCell
The "fn" message creates a new lexical block. The chapter on code will talk more about this. But as you can see, this block assigns 43 to the cell "x", and 42 to the cell "y". But since the cell "y" doesn't exist, it will only be created inside the lexical context, while "x" exists outside, and will be assigned a new value instead. The basic idea is that code like this should behave like you expect it to behave.
The canonical form of assignment is a bit different from the way you usually write code in Ioke. The section on the syntax of assignments talked a bit about this. Specifically, something like "foo = 42" will get translated into "=(foo, 42)". That also means that assignment is just a regular method call, and can be overridden or removed just like any other method. That is exactly how both lexical context, and local method context make it possible to have different logic here. This is true for all assignment operators.
All assignment operators take as their first argument the place to assign to. This place will be unevaluated. Only the second argument to an assignment will be evaluated. In most cases, a place is the same thing as a cell name, but it doesn't have to be. Let's look at the case of assigning a cell with a strange name. Say we want to assign the cell with the no name. We can do it like this:
cell("") = 42
What happens here is a bit subtle. Since the left hand side of the assignment takes arguments, the "=" method figures out that the assignment is not to a simple cell name, but to a place. The parsing step will change "cell("") = 42" into "=(cell(""), 42)". Notice here that the argument comes along into the specification of the place. When this happens, the assignment operator will not try to create or assign a cell - instead it will in this case call the method cell=. So "cell("") = 42" will ultimately end up being the same as "cell=("", 42)". This way of transforming the name will work the same for all cases, so you can have as many arguments as you want to the place on the left hand side. The equals sign will be added to the method name, and a message will be sent to that instead.
This makes assignment of places highly flexible, and the only thing you need to do is implement methods with the right names. This feature is used extensively in Lists and Dicts to make it easy to assign to specific indexes. So, say we have a list called x. Then this code: "x[13] = 42" will be transformed into "x =([](13), 42)" which will in turn be transformed into "x []=(13, 42)". Ioke lists also has an at= method, so you can do "x at(13) = 42" which will call at=, of course.
The second transformation that might happen is that if you try to assign a cell that has an assigner, you will call that assigner instead of actually assigning a cell. So, for example, if you do "foo documentation = 42", this will not actually create or assign the cell "documentation". Instead it will find that Base has a cell called "documentation=", and instead send that message. So the prior code would actually be equivalent to "foo documentation=(42)".
All of these assignment processes together make it really easy to take control over assignment, while still making it very obvious and natural in most cases.
Destructuring assignment
The easiest example of destructuring assignment looks like this:
(x, y) = (42, 44)
Note that the parenthesis are necessary both on the left hand and right hand side. This will assign x to 42 and y to 44 in the current context, following the assignment rules given above. This assignment will happen in parallel, which means you can do the obvious swapping of values in one operation:
(x, y) = (y, x)
This also works for more than two simultaneous assignments.
The right hand side of an expression like this is expected to be a regular value that can be converted into a tuple. This include all Enumerable objects, since asTuple is defined there. That means you can also do something like this:
(x, y) = [42, 44]
If the destructurings doesn't match up, this is an error. Something like
(x, y) = [42, 44, 46]
will signal a condition Condition Error DestructuringMismatch
. This might not always be convenient. Say you don't care about the rest of the arguments and want to extract the two first elements no matter what, you can do that by ignoring the rest - this is done with the underscore:
(x, y, _) = 1..100
You can also use the underscore to ignore specific locations in other places. These will only ignore one element though:
(x, _, y, _) = 1..100
x should == 1
y should == 3
You can nest destructurings. All of the above mechanisms work correctly while doing that. So for example you can do this:
(x, y, (q, p, _), z) = (10, 42, 1..100, 55)
x should == 10
y should == 42
q should == 1
p should == 2
z should == 55
Finally, all of the above work with places as well as with regular names. This means you can for example do list indexed assignment inside one of these elements:
x = (1..10) asList
x ([5], [0], [2]) = (2, 3, 4)
x should == [3,2,2,4,5,2,6,7,8,9,10]
Let
Sometimes you really need to change the value of something temporarily, but then make sure that the value gets set back to the original value afterwards. Other situations often arise when you want to have a new name bound to something, but maybe not for a complete method. This might be really useful to create closures, and also to create localized helper methods. For example, the Ioke implementations for Enumerable use a helper syntax macro. This macro is bound temporarily, using a let form. This ensures that the syntax doesn't stay around and pollute the namespace.
A let in Ioke can do two different things, that on the surface look mostly the same but are really very different operations. The first one is to introduce new lexical bindings, and the second is to do a dynamic rebind of a specific place. The easiest way of thinking about it is that the lexical binding introduces a local change or addition to the available names you're currently using, while a dynamic rebinding will change the global state temporarily, and then set it back.
This sounds really academic, so let us go for some examples. We begin with lexical bindings.
;; put everything in a method to show explicit scope
foo = method(x, y,
x println ; => argument value of x
let(x, 14,
x println ; => 14
)
x println ; => argument value of x
y println ; => argument value of y
y = 13
y println ; => 13
let(y, 14,
y println ; => 14
)
y println ; => 13
z println ; will signal condition
let(z, 42,
z println ; => 42
)
z println ; will signal condition
)
Here a new method is created that has two arguments, x and y. The first let-expression will create a new scope where a binding from x to 14 is established. This binding is valid until the end of the let-form (but it can be changed, doing an assignment will set the value to something else, but only until the end of the let form). The same thing is true with y. We can change the value of y outside of the let form. That changes the actual argument variable. But a let form that binds y will only have it active for a limited time. Finally, a let form can also create totally new variables, as when creating z.
I didn't show any example of it, but the first part of a let-name can be any kind of place, not just a simple name. Anything you can use with =, can be used as a name for let. So you could do something like let(cell(""), 42, nil) if you wanted to.
OK, so that's lexical binding. What about dynamic rebinding? The main difference in a dynamic binding is that the scope you work in is something that is referencable from other scopes. In most cases this will be global places, but not necessarily. You can also rebind cells inside of other objects with the dynamic binding feature.
bar = method(
let(Origin foo, method("haha" println),
"x" foo
Origin foo
[1,2,3] foo
)
let(Text something, 42,
"abc" something println ; => 42
)
"abc" something println ; will signal condition
let(Text inspect, "HAHA",
"foo bar qux" inspect println ; => "HAHA"
)
"foo bar qux" inspect println ; => #["foo bar qux"]
let(Text asRational, method(42),
(3 + "haha") println ; => 45
)
)
This example actually changes things quite a lot. The first and second examples introduce new cells into existing places, uses them and then doesn't do anything. The third example actually overrides an existing cell in Text - inspect - and then uses it inside of the let code. Finally, after the let block is over, we see that the original method is back. The fourth example shows that our changes with let actually are global. There is no asRational on Text, but we add it temporarily and can then use it in arithmetic with numbers. This is once again a temporary change that will disappear afterwards.
Ioke's let-form is incredibly powerful, and it allows very nice temporal and localized changes. Of course, it's a power that can be abused, but it gives lots of interesting possibilities for expression.
Control flow
Ioke has some standard control flow operators, like most other languages. In Ioke, all of these are regular method calls though, and they can usually be implemented in terms of Ioke. This chapter will chiefly talk about comparisons, conditionals and iteration constructs.
Comparison
There are several comparison operators in Ioke, but the most important is called the spaceship operator. This operator is <=>. It takes one argument and returns -1, 0 or 1 depending on the ordering of the receiver and the argument. If the two objects can't be compared, it returns nil. If you implement this operator and mixin Mixins Comparing, you get the operators ==, !=, <, <=, > and >= implemented in terms of the spaceship operator. There are two other common operators in Ioke. The first =~, which can also be called the match operator. It's only implemented for Regexp right now. The === operator also exists, but implements matching slightly differently for all different types of objects. It is the basis for the case-expression. The contract of comparison operators is that they should return a true value (not necessarily the true) if the comparison is true, and otherwise return either false or nil.
The contract for === should be matching or not matching. It is among other things used in Ranges to see if something is included in that range or not.
iik> 1 + 2 < 4
+> true
iik> 3 + 2 < 4
+> false
iik> "foo" <=> "fop"
+> -1
Conditionals
Ioke has two different ways of doing conditionals. The first one is the default, and is also the traditional conditional from other languages. The second version looks more like Smalltalk conditionals.
As with everything else, these conditionals are all methods, and can be overridden and changed if need be. They can also be polymorphic.
The default conditionals are called "if" and "unless". They both take an initial evaluated argument that is used to check which branch should be taken. The "if" method will execute it's second argument if the first argument is true, and the third argument if the first argument is false. The "unless" method does the inverse -- executing the second argument if the first argument is false, and the third argument if the first argument is true. One or both of the branches can be left out from the statement. If no else-part is around and the conditional part evaluates to a false value, that false value will be returned.
A few examples are in order:
if(42 < 43,
"wow, math comparison works" println,
"we have some serious trouble" println)
if(42 < 43,
"wow, math comparison works",
"we have some serious trouble") println
unless(42 < 43,
"convoluted math" println)
It is good style to not use "unless" with an else branch. It generally tends to not be so readable that way. Remember that "if" and "unless" return their values, which means they are expressions like everything else. The middle example show that you can just call println on the result of the if-call, instead of doing it twice inside. This is also good style. Assigning the result of an if-call is likewise not a problem.
In some languages you see a pattern such as "if(foo = someExpensiveMethodCall(), foo println)", where a variable is assigned in the condition evaluation so the value doesn't have to be evaluated twice. This works in Ioke too, but there is a more idiomatic way of doing it. Both "if" and "unless" establish a lexical context, where a variable called "it" is available. This variable will be bound to the result of the conditional. So the above idiom could instead be written "if(someExpensiveMethodCall(), it println)". This is the preferred way of handling regular expression matching.
The Smalltalk inspired way of doing conditionals rest on the methods called ifTrue and ifFalse. Both of these methods are only defined on true and false, which means they are not as general as the if and unless statements. They can also be chained together, so you can write:
(42 < 43) ifTrue("wowsie!" println) ifFalse("oh noes" println)
As should be obvious from these examples, these conditionals can not return any value. They must only rely on side effects to achieve anything.
Ioke also supports the expected short circuiting boolean evaluators. They are implemented as regular methods and are available on all objects. All of the expected combinators are available, including "and", "&&", "or", "||", "xor", "nor" and "nand".
cond
Ioke doesn't have any else-if expression, which means that when you want to do several nested checks, you end up with lots of indentation. Cond is a macro that expands to that code. The code using cond will not have more indentation, which means it might be easier to read. A cond has one or more conditional expressions followed by an action part to execute if that condition is true. As soon as a condition has evaluated to true cond will not evaluate any more conditions or actions. If no conditions evaluate to true, nil will be returned, unless there is an action part following the last action-part. In other terms, if the cond-expression has an odd number of arguments, the last argument is the default case to execute if nothing else matches.
Some examples of cond:
cond(
x == 1, "one" println,
x == -1, "minus one" println,
x < 0, "negative" println,
x > 0, "positive" println,
"zero" println
)
As you can see, it becomes quite clear what happens here. Keep in mind that cond is an expression, just like anything else in Ioke, and will return the last value evaluated.
case
A thing that you very often want to do is to check one value against several different conditions. The case expression allows this to be done succinctly. The core to the case-expression is the === method, that is used for matching. The expression takes one value, then one or more conditionals followed by actions, and then an optional default part. Once something matches, no more conditionals will be executed. The conditional part should not be a complete conditional statement. Instead it should return something that implements a fitting ===. So, a small example follows:
case(value,
Text, "it is a text!" println,
1..10, "it is a low number" println,
:blurg, "it is the symbol blurg" println,
fn(c, (c+2) == 10), "it is 8" println,
"we don't know it!" println)
The above example shows several different things you can match against, including a lexical block. The implementation of === for a lexical block will call the block with the value and then return true or false depending on the truth-value of the result of the call to the block.
A thing that can be inconvenient in some languages is to do combinations of several of these. Say you want to check that something is a Text and matches a regular expression, or it is either 5..10 or 15..20. In most cases you will end up having to write several conditional parts for at least one of those two. But Ioke allows you to use combiners in the conditional part of a case expression. These combiners will be rewritten before executed, so a combiner called "else" will actually use the method "case:else", that in turn returns an object that responds correctly to ===. The end result is that using combiners read really well, and you can define your own by prefixing the name with "case:". There are several standard ones. Using a few of them looks like this:
case(value,
and(Text, #/o+/), "it's a text with several oos" println,
or(5..10, 15..20), "numberific" println,
else, "oh no!" println)
Combiners can be combined with each other and nested, so you could do and(foo, or(1, 2, 3), not(x)) if you want.
The available combiners are these:
- and
- Returns a matcher that returns true if all the arguments return true when calling ===. This is short circuiting.
- or
- Returns a matcher that returns true if any of the arguments return true when calling ===. This is short circuiting.
- not
- Takes one argument and returns the false if calling === on the argument returns true, and the other way around.
- nand
- The nand operation applied to === combiners.
- nor
- The nor operation applied to === combiners.
- xor
- The xor operation applied to === combiners.
- else
otherwise
- Returns a matcher that always returns true. This is useful to make the default argument read better.
Iteration
Ioke supports most of the expected control flow operations for iteration. The one thing that is missing is the for-loop. Since the for-loop encourages low level stepping, and can be replaced by other kinds of operations, I don't see any reason in having it in Ioke. In fact, the for-statement in Ruby is generally considered bad form too. And if someone really wants a for-loop it's really easy to implement. The name 'for' is also currently taken for list comprehensions.
loop
For creating infinte loops, the "loop"-method is the thing. It will just take a piece of code and execute it over and over again until some non-local flow control rips the execution up. Using it is as simple as calling it:
loop("hello" println)
x = 0
loop(
if(x > 10, break)
x++
)
The first example will loop forever, printing hello over and over again. The second example will increment a variable until it's larger then 10, and then it will break out of the loop.
while
The Ioke while loop works exactly like while-loops in other languages. It takes one argument that is a condition to reevaluate on each iteration, and another argument that is the code to evaluate each iteration. The result of the while-loop is the result of the last executed expression in the body.
x = 0
while(x < 10,
x println
x++
)
until
The until-loop works the same as the while-loop, except it expects its condition argument to evaluate to false. It will stop iterating when the conditional is true for the first time.
x = 0
until(x == 10,
x println
x++
)
times
A very common need is to iterate something a certain number of times. The Number Integer kind defines a method called "times" that does exactly this. It's got two forms - one with one argument and one with two arguments. With one argument, it will just run the argument code the specified number of times, and with two arguments the first argument should be the name of a cell to assign the current iteration value to and the second is the code to execute.
3 times("hello" println)
4 times(n,
"#{n}: wow" println)
The first example will print hello three times, while the second example will count up from 0 to 3, printing the number followed by "wow".
each
For most iteration needs, you want to traverse a collection in some way. The standard way of doing this is with the "each"-method. It's defined on all central collection classes and is also the basis of the contract for Mixins Enumerable. The contract for each has three different forms, and all should be implemented if you decide to implement the each method.
The each method should -- as the name implies -- do something for each entry in the collection it belongs to. So calling each on a set would do something with each entry, etc. Exactly what that is depends on how many arguments are given to "each".
If one argument is given, it should be a message chain. This message chain will be applied to each element.
[:one, :two, :three] each(inspect println)
;; the above would execute:
:one inspect println
:two inspect println
:three inspect println
Another way of saying it is that the message chain will be executed using each element of the collection as receiver, in turn. The return value will be thrown away in this case, so to achieve anything, the code need to mutate data somewhere.
The second -- and most common -- form, takes two arguments. The first argument should be the name of a cell to assign each element to, and the second argument should be the code to execute. Under the covers, this form will establish a new lexical context for the code to run in. As with the first version, each return value will be trown away.
[2, 4, 6] each(x, (x*x) println)
Here, the name "x" will be used as the name of each element of the list in turn, while executing the code.
The final form of each takes three arguments, where the first is the name of a cell to assign the current index, and the other two arguments are the same as the above.
[2, 4, 6] each(i, x, "#{i}: #{(x*x)}" println)
The above code would print:
0: 4
1: 16
2: 36
seq
There are two different iterator protocols in Ioke. The first one is based on each
as described in the previous section. The second protocol is slightly more general and is based on external iterators. The seq
method is expected to return a Sequence object. This need to have two methods, next?
and next
. The first one returns true if next
can be called again and false otherwise. The next
method returns the next object in the sequence. This protocol can be used to implement each
. If you have a seq
method you can mixin Mixins Sequenced
. This automatically makes your object Enumerable, gives you an each
method and add several convenience methods. The methods on Mixins Sequenced
and Sequence
will be described further down.
break, continue
When executing loops it is sometimes important to be able to interrupt execution prematurely. In this cases the break and continue methods allow this for "loop", "while" and "until". Both break and continue work lexically, so if you send code to another method that uses these methods, they will generally jump out of a lexically visible loop, just like expected.
The break method takes an optional value to return. If no value is provided it will default to nil. When breaking out of a loop, that loop will return the value given to break. The continue method will not break out of the execution, but will instead jump to the beginning and reevaluate the condition once again.
while(true,
break(42))
This code will immediately return 42 from the while-loop, even though it should have iterated forever.
i = 0
while(i < 10,
i println
if(i == 5,
i = 7
continue)
i++
)
This code uses continue to jump over a specific number, so it will only print 0 to 5, and 7 to 9.
Comprehensions
Ioke's Enumerable mimic makes it really easy to use higher order operations to transform and work with collections of data. But in some cases the code for doing that might not be as clear as it could be. Comprehensions allow a list, set or dict to be created based on a more abstract definition of what should be done. The specific parts of a comprehension are generators, filters and the mapping. The generators are what data to work on, the filters chooses more specifically among the generated data, and the mapping decides what the output should look like.
The following example does three nested iterations and returns all combinations where the product of the number is larger than 100:
for(
x <- 1..20,
y <- 1..20,
z <- 1..20,
val = x*y*z,
val > 100,
[x, y, z, val])
This code neatly shows all things you can do in a comprehension. The final argument will always be the output mapping, which in this case is a list of the three variables, and their product. The generator parts is first a name, the <- operator followed by an expression that is Enumerable. You can also see that one of the expressions is an assignment, that can be used later. Finally, there is a conditional that limits what the output will be. The more or less equivalent expression using Enumerable methods would be 1..20 flatMap(x, 1..20 flatMap(y, 1..20 filter(z, x*y*z > 100) map([x,y,z,x*y*z]))). In my eyes, the for-comprehension is much more readable.
There are two variations on this. The first one is when you want the output to be a Set of things instead of a List. The code is exactly the same, except instead of using for, you use for:set. There is also a for:dict version, for more esoteric usages.
Code
Many of the things you do in Ioke will directly manipulate code. Since the messages that make up code is really easy to get hold of, this manipulation comes easy too. Ioke takes the Lisp philosophy of "code is data" to heart. The basic unit of a piece of code is a Message. A Message has a name, a next and prev pointer, and any number of arguments. When you manipulate a message, the argument list will contain messages too - and if the next or prev pointers are not nil, they will point to other messages. It serves well to remember that except for the message itself, all code will be evaluated in the context of a receiver and a ground. The ground is necessary because arguments to be evaluated need to be run in some specific context, even though the current receiver is not the same as the ground.
The current types of code can be divided into three different categories. These are methods, macros and blocks. Native methods are all of the kind NativeMethod, but can have any kind of semantics - including semantics that look like macros. Most native methods do have the same semantics as regular methods, however.
Methods
A method in Ioke is executable code that is activatable. A method can take arguments of several different types. The arguments to a method will always be evaluated before the code in the method starts to execute. An Ioke method is defined using the "method" method. All Ioke methods have the kind DefaultMethod. This leaves the room open to define other kinds of methods, if need be. DefaultMethod's could be implemented using macros, but at this point they aren't. A DefaultMethod can have a name - and will get a name the first time it is assigned to a cell.
It is really easy to define and use a simple method. The easiest case is to define a method that is empty. This method will just return nil:
m = method()
m ;; call the method
Since methods are activatable, when you name a cell that contains a method, that method will be invoked. To stop that behavior, use the "cell" method.
The definition of a method can take several different pieces. These are a documentation string, definitions of positional required arguments, definitions of positional optional arguments, definitions of keyword arguments, definition of a rest argument, definition of a keyword rest argument and the actual code of the method.
Let's take these one by one. First, if the the first element of a call to "method" is a literal text, and there is at least one more argument in the definition, then that text will be the documentation text for the method:
;; a method that returns "foo"
m = method("foo")
;; a method that returns nil, but
;; has the documentation text "foo"
m = method("foo", nil)
A method can take any number of required positional arguments. These will be checked when a method is called, and if not enough -- or too many -- arguments are provided, an error will be signalled.
m = method(x, x println)
m = method(x, y, z,
x * y + z)
The first method takes one argument and prints that argument. The second method takes three arguments and return the product of the two first added to the third.
A method can also have optional positional arguments. In that case the optional arguments must follow the required arguments. Optional arguments need to have a default value -- in fact, that is how you distinguish them from required arguments. The arity of method calls will still be checked, but using minimum and maximum values instead. The default value for an argument should be code that can be executed in the context of the running method, so a default value can refer to earlier positional arguments. A default value can also do quite complex things, if need be, although it's not really recommended.
;; takes zero or one arguments
m = method(x 42, x println)
;; takes one to three arguments
m = method(x, y 42, z 25,
x*y + z)
The syntax for optional arguments is to just write a space after the name of the argument, and then write the code to generate the default value after it.
A method can also have keyword arguments. Keyword arguments are checked, just like regular arguments, and you can't generally give keyword arguments to a method not expecting it. Nor can you give unexpected keyword arguments to a method that takes other keywords. Keyword arguments can never be required. They can have default values, which will default to nil if not provided. They can be defined anywhere among the arguments -- the only reason to reorder them is that default values of other optional arguments can use prior defined keyword arguments.
A keyword argument is defined just like a regular argument, except that it ends in a colon.
m = method(foo:, bar: 42,
foo println
bar println
)
Just as with regular optional arguments, you supply the default value of the keyword argument after a space. The cells for the keyword arguments will be the same as their names, without the ending colon. The above code would print nil and 42 if no arguments were specified. It's important to remember that keyword arguments and positional arguments do not interact -- except for when calculating default values. When assigning values it's always possible to see what is positional and what is a keyword argument.
Ioke methods can collect positional arguments into a list. This allow methods to take variable number of arguments. The rule is that all other positional arguments are first calculated, and the remaining positional arguments will be added to the rest argument. If no positional arguments are available, the rest argument will be empty. A rest argument is defined by preceding it with a plus sign in the argument definition. For clarity a rest argument should be defined last in the list, although it doesn't exactly matter anyway.
m = method(+rest,
rest println)
m = method(x, y 42, +rest,
rest println)
The above code defines one method that only takes one rest argument. That means the method can take any number of arguments and all of them will be collected into a list. The second method takes one required argument, one optional argument and any number of extra arguments. So if four arguments are given, the rest argument will contain two.
The final type of argument is keyword rest arguments. Just like positional rest arguments, a keyword rest argument can collect all keywords given to a method, no matter what. If a keyword rest argument is used, no conditions will be signalled if an unknown keyword is given to a method. If other keywords are defined, these keywords will not show up in the keyword rest argument. The keyword rest argument is defined by preceding the name with a +: sigil, and the keyword rest argument will be a Dict instead of a list. The keys will be symbols but without the ending colon.
m = method(+:krest,
krest println)
m = method(x, y:, +rest, +:krest,
[x, y, rest, krest])
The above code first creates a method that can take any number of keyword arguments but nothing else. The second method takes one required positional argument, one keyword argument, rest arguments and keyword rest arguments, and returns a new list containing all the arguments given to it.
The final argument to the method method should always be the code to execute. This code will be executed in the context of a receiver, that is the object the method is activated on. A method execution also happens in the context of the method activation context, where local variables are stored. This activation context contain some predefined variables that can be used. These are "self", "@", "currentMessage" and "surroundingContext". Both "self" and "@" refer to the receiver of the method call. "currentMessage" returns the message that initiated the activation of the method, and "surroundingContext" returns the object that represents the context where this method was called from. Both "self" and "@" can be used to specify that something should be assigned to the receiver, for example.
createNewCell = method(
@foo = 42
)
The method create above will create assign the value 42 to the cell "foo" on the object the method was called on.
When calling a method, you specify positional arguments separated with commas. You can provide keyword arguments in any order, in any place inside the braces:
;; the method foo takes any kind of argument
foo
foo()
foo(1, 2, 3)
foo(blarg: 42, 2, 3, 4)
foo(quux: 42*2)
To give a keyword argument, you just write it exactly like you define keyword arugments - a name followed by a colon.
Sometimes it can be useful to be able to take a list of values and give them as positional arguments. The same can be useful to do with a dict of names. You can do that using splatting. This is done by preceding a list or a dict with an asterisk. This will result in the method getting the values inside of it as if the arguments were given directly. You can splat several things to the same invocation.
dc = {foo: 42, bar: 13}
ls = [1, 2, 3, 4]
ls2 = [42, 43, 44]
foo(*dc)
;; the same as:
foo(foo: 42, bar: 13)
foo(*ls)
;; the same as:
foo(1, 2, 3, 4)
foo(*ls2, 111, *dc, *ls)
;; the same as:
foo(42, 43, 44, 111, foo: 42, bar: 13, 1, 2, 3, 4)
If you try to splat something that can't be splatted, a condition will be signalled.
Macros
The main difference between a macro and a method in Ioke is that the arguments to a macro are not evaluated before they are sent to the macro. That means you have to use macros to send raw message chains in an invocation. In most languages, this kind of feature is generally called call-by-name. When a macro gets called, it will get access to a cell called "call" which is a mimic of the kind Call. This gives access to information about the call and makes it possible to evaluate the code sent as arguments, check how many arguments are supplied, and so on.
A macro is created using the "macro" cell on DefaultBehavior. This will return a mimic of DefaultMacro. Since macros can't define arguments, it's a bit easier to describe than methods, but the things that can be done with macros are also a bit more interesting than what can be achieved with methods. One important thing to keep in mind is that most macros can not receive splatted arguments. In most cases keyword arguments aren't available either - but they could be faked if needed. Macros should generally be used to implement control structures and things that need to manipulate code in different ways.
Just like a method, a macro gets evaluated on a specific receiver. It also gets the same kind of method activation context, but the contents of it is a bit different. Specifically, the context for a macro contains cells named "self", "@", "currentMessage", "surroundingContext" and "call". It's the "call" cell that is most important. It is a mimic of Call, and Call defines several important methods for manipulating the call environment. These are:
- arguments
- This method returns a list containing the unevaluated arguments given to this message. Any kind of manipulation can be done with these arguments.
- ground
- Returns the ground in which the call was initiated. This is necessary to evaluate arguments in their own environment.
- message
- The currently executing message. This is the same as the "currentMessage" cell in the macro activation context.
- evaluatedArguments
- Returns a list containing all arguments, evaluated according to the regular rules (but not handling splatting or keywords).
- resendToMethod
- Allows a specific message to be resent to another method, without manually copying lots of information.
These methods are a bit hard to understand, so I'll take some examples from the implementation of Ioke, and show how macros are used here.
Mixins Enumerable map = macro(
"takes one or two arguments. if one argument is given,
it will be evaluated as a message chain on each element
in the enumerable, and then the result will be collected
in a new List. if two arguments are given, the first one
should be an unevaluated argument name, which will be
bound inside the scope of executing the second piece of
code. it's important to notice that the one argument
form will establish no context, while the two argument form
establishes a new lexical closure.",
len = call arguments length
result = list()
if(len == 1,
code = call arguments first
self each(n, result << code evaluateOn(call ground, cell(:n))),
code = LexicalBlock createFrom(call arguments, call ground)
self each(n, result << code call(cell(:n))))
result)
The code above implements map, one of the methods from Enumerable. The map method allows one collection to be mapped in a predefined way into something else. It can take either one or two arguments. If one argument is given, that is a message chain to apply, and then collect the results. If two arguments are given, the first is the argument name to use, and the second is the code to execute for each entry.
The first step is to figure out how many arguments have been given. This is done by checking the length of the "call arguments" cell. If we have a length of one, we know that the first argument is a piece of code to apply, so we assign that argument to a cell called "code". Now, "code" will be a mimic of Message, and Message has a method called "evaluateOn", that can be used to fully evaluate a message chain. And that's exactly what we do for each element in the collection we are in. The result of evaluateOn is added to the result list. We use "call ground" to get the correct ground for the code to be evaluated in.
If we get two arguments, it's possible to take a shortcut and generate a lexical block from those arguments, and then use that. So we call "LexicalBlock createFrom" and send in the arguments and the ground, and then call that piece of code once for each element in the collection.
It is a bit tricky to figure out how macros work. I recommend looking at the implementations of some of the core Ioke methods/macros, since these use much of the functionality.
Blocks
A lexical block allows the execution of a piece of code in the lexical context of some other code, instead of in a dynamic object scope. A lexical block does not have a receiver. Instead, it just establishes a new lexical context, and executes the code in that. The exact effect that has on assignments has been described earlier.
A lexical block can be created using either the "fn" or the "fnx" methods of DefaultBehavior. The main difference between the two is that a block created with "fnx" will be activatable, while something created with "fn" will not. Lexical blocks handle arguments exactly the same way as methods, so a lexical block can take optional arguments, keyword arguments, rest arguments and so on. Both "fn" and "fnx" also take optional documentation text.
A block created with the "fn" method can be invoked using the "call" method of the kind LexicalBlock
.
x = fn(z, z println)
x call(42)
If a block created with the "fn" method takes one or more explicit parameters it can also be activated like a regular method. The reason for this is shown in the code snippet below. Here the result of invoking the block referred to by "x" is passed to "y" (which may be a regular method or even another block). If "x" would be fully non-activatable, "x" would be passed to "y" as is with the argument thrown away. In other words, that would be dead code. However, you can still refer to the block as "x" without an invocation to happen.
x = fn(z, z + 42)
y(x(100)) ;; activates the block with argument 100 and passes the result to y
x ;; refers to the block without activating it
A block created with the "fnx" method is activatable per se and thus can be activated like a regular method. The default is to use "fn" to create inactive blocks though, since blocks are generally used to pass pieces of code around.
y = fnx(z, z println)
y(42)
A lexical block is a regular kind of object that can be assigned to any cell, just like other objects. Lexical blocks mimic LexicalBlock, and blocks don't have names. In contrast to methods and macros, no extra cells will be added to the activation context for a lexical block.
You can also do several kinds of functional composition of blocks. Some of these combinations only make sense for predicates, while others are more generally applicable. They all expect to work with functions that are OK with only taking one argument, though. In the following examples, f and g are general lexical blocks, while p? and q? are predicates.
-
f -> g
- Will return a new block that is the equivalent of
g(f(arg))
-
f <- g
- Will return a new block that is the equivalent of
f(g(arg))
-
p? & q?
- Will return a new block that is the equivalent of
p?(arg) & q?(arg)
-
p? | q?
- Will return a new block that is the equivalent of
p?(arg) | q?(arg)
-
p? complement
- Will return a new block that is the equivalent of
not(p?(arg))
Lecros
A macro works exactly like a method, in that it always has a receiver, and that receiver is available inside the macro as 'self' and '@'. In some circumstances it can be really useful to have a macro that behaves like a lexical block instead - being lexical so it can use cells defined outside of the definition of the macro. These macros won't have access to 'self' or '@', since they don't have a receiver in that way. Where such a macro is called is only based on namespacing.
Ioke supports these kind of macros. They are all mimics of the kind LexicalMacro, and they are created using the method 'lecro'. A LexicalMacro is activatable by default, but a non-activatable lecro can be created using lecrox. The 'lecro' method takes the same arguments as 'macro', and the only real difference is the way it handles outside cells and the receiver value. A lecro also has a cell called outerScope that can be used if you need to explicitly access something in the outer name space - such as call.
Syntax
Ioke supports loads of stuff with the standard macro
, but sometimes these are a bit too low level for commonly used operations. Syntax is one of those cases: you can achieve the same goals with macros, but you don't really want to. Many features in Ioke S are implemented using syntax.
You can define syntax using the syntax
method. This returns a mimic of DefaultSyntax
. You can use the same kind of cells in a syntax as you can in a macro. What is different with syntax is that syntax can only return one of two things. The first is nil
, and the second is a message chain. A syntax will only be executed once at every point in the message chains, because after a syntax executes the first time, it will replace itself with the result of that evaluation. If that evaluation returns nil
, syntax will just remove itself from the message chain.
You can use this for many things, but one of the more useful things you can do is translate a high level declarative definition of something into a low level executable version. That is exactly how for comprehensions are implemented.
Syntactic macros are fairly advanced, and take some time to grok. They are incredibly useful though, and they are used all over the standard library to achieve all manner of interesting things. Take a look there and things should hopefully become clearer. It's also a must to read the section on message chain manipulation and quoting in this guide to make syntax macros readable.
Destructuring
A common problem with macros is that you want to take several different combinations of arguments, and do different things depending on how many you get. Say you might want to take one code argument, but also two optional arguments that should be evaluated. All of that code turns out to be highly repetetive, so Ioke contains a collection of syntax macros that make it easier to write these things. These are collectively called destructuring syntax.
Let us say we have a macro that can be called with any of three types of argument list: [code], [evaluatedArgument, code], or [evaluatedArgument, code, evaluatedArgument]. The stuff that should happen is totally different for each of these cases. With a regular macro the code would look something like this:
foo = macro(
len = call arguments length
case(len,
1,
code = call arguments[0]
; do something with the code
,
2,
arg1 = call argAt(0)
code = call arguments[1]
; do something with the code and arg
,
3,
arg1 = call argAt(0)
code = call arguments[1]
arg2 = call argAt(2)
; do something with the code and args
))
As you can see it's really a lot of code to see what happens here, and it is very imperative in style. But, if I instead use dmacro - which is the destructuring version of macro - it looks like this:
foo = dmacro(
[code]
; do something with the code
,
[>arg1, code]
; do something with the code and arg
,
[>arg1, code, >arg2]
; do something with the code and args
)
dmacro will automatically check the length and extract the different arguments. The right arrow before the names of arg1 and arg2 marks that these should be evaluated. And what is more, dmacro will generate code that also generates a good condition if no argument matching works out. If you give zero arguments to the first version, it will fail silently. The dmacro will complain immediately. The dmacro destructuring syntax actually supports several more ways of ripping arguments apart. You can find this information in the doks for dmacro. Also, there are equivalent versions of dmacro for lecro, lecrox and syntax, called dlecro, dlecrox and dsyntax. They do the same thing, except they act like lecros or syntax instead.
Message chains
In many cases a macro will take code that is not wrapped up inside of a method, macro or block. These pieces of code are called message chains, since their representation will be a raw Message mimic. The chains are quite flexible, since they can be taken apart, modified and put together again. They can also be unevaluated and used as data definitions of some kind. That's how the argument handling to methods are implemented, for example. Since the call to "method" can be seen as a regular call to a macro, the argument descriptions are actually just unevaluated message chains that are picked apart to tease out the argument names. The same technique is applicable in any macro usage.
The term message chain fragment is also used to specifically mean a message chain that is meant to be put together with something and evaluated. Picture a daisy chain that gets added at the end of another chain and then executed. That's what happens if you execute something like [1, 2, 3] map(*2)
. In this case the call to *
with the argument 2 will be a message chain fragment that will be put together with a new receiver before execution.
To handle syntax correctly - but also to generally handle manipulation of message chains - it is important to know about the available methods to do this. I have added quite a lot of nice stuff that makes it easy to work with message chains.
First, messages are actually Enumerable
, so you can use any Enumerable
methods on them. The enumeration always starts at the receiver. It will not proceed into arguments, just following the next-pointer. To create a new message or message chain, there are several helpful methods and operators. The first method is called message
and takes an evaluated name and returns a new message with that name. Message from
takes one argument that will not be evaluated and returns a message chain corresponding to that argument. Message fromText
parses text and returns the message chain for it. Message wrap
takes an evaluated argument and returns a message that will always return that value. As will be mentioned later, Message has next=
and prev=
methods that you can use to set the next and previous pointers. Message also has appendArgument
and prependArgument
that allow you to add new arguments to the message arguments.
The most used versions for creating message chains are short cuts for the above. Let us begin with creation. Instead of Message from
you can use '. That is a single quote mark. The message after that will be unevaluated and returned as a message chain. If you use a `, a backtick, that is equivalent to Message wrap
. And then we have '', that is two single quotes after each other. This message is generally called metaquote or quasiquote. It works the same as ', except that it will find any place where ` is used and insert the value of evaluating the message after the ` and insert that into the current message chain. Finally, '' will replace a `` with a literal ` message.
You can add new arguments to a message by using the << operator. This operator returns the receiver.
If you want to chain together a message chain, using next=
and prev=
is pretty tedious. You can instead use the -> operator. This will chain together the left hand side and the right hand side messages, and return the right hand side message.
I think it is time for some examples:
; create a new message with name foo
x = 'foo
; add two arguments to the foo message
arg = '(bar quux)
(x << arg) << 'baz
; what we have done so far could be done with:
x = '(foo(bar quux, baz))
y = 'blurg
; chain together x and y
x -> y
; the above is equivalent to
if(y prev,
y prev next = nil)
x next = y
y prev = x
val = 42
; insert the message chain in x
''(foo bar(`val) `x)
; the above will return the same as
'(foo bar(42) foo(bar quux, baz))
To understand these operators, you need to have a clear understanding of how the internals of message chains work. Once that clicks, these should be fairly straight forward to understand.
Rewriting
One of the problems with manipulating message chains is that the code tend to be fairly imperative and unwieldy. In many cases you might want to do some simple restructurings that are easy to explain but not so easy to encode using the available operations. Since Ioke P, Ioke support message rewriting that sometimes can make these situations slightly easier.
Say you have a simple message and want to insert some other message surrounding it:
; create a new message with name foo
x = 'foo
; this will return a new message that is the same as writing 'something(foo) originally.
x2 = x rewrite(
'(:x) => '(something(:x))
)
The rewrite support include quite a lot of interesting capabilities. For a closer look at what is possible, look at the specs for Message#rewrite.