Icon (programming language)
Icon is a very high-level programming language based on the concept of "goal-directed execution" in which code returns a "success" along with valid values, or a "failure", indicating that there is no valid data to return. The success and failure of a given block of code is used to direct further processing, whereas conventional languages would typically use boolean logic written by the programmer to achieve the same ends. Because the logic for basic control structures is often implicit in Icon, common tasks can be completed with less explicit code.
![]() | |
Paradigms | multi-paradigm: structured, text-oriented |
---|---|
Family | SNOBOL |
Designed by | Ralph Griswold |
First appeared | 1977 |
Stable release | 9.5.24a
/ January 17, 2024[1] |
Typing discipline | dynamic |
Website | www |
Major implementations | |
Icon, Jcon | |
Dialects | |
Unicon | |
Influenced by | |
SNOBOL, SL5, ALGOL | |
Influenced | |
Unicon, Python, Goaldi,[2] jq |
Icon was designed by Ralph Griswold after leaving Bell Labs where he was a major contributor to the SNOBOL language. SNOBOL was a string-processing language with what would be considered dated syntax by the standards of the early 1970s. After moving to the University of Arizona, he further developed the underlying SNOBOL concepts in SL5, but considered the result to be a failure. This led to the significantly updated Icon, which blends the short but conceptually dense code of SNOBOL-like languages with the more familiar syntax of ALGOL-inspired languages like C or Pascal.
Like the languages that inspired it, the primary area of use of Icon is managing strings and textual patterns. String operations often fail, for instance, finding "the" in "world". In most languages, this requires testing and branching to avoid using a non-valid result. In Icon most of these sorts of tests are simply unneeded, reducing the amount of code that must be written. Complex pattern handling can be done in a few lines of terse code, similar to more dedicated languages like Perl but retaining a more function-oriented syntax familiar to users of other ALGOL-like languages.
Icon is not object-oriented, but an object-oriented extension named Idol was developed in 1996 which eventually became Unicon. It also inspired other languages, with its simple generators being especially influential; Icon's generators were a major inspiration for the Python language.[3]
History
SNOBOL
The original SNOBOL effort, retroactively known as SNOBOL1, launched in the fall of 1962 at the Bell Labs Programming Research Studies Department.[4] The effort was a reaction to the frustrations of attempting to use the SCL language for polynomial formula manipulation, symbolic integration and studying Markov chains. SCL, written by the department head Chester Lee, was both slow and had a low-level syntax that resulting in volumes of code for even simple projects. After briefly considering the COMIT language, Ivan Polonsky, Ralph Griswold and David Farber, all members of the six-person department, decided to write their own language to solve these problems.[5]
The first versions were running on the IBM 7090 in early 1963, and by the summer had been built out and was being used across Bell. This led almost immediately to SNOBOL2, which added a number of built-in functions, and the ability to link to external assembly language code. It was released in April 1964 and mostly used within Bell, but also saw some use at Project MAC. The introduction of system functions served mostly to indicate the need for user functions, which was the major feature of SNOBOL3, released in July 1964.[6]
SNOBOL3's introduction corresponded with major changes within the Bell Labs computing department, including the addition of the new GE 645 mainframe which would require a rewrite of SNOBOL. Instead, the team suggested writing a new version that would run on a virtual machine, named SIL for SNOBOL Intermediate Language, allowing it to be easily ported to any sufficiently powerful platform. This proposal was accepted as SNOBOL4 in September 1965. By this time, plans for a significantly improved version of the language emerged in August 1966.[7] Further work on the language continued throughout the rest of the 1960s, notably adding the associative array type in later version, which they referred to as a table.
SL5 leads to Icon
Griswold left Bell Labs to become a professor at the University of Arizona in August 1971.[8] He introduced SNOBOL4 as a research tool at that time.[9] He received grants from the National Science Foundation to continue supporting and evolving SNOBOL.[10]
As a language originally developed in the early 1960s, SNOBOL's syntax bears the marks of other early programming languages like FORTRAN and COBOL. In particular, the language is column-dependant, as many of these languages were entered on punch cards where column layout is natural. Additionally, control structures were almost entirely based on branching around code rather than the use of blocks, which were becoming a must-have feature after the introduction of ALGOL 60. By the time he moved to Arizona, the syntax of SNOBOL4 was hopelessly outdated.[11]
Griswold began the effort of implementing SNOBOL's underlying success/failure concept with traditional flow control structures like if/then. This became SL5, short for "SNOBOL Language 5", but the result was unsatisfying.[11] In 1977, he returned to the language to consider a new version. He abandoned the very powerful function system introduced in SL5 with a simpler concept of suspend/resume and developed a new concept for the natural successor to SNOBOL4 with the following principles;[11]
- SNOBOL4's philosophic and sematic basis
- SL5 syntactic basis
- SL5 features, excluding the generalized procedure mechanism
The new language was initially known as SNOBOL5, but as it was significantly different from SNOBOL in all but the underlying concept, a new name was ultimately desired. After considering "s" as a sort of homage to "C", but this was ultimately abandoned due to the problems with typesetting documents using that name. A series of new names were proposed and abandoned; Irving, bard, and "TL" for "The Language". It was at this time that Xerox PARC began publishing about their work on graphical user interfaces and the term "icon" began to enter the computer lexicon. The decision was made to change the name initially to "icon" before finally choosing "Icon".[11][a]
Language
Basic syntax
The Icon language is derived from the ALGOL-class of structured programming languages, and thus has syntax similar to C or Pascal. Icon is most similar to Pascal, using <syntaxhighlight lang="text" class="" style="" inline="1">:=</syntaxhighlight> syntax for assignments, the <syntaxhighlight lang="text" class="" style="" inline="1">procedure</syntaxhighlight> keyword and similar syntax. On the other hand, Icon uses C-style braces for structuring execution groups, and programs start by running a procedure called <syntaxhighlight lang="text" class="" style="" inline="1">main</syntaxhighlight>.[13]
In many ways Icon also shares features with most scripting languages (as well as SNOBOL and SL5, from which they were taken): variables do not have to be declared, types are cast automatically, and numbers can be converted to strings and back automatically.[14] Another feature common to many scripting languages, but not all, is the lack of a line-ending character; in Icon, lines that do not end with a semicolon get ended by an implied semicolon if it makes sense.[15]
Procedures are the basic building blocks of Icon programs. Although they use Pascal naming, they work more like C functions and can return values; there is no <syntaxhighlight lang="text" class="" style="" inline="1">function</syntaxhighlight> keyword in Icon.[16]
<syntaxhighlight lang="icon">
procedure doSomething(aString) write(aString) end
</syntaxhighlight>
Goal-directed execution
One of the key concepts in SNOBOL was that its functions returned the "success" or "failure" as primitives of the language rather than using magic numbers or other techniques.[17][18] For instance, a function that returns the position of a substring within another string is a common routine found in most language runtime systems; in JavaScript one might want to find the position of the word "World" within a "Hello, World!" program, which would be done with <syntaxhighlight lang="icon" class="" style="" inline="1">position = "Hello, World".indexOf("World")</syntaxhighlight>, which would return 7. If one instead asks for the <syntaxhighlight lang="icon" class="" style="" inline="1">position = "Hello, World".indexOf("Goodbye")</syntaxhighlight> the code will "fail", as the search term does not appear in the string. In JavaScript, as in most languages, this will be indicated by returning a magic number, in this case -1.[19]
In SNOBOL a failure of this sort returns a special value, <syntaxhighlight lang="text" class="" style="" inline="1">fail</syntaxhighlight>. SNOBOL's syntax operates directly on the success or failure of the operation, jumping to labelled sections of the code without having to write a separate test. For instance, the following code prints "Hello, world!" five times:[20]
<syntaxhighlight lang=snobol>
- SNOBOL program to print Hello World
I = 1
LOOP OUTPUT = "Hello, world!"
I = I + 1 LE(I, 5) : S(LOOP)
END </syntaxhighlight>
To perform the loop, the less-than-or-equal operator, <syntaxhighlight lang="text" class="" style="" inline="1">LE</syntaxhighlight>, is called on the index variable I, and if it <syntaxhighlight lang="text" class="" style="" inline="1">S</syntaxhighlight>ucceeds, meaning I is less than 5, it branches to the named label <syntaxhighlight lang="text" class="" style="" inline="1">LOOP</syntaxhighlight> and continues.[20]
Icon retained the concept of flow control based on success or failure but developed the language further. One change was the replacement of the labelled <syntaxhighlight lang="text" class="" style="" inline="1">GOTO</syntaxhighlight>-like branching with block-oriented structures in keeping with the structured programming style that was sweeping the computer industry in the late 1960s.[11] The second was to allow "failure" to be passed along the call chain so that entire blocks will succeed or fail as a whole. This is a key concept of the Icon language. Whereas in traditional languages one would have to include code to test the success or failure based on boolean logic and then branch based on the outcome, such tests and branches are inherent to Icon code and do not have to be explicitly written.[21]
For instance, consider this bit of code written in the Java programming language. It calls the function <syntaxhighlight lang="text" class="" style="" inline="1">read()</syntaxhighlight> to read a character from a (previously opened) file, assigns the result to the variable <syntaxhighlight lang="text" class="" style="" inline="1">a</syntaxhighlight>, and then <syntaxhighlight lang="text" class="" style="" inline="1">write</syntaxhighlight>s the value of <syntaxhighlight lang="text" class="" style="" inline="1">a</syntaxhighlight> to another file. The result is to copy one file to another. <syntaxhighlight lang="text" class="" style="" inline="1">read</syntaxhighlight> will eventually run out of characters to read from the file, potentially on its very first call, which would leave <syntaxhighlight lang="text" class="" style="" inline="1">a</syntaxhighlight> in an undetermined state and potentially cause <syntaxhighlight lang="text" class="" style="" inline="1">write</syntaxhighlight> to cause a null pointer exception. To avoid this, <syntaxhighlight lang="text" class="" style="" inline="1">read</syntaxhighlight> returns the special value <syntaxhighlight lang="text" class="" style="" inline="1">EOF</syntaxhighlight> (end-of-file) in this situation, which requires an explicit test to avoid <syntaxhighlight lang="text" class="" style="" inline="1">write</syntaxhighlight>ing it:
<syntaxhighlight lang="java">
while ((a = read()) != EOF) { write(a); }
</syntaxhighlight>
In contrast, in Icon the <syntaxhighlight lang="text" class="" style="" inline="1">read()</syntaxhighlight> function returns a line of text or <syntaxhighlight lang="text" class="" style="" inline="1">&fail</syntaxhighlight>. <syntaxhighlight lang="text" class="" style="" inline="1">&fail</syntaxhighlight> is not simply an analog of <syntaxhighlight lang="text" class="" style="" inline="1">EOF</syntaxhighlight>, as it is explicitly understood by the language to mean "stop processing" or "do the fail case" depending on the context. The equivalent code in Icon is:[18]
<syntaxhighlight lang="icon">
while a := read() do write(a)
</syntaxhighlight>
This means, "as long as read does not fail, call write, otherwise stop".[18] There is no need to specify a test against the magic number as in the Java example, this is implicit, and the resulting code is simplified. Because success and failure are passed up through the call chain, one can embed function calls within others and they stop when the nested function call fails. For instance, the code above can be reduced to:[22]
<syntaxhighlight lang="icon">
while write(read())
</syntaxhighlight>
In this version, if the <syntaxhighlight lang="text" class="" style="" inline="1">read</syntaxhighlight> call fails, the <syntaxhighlight lang="text" class="" style="" inline="1">write</syntaxhighlight> call fails, and the <syntaxhighlight lang="text" class="" style="" inline="1">while</syntaxhighlight> stops.[22] Icon's branching and looping constructs are all based on the success or failure of the code inside them, not on an arbitrary boolean test provided by the programmer. <syntaxhighlight lang="text" class="" style="" inline="1">if</syntaxhighlight> performs the <syntaxhighlight lang="text" class="" style="" inline="1">then</syntaxhighlight> block if its "test" returns a value, and performs the <syntaxhighlight lang="text" class="" style="" inline="1">else</syntaxhighlight> block or moves to the next line if it returns <syntaxhighlight lang="text" class="" style="" inline="1">&fail</syntaxhighlight>. Likewise, <syntaxhighlight lang="text" class="" style="" inline="1">while</syntaxhighlight> continues calling its block until it receives a fail. Icon refers to this concept as goal-directed execution.[23]
It is important to contrast the concept of success and failure with the concept of an exception; exceptions are unusual situations, not expected outcomes. Fails in Icon are expected outcomes; reaching the end of a file is an expected situation and not an exception. Icon does not have exception handling in the traditional sense, although fail is often used in exception-like situations. For instance, if the file being read does not exist, <syntaxhighlight lang="text" class="" style="" inline="1">read</syntaxhighlight> fails without a special situation being indicated.[18] In traditional language, these "other conditions" have no natural way of being indicated; additional magic numbers may be used, but more typically exception handling is used to "throw" a value. For instance, to handle a missing file in the Java code, one might see:
<syntaxhighlight lang="java">
try { while ((a = read()) != EOF) { write(a); } } catch (Exception e) { // something else went wrong, use this catch to exit the loop }
</syntaxhighlight>
This case needs two comparisons: one for EOF and another for all other errors. Since Java does not allow exceptions to be compared as logic elements, as under Icon, the lengthy <syntaxhighlight lang="text" class="" style="" inline="1">try/catch</syntaxhighlight> syntax must be used instead. Try blocks also impose a performance penalty even if no exception is thrown, a distributed cost that Icon normally avoids.
Icon uses this same goal-directed mechanism to perform traditional boolean tests, although with subtle differences. A simple comparison like <syntaxhighlight lang="icon" class="" style="" inline="1">if a < b then write("a is smaller than b")</syntaxhighlight> does not mean, "if the conditional expression evaluation results in or returns a true value" as they would under most languages; instead, it means something more like, "if the conditional expression, here <syntaxhighlight lang="text" class="" style="" inline="1"><</syntaxhighlight> operation, succeeds and does not fail". In this case, the <syntaxhighlight lang="text" class="" style="" inline="1"><</syntaxhighlight> operator succeeds if the comparison is true. The <syntaxhighlight lang="text" class="" style="" inline="1">if</syntaxhighlight> calls its <syntaxhighlight lang="text" class="" style="" inline="1">then</syntaxhighlight> clause if the expression succeeds, or the <syntaxhighlight lang="text" class="" style="" inline="1">else</syntaxhighlight> or the next line if it fails. The result is similar to the traditional if/then seen in other languages, the <syntaxhighlight lang="text" class="" style="" inline="1">if</syntaxhighlight> performs <syntaxhighlight lang="text" class="" style="" inline="1">then</syntaxhighlight> if <syntaxhighlight lang="text" class="" style="" inline="1">a</syntaxhighlight> is less than <syntaxhighlight lang="text" class="" style="" inline="1">b</syntaxhighlight>. The subtlety is that the same comparison expression can be placed anywhere, for instance:
<syntaxhighlight lang="icon">
write(a < b)
</syntaxhighlight>
Another difference is that the <syntaxhighlight lang="text" class="" style="" inline="1"><</syntaxhighlight> operator returns its second argument if it succeeds, which in this example will result in the value of <syntaxhighlight lang="text" class="" style="" inline="1">b</syntaxhighlight> being written if it is larger than <syntaxhighlight lang="text" class="" style="" inline="1">a</syntaxhighlight>, otherwise nothing is written. As this is not a test per se, but an operator that returns a value, they can be strung together allowing things like <syntaxhighlight lang="text" class="" style="" inline="1">if a < b < c</syntaxhighlight>,[22] a common type of comparison that in most languages must be written as a conjunction of two inequalities like <syntaxhighlight lang="text" class="" style="" inline="1">if (a < b) && (b < c)</syntaxhighlight>.
A key aspect of goal-directed execution is that the program may have to rewind to an earlier state if a procedure fails, a task known as backtracking. For instance, consider code that sets a variable to a starting location and then performs operations that may change the value - this is common in string scanning operations for instance, which will advance a cursor through the string as it scans. If the procedure fails, it is important that any subsequent reads of that variable return the original state, not the state as it was being internally manipulated. For this task, Icon has the reversible assignment operator, <syntaxhighlight lang="text" class="" style="" inline="1"><-</syntaxhighlight>, and the reversible exchange, <syntaxhighlight lang="text" class="" style="" inline="1"><-></syntaxhighlight>. For instance, consider some code that is attempting to find a pattern string within a larger string:
<syntaxhighlight lang="icon">
{ (i := 10) & (j := (i < find(pattern, inString))) }
</syntaxhighlight>
This code begins by moving <syntaxhighlight lang="text" class="" style="" inline="1">i</syntaxhighlight> to 10, the starting location for the search. However, if the <syntaxhighlight lang="text" class="" style="" inline="1">find</syntaxhighlight> fails, the block will fail as a whole, which results in the value of <syntaxhighlight lang="text" class="" style="" inline="1">i</syntaxhighlight> being left at 10 as an undesirable side effect. Replacing <syntaxhighlight lang="text" class="" style="" inline="1">i := 10</syntaxhighlight> with <syntaxhighlight lang="text" class="" style="" inline="1">i <- 10</syntaxhighlight> indicates that <syntaxhighlight lang="text" class="" style="" inline="1">i</syntaxhighlight> should be reset to its previous value if the block fails. This provides an analog of atomicity in the execution.
Generators
Expressions in Icon may return a single value, for instance, <syntaxhighlight lang="text" class="" style="" inline="1">5 > x</syntaxhighlight> will evaluate and return x if the value of x is less than 5, or else fail. However, Icon also includes the concept of procedures that do not immediately return success or failure, and instead return new values every time they are called. These are known as generators, and are a key part of the Icon language. Within the parlance of Icon, the evaluation of an expression or function produces a result sequence. A result sequence contains all the possible values that can be generated by the expression or function. When the result sequence is exhausted, the expression or function fails.
Icon allows any procedure to return a single value or multiple values, controlled using the <syntaxhighlight lang="text" class="" style="" inline="1">fail</syntaxhighlight>, <syntaxhighlight lang="text" class="" style="" inline="1">return</syntaxhighlight> and <syntaxhighlight lang="text" class="" style="" inline="1">suspend</syntaxhighlight> keywords. A procedure that lacks any of these keywords returns <syntaxhighlight lang="text" class="" style="" inline="1">&fail</syntaxhighlight>, which occurs whenever execution runs to the <syntaxhighlight lang="text" class="" style="" inline="1">end</syntaxhighlight> of a procedure. For instance:
<syntaxhighlight lang="icon">
procedure f(x) if x > 0 then { return 1 } end
</syntaxhighlight>
Calling <syntaxhighlight lang="text" class="" style="" inline="1">f(5)</syntaxhighlight> will return 1, but calling <syntaxhighlight lang="text" class="" style="" inline="1">f(-1)</syntaxhighlight> will return <syntaxhighlight lang="text" class="" style="" inline="1">&fail</syntaxhighlight>. This can lead to non-obvious behavior, for instance, <syntaxhighlight lang="text" class="" style="" inline="1">write(f(-1))</syntaxhighlight> will output nothing because <syntaxhighlight lang="text" class="" style="" inline="1">f</syntaxhighlight> fails and suspends operation of <syntaxhighlight lang="text" class="" style="" inline="1">write</syntaxhighlight>.[24]
Converting a procedure to be a generator uses the <syntaxhighlight lang="text" class="" style="" inline="1">suspend</syntaxhighlight> keyword, which means "return this value, and when called again, start execution at this point". In this respect it is something like a combination of the <syntaxhighlight lang="text" class="" style="" inline="1">static</syntaxhighlight> concept in C and <syntaxhighlight lang="text" class="" style="" inline="1">return</syntaxhighlight>. For instance:[18]
<syntaxhighlight lang="icon">
procedure ItoJ(i, j) while i <= j do { suspend i i +:= 1 } fail end
</syntaxhighlight>
creates a generator that returns a series of numbers starting at <syntaxhighlight lang="text" class="" style="" inline="1">i</syntaxhighlight> and ending a <syntaxhighlight lang="text" class="" style="" inline="1">j</syntaxhighlight>, and then returns <syntaxhighlight lang="text" class="" style="" inline="1">&fail</syntaxhighlight> after that.[b] The <syntaxhighlight lang="text" class="" style="" inline="1">suspend i</syntaxhighlight> stops execution and returns the value of <syntaxhighlight lang="text" class="" style="" inline="1">i</syntaxhighlight> without reseting any of the state. When another call is made to the same function, execution picks up at that point with the previous values. In this case, that causes it to perform <syntaxhighlight lang="text" class="" style="" inline="1">i +:= 1</syntaxhighlight>, loop back to the start of the while block, and then return the next value and suspend again. This continues until <syntaxhighlight lang="text" class="" style="" inline="1">i <= j</syntaxhighlight> fails, at which point it exits the block and calls <syntaxhighlight lang="text" class="" style="" inline="1">fail</syntaxhighlight>. This allows iterators to be constructed with ease.[18]
Another type of generator-builder is the alternator, which looks and operates like the boolean <syntaxhighlight lang="text" class="" style="" inline="1">or</syntaxhighlight> operator. For instance:
<syntaxhighlight lang="icon">
if y < (x | 5) then write("y=", y)
</syntaxhighlight>
This appears to say "if y is smaller than x or 5 then...", but is actually a short-form for a generator that returns values until it falls off the end of the list. The values of the list are "injected" into the operations, in this case, <syntaxhighlight lang="text" class="" style="" inline="1"><</syntaxhighlight>. So in this example, the system first tests y < x, if x is indeed larger than y it returns the value of x, the test passes, and the value of y is written out in the <syntaxhighlight lang="text" class="" style="" inline="1">then</syntaxhighlight> clause. However, if x is not larger than y it fails, and the alternator continues, performing y < 5. If that test passes, y is written. If y is smaller than neither x or 5, the alternator runs out of tests and fails, the <syntaxhighlight lang="text" class="" style="" inline="1">if</syntaxhighlight> fails, and the <syntaxhighlight lang="text" class="" style="" inline="1">write</syntaxhighlight> is not performed. Thus, the value of y will appear on the console if it is smaller than x or 5, thereby fulfilling the purpose of a boolean <syntaxhighlight lang="text" class="" style="" inline="1">or</syntaxhighlight>. Functions will not be called unless evaluating their parameters succeeds, so this example can be shortened to:
<syntaxhighlight lang="icon">
write("y=", (x | 5) > y)
</syntaxhighlight>
Internally, the alternator is not simply an <syntaxhighlight lang="text" class="" style="" inline="1">or</syntaxhighlight> and one can also use it to construct arbitrary lists of values. This can be used to iterate over arbitrary values, like:
<syntaxhighlight lang="icon">
every i := (1|3|4|5|10|11|23) do write(i)
</syntaxhighlight>
As lists of integers are commonly found in many programming contexts, Icon also includes the <syntaxhighlight lang="text" class="" style="" inline="1">to</syntaxhighlight> keyword to construct ad hoc integer generators:
<syntaxhighlight lang="icon">
every k := i to j do write(k)
</syntaxhighlight>
which can be shortened:
<syntaxhighlight lang="icon">
every write(1 to 10)
</syntaxhighlight>
Icon is not strongly typed, so the alternator lists can contain different types of items:
<syntaxhighlight lang="icon">
every i := (1 | "hello" | x < 5) do write(i)
</syntaxhighlight>
This writes 1, "hello" and maybe 5 depending on the value of x.
Likewise the conjunction operator, <syntaxhighlight lang="text" class="" style="" inline="1">&</syntaxhighlight>, is used in a fashion similar to a boolean <syntaxhighlight lang="text" class="" style="" inline="1">and</syntaxhighlight> operator:[25]
<syntaxhighlight lang="icon">
every x := ItoJ(0,10) & x % 2 == 0 do write(x)
</syntaxhighlight>
This code calls <syntaxhighlight lang="text" class="" style="" inline="1">ItoJ</syntaxhighlight> and returns an initial value of 0 which is assigned to x. It then performs the right-hand side of the conjunction, and since <syntaxhighlight lang="text" class="" style="" inline="1">x % 2</syntaxhighlight> does equal 0, it writes out the value. It then calls the <syntaxhighlight lang="text" class="" style="" inline="1">ItoJ</syntaxhighlight> generator again which assigns 1 to x, which fails the right-hand-side and prints nothing. The result is a list of every even integer from 0 to 10.[25]
The concept of generators is particularly useful and powerful when used with string operations, and is a major underlying basis for Icon's overall design. Consider the <syntaxhighlight lang="text" class="" style="" inline="1">indexOf</syntaxhighlight> operation found in many languages; this function looks for one string within another and returns an index of its location, or a magic number if it is not found. For instance:
<syntaxhighlight lang="java">
s = "All the world's a stage. And all the men and women merely players"; i = indexOf("the", s); write(i);
</syntaxhighlight>
This will scan the string <syntaxhighlight lang="text" class="" style="" inline="1">s</syntaxhighlight>, find the first occurrence of "the", and return that index, in this case 4. The string, however, contains two instances of the string "the", so to return the second example an alternate syntax is used:
<syntaxhighlight lang="java">
j = indexOf("the", s, i+1); write(j);
</syntaxhighlight>
This tells it to scan starting at location 5, so it will not match the first instance we found previously. However, there may not be a second instance of "the" -there may not be a first one either- so the return value from <syntaxhighlight lang="text" class="" style="" inline="1">indexOf</syntaxhighlight> has to be checked against the magic number -1 which is used to indicate no matches. A complete routine that prints out the location of every instance is:
<syntaxhighlight lang="java">
s = "All the world's a stage. And all the men and women merely players"; i = indexOf("the", s); while i != -1 { write(i); i = indexOf("the", s, i+1); }
</syntaxhighlight>
In Icon, the equivalent <syntaxhighlight lang="text" class="" style="" inline="1">find</syntaxhighlight> is a generator, so the same results can be created with a single line:
<syntaxhighlight lang="icon">
s := "All the world's a stage. And all the men and women merely players" every write(find("the", s))
</syntaxhighlight>
Of course there are times where one does want to find a string after some point in input, for instance, if scanning a text file that contains a line number in the first four columns, a space, and then a line of text. Goal-directed execution can be used to skip over the line numbers:
<syntaxhighlight lang="icon">
every write(5 < find("the", s))
</syntaxhighlight>
The position will only be returned if "the" appears after position 5; the comparison will fail otherwise, pass the fail to write, and the write will not occur.
The <syntaxhighlight lang="text" class="" style="" inline="1">every</syntaxhighlight> operator is similar to <syntaxhighlight lang="text" class="" style="" inline="1">while</syntaxhighlight>, looping through every item returned by a generator and exiting on failure:[24]
<syntaxhighlight lang="icon">
every k := i to j do write(someFunction(k))
</syntaxhighlight>
There is a key difference between <syntaxhighlight lang="text" class="" style="" inline="1">every</syntaxhighlight> and <syntaxhighlight lang="text" class="" style="" inline="1">while</syntaxhighlight>; <syntaxhighlight lang="text" class="" style="" inline="1">while</syntaxhighlight> re-evaluates the first result until it fails, whereas <syntaxhighlight lang="text" class="" style="" inline="1">every</syntaxhighlight> fetches the next value from a generator. <syntaxhighlight lang="text" class="" style="" inline="1">every</syntaxhighlight> actually injects values into the function in a fashion similar to blocks under Smalltalk. For instance, the above loop can be re-written this way:[24]
<syntaxhighlight lang="icon">
every write(someFunction(i to j))
</syntaxhighlight>
In this case, the values from i to j will be injected into <syntaxhighlight lang="text" class="" style="" inline="1">someFunction</syntaxhighlight> and (potentially) write multiple lines of output.[24]
Collections
Icon includes several collection types including lists that can also be used as stacks and queues, tables (also known as maps or dictionaries in other languages), sets and others. Icon refers to these as structures. Collections are inherent generators and can be easily called using the bang syntax. For instance:
<syntaxhighlight lang="icon">
lines := [] # create an empty list while line := read() do { # loop reading lines from standard input push(lines, line) # use stack-like syntax to push the line on the list } while line := pop(lines) do { # loop while lines can be popped off the list write(line) # write the line out }
</syntaxhighlight>
Using the fail propagation as seen in earlier examples, we can combine the tests and the loops:
<syntaxhighlight lang="icon">
lines := [] # create an empty list while push(lines, read()) # push until empty while write(pop(lines)) # write until empty
</syntaxhighlight>
Because the list collection is a generator, this can be further simplified with the bang syntax:
<syntaxhighlight lang="icon">
lines := [] every push(lines, !&input) every write(!lines)
</syntaxhighlight>
In this case, the bang in <syntaxhighlight lang="text" class="" style="" inline="1">write</syntaxhighlight> causes Icon to return a line of text one by one from the array and finally fail at the end. <syntaxhighlight lang="text" class="" style="" inline="1">&input</syntaxhighlight> is a generator-based analog of <syntaxhighlight lang="text" class="" style="" inline="1">read</syntaxhighlight> that reads a line from standard input, so <syntaxhighlight lang="text" class="" style="" inline="1">!&input</syntaxhighlight> continues reading lines until the file ends.
As Icon is typeless, lists can contain any different types of values:
<syntaxhighlight lang="icon">aCat := ["muffins", "tabby", 2002, 8]</syntaxhighlight>
The items can included other structures. To build larger lists, Icon includes the <syntaxhighlight lang="text" class="" style="" inline="1">list</syntaxhighlight> generator; <syntaxhighlight lang="text" class="" style="" inline="1">i := list(10, "word")</syntaxhighlight> generates a list containing 10 copies of "word". Like arrays in other languages, Icon allows items to be looked up by position, e.g., <syntaxhighlight lang="text" class="" style="" inline="1">weight := aCat[4]</syntaxhighlight>. Array slicing is included, allowing new lists to be created out of the elements of other lists, for instance, <syntaxhighlight lang="text" class="" style="" inline="1">aCat := Cats[2:4]</syntaxhighlight> produces a new list called aCat that contains "tabby" and 2002.
Tables are essentially lists with arbitrary index keys rather than integers:
<syntaxhighlight lang="icon">
symbols := table(0) symbols["there"] := 1 symbols["here"] := 2
</syntaxhighlight>
This code creates a table that will use zero as the default value of any unknown key. It then adds two items into the table, with the keys "there" and "here", and values 1 and 2.
Sets are also similar to lists but contain only a single member of any given value. Icon includes the <syntaxhighlight lang="text" class="" style="" inline="1">++</syntaxhighlight> to produce the union of two sets, <syntaxhighlight lang="text" class="" style="" inline="1">**</syntaxhighlight> the intersection, and <syntaxhighlight lang="text" class="" style="" inline="1">--</syntaxhighlight> the difference. Icon includes a number of pre-defined "Cset"s, a set containing various characters. There are four standard Csets in Icon, <syntaxhighlight lang="text" class="" style="" inline="1">&ucase</syntaxhighlight>, <syntaxhighlight lang="text" class="" style="" inline="1">&lcase</syntaxhighlight>, <syntaxhighlight lang="text" class="" style="" inline="1">&letters</syntaxhighlight>, and <syntaxhighlight lang="text" class="" style="" inline="1">&digits</syntaxhighlight>. New Csets can be made by enclosing a string in single quotes, for instance, <syntaxhighlight lang="text" class="" style="" inline="1">vowel := 'aeiou'</syntaxhighlight>.
Strings
In Icon, strings are lists of characters. As a list, they are generators and can thus be iterated over using the bang syntax:
<syntaxhighlight lang="icon">
write(!"Hello, world!")
</syntaxhighlight>
Will print out each character of the string on a separate line.
Substrings can be extracted from a string by using a range specification within brackets. A range specification can return a point to a single character, or a slice of the string. Strings can be indexed from either the right or the left. Positions within a string are defined to be between the characters 1A2B3C4 and can be specified from the right −3A−2B−1C0
For example, <syntaxhighlight lang="icon">
"Wikipedia"[1] ==> "W" "Wikipedia"[3] ==> "k" "Wikipedia"[0] ==> "a" "Wikipedia"[1:3] ==> "Wi" "Wikipedia"[-2:0] ==> "ia" "Wikipedia"[2+:3] ==> "iki"
</syntaxhighlight> Where the last example shows using a length instead of an ending position
The subscripting specification can be used as a lvalue within an expression. This can be used to insert strings into another string or delete parts of a string. For example:
<syntaxhighlight lang="icon">
s := "abc" s[2] := "123" s now has a value of "a123c" s := "abcdefg" s[3:5] := "ABCD" s now has a value of "abABCDefg" s := "abcdefg" s[3:5] := "" s now has a value of "abefg"
</syntaxhighlight>
String scanning
A further simplification for handling strings is the scanning system, invoked with <syntaxhighlight lang="text" class="" style="" inline="1">?</syntaxhighlight>, which calls functions on a string:
<syntaxhighlight lang="icon">
s ? write(find("the"))
</syntaxhighlight>
Icon refers to the left-hand-side of the <syntaxhighlight lang="text" class="" style="" inline="1">?</syntaxhighlight> as the subject, and passes it into string functions. Recall the <syntaxhighlight lang="text" class="" style="" inline="1">find</syntaxhighlight> takes two parameters, the search text as parameter one and the string to search in parameter two. Using <syntaxhighlight lang="text" class="" style="" inline="1">?</syntaxhighlight> the second parameter is implicit and does not have to be specified by the programmer. In the common cases when multiple functions are being called on a single string in sequence, this style can significantly reduce the length of the resulting code and improve clarity. Icon function signatures identify the subject parameter in their definitions so the parameter can be hoisted in this fashion.
The <syntaxhighlight lang="text" class="" style="" inline="1">?</syntaxhighlight> is not simply a form of syntactic sugar, it also sets up a "string scanning environment" for any following string operations. This is based on two internal variables, <syntaxhighlight lang="text" class="" style="" inline="1">&subject</syntaxhighlight> and <syntaxhighlight lang="text" class="" style="" inline="1">&pos</syntaxhighlight>; <syntaxhighlight lang="text" class="" style="" inline="1">&subject</syntaxhighlight> is simply a pointer to the original string, while <syntaxhighlight lang="text" class="" style="" inline="1">&pos</syntaxhighlight> is the current position within it, or cursor. Icon's various string manipulation procedures use these two variables so they do not have to be explicitly supplied by the programmer. For example:
<syntaxhighlight lang="icon">
s := "this is a string" s ? write("subject=[",&subject,"], pos=[",&pos,"]")
</syntaxhighlight>
would produce:
<syntaxhighlight lang="text"> subject=[this is a string], pos=[1] </syntaxhighlight>
Built-in and user-defined functions can be used to move around within the string being scanned. All of the built-in functions will default to <syntaxhighlight lang="text" class="" style="" inline="1">&subject</syntaxhighlight> and <syntaxhighlight lang="text" class="" style="" inline="1">&pos</syntaxhighlight> to allow the scanning syntax to be used. The following code will write all blank-delimited "words" in a string:
<syntaxhighlight lang="icon">
s := "this is a string" s ? { # Establish string scanning environment while not pos(0) do { # Test for end of string tab(many(' ')) # Skip past any blanks word := tab(upto(' ') | 0) # the next word is up to the next blank -or- the end of the line write(word) # write the word } }
</syntaxhighlight>
There are a number of new functions introduced in this example. <syntaxhighlight lang="text" class="" style="" inline="1">pos</syntaxhighlight> returns the current value of <syntaxhighlight lang="text" class="" style="" inline="1">&pos</syntaxhighlight>. It may not be immediately obvious why one would need this function and not simply use the value of <syntaxhighlight lang="text" class="" style="" inline="1">&pos</syntaxhighlight> directly; the reason is that <syntaxhighlight lang="text" class="" style="" inline="1">&pos</syntaxhighlight> is a variable and thus cannot take on the value <syntaxhighlight lang="text" class="" style="" inline="1">&fail</syntaxhighlight>, which the procedure <syntaxhighlight lang="text" class="" style="" inline="1">pos</syntaxhighlight> can. Thus <syntaxhighlight lang="text" class="" style="" inline="1">pos</syntaxhighlight> provides a lightweight wrapper on <syntaxhighlight lang="text" class="" style="" inline="1">&pos</syntaxhighlight> that allows Icon's goal-directed flow control to be easily used without having to provide hand-written boolean tests against <syntaxhighlight lang="text" class="" style="" inline="1">&pos</syntaxhighlight>. In this case, the test is "is &pos zero", which, in the odd numbering of Icon's string locations, is the end of the line. If it is not zero, <syntaxhighlight lang="text" class="" style="" inline="1">pos</syntaxhighlight> returns <syntaxhighlight lang="text" class="" style="" inline="1">&fail</syntaxhighlight>, which is inverted with the <syntaxhighlight lang="text" class="" style="" inline="1">not</syntaxhighlight> and the loop continues.
<syntaxhighlight lang="text" class="" style="" inline="1">many</syntaxhighlight> finds one or more examples of the provided Cset parameter starting at the current <syntaxhighlight lang="text" class="" style="" inline="1">&pos</syntaxhighlight>. In this case, it is looking for space characters, so the result of this function is the location of the first non-space character after <syntaxhighlight lang="text" class="" style="" inline="1">&pos</syntaxhighlight>. <syntaxhighlight lang="text" class="" style="" inline="1">tab</syntaxhighlight> moves <syntaxhighlight lang="text" class="" style="" inline="1">&pos</syntaxhighlight> to that location, again with a potential <syntaxhighlight lang="text" class="" style="" inline="1">&fail</syntaxhighlight> in case, for instance, <syntaxhighlight lang="text" class="" style="" inline="1">many</syntaxhighlight> falls off the end of the string. <syntaxhighlight lang="text" class="" style="" inline="1">upto</syntaxhighlight> is essentially the reverse of <syntaxhighlight lang="text" class="" style="" inline="1">many</syntaxhighlight>; it returns the location immediately prior to its provided Cset, which the example then sets the <syntaxhighlight lang="text" class="" style="" inline="1">&pos</syntaxhighlight> to with another <syntaxhighlight lang="text" class="" style="" inline="1">tab</syntaxhighlight>. Alternation is used to also stop at the end of a line.
This example can be made more robust through the use of a more appropriate "word breaking" Cset which might include periods, commas and other punctuation, as well as other whitespace characters like tab and non-breaking spaces. That Cset can then be used in <syntaxhighlight lang="text" class="" style="" inline="1">many</syntaxhighlight> and <syntaxhighlight lang="text" class="" style="" inline="1">upto</syntaxhighlight>.
A more complex example demonstrates the integration of generators and string scanning within the language.
<syntaxhighlight lang="icon">
procedure main() s := "Mon Dec 8" s ? write(Mdate() | "not a valid date") end # Define a matching function that returns # a string that matches a day month dayofmonth procedure Mdate() # Define some initial values static dates static days initial { days := ["Mon","Tue","Wed","Thr","Fri","Sat","Sun"] months := ["Jan","Feb","Mar","Apr","May","Jun", "Jul","Aug","Sep","Oct","Nov","Dec"] } every suspend (retval <- tab(match(!days)) || # Match a day =" " || # Followed by a blank tab(match(!months)) || # Followed by the month =" " || # Followed by a blank matchdigits(2) # Followed by at least 2 digits ) & (=" " | pos(0) ) & # Either a blank or the end of the string retval # And finally return the string end # Matching function that returns a string of n digits procedure matchdigits(n) suspend (v := tab(many(&digits)) & *v <= n) & v end
</syntaxhighlight>
Criticisms
Laurence Tratt wrote a paper on Icon examining its real-world applications and pointing out a number of areas of concern. Among these were a number of practical decisions that derive from their origins in string processing but do not make as much sense in other areas.[24] Among them:
The decision to fail by default at the end of procedures makes sense in the context of generators, but less so in the case of general procedures. Returning to the example noted above, <syntaxhighlight lang="text" class="" style="" inline="1">write(f(-1))</syntaxhighlight> will not output which may be expected. However:[24] <syntaxhighlight lang="icon">
x := 10 (additional lines) x := f(-1) write(x)
</syntaxhighlight> will result in 10 being printed. This sort of issue is not at all obvious as even in an interactive debugger all the code is invoked yet <syntaxhighlight lang="text" class="" style="" inline="1">x</syntaxhighlight> never picks up the expected value. This could be dismissed as one of those "gotchas" that programmers have to be aware of in any language, but Tratt examined a variety of Icon programs and found that the vast majority of procedures are not generators. This means that Icon's default behaviour is only used by a tiny minority of its constructs, yet represents a major source of potential errors in all the others.[24]
Another issue is the lack of a boolean data type[c] and conventional boolean logic. While the success/fail system works in most cases where the ultimate goal is to check a value, this can still lead to some odd behaviour in seemingly simple code:[25] <syntaxhighlight lang="icon">
procedure main() if c then { write("taken") } end
</syntaxhighlight> This program will print "taken". The reason is that the test, <syntaxhighlight lang="text" class="" style="" inline="1">c</syntaxhighlight>, does return a value; that value is <syntaxhighlight lang="text" class="" style="" inline="1">&null</syntaxhighlight>, the default value for all otherwise uninitiated variables.[26] <syntaxhighlight lang="text" class="" style="" inline="1">&null</syntaxhighlight> is a valid value, so <syntaxhighlight lang="text" class="" style="" inline="1">if c</syntaxhighlight> succeeds. To test this, one needs to make the test explicit, <syntaxhighlight lang="text" class="" style="" inline="1">c === &null</syntaxhighlight>. Tratt supposed that it detracts from the self-documenting code, having supposed erroneously that it is testing "is c zero" or "does c exist".[25]
See also
Notes
- ^ According to an interview in 1985, Griswold states that the term 'icon' was not being used until Smalltalk was released to the public some time later. He expressed his annoyance that the term was now confusing people who thought the language had graphical elements.[12]
- ^ The <syntaxhighlight lang="text" class="" style="" inline="1">fail</syntaxhighlight> is not required in this case as it is immediately before the <syntaxhighlight lang="text" class="" style="" inline="1">end</syntaxhighlight>. It has been added for clarity.
- ^ Although, as Tratt points out, K&R C also lacks an explicit boolean type and uses 0 for false and any non-zero for true.[24]
References
Citations
- ^ Townsend, Gregg (January 26, 2024). "Update version to 9.5.22e". GitHub.
- ^ "Goaldi". GitHub.
- ^ Schemenauer, Neil; Peters, Tim; Hetland, Magnus Lie (18 May 2001). "PEP 255 – Simple Generators". Python Enhancement Proposals. Python Software Foundation. Retrieved 9 February 2012.
- ^ Griswold 1981, pp. 601, 602.
- ^ Griswold 1981, pp. 602.
- ^ Griswold 1981, pp. 606.
- ^ Griswold 1981, pp. 608.
- ^ Griswold 1981, pp. 609.
- ^ Griswold 1981, pp. 629.
- ^ Shapiro 1985, pp. 346.
- ^ 11.0 11.1 11.2 11.3 11.4 Griswold & Griswold 1993, p. 53.
- ^ Shapiro 1985, p. 350.
- ^ Griswold & Griswold 2002, p. xv.
- ^ Griswold & Griswold 2002, p. xvi.
- ^ Griswold & Griswold 2002, p. 10.
- ^ Griswold & Griswold 2002, p. 1.
- ^ Griswold & Griswold 2002, p. 4.
- ^ 18.0 18.1 18.2 18.3 18.4 18.5 Tratt 2010, p. 74.
- ^ "Array.prototype.indexOf()". MDN Web Docs. 27 June 2023.
- ^ 20.0 20.1 Lane, Rupert (26 July 2015). "SNOBOL - Introduction". Try MTS.
- ^ Tratt 2010, p. 73.
- ^ 22.0 22.1 22.2 Griswold 1996, p. 2.1.
- ^ Griswold 1996, p. 1.
- ^ 24.0 24.1 24.2 24.3 24.4 24.5 24.6 24.7 Tratt 2010, p. 75.
- ^ 25.0 25.1 25.2 25.3 Tratt 2010, p. 76.
- ^ Griswold & Griswold 2002, p. 128.
Bibliography
- Griswold, Ralph; Griswold, Madge (2002). The Icon Programming Language (third ed.). Peer-to-Peer Communications. ISBN 1-57398-001-3.
- Griswold, Ralph; Griswold, Madge (March 1993). "History of the Icon Programming Language". SIGPLAN Notices. 23 (3): 53–68. doi:10.1145/155360.155363. S2CID 861936.
- Griswold, Ralph (1981). "13". In Wexelblat, Richard (ed.). A History of the SNOBOL Programming Languages. History of Programming Languages. Academic Press.
- Griswold, Ralph (2 March 1996). "An Overview of the Icon Programming Language; Version 9". Department of Computer Science, The University of Arizona.
- Tratt, Laurence (18 October 2010). "Experiences with an icon-like expression evaluation system" (PDF). Proceedings of the 6th symposium on Dynamic languages. pp. 73–80. doi:10.1145/1869631.1869640. ISBN 9781450304054. S2CID 14588067.
- Shapiro, Ezra (July 1985). "SNOBOL and Icon". Byte. pp. 341–350.
External links
- Official website
- Icon on GitHub
- Oral history interview with Stephen Wampler, Charles Babbage Institute, University of Minnesota. Wampler discusses his work on the development Icon in the late 1970s.
- Oral history interview with Robert Goldberg, Charles Babbage Institute, University of Minnesota. Goldberg discusses his interaction with Griswold when working on Icon in the classroom at Illinois Institute of Technology.
- Oral history interview with Kenneth Walker, Charles Babbage Institute, University of Minnesota. Walker describes the work environment of the Icon project, his interactions with Griswold, and his own work on an Icon compiler.
- The Icon Programming Language page on The Rosetta Code comparative programming tasks project site