Tcl Resource Center
|
|
Changes to Tcl script semantics in Tcl 8.0
This page describes differences in the semantics of Tcl scripts between
Tcl 8.0 and previous releases of Tcl. Please report any problems you
discover with this release on the comp.lang.tcl newsgroup.
A major goal for the Tcl 8.0 compiler is no changes or very few changes
to Tcl/Tk scripts or C extensions. I have tried very hard to ensure
that the behavior documented in the man pages is preserved but I have
made some changes that are visible at the level of Tcl scripts. Most
of these changes allow Tcl programs to execute faster. The key changes
are the following.
- Compilation errors return errors immediately, before executing
a script
Tcl 8.0 now has separate compilation and execution phases. If an error
is found during compilation, an error is returned immediately and the
script is not executed (not even partially). This differs from Tcl
7.6 where a script would be executed up to the first command with an
error.
Compilation errors are now returned for scripts with malformed expression
words (e.g., words with missing close braces or quotes). In addition,
several commands are treated specially by the Tcl bytecode compiler
and are compiled into an inline sequence of instructions for better
execution speed. In Tcl 8.0, these commands are break ,
catch , continue , expr , for ,
foreach , if , incr , set ,
and while . If a script includes one of these commands
with the wrong number of argument words (e.g., set a xxx yyy ),
a compilation error is returned immediately and the script is not executed.
For example, the body of the following procedure has a syntax error:
proc p {} {
global x
set x 1
incr x 1 2 ;# error: wrong number of arguments
set x 2
}
When p is called the first time, its body is compiled.
In Tcl 8.0, a syntax error is returned immediately and the global variable
x is never modified. In Tcl 7.6, x would
be set to 1.
Returning a compilation error immediately lets you find typos and other
syntax errors in your scripts as soon as possible. It also avoids the
Tcl 7.6 problem that a bad procedure body or eval 'ed script
could leave your data structures in an inconsistent state because they
only executed partially.
- Catch no longer catches compilation errors
One consequence of the fact that in Tcl 8.0 compilation errors in scripts
are returned immediately is that catch no longer prevents
compilation errors from aborting command interpretation. For example,
catch {set}
immediately reports the error
wrong # args: should be "set varName ?newValue?"
To have catch handle compile time errors, give it a script
that is only determined at runtime: for example,
catch [concat {set}]
- Scripts are now completely compiled
Tcl 7.6 would ignore any characters in a script after the last command
executed. This allowed you, for example, to put comments after the
exit command that terminated a script without having to
start each comment line with a # :
initialize ;# the commands
compute ;# of the
finalize ;# program
exit ;# stop execution
This is the first line in the script's change log.
A second line in the change log.
Tcl 8.0 now attempts to compile those lines so you should put a
# at the start of each comment line.
- Lists are now aggressively parsed
Lists are now a real (and significantly faster) data type and not just
strings that are rescanned each time. You still use them the same way,
but strings are converted to a faster internal representation behind
the scenes on the first list operation. This means that lists are now
completely parsed. It used to be the case that if you had a misformed
list but the erroneous part was after the point you inserted or extracted
an element, then you never saw an error. Tcl 8.0 now reports an error.
For example, in Tcl 8.0
lindex {a b "c"d e} 1
returns the error
list element in quotes followed by "d" instead of space
while in Tcl 7.6, it returned b .
- List operations don't preserve the exact white space between
elements
List operations in Tcl 7.6 always retained the white space between
list elements. This is no longer true in Tcl 8.0, where list operations
return lists whose elements are separated by a single space. In the
following example, the first command sets the variable x
to a string containing two tab characters:
set x {one two three}
lrange $x 0 1
In Tcl 8.0, the lrange command returns one two .
Programs that need to preserve white space should either use string
operations or use a combination of the split and join
commands. For example,
join [lrange [split $x { }] 0 1] { }
will preserve the tab between the list elements (there is a tab character
inside each pair of braces). Note that the Tcl man pages never promised
to preserve the white space.
- Fewer floating-to-string conversions (and the associated rounding) may change program behavior
Both Tcl 8.0 and 7.6 use full IEEE floating-point precision
(about 17 decimal digits) when computing expressions.
They both round floating-point values when converting them to strings
(although by default Tcl 8.0 retains 12 digits
while Tcl 7.6 keeps 6 digits).
However, the new object system in Tcl 8.0 causes fewer floating-to-string
conversions (and the associated rounding) to occur than in Tcl 7.6.
These changes mean that floating-point computation
is more accurate and faster in Tcl 8.0,
but there are sometimes behavioral changes.
For example, the command
for {set x 0.0} {$x != 4.0} {set x [expr $x+0.1]} {puts $x}
terminates in Tcl 7.6 but loops forever in Tcl 8.0.
This is because the fraction 0.1 can not be exactly
represented as an IEEE floating-point value
and so repeatedly adding it to 0.0
will never produce an floating-point value that is exactly 4.0.
This loop terminated in Tcl 7.6 only because of the rounding
that was done at every assignment to the variable x .
The solution in Tcl 8.0 (and most other programming languages)
is to use approximate comparisons for floating-point.
That is, when looping until the value of a floating-point expression
reaches some target value
you need to test whether the expression is "close enough" to the target.
For example,
for {set x 0.0} {abs($x-4.0) > 0.001} {set x [expr $x+0.1]} {puts $x}
stops when the variable x reaches 4.0.
- The original strings in expressions are retained
For example,
expr {"0y" < "0x3"}
yields 0 (not 1) in Tcl 8.0 because the original string for "0x3" is
not lost. Tcl always tries to convert expression operands to numbers
if possible. In Tcl 7.6, this meant that the string "0x3" was lost
when the expression implementation converted it to a number (3) and
then back into a string when it realized that a string comparison had
to be done. This meant Tcl 7.6 did a string comparison between "0y"
and "3". Since Tcl values are now stored in Tcl object that hold an
internal form as well as a string, the original string isn't lost and
Tcl 8.0 can do the string comparison between "0y" and "0x3".
- info cmdcount is no longer accurate
Compiled commands are "compiled away" and their execution is no longer
counted. In Tcl 8.0, this includes the commands break ,
catch , continue , expr , for ,
foreach , if , incr , set ,
and while . It was felt that the cost of keeping info
cmdcount accurate is not worth the cost of doing so.
- append no longer triggers read traces
The append command no longer triggers read traces when
fetching the old values of variables before doing the append operation.
It only triggers a write trace after each append. It was felt that
these read traces were not very useful and not worth the extra execution
overhead they require.
Note that the lappend command still triggers read traces
in this situation. lappend will be changed to behave like
append .
- lappend triggers only one write trace
lappend triggers only one write trace after appending
all arguments to the list.
- Error tracebacks are shorter
There are fewer recursive calls to Tcl_Eval now since some commands
like while and if have been "compiled away"
into a sequence of bytecoded instructions. This means that tracebacks
often have fewer traceback levels that in the past.
- Comments and white space are "free"
Only the bytecode compiler sees these. They do not effect the execution
speed of scripts.
- The length of (local) variable names doesn't matter.
For most commands, the compiler looks up local variables in a procedure
at compile time and emits instructions that, at execution time, take
the same amount of time regardless of the length of the variable names.
Frequently asked questions
- Will the Tcl 8.0 compiler will support the major extensions
to Tcl/Tk?
The compiler will work with any extension that doesn't modify the Tcl
core since Tcl 8.0 includes all the old C API's and the old tcl commands.
These extensions just need recompiling. Extensions that require source
changes to the core will be delayed because the Tcl 8.0 core is substantially
different than in Tcl7.6.
- I tie in to the Tcl event loop to support OLE, etc. Can
I use the compiler?
The Tcl event loop won't be changed and will operate just as before.
- Can I redefine the built-in commands?
Yes, but this causes all code that is currently cached for scripts
and procedure bodies to be thrown away. The scripts and procedure bodies
will be recompiled the next time they are used. This means you should
redefine any builtin commands at program startup.
- What is slower?
Expressions that are not enclosed in braces may be slower in Tcl 8.0.
The compiler must generate additional code for such expressions.
Also, the bytecode interpreter
may need to invoke the expr command for these expressions
at runtime after doing one round of substitutions in order
to preserve Tcl's two level substitution semantics for expressions.
This runtime call to the expr command
causes code to be generated, executed, and then thrown away,
which significantly adds to the execution cost.
The best speed in Tcl 8.0 is always obtained
by enclosing all expressions, even those in control structure commands,
inside braces.
Tk bindings and variable traces are often slower in Tcl 8.0 since they
compute new scripts that they then eval . Code is generated
for these scripts, executed once, then thrown away. If the code is
not iterative or doesn't execute for long, the time to compile it can
outweigh the savings in execution time.
- Is this as fast as things will get with the compiler?
No, the C command procedures for many Tcl commands such as open
and all the Tk commands have not been converted to take Tcl
objects instead of strings. Object-based command procedures are faster
than string-based ones since they can keep an appropriate internal
representation in Tcl objects. Also, the bytecode interpreter can call
object-based command procedures directly; string-based command procedures
are called by a wrapper procedure that must first convert Tcl objects
to strings. I expect Tk programs to speed up substantially when Tk
is modified to use objects.
Much of the speedup in Tcl programs to date has resulted from a speedup
in accessing local variables. The compiler allocates an index for each
local variable in an array of variables in each procedure frame. This
avoids having to do expensive hashtable lookups at runtime. Variables
referenced outside of a procedure body would also benefit from doing
the lookup once at compile-time instead of at each use. The compiler
could also do this lookup at compile-time for the variables in most
global commands.
Many programs are slowed down currently because too many small scripts
are being compiled. Many are the result of Tk bindings and variable
traces. Others are the result of programs that must use eval
to execute a command that they have computed, when those programs have
already parsed the command into separate command and argument words;
we will add support to Tcl to avoid unnecessary compilations in this
situation.
In Tcl 8.0, ten commands are compiled into an inline sequence of specialized
instructions. Many additional Tcl commands such as lappend
and lindex would benefit from being compiled inline.
- Can I save the Tcl bytecodes for my program to disk?
Many people have asked for this feature. It would reduce the time
needed to run many programs. It might also help companies and
individuals that want to "hide" their programs.
Unfortunately, Tcl 8.0 does not currently allow you to save Tcl bytecodes.
It is not technically hard to implement this,
but we have been reluctant to do so until the set of Tcl bytecodes
becomes more stable.
- Why doesn't Tcl use the Java bytecodes?
I had originally hoped to use Java bytecodes because they have a mature
design and because Java is widely available. I chose to use my own
Tcl-specific bytecodes because I was concerned that using the Java
virtual machine would be too slow or take too much memory. The basic
problem is the semantic mismatch between Java bytecodes and Tcl. Consider
the Tcl set command. Tcl variables behave very differently
than Java variables. I can't use a Java instruction like astore
(store object reference in local variable) to store a Tcl value into
a Tcl variable since it doesn't handle by itself such Tcl details as
variable traces, unset , or global . The best
I could do would be to translate a Tcl set command into
a sequence of several Java instructions that did the appropriate checks.
Unfortunately, the number of Java instructions to implement each Tcl
command would make the compiled program too big. A more realistic scheme
is to generate Java bytecodes that call one or more Java methods to
do the actual work for each Tcl command. With this number of Java method
calls, acceptable performance will depend on using a Java JIT (bytecode-to-machine
code) compiler. With Java JIT compilers becoming more widely available
this might be realistic. One possibility would be to translate the
relatively high-level Tcl bytecodes into Java bytecodes. However, there
is another problem with using Java bytecodes. Much of the interesting
code in Tcl/Tk and its extensions is in C. Java code can call native
methods implemented in C, and vice-versa, but this is awkward, not
portable, and the capability is disabled in Netscape (and many other
Java implementations) for safety reasons.
Brian Lewis, brian.lewis@eng.sun.com Last edited
August 21, 1997.
Scriptics
| Resource Center
| Add Url
| Site Map
| Gen Html
| Debug
| Search
| Feedback
Sponsored by Scriptics |