This chapter covers the advanced topic of balancing performance and
flexibility in a Dylan program. If you are writing a stand-alone program
and are comfortable with using type constraints as you would in a static
language, you do not need to read this chapter carefully. You may want
to skim the chapter, so that you have an idea of what options are
available to you in the future for larger or more complex projects.
We start out by describing what Dylan’s execution model is, and what we
mean by an efficiency model. The efficiency model can help a
programmer to choose the appropriate language features for a particular
problem. We also explore advanced features of Dylan that will let the
programmer negotiate with the compiler to trade away part of the
flexibility of the execution model for enhanced performance.
Dylan is a dynamic language — everything in Dylan is defined in terms of
a dynamic execution model. As we saw in Method dispatch,
the execution model of how a method is chosen when a generic function is
called with a particular set of arguments is highly dynamic: the
arguments are evaluated; the types of the arguments are determined; the
applicable methods are found and sorted according to specificity; and,
finally, the most specific, applicable method is called. This model
implies that values and types can change, and that methods can be added
right up until the generic function is called, and any of these changes
still have an effect on which method is ultimately chosen. This dynamism
— the model that value, number, and type of arguments; return values;
applicable method; and method choice and execution are all determined at
the last possible moment — is what gives the Dylan language its power.
You might think that this dynamism also means that Dylan must perform
poorly, because the only way to obey its execution model is to do a lot
of extra computation at run time. But not every program makes use of
dynamic features. Most functions accept and return a fixed number of
values (often they return only one), and those values are often of a
fixed or constrained type. Even programs that do use dynamism will not
require it everywhere. So, a good Dylan compiler will identify the
static parts of a program, and will compile them statically (that is, in
a manner that is competitive with what a compiler of any good static
language would do). To do that, the compiler uses a technique called
partial evaluation — operations that can be evaluated at compile time
(that the compiler knows can have only one outcome), will be done at
compile time. Thus, even though the programmer can continue to think and
program in terms of Dylan’s dynamic execution model, the compiler will
generate efficient code when it can show that it can obtain the same
return value without carrying out the full process at run time.
For small projects — projects that can fit in a single library — the
compiler can analyze the entire project and generate code that is
competitive with any static language. If type constraints are used for
all module variables, slots, parameters, and return values (as they
would be in a static language), the compiler can generate code
equivalent to that generated by compilers for static languages. In the
remainder of this chapter, we examine how we can use type constraints,
limited types, open classes, open generic functions, domain sealing, and
primary classes to balance performance and flexibility in Dylan
programs.
Dylan is a powerful language: Many of the built-in, or primitive,
language operations are high-level operations, such as the
method-dispatch mechanism, the collection facility, and the exception
mechanism. Because of Dylan’s powerful features, it can be hard for the
programmer to develop an efficiency model — a model of the absolute or
relative cost of different approaches to a problem.
In contrast, in other languages, such as C, every language construct can
be explained directly in terms of a small number of machine
instructions. Although it may be easy to understand the performance of a
C program in terms of a simple model, programming in C is more work for
the programmer — the higher-level abstractions are not provided, and
must often be built from scratch.
For example, a C programmer expects that the run-time cost of calling a
function is the cost of possibly saving registers on a stack, passing
the arguments, executing a machine instruction for jumping to a
subroutine, and then executing a return instruction at the end of the
function; if it is a call through a function pointer, or a C++ virtual
function, the cost of an indirect jump must be added. In Dylan, the
story is more complicated, because Dylan has a more sophisticated
execution model: A call to a generic function might be much more
expensive in a dynamic situation, because computing the most specific
method could take much longer than would execution of the method itself.
To write efficient programs in Dylan, you have to understand what
constructs in the language can be expensive in time or space, and how
you can reduce those costs in common cases. This understanding is based
on an efficiency model — a conceptual model of how a program in Dylan
runs at a low level.
One problem with developing an efficiency model is that there is no
single way to implement many Dylan operations. Different compilers do
things in different ways, and certain compilers have multiple techniques
for compiling the same piece of code, depending on circumstances.
Nonetheless, we shall try to give an intuitive feel for which features
of Dylan are costly, and which features enable the compiler to make
optimizations.
In Dylan, variables, parameters, return values, and slots can all have
type constraints. Dylan’s dynamic nature means that type constraints can
be looser than is typical of a static language, or can even be deferred
altogether, in support of rapid prototyping or evolutionary development.
Type constraints in a dynamic language serve three primary purposes:
Type constraints are required for method dispatch: the methods of a
generic function are distinguished by the types of their required
arguments. The generic function chooses the applicable methods by
sorting them according to the type constraints of their parameters.
Type constraints can be used optionally to enforce program
restrictions. The compiler ensures that a variable, parameter, return
value, or slot will never take on a value that is incompatible with
the type constraint of the parameter, return value, or slot. (If the
compiler cannot prove at compile time that an incorrect type is
impossible, it inserts a run-time check to enforce the type
constraint.)
Type constraints allow the compiler to generate better code, because
they are a contract between the programmer and the compiler that the
variable, parameter, return value, or slot in question will never
take on a value that is incompatible with its type constraint; hence,
the compiler needs only to generate code for dealing with the
declared type.
Many Dylan compilers use type inferencing to determine the possible
types of variables, parameters, and slots that do not have explicit type
constraints. Within a library, the compiler essentially knows everything
about the variables and functions that are not exported at the library
interface — it can analyze all uses of variables, and all callers and
callees of functions. Through this analysis, the compiler can develop a
worst-case scenario of the possible types of every variable, parameter,
return value, and slot. As a result, these compilers generate efficient
code even if the programmer does not fully declare all types (as would
be required in most static languages).
Some compilers have a facility for generating performance warnings,
which inform you when type inferencing is not able to determine types
sufficiently to generate optimal code. Some compilers have a facility
for generating safety warnings, informing you when type inferencing
is not able to determine types sufficiently to omit run-time type
checking. As an example, consider these definitions (which are similar
to, but not exactly the same as, the definitions on which we settled in
Four Complete Libraries):
Because we made the choice to store total-seconds as an integer, and
because 60 is an integer constant, the compiler can infer that the
truncate/ calls are for an integer divided by integer. There is no
need to consider whether to use floating-point or integer division.
If we were more concerned with testing out ideas, we might have left
unspecified the type of the total-seconds slot (implicitly, its type
would then be <object>), or, if we wanted to keep the option of
having times more accurate than just seconds, we might have specified
that its type was <real>, allowing for the possibility of using
floating-point numbers, which can express fractional seconds.
If we left the type of the total-seconds slot unspecified, the
compiler would need to check the arguments to truncate/, on the off
chance that an argument was not numeric at all. In some compilers, you
would be able to get a compile-time safety warning stating that a
run-time type error is possible (which, if unhandled, will result
in program failure), and that the check, and the possibility of a
run-time error, could be avoided if the compiler knew that
total-seconds was a <real>.
If we specified the type of the total-seconds slot as <real>,
the compiler would have to dispatch on the type of total-seconds,
using either floating-point or integer division as necessary. In some
compilers, we would be able to get a compile-time performance warning
stating that this dispatch could be omitted if the compiler knew that
total-seconds was of a more restricted type.
Note that the type of the return value of decode-total-seconds can be
inferred: max-unit and minutes must be <integer> (inferred from
the definition of truncate/), and seconds must have the same type
as total-seconds (<integer>, in our example); thus, the compiler does not have to
insert any type checks on the return values of decode-total-seconds.
Dylan enforces declared return types in the same way as it enforces
parameter types, by eliminating the check where type inferencing can
show it is not needed, and using the enforced types to make further
inferences.
From this example, you can see how the compiler can get a lot of mileage
from a small number of constraints, and how it can point you to the
places where further clarification will produce the most performance and
safety benefits. At the same time, Dylan does not require that you have
all your types thought out in advance of compiling the program; the
dynamic nature of the language allows Dylan to defer considering type
information until the program is actually running. In good Dylan
development environments, there is support for resolving and continuing
from run-time type errors during program development (rather than
requiring editing of the code and recompilation).
Remember that your code is more suited to reuse when it has fewer and
more general type constraints. If you have a compiler that can issue
safety and performance notes, try to generalize and minimize your type
constraints, being guided by your safety and performance requirements.
Often, just the constraints required to specify method applicability
will be sufficient for good safety and performance. Declaring the types
of module variables, slots, and return values of functions is also
useful and can help to document your program. Declaring types for
constants and local variables can be useful for enforcing program
correctness, but is unlikely to create optimization opportunities, and
might actually reduce performance, because the compiler will insert type
checks to enforce such constraints if they are overly restrictive.
Some of Dylan’s built-in types are extremely general. When these types
are used, the compiler’s type inferencing is thwarted, and less
efficient code will be generated. The place where this situation is most
obvious is in the <collection> types, where the elements of a
collection are essentially like multiple slots, all with the same type
constraint. For the built-in collections, elements typically have a
general default type (often simply <object>), and there can be an
arbitrary number of them. The limited mechanism is a way to specify
that you expect to store objects of a particular type in the collection,
and to specify how many elements will be in the collection.
As an example, in Vehicle containers, the generate-gates
method returns a <vector>. Without further information, the compiler
must assume that that vector might contain objects of any types. As a
result, the following code in the build-simple-airport method from
The airport-test.dylan file, will be inefficient:
Because the compiler can infer only that gates is a <vector>, it
must generate extra code to determine whether each gate has a
connected-to method on it. We can use limited types to constrain
gate-instances as follows:
With the limited constraint of the return value of generate-gates,
the compiler can ensure that only gate objects will ever be stored in
the vector; hence, it can be sure that each gate will be a <gate>
and will have a connected-to method.
Note that limited-collection types are instantiable types; that is, you
can make an object of a limited type. This capability is different from
similar constructs in certain other languages, in which those constructs
are only an assertion about the range or type of values to be stored in
the collection. Having declared the return value of generate-gates to
be a <gate-vector>, it would be an error to return a <vector>
instead; hence, we changed the argument to make when constructing
result to be <gate-vector> instead of the original <vector>.
If <gate> and connected-to are not open (as described in
Open generic functions and Open classes), the compiler can infer that
connected-to is used here to set a slot in the gate instance and to
further optimize the code generated. We do not delve into the exact
details of what the compiler has to know to make this optimization, but
it is worth noting that, if either the class or the generic function
were open, the optimization could not be made.
Another use of limited types is to allow compact representations. We can
use limited with the built-in type <integer> to specify numbers with
a limited range that can be stored more compactly than integers. It is
especially useful to use a limited range in combination with a limited
collection; for example,
Many languages provide enumeration types both to enforce program
correctness and to provide more compact representation of
multiple-choice values. Dylan does not have a built-in enumeration type,
but you can easily construct enumerations using the type-union and
singleton type constructors.
For example, consider the <latitude> and <longitude> classes, where
there are only two valid values for the direction slot in each class.
Rather than enforcing the restrictions programmatically, as we did in
Virtual slots, we can create types that do the job for us:
Here, the abstract superclass specifies that the read-only slot
direction must be a <symbol>, and that it must be initialized when
an instance is created with the keyword direction:. The constant
<latitude-direction> is a type specification that permits only the
symbol #"north" or the symbol #"south". The class <latitude>
specifies that, when an instance of <latitude> is made, the initial
value must be of the <latitude-direction> type. We handled the
longitude case similarly.
The use of type-union and singleton to create enumeration types in
this fashion is common enough that the function one-of is usually
available in a utility library as a shorthand:
Some Dylan compilers will recognize the idiomatic use of type-union
and singleton to represent such enumerations more compactly. For
instance, a compiler could represent the direction slot of a latitude or
longitude as a single bit, using the getter and setter functions to
translate back and forth to the appropriate symbol.
The definition of the one-of constant is a method called a direct
method or baremethod. It is the equivalent of a function in other
languages. A bare method does not create an implicit generic function,
and invoking a bare method does not use method-dispatch procedure, but
rather calls the method directly. We choose to use a bare method here
because we are sure that one-of will never need method dispatch: it
performs the same operation independent of the types of its arguments.
The bare method serves to document this intent. If there were some
possibility of additional methods, it would be more perspicuous to use a
generic function, even if there is initially only one method. Most Dylan
compilers will generate equally efficient code for a bare method and for
a generic function with only one method, so the choice of which to use
should be based on whether or not it would ever make sense to have
additional methods that discriminate on parameter types.
The most important construct in the Dylan execution model is the
function call, because function calls are the most common operation in
the language. Remember that all slot accesses and assignments,
arithmetic operations, and collection accesses obey the execution model
of function calls, even if the syntax for them does not look like that
of function calls.
We have already discussed how Dylan compilers can optimize away run-time
checking of argument types and the overhead of method dispatch, and that
good compilers will generate equally efficient code for calls to
single-method generic functions or direct methods.
There is one additional optimization that good Dylan compilers will
make, which is enabled by a particular style of programming. If the
final operation in a method is a call to another function (called a
tail call) then the calling function can jump directly to the called
function, rather than using a call-and-return sequence. Thus, the return
from the called function returns to its caller’s caller.
As an example, consider this decode-total-seconds method:
The inner call to decode-total-seconds can be a direct jump rather
than a function call, because the compiler can infer which method should
be called and that the return values already have the correct
constraints.
In addition to specifying the types of the parameters and return values
of methods, you can specify the types of the parameters and return
values of a generic function. You usually restrict the parameter types
of a generic function to establish the contract of the generic
function — that is, to define the domain of arguments that the generic
function is intended to handle, and the domain of the values that it
will return.
If we define a method without also defining a generic function, Dylan
creates an implicit generic function with the most general types for
each parameter and return value that are compatible with the method. For
example, assume that we defined a method for next-landing-step, and
did not explicitly create a generic function for it. The method is as
follows:
When we define a method without also defining a generic function, the
compiler will generate an implicit generic function for us, which, in
this case, will be as though we had defined the generic function like
this:
In The schedule.dylan file, where we did define a generic function, we
used a simple definition, just documenting the number of arguments, and
giving them mnemonic names:
Because we did not specify types of the arguments or return values, they
default to <object>, just as they did in the preceding implicit
generic function.
Although the generic function that we wrote does prevent us from
defining methods with the wrong number of arguments, it does not
constrain the types of those arguments or the format or type of return
values in any way. A sophisticated compiler may be able to make
inferences based on the methods that we define, but we could both aid
the compiler and more clearly document the protocol of
next-landing-step by specifying the types of the parameters and return
values in the definition of the generic function:
Now, the compiler can help us. If we define a method whose arguments are
not a subclass of <vehicle-storage> and a subclass of <aircraft>
(for example, if we provided the arguments in the wrong order), the
compiler will report the error. Furthermore, the compiler can use the
value declaration to detect errors in the return values (for example, if
we returned only a single value or returned a value of the wrong type).
Finally, the compiler can be asked to issue a warning if there is a
subclass of the argument types for which no method is applicable.
In addition to establishing a contract, specifying the types of the
parameters and return values of generic functions can allow the compiler
to make additional inferences, as described in Type constraints
with regard to truncate/. In the absence of other information,
the compiler is limited in the optimizations that it can make based
solely on the parameter types in the generic function, so it is
generally best not to restrict artificially the types of a generic
function, but rather to use the restricted types to document the
generic function’s protocol.
By default, generic functions are sealed. When you use definegeneric, that is the same as using definesealedgeneric. No other
library can add methods to a sealed generic function — not even on new
classes that they may introduce. Methods cannot be added to, or removed
from, the generic function at run time. The only methods on a sealed
generic function are the methods that are defined in the library where
the generic function itself is defined. Because of the restrictions on a
sealed generic function, the compiler, using type-inference information,
can usually narrow the choice of applicable methods for any particular
call to the generic function, eliminating most or all of the overhead of
run-time dispatching that would normally be expected of a dynamic
language.
We saw in Libraries and Modules, that we must define a generic function
that is part of a shared protocol using defineopengeneric, so that
libraries sharing the protocol can implement the protocol for the
classes that they define, by adding methods. If we do not define the
generic function to be open, other libraries are prohibited from adding
methods to the generic function, which would make it useless as a
protocol. Unfortunately, a generic function that is open cannot be
optimized. Even when the compiler may be able to infer the exact types
of the arguments to the generic function in a particular call, because
an open generic function may have methods added or removed, even at run
time, the compiler must produce code to handle all these possibilities.
Because open generic functions cannot be optimized, you should use them
only when necessary. You need to balance the division of your program
into libraries against the need to export and open more generic
functions if the program is too finely divided. This balance is
illustrated by the considerations we made in designing a protocol in
Protocol design. When we chose to split the time and angle
libraries, we were forced to create the say protocol library and open
the generic function say. In Sealed domains, we show how to regain
certain optimizations when you decide that opening a generic function is required.
Note that generic functions that are defined implicitly in a library —
such as those that are defined when you define only a single method, or
those that are defined for slot accessors — are sealed by default. If
you expect other libraries to add methods to one of these implicit
generic functions, you must define the generic function explicitly to be
open using defineopengeneric.
By default, classes are sealed. When you use defineclass, that is
the same as using definesealedclass. Other libraries cannot
directly subclass a sealed class — they cannot define new classes that
have your sealed class as a direct superclass. The only direct
subclasses of the class are those subclasses that are defined in the
library where the class itself is defined. Extensive optimization
opportunities occur when the methods of a sealed generic function are
specialized on sealed classes. In this case, the compiler can usually
choose the correct method of the generic function to call at compile
time, eliminating any run-time overhead for using a generic function.
We saw in Libraries and Modules, that we must define a class that is a
shared substrate, such as <sixty-unit>, using defineopenclass,
if the libraries sharing the substrate are expected to subclass the
class. If we did not define the class to be open, other libraries would
be prevented from subclassing it — which might be reasonable if the
substrate were not intended to be extended by subclassing.
Unlike an open generic function, an open class does not prevent all
optimization. If a generic function has a method applicable to an open
class, but the generic function is sealed, then the compiler might still
be able to optimize method dispatch if that compiler can infer the types
of the arguments to the generic function at a particular call.
Sometimes, the dispatch code will be slightly less optimal, because it
must allow for arbitrary subclasses, rather than a fixed set of
subclasses; in general, however, opening a class is less costly than is
opening a generic function.
Note that, although you cannot directly subclass a sealed class from
another library, you can subclass a sealed class in the library that
defines the sealed class. It may not be obvious, but a corollary of this
rule of sealing is that you can define an open subclass of a sealed
class in the library that defines the sealed class. Using a sealed class
with an open subclass is one simple way to get both flexibility and
efficiency — the classes in the sealed branch will be optimized by the
compiler, while the open subclass can be exported for other libraries to
build on and extend.
When you define a protocol that is meant to be extended by many
libraries, both the base classes and the generic functions that make up
the protocol must be open. This simple exigency might make it seem that
there is no hope of optimizing such a protocol — however, there is hope.
You use the definesealeddomain form to seal selectively subsets or
branches of the protocol, permitting the compiler to make all the
optimizations that would be possible if the classes and generic
functions were sealed, but only for the particular subset or branch in
question.
As an example, consider the say protocol as used in the time
library. Because the say generic function is defined to be open, even
if the compiler can infer that the argument to say is a <time> or
<time-offset>, it must insert code to choose the appropriate method
to call at run time on the off chance that some other library has added
or removed methods for say. The solution is to add the following
definition to the time library:
// Declare the say generic function sealed, for all time classesdefinesealeddomainsay(<time>);
This statement is essentially a guarantee to the compiler that the only
methods on say that are applicable to <time> objects (and also to
<time-of-day> and <time-offset> objects, because <time-of-day> and
<time-offset> are subclasses of <time>) are those that are defined
explicitly in the time library (and in any libraries from which that
one imports). Thus, when the compiler can prove that the argument to
say is a <time-offset>, it can call the correct method directly,
without any run-time dispatch overhead.
Another way to get the same effect as a sealed domain, which is also
self-documenting, is to use definesealedmethod when defining
individual methods on the protocol. So, for instance, in the case of the
time library, we might have defined the two methods on say as
follows:
Defining a sealed method is the same as defining the generic function to
be sealed over the domain of the method’s specializers. In effect, this
technique says that you do not intend anyone to add more specific
methods in that domain, or to create classes that would change the
applicability of the sealed methods.
With either the definesealeddomain form or the sealed methods, the
use of say on <time> objects will be as efficient as it would be
were say not an open generic function after all. At the same time,
other libraries that create new classes can still extend the say
protocol to cover those classes.
Sealed domains impose restrictions on the ability of other libraries to
create new methods, to remove new methods, and to create new classes:
You cannot add methods to an open generic function imported from another
library that would fall into the sealed domain of any other library.
You can avoid this restriction by ensuring that at least one of the
specializers of your method is a subtype of a type defined in your
library.
When you seal a domain of a generic function imported from another
library, you will not cause conflicts with other libraries, as long as
both of the following conditions hold:
At least one of the types in the sealed domain is a subtype of a
class defined in your library
No additional subtypes can be defined for any of the types in the
sealed domain
In the case of a type that is a class, the first condition means that
you must have defined either the class or one of its superclasses in
your library. The second condition means that the classes in the domain
must not have any open subclasses (a degenerate case of which is a leaf
class — a class with no subclasses at all).
If you need to seal a domain over a class that has open subclasses, you
will need a thorough understanding of the sealing constraints detailed
in The Dylan Reference Manual, but these two simple rules should
handle many common cases.
In our example, we obeyed both rules of thumb: our methods for say are
on classes we defined, and our sealing was over classes that will not be
further subclassed. The rules of thumb not only keep you from violating
sealing constraints, they make for good protocol design: a library that
extends a protocol really should extend it only for classes it fully
understands, which usually means classes it creates.
As an example of the restriction on subclassing open classes involved in
a sealed domain, if the <time> class were an open class, we still
could not add the following class in a library that used the time
library:
As far as the compiler is concerned, it “knows” that the only say
method applicable to a <time> is the one in the time library. (That
is what we have told it with our sealeddomain definition.) It would
be valid to pass a <place-and-time> object as an argument to a
function that accepted <time> objects, but within that function the
compiler might have already optimized a call to say to the method for
<time> objects (based on <time> being in the sealed domain of
say). But there is also a method for say on <position>, and,
more important, we probably will want to define a method specifically for
<place-and-time>. Because of this ambiguity, the class
<place-and-time> cannot be defined in a separate library, and the
compiler will signal an error.
Note that the class <place-and-time> could be defined in the time
library. The compiler can deal correctly with classes that may straddle
a sealed domain, if they are known in the library where the sealed
domain is defined. It would also be valid to subclass <time> in any
way that did not change the applicability of methods in any sealed
generic-function domains that include <time>. The actual rule
involved depends on an analysis of the exact methods of the generic
function, and the rule is complicated enough that you should just rely
on your compiler to detect illegal situations.
Dylan does allow you to omit definition of a generic function. As we
mentioned earlier, if you define a method without also defining a
generic function, Dylan implicitly creates a generic function with the
most general types for the parameters and return values that are
compatible with the method. The most common case of implicit generic
functions is for the slot-accessor methods that are created when a new
class is defined. Because these generic functions typically have only a
single method and are sealed by default (see Open generic functions),
the compiler can make extensive optimizations for slot accessors, ideally
making slot access no more expensive than an array reference or
structure-member access in other languages.
Even when a slot is inherited by subclassing, a good Dylan compiler will
use a coloring algorithm to assign slots to the same offset in each
subclass, keeping the cost of slot access to a minimum. You can use
primary classes (see Primary classes) to guarantee efficient slot
access. When a program defines explicit methods for a slot getter or
setter generic function, of course, the overhead is greater.
In the <sixty-unit> class, we specified an initial value for
total-seconds; hence, there is no need to check that the slot has
been initialized before it is accessed. In some situations, it may not
be feasible to give a default or initial value for a slot. Dylan permits
this omission and will ensure that the slot is initialized before that
slot is used; of course, this check does not come for free, so it is
preferable to provide initial values where possible. In fact, because we
always expect to initialize the total-seconds slot when we make a new
<sixty-unit>, it would be more accurate to specify <sixty-unit> as
follows:
That is, rather than giving the slot an initial value of 0 and an
optional init-keyword:, we simply require that the slot be initialized
when we make a <sixty-unit> object. Of course, the initial value must
obey the type constraint of <integer>. The compiler can still make the
inference that the slot will always be initialized and will always have
an integer value.
Always initializing slots, either with a default value or required
init-keyword, will make slot access efficient.
Finally, in many cases, slots hold values that will not change over the
lifetime of each instance (although they may be different values for
each instance). In the case of the <sixty-unit> class, we never change
the value of total-seconds. When adding two instances, we create a
new one to hold the new value, rather than changing one of the argument
instances (that way, we do not have to worry about changing an instance
that may still be in use by some other part of the program). In such
cases, declaring the slot to be constant both documents and enforces
this intent. Furthermore, the compiler can often make additional
optimizations for slots that are known never to be modified. The final
definition of <sixty-unit> is as follows:
Classes have one additional variation that you can use to optimize
performance. A class that is defined as primary allows the compiler to
generate the most efficient code for accessing the slots defined in the
primary class (whether the accessor is applied to the primary class or
to one of that class’s subclasses). However, a primary class cannot be
combined with any other primary class (unless one is a subclass of the
other). This restriction implies that you should delay declaring a class
to be primary until you are sure of your inheritance design. Also,
because sealed classes are already highly optimized, the primary
declaration is of most use for open classes.
As an example, consider the class <sixty-unit>, and its slot
total-seconds, as used in this method for decode-total-seconds:
Although the generic function for the slot accessor total-seconds is
sealed, and it is trivial for the compiler to infer that its argument is
a <sixty-unit> in the call sixty-unit.total-seconds, because
<sixty-unit> is declared open, the compiler cannot emit the most
efficient code for that call. Because an open class could be mixed
with any number of other classes, there is no guarantee that the slots
of every object that is a <sixty-unit> will always be stored in the
same order — there is no guarantee that total-seconds will always
be the first slot in an object that is an indirect instance of
<sixty-unit>, for instance.
Declaring a class primary is essentially making a guarantee that the
compiler can always put the primary class’s slots in the same place in
an instance, and that any other superclasses will have to adjust:
By adding the primary declaration to the definition, any library that
subclasses <sixty-unit> is guaranteed to put total-seconds at the
same offset. Hence, the compiler can turn the call
sixty-unit.total-seconds into a single machine instruction (load with
constant offset), without concern over which subclass of <sixty-unit>
was passed as an argument.
It is permissible to make subclasses of a primary class also primary,
essentially freezing the assignment of all the slots in the subclass
too. What is not permissible is to multiply inherit from more than one
primary class; as you can see, such behavior would lead to a conflict
between the fixed slot assignments.
Because primary classes restrict extension in this way, you should use
them sparingly in libraries intended to be software components. Primary
classes are of most benefit in large, modular programs, where all the
clients of each component are known, and the need for extensibility is
bounded; typically that occurs toward the end of a project, when you are
tuning for performance.
In this section, we review additional techniques that compilers can use
to generate code that obeys the Dylan execution model, but is more
efficient than a straightforward implementation of that model might
suggest. Knowing about these techniques can help you to evaluate
different vendors’ compilers. You will have to consult the documentation
of your particular implementation to discover whether or not these
techniques are used.
In addition to using type inferencing and sealed domains, another way to
speed up generic function calls when they must dispatch at run time is
to cache the return values of previous calls. So, for example, the first
time that a given generic function is called with certain classes of
arguments, the full sorted sequence of applicable methods is computed;
after that, however, it only to be only looked up in a table. Thus, if
the generic function is called often with the same type of
arguments, most calls will be fast. This technique is used in other
object-oriented languages, such as Smalltalk and CLOS, and is useful for
speeding up completely dynamic situations. Most good Dylan compilers
will use some form of cached dispatching.
A second form of cached dispatching is called call-site caching.
Although a generic function may have many calls throughout a program,
often the types of arguments passed are directly related to where (that
is, in what other method) the call is made. Some Dylan compilers will
cache the types and methods of each call at the point of call, and will
use this cache to avoid dispatch if the same types are passed as
arguments in a subsequent call from the same place.
Efficiency of keyword arguments and of multiple values¶
Keyword arguments are a powerful and flexible, but potentially
expensive, feature of Dylan. The processing of keywords and values at
run time can be an expensive operation, especially if many keywords are
used. A Dylan compiler can pass keyword arguments as efficiently as it
can required arguments, if the called function is known at compilation
time.
Returning multiple values again raises performance issues. In some
implementations of Dylan, there is an extra cost for returning more than
one value; in others, the cost is associated with calling a function
that does not declare how many values it returns. When the compiler
knows what function is being called, these costs usually can be
eliminated, but certain costs may still exist — for example, certain
implementations may not optimize tail calls between functions that
return different numbers of arguments.
Dylan uses automatic storage-management; thus, programmers explicitly
allocate objects, and hence memory, but deallocation is automatic and
occurs after all references to an object are gone. The process of
reclaiming memory when objects are no longer in use is known as garbage
collection.
There are strong advantages to automatic storage-management. With manual
storage-management, small program bugs, such as freeing of an object
that is still in use, can cause subtle bugs that lead to crashes in
parts of the program unrelated to where the real problem lies. Dylan is
able to guarantee that all programs fail in disciplined ways, usually
with exceptions, because the type system and memory management are safe.
But automatic storage-management may create performance concerns.
Although early implementations of garbage collection were infamously
slow, modern garbage collectors are usually fast enough that using one
should not raise concerns for most programs. But some programs with
specialized or tuned use of memory may run slower with automatic
management.
Whether storage management is automatic or manual, the use of memory
raises performance issues. Every allocation of memory takes time,
including the time to reclaim unused memory; either the programmer must
free it explicitly, or the garbage collector has to do more work.
It is obvious that calling a function such as make, vector, or
pair in Dylan allocates memory, but there are operations that
implicitly use memory. For example, creating a closure (see
Closures) will usually cause Dylan to allocate memory for
the closure.
On the other hand, sometimes the compiler is able to prove that an
object is never used after the function that creates it returns. In a
good compiler, such objects are allocated on the stack, and are
reclaimed automatically when the function exits.
A good Dylan development environment will have tools that help you to
meter and profile memory usage, so that you can adjust your program to
utilize memory efficiently.
Inlining, constant folding, and partial evaluation¶
One optimization that is common in many computer languages is inlining.
Inlining replaces a call to a known function with the body of the
function. Inlining is an important optimization in Dylan, because almost
all Dylan operations — slot access, array indexing, and collection
iteration — involve function calls.
All good Dylan compilers, when compiling for speed, can be aggressive
about inlining any computations, as long as doing so would not make a
program grow too large. Constant folding (evaluating expressions
involving constant values at compile time) and inlining are just two of
the partial-evaluation techniques that you should expect to find in
any good Dylan compiler.
The quality of type inference can vary greatly among Dylan compilers.
Type inference — like most forms of program analysis — works best with
simple, straightforward code. Some constructs that are typically
difficult for type inference are assignment and calling of block exit
functions outside of the method that defines the block exit functions.
One other way in which type constraints can be helpful is that they
permit the compiler to choose efficient representations for objects.
Most Dylan objects contain enough information for Dylan to determine
their class — this one is an important feature for the dynamic aspects
of the language. But, suppose we have a 1000 x 1000 limited(<array>,of:<single-float>). There is no reason that each of the numbers in
that array should also contain a reference to the <single-float>
class; the one reference in the limited type is sufficient. (Note that,
if we had used of:<real> or of:<float>, we would have needed
more information, since multiple classes would have been possible.)
When an object is represented in such a way, often many of the
operations on it can be optimized. For example, the conventional
representation of <double-float> will usually require an
indirect-memory-reference machine instruction to get at the actual
number, so adding two such objects is one floating-point machine
instruction and two load-from-memory machine instructions; if a direct
representation is used, just the add machine instruction is needed.
Further, if the return value is saved in a variable for which type
information is not available, it may be necessary to allocate memory
dynamically to store the return value.
Types that may have more efficient representations include certain
integer classes, the floating-point classes, characters, and Booleans.
Precise declarations about these types, especially in slots and limited
collections, can lead to significant improvements in both the time and
memory needed to run a program.
The most important point about performance is that it is important to
pay attention to efficiency during the entire design and development
cycle of a project. During the design phase, try to ensure that the
algorithms chosen have the right asymptotic behavior and constant
factors, and that it is possible to implement the needed operations
efficiently. During the implementation phase, use the language
constructs that most clearly express what the program is doing. Once the
program is working correctly, it is then time to add type and sealing
declarations, and to use metering and profiling tools to find and
rewrite heavily used, slow parts of the program, in order to improve the
performance.
One of the most important considerations when programming is not to
worry about performance too soon. It is always more important that your
design and implementation be clear and correct, first. There is no value
in arriving at an answer with lightning speed, if it turns out to be the
wrong answer.
In this chapter, we covered the following:
We showed how Dylan can balance performance and flexibility to
support a range of programming requirements.
We showed how type constraints affect performance.
We showed how limited types can improve performance.
We showed how open generic functions provide modularity and
flexibility.
We showed how open classes provide modularity and flexibility.
We showed how sealed generic function domains mitigate the
performance penalty of open classes and generic functions.
We showed how primary classes permit efficient slot access.
We presented both an execution and efficiency model that provides a
conceptual model of how a program in Dylan runs, and what the
relative cost of different program elements are.