Performance and Flexibility#
This chapter covers the advanced topic of balancing performance and flexibility in a Dylan program. If you are writing a stand-alone program and are comfortable with using type constraints as you would in a static language, you do not need to read this chapter carefully. You may want to skim the chapter, so that you have an idea of what options are available to you in the future for larger or more complex projects.
We start out by describing what Dylan’s execution model is, and what we mean by an efficiency model. The efficiency model can help a programmer to choose the appropriate language features for a particular problem. We also explore advanced features of Dylan that will let the programmer negotiate with the compiler to trade away part of the flexibility of the execution model for enhanced performance.
Execution model#
Dylan is a dynamic language — everything in Dylan is defined in terms of a dynamic execution model. As we saw in Method dispatch, the execution model of how a method is chosen when a generic function is called with a particular set of arguments is highly dynamic: the arguments are evaluated; the types of the arguments are determined; the applicable methods are found and sorted according to specificity; and, finally, the most specific, applicable method is called. This model implies that values and types can change, and that methods can be added right up until the generic function is called, and any of these changes still have an effect on which method is ultimately chosen. This dynamism — the model that value, number, and type of arguments; return values; applicable method; and method choice and execution are all determined at the last possible moment — is what gives the Dylan language its power.
You might think that this dynamism also means that Dylan must perform poorly, because the only way to obey its execution model is to do a lot of extra computation at run time. But not every program makes use of dynamic features. Most functions accept and return a fixed number of values (often they return only one), and those values are often of a fixed or constrained type. Even programs that do use dynamism will not require it everywhere. So, a good Dylan compiler will identify the static parts of a program, and will compile them statically (that is, in a manner that is competitive with what a compiler of any good static language would do). To do that, the compiler uses a technique called partial evaluation — operations that can be evaluated at compile time (that the compiler knows can have only one outcome), will be done at compile time. Thus, even though the programmer can continue to think and program in terms of Dylan’s dynamic execution model, the compiler will generate efficient code when it can show that it can obtain the same return value without carrying out the full process at run time.
For small projects — projects that can fit in a single library — the compiler can analyze the entire project and generate code that is competitive with any static language. If type constraints are used for all module variables, slots, parameters, and return values (as they would be in a static language), the compiler can generate code equivalent to that generated by compilers for static languages. In the remainder of this chapter, we examine how we can use type constraints, limited types, open classes, open generic functions, domain sealing, and primary classes to balance performance and flexibility in Dylan programs.
Efficiency model#
Dylan is a powerful language: Many of the built-in, or primitive, language operations are high-level operations, such as the method-dispatch mechanism, the collection facility, and the exception mechanism. Because of Dylan’s powerful features, it can be hard for the programmer to develop an efficiency model — a model of the absolute or relative cost of different approaches to a problem.
In contrast, in other languages, such as C, every language construct can be explained directly in terms of a small number of machine instructions. Although it may be easy to understand the performance of a C program in terms of a simple model, programming in C is more work for the programmer — the higher-level abstractions are not provided, and must often be built from scratch.
For example, a C programmer expects that the run-time cost of calling a function is the cost of possibly saving registers on a stack, passing the arguments, executing a machine instruction for jumping to a subroutine, and then executing a return instruction at the end of the function; if it is a call through a function pointer, or a C++ virtual function, the cost of an indirect jump must be added. In Dylan, the story is more complicated, because Dylan has a more sophisticated execution model: A call to a generic function might be much more expensive in a dynamic situation, because computing the most specific method could take much longer than would execution of the method itself.
To write efficient programs in Dylan, you have to understand what constructs in the language can be expensive in time or space, and how you can reduce those costs in common cases. This understanding is based on an efficiency model — a conceptual model of how a program in Dylan runs at a low level.
One problem with developing an efficiency model is that there is no single way to implement many Dylan operations. Different compilers do things in different ways, and certain compilers have multiple techniques for compiling the same piece of code, depending on circumstances. Nonetheless, we shall try to give an intuitive feel for which features of Dylan are costly, and which features enable the compiler to make optimizations.
Type constraints#
In Dylan, variables, parameters, return values, and slots can all have type constraints. Dylan’s dynamic nature means that type constraints can be looser than is typical of a static language, or can even be deferred altogether, in support of rapid prototyping or evolutionary development. Type constraints in a dynamic language serve three primary purposes:
Type constraints are required for method dispatch: the methods of a generic function are distinguished by the types of their required arguments. The generic function chooses the applicable methods by sorting them according to the type constraints of their parameters.
Type constraints can be used optionally to enforce program restrictions. The compiler ensures that a variable, parameter, return value, or slot will never take on a value that is incompatible with the type constraint of the parameter, return value, or slot. (If the compiler cannot prove at compile time that an incorrect type is impossible, it inserts a run-time check to enforce the type constraint.)
Type constraints allow the compiler to generate better code, because they are a contract between the programmer and the compiler that the variable, parameter, return value, or slot in question will never take on a value that is incompatible with its type constraint; hence, the compiler needs only to generate code for dealing with the declared type.
Many Dylan compilers use type inferencing to determine the possible types of variables, parameters, and slots that do not have explicit type constraints. Within a library, the compiler essentially knows everything about the variables and functions that are not exported at the library interface — it can analyze all uses of variables, and all callers and callees of functions. Through this analysis, the compiler can develop a worst-case scenario of the possible types of every variable, parameter, return value, and slot. As a result, these compilers generate efficient code even if the programmer does not fully declare all types (as would be required in most static languages).
Some compilers have a facility for generating performance warnings, which inform you when type inferencing is not able to determine types sufficiently to generate optimal code. Some compilers have a facility for generating safety warnings, informing you when type inferencing is not able to determine types sufficiently to omit run-time type checking. As an example, consider these definitions (which are similar to, but not exactly the same as, the definitions on which we settled in Four Complete Libraries):
define abstract open class <sixty-unit> (<object>)
slot total-seconds :: <integer> = 0, init-keyword: total-seconds:;
end class <sixty-unit>;
define method decode-total-seconds
(sixty-unit :: <sixty-unit>)
=> (hours :: <integer>, minutes :: <integer>, seconds :: <integer>)
let total-seconds = abs(sixty-unit.total-seconds);
let (total-minutes, seconds) = truncate/(total-seconds, 60);
let (max-unit, minutes) = truncate/(total-minutes, 60);
values (max-unit, minutes, seconds);
end method decode-total-seconds;
Because we made the choice to store total-seconds
as an integer, and
because 60 is an integer constant, the compiler can infer that the
truncate/ calls are for an integer divided by integer. There is no
need to consider whether to use floating-point or integer division.
If we were more concerned with testing out ideas, we might have left
unspecified the type of the total-seconds
slot (implicitly, its type
would then be <object>
), or, if we wanted to keep the option of
having times more accurate than just seconds, we might have specified
that its type was <real>
, allowing for the possibility of using
floating-point numbers, which can express fractional seconds.
If we left the type of the total-seconds
slot unspecified, the
compiler would need to check the arguments to truncate/
, on the off
chance that an argument was not numeric at all. In some compilers, you
would be able to get a compile-time safety warning stating that a
run-time type error is possible (which, if unhandled, will result
in program failure), and that the check, and the possibility of a
run-time error, could be avoided if the compiler knew that
total-seconds
was a <real>
.
If we specified the type of the total-seconds
slot as <real>
,
the compiler would have to dispatch on the type of total-seconds
,
using either floating-point or integer division as necessary. In some
compilers, we would be able to get a compile-time performance warning
stating that this dispatch could be omitted if the compiler knew that
total-seconds
was of a more restricted type.
Note that the type of the return value of decode-total-seconds
can be
inferred: max-unit
and minutes
must be <integer>
(inferred from
the definition of truncate/
), and seconds
must have the same type
as total-seconds
(<integer>
, in our example); thus, the compiler does not have to
insert any type checks on the return values of decode-total-seconds
.
Dylan enforces declared return types in the same way as it enforces
parameter types, by eliminating the check where type inferencing can
show it is not needed, and using the enforced types to make further
inferences.
From this example, you can see how the compiler can get a lot of mileage from a small number of constraints, and how it can point you to the places where further clarification will produce the most performance and safety benefits. At the same time, Dylan does not require that you have all your types thought out in advance of compiling the program; the dynamic nature of the language allows Dylan to defer considering type information until the program is actually running. In good Dylan development environments, there is support for resolving and continuing from run-time type errors during program development (rather than requiring editing of the code and recompilation).
Remember that your code is more suited to reuse when it has fewer and more general type constraints. If you have a compiler that can issue safety and performance notes, try to generalize and minimize your type constraints, being guided by your safety and performance requirements. Often, just the constraints required to specify method applicability will be sufficient for good safety and performance. Declaring the types of module variables, slots, and return values of functions is also useful and can help to document your program. Declaring types for constants and local variables can be useful for enforcing program correctness, but is unlikely to create optimization opportunities, and might actually reduce performance, because the compiler will insert type checks to enforce such constraints if they are overly restrictive.
Limited types#
Some of Dylan’s built-in types are extremely general. When these types
are used, the compiler’s type inferencing is thwarted, and less
efficient code will be generated. The place where this situation is most
obvious is in the <collection>
types, where the elements of a
collection are essentially like multiple slots, all with the same type
constraint. For the built-in collections, elements typically have a
general default type (often simply <object>
), and there can be an
arbitrary number of them. The limited
mechanism is a way to specify
that you expect to store objects of a particular type in the collection,
and to specify how many elements will be in the collection.
As an example, in Vehicle containers, the generate-gates
method returns a <vector>
. Without further information, the compiler
must assume that that vector might contain objects of any types. As a
result, the following code in the build-simple-airport
method from
The airport-test.dylan file, will be inefficient:
let gates = generate-gates(gates-per-terminal, capacity);
...
for (gate in gates)
gate.connected-to := taxiway-vector;
end for;
Because the compiler can infer only that gates
is a <vector>
, it
must generate extra code to determine whether each gate
has a
connected-to
method on it. We can use limited types to constrain
gate-instances
as follows:
define constant <gate-vector> = limited(<vector>, of: <gate>);
define method generate-gates
(gates-per-terminal :: <vector>, default-gate-capacity :: <size>)
=> (gates :: <gate-vector>)
let result = make(<gate-vector>, size: reduce1(\+,
gates-per-terminal));
...
values(result);
end method generate-gates;
With the limited constraint of the return value of generate-gates
,
the compiler can ensure that only gate objects will ever be stored in
the vector; hence, it can be sure that each gate
will be a <gate>
and will have a connected-to
method.
Note that limited-collection types are instantiable types; that is, you
can make an object of a limited type. This capability is different from
similar constructs in certain other languages, in which those constructs
are only an assertion about the range or type of values to be stored in
the collection. Having declared the return value of generate-gates
to
be a <gate-vector>
, it would be an error to return a <vector>
instead; hence, we changed the argument to make
when constructing
result
to be <gate-vector>
instead of the original <vector>
.
If <gate>
and connected-to
are not open (as described in
Open generic functions and Open classes), the compiler can infer that
connected-to
is used here to set a slot in the gate instance and to
further optimize the code generated. We do not delve into the exact
details of what the compiler has to know to make this optimization, but
it is worth noting that, if either the class or the generic function
were open, the optimization could not be made.
Another use of limited types is to allow compact representations. We can
use limited
with the built-in type <integer>
to specify numbers with
a limited range that can be stored more compactly than integers. It is
especially useful to use a limited range in combination with a limited
collection; for example,
define constant <signed-byte-vector>
= limited(<simple-vector>,
of: limited(<integer>, min: -128, max 127));
In the preceding example, we define a type that can be represented as a one-dimensional array of 8-bit bytes.
Enumerations#
Many languages provide enumeration types both to enforce program
correctness and to provide more compact representation of
multiple-choice values. Dylan does not have a built-in enumeration type,
but you can easily construct enumerations using the type-union
and
singleton
type constructors.
For example, consider the <latitude>
and <longitude>
classes, where
there are only two valid values for the direction
slot in each class.
Rather than enforcing the restrictions programmatically, as we did in
Virtual slots, we can create types that do the job for us:
define abstract class <directed-angle> (<sixty-unit>)
slot direction :: <symbol>, required-init-keyword: direction:;
end class <directed-angle>;
define constant <latitude-direction>
= type-union(singleton(#"north"), singleton(#"south"));
define class <latitude> (<directed-angle>)
keyword direction:, type: <latitude-direction>;
end class <latitude>;
define constant <longitude-direction>
= type-union(singleton(#"east"), singleton(#"west"));
define class <longitude> (<directed-angle>)
keyword direction:, type: <longitude-direction>;
end class <longitude>;
Here, the abstract superclass specifies that the read-only slot
direction
must be a <symbol>
, and that it must be initialized when
an instance is created with the keyword direction:
. The constant
<latitude-direction>
is a type specification that permits only the
symbol #"north"
or the symbol #"south"
. The class <latitude>
specifies that, when an instance of <latitude>
is made, the initial
value must be of the <latitude-direction>
type. We handled the
longitude case similarly.
The use of type-union
and singleton
to create enumeration types in
this fashion is common enough that the function one-of
is usually
available in a utility library as a shorthand:
define constant one-of
= method (#rest objects)
apply(type-union, map(singleton, objects))
end method;
With this abbreviation, the direction types can be written more compactly:
define constant <latitude-direction> = one-of(#"north", #"south");
define constant <longitude-direction> = one-of(#"east", #"west");
Some Dylan compilers will recognize the idiomatic use of type-union
and singleton
to represent such enumerations more compactly. For
instance, a compiler could represent the direction slot of a latitude or
longitude as a single bit, using the getter and setter functions to
translate back and forth to the appropriate symbol.
Direct methods#
The definition of the one-of
constant is a method called a direct
method or bare method. It is the equivalent of a function in other
languages. A bare method does not create an implicit generic function,
and invoking a bare method does not use method-dispatch procedure, but
rather calls the method directly. We choose to use a bare method here
because we are sure that one-of
will never need method dispatch: it
performs the same operation independent of the types of its arguments.
The bare method serves to document this intent. If there were some
possibility of additional methods, it would be more perspicuous to use a
generic function, even if there is initially only one method. Most Dylan
compilers will generate equally efficient code for a bare method and for
a generic function with only one method, so the choice of which to use
should be based on whether or not it would ever make sense to have
additional methods that discriminate on parameter types.
Tail calls#
The most important construct in the Dylan execution model is the function call, because function calls are the most common operation in the language. Remember that all slot accesses and assignments, arithmetic operations, and collection accesses obey the execution model of function calls, even if the syntax for them does not look like that of function calls.
We have already discussed how Dylan compilers can optimize away run-time checking of argument types and the overhead of method dispatch, and that good compilers will generate equally efficient code for calls to single-method generic functions or direct methods.
There is one additional optimization that good Dylan compilers will make, which is enabled by a particular style of programming. If the final operation in a method is a call to another function (called a tail call) then the calling function can jump directly to the called function, rather than using a call-and-return sequence. Thus, the return from the called function returns to its caller’s caller.
As an example, consider this decode-total-seconds
method:
define method decode-total-seconds
(sixty-unit :: <sixty-unit>)
=> (hours :: <integer>, minutes :: <integer>, seconds :: <integer>)
decode-total-seconds(sixty-unit.total-seconds);
end method decode-total-seconds;
The inner call to decode-total-seconds
can be a direct jump rather
than a function call, because the compiler can infer which method should
be called and that the return values already have the correct
constraints.
Typed generic functions#
In addition to specifying the types of the parameters and return values of methods, you can specify the types of the parameters and return values of a generic function. You usually restrict the parameter types of a generic function to establish the contract of the generic function — that is, to define the domain of arguments that the generic function is intended to handle, and the domain of the values that it will return.
If we define a method without also defining a generic function, Dylan
creates an implicit generic function with the most general types for
each parameter and return value that are compatible with the method. For
example, assume that we defined a method for next-landing-step
, and
did not explicitly create a generic function for it. The method is as
follows:
define method next-landing-step
(storage :: <sky>, aircraft :: <aircraft>)
=> (next-class :: false-or(<class>), duration ::
false-or(<time-offset>))
...
end if;
end method next-landing-step;
When we define a method without also defining a generic function, the compiler will generate an implicit generic function for us, which, in this case, will be as though we had defined the generic function like this:
define generic next-landing-step (o1 :: <object>, o2 :: <object>)
=> (#rest r :: <object>);
In The schedule.dylan file, where we did define a generic function, we used a simple definition, just documenting the number of arguments, and giving them mnemonic names:
define generic next-landing-step (container, vehicle);
Because we did not specify types of the arguments or return values, they
default to <object>
, just as they did in the preceding implicit
generic function.
Although the generic function that we wrote does prevent us from
defining methods with the wrong number of arguments, it does not
constrain the types of those arguments or the format or type of return
values in any way. A sophisticated compiler may be able to make
inferences based on the methods that we define, but we could both aid
the compiler and more clearly document the protocol of
next-landing-step
by specifying the types of the parameters and return
values in the definition of the generic function:
define generic next-landing-step
(storage :: <vehicle-storage>, aircraft :: <aircraft>)
=> (next-storage :: <vehicle-storage>, elapsed-time :: <time-offset>);
Now, the compiler can help us. If we define a method whose arguments are
not a subclass of <vehicle-storage>
and a subclass of <aircraft>
(for example, if we provided the arguments in the wrong order), the
compiler will report the error. Furthermore, the compiler can use the
value declaration to detect errors in the return values (for example, if
we returned only a single value or returned a value of the wrong type).
Finally, the compiler can be asked to issue a warning if there is a
subclass of the argument types for which no method is applicable.
In addition to establishing a contract, specifying the types of the
parameters and return values of generic functions can allow the compiler
to make additional inferences, as described in Type constraints
with regard to truncate/
. In the absence of other information,
the compiler is limited in the optimizations that it can make based
solely on the parameter types in the generic function, so it is
generally best not to restrict artificially the types of a generic
function, but rather to use the restricted types to document the
generic function’s protocol.
Open generic functions#
By default, generic functions are sealed. When you use define
generic
, that is the same as using define sealed generic
. No other
library can add methods to a sealed generic function — not even on new
classes that they may introduce. Methods cannot be added to, or removed
from, the generic function at run time. The only methods on a sealed
generic function are the methods that are defined in the library where
the generic function itself is defined. Because of the restrictions on a
sealed generic function, the compiler, using type-inference information,
can usually narrow the choice of applicable methods for any particular
call to the generic function, eliminating most or all of the overhead of
run-time dispatching that would normally be expected of a dynamic
language.
We saw in Libraries and Modules, that we must define a generic function
that is part of a shared protocol using define open generic
, so that
libraries sharing the protocol can implement the protocol for the
classes that they define, by adding methods. If we do not define the
generic function to be open, other libraries are prohibited from adding
methods to the generic function, which would make it useless as a
protocol. Unfortunately, a generic function that is open cannot be
optimized. Even when the compiler may be able to infer the exact types
of the arguments to the generic function in a particular call, because
an open generic function may have methods added or removed, even at run
time, the compiler must produce code to handle all these possibilities.
Because open generic functions cannot be optimized, you should use them
only when necessary. You need to balance the division of your program
into libraries against the need to export and open more generic
functions if the program is too finely divided. This balance is
illustrated by the considerations we made in designing a protocol in
Protocol design. When we chose to split the time
and angle
libraries, we were forced to create the say
protocol library and open
the generic function say
. In Sealed domains, we show how to regain
certain optimizations when you decide that opening a generic function is required.
Note that generic functions that are defined implicitly in a library —
such as those that are defined when you define only a single method, or
those that are defined for slot accessors — are sealed by default. If
you expect other libraries to add methods to one of these implicit
generic functions, you must define the generic function explicitly to be
open using define open generic
.
Open classes#
By default, classes are sealed
. When you use define class
, that is
the same as using define sealed class
. Other libraries cannot
directly subclass a sealed class — they cannot define new classes that
have your sealed class as a direct superclass. The only direct
subclasses of the class are those subclasses that are defined in the
library where the class itself is defined. Extensive optimization
opportunities occur when the methods of a sealed generic function are
specialized on sealed classes. In this case, the compiler can usually
choose the correct method of the generic function to call at compile
time, eliminating any run-time overhead for using a generic function.
We saw in Libraries and Modules, that we must define a class that is a
shared substrate, such as <sixty-unit>
, using define open class
,
if the libraries sharing the substrate are expected to subclass the
class. If we did not define the class to be open, other libraries would
be prevented from subclassing it — which might be reasonable if the
substrate were not intended to be extended by subclassing.
Unlike an open generic function, an open class does not prevent all optimization. If a generic function has a method applicable to an open class, but the generic function is sealed, then the compiler might still be able to optimize method dispatch if that compiler can infer the types of the arguments to the generic function at a particular call. Sometimes, the dispatch code will be slightly less optimal, because it must allow for arbitrary subclasses, rather than a fixed set of subclasses; in general, however, opening a class is less costly than is opening a generic function.
Note that, although you cannot directly subclass a sealed class from another library, you can subclass a sealed class in the library that defines the sealed class. It may not be obvious, but a corollary of this rule of sealing is that you can define an open subclass of a sealed class in the library that defines the sealed class. Using a sealed class with an open subclass is one simple way to get both flexibility and efficiency — the classes in the sealed branch will be optimized by the compiler, while the open subclass can be exported for other libraries to build on and extend.
Sealed domains#
When you define a protocol that is meant to be extended by many
libraries, both the base classes and the generic functions that make up
the protocol must be open. This simple exigency might make it seem that
there is no hope of optimizing such a protocol — however, there is hope.
You use the define sealed domain
form to seal selectively subsets or
branches of the protocol, permitting the compiler to make all the
optimizations that would be possible if the classes and generic
functions were sealed, but only for the particular subset or branch in
question.
As an example, consider the say
protocol as used in the time
library. Because the say
generic function is defined to be open, even
if the compiler can infer that the argument to say
is a <time>
or
<time-offset>
, it must insert code to choose the appropriate method
to call at run time on the off chance that some other library has added
or removed methods for say
. The solution is to add the following
definition to the time
library:
// Declare the say generic function sealed, for all time classes
define sealed domain say (<time>);
This statement is essentially a guarantee to the compiler that the only
methods on say
that are applicable to <time>
objects (and also to
<time-of-day>
and <time-offset>
objects, because <time-of-day>
and
<time-offset>
are subclasses of <time>
) are those that are defined
explicitly in the time
library (and in any libraries from which that
one imports). Thus, when the compiler can prove that the argument to
say
is a <time-offset>
, it can call the correct method directly,
without any run-time dispatch overhead.
Another way to get the same effect as a sealed domain, which is also
self-documenting, is to use define sealed method
when defining
individual methods on the protocol. So, for instance, in the case of the
time
library, we might have defined the two methods on say
as
follows:
define sealed method say (time :: <time>)
let (hours, minutes) = decode-total-seconds (time);
format-out("%d:%s%d", hours, if (minutes < 10) "0" else " " end,
minutes);
end method say;
define sealed method say (time :: <time-offset>) => ()
format-out("%s ", if (time.past?) "minus" else "plus" end);
next-method();
end method say;
Defining a sealed method is the same as defining the generic function to be sealed over the domain of the method’s specializers. In effect, this technique says that you do not intend anyone to add more specific methods in that domain, or to create classes that would change the applicability of the sealed methods.
With either the define sealed domain
form or the sealed methods, the
use of say
on <time>
objects will be as efficient as it would be
were say
not an open generic function after all. At the same time,
other libraries that create new classes can still extend the say
protocol to cover those classes.
Sealed domains impose restrictions on the ability of other libraries to create new methods, to remove new methods, and to create new classes:
You cannot add methods to an open generic function imported from another library that would fall into the sealed domain of any other library. You can avoid this restriction by ensuring that at least one of the specializers of your method is a subtype of a type defined in your library.
When you seal a domain of a generic function imported from another library, you will not cause conflicts with other libraries, as long as both of the following conditions hold:
At least one of the types in the sealed domain is a subtype of a class defined in your library
No additional subtypes can be defined for any of the types in the sealed domain
In the case of a type that is a class, the first condition means that you must have defined either the class or one of its superclasses in your library. The second condition means that the classes in the domain must not have any open subclasses (a degenerate case of which is a leaf class — a class with no subclasses at all).
If you need to seal a domain over a class that has open subclasses, you will need a thorough understanding of the sealing constraints detailed in The Dylan Reference Manual, but these two simple rules should handle many common cases.
In our example, we obeyed both rules of thumb: our methods for say
are
on classes we defined, and our sealing was over classes that will not be
further subclassed. The rules of thumb not only keep you from violating
sealing constraints, they make for good protocol design: a library that
extends a protocol really should extend it only for classes it fully
understands, which usually means classes it creates.
As an example of the restriction on subclassing open classes involved in
a sealed domain, if the <time>
class were an open class, we still
could not add the following class in a library that used the time
library:
define class <place-and-time> (<position>, <time>)
end class <place-and-time>;
As far as the compiler is concerned, it “knows” that the only say
method applicable to a <time>
is the one in the time
library. (That
is what we have told it with our sealed domain
definition.) It would
be valid to pass a <place-and-time>
object as an argument to a
function that accepted <time>
objects, but within that function the
compiler might have already optimized a call to say
to the method for
<time>
objects (based on <time>
being in the sealed domain of
say
). But there is also a method for say
on <position>
, and,
more important, we probably will want to define a method specifically for
<place-and-time>
. Because of this ambiguity, the class
<place-and-time>
cannot be defined in a separate library, and the
compiler will signal an error.
Note that the class <place-and-time>
could be defined in the time
library. The compiler can deal correctly with classes that may straddle
a sealed domain, if they are known in the library where the sealed
domain is defined. It would also be valid to subclass <time>
in any
way that did not change the applicability of methods in any sealed
generic-function domains that include <time>
. The actual rule
involved depends on an analysis of the exact methods of the generic
function, and the rule is complicated enough that you should just rely
on your compiler to detect illegal situations.
Slot accessors#
Dylan does allow you to omit definition of a generic function. As we mentioned earlier, if you define a method without also defining a generic function, Dylan implicitly creates a generic function with the most general types for the parameters and return values that are compatible with the method. The most common case of implicit generic functions is for the slot-accessor methods that are created when a new class is defined. Because these generic functions typically have only a single method and are sealed by default (see Open generic functions), the compiler can make extensive optimizations for slot accessors, ideally making slot access no more expensive than an array reference or structure-member access in other languages.
Even when a slot is inherited by subclassing, a good Dylan compiler will use a coloring algorithm to assign slots to the same offset in each subclass, keeping the cost of slot access to a minimum. You can use primary classes (see Primary classes) to guarantee efficient slot access. When a program defines explicit methods for a slot getter or setter generic function, of course, the overhead is greater.
In the <sixty-unit>
class, we specified an initial value for
total-seconds
; hence, there is no need to check that the slot has
been initialized before it is accessed. In some situations, it may not
be feasible to give a default or initial value for a slot. Dylan permits
this omission and will ensure that the slot is initialized before that
slot is used; of course, this check does not come for free, so it is
preferable to provide initial values where possible. In fact, because we
always expect to initialize the total-seconds
slot when we make a new
<sixty-unit>
, it would be more accurate to specify <sixty-unit>
as
follows:
define open abstract class <sixty-unit> (<object>)
slot total-seconds :: <integer>,
required-init-keyword: total-seconds:,
end class <sixty-unit>;
That is, rather than giving the slot an initial value of 0
and an
optional init-keyword:
, we simply require that the slot be initialized
when we make a <sixty-unit>
object. Of course, the initial value must
obey the type constraint of <integer>
. The compiler can still make the
inference that the slot will always be initialized and will always have
an integer value.
Always initializing slots, either with a default value or required init-keyword, will make slot access efficient.
Finally, in many cases, slots hold values that will not change over the
lifetime of each instance (although they may be different values for
each instance). In the case of the <sixty-unit>
class, we never change
the value of total-seconds
. When adding two instances, we create a
new one to hold the new value, rather than changing one of the argument
instances (that way, we do not have to worry about changing an instance
that may still be in use by some other part of the program). In such
cases, declaring the slot to be constant
both documents and enforces
this intent. Furthermore, the compiler can often make additional
optimizations for slots that are known never to be modified. The final
definition of <sixty-unit>
is as follows:
define open abstract class <sixty-unit> (<object>)
constant slot total-seconds :: <integer>,
required-init-keyword: total-seconds:,
end class <sixty-unit>;
(The constant
declaration is simply shorthand for the slot option
setter: #f
, meaning that there is no way to set the slot.)
Primary classes#
Classes have one additional variation that you can use to optimize
performance. A class that is defined as primary
allows the compiler to
generate the most efficient code for accessing the slots defined in the
primary class (whether the accessor is applied to the primary class or
to one of that class’s subclasses). However, a primary class cannot be
combined with any other primary class (unless one is a subclass of the
other). This restriction implies that you should delay declaring a class
to be primary until you are sure of your inheritance design. Also,
because sealed classes are already highly optimized, the primary
declaration is of most use for open classes.
As an example, consider the class <sixty-unit>
, and its slot
total-seconds
, as used in this method for decode-total-seconds
:
define method decode-total-seconds
(sixty-unit :: <sixty-unit>)
=> (hours :: <integer>, minutes :: <integer>, seconds :: <integer>)
decode-total-seconds(sixty-unit.total-seconds);
end method decode-total-seconds;
Although the generic function for the slot accessor total-seconds
is
sealed, and it is trivial for the compiler to infer that its argument is
a <sixty-unit>
in the call sixty-unit.total-seconds
, because
<sixty-unit>
is declared open, the compiler cannot emit the most
efficient code for that call. Because an open class could be mixed
with any number of other classes, there is no guarantee that the slots
of every object that is a <sixty-unit>
will always be stored in the
same order — there is no guarantee that total-seconds
will always
be the first slot in an object that is an indirect instance of
<sixty-unit>
, for instance.
Declaring a class primary
is essentially making a guarantee that the
compiler can always put the primary class’s slots in the same place in
an instance, and that any other superclasses will have to adjust:
define abstract open primary class <sixty-unit> (<object>)
constant slot total-seconds :: <integer>,
required-init-keyword: total-seconds:;
end class <sixty-unit>;
By adding the primary
declaration to the definition, any library that
subclasses <sixty-unit>
is guaranteed to put total-seconds
at the
same offset. Hence, the compiler can turn the call
sixty-unit.total-seconds
into a single machine instruction (load with
constant offset), without concern over which subclass of <sixty-unit>
was passed as an argument.
It is permissible to make subclasses of a primary class also primary, essentially freezing the assignment of all the slots in the subclass too. What is not permissible is to multiply inherit from more than one primary class; as you can see, such behavior would lead to a conflict between the fixed slot assignments.
Because primary classes restrict extension in this way, you should use them sparingly in libraries intended to be software components. Primary classes are of most benefit in large, modular programs, where all the clients of each component are known, and the need for extensibility is bounded; typically that occurs toward the end of a project, when you are tuning for performance.
Additional efficiency information#
In this section, we review additional techniques that compilers can use to generate code that obeys the Dylan execution model, but is more efficient than a straightforward implementation of that model might suggest. Knowing about these techniques can help you to evaluate different vendors’ compilers. You will have to consult the documentation of your particular implementation to discover whether or not these techniques are used.
Efficiency of generic function calls#
In addition to using type inferencing and sealed domains, another way to speed up generic function calls when they must dispatch at run time is to cache the return values of previous calls. So, for example, the first time that a given generic function is called with certain classes of arguments, the full sorted sequence of applicable methods is computed; after that, however, it only to be only looked up in a table. Thus, if the generic function is called often with the same type of arguments, most calls will be fast. This technique is used in other object-oriented languages, such as Smalltalk and CLOS, and is useful for speeding up completely dynamic situations. Most good Dylan compilers will use some form of cached dispatching.
A second form of cached dispatching is called call-site caching. Although a generic function may have many calls throughout a program, often the types of arguments passed are directly related to where (that is, in what other method) the call is made. Some Dylan compilers will cache the types and methods of each call at the point of call, and will use this cache to avoid dispatch if the same types are passed as arguments in a subsequent call from the same place.
Efficiency of keyword arguments and of multiple values#
Keyword arguments are a powerful and flexible, but potentially expensive, feature of Dylan. The processing of keywords and values at run time can be an expensive operation, especially if many keywords are used. A Dylan compiler can pass keyword arguments as efficiently as it can required arguments, if the called function is known at compilation time.
Returning multiple values again raises performance issues. In some implementations of Dylan, there is an extra cost for returning more than one value; in others, the cost is associated with calling a function that does not declare how many values it returns. When the compiler knows what function is being called, these costs usually can be eliminated, but certain costs may still exist — for example, certain implementations may not optimize tail calls between functions that return different numbers of arguments.
Memory usage#
Dylan uses automatic storage-management; thus, programmers explicitly allocate objects, and hence memory, but deallocation is automatic and occurs after all references to an object are gone. The process of reclaiming memory when objects are no longer in use is known as garbage collection.
There are strong advantages to automatic storage-management. With manual storage-management, small program bugs, such as freeing of an object that is still in use, can cause subtle bugs that lead to crashes in parts of the program unrelated to where the real problem lies. Dylan is able to guarantee that all programs fail in disciplined ways, usually with exceptions, because the type system and memory management are safe.
But automatic storage-management may create performance concerns. Although early implementations of garbage collection were infamously slow, modern garbage collectors are usually fast enough that using one should not raise concerns for most programs. But some programs with specialized or tuned use of memory may run slower with automatic management.
Whether storage management is automatic or manual, the use of memory raises performance issues. Every allocation of memory takes time, including the time to reclaim unused memory; either the programmer must free it explicitly, or the garbage collector has to do more work.
It is obvious that calling a function such as make
, vector
, or
pair
in Dylan allocates memory, but there are operations that
implicitly use memory. For example, creating a closure (see
Closures) will usually cause Dylan to allocate memory for
the closure.
On the other hand, sometimes the compiler is able to prove that an object is never used after the function that creates it returns. In a good compiler, such objects are allocated on the stack, and are reclaimed automatically when the function exits.
A good Dylan development environment will have tools that help you to meter and profile memory usage, so that you can adjust your program to utilize memory efficiently.
Inlining, constant folding, and partial evaluation#
One optimization that is common in many computer languages is inlining. Inlining replaces a call to a known function with the body of the function. Inlining is an important optimization in Dylan, because almost all Dylan operations — slot access, array indexing, and collection iteration — involve function calls.
All good Dylan compilers, when compiling for speed, can be aggressive about inlining any computations, as long as doing so would not make a program grow too large. Constant folding (evaluating expressions involving constant values at compile time) and inlining are just two of the partial-evaluation techniques that you should expect to find in any good Dylan compiler.
Type inference#
The quality of type inference can vary greatly among Dylan compilers. Type inference — like most forms of program analysis — works best with simple, straightforward code. Some constructs that are typically difficult for type inference are assignment and calling of block exit functions outside of the method that defines the block exit functions.
One other way in which type constraints can be helpful is that they
permit the compiler to choose efficient representations for objects.
Most Dylan objects contain enough information for Dylan to determine
their class — this one is an important feature for the dynamic aspects
of the language. But, suppose we have a 1000 x 1000 limited(<array>,
of: <single-float>)
. There is no reason that each of the numbers in
that array should also contain a reference to the <single-float>
class; the one reference in the limited type is sufficient. (Note that,
if we had used of: <real>
or of: <float>
, we would have needed
more information, since multiple classes would have been possible.)
When an object is represented in such a way, often many of the
operations on it can be optimized. For example, the conventional
representation of <double-float>
will usually require an
indirect-memory-reference machine instruction to get at the actual
number, so adding two such objects is one floating-point machine
instruction and two load-from-memory machine instructions; if a direct
representation is used, just the add machine instruction is needed.
Further, if the return value is saved in a variable for which type
information is not available, it may be necessary to allocate memory
dynamically to store the return value.
Types that may have more efficient representations include certain integer classes, the floating-point classes, characters, and Booleans. Precise declarations about these types, especially in slots and limited collections, can lead to significant improvements in both the time and memory needed to run a program.
Summary#
The most important point about performance is that it is important to pay attention to efficiency during the entire design and development cycle of a project. During the design phase, try to ensure that the algorithms chosen have the right asymptotic behavior and constant factors, and that it is possible to implement the needed operations efficiently. During the implementation phase, use the language constructs that most clearly express what the program is doing. Once the program is working correctly, it is then time to add type and sealing declarations, and to use metering and profiling tools to find and rewrite heavily used, slow parts of the program, in order to improve the performance.
One of the most important considerations when programming is not to worry about performance too soon. It is always more important that your design and implementation be clear and correct, first. There is no value in arriving at an answer with lightning speed, if it turns out to be the wrong answer.
In this chapter, we covered the following:
We showed how Dylan can balance performance and flexibility to support a range of programming requirements.
We showed how type constraints affect performance.
We showed how limited types can improve performance.
We showed how open generic functions provide modularity and flexibility.
We showed how open classes provide modularity and flexibility.
We showed how sealed generic function domains mitigate the performance penalty of open classes and generic functions.
We showed how primary classes permit efficient slot access.
We presented both an execution and efficiency model that provides a conceptual model of how a program in Dylan runs, and what the relative cost of different program elements are.
We examined the method constructs for flexibility and performance available in Dylan; see Methods: flexibility versus performance.
# Construct
Effects
direct method
highly optimizable
no method dispatch
sealed generic function on a sealed class
highly optimizable
not extensible by other libraries
sealed generic function on an open class
optimizable
other libraries can subclass
open generic function on an open class in a sealed domain
highly optimizable
other libraries can add methods
other libraries can subclass
open generic function on an open class
not optimizable
methods can be added at run time
subclasses can be created at run time
We discussed the constructs that can have type constraints, and the influence on performance or flexibility of using such a declaration; see Type constraint: flexibility versus performance..
# Construct
Effects
module constants
enforce program correctness
module variables
permit type inferencing
required parameters
required for method dispatch
permit type inferencing
optional parameters
permit type inferencing
return values
enforce program correctness
permit type inferencing
limited types
permit type inferencing
permit compact data representation
slots
permit type inferencing