Example: Symbolic Algebra

The manipulation of symbolic algebraic expressions is a complex process that illustrates many of the hardest problems that occur in the design of large-scale systems. An algebraic expression, in general, can be viewed as a hierarchical structure, a tree of operators applied to operands. We can construct algebraic expressions by starting with a set of primitive objects, such as constants and variables, and combining these by means of algebraic operators, such as addition and multiplication. As in other languages, we form abstractions that enable us to refer to compound objects in simple terms. Typical abstractions in symbolic algebra are ideas such as linear combination, polynomial, rational function, or trigonometric function. We can regard these as compound ``types,'' which are often useful for directing the processing of expressions. For example, we could describe the expression

as a polynomial in

We will not attempt to develop a complete algebraic-manipulation system here. Such systems are exceedingly complex programs, embodying deep algebraic knowledge and elegant algorithms. What we will do is look at a simple but important part of algebraic manipulation: the arithmetic of polynomials. We will illustrate the kinds of decisions the designer of such a system faces, and how to apply the ideas of abstract data and generic operations to help organize this effort.

Arithmetic on polynomials

Our first task in designing a system for performing arithmetic on
polynomials is to decide just what a polynomial is. Polynomials are
normally defined relative to certain variables (the
*
indeterminates* of the polynomial). For simplicity, we will restrict
ourselves to polynomials having just one indeterminate (
*
univariate polynomials*).^{} We will define a polynomial to be a
sum of terms, each of which is either a coefficient, a power of the
indeterminate, or a product of a coefficient and a power of the
indeterminate. A coefficient is defined as an algebraic expression
that is not dependent upon the indeterminate of the polynomial. For
example,

5*x*^{2} +3*x* +7

is a simple polynomial in

(*y*^{2} +1)*x*^{3} +(2*y*)*x*+1

is a polynomial in

Already we are skirting some thorny issues. Is the first of these
polynomials the same as the polynomial
5*y*^{2} +3*y* +7, or not? A
reasonable answer might be ``yes, if we are considering a polynomial
purely as a mathematical function, but no, if we are considering a
polynomial to be a syntactic form.'' The second polynomial is
algebraically equivalent to a polynomial in *y* whose coefficients are
polynomials in *x*. Should our system recognize this, or not?
Furthermore, there are other ways to represent a polynomial--for
example, as a product of factors, or (for a univariate polynomial) as
the set of roots, or as a listing of the values of the polynomial at a
specified set of points.^{}
We can finesse these questions by deciding that in our
algebraic-manipulation system a ``polynomial'' will be a
particular syntactic form, not its underlying mathematical meaning.

Now we must consider how to go about doing arithmetic on polynomials. In this simple system, we will consider only addition and multiplication. Moreover, we will insist that two polynomials to be combined must have the same indeterminate.

We will approach the design of our system by following the familiar
discipline of data abstraction. We will represent polynomials using a
data structure called a
*poly*, which consists of a variable and a
collection of terms. We assume that we have selectors `variable`
and `term-list` that extract those parts from a poly and
a constructor `make-poly` that assembles a
poly from a given variable and a term list.
A variable will be just a symbol, so we can use the
`same-variable?`
procedure of section to compare variables.
The following procedures define addition and multiplication of polys:

(define (add-poly p1 p2) (if (same-variable? (variable p1) (variable p2)) (make-poly (variable p1) (add-terms (term-list p1) (term-list p2))) (error "Polys not in same var - ADD-POLY" (list p1 p2)))) (define (mul-poly p1 p2) (if (same-variable? (variable p1) (variable p2)) (make-poly (variable p1) (mul-terms (term-list p1) (term-list p2))) (error "Polys not in same var - MUL-POLY" (list p1 p2))))

To incorporate polynomials into our generic arithmetic system, we need
to supply them with type tags. We'll use the tag `polynomial`,
and install appropriate operations on tagged polynomials in
the operation table. We'll embed all our code
in an installation procedure for the polynomial package,
similar to the ones in
section :

(define (install-polynomial-package);; internal procedures;; representation of poly(define (make-poly variable term-list) (cons variable term-list)) (define (variable p) (car p)) (define (term-list p) (cdr p)) proceduressame-variable?andvariable?from section;; representation of terms and term listsproceduresadjoin-term ...from text belowcoeff

;; continued on next page(define (add-poly p1 p2) ...) procedures used by

add-poly(define (mul-poly p1 p2) ...) procedures used bymul-poly;; interface to rest of the system(define (tag p) (attach-tag 'polynomial p)) (put 'add '(polynomial polynomial) (lambda (p1 p2) (tag (add-poly p1 p2)))) (put 'mul '(polynomial polynomial) (lambda (p1 p2) (tag (mul-poly p1 p2)))) (put 'make 'polynomial (lambda (var terms) (tag (make-poly var terms)))) 'done)

Polynomial addition is performed termwise. Terms of the same order (i.e., with the same power of the indeterminate) must be combined. This is done by forming a new term of the same order whose coefficient is the sum of the coefficients of the addends. Terms in one addend for which there are no terms of the same order in the other addend are simply accumulated into the sum polynomial being constructed.

In order to manipulate term lists, we will assume that we have a
constructor
`the-empty-termlist` that returns an empty term list
and a constructor
`adjoin-term` that adjoins a new term to a term
list. We will also assume that we have a predicate
`
empty-termlist?` that tells if a given term list is empty, a selector
`first-term` that extracts the highest-order term from a term
list, and a selector
`rest-terms` that returns all but the highest-order
term. To manipulate terms, we will suppose that we have a
constructor
`make-term` that constructs a term with given
order and coefficient, and selectors
`
order` and
`coeff` that return, respectively, the order and the
coefficient of the term. These operations allow us to consider both
terms and term lists as data abstractions, whose concrete
representations we can worry about separately.

Here is the procedure that constructs the term list for the sum of two
polynomials:^{}

(define (add-terms L1 L2) (cond ((empty-termlist? L1) L2) ((empty-termlist? L2) L1) (else (let ((t1 (first-term L1)) (t2 (first-term L2))) (cond ((> (order t1) (order t2)) (adjoin-term t1 (add-terms (rest-terms L1) L2))) ((< (order t1) (order t2)) (adjoin-term t2 (add-terms L1 (rest-terms L2)))) (else (adjoin-term (make-term (order t1) (add (coeff t1) (coeff t2))) (add-terms (rest-terms L1) (rest-terms L2)))))))))The most important point to note here is that we used the generic addition procedure

In order to multiply two term lists, we multiply each term of the
first list by all the terms of the other list, repeatedly using
`mul-term-by-allterms`, which multiplies a given term by
all terms in a given term list. The resulting term lists (one for
each term of the first list) are accumulated into a sum. Multiplying
two terms forms a term whose order is the sum of the orders of the
factors and whose coefficient is the product of the coefficients of
the factors:

(define (mul-terms L1 L2) (if (empty-termlist? L1) (the-empty-termlist) (add-terms (mul-term-by-all-terms (first-term L1) L2) (mul-terms (rest-terms L1) L2)))) (define (mul-term-by-all-terms t1 L) (if (empty-termlist? L) (the-empty-termlist) (let ((t2 (first-term L))) (adjoin-term (make-term (+ (order t1) (order t2)) (mul (coeff t1) (coeff t2))) (mul-term-by-all-terms t1 (rest-terms L))))))

This is really all there is to polynomial addition and multiplication.
Notice that, since we operate on terms using the generic procedures
`add` and `mul`, our polynomial package is automatically able
to handle any type of coefficient that is known about by the generic
arithmetic package. If we include a
coercion mechanism such as one of
those discussed in section ,
then we also are automatically able to handle operations on
polynomials of different coefficient types, such as

Because we installed the polynomial addition and
multiplication procedures `add-poly` and `mul-poly` in the generic
arithmetic system as the `add` and `mul` operations
for type `polynomial`, our system is also
automatically able to handle polynomial operations such as

The reason is that when the system tries to combine coefficients, it
will dispatch through `add` and `mul`. Since the coefficients
are themselves polynomials (in *y*), these will be combined using `
add-poly` and `mulpoly`. The result is a kind of
``data-directed
recursion'' in which, for example, a call to `mul-poly` will result
in recursive calls to `mul-poly` in order to multiply the
coefficients. If the coefficients of the coefficients were themselves
polynomials (as might be used to represent polynomials in three
variables), the data direction would ensure that the system would
follow through another level of recursive calls, and so on through as
many levels as the structure of the data dictates.^{}

Representing term lists

Finally, we must confront the job of implementing a good
representation for term lists. A term list is, in effect, a set of
coefficients keyed by the order of the term. Hence, any of the
methods for representing sets, as discussed in
section , can be applied to this task. On
the other hand, our procedures `add-terms` and `mul-terms` always
access term lists sequentially from highest to lowest order. Thus, we
will use some kind of ordered list representation.

How should we structure the list that represents a term list? One
consideration is the ``density'' of the polynomials we intend to
manipulate. A polynomial is said to be
*dense* if it has nonzero
coefficients in terms of most orders. If it has many zero terms it
is said to be
*sparse*. For example,

is a dense polynomial, whereas

is sparse.

The term lists of dense polynomials are most efficiently represented
as lists of the coefficients. For example, *A* above would be nicely
represented as `(1 2 0 3 -2 -5)`. The order of a term in this
representation is the length of the sublist beginning with that term's
coefficient, decremented by 1.^{} This would be a terrible representation for a
sparse polynomial such as *B*: There would be a giant list of zeros
punctuated by a few lonely nonzero terms. A more reasonable
representation of the term list of a sparse polynomial is as a list of
the nonzero terms, where each term is a list containing the order of the
term and the coefficient for that order. In such a scheme, polynomial
*B* is efficiently represented as `((100 1) (2 2) (0 1))`. As
most polynomial manipulations are performed on sparse polynomials, we
will use this method. We will assume that term lists are represented
as lists of terms, arranged from highest-order to lowest-order term.
Once we have made this decision, implementing the selectors and
constructors for terms and term lists is straightforward:^{}

(define (adjoin-term term term-list) (if (=zero? (coeff term)) term-list (cons term term-list))) (define (the-empty-termlist) '()) (define (first-term term-list) (car term-list)) (define (rest-terms term-list) (cdr term-list)) (define (empty-termlist? term-list) (null? term-list)) (define (make-term order coeff) (list order coeff)) (define (order term) (car term)) (define (coeff term) (cadr term))where

Users of the polynomial package will create (tagged) polynomials by means of the procedure:

(define (make-polynomial var terms) ((get 'make 'polynomial) var terms))

**Exercise.**
Install `=zero?` for polynomials in the generic arithmetic
package. This will allow `adjoin-term` to work for polynomials
with coefficients that are themselves polynomials.

**Exercise.**
Extend the polynomial system to include subtraction of polynomials.
(Hint: You may find it helpful to define a generic negation operation.)

**Exercise.**
Define procedures that implement the term-list representation
described above as appropriate for dense polynomials.

**Exercise.**
Suppose we want to have a polynomial system that is efficient for both
sparse and dense polynomials. One way to do this is to allow both
kinds of term-list representations in our system. The situation is
analogous to the complex-number example of section ,
where we allowed both rectangular and polar representations.
To do this we must distinguish different types of term lists and make
the operations on term lists generic. Redesign the polynomial system
to implement this generalization. This is a major effort, not a local
change.

**Exercise.** A univariate polynomial can be divided by another one to produce a
polynomial quotient and a polynomial remainder. For example,

Division can be performed via long division. That is, divide the highest-order term of the dividend by the highest-order term of the divisor. The result is the first term of the quotient. Next, multiply the result by the divisor, subtract that from the dividend, and produce the rest of the answer by recursively dividing the difference by the divisor. Stop when the order of the divisor exceeds the order of the dividend and declare the dividend to be the remainder. Also, if the dividend ever becomes zero, return zero as both quotient and remainder.

We can design a `div-poly` procedure on the model of `add-poly` and
`mulpoly`. The procedure checks to see if the two polys have
the same variable. If so, `div-poly` strips off the variable and
passes the problem to `div-terms`, which performs the division
operation on term lists. `Div-poly` finally reattaches the variable
to the result supplied by `div-terms`. It is convenient
to design `div-terms` to compute both the quotient and the remainder
of a division. `Div-terms` can take two term lists as arguments and
return a list of the quotient term list and the remainder term list.

Complete the following definition of `div-terms` by filling in the
missing expressions. Use this to implement `div-poly`, which takes
two polys as arguments and returns a list of the quotient and
remainder polys.

(define (div-terms L1 L2) (if (empty-termlist? L1) (list (the-empty-termlist) (the-empty-termlist)) (let ((t1 (first-term L1)) (t2 (first-term L2))) (if (> (order t2) (order t1)) (list (the-empty-termlist) L1) (let ((new-c (div (coeff t1) (coeff t2))) (new-o (- (order t1) (order t2)))) (let ((rest-of-result compute rest of result recursively )) form complete result ))))))

Hierarchies of types in symbolic algebra

Our polynomial system illustrates how objects of one type (polynomials) may in fact be complex objects that have objects of many different types as parts. This poses no real difficulty in defining generic operations. We need only install appropriate generic operations for performing the necessary manipulations of the parts of the compound types. In fact, we saw that polynomials form a kind of ``recursive data abstraction,'' in that parts of a polynomial may themselves be polynomials. Our generic operations and our data-directed programming style can handle this complication without much trouble.

On the other hand, polynomial algebra is a system for which the data
types cannot be naturally arranged in a tower. For instance, it is
possible to have polynomials in *x* whose coefficients are polynomials
in *y*. It is also possible to have polynomials in *y* whose
coefficients are polynomials in *x*. Neither of these types is
``above'' the other in any natural way, yet it is often necessary to
add together elements from each set. There are several ways to do
this. One possibility is to convert one polynomial to the type of the
other by expanding and rearranging terms so that both polynomials have
the same principal variable. One can impose a towerlike structure on
this by ordering the variables and thus always converting any
polynomial to a
``canonical form'' with the highest-priority variable
dominant and the lower-priority variables buried in the coefficients.
This strategy works fairly well, except that the conversion may expand
a polynomial unnecessarily, making it hard to read and perhaps less
efficient to work with. The tower strategy is certainly not natural
for this domain or for any domain where the user can invent new types
dynamically using old types in various combining forms, such as
trigonometric functions, power series, and integrals.

It should not be surprising that controlling coercion is a serious problem in the design of large-scale algebraic-manipulation systems. Much of the complexity of such systems is concerned with relationships among diverse types. Indeed, it is fair to say that we do not yet completely understand coercion. In fact, we do not yet completely understand the concept of a data type. Nevertheless, what we know provides us with powerful structuring and modularity principles to support the design of large systems.

**Exercise.**
By imposing an ordering on variables, extend the polynomial package so
that addition and multiplication of polynomials works for polynomials
in different variables. (This is not easy!)

Extended exercise: Rational functions
We can extend our generic arithmetic system to include *rational
functions*. These are ``fractions'' whose numerator and denominator
are polynomials, such as

The system should be able to add, subtract, multiply, and divide rational functions, and to perform such computations as

(Here the sum has been simplified by removing common factors. Ordinary ``cross multiplication'' would have produced a fourth-degree polynomial over a fifth-degree polynomial.)

If we modify our rational-arithmetic package so that it uses generic operations, then it will do what we want, except for the problem of reducing fractions to lowest terms.

**Exercise.**
Modify the rational-arithmetic package to use generic operations, but
change `make-rat` so that it does not attempt to reduce fractions
to lowest terms. Test your system by calling `make-rational` on
two polynomials to produce a rational function

(define p1 (make-polynomial 'x '((2 1)(0 1)))) (define p2 (make-polynomial 'x '((3 1)(0 1)))) (define rf (make-rational p2 p1))Now add

We can reduce polynomial fractions to lowest terms using the same idea
we used with integers: modifying `make-rat` to divide both the
numerator and the denominator by their greatest common divisor. The
notion of
``greatest common divisor'' makes sense for polynomials. In
fact, we can compute the GCD of two polynomials using essentially the
same Euclid's Algorithm that works for integers.^{} The
integer version is

(define (gcd a b) (if (= b 0) a (gcd b (remainder a b))))Using this, we could make the obvious modification to define a GCD operation that works on term lists:

(define (gcd-terms a b) (if (empty-termlist? b) a (gcd-terms b (remainder-terms a b))))where

**Exercise.**
Using `div-terms`, implement the procedure `remainder-terms` and
use this to define `gcd-terms` as above. Now write a procedure
`gcd-poly` that computes the polynomial GCD of two polys.
(The procedure should signal an error if the two polys are not
in the same variable.) Install in the system a generic operation `
greatest-common-divisor` that reduces to `gcd-poly` for polynomials
and to ordinary `gcd` for ordinary numbers. As a test, try

(define p1 (make-polynomial 'x '((4 1) (3 -1) (2 -2) (1 2)))) (define p2 (make-polynomial 'x '((3 1) (1 -1)))) (greatest-common-divisor p1 p2)and check your result by hand.

**Exercise.**
Define *P*_{1}, *P*_{2}, and *P*_{3} to be the polynomials

Now define *Q*_{1} to be the product of *P*_{1} and *P*_{2} and *Q*_{2} to
be the product of *P*_{1} and *P*_{3}, and use `greatest-common-divisor`
(exercise ) to
compute the GCD of *Q*_{1} and *Q*_{2}.
Note that the answer is not the same as *P*_{1}.
This example introduces noninteger
operations into the computation, causing difficulties with the GCD
algorithm.
^{}
To understand what is happening,
try tracing `gcd-terms` while computing the GCD or
try performing the division by hand.

We can solve the problem exhibited in exercise if
we use the following modification of the GCD algorithm (which really
works only in the case of polynomials with integer coefficients).
Before performing any polynomial division in the GCD computation, we
multiply the dividend by an integer constant factor, chosen to
guarantee that no fractions will arise during the division process.
Our answer will thus differ from the actual GCD by an integer constant
factor, but this does not matter in the case of reducing rational
functions to lowest terms; the GCD will be used to divide both the
numerator and denominator, so the integer constant factor will cancel
out.

More precisely, if *P* and *Q* are polynomials, let *O*_{1} be the
order of *P* (i.e., the order of the largest term of *P*) and let
*O*_{2} be the order of *Q*. Let *c* be the leading coefficient of
*Q*. Then it can be shown that, if we multiply *P* by the
*
integerizing factor*
*c*^{1+O1 -O2}, the resulting polynomial can
be divided by *Q* by using the `div-terms` algorithm without
introducing any fractions. The operation of multiplying the dividend
by this constant and then dividing is sometimes called the
*
pseudodivision* of *P* by *Q*. The remainder of the division is
called the *pseudoremainder*.

**Exercise.**
a. Implement the procedure `pseudoremainder-terms`, which is just like
`remainderterms` except that it multiplies the dividend by
the integerizing factor described above before calling `div-terms`.
Modify `gcd-terms` to use `pseudoremainderterms`, and verify
that `greatest-common-divisor` now produces an answer with integer
coefficients on the example in exercise .

b. The GCD now has integer coefficients, but they are larger than those
of *P*_{1}. Modify `gcd-terms` so that it removes common factors from
the coefficients of the answer by dividing all the coefficients by their
(integer) greatest common divisor.

Thus, here is how to reduce a rational function to lowest terms:

- Compute the GCD of the numerator and denominator, using
the version of
`gcd-terms`from exercise . - When you obtain the GCD, multiply both numerator and
denominator by the same integerizing factor before dividing through by
the GCD, so that division by the GCD will not introduce any noninteger
coefficients. As the factor you can use the leading coefficient of
the GCD raised to the power
1+
*O*_{1}-*O*_{2}, where*O*_{2}is the order of the GCD and*O*_{1}is the maximum of the orders of the numerator and denominator. This will ensure that dividing the numerator and denominator by the GCD will not introduce any fractions. - The result of this operation will be a numerator and denominator with integer coefficients. The coefficients will normally be very large because of all of the integerizing factors, so the last step is to remove the redundant factors by computing the (integer) greatest common divisor of all the coefficients of the numerator and the denominator and dividing through by this factor.

**Exercise.**
a. Implement this algorithm as a procedure `reduce-terms` that takes two
term lists `n` and `d` as arguments and returns a list `
nn`, `dd`, which are `n` and `d` reduced to lowest terms
via the algorithm given above.
Also write a procedure `reduce-poly`, analogous to `add-poly`,
that checks to see if the two polys have
the same variable. If so, `reduce-poly` strips off the variable and
passes the problem to `reduce-terms`, then reattaches the variable
to the two term lists supplied by `reduce-terms`.

b. Define a procedure analogous to `reduce-terms`
that does what the original `make-rat` did for integers:

(define (reduce-integers n d) (let ((g (gcd n d))) (list (/ n g) (/ d g))))and define

(define p1 (make-polynomial 'x '((1 1)(0 1)))) (define p2 (make-polynomial 'x '((3 1)(0 -1)))) (define p3 (make-polynomial 'x '((1 1)))) (define p4 (make-polynomial 'x '((2 1)(0 -1))))See if you get the correct answer, correctly reduced to lowest terms.(define rf1 (make-rational p1 p2)) (define rf2 (make-rational p3 p4))

(add rf1 rf2)

The GCD computation is at the heart of any system that does operations
on rational functions. The algorithm used above, although
mathematically straightforward, is extremely slow. The slowness is
due partly to the large number of division operations and partly to
the enormous size of the intermediate coefficients generated by the
pseudodivisions. One of the active areas in the development of
algebraic-manipulation systems is the design of better algorithms for
computing polynomial GCDs.^{}