Abstract
A paper by Shier (J. Res NBS 80B) shows how to partition the graph of a matrix into a tree so as to minimize the number of operations required to invert the matrix. The present paper shows how to economically solve a sparse system of linear equations after the application of Shier’s method to the coefficient matrix.
Keywords: Sparse equations, tree partitions
1. Introduction
In [1]1 Shier points out that if: (a) the graph corresponding to a sparse matrix A is partitioned into subgraphs which themselves can be regarded as nodes of a tree, and (b) the nodes of this tree are suitably numbered; then A can be partitioned as (Aij) where Aij are submatrices and the ith row of A is
| (1) |
and where node r(i) is the “father” of i in the tree. Also, Aik = 0 (k < i) unless r(k) = i and A is block incidence-symmetric. He then describes a relatively efficient way of finding A−1, involving the computation of Aii−1 and similar sub-matrices by standard methods for dense matrices combined with recursive application of his algorithm. He also describes a method for carrying out the tree partitioning in (a) above.
Unfortunately he does not describe in detail how his method can be applied to the much more common problem of solving sparse equations, although he does mention (p. 252, lines 3–5) that it can be so applied. This will now be done.
2. Solution of Equations
We have to solve:
| (2) |
where A is partitioned as in (1) and x and b are partitioned conformably into sub-vectors
| (3) |
We may very efficiently solve (2) by Block Gaussian Elimination as follows: (A) (Elimination of sub-diagonal sub-matrices) For i = 1, ⋯, n − 1 do:
| (4) |
Eliminate Ar(i),i by subtracting mr(i),ix (row i) from row r(i), i.e.
| (5) |
| (6) |
(B) (Back-Substitution)
| (7) |
(V) For i = n − 1, ··, 1 do:
| (8) |
(The above simply constitutes Gaussian Elimination with coefficients consisting of submatrices instead of scalars.) The great advantage of this method is that there is no fill-in except within the blocks, i.e., a zero submatrix always remains zero.
3. A More Economical Method
Further economy can be obtained by omitting the explicit calculation of m in (4). Rather we can perform triangular decomposition
| (9) |
Then (6) can be replaced by:
| (10) |
| (11) |
| (12) |
| (13) |
| (14) |
(5) Can be replaced by similar calculations with each column of Ai,r(i) taking the place, in turn, of . (7) Can be replaced by (9), (10), (11) with i = n. (8) Can be replaced by:
| (15) |
| (16) |
| (17) |
4. Operation Count
(1) If the explicit inverse Aii−1 and mr(i),i are employed as in §2, using dense matrix techniques, the operation count would be as follows, assuming Aii is order pixpi, and pi = p for all i: the formation of Aii−1, mr(i),i and equation (5) each require 0(p3) multiplications, for a total of 0(3p3); while eq (6) requires 0(p2), (7) requires 0(p3 + p2) and (8) requires 0(2p2). Thus, the total number of multiplications is approximately
| (18) |
(II) If the method of §3 is used we have: equation (9) requires 0(p3/3) multiplications; equations (10), (11) and (13) together need 0(2p2). The solution of equations (10), (11) and (13) with any column of Ai,r(i) in place of requires 0(2p2) for each column, i.e. 0(2p3) in all. Equations (15)–(17) require 0(2p2). Equations (9), (10) and (11) for i = n require . Thus, the total number of multiplications required for this method is approximately
| (19) |
Thus the method of §3 is more efficient for large pi = p, when we may ignore multiplicative and overhead factors.
(III) If the equations are solved directly without any partitioning, as if they were full, the number of multiplications required is , which for large n is much greater than .
5. Labelling of Tree Nodes
The nodes of the tree must be numbered in such a way that its incidence matrix has the form (1). This can be accomplished for example by a modification of the “Reverse Cuthill-McKee Algorithm” [2]. (This was originally devised as a band-width minimization technique, although that aspect has no relevance in the present context.) Simplified and re-worded for our purposes the algorithm may be described thus:
Suppose there are N nodes in the tree. Choose an arbitrary node and number it N (this is defined as the only member of “level” 1). Set I = 1 and J = N − 1.
-
Consider all nodes adjacent to nodes in level I but as yet unnumbered (they will be defined as members of level I + 1). Suppose there are K such nodes in all. If K = 0 terminate. Otherwise assign to them the numbers J, J − 1, ··, J − K + 1.
Set J = J − K and I = I + 1.
Repeat step B until K = 0.
It is simple to prove that the incidence matrix of a tree thus numbered has the form (1), i.e. each node (numbered i, say) is adjacent to only one node having a higher number (say r(i)).
Proof: Suppose if possible a node numbered i is adjacent to two nodes numbered r1 and r2 such that r1 > i, r2 > i. Then the nodes r1 and r2 belong to lower levels than node i. Hence they are both connected, via paths not including node i, to node N. Thus we have two separate paths connecting nodes i and N, i.e. we have a loop. But this contradicts the assumption that the graph is a tree. Hence there must be only one node adjacent to i with number > i. Q.E.D.
Footnotes
Figures in brackets indicate the literature references at the end of this paper.
6. References
- [1].Shier D. R., Inverting sparse matrices by tree partitioning, J. Res. Nat. Bur. Stand (U.S.), 80B (Math. Sci.), No. 2, pp. 245–257 (April–June 1976). [Google Scholar]
- [2].Cuthill E., Several strategies for reducing the bandwidth of a matrix In: Sparse Matrices and their applications, Ed. D. J. Rose and R. A. Willoughby pp. 157–160 (Plenum Press, New York, N. Y., 1972). [Google Scholar]
