Abstract
The integer division of a numerator n by a divisor d gives a quotient q and a remainder r. Optimizing compilers accelerate software by replacing the division of n by d with the division of (or ) by m for convenient integers c and m chosen so that they approximate the reciprocal: . Such techniques are especially advantageous when m is chosen to be a power of two and when d is a constant so that c and m can be precomputed. The literature contains many bounds on the distance between and the divisor d. Some of these bounds are optimally tight, while others are not. We present optimally tight bounds for quotient and remainder computations.
Keywords: Integer division, Compiler optimization, Tight bounds
Highlights
-
•
Generalizes theory on how to replace a division by a multiplication followed by a shift.
-
•
Includes both the computation of the quotient and remainder.
-
•
Introduces several new tighter bounds.
Integer division; Compiler optimization; Tight bounds
1. Introduction
The problem of computing the integer division given constant divisors has a long history in computer science [1], [2], [3], [4]. Granlund and Montgomery [5] present the first general-purpose algorithms to divide integers by constants using a multiplication and a division by a power of two: their work was adopted by the GNU Compiler Collection (GCC). Given any non-zero 32-bit divisor known at compile time, the optimizing compiler can replace the division by a multiplication followed by a shift. Warren [6] improved on the Granlund and Montgomery technique by deriving a better bound that gives a wider range of choices. Warren's better approach is found in LLVM's Clang compiler. Many optimizing compilers rely on equivalent techniques, either based on the original Granlund-Montgomery article or on Warren's technique.
Robison [7] describes a slightly superior alternative for some divisors in that we multiply and add the multiplier before dividing by a power of two (henceforth the multiply-add technique). Though it comes at the cost of an addition, it allows one to choose a smaller multiplier, which can be advantageous. Robison's approach is implemented in the popular libdivide library [8].
Most of the literature is focused on the computation of the quotient q of the division of n by d. From the quotient q, we can compute the remainder as . We can also compute the remainder directly [9] without first computing the quotient: it is given by taking remainder of divided by m, and then multiplying it by m. However, for the remainder and the quotient to be exact, it is necessary that approximates more closely than if we merely need the quotient.
From the computation of remainders, we can derive a divisibility check, that is check whether d divides n, or, equivalently, check that n is a multiple of d. Though it may seem that computing the remainder and checking whether it is zero is efficient, we can simplify and accelerate the algorithm by avoiding the computation of the remainder.
The literature commonly assumes that m is a power of two. We approach the problem more generally, letting m and c be any integer, and restricting the numerator to an interval where N can be any integer. It makes our exposition more general, while simplifying the notation.
Some our novel contributions are as follows:
-
•
We improve Robison's bound [7], in a manner similar to how Warren improved Granlund and Montgomery's bound. That is, we provide an optimal bound for the multiply-add technique.1
-
•
We derive a new tighter bounds for computing the quotient directly and checking the divisibility, thus improving on the work of Lemire at al. [9]
-
•
We show that we can adapt Robison's technique to compute remainders directly and derive a novel bound. We adapt the multiply-add technique for the purpose of a divisibility check. To our knowledge, these results are novel.
All our bounds on how close must be to are optimal and form necessary and sufficient conditions. Table 1 presents our core results in concise manner.
Table 1.
Summary of main results. Throughout, all values are non-negative integers, the divisor is non-zero d > 0, and the numerator is bounded by N ≥ d, so that n ∈ [0,N]. We add the constraint that c ∈ [0,m) so that c is as small as possible.
Theorem 1 and Warren [11] | |
statement: | division(n,d)=division(c⁎n,m) for all n ∈ [0,N] |
condition: | |
Theorem 2 (novel, improves Lemire et al. [9, Theorem 1]) | |
statement: | division(n,d)=division(c⁎n,m) and remainder(n,d)=division(remainder(c⁎n,m)⁎d,m) for all n ∈ [0,N] |
condition: | |
Proposition 1 (novel, generalizes Lemire et al. [9]) | |
statement: | d divides n ∈ [0,N] if and only if remainder(c⁎n,m)<c |
condition: | |
Theorem 3 (improves Robison [7]) | |
statement: | division(n,d)=division(c⁎n + c,m) for all n ∈ [0,N] |
condition: | |
Theorem 4 (novel) | |
statement: | division(n,d)=division(c⁎n + c,m) and remainder(n,d)=division(remainder(c⁎n + c,m)⁎d,m) for all n ∈ [0,N] |
condition: | |
Proposition 2 (novel) | |
statement: | d divides n ∈ [0,N] if and only if remainder(c⁎n + c,m)<c |
condition: |
2. Other related work
The problem of quickly computing the division by a constant in computers dates back to at least the 1970s. Jacobsohn [2] shows that we can divide by an odd integer by multiplying by a fractional inverse, followed by some rounding. Artzy et al. [1] describe a related algorithm to divide multiples of a known divisor (exact division). Li [3] presents algorithms for integer division by all odd integers up to 55 [11, § 10-18]. Divisions are executed as series of “shift and add” instructions.
Magenheimer et al. [12] describe how to compute the division of integers by odd divisors as a multiplication and an addition followed by a division by a power of two. Their approach was later refined by Robison [7]. Similarly, Granlund and Montgomery's approach [5] (without an intermediate addition) was refined by Cavagnino and Werbrouck [13], and later by Warren [11]. As remarked by Robison [7], the two approaches (with and without an intermediate addition) are complementary: we can choose one or the other depending on the divisor. We review and elaborate on this complementarity in § 6.
To our knowledge, the latest work on the software acceleration of the division by constants was Lemire et al. [9]. They revisited two specific problems: the direct computation of the remainder—without first computing the quotient—and the related divisibility tests. Compared to optimizing compilers that compute the remainder by first computing the quotient, they found that their direct approach could be up to 30% faster. Their divisibility test could be twice as fast as the code produced by popular optimizing compilers and libraries. It can also be up to twice as fast as the state-of-the-art divisibility check proposed by Granlund and Montgomery [5]. They did not consider the multiply-add approach, a gap that we fill with § 5. We also make their main result [9, Theorem 1] tighter (see Theorem 2). We similarly improve mathematically on their divisibility check [9, Proposition 1] (see Proposition 1). Our improvements may not immediately result in improved software performance, but they fill a conceptual gap. The systematic computation of the remainder directly as proposed by Lemire et al., without first computing the quotient, has received attention in the hardware and circuit literature [14], [15], [16], [17] but had never been generally exploited in software as far as we know. One practical exception was the work by Vowels [4] who described the direct computation of both the quotient and remainder, in the special case where we divide by 10.
3. Technical preliminaries
For non-negative real numbers z, is the greatest integer no larger than z. It is a monotonic function: if then .
We define and
(1) |
(2) |
for positive real numbers with the constraint that . We have that . If y is an integer and x is not an integer, then . By definition, we always have that .
Lemma 1
Consider a positive integer , a non-negative integer n and a non-negative real number x. We have that and if and only if .
Proof
(⇐) We can verify that if , the previous two conditions are satisfied.
(⇒) Assume that and . We have
(3)
(4) By expanding out , we have that
(5)
(6) Expanding out and multiplying by d, we have . We establish the lemma by adding this last equation to the previous inequality. □
4. Multiply-divide results
Given a non-negative numerator n and non-zero divisor d, we want to show that by choosing integer constants c and m carefully, we can compute and by starting from and dividing by m.
4.1. Quotient
We want to find c and m such that . Intuitively, this equation implies that and . Let us formalize this intuition.
For any non-negative real number x and non-negative integer Q, we have that is equivalent to . Letting and , we get . Since , and , we have that is equivalent to .
Consider a range of integer numerators for some maximal integer numerator . We want this equation to hold for all n. The equation is satisfied trivially when and . Suppose that and rewrite the inequalities as .
Given any for some integer , we have that the leftmost expression is largest and equal to 1 when . Meanwhile the rightmost expression is smallest when n is as large as possible with . To prove this bound, partition the possible numerators into sets . Fixing N and d, we seek the value minimizing . For we have which is minimized for the largest member of . Let v be the largest member of ; we have . We see that the values of n in are the minimizing values in each . Among these, we can show v minimizes f.
-
•
Consider any with . Write it as with and and so and . As k increases, the numerator decreases and the denominator increases, we have that the minimum is reached when k is largest (), in which case , since .
-
•
Consider any with . Write n as , so is again . We have . Again, as k increases, the numerator decreases and the denominator increases, we have that the minimum is reached when k is largest () in which case .
Thus minimizes . We have shown Lemma 2 because .
Lemma 2
Given an integer , the value of over is minimized when n is .
Hence we have that is equivalent to for all .
Theorem 1
Consider an integer divisor and a range of integer numerators where is an integer. We have that for all integer numerators n in the range if and only if
(7)
Remark 1
Granlund and Montgomery [5] have an upper bound of as a sufficient (but not necessary) condition. A bound equivalent to Theorem 1 is derived by Warren [6].
Once we have a pair of inequalities as in Theorem 1, we can solve for c and m. It is always possible to do so: we can verify that , is always a solution. However, we may have further constraints on c and m: maybe we require m to be a power of two. We can show that as long as we can choose m arbitrarily large, there is always a solution. Letting , we can rewrite the inequalities as . Thus if c is to be as small as possible, we must have that . It remains to solve for m such that . Because , we have that the inequality is always satisfied when . This bound indicates that it is always possible to find a solution, by picking m large enough.
4.2. Remainder
From the quotient , we get the quotient of the division of n by d; it is maybe intuitive that we can derive the remainder of the division of n by d from .
Formally, we want to find integer constants and such that for any integer numerator and integer divisor , we have that .
If we find c and m such that is satisfied, then replacing c with or would still work: in fact for any integer k. Thus we require c to be in .
With this constraint (), we are able to show (see Lemma 3) that the ability to compute remainders via implies that the quotient of n divided by d is given by . Intuitively, it is strictly more difficult to compute the remainder than to compute the quotient. Hence, if we just need the remainder, and not the quotient, we cannot relax our conditions when .
Lemma 3
Consider an integer divisor . Suppose that we have integer constants c and m such that and for all numerators then we must have that .
Proof
When , we have that holds trivially. Since then so when increases following an increment of n by one, it must increase by at most one. We just have to show that it happens exactly when .
We have . The left side of this equation increases by c exactly when n is incremented by one. When increases by one, then it contributes m to the right side. Since , we have that an increase of corresponds to a decrease of .
However, we have that . From this equation, we have that whenever increases when we increment n by one, then must also increase. We know that when n is incremented, then either increases by one, or goes back to zero. It is not possible for to increase if decreases: it must therefore be that a decrease in corresponds to . Thus we have that an increase of following an increment of n corresponds . It follows that . □
We still must derive the conditions on c and m. We can expand the condition that as follows:
(8) |
(9) |
(10) |
Then by Lemma 1, we have that the two constraints ( and ) are equivalent to , or . This condition should hold for all applicable values of n, and thus we choose to use the maximal value of n (i.e., N) as it provides the tightest bound — so an equivalent expression is .
We have derived the following theorem.
Theorem 2
Consider an integer divisor and a range of integer numerators where is an integer. We have that and for all integer numerators n in the range if and only if
(11)
We can check that the conditions of Theorem 2 are always met with and .
Remark 2
In previous work [9, Theorem 1], Lemire et al. reported an upper bound of as a sufficient (but not necessary) condition. For the difference to matter, we need that there is an integer in the interval . It happens in some instances, for example if , , and , we have that . However, in the previous work [9], Lemire et al. considered only the case where . We can show that if and then the earlier bound is tight. Indeed, if there is an integer z in then there must be an integer in . Substituting , the interval becomes : because is an integer and , there is no integer in this interval. Hence, there cannot be an integer in and the earlier bound is tight.
If we only desire the remainder, and not the quotient, we can lift the restriction that : we can replace c by for any integer k.
4.3. Check for divisibility
We have that n is a multiple of d if and only if . Given Theorem 2, we can check whether by checking whether . In turn, we have that this last equation holds if and only if or . Thus is a divisibility test. However, we show the more elegant result that is a divisibility test (see Proposition 1).
By the assumption of Theorem 2, we have that . Thus if n is a multiple of d, then we have that . We need to prove the counterpart, that implies that n is a multiple of d. By Theorem 2, we have that . Hence we have that since n and have the same quotient with respect to d. When two values have the same quotient () then their difference must be captured by their remainders: . In this case, taking and , we have that their difference is . It follows that and therefore . Thus if , we have which implies .
Proposition 1
Consider an integer divisor . We have that d divides if and only if subject to the condition that
(12)
Proposition 1 selects a value of c in when .
5. Multiply-add-divide results
Some authors [7], [12] have considered the case where we replace the division by a formula of the multiply-add from for some b. The benefit of the multiply-add approach is that it may allow one to pick a smaller value of c, compared to the simpler form . The derivations are nearly identical as in § 4, so we just give our results.
Theorem 3
Consider an integer divisor and a range of integer numerators where is an integer. We have that for all integer numerators n in the range if and only if
(13)
Remark 3
Robison [7] derived the sufficient condition . When , Robison's bound is suboptimal unlike Theorem 3. Drane et al. [10] derive a similar result to ours for the case where d is odd.
Theorem 4
Consider an integer divisor and a range of integer numerators where is an integer. We have that and for all integer numerators n in the range if and only if
(14)
Proposition 2
Consider an integer divisor . We have that d divides if and only if subject to the condition that
(15)
We can check that the conditions of Theorem 3 are met when and as long as m is not divisible by d. Proposition 2 is satisfied with the more stringent inequality .
6. Complementarity
When processing numerators and divisors in , it may be most convenient if the constant c is also in . In this respect, the multiply-shift and multiply-add-shift results are complementary as first shown by Robison [7]. Suppose that we want to divide all integers by d in the case where is a power of two. For example, we may have . We want the constant m to be a power of two.
When d is a power of two, efficient division and remainder routines are available. The quotient requires a single binary shift while the remainder requires selecting the low-weight bits with a mask. Thus suppose that the divisor d is not a power of two.
To satisfy the constraints of Theorem 1, Theorem 2, we can pick and . Unfortunately, is not in which may cause implementation issues. Indeed, if we want to do 64-bit arithmetic on hardware with 64-bit machine words, it is most convenient if all constants fit in 64-bit words. Thus we may try a smaller constant. The choice is convenient since is then an integer . Unfortunately, it is not a valid choice for all divisors d, as per the requirements of Theorem 1, Theorem 2. For example, if and , picking , we get . We have that . We can verify that . Thus the conditions of Theorem 1 are not satisfied.
Thankfully, we can fall back on the multiply-add-shift results. Suppose that setting and fails to satisfy the conditions of Theorem 1, then we have that
(16) |
It may be convenient to simplify this equation further. We can multiply both sides by d. We have that on the left-hand-side. On the right-hand-side, we have m plus some quantity that may not be integer, but we can safely apply the ceiling function since the left-hand-side is an integer. After subtracting m from both sides, we get
(17) |
(E.g., with and , we get .) We want to show that the conditions of Theorem 3 are satisfied when keeping and setting . That is, if Theorem 1 fails us, we can use Theorem 3 so that it is always possible to pick .
We have that and hence and therefore . We assume that d is not a power of two. Since we assume and hence m are powers of two, we have that and thus from (16)
(18) |
We can divide by m and subtract to get
(19) |
(20) |
(21) |
(22) |
(23) |
(24) |
We have shown Proposition 3, which tells us that it is always possible to compute the quotient using ,2 selecting the approach using Equation (17).
Proposition 3
Consider an integer divisor that is not a power of two. Let N be an integer such that is a power of two, then we can compute the quotient of any integer by d using a constant as follows. Let .
- •
if , we let and we have .
- •
Otherwise we let and we have .
The same complementarity exists between the novel theorems for the computation of the quotient and remainder: Theorem 2, Theorem 4. Indeed, if we choose and , but the conditions of Theorem 2 are not met, then we have that
(25) |
We have that since m is not divisible by d and thus
(26) |
(27) |
(28) |
(29) |
(30) |
Thus, again, it is always possible to pick : if not with Theorem 2, then with Theorem 4, using the analog of Equation (17) to select the correct approach. We formalize the result with the novel Proposition 4.
Proposition 4
Consider an integer divisor that is not a power of two. Let N be an integer such that is a power of two, then we can compute the quotient and the remainder of any integer by d using a constant as follows. Let .
- •
if , we let and we have and .
- •
Otherwise we let and we have and .
Ideal divisors When is a power of two, we have shown that it is always possible to pick . However, for some divisors, we can pick an even smaller m, even with the constraint that m be a power of two. Suppose that you would like to compute both the remainder and the quotient, as in Theorem 2. Picking would be especially convenient. (In architectures where the product is stored in a pair of registers, and are essentially free if , where W is the architecture's word size.) We choose and . To satisfy the conditions of Theorem 2, we need that . We have that . Assuming that d does not divide , the inequality holds if and only if which is true if and only if d divides . We refer to any such divisors as being ideal, as they enable us to pick . See Table 2.3
Table 2.
Some ideal divisors.
Range [0,N + 1) | ideal divisor |
---|---|
[0,232) | 641 |
[0,232) | 6700417 |
[0,264) | 274177 |
[0,264) | 67280421310721 |
We can show that we cannot pick m to be a smaller power of two—unless d divides . Indeed, if we pick , then the conditions of Theorem 2 require that but for and since is an integer, we then must have that . Yet we have when d does not divide .
7. Rounding
Instead of computing the integer division (), we sometimes wish to round the result to the nearest integer. Unsurprisingly and maybe obviously, it is possible to do with an expression of the form for some integer z that depends on n and d. Hence, our efficient integer quotient computations extend to the computation of the rounded division. Indeed, we have that is the round-to-nearest function; when d is odd and n is between two multiple of d, it rounds up. To round down, we can use instead.4
It is also possible to handle more complicated scenarios. For example, what if we wish to round to the nearest integer, rounding to the nearest even integer when we are in-between two integers? It is only relevant when the division d is even. Let . Whenever z is a multiple of d and is odd, we return , else we return . We can check that an integer is a multiple of d efficiently with Proposition 1 for example. The division and the divisibility test can reuse the same intermediate computations.
8. Conclusion
Our work shows that a unified approach, with the same precomputed constants, allows the computation of the quotient and remainder, while further providing fast divisibility checks. Thus, for example, an algorithm could check efficiently whether an integer is divisible by another and, in the negative case, compute the remainder while reusing the prior work.
Future work might address the problem of computing generalized expressions such as for integer values n and real numbers x. For example, we have that can be computed as multiplication and a division by a power of two: for . We find such optimizations hand-coded in highly optimized algorithms [19]: it might prove useful to formalize the derivation of such routines.
Declarations
Author contribution statement
D. Lemire: Conceived and designed the analysis; Wrote the paper.
C. Bartlett, O. Kaser: Conceived and designed the analysis.
Declaration of interests statement
The first author was supported by the Natural Sciences and Engineering Research Council of Canada under grant number RGPIN-2017-0391.
Data availability statement
No data was used for the research described in the article.
Funding statement
The authors declare no conflict of interest.
Additional information
No additional information is available for this paper.
Drane et al. [10] have a related bound, but they also have additional constraints on the divisor d.
References
- 1.Artzy E., Hinds J.A., Saal H.J. A fast division technique for constant divisors. Commun. ACM. 1976;19:98–101. [Google Scholar]
- 2.Jacobsohn D.H. A combinatoric division algorithm for fixed-integer divisors. IEEE Trans. Comput. 1973;100:608–610. [Google Scholar]
- 3.Li S.-Y.R. Fast constant division routines. IEEE Trans. Comput. 1985;34:866–869. [Google Scholar]
- 4.Vowels R.A. Division by 10. Aust. Comput. J. 1992;24:81–85. [Google Scholar]
- 5.Granlund T., Montgomery P.L. Division by invariant integers using multiplication. SIGPLAN Not. 1994;29:61–72. [Google Scholar]
- 6.Warren H.S., Jr. 1st edition. Addison-Wesley; Boston: 2002. Hacker's Delight. [Google Scholar]
- 7.Robison A.D. N-bit unsigned division via n-bit multiply-add. Proceedings of the 17th IEEE Symposium on Computer Arithmetic; ARITH '05; Washington, DC, USA: IEEE Computer Society; 2005. pp. 131–139. [Google Scholar]
- 8.Anonymous author Labor of division (episode iii): faster unsigned division by constants. Nov. 2020. http://ridiculousfish.com/blog/posts/labor-of-division-episode-iii.html 2011.
- 9.Lemire D., Kaser O., Kurz N. Faster remainder by direct computation: applications to compilers and software libraries. Softw. Pract. Exp. 2019;49:953–970. [Google Scholar]
- 10.Drane T., Cheung W.-c., Constantinides G. 2012 IEEE International Symposium on Circuits and Systems. IEEE; 2012. Correctly rounded constant integer division via multiply-add; pp. 1243–1246. [Google Scholar]
- 11.Warren H.S., Jr. 2nd edition. Addison-Wesley; Boston: 2013. Hacker's Delight. [Google Scholar]
- 12.Magenheimer D.J., Peters L., Pettis K., Zuras D. Integer multiplication and division on the HP precision architecture. SIGARCH Comput. Archit. News. 1987;15:90–99. [Google Scholar]
- 13.Cavagnino D., Werbrouck A.E. Efficient algorithms for integer division by constants using multiplication. Comput. J. 2008;51:470–480. [Google Scholar]
- 14.Raghuram P.S., Petry F.E. Constant-division algorithms. IEE Proc., Comput. Digit. Tech. 1994;141:334–340. [Google Scholar]
- 15.Doran R.W. Special cases of division. J. Univers. Comput. Sci. 1995;1:176–194. [Google Scholar]
- 16.Ugurdag F., Dinechin F.D., Gener Y.S., Gören S., Didier L.-S. Hardware division by small integer constants. IEEE Trans. Comput. 2017;66:2097–2110. [Google Scholar]
- 17.de Dinechin F., Didier L.-S. Springer; Berlin, Heidelberg: 2012. Table-Based Division by Small Integer Constants; pp. 53–63. [Google Scholar]
- 18.Jaroma J.H., Reddy K.N. Classical and alternative approaches to the Mersenne and Fermat numbers. Am. Math. Mon. 2007;114:677–687. [Google Scholar]
- 19.Lemire D. Number parsing at a gigabyte per second. Softw. Pract. Exp. 2021 in press (published online 11 May 2021) [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
No data was used for the research described in the article.