hig.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard-cite-them-right
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • sv-SE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • de-DE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Vectorized table driven algorithms for double precision elementary functions using Taylor expansions
University of Gävle, Department of Mathematics, Natural and Computer Sciences, Ämnesavdelningen för datavetenskap.
2009 (English)In: Aplimat - Journal of Applied Mathematics, ISSN 1337-6365, Vol. 2, no 3, 171-188 p.Article in journal (Refereed) Published
Abstract [en]

This paper presents fast implementations of the inverse square root and arcsine, both in double precision. In single precision it is often possible to use a small table and one ordinary Newton-Raphson iteration to compute elementary functions such as the square root. In double precision a substantially larger table is necessary to obtain the desired precision, or, if a smaller table is used, the additional Newton-Raphson iterations required to obtain the precision often requires the evaluation of other expensive elementary functions. Furthermore, large tables use a lot of the cash memory that should have been used for the application code.

Obtaining the desired precision using a small table can instead be realised by using a higher order method than the second order Newton-Raphson method. A generalization of Newton's method to higher order is Householder's method, which unfortunately often results in very complicated expressions requiring many multiplications, additions, and even divisions.

We show how a high-order method can be used, which only requires a few extra additions and multiplications for each degree of higher order. The method starts from the Taylor expansion of the difference of the value of the elementary function and a starting guess value for each iteration. If the Taylor series is truncated after the second term, ordinary Newton iterations are obtained. In several cases it is possible to algebraically simplify the difference between the true value and the starting guess value. In those cases we show that it is advantageous to use the Taylor series to higher order to obtain the fast convergent method. Moreover, we will show how the coefficients of a Chebyshev polynomial can be fitted to give as little error as possible for the functions close to zero and in the same time reduce the terms in the Taylor expansion.

In the paper we benchmark two example implementations of the method on the x86_64 architecture. The first is the inverse square root, where the actual table (to 12 bit precision) is provided by the processor hardware. The inverse square root is important in many application programs, including computer graphics, and explicit particle simulation codes, for instance the Monte Carlo and Molecular Dynamics methods of statistical mechanics. The other example is the arcsine function, which has a slow converging Taylor expansion and where no tables are provided by the hardware. The vectorized versions of the implementations of the inverse square root are 3.5 times faster than compiled code on the Athlon64 and about 5 times faster on the Core 2. The scalar version of the arcsine function is, depending on order and table size, between 2 and 3 times faster than the compiled code, and the vectorized version is between 3 and 4 times faster on the Athlon64, while it is between 4 and 5 times faster than the compiled version on the Core 2.

Place, publisher, year, edition, pages
2009. Vol. 2, no 3, 171-188 p.
Keyword [en]
Vectorized algorithms, Taylor expansion, Double precision
Identifiers
URN: urn:nbn:se:hig:diva-5075OAI: oai:DiVA.org:hig-5075DiVA: diva2:232301
Available from: 2009-08-21 Created: 2009-08-21 Last updated: 2011-03-07Bibliographically approved

Open Access in DiVA

No full text

Other links

Aplimat

Search in DiVA

By author/editor
Hast, Anders
By organisation
Ämnesavdelningen för datavetenskap

Search outside of DiVA

GoogleGoogle Scholar

Total: 58 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard-cite-them-right
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • sv-SE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • de-DE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf