Arduino/AVR and ZX Spectrum: sin() routines.

Introduction

The previous posts on sin() approximations has made me wonder which methods are implemented on 8-bit processors. So I took a look at both the AVR C library for the 8-bit megaAVR (used in the Arduino Uno) and the ZX Spectrum (an early Z80-based 8-bit computer) ROM disassembly.

Sin(x): The AVR C lib version

After downloading the most recent sources from the Subversion repository, I started going through the directory tree to get a sense of what can be found where. The files of interest are located in the avr-libc/libm/fplib/ directory; all of them are written in AVR assembly language.

The first routine to look at is sin which can be found in sin.S (gee, what gave it away?). This routine calls __fp_rempio2, which changes the argument so it is within the right range for calling the actual sin() calculating function, __fp_sinus.

The __fp_sinus function does some more domain manipulation after which it starts setting up pointers to a table holding six coefficients:

Coefficient Value
c_0 1.0000000000
c_1 -0.1666666664
c_2 0.0083333315
c_3 -0.0001984090
c_4 0.0000027526
c_5 -0.0000000239

After the setup code, __fp_powsodd is called, which evaluates the following polynomial with odd powers only:

P(x) = c_0\cdot x + c_1\cdot x^3 + c_2\cdot x^5 + c_3\cdot x^7 + c_4\cdot x^9 + c_5\cdot x^{11}.

The careful observer will have noticed that the coefficients are those of the Taylor series for \sin(x).

Performance

The maximum error of the Taylor approximation is determined by the first truncated term, which is then \frac{(2\pi \cdot 0.25)^{13}}{13!} \approx 5.6922 \cdot 10^{-8}.

Summarizing, the AVR lib C uses an 11th-order polynomial based on the Taylor series and domain symmetry to calculate sin(x).

Sin(x): Z80 / ZX Spectrum version

According to the Complete Spectrum ROM Disassembly the main part of the sin() function is the series generator routine. This routine evaluates Chebyshev polynomials up to a specified order. The calculation of sin() is based on the first six polynomials:

T_0(x) = 1
T_1(x) = x
T_2(x) = 2x^2 - 1
T_3(x) = 4x^3 -3x
T_4(x) = 8x^4 -8x^2 +1
T_5(x) = 16x^5 -20x^3 +5x

These polynomials are weighted using the following coefficients:

Coefficient Value
c_0 1.276278962
c_1 -.142630785
c_2 0.004559008
c_3 -.000068294
c_4 0.000000592
c_5 -.000000003

The approximation polynomial is equal to:

P(z) = c_0 \cdot T_0(z) + 2c_1 \cdot T_1(z) + 2c_2 \cdot T_2(z) + 2c_3 \cdot T_3(z) + 2c_4 \cdot T_4(z) + \ldots
\qquad + 2c_5 \cdot T_5(z)

,where z=2w^2-1 and w=4x is between -1 and 1.

Finally, the approximation is calculated by \sin(2\pi\cdot x) \approx P(z)\cdot w.

The interesting thing about the ZX Spectrum code is that the Chebyshev polynomials T_0 \ldots T_5 are not evaluated one at a time, but that the entire calculation of P(z) is done in one go. This code is almost as efficient as evaluating one polynomial!

Performance

The performance of this approximation is very good, as shown by the following graphs:

The maximum error is approximately -2\cdot 10^{-9}, which is better than the AVR version! Not bad for an oldskool machine!

Conclusion

Sometimes older is better!

Additional information

BOCHS, the IA-32 CPU emulator, uses a 17th-order Taylor polynomial to calculate sin().

Sin(x) using Taylor series: better than expected

Introduction

In the previous posts [1,2] about \sin(2\pi \cdot x) approximations, I quickly discarded the Taylor series believing their approximation error characteristics were not ideal. However, this was based on using the Taylor derived polynomial for the entire domain x \in [0 \ldots 1). In the ‘better sin(x)‘ post, the domain was divided in four parts and the symmetry of sin(x) was exploited to reduce the approximation error. This trick is, of course, also possible with Taylor-based polynomial.

I tried it with the following results..

Taylor series of \sin(2\pi \cdot x)

The Taylor series of \sin (2\pi \cdot x) developed around x=0 is:

P(x) = 2\pi\cdot x - \frac{8\pi^3}{3!} \cdot x^3 + \frac{32\pi^5}{5!} \cdot x^5 - \ldots,

which, given N terms, can be compactly written as:

P(x) = \sum_{i=1}^{N} (-1)^{(i+1)} \cdot {(2\pi)^{(2i - 1)} \over {(2i-1)!}} x^{(2i-1)}.

A very useful property of this particular series is that the sign of each term is the opposite of the one that comes before it. In addition, the next term is always smaller than the previous one. This means that if we truncate the series at a certain term, the total approximation error is always smaller or equal to the first term we left out. So, we can choose the number of terms to include based on the desired approximation error!

The approximation error of our Taylor series is largest at the largest value of x we’re going to use. By diving the domain by four, the largest value of x we’ll encounter is \frac{1}{4}. Knowing this, we can make a table of the approximation error versus the number of terms included in our polynomial:

Terms, N P(x) order max. error
1 1 0.64596
2 3 0.07969
3 5 0.00468
4 7 0.00016
5 9 3.60e-6

Clearly, the maximum approximation error drops rapidly as the order of the polynomial is increased.

Performance evaluation

The third-order Taylor polynomial versus the ‘better sin(x)’ version.

Which approximation is better, the one from the previous post or a 3rd order Taylor approximation?

The following graph shows the spectrum of the approximated sinusoidal wave. It was generated using an 65536-point FFT and 2129 sine wave periods.

The third harmonic can be found at -35.0 dBc, which is 11.9 dB worse. The signal-to-noise ratio (SNR) of the Taylor-based version is 33.2 dB, which is 11.7 dB worse. Summarizing, you’re better off with the ‘better sin(x)’ approximation version.

Ninth-order Taylor performance

If you want a high-purity sine oscillator and have cycles to spare, but don’t want the memory footprint of a table-lookup oscillator, you can use a 9th-order Taylor approximation.

Here are the performance graphs for the approximation:

The following graph shows the spectrum of the approximated sinusoidal wave. It was generated using an 65536-point FFT and 2129 sine wave periods.

The SNR is around 121.2 dB when all calculations are implemented using floating-point arithmetic.

Conclusions

The Taylor series are well suited for approximating \sin(2\pi \cdot x), especially when the domain is kept small. The precision of the approximation is easily determined and can be increased simply by adding additional terms. A 9th order polynomial is capable of reaching professional audio quality, in terms of SNR.

A better sin(x) approximation

Introduction

Last time I showed a simple algorithm for approximating \sin(2 \cdot \pi \cdot x). The third harmonic was -28.6 dBc, which is not good enough for high-quality sound synthesis.

In this post, I will derive a more accurate approximation polynomial.

Method

The first change we will make in the algorithm is to split the domain [0..1) into four pieces, instead of two, and exploit additional symmetry properties of the sine function. This well-known trick allows us to decrease the input range over which our polynomial P(x) must be accurate.

Now, we pick the order of the polynomial to use. Our previous algorithm used a second-order polynomial so we’ll try a third-order one here. The third-order polynomial has four coefficients, so we’ll be able to enforce four constraints:

P(x) = \alpha_3 \cdot x^3 + \alpha_2 \cdot x^2 + \alpha_1 \cdot x + \alpha_0.

Setting constraints

To ensure the curve starts and ends at the correct amplitude, we should constrain the end points, i.e. P(0) and P(\frac{1}{4}). These should be equal to \sin \left( 2 \cdot \pi \cdot 0 \right ) and \sin \left( 2 \cdot \pi \cdot \frac{1}{4} \right ), respectively. This leads to the following constraints:

P(0) = 0 and P(\frac{1}{4}) = 1.

Now, we still have two addition constraints we can set. We could either choose two additional points our polynomial must intersect, or we could set two constraints on the derivative of the polynomial, P'(x):

P'(x) = 3 \cdot \alpha_3 \cdot x^2 + 2 \cdot \alpha_2 \cdot x + \alpha_1.

I chose to constrain the derivatives at the end points because it makes the piece-wise sin approximation continuous in the first derivative, which is desirable from a spectral point-of-view.

The derivative of \sin(2\cdot \pi \cdot x) is 2 \cdot \pi \cdot \cos(2 \cdot \pi \cdot x) and we equate the derivative of the polynomial at x=0 and x=\frac{1}{4} to it:

P'(0) = 2 \cdot \pi and P'(\frac{1}{4}) = 0.

Solving for the coefficients

Given our constraints, we can now solve the system of equations. The system we must solve is:

\alpha_3 \cdot 0^3 + \alpha_2 \cdot 0^2 + \alpha_1 \cdot 0 + \alpha_0 = 0.
\alpha_3 \cdot \frac{1}{4}^3 + \alpha_2 \cdot \frac{1}{4}^2 + \alpha_1 \cdot \frac{1}{4} + \alpha_0 = 1.
3 \cdot \alpha_3 \cdot 0^2 + 2 \cdot \alpha_2 \cdot 0 + \alpha_1 = 2 \cdot \pi.
3 \cdot \alpha_3 \cdot \frac{1}{4}^2 + 2 \cdot \alpha_2 \cdot \frac{1}{4} + \alpha_1 = 0,

The first and third equations give values for \alpha_0 = 0 and \alpha_1 = 2 \cdot \pi. Using these values and simplifying leads to:

\alpha_3 \cdot \frac{1}{64} + \alpha_2 \cdot \frac{1}{16} + \frac{1}{2}\pi = 1.
\alpha_3 \cdot \frac{3}{16} + \alpha_2 \cdot \frac{1}{2} + 2\pi = 0,

which, after some manipulation, gives:

\alpha_3 = 32\pi - 128 \approx -27.469 and \alpha_2 = -16\pi + 48 \approx -2.2655.

How good is it?

The polynomial P(x) follows the \sin(2\cdot \pi \cdot x) curve (for x=[0..\frac{1}{4}]) very accurately:

As can be seen from the error graph below, the error is less than +/- 1.1% (with respect to full-scale):

The following graph shows the spectrum of the approximated sinusoidal wave. It was generated using an 65536-point FFT and 2129 sine wave periods:

The spectrum graph shows that the third harmonic, which is found at -46.9 dBc, is much lower than the simple sine approximation from the previous post.

Show us the code!

/*
  A better sin(2*pi*x) approximation.
  The caller must make sure that the argument lies
  within [0..1).

  Note: this can be optimized further by using Horner's
  rule for evaluating polynomials, see:
  
  http://en.wikipedia.org/wiki/Horner%27s_method
  
  Omitted here for clarity.
*/

float sin_poly(float x)
{
  const float a3 = -27.469f;
  const float a2 = -2.2655f;
  const float a1 = 6.2832f;

  return a3*x*x*x + a2*x*x + a1*x;
}

float better_sin(float x)
{
  if (x > 0.75f)
    return -sin_poly(1.0f - x);
  else if (x  > 0.5f)
    return -sin_poly(x - 0.5f);
  else if (x > 0.25f)
    return sin_poly(0.5f - x);
  else
    return sin_poly(x);
};

Conclusion

This approximation is quite good and it has a spectral purity approximately equivalent to a 7-bit signal.

A simple sin(x) approximation

Introduction

When implementing digital oscillators or filters, a simple way of calculating \sin(2\pi \cdot x) for x=[0..1) is needed. Efficiency, in terms of the number of CPU cycles or hardware complexity, is favoured above accuracy. Several methods are known that produce good approximations to \sin(\pi \cdot x). These methods can be divided into three categories: table-lookup algorithms, approximation functions and recursive resonator algorithms.

Table-lookup algorithms are cycle efficient and their accuracy can be easily increased by increasing the number of table entries. Accuracy can also be increased by introducing interpolation between table entries.

In some memory limited situations, such as small microcontroller and DSP targets, a table is undesirable and an approximation based on functions is preferred.

Here, I will show how to approximate \sin(2 \cdot \pi\cdot x) based on a simple polynomial.

Approximating sin(2*pi*x)

The well known Taylor series seems a good candidate for obtaining a polynomial that approximates \sin(2\pi \cdot x) well. However, the approximation gets worse the further away the function argument gets from the point from which the series was derived. This generates a discontinuity when \sin(2\pi \cdot x) “wraps around”. To avoid such discontinuities, the approximation must produce the same value for x=0 and x=1.

An interesting candidate is the polynomial -16x^2 + 8x, which approximates \sin(2\pi \cdot x) quite well for x=[0..\frac{1}{2}]. The other half of the sinusoid is simply generated by mirroring the polynomial and adjusting for the offset, resulting in 16x^2 - 24x + 8.

How good is it?

The approximation is accurate to within +/- 6% of the total amplitude as is shown by the graphs below:

In sound synthesis and radio applications it is desirable to have low spurious tones. The following graph shows the spectrum of the approximated sinusoidal wave. It was generated using an 65536-point FFT and 2129 sine wave periods.

The spectrum is free of all even harmonics as the approximation is point symmetrical and the “grass” is due to high frequency components aliasing to lower frequencies. The first major spurious response is the third harmonic, which is found at -28.6 dBc.

Show me the code!

/*
  Fast sin(2*pi*x) approximation.
  The caller must make sure that the argument lies
  within [0..1).

  Note: this can be optimized further by using Horner's
  rule for evaluating polynomials, see:
  
  http://en.wikipedia.org/wiki/Horner%27s_method
  
  Omitted here for clarity.
*/

float fast_sin(float x)
{
  if (x < 0.5f)
  {
    return -16.0f*x*x + 8.0f*x;
  }
  else
  {
    return 16.0f*x*x - 24.0f*x + 8.0f;
  }
};

Conclusion

This may not be the world’s best sin approximation, but it sure is simple. Next time we’ll look at a more accurate polynomial-based sin approximation.