Newton iteration¶

The setup is that we have a polynomial $F(x,y)$ (monic in $y$ for simplicity) and want to find a solution $\phi(x)$ such that $F(x, \phi(x)) = 0$. If we can find a good 'starting point', then Newton iteration allows us to get a power series solution.

Requirements from the stating point $\alpha$

\begin{align*} F(0, \alpha) &= 0\\ \partial_y F (0, \alpha) & \neq 0 \end{align*}

If we have such a starting point $\alpha$, then we get the following iterative process: \begin{align*} \phi_0 & = \alpha\\ \phi_{i+1} &= \phi_i + \frac{F(x, \phi_i(x))}{\partial_y F(0, \alpha)} \end{align*}

The above then satisfies $F(x, \phi_i(x)) = 0 \bmod{x^{i+1}}$ for all $i \geq 0$.

In [19]:
R.<x,y> = PolynomialRing(QQ) 
In [20]:
def NewtonIteration(f, alpha, steps: int = 10, debug = False):
    if f(0, alpha) > 1e-5 : raise ValueError(f"f = {f} is not 0 at alpha={alpha}")
    
    dyf = derivative(f,y)
    dyf_alpha = dyf(0, alpha)
    if dyf_alpha < 1e-5: raise ValueError(f"f is degenerate at alpha={alpha}")
    
    curr_phi = alpha
    for i in range(steps):
        if debug: 
            print(f"phi:\t{curr_phi}")
            
        curr_phi = (curr_phi - (f(x, curr_phi)/dyf_alpha)) % x^(i+1)
    return curr_phi

Example 1¶

In [21]:
f = (y - (x^2 + x + 3)) * (y - (x^3 + 2*x^2  + 2))
In [22]:
NewtonIteration(f, 3, debug=True)
phi:	3
phi:	3
phi:	x + 3
phi:	x^2 + x + 3
phi:	x^2 + x + 3
phi:	x^2 + x + 3
phi:	x^2 + x + 3
phi:	x^2 + x + 3
phi:	x^2 + x + 3
phi:	x^2 + x + 3
Out[22]:
x^2 + x + 3

We recover the root.

Example 2¶

In [23]:
f = (y^2 - (x + 1)) * (y^2 + 7*x + 3)
In [24]:
phi = NewtonIteration(f, 1, debug = True)
phi:	1
phi:	1
phi:	1/2*x + 1
phi:	-1/8*x^2 + 1/2*x + 1
phi:	1/16*x^3 - 1/8*x^2 + 1/2*x + 1
phi:	-5/128*x^4 + 1/16*x^3 - 1/8*x^2 + 1/2*x + 1
phi:	7/256*x^5 - 5/128*x^4 + 1/16*x^3 - 1/8*x^2 + 1/2*x + 1
phi:	-21/1024*x^6 + 7/256*x^5 - 5/128*x^4 + 1/16*x^3 - 1/8*x^2 + 1/2*x + 1
phi:	33/2048*x^7 - 21/1024*x^6 + 7/256*x^5 - 5/128*x^4 + 1/16*x^3 - 1/8*x^2 + 1/2*x + 1
phi:	-429/32768*x^8 + 33/2048*x^7 - 21/1024*x^6 + 7/256*x^5 - 5/128*x^4 + 1/16*x^3 - 1/8*x^2 + 1/2*x + 1

What's going on here is that we are recovering the power series of $\sqrt{1 + x}$.

In [25]:
(phi^2 - (x+1)) % x^(10)
Out[25]:
0

Thus, if we calculate the minimal polynomial of $\phi$, then we will recover the factor of $f$ from which $\phi$ came from.

Newton iteration with quadratic convergence¶

With the same condition on the starting point, we could get faster convergence with the following iterative process.

\begin{align*} \phi_0 & = \alpha\\ \phi_{i+1} &= \phi_i + \frac{F(x, \phi_i)}{\partial_y F(x, \phi_i)} \end{align*}

The above is however a rational expression since the denominator is a polynomial, but it can still be interpreted as a power series since the denominator has a nonzero constant term. However, the advantage is that we get much faster convergence: $$ F(x, \phi_i(x)) = 0 \bmod{x^{2^i}} $$

Quadratic convergence without divisions¶

There is yet another trick we can employ to come up with a similar expression that does not use any divisions. This comes from the simple observation that $1/\partial_yF(x, \phi_i)$ is the solution to the equation $H(z) = z \cdot \partial_y F(x,\phi_i)$. Thus, we can do Newton iteration here as well!

If $z_i$ is such that $H(z_i) = 0 \bmod x^i$, then we can define $z_{i+1}$ as follows: \begin{align*} z_{i+1} & = z_i - \frac{H(z_i)}{\partial_z H(z_i)}\\ & = z_i - \frac{z_i \cdot \partial_y F(x, \phi_i) - 1}{\partial_y F(x, \phi_i)}\\ & = z_i - (z_i \cdot \partial_y F(x, \phi_i) - 1) \cdot z_i \bmod{x^{i+1}}\\ & = 2 z_i - z_i^2 \cdot \partial_y F(x, \phi_i). \end{align*}

This can be used to get the faster convergence without requiring any divisions by polynomials.

In [26]:
def NewtonIteration_quadratic_conv(f, alpha, steps: int = 10, debug = False):
    if f(0, alpha) > 1e-5 : raise ValueError(f"f = {f} is not 0 at alpha={alpha}")
    
    dyf = derivative(f,y)
    dyf_alpha = dyf(0, alpha)
    if dyf_alpha < 1e-5: raise ValueError(f"f is degenerate at alpha={alpha}")

    curr_phi = alpha
    curr_z = 1/dyf_alpha

    for i in range(steps):
        if debug:
            print(f"phi:\t{curr_phi}")
            print(f"z:\t{curr_z}")
            print("")

        curr_phi = (curr_phi - f(x,curr_phi) * curr_z) % (x^(2^(i+1)))
        curr_z = (2*curr_z - curr_z * curr_z * dyf(x, curr_phi)) % (x^(2^(i+1)))

    return curr_phi

Example 4¶

In [27]:
f = (y - (x^10 + x^6 + 3*x + 3)) * (y - (x^3 + 2*x^2  + 2))
In [28]:
NewtonIteration_quadratic_conv(f, 3, steps=5, debug=True)
phi:	3
z:	1

phi:	3*x + 3
z:	-3*x + 1

phi:	3*x + 3
z:	-38*x^3 + 11*x^2 - 3*x + 1

phi:	x^6 + 3*x + 3
z:	-5649*x^7 + 1619*x^6 - 464*x^5 + 133*x^4 - 38*x^3 + 11*x^2 - 3*x + 1

phi:	x^10 + x^6 + 3*x + 3
z:	-124079786*x^15 + 35561905*x^14 - 10192225*x^13 + 2921144*x^12 - 837215*x^11 + 239950*x^10 - 68771*x^9 + 19710*x^8 - 5649*x^7 + 1619*x^6 - 464*x^5 + 133*x^4 - 38*x^3 + 11*x^2 - 3*x + 1

Out[28]:
x^10 + x^6 + 3*x + 3

Dealing with degeneracy¶

Suppose we have a polynomial of the form $f(x,y) = g(x,y)^2 \cdot h(x,y)$, then no matter what $\alpha$ we choose that makes $g(0,\alpha) = 0$, that will always be high-multiplicity root and hence degenerate. Therefore, we cannot start our Newton Iteration from $\alpha$ and hope to recover $g$.

However, if we are willing to perturb our polynomial a little bit, then we can recover $g$ from it!

Suppose $\alpha$ was such that $g(0,\alpha) = 0$, and $h(0,\alpha) \neq 0$ and $\partial_y g (0,\alpha) \neq 0$. Then, we can work with the following polynomial: $$F(x,y) = f(x, y + \epsilon) - f(0, \alpha + \epsilon).$$

One can check that $F(0,\alpha) = 0$ and $\partial_y F(0,\alpha) \neq 0$ (although it will be divisible by some power of $\epsilon$). Thus, one can hope to start Newton iteration for $F$ instead of $f$ and hope to get $g$ somehow.

In [29]:
f = (y - (x^3 + 3*x^2 + 32 * x + 1))^2 * (y + x^2 - x + 9)
In [30]:
alpha = 1
f(0,alpha) == 0
Out[30]:
True
In [31]:
eps = 0.001
F = f(x, y + eps) - f(0, alpha + eps)
F(0,alpha)
# Well, it ought to be zero but computers are bad with floating point arithmetic
Out[31]:
1.77635683940025e-15
In [32]:
dy_F = derivative(F, y)
dy_F(0, alpha)
Out[32]:
0.0200029999999991
In [33]:
phi = NewtonIteration(F, alpha, debug=True)
phi:	1
phi:	1.00000000000000
phi:	32.0000000000000*x + 1.00000000000000
phi:	3.00000000000000*x^2 + 32.0000000000000*x + 1.00000000000000
phi:	x^3 + 3.00000000000000*x^2 + 32.0000000000000*x + 1.00000000000000
phi:	-4.26300000000000e-11*x^4 + x^3 + 3.00000000000000*x^2 + 32.0000000000000*x + 1.00000000000000
phi:	1.26100000000000e-10*x^5 - 7.02800000000000e-11*x^4 + x^3 + 3.00000000000000*x^2 + 32.0000000000000*x + 1.00000000000000
phi:	-3.60500000000000e-10*x^6 + 1.78500000000000e-10*x^5 + 6.00800000000000e-11*x^4 + x^3 + 3.00000000000000*x^2 + 32.0000000000000*x + 1.00000000000000
phi:	1.04100000000000e-9*x^7 - 5.75400000000000e-10*x^6 - 2.11100000000000e-10*x^5 - 3.89400000000000e-11*x^4 + x^3 + 3.00000000000000*x^2 + 32.0000000000000*x + 1.00000000000000
phi:	-2.97500000000000e-9*x^8 + 1.87200000000000e-9*x^7 + 6.71900000000000e-10*x^6 + 1.15500000000000e-10*x^5 - 6.47500000000000e-11*x^4 + x^3 + 3.00000000000000*x^2 + 32.0000000000000*x + 1.00000000000000

Thus if we round off the coefficients, we can see the root we wanted sitting inside $\phi$.

In [34]:
g_prime = 0
for c in phi.coefficients():
    g_prime = g_prime * x + round(c)
print(f"g_prime = {g_prime}")
print(f"f= {factor(f)}")
g_prime = x^3 + 3*x^2 + 32*x + 1
f= (x^2 - x + y + 9) * (-x^3 - 3*x^2 - 32*x + y - 1)^2

Thus from $\phi$, we can extract the coefficients of the factor of $f$ even though the factor has multiplicities.