Newton iteration¶
The setup is that we have a polynomial $F(x,y)$ (monic in $y$ for simplicity) and want to find a solution $\phi(x)$ such that $F(x, \phi(x)) = 0$. If we can find a good 'starting point', then Newton iteration allows us to get a power series solution.
Requirements from the stating point $\alpha$
\begin{align*} F(0, \alpha) &= 0\\ \partial_y F (0, \alpha) & \neq 0 \end{align*}
If we have such a starting point $\alpha$, then we get the following iterative process: \begin{align*} \phi_0 & = \alpha\\ \phi_{i+1} &= \phi_i + \frac{F(x, \phi_i(x))}{\partial_y F(0, \alpha)} \end{align*}
The above then satisfies $F(x, \phi_i(x)) = 0 \bmod{x^{i+1}}$ for all $i \geq 0$.
R.<x,y> = PolynomialRing(QQ)
def NewtonIteration(f, alpha, steps: int = 10, debug = False):
if f(0, alpha) > 1e-5 : raise ValueError(f"f = {f} is not 0 at alpha={alpha}")
dyf = derivative(f,y)
dyf_alpha = dyf(0, alpha)
if dyf_alpha < 1e-5: raise ValueError(f"f is degenerate at alpha={alpha}")
curr_phi = alpha
for i in range(steps):
if debug:
print(f"phi:\t{curr_phi}")
curr_phi = (curr_phi - (f(x, curr_phi)/dyf_alpha)) % x^(i+1)
return curr_phi
Example 1¶
f = (y - (x^2 + x + 3)) * (y - (x^3 + 2*x^2 + 2))
NewtonIteration(f, 3, debug=True)
phi: 3 phi: 3 phi: x + 3 phi: x^2 + x + 3 phi: x^2 + x + 3 phi: x^2 + x + 3 phi: x^2 + x + 3 phi: x^2 + x + 3 phi: x^2 + x + 3 phi: x^2 + x + 3
x^2 + x + 3
We recover the root.
Example 2¶
f = (y^2 - (x + 1)) * (y^2 + 7*x + 3)
phi = NewtonIteration(f, 1, debug = True)
phi: 1 phi: 1 phi: 1/2*x + 1 phi: -1/8*x^2 + 1/2*x + 1 phi: 1/16*x^3 - 1/8*x^2 + 1/2*x + 1 phi: -5/128*x^4 + 1/16*x^3 - 1/8*x^2 + 1/2*x + 1 phi: 7/256*x^5 - 5/128*x^4 + 1/16*x^3 - 1/8*x^2 + 1/2*x + 1 phi: -21/1024*x^6 + 7/256*x^5 - 5/128*x^4 + 1/16*x^3 - 1/8*x^2 + 1/2*x + 1 phi: 33/2048*x^7 - 21/1024*x^6 + 7/256*x^5 - 5/128*x^4 + 1/16*x^3 - 1/8*x^2 + 1/2*x + 1 phi: -429/32768*x^8 + 33/2048*x^7 - 21/1024*x^6 + 7/256*x^5 - 5/128*x^4 + 1/16*x^3 - 1/8*x^2 + 1/2*x + 1
What's going on here is that we are recovering the power series of $\sqrt{1 + x}$.
(phi^2 - (x+1)) % x^(10)
0
Thus, if we calculate the minimal polynomial of $\phi$, then we will recover the factor of $f$ from which $\phi$ came from.
Newton iteration with quadratic convergence¶
With the same condition on the starting point, we could get faster convergence with the following iterative process.
\begin{align*} \phi_0 & = \alpha\\ \phi_{i+1} &= \phi_i + \frac{F(x, \phi_i)}{\partial_y F(x, \phi_i)} \end{align*}
The above is however a rational expression since the denominator is a polynomial, but it can still be interpreted as a power series since the denominator has a nonzero constant term. However, the advantage is that we get much faster convergence: $$ F(x, \phi_i(x)) = 0 \bmod{x^{2^i}} $$
Quadratic convergence without divisions¶
There is yet another trick we can employ to come up with a similar expression that does not use any divisions. This comes from the simple observation that $1/\partial_yF(x, \phi_i)$ is the solution to the equation $H(z) = z \cdot \partial_y F(x,\phi_i)$. Thus, we can do Newton iteration here as well!
If $z_i$ is such that $H(z_i) = 0 \bmod x^i$, then we can define $z_{i+1}$ as follows: \begin{align*} z_{i+1} & = z_i - \frac{H(z_i)}{\partial_z H(z_i)}\\ & = z_i - \frac{z_i \cdot \partial_y F(x, \phi_i) - 1}{\partial_y F(x, \phi_i)}\\ & = z_i - (z_i \cdot \partial_y F(x, \phi_i) - 1) \cdot z_i \bmod{x^{i+1}}\\ & = 2 z_i - z_i^2 \cdot \partial_y F(x, \phi_i). \end{align*}
This can be used to get the faster convergence without requiring any divisions by polynomials.
def NewtonIteration_quadratic_conv(f, alpha, steps: int = 10, debug = False):
if f(0, alpha) > 1e-5 : raise ValueError(f"f = {f} is not 0 at alpha={alpha}")
dyf = derivative(f,y)
dyf_alpha = dyf(0, alpha)
if dyf_alpha < 1e-5: raise ValueError(f"f is degenerate at alpha={alpha}")
curr_phi = alpha
curr_z = 1/dyf_alpha
for i in range(steps):
if debug:
print(f"phi:\t{curr_phi}")
print(f"z:\t{curr_z}")
print("")
curr_phi = (curr_phi - f(x,curr_phi) * curr_z) % (x^(2^(i+1)))
curr_z = (2*curr_z - curr_z * curr_z * dyf(x, curr_phi)) % (x^(2^(i+1)))
return curr_phi
Example 4¶
f = (y - (x^10 + x^6 + 3*x + 3)) * (y - (x^3 + 2*x^2 + 2))
NewtonIteration_quadratic_conv(f, 3, steps=5, debug=True)
phi: 3 z: 1 phi: 3*x + 3 z: -3*x + 1 phi: 3*x + 3 z: -38*x^3 + 11*x^2 - 3*x + 1 phi: x^6 + 3*x + 3 z: -5649*x^7 + 1619*x^6 - 464*x^5 + 133*x^4 - 38*x^3 + 11*x^2 - 3*x + 1 phi: x^10 + x^6 + 3*x + 3 z: -124079786*x^15 + 35561905*x^14 - 10192225*x^13 + 2921144*x^12 - 837215*x^11 + 239950*x^10 - 68771*x^9 + 19710*x^8 - 5649*x^7 + 1619*x^6 - 464*x^5 + 133*x^4 - 38*x^3 + 11*x^2 - 3*x + 1
x^10 + x^6 + 3*x + 3
Dealing with degeneracy¶
Suppose we have a polynomial of the form $f(x,y) = g(x,y)^2 \cdot h(x,y)$, then no matter what $\alpha$ we choose that makes $g(0,\alpha) = 0$, that will always be high-multiplicity root and hence degenerate. Therefore, we cannot start our Newton Iteration from $\alpha$ and hope to recover $g$.
However, if we are willing to perturb our polynomial a little bit, then we can recover $g$ from it!
Suppose $\alpha$ was such that $g(0,\alpha) = 0$, and $h(0,\alpha) \neq 0$ and $\partial_y g (0,\alpha) \neq 0$. Then, we can work with the following polynomial: $$F(x,y) = f(x, y + \epsilon) - f(0, \alpha + \epsilon).$$
One can check that $F(0,\alpha) = 0$ and $\partial_y F(0,\alpha) \neq 0$ (although it will be divisible by some power of $\epsilon$). Thus, one can hope to start Newton iteration for $F$ instead of $f$ and hope to get $g$ somehow.
f = (y - (x^3 + 3*x^2 + 32 * x + 1))^2 * (y + x^2 - x + 9)
alpha = 1
f(0,alpha) == 0
True
eps = 0.001
F = f(x, y + eps) - f(0, alpha + eps)
F(0,alpha)
# Well, it ought to be zero but computers are bad with floating point arithmetic
1.77635683940025e-15
dy_F = derivative(F, y)
dy_F(0, alpha)
0.0200029999999991
phi = NewtonIteration(F, alpha, debug=True)
phi: 1 phi: 1.00000000000000 phi: 32.0000000000000*x + 1.00000000000000 phi: 3.00000000000000*x^2 + 32.0000000000000*x + 1.00000000000000 phi: x^3 + 3.00000000000000*x^2 + 32.0000000000000*x + 1.00000000000000 phi: -4.26300000000000e-11*x^4 + x^3 + 3.00000000000000*x^2 + 32.0000000000000*x + 1.00000000000000 phi: 1.26100000000000e-10*x^5 - 7.02800000000000e-11*x^4 + x^3 + 3.00000000000000*x^2 + 32.0000000000000*x + 1.00000000000000 phi: -3.60500000000000e-10*x^6 + 1.78500000000000e-10*x^5 + 6.00800000000000e-11*x^4 + x^3 + 3.00000000000000*x^2 + 32.0000000000000*x + 1.00000000000000 phi: 1.04100000000000e-9*x^7 - 5.75400000000000e-10*x^6 - 2.11100000000000e-10*x^5 - 3.89400000000000e-11*x^4 + x^3 + 3.00000000000000*x^2 + 32.0000000000000*x + 1.00000000000000 phi: -2.97500000000000e-9*x^8 + 1.87200000000000e-9*x^7 + 6.71900000000000e-10*x^6 + 1.15500000000000e-10*x^5 - 6.47500000000000e-11*x^4 + x^3 + 3.00000000000000*x^2 + 32.0000000000000*x + 1.00000000000000
Thus if we round off the coefficients, we can see the root we wanted sitting inside $\phi$.
g_prime = 0
for c in phi.coefficients():
g_prime = g_prime * x + round(c)
print(f"g_prime = {g_prime}")
print(f"f= {factor(f)}")
g_prime = x^3 + 3*x^2 + 32*x + 1 f= (x^2 - x + y + 9) * (-x^3 - 3*x^2 - 32*x + y - 1)^2
Thus from $\phi$, we can extract the coefficients of the factor of $f$ even though the factor has multiplicities.