{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Newton iteration" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The setup is that we have a polynomial $F(x,y)$ (monic in $y$ for simplicity) and want to find a solution $\\phi(x)$ such that $F(x, \\phi(x)) = 0$. If we can find a good 'starting point', then Newton iteration allows us to get a power series solution. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Requirements from the stating point $\\alpha$**\n", "\n", "\\begin{align*}\n", "F(0, \\alpha) &= 0\\\\\n", "\\partial_y F (0, \\alpha) & \\neq 0\n", "\\end{align*}\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we have such a starting point $\\alpha$, then we get the following iterative process:\n", "\\begin{align*}\n", "\\phi_0 & = \\alpha\\\\\n", "\\phi_{i+1} &= \\phi_i + \\frac{F(x, \\phi_i(x))}{\\partial_y F(0, \\alpha)}\n", "\\end{align*}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The above then satisfies $F(x, \\phi_i(x)) = 0 \\bmod{x^{i+1}}$ for all $i \\geq 0$. " ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "R. = PolynomialRing(QQ) " ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "def NewtonIteration(f, alpha, steps: int = 10, debug = False):\n", " if f(0, alpha) > 1e-5 : raise ValueError(f\"f = {f} is not 0 at alpha={alpha}\")\n", " \n", " dyf = derivative(f,y)\n", " dyf_alpha = dyf(0, alpha)\n", " if dyf_alpha < 1e-5: raise ValueError(f\"f is degenerate at alpha={alpha}\")\n", " \n", " curr_phi = alpha\n", " for i in range(steps):\n", " if debug: \n", " print(f\"phi:\\t{curr_phi}\")\n", " \n", " curr_phi = (curr_phi - (f(x, curr_phi)/dyf_alpha)) % x^(i+1)\n", " return curr_phi" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example 1" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "f = (y - (x^2 + x + 3)) * (y - (x^3 + 2*x^2 + 2))" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "phi:\t3\n", "phi:\t3\n", "phi:\tx + 3\n", "phi:\tx^2 + x + 3\n", "phi:\tx^2 + x + 3\n", "phi:\tx^2 + x + 3\n", "phi:\tx^2 + x + 3\n", "phi:\tx^2 + x + 3\n", "phi:\tx^2 + x + 3\n", "phi:\tx^2 + x + 3\n" ] }, { "data": { "text/plain": [ "x^2 + x + 3" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "NewtonIteration(f, 3, debug=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We recover the root." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example 2" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "f = (y^2 - (x + 1)) * (y^2 + 7*x + 3)" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "phi:\t1\n", "phi:\t1\n", "phi:\t1/2*x + 1\n", "phi:\t-1/8*x^2 + 1/2*x + 1\n", "phi:\t1/16*x^3 - 1/8*x^2 + 1/2*x + 1\n", "phi:\t-5/128*x^4 + 1/16*x^3 - 1/8*x^2 + 1/2*x + 1\n", "phi:\t7/256*x^5 - 5/128*x^4 + 1/16*x^3 - 1/8*x^2 + 1/2*x + 1\n", "phi:\t-21/1024*x^6 + 7/256*x^5 - 5/128*x^4 + 1/16*x^3 - 1/8*x^2 + 1/2*x + 1\n", "phi:\t33/2048*x^7 - 21/1024*x^6 + 7/256*x^5 - 5/128*x^4 + 1/16*x^3 - 1/8*x^2 + 1/2*x + 1\n", "phi:\t-429/32768*x^8 + 33/2048*x^7 - 21/1024*x^6 + 7/256*x^5 - 5/128*x^4 + 1/16*x^3 - 1/8*x^2 + 1/2*x + 1\n" ] } ], "source": [ "phi = NewtonIteration(f, 1, debug = True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What's going on here is that we are recovering the power series of $\\sqrt{1 + x}$. " ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(phi^2 - (x+1)) % x^(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Thus, if we calculate the _minimal polynomial_ of $\\phi$, then we will recover the factor of $f$ from which $\\phi$ came from. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Newton iteration with quadratic convergence\n", "\n", "With the same condition on the starting point, we could get faster convergence with the following iterative process. \n", "\n", "\\begin{align*}\n", "\\phi_0 & = \\alpha\\\\\n", "\\phi_{i+1} &= \\phi_i + \\frac{F(x, \\phi_i)}{\\partial_y F(x, \\phi_i)}\n", "\\end{align*}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The above is however a rational expression since the denominator is a polynomial, but it can still be interpreted as a power series since the denominator has a nonzero constant term. However, the advantage is that we get much faster convergence:\n", "$$\n", "F(x, \\phi_i(x)) = 0 \\bmod{x^{2^i}}\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Quadratic convergence without divisions\n", "\n", "There is yet another trick we can employ to come up with a similar expression that does not use any divisions. This comes from the simple observation that $1/\\partial_yF(x, \\phi_i)$ is the solution to the equation $H(z) = z \\cdot \\partial_y F(x,\\phi_i)$. Thus, we can do Newton iteration here as well!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If $z_i$ is such that $H(z_i) = 0 \\bmod x^i$, then we can define $z_{i+1}$ as follows:\n", "\\begin{align*}\n", "z_{i+1} & = z_i - \\frac{H(z_i)}{\\partial_z H(z_i)}\\\\\n", " & = z_i - \\frac{z_i \\cdot \\partial_y F(x, \\phi_i) - 1}{\\partial_y F(x, \\phi_i)}\\\\\n", " & = z_i - (z_i \\cdot \\partial_y F(x, \\phi_i) - 1) \\cdot z_i \\bmod{x^{i+1}}\\\\\n", " & = 2 z_i - z_i^2 \\cdot \\partial_y F(x, \\phi_i).\n", "\\end{align*}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This can be used to get the faster convergence without requiring any divisions by polynomials. " ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "def NewtonIteration_quadratic_conv(f, alpha, steps: int = 10, debug = False):\n", " if f(0, alpha) > 1e-5 : raise ValueError(f\"f = {f} is not 0 at alpha={alpha}\")\n", " \n", " dyf = derivative(f,y)\n", " dyf_alpha = dyf(0, alpha)\n", " if dyf_alpha < 1e-5: raise ValueError(f\"f is degenerate at alpha={alpha}\")\n", "\n", " curr_phi = alpha\n", " curr_z = 1/dyf_alpha\n", "\n", " for i in range(steps):\n", " if debug:\n", " print(f\"phi:\\t{curr_phi}\")\n", " print(f\"z:\\t{curr_z}\")\n", " print(\"\")\n", "\n", " curr_phi = (curr_phi - f(x,curr_phi) * curr_z) % (x^(2^(i+1)))\n", " curr_z = (2*curr_z - curr_z * curr_z * dyf(x, curr_phi)) % (x^(2^(i+1)))\n", "\n", " return curr_phi" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example 4" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "f = (y - (x^10 + x^6 + 3*x + 3)) * (y - (x^3 + 2*x^2 + 2))" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "phi:\t3\n", "z:\t1\n", "\n", "phi:\t3*x + 3\n", "z:\t-3*x + 1\n", "\n", "phi:\t3*x + 3\n", "z:\t-38*x^3 + 11*x^2 - 3*x + 1\n", "\n", "phi:\tx^6 + 3*x + 3\n", "z:\t-5649*x^7 + 1619*x^6 - 464*x^5 + 133*x^4 - 38*x^3 + 11*x^2 - 3*x + 1\n", "\n", "phi:\tx^10 + x^6 + 3*x + 3\n", "z:\t-124079786*x^15 + 35561905*x^14 - 10192225*x^13 + 2921144*x^12 - 837215*x^11 + 239950*x^10 - 68771*x^9 + 19710*x^8 - 5649*x^7 + 1619*x^6 - 464*x^5 + 133*x^4 - 38*x^3 + 11*x^2 - 3*x + 1\n", "\n" ] }, { "data": { "text/plain": [ "x^10 + x^6 + 3*x + 3" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "NewtonIteration_quadratic_conv(f, 3, steps=5, debug=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Dealing with degeneracy\n", "\n", "Suppose we have a polynomial of the form $f(x,y) = g(x,y)^2 \\cdot h(x,y)$, then no matter what $\\alpha$ we choose that makes $g(0,\\alpha) = 0$, that will always be high-multiplicity root and hence degenerate. Therefore, we cannot start our Newton Iteration from $\\alpha$ and hope to recover $g$. \n", "\n", "However, if we are willing to perturb our polynomial a little bit, then we can recover $g$ from it!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Suppose $\\alpha$ was such that $g(0,\\alpha) = 0$, and $h(0,\\alpha) \\neq 0$ and $\\partial_y g (0,\\alpha) \\neq 0$. Then, we can work with the following polynomial:\n", "$$F(x,y) = f(x, y + \\epsilon) - f(0, \\alpha + \\epsilon).$$ \n", "\n", "One can check that $F(0,\\alpha) = 0$ and $\\partial_y F(0,\\alpha) \\neq 0$ (although it will be divisible by some power of $\\epsilon$). Thus, one can hope to start Newton iteration for $F$ instead of $f$ and hope to get $g$ somehow. " ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [], "source": [ "f = (y - (x^3 + 3*x^2 + 32 * x + 1))^2 * (y + x^2 - x + 9)" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "alpha = 1\n", "f(0,alpha) == 0" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1.77635683940025e-15" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "eps = 0.001\n", "F = f(x, y + eps) - f(0, alpha + eps)\n", "F(0,alpha)\n", "# Well, it ought to be zero but computers are bad with floating point arithmetic" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.0200029999999991" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dy_F = derivative(F, y)\n", "dy_F(0, alpha)" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "phi:\t1\n", "phi:\t1.00000000000000\n", "phi:\t32.0000000000000*x + 1.00000000000000\n", "phi:\t3.00000000000000*x^2 + 32.0000000000000*x + 1.00000000000000\n", "phi:\tx^3 + 3.00000000000000*x^2 + 32.0000000000000*x + 1.00000000000000\n", "phi:\t-4.26300000000000e-11*x^4 + x^3 + 3.00000000000000*x^2 + 32.0000000000000*x + 1.00000000000000\n", "phi:\t1.26100000000000e-10*x^5 - 7.02800000000000e-11*x^4 + x^3 + 3.00000000000000*x^2 + 32.0000000000000*x + 1.00000000000000\n", "phi:\t-3.60500000000000e-10*x^6 + 1.78500000000000e-10*x^5 + 6.00800000000000e-11*x^4 + x^3 + 3.00000000000000*x^2 + 32.0000000000000*x + 1.00000000000000\n", "phi:\t1.04100000000000e-9*x^7 - 5.75400000000000e-10*x^6 - 2.11100000000000e-10*x^5 - 3.89400000000000e-11*x^4 + x^3 + 3.00000000000000*x^2 + 32.0000000000000*x + 1.00000000000000\n", "phi:\t-2.97500000000000e-9*x^8 + 1.87200000000000e-9*x^7 + 6.71900000000000e-10*x^6 + 1.15500000000000e-10*x^5 - 6.47500000000000e-11*x^4 + x^3 + 3.00000000000000*x^2 + 32.0000000000000*x + 1.00000000000000\n" ] } ], "source": [ "phi = NewtonIteration(F, alpha, debug=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Thus if we round off the coefficients, we can see the root we wanted sitting inside $\\phi$. " ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "g_prime = x^3 + 3*x^2 + 32*x + 1\n", "f= (x^2 - x + y + 9) * (-x^3 - 3*x^2 - 32*x + y - 1)^2\n" ] } ], "source": [ "g_prime = 0\n", "for c in phi.coefficients():\n", " g_prime = g_prime * x + round(c)\n", "print(f\"g_prime = {g_prime}\")\n", "print(f\"f= {factor(f)}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Thus from $\\phi$, we can extract the coefficients of the factor of $f$ even though the factor has multiplicities. " ] } ], "metadata": { "kernelspec": { "display_name": "SageMath 10.5", "language": "sage", "name": "sagemath-10.5" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.5" } }, "nbformat": 4, "nbformat_minor": 4 }