postgres.email

pg17 issues with not-null contraints

  • Jump to comment-1
    Justin Pryzby<pryzby@telsasoft.com>
    Apr 15, 2024, 12:13 PM UTC
    Forking: <20230829172828.5qi2pfbladvfgvsg@alvherre.pgsql>
    Subject: Re: Strange presentaion related to inheritance in \d+
    On Tue, Aug 29, 2023 at 07:28:28PM +0200, Alvaro Herrera wrote:
    On 2023-Aug-29, Kyotaro Horiguchi wrote:
    Attached is the initial version of the patch. It prevents "CREATE
    TABLE" from executing if there is an inconsisntent not-null
    constraint. Also I noticed that "ALTER TABLE t ADD NOT NULL c NO
    INHERIT" silently ignores the "NO INHERIT" part and fixed it.

    Great, thank you. I pushed it after modifying it a bit -- instead of
    throwing the error in MergeAttributes, I did it in
    AddRelationNotNullConstraints(). It seems cleaner this way, mostly
    because we already have to match these two constraints there. (I guess
    you could argue that we waste catalog-insertion work before the error is
    reported and the whole thing is aborted; but I don't think this is a
    serious problem in practice.)
    9b581c5341 can break dump/restore from old versions, including
    pgupgrade.
    postgres=# CREATE TABLE iparent(id serial PRIMARY KEY); CREATE TABLE child (id int) INHERITS (iparent); ALTER TABLE child ALTER id DROP NOT NULL; ALTER TABLE child ADD CONSTRAINT p PRIMARY KEY (id);
    $ pg_dump -h /tmp -p 5678 postgres -Fc |pg_restore -1 -h /tmp -p 5679 -d postgres
    ERROR: cannot change NO INHERIT status of inherited NOT NULL constraint "pgdump_throwaway_notnull_0" on relation "child"
    STATEMENT: ALTER TABLE ONLY public.iparent
            ADD CONSTRAINT iparent_pkey PRIMARY KEY (id);
        ALTER TABLE ONLY public.iparent DROP CONSTRAINT pgdump_throwaway_notnull_0;
    Strangely, if I name the table "parent", it seems to work, which might
    indicate an ordering/dependency issue.
    I think there are other issues related to b0e96f3119 (Catalog not-null
    constraints) - if I dump a v16 server using v17 tools, the backup can't
    be restored into the v16 server. I'm okay ignoring a line or two like
    'unrecognized configuration parameter "transaction_timeout", but not
    'syntax error at or near "NO"'.
    postgres=# CREATE TABLE a(i int not null primary key);
    $ pg_dump -h /tmp -p 5678 postgres |psql -h /tmp -p 5678 -d new
    2024-04-13 21:26:14.510 CDT [475995] ERROR: syntax error at or near "NO" at character 86
    2024-04-13 21:26:14.510 CDT [475995] STATEMENT: CREATE TABLE public.a (
            i integer CONSTRAINT pgdump_throwaway_notnull_0 NOT NULL NO INHERIT
        );
    ERROR: syntax error at or near "NO"
    LINE 2: ...er CONSTRAINT pgdump_throwaway_notnull_0 NOT NULL NO INHERIT
    The other version checks in pg_dump.c are used to construct sql for
    querying the source db, but this is used to create the sql to restore
    the target, using syntax that didn't exist until v17.
                    if (print_notnull)  
                    {
                        if (tbinfo->notnull_constrs[j][0] == '\0')
                            appendPQExpBufferStr(q, " NOT NULL");
                        else
                            appendPQExpBuffer(q, " CONSTRAINT %s NOT NULL",
                                              fmtId(tbinfo->notnull_constrs[j]));
    
                        if (tbinfo->notnull_noinh[j])
                            appendPQExpBufferStr(q, " NO INHERIT");
                    }
    This other thread is 6 years old and forgotten again, but still seems
    relevant.
    https://www.postgresql.org/message-id/flat/b8794d6a-38f0-9d7c-ad4b-e85adf860fc9%40enterprisedb.com
    BTW, these comments are out of date:
    + * In versions 16 and up, we need pg_constraint for explicit NOT NULL
    + if (fout->remoteVersion >= 170000)
    + * that we needn't specify that again for the child. (Versions >= 16 no
    + if (fout->remoteVersion < 170000)
    --
    Justin
    • Jump to comment-1
      Alvaro Herrera<alvherre@alvh.no-ip.org>
      Apr 15, 2024, 1:47 PM UTC
      On 2024-Apr-15, Justin Pryzby wrote:
      9b581c5341 can break dump/restore from old versions, including
      pgupgrade.

      postgres=# CREATE TABLE iparent(id serial PRIMARY KEY); CREATE TABLE child (id int) INHERITS (iparent); ALTER TABLE child ALTER id DROP NOT NULL; ALTER TABLE child ADD CONSTRAINT p PRIMARY KEY (id);

      $ pg_dump -h /tmp -p 5678 postgres -Fc |pg_restore -1 -h /tmp -p 5679 -d postgres
      ERROR: cannot change NO INHERIT status of inherited NOT NULL constraint "pgdump_throwaway_notnull_0" on relation "child"
      STATEMENT: ALTER TABLE ONLY public.iparent
      ADD CONSTRAINT iparent_pkey PRIMARY KEY (id);

      Strangely, if I name the table "parent", it seems to work, which might
      indicate an ordering/dependency issue.
      Hmm, apparently if the table is "iparent", the primary key is created in
      the child first; if the table is "parent", then the PK is created first
      there. I think the problem is that the ADD CONSTRAINT for the PK should
      not be recursing at all in this case ... seeing in particular that the
      command specifies ONLY. Should be a simple fix, looking now.
      I think there are other issues related to b0e96f3119 (Catalog not-null
      constraints) - if I dump a v16 server using v17 tools, the backup can't
      be restored into the v16 server. I'm okay ignoring a line or two like
      'unrecognized configuration parameter "transaction_timeout", but not
      'syntax error at or near "NO"'.
      This doesn't look something that we can handle at all. The assumption
      is that pg_dump's output is going to be fed to a server that's at least
      the same version. Running on older versions is just not supported.
      The other version checks in pg_dump.c are used to construct sql for
      querying the source db, but this is used to create the sql to restore
      the target, using syntax that didn't exist until v17.

      if (print_notnull)
      {
      if (tbinfo->notnull_constrs[j][0] == '\0')
      appendPQExpBufferStr(q, " NOT NULL");
      else
      appendPQExpBuffer(q, " CONSTRAINT %s NOT NULL",
      fmtId(tbinfo->notnull_constrs[j]));
      if (tbinfo->notnull_noinh[j])
      appendPQExpBufferStr(q, " NO INHERIT");
      }
      If you have ideas on what to do about this, I'm all ears, but keep in
      mind that pg_dump doesn't necessarily know what the target version is.

      This other thread is 6 years old and forgotten again, but still seems
      relevant.
      https://www.postgresql.org/message-id/flat/b8794d6a-38f0-9d7c-ad4b-e85adf860fc9%40enterprisedb.com
      I only skimmed very briefly, but it looks related to commit c3709100be73
      that I pushed earlier today. Or if you have some specific case that
      fails to be handled please let me know. (Maybe we should have the
      regress tests leave some tables behind to ensure the pg_upgrade behavior
      is what we want, if we continue to break it.)
      BTW, these comments are out of date:

      + * In versions 16 and up, we need pg_constraint for explicit NOT NULL
      + if (fout->remoteVersion >= 170000)
      + * that we needn't specify that again for the child. (Versions >= 16 no
      + if (fout->remoteVersion < 170000)
      Thanks, will fix. But I'm probably touching this code in the fix for
      Andrew Bille's problem, so I might not do so immediately.
      --
      Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
      "Estoy de acuerdo contigo en que la verdad absoluta no existe...
      El problema es que la mentira sí existe y tu estás mintiendo" (G. Lama)
      • Jump to comment-1
        Alvaro Herrera<alvherre@alvh.no-ip.org>
        Apr 15, 2024, 4:30 PM UTC
        On 2024-Apr-15, Alvaro Herrera wrote:
        On 2024-Apr-15, Justin Pryzby wrote:
        postgres=# CREATE TABLE iparent(id serial PRIMARY KEY); CREATE TABLE child (id int) INHERITS (iparent); ALTER TABLE child ALTER id DROP NOT NULL; ALTER TABLE child ADD CONSTRAINT p PRIMARY KEY (id);

        $ pg_dump -h /tmp -p 5678 postgres -Fc |pg_restore -1 -h /tmp -p 5679 -d postgres
        ERROR: cannot change NO INHERIT status of inherited NOT NULL constraint "pgdump_throwaway_notnull_0" on relation "child"
        STATEMENT: ALTER TABLE ONLY public.iparent
        ADD CONSTRAINT iparent_pkey PRIMARY KEY (id);
        Hmm, apparently if the table is "iparent", the primary key is created in
        the child first; if the table is "parent", then the PK is created first
        there. I think the problem is that the ADD CONSTRAINT for the PK should
        not be recursing at all in this case ... seeing in particular that the
        command specifies ONLY. Should be a simple fix, looking now.
        So the problem is that the ADD CONSTRAINT PRIMARY KEY in the parent
        table wants to recurse to the child, so that a NOT NULL constraint is
        created on each column. If the child is created first, there's already
        a NOT NULL NO INHERIT constraint in it which was created for its own
        primary key, so the internal recursion in the parent's ADD PK fails.
        A fix doesn't look all that simple:
        - As I said in my earlier reply, my first thought was to have ALTER
        TABLE ADD PRIMARY KEY not recurse if the command is ALTER TABLE ONLY.
        This doesn't work, because the point of that recursion is precisely to
        handle this case, so if we do that, we break the other stuff that this
        was added to solve.
        - Second thought was to add a bespoke dependency in pg_dump.c so that
        the child PK is dumped after the parent PK. I looked at the code,
        didn't like the idea of adding such a hack, went looking for other
        ideas.
        - Third thought was to hack AdjustNotNullInheritance1() so that it
        changes the conisnoinherit flag in this particular case. Works great,
        except that once we mark this constraint as inherited, we cannot drop
        it; and since it's a constraint marked "throwaway", pg_dump expects to
        be able to drop it, which means the ALTER TABLE DROP CONSTRAINT throws
        an error, and a constraint named pgdump_throwaway_notnull_0 remains in
        place.
        - Fourth thought: we do as in the third thought, except we also allow
        DROP CONSTRAINT a constraint that's marked "local, inherited" to be
        simply an inherited constraint (remove its "local" marker).
        I'm going to try to implement this fourth idea, which seems promising.
        I think if we do that, the end result will be identical to the case
        where the child is created after the parent.
        However, we'll also need that constraint to have a name better than
        pgdump_throwaway_notnull_NN.
        --
        Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
        • Jump to comment-1
          Alvaro Herrera<alvherre@alvh.no-ip.org>
          Apr 16, 2024, 6:12 PM UTC
          On 2024-Apr-15, Alvaro Herrera wrote:
          - Fourth thought: we do as in the third thought, except we also allow
          DROP CONSTRAINT a constraint that's marked "local, inherited" to be
          simply an inherited constraint (remove its "local" marker).
          Here is an initial implementation of what I was thinking. Can you
          please give it a try and see if it fixes this problem? At least in my
          run of your original test case, it seems to work as expected.
          This is still missing some cleanup and additional tests, of course.
          Speaking of which, I wonder if I should modify pg16's tests so that they
          leave behind tables set up in this way, to immortalize pg_upgrade
          testing.
          --
          Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
          • Jump to comment-1
            Justin Pryzby<pryzby@telsasoft.com>
            Apr 16, 2024, 10:36 PM UTC
            On Tue, Apr 16, 2024 at 08:11:49PM +0200, Alvaro Herrera wrote:
            On 2024-Apr-15, Alvaro Herrera wrote:
            - Fourth thought: we do as in the third thought, except we also allow
            DROP CONSTRAINT a constraint that's marked "local, inherited" to be
            simply an inherited constraint (remove its "local" marker).

            Here is an initial implementation of what I was thinking. Can you
            please give it a try and see if it fixes this problem? At least in my
            run of your original test case, it seems to work as expected.
            Yes, this fixes the issue I reported.
            BTW, that seems to be the same issue Andrew reported in January.
            https://www.postgresql.org/message-id/CAJnzarwkfRu76_yi3dqVF_WL-MpvT54zMwAxFwJceXdHB76bOA%40mail.gmail.com
            --
            Justin
            • Jump to comment-1
              Alvaro Herrera<alvherre@alvh.no-ip.org>
              Apr 17, 2024, 5:45 PM UTC
              On 2024-Apr-16, Justin Pryzby wrote:
              Yes, this fixes the issue I reported.
              Excellent, thanks for confirming.
              BTW, that seems to be the same issue Andrew reported in January.
              https://www.postgresql.org/message-id/CAJnzarwkfRu76_yi3dqVF_WL-MpvT54zMwAxFwJceXdHB76bOA%40mail.gmail.com
              That's really good news -- I was worried it would require much more
              invasive changes. I tested his case and noticed two additional issues,
              first that we fail to acquire locks down the hierarchy, so recursing
              down like ATPrepAddPrimaryKey does fails to pin down the children
              properly; and second, that the constraint left behind by restoring the
              dump preserves the "throaway" name. I made pg_dump use a different name
              when the table has a parent, just in case we end up not dropping the
              constraint.
              I'm going to push this early tomorrow. CI run:
              https://cirrus-ci.com/build/5754149453692928
              --
              Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
          • Jump to comment-1
            Justin Pryzby<pryzby@telsasoft.com>
            Apr 18, 2024, 3:07 PM UTC
            On Tue, Apr 16, 2024 at 08:11:49PM +0200, Alvaro Herrera wrote:
            This is still missing some cleanup and additional tests, of course.
            Speaking of which, I wonder if I should modify pg16's tests so that they
            leave behind tables set up in this way, to immortalize pg_upgrade
            testing.
            That seems like it could be important. I considered but never actually
            test your patch by pg_upgrading across major versions.
            BTW, this works up to v16 (although maybe it should not):
            | CREATE TABLE ip(id int PRIMARY KEY); CREATE TABLE ic(id int) INHERITS (ip); ALTER TABLE ic ALTER id DROP NOT NULL;
            Under v17, this fails. Maybe that's okay, but it should probably be
            called out in the release notes.
            | ERROR: cannot drop inherited constraint "ic_id_not_null" of relation "ic"
            That's the issue that I mentioned in the 6 year old thread. In the
            future (upgrading *from* v17) it won't be possible anymore, right? It'd
            still be nice to detect the issue in advance rather than failing halfway
            through the upgrade. I have a rebased patch while I'll send on that
            thread. I guess it's mostly unrelated to your patch but it'd be nice if
            you could take a look.
            --
            Justin
            • Jump to comment-1
              Alvaro Herrera<alvherre@alvh.no-ip.org>
              Apr 18, 2024, 4:23 PM UTC
              On 2024-Apr-18, Justin Pryzby wrote:
              That seems like it could be important. I considered but never actually
              test your patch by pg_upgrading across major versions.
              It would be a welcome contribution for sure. I've been doing it rather
              haphazardly, which is not great.
              BTW, this works up to v16 (although maybe it should not):

              | CREATE TABLE ip(id int PRIMARY KEY); CREATE TABLE ic(id int) INHERITS (ip); ALTER TABLE ic ALTER id DROP NOT NULL;

              Under v17, this fails. Maybe that's okay, but it should probably be
              called out in the release notes.
              Sure, we should mention that.
              | ERROR: cannot drop inherited constraint "ic_id_not_null" of relation "ic"

              That's the issue that I mentioned in the 6 year old thread. In the
              future (upgrading *from* v17) it won't be possible anymore, right?
              Yeah, trying to drop the constraint in 17 fails as it should; it was one
              of the goals of this whole thing in fact.
              It'd still be nice to detect the issue in advance rather than failing
              halfway through the upgrade.
              Maybe we can have pg_upgrade --check look for cases we might have
              trouble upgrading. (I mean: such cases would fail if you have rows with
              nulls in the affected columns, but the schema upgrade should be
              successful. Is that what you have in mind?)
              I have a rebased patch while I'll send on that thread. I guess it's
              mostly unrelated to your patch but it'd be nice if you could take a
              look.
              Okay.
              --
              Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
              "Industry suffers from the managerial dogma that for the sake of stability
              and continuity, the company should be independent of the competence of
              individual employees." (E. Dijkstra)
              • Jump to comment-1
                Justin Pryzby<pryzby@telsasoft.com>
                Apr 18, 2024, 4:52 PM UTC
                On Thu, Apr 18, 2024 at 06:23:30PM +0200, Alvaro Herrera wrote:
                On 2024-Apr-18, Justin Pryzby wrote:
                BTW, this works up to v16 (although maybe it should not):

                | CREATE TABLE ip(id int PRIMARY KEY); CREATE TABLE ic(id int) INHERITS (ip); ALTER TABLE ic ALTER id DROP NOT NULL;
                It'd still be nice to detect the issue in advance rather than failing
                halfway through the upgrade.

                Maybe we can have pg_upgrade --check look for cases we might have
                trouble upgrading. (I mean: such cases would fail if you have rows with
                nulls in the affected columns, but the schema upgrade should be
                successful. Is that what you have in mind?)
                Before v16, pg_upgrade failed in the middle of restoring the schema,
                without being caught during --check. The patch to implement that was
                forgotten and never progressed.
                I'm not totally clear on what's intended in v17 - maybe it'd be dead
                code, and maybe it shouldn't even be applied to master branch. But I do
                think it's worth patching earlier versions (even though it'll be less
                useful than having done so 5 years ago).
                --
                Justin
                • Jump to comment-1
                  Robert Haas<robertmhaas@gmail.com>
                  Apr 30, 2024, 5:52 PM UTC
                  On Thu, Apr 18, 2024 at 12:52 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
                  I'm not totally clear on what's intended in v17 - maybe it'd be dead
                  code, and maybe it shouldn't even be applied to master branch. But I do
                  think it's worth patching earlier versions (even though it'll be less
                  useful than having done so 5 years ago).
                  This thread is still on the open items list, but I'm not sure whether
                  there's still stuff here that needs to be fixed for the current
                  release. If not, this thread should be moved to the "resolved before
                  17beta1" section. If so, we should try to reach consensus on what the
                  remaining issues are and what we're going to do about them.
                  --
                  Robert Haas
                  EDB: http://www.enterprisedb.com
                  • Jump to comment-1
                    Justin Pryzby<pryzby@telsasoft.com>
                    Apr 30, 2024, 6:53 PM UTC
                    On Tue, Apr 30, 2024 at 01:52:02PM -0400, Robert Haas wrote:
                    On Thu, Apr 18, 2024 at 12:52 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
                    I'm not totally clear on what's intended in v17 - maybe it'd be dead
                    code, and maybe it shouldn't even be applied to master branch. But I do
                    think it's worth patching earlier versions (even though it'll be less
                    useful than having done so 5 years ago).

                    This thread is still on the open items list, but I'm not sure whether
                    there's still stuff here that needs to be fixed for the current
                    release. If not, this thread should be moved to the "resolved before
                    17beta1" section. If so, we should try to reach consensus on what the
                    remaining issues are and what we're going to do about them.
                    I think the only thing that's relevant for v17 is this:
                    On Tue, Apr 16, 2024 at 08:11:49PM +0200, Alvaro Herrera wrote:
                    Speaking of which, I wonder if I should modify pg16's tests so that they
                    leave behind tables set up in this way, to immortalize pg_upgrade testing.
                    The patch on the other thread for pg_upgrade --check is an old issue
                    affecting all stable releases.
                    --
                    Justin
      • Jump to comment-1
        Justin Pryzby<pryzby@telsasoft.com>
        Apr 16, 2024, 12:02 AM UTC
        On Mon, Apr 15, 2024 at 03:47:38PM +0200, Alvaro Herrera wrote:
        On 2024-Apr-15, Justin Pryzby wrote:
        I think there are other issues related to b0e96f3119 (Catalog not-null
        constraints) - if I dump a v16 server using v17 tools, the backup can't
        be restored into the v16 server. I'm okay ignoring a line or two like
        'unrecognized configuration parameter "transaction_timeout", but not
        'syntax error at or near "NO"'.

        This doesn't look something that we can handle at all. The assumption
        is that pg_dump's output is going to be fed to a server that's at least
        the same version. Running on older versions is just not supported.
        You're right - the docs say:
        |Also, it is not guaranteed that pg_dump's output can be loaded into a
        |server of an older major version — not even if the dump was taken from a
        |server of that version
        Here's a couple more issues affecting upgrades from v16 to v17.
        postgres=# CREATE TABLE a(i int NOT NULL); CREATE TABLE b(i int PRIMARY KEY) INHERITS (a);
        pg_restore: error: could not execute query: ERROR: constraint "pgdump_throwaway_notnull_0" of relation "b" does not exist
        postgres=# CREATE TABLE a(i int CONSTRAINT a NOT NULL PRIMARY KEY); CREATE TABLE b()INHERITS(a); ALTER TABLE b ADD CONSTRAINT pkb PRIMARY KEY (i);
        pg_restore: error: could not execute query: ERROR: cannot drop inherited constraint "pgdump_throwaway_notnull_0" of relation "b"
        --
        Justin
        • Jump to comment-1
          Alvaro Herrera<alvherre@alvh.no-ip.org>
          Apr 16, 2024, 6:26 PM UTC
          On 2024-Apr-15, Justin Pryzby wrote:
          Here's a couple more issues affecting upgrades from v16 to v17.

          postgres=# CREATE TABLE a(i int NOT NULL); CREATE TABLE b(i int PRIMARY KEY) INHERITS (a);
          pg_restore: error: could not execute query: ERROR: constraint "pgdump_throwaway_notnull_0" of relation "b" does not exist
          This one requires a separate pg_dump fix, which should --I hope-- be
          pretty simple.
          postgres=# CREATE TABLE a(i int CONSTRAINT a NOT NULL PRIMARY KEY); CREATE TABLE b()INHERITS(a); ALTER TABLE b ADD CONSTRAINT pkb PRIMARY KEY (i);
          pg_restore: error: could not execute query: ERROR: cannot drop inherited constraint "pgdump_throwaway_notnull_0" of relation "b"
          This one seems to be fixed with the patch I just posted.
          --
          Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
          "I am amazed at [the pgsql-sql] mailing list for the wonderful support, and
          lack of hesitasion in answering a lost soul's question, I just wished the rest
          of the mailing list could be like this." (Fotis)
                    https://postgr.es/m/200606261359.k5QDxE2p004593@auth-smtp.hol.gr
        • Jump to comment-1
          Alvaro Herrera<alvherre@alvh.no-ip.org>
          Apr 18, 2024, 1:41 PM UTC
          On 2024-Apr-15, Justin Pryzby wrote:
          Here's a couple more issues affecting upgrades from v16 to v17.

          postgres=# CREATE TABLE a(i int NOT NULL); CREATE TABLE b(i int PRIMARY KEY) INHERITS (a);
          pg_restore: error: could not execute query: ERROR: constraint "pgdump_throwaway_notnull_0" of relation "b" does not exist

          postgres=# CREATE TABLE a(i int CONSTRAINT a NOT NULL PRIMARY KEY); CREATE TABLE b()INHERITS(a); ALTER TABLE b ADD CONSTRAINT pkb PRIMARY KEY (i);
          pg_restore: error: could not execute query: ERROR: cannot drop inherited constraint "pgdump_throwaway_notnull_0" of relation "b"
          I pushed a fix now, and it should also cover these two issues, which
          required only minor changes over what I posted yesterday. Also, thank
          you for pointing out that the patch also fixed Andrew's problem. It
          did, except there was a locking problem which required an additional
          tweak.
          Thanks for reporting these.
          --
          Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
          "At least to kernel hackers, who really are human, despite occasional
          rumors to the contrary" (LWN.net)