Cleanup: remove unused fields from nodes

  • Jump to comment-1
    boekewurm+postgres@gmail.com2024-04-22T15:21:44+00:00
    Hi, While working on the 'reduce nodeToString output' patch, I noticed that my IDE marked one field in the TidScanState node as 'unused'. After checking this seemed to be accurate, and I found a few more such fields in Node structs. PFA some patches that clean this up: 0001 is plain removal of fields that are not accessed anywhere anymore, 0002 and up clean up fields that are effectively write-only, with no effective use inside PostgreSQL's own code, and no effective usage found on Debian code search, nor Github code search. I'm quite confident about the correctness of patches 1 and 3 (no usage at all, and newly introduced with no meaningful uses), while patches 2, 4, and 5 could be considered 'as designed'. For those last ones I have no strong opinion for removal or against keeping them around, this is just to point out we can remove the fields, as nobody seems to be using them. /cc Tom Lane and Etsuro Fujita: 2 and 4 were introduced with your commit afb9249d in 2015. /cc Amit Kapila: 0003 was introduced with your spearheaded commit 6185c973 this year. Kind regards, Matthias van de Meent 0001 removes two old fields that are not in use anywhere anymore, but at some point these were used. 0002/0004 remove fields in ExecRowMark which were added for FDWs to use, but there are no FDWs which use this: I could only find two FDWs who implement RefetchForeignRow, one being BlackHoleFDW, and the other a no-op implementation in kafka_fdw [0]. We also don't seem to have any testing on this feature. 0003 drops the input_finfo field on the new JsonExprState struct. It wasn't read anywhere, so keeping it around makes little sense IMO. 0005 drops field DeallocateStmt.isall: the value of the field is implied by !name, and that field was used as such. [0] https://github.com/cohenjo/kafka_fdw/blob/master/src/kafka_fdw.c#L1793
    • Jump to comment-1
      tgl@sss.pgh.pa.us2024-04-22T15:41:16+00:00
      Matthias van de Meent <boekewurm+postgres@gmail.com> writes: > 0001 removes two old fields that are not in use anywhere anymore, but > at some point these were used. +1. They're not being initialized, so they're useless and confusing. > 0002/0004 remove fields in ExecRowMark which were added for FDWs to > use, but there are no FDWs which use this: I could only find two FDWs > who implement RefetchForeignRow, one being BlackHoleFDW, and the other > a no-op implementation in kafka_fdw [0]. We also don't seem to have > any testing on this feature. I'm kind of down on removing either of these. ermExtra is explicitly intended for extensions to use, and just because we haven't found any users doesn't mean there aren't any, or might not be next week. Similarly, ermActive seems like it's at least potentially useful: is there another way for onlookers to discover that state? > 0003 drops the input_finfo field on the new JsonExprState struct. It > wasn't read anywhere, so keeping it around makes little sense IMO. +1. The adjacent input_fcinfo field has this info if anyone needs it. > 0005 drops field DeallocateStmt.isall: the value of the field is > implied by !name, and that field was used as such. Seems reasonable. I think it would be a good idea to push 0003 for v17, just so nobody grows an unnecessary dependency on that field. 0001 and 0005 could be left for v18, but on the other hand they're so trivial that it could also be sensible to just push them to get them out of the way. regards, tom lane
      • Jump to comment-1
        boekewurm+postgres@gmail.com2024-04-22T16:46:51+00:00
        On Mon, 22 Apr 2024 at 17:41, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Matthias van de Meent <boekewurm+postgres@gmail.com> writes: > > 0002/0004 remove fields in ExecRowMark which were added for FDWs to > > use, but there are no FDWs which use this: I could only find two FDWs > > who implement RefetchForeignRow, one being BlackHoleFDW, and the other > > a no-op implementation in kafka_fdw [0]. We also don't seem to have > > any testing on this feature. > > I'm kind of down on removing either of these. ermExtra is explicitly > intended for extensions to use, and just because we haven't found any > users doesn't mean there aren't any, or might not be next week. That's a good point, and also why I wasn't 100% sure removing it was a good idea. I'm not quite sure why this would be used (rather than the internal state of the FDW, or no state at all), but haven't looked very deep into it either, so I'm quite fine with not channging that. > Similarly, ermActive seems like it's at least potentially useful: > is there another way for onlookers to discover that state? The ermActive field is always true when RefetchForeignRow is called (in ExecLockRows(), in nodeLockRows.c), and we don't seem to care about the value of the field afterwards. Additionally, we always set erm->curCtid to a valid value when ermActive is also first set in that code path. In all, it feels like a duplicative field with no real uses inside PostgreSQL itself. If an extension (FDW) needs it, it should probably use ermExtra instead, as ermActive seemingly doesn't carry any meaningful value into the FDW call. > I think it would be a good idea to push 0003 for v17, just so nobody > grows an unnecessary dependency on that field. 0001 and 0005 could > be left for v18, but on the other hand they're so trivial that it > could also be sensible to just push them to get them out of the way. Beta 1 scheduled to be released for quite some time, so I don't think there are any problems with fixing these kinds of minor issues in the provisional ABI for v17. Kind regards, Matthias van de Meent
        • Jump to comment-1
          michael@paquier.xyz2024-04-23T01:15:06+00:00
          On Mon, Apr 22, 2024 at 06:46:27PM +0200, Matthias van de Meent wrote: > On Mon, 22 Apr 2024 at 17:41, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Matthias van de Meent <boekewurm+postgres@gmail.com> writes: >>> 0002/0004 remove fields in ExecRowMark which were added for FDWs to >>> use, but there are no FDWs which use this: I could only find two FDWs >>> who implement RefetchForeignRow, one being BlackHoleFDW, and the other >>> a no-op implementation in kafka_fdw [0]. We also don't seem to have >>> any testing on this feature. >> >> I'm kind of down on removing either of these. ermExtra is explicitly >> intended for extensions to use, and just because we haven't found any >> users doesn't mean there aren't any, or might not be next week. > > That's a good point, and also why I wasn't 100% sure removing it was a > good idea. I'm not quite sure why this would be used (rather than the > internal state of the FDW, or no state at all), but haven't looked > very deep into it either, so I'm quite fine with not channging that. Custom nodes are one extra possibility? I'd leave ermActive and ermExtra be. >> I think it would be a good idea to push 0003 for v17, just so nobody >> grows an unnecessary dependency on that field. 0001 and 0005 could >> be left for v18, but on the other hand they're so trivial that it >> could also be sensible to just push them to get them out of the way. > > Beta 1 scheduled to be released for quite some time, so I don't think > there are any problems with fixing these kinds of minor issues in the > provisional ABI for v17. Tweaking the APIs should be OK until GA, as long as we agree that the current interfaces can be improved. 0003 is new in v17, so let's apply it now. I don't see much a strong argument in waiting for the removal of 0001 and 0005, either, to keep the interfaces cleaner moving on. However, this is not a regression and these have been around for years, so I'd suggest for v18 to open before moving on with the removal. I was wondering for a bit about how tss_htup could be abused in the open, and the only references I can see come from forks of the pre-2019 area, where this uses TidNext(). As a whole, ripping it out does not stress me much. -- Michael
          • Jump to comment-1
            tgl@sss.pgh.pa.us2024-04-23T17:01:10+00:00
            Michael Paquier <michael@paquier.xyz> writes: > On Mon, Apr 22, 2024 at 06:46:27PM +0200, Matthias van de Meent wrote: >> On Mon, 22 Apr 2024 at 17:41, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I think it would be a good idea to push 0003 for v17, just so nobody >>> grows an unnecessary dependency on that field. 0001 and 0005 could >>> be left for v18, but on the other hand they're so trivial that it >>> could also be sensible to just push them to get them out of the way. > Tweaking the APIs should be OK until GA, as long as we agree that the > current interfaces can be improved. > 0003 is new in v17, so let's apply it now. I don't see much a strong > argument in waiting for the removal of 0001 and 0005, either, to keep > the interfaces cleaner moving on. However, this is not a regression > and these have been around for years, so I'd suggest for v18 to open > before moving on with the removal. I went ahead and pushed 0001 and 0003, figuring there was little point in waiting on 0001. I'd intended to push 0005 (remove "isall") as well, but it failed check-world: diff -U3 /home/postgres/pgsql/contrib/pg_stat_statements/expected/utility.out /home/postgres/pgsql/contrib/pg_stat_statements/results/utility.out --- /home/postgres/pgsql/contrib/pg_stat_statements/expected/utility.out 2023-12-08 15:14:55.689347888 -0500 +++ /home/postgres/pgsql/contrib/pg_stat_statements/results/utility.out 2024-04-23 12:17:22.187721947 -0400 @@ -536,12 +536,11 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; calls | rows | query -------+------+---------------------------------------------------- - 2 | 0 | DEALLOCATE $1 - 2 | 0 | DEALLOCATE ALL + 4 | 0 | DEALLOCATE $1 2 | 2 | PREPARE stat_select AS SELECT $1 AS a 1 | 1 | SELECT $1 as a 1 | 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t -(5 rows) +(4 rows) SELECT pg_stat_statements_reset() IS NOT NULL AS t; That is, query jumbling no longer distinguishes "DEALLOCATE x" from "DEALLOCATE ALL", because the DeallocateStmt.name field is marked query_jumble_ignore. Now maybe that's fine, but it's a point we'd not considered so far in this thread. Thoughts? regards, tom lane
            • Jump to comment-1
              michael@paquier.xyz2024-04-24T02:57:29+00:00
              On Tue, Apr 23, 2024 at 01:01:04PM -0400, Tom Lane wrote: > That is, query jumbling no longer distinguishes "DEALLOCATE x" from > "DEALLOCATE ALL", because the DeallocateStmt.name field is marked > query_jumble_ignore. Now maybe that's fine, but it's a point > we'd not considered so far in this thread. Thoughts? And of course, I've managed to forget about bb45156f342c and the reason behind the addition of the field is to be able to make the difference between the named and ALL cases for DEALLOCATE, around here: https://www.postgresql.org/message-id/ZNq9kRwWbKzvR%2B2a%40paquier.xyz This is new in v17, so perhaps it could be changed, but I think that's important to make the difference here for monitoring purposes as DEALLOCATE ALL could be used as a way to clean up prepared statements in connection poolers (for example, pgbouncer's server_reset_query). And doing this tweak in the Node structure of DeallocateStmt is simpler than having to maintain a new pg_node_attr for query jumbling. -- Michael
              • Jump to comment-1
                tgl@sss.pgh.pa.us2024-04-24T03:03:47+00:00
                Michael Paquier <michael@paquier.xyz> writes: > On Tue, Apr 23, 2024 at 01:01:04PM -0400, Tom Lane wrote: >> That is, query jumbling no longer distinguishes "DEALLOCATE x" from >> "DEALLOCATE ALL", because the DeallocateStmt.name field is marked >> query_jumble_ignore. Now maybe that's fine, but it's a point >> we'd not considered so far in this thread. Thoughts? > And of course, I've managed to forget about bb45156f342c and the > reason behind the addition of the field is to be able to make the > difference between the named and ALL cases for DEALLOCATE, around > here: > https://www.postgresql.org/message-id/ZNq9kRwWbKzvR%2B2a%40paquier.xyz Hah. Seems like the comment for isall needs to explain that it exists for this purpose, so we don't make this mistake again. regards, tom lane
                • Jump to comment-1
                  michael@paquier.xyz2024-04-24T07:28:40+00:00
                  On Tue, Apr 23, 2024 at 11:03:40PM -0400, Tom Lane wrote: > Hah. Seems like the comment for isall needs to explain that it > exists for this purpose, so we don't make this mistake again. How about something like the attached? -- Michael
                  • Jump to comment-1
                    boekewurm+postgres@gmail.com2024-04-24T09:51:14+00:00
                    On Wed, 24 Apr 2024 at 09:28, Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, Apr 23, 2024 at 11:03:40PM -0400, Tom Lane wrote: > > Hah. Seems like the comment for isall needs to explain that it > > exists for this purpose, so we don't make this mistake again. > > How about something like the attached? LGTM. Thanks, Matthias
                  • Jump to comment-1
                    tgl@sss.pgh.pa.us2024-04-24T12:32:04+00:00
                    Michael Paquier <michael@paquier.xyz> writes: > On Tue, Apr 23, 2024 at 11:03:40PM -0400, Tom Lane wrote: >> Hah. Seems like the comment for isall needs to explain that it >> exists for this purpose, so we don't make this mistake again. > How about something like the attached? I was thinking about wording like * True if DEALLOCATE ALL. This is redundant with "name == NULL", * but we make it a separate field so that exactly this condition * (and not the precise name) will be accounted for in query jumbling. regards, tom lane
                    • Jump to comment-1
                      michael@paquier.xyz2024-04-25T01:26:02+00:00
                      On Wed, Apr 24, 2024 at 08:31:57AM -0400, Tom Lane wrote: > I was thinking about wording like > > * True if DEALLOCATE ALL. This is redundant with "name == NULL", > * but we make it a separate field so that exactly this condition > * (and not the precise name) will be accounted for in query jumbling. Fine by me. I've just used that and applied a patch to doing so. Thanks. -- Michael