Collation again here

  • Jump to comment-1
    Rihad<grihad@gmail.com>
    Jan 8, 2026, 12:39 PM UTC
    Hi, guys. Just pg_upgraded our PG from 13.x to 18.1, rebuilt all indexes concurrently and issued ALTER DATABASE foo REFRESH COLLATION VERSION
    Everything's fine on the master server, no warnings etc. Then I set up a replica using pg_basebackup, and there when trying to access the DB using psql I get:
    WARNING:  database "foo" has a collation version mismatch
    DETAIL:  The database was created using collation version 43.0, but the operating system provides version 34.0.
    HINT:  Rebuild all objects in this database that use the default collation and run ALTER DATABASE foo REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
    psql (18.1)
    Type "help" for help.
    (and I can't issue those same commands on the read-only replica).
    The OS are both FreeBSD, but the master runs 14.3, the replica runs 13.5. PostgreSQL packages are built with ICU support by default, the versions used are identical: icu-76.1
    Could it be that the OS (libc) needs to be the same version? I hoped it would be enough to use the same ICU.
    Thanks for any tips.
    • Jump to comment-1
      Daniel Verite<daniel@manitou-mail.org>
      Jan 8, 2026, 1:19 PM UTC
      Rihad wrote:
      DETAIL: The database was created using collation version 43.0, but the
      operating system provides version 34.0.
      The OS are both FreeBSD, but the master runs 14.3, the replica runs
      13.5. PostgreSQL packages are built with ICU support by default, the
      versions used are identical: icu-76.1


      Could it be that the OS (libc) needs to be the same version? I hoped it
      would be enough to use the same ICU.
      That would be the case if the database collation provider was ICU,
      that is pg_database.datlocprovider = 'i'
      Otherwise no, it's the libc version that matters, unless using
      the binary-sorting C or C.UTF-8 collations.
      43.0 and 34.0 don't look like ICU collation version numbers, so
      they probably refer to FreeBSD libc (pg_database.datlocprovider = 'c')
      As a general rule, you should indeed run the same OS version
      for the primary and physical replicas.
      Before Postgres version 15 this message did not exist, so the risks
      of mismatch were silent but they existed just the same.
      Best regards,
      --
      Daniel Vérité
      https://postgresql.verite.pro/
      • Jump to comment-1
        Rihad<grihad@gmail.com>
        Jan 8, 2026, 2:52 PM UTC
        On 1/8/26 5:19 PM, Daniel Verite wrote:
        Rihad wrote:
        DETAIL: The database was created using collation version 43.0, but the
        operating system provides version 34.0.
        The OS are both FreeBSD, but the master runs 14.3, the replica runs
        13.5. PostgreSQL packages are built with ICU support by default, the
        versions used are identical: icu-76.1


        Could it be that the OS (libc) needs to be the same version? I hoped it
        would be enough to use the same ICU.
        That would be the case if the database collation provider was ICU,
        that is pg_database.datlocprovider = 'i'
        Otherwise no, it's the libc version that matters, unless using
        the binary-sorting C or C.UTF-8 collations.
        43.0 and 34.0 don't look like ICU collation version numbers, so
        they probably refer to FreeBSD libc (pg_database.datlocprovider = 'c')

        As a general rule, you should indeed run the same OS version
        for the primary and physical replicas.
        Before Postgres version 15 this message did not exist, so the risks
        of mismatch were silent but they existed just the same.


        Best regards,
        Thanks. Both FreeBSD 13.5 & 14.3 are "current", just from different major branches. I wonder why they use such different locale/collate versioning, meaning that they differ roughly by that much as in 34 vs. 43. Maybe someone there mistyped 34?)
        • Jump to comment-1
          Peter 'PMc' Much<pmc@citylink.dinoex.sub.org>
          Jan 8, 2026, 10:45 PM UTC
          On Thu, Jan 08, 2026 at 05:26:47PM +0400, Rihad wrote:
          ! > As a general rule, you should indeed run the same OS version
          ! > for the primary and physical replicas.
          ! > Before Postgres version 15 this message did not exist, so the risks
          ! > of mismatch were silent but they existed just the same.
          ! >
          ! >
          ! > Best regards,
          !
          ! Thanks. Both FreeBSD 13.5 & 14.3 are "current", just from different major
          ! branches. I wonder why they use such different locale/collate versioning,
          ! meaning that they differ roughly by that much as in 34 vs. 43. Maybe someone
          ! there mistyped 34?)
          Hi, I ran into that same (surprizing) error when upgrading FreeBSD
          13.x to 14.x.
          In FreeBSD the major version switches are for exactly that purpose: to
          change interfaces or libaries that may induce incompatibilities
          The version numbers (34.0 vs 43.0) might mean anything, I don't
          know; but they originate from these places in the OS distribution:
          $ hd /usr/share/locale/en_US.UTF-8/LC_COLLATE | head -1
          00000000  42 53 44 20 31 2e 30 0a  00 00 00 00 34 33 2e 30  |BSD 1.0.....43.0|
          In Yor case, apparently the conclusion is that you need the same
          FreeBSD major version throughout Your replication set.
          -- PMc
    • Jump to comment-1
      Dominique Devienne<ddevienne@gmail.com>
      Jan 8, 2026, 12:49 PM UTC
      On Thu, Jan 8, 2026 at 1:39 PM Rihad <grihad@gmail.com> wrote:
      Hi, guys. Just pg_upgraded our PG from 13.x to 18.1, rebuilt all indexes concurrently and issued ALTER DATABASE foo REFRESH COLLATION VERSION
      Everything's fine on the master server, no warnings etc. Then I set up a replica using pg_basebackup, and there when trying to access the DB using psql I get:

      WARNING: database "foo" has a collation version mismatch
      DETAIL: The database was created using collation version 43.0, but the operating system provides version 34.0.
      HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE foo REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
      The OS are both FreeBSD, but the master runs 14.3, the replica runs 13.5. PostgreSQL packages are built with ICU support by default, the versions used are identical: icu-76.1
      Could it be that the OS (libc) needs to be the same version? I hoped it would be enough to use the same ICU.
      Depends what Provider you used in those DBs:
      https://www.postgresql.org/docs/current/locale.html#LOCALE-PROVIDERS
      Having the same ICU is good, but do your DBs use that provider? If
      not, and you're using the libc one, then yes, the libc version
      matters.
      Myself I use the new builtin provider to avoid OS portability issues. --DD
      • Jump to comment-1
        Rihad<grihad@gmail.com>
        Jan 8, 2026, 2:49 PM UTC
        On 1/8/26 4:48 PM, Dominique Devienne wrote:
        On Thu, Jan 8, 2026 at 1:39 PM Rihad<grihad@gmail.com> wrote:
        Hi, guys. Just pg_upgraded our PG from 13.x to 18.1, rebuilt all indexes concurrently and issued ALTER DATABASE foo REFRESH COLLATION VERSION
        Everything's fine on the master server, no warnings etc. Then I set up a replica using pg_basebackup, and there when trying to access the DB using psql I get:

        WARNING: database "foo" has a collation version mismatch
        DETAIL: The database was created using collation version 43.0, but the operating system provides version 34.0.
        HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE foo REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
        The OS are both FreeBSD, but the master runs 14.3, the replica runs 13.5. PostgreSQL packages are built with ICU support by default, the versions used are identical: icu-76.1
        Could it be that the OS (libc) needs to be the same version? I hoped it would be enough to use the same ICU.
        Depends what Provider you used in those DBs:
        https://www.postgresql.org/docs/current/locale.html#LOCALE-PROVIDERS
        Having the same ICU is good, but do your DBs use that provider? If
        not, and you're using the libc one, then yes, the libc version
        matters.
        Myself I use the new builtin provider to avoid OS portability issues. --DD
        Hi, our locale/collate/whatever is enUS.UTF-8 (as set by these pgupgrade flags: --encoding=utf-8 --locale=en_US.UTF-8)
        Looking into pg_collation system table that collation has collprovide="c". First I thought "c" meant libc, but this article states that "c" means PG Internal provider, and libc would have been "l".
        https://medium.com/@adarsh2801/understanding-collations-in-postgresql-648e4fa333e1
        1. /PostgreSQL Internal Provider (‘c’) /: Introduced in Postgres 15.
        This built-in collation support is System/OS agnostic.
        2. /System Library Provider (‘l’) : /Uses GNU C library and hence is
        OS locale dependent.
        3. /ICU — International Components for Unicode (‘i’) : /Uses ICU
        library for unicode-aware collation.
        We only have "i" & "c" in pg_collation. And we aren't using any of "i" it seems. All this locale/encoding/collate stuff is too much for me to handle, sorry)
        So if we are using the internal (builtin) "c" provider how come the PG 18.1 run on FreeBSD 13.5 version shows warnings that the system version is 34.0?
        The article must be wrong I guess.
        Then upgrading 13.5 to 14.3 is our only option.
        • Jump to comment-1
          Daniel Verite<daniel@manitou-mail.org>
          Jan 8, 2026, 4:12 PM UTC
          Rihad wrote:
          Looking into pg_collation system table that collation has
          collprovide="c". First I thought "c" meant libc, but this article states
          that "c" means PG Internal provider, and libc would have been "l".

          https://medium.com/@adarsh2801/understanding-collations-in-postgresql-648e4fa333e1

          1. /PostgreSQL Internal Provider (‘c’) /: Introduced in Postgres 15.
          This built-in collation support is System/OS agnostic.
          2. /System Library Provider (‘l’) : /Uses GNU C library and hence is
          OS locale dependent.
          3. /ICU — International Components for Unicode (‘i’) : /Uses ICU
          library for unicode-aware collation.
          Well, this is quite wrong!
          See the doc at [1]:
          <quote>
          datlocprovider char
          Locale provider for this database: b = builtin, c = libc, i = icu
          </quote>
          Also, the builtin provider has been introduced in pg17, not 15,
          and 'l' has never been used in this field.
          It looks like hallucination from an AI model.
          Quickly reading the rest of the article, it's a mix of good points
          and very confusing assertions that would not be expected
          from a person proficient in the domain
          (for example: writing that en_us.UTF-8 is "case-insensitive")
          Unfortunately nowadays AI slop proliferates, so we need
          to be extra cautious on posts from untrusted sources.
          [1]: https://www.postgresql.org/docs/current/catalog-pg-database.html
          Best regards,
          --
          Daniel Vérité
          https://postgresql.verite.pro/
        • Jump to comment-1
          Dominique Devienne<ddevienne@gmail.com>
          Jan 8, 2026, 2:02 PM UTC
          On Thu, Jan 8, 2026 at 2:18 PM Rihad <grihad@gmail.com> wrote:
          We only have "i" & "c" in pg_collation. And we aren't using any of "i" it seems. All this locale/encoding/collate stuff is too much for me to handle, sorry)
          As Daniel already told you, it's pg_database.datlocprovider that matters.
          So if we are using the internal (builtin) "c" provider how come the PG 18.1 run on FreeBSD 13.5 version shows warnings that the system version is 34.0?
          "c" is NOT built-in. It's implemented thanks to the OS' libc as a dependency.
          "b" is built-in, i.e. entirely within postgres, w/o external
          dependencies (like icu or libc). New in PostgreSQL 18.
          Then upgrading (the OS from) 13.5 to 14.3 is our only option.
          Given you're likely on "c", probably. As recommeded by Daniel. --DD
        • Jump to comment-1
          Adrian Klaver<adrian.klaver@aklaver.com>
          Jan 8, 2026, 4:05 PM UTC
          On 1/8/26 05:18, Rihad wrote:
          On 1/8/26 4:48 PM, Dominique Devienne wrote:
          Looking into pg_collation system table that collation has > collprovide="c". First I thought "c" meant libc, but this article states > that "c" means PG Internal provider, and libc would have been "l".
          https://medium.com/@adarsh2801/understanding-collations-in- > postgresql-648e4fa333e1
          1. /PostgreSQL Internal Provider (‘c’) /: Introduced in Postgres 15.
          This built-in collation support is System/OS agnostic.
          2. /System Library Provider (‘l’) : /Uses GNU C library and hence is
          OS locale dependent.
          3. /ICU — International Components for Unicode (‘i’) : /Uses ICU
          library for unicode-aware collation.
          This is what the docs are for:
          https://www.postgresql.org/docs/current/catalog-pg-collation.html
          "collprovider char
          Provider of the collation: d = database default, b = builtin, c = libc, i = icu
          "
          And
          https://www.postgresql.org/docs/current/locale.html#LOCALE-PROVIDERS
          "
          23.1.4. Locale Providers
          A locale provider specifies which library defines the locale behavior for collations and character classifications.
          The commands and tools that select the locale settings, as described above, each have an option to select the locale provider. Here is an example to initialize a database cluster using the ICU provider:
          initdb --locale-provider=icu --icu-locale=en
          See the description of the respective commands and programs for details. Note that you can mix locale providers at different granularities, for example use libc by default for the cluster but have one database that uses the icu provider, and then have collation objects using either provider within those databases.
          Regardless of the locale provider, the operating system is still used to provide some locale-aware behavior, such as messages (see lc_messages).
          The available locale providers are listed below:
          builtin
          The builtin provider uses built-in operations. Only the C, C.UTF-8, and PG_UNICODE_FAST locales are supported for this provider.
          
          The C locale behavior is identical to the C locale in the libc provider. When using this locale, the behavior may depend on the database encoding.
          
          The C.UTF-8 locale is available only for when the database encoding is UTF-8, and the behavior is based on Unicode. The collation uses the code point values only. The regular expression character classes are based on the "POSIX Compatible" semantics, and the case mapping is the "simple" variant.
          
          The PG_UNICODE_FAST locale is available only when the database encoding is UTF-8, and the behavior is based on Unicode. The collation uses the code point values only. The regular expression character classes are based on the "Standard" semantics, and the case mapping is the "full" variant.
          icu
          The icu provider uses the external ICU library. PostgreSQL must have been configured with support.
          
          ICU provides collation and character classification behavior that is independent of the operating system and database encoding, which is preferable if you expect to transition to other platforms without any change in results. LC_COLLATE and LC_CTYPE can be set independently of the ICU locale.
          Note
          
          For the ICU provider, results may depend on the version of the ICU library used, as it is updated to reflect changes in natural language over time.
          libc
          The libc provider uses the operating system's C library. The collation and character classification behavior is controlled by the settings LC_COLLATE and LC_CTYPE, so they cannot be set independently.
          Note
          
          The same locale name may have different behavior on different platforms when using the libc provider.
          "
          Rather then some made up gibberish.
          We only have "i" & "c" in pg_collation. And we aren't using any of "i" > it seems. All this locale/encoding/collate stuff is too much for me to > handle, sorry)
          So if we are using the internal (builtin) "c" provider how come the PG > 18.1 run on FreeBSD 13.5 version shows warnings that the system version > is 34.0?
          The article must be wrong I guess.
          Then upgrading 13.5 to 14.3 is our only option.
          -- Adrian Klaver
          adrian.klaver@aklaver.com