pg_asciifold/README.md
2020-06-07 13:18:13 -04:00

881 B

PostgreSQL ASCII folding

Reasonably fast (tested on Musicbrainz dataset, is 40% faster than a simple UPPER()) ASCII folding functions based on Lucene's ASCIIFoldingFilter for PostgreSQL

Example:

postgres=# SELECT asciifold('Hello, ⒩ᴐⱤú⒴⁈~!');
      asciifold       
----------------------
 Hello, (n)ORu(y)?!~!
(1 row)

postgres=# SELECT asciifold_lower('Hello, ⒩ᴐⱤú⒴⁈~!');
      asciifold       
----------------------
 hello, (n)oru(y)?!~!
(1 row)

UTF8 input string is not sanitized (invalid UTF8 might lead to undefined behavior)

Compiling from source (CMake)

apt install postgresql-server-11-dev
cmake .
make

See asciifolding.c & build.sh for more information