2020-06-07 13:18:13 -04:00
2020-06-07 12:40:18 -04:00
2020-06-07 12:40:18 -04:00
2020-06-07 12:40:18 -04:00
2020-06-07 12:40:18 -04:00
2020-06-07 12:40:18 -04:00
2020-06-07 12:52:13 -04:00
2020-06-07 13:18:13 -04:00
2020-06-07 12:40:18 -04:00

PostgreSQL ASCII folding

Reasonably fast (tested on Musicbrainz dataset, is 40% faster than a simple UPPER()) ASCII folding functions based on Lucene's ASCIIFoldingFilter for PostgreSQL

Example:

postgres=# SELECT asciifold('Hello, ⒩ᴐⱤú⒴⁈~!');
      asciifold       
----------------------
 Hello, (n)ORu(y)?!~!
(1 row)

postgres=# SELECT asciifold_lower('Hello, ⒩ᴐⱤú⒴⁈~!');
      asciifold       
----------------------
 hello, (n)oru(y)?!~!
(1 row)

UTF8 input string is not sanitized (invalid UTF8 might lead to undefined behavior)

Compiling from source (CMake)

apt install postgresql-server-11-dev
cmake .
make

See asciifolding.c & build.sh for more information

Description
No description provided
Readme GPL-3.0 55 KiB
Languages
C 81.5%
Python 18%
Shell 0.3%
CMake 0.2%