Help Center/ DataArts Fabric/ Developer Guide/ SQL Syntax Reference/ Functions and Operators/ Character Processing Functions and Operators
Updated on 2025-09-18 GMT+08:00

Character Processing Functions and Operators

The character processing functions and operators provided by DataArts Fabric SQL are mainly used for connections between strings, between strings and non-strings, as well as pattern matching operations on strings.

bit_length(string)

Description: The bit length of a string.

Return type: integer.

Example:

1
2
3
4
5
SELECT bit_length('world');
 bit_length
------------
         40
(1 row)

btrim(string text [, characters text])

Description: Removes the longest string containing only characters from characters (default is whitespace) from the beginning and end of string.

Return type: text.

Example:

1
2
3
4
5
SELECT btrim('sring' , 'ing');
 btrim
-------
 sr
(1 row)

char_length(string) or character_length(string)

Description: The number of characters in a string.

Return type: integer.

Example:

1
2
3
4
5
SELECT char_length('hello');
 char_length
-------------
           5
(1 row)

instr(text,text,int,int)

Description: Position of the string matching function. The first int specifies the starting position for matching, and the second int specifies the number of matches.

Return type: integer.

Example:

1
2
3
4
5
SELECT instr( 'abcdabcdabcd', 'bcd', 2, 2 );
 instr
-------
     6
(1 row)

lengthb(text/bpchar)

Description: Gets the number of bytes of a specified string.

Return type: integer.

Example:

1
2
3
4
5
SELECT lengthb('hello');
 lengthb
---------
       5
(1 row)
  • If there is a line break in the string, such as a string consisting of a line break and a space, the value of LENGTH and LENGTHB in DataArts Fabric SQL is 2.
  • For the CHAR(n) type, n in DataArts Fabric SQL refers to the number of characters. Therefore, for multibyte encoded character sets, the length returned by the LENGTHB function may be larger than n.

left(str text, n int)

Description: Returns the first n characters of a string.

  • In ORA and TD compatibility modes, when n is negative, it returns all characters except the last |n| characters.
  • In MySQL compatibility mode, when n is negative, it returns an empty string.

Return type: text.

Example:

1
2
3
4
5
SELECT left('abcde', 2);
 left
------
 ab
(1 row)

length(string bytea, encoding name )

Description: Specifies the number of characters in the string under the encoding format. Under this encoding format, the string must be valid.

Return type: integer.

Example:

1
2
3
4
5
SELECT length('jose', 'UTF8');
 length
--------
      4
(1 row)

lpad(string text, length int [, fill text])

Description: Pads the string to the specified length using the fill character (a space by default). If the string is already longer than the length, its trailing part is truncated.

Return type: text.

Example:

1
2
3
4
5
SELECT lpad('hi', 5, 'xyza');
 lpad  
-------
 xyzhi
(1 row)

octet_length(string)

Description: Number of bytes in a string.

Return type: integer.

Example:

1
2
3
4
5
SELECT octet_length('jose');
 octet_length
--------------
            4
(1 row)

overlay(string placing string FROM int [for int])

Description: Replaces substrings. FROM int indicates the character from which the first string is replaced. For int indicates the number of characters to be replaced in the first string.

Return type: text.

Example:

1
2
3
4
5
SELECT overlay('hello' placing 'world' from 2 for 3 );
 overlay 
---------
 hworldo
(1 row)

position(substring in string)

Description: Specifies the position of the substring in the string. If the string does not contain any substring, 0 is returned.

Return type: integer.

Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
SELECT position('ing' in 'string');
 position
----------
        4
(1 row)

SELECT position('ing' in 'strin');
 position
----------
       0
(1 row)

pg_client_encoding()

Description: Current client encoding name.

Return type: name

Example:

1
2
3
4
5
SELECT pg_client_encoding();
 pg_client_encoding
--------------------
 UTF8
(1 row)

quote_ident(string text)

Description: Returns the identifier form (quotation marks are used as required) applicable to the SQL statement. Quotation marks are added only when necessary (the string contains non-identifier characters or characters that can be converted to uppercase or lowercase). The quotation marks embedded in the return value are written twice.

Return type: text.

Example:

1
2
3
4
5
SELECT quote_ident('hello world');
 quote_ident
--------------
 "hello world"
(1 row)

quote_literal(string text)

Description: Returns the form (quotation marks are used as required) suitable for use as text in a SQL statement.

Return type: text.

Example:

1
2
3
4
5
SELECT quote_literal('hello');
 quote_literal 
---------------
 'hello'
(1 row)

If the following information is displayed, the text is escaped.

1
2
3
4
5
SELECT quote_literal(E'O\'hello');
 quote_literal
---------------
 'O''hello'
(1 row)

If the following information is displayed, the backslash is written twice.

1
2
3
4
5
SELECT quote_literal('O\hello');
 quote_literal 
---------------
 E'O\\hello'
(1 row)

If the parameter is NULL, a null value is returned. If the parameter is NULL, the quote_nullable function is recommended.

1
2
3
4
5
SELECT quote_literal(NULL);
 quote_literal 
---------------

(1 row)

quote_literal(value anyelement)

Description: Forcibly converts the given value to text and adds quotation marks as the text.

Return type: text.

Example:

1
2
3
4
5
SELECT quote_literal(42.5);
 quote_literal 
---------------
 '42.5'
(1 row)

If the following information is displayed, the fixed value is escaped.

1
2
3
4
5
SELECT quote_literal(E'O\'42.5');
 quote_literal
---------------
 'O''42.5'
(1 row)

If the following information is displayed, the backslash is written twice.

1
2
3
4
5
SELECT quote_literal('O\42.5');
 quote_literal 
---------------
 E'O\\42.5'
(1 row)

quote_nullable(string text)

Description: Returns a string in a SQL statement (quotation marks are used as required).

Return type: text.

Example:

1
2
3
4
5
SELECT quote_nullable('hello');
 quote_nullable 
----------------
 'hello'
(1 row)

If the following information is displayed, the text is escaped.

1
2
3
4
5
SELECT quote_nullable(E'O\'hello');
 quote_nullable
----------------
 'O''hello'
(1 row)

If the following information is displayed, the backslash is written twice.

1
2
3
4
5
SELECT quote_nullable('O\hello');
 quote_nullable
----------------
 E'O\\hello'
(1 row)

If the parameter is NULL, a null value is returned.

1
2
3
4
5
SELECT quote_nullable(NULL);
 quote_nullable
----------------
 NULL
(1 row)

quote_nullable(value anyelement)

Description: Converts the given parameter value to text and then quotes it as the text.

Return type: text.

Example:

1
2
3
4
5
SELECT quote_nullable(42.5);
 quote_nullable
----------------
 '42.5'
(1 row)

If the following information is displayed, the fixed value is escaped.

1
2
3
4
5
SELECT quote_nullable(E'O\'42.5');
 quote_nullable 
----------------
 'O''42.5'
(1 row)

If the following information is displayed, the backslash is written twice.

1
2
3
4
5
SELECT quote_nullable('O\42.5');
 quote_nullable
----------------
 E'O\\42.5'
(1 row)

If the parameter is NULL, a null value is returned.

1
2
3
4
5
SELECT quote_nullable(NULL);
 quote_nullable
----------------
 NULL
(1 row)

substring(string [from int] [for int])

Description: Truncates a substring. from int indicates the sequence number of the character to be truncated, and for int indicates the number of bytes to be truncated.

Return type: text.

Example:

1
2
3
4
5
SELECT substring('Thomas' from 2 for 3);
 substring
-----------
 hom
(1 row)

substring(string from pattern)

Description: Extracts substring matching POSIX regular expression. If no match is found, a null value is returned. Otherwise, the part of the text that matches the pattern is returned.

Return type: text.

Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
SELECT substring('Thomas' from '...$');
 substring
-----------
 mas
(1 row)
SELECT substring('foobar' from 'o(.)b');
 substring 
--------
 o
(1 row)
SELECT substring('foobar' from '(o(.)b)');
 substring 
--------
 oob
(1 row)

If the POSIX regular expression pattern contains any parentheses, the text that matches the first pair of subexpressions (corresponding to the first left parenthesis) is returned. If you want to use parentheses in an expression without causing this exception, you can add a pair of parentheses outside the entire expression.

substring(string from pattern for escape)

Description: Extracts substrings matching the SQL regular expression. The declared schema must match the entire data string. Otherwise, the function fails and returns a null value. To identify the part of the pattern that should be returned on success, the pattern must contain two occurrences of the escape character followed by double quotation marks ("). The text matching the pattern between the two tags is returned.

Return type: text.

Example:

1
2
3
4
5
SELECT substring('Thomas' from '%#"o_a#"_' for '#');
 substring
-----------
 oma
(1 row)

rawcat(raw,raw)

Description: String concatenation function.

Return type: raw

Example:

1
2
3
4
5
SELECT rawcat('ab','cd');
 rawcat
--------
 ABCD
(1 row)

regexp_like(text,text,text)

Description: Specifies the pattern matching function of a regular expression.

Return type: bool

Example:

1
2
3
4
5
SELECT regexp_like('str','[ac]');
 regexp_like
-------------
 f
(1 row)

regexp_substr(text,text)

Description: Extracts substrings from a regular expression. Similar to substr, if a regular expression contains multiple parallel parentheses, all the parentheses are processed.

Return type: text.

Example:

1
2
3
4
5
SELECT regexp_substr('str','[ac]');
 regexp_substr
---------------

(1 row)

regexp_matches(string text, pattern text [, flags text])

Description: Returns all substrings that match the POSIX regular expression in string. If the pattern does not match, the function returns no rows. If the pattern contains no parenthesized subexpressions, each row returned is a single-element text array containing the substring matching the whole pattern. If the pattern contains a parenthetical subexpression, this function returns a text array whose nth element is the substring that matches the nth parenthetical subexpression of the pattern.

The flags parameter is optional and contains zero or more single-letter flags that change the behavior of the function. i indicates case-insensitive matching, and g indicates replacing each matched substring, not just the first one.

If the last parameter is provided but the parameter value is an empty string ('') and the database SQL compatibility mode is set to ORA, the returned result is an empty set. This is because the ORA-compatible mode treats '' as NULL. The following methods can be used to avoid such behavior:

  • Change the SQL compatibility mode to TD.
  • Do not provide the last parameter, or ensure the last parameter is not an empty string.

Return type: setof text[]

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
SELECT regexp_matches('foobarbequebaz', '(bar)(beque)');
 regexp_matches
----------------
 {bar,beque}
(1 row)

SELECT regexp_matches('foobarbequebaz', 'barbeque');
 regexp_matches 
----------------
 {barbeque}
(1 row)

SELECT regexp_matches('foobarbequebazilbarfbonk', '(b[^b]+)(b[^b]+)', 'g');
    regexp_matches    
--------------
 {bar,beque}
 {bazil,barf}
(2 rows)

If there is no subquery and the regexp_matches function does not find a match, the data in the table will not be output. This is generally not the desired result and such usage should be avoided. You are advised to use the regexp_substr function to achieve the same functionality.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
SELECT * FROM tab;
 c1  | c2  
-----+-----
 dws | dws
(1 row)

SELECT c1, regexp_matches(c2, '(bar)(beque)') FROM tab;
 c1 | regexp_matches 
----+----------------
(0 rows)

SELECT c1, c2, (SELECT regexp_matches(c2, '(bar)(beque)')) FROM tab;
 c1  | c2  | regexp_matches 
-----+-----+----------------
 dws | dws | 
(1 row)

regexp_split_to_array(string text, pattern text [, flags text ])

Description: Splits the string using POSIX regular expressions as delimiters. Similar to regexp_split_to_table, but regexp_split_to_array returns its results as a text array.

Return type: text[].

Example:

1
2
3
4
5
SELECT regexp_split_to_array('hello world', E'\\s+');
 regexp_split_to_array
-----------------------
 {hello,world}
(1 row)

regexp_split_to_table(string text, pattern text [, flags text])

Description: Splits the string using POSIX regular expressions as delimiters. If there is no match to the pattern, the function returns the string. If there is at least one match, for each match it returns the text from the end of the last match (or the start of the string) to the beginning of the current match. When there are no more matches, it returns the text from the end of the last match to the end of the string.

The flags parameter includes zero or more single-character flags that alter the function's behavior. i indicates case-insensitive matching, and g indicates replacing each matched substring, not just the first one.

Return type: setof text

Example:

1
2
3
4
5
6
SELECT regexp_split_to_table('hello world', E'\\s+');
 regexp_split_to_table
-----------------------
 hello
 world
(2 rows)

If there is no subquery and the regexp_split_to_table function does not find a match, the data in the table will not be output. This is generally not the desired result and such usage should be avoided.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
SELECT * FROM tab;
 c1  | c2  
-----+-----
 dws |
(1 row)

SELECT c1, regexp_split_to_table(c2, E'\\s+') FROM tab;
 c1 | regexp_split_to_table 
----+-----------------------
(0 rows)

SELECT c1, (select regexp_split_to_table(c2, E'\\s+')) FROM tab;
 c1  | regexp_split_to_table 
-----+-----------------------
 dws | 
(1 row)

repeat(string text, number int )

Description: Repeats the string a specified number of times.

Return type: text.

Example:

1
2
3
4
5
SELECT repeat('Pg', 4);
  repeat
----------
 PgPgPgPg
(1 row)

replace(string text, from text, to text)

Description: Replaces all instances of the substring from within the string with the substring to.

Return type: text.

Example:

1
2
3
4
5
SELECT replace('abcdefabcdef', 'cd', 'XXX');
    replace     
----------------
 abXXXefabXXXef
(1 row)

reverse(str)

Description: Returns the string in reverse order.

Return type: text.

Example:

1
2
3
4
5
SELECT reverse('abcde');
 reverse
---------
 edcba
(1 row)

right(str text, n int)

Description: Returns the last n characters of the string.

  • In ORA and TD compatibility modes, when n is negative, it returns all characters excluding the first |n| characters.
  • In MySQL compatibility mode, when n is negative, it returns an empty string.

Return type: text.

Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
SELECT right('abcde', 2);
 right
-------
 de
(1 row)

SELECT right('abcde', -2);
 right 
-------
 cde
(1 row)

rpad(string text, length int [, fill text])

Description: Pads the string to the specified length using the fill character (a space by default). If the string is already longer than the length, its trailing part is truncated.

Return type: text.

Example:

1
2
3
4
5
SELECT rpad('hi', 5, 'xy');
 rpad
-------
 hixyx
(1 row)

rtrim(string text [, characters text])

Description: Removes the longest string containing only characters from characters (a space by default) from the end of string.

Return type: text.

Example:

1
2
3
4
5
SELECT rtrim('trimxxxx', 'x');
 rtrim
-------
 trim
(1 row)

sys_context ( 'namespace' , 'parameter')

Description: Obtains and returns the value of parameter in a specified namespace.

Return type: text.

Example:

1
2
3
4
5
SELECT SYS_CONTEXT ( 'postgres' , 'archive_mode');
 sys_context
-------------

(1 row)

substrb(text,int,int)

Description: Extracts substrings. The first int indicates the start position of the extraction, and the second int indicates the number of bytes to be extracted.

Return type: text.

Example:

1
2
3
4
5
SELECT substrb('string',2,3);
 substrb
---------
 tri
(1 row)

substrb(text,int)

Description: Extracts substrings. int indicates the start position of the extraction.

Return type: text.

Example:

1
2
3
4
5
SELECT substrb('string',2);
 substrb
---------
 tring
(1 row)

string || string

Description: Concatenates strings.

Return type: text.

Example:

1
2
3
4
5
SELECT 'DA'||'TABASE' AS RESULT;
 result 
----------
 DATABASE
(1 row)

string || non-string or non-string || string

Description: Concatenates strings and non-strings.

Return type: text.

Example:

1
2
3
4
5
SELECT 'Value: '||42 AS RESULT;
  result   
-----------
 Value: 42
(1 row)

split_part(string text, delimiter text, field int)

Description: Splits string on delimiter and returns the field-th substring (based on the text of the first appeared delimiter).

Return type: text.

Example:

1
2
3
4
5
SELECT split_part('abc~@~def~@~ghi', '~@~', 2);
 split_part
------------
 def
(1 row)

strpos(string, substring)

Description: Specifies the position of a substring. Same as position(substring in string), but the parameter sequence is opposite.

Return type: integer.

Example:

1
2
3
4
5
SELECT strpos('source', 'rc');
 strpos
--------
      4
(1 row)

to_hex(number int or bigint)

Description: Converts number to its hexadecimal representation.

Return type: text.

Example:

1
2
3
4
5
SELECT to_hex(2147483647);
  to_hex
----------
 7fffffff
(1 row)

translate(string text, from text, to text)

Description: Converts any character in string that matches a character in from to the corresponding character in to. If from is longer than to, extra characters are removed from from.

Return type: text.

Example:

1
2
3
4
5
SELECT translate('12345', '143', 'ax');
 translate
-----------
 a2x5
(1 row)

length(string)

Description: Gets the number of characters in string.

Return type: integer.

Example:

1
2
3
4
5
SELECT length('abcd');
 length 
--------
      4
(1 row)

lengthb(string)

Description: Gets the number of bytes in string. The value depends on the character set. The same Chinese character may return different byte counts in GBK and UTF8.

Return type: integer.

Example:

1
2
3
4
5
SELECT lengthb('hello');
 lengthb 
---------
       5
(1 row)

substr(string,from)

Description:

Extracts substrings from string.

from indicates the start position of the extraction.

  • When from is 0, it is treated as 1.
  • When from is positive, all characters are extracted from from to the end.
  • When from is negative, the last n characters of the string are extracted, where n is the absolute value of from.

Return type: varchar.

Example:

When from is positive:

1
2
3
4
5
SELECT substr('ABCDEF',2);
 substr
--------
 BCDEF
(1 row)

When from is negative:

1
2
3
4
5
SELECT substr('ABCDEF',-2);
 substr
--------
 EF
(1 row)

substr(string,from,count)

Description:

Extracts substrings from string.

from indicates the start position of the extraction.

count indicates the length of the extracted substring.

  • When from is 0, it is treated as 1.
  • When from is positive, count characters are extracted starting from from.
  • When from is negative, count characters are extracted starting from the nth to last character, where n is the absolute value of from.
  • When count is less than 1, null is returned.

Return type: varchar.

Example:

When from is positive:

1
2
3
4
5
SELECT substr('ABCDEF',2,2);
 substr 
--------
 BC
(1 row)

When from is negative:

1
2
3
4
5
SELECT substr('ABCDEF',-3,2);
 substr 
--------
 DE
(1 row)

substrb(string,from)

Description: This function has the same functionality as SUBSTR(string,from), but the calculation unit is bytes.

Return type: bytea

Example:

1
2
3
4
5
SELECT substrb('ABCDEF',-2);
 substrb 
---------
 EF
(1 row)

substrb(string,from,count)

Description: This function has the same functionality as SUBSTR(string,from,count), but the calculation unit is bytes.

Return type: bytea

Example:

1
2
3
4
5
SELECT substrb('ABCDEF',2,2);
 substrb 
---------
 BC
(1 row)

trim([leading |trailing |both] [characters] from string)

Description: Removes the longest string containing only the characters (a space by default) from the start, end, or both sides of the string.

Return type: varchar.

Example:

1
2
3
4
5
SELECT trim(BOTH 'x' FROM 'xTomxx');
 btrim
-------
 Tom
(1 row)
1
2
3
4
5
SELECT trim(LEADING 'x' FROM 'xTomxx');
 ltrim
-------
 Tomxx
(1 row)
1
2
3
4
5
SELECT trim(TRAILING 'x' FROM 'xTomxx');
 rtrim
-------
 xTom
(1 row)

rtrim(string [, characters])

Description: Removes the longest string containing only characters from characters (a space by default) from the end of string.

Return type: varchar.

Example:

1
2
3
4
5
SELECT rtrim('TRIMxxxx','x');
 rtrim
-------
 TRIM
(1 row)

ltrim(string [, characters])

Description: Removes the longest string containing only characters from characters (a space by default) from the start of string.

Return type: varchar.

Example:

1
2
3
4
5
SELECT ltrim('xxxxTRIM','x');
 ltrim
-------
 TRIM
(1 row)

upper(string)

Description: Converts the string to uppercase.

Return type: varchar.

Example:

1
2
3
4
5
SELECT upper('tom');
 upper
-------
 TOM
(1 row)

ucase(string)

Description: Converts the string to uppercase.

Return type: varchar.

Example:

SELECT ucase('sam');
 ucase
-------
 SAM
(1 row)

lower(string)

Description: Converts the string to lowercase.

Return type: varchar.

Example:

1
2
3
4
5
SELECT lower('TOM');
 lower
-------
 tom
(1 row)

lcase(string)

Description: Converts the string to lowercase.

Return type: varchar.

Example:

SELECT lcase('SAM');
 lcase
-------
 sam
(1 row)

rpad(string varchar, length int [, fill varchar])

Description: Pads the string to the specified length using the fill character (a space by default). If the string is already longer than the length, its trailing part is truncated.

The length parameter indicates the character length in DataArts Fabric SQL. A Chinese character counts as one character.

Return type: varchar.

Example:

1
2
3
4
5
SELECT rpad('hi',5,'xyza');
 rpad
-------
 hixyz
(1 row)
1
2
3
4
5
SELECT rpad('hi',5,'abcdefg');
 rpad  
-------
 hiabc
(1 row)

instr(string,substring[,position,occurrence])

Description: Queries and returns the value of the substring position that occurs the occurrence (first by default) times from the position (1 by default) of the string.

  • When position is set to 0, 0 is returned.
  • If the value of position is a negative number, the system searches for the string from the last nth character. n is the absolute value of position.

This function uses characters as the calculation unit. For example, a Chinese character counts as one character.

Return type: integer.

Example:

1
2
3
4
5
SELECT instr('corporate floor','or', 3);
 instr 
-------
     5
(1 row)
1
2
3
4
5
SELECT instr('corporate floor','or',-3,2);
 instr 
-------
     2
(1 row)

locate(substring,string[,position])

Description: Queries and returns the value of the substring position that appears for the first time from the position specified by position (1 by default) of string. The calculation unit is character. If no substring exists in the string, 0 is returned.

Return type: integer.

Example:

1
2
3
4
5
SELECT locate('ball','football');
 locate 
--------
     5
(1 row)
1
2
3
4
5
SELECT locate('er','soccerplayer','6');
 locate 
--------
    11 
(1 row)

initcap(string)

Description: Converts the first letter of each word in a string to uppercase and other letters to lowercase.

Return type: text.

Example:

1
2
3
4
5
SELECT initcap('hi THOMAS');
  initcap
-----------
 Hi Thomas
(1 row)

ascii(string)

Description: ASCII code of the first character in the string.

Return type: integer.

Example:

1
2
3
4
5
SELECT ascii('xyz');
 ascii 
-------
   120
(1 row)

replace(string varchar, search_string varchar, replacement_string varchar)

Description: Replaces all substrings search_string in string with substrings replacement_string.

Return type: varchar.

Example:

1
2
3
4
5
SELECT replace('jack and jue','j','bl');
    replace     
----------------
 black and blue
(1 row)

lpad(string varchar, length int[, repeat_string varchar])

Description: Adds a series of repeat_strings (a space by default) to the left of string to form a new string whose total length is n.

If the length of the string is longer than the specified length, the function truncates the string and returns the string of the specified length.

Return type: varchar.

Example:

1
2
3
4
5
SELECT lpad('PAGE 1',15,'*.');
      lpad       
-----------------
 *.*.*.*.*PAGE 1
(1 row)
1
2
3
4
5
SELECT lpad('hello world',5,'abcd');
 lpad  
-------
 hello
(1 row)

concat(str1,str2)

Description: Concatenates and returns str1 and str2.

  • In ORA and TD compatibility modes, the returned result is the concatenation of all non-null strings.
  • In MySQL compatibility mode, if the input parameter contains NULL, the return result is NULL.

Return type: varchar.

Example:

1
2
3
4
5
SELECT concat('Hello', ' World!');
    concat    
--------------
 Hello World!
(1 row)

chr(integer)

Description: Specifies the character of the ASCII code.

Return type: varchar.

Example:

1
2
3
4
5
SELECT chr(65);
 chr
-----
 A
(1 row)

regexp_substr(source_char, pattern)

Description: Extracts substrings from a regular expression.

Return type: varchar.

Example:

1
2
3
4
5
SELECT regexp_substr('500 Hello World, Redwood Shores, CA', ',[^,]+,') "REGEXPR_SUBSTR";
  REGEXPR_SUBSTR   
-------------------
 , Redwood Shores,
(1 row)

regexp_replace(string, pattern, replacement [,flags ])

Description: Replaces the substring that matches the POSIX regular expression. If no pattern is matched, the unmodified string is returned. If a match is found, the matching substring in the returned string is replaced with the replacement string.

The replacement string can contain \n, where \n ranges from 1 to 9, indicating that the substring of the nth parenthesized subexpression in the matching pattern within the string should be inserted. Additionally, it can contain \&, indicating that the substring matching the entire pattern should be inserted.

The optional flags parameter includes zero or more single-character flags that alter the function's behavior, as detailed in the following table.

Table 1 flags parameter options

Option

Description

g

Replace every matched substring rather than just the first (by default, only the first matched substring is replaced).

B

By default, use the Henry Spencer regex library and its regex syntax. Specifying the B option prioritizes using the Boost.Regex library and its regex syntax.

The following two scenarios will also automatically switch to using the Henry Spencer regex library and its regex syntax when the B option is specified:

  • Any combination of p, q, w, and x characters is included in the flags.
  • The string or pattern parameters contain multibyte characters.

b

Matches according to POSIX Basic Regular Expressions (BREs) matching rules.

c

Case-sensitive matching.

e

Matches according to the POSIX Extended Regular Expressions (EREs) matching rules. When neither b nor e is specified, if the Henry Spencer regex library is chosen, it follows the Advanced Regular Expressions (AREs), akin to Perl Compatible Regular Expressions (PCREs). If the Boost.Regex library is chosen, it adheres to PCREs.

i

Case-insensitive matching.

m

Newline-sensitive matching, synonymous with the n option.

n

Newline-sensitive matching. When this option is active, the newline character influences the matching of metacharacters (., ^, $, and [^).

p

Partially newline-sensitive matching. When this option is active, the newline character influences the matching of metacharacters (. and [^), partially compared to the n option.

q

Resets the regular expression to a double-quoted text string, treating all characters as literals.

s

Non-newline-sensitive matching.

t

Compact syntax (default). When this option is active, all characters are significant.

w

Inverse partially newline-sensitive matching. When this option is active, the newline character influences the matching of metacharacters (^ and $), inversely partially compared to the n option.

x

Extended syntax. Contrary to compact syntax, in extended syntax, whitespace characters in the regular expression are ignored. Whitespace characters include spaces, horizontal tabs, newlines, and any characters belonging to the space character set.

Return type: varchar.

Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
SELECT regexp_replace('Thomas', '.[mN]a.', 'M');
 regexp_replace
----------------
 ThM
(1 row)
SELECT regexp_replace('foobarbaz','b(..)', E'X\\1Y', 'g') AS RESULT;                                                    
   result    
-------------
 fooXarYXazY
(1 row)

concat_ws(sep text, str"any" [, str"any" [, ...] ])

Description: Uses the first parameter as the separator to concatenate all parameters following the second one.

Return type: text.

Example:

1
2
3
4
5
SELECT concat_ws(',', 'ABCDE', 2, NULL, 22);
 concat_ws
------------
 ABCDE,2,22
(1 row)

convert(string bytea, src_encoding name, dest_encoding name)

Description: Converts the string bytea using the destination encoding specified by dest_encoding. src_encoding specifies the source encoding format. The string must be valid under this encoding.

Return type: bytea

Example:

1
2
3
4
5
SELECT convert('text_in_utf8', 'UTF8', 'GBK');
          convert        
----------------------------
 \x746578745f696e5f75746638
(1 row)

If the conversion rule from the source encoding format to the destination encoding format does not exist, the string is returned unchanged. For example, there is no conversion rule between GBK and LATIN1. Specific conversion rules can be obtained by querying the pg_conversion system catalog.

Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
show server_encoding;
 server_encoding 
-----------------
 LATIN1
(1 row)

SELECT convert_from('some text', 'GBK');
 convert_from 
--------------
 some text
(1 row)

db_latin1=# SELECT convert_to('some text', 'GBK');
      convert_to      
----------------------
 \x736f6d652074657874
(1 row)

db_latin1=# SELECT convert('some text', 'GBK', 'LATIN1');
       convert        
----------------------
 \x736f6d652074657874
(1 row)

convert_from(string bytea, src_encoding name)

Description: Converts the string bytea using the database's encoding method.

src_encoding specifies the source encoding format. The string must be valid under this encoding.

Return type: text.

Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
SELECT convert_from('text_in_utf8', 'UTF8');
 convert_from
--------------
 text_in_utf8
(1 row)
SELECT convert_from('\x6461746162617365','gbk');
 convert_from
--------------
 database
(1 row)

convert_to(string text, dest_encoding name)

Description: Converts the string into the encoding format specified by dest_encoding.

Return type: bytea

Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
SELECT convert_to('some text', 'UTF8');
      convert_to
----------------------
 \x736f6d652074657874
(1 row)
SELECT convert_to('database', 'gbk');
     convert_to
--------------------
 \x6461746162617365
(1 row)

string [NOT] LIKE pattern [ESCAPE escape-character]

Description: Pattern matching function.

If the pattern contains neither percent signs nor underscores, it stands solely for itself, making LIKE behave like the equality operator. An underscore (_) in the pattern matches any single character, while a percent sign (%) matches zero or more characters.

To match an underscore or percent sign literally, the respective character in the pattern must be preceded by an escape character. By default, the escape character is a backslash, but you may specify another using the ESCAPE clause. To match the escape character itself, use two escape characters.

Return type: boolean.

Example:

1
2
3
4
5
SELECT 'AA_BBCC' LIKE '%A@_B%' ESCAPE '@' AS RESULT;
 result
--------
 t
(1 row)
1
2
3
4
5
SELECT 'AA_BBCC' LIKE '%A@_B%' AS RESULT;
 result
--------
 f
(1 row)
1
2
3
4
5
SELECT 'AA@_BBCC' LIKE '%A@_B%' AS RESULT;
 result
--------
 t
(1 row)

REGEXP_LIKE(source_string, pattern [, match_parameter])

Description: Regular expression pattern matching function.

source_string is the source string, and pattern is the regular expression matching pattern. match_parameter specifies the matching option, which can include:

  • "i": case-insensitive.
  • "c": case-sensitive.
  • "n": allows the regex meta-character '.' to match newlines.
  • "m": treats source_string as multi-line.

If the match_parameter option is omitted, defaults are case sensitivity, '.' does not match newlines, and source_string is treated as a single line.

Return type: boolean.

Example:

1
2
3
4
5
SELECT regexp_like('ABC', '[A-Z]');
 regexp_like
-------------
 t
(1 row)
1
2
3
4
5
SELECT regexp_like('ABC', '[D-Z]');
 regexp_like
-------------
 f
(1 row)
1
2
3
4
5
SELECT regexp_like('abc', '[A-Z]','i');
 regexp_like
-------------
 t
(1 row)
1
2
3
4
5
SELECT regexp_like('abc', '[A-Z]');
 regexp_like
-------------
 f
(1 row)

format(formatstr text [, str"any" [, ...] ])

Description: Formats a string.

Return type: text.

Example:

1
2
3
4
5
SELECT format('Hello %s, %1$s', 'World');
       format       
--------------------
 Hello World, World
(1 row)

md5(string)

Description: Encrypts a string using MD5 and returns a hexadecimal number.

MD5 is insecure and is not recommended.

Return type: text.

Example:

1
2
3
4
5
SELECT md5('ABC');
               md5                
----------------------------------
 902fbdd2b1df0c4f70b4a5d23525e932
(1 row)

decode(string text, format text)

Description: Decodes binary data from text data.

Return type: bytea

Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
SELECT decode('ZGF0YWJhc2U=', 'base64');
    decode    
--------------
 \x6461746162617365
(1 row)

SELECT convert_from('\x6461746162617365','utf-8');
 convert_from
--------------
 database
(1 row)

encode(data bytea, format text)

Description: Encodes binary data into text data.

Return type: text.

Example:

1
2
3
4
5
SELECT encode('database', 'base64');
  encode  
----------
 ZGF0YWJhc2U=
(1 row)

CONV(n, fromBase, toBase)

Description: Converts the given value or string into the target number system and outputs the result as a string. If the parameter contains NULL, NULL is returned. The value range is [-36, -2] and [2, 36].

Return type: text.

Example:

1
2
3
4
5
SELECT CONV(-1, 10, 16) as result;
      result
------------------
 FFFFFFFFFFFFFFFF
(1 row)

HEX(n)

Description: n can be of the int type or a string. Returns the hexadecimal string of n. If the parameter contains NULL, NULL is returned.

Return type: text.

Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
SELECT HEX(255) as result;
 result
--------
   FF
(1 row)
SELECT HEX('abc') as result;
 result
--------
 616263
(1 row)

UNHEX(n)

Description: Performs the reverse operation of HEX(n). n can be of the int type or a string. Each pair of hexadecimal digits in a parameter is considered as a number and converted into the character represented by the number. If the parameter contains NULL, NULL is returned. This function is only supported by clusters of version 8.2.0 or later.

Return type: bytea

Example:

1
2
3
4
5
SELECT UNHEX('abc') as result;
 result
--------
 \x0abc
(1 row)

SPACE(n int)

Description: Returns a string consisting of n spaces. If the parameter contains NULL, NULL is returned.

Return type: text.

Example:

1
2
3
4
5
SELECT SPACE(2) as result;
 result
--------
   
(1 row)

STRCMP(text, text)

Description: Compares two strings. If all strings are the same, 0 is returned. If the first string is less than the second string according to the current classification order, -1 is returned. Otherwise, 1 is returned. If the parameter contains NULL, NULL is returned.

Return type: text.

Example:

1
2
3
4
5
SELECT STRCMP('AA', 'AA'), STRCMP('AA', 'AB'), STRCMP('AA', 'A');
STRCMP  |  STRCMP  |  STRCMP
------------------------------
    0   |    -1    |     1
(1 row)

BIN(n bigint)

Description: Converts the bigint type from decimal to binary and returns the result as a string. If the parameter contains NULL, NULL is returned.

Return type: text.

Example:
1
2
3
4
5
SELECT BIN(16) as result;
 result
--------
 10000
(1 row)