String Manipulation Functions¶
We use string manipulation functions quite extensively. Here are some of the important functions which we typically use.
- Case Conversion –
lower
,upper
,initcap
- Getting size of the column value –
length
- Extracting Data –
substr
andsplit_part
- Trimming and Padding functions –
trim
,rtrim
,ltrim
,rpad
andlpad
- Reversing strings –
reverse
- Concatenating multiple strings
concat
andconcat_ws
Case Conversion and Length¶
Let us understand how to perform case conversion of a string and also get length of a string.
%load_ext sql
%env DATABASE_URL=postgresql://itversity_retail_user:retail_password@pg.itversity.com:5432/itversity_retail_db
env: DATABASE_URL=postgresql://itversity_retail_user:retail_password@pg.itversity.com:5432/itversity_retail_db
- Case Conversion Functions –
lower
,upper
,initcap
%%sql
SELECT lower('hEllo wOrlD') AS lower_result,
upper('hEllo wOrlD') AS upper_result,
initcap('hEllo wOrlD') AS initcap_result
1 rows affected.
lower_result | upper_result | initcap_result |
---|---|---|
hello world | HELLO WORLD | Hello World |
- Getting length –
length
%%sql
SELECT length('hEllo wOrlD') AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
11 |
Let us see how to use these functions on top of the table. We will use orders table which was loaded as part of last section.
- order_status for all the orders is in upper case and we will convert every thing to lower case.
%%sql
SELECT * FROM orders LIMIT 10
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 10 rows affected.
order_id | order_date | order_customer_id | order_status |
---|---|---|---|
1 | 2013-07-25 00:00:00 | 11599 | CLOSED |
2 | 2013-07-25 00:00:00 | 256 | PENDING_PAYMENT |
3 | 2013-07-25 00:00:00 | 12111 | COMPLETE |
4 | 2013-07-25 00:00:00 | 8827 | CLOSED |
5 | 2013-07-25 00:00:00 | 11318 | COMPLETE |
6 | 2013-07-25 00:00:00 | 7130 | COMPLETE |
7 | 2013-07-25 00:00:00 | 4530 | COMPLETE |
8 | 2013-07-25 00:00:00 | 2911 | PROCESSING |
9 | 2013-07-25 00:00:00 | 5657 | PENDING_PAYMENT |
10 | 2013-07-25 00:00:00 | 5648 | PENDING_PAYMENT |
%%sql
SELECT order_id, order_date, order_customer_id,
lower(order_status) AS order_status,
length(order_status) AS order_status_length
FROM orders LIMIT 10
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 10 rows affected.
order_id | order_date | order_customer_id | order_status | order_status_length |
---|---|---|---|---|
1 | 2013-07-25 00:00:00 | 11599 | closed | 6 |
2 | 2013-07-25 00:00:00 | 256 | pending_payment | 15 |
3 | 2013-07-25 00:00:00 | 12111 | complete | 8 |
4 | 2013-07-25 00:00:00 | 8827 | closed | 6 |
5 | 2013-07-25 00:00:00 | 11318 | complete | 8 |
6 | 2013-07-25 00:00:00 | 7130 | complete | 8 |
7 | 2013-07-25 00:00:00 | 4530 | complete | 8 |
8 | 2013-07-25 00:00:00 | 2911 | processing | 10 |
9 | 2013-07-25 00:00:00 | 5657 | pending_payment | 15 |
10 | 2013-07-25 00:00:00 | 5648 | pending_payment | 15 |
Extracting Data – substr and split_part¶
Let us understand how to extract data from strings using substr
/substring
as well as split_part
.
%env DATABASE_URL=postgresql://itversity_retail_user:retail_password@pg.itversity.com:5432/itversity_retail_db
env: DATABASE_URL=postgresql://itversity_retail_user:retail_password@pg.itversity.com:5432/itversity_retail_db
- We can extract sub string from main string using
substr
orsubstring
position and length. - For example, get first 4 characters from date to get year or get last 4 characters from fixed length unique id.
substring
have broader options (regular expression) and also can be used with different styles (using keywords such asFROM
,FOR
).- Unlike in other relational databases, we cannot pass negative integers to
substr
orsubstring
to get the information from right. We need to use functions likeright
instead.
%%sql
SELECT substr('2013-07-25 00:00:00.0', 1, 4) AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
2013 |
%%sql
SELECT substring('2013-07-25 00:00:00.0', 1, 4) AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
2013 |
%%sql
SELECT substring('2013-07-25 00:00:00.0' FROM 1 FOR 4) AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
2013 |
%%sql
SELECT substring('2013-07-25 00:00:00.0', 6, 2) AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
07 |
%%sql
SELECT substring('2013-07-25 00:00:00.0', 9, 2) AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
25 |
%%sql
SELECT substring('2013-07-25 00:00:00.0' from 12) AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
00:00:00.0 |
%%sql
SELECT substr('2013-07-25 00:00:00.0', 12) AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
00:00:00.0 |
%%sql
SELECT right('123 456 7890', 4) AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
7890 |
%%sql
SELECT left('123 456 7890', 3) AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
123 |
{note}
We can also use combination of `substring` and `length` like below to get last 4 digits or characters from a string.
%%sql
SELECT substring('123 456 7890' FROM length('123 456 7890') - 4) AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
7890 |
%%sql
SELECT substring('123 456 7890' FROM '....
#39;) AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
7890 |
{note}
Getting first 3 characters or digits as well as last 4 characters or digits using `substring`. However, this works only when the strings are of fixed length.
%%sql
WITH unique_ids AS (
SELECT '241-80-7115' AS unique_id UNION
SELECT '694-30-6851' UNION
SELECT '586-92-5361' UNION
SELECT '884-65-284' UNION
SELECT '876-99-585' UNION
SELECT '831-59-5593' UNION
SELECT '399-88-3617' UNION
SELECT '733-17-4217' UNION
SELECT '873-68-9778' UNION
SELECT '48'
) SELECT unique_id,
substring(unique_id FROM 1 FOR 3) AS unique_id_first3,
substring(unique_id FROM '....
#39;) AS unique_id_last4
FROM unique_ids
ORDER BY unique_id
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 10 rows affected.
unique_id | unique_id_first3 | unique_id_last4 |
---|---|---|
241-80-7115 | 241 | 7115 |
399-88-3617 | 399 | 3617 |
48 | 48 | None |
586-92-5361 | 586 | 5361 |
694-30-6851 | 694 | 6851 |
733-17-4217 | 733 | 4217 |
831-59-5593 | 831 | 5593 |
873-68-9778 | 873 | 9778 |
876-99-585 | 876 | -585 |
884-65-284 | 884 | -284 |
- Let us see how we can extract date part from order_date of orders.
%%sql
SELECT * FROM orders LIMIT 10
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 10 rows affected.
order_id | order_date | order_customer_id | order_status |
---|---|---|---|
1 | 2013-07-25 00:00:00 | 11599 | CLOSED |
2 | 2013-07-25 00:00:00 | 256 | PENDING_PAYMENT |
3 | 2013-07-25 00:00:00 | 12111 | COMPLETE |
4 | 2013-07-25 00:00:00 | 8827 | CLOSED |
5 | 2013-07-25 00:00:00 | 11318 | COMPLETE |
6 | 2013-07-25 00:00:00 | 7130 | COMPLETE |
7 | 2013-07-25 00:00:00 | 4530 | COMPLETE |
8 | 2013-07-25 00:00:00 | 2911 | PROCESSING |
9 | 2013-07-25 00:00:00 | 5657 | PENDING_PAYMENT |
10 | 2013-07-25 00:00:00 | 5648 | PENDING_PAYMENT |
%%sql
SELECT order_id,
substr(order_date::varchar, 1, 10) AS order_date,
order_customer_id,
order_status
FROM orders
LIMIT 10
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 10 rows affected.
order_id | order_date | order_customer_id | order_status |
---|---|---|---|
1 | 2013-07-25 | 11599 | CLOSED |
2 | 2013-07-25 | 256 | PENDING_PAYMENT |
3 | 2013-07-25 | 12111 | COMPLETE |
4 | 2013-07-25 | 8827 | CLOSED |
5 | 2013-07-25 | 11318 | COMPLETE |
6 | 2013-07-25 | 7130 | COMPLETE |
7 | 2013-07-25 | 4530 | COMPLETE |
8 | 2013-07-25 | 2911 | PROCESSING |
9 | 2013-07-25 | 5657 | PENDING_PAYMENT |
10 | 2013-07-25 | 5648 | PENDING_PAYMENT |
Let us understand how to extract the information from the string where there is a delimiter.
split_part
can be used to split a string using delimiter and extract the information.- If there is no data in a given position after splitting, it will be represented as empty string ”.
%%sql
SELECT split_part('2013-07-25', '-', 1) AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
2013 |
%%sql
WITH addresses AS (
SELECT '593 Fair Oaks Pass, Frankfort, Kentucky, 40618' AS address UNION
SELECT ', Vancouver, Washington, 98687' UNION
SELECT '83047 Glacier Hill Circle, Sacramento, California, 94237' UNION
SELECT '935 Columbus Junction, Cincinnati, Ohio, 45213' UNION
SELECT '03010 Nevada Crossing, El Paso, Texas, 88579' UNION
SELECT '9 Dunning Circle, , Arizona, 85271' UNION
SELECT '96 Fair Oaks Way, Decatur, Illinois, 62525' UNION
SELECT '999 Caliangt Avenue, Greenville, South Carolina, 29615' UNION
SELECT '2 Saint Paul Trail, Bridgeport, , 06673' UNION
SELECT '3 Reindahl Center, Ogden, Utah'
) SELECT split_part(address, ', ', 1) street,
split_part(address, ', ', 2) city,
split_part(address, ', ', 3) state,
split_part(address, ', ', 4) postal_code
FROM addresses
ORDER BY postal_code
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 10 rows affected.
street | city | state | postal_code |
---|---|---|---|
3 Reindahl Center | Ogden | Utah | |
2 Saint Paul Trail | Bridgeport | 06673 | |
999 Caliangt Avenue | Greenville | South Carolina | 29615 |
593 Fair Oaks Pass | Frankfort | Kentucky | 40618 |
935 Columbus Junction | Cincinnati | Ohio | 45213 |
96 Fair Oaks Way | Decatur | Illinois | 62525 |
9 Dunning Circle | Arizona | 85271 | |
03010 Nevada Crossing | El Paso | Texas | 88579 |
83047 Glacier Hill Circle | Sacramento | California | 94237 |
Vancouver | Washington | 98687 |
%%sql
WITH addresses AS (
SELECT '593 Fair Oaks Pass, Frankfort, Kentucky, 40618' AS address UNION
SELECT ', Vancouver, Washington, 98687' UNION
SELECT '83047 Glacier Hill Circle, Sacramento, California, 94237' UNION
SELECT '935 Columbus Junction, Cincinnati, Ohio, 45213' UNION
SELECT '03010 Nevada Crossing, El Paso, Texas, 88579' UNION
SELECT '9 Dunning Circle, , Arizona, 85271' UNION
SELECT '96 Fair Oaks Way, Decatur, Illinois, 62525' UNION
SELECT '999 Caliangt Avenue, Greenville, South Carolina, 29615' UNION
SELECT '2 Saint Paul Trail, Bridgeport, , 06673' UNION
SELECT '3 Reindahl Center, Ogden, Utah'
) SELECT split_part(address, ', ', 1) street,
split_part(address, ', ', 2) city,
split_part(address, ', ', 3) state,
split_part(address, ', ', 4) postal_code
FROM addresses
WHERE split_part(address, ', ', 1) = ''
ORDER BY postal_code
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
street | city | state | postal_code |
---|---|---|---|
Vancouver | Washington | 98687 |
%%sql
WITH unique_ids AS (
SELECT '241-80-7115' AS unique_id UNION
SELECT '694-30-6851' UNION
SELECT '586-92-5361' UNION
SELECT '884-65-284' UNION
SELECT '876-99-585' UNION
SELECT '831-59-5593' UNION
SELECT '399-88-3617' UNION
SELECT '733-17-4217' UNION
SELECT '873-68-9778' UNION
SELECT '480-69-032'
) SELECT unique_id,
substring(unique_id FROM 1 FOR 3) AS unique_id_first3,
substring(unique_id FROM '....
#39;) AS unique_id_last4,
CASE WHEN length(split_part(unique_id, '-', 3)) = 4
THEN split_part(unique_id, '-', 3)
ELSE 'Invalid'
END AS unique_id_last
FROM unique_ids
ORDER BY unique_id
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 10 rows affected.
unique_id | unique_id_first3 | unique_id_last4 | unique_id_last |
---|---|---|---|
241-80-7115 | 241 | 7115 | 7115 |
399-88-3617 | 399 | 3617 | 3617 |
480-69-032 | 480 | -032 | Invalid |
586-92-5361 | 586 | 5361 | 5361 |
694-30-6851 | 694 | 6851 | 6851 |
733-17-4217 | 733 | 4217 | 4217 |
831-59-5593 | 831 | 5593 | 5593 |
873-68-9778 | 873 | 9778 | 9778 |
876-99-585 | 876 | -585 | Invalid |
884-65-284 | 884 | -284 | Invalid |
Using position or strpos¶
At times we might want to get the position of a substring in a main string. For example, we might want to check whether email ids have @ in them. We can use functions such as position
or strpos
.
%%sql
SELECT position('@' IN 'it@versity.com'),
position ('@' IN 'itversity.com')
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
position | position_1 |
---|---|
3 | 0 |
%%sql
SELECT strpos('it@versity.com', '@'),
strpos('itversity.com', '@')
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
strpos | strpos_1 |
---|---|
3 | 0 |
%%sql
WITH email_ids AS (
SELECT 'bsellan0@yellowbook.com' AS email_id UNION
SELECT 'rstelljes1@illinois.edu' UNION
SELECT 'mmalarkey2@webeden.co.uk' UNION
SELECT 'emussared3@redcross.org' UNION
SELECT 'livashin4@bloglovin.com' UNION
SELECT 'gkeach5cbc.ca' UNION
SELECT 'emasham6@xing.com' UNION
SELECT 'rcobbald7@house.gov' UNION
SELECT 'rdrohan8@washingtonpost.com' UNION
SELECT 'aebben9@arstechnica.com'
) SELECT email_id, position('@' IN email_id),
strpos(email_id, '@')
FROM email_ids
ORDER BY 2, 1
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 10 rows affected.
email_id | position | strpos |
---|---|---|
gkeach5cbc.ca | 0 | 0 |
aebben9@arstechnica.com | 8 | 8 |
bsellan0@yellowbook.com | 9 | 9 |
emasham6@xing.com | 9 | 9 |
rdrohan8@washingtonpost.com | 9 | 9 |
livashin4@bloglovin.com | 10 | 10 |
rcobbald7@house.gov | 10 | 10 |
emussared3@redcross.org | 11 | 11 |
mmalarkey2@webeden.co.uk | 11 | 11 |
rstelljes1@illinois.edu | 11 | 11 |
%%sql
WITH email_ids AS (
SELECT 'bsellan0@yellowbook.com' AS email_id UNION
SELECT 'rstelljes1@illinois.edu' UNION
SELECT 'mmalarkey2@webeden.co.uk' UNION
SELECT 'emussared3@redcross.org' UNION
SELECT 'livashin4@bloglovin.com' UNION
SELECT 'gkeach5cbc.ca' UNION
SELECT 'emasham6@xing.com' UNION
SELECT 'rcobbald7@house.gov' UNION
SELECT 'rdrohan8@washingtonpost.com' UNION
SELECT 'aebben9@arstechnica.com'
) SELECT email_id, position('@' IN email_id),
strpos(email_id, '@')
FROM email_ids
WHERE position ('@' IN email_id) = 0
ORDER BY 1
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
email_id | position | strpos |
---|---|---|
gkeach5cbc.ca | 0 | 0 |
%%sql
WITH unique_ids AS (
SELECT '241-80-7115' AS unique_id UNION
SELECT '694-30-6851' UNION
SELECT '586-92-5361' UNION
SELECT '884-65-284' UNION
SELECT '876-99-585' UNION
SELECT '831-59-5593' UNION
SELECT '399-88-3617' UNION
SELECT '733-17-4217' UNION
SELECT '873-68-9778' UNION
SELECT '48'
) SELECT unique_id,
position('-' IN unique_id) AS pos,
position('-' IN unique_id) + position('-' IN substring(unique_id FROM 5)) AS pos_2nd
FROM unique_ids
ORDER BY unique_id
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 10 rows affected.
unique_id | pos | pos_2nd |
---|---|---|
241-80-7115 | 4 | 7 |
399-88-3617 | 4 | 7 |
48 | 0 | 0 |
586-92-5361 | 4 | 7 |
694-30-6851 | 4 | 7 |
733-17-4217 | 4 | 7 |
831-59-5593 | 4 | 7 |
873-68-9778 | 4 | 7 |
876-99-585 | 4 | 7 |
884-65-284 | 4 | 7 |
Trimming and Padding Functions¶
Let us understand how to trim or remove leading and/or trailing spaces in a string.
%env DATABASE_URL=postgresql://itversity_retail_user:retail_password@pg.itversity.com:5432/itversity_retail_db
env: DATABASE_URL=postgresql://itversity_retail_user:retail_password@pg.itversity.com:5432/itversity_retail_db
ltrim
is used to remove the spaces on the left side of the string.rtrim
is used to remove the spaces on the right side of the string.trim
is used to remove the spaces on both sides of the string.
%%sql
SELECT ltrim(' Hello World') AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
Hello World |
%%sql
SELECT rtrim(' Hello World ') AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
Hello World |
%%sql
SELECT length(trim(' Hello World ')) AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
11 |
%%sql
SELECT ltrim('----Hello World----', '-') AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
Hello World—- |
%%sql
SELECT rtrim('----Hello World----', '-') AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
—-Hello World |
%%sql
SELECT trim('----Hello World----', '-') AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
Hello World |
Let us understand how to use padding to pad characters to a string.
- Let us assume that there are 3 fields – year, month and date which are of type integer.
- If we have to concatenate all the 3 fields and create a date, we might have to pad month and date with 0.
lpad
is used more often thanrpad
especially when we try to build the date from separate columns.
%%sql
SELECT 2013 AS year, 7 AS month, 25 AS myDate
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
year | month | mydate |
---|---|---|
2013 | 7 | 25 |
%%sql
SELECT lpad(7::varchar, 2, '0') AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
07 |
%%sql
SELECT lpad(10::varchar, 2, '0') AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
10 |
%%sql
SELECT lpad(100::varchar, 2, '0') AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
10 |
Reverse and Concatenating multiple strings¶
Let us understand how to reverse a string as well as concatenate multiple strings.
%env DATABASE_URL=postgresql://itversity_retail_user:retail_password@pg.itversity:5432/itversity_retail_db
env: DATABASE_URL=postgresql://itversity_retail_user:retail_password@pg.itversity:5432/itversity_retail_db
- We can use
reverse
to reverse a string. - We can concatenate multiple strings using
concat
andconcat_ws
. concat_ws
is typically used if we want to have the same string between all the strings that are being concatenated.
%%sql
SELECT reverse('Hello World') AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
dlroW olleH |
%%sql
SELECT concat('Hello ', 'World') AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
Hello World |
%%sql
SELECT concat('Order Status is ', order_status) AS result
FROM orders LIMIT 10
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 10 rows affected.
result |
---|
Order Status is CLOSED |
Order Status is PENDING_PAYMENT |
Order Status is COMPLETE |
Order Status is CLOSED |
Order Status is COMPLETE |
Order Status is COMPLETE |
Order Status is COMPLETE |
Order Status is PROCESSING |
Order Status is PENDING_PAYMENT |
Order Status is PENDING_PAYMENT |
%%sql
SELECT * FROM (SELECT 2013 AS year, 7 AS month, 25 AS myDate) q
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
year | month | mydate |
---|---|---|
2013 | 7 | 25 |
%%sql
SELECT concat(year, '-', lpad(month::varchar, 2, '0'), '-',
lpad(myDate::varchar, 2, '0')) AS order_date
FROM
(SELECT 2013 AS year, 7 AS month, 25 AS myDate) q
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
order_date |
---|
2013-07-25 |
%%sql
SELECT concat_ws('-', year, lpad(month::varchar, 2, '0'),
lpad(myDate::varchar, 2, '0')) AS order_date
FROM
(SELECT 2013 AS year, 7 AS month, 25 AS myDate) q
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
order_date |
---|
2013-07-25 |
String Replacement¶
Let us go through the details related to string replacement.
%load_ext sql
The sql extension is already loaded. To reload it, use: %reload_ext sql
%env DATABASE_URL=postgresql://itversity_retail_user:retail_password@pg.itversity.com:5432/itversity_retail_db
env: DATABASE_URL=postgresql://itversity_retail_user:retail_password@pg.itversity.com:5432/itversity_retail_db
%%sql
SELECT replace('Hello World', 'alo', 'ello') AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
Hello World |
%%sql
SELECT overlay('Halo World' PLACING 'ello' FROM 2 FOR 3) AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
Hello World |
%%sql
WITH unique_ids AS (
SELECT '241-80-7115' AS unique_id UNION
SELECT '694-30-6851' UNION
SELECT '586-92-5361' UNION
SELECT '884-65-2844' UNION
SELECT '876-99-5856' UNION
SELECT '831-59-5593' UNION
SELECT '399-88-3617' UNION
SELECT '733-17-4217' UNION
SELECT '873-68-9778' UNION
SELECT '487-21-9802'
) SELECT unique_id,
replace(unique_id, '-', ' ') AS unique_id_replaced,
translate(unique_id, '-', ' ') AS unique_id_translated,
overlay(unique_id PLACING ' ' FROM 4 FOR 1) AS unique_id_overlaid
FROM unique_ids
ORDER BY unique_id
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 10 rows affected.
unique_id | unique_id_replaced | unique_id_translated | unique_id_overlaid |
---|---|---|---|
241-80-7115 | 241 80 7115 | 241 80 7115 | 241 80-7115 |
399-88-3617 | 399 88 3617 | 399 88 3617 | 399 88-3617 |
487-21-9802 | 487 21 9802 | 487 21 9802 | 487 21-9802 |
586-92-5361 | 586 92 5361 | 586 92 5361 | 586 92-5361 |
694-30-6851 | 694 30 6851 | 694 30 6851 | 694 30-6851 |
733-17-4217 | 733 17 4217 | 733 17 4217 | 733 17-4217 |
831-59-5593 | 831 59 5593 | 831 59 5593 | 831 59-5593 |
873-68-9778 | 873 68 9778 | 873 68 9778 | 873 68-9778 |
876-99-5856 | 876 99 5856 | 876 99 5856 | 876 99-5856 |
884-65-2844 | 884 65 2844 | 884 65 2844 | 884 65-2844 |
%%sql
WITH unique_ids AS (
SELECT '241-80-7115' AS unique_id UNION
SELECT '694-30:6851' UNION
SELECT '586-92-5361' UNION
SELECT '884:65-2844' UNION
SELECT '876/99-5856' UNION
SELECT '831-59:5593' UNION
SELECT '399-88-3617' UNION
SELECT '733:17-4217' UNION
SELECT '873:68-9778' UNION
SELECT '487-21/9802'
) SELECT unique_id,
replace(replace(unique_id, '-', ' '), ':', ' ') AS unique_id_replaced,
translate(unique_id, '-:/', ' ') AS unique_id_translated,
overlay(overlay(unique_id PLACING ' ' FROM 4 FOR 1) PLACING ' ' FROM 7 FOR 1) AS unique_id_overlaid
FROM unique_ids
ORDER BY unique_id
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 10 rows affected.
unique_id | unique_id_replaced | unique_id_translated | unique_id_overlaid |
---|---|---|---|
241-80-7115 | 241 80 7115 | 241 80 7115 | 241 80 7115 |
399-88-3617 | 399 88 3617 | 399 88 3617 | 399 88 3617 |
487-21/9802 | 487 21/9802 | 487 21 9802 | 487 21 9802 |
586-92-5361 | 586 92 5361 | 586 92 5361 | 586 92 5361 |
694-30:6851 | 694 30 6851 | 694 30 6851 | 694 30 6851 |
733:17-4217 | 733 17 4217 | 733 17 4217 | 733 17 4217 |
831-59:5593 | 831 59 5593 | 831 59 5593 | 831 59 5593 |
873:68-9778 | 873 68 9778 | 873 68 9778 | 873 68 9778 |
876/99-5856 | 876/99 5856 | 876 99 5856 | 876 99 5856 |
884:65-2844 | 884 65 2844 | 884 65 2844 | 884 65 2844 |
{note}
In case of `translate`, if we do not have characters for replacement, then those will be replaced with empty string. For example, `translate('+86 (238) 954-9649', '+() -', '0')`will result in **0862389549649**.
%%sql
SELECT translate('+86 (238) 954-9649', '+() -', '0') AS result
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
result |
---|
0862389549649 |
%%sql
WITH phone_numbers AS (
SELECT '+86 (238) 954-9649' AS phone_number UNION
SELECT '+420 (331) 900-5807' UNION
SELECT '+1 (320) 484-4495' UNION
SELECT '+45 (238) 961-9801' UNION
SELECT '+51 (123) 545-6543' UNION
SELECT '+63 (308) 354-2560' UNION
SELECT '+86 (433) 851-1260' UNION
SELECT '+63 (332) 705-0319' UNION
SELECT '+351 (147) 359-3767' UNION
SELECT '+57 (714) 557-0468'
) SELECT phone_number,
translate(phone_number, '+() -', '') phone_number_int
FROM phone_numbers
ORDER BY phone_number
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 10 rows affected.
phone_number | phone_number_int |
---|---|
+1 (320) 484-4495 | 13204844495 |
+351 (147) 359-3767 | 3511473593767 |
+420 (331) 900-5807 | 4203319005807 |
+45 (238) 961-9801 | 452389619801 |
+51 (123) 545-6543 | 511235456543 |
+57 (714) 557-0468 | 577145570468 |
+63 (308) 354-2560 | 633083542560 |
+63 (332) 705-0319 | 633327050319 |
+86 (238) 954-9649 | 862389549649 |
+86 (433) 851-1260 | 864338511260 |