Monday 1 August 2022

Checking for hyphen and space with regexp_like

 If you need to check that a string only contains a certain set of characters then regexp_like can be very useful.

However you can have some trouble when trying to check for things like spaces and the hyphen symbol.

Through a bit of trial and error is seems like placement of the characters to check in the pattern section of the condition makes a difference.

For example if what we are trying to do is find a string that contains any character not in a set we define (using the ^ to indicate characters NOT matching the pattern) and that set contains a hyphen as an allowable character then the following falsely says that the string violates the regular expression. The hyphen is after the underscore in the pattern.

SELECT 'TRUE'
FROM dual
WHERE regexp_like('This is a string with a - hyphen', q'[[^a-zA-Z0-9._-~@/ ]]') 

This one results in ORA-12728: invalid range in regular expression. The hyphen is after the slash in the pattern.

SELECT 'TRUE'
FROM dual
WHERE regexp_like('This is a string with a - hyphen', q'[[^a-zA-Z0-9._~@/- ]]') 

And this one works as expected. The hyphen is at the start of the pattern. It passes as the string doesn't contain any characters other than those allowed in the pattern.

SELECT 'TRUE'
FROM dual
WHERE regexp_like('This is a string with a - hyphen', q'[[^-a-zA-Z0-9._~@/ ]]') 

I also found that space seems to work when it is defined at the end of the pattern. Originally I had space escaped with \ but it is not needed.