Aries' Blog

Freestyle technical blog

Regex notes

Regex is very useful but hard to remember, I use it infrequently and always forget the syntax.

Regex reserved characters

Chars Description
. Any single character.
( ) Grouping delimitators.
[ ] Character set delimiters. OR at character level
| OR expression for patterns (one or another).
{ } Repetitions delimiters.
* Zero or more repetitions of the previous character.
+ One or more repetitions.
? Zero or one repetition. Also it’s used for lazy matches.
^ Start of the string. You can use it to force a pattern to match only at the start. Also it’s used as a NOT inside of character sets.
$ End of the string.
/ Separator. In many regex engines, regex patterns must be enclosed in /s
- Range definition. Used to define a range of consecutive characters, like A-Z
\ Escape character for all reserved characters, so \? will search for a literal ?. It’s also used for other special search patterns (see below).

Below online regex tester can provide great help.

Unicode character

Use \x{FFFF} pattern to search for unicode. It works for both Java & Notepad++

1. Java

Java regex engine is similar to Perl one.

I found this Java Regex Tester most helpful as it provide as a Java string view of the Regex which you can use directly in code. This tester is based on Java 6.

Java Regex engine spec is documented in java.util.regex.Pattern API doc, this link to Java 8 one.

2. Notepad++

Notepad++ support regex search by selecting Regular expression under Search Mode in the search window. It supports PCRE (PERL Compatible Regular Expressions). Use this tester and select PCRE.

Note that / (forward slash) need to be escaped in tester but not in notepad++.

So patterns

^http(s?):\/\/my.com\/path\/to\/page and

^http(s?)://my.com/path/to/page

both work in notepad++, but only the first one is valid in the tester.

3. Grep

Use grep -E or simply egrep to activate Extended Regular Expressions function, then grep will behave like Notepad++.

Patterns

^http(s?):\/\/my.com\/path\/to\/page and

^http(s?)://my.com/path/to/page

both work with grep -E or egrep .