In this tutorial, we will discuss what is a Java Regular expression and how to use java substring regex for pattern matching using the pattern.matcher along with different examples. We will also cover various java regex special characters that we use for java pattern matches.
Table of Contents
What is a Regular expression (Java regex)?
A regular expression is a technique that we use to search for particular patterns in a string. It can be either a single character or a sequence of characters. We can use the java regex to perform any type of string search and replace operation.
In order to use the java regular expression, we can import the java.util.regex package.
java.util.regex package
The java.util.regex package contains 1 interface and 3 classes as listed below:
- MatchResult interface
- Matcher class
- Pattern class
- PatternSyntaxException class
Pattern class
The Pattern class is used to implement the java regular expression. It has a compile() method that accepts the regular expression as an argument and returns a pattern object that we can use to perform a pattern match.
Below are the commonly used methods of the Pattern class:
Method | Description |
---|---|
Matcher matcher(CharSequence input) | Creates a matcher that matches the input with the given pattern |
String pattern() | Returns a regular expression from which the pattern was compiled |
String[] split(CharSequence input) | Splits the input sequence around the pattern match |
Pattern compile(String regex) | Compiles the regular expression as a pattern |
boolean matches(String regex, CharSequence input, | Compiles the regular expression and performs a pattern match. |
The compile method has an option flag parameter that denotes how to perform a pattern java match:
- Pattern.CASE_INSENSITIVE: Ignores the case of letters during the pattern search
- Pattern.LITERAL: Treats the special characters as ordinary characters during the pattern search
- Pattern.UNICODE_CASE: Used along with CASE_INSENSITIVE to ignore the case of letters outside the English alphabets.
Matcher class
The Matcher class implements the MatchResult interface and performs pattern matches on a sequence of characters. We can create a Matcher object using the matcher method on the Pattern object.
Below are the different methods that are present in the Matcher class:
Method | Description |
---|---|
int end() | Returns the offset of the last character that is matched |
boolean find() | Finds the next subsequence of the input that matches the pattern |
boolean find(int start) | Resets the matcher and finds the next subsequence of the input that matches the pattern starting from the specified index |
String group() | Returns the input subsequence that matches the expression |
int groupCount() | Returns the number of capturing groups in the matcher's pattern |
boolean matches() | Finds the match against the pattern |
Pattern pattern() | Returns the pattern interpreted by the matcher |
Matcher region(int start, int end) | Sets the limit of the region to perform pattern match |
String replaceAll(String replacement) | Replaces all the subsequence that matches the pattern with the given new string |
Matcher reset() | Resets the matcher |
Regular Expression Patterns
We can check for either alphabet or numeric regular expression patterns in an input string. The compile method of the pattern class accepts this regular expression as the first parameter. The different combinations of patterns or character classes are below:
Pattern | Description |
---|---|
[abc] | Finds a character from the options provided in the bracket |
[^abc] | Finds a character that is not between the options provided in the bracket |
[0-9] | Finds a character in the range 0-9 |
[a-zA-Z] | Finds a character between a to z of both cases |
[a-g[k-r]] | Finds a character between a to g and k to r (union) |
[a-z&&[lmn]] | Finds a character between a to z that has l,m,n - intersection |
[a-z&&[^de]] | Finds a character between a and z except d and e - subtraction |
[a-z&&[^h-k]] | Finds a character between a and z except in the range h and k |
Metacharacters
We can also use metacharacters as part of the regular expression patterns which have a special meaning.
Metacharacter | Description |
---|---|
| | Finds a match for any one of the patterns separated by | |
. | Finds a single instance of any character |
^ | Finds a match at the beginning of the string |
$ | Finds a match at the end of the string |
\d | Finds a digit |
\s | Finds a whitespace character |
\b | Finds a match at either beginning or end of the word |
\uxxxx | Finds a unicode character specified by the hexadecimal number xxxx |
\D | Any non digit equivalent to [^0-9] |
\S | Any non-whitespace character which is equivalent to [^\s] |
\w | Any word character which is equivalent to [a-zA-Z_0-9] |
\W | Any non-word character which is equivalent to [^\w] |
Quantifiers
We can use quantifiers to define the quantity or number of occurrences of the specified character in the regular expression pattern.
Quantifier | Description |
---|---|
a+ | a occurs one or more times |
a* | a occurs zero or more times |
a? | a occurs zero or once |
a{n} | a occurs n times |
a{n,} | a occurs n or more times |
a{m,n} | a occurs atleast m times but less than n times |
Java Regular expressions examples
Now, let’s see various java regex examples that demonstrate different java patterns.
Example: Find a string
Below is a simple example to find a java pattern with the string “java” in the input text. It uses the java pattern.matcher method to check for the required pattern. If the pattern is found, it returns true else it returns false.
import java.util.regex.*; public class RegExDemo { public static void main(String[] args) { Pattern p = Pattern.compile("java", Pattern.CASE_INSENSITIVE); Matcher m = p.matcher("Welcome to Java tutorial"); boolean bfound = m.find(); if(bfound) System.out.println("Pattern found"); else System.out.println("Pattern not found"); } }
Pattern found
Example: Different ways of writing a regular expression
There are different ways of writing regular expression patterns in java. The 1st method uses a combination of Pattern and Matcher class with Pattern.matcher method and the matches method in different statements. The 2nd method uses the same combination but in a single statement while the third method uses only Pattern.matches to search for the regular expression pattern.
In this example, we check for the pattern with the 2nd character as ‘a’, and the remaining characters can be any letters.
import java.util.regex.*; public class RegExDemo2 { public static void main(String[] args) { Pattern p = Pattern.compile(".a.."); Matcher m = p.matcher("java"); System.out.println(m.matches()); boolean b = Pattern.compile(".a..").matcher("java").matches(); System.out.println(b); boolean bm = Pattern.matches(".a..", "java"); System.out.println(bm); } }
true true true
Example: Regular expression pattern using . (dot)
The below example shows different demonstrations of using the .(dot) character for a regular expression. The 1st output is true since it matches the input having 2nd character as i. The 2nd output is false since it does not match with the given expression since there is no ‘i’ in the 2nd character. The 3rd output is false since there are more than 3 characters. The last 2 statements are true since the 1st character is ‘h’ and the last character is ‘e’ respectively matching the number of character length as well.
import java.util.regex.*; public class RegExDemo3 { public static void main(String[] args) { System.out.println(Pattern.matches(".i", "hi")); System.out.println(Pattern.matches(".i", "at")); System.out.println(Pattern.matches(".a.", "java")); System.out.println(Pattern.matches("h.", "hi")); System.out.println(Pattern.matches("..e", "bye")); } }
true false false true true
Example: Regular expression character class
In this example, we use the characters as a regular expression pattern. If the pattern is present in the input string, it returns true else it returns false.
import java.util.regex.*; public class RegExDemo4 { public static void main(String[] args) { System.out.println(Pattern.matches("[abc]", "bag")); System.out.println(Pattern.matches("[abc]", "a")); System.out.println(Pattern.matches("[a-c][p-u]", "ar")); System.out.println(Pattern.matches(".*come.*", "welcome")); System.out.println(Pattern.matches("java", "Java")); } }
false true true true false
Example: Regular expression quantifier
In the below example, we use various quantifiers like ‘?’ that checks if the character occurs only once, ‘+’ checks if the character occurs more than once, and ‘*’ checks if the character occurs zero or more times.
import java.util.regex.*; public class RegExDemo5 { public static void main(String[] args) { System.out.println(Pattern.matches("[lmn]?", "l")); System.out.println(Pattern.matches("[lmn]?", "hello")); System.out.println(Pattern.matches("[lmn]+", "llmmn")); System.out.println(Pattern.matches("[lmn]*", "java")); System.out.println(Pattern.matches("[lmn]*", "lln")); } }
true false true false true
Example: Find multiple occurrences using the matcher method
The below example illustrates the multiple occurrences of the pattern in the input string using the Pattern.matcher method. It displays the locations at which the character ‘a’ occurs in the string.
import java.util.regex.*; public class RegExDemo6 { public static void main(String[] args) { Pattern p = Pattern.compile("a"); Matcher m = p.matcher("Welcome to java tutorial"); while(m.find()) { System.out.println("Occurs at: " + m.start() + " - " + m.end()); } } }
Occurs at: 12 - 13 Occurs at: 14 - 15 Occurs at: 22 - 23
Example: Boundary matches
This is one of the java pattern examples that check for boundary matches. This is a type of java regex special characters in the search pattern. The 1st output is true since the pattern matches the beginning of the string while the second one is false since it does not begin with the pattern.
import java.util.regex.*; public class RegExDemo7 { public static void main(String[] args) { System.out.println(Pattern.matches("^Java$","Java")); System.out.println(Pattern.matches("^Java$","Welcome to java")); } }
true false
Example: Regular expression with digits
This example uses a digits pattern in the regular expression. It checks for a match with any digit that follows the word “Java”. Hence the 1st 2 output is true since it contains a digit while the last output is false since it does not contain any digit.
import java.util.regex.*; public class RegExDemo7 { public static void main(String[] args) { String regex = "Java\\d"; System.out.println(Pattern.matches(regex, "Java5")); System.out.println(Pattern.matches(regex, "Java8")); System.out.println(Pattern.matches(regex, "JavaScript")); } }
true true false
Example: Using logical operators in regular expression pattern
We can also use logical operators like AND, OR in patterns. By default, it considers and AND operator when we have more than one character in the regular expression pattern. For example, in the below code, the output is true if the first 2 characters are ‘c’ and ‘h’. Hence the 1st 2 output is true and the last output is false.
import java.util.regex.*; public class RegExDemo8 { public static void main(String[] args) { String regex = "[Cc][h].*"; String s = "cheque"; Pattern p = Pattern.compile(regex); Matcher m = p.matcher(s); System.out.println(m.matches()); s = "Chart"; m = p.matcher(s); System.out.println(m.matches()); s = "color"; m = p.matcher(s); System.out.println(m.matches()); } }
true true false
We can use the OR operator by using the ‘|’ symbol to check for the matching patterns. In this example, the output is true if the input string contains either the text “Java” or “JavaScript”.
import java.util.regex.*; public class RegExDemo8 { public static void main(String[] args) { String regex = ".*Java.*|.*JavaScript.*"; String s = "Welcome to Java tutorial"; Pattern p = Pattern.compile(regex); Matcher m = p.matcher(s); System.out.println(m.matches()); s = "JavaScript tutorial"; m = p.matcher(s); System.out.println(m.matches()); s = "C tutorial"; m = p.matcher(s); System.out.println(m.matches()); } }
true true false
The above two examples also illustrate the use of java substring regex in pattern search since we check for a substring in the input string.
Conclusion
In this tutorial, we have learned Java Regular expression pattern matching using Pattern.matcher and other methods with examples along with how to use Java regex special characters and java substring regex in pattern search.