Regular expression also supports a number of special characters that may be affect the way for matching a pattern.
For example: Enter the regex: sri.....
Enter input string to search: srinivas
I found the text srinivas starting at index 0 and ending at index 8.
The match will succeeds here even “.....” is not present in the input because “.....” is a metacharacter means it will match any 5 characters present in the input string.
The metacharacters supported by regex are : <([{\^-=$!|]})?*+.>
If we want a metacharacter act as a ordinary character there are 2 ways:
-
Precede the metacharacter with backslash
-
Enclose it within \Q (which starts the quote) and \E (which ends it).
We can use this \Q and \E can be placed anywhere in the expression provided that \Q comes first.
Below are some metacharacters with definition:
. → Accept or match any character
\d → Matches any digits
\D or [^0-9] → Matches any non-digits
\s or [\t\n\x0B\f\r] → Matches any whitespace character
\S or [^\s] → Matches any non-whitespace character
\w or [a-zA-Z_0-9] → Matches any word character
\W or [^\w] → Matches any non-word character
\b → Matches a word boundary
\B → Matches a non word boundary
\ → It is used to match the next character as a special character or a literal or a backreference, or an octal escape.
^ → It is used to match the position at the beginning of the input string.
$ → It is used to match the position at the end of the input string.
*→ It is used to match the preceding characters or subexpression of given input string.
+→ It is used to match the preceding character or subexpression one or more times of given input string.
? → It is used to match the preceding character or subexpression zero or one time of given input string.
{n} → This expression is provided after the regex pattern and matches n times exactly where n is a non negative integer.
{n,} → This expression also provides after the regex pattern and matches at least n times where n is a non negative integer.
{n,m} → This expression also provides after the regex pattern and matches at least n and at most m times where n and m both are non negative integer and should n<=m.
(pattern) → It may be a subexpression that matches pattern and captures the match
(?:pattern) → It may be a subexpression that matches pattern but does not capture the match
(?=pattern) → When a string matching pattern begins this subexpression matches at any point in the string.
(?!pattern) → When a string not matching pattern begins this subexpression matches at any point in the string
a|b → Matches either a or b.
[abc] → A character set. Matches any one of the enclosed characters.
[^abc] → A negative character set. Matches any character not enclosed
[a-x] → A range of characters. Matches any character in the specified range
[^a-x] → A negative range characters. Matches any character not in the specified range
\b → Matches a word boundary, that is, the position between a word and a space
\n → Matches a newline character
\v → Matches a vertical tab character
\num → Matches num, where num =+ve integer.
Regular Expression Metacharacters Example:
import java.util.regex.*;
class RegexExample{
public static void main(String args[]){
System.out.println(Pattern.matches("\\d","cdfg"));
System.out.println(Pattern.matches("\\d", "9"));
System.out.println(Pattern.matches("\\d", "6789"));
System.out.println(Pattern.matches("\\d", "123gcd"));
System.out.println(Pattern.matches("\\D", "9"));
System.out.println(Pattern.matches("\\D", "8976"));
System.out.println(Pattern.matches("\\D", "123cdf"));
System.out.println(Pattern.matches("\\D*", "srinu"));
System.out.println(Pattern.matches("[a-zA-Z0-9]{7}", "srinu89"));
System.out.println(Pattern.matches("[a-zA-Z0-9]{7}", "srinivas89"));
System.out.println(Pattern.matches("[a-zA-Z0-9]{6}", "arun$2"));
}
}
Output: false
true
false
false
false
false
false
true
true
false
false
In the above example the first match is false because we are passing the regex as “d” means it matches only digits in input string ,while on the same second match is true.
On the third match is also false because here digit is coming more than once, in fourth match the match result is false because the input string is digit and characters while it will match only digit.
On the fifth match is false because we are passing the regex as “D” means it matches only a non-digits in input string and the same result is in sixth match.
On the seventh match the result is false because regex will match only a non-digit but we are passing digits and characters as a input while on the same eighth match is true because passed regex will match 0 or more non-digit characters.
On the ninth match the result is true because the passed regex is [a-zA-Z0-9]{7} with boundary limit or length of 7 and input string is digits and characters of the same(7) length.
On the tenth match the result is false because the passed regex is [a-zA-Z0-9]{7} with boundary limit or length of 7 and input string is more than 7 length.
On the eleventh match the result is false because the passed regex is [a-zA-Z0-9]{7} with boundary limit or length of 7 and input string is digit, characters and special character during match there is no special character in regex pattern but it ‘$’ is in input string.
0 Comment(s)