JAVA正则表达式-捕获组与非捕获组
Java捕获组与非捕获组的问题困扰了我一阵子,下面是我弄明白后的笔记:
先看例子:
- import java.util.regex.Matcher;
- import java.util.regex.Pattern;
-
- public class PatternTest {
-
- public static void main(String[] args) {
- String text = "<textarea rows=\"20\" cols=\"70\">nexus maven repository index properties updating index central</textarea>";
- String reg = "<textarea.*?>.*?</textarea>";
- Pattern p = Pattern.compile(reg);
- Matcher m = p.matcher(text);
- while (m.find()) {
- System.out.println(m.group());
- }
- }
-
- }
运行结果:
- <textarea rows="20" cols="70">nexus maven repository index properties updating index central</textarea>
现在,如果我只想匹配到<textarea>内的文本内容即“nexus maven repository index properties updating index central”,怎么做呢?这时候就要用到捕获组了。上述代码中“<textarea.*?>.*?</textarea>”最中间的“.*?”是匹配内容的正则表达式,只需要将它用括号括起来,就是一个捕获组了。看代码:
- import java.util.regex.Matcher;
- import java.util.regex.Pattern;
-
- public class PatternTest {
-
- public static void main(String[] args) {
- String text = "<textarea rows=\"20\" cols=\"70\">nexus maven repository index properties updating index central</textarea>";
-
- String reg = "(<textarea.*?>)(.*?)(</textarea>)";
- Pattern p = Pattern.compile(reg);
- Matcher m = p.matcher(text);
- while (m.find()) {
- System.out.println(m.group(0));
- System.out.println(m.group(1));
- System.out.println(m.group(2));
- System.out.println(m.group(3));
- }
- }
- }
运行结果:
- <textarea rows="20" cols="70">nexus maven repository index properties updating index central</textarea>
- <textarea rows="20" cols="70">
- nexus maven repository index properties updating index central
- </textarea>
从上述代码得出结论:正则表达式中每个"()"内的部分算作一个捕获组,每个捕获组都有一个编号,从1,2...,编号0代表整个匹配到的内容。
至于非捕获组,只需要将捕获组中"()"变为"(?:)"即可,代码说话:
- import java.util.regex.Matcher;
- import java.util.regex.Pattern;
-
- public class PatternTest {
-
- public static void main(String[] args) {
- String text = "<textarea rows=\"20\" cols=\"70\">nexus maven repository index properties updating index central</textarea>";
-
- String reg = "(?:<textarea.*?>)(.*?)(?:</textarea>)";
- Pattern p = Pattern.compile(reg);
- Matcher m = p.matcher(text);
- while (m.find()) {
- System.out.println(m.group(0));
- System.out.println(m.group(1));
- }
- }
- }
运行结果:
- <textarea rows="20" cols="70">nexus maven repository index properties updating index central</textarea>
- nexus maven repository index properties updating index central
如果试图运行:
System.out.println(m.group(2));将会抛出异常,因为不存在编号为2的捕获组。