Java regular expression optimization - help needed
Hi All,
I am new to Java Regular expression. We are using a pattern for matching a string. We are using this for validating a text field and it meets our requirements. But there is a performance issue in the matching.
pattern :([a-zA-Z0-9]+[ ]?(([_\-][a-zA-Z0-9 ])*)?[_\-]?)+
1. Input text should start with a-zA-Z0-9.
2. Space(single) is allowed between words
3. "_" and "-" are allowed but cannot be consecutive.
Our problem is, for certain input strings the CPU time goes high and causes hanging the threads. Also we get exceptions. Can anyone please
help me to optimize the Pattern or suggest a new pattern to solve my issue.
Exception details
============================================
Hung thread details, all the same:
[9/28/11 11:40:07:320 CDT] 00000003 ThreadMonitor W WSVR0605W: Thread "WebContainer : 26" (0000004f) has been active for 709755 mi
lliseconds and may be hung. There is/are 1 thread(s) in total in the server that may be hung.
at java.util.regex.Pattern$GroupCurly.match(Pattern.j ava:3938)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Branch.match(Pattern.java: 4124)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$Curly.match0(Pattern.java: 3801)
at java.util.regex.Pattern$Curly.match(Pattern.java:3 756)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Loop.match(Pattern.java:43 07)
at java.util.regex.Pattern$GroupTail.match(Pattern.ja va:4239)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$BranchConn.match(Pattern.j ava:4090)
at java.util.regex.Pattern$GroupTail.match(Pattern.ja va:4239)
at java.util.regex.Pattern$GroupCurly.match0(Pattern. java:4006)
at java.util.regex.Pattern$GroupCurly.match(Pattern.j ava:3928)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Branch.match(Pattern.java: 4124)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$Curly.match0(Pattern.java: 3794)
at java.util.regex.Pattern$Curly.match(Pattern.java:3 756)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Loop.match(Pattern.java:43 07)
at java.util.regex.Pattern$GroupTail.match(Pattern.ja va:4239)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$Branch.match(Pattern.java: 4124)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$Curly.match0(Pattern.java: 3794)
at java.util.regex.Pattern$Curly.match(Pattern.java:3 756)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Loop.match(Pattern.java:43 07)
at java.util.regex.Pattern$GroupTail.match(Pattern.ja va:4239)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$BranchConn.match(Pattern.j ava:4090)
at java.util.regex.Pattern$GroupTail.match(Pattern.ja va:4239)
at java.util.regex.Pattern$GroupCurly.match0(Pattern. java:4006)
at java.util.regex.Pattern$GroupCurly.match(Pattern.j ava:3928)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Branch.match(Pattern.java: 4124)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$Curly.match0(Pattern.java: 3794)
at java.util.regex.Pattern$Curly.match(Pattern.java:3 756)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Loop.match(Pattern.java:43 07)
at java.util.regex.Pattern$GroupTail.match(Pattern.ja va:4239)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$BranchConn.match(Pattern.j ava:4090)
at java.util.regex.Pattern$GroupTail.match(Pattern.ja va:4239)
at java.util.regex.Pattern$GroupCurly.match0(Pattern. java:4006)
at java.util.regex.Pattern$GroupCurly.match(Pattern.j ava:3928)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Branch.match(Pattern.java: 4124)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$Curly.match0(Pattern.java: 3794)
at java.util.regex.Pattern$Curly.match(Pattern.java:3 756)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Loop.match(Pattern.java:43 07)
at java.util.regex.Pattern$GroupTail.match(Pattern.ja va:4239)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$Branch.match(Pattern.java: 4124)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$Curly.match0(Pattern.java: 3794)
at java.util.regex.Pattern$Curly.match(Pattern.java:3 756)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Loop.match(Pattern.java:43 07)
at java.util.regex.Pattern$GroupTail.match(Pattern.ja va:4239)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$Branch.match(Pattern.java: 4124)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$Curly.match0(Pattern.java: 3801)
at java.util.regex.Pattern$Curly.match(Pattern.java:3 756)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Loop.match(Pattern.java:43 07)
at java.util.regex.Pattern$GroupTail.match(Pattern.ja va:4239)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$BranchConn.match(Pattern.j ava:4090)
at java.util.regex.Pattern$GroupTail.match(Pattern.ja va:4239)
at java.util.regex.Pattern$GroupCurly.match0(Pattern. java:4006)
at java.util.regex.Pattern$GroupCurly.match(Pattern.j ava:3928)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Branch.match(Pattern.java: 4124)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$Curly.match0(Pattern.java: 3794)
at java.util.regex.Pattern$Curly.match(Pattern.java:3 756)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Loop.match(Pattern.java:43 07)
at java.util.regex.Pattern$GroupTail.match(Pattern.ja va:4239)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$Branch.match(Pattern.java: 4124)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$Curly.match0(Pattern.java: 3794)
at java.util.regex.Pattern$Curly.match(Pattern.java:3 756)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Loop.match(Pattern.java:43 07)
at java.util.regex.Pattern$GroupTail.match(Pattern.ja va:4239)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$BranchConn.match(Pattern.j ava:4090)
at java.util.regex.Pattern$GroupTail.match(Pattern.ja va:4239)
at java.util.regex.Pattern$GroupCurly.match0(Pattern. java:4006)
at java.util.regex.Pattern$GroupCurly.match(Pattern.j ava:3928)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Branch.match(Pattern.java: 4124)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$Curly.match0(Pattern.java: 3794)
at java.util.regex.Pattern$Curly.match(Pattern.java:3 756)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Loop.match(Pattern.java:43 07)
at java.util.regex.Pattern$GroupTail.match(Pattern.ja va:4239)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$Branch.match(Pattern.java: 4124)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$Curly.match0(Pattern.java: 3794)
at java.util.regex.Pattern$Curly.match(Pattern.java:3 756)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Loop.match(Pattern.java:43 07)
at java.util.regex.Pattern$GroupTail.match(Pattern.ja va:4239)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$Branch.match(Pattern.java: 4124)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$Curly.match0(Pattern.java: 3794)
at java.util.regex.Pattern$Curly.match(Pattern.java:3 756)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Loop.match(Pattern.java:43 07)
at java.util.regex.Pattern$GroupTail.match(Pattern.ja va:4239)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$BranchConn.match(Pattern.j ava:4090)
at java.util.regex.Pattern$GroupTail.match(Pattern.ja va:4239)
at java.util.regex.Pattern$GroupCurly.match0(Pattern. java:4006)
at java.util.regex.Pattern$GroupCurly.match(Pattern.j ava:3928)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Branch.match(Pattern.java: 4124)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$Curly.match0(Pattern.java: 3794)
at java.util.regex.Pattern$Curly.match(Pattern.java:3 756)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Loop.match(Pattern.java:43 07)
at java.util.regex.Pattern$GroupTail.match(Pattern.ja va:4239)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$BranchConn.match(Pattern.j ava:4090)
at java.util.regex.Pattern$GroupTail.match(Pattern.ja va:4239)
at java.util.regex.Pattern$GroupCurly.match0(Pattern. java:4006)
at java.util.regex.Pattern$GroupCurly.match(Pattern.j ava:3928)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Branch.match(Pattern.java: 4124)
at java.util.regex.Pattern$Ques.match(Pattern.java:37 03)
at java.util.regex.Pattern$Curly.match0(Pattern.java: 3794)
at java.util.regex.Pattern$Curly.match(Pattern.java:3 756)
at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4180)
at java.util.regex.Pattern$Loop.matchInit(Pattern.jav a:4323)
at java.util.regex.Pattern$Prolog.match(Pattern.java: 4263)
at java.util.regex.Matcher.match(Matcher.java:1139)
at java.util.regex.Matcher.matches(Matcher.java:514)
Thanks
Deepak
Re: Java regular expression optimization - help needed
Why did you add + at the end?
And one more thing, why did you use [_\-] instead of [_|-]
Re: Java regular expression optimization - help needed
Quote:
1. Input text should start with a-zA-Z0-9.
You don't need a Pattern for this.
Quote:
2. Space(single) is allowed between words
Use a pattern for this
Quote:
3. "_" and "-" are allowed but cannot be consecutive.
Use a separate pattern for this.
It would be crazy to attempt all 3 in a single Pattern, especially when any one being wrong is enough to invalidate your input
Re: Java regular expression optimization - help needed
Please do not double post the same question. Your other post has been removed.