Regex definition unclear...
I've read the documentation. It's still unclear. I've searched this forum. No luck. So would someone kindly explain in simple terms why the asterisk ("zero or any occurrences") is necessary - as described in the comments above the expression. Just guessing: the asterisk is the "thereafter" part? But how do we arrive at that from the "zero or any occurrences" described in the Pattern documentation?
Here's the example:
Code Java:
// This method verifies that firstName starts with an uppercase
// letter and contains lowercase letters thereafter.
public void setFirstName(String first)
{
if (firstName.matches("[A-Z][a-z]*"))
{
this.firstName = first;
}
// and blah blah blah...
After reading the documentation, it seems to me that the expression would say: "the first letter of firstName must be uppercase and the second letter is must be lowercase, zero or any number of times" (which of course doesn't make sense). The comments above tell me how the expression should be read, but I don't see the logic in that translation --- yet.
Re: Regex definition unclear...
The regular expression has 3 parts:
- The first part is [A-Z] which means that the first letter must be present and must be capitalized.
- The next part is [a-z] which will match any lower-case letter.
- The third part is the asterisk, *, which tells the regex engine that the preceding character (the lower case letter) may be repeated 0 or more times. The key is the asterisk affects the character that comes before it.
So this means that valid Strings include A, Hello, John, Xasdflaksdfajse, and Phillip. Non-valid Strings would be ones with any non-letter characters, any that don't start with an upper case or any with uppercase letters that are not in the first position.
Re: Regex definition unclear...
curmudgeon,
Thanks for a great explanation!
Re: Regex definition unclear...