\d+ does not work with split giving same result as just \d
gives me [, 1, +, 2, +, 1, 3]
What do I need to do if I want: [1, +, 2, +, 13] instead?
Thanks in advance
StackOverflow: How to split a string but also keep the delimiters
The accepted answer may be what you're looking for, applied here you just use + instead of ; as the "delimiter".
Thanks. But why doesn't \d+ work?
gives me [3, +4, -, (2, *3, ), ^16] and not [3, +, 4, -, (2, *, 3, ), ^, 16]
Why \d+ doesn't work:
You're using a look-ahead match, which implicitly means the actual match group is the "space" between characters, i.e. match.start() = match.end(). Since you're matching at least one digit, the starting edge is always the same. The split algorithm can't tell the difference between the two so you get the same result.
That's a rather long and convoluted regex pattern, I'm not quite sure where to begin with debugging it.
A better solution is to utilize character classes which match all the appropriate operators.
Depending on how complex you're planning to get with your tokenizer, it may be better to use a different technique. Rather than trying to split everything with one regex, try parsing the tokens as the come. This way each regex stays simple, is easy to modify, and is specific to what token you're trying to parse.