寂寞好了 发表于 2016104 03:35:01
95
1
Code golfis an interesting concept to me: to solve a programming challenge, using not the most efficient or readable code, or the most stateoftheart solution, but with the smallest code size possible.
The interest comes from the fact, since it's such a counterintuitive way of thinking, it taps into the creative part of our brain, to come up with new and inventive ways of reducing the code.
I recently participated and won one of these challenges, as part of a pixels.campcontest. This post is about how I came up with my solution.
The challenge
The input for our code (originally published herewas a string representing a Polish Notationaddition of two nonnegative integers. For example:
Our code should consist of a list of regular expression substitutions (of the form /match/replacement/). This list would be applied in a loop to the input string, constantly mutating it until the output consisted of a single number, which should equal the result of the addition. So for the above string, we would have to yield 16.
For added complexity, all numbers were represented in octal notation.
Performing an addition using nothing more than regular expressions might seem puzzling at first. But let's see how we can go about this.
The foundations
After seeing the challenge, I immediately remembered something I had seen before, which ended up being the entire core of my solution: incrementing a numbers using regular expressions.
30 seconds later, I had arrived at the same Stack Overflow answer I had seen before.
As it turns out, incrementing a number with regex’s is a known and solved problem. Let’s use the number 9 as our sample input (I won’t worry convert this to octal notation just yet, because we’re all more used to reasoning about decimal numbers).
The algorithm for incrementing 9 is:
1. Adding a lookup table
This table consists of 01234567890. Each digit must be followed by it’s incremented counterpart, which is why we need 0 at both ends. In order to distinguish the actual number from the lookup table, we can separate them using any nonnumeric character. Let’s go with ~(tilde).
So 9becomes 9~01234567890.
2. Leftpadding the input
We’re actually going to increment each digit individually, and we know 9must become 10, which has an extra digit. So we need to safeguard for that by leftpadding our input with zeros, at least when the input starts with a 9.
We now have 09~01234567890.
3. Find out what to increment
Now it’s time to do the actual “math”. For a 09to become a 10, we need to increment both of it’s digits. But this is not the overall rule. For example, for 10to become 11, only the last digit is incremented.
The overall rule here is that we only increment the last digit, and as long as they are 9’s, we also increment the ones before them. In regular expressions, this could loosely be translated to /(\d)9*$/.
4. Looking ahead
To actually make use of this, we need to capture each of the digits we want to increment, and using a positive lookahead, finding it’s occurrence in the lookup table that follows, so that we can also match against the digit right after it. This one looks a bit more complicated:
 /(\d)(?=9*~\d*?\1(\d))/$2/g
复制代码 In more detail:
 Match a digit: (\d+);
 Within the lookahead, match all following 9’s (due to the previously mentioned rule, match the lookup table separator, and an arbitrary number of digits until the original one is matched: (?=9*~\d*?\1);
 Match the digit right after that: (\d);
 the match (which is only the first digit, as the lookahead doesn’t count) with $2, which matched the incremented digit from the lookup table;
 Use the /gmodifier, so that this matches more than once (i.e.: 1999must become 2000, so we need to match all four digits).
5. Taking out the trash
At this point, 09has become 10, so can just get rid of the lookup table with a simple regex, leaving us with the final result.
Hold on. This just incremented a single number. We didn’t sum anything yet!
Adding numbers, one step at a time
With the logic in place for incrementing a single number, we can just as easily deduce the logic for decrementing a number. It’s pretty much the same, with the lookup table inverted. And we also don’t need to add leading zeros.
Now, remember that the regexes we use are called several times, in a loop?This means we can go back to our primary school days, and increment one number while decrementing the other. When the decrementing one reaches zero, we’re done.
So, knowing all of this, my first submitted solution was:
 /\+ (\d+) (\d+)/z$1~012345670 $2#076543210/
 /z(\d+)~\d* 0*#\d+/$1/
 /z7/z07/
 /(\d)(?=7*~\d*?\1(\d))/$2/g
 /(\d)(?=0*#\d*?\1(\d))/$2/g
复制代码 Let’s go through this one step at a time:
 /\+ (\d+) (\d+)/z$1~012345670 $2#076543210/
复制代码 First, I’m adding lookup tables to both digits (using #in the second one to distinguish them. Also note that they only go up to 7, since the actual problem uses octal numbers only. I’m also removing the plus sign in favor of a z, which saved me the backslach and the unnecessary whitespace.
This will match only when the second number is zero, and replace the whole string with the captured first number. Because if the second number is zero, the final result is the first number alone.
Once this matches, all other regexes will no longer match, and the algorithm is over.
Adding a leading zero, if the first number is a 7.
 /(\d)(?=7*~\d*?\1(\d))/$2/g
 /(\d)(?=0*#\d*?\1(\d))/$2/g
复制代码 The first one was already explained in detail. It increments the first number. The second one is mostly equivalent, but it decrements the second one.
And there we go. This will take long to run. N+1iterations, to be more accurate, where Nis equal to the second number. So + 10 7will take 7 iterations. But it will get there.
Squeezing to the last byte
The above solution gets us to 132 characters. But we can go further.
Notice a few things about the whole solution:
 /\+ (\d+) (\d+)/z$1~012345670 $2#076543210/
 /z(\d+)~\d* 0*#\d+/$1/
 /z7/z07/
 /(\d)(?=7*~\d*?\1(\d))/$2/g
 /(\d)(?=0*#\d*?\1(\d))/$2/g
复制代码
 \dmatches a digit. But the contest creators were nice to us, and told us we could assume the input is valid. So we can just replace this with a .(dot), which matches any single character; Same thing for the \+;
 The zplaceholder was actually an oversight on my part. We can remove it altogether, and match with the beginning of the string ^instead;
 Without that zto worry about, we can now more cleverly handle the second regex, where we replace the whole string with the value of the first number we capture. It can now be written as /~.* 0*#.+//(i.e.: delete everything starting at the tilde, as long as the second number is zero)
 The last two expressions look intriguingly alike. Let’s use the pipe operator to merge them into a single one: /(.)(?=(7*~0*#).*?\1(.))/$3/g
This leaves us with the final, 92 character long solution:
 /. (.+) (.+)/$1~012345670 $2#076543210/
 /~.* 0*#.+//
 /⁷/07/
 /(.)(?=(7*~0*#).*?\1(.))/$3/g
复制代码 Conclusion
As useless as this may seem (and, well, it probably is), I'm fascinated by the thought process required by exercises such as this one. It's a cool feeling, to poke my brain with different sticks from time to time.
If you're also attending pixels.campthis week let me know, or reach me out during the event. I'd love to chat, and I'll have stickers! 
