Saturday, April 27, 2013

2013 Google Code Jam Round 1A

Well that was a bust.  I figured out how to do the first problem within the first 15 minutes, and spent the rest of my time screwing it up.  The problem is as follows...


Maria has been hired by the Ghastly Chemicals Junkies (GCJ) company to help them manufacture bullseyes. A bullseye consists of a number of concentric rings (rings that are centered at the same point), and it usually represents an archery target. GCJ is interested in manufacturing black-and-white bullseyes.

Maria starts with t millilitres of black paint, which she will use to draw rings of thickness 1cm (one centimetre). A ring of thickness 1cm is the space between two concentric circles whose radii differ by 1cm.
Maria draws the first black ring around a white circle of radius r cm. Then she repeats the following process for as long as she has enough paint to do so:
  1. Maria imagines a white ring of thickness 1cm around the last black ring.
  2. Then she draws a new black ring of thickness 1cm around that white ring.
Note that each "white ring" is simply the space between two black rings.
The area of a disk with radius 1cm is π cm2. One millilitre of paint is required to cover area π cm2. What is the maximum number of black rings that Maria can draw? Please note that:
  • Maria only draws complete rings. If the remaining paint is not enough to draw a complete black ring, she stops painting immediately.
  • There will always be enough paint to draw at least one black ring.

My Solution

The key here is writing a generalized expression relating area to number of rings...  Lets start with the area for the ith ring... this will be the area of the circle with a radius going to the outside of the black paint, minus the area of the paint just inside the black line...

Equation 1: Area of ring #i

Here, R is the radius just at the inside of the ith ring, and we know that the ring radius will be one centimeter larger.  Next, lets formalize the radius with respect to the ring number...

Equation 2: Incorporating formalized Ri
Now all we need to do is sum all N of our rings... which means summing from 0 to N-1, the way we formulated our answer...

Equation 3: Total Area of paint as a function of N rings
Equation 3 gives us a total area... and the problem specifies that 1 ml of paint covers an area of pi... we get a final answer as follows...
Equation 4: Quadratic equation for number of rings that can be painted
So this is where I made my big mistake... I have a quadratic equation that I can stuff into the Quadratic Formula... so that is what I did.  Convert everything to double precision and solve.  Then the number of rings would be the floor of the solution to this equation.  But double precision runs into rounding errors... I'd get answers that would be correct MOST of the time, and then fail for edge cases where rounding matters, and you can bet Google checks for rounding.

However, I was convinced that I had made an algebra mistake, and so I went through this math several times, trying to figure out if I had an off by one error.  I was left with only a few minutes to work through the rest of this problem, when it occurred to me that left in the final form in equation 4, I could use a binary search to finish the search in a way that wouldn't induce round-off errors.

Unfortunately, my first attempt at the binary search for the small set didn't pass, and at this point, the round ended.  I'm pretty much convinced that my approach was correct, but I got caught in the technical details of the solution, which killed me.  I'll be back next week, trying to get through to Round 2.  But I'll have to do a lot better than this.  None the less, Google Code Jam is always my favorite programming event of the year, and this is no exception.  Cheers, Google!

Saturday, April 13, 2013

Google Code Jam Qualification Round

Today was the qualification round for the 2013 Google Code Jam... always my favorite programming contest.  I decided to make an attempt at the first three problems, and didn't attack the last problem, given the points I figured a depth first search was too naive to work.

The first problem was a 4x4 tic tac toe game, with wildcard squares.  The goal was just to determine if there was a winner, or if it was a draw, or if the game is not done yet.  I've done many chess games based on bitmasks, so I decided to use that approach here.  I created the masks by hand, they are easy enough to figure out.  The problem is then solved with 10 bitwise checks per side, and an additional draw check.

The second problem was about figuring out if a given lawnmower pattern could be achieved given certain rules... first, the lawnmower could only make straight line cuts perpendicular to one of the edges.  Second, the height of the blade could only be set at the start of an edge.  The lawnmower had to be driven all the way across the grass, at which point you could drive the lawnmower across from another edge starting point.  The key insight was that for any given patch of grass, the height had to be greater than or equal to the max height in the row, or the max height in the column.  Finding the max for each row and each column has to be done only once, and then once through each patch to do the check, leads to O(n*m) time.

The third problem was sneaky .. but I've been messing around with palindromes for fun with some coworkers, so we had figured out how to get them efficiently as a sequence.  The problem asks us to find all numeric palindromes which are also squares of numeric palindromes.  So for instance, 121 is a palindrome, which is 11^2, where 11 is also a palindrome.  The hard part about this problem is the interval bounds.  The small set is only squares less than a thousand, but the middle set could be all such numbers from 0 to 10^14, and the large set was numbers from 0 to 10^100.  I gave up on the large data set immediately, figuring that it would take some mathy trick to get it to pass.  The key insight for the other two data sets is first recognizing that instead of looking for numbers to be taking the square root of, you start with smaller palindromes, and check to see that the squares are also palindromes.  This means we really are only searching for numbers less than 10^7, which is much more tractable.  In fact, there are only 10999 palindromes less than 10^7, so I went with the approach of just finding all of the requested numbers in the full range up front.  Then, for each interval, just test how many in the list fall within the bounds.

I didn't attempt the last problem, but I did notice a couple of things that could have helped improve the performance of a solution.  This problem was described as having a set of locked chests, each which may contain some keys, and each chest can only be unlocked by certain classes of keys.  You start with at least one key, and need to check if you can unlock all the chests.

If you go with a depth first search approach, don't include the empty boxes in the initial search.  They are outside the critical path, and could be added back in to the critical sequence after the fact, maintaining the "lexigraphical order".  Second, at each node of the search, you should probably check if there are enough reachable keys left to unlock any given box... this is a more directed search, and could cut off much more quickly.  Third, it may be possible to identify if the graph of chests is fully connected graph up front, and immediately fail without searching for an unlock order if it isn't connected.  Other than that, I didn't take a crack at this one.

I hope everyone is enjoying the code jam so far, and am interested in hearing other approaches.

Monday, April 8, 2013

A fixed capacity cache with an LRU eviction policy in C#

One of the critical strategies in creating a system that performs well at high scale is effective caching.  In distributed systems, caching can be your best defense against overloading a database, and an effective solution to temporary database failures, so you can keep servicing requests during downtime.  Our services often provide a cache timeout per endpoint, depending on how fresh the data needs to be for the system to function reliably.  However, the cache size can grow unwieldy if there are many unique calls to an endpoint in a short period of time, when only a subset of the calls are likely to benefit from the caching.

The solution is a cache of a fixed size, which evicts items using a least recently used policy when capacity is reached.  Requirements are that the cache support O(1) inserts and reads, and manage the freshness of the data internally.  The user should be able to specify a capacity of the cache, as well as a staleness timeout.  The interface in my example would be a string-string key value store, with functions for set(key, value), and get(key).  Expired or non-existing items return null strings.

What makes this a fun problem is understanding how to mix data structures to achieve these goals.  Constant time lookups and inserts mean you want a hash table to be your primary storage.  To maintain an order, we need to store our nodes in a list... but it is important to make sure removing nodes from the list, as well as adding either to the head or tail of the list occur in constant time.  A dequeue is appropriate for this goal.  In my design, I'll store the dequeue nodes in the hash table indexed by the key, and a tuple containing the key, value, and timeout inside the dequeue nodes.

My solution is below: