Please refer to the strawberry fields problem on this page AnalysisThink of each strawberry as a point on an XY plane. Then we can find out the concave hull which is the smallest polygon that contains all the ponits or strawberries. Assume for some given strawberry field, the concave hull is something like this Now walk through the vertices of the concave hull in succsession. For each pair of successive vertexes, find the point of intersection that is outside the concave hull. This will now result in an updated polygon as shown. Finally pick the largest edge and from the two edges, one of which succeeds the largest edge and the other which precedes it, pick the larger one. Form a rectangle between the largest edge and the selected succeessive or preceding edge. This is depicted here. Now choose all points of the concave polygon that lie outside the rectangle that we just derived. Add to this set of points, a point that would have been derived as a result of deriving the rectangle. Repeat the process of forming the rectangle till all points of concave polygon are covered. For the sample, the final result is shown. AlgorithmThe steps of the algorithm are enumerated below.
0 Comments
Please refer to the setless set problem on this page. AnalysisIt can be easily deduced that the collection of 81 cards, contiains 27 SETs where each set contains the same cards. Now in order to find the number of ways in which we can collect 20 cars such that it contains no set, we can use the following formula: $$ N_{20\,cards\,without\,set} = N_{all\,possible\,collections\,of\,20\,cards}  N_{collection\,of\,20\,cards\,with\,atleast\,1\,set}$$ Find out set with different cards Now if \(n_{same}\) is the number of sets with same cards, we already know that $$n_{same} = 27$$ If we want to form a set that contains different cards, we can first pick an equal set and then pick a card from that set. Repeating this thrice will give us a set that contains different cards. Thus total number of sets with different cards \(n_{different}\) can be defined as $$n_{different} = (27 * 3) * (26 * 3) * (25 * 3) $$ Thus total number of possible sets \(n_{sets}\) can be defined as: $$n_{sets} = n_{same} + n_{different}$$ Now we can easily find out the number of ways to pick k sets from \(n_{sets}\). A collection of 20 cards will contain a minimum of 1 set and a maximum of 6 sets. Thus number of ways to collect 'at the most' 6 sets can be defined as $$n_{6\,sets} = \sum_{k = 1}^{k = 6} \binom {n_{sets}} {k}$$ Thus to find out N ways in which we can collect 20 cards that does not contain a set is $$ N = \binom {81} {20}  n_{6\,sets}$$ ImplementationWe can now write code that
Please refer to the palindromic pangram problem on this page. AnalysisLet \(A_{p}\) represent a word and let \(A_{pr}\) represent its reverse. Let \(A_{1}\), \(A_{2}\), ..., \(A_{n1}\), \(A_{n}\) be the subsequences of \(A_{pr}\) such that all of \(A_{1}\) \(A_{2}\) ..., \(A_{n1}\) are valid words while \(A_{n}\) could be a valid word, but must be a palindrome. Then we can say that the following is a pangram for word \(A_{p}\): $$Pangram_{A_{p}} \leftarrow \sum_{i = 1}^{i = n1} A + A_{p}$$ Let us verify if this holds true with the following two examples cited in the problem statement: $$Pangram_{daffodil} \leftarrow lid + off + a + daffodil$$ where $A_{n}$ = 'd' which is a palindrome, but not a valid word! and, $$Pangram_{ayatollahs} \leftarrow shallot + ayatollahs$$ where \(A_{n}\) = 'aya' which is a palindraome, but not a valid word! It can be easily observed here that \(A_{1}\), \(A_{2}\), ...,\(A_{n}\) are all prefixes of \(A_{pr}\). Or we can think of them as substring q of string \(A_{pr}\). We also know then that a suffix tree is a good data structure to do string matching! Algorithm
Please refer to the mystery M function problem on this page. Analysis(defun m (i j k) Let us look at the exit conditions for this recursive function ’m’. One of the exit conditions is when k equals zero. Now look at the recursive call. The recursive call fetches a value of k by replacing it with a call to ’m’ while decrementing k. As such k will reach 0, when it will return 1. So it can be seen that the terminating condition of k equals zero returns 1 which is used as a value for k itself. EvaluationThe evaluation will never terminate!
Please refer to the landmarks web app problem on this page. AnalysisThe key part to solve here is to pick the nearby landmarks. As such we need to define what nearby means. Since each landmark’s latitutde and longitude are defined, we can define for a given landmark lm, whose latitude and longitude are \(lm_{lat}\)and \(lm_{long}\) respectively; a nearby landmark nlm with latitude and longitude \(nlm_{lat}\) and \(nlm_{long}\) such that: The nearest \(nlm_{lat} < lm_{lat}\)OR The nearest \(nlm_{lat} > lm_{lat}\) OR The nearest \(nlm_{long} < lm_{long}\) OR The nearest \(nlm_{long} > lm_{long}\) In other words, the nearest landmark in any direction to the given landmark assuming the given landmark as a center. If we can now think of a landmkark as a vertex on a graph, then the neighboring vertexes of a given landmark will be those landmarks that satisfy the above definition. AlgorithmBuild a BTree where key is latitude and another where key is longitude. Now do a inorder traversal of this BTree while adding a landmark as a key to a hash table and its nearest landmarkes as values. Then given a landmark, we just look up in the hash table to find the nearest landmarks. ImplementationIn order to make the web application responsive, it would be nice to load the page with a javscript hashtable of landmarks and its nearby landmarks.
Please refer to the decrypting the two time pad problem on this page. AnalysisThe following are the problem characteristics. It is an instance of:
Algorithm
ImplementationI have not implemented this algorithm. But the analysis and the requirements of the problem do indicate that an approximate solution is requi Ared and the quality of it will hence depend upon (drawn from the algorithm) the chosen reference text. As such we may have to try out the implementation with mulitple chosen texts. ReferencesPlease refer to the ascii amaze problem on this page. AnalysisA maze is a grid of n * m cells. Given a maze, we can detect from a given cell k, which cell you can move to next. This is a defintion of a directed graph. Thus we can model a maze as a directed graph where each cell of the maze is a vertex and its neighbors are the cells you can navigate to from the given cell. Now the start and end cells are defined as the bottom left and top right cells. Counting the lowest row as 1st row and left most column as 1st column while assuming m rows and n rows in the maze, the start cell is (1,1) while the end cell is (m, n). The problem then becomes of finding the shortest path between the start and end cells. AlgorithmPlease refer to the addagram problem on this page. AnalysisGiven a 3 letter word, let us assume, we can find out the 4 letter words that are its addagrams. We can then expand this to a n letter word pointing to its (n + 1)th addagram. We can obviously model this as a directed graph where each word points to its addagrams. Further since there is no possibility of an addagram pointing to its possible source, this will be an acylic graph. Thus we have a directed acyclic graph representing the addagrams. AlgorithmThe problem then is of finding the ’longest path’ in the directed acyclic graph. In our case each edge will have a weight of 1.
First we build a hash table where key is the length of the word and values are the words of those length. Then we build the directed acylic graph such that each vertex is the word ’w’ and its neighbors are its addagrams obtained by searching through the hash table those words which are 1 more character in length than the given word ’w’. Please refer to the word numbers problem on this page. Before I even explain how I tried to solve this problem, you can download and install the code from my bitbucket wordnums repository. After you have tried finding numbers and are convinced that this application does solve the problem, if you are interested in understanding the details of the solution, you may continue reading. AnalysisThe characteristics of the problem are:
Finally, we first need to convert a given number into its word equivalent. The algorithm to do the conversion is described next. Converting a number into it's word equivalentIt can be observed that any number is a repetitive pattern of 1 to 999, scaled by thousand for the next pattern and by million for the next and so on. In our case, we scale only upto millions. Thus it would suffice to generate a word equivalent of a 3 digit number between 1 and 999 and use the scaling factor of thousand and million based upon the number of times the pattern is repeated. Further for a 3 digit number, any digit that occurs in unit’s or hundred’s place is a simple look up from the numeric form of the digit to its word form. However the translations for the digit in ten’s place are best described as quirky as 1 in ten’s place means ten if followed by 0 in units place, 11 if followed by 1 and so on. The digits 2 to 9 in ten’s place are less quirkier as the conversion depends only on the digit and not by the digit in unit’s place as is the case for 1 in ten’s place. Thus,it is now possible to generate word equivalent for any number by splitting it into patterns matching a 3 digit number and using the appropriate scale such as thousand and million. The code that generates a word equivalent is shown here. (defun joinwords (words) How about brute force?Having identified that an index btree is a suitable data structure for our needs, and we have already figured out the generation of a word equivalent, it would be straightforward to generate an indexed btree of billion numbers. We can then traverse the tree in sorted order to update the index which is the number of letters covered. Then a simple lookup using the index should suffice. However this approach will suffer from the problem of time to build an index, sorted btree as we have a billion numbers. The time is dependent upon the processing power and memory availabe on the machine that builds the data structure. Of course, we could use a large number of machines (or multiple cores) so that we can build the data structure faster. Even in that case, we still have to ensure that we distribute the numbers evenly between the machines so that all parallel computations complete around the same time! But would there not be an elegant way out? It turns out, there is! Elegant way outWe have already identified that there is a repititive pattern of digits. We used this property to generate a word equivalent. Then the same property should be helpful in lexicographical ordering of the words. Is this true? Let us find out. If we sort just the first 999 numbers, the sorted list (A) looks something as shown below: Now consider the numbers upto 999,999. These are actually a group of 999 numbers where each group starts with [1000;2000;3000;999,000]. Their word equivalents are [onethousand,twothousand,threethousand,..,ninehundredninetyninethousand]. These word equivalents can be generated by adding the suffix ’thousand’ to the word equivalents of numbers from 1 to 999, which is our pattern (building block). So the sort order for the numbers upto 999,999 remains the same when expressed as a group of numbers as described above. Further each such group contains an additonal 999 numbers, which happen to be 1 to 999. The same argument holds true for numbers upto 1 billion. Just that a group in this case will contain 1 million numbers. Overall the three sorted lists A, B, C respectively of multiples of one, thousand and million are shown below. Since any number is either a million or thousand or hundred, we just need to know the lexicographical position of the successive scales starting from the highest scale. Thus for a number that is multiple of million, I need to find out the sorted order position of the millionth part of the number, followed by that for the thousandth part and then for the hundredth part. As we have a billion numbers to look at, we need to maintain the following 3 databases:
Now if we can precompute the number of letters counted and the sum, we can have an elegant solution to our problem. Precomputing the sum and number of lettersSince we know the word equivalent of a number, we also know its length. For a number that represents a multiple of one, its length can be denoted as: \(l_i\). A number that is multiple of 1 is the smallest scale and we need to look no further. So the overall sum of all letters in word equivalents of numbers that are multiple of 1 can be represented as: $$l_{ones} = \sum_{i=1}^{999} l_i$$ Now lets consider a number that represents a multiple of thousand. Let the length of its word equivalent be denoted by \(l_t\). Since such a group contains further 1000 numbers, the length for the group can be described as $$l_{tg} = l_t + l_{ones}$$ So the total number of letters covered upto 1 million can be denoted as $$l_{thousands} = \sum_{g=1000}^{999000} l_{tg}$$ Similary for a multiple of millions, the formula is $$l_{mg} = l_m + l_{thousands}$$ So the total number of letters covered upto 1 million can be denoted as $$l_{thousands} = \sum_{g=1000}^{999000} l_{tg}$$ Similary for a multiple of millions, the formula is $$l_{mg} = l_m + l_{thousands}$$ Here we do not need to compute the overall summation as we are scaling only upto a billion. On the same lines, the sum of numbers covered can also be computed. $$s_{thousand} = \sum_{i=1}^{999} i$$ $$s_{million} = \sum_{i=1}^{999999} i$$ $$s_{tg} = t + s_{thousand}$$ where t = (1000 or 2000 or ... 999000) $$s_{mg} = m + s_{million}$$ where m = (1000,000 or 200,000 or 999,000) AlgorithmNow we can describe the entire algorithm. As a first step, we build the 3 databases which contains the multiples of one, thousand and million. Each database is an indexed btree. Next we update the btree which contains the multiples upto million with the sum and letters covered using the formuale described in the previous section. Now given the number of letters covered, which is actually an index into our database, we look up the number starting with the millions database using this index. We find the first node n in the btree such that for this node n, if index is \(k_found\), $$k_{found} >= index$$ If \(k_{found}\) is index, we have found our number. Else, we navigate to the previous node in the tree and check if that is the node we are looking for by computing the number of letters covered. Otherwise, this node forms a part of number and serves as a key into the next database, which is thousands database if it is multiple of thousand or one’s database. Before searching in the next database, we update that database with the letters and sums on each node, as the key is a prefix to each number in the next database., Every node of the binary tree contains the following data:
(defun findinteger (idx db &optional (n 0) (sum 0) (vals (list 0 0))) A visual depiction of the algorithm is shown below. Since we use an indexed BTree, the efficiency of the algorithm while creating and updating a btree is $$O(n (log n))$$ while looking up the number is $$\approx O(log n)$$. Hence the overall efficency is: $$O(n (log n))$$ DisclaimersThe solution is tested on SBCL 1.0.39 on Mac OS X 10.6 using clelephant as the btree implementation. I would like to emphasize that the solution was not generated in the sequence described above. I did some detailed whiteboarding and tried out code snippets and finally figured out the solution. Further, the solution uses only \(999 * (3 + 2 + 1) = 5994\) nodes. So an inmemory version of a binary search tree should as well suffice. Since I had a ready to use disk based indexed BTree implementation, I chose to use that. What is the number at 51th billion letter?As per my solution, there is no number that ends at 51 billion. In fact, there is a number that ends at 51 billion and one. And that number is 676465328. The sum of numbers upto that point is 206392395198132239862723. From the code run at REPL * (wordnums:findat 51000000001) Please refer to the sling blade problem on this page. Before I even explain how I tried to solve this problem, you can download and install the code from my bitbucket oltitles repository. After you have tried finding overlapping titles and are convinced that this application does solve the problem, if you are interested in understanding the details of the solution, you may continue reading. AnalysisThe problem requires us to find the longest chain of overlapping titles. The following can be identified from the problem details:
Thus we have a directed graph to reprsent the titles. Now the problem of finding the longest chain gets transformed into the problem of finding a path within this graph which covers as many vertexes as possible. At least theoretically, then the longest chain will be the path that covers each vertex and that too exactly once, since we are not supposed to repeat titles. While prototyping all the different ideas and approaches in code, I used the clgraph library by Gary King. So wherever you read something as ’I tried’ it implies as in ’tried in code’. I have not shown all the code but only the one that I chose as a solution. Traveling Salesman ProblemThe problem of finding the path in a graph which covers each vertex exactly once is the traveling salesman problem and we know it is an NP Complete Problem. In our case, since we have directed edges, where weight of each edge is 1, it becomes the Hamiltonian cylce problem. But we have a directed graph at hand. A heuristic to find the Hamltionian path is to find minimum spanning tree of a graph and finding the shortest paths between the vertexes of the tree ordered by depth first traversal. However this works only for undirected graphs. Anyways, how do you find tree of a directed graph? However this analysis clearly shows that we cannot have a chain that covers all titles unless its an undirected graph. Attempt 1: Assuming an undirected graphGoing by the previous argument, I tried with an undirected graph and produced an ordering of the vertexes in depth first traversal of the minimum spanning tree. Now while finding the paths between these vertexes, I actually look at the directed version of the graph. The problem with this approach is that you find a long chain only if you reach a vertex that is a leaf late enough in the cycle. This is evident from the visual below. Hence this transformation is not suitable for our problem. Again we could actually find the optimum branchings (minimum spanning tree) of a directed graph, but still the vertexes obtained from the branching could suffer from the same problem. So I dropped this idea. Attempt 2: Topological SortAs a first step, I removed all the edges from the edge that would result in creation of cycles. Now I have a directed acyclic graph and you can do a topological sort. I then tried to find the path between two vertexes at the opposing ends of the sorted list of vertexes as shown in visual below. Again this usually produced a chain of around 20 titles. Again here ideally we must try to find the path between the pairs of vertexes which are ordered by their position in the topological sort. Still attempting to find the path by successively moving deeper into the sorted list at both the ends gave me a fair enough idea to follow this path or not. Not happy with the results, I abandoned this idea as well. Attempt 3: Cyclic ApproachFinally this is what I came up with. I first found out if they directed graph is connected. It was disonnected. So I found the largest connected component. I chose this component to find the longest path. Next I figured if it was cyclic or not. It was. So I found all the vertexes which belong to a cycle. Next was to find a path for each cyclic vertex such that the path contains only cyclic vertexes. The largest such cyclic path would be my answer to the longest chain. This is the property that I applied here usefully. In order to prove a vertex is cyclic, we can determine a path where the vertex in question is the start as well as the end node. However the coroallary of this is that every vertex on the the cyclic path is cyclic itself. As such we can find a path of overlapping titles where each vertex on the path is cyclic, because for a given cyclic vertex, one of its neighbors has to be cyclic! Implementing this, I was able to find out a chain of 230 titles. Thus by using the heuristic of finding longest cyclic path so that it contains only cyclic vertexes, I was able to find a heuristic solution to the longest chain. We can also claim that the longest chain which is a cyclic chain cannot be more than the total number of cyclic vertexes. AlgorithmThe algorithm can thus be described as: Build a directed graph of the titles where each directed edge (u, v) represents an edge from title ’u’ to overlapping title ’v’. Then find all the cyclic vertexes of the largest connected component of the directed graph. Find the cyclic path for each cyclic vertex such that the path contains only cyclic vertexs. The longest such path is the longest chain of overlapping titles, given the heuristics.
Since the main task is to find out the cyclic vertexes (using depth first search), let us assume we have a graph G=(V, E) which represents the largest connected component. Since we find out if each vertex V is cyclic, the efficiency turns out to be: \( O(V . (V + E)) \approx O({V}^{2}) \) When we find the cyclic path, we do not do a depth first search and only look up in the list of cyclic vertexes for the next valid vertex for the given vertex 
MeI am a polyglot software engineer specializing in shipping iOS and 3d scientific visualization applications. Archives
December 2011
Categories
All
