**Problem description:**

Write a program which takes text for an anonymous letter and in the character set the number of times and determines if it is possible to write the anonymous letter using the magazine.

The anonymous letter can be written using the magazine if for each character in the anonymous letter, the number of times it appears in the anonymous letter is no more than the number of times it appears in the magazine.

*Hint:* Count the number of distinct characters appearing in the letter.

**Solution:**

A better approach is to make a single pass over the letter, storing the character counts for the letter in single hash table — keys are characters, and values are the number of times that character appears. Next, we make a pass over the magazine. When processing a character c, if c appears in the hash table ,we reduce its count by 1; we remove it from the hash when its count goes to zero. If the hash becomes empty, we return sure. If we reach the end of the letter and the has is nonempty, we return false — each of the characters remaining in the hash occurs more times in the magazine.

**focus point:**

- better algorithm design to improve time complexity

#### 13.6 COMPUTE THE

1 | k |

MOST FREQUENT QUERIES

You are given a log file containing search queries. Each query is a string, and queries are separated by new lines. Diverse applications, such as autocompletion and trend analysis, require computing the most frequent queries. In the abstract, you are to solve the following problem.

You are given an array of strings. Compute the

1 | k |

strings that appear most frequently in the array.

*Hint:* Consider extreme values for

1 | k |

, as well as scenarios where there are a relatively small number of distinct strings.

Since all that is required is the

1 | k |

most frequent strings, the sorted phase in above algorithm is overkill because it tells us about the relative frequencies of strings that are infrequent, too. **We can achieve a time complexity of**

1 | O(n + mlogk) |

**.** This approach is superior when

1 | m |

is large, i.e., comparable to

1 | n |

and

1 | k |

is small. **We do this by maintaining a min-heap of the**

1 | k |

**most frequent strings.** If the new string’s frequency is greater than the root’s frequency, we delete the root and add the new string to the min-heap. The

1 | k |

strings in the min-heap at the end of the iteration are the result. In the worst-case, each iterative step entails a heap delete and insert, so the time complexity is

1 | O(n + mlogk) |

. The space complexity is dominated by the hash table, i.e.,

1 | O(m) |

.

We can improve the time complexity to almost certain

1 | O(n + m) = O(n) |

by using the algorithm in Solution 12.9 on Page 195 to compute the

1 | k |

largest elements in the array of unique strings, again comparing strings on their frequencies. The space complexity is

1 | O(m) |

.

**13.7 FIND THE NEAREST REPEATED ENTRES IN AN ARRAY**

**13.8 FIND THE SMALLEST SUBARRAY COVERING ALL VALUES**

Specially, we use *a hash table to map keywords to their most recent occurrences in the paragraph array as we iterate through it*, and **a hash table mapping each keyword to the length of the shortest subarray ending at the most recent occurrence of that keyword**.

These two hash tables give us is the ability to determine the shortest subarray sequentially covering the fist

1 | k |

keywords given the shortest subarray sequentially covering the first

1 | k - 1 |

keywords.

When processing the

1 | i |

th string in the paragraph array, if that string is the

1 | j |

th keyword, we update the most recent occurrence of the keyword to

1 | i |

. The shortest subarray ending at

1 | i |

which sequentially covers the first

1 | j |

keywords consists of the shortest subarray ending at the most recent occurrence of the first

1 | j - 1 |

keywords plus the elements from the most recent occurrence of the

1 | (j - 1) |

th keyword to i. This computation is implemented below.

**Reference**

## 13.10 FIND THE LONGEST SUBARRAY WITH DISTINCT ENTRIES

Write a program that takes an array and **returns the length of a longest subarray** with the property that all its elements are distinct. For example, if the array is

1 |

then a longest subarray all of whose elements are distinct is

1 |

.

*Hint:* What should you do if the subarray from indices

1 | i |

to

1 | j |

satisfies the property, but the subarray from

1 | i |

to

1 | j + 1 |

does not.

**Solution**

We can improve the time complexity by reusing previous computation as we iterate through the array. Suppose we know the longest duplicate-free subarray ending at the given index. The longest duplicate-free subarray ending at the next index is either the previous subarray appended with the element at the next index, if that element does not appear in the longest duplicate-free subarray at the current index. Otherwise it is the subarray beginning at the most recent occurrence of the element at the next index to the next index. To perform this case analysis as we iterate, all we need is a hash table storing the most recent occurrence of each element, and the longest duplicate-free subarray ending at the current element.

**Reference**

## 13.11 FIND THE LENGTH OF A LONGEST CONTAINED INTERVAL

Write a program which takes as input a set of integers represented by an array, and returns the size of a largest subset of integers in the array having the property that if two integers are in the subset, then so are all integers between them.

*Hint:* Do you really need a total ordering on the input?

**Solution**

On closer inspection we see that sorting is not essential to the functioning of the algorithm.

# 13.12 COMPUTE THE AVERAGE OF THE TOP THREE SCORES

Student test scores are recorded in a file. Each line consists of a student ID, which is an alphanumeric string, and integer between 0 and 100, inclusive.

Write a program, which takes as input a file containing test scores and returns the student who has the maximum score averaged across his or her top three tests. If the student has fewer than three test scores, ignore that student.

*Hint:* Generalize to computing the top

1 | k |

scores.

**Solution**

A straightforward approach would be to find the scores for each student, sort those scores ,and average the top three. More specifically, we can find the scores for each student in a single pass, recording them using a map which takes a student id and stores scores for that student in a dynamic array.

## 13.13 COMPUTE ALL STRING DECOMPOSITIONS

This problem is considered with taking a string (the “sentence” string) and a set of strings (the “words”), and finding the substrings of the sentence which are the concatenation of all the words (in any order).

Write a program which takes as input a string (the “sentence”) and an array of strings (the “words”), and returns the starting indices of substrings of the sentence string which are the concatenation of all the strings in the words array. Each string must appear exactly once, and their ordering is immaterial. Assume all strings in the words array have equal length. It is possible for the words array to contain duplicates.

*Hint:* Exploit the factor that the words have the same length.

**Solution:** Let’s begin by considering the problem of checking whether a string is the concatenation strings in words.

## 13.14 FIND A HIGHEST AFFINITY PAIR

## 13.15 TEST THE COLLATE CONJECTURE

The state of a game of chess is determined by what piece is present on each square, as illustrated in Figure 13.2. Each square may be empty, or have one of six classes of pieces; each piece may be black or white. Thus

1 | [lg(1 + 6 * 2)] = 4 |

bits suffice per square, which means that a total of

1 | 64 * 4 = 256 |

bits can represent the sate of the chessboard. (The actual state of the game is slightly more complex, as it needs to capture which side is to improve, castling rights, *en passant*, etc., but we will use the simpler model for this question.)

Chess playing computers need to store sets of states, e.g., to determine if a particular state has been evaluated before, or is known to be a winning state. To reduce storage, it is natural to apply a hash function to the 256 bits of state, and ignore collisions. The hash code can be computed by a conventional hash function for strings. However, since the computer repeatedly explores nearby states, **it is advantageous to consider hash functions that can be efficiently coupled based on incremental changes to the board**.

Design a hash function for chess game states. Your function should take a state the hash code for the state, and a move, and efficiently compute the hash code for the updated state.