Lossy Run-Length Encoding - Microsoft Top Interview Questions
Problem Statement :
You are given a lowercase alphabet string s and an integer k. Consider an operation where we perform a run-length encoding on a string by representing repeated successive characters as a count and character. For example, the string "aabbbc" would be encoded as "2a3bc". Note that we don't put "1c" for "c" since it only appears once successively. Given that you can first remove any k consecutive characters in s, return the minimum length possible of the resulting run-length encoding. Constraints k ≤ n ≤ 100,000 where n is the length of s. Example 1 Input s = "aaaaabbaaaaaccaaa" k = 2 Output 6 Explanation The two obvious choices are to remove the "bb"s or the "cc"s. If we remove the "bb"s, then we'd get "10a2c3a" which has length of 7. If we remove the "cc"s, then we'd get "5a2b8a" which has length of 6.
Solution :
Solution in C++ :
struct state {
int lhs, rhs;
int total;
state() {
}
state(int a, int b) {
lhs = a;
rhs = b;
total = 0;
}
};
int lenof(int x) {
if (x == 1) return 0;
return to_string(x).size();
}
void update(state& prev, state& curr) {
curr.total = prev.total + lenof(curr.rhs - curr.lhs + 1) + 1;
}
class Solution {
public:
int solve(string s, int k) {
int n = s.size();
if (n == k) return 0;
vector<state> lhs;
lhs.emplace_back(-1, -1);
for (int i = 0; i < n - k; i++) {
if (i == 0 || s[i] != s[i - 1]) {
lhs.emplace_back(i, i);
} else {
lhs.back().rhs++;
}
update(lhs[lhs.size() - 2], lhs.back());
}
// initial estimate - delete the entire suffix
int ret = lhs.back().total;
vector<state> rhs;
rhs.emplace_back(n, n);
for (int i = n - k - 1; i >= 0; i--) {
// add the rightmost unadded character to the right side
int add = i + k;
if (add == n - 1 || s[add] != s[add + 1]) {
rhs.emplace_back(add, add);
} else {
rhs.back().lhs--;
}
update(rhs[rhs.size() - 2], rhs.back());
// remove the rightmost added character from the left
if (lhs.back().lhs == lhs.back().rhs) {
lhs.pop_back();
} else {
lhs.back().rhs--;
}
if (lhs.size() > 1) {
update(lhs[lhs.size() - 2], lhs.back());
}
// new naive estimate, just stick the two together
ret = min(ret, lhs.back().total + rhs.back().total);
// is it possible that the two ends can be stuck together?
if (lhs.size() > 1 && rhs.size() > 1 && s[lhs.back().rhs] == s[rhs.back().lhs]) {
// add together all the components that are not involved in the merge
int cand = lhs[lhs.size() - 2].total + rhs[rhs.size() - 2].total;
// recompute the compressed length
int tot = rhs.back().rhs - rhs.back().lhs + 1;
tot += lhs.back().rhs - lhs.back().lhs + 1;
cand += lenof(tot) + 1;
ret = min(ret, cand);
}
}
return ret;
}
};
int solve(string s, int k) {
return (new Solution())->solve(s, k);
}
Solution in Python :
class Solution:
def solve(self, S, K):
N = len(S)
if N == K:
return 0
left = [1] * N
for i in range(N - 1):
if S[i] == S[i + 1]:
left[i + 1] = left[i] + 1
right = [1] * N
for i in reversed(range(N - 1)):
if S[i] == S[i + 1]:
right[i] = right[i + 1] + 1
def rle(x):
return x if x <= 1 else len(str(x)) + 1
R = [len(list(g)) for _, g in groupby(S)]
prefix = [0] * N
prev = 0
i = 0
for x in R:
for j in range(1, 1 + x):
prefix[i] = prev + rle(j)
i += 1
prev += rle(x)
suffix = [0] * N
prev = 0
i = N - 1
for x in reversed(R):
for j in range(1, 1 + x):
suffix[i] = prev + rle(j)
i -= 1
prev += rle(x)
ans = min(prefix[~K], suffix[K])
for i in range(len(S) - K - 1):
cand = prefix[i] + suffix[i + K + 1]
lv = left[i]
rv = right[i + K + 1]
if S[i] == S[i + K + 1]:
cand -= rle(lv) + rle(rv)
cand += rle(lv + rv)
ans = min(ans, cand)
return ans
View More Similar Problems
Cube Summation
You are given a 3-D Matrix in which each block contains 0 initially. The first block is defined by the coordinate (1,1,1) and the last block is defined by the coordinate (N,N,N). There are two types of queries. UPDATE x y z W updates the value of block (x,y,z) to W. QUERY x1 y1 z1 x2 y2 z2 calculates the sum of the value of blocks whose x coordinate is between x1 and x2 (inclusive), y coor
View Solution →Direct Connections
Enter-View ( EV ) is a linear, street-like country. By linear, we mean all the cities of the country are placed on a single straight line - the x -axis. Thus every city's position can be defined by a single coordinate, xi, the distance from the left borderline of the country. You can treat all cities as single points. Unfortunately, the dictator of telecommunication of EV (Mr. S. Treat Jr.) do
View Solution →Subsequence Weighting
A subsequence of a sequence is a sequence which is obtained by deleting zero or more elements from the sequence. You are given a sequence A in which every element is a pair of integers i.e A = [(a1, w1), (a2, w2),..., (aN, wN)]. For a subseqence B = [(b1, v1), (b2, v2), ...., (bM, vM)] of the given sequence : We call it increasing if for every i (1 <= i < M ) , bi < bi+1. Weight(B) =
View Solution →Kindergarten Adventures
Meera teaches a class of n students, and every day in her classroom is an adventure. Today is drawing day! The students are sitting around a round table, and they are numbered from 1 to n in the clockwise direction. This means that the students are numbered 1, 2, 3, . . . , n-1, n, and students 1 and n are sitting next to each other. After letting the students draw for a certain period of ti
View Solution →Mr. X and His Shots
A cricket match is going to be held. The field is represented by a 1D plane. A cricketer, Mr. X has N favorite shots. Each shot has a particular range. The range of the ith shot is from Ai to Bi. That means his favorite shot can be anywhere in this range. Each player on the opposite team can field only in a particular range. Player i can field from Ci to Di. You are given the N favorite shots of M
View Solution →Jim and the Skyscrapers
Jim has invented a new flying object called HZ42. HZ42 is like a broom and can only fly horizontally, independent of the environment. One day, Jim started his flight from Dubai's highest skyscraper, traveled some distance and landed on another skyscraper of same height! So much fun! But unfortunately, new skyscrapers have been built recently. Let us describe the problem in one dimensional space
View Solution →