Lossy Run-Length Encoding - Microsoft Top Interview Questions


Problem Statement :


You are given a lowercase alphabet string s and an integer k. 

Consider an operation where we perform a run-length encoding on a string by representing repeated successive characters as a count and character. 

For example, the string "aabbbc" would be encoded as "2a3bc". Note that we don't put "1c" for "c" since it only appears once successively.

Given that you can first remove any k consecutive characters in s, return the minimum length possible of the resulting run-length encoding.

Constraints

k ≤ n ≤ 100,000 where n is the length of s.

Example 1

Input

s = "aaaaabbaaaaaccaaa"

k = 2

Output

6

Explanation

The two obvious choices are to remove the "bb"s or the "cc"s.

If we remove the "bb"s, then we'd get "10a2c3a" which has length of 7.

If we remove the "cc"s, then we'd get "5a2b8a" which has length of 6.



Solution :



title-img




                        Solution in C++ :

struct state {
    int lhs, rhs;
    int total;
    state() {
    }
    state(int a, int b) {
        lhs = a;
        rhs = b;
        total = 0;
    }
};
int lenof(int x) {
    if (x == 1) return 0;
    return to_string(x).size();
}
void update(state& prev, state& curr) {
    curr.total = prev.total + lenof(curr.rhs - curr.lhs + 1) + 1;
}
class Solution {
    public:
    int solve(string s, int k) {
        int n = s.size();
        if (n == k) return 0;
        vector<state> lhs;
        lhs.emplace_back(-1, -1);
        for (int i = 0; i < n - k; i++) {
            if (i == 0 || s[i] != s[i - 1]) {
                lhs.emplace_back(i, i);
            } else {
                lhs.back().rhs++;
            }
            update(lhs[lhs.size() - 2], lhs.back());
        }
        // initial estimate - delete the entire suffix
        int ret = lhs.back().total;
        vector<state> rhs;
        rhs.emplace_back(n, n);
        for (int i = n - k - 1; i >= 0; i--) {
            // add the rightmost unadded character to the right side
            int add = i + k;
            if (add == n - 1 || s[add] != s[add + 1]) {
                rhs.emplace_back(add, add);
            } else {
                rhs.back().lhs--;
            }
            update(rhs[rhs.size() - 2], rhs.back());
            // remove the rightmost added character from the left
            if (lhs.back().lhs == lhs.back().rhs) {
                lhs.pop_back();
            } else {
                lhs.back().rhs--;
            }
            if (lhs.size() > 1) {
                update(lhs[lhs.size() - 2], lhs.back());
            }
            // new naive estimate, just stick the two together
            ret = min(ret, lhs.back().total + rhs.back().total);
            // is it possible that the two ends can be stuck together?
            if (lhs.size() > 1 && rhs.size() > 1 && s[lhs.back().rhs] == s[rhs.back().lhs]) {
                // add together all the components that are not involved in the merge
                int cand = lhs[lhs.size() - 2].total + rhs[rhs.size() - 2].total;
                // recompute the compressed length
                int tot = rhs.back().rhs - rhs.back().lhs + 1;
                tot += lhs.back().rhs - lhs.back().lhs + 1;
                cand += lenof(tot) + 1;
                ret = min(ret, cand);
            }
        }
        return ret;
    }
};


int solve(string s, int k) {
    return (new Solution())->solve(s, k);
}
                    




                        Solution in Python : 
                            
class Solution:
    def solve(self, S, K):
        N = len(S)
        if N == K:
            return 0

        left = [1] * N
        for i in range(N - 1):
            if S[i] == S[i + 1]:
                left[i + 1] = left[i] + 1

        right = [1] * N
        for i in reversed(range(N - 1)):
            if S[i] == S[i + 1]:
                right[i] = right[i + 1] + 1

        def rle(x):
            return x if x <= 1 else len(str(x)) + 1

        R = [len(list(g)) for _, g in groupby(S)]

        prefix = [0] * N
        prev = 0
        i = 0
        for x in R:
            for j in range(1, 1 + x):
                prefix[i] = prev + rle(j)
                i += 1
            prev += rle(x)

        suffix = [0] * N
        prev = 0
        i = N - 1
        for x in reversed(R):
            for j in range(1, 1 + x):
                suffix[i] = prev + rle(j)
                i -= 1
            prev += rle(x)

        ans = min(prefix[~K], suffix[K])
        for i in range(len(S) - K - 1):
            cand = prefix[i] + suffix[i + K + 1]
            lv = left[i]
            rv = right[i + K + 1]
            if S[i] == S[i + K + 1]:
                cand -= rle(lv) + rle(rv)
                cand += rle(lv + rv)
            ans = min(ans, cand)
        return ans
                    


View More Similar Problems

Find Merge Point of Two Lists

This challenge is part of a tutorial track by MyCodeSchool Given pointers to the head nodes of 2 linked lists that merge together at some point, find the node where the two lists merge. The merge point is where both lists point to the same node, i.e. they reference the same memory location. It is guaranteed that the two head nodes will be different, and neither will be NULL. If the lists share

View Solution →

Inserting a Node Into a Sorted Doubly Linked List

Given a reference to the head of a doubly-linked list and an integer ,data , create a new DoublyLinkedListNode object having data value data and insert it at the proper location to maintain the sort. Example head refers to the list 1 <-> 2 <-> 4 - > NULL. data = 3 Return a reference to the new list: 1 <-> 2 <-> 4 - > NULL , Function Description Complete the sortedInsert function

View Solution →

Reverse a doubly linked list

This challenge is part of a tutorial track by MyCodeSchool Given the pointer to the head node of a doubly linked list, reverse the order of the nodes in place. That is, change the next and prev pointers of the nodes so that the direction of the list is reversed. Return a reference to the head node of the reversed list. Note: The head node might be NULL to indicate that the list is empty.

View Solution →

Tree: Preorder Traversal

Complete the preorder function in the editor below, which has 1 parameter: a pointer to the root of a binary tree. It must print the values in the tree's preorder traversal as a single line of space-separated values. Input Format Our test code passes the root node of a binary tree to the preOrder function. Constraints 1 <= Nodes in the tree <= 500 Output Format Print the tree's

View Solution →

Tree: Postorder Traversal

Complete the postorder function in the editor below. It received 1 parameter: a pointer to the root of a binary tree. It must print the values in the tree's postorder traversal as a single line of space-separated values. Input Format Our test code passes the root node of a binary tree to the postorder function. Constraints 1 <= Nodes in the tree <= 500 Output Format Print the

View Solution →

Tree: Inorder Traversal

In this challenge, you are required to implement inorder traversal of a tree. Complete the inorder function in your editor below, which has 1 parameter: a pointer to the root of a binary tree. It must print the values in the tree's inorder traversal as a single line of space-separated values. Input Format Our hidden tester code passes the root node of a binary tree to your $inOrder* func

View Solution →