Ashton and String


Problem Statement :


Ashton appeared for a job interview and is asked the following question. Arrange all the distinct substrings of a given string in lexicographical order and concatenate them. Print the  kth character of the concatenated string. It is assured that given value of k will be valid i.e. there will be a kth character. Can you help Ashton out with this?


Note We have distinct substrings here, i.e. if string is aa, it's distinct substrings are a and aa.

Function Description

Complete the ashtonString function in the editor below. It should return the kth character from the concatenated string, 1-based indexing.

ashtonString has the following parameters:
- s: a string
- k: an integer

Input Format

The first line will contain an integer t, the number of test cases.

Each of the subsequent t pairs of lines is as follows:
- The first line of each test case contains a string, s.
- The second line contains an integer, k.


Constraints


1  <=  t  <=  5
1  <=  | s | <=  10^5
k will be an appropriate integer.

Output Format

Print the kth character (1-based index) of the concatenation of the ordered distinct substrings of s.



Solution :



title-img


                            Solution in C :

In   C++  :







#include <algorithm>
#include <iostream>
#include <cstring>
#include <complex>
#include <cassert>
#include <cstdio>
#include <cstdlib>
#include <vector>
#include <string>
#include <cmath>
#include <ctime>
#include <queue>
#include <list>
#include <map>
#include <set>

#define all(x) (x).begin(), (x).end()
#define type(x) __typeof((x).begin())
#define foreach(it,x) for(__typeof(x.begin()) it = x.begin() ; it!=x.end() ; it++ )

#ifdef KAZAR
    #define eprintf(...) fprintf(stderr,__VA_ARGS__)
#else
    #define eprintf(...) 0
#endif

using namespace std;

template<class T> inline void umax(T &a,T b){if(a<b) a = b ; }
template<class T> inline void umin(T &a,T b){if(a>b) a = b ; }
template<class T> inline T abs(T a){return a>0 ? a : -a;}
template<class T> inline T gcd(T a,T b){return __gcd(a, b);}
template<class T> inline T lcm(T a,T b){return a/gcd(a,b)*b;}

const int inf = 1e9 + 143;
const long long longinf = 1e18 + 143;

inline int read(){int x;scanf(" %d",&x);return x;}

const int N = 101001;

char s[N];
int rnk[N];
int sa[N];
int lcp[N];
pair<pair<int, int>, int> p[N];

void solve(){

    scanf(" %s",s + 1);
    int n = strlen(s + 1);
    long long kth;
    cin >> kth;

    for(int i = 1; i <= n; i++) rnk[i] = s[i] - 'a' + 1;
    for(int g = 1; g <= n; g <<= 1){
        for(int i = 1; i <= n; i++){
            p[i] = make_pair(make_pair(rnk[i], ((i + g) > n)? 0 : rnk[i + g]), i);
        }
        sort(p + 1, p + 1 + n);
        int ptr = 1;
        for(int i = 1; i <= n; i++){
            if(i > 1 && p[i].first != p[i - 1].first)
                ++ptr;
            rnk[p[i].second] = ptr;
        }
    }

    for(int i = 1; i <= n; i++) sa[rnk[i]] = i;

    int cur = 0;
    for(int i = 1; i <= n; i++){
        if(rnk[i] == n)
            continue;
        int next = sa[rnk[i] + 1];
        while(i + cur <= n && next + cur <= n && s[i + cur] == s[next + cur])
            ++cur;
        lcp[rnk[i]] = cur;
        if(cur > 0)
            --cur;
    }
    lcp[n] = 0;
    lcp[0] = 0;

    long long sum = 0;
    for(int i = 1; i <= n; i++){
        eprintf("::%d\n",sa[i]);
        long long len = n - sa[i] + 1;
        long long d = 0;
        d += len * (len + 1) / 2ll;
        d -= 1ll * lcp[i - 1] * (lcp[i - 1] + 1) / 2ll;
        if(sum + d >= kth){
            for(int j = lcp[i - 1] + 1; j <= len; j++){
                sum += j;
                if(sum >= kth){
                    sum -= j;
                    eprintf("i : %d, j : %d\n",i,j);
                    printf("%c\n",s[sa[i] + (kth - sum) - 1]);
                    return;
                }
            }
        }
        sum += d;
    }

}

int main(){

#ifdef KAZAR
    freopen("f.input","r",stdin);
    freopen("f.output","w",stdout);
    freopen("error","w",stderr);
#endif

    int tc = read();
    while(tc--){
        solve();
    }

    return 0;
}









In   Java :







import java.io.*;
import java.util.*;
import java.text.*;
import java.math.*;
import java.util.regex.*;

public class Solution {

    public static void main(String[] args) throws InterruptedException {
        Scanner in = new Scanner(System.in);
        int testCases = Integer.valueOf(in.next());
        for (int i = 0; i < testCases; i++) {
            String str = in.next();
            int k = Integer.valueOf(in.next());

            System.out.println(findSubstrings(str, k));
        }   
    }
    
     public static char findSubstrings(String string, int k) {
        char[] arr = string.toCharArray();
        Set<String> fastSet = new HashSet<>(100000);
        Object[] letters = new Object[26];

        for (int i = 0; i < arr.length; i++) {
            Bucket bucket = (Bucket) letters[(arr[i]) - 97];
            if (bucket == null) {
                bucket = new Bucket(String.valueOf(arr[i]));
                letters[(arr[i]) - 97] = bucket;
            }

            bucket.positions.add(i);
        }

        Stack<Bucket> bucketStack = new Stack<>();

        for (int i = letters.length - 1; i >= 0; i--) {
            if (letters[i] != null) {
                bucketStack.push((Bucket) letters[i]);
                letters[i] = null;
            }
        }

        int count = 0;
        while (!bucketStack.isEmpty()) {
            Bucket b = bucketStack.pop();
//            if (!fastSet.contains(b.str)) {
  //              fastSet.add(b.str);
                if (count + b.str.length() >= k) {
                    return b.str.charAt(k - count - 1);
                } else {
                    count += b.str.length();
                }
    //        }

            for (int position : b.positions) {
                if (arr.length > position + b.str.length()) {

                    int newCharPosition = position + b.str.length();
                    Bucket bucket = (Bucket) letters[(arr[newCharPosition]) - 97];
                    if (bucket == null) {
                        String currStr = b.str + arr[newCharPosition];
                        bucket = new Bucket(currStr);
                        letters[(arr[newCharPosition]) - 97] = bucket;
                    }

                    bucket.positions.add(position);
                }
            }

            for (int i = letters.length - 1; i >= 0; i--) {
                if (letters[i] != null) {
                    bucketStack.push((Bucket) letters[i]);
                    letters[i] = null;
                }
            }
        }

        return '-';
    }

    public static class Bucket implements Comparable<Bucket> {

        String str;
        List<Integer> positions = new ArrayList<>();

        public Bucket(final String str) {
            this.str = str;
        }

        @Override
        public int compareTo(final Bucket o) {
            return -str.compareTo(o.str);
        }

        @Override
        public int hashCode() {
            return str.hashCode();
        }

        @Override
        public boolean equals(final Object obj) {
            return str.equals(((Bucket) obj).str);
        }
    }
}








In  C :








#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#define N 100001
void merge(int*a,int*left,int*right,int left_size, int right_size);
void sort_a(int*a,int size);
void suffixSort(int n);
void LCP(int n);
char str[N]; //input
int rank[N], pos[N], lcp[N]; //output
int cnt[N], next[N]; //internal
int bh[N], b2h[N];

int main(){
  char cc;
  int T,n,i;
  long long K,c,t,x,tt;
  double xx;
  scanf("%d",&T);
  while(T--){
    scanf("%s%lld",str,&K);
    n=strlen(str);
    suffixSort(n);
    LCP(n);
    c=0;
    for(i=0;i<n;i++){
      tt=c;
      c+=(lcp[i]+1+n-pos[i])*(long long)(n-pos[i]-lcp[i])/2;
      if(K<=c){
        xx=(-1+sqrt(4*(2*(K-tt)+lcp[i]+lcp[i]*(long long)lcp[i])))/2;
        x=(long long)xx;
        t=K-tt-(lcp[i]+1+x)*(x-lcp[i])/2;
        if(!t)
          cc=str[pos[i]+x-1];
        else
          cc=str[pos[i]+t-1];
        break;
      }
    }
    printf("%c\n",cc);
  }
  return 0;
}
void merge(int*a,int*left,int*right,int left_size, int right_size){
    int i = 0, j = 0;
    while (i < left_size|| j < right_size) {
        if (i == left_size) {
            a[i+j] = right[j];
            j++;
        } else if (j == right_size) {
            a[i+j] = left[i];
            i++;
        } else if (str[left[i]] <= str[right[j]]) {
            a[i+j] = left[i];
            i++;                
        } else {
            a[i+j] = right[j];
            j++;
        }
    }
    return;
}
void sort_a(int*a,int size){
  if (size < 2)
    return;
  int m = (size+1)/2,i;
  int *left,*right;
  left=(int*)malloc(m*sizeof(int));
  right=(int*)malloc((size-m)*sizeof(int));
  for(i=0;i<m;i++)
    left[i]=a[i];
  for(i=0;i<size-m;i++)
    right[i]=a[i+m];
  sort_a(left,m);
  sort_a(right,size-m);
  merge(a,left,right,m,size-m);
  free(left);
  free(right);
  return;
}
 
void suffixSort(int n){
  //sort suffixes according to their first characters
  int h,i,j,k;
  for (i=0; i<n; ++i){
    pos[i] = i;
  }
  sort_a(pos, n);
  //{pos contains the list of suffixes sorted by their first character}
 
  for (i=0; i<n; ++i){
    bh[i] = i == 0 || str[pos[i]] != str[pos[i-1]];
    b2h[i] = 0;
  }
 
  for (h = 1; h < n; h <<= 1){
    //{bh[i] == false if the first h characters of pos[i-1] == the first h characters of pos[i]}
    int buckets = 0;
    for (i=0; i < n; i = j){
      j = i + 1;
      while (j < n && !bh[j]) j++;
      next[i] = j;
      buckets++;
    }
    if (buckets == n) break; // We are done! Lucky bastards!
    //{suffixes are separted in buckets containing strings starting with the same h characters}
 
    for (i = 0; i < n; i = next[i]){
      cnt[i] = 0;
      for (j = i; j < next[i]; ++j){
        rank[pos[j]] = i;
      }
    }
 
    cnt[rank[n - h]]++;
    b2h[rank[n - h]] = 1;
    for (i = 0; i < n; i = next[i]){
      for (j = i; j < next[i]; ++j){
        int s = pos[j] - h;
        if (s >= 0){
          int head = rank[s];
          rank[s] = head + cnt[head]++;
          b2h[rank[s]] = 1;
        }
      }
      for (j = i; j < next[i]; ++j){
        int s = pos[j] - h;
        if (s >= 0 && b2h[rank[s]]){
          for (k = rank[s]+1; !bh[k] && b2h[k]; k++) b2h[k] = 0;
        }
      }
    }
    for (i=0; i<n; ++i){
      pos[rank[i]] = i;
      bh[i] |= b2h[i];
    }
  }
  for (i=0; i<n; ++i){
    rank[pos[i]] = i;
  }
}
// End of suffix array algorithm

void LCP(int n){
  int l=0,i,j,k;
  for(i=0;i<n;i++){
    k=rank[i];
    if(!k)
      continue;
    j=pos[k-1];
    while(str[i+l]==str[j+l])
      l++;
    lcp[k]=l;
    if(l>0)
      l--;
  }
  lcp[0]=0;
  return;
}








In  Python3 :








def foo(text, k):
    stack = [("",list(range(len(text))))]
    while stack != []:
        prefix,ii = stack.pop()
        if k<len(prefix):
            return prefix[k]
        k -= len(prefix)
        cs = sorted([(text[i],i+1) for i in ii if i<len(text)], reverse=True)
        i = 0
        while i<len(cs):
            c = cs[i][0]
            ii2 = [cs[i][1]]
            j = i+1
            while j<len(cs) and cs[j][0]==c:
                ii2.append(cs[j][1])
                j+=1
            stack.append((prefix+c, ii2))
            i=j
    return None

T = int(input().strip())
for i in range(T):
    text = input().strip()
    k = int(input().strip())
    print(foo(text,k-1))
                        








View More Similar Problems

Jenny's Subtrees

Jenny loves experimenting with trees. Her favorite tree has n nodes connected by n - 1 edges, and each edge is ` unit in length. She wants to cut a subtree (i.e., a connected part of the original tree) of radius r from this tree by performing the following two steps: 1. Choose a node, x , from the tree. 2. Cut a subtree consisting of all nodes which are not further than r units from node x .

View Solution →

Tree Coordinates

We consider metric space to be a pair, , where is a set and such that the following conditions hold: where is the distance between points and . Let's define the product of two metric spaces, , to be such that: , where , . So, it follows logically that is also a metric space. We then define squared metric space, , to be the product of a metric space multiplied with itself: . For

View Solution →

Array Pairs

Consider an array of n integers, A = [ a1, a2, . . . . an] . Find and print the total number of (i , j) pairs such that ai * aj <= max(ai, ai+1, . . . aj) where i < j. Input Format The first line contains an integer, n , denoting the number of elements in the array. The second line consists of n space-separated integers describing the respective values of a1, a2 , . . . an .

View Solution →

Self Balancing Tree

An AVL tree (Georgy Adelson-Velsky and Landis' tree, named after the inventors) is a self-balancing binary search tree. In an AVL tree, the heights of the two child subtrees of any node differ by at most one; if at any time they differ by more than one, rebalancing is done to restore this property. We define balance factor for each node as : balanceFactor = height(left subtree) - height(righ

View Solution →

Array and simple queries

Given two numbers N and M. N indicates the number of elements in the array A[](1-indexed) and M indicates number of queries. You need to perform two types of queries on the array A[] . You are given queries. Queries can be of two types, type 1 and type 2. Type 1 queries are represented as 1 i j : Modify the given array by removing elements from i to j and adding them to the front. Ty

View Solution →

Median Updates

The median M of numbers is defined as the middle number after sorting them in order if M is odd. Or it is the average of the middle two numbers if M is even. You start with an empty number list. Then, you can add numbers to the list, or remove existing numbers from it. After each add or remove operation, output the median. Input: The first line is an integer, N , that indicates the number o

View Solution →