Structuring the Document C


Problem Statement :


A document is represented as a collection paragraphs, a paragraph is represented as a collection of sentences, a sentence is represented as a collection of words and a word is represented as a collection of lower-case ([a-z]) and upper-case ([A-Z]) English characters.

You will convert a raw text document into its component paragraphs, sentences and words. To test your results, queries will ask you to return a specific paragraph, sentence or word as described below.

Alicia is studying the C programming language at the University of Dunkirk and she represents the words, sentences, paragraphs, and documents using pointers:

    A word is described by  char * .
    A sentence is described by char **  . The words in the sentence are separated by one space (" "). The last word does not end with a space(" ").
    A paragraph is described by char *** . The sentences in the paragraph are separated by one period (".").
   A document is described by char ****  . The paragraphs in the document are separated by one newline("\n"). The last paragraph does not end with a newline.

For example:

    Learning C is fun.
    Learning pointers is more fun.It is good to have pointers.

    The only sentence in the first paragraph could be represented as:

char** first_sentence_in_first_paragraph = {"Learning", "C", "is", "fun"};

    The first paragraph itself could be represented as:

char*** first_paragraph = {{"Learning", "C", "is", "fun"}};

    The first sentence in the second paragraph could be represented as:

char** first_sentence_in_second_paragraph = {"Learning", "pointers", "is", "more", "fun"};

    The second sentence in the second paragraph could be represented as:

char** second_sentence_in_second_paragraph = {"It", "is", "good", "to", "have", "pointers"};

    The second paragraph could be represented as:

char*** second_paragraph = {{"Learning", "pointers", "is", "more", "fun"}, {"It", "is", "good", "to", "have", "pointers"}};

    Finally, the document could be represented as:

char**** document = {{{"Learning", "C", "is", "fun"}}, {{"Learning", "pointers", "is", "more", "fun"}, {"It", "is", "good", "to", "have", "pointers"}}};

The first line contains the integer  paragraph_count. 
Each of the next paragraph_count lines contains a paragraph as a single string.
The next line contains the integer q, the number of queries.
Each of the next q lines or groups of lines contains a query in one of the following formats: 

1 The first line contains 1 k :

    The next line contains an integer  x , the number of sentences in the kth paragraph.
    Each of the next x lines contains an integer a[i], the number of words in the ith sentence.
    This query corresponds to calling the function kth_paragraph. 

2 The first line contains 2 k m:

    The next line contains an integer  x , the number of words in the kth sentence of the mth paragraph.
    This query corresponds to calling the function kth_sentence_in_mth_paragraph. 

3 The only line contains 3 k m n:

    This query corresponds to calling the function kth_word_in_mth_sentence_of_nth_paragraph.

Constraints

    The text which is passed to the get_document has words separated by a space (" "), sentences separated by a period (".") and paragraphs separated by a newline("\n").
    The last word in a sentence does not end with a space.
    The last paragraph does not end with a newline.
    The words contain only upper-case and lower-case English letters.
    1 <= number of characters in the entire document <= 1000
    1 <= number of paragraphs in the entire document <= 5

Output Format

Print the paragraph, sentence or the word corresponding to the query to check the logic of your code.



Solution :



title-img


                            Solution in C :

char* kth_word_in_mth_sentence_of_nth_paragraph(char**** document, int k, int m, int n) {
    return document[n-1][m-1][k-1];
}

char** kth_sentence_in_mth_paragraph(char**** document, int k, int m) {
    return document[m-1][k-1];
}

char*** kth_paragraph(char**** document, int k) {
    return document[k-1];
}

char** split_string(char* text, char delim) {
    assert(text != NULL);
    char** result = malloc(1*sizeof(char*));
    int size = 1;
    
    char* temp = strtok(text, &delim);
    *result = temp;
    
    while(temp != NULL) {
        size++;
        result = realloc(result,size*sizeof(char*));
        temp = strtok(NULL, &delim);
        result[size-1] = temp;
    }
    return result;
}

char**** get_document(char* text) {
    assert(text != NULL);
    
    // split text by '\n' and count number of paragraphs
    char** paragraphs = split_string(text, '\n');
    int npar = 0;
    while (paragraphs[npar] != NULL) {
        npar++;
    }
    
    char**** doc = malloc((npar+1)*sizeof(char***));
    // set last position to NULL for the user
    // to know when the array ends.
    doc[npar] = NULL; 
    
    int i = 0;
    while (paragraphs[i] != NULL) {
        
        // split sentences of paragraph by '.' and count number of sentences
        char** sentences = split_string(paragraphs[i], '.');
        int nsen = 0;
        while(sentences[nsen] != NULL) {
            nsen++;
        }
        
        doc[i] = malloc((nsen+1)*sizeof(char**));
        // set last position to NULL for the user
        // to know when the array ends.
        doc[i][nsen] = NULL; 
        
        int j = 0;
        while (sentences[j] != NULL) {
            
            // remember that doc[0][0] means: paragraph #0,
            // sentence #0 and should act like a pointer to
            // the first element of an array of words (strings)
            
            // split string by ' ' and associate doc[i][j]
            // with the array of strings representing words
            // that is returned by split_string.
            doc[i][j] = split_string(sentences[j], ' ');
            j++;
        }
        i++;
    }
    
    return doc; 
}
                        








View More Similar Problems

Balanced Forest

Greg has a tree of nodes containing integer data. He wants to insert a node with some non-zero integer value somewhere into the tree. His goal is to be able to cut two edges and have the values of each of the three new trees sum to the same amount. This is called a balanced forest. Being frugal, the data value he inserts should be minimal. Determine the minimal amount that a new node can have to a

View Solution →

Jenny's Subtrees

Jenny loves experimenting with trees. Her favorite tree has n nodes connected by n - 1 edges, and each edge is ` unit in length. She wants to cut a subtree (i.e., a connected part of the original tree) of radius r from this tree by performing the following two steps: 1. Choose a node, x , from the tree. 2. Cut a subtree consisting of all nodes which are not further than r units from node x .

View Solution →

Tree Coordinates

We consider metric space to be a pair, , where is a set and such that the following conditions hold: where is the distance between points and . Let's define the product of two metric spaces, , to be such that: , where , . So, it follows logically that is also a metric space. We then define squared metric space, , to be the product of a metric space multiplied with itself: . For

View Solution →

Array Pairs

Consider an array of n integers, A = [ a1, a2, . . . . an] . Find and print the total number of (i , j) pairs such that ai * aj <= max(ai, ai+1, . . . aj) where i < j. Input Format The first line contains an integer, n , denoting the number of elements in the array. The second line consists of n space-separated integers describing the respective values of a1, a2 , . . . an .

View Solution →

Self Balancing Tree

An AVL tree (Georgy Adelson-Velsky and Landis' tree, named after the inventors) is a self-balancing binary search tree. In an AVL tree, the heights of the two child subtrees of any node differ by at most one; if at any time they differ by more than one, rebalancing is done to restore this property. We define balance factor for each node as : balanceFactor = height(left subtree) - height(righ

View Solution →

Array and simple queries

Given two numbers N and M. N indicates the number of elements in the array A[](1-indexed) and M indicates number of queries. You need to perform two types of queries on the array A[] . You are given queries. Queries can be of two types, type 1 and type 2. Type 1 queries are represented as 1 i j : Modify the given array by removing elements from i to j and adding them to the front. Ty

View Solution →