Querying the Document C
Problem Statement :
A document is represented as a collection paragraphs, a paragraph is represented as a collection of sentences, a sentence is represented as a collection of words and a word is represented as a collection of lower-case ([a-z]) and upper-case ([A-Z]) English characters. You will convert a raw text document into its component paragraphs, sentences and words. To test your results, queries will ask you to return a specific paragraph, sentence or word as described below. Alicia is studying the C programming language at the University of Dunkirk and she represents the words, sentences, paragraphs, and documents using pointers: A word is described by char * . A sentence is described by char ** . The words in the sentence are separated by one space (" "). The last word does not end with a space(" "). A paragraph is described by char *** . The sentences in the paragraph are separated by one period ("."). A document is described by char **** . The paragraphs in the document are separated by one newline("\n"). The last paragraph does not end with a newline. For example: Learning C is fun. Learning pointers is more fun.It is good to have pointers. The only sentence in the first paragraph could be represented as: char** first_sentence_in_first_paragraph = {"Learning", "C", "is", "fun"}; The first paragraph itself could be represented as: char*** first_paragraph = {{"Learning", "C", "is", "fun"}}; The first sentence in the second paragraph could be represented as: char** first_sentence_in_second_paragraph = {"Learning", "pointers", "is", "more", "fun"}; The second sentence in the second paragraph could be represented as: char** second_sentence_in_second_paragraph = {"It", "is", "good", "to", "have", "pointers"}; The second paragraph could be represented as: char*** second_paragraph = {{"Learning", "pointers", "is", "more", "fun"}, {"It", "is", "good", "to", "have", "pointers"}}; Finally, the document could be represented as: char**** document = {{{"Learning", "C", "is", "fun"}}, {{"Learning", "pointers", "is", "more", "fun"}, {"It", "is", "good", "to", "have", "pointers"}}}; The first line contains the integer paragraph_count. Each of the next paragraph_count lines contains a paragraph as a single string. The next line contains the integer q, the number of queries. Each of the next q lines or groups of lines contains a query in one of the following formats: 1 The first line contains 1 k : The next line contains an integer x , the number of sentences in the kth paragraph. Each of the next x lines contains an integer a[i], the number of words in the ith sentence. This query corresponds to calling the function kth_paragraph. 2 The first line contains 2 k m: The next line contains an integer x , the number of words in the kth sentence of the mth paragraph. This query corresponds to calling the function kth_sentence_in_mth_paragraph. 3 The only line contains 3 k m n: This query corresponds to calling the function kth_word_in_mth_sentence_of_nth_paragraph. Constraints The text which is passed to the get_document has words separated by a space (" "), sentences separated by a period (".") and paragraphs separated by a newline("\n"). The last word in a sentence does not end with a space. The last paragraph does not end with a newline. The words contain only upper-case and lower-case English letters. 1 <= number of characters in the entire document <= 1000 1 <= number of paragraphs in the entire document <= 5 Output Format Print the paragraph, sentence or the word corresponding to the query to check the logic of your code.
Solution :
Solution in C :
char* kth_word_in_mth_sentence_of_nth_paragraph(char**** document, int k, int m, int n) {
return document[n-1][m-1][k-1];
}
char** kth_sentence_in_mth_paragraph(char**** document, int k, int m) {
return document[m-1][k-1];
}
char*** kth_paragraph(char**** document, int k) {
return document[k-1];
}
char** split_string(char* text, char delim) {
assert(text != NULL);
char** result = malloc(1*sizeof(char*));
int size = 1;
char* temp = strtok(text, &delim);
*result = temp;
while(temp != NULL) {
size++;
result = realloc(result,size*sizeof(char*));
temp = strtok(NULL, &delim);
result[size-1] = temp;
}
return result;
}
char**** get_document(char* text) {
assert(text != NULL);
// split text by '\n' and count number of paragraphs
char** paragraphs = split_string(text, '\n');
int npar = 0;
while (paragraphs[npar] != NULL) {
npar++;
}
char**** doc = malloc((npar+1)*sizeof(char***));
// set last position to NULL for the user
// to know when the array ends.
doc[npar] = NULL;
int i = 0;
while (paragraphs[i] != NULL) {
// split sentences of paragraph by '.' and count number of sentences
char** sentences = split_string(paragraphs[i], '.');
int nsen = 0;
while(sentences[nsen] != NULL) {
nsen++;
}
doc[i] = malloc((nsen+1)*sizeof(char**));
// set last position to NULL for the user
// to know when the array ends.
doc[i][nsen] = NULL;
int j = 0;
while (sentences[j] != NULL) {
// remember that doc[0][0] means: paragraph #0,
// sentence #0 and should act like a pointer to
// the first element of an array of words (strings)
// split string by ' ' and associate doc[i][j]
// with the array of strings representing words
// that is returned by split_string.
doc[i][j] = split_string(sentences[j], ' ');
j++;
}
i++;
}
return doc;
}
View More Similar Problems
Median Updates
The median M of numbers is defined as the middle number after sorting them in order if M is odd. Or it is the average of the middle two numbers if M is even. You start with an empty number list. Then, you can add numbers to the list, or remove existing numbers from it. After each add or remove operation, output the median. Input: The first line is an integer, N , that indicates the number o
View Solution →Maximum Element
You have an empty sequence, and you will be given N queries. Each query is one of these three types: 1 x -Push the element x into the stack. 2 -Delete the element present at the top of the stack. 3 -Print the maximum element in the stack. Input Format The first line of input contains an integer, N . The next N lines each contain an above mentioned query. (It is guaranteed that each
View Solution →Balanced Brackets
A bracket is considered to be any one of the following characters: (, ), {, }, [, or ]. Two brackets are considered to be a matched pair if the an opening bracket (i.e., (, [, or {) occurs to the left of a closing bracket (i.e., ), ], or }) of the exact same type. There are three types of matched pairs of brackets: [], {}, and (). A matching pair of brackets is not balanced if the set of bra
View Solution →Equal Stacks
ou have three stacks of cylinders where each cylinder has the same diameter, but they may vary in height. You can change the height of a stack by removing and discarding its topmost cylinder any number of times. Find the maximum possible height of the stacks such that all of the stacks are exactly the same height. This means you must remove zero or more cylinders from the top of zero or more of
View Solution →Game of Two Stacks
Alexa has two stacks of non-negative integers, stack A = [a0, a1, . . . , an-1 ] and stack B = [b0, b1, . . . , b m-1] where index 0 denotes the top of the stack. Alexa challenges Nick to play the following game: In each move, Nick can remove one integer from the top of either stack A or stack B. Nick keeps a running sum of the integers he removes from the two stacks. Nick is disqualified f
View Solution →Largest Rectangle
Skyline Real Estate Developers is planning to demolish a number of old, unoccupied buildings and construct a shopping mall in their place. Your task is to find the largest solid area in which the mall can be constructed. There are a number of buildings in a certain two-dimensional landscape. Each building has a height, given by . If you join adjacent buildings, they will form a solid rectangle
View Solution →