Pig Quiz

wordfile_grpd: {group: chararray,wordfile_flat: {t:(wordin: chararray)}}


Add a statement to dump the contents:

DUMP wordfile_grpd.

What is the correct line that you see for the word far?


(far,((far), (far)))




In the following LOAD command:

wordfile = LOAD ‘/user/cloudera/pigin/testfile*’

USING PigStorage(‘\n’) AS (linesin:chararray);

Which is the relation and field name

wordfile is the relation, linesin is the field name

linesin is the relation and wordfile is the field name

the chararray is the relation and the field name is linesin



If you enter the command in grunt> DESCRIBE wordfile

You get the following result: > wordfile: {linesin:chararry}

Which of the following is correct explanation of this result?

The relation name followed by fields

The relation name followed by a bag of tuples


In the statement

wordflat = FOREACH wordfile GENERATE FLATTEN(TOKENIZE(linesin)) AS wordin

What will happen to the bag of words?

Tokenize will remove the bag level

Flatten will remove the bag level


Pig is a dataflow language. What does that imply about each statement in a Pig script?

Each statement in PIG creates a new relation, which represents a new set of data.

Each statement creates a relation that can be used like a variable

Each statement performs calculations, assigns variables, or executes a flow control.


