Is it possible to detect and handle string collisions among grouped values when grouping in Hadoop Pig?

StackOverflow https://stackoverflow.com/questions/7406392

  •  29-10-2019
  •  | 
  •  

Question

Assuming I have lines of data like the following that show user names and their favorite fruits:

Alice\tApple
Bob\tApple
Charlie\tGuava
Alice\tOrange

I'd like to create a pig query that shows the favorite fruit of each user. If a user appears multiple times, then I'd like to show "Multiple". For example, the result with the data above should be:

Alice\tMultiple
Bob\tApple
Charlie\tGuava

In SQL, this could be done something like this (although it wouldn't necessarily perform very well):

select user, case when count(fruit) > 1 then 'Multiple' else max(fruit) end
from FruitPreferences
group by user

But I can't figure out the equivalent PigLatin. Any ideas?

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top