Hive would be a much more useful and efficient component in the analyst's workflow if it were possible to specify the number of reducers, at least whenever summary functions and/or group-bys aren't used. As it stands, Hive isn't very useful for creating data subsets under certain conditions because it may unnecessarily force large selections through a single reducer.