i have a question on even distribution in case of UPI defined. though the values are unique and unique row hash will be generated what would be the guarantee all the row hashes generated wouldnt be bucketed to same amp. ie if value 5 10 15 20 25 are rowhash value generated and when bucketing takes place in a 5 amp system wouldn't all these go to the same amp ?
Yes, they would, but for small tables it would not cause any issues and for tables with every additional record the probability of that is decreasing. Probability of having 10 rows on the same AMP in TD Express ~ 0.004.
@dnoeth Could you please answer above question? How can unique records in case of UPI will have equal distribution in all amps? Hashing value and bucketing for different values in UPI column can point to same amp also inspite of different hash value but while passing hash map may point to same amp. Then how equal distribution in all amps?
Let's phrase it a bit different:
UPI will probably result in almost equal distribution among AMPs for larger tables :-)
When you got an 6 AMP system and place rows based on rolling the dice you will not expect even distribution after 6 or 12 or 18 rows, but when you roll it 60,000 times it will be almost evenly distributed.
SELECT id,x ,Count(*) AS actual_count ,Sum(actual_count) Over (PARTITION BY id)/6 AS expected_cot FROM ( SELECT SAMPLEID AS id, Random(1,6) AS x FROM sys_calendar.CALENDAR SAMPLE 6,12,18,60,600,6000,60000 ) AS dt GROUP BY 1,2 ORDER BY 1,2