I have a table t1 having 20 unique records.UPI is defined in 1st column i.e. on col1.
I have 4 AMP system. All amps hold 5 records.
4 more amp got added. If i insert 12 more records to t1
will it be distributed on newly aded AMPS or to both existing and newly added AMPS. can someone please explain with
hashbucket concept here.
Whenever new AMP are added to the existing system, your hash algorithm changes and the rows from the exisitng AMP may or may not migrate to the new AMP based on the new hash algorithm.
In your case, It will not be like that 12 new records will be added to newly added AMP, these records may go to your old Amp as well as new AMP.
Also, some of the records which were earlier present in the old amps may shift to the new AMP.
To find the number of Rows per Amp , you can use the following query:
Select hashamp(hashbucket(hashrow(col1), count (*) from tableA group by 1
And to find which record is going to which AMP, you can use the following query:
Select hashamp(hashbucket(hashrow(col1), col1 from tableA group by 1
I can find out number of amps and number of records in each amp. But my question is
when PI column value is rowhashed and 32 bit rowhash is generated. The 1st 16 bit is used to identify the destination selection AMP. means the 1st 16 bit is used by the hashmap to find the destination AMP.
And when more AMP is added, structure of hashmap changes. but does it mean the existing records are again re-distributed among all the amps(existing+newly added) according to new hashmap structure?