Dear TD Users,
I am trying to understand if there's an easy way (using queries) to randomly segregate data records in a given table into specific groups.
Basically, I am trying to assign each record a particular (pre-defined) value, at the same time I'd also want to ensure that all the records being selected are (almost) equally distributed in random order between the set of chosen grouping values.
For example: "Table A" has the following data with "Col_1" having unique values...
I would like to assign each row from "Table A" to a specific group value. In this case, let's say that we choose to segregate them into 3 groups: group # 1, 2 & 3. Here's what I am trying to do...
I am trying to come up with a query that is flexible enough to accept "n" value and then segregate the underlying records into 1..n groups equally in a random fashion. However, I am encountering challenges with the following:
Any inputs/suggestions in this regard will be greatly appreciated.
If the number of groups is less than 16 you might apply the existing SAMPLE clause using a two step approach:
Using percentage without calculating numbers (like SAMPLE 0.33,0.33,0.34) might result in a (few) missing rows...
Otherwise you can utilize a ROW_NUMBER over RANDOM:
(ROW_NUMBER() OVER (ORDER BY r) MOD 5) + 1
RANDOM(-2147483648,2147483647) AS r
FROM table AS t
) AS dt
Thank you very much, Dieter!! Both your suggestions were spot on...
I actually didn't realize the SAMPLE clause can be leveraged in this case (I know...long way to go...:)) and was actually trying to work out a solution with RANDOM function and a manually generated range of values (based on # of groups).
Your solution makes total sense and thank you once again for making it so simple and straightforward.
Appreciate your time and inputs!