• he/him

massachusetts

see Carrd link for forwarding information



generalized computing problem which is either going to be obvious as soon as I've figured out how to describe it, or turn out to be one of the Great Unsolved Problems:

I have in my possession a collection of objects, each of which has numerous attributes. I have a blue cold furry cow, a red hot scaly horse, a green hot furry dog, etc. I wish to sort them into somewhere between 3 and 10 boxes. I want the number of things in each box to be as close to equal as possible.

Computer, tell me: Which one of those attributes should I use to sort my objects?


You must log in to comment.

in reply to @pervocracy's post:

Some details might matter, such as the nature of the bins (hot OR scaly vs. hot AND scaly?) and whether it needs to be exhaustive. One interesting path would be the "frequent itemset" discovery algorithms, but they may not really be what you're after.

If the attribute color, for example, is different for every object in your collection, does that make color the best attribute (because now you get to distribute evenly between boxes) or useless (because it doesn't tell you anything about how to distribute between boxes)?

Wanting to make sure I understand the problem - are you wanting to like, pick "color" and have one box for each color? If that's the case I think the problem is relatively straightforward; you can just try each attribute and see how distributed your boxes are for that one, by whatever metric you want to choose "as close to equal as possible" by (standard deviation, maximum difference, etc).