How to approach a clustering problem based on unsuitability
Author |
Message |
canuckster
|
Posted: Fri Nov 17, 2006 1:46 pm Post subject: How to approach a clustering problem based on unsuitability |
|
|
For those of you rolling your eyes at another first-time poster with a big question to ask -- I hope I don't sound too newbie.
I'll soon be facing a programming challenge for which I feel there must be solutions or at least related algorithms out there already; I don't want to waste time reinventing anything. The nature of it seems to demand a mathematical approach beyond my experience, so I'll likely end up hiring someone to help me -- though if I can offer that person a strategy to begin with, so much the better.
FYI I'm an Actionscript and PHP programmer, so I use a mix of procedural and OOP (though I'd imagine this problem's solution will likely use C or Java). Not really apropos to the topic, but I thought I'd introduce myself.
The problem actually has two main components (the first involving distance-based clustering, which might end up getting combined with the rest of the suitability/unsuitability criteria), but right now I'm most concerned with its second half: I will have a bunch of people (in hundreds) to cluster into groups of 8 according to criteria describing their unsuitability. I say "unsuitability" because what I'm looking for might be called an "anti-matchmaking" algorithm: rather than increase the chances that A and B will end up in the same group of 8 because they share a particular characteristic, I want to decrease the chances.
Continuing with the matchmaking/anti-matchmaking analogy, imagine a small town where there's a limited pool of eligible partners, and once a month the local town hall hosts a singles event, seating people at tables of eight for dinner and drinks. The organizers pay attention to who's attended in the past and with whom they'd been seated, and wish to avoid assigning anyone to a table with a person they'd sat with before (within reason; perhaps only the previous two or three events would be considered).
To my mind, the computational requirements in order to minimize the potential for I-sat-with-you-last-time grow enormously as does the number of participants. A simple brute-force method -- examine every possible combination, assign penalty scores according to seating conflicts (assigning, perhaps, a higher penalty for having sat together one event ago, but a smaller penalty for two events, etc.), then sort the results and take the lowest-penalty-sum result as the answer -- would work, but be at best ungainly and at worst impossible.
I'm sure there are much more elegant approaches to solving this -- some of which I might be able to understand myself, some of which only with the aid of someone who specializes in math or computer science -- but I don't even know where to begin to look for such approaches, much less evaluate their suitability.
So -- if all I get out of posting this message is "just google XXXXXXXX and YYYYYYYYY", then I'll be one big step ahead of where I am now, since right now I don't know what to look for.
Thanks for wading through this post. I'm grateful for any suggestions. |
|
|
|
|
 |
Sponsor Sponsor

|
|
 |
canuckster
|
Posted: Thu Nov 23, 2006 9:23 pm Post subject: (No subject) |
|
|
I'm trying to figure out why no one has replied to this thread.
Have I been presumptuous in asking this type of programming question here? Is it a case of "he's got newbie written all over; leave him be"?
Is it because I identified myself as a 'web programmer', and therefore should not be taken seriously?
Have I asked too tough a question?
I'm just surprised because opinions and suggestions seem to flow freely in all the other forums I'm a member of.
If I have annoyed people here with my post, I apologize, and won't post again. |
|
|
|
|
 |
Clayton

|
Posted: Thu Nov 23, 2006 9:35 pm Post subject: (No subject) |
|
|
I'm sorry, at first I didn't answer to this question because I didn't really think I had any kind of answer that would suit you. however, reading through it again, my suggestion to you would be this, check out matching algorithms, then, find a way to bend it to work backwards (usually means switching a couple of things around) and see what happens. Also, if you don't really have an execution time constraint, you could theoretically brute force it. Although it wouldn't be pretty it is pheasable to an extent.
Please be assured that you didn't offend anybody, most likely they just had the same kind of thoughts as myself. |
|
|
|
|
 |
[Gandalf]

|
Posted: Thu Nov 23, 2006 9:54 pm Post subject: (No subject) |
|
|
Actually, it was a great first post, and I'm sure you haven't annoyed anyone. (How could that be?)
If anything, the reason you haven't recieved any replies is because it's too tough of a question, or possibly, too specific of a question. I'm pretty sure a lot of the posters here are more interested in programming concepts and general computer science skills, not neccessarily specific algorithms and such. We really only have a few true "experts" in here.
To be honest, if this problem were assigned to me I'd go straight for the brute force method you described, but like you said, who knows how successful it would be. I guess it depends on what exactly you need this for?
And, welcome!  |
|
|
|
|
 |
canuckster
|
Posted: Sun Nov 26, 2006 2:57 pm Post subject: (No subject) |
|
|
Thanks for your replies -- really! I seem to have been having an unlucky streak with external developers where they go silent after awhile and just leave me hanging. So that's made me doubly ready to assume that other people "just can't be bothered". Anyway, my faith has recovered a notch or so.
I've considered your suggestion, Freakman, about switching things around in my model so that it could work with existing clustering methods -- perhaps find a way to say that what two people have in common is that they haven't been seated together before.
And the brute force method might have its day, too ... I'd like to do some more research prior to that, because I'd like to avoid both the performance penalties and the inelegance (read: ego-bruising) that would entail.
It's just frustrating when I think that this seems like a problem that's already been solved, perhaps in several ways -- and if only I were, say, a grad student or prof in Comp Sci or Math, I'd know how and where best to continue my research.
BTW I'm not ready for this right now, and can't be 100% certain of when I would be, but can you offer any suggestions of where to look to hire someone to program such a beast? Preferably in southern Ontario (always nice to discuss things in person where possible), but I suppose it could really be anywhere. And if you know of any other forums where I might post my question, please say so.
One last thing -- I'm supposed to be subscribed to this thread, but never received any notification of your posts. Does this surprise you? Maybe I made a mistake, but if not then the site admin should know. |
|
|
|
|
 |
|
|