|
It'd be interesting to see how many people voted for each rating, sort of like MSDN articles[^]. I find for example in cases where there's a lot of 1s and a lot of 9s (5s in the case of Code Project), the 1s are negligible because they're usually just entered by the overzealous or bored. But you wouldn't know that's the case unless you could see the rating distribution the way you can on MSDN, you just see a 3 instead.
|
|
|
|
|
This would be great indeed!
Ivan S Zapreev
cerndan@mail.ru
|
|
|
|
|
|
I personally do not like the voting model wherein you choose how much you like something, from 1 to 5 (or 1–10, etc.) Each person interprets this range differently. I, for instance, use it as a quality scale starting at 1 for a letter grade similar to C–, and scaling up to 5: ‘perfect A+’, others use the middle as ‘ok’, 1 as ‘horrible’, and 5 as ‘good enough’. The result is that the average of all these numbers does not mean much. It’s marginally better than sheer randomness.
The problem is that people interpret the scale differently, and that there are too many grades. We have lots of ‘markers’ on the net, let’s use them!
I think voting on the Internet should be constrained to just three options with well-defined meanings and color-codings:
- Good/Yes/Quality
- Meh/Average/Maybe/‘I agree’
- Poor/No/‘Disagree’
All items start out with an average rating, and this initial rating has not-insignificant weight, which should be about half the average number of ratings an item on the site receives. Starting out in the middle means the first few votes move the average a small amount, instead of a large amount. Authors can then vote ‘Good’ on their own articles, and not affect the ranking so drastically at the beginning.
By voting ‘yes’, you increase the rating. By voting ‘no’, you decrease the rating, and by voting ‘meh’, you are in effect saying: “The current rating is accurate, increase its weight.”
Gold/Silver/Bronze weighting and what-not can be added on top of this model pretty trivially.
Another thing I just thought of is a fourth category of member. This category is only achieved when a member posts some number of articles which achieve a good rating by so many other members, or something along those lines. Perhaps these members can have an additional ‘double-plus-good’ button for voting, which counts as 3 gold-level ‘good’ ratings. Having the extra option would be a good reward for people who achieve this high level of membership. Maybe this group is short-listed for a draw every month, or something.
|
|
|
|
|
The only thing I find as a fault of the system is to assume that because a person is listed as a rookie because of the following:
Newly joined
Low levels of postings
No or little articles
By no way does this make a person either a guru (sh*tty articles, long membership, overposting) or a rookie. By determining if a person's weight by using guru or rookie status is simply not going to work.
Maybe we should take tests to prove our levels, then true weighting can be established.
And by simply posting this in no way makes me a guru.
Which I am definetly not.
Rant finished.
Fear is the mind killer. So most of you have everything to worry about, now don't you!
|
|
|
|
|
Testing raises its own inescapable problems, and it's just not going to happen on the site anyway. I agree with your sentiments completely, though.
The way these issues are often tackled on community sites is to have members "trust" each other; someone gets more trust points the more people that trust them, and the more highly trusted the people are that trust them. This can result in cliquishness if not implemented correctly or with too small a group, but with a group the size of the Code Project membership it could work well.
Did you ever read Michael Shermer's column in Scientific American, on being a skeptic? He often talks about the difference between science and pseudoscience. Pseudoscientists often reference each others' work, in an attempt to better trick the public; here you have a trust network that is outside that of science. However, it is neither as large nor as well-established as the trust network populated by genuine scientists.
It helps to have something real to which to "anchor" things; for instance, the pseudoscientists' trust network probably contains very few, if any, Nobel prize winners. In our situation, it could be something like testing as you suggest, or the mechanisms already built into the site: articles, voting, etc. It's not necessary to have "anchors", though; their importance decreases as the population size increases. It's possible for someone like Daniel Stephen Rule to spam day and night and get a respectable score on something just by creating bullshit accounts, but it would be impossible for him to stem the tide of public opinion all on his own, no matter how hard he tried.
A related idea: IBM was the first to create a fully-functional search engine based on ranking ideas drawn from scientific research papers. I'm talking about the references that scientists make to other papers in their own papers. A paper gets more acclaim if more important scientists refer to it in their own papers-- you get the idea. IBM adapted this basic methodology to web searches; sites are either considered to be "hubs" or "authorities" (they can be both, but don't tend to be). Highly-respected hubs point to many highly-respected authorities; authorities are trusted to contain useful information. The "respect" calculations take place in a series of passes over the network; sites with weak connections wind up dropping off.
Regards,
Jeff Varszegi
|
|
|
|
|
In the case of IBM (remember I am basically ignorant of these types of things), would you consider a case where a more "important" scientist is disproving the other's work. Wouldn't that give the "disproved" scientist more credability.
I was being sarcastic about the testing because I have met plenty of programmers that have taken certifications and have high scores that couldn't really do squat in the real world. ie booksmart
As for the Danile Stephen syndrome, that is a complete idiot that should be banned by IP address. I wrote one article but pulled it after I realized that I hadn't delivered what the population expected.
All the best
Eric C. Tomlinson
No comment, Mr. Senator<pre>
|
|
|
|
|
That's an interesting question. I don't know as much about the research paper ranking as the search engine itself, and I'd have to go read up again on that to really have an in-depth conversation. I'm not trying to pass myself off as an authority , just throw some stuff out there that I found interesting in the past.
I agree totally with you about the certification stuff. I have some certifications myself that I think were too easy to get (I only tout them on my resume), and I've met enough over-certified shmucks in the last few years to last me the rest of my life.
A friend of mine said years ago that the main problem is that programming is not a profession, in the way that being a lawyer or a doctor is a profession. It's missing several criteria:
1) The uniform is usually an indicator of professional status
2) Industry-standard certification required to practice
3) Some sort of code of conduct, often accompanied by a "sacred" oath
4) Legal recognition of the above (often but not always excluding the uniform!)
You get the idea. Policemen are closer to being true professionals than we are. The certifications out there exist so that companies can claim wide acceptance of their respective technologies; it's in the best interests of a company to certify as many people as possible, as long as they can maintain some semblance of professionalism as they go about it. I think this explains why almost all testing processes for IT folks are mediocre.
I think that with the pressures exerted on government and other important projects by ISO certification requirements and the like, we're gradually moving towards professionalism for the industry. I hate to hear everybody prattling on about 6 Sigma certification etc. as if it's some magic bullet, but I guess that's more about project management than development anyway.
Regards,
Jeff Varszegi
|
|
|
|
|
Jeff Varszegi wrote:
1) The uniform is usually an indicator of professional status
2) Industry-standard certification required to practice
3) Some sort of code of conduct, often accompanied by a "sacred" oath
4) Legal recognition of the above (often but not always excluding the uniform!)
This is an excellent point. But how the hell can we get enough programmers to agree on what we need to do to establish our credability. We can even decide on multiple OS.
No comment, Mr. Senator<pre>
|
|
|
|
|
multiple OS
You're totally right! And I suspect that if we were to split up the profession properly, so that we had graphics specialists, etc., the way they have in the medical field, we'd probably wind up with the most specializations of any profession yet seen. I think that the OS schism is driven by company loyalty/hatred, which I think you also wouldn't see in other professions very much. We're a motley bunch. I'm sure all of these questions will be answered someday, but not during my career. It's a crappy time to be a programmer-- we're past the heady dot-com times, but without anything good to replace them. Now we suddenly find ourselves "commoditized" in a global marketplace, without any sort of professional standards on the basis of which to compete; but the companies that are getting the big outsourcing contracts are all trumpeting their certifications. (I'm not railing against those companies in places like India-- everybody's gotta eat. It doesn't mean that I have to like what's happening to my livelihood, though.)
Regards,
Jeff Varszegi
|
|
|
|
|
I agree that everyone has the right to eat, but I have been called in after a "overseas" implementation to clean up for several months. The documentation was horrible, the delivered product didn't match the requirements and there was no plan in place as to who would maintain it. The company decided that having either consultants or staff on hand (even though it seems more expensive from the first look) to develop and maintain the system.
Even those service related companies, for example Dell, are going to end up bringing back the customer service teams to the US. I had to call for service on my laptop and I had to call 4 times to finally reach a "chap" who could understand me as well as I understand him. Needless to say I was pretty hot under the collar and complained to company. After 3 months they announced that they were re-hiring the displaced US workers and moving the call center back to the US.
I have nothing against Indians nor Pakastanis. I have worked with both on multiple projects and have seen nothing less than perfection from them.
No comment, Mr. Senator<pre>
|
|
|
|
|
I used to be mathematician. (Finished Mechanico-Mathematical department of Moscow State Univirsity).
I was amazed to see complex formulas i managed to forget long time ago.
George
|
|
|
|
|
Follow the following link to get some more complex formulas
http://www.codeproject.com/useritems/CodeProject_s_Voting.asp
BTW I have graduated the Novosibirsk State University, Mechanico-Mathematical department.
Ivan S Zapreev
cerndan@mail.ru
|
|
|
|
|
|
Do not worry ....
If you read this article then you wasted your time too
Ivan S Zapreev
cerndan@mail.ru
|
|
|
|
|
Hi Ivan. Hi all.
First of all, I had a good time reading big words
about small problems, especially in the posts!
(4th moment and so on)
The temptation of writing what is good and what is
bad in the article is there, but I wouldn't like
to play the same game (big words for small problems).
Anyway, the assumption than votes and weights are
independent is really too strong!
(perhaps they are *uncorrelated*)
Is you say so, you say "all people vote the same way"
(same distribution).
But weights are used because experienced people
(should) vote better...
P.S. I won't tell you my vote, but don't worry,
it has a really small weight!
Luca Piccareta
|
|
|
|
|
You are right
>>Anyway, the assumption than votes and weights are
>>independent is really too strong!
>>(perhaps they are *uncorrelated*)
is a weaker assumption and in this case all conclusions stay the same.
But please note that you have also payed attention to this article and besides, you also wrote some big words about small problems
BTW I really think that people vote the same way as I presume that they vote ideally i.e. depending on the quality of an article.
Ivan S Zapreev
cerndan@mail.ru
|
|
|
|
|
For Movie rating there is no gurus and rookies, but rating is still useful to estimate potential interest of movie based on peoples affinity : if someone like same movies as me, I should see also the movies he likes that I have not seen yet, and vice-versa. Perhaps this affinity theory is also useful for CodeProject’s Article rating ?
|
|
|
|
|
Do you mean that if someone likes the same kind of articles as you do then he will not only read the same article with you but also will vote the same way?
Ivan S Zapreev
cerndan@mail.ru
|
|
|
|
|
Statistically, yes ! If I think I am a guru, I will naturally read other guru's article in priority, and vice-versa.
|
|
|
|
|
I think that this influences ony the distribution of voters' weigts for each certain article.
Ivan S Zapreev
cerndan@mail.ru
|
|
|
|
|
Some speculation on an implementation:
Each user’s vote would have to be stored; as well as their voting history (normalized, probably).
Then, when you are presented with a list of articles, those who vote ‘similar’ to you will have more weight with how your listing is sorted.
This system would depend on how you vote, so those who vote very little should fall back on a default algorthm, or the algorithm should introduce the ‘similar votes’ term as a function of how many votes you’ve cast. As you vote for articles, the system gets more certain of your similitude to other members of the site. Perhaps an n–space could be used to classify users.
This would all be very expensive in terms of CPU time, but I see several optimizations. One would be to run the algorithm only once every few days, perhaps off-server. The vote hisory could be discarded after each one of these runs.
|
|
|
|
|
I entirely agree with You for all points !
n-space could be used to classify users : Yes ! and classifying this n-space would produce a psychological profile of the users ! (for example, for movie rating, an n-space can correspond to movie category : fantasy, science-fiction, ... and for article voting, this can correspond to level of expertise : didactic article or expert article)
CPU time : I think this is the reason why IMDB as not yet a affinity classifier for voting member : it is very expensive of time consuming, but that is worth the blow for predicting movie attractiveness.
|
|
|
|
|
See also : Collaborative Filtering : www.vsnetfr.com/lien.aspx?ID=3923
|
|
|
|
|
No math here :
guru : people with high membership level
rookies : people with low membership level
(I) "Wi and Vi independent" imply : guru and rookie vote in similar way
(D) "Wi and Vi dependent" imply : guru and rookie don't vote in similar way
(given you level of math, and to keep that post short, I don't think I need to proove or explain better the aboves implication, but I can if you ask so)
if (I) is true (that is your hypothesis when you write E(fg) = E(f)*E(g)), then the weigth don't change the vote. That's correct, and nobody need stat to understand that if the guru alone would give the same note that the rookies alone, the combined note would be the same.
if (D) is true (that is code project hypothesis), the weight system allow the combined note to be closer to the opinion of the guru. And also, if (D) is true, your demonstration is false.
So, rather than a show off of mathematical formulas, I expected, from the title of the article, a dicussion on :
Does the guru vote so differently than the rooky ?
or:
Why weights with 1 2 3 4 5 instead of 10 20 30 40 50 ?
or:
Why not a note by membership level in addition to the global one ?
or
Why not discarding the 10 percent lower and 10 percent higher note to reduce data polution.
and there are many others.
As a matter of fact, the ranting system is an ARBITRARY choice, reflecting codeguru personal opinions on the best way to give feedback on article interest.
Weighted sum as they are used seems ok to me, as they give a weighted mean if (D), and the correct mean for (I) too (you only proved that).
But your concusion is wrong. When you say :
"It was discovered that although the weight of each person in the system is taken into account, the mean value doesn’t depend from it."
, you make a 'statistician rooky' mistake, forgetting that a demonstration as no value as proof if the hypothesis are no checked.
Of course, up to you to pretend that it's a matter of personal opinion to think that experienced programmer judge an article the same way than unexperienced one,
and up to me to think that as long as the rating system is logical mathematically correct and make common sens assomptions, I won't find it smart or dumb, but just fitting for the job.
|
|
|
|
|