|
One could be genuine. But the second, on a two year old question? Nah - it' spam.
Both "solutions" deleted, spammer on 3
Bad command or file name. Bad, bad command! Sit! Stay! Staaaay...
|
|
|
|
|
|
|
Gone.
--
"My software never has bugs. It just develops random features."
|
|
|
|
|
|
A bright one, this time.
A bloke called William: http://www.codeproject.com/script/Membership/View.aspx?mid=11785769[^] - talking about what happened to his breasts after he stopped breastfeeding his son, and the wonder drug he found to fix it...
I knew marketeers were a different species, but I didn't know they were a different phylum...
Bad command or file name. Bad, bad command! Sit! Stay! Staaaay...
|
|
|
|
|
|
|
10th kick applied.
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
|
Gone
If the brain were so simple we could understand it, we would be so simple we couldn't. — Lyall Watson
|
|
|
|
|
|
|
The Spam Filter we are using on CodeProject uses probabilities that words and phrases would occur in Spam and non-Spam content to predict if a Message, Question, or Answer is Spam or not.
If the probability is that it is Spam is high enough, we send the potential fertilizer to the moderation queue, where a select few have been granted the godly power to bless it so that it can be seen by all, or assign it to a tortuous purgatory.
Until recently, when content was rejected, it was analyzed and its words and phases were added to the Spam probability tables. Unfortunately, the accepted content wasn't analyzed and added to the non-Spam probability tables.
This meant that trigger happy, or erroneous rejection of content would gradually weigh the probabilities for good content towards being placed in the moderation queue as suspected Spam.
We have changed this, so that accepted content is now analyzed and saved. This means that even if some content is erroneously rejected, eventually (actually very quickly) the acceptance of similar content will change the probabilities so that the good content will no longer be sent to moderation. So keep up the good work and the Spam Filter will correct itself and get better and better.
However, the current probability tables are a little off for some low frequency occurring phrases and words, and its your fault for being too eager to reject. Please read carefully and consider the context of the content before nuking or blessing it. It will make you life easier in the long run.
Also, it's not possible to tweak the tables for 'special' cases. It is a learning algorithm which makes its decisions based on the decisions of the moderators. We didn't set up the probabilities, they were created from your work, and there are literally hundreds of thousands of data points.
This is also why we are not too concerned that some spammer/troll will figure out how to get around the filter. Once you moderators identify enough of any of type of spam/abuse, then they are toast. There is only so much they can do to their garbage and still make it convey the message they wish.
Please keep up the great work, and we look forward to any suggestions or questions concerning the Spam Filter (after they have been analyzed for Spam )
|
|
|
|
|
Matthew Dennis wrote: Please read carefully and consider the context of the content before nuking or blessing it.
That's a problem, because we don't get to see the context in which the message is placed (or the question the answer is aimed at) - and sometimes that is relevant. Perhaps we need a link?
Bad command or file name. Bad, bad command! Sit! Stay! Staaaay...
|
|
|
|
|
|
In the last few days, I've had two messages marked as possible spam; they were eventually released, but I like to know, if possible, WHY they were marked as possible spam. If there are certain words I should avoid, I'll try to do so, but without knowing, it's hard to adjust behaviour.
Thanks,
Tim
|
|
|
|
|
To tell people on a case by case basis, would be, I guess very time consuming.
To tell everybody, would be defeating the object, as you are then educating the Spammers.
Tim Carmichael wrote: If there are certain words I should avoid, I'll try to do so, but without knowing, it's hard to adjust behaviour.
That's the point you shouldn't adjust your behaviour, the filter should learn what normal messages are. The word/words you used were incorrectly flagged as potential Spam indicators. It's this flagging that has been fixed - so you should be able to use these words now.
|
|
|
|
|
Please don't adjust your posting behaviour. The whole point of the spam system is that we teach it what we feel is acceptable and what is not. Post as you will, and if it gets trapped then after it's marked "good" the system will learn and should not bother you again. If it again gets all weird on you then it's time to give it another lesson.
It works for us; we don't work for it.
cheers
Chris Maunder
|
|
|
|
|
Thank you both... it was just odd because I had never had anything marked as spam before.
|
|
|
|
|
For Message Moderation the Message Replied To column contains links to the parent messages. I've turned this into a button to make it more obvious.
I'll look at doing something similar for Q&A.
|
|
|
|
|
Done and waiting for deploy.
|
|
|
|
|
Wouldn't it make sense to bypass the spam filter for messages from members with a certain minimum reputation?
If the brain were so simple we could understand it, we would be so simple we couldn't. — Lyall Watson
|
|
|
|
|
Nope.
As discussed before: spam, sadly, can come from all directions.
cheers
Chris Maunder
|
|
|
|
|
The only thing I could think of might be to not Filter Spam in this Forum as it is likely that posters might be quoting a spam message.
But I am waiting until it becomes an issue, as the Filter might actually figure this out.
|
|
|
|