Click here to Skip to main content
15,890,527 members
Please Sign up or sign in to vote.
4.00/5 (1 vote)
See more:
I want to match words of a complete sentence against a list of words, and if contain not to proceed to next step. However, it matches partially as well.

For example, if my list contain a word 'me' and in the sentence there is a word 'some' it fails.

What I have tried:

What I've done,

Python
if not any(word in text_sentence.lower().strip() for word in banned_words):
    # Proceed to next level, as a valid sentence


Once I pass "Some one like you" it fails, as my list contain a word "me". Verified simply printing the words.

Python
for item in banned_words:
    if item in slbot_tweet:
        print(item)
Posted
Updated 15-Aug-18 22:14pm

You need to break your sentence down into individual word tokens, using all forms of punctuation as delimiters: quotes, double quotes, comma, space, dot, colon, semicolon, brackets, exclamation, hyphen, and anything else your user is likely to type to mask it: I'd suggest breaking it on anything that isn't a letter of number!

Once you have the string as an array of word tokens, you can check if any of them are in the "banned" list.
 
Share this answer
 
Comments
CodingLover 16-Aug-18 1:59am    
Thanks for the comment.

I thought the following segment split the sentence by space.

word in text_sentence.lower().strip()
Your tweet text is not an array of words, but an array of characters. As OriginalGriff says, you need to break it up into a proper array of tokens (words) first. Use the string.split()[^] method.
 
Share this answer
 
Comments
CodingLover 16-Aug-18 4:33am    
Yes, it works once tokanized. Was miss-up with strip() and split().

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900