Archive for August, 2010
One of my recent PHP projects had the requirement to filter out inappropriate language from user submitted content. After thinking about the problem briefly, I decided that I didn’t want to be writing the filter myself but, rather, find a third party service that could filter my text for me. By doing this, I eliminated the need to create and maintain a bad-word list, as well as saved the CPU cycles required to actually preform the search-and-replacement (Although, arguably, remote API calls are more expensive anyways).
After some searching I stumbled across the free Cdyne Profanity Filter Service. Not only does this service filter out the standard inappropriate language that you would expect, it also doesn’t produce false negatives (eg: the hello isn’t filtered for containing the word hell), and it has fairly robust phonetic character matching to catch things like a$$. The Cdyne service is exposed as a Soap WSDL so easy interfaces to languages other than PHP are possible.
I ended up writing a Zend Framework based Soap Client Service for the Cdyne filter, and I figured I would share it with any others who are looking to do filtering. In the following zip, there is the Service class, along with some unit tests demonstrating the use of the class methods. You should be able to rename the Zext_Service_Cdyne_ProfanityFilter class to one of your choosing if you do not like the pseudo namespacing I’ve used. Check out Cdyne’s wiki for more info.