by area. There is probably a big difference in accuracy between pornography and the other objectionable areas. There is also a trade-off between false positives and false negatives. The extent to which advanced techniques make a difference depends on where in the trade-off you start out. If I had to give a number, I would expect a 20 to 30 percent improvement in accuracy over the bag-of-words model—if you want to let all good content through (if you do not want over-blocking).
The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
2 Text Categorization and Analysis ."
Technical, Business, and Legal Dimensions of Protecting Children from Pornography on the Internet: Proceedings of a Workshop . Washington, DC: The National Academies Press,
Please select a format: