leg. If I put enough of them together, then there is a person in the picture. If there is a person and there is skin, then they have no clothes on, and there is a problem. We could reason about the arrangement of skin, or we could simply say that any big blob of skin must be a naked person. We did a classification based on kinematics.
Performance assessment is complicated. There are two things to consider: first, the probability that the program will say a picture is rude when it is not (i.e., false positive) and, second, the probability that the program will say a picture is not rude when it is (i.e., false negative). Although it is desirable to try to make both numbers as small as possible, the appropriate trade-off between false positives and false negatives depends on the application, as described below. Moreover, false positive and false negative rates can be measured in different ways. Doing the experiments can be embarrassing because a lot of pictures need to be handled and viewed, and all sorts of other things make it tricky as well. The experiments are difficult to assess because they all use different sets of data. People usually report the experiments that display their work in a good light. In view of these phenomena, it is not easy to say what would happen if we dropped one of these programs on the Web.
One way to reduce viewing of pornographic images is intimidation. A manager or parent might say to employees or children that Internet traffic will be monitored. They might explain that the image categorization program will store every image it is worried about in a folder and, once a week, the folder will be opened and the contents displayed. If the images are problematic, the manager or parent will have a conversation with the employee or child. This approach might work, because when people are warned about monitoring, they may not behave in a silly way.
But it will work only if there is a low probability of false positives. No one will pay attention to monitoring if each week 1,500 “pornographic” pictures are discovered in the folder, all being pictures of apple pies that the program has misinterpreted. The security industry usually says that people faced with many false positives get bored and do not want to deal with the problem.1 On the other hand, a high rate of false negatives is not a concern in this context. Typically, in a monitoring application, letting
Milo Medin noted that the Internal Revenue Service (IRS) uses the intimidation approach. In the tax context, many false positives may not be a problem. Certain behaviors cause the IRS to expend a lot of energy to respond. If the consequences of an investigation are high enough, then the IRS needs to do it only a few times to generate certain behaviors.