An Azure service that enables users to identify content that is potentially offensive, risky, or otherwise undesirable. Previously known as Azure Content Moderator.
quick answer model didnt cross threshold so marked Safe, known limitation, tune thresholds + add extra validation layers.
yeah this is real and happens, not u doing something wrong, this is model limitation + policy gap + outdated sample mismatch.
Content Safety image model is probabilistic classifier, not deterministic, so edge cases like animal-on-human violence can slip as Safe depending on training coverage and thresholds, especially if model focuses more on human-to-human violence signals, also repo is outdated so examples may not align with current model versions and taxonomy.
API and Studio both use same backend so same result expected, technically whats going on is score per category (violence, self-harm etc) stays below threshold so classification = Safe not bc model sees nothing but bc confidence < cutoff, u can inspect raw scores via API (not just label) and u will see low but non-zero values, thats key for tuning where to report GitHub repo issues (for sample problems) + Azure feedback/support for model false negatives, real-world fix is never rely on single pass moderation.
Use multi-layer approach adjust thresholds (lower cutoff), add custom classifiers or CV models for domain-specific violence, run secondary validation (ensemble), and add human-in-the-loop for borderline cases, also log scores not just labels and build own decision logic
rgds,
Alex