Lemmy realtime CSAM detection tool updated in accuracy

db0@lemmy.dbzer0.com · edit-2 3 days ago

Lemmy realtime CSAM detection tool updated in accuracy

hok@lemmy.dbzer0.com · 3 days ago

Curious, how do you evaluate the performance without breaking the law?

Batman@lemmy.world · 3 days ago

Looking at this it looks like the author just… has csam to evaluate with: https://github.com/Haidra-Org/horde-safety/blob/main/tests/test_csam_checker.py

Guess we don’t know the laws where they live though. Or where they run the program.

chicken@lemmy.dbzer0.com · 2 days ago

It seems like it could be legally problematic, but I’m not sure what the alternative would be other than accepting the privacy/autonomy nightmare of funneling all traffic through a government affiliated centralized service.

Railcar8095@lemm.ee · 2 days ago

I didn’t delve very deep, but it seems it uses a pre trained model that classifies images with anime tags (loli, lewd…) and gives some weights to those. I guess at some point the author will review the results of the real images on Lemmy and use them to tweak those weights, which I understand might be way at least temporary illegal (they could destroy the image and keep only the transformed version, which is “impossible” to turn back into the original image, do it still works as training data)

DeepDanbooru is the model of you’re interested.

db0@lemmy.dbzer0.com · edit-2 2 days ago

Deepdanbooru is one of the two models. We also use open ai clip

Railcar8095@lemm.ee · 2 days ago

Well, if that was my only mistake then your code was surprisingly easy to follow (rule number one of github: never read the readme.md)

How do you deal with tweaking? Do you get random samples from Lemmy? Anything you can share about the legal aspect.of it?

db0@lemmy.dbzer0.com · 2 days ago

horde-safety is there primarily to protect the AI Horde. We have no shortage of creeps and pedos trying to use our crowdsource resources.