commits
Add Content Review Filters by Meta
It came up in our HMA office hours chat today, and I realized we hadn't added them to this list!
Signed-off-by: Cassidy James Blaede <cassidyjames@roost.tools>
Add Google Content Safety API (industry service for CSAM detection)
Since Coop is released as open source and includes Content Safety API integration, this makes more sense to add now, so long as we mention that rationale
Signed-off-by: Cassidy James Blaede <cassidyjames@roost.tools>
Updated project descriptions and added new datasets related to AI safety and red teaming based on what's available in Microsoft's PyRIT tool.
Signed-off-by: Roman Lutz <romanlutz13@gmail.com>
Rename “AI-powered Guardrails” to “AI for Safety”
Follow-up to #34
Signed-off-by: Cassidy James Blaede <cassidyjames@roost.tools>
Add Risk Atlas Nexus
Signed-off-by: elizabethmdaly <85226478+elizabethmdaly@users.noreply.github.com>
PR template: update formatting and add prefernce for source code
Signed-off-by: Manish Nagireddy <65432909+mnagired@users.noreply.github.com>
Signed-off-by: Manish Nagireddy <65432909+mnagired@users.noreply.github.com>
Fixes #41 and changes it from one sentence to a list to be a bit more readable. GitHub might have issues generating a preview of the changes because it's all in an HTML comment. 🤷🏻
Signed-off-by: Cassidy James Blaede <cassidyjames@roost.tools>
update link for openguardrail to github link
Update link for pyrit to github link
Signed-off-by: Dennis Rall <56480601+dennis-rall@users.noreply.github.com>
Fixes #35
Signed-off-by: Cassidy James Blaede <cassidyjames@roost.tools>
Fixes #32
Signed-off-by: Cassidy James Blaede <cassidyjames@roost.tools>
Maybe a bit limited in how widely useful it is, but it's neat to see others building on HMA and they're part of the T&S ecosystem
Signed-off-by: Cassidy James Blaede <cassidyjames@roost.tools>
Removed an empty line before the Privacy Protection section.
Signed-off-by: Dennis Rall <56480601+dennis-rall@users.noreply.github.com>
Signed-off-by: Dennis Rall <56480601+dennis-rall@users.noreply.github.com>
Signed-off-by: Dennis Rall <56480601+dennis-rall@users.noreply.github.com>
This is the link Google uses in public places like blog posts. I also added a note that the service requires registration.
Signed-off-by: Cassidy James Blaede <cassidyjames@roost.tools>
Signed-off-by: Cassidy James Blaede <cassidyjames@roost.tools>
Signed-off-by: Cassidy James Blaede <cassidyjames@roost.tools>
feat: add Promptfoo to AI Readteming Tools
Add gpt-oss-safeguard by OpenAI
https://github.com/openai/gpt-oss-safeguard
gpt-oss-safeguard is a set of open-weight safety reasoning models built-upon gpt-oss. With these models, you can classify text content based on safety policies that you provide and perform a suite of foundational safety tasks.
gpt-oss-safeguard was released by OpenAI in partnership with ROOST and Hugging Face after months of work, including evaluation and testing from ROOST and Discord.
Signed-off-by: Cassidy James Blaede <cassidyjames@roost.tools>
Added Promptfoo to the list of AI evaluation tools.
Signed-off-by: Dennis Rall <56480601+dennis-rall@users.noreply.github.com>
Added CoPE by Zentropi for content classification.
Signed-off-by: samidh <samidh@users.noreply.github.com>
Added a new entry for Kanana Safeguard under AI safety tools.
Signed-off-by: Youshin Kim <theodore.k@kakaocorp.com>
Updates the license from CC-by-1 to the more recent by 4, which is recommended by CC
https://creativecommons.org/licenses/by/1.0/deed.en
Signed-off-by: Anne B <16597355+annebdh@users.noreply.github.com>
https://github.com/openai/gpt-oss-safeguard
gpt-oss-safeguard is a set of open-weight safety reasoning models built-upon gpt-oss. With these models, you can classify text content based on safety policies that you provide and perform a suite of foundational safety tasks.
gpt-oss-safeguard was released by OpenAI in partnership with ROOST and Hugging Face after months of work, including evaluation and testing from ROOST and Discord.
Signed-off-by: Cassidy James Blaede <cassidyjames@roost.tools>