commits
feat: add Promptfoo to AI Readteming Tools
Add gpt-oss-safeguard by OpenAI
https://github.com/openai/gpt-oss-safeguard
gpt-oss-safeguard is a set of open-weight safety reasoning models built-upon gpt-oss. With these models, you can classify text content based on safety policies that you provide and perform a suite of foundational safety tasks.
gpt-oss-safeguard was released by OpenAI in partnership with ROOST and Hugging Face after months of work, including evaluation and testing from ROOST and Discord.
Signed-off-by: Cassidy James Blaede <cassidyjames@roost.tools>
Added Promptfoo to the list of AI evaluation tools.
Signed-off-by: Dennis Rall <56480601+dennis-rall@users.noreply.github.com>
Added CoPE by Zentropi for content classification.
Signed-off-by: samidh <samidh@users.noreply.github.com>
Added a new entry for Kanana Safeguard under AI safety tools.
Signed-off-by: Youshin Kim <theodore.k@kakaocorp.com>
Updates the license from CC-by-1 to the more recent by 4, which is recommended by CC
https://creativecommons.org/licenses/by/1.0/deed.en
Signed-off-by: Anne B <16597355+annebdh@users.noreply.github.com>
Signed-off-by: juansmrad <122411379+juansmrad@users.noreply.github.com>
Aymara lets AI developers automate the creation, administration, and analysis of generative AI evals with a focus on safety, accuracy, and jailbreaking.
Added DFR's interference site and CIB mangotree
Adding several open source resources and datasets from Tattle
The page linked to for FediCheck is gone following IFTAS's reduction in activities, but Jaz did open-source the code to FediCheck.
Add Toxic Prompt RoBERTa, ToxicChat, JailbreakHub
Added a Fediverse spam filter and created new fediverse category
adding bumble private detector
Adding Privacy Section and Fawkes Facial Cloaking Tool
Thanks to @marielpovolny, @dsiegel17, @jenrweed
added resources and links for trust and safety / online safety / risk tools and resources that are open source
https://github.com/openai/gpt-oss-safeguard
gpt-oss-safeguard is a set of open-weight safety reasoning models built-upon gpt-oss. With these models, you can classify text content based on safety policies that you provide and perform a suite of foundational safety tasks.
gpt-oss-safeguard was released by OpenAI in partnership with ROOST and Hugging Face after months of work, including evaluation and testing from ROOST and Discord.
Signed-off-by: Cassidy James Blaede <cassidyjames@roost.tools>