Przejdź do trybu offline z Player FM !
[QA] FERRET: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique
Manage episode 436144155 series 3524393
FERRET enhances adversarial prompt generation for large language models, improving attack success rates and efficiency over RAINBOW TEAMING while ensuring effective prompts across various model sizes.
https://arxiv.org/abs//2408.10701
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
--- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support
1480 odcinków
Manage episode 436144155 series 3524393
FERRET enhances adversarial prompt generation for large language models, improving attack success rates and efficiency over RAINBOW TEAMING while ensuring effective prompts across various model sizes.
https://arxiv.org/abs//2408.10701
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
--- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support
1480 odcinków
Tất cả các tập
×Zapraszamy w Player FM
Odtwarzacz FM skanuje sieć w poszukiwaniu wysokiej jakości podcastów, abyś mógł się nią cieszyć już teraz. To najlepsza aplikacja do podcastów, działająca na Androidzie, iPhonie i Internecie. Zarejestruj się, aby zsynchronizować subskrypcje na różnych urządzeniach.