OpenAI launched a moderation system at GPT-4

With this new release, it will avoid having "toxic and harmful" material from the Internet

Last updated Aug 16, 2023

OpenAI has unveiled a content moderation system based on its GPT-4 technology, which it aims to moderate online traffic and filter “toxic and harmful” material from the Internet to “lighten the mental load” on human moderators performing this task. function.

The company that created ChatGPT highlighted the need to moderate content on digital platforms, since it is something “crucial in maintaining the health” of said media.

In this regard, he noted the “meticulous effort, sensitivity and deep understanding of the context” that the online content moderation process requires. Likewise, he also pointed out the need for a “rapid adaptation” to new use cases in this area.

OpenAI has also remarked that, due to its complexity, it is a “slow and challenging” process for users who are dedicated to moderating content and filtering harmful or inappropriate material.

In this framework, the technology has presented a content moderation system, which uses its own GPT-4 technology to filter online content and thus detect “toxic and harmful” material on digital platforms.

Thus, as OpenAI detailed in a statement on its blog, it is a system that, through its most powerful AI technology, can help moderate online traffic in accordance with the specific policies of the platforms where it is implemented. In fact, any user with access to the OpenAI API can implement this system and create their own AI-assisted moderation process.

In this way, the system is designed to “relieve the mental load of a large number of human moderators”, who can rely on GPT-4 to filter content. In addition, as the company has explained, this technology allows a “more consistent” labeling of online content, since LLMs (large language models) are more sensitive to wording differences and can adapt more quickly to updates of policies to deliver a “consistent content experience.”

On top of all this, it offers a “faster feedback loop” for refining the moderation policies used.

To use this system, the desired moderation rules must first be entered into GPT-4. After that, OpenAI tests the operation of the moderation system with a sample of problematic content, based on these pre-established rules.

The decisions made by the AI must be reviewed by the moderator users and, in case of finding erroneous judgments, the AI’s decision can be corrected and, thus, trained to carry out a more precise moderation. “We can repeat the steps until we are satisfied with the quality of the policy,” OpenAI explained. This procedure reduces the content policy development process “from months to hours.”

Despite all this, OpenAI pointed out that, for the moment, there are some limitations in the system. For example, you have referenced potential “unwanted biases” that might have been introduced into the model during training.

“As we continue to refine and develop this method, we remain committed to transparency and will continue to share our learnings and progress with the community,” OpenAI said in this regard.

Source: dpa

(Reference image source: Jonathan Kemper, Unsplash)

Visit our news channel on Google News and follow us to get accurate, interesting information and stay up to date with everything. You can also see our daily content on Twitter and Instagram