In popular culture (science fiction movies/books), Artificial Intelligence (AI)/Robots become self-aware and rebel against humanity and decide to destroy it. While this is one possible scenario, it is probably the least likely path to the appearance of dangerous AI. Various high-profile individuals, from physicist Stephen Hawking to Elon Musk, have warned of the danger.

This is why artificial intelligence safety is emerging as an essential discipline. Computer scientists have begun studying the unintended consequences of poorly designed AI systems, ones created with broken ethical frameworks or ones that do not share human values.

But there's a significant omission in this field, according to independent researchers Federico Pistono and Roman Yampolskiy of the University of Louisville in Kentucky. In a report by the MIT Technology Review, they remarked that, to their knowledge, nothing has been published on how to design a rogue machine. That's an important problem because computer security specialists must know the beast they are up against before they can hope to defeat it.

So the question is, how do we fight malevolent or rogue AI models? 

AI models are incapable of doing anything on their own. These malicious actions are the result of harmful data leveraged to train them. Therefore, developers should build more secure and trustworthy generative AI applications.

However, business leaders are trying to strike the right balance between innovation and risk management in the rapidly evolving landscape of generative AI. Prompt injection attacks have emerged as a significant challenge, where malicious actors try to manipulate an AI system into doing something outside its intended purpose, such as producing harmful content or exfiltrating confidential data. In addition to mitigating these security risks, organizations are concerned about quality and reliability. They want to ensure that their AI systems are not generating errors or adding information not substantiated in the application's data sources, which can erode user trust. 

Microsoft's AI tools

Generative AI can be a force multiplier for every department, company, and industry. To help customers meet these AI quality and safety challenges, Microsoft has announced five available tools for generative AI app developers coming soon to Azure AI Studio. Prompt Shields are tools used to detect and block prompt injection attacks, including a new model for identifying indirect prompt attacks before they impact your model. Groundedness detection detects hallucinations in model outputs, and safety system messages steer the model's behaviour toward safe and responsible outputs.

Finally, safety evaluations assess an application's vulnerability to jailbreak attacks and generate content risks, and risk and safety monitoring aid in understanding what model inputs, outputs, and end users are triggering content filters to inform mitigations.

Need for further research

Apart from the possibility of malevolent models, AI already poses some threats. Telangana police's Additional Director General of Police's Women Safety Wing, Shikha Goel, recently stated that cybercriminals are using deep fakes, especially to commit automated disinformation attacks, identity thefts, financial frauds, scams and hoaxes, celebrity pornography, and even election manipulation.

Such AI crimes can be stopped using tools similar to the ones proposed by Microsoft. Therefore, undoubtedly, there is a rise in the need for research in this space. It is essential to understand that the data used in the development of the tech plays a crucial role in the character of the model.

Sources of Article

MIT Technology Review

Azure AI Blog

Image: Unsplash

Want to publish your content?

Publish an article and share your insights to the world.

Get Published Icon
ALSO EXPLORE