Everything You Need To Know About Sora

OpenAI, the US based artificial intelligence research organization, recently introduced Sora, a text-to-video model that can generate videos up to a minute long while maintaining visual quality and adherence to the user’s prompt.

According to information of OpenAI website, Sora is “becoming available to red teamers to assess critical areas for harms or risks. We are also granting access to a number of visual artists, designers, and filmmakers to gain feedback on how to advance the model to be most helpful for creative professionals.”

Capabilities of Sora

Sora can generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background. The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world.

The model also has a deep understanding of language, enabling it to accurately interpret prompts and generate compelling characters that express vibrant emotions. Sora can also create multiple shots within a single generated video that accurately persist characters and visual style.

However, the current model also has weaknesses. According to OpenAI it may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect.

The model may also confuse spatial details of a prompt, for example, mixing up left and right, and may struggle with precise descriptions of events that take place over time, like following a specific camera trajectory.

The announcement of the introduction of Sora was met with a mixed reaction from pundits. Experts have touted Sora for its ability to revolutionize the video industry, making it easier and cheaper to produce video clips. Others have raised concerns that AI-generated content could be used to wrongly influence elections or otherwise sow confusion worldwide.

OpenAI says it is taking several important safety steps ahead of making Sora available to users.

“We are working with red teamers — domain experts in areas like misinformation, hateful content, and bias — who will be adversarially testing the model,” it said.

It also disclosed it is building tools to help detect misleading content such as a detection classifier that can tell when Sora generated a video.