
OpenAI Introduces AI Agent Operator OpenAI introduced
Operator – a GPT-4 o-based agent capable of performing online browser tasks. The agent works through a special interface where users can see the browser window and control the assistant’s actions.
Operator uses Computer-Using Agent, combining GPT-4 o’s visual capabilities with advanced thinking through reinforcement learning. Computer-Using Agent achieved 38.1% success on the OSWorld test and 87% on WebVoyager, surpassing previous models.
The agent operates on a remote server via encrypted connection. Users can take control for CAPTCHA input or payment data. Operator has instruction sets for storing user preferences. You can input any request, even with photos, and the assistant will start browsing – you can delegate food ordering, table reservations, ticket purchases, taxi calls, and more. Operator also shows a mini-screen with everything it does in real-time.
OpenAI heavily emphasizes system security and attack resistance. The entire process is monitored by a separate model that can trigger execution stops if something’s wrong. Additionally, suspicious situations will be sent for manual review.
The service is available to Pro users in the US, will be added to Plus subscription in few weeks, and API for developers. Although Anthropic and Google showed similar demonstrations earlier, OpenAI first launched a consumer product, despite Pro subscription unprofitability. Let’s hope that when Operator learns to make purchases independently, it won’t start ordering gifts for itself on its activation day.