The model underpinning Operator is a Computer-Using Agent (CUA) that combines GPT-4o's vision mode to "see" what's on the user's screen through screenshots with graphical user interfaces (GUIs) that ...
It can also ask follow-up questions to further personalize the tasks it completes, such as login information for other websites. Users can take control of the screen at any time.