- by SAN FRANCISCO
- 01 20, 2025
Loading
THE 12 DAYSAIAIGPTAIAIAIAIAIAIAIGPTAI AIAI AIAIAIAIAI GPTAI of Christmas are meant to start on December 25th. But not in the world of artificial intelligence (). On December 5th Open, maker of , began a blizzard of product shipments dubbed, gratingly, the “12 days of shipmas”. It has included a full roll-out of Sora, its video-generation tool, as well as Canvas, a writing and coding product.Not to be outdone, has also put the elves to work early. On December 11th it unveiled a new model called Gemini 2.0. And it launched souped-up prototypes of two products based on the model, called Astra and Mariner. These can take actions on a user’s behalf—making them what industry types call “agentic ”.The prominence of products over models in both sets of announcements was noteworthy. While the boffins who work on large language models are striving to get to the next frontier of intelligence, developers are under pressure to release clever products that prove there is a market for all this ingenuity.Developing generative- products is fraught with difficulty. Product developers typically work backwards from what the consumer needs. Generative , however, is evolving so quickly that the technology is defining the product. “You are normally taught not to be a hammer looking for a nail,” says Kevin Weil, Open’s chief product officer. But “every two months computers can do something that we have never before been able to do.”These latest product launches, though, have been marred by glitches. Open had to suspend access to Sora shortly after it was released to Chat subscribers because it underestimated demand, according to its boss, Sam Altman. Those who obtained access, even if impressed by what they found, noticed that problems from an earlier demo remained. One of the most glaring was the trouble the tool had portraying complex movements realistically. Marques Brownlee, a tech reviewer, noted that Sora was almost guaranteed to “mess up” anything walking with four legs, and objects randomly disappeared.Google’s agents were not fully polished, either. Astra, which is currently available to only a small group of “trusted testers”, can explain in several languages what it sees through a phone’s camera, and has access to Google sites such as Search and Maps. In a demo that involved taking a video of famous paintings, it spoke knowledgeably about them. Yet it was flummoxed when asked by to name the city where most of the originals were on display. Mariner, Google’s other new prototype, can complete tasks on a browser, such as filling a shopping basket in an online supermarket. But it cannot complete the checkout itself.Silicon Valley has great expectations for agentic in particular. The use of agents to advance from “chatting to doing” could be one of the big tech breakthroughs of 2025, says Alex Wang of Scale, an data company. Already that hope has bolstered the share prices of software giants like Salesforce. It said this month that it had struck deals with more than 200 customers for Agentforce, its workplace agent, within a week of releasing it in October. Microsoft, its bigger rival, has released a variety of agents.Several factors, however, make it harder to create agents than chatbots. One is data. Unlike chatbots, which scrape information from the web to answer questions, agents require data on the way tasks are performed, including the sequencing of actions and the reasoning behind them. For routine activities, such as processing a customer order, that may be straightforward. In many cases, though, it will be difficult to find sufficient data to train the tool.A second problem is trust. Checking whether a chatbot has given a right or wrong answer is usually easy. Determining whether an agent has booked the best restaurant or holiday it could within your budget may be more difficult. Google deliberately prevents Mariner from spending money in case it garbles the decision. Users may also balk at providing agents with sensitive information about, say, their purchase history, which may be required for the tools to function properly.A final problem is cost. In order to reason, plan and solve problems on behalf of users, agents need access to models that can handle complex tasks. They also require low latency and the ability to interact with other tools such as a web browser, as well as plenty of memory to provide a service tailored to the user. All that is tricky and expensive to build, and requires lots of computing power to operate.Already cost pressures are starting to mount. On December 5th Openlaunched a “pro” version of Chat, with unlimited access to all its latest bells and whistles, at $200 a month, ten times the price of its basic subscription. Alphabet, owner of Google, is as rich as Croesus and could be more generous. Still, if agentic lives up to the hype, users may find it worth the extra cost.