Every month, we produce the Guide to AI, an editorialized roundup covering geopolitics, hardware, start-ups, research, and fundraising. But so much happens in the AI world, that weeks can feel like years. So, on off-weeks for Guide to AI, we’ll be bringing you three things that grabbed our attention from the past few days…
Subscribe to Air Street Press so you don’t miss any of our writing - like our 2024 year in review.
1. Market commentators don’t always agree that AI exists
If you minus out US big tech, then over the last couple of years (since Oct 2022), the S&P has in fact not outperformed its European counterparts. We’ve seen many versions of this argument articulated routinely, most recently in the Financial Times’ Alphaville section. The FT included this handy chart:
We’re confused by this argument.
First, “October 2022” feels like a pretty arbitrary cut-off to measure market performance, considering that the SP500 and MSCI EMU indices have diverged significantly since 2009. One could make the argument that US acceleration has been high growth, tech-led, while European stocks are more mature and tend to pay out dividends. But if you compare accumulation ETFs to get a sense of total shareholder return over the past decade, the trend goes back much further than 2022…
Secondly, even if the divergence was entirely the result of tech - why would it matter? The US tech sector is made up of highly profitable, dynamic businesses that create products used by billions of people around the world. You may think they are overspending in the short-term on AI infrastructure, but these companies can afford to. Microsoft and Alphabet aren’t the manufacturers of VC-backed juicemakers. Talk of AI bubbles often suggests people are proceeding off the wrong mental model.
2. The humanoids are coming
Taking the stage at CES in Las Vegas, a chic leather jacket-clad Jensen Huang, flanked by humanoids on either side, presented a menacing image to the world.
Rather than announcing the robot takeover, NVIDIA was in fact releasing a family of generative world foundation models called Cosmos. Trained on tens of millions of hours of humans moving and manipulating objects, the Cosmos family is designed to generate videos and world states for physical AI development that demonstrate an accurate grasp of physics.
As an anecdote about the importance of timing in tech adoption, back in 2018, a deep-learning startup (TwentyBN) where I worked pushed the research agenda of end-to-end learning on video streams of human actions to learn visual common sense. They had demonstrated this was a tractable path but, back then, the commercial use cases weren’t clear nor were the scalable transformer architectures…
Current users include humanoid start-ups such as Agility and Figure AI, as well as companies working on self-driving, like Wayve.
What should we make of all this?
Firstly, betting against NVIDIA is generally unwise. Unless some mysterious, unspecified new AI paradigm comes along, every time a potential use case gains promise, the company will almost always be the natural partner of choice for entrepreneurs. Who’s going to say no to GPUs and a ready-made software stack? Best case scenario: they plant themselves at the center of a new use case. Worst case: they make a model that lots of people use. Either way, the house never loses.
Secondly, we’re starting to see signs of an NVIDIA playbook emerging. The company was quick to move on self-driving almost a decade ago now, partnering with all the major players, while working on its own autonomy stack in the background. It seems highly likely that with Cosmo, or Gr00T, their general-purpose foundation model for robotics, that they are attempting the same move with physical AI. This is the side of NVIDIA’s business that the focus on the (obviously important) CUDA advantage sometimes overlooks.
3. An implicit reward for reaching the end
Work inspired by O1 reasoning is understandably popular at the moment. But traditional approaches to improving reasoning in LLMs have often been dependent on large volumes of annotated data. PRIME (Process Reinforcement through Implicit Rewards) introduces an implicit process reward model, which learns from overall outcomes (is this solution correct?) rather than needing detailed step-by-step feedback.
This model generates dense feedback for every step in a reasoning process, solving a common problem in reinforcement learning (RL) where rewards are too sparse to guide training effectively. Importantly, the PRM can start from an existing language model and improve itself dynamically during training, reducing the complexity and cost of creating a specialized reward system.
This allowed the authors’ model, trained from Qwen2.5-Math-7B-Base, to surpass GPT-4o and Qwen2.5-Math-7B-Instruct - using only 1/10 of the data.
If you want to learn more about Implicit PRM and some of the theory behind it, you should check out the original paper.
See you next week!