Hi everyone!
Welcome to the latest issue of your guide to AI, an editorialized newsletter covering the key developments in AI policy, research, industry, and start-ups over the last month. First up, a few updates:
Congratulations to our friends at Fern Labs, who announced their $3M Pre-Seed round from us at Air Street. Sign up to their waitlist to build that software tool you always wanted :-)
Our events series is in full swing. I’ll be in Zurich this week after events in Berlin and Munich over the last two. Subscribe to our events page to ensure you don’t miss out.
Our annual RAAIS conference is back on 13 June 2025, check it out and register your interest. Announced speakers include Max Jaderberg of Isomorphic Labs, Eiso Kant of Poolside, and Mati Staniszewski of ElevenLabs.
Remember to subscribe to Air Street Press to get all of our news, essays on AI research, startup playbooks, geopolitics, defense, biotech and more directly in your inbox.
If you’re a founder, working on an AI product or doing research in AI and enjoy writing analytical and opinionated pieces on any of the themes you read about here, drop me a line. I’ll be featuring guest essays on Air Street Press.
We love hearing what you’re up to and what’s on your mind, just hit reply or forward to your friends :-)
🌎 The (geo)politics of AI
Europe is living through what is hopefully a profound vibe shift: a revised focus on doing what it takes to become competitive on the world stage and master its fate (to the extent that it can). The European Commission’s plan ambitiously promises a “bolder, simpler, faster Union.” It features a laundry list of initiatives—from decarbonisation to digitalisation, defense to democracy—and reads like a manifesto for solving every conceivable problem, yet offers little clarity on how these lofty goals will be achieved without drowning in red tape. What caught everyone’s attention is the commitment to cut down bureaucratic red tape by 25% (how?)streamline sustainability reporting (is this really what’s holding Europe back?), and efforts (which are so far not super clear) to advance AI and defense. This ambitious agenda faces significant hurdles, such as geopolitical instability, illegal migration, and attacks on core European principles. While the Commission aims to deliver faster and more effectively, its success will depend on strong cooperation among EU institutions and member states in an increasingly complex and polarized world…
Paris recently played host to the AI Action Summit, the latest in a series of conferences hosted by nation states. In contrast to the UK’s first summit which focused on AI safety, this one was reportedly more of a trade show for AI opportunities. Of note, U.S. Vice President JD Vance delivered a punchy speech (which I loved) to emphasize the importance of embracing AI's potential rather than fixating on its risks. Vance argued that an overly cautious approach could hinder progress and that the "AI future" won't be won by "hand-wringing about safety." Now, you can only imagine the reaction from Team Safety…, which were largely sidelined from the main stage of his speech and the conference itself.
Meanwhile, Beijing is backing the rapid proliferation of DeepSeek’s models across its hospitals, local governments and heavy industries. Indeed, the FT reported that “All the major cloud service providers, at least six car manufacturers, several local governments, a number of hospitals and a handful of state-owned enterprises have moved to deploy DeepSeek, with the shift among traditionally conservative institutions particularly striking.” In one example, a doctor at a public hospital in Hubei province in central China said “the institution’s leadership had issued a directive that DeepSeek should be used as a third-party arbiter if two doctors have differing views on a patient’s treatment.” Casting this approach to technology diffusion with Europe’s approach of drafting press releases about competitiveness and high-level commitments (read: often mumbo jumbo) is quite striking.
And the stakes couldn’t be higher for Europe as the US pulls further away as the weeks go by. It is now existential for Europe to get serious about enabling rather than hindering the creation of home-grown winners in critical sectors such as AI, defense, and energy. The continent has always had what it takes. But we need policies that accelerate public procurement with significant budgets for new players. We must modernize decaying infrastructure, build and buy anew. We must attract and retain skilled immigration. We must unshackle the formation of new companies from our university inventions.
On defense alone, there is an urgent need to build up European home-grown winners to plug the enormous gaps if the US is to withdraw further. Meanwhile, the US is supporting its own proto-winners at an increasingly large scale. For example, Anduril has swooped in to salvage the U.S. Army’s $22 billion Integrated Visual Augmentation System (IVAS) program, a project Microsoft fumbled over the years. What began as a futuristic vision of augmented reality headsets for soldiers devolved into a comedy of errors—headaches, nausea, and tech that couldn’t handle bad weather. Microsoft, after years of public flogging and $1.5 billion down the drain, quietly exited stage left, leaving Anduril to play the main character.
🍪 Hardware
Despite significant DeepSeek-driven market jitters caused by people misreading headlines because they don’t read the actual research paper, big tech and sovereign nation AI infrastructure investments continue seemingly unabated. Over in Saudi Arabia, US-based NVIDIA challenger Groq announced a $1.5B deal to expand its inference-focused infrastructure across the country.
Meanwhile, Microsoft, Alphabet, Amazon and Meta combined have invested capex worth $246 billion in 2024, up from $151 billion in 2023. Indeed, Amazon now plans $100 billion+ investments in 2025 and Meta too is in talks for a $200 billion data center investment this year. A few weeks after Satya of Microsoft reaffirmed his commitment to spend $80B to build out Azure AI infrastructure in 2025, rumors started circulating that the company cancelled leases worth a few hundred megawatts of data center capacity, citing facility/power delays.
Over in China, Alibaba, the Chinese e-commerce giant, plans to invest a staggering $53 billion in AI infrastructure over the next three years. This marks a major pivot for the company as it aims to become a leader in artificial intelligence. Alibaba envisions partnering with companies to develop and apply AI to real-world problems, providing the necessary computing power as models evolve.
Tesla is making a bold move to challenge ride-hailing giants like Waymo and Uber by seeking approval to operate its own fleet of vehicles in California. The company has applied for a transportation charter-party carrier permit. This comes as Tesla faces declining auto sales and shrinking profit margins as Elon seems to be far too distracted on DOGE (which is in itself a great initiative) vs. his day job.
And as the robotics craze continues, Meta is the latest to announce its entry into humanoid robotics, forming a new team within its Reality Labs division.
🏭 Big tech start-ups
The last few weeks have seen a flurry of model drops from large labs. Over at OpenAI, Sam pre-empted the news with his latest blog, “There Observations”, in which he paints an increasingly clear path towards AGI. The company released a paper, Competitive Programming with Large Reasoning Models, in which they reported another episode of The Bitter Lesson. They built a new o1-ioi system, which used hand-engineered inference strategies, and achieved a gold medal at the 2024 International Olympiad in Informatics under relaxed constraints. However, when OpenAI scaled-up their o3 model solely using RL, the system surpassed those results without relying on domain-specific techniques. Impressively, o3 achieved a gold medal at IOI and a Codeforces rating in the 99.8th percentile. This work showed emergent reasoning strategies, such as self-validation through brute-force solutions, without human intervention.
Then came the research preview of GPT 4.5, which OpenAI touted as an example of how improved capabilities can still come from scaling unsupervised learning during pre-training. The system has better EQ, follows instructions better, has a seemingly better world understanding, and hallucinates less. The company further emphasized that future models will combine this approach with reasoning capabilities to create even more powerful AI systems.
Fresh off of xAI’s 100k H100 cluster came Grok 3 Beta, their newest system designed to excel in reasoning, mathematics, coding, and instruction-following tasks. Grok 3 leverages large-scale reinforcement learning to refine its problem-solving strategies, enabling it to think for seconds to minutes, backtrack, and self-correct. Its Elo score of 1402 in the Chatbot Arena and strong performance on benchmarks like AIME’25 (93.3%) and GPQA (84.6%) highlight its dominance in both academic and real-world tasks. Similar to OpenAI, the model introduces "Think" mode, allowing users to inspect its reasoning process, and features a smaller, cost-efficient variant, Grok 3 mini.
A flurry of companies launched DeepResearch (or similarly named) products for real-time knowledge retrieval through web and document search, reasoning, self-correction and re-searching if results aren’t sufficiently helpful, and finally synthesis of research reports. Many are touting this as the latest “killer feature” for AI models and it certainly does look valuable. It’s interesting to read examples of the prompts, the reports, and how they’re evaluated by users with expertise in the field
Next came Anthropic’s turn with Claude 3.7 Sonnet and Claude Code. Different to their peers, the company built a hybrid reasoning model that integrates rapid response capabilities with extended, step-by-step reasoning. Unlike its predecessors, this model allows users to toggle between standard and extended thinking modes, offering flexibility for tasks ranging from quick answers to complex problem-solving. Extended thinking mode, available on paid tiers, enhances performance in math, physics, coding, and real-world applications, reflecting a shift from competition-focused optimization to practical business use. API users can even set token budgets to balance speed and quality. And Amazon should be happy too, as it finally launched Alexa+, featuring Claude as its new brains.
Meanwhile, Claude Code is a command-line tool enabling developers to delegate tasks like debugging, test-driven development, and large-scale refactoring directly to the AI. Early tests show significant time savings, with Claude outperforming competitors in coding benchmarks like SWE-bench and TAU-bench. It is particularly useful for long-horizon agent coding tasks, such as those productised by Fern Labs.
Finally, Google announced major updates to its Gemini AI models, making the powerful fan-favorite Gemini 2.0 Flash generally available to developers. Built using RL techniques, 2.0 Flash offers enhanced reasoning capabilities across vast amounts of multimodal information. The company also unveiled Gemini 2.0 Pro Experimental, their most capable model yet for complex prompts and coding tasks, and 2.0 Flash-Lite, a cost-efficient alternative that outperforms its predecessor.
🔬Research papers
Tissue reassembly with generative AI, EPFL, Meta AI, Swiss Institute of Bioinformatics.
In this paper, the authors introduce LUNA, a generative AI model designed to reconstruct tissue architectures from dissociated single-cell RNA sequencing (scRNA-seq) data. LUNA leverages spatial priors learned from spatial transcriptomics datasets to predict the spatial arrangement of cells based solely on their gene expression profiles.
The model employs a diffusion-based approach, progressively denoising random noise into spatial cell coordinates, and uses an attention mechanism to capture both local and global cellular interactions. LUNA demonstrated strong performance in reconstructing the MERFISH whole mouse brain atlas, accurately predicting spatial locations for over 1.2 million cells, including unseen cell types, with a Pearson correlation of 0.95 for spatial gene expression patterns. However, it is currently limited to 2D spatial reconstructions.
Multi-megabase scale genome interpretation with genetic language models, GSK, Max Planck, ETHZ.
Phenformer is a deep-learning model designed to predict disease risk directly from whole genome sequences. The model processes up to 88 million base pairs, integrating sequence, cell-type-specific expression, and phenotype data. It identifies disease-relevant cell and tissue types, outperforming state-of-the-art polygenic risk score (PRS) methods in predictive accuracy, particularly for diverse populations.
Uniquely, this model highlights mechanistic hypotheses, such as liver involvement in psoriasis and optic nerve complications in COPD, supported by existing clinical and epidemiological evidence. The authors demonstrate its ability to stratify individuals into molecular subtypes, revealing co-morbidity patterns.
Scaling Pre-training to One Hundred Billion Data for Vision Language Models, Simons Foundation, Cornell University
The authors investigate the potential of pre-training vision-language models on an unprecedented scale of 100 billion examples. They find that while model performance saturates on many common Western-centric benchmarks, tasks of cultural diversity achieve more substantial gains from this massive web data.
The paper also analyzes the model's multilinguality, showing gains in low-resource languages. Interestingly, they observe that reducing pretraining dataset size via quality filters like CLIP may inadvertently reduce cultural diversity, even in large-scale datasets.
These results highlight that the 100-billion example scale is vital for building truly inclusive multimodal systems, even if traditional benchmarks may not benefit significantly.
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach, ELLIS Institute Tübingen, University of Maryland, Lawrence Livermore National Laboratory
In this paper, the authors propose a novel language model architecture that scales test-time computation by reasoning in latent space using a recurrent block. The model is trained on 800 billion tokens and achieves strong performance on reasoning benchmarks, often outperforming larger models.
Key experiments demonstrate the model's ability to improve accuracy on tasks like mathematical and code reasoning by increasing test-time compute. The model also exhibits useful behaviors like zero-shot adaptive compute and continuous chain-of-thought reasoning.
However, the model is still a proof-of-concept trained on a limited compute budget. More optimal training may yield even better results.
This work presents a promising new approach for endowing language models with reasoning capabilities that could enhance their performance on complex real-world tasks. Moving reasoning into high-dimensional latent space, rather than explicit verbalization, opens up new possibilities for developing capable and efficient models.
Less is More for RL Scaling, GAIR-NLP
But could we scale RL with less? This work explores how reducing model complexity and training data requirements can still achieve competitive or superior performance compared to traditional large-scale RL approaches. The authors show that their approch benchmarks favorably with significantly reduced computational resources: “With merely 817 curated training samples, LIMO achieves 57.1% accuracy on AIME and 94.8% on MATH, improving from previous SFT-based models' 6.5% and 59.2% respectively, while only using 1% of the training data required by previous approaches.”
A notable caveat, however, is that the approach may require careful tuning to balance simplicity and performance, which could limit its generalizability across all RL problems.
The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks, University of California Berkeley, Carnegie Mellon University, Alibaba
How should reasoning models strike the right balance between thinking and acting on their environment? This paper evaluates this question. They identify a phenomenon called "overthinking" where LRMs favor extended internal reasoning over interacting with the environment.
The authors analyze 4018 trajectories across 19 models on software engineering tasks. They find that higher overthinking scores correlate with decreased task performance, and reasoning models exhibit stronger overthinking tendencies compared to non-reasoning models.
Notably, selecting the solution with the lowest overthinking score from just 2 samples can improve performance by nearly 30% while reducing computational costs by 43%. The authors suggest that leveraging native function-calling capabilities and selective reinforcement learning could help mitigate overthinking.
Brain-to-Text Decoding: A Non-Invasive Approach via Typing, Meta AI, Basque Center on Cognition, Brain and Language (BCBL), Rothschild Foundation Hospital.
In this paper, the authors explore the use of AI to decode language from non-invasive brain recordings and to understand the neural mechanisms of language production. Using MEG and EEG, they recorded brain activity from 35 participants typing sentences and trained an AI model to reconstruct sentences from these signals. The model achieved up to 80% character decoding accuracy with MEG, outperforming EEG systems.
The second study analyzed how the brain transforms thoughts into words, syllables, and letters. By interpreting MEG signals, the authors identified a dynamic neural code that chains successive representations while maintaining coherence over time.
Down the line, this kind of work could have applications in restoring communication for individuals with speech impairments, and contributes to understanding the neural basis of language.
Computational design of serine hydrolases, University of Washington; Institute for Protein Design; Howard Hughes Medical Institute
The authors present a computational approach to design serine hydrolases, enzymes with complex active sites that catalyze multistep reactions. Their designed enzymes have catalytic efficiencies up to 220,000 M⁻¹s⁻¹, a substantial improvement over previous computational designs. Moreover, crystal structures closely match the design models with sub-angstrom accuracy.
Of note, the designs have novel folds not seen in natural serine hydrolases, expanding the known structural diversity of this enzyme family. Meanwhile, analysis of the designs provides insights into the geometric basis of catalysis. A popular application of this kind of work includes plastic recycling, where serine hydrolases could break down polyethylene terephthalate (PET).
Robust Autonomy Emerges from Self-Play, Apple
In this paper, the authors explore the use of self-play reinforcement learning to develop robust and naturalistic driving policies for autonomous vehicles. They introduce GIGAFLOW, a simulator capable of training policies on an unprecedented scale, simulating 1.6 billion kilometers of driving in under 10 days using an 8-GPU node. The resulting policy achieves state-of-the-art performance on benchmarks like CARLA, nuPlan, and Waymo Open Motion Dataset, outperforming specialist models trained on benchmark-specific data.
The experiments demonstrate that self-play, combined with minimalistic reward functions, enables the emergence of diverse and realistic driving behaviors without using human driving data. The policy generalizes across various traffic scenarios, maps, and actor behaviors, achieving robustness with an average of 17.5 years of simulated driving between incidents.
Avat3r: Large Animatable Gaussian Reconstruction Model for High-fidelity 3D Head Avatars, Technical University of Munich, Meta Reality Labs Pittsburgh
The authors present Avat3r, a method for creating high-quality, animatable 3D head avatars from just a few input images. The key innovation is an architecture that predicts 3D Gaussians for each pixel, allowing for detailed reconstructions without relying on a fixed template mesh.
Avat3r outperforms state-of-the-art methods in both few-shot and single-shot scenarios. Experiments show it produces more expressive avatars with higher rendering quality, better identity matching, and smoother video renderings.
The method generalizes well to out-of-distribution examples like AI-generated images or antique busts. This opens up potential applications in casual settings where users may only have a few smartphone photos available.
🛢Dataset and benchmark drops
Tahoe-100M: A Giga-Scale Single-Cell Perturbation Atlas for Context-Dependent Gene Function and Cellular Modeling, Vevo Therapeutics; Arc Institute; Parse Biosciences.
In this paper, the authors present Tahoe-100M, a single-cell perturbation atlas comprising over 100 million transcriptomic profiles from 50 cancer cell lines treated with 1,100 small-molecule drugs. Using the Mosaic platform, they reduced batch effects by profiling diverse cell lines in parallel, enabling high-throughput single-cell RNA sequencing. The dataset captures 52,886 unique cell line-drug-dose conditions, with a median of 1,287 cells per condition.
The study highlights drug-induced transcriptional changes, showing that targeted inhibitors like RAS and RAF modulators elicit mutation-specific responses. E-distance metrics quantified the magnitude of drug effects, revealing stronger impacts for cancer-relevant pathways like PI3K/AKT and HDAC inhibitors. Cell cycle analysis demonstrated distinct phase arrests induced by specific drug classes, such as G2/M arrest by HDAC inhibitors.
This research provides a scalable resource for AI-driven modeling of cellular behavior, with applications in drug discovery, personalized medicine, and understanding treatment resistance mechanisms.
ENIGMA EVAL: A Benchmark of Long Multimodal Reasoning Challenges, Scale AI, Center for AI Safety, MIT
This work introduces ENIGMA EVAL, a benchmark designed to evaluate the reasoning capabilities of large language models (LLMs) through complex, multimodal puzzles. The dataset includes 1,184 puzzles sourced from diverse puzzle-solving events, requiring models to synthesize implicit knowledge and perform multi-step deductive reasoning. These puzzles combine text and images, challenging models to uncover hidden connections and solution paths.
Experiments show that state-of-the-art models achieve only 7.0% accuracy on normal puzzles and 0% on harder ones, highlighting significant gaps in reasoning and problem-solving abilities. The study also reveals that some models struggle with OCR and parsing, though transcription does not drastically improve performance.
This research matters because it pushes the boundaries of AI evaluation, focusing on unstructured, creative problem-solving. By exposing current limitations, ENIGMA EVAL provides a framework for advancing AI systems capable of tackling real-world challenges requiring flexible reasoning and multimodal understanding.
MLGym: A New Framework and Benchmark for Advancing AI Research Agents, Meta, University of California Santa Barbara, University College London
In this paper, the authors introduce MLGym, a framework for evaluating and developing AI research agents, along with MLGym-Bench, a suite of 13 diverse AI research tasks. The tasks span computer vision, NLP, reinforcement learning, and game theory, requiring agents to generate ideas, process data, implement methods, and analyze results.
The authors evaluate frontier language models like GPT-4 and Claude on these tasks. They find that while the models can improve on given baselines by tuning hyperparameters, they do not generate novel hypotheses, algorithms, or architectures.
MLGym enables research on training algorithms like RL for AI research agents. The framework and benchmark are open-sourced to facilitate future work on advancing the AI research capabilities of language model agents.
PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi-agent Tasks, FAIR Meta
In this paper, the authors introduce PARTNR, a benchmark for evaluating embodied AI agents in collaborative household tasks. The dataset contains 100,000 diverse, natural language instructions spanning 60 houses and 5,819 objects. Tasks exhibit real-world characteristics like spatial, temporal, and heterogeneous agent capability constraints.
The authors analyze state-of-the-art language models on PARTNR, revealing significant limitations in planning, perception, and skill execution. When paired with humans, the models require 1.5x more steps than human-human teams and 1.1x more than a single human, highlighting room for improvement.
Fine-tuning smaller language models on planning data achieves performance on par with models 9 times larger while being 8.6x faster. PARTNR aims to drive research in collaborative embodied agents, with potential applications in domestic robotics and virtual assistants. The benchmark's realistic tasks and systematic evaluations provide valuable insights for advancing human-robot interaction.
SYNTHETIC-1: Scaling Distributed Synthetic Data Generation for Verified Reasoning, Prime Intellect
The authors introduce SYNTHETIC-1, a large-scale open-source dataset of 1.4 million verified reasoning tasks spanning math, coding, and science. The dataset is designed to improve reasoning model training by leveraging DeepSeek-R1, a model trained with reinforcement learning and fine-tuned using verified reasoning traces.
The experiments involve generating tasks such as verifiable math problems, coding challenges with unit tests, and open-ended STEM questions, verified using LLM judges or programmatic methods. Notably, the dataset includes 61,000 synthetic code understanding tasks, which are particularly challenging for state-of-the-art models.
The research demonstrates that cold-start synthetic data significantly enhances model performance, and distillation from strong teacher models can outperform large-scale reinforcement learning. This work enables globally distributed reinforcement learning with verifiable rewards, allowing anyone to contribute compute.
🚀 Funding highlight reel
Mercor, the AI recruiting startup that automates hiring processes, raised a $100M Series B at a $2B valuation from Felicis, Benchmark, and General Catalyst.
Taktile, the AI-powered risk decisioning platform for financial services, raised a $54M Series B financing round from Balderton Capital, Index Ventures, and Tiger Global.
ElevenLabs*, the AI audio technology company, raised a $180M Series C at a $3.3B valuation from a16z and ICONIQ Growth.
Fal, the generative media platform for developers, raised a $49M Series B financing round led by Notable Capital with participation from Andreessen Horowitz and Bessemer Venture Partners.
Protex AI, the AI-driven workplace safety company, raised a $36M Series B financing round led by Hedosophia with participation from Salesforce Ventures.
Saronic, a company focused on autonomous shipbuilding, raised a $600M Series C at a $4B valuation from Elad Gil and General Catalyst.
Luminance, the AI legaltech company automating contract generation and negotiation, raised a $75M Series C financing round from Point72 Private Investments, Forestay Capital, and RPS Ventures.
Lambda*, the AI cloud platform provider, raised a $480M Series D financing round co-led by Andra Capital and SGW, with participation from NVIDIA and ARK Invest.
Together AI, the AI cloud platform for open source and enterprise AI, raised a $305M Series B financing round led by General Catalyst and co-led by Prosperity7, with participation from Salesforce Ventures and NVIDIA.
Tolan, the AI company behind Embodied Companions, raised a $10M financing round from investors including Lachy Groom, Nat Friedman, and Daniel Gross.
Abridge, the generative AI platform for clinical conversations, raised a $250M Series D from Elad Gil, IVP, and Bessemer Venture Partners.
Nomagic, the Polish startup building AI-powered robotic arms for logistics operations, raised a $44M Series B financing round from the European Bank for Reconstruction and Development (EBRD), Khosla Ventures, and Almaz Capital.
Achira, a biotech company blending AI and physics to model molecules, raised a $33M seed financing round backed by Dimension and NVIDIA.
Prime Intellect, a company building a peer-to-peer protocol for open-source AI, raised $15M in a financing round led by Founders Fund with participation from Menlo Ventures and Andrej Karpathy.
Elicit, the AI platform for evidence-backed decisions, raised a $22M Series A at a $100M valuation from Spark Capital and Footwork.
Enveda*, a techbio company using AI to discover new medicines from nature, raised a $150M Series C financing round with investment from Sanofi.
Unique, the Swiss AI platform for finance, raised a $30M Series A financing round led by DN Capital and CommerzVentures.
Latent Labs, the AI-first frontier bio company, raised a $40M Series A financing round co-led by Radical Ventures and Sofinnova Partners.
Tana, the AI-powered knowledge graph for work, raised a $25M financing round led by Tola Capital with participation from Lightspeed Venture Partners and Northzone.
Fern Labs*, a London-based startup focused on coordinating networks of AI agents to build software autonomously, raised a $3M pre-seed financing round led by us at Air Street Capital.
Prior Labs, a German AI startup focused on building models to analyze tabular data, raised a €9 million pre-seed financing round from Balderton Capital and XTX Ventures.
Tines, the automation platform for enterprise workflows, raised a $125M Series C at a $1.125B valuation from new and existing investors.
Crescendo, the AI-powered customer support platform, raised a $50M Series C financing round at a $500M valuation from General Catalyst and Celesta Capital.
Positron, a Reno-based AI chip startup focused on inference chips, raised $23.5M in a seed financing round from Valor Equity Partners, Atreides Management, and Flume Ventures.
Sesame, developers of an AI voice model and hardware device, raised an undisclosed Series A from a16z.
Achira, a techbio company blending AI and physics to model molecules, raised a $33M seed financing round backed by Nvidia.
Verkada, the security systems maker specializing in video security cameras, environmental sensors, and alarms, raised $200M in a financing round at a $4.5B valuation led by General Catalyst with participation from Eclipse Ventures.
Arize, the AI observability platform for monitoring and evaluating AI models, raised a $70M Series C financing round from Adams Street Partners, M12, and SineWave Ventures.
Terrain Biosciences, the RNA design-build company leveraging AI for therapeutics and vaccine development, raised $9M in a seed financing round from Magnetic Ventures, Bruker Corporation, and Josef Feldman of Ex Nihilo.
* denotes companies in which the authors hold shares.
🤫 The rumor mill…
Anduril, the defense-tech company, is raising $2.5 billion in a financing round at a $28 billion valuation led by Founders Fund.
Anthropic, the AI startup behind the Claude chatbot, is raising a $3.5B financing round at a $61.5B valuation from Lightspeed Venture Partners, General Catalyst, and Bessemer Venture Partners.
Thinking Machines Lab, Mira Murati’s new AGI company, is in talks to raise $1billion.
Safe Superintelligence, an AI company focused on safe AI development, is raising over $1 billion in a financing round at a valuation exceeding $30 billion, with Greenoaks Capital Partners leading the round and investing $500 million.
🤝 Exits
IBM, the technology giant, acquired DataStax, a company specializing in database and generative AI capabilities, including Apache Cassandra and vector databases, to enhance its watsonx AI offerings. The acquisition price was not disclosed. DataStax had previously raised $342.6M and was valued at $1.6B during its most recent funding round in June 2022, with hundreds of paying customers.
Ravelin, the AI-native fraud prevention platform, was acquired by Worldpay. Big congrats to this team, whom I had the pleasure of working with as lead investor after their Seed. The acquisition price was not disclosed.
Accern, an AI-powered insights provider for financial firms, was acquired by Wand AI, a Palo Alto-based startup specializing in AI agents for enterprises, for an undisclosed eight-figure sum. The company had previously raised $40M from investors including Fusion Fund and Mighty Capital.
Applied Intuition, a company providing AI-powered tools for autonomous systems development, acquired EpiSci, an innovator in AI and autonomy software for national security, to enhance U.S. defense capabilities; the acquisition price was not disclosed.
MongoDB, a leading database platform, acquired Voyage AI, a company specializing in embedding and reranking models for AI-powered search and retrieval, to enhance its AI capabilities; the acquisition price was not disclosed.