Z Potentials | Xiaodi Hou, Former TuSimple CEO, Returns: “Takeover Rate Is Dead. Long Live CPM”

Z Potentials invited Xiaodi Hou, founder and CEO of Bot Auto, to give a talk.

May 13, 2025

As the global autonomous driving industry continues its halting march between technological inflection points and elusive commercial returns, one of its earliest and hardest-core visionaries is back — not with a flashier demo, but with a sharper knife. Xiaodi Hou, the scientist-engineer who once rang the Nasdaq bell with TuSimple, is now reengineering the value equation with something far less sexy, and far more unforgiving: Cost per Mile (CPM). Not the takeover rate. Not cumulative demo miles. But a metric stripped of narrative gloss — calculated to the decimal, grounded in the dirt. From neural computation labs at Caltech to the dust-blown stretches of the I-10 interstate, Hou’s two-decade technical odyssey mirrors the industry’s own paradigm shift — from gazing at the stars to wading through the muck. As a former investor, he once pored over more than 300 pitch decks in Silicon Valley, decoding the venture logic and storytelling scaffolds that built (and often broke) the frontier. A decade later, helming a public company, he confronted the brutal arithmetic beneath Wall Street’s beloved “trucking story”: $40 in costs against $2 in revenue — a terminally inverted business that no amount of vision could right.

As a serial founder who has survived both hype and misdiagnosis, Xiaodi Hou is now taking first principles thinking to its logical extreme: While most autonomous driving companies continue to flex in demo videos featuring night runs through complex traffic, Bot Auto has quietly relocated its headquarters to Houston — the beating heart of America’s freight corridors — and is rebuilding its tech stack from the ground up with unglamorous, boots-on-the-ground discipline. No more sensor arms races. No more theatrical lidar arrays. Instead, Hou is importing the pretraining revolution from generative AI into the world of trucks, slashing data labeling costs to just 2% of what legacy systems require. He’s also rejecting the siren song of platform economics and inflated valuations. Bot Auto isn’t selling a platform — it’s operating as a carrier. That means managing its own fleet, hauling real freight, and riding the tides of the logistics market to map out a cost curve grounded in actual miles, not pitch decks.

In a country where the average truck driver is now over 50, Bot Auto is pursuing a more fundamental question: when the gleam of autonomous tech fades, can it still move freight cheaper than a human ever could — and in doing so, sustain the steel arteries that carry 70% of the nation’s goods? This episode of Z Potentials sits down with Xiaodi Hou, founder and CEO of Bot Auto, for a conversation that cuts against the grain of Silicon Valley orthodoxy. We explore how he’s applying a millimeter-precise operational mindset to rethink the final equation of the trillion-dollar logistics market. Enjoy!

I often ask myself: “If this were the last decision of my life, would I regret it?” If the answer is yes, then I go for it — without hesitation. If it’s no, that tells me it probably doesn’t matter. This mindset has shaped many of the important decisions I’ve made since.
The pace of research is accelerating, and with the rapid rise of foundational compute, we’re clearly headed into a future full of deep uncertainty — and inevitable disruption. That’s exactly when you want to take big swings, not tinker around the edges. That’s why we chose autonomous driving — and not just any autonomy, but straight into L4. We chose to start at the hardest level — L4 — because going from L4 down to L2 is manageable. Going the other way? Nearly impossible. That choice reflects who we are: focused on value creation, driven by clarity, and committed to playing the long game.
The first technical breakthrough came with deep learning. From 2016 to 2022, we entered a second phase, where deep learning became commoditized — models for detection and recognition matured, and object detection stopped being a hard problem. Since 2022, we’ve entered a third phase. You start hearing new terms: end-to-end, foundation models, large models. Whether a system is “end-to-end” isn’t really the point. What matters is using a large model to solve the whole problem. For us, that large model is the foundation model.
I see the maturation of adjacent technologies driven by industry evolution as something akin to how the steam engine reshaped social structures — it forces a rethinking of how organizations are built. In the traditional era of autonomy, we spent a lot of time and money on annotation — manual labeling at scale. But large language models (LLMs) have pointed us in a radically different direction: with massive pretraining, you can get a powerful model, and then fine-tune it at a fraction of the cost. We’ve brought down our data labeling costs to just 2% of what they used to be.
What I want is for our company to be one that takes social responsibility seriously. On one hand, we’ve built an internal engine for value creation. On the other hand, we generate strong positive externalities. Our goal isn’t to win a zero-sum game — it’s to become a “coral reef” company: one that expands the ecosystem, enlarges the market, and gives more people a reason and a way to participate.

01 Epiphany on the Edge: A Philosophy of Value Creation and a Life That Actually Feels Good

ZP: Welcome, Xiaodi — to start us off, could you briefly introduce yourself?

Xiaodi: Hello everyone, I’m Xiaodi Hou, founder of Bot Auto. I did my undergrad at Shanghai Jiao Tong University and went on to earn my Ph.D. at Caltech. My research was centered on computer vision, but I drew heavily from psychophysics and neuroscience to build computational models. The core question driving my work was: how can we use computational methods to convert raw pixels into interpretable concepts? At one point, I spent six months running electrophysiological experiments, studying how neurons in a monkey’s brain respond to visual stimuli. That experience led me to an important realization — today’s artificial neural networks, while loosely inspired by biology, are actually quite distant from how the brain really works. The resemblance is largely structural, not functional. The underlying mechanics are fundamentally different.

After graduating, I was faced with two options: pursue an academic career, or start a company. Most of my peers chose faculty positions — but to me, academia didn’t seem like the optimal path for maximizing value creation. So almost instinctively, I chose entrepreneurship. In 2010, I joined Pasadena Angels — one of the most respected and well-connected angel investor networks in Southern California. I participated in weekly pitch reviews and, within a year, evaluated over 300 startup decks. That experience gave me a foundational understanding of the venture ecosystem — and it confirmed for me that I wanted to build something of my own.

In 2014, I co-founded my first company, Cogtu Technology, where I served as CTO. The idea was to use AI to identify images and provide contextual metadata for ad targeting — improving ad relevance and efficiency. But after a year, we realized something important: in the advertising ecosystem, AI was largely a value-add, not a disruptive core. It had to attach itself to platforms rather than replace them. For a startup, that meant competing with entrenched giants — and that wasn't a winnable fight. So we made the hard call to shut it down. One of our investors at the time was Sina Tech, which believed strongly in our technical capabilities. They proposed we start a new company from the ashes — convert the debt into equity in a new entity, and they would commit to leading the next round. So in 2015, with the same core team, we founded TuSimple.

ZP: Were there any pivotal choices or turning points in your life that shaped who you are today?

Xiaodi: The most profound experience came during my undergraduate years. I was misdiagnosed as being terminally ill. In my sophomore year, after running a 400-meter race during the university sports meet, I suddenly went into acute kidney failure. The doctors told me I had three months to live. It turned out to be a misdiagnosis, but in that moment — facing what I thought was certain death — I found I wasn’t actually afraid. What struck me most was this bizarre sense of regret that I hadn’t finished reading a book I was deep into at the time: From Neuron to Brain. That was it — not GPA, not final exams, not grades. Just that book. That moment rewired my perspective on what matters. After I was discharged, I started viewing life through a different lens. Whenever I face a decision now, I ask myself: “If this were the last opportunity of my life, would I regret walking away from it?” If the answer is yes, I do it — no hesitation. If the answer is no, then I know it’s not that important. That framing has guided many of my most important choices since.

My life plan has always revolved around one core idea: value creation. At the very beginning of my Ph.D., my advisor asked me a question that turned out to be pivotal:“Do you see yourself more as someone who wants to discover, or someone who wants to invent?" He told me that how I answered would shape the rest of my research career. I didn’t hesitate — I chose invention. I wasn’t the type to get excited about discovering a new kind of neuron. What motivated me was building something that actually works — something usable, something that moves the needle in the real world. That drive didn’t come from the misdiagnosis I mentioned earlier; it’s always been there, embedded deep in who I am. I’ve always been obsessed with making things that work. If something I build — with my own hands and brain — ends up running successfully in the world, I feel an overwhelming sense of fulfillment. I’ve met many great founders over the years, and almost all of them share this trait. For real entrepreneurs, motivation goes far beyond financial return. The true reward is building something that matters — something that adds real value to society.

ZP: It’s been a decade since TuSimple was founded. What were your expectations for yourself back then? And looking forward, what are they for the next ten years?

Xiaodi: Honestly, I’ve never had particularly grand expectations for myself. What matters most to me is a sense of spiritual fulfillment. Looking back, I feel satisfied with how I’ve grown over the past decade — mainly because I’ve never let myself settle into a comfort zone. If there’s any through-line in my expectations, it’s this: to keep growing, and to keep building things that are interesting. As for wealth — there was a point when, on paper, I became a billionaire. But that didn’t bring me any joy. I never sold my shares. I didn’t want to create distance between myself and the team. I wanted us to stay in it together, through to the very end. Timing, resources, and alignment all matter — but nothing matters more than people. That sense of solidarity — building with your teammates, struggling side by side — that’s what makes entrepreneurship worth it.

TuSimple going public was no doubt an important milestone for the team — but for me personally, it didn’t represent “success” in any real sense. I still feel like I’m on the journey. If I had to share moments that genuinely brought me joy, three come to mind. The first was simple: waking up one morning to an alarm, groggily reaching for my phone — and realizing it was a holiday. That feeling of effortless ease, knowing I could go back to sleep, was pure contentment. The second moment was in late 2021, when our team successfully completed the “Driver Out” milestone — full autonomy on public roads with no driver, no safety operator, no remote control. That was a goal we had set for ourselves and actually achieved. And yes, there was joy in that — but it lasted only ten minutes. Because I knew it wasn’t the finish line. There was still the brutal challenge of operationalizing it at scale. The third was quieter, but it stuck with me more. I was driving from California to our office in Arizona, and on the highway, I started seeing our trucks — our own fleet — coming toward me, one after another. That moment hit different. It wasn’t just a milestone — it was a manifestation of everything we’d been building, piece by piece. That’s when the meaning of value creation really crystallized for me. The effort, the struggle, the persistence — all of it had finally taken shape in the real world.

02 Lessons from TuSimple: First Principles and the Cost Curve as the True Test of L4

ZP: If you had to summarize TuSimple’s positioning in a single sentence, what would it be?

Xiaodi: TuSimple’s core mission was to realize L4-level driverless trucking for long-haul logistics. As CTO, my primary responsibility was to define the technical direction. The business model, on the other hand, was something we explored collectively as a team.

ZP: When TuSimple was first founded, how did you approach the question of direction?

Xiaodi: I laid out three principles for myself at the very beginning. First, the company had to be technology-led — that’s where our strength was. Second, the business model had to be clear: who the customer is, and who’s paying. Third, even on the technical side, I wanted to work on something that reflected long-termism — something with real potential for sustained value creation, where the technical roadmap had enough depth to keep evolving over time.

We explored several directions, and in the end, the two finalists were smart retail and autonomous driving. Here’s how I thought about it: Deep learning had just emerged as a serious field — it was only introduced in late 2012, and by 2015 it had been around for less than three years. At the time, many in academia were still skeptical. But I saw two key signals, and I emphasized both when our team made the decision. First, NVIDIA’s GPUs were on a consistent upward trajectory. Compute power was compounding. As someone who’s also a gamer, I had firsthand intuition for how fast GPUs were improving — and I was convinced that, as new applications emerged, the hardware would keep pace. Second, despite the skepticism — especially from professors who had been entrenched in more traditional machine learning theory — I saw a growing wave of researchers pivoting into deep learning. New, high-quality work was coming out constantly. I believed the pragmatists in the academic community would eventually outnumber the theorists — and maybe even help reshape theory itself. The acceleration of research, combined with the rise of scalable compute, signaled one thing to me: we were heading into a period of massive uncertainty and inevitable transformation. That’s exactly the kind of moment when you want to bet big — when you want to go where the upside is non-linear. So we chose autonomous driving. And not incrementally — we went straight for L4.

ZP: What was the state of the industry’s technology at the time? And why were you so set on pursuing L4 at that particular moment?

Xiaodi: My thinking was: if we were already committing to a space filled with this much uncertainty, then we had no choice but to go deep — really deep. You can’t build serious autonomous driving tech by jumping into the L2 bandwagon and playing catch-up. To create something disruptive, you need a full-system understanding. That’s where the opportunity lies. At the core of our thinking was a very clear recognition: we were the ants. Google X and Mobileye were the elephants. If everyone plays by the same rules, the ants don’t win. That’s why we decided to go straight for L4 — the hardest problem. Because the further you push from the L4 starting point, the more likely you are to break out of the mold. But if you start from L2, it’s almost impossible to make that leap later on. That decision reflected who we were — focused on value creation, clear in intent, unwilling to compromise, and committed to the long game.

When it comes to the debate between L2 and L4, I still firmly believe that L4 is where true value creation happens. L2, for automakers, is more like a luxury feature. A friend of mine who works in the luxury goods industry once told me: luxury is the first thing people give up when money gets tight. That’s exactly how I see L2 — it's not foundational; it's optional. The value of L2 driver assistance is similar to the early work we did at TuSimple providing visual recognition services for ad platforms — it's a value-add, not the core product. Take ZongMu Technology, for example. (ZP note: In early 2025, ZongMu — once a high-profile startup in the intelligent driving space — became embroiled in controversy after reports of its founder disappearing, unpaid salaries, lapsed social security contributions, and executives allegedly wiring out large sums surfaced shortly after the Lunar New Year.) I don’t know the company in detail, but from what I understand, they did generate some early revenue. But as automakers began developing these technologies in-house, ZongMu’s market opportunity shrank dramatically. This isn’t a rare story in the industry. What starts out as a scarce capability gradually becomes commoditized — and as that happens, its value erodes. Whether OEMs choose to build internally or buy from third parties, they ultimately control the pricing power. Suppliers are left in a reactive position, with little room for margin or leverage.

Any business model that depends on someone else’s platform will always be vulnerable. If L4 technology were already fully productized and widely adopted — to the point where any decent team could build a system — then it would be a commodity. At that stage, customers would just comparison-shop, pick the vendor with the best price-performance ratio, and treat autonomy as a line item. But we’re nowhere near that. L4 is not productized yet — let alone commoditized. In the near term, the defining question for L4 is: Can you bring the cost down? If your cost per mile is lower than that of a human driver, then — and only then — do you have a viable L4 product. When we completed our “Driver Out” milestone, we proved it was technically possible. But the cost was $40 per mile, while the revenue was only $2 per mile. That model is dead on arrival. That’s why cost reduction is the crux of L4 commercialization. (ZP note: We’ll explore this in more depth later in the article.)

ZP: What were your expectations back then for achieving L4? Did you have a concrete timeline in mind, and how did you plan for the inevitable challenges?

Xiaodi: At the time, our expectation was to remove the driver around 2020 — and that turned out to be fairly accurate. We actually achieved it in 2021. We also believed we had a real shot at being the first to commercialize. Even though we couldn’t quantify how big the “elephants” were, we knew one thing for sure: they were still using last-generation technology. Google X and Mobileye hadn’t adopted deep learning yet, while we had already bet on it — and that gave us a chance to leap ahead right at the starting line. That conviction was what gave us the courage to enter the field. We didn’t overthink it. I mean, I once got a terminal diagnosis — what else is there to be afraid of?

ZP: Looking back, what do you think were some of the better decisions made during your time building TuSimple? What made them right in hindsight?

Xiaodi: From a technical roadmap perspective, I think TuSimple actually made some remarkably good calls. Just yesterday, I ran into an investor at a conference — someone who first met me in 2017. He told me that back then, I kept emphasizing the importance of vision and warned against over-reliance on lidar. At the time, he thought I was talking nonsense. Now, he admits I was absolutely right. Back then, our position was that lidar has its strengths, but the DARPA Grand Challenge playbook shouldn’t be copied wholesale. (ZP note: the DARPA challenge was a U.S. government-sponsored autonomous vehicle competition that heavily influenced early autonomous vehicles thinking.) In 2015, that competition was still fresh in everyone’s mind. Google and others leaned heavily on lidar, while cameras were used mostly just to read traffic lights. Ironically, public opinion today seems to have swung in the opposite direction — some people now argue that not using lidar is the only way to prove you have advanced algorithms. That kind of binary thinking is misguided. Our attitude toward sensors was — and still is — open-minded. We never held any dogmatic views about what to use or not use. The principle was simple: Black cat, white cat — if it catches mice, it’s a good cat. Use whatever works. That flexibility, that pragmatism in our technical approach — I think it was one of our most important strengths, and it hasn’t changed.

From a business strategy perspective, one of the key decisions we got right early on was focusing on the trucking sector. It took years for much of the industry to catch up to that realization — that trucks would commercialize long before robotaxis. Today, that view has become almost a consensus. Our bet was that highways and long-haul freight corridors were simply a better fit for autonomy. The traffic environment is more structured, and that matters — because ambiguity in traffic rules is the hardest problem in autonomous driving. That judgment still holds true today.

I believe the reason we were able to make the right calls came down to one thing: first principles. I've always believed that whatever you're doing, you need to be brutally honest about the end goal — and be accountable to that. Whether it was choosing flexible sensor architectures or focusing on highway freight, these weren’t ideological positions. They were just natural outcomes of reasoning from first principles.

ZP: Looking back, what are some of the key learnings from that entrepreneurial chapter?

Xiaodi: One major lesson was that, in trying to meet Wall Street’s expectations, the company adopted some shortsighted strategies. For instance, we scaled operations by essentially buying revenue — spending money to artificially grow topline numbers. It looked good on paper in the short term, but it didn’t create real value. You can see similar patterns in Aurora’s financials — they talk about revenue without addressing cost, which clearly isn’t sustainable. Another learning: there’s a fundamental tension between scale and development velocity. From a systems perspective, fleet size is a double-edged sword. The larger the fleet, the more friction it creates for algorithmic and systems iteration. It’s a complex tradeoff, one that spans business development, hardware operations, and software engineering. And the bigger the organization, the more likely you are to see collaboration break down across teams. Smaller companies don’t have that luxury — they have to work together to survive. That’s why, at Bot Auto, I’ve made it crystal clear that my job as CEO is to define the vision — and to make sure that every team understands that vision in the same way, with precision. Vision misalignment is the root cause of siloed execution, and that’s something we actively work to avoid.

ZP: From the perspective of your personal evolution, how did this entrepreneurial journey shape your growth?

Xiaodi: I went through three major role transitions. The first was as a technical hacker. The second, as a technical manager. And the third — as a true company leader. In the early days, we were a startup with limited ability to hire seasoned talent. So we brought on a group of interns from UCSD and trained them up into full-time engineers. Everyone was figuring things out as we went. I worked closely with one or two other engineers, leading the team hands-on — even writing key pieces of code myself. Over time, I transitioned into the second phase: technical management. I stopped writing code directly and focused instead on guiding the team, especially on strategic technical decisions. Even when I was fully capable of doing something myself, I’d delegate it to give team members space to grow.

The third transition happened after we completed the “Driver Out” milestone in 2021. That project was a major inflection point for TuSimple — it involved everything: software, hardware, algorithms, engineering, even operations. And in the end, all of that pressure converged on me. Once it was done, my perspective fundamentally shifted. I realized that building a company wasn’t just about developing a piece of technology — it was about making the business model work. The gap between cost and revenue was massive, and whoever could close that gap would be the one to truly move the company forward and turn the technology into a viable product. That’s when I knew: I had to become the CEO. Not just to lead the tech, but to make sure the entire company moved forward with coherence — inside and out.

03 Starting Over: Redefining Autonomous Driving Through Transportation as a Service and Cost per Mile

ZP: Let me ask you something pointed — the fact that you're once again building autonomous trucking... is that about obsession?

Xiaodi: Haha, I get misunderstood on this a lot, so let me clarify. Some people assume I’m doing this out of spite, or revenge — like I have something to prove. That’s really not the case. I’ve said in past interviews that “I’m a decision-making machine without emotions.” It’s an easy quote to misinterpret, but it actually reflects how I approach problems — I default to first principles, not emotional drivers. So no — my decision to do another autonomous driving startup wasn’t about ego, or proving a point. That’s not how I make major life decisions. For me, the core motivation has always been value creation. To borrow a quote from Alex Honnold, the free solo climber: “The only time I truly feel alive is when I’m climbing.” That pursuit of meaning — that’s why I build. That’s why I start companies, rather than work for someone else.

The second point is that I did seriously consider other industries. In early March 2023, I stepped down from TuSimple’s board after they decided to make a series of strategic shifts — pulling out of the U.S. market, pivoting toward China, and moving from L4 to L2. At the time, I looked into several potential directions: embodied AI, foundation models — even some investors told me they’d back me immediately if I chose to go into large models. But I had real reservations about that space. My concern was: what happens when a dominant platform player decides to bundle AI tools into their existing products? Could I compete with that? Take Google — they could capture massive market share without even trying. Bing, just by being tied to the Edge browser, gets huge traffic. That kind of platform leverage makes the foundation model space feel full of unseen traps. Autonomous driving, on the other hand — I’ve already fallen into most of the traps there. That’s exactly what makes it valuable experience. The scars I earned at TuSimple are what give me an edge in building something better this time around. If I had picked robotics, I’d be starting from scratch. I’d be asking myself: should we build a screw-tightening robot or a laundry-folding robot? I don’t even know if two-legged robots are necessary in the next decade. My understanding of that space is too limited. And if, three years from now, I realized I had made the wrong foundational call, I’d deeply regret it. But with autonomous trucks, I’ve already validated that there are no technical dead ends — the remaining challenge is operational. That’s a known battlefield for me. It’s like a game: I lost right before the final level, but now that I’m restarting, I know exactly where the traps are — and where the treasure chests are, too. And one more thing — [laughs, with a bit of founder fatigue] — the autonomous vehicles industry has already hit rock bottom. From here, it can only get better.

ZP: So you anticipated how tough the environment would be — and still chose this path?

Xiaodi: I did. I started fundraising in April 2023 and closed the round by July. It took about three to four months, and honestly, it was a rough stretch. A lot of investors were cautious — even skeptical — about autonomous driving. They saw it as high-risk and advised me to pivot to a newer, hotter vertical. But I stayed grounded in first principles. I could still see a future for autonomy. And I still believe this: a year from now, the state of autonomous driving will be better than it is today.

I asked the team a hard question upfront: Do we truly believe we can commercialize autonomous driving with no fallback resources — no internal surplus, no external rescue? I want every person who joins to think seriously about that. If the answer is no, then don’t join. We’re approaching this from the ground up — innovating at every level, relentlessly driving down costs, and becoming radically operations-focused. That’s why we moved our headquarters from the comfort of California to Houston, Texas — where we’re face-to-face with real trucking operations every day. I believe that if we stay disciplined and stick to this path, the road ahead will open up. In this industry, there’s no such thing as a $1 billion mid-sized company. You either go much bigger — or you die. Go big or go home. I’ve also reminded the team: we should be mentally prepared for three to five years of hard days. What I’m proud of this time around is that every person on this team is here because they believe. They’re mission-driven, not convenience-driven. We’re in this together — building the future of autonomy.

ZP: How do you view the technological evolution of autonomous trucking? What have been the key breakthroughs over the past decade?

Xiaodi: Before 2016, we were still in the first phase — most people were approaching the problem using traditional methods. The big, unresolved question at the time was: can we even reliably detect and classify every object in the environment? That alone was incredibly difficult. The first real breakthrough came with deep learning. That marked the transition into the second phase — from 2016 to 2022 — as deep learning became more widely adopted. We started applying it to detection and recognition, and suddenly, object detection was no longer the bottleneck. That progress was like solving the basic problem of “feeding yourself.” Once that was in place, people began aiming for a higher standard — more sophisticated capabilities. Prediction, for example, became a hot topic.

After 2022, we entered what I consider the third phase. You started hearing a flood of new buzzwords — end-to-end, large models, and so on. In my view, whether something is “end-to-end” or not isn’t the real issue. What matters is whether we can use a single, general-purpose model to solve the problem. For us, that means a foundation model — a multi-task, shared-architecture model. It doesn’t have to be a hundred-billion-parameter behemoth to qualify as a large model. In fact, end-to-end often imposes an unnecessary constraint on model design. It forces you to abandon non-end-to-end techniques that are often extremely effective. I’ve always argued that two things — Newtonian mechanics and traffic rules — should not be solved using end-to-end architectures. That’s why I believe we shouldn’t anchor our thinking to whatever buzzword happens to be trending. When you take a step back and look at how technology actually evolves, nothing ever appears fully formed, like Sun Wukong leaping from a rock. Everything builds, incrementally, on what came before. It’s a continuous process — and we should approach it with humility.

In my view, what marks this new wave of foundation models is twofold. First, multi-tasking — the ability to use a single model to handle a full chain of problems, from detection to tracking to prediction. Second, multi-modality — a model that can take in diverse types of input, whether single-frame or multi-frame, whether from cameras or lidar, and process it all through one unified network. That allows us to allocate more compute to a single, scalable model architecture. So yes — this, to me, represents the entry point into the third phase of autonomous driving.

ZP: As of today, are there still any technical bottlenecks standing in the way of realizing Bot Auto’s goals?

Xiaodi: Honestly, I don't think so. At this point, many autonomous driving companies are already able to demonstrate strong performance — even some L2 players can now confidently demo complex driving scenarios. I tend to break down technical performance into three categories: Best case, average case, and worst case. Right now, the best-case and average-case scenarios have largely been solved. The real focus should be on the lower bound — the worst case — making sure the system still behaves according to traffic laws without human intervention, even in edge conditions. And that’s not just a software problem. It’s a complex intersection of software, hardware, and a deep understanding of traffic regulations. That’s why we can’t fall into the trap of thinking deep learning alone will solve everything. From a narrow, machine-learning perspective, we’re seeing incremental improvements in both best-case and average performance — but not breakthroughs. But from a broader, engineering perspective, there has been meaningful progress. System complexity has been significantly reduced, and development costs have dropped by at least an order of magnitude.

ZP: With the rapid rise of generative AI, how do you view its potential applications in autonomous trucking? What kind of impact could it have on the industry?

Xiaodi: Let me start with an analogy. The rise of the steam engine didn’t lead to every household owning one — it led to the centralization of labor and a complete restructuring of society. That’s how I view the relationship between narrow technical advances and broad organizational change. Just as the steam engine reshaped social structures, certain technologies will reshape how we build and run companies. In the traditional era of autonomy, we spent huge amounts of time and money on manual annotation. But one of the core characteristics of LLMs is that they require very large models — and no single company can afford to manually label data at that scale. So, the industry started looking for alternative approaches to pretraining — most notably, training on the massive corpus of human-written text accumulated across the internet. That, in a sense, opened the door to pretraining as a methodology — and I think its impact on autonomous driving has been significantly underestimated. Back in my previous company, we spent heavily on annotation. But LLMs have pointed us toward a different path: large-scale pretraining to get strong model performance, followed by minimal-cost fine-tuning. We've been able to reduce our data labeling costs to just 2% of what they used to be.

Our first actual spend on manual data annotation at Bot Auto didn’t happen until March 8, 2024. Before February, we hadn’t spent a single dollar on it. What we were doing in late 2023 and early 2024 was pretraining — applying some of our thinking around LLMs and adapting it to the trucking domain. That transition came from real insight, not just hype. I genuinely believe AI will drive significant progress across many industries — but that progress won’t come from simply slapping two buzzwords together. It will be subtle, compounding, and foundational. That’s also something I’ve come to appreciate over the past few years: what makes our small team so effective. On one hand, we’re tight-knit and experienced. On the other, we’re constantly absorbing new developments — and that allows us to do more with fewer people, and build a more efficient organization that can punch well above its weight.

ZP: If you had to summarize Bot Auto’s positioning in one sentence, what would it be?

Xiaodi: To answer that, you have to go back to the fundamental question: how do you create value? And the clearest expression of value creation in this space is reducing cost per mile. If autonomous driving is going to matter, it has to lower the operational cost per mile — full stop. That’s why we’ve defined the company’s entire structure around solving autonomy through the lens of CPM. What we’re building is Transportation as a Service, with cost per mile as the core metric guiding everything we do.

ZP: Why is cost per mile such a critical metric?

Xiaodi: CPM isn’t a technical metric — it’s a business one. And I believe it will become the foundational concept that defines the future of autonomous driving. Previously, the go-to metric was MPI — miles per intervention — which refers specifically to interventions made for safety reasons. But the problem is, every company defines “intervention” differently. The variance is huge. And once your MPI reaches thousands or even tens of thousands of miles per intervention, it becomes a statistical rarity — essentially noise. It’s not a reliable or stable indicator anymore. We’re now at a stage where technology is no longer the bottleneck — business is. That’s where cost per mile comes in. CPM is a benchmark metric. It’s honest. You can’t game it. And it gives everyone in the industry a shared baseline for comparison and accountability.

ZP: What kind of problem is Bot Auto trying to solve? And how is that need currently being met?

Xiaodi: Let me focus specifically on the U.S. market, since that’s where we’re currently operating. In the U.S., there are quite a few companies working on incremental innovations — for example, freight-matching platforms. These companies try to match shippers with available capacity using more modern, IT- or automation-driven approaches. Essentially, they act as middlemen optimizing for information asymmetry. And yes, there is a real asymmetry in the market. On the demand side, there’s been a persistent supply shortage for years — that’s a macro-level fact about the U.S. freight industry. But if you zoom in at the micro level, you’ll find a very different reality: many companies struggle with oversupply during slow seasons, when there simply isn’t enough freight to haul. So here’s the dynamic: if someone wants to build a business around macro-level supply shortages, they’ll inevitably have to absorb the micro-level risk of oversupply. That’s what gives rise to one of the most well-known patterns in U.S. trucking — the industry tide. Every few years, trucking companies hit a downturn. Then, a few years later, it’s a boom cycle again.

Autonomous driving has a critical role to play in reforming the supply side of the freight market. One important fact: in the U.S., very few young people are choosing to become truck drivers anymore. There are a couple of reasons for this. First, people born after the year 2000 have a fundamentally different mindset than previous generations. Driving a truck is grueling work — physically and mentally — and fewer people are willing to endure that. Second, as social safety nets and economic options expand, people have more choices. They’re less likely to take on difficult jobs purely for survival. Meanwhile, on the demand side, freight demand is only going up. In the past, cargo could be shipped via rail and arrive in a week. But with today’s e-commerce expectations, those timelines are no longer acceptable — flexibility has become paramount, and trucking has filled that gap. Trucks now move about 70% of all freight in the U.S. At the same time, the average truck driver is 50 years old — and younger generations aren’t stepping in fast enough to replace them. This labor gap is only going to widen.

ZP: Who are our customers? And how big is this market?

Xiaodi: This is an $800 billion to $1 trillion market — and a huge portion of that cost comes from drivers. So yes, it’s an industry with massive potential. Roughly 70% of all freight in the U.S. moves by truck — everything from construction materials and agricultural products to Amazon packages.

ZP: And who are our competitors?

Xiaodi: First, let me explain our business model. In the autonomous driving space, some companies focus on building software and selling it as a product. Others sell a combined software and hardware stack. But in our case, we don’t see the need to sell anything to others — because we are the trucking fleet. That means our customer base doesn’t really overlap with most other autonomous trucking companies. As for competition — in a practical sense, we don’t really have any. This is a market where demand structurally exceeds supply, and that imbalance isn’t going away anytime soon. Even in our most optimistic projections, we won’t come close to filling the shortage of truck drivers by 2030. From an economics perspective, the core problem is diseconomies of scale. The larger a trucking company grows — the more drivers it manages — the more layers of complexity it has to absorb. Operational costs rise with organizational overhead. That’s why even the largest fleets in the U.S., like C.H. Robinson, top out around 20,000 trucks. Most of the market is fragmented — independent owner-operators running small fleets. But that leads to its own inefficiencies: poor information flow, lack of maintenance scale, and limited optimization. On the other hand, companies that grow too large run into human management bottlenecks. The only way to break this structural constraint is through autonomous driving. It introduces a new kind of scalability — one that finally delivers true economies of scale.

From a historical perspective, no one was really working on autonomous trucks in the early days — everyone was focused on passenger cars. So before 2018, there was still a lot of skepticism about self-driving trucks. People would ask: Why not go after a more prestigious, consumer-facing market? But then Robotaxis ran into roadblocks — and Robotrucks started to gain traction. Around that time, three companies in the space went public: TuSimple, Embark, and Aurora. That said, the market tends to swing like a pendulum. In my view, resilient founders shouldn't let themselves get caught up in those swings. You have to see your own path clearly — and keep walking it. As it stands today, most of our peers are still focused on demos. Almost no one in this space has been willing to publicly disclose their cost per mile — and I think that’s unfortunate. In the next month or two, I plan to publish our own CPM calculation methodology. Because that’s the only way this industry moves toward a healthier, more transparent direction.

ZP: What do customers care about most?

Xiaodi: Customers don’t care how sophisticated our neural networks are. What they fundamentally care about is simple: Can you deliver the goods on time? Is the price reasonable? Can you provide consistent, long-term capacity? That’s the baseline. Of course, once our fleet becomes fully autonomous, we’ll be able to offer value-added services that are much harder for human-driven fleets — things like real-time vehicle tracking and delivery confirmations, essentially as built-in features. These kinds of digital services become trivial once the fleet is automated. As for pricing — in the early stages of autonomous trucking, pricing shouldn’t be higher than human drivers, but it also shouldn’t be too low. You need to find a way to earn revenue without disrupting the market. If you imagine the human driver cost as a baseline, pricing above that means you haven’t achieved productization yet. Pricing below it means you don’t need to lower prices anytime soon — the market will naturally shift in your direction. And that shift is inevitable. The supply of human drivers is shrinking — it’s a natural displacement process. At the same time, autonomy creates new kinds of jobs. As more autonomous operations centers come online, we’ll see rising demand for human labor in those roles — the ecosystem evolves, not disappears.

ZP: Are there any major milestones we’re aiming for this year?

Xiaodi: Here’s a bit of a preview — we actually completed daytime and nighttime point-to-point demos last year. This year, our focus is on bringing our redundancy systems to a level of maturity where we can run an extended period of Driver Out pilot operations. Looking one to two years ahead, our goal is to achieve a cost per kilometer that’s lower than what it costs to operate with human drivers.

ZP: What’s Bot Auto’s long-term vision? What kind of company do you want it to be by 2030?

Xiaodi: I want Bot Auto to be a company that embraces real social responsibility. On one hand, we have an internal engine for value creation. On the other, we generate strong positive externalities. Our goal is to be a “coral reef” company — one that expands the market and creates opportunities for others, rather than fighting for dominance in a zero-sum game. If we can build a logistics network that functions as infrastructure, it creates pull for the entire ecosystem. That means we can help others build businesses around it — give people a platform to earn income and grow. Think back to what Alibaba did in its early years. Its platform vision — “making it easy to do business anywhere” — ended up supporting countless small businesses. By enabling others, it enabled itself. Now we’re at a similar moment — with the chance to stand at the center of transformation in the logistics industry. And if we want to do that well, the mindset we need isn’t extractive — it’s altruistic.

04 Rapid Fire

ZP: What’s your zodiac sign or MBTI type?

Xiaodi: Gemini, INTJ, blood type O. (ZP note: An unmistakably optimistic founder with a built-in “sunshine filter” — see photo for proof!)

ZP: Besides building companies, what are some of your personal interests?

Xiaodi: I actually have a lot of hobbies. First and foremost — music. I’m a huge fan of heavy metal, and I play guitar myself. I used to run a lot, too. I’ve completed a half-marathon — my best time was 1 hour and 40 minutes. I’m also really into mountain biking — though I’ve broken a bone doing it. Funny enough, despite working in autonomous driving, I personally drive a stick-shift gas-powered car. I’ve always loved machines and the feel of analog control.

I’m also a total foodie. I cook Chinese food — pretty seriously, actually. I’m into tea and coffee, too — I’ve done a lot of deep dives into both. The truth is, I just really love life. If I didn’t have to work, my hobbies alone would keep me so busy I’d probably sleep only four hours a day.

ZP: One book that’s had a big impact on you?

Xiaodi: This probably belongs more in the “slow answer” category — I’d say Romance of the Three Kingdoms. There’s a lot in that book that I still draw from today. As a kid, I liked Zhuge Liang (Chancellor of Shu Han) — he was the mastermind, always ten steps ahead. But in high school, through college, and even during the early stages of my startup journey, the character I identified with most was Jiang Wei (General of Shu Han) — the one who never gave up, who kept fighting even when he knew the odds were hopeless. At that time, I needed Jiang Wei’s kind of courage. Later, as the company grew and I started facing harder and more complex decisions, I found myself drawn to Liu Bei (Emperor of Shu Han). He’s the one I relate to most now. Liu Bei stayed true to what was right — even in the face of overwhelming odds, even when others criticized or misunderstood him. There’s a scene I often think about: when Liu Bei stood up to Cao Cao (Chancellor of the Eastern Han)’s invasion, he still insisted on doing the right thing. I’ve faced similar decisions. I know autonomous driving is the right direction, but people talk behind your back. It’s hard — but you still have to do the right thing. That’s what Liu Bei represents: choosing the difficult path because it’s right. That’s given me a lot of strength and clarity. When I was younger, I used to think Liu Bei was kind of weak — always losing battles. But now? I’m a two-time founder, and I’ve lost twice too. And that’s okay. Success or failure isn’t the point. The point is to keep walking forward. You have to let go of ego, and hold onto what’s true, kind, and meaningful. That’s why Liu Bei is my hero now — and the person I try to model myself after.

ZP: What are your favorite podcasts?

Xiaodi: I’m especially fond of two: Left-Right (忽左忽右) and Caffè Breve (半拿铁). Both have real depth. Caffè Breve, for instance, tells business history in a style reminiscent of traditional Chinese comic dialogue — almost like a modern-day cross talk. What I love about it is that it forces you to zoom out. By placing individual people and events in their historical context, it pushes you to think across longer time horizons. Decisions start to make more sense when viewed this way — not just as isolated moments, but as part of broader historical patterns. That sense of historical continuity really resonates with me. It’s helped me connect the dots between industry dynamics, global shifts, and past cycles — and to see how often history rhymes. The older I get, the more I feel the importance of learning from history — not just for perspective, but to avoid repeating the same mistakes.

Disclaimer

ZP note: This interview has been edited for clarity and conciseness, and the final content has been reviewed and approved by Xiaodi Hou. The views expressed are solely those of the interviewee. We welcome your thoughts — feel free to share your perspectives in the comments below. To learn more about Bot Auto, please visit their official website: https://bot.auto/

Z Potentials will continue bringing you in-depth conversations with founders at the frontiers of AI, robotics, and global innovation. If you’re someone who looks to the future with curiosity and conviction, we invite you to join our community — to share, learn, and grow with us.

Image Source: Xiaodi Hou

Z Potentials