February 12, 2026

Be Your Own Robot Business Owner

 


The Dawn of Autonomous Robot Businesses:
When Will Owners Replace Staff with Robots?


Robots delight us humans; especially during the honeymoon phase of robots being new. There's a certain charm of having a physical building, running a traditional business like a store or shop, and having all of the labor being done by human shaped robots. Although touch screens, invisible AI, and non-human shaped robots will continue to be more prevalent, our human minds think of a restaurant with Robot Servers, Robot Cooks, and a Robot Manager.


For centuries, the definition of "business owner" has been synonymous with "people manager." Whether running a manufacturing plant, a logistics company, or a neighborhood coffee shop, scaling a business meant hiring more hands. It meant managing schedules, navigating HR disputes, covering sick days, and training staff.


But tomorrow we will have a new option: Humanoid Robots. We are rapidly approaching a moment where a business owner will no longer hire staff, but purchase them.


The concept of the Autonomous Business is moving from science fiction to a viable economic model. In the near future, an entrepreneur might sign a lease, purchase a "staff package" of five androids, and open for business without ever interviewing a single human applicant.


From Automation to Autonomy

To understand where we are going, we must distinguish between automation and autonomy.

Today, we have automation. A car wash is automated; it requires machines to do the work, but humans to oversee, maintain, and intervene when things go wrong. A self-checkout kiosk is automated, but it requires a human attendant to swipe an ID or fix a scanning error.

The next phase is autonomy. This is where the machine handles the work, the troubleshooting, and the entire environment.

The missing link has always been special hardware that can navigate a world built for humans. Super specialized robots (like huge robotic arms in car factories) are expensive and require structured environments. However, the new wave of humanoid robots, such as Tesla’s Optimus, Figure AI, and Boston Dynamics’ Atlas, are designed to walk on two legs and to have dexterity. They don’t need a factory built around them; they can walk into a standard kitchen, hold a standard broom, and operate a standard cash register.


The Economics of the Iron Collar Worker
(See what I did there? Not White Collar, Not Blue Collar. Iron Collar.)

Will business owners make the switch? The math might become undeniable.

Currently, labor is often the single highest cost for small-to-medium businesses. In the US, a minimum wage employee might cost a business $30,000 to $45,000 annually once taxes and benefits are factored in, for ~ 40 hours of work a week.


Mass production should see the price of AI-powered advanced humanoid robots drop to perhaps $15,000. Maybe $20,000 to $30,000, but I counter the actual cost of the entire robot with payment plans / robot loans / leases / financing options. As a business expense, I think this makes it undeniable.


What is the Return On Investment (ROI) for a business owner?

Availability: 160 hours per week (charging batteries & human oversight factored in).

Reliability: No sick days, no turnover, no theft, and perfect consistency.


This means our world will have 24 hour stores and restaurants! I believe that this will support the 3 different Work Shifts or "Cycles" of people: Morning, Afternoon, and Graveyard Night.


Las Vegas is famously a city with nightlife. We can expect more adult commerce and entertainment to be available as robot workers keep the place clean and safe with Security Body Guard robots, Janitor robots will keep the floors and indeed surfaces clean and disinfected, promoting a hygienic hang-out with lessening sickness like the cough virus and flu spreading.


The Timeline: When Can You Buy Your Staff?

While you can’t buy a robot barista at Home Depot today, the roadmap is clearer than ever.

Phase 1: The Industrial Pilot (2025–2027)
We are here now. Robots are being deployed in "unstructured" but controlled environments. BMW and Mercedes-Benz are testing humanoids on assembly lines. Amazon is deploying Agility Robotics' "Digit" to move totes. At this stage, robots are expensive enterprise tools, not general staff.

Phase 2: The Hybrid Workforce (2028–2032)
As costs drop, early-adopter small businesses will introduce robots for "back of house" tasks. A restaurant owner might buy a robot solely for dishwashing and food prep, keeping humans for customer service. This is the era of "Cobots" (collaborative robots) working alongside people.

Phase 3: The Autonomous Turn-key (2035+)
This is the realization of the vision. A franchisee buys a "Store-in-a-Box." The package includes the real estate lease, the inventory, and four general-purpose robots to run the floor. The owner monitors the business from a laptop at home, stepping in only for high-level strategy or major hardware failure.


These time estimates are a balance between optimistic and conservative.


The Age of the Entrepreneur

This is the reason why more people will become their own business owners. You essentially slash risk, and increase an efficient baseline, removing a potential money losing aspect of current business. Importance shifts to location and filling market needs, and AI will even help us strategize and give us the best options!

In this new era, the skill set required to own a business changes drastically, as you can see. The business owner only needs to know the basics of business and enough about the robot technology through standard research.

The role shifts from Managing People to Managing Assets.

  • Instead of making weekly schedules, the owner manages software updates.

  • Instead of conducting performance reviews, the owner analyzes efficiency data.

  • Instead of hiring base level workers, the owner hires technicians (Technical Support).


This democratizes entrepreneurship for those who have capital.


The Human Question

Does this mean the end of human staff? Very Unlikely! Instead, this signals a bifurcation of the economy.


We will likely see a split between Commodity Services and Luxury Experiences.

Commodity: Fast food, convenience stores, gas stations, and warehousing will become fully autonomous to drive prices down and speed up.

Luxury: Fine dining, boutique retail, and caregiving will retain human staff. In a world of robots, human interaction will become a premium product. A sign in a window reading "100% Human Staffed" will justify a 20% higher price point.


The technology is nearly there. The economics are inevitable. The remaining hurdles for the Autonomous Business are legal and social. Who is liable if a robot drops a hot coffee on a customer? How do we insure a robot staff member? And how will society react to local shops that contribute no wages to the local community?

Despite these questions, the trajectory is clear. The "Help Wanted" sign is about to become a relic of the past, replaced by a purchase order for the next generation of workers. The business owners of tomorrow won’t be looking for good help; they’ll be buying it.


This article is augmented by AI.

January 16, 2026

Why Today’s AI Can’t Reliably Explain “Why I Was Wrong”


Image: Kittipong Jirasukhanont via Alamy Stock

With today’s LLMs, “explaining why it was wrong” is usually a second, separate act of text generation - not a direct window into the real causes of the mistake.


Why We Still Can’t Make an LLM That Truly Explains Why It Was Wrong

A modern LLM is trained to produce the most likely next token given context; Not to retrieve a ground-truth record of its internal causes. So, when you ask it to explain an error, it often generates a fluent, human-shaped justification that sounds right whether or not it matches what actually drove the output. 

Large Language Models are a type of Chatbot AI that are completely designed to produce plausible answers.

When a human makes a mistake, we can often ask them why it happened and get something close to the truth:

  • “I didn’t read the question carefully.”

  • “I assumed X meant Y.”

  • “I forgot one constraint.”


That’s different from a plausible narrative that merely resembles an explanation in English. When an AI language model (LLM) makes a mistake and we ask “why did you get that wrong?”, we usually get something that sounds intelligent, but may not actually be the real reason at all.

A key insight from interpretability researchers is that LLMs can produce “explanation-shaped text” without it being mechanically tied to the real decision process. Sarkar from Microsoft calls these explanations that are outputs like any other post-hoc “exoplanations.”

(Source: https://www.microsoft.com/en-us/research/wp-content/uploads/2024/05/sarkar_2024_llms_cannot_explain.pdf)

An LLM’s explanation is typically just another output that it generates because it’s statistically likely to look like a good explanation.
Not because the model actually accessed a faithful record of what caused the error.

This space between plausible explanation and faithful explanation is one of the biggest reasons LLM transparency in the beginning of 2026 is still mostly an illusion.


You must remember that LLMs were not built to retrieve causes (explanations).

They were built to generate text. They are masters of language (English, in this case). They are a success because they can communicate to us very well. But they can't explain why they did something wrong!


AI Research repeatedly finds that explanation-like artifacts can fail to track model causality:

  • In NLP, even widely used interpretability proxies (like attention) were shown to be unreliable as “explanations” of decision-making.


In LLM Reasoning “Chain-of-Thought” (CoT), studies, it is demonstrated that models can produce unfaithful step-by-step reasoning that does not reflect the real determinants of the answer. Especially when nudged or biased for outcomes.

According to OpenAI’s recent work on “chain-of-thought monitorability”, Chain-of-Thought is showing that it may not be a reliable window into the true process of the AI, even if it is helpful for AI research.

My core point: token probability can generate an explanation that is statistically plausible, not causally grounded.

An LLM produces tokens one at a time by predicting what’s most likely next, given the context.

So when you ask, “Why did you do that?”, the model doesn’t open a hidden “reason drawer” and pull out the real motive.

It produces an answer using the same machinery it uses for everything else:
predict-the-next-token.

In fact, a model's explanation is often very convincing, provides consistency, and aligns with what you as a person expects to hear. But is it not necessarily true.


Why “just make it smarter” doesn’t automatically fix it

It’s tempting to assume that if a model becomes strong enough, it will naturally “know what it did.” But there are two structural obstacles:

1. The model’s causes aren’t stored as a neat list of reasons

A transformer’s output arises from distributed activations across layers, attention heads, and representations. Even if the computation is deterministic, the “reason” is not natively encoded as a sentence. Mapping that distributed cause into faithful language is an extra problem.

Interpretability teams at Anthropic describe the basic state of affairs bluntly: we still don’t fully know how these systems work internally, and extracting human-meaningful mechanisms is an active research program.

2. The model has strong incentives to answer anyway

In many deployments, being helpful and confident is rewarded. If uncertainty or refusal is penalized, models learn to “fill in” with plausible completions. This same pressure applies when asked for self-critique: a confident-sounding explanation is often rated better than an honest “I don’t know what internal feature caused this.” (This is also why monitorability work emphasizes incentive effects.) 


Anthropic has documented cases where reasoning traces don’t reliably reflect what the model is “actually using,” which creates a serious problem for anyone trying to treat chain-of-thought as a safety monitor.

And a key insight from research is basically:

If reasoning text is optional for success, the model is free to confabulate it.

So if the model can produce the correct answer without showing its work, the “work” it shows might not be real work at all.


Why “probability of tokens” produces believable explanations (even when wrong)

LLMs are trained on massive text corpora filled with:

  • arguments

  • solutions

  • proofs

  • apologies

  • “here’s why I was wrong” moments

So they learn what human-like error explanation language looks like.

Humans know that they shouldn't simply memorize what an explanation for an error is and then use it in important conversation often involving personal affairs or work that will affect the world.


LLMs don’t naturally store “the reason” in a readable form

Even if an LLM did behave as if it had internal reasons, those “reasons” are not stored as a neat symbolic structure like:

Mistake cause: assumption #3 failed due to missing information

 The reasons are distributed across billions of parameters and activations inside the AI.


 Meaning:

  • The “cause” may be an interaction between many tiny factors

  • It may not be representable as a short human sentence

  • It may not be stable (the same prompt can route through different internal patterns)

So when we ask for a reason, the model often replies with a compressed story that resembles a cause, even if it’s not the real one.


Another hard truth: models can hide their real process (even accidentally)

Once you introduce optimization pressures (fine-tuning, RLHF, tool-use, safety training), you can create situations where models learn:

  • “this style of reasoning is what evaluators like”

  • “this explanation avoids conflict”

  • “this looks careful and safe”

OpenAI and Anthropic have both investigated cases where a model’s reasoning trace can become unreliable for monitoring, especially when incentives are misaligned.

In extreme agentic setups, researchers have even shown examples where a model can produce misleading rationales in pursuit of a goal.

Even without “intent,” the effect looks the same to the user:

you get a clean explanation… that might not be the real reason.


So why can’t we just train it to be honest about mistakes?

Because “honest” is not a simple label.

To make an AI reliably explain why it was wrong, you need:

  1. A ground-truth definition of “why”

  2. A way to verify it

  3. A training signal that rewards faithfulness over plausibility

But in most tasks, we can verify the answer, not the internal cause.

So we end up in a trap:

  • The model learns to produce explanations that humans approve of

  • Not explanations that are mechanistically accurate

This issue shows up directly in research evaluating faithfulness of self-explanations and rationale methods.


What would it take to solve this?

If you want real “why I was wrong” explanations, you likely need architecture-level changes and/or instrumentation.


Let me say that again. If you want real Why I Was Wrong Explanations, you need architecture-level changes and/or instrumentation.


Some promising directions include:

1) Faithfulness-focused evaluation and training

Frameworks aimed at explicitly measuring and improving explanation faithfulness are emerging.

2) Mechanistic interpretability (actual internal tracing)

Instead of asking the model to describe its reasoning, you analyze the activations/circuits.

This is hard - but it’s closer to “real cause” than text-generated rationales.

3) Externalized decision logs (tool-assisted transparency)

If a model uses tools (retrieval, code execution, search), you can log the real steps externally, rather than trusting narrative. OpenAI’s work on chain-of-thought monitorability relates to this broader push.

4) Counterfactual-based explanations

Asking: “What minimal change would flip your answer?” can sometimes be more faithful than asking for storytime. This idea appears across explanation faithfulness research. 


The conclusion: The model is not lying. It’s generating.

This is a very important sentence in this article:

LLMs don’t explain mistakes the way humans do, because they don’t have mistakes the way humans do.

They have statistical failure modes, search failures, context failures, and generalization gaps.

When asked “why,” they respond with the most likely kind of “why-answer” found in their training data.

That’s why we still can’t reliably build an LLM that:

  • identifies the true internal cause of its error

  • expresses it faithfully in language

  • and does so consistently under pressure

Because unless we redesign the system to produce verifiable, faithful traces, the model will keep doing what it does best:

generate plausible text.

December 30, 2025

The Mixture of Titans: Intelligent Model Routing


In the rapidly evolving landscape of artificial intelligence, large language models (LLMs) have become indispensable tools for everything from creative writing to complex problem-solving. Yet, no single LLM excels at every task. As of late 2025, models like Anthropic's Claude series dominate in coding and structured reasoning, Claude is known as a contender for strong writing, with OpenAI's ChatGPT (powered by GPT variants) having a recent history with GPT-4.5 in impressive language generation and creative prose, while Google's Gemini stands out in deep reasoning across math, philosophy, and cosmology and is generally considered to be the best thinking model as of this writing. This specialization creates an opportunity: why settle for one model when you can harness the strengths of many?

Enter the "Mixture of Titans" - a proposed architecture that combines multiple powerhouse LLMs into a unified system, guided by an intelligent router. This router, itself an AI, analyzes user queries in real-time and dynamically selects the optimal model for the task at hand. By automating model selection, the Mixture of Titans promises superior performance, cost efficiency, and adaptability, mirroring concepts from Mixture of Experts (MoE) but applied at a system level across distinct, full-scale LLMs.


The Roots: From Mixture of Experts to System-Level Routing

The idea draws inspiration from Mixture of Experts (MoE), a longstanding technique in machine learning where multiple specialized sub-models ("experts") work on tasks. A gating or routing network decides which experts to activate for each input, enabling massive scale with efficient computation. Modern LLMs like Mixtral, DeepSeek-MoE, and even rumored components of GPT-4 incorporate MoE layers internally, allowing models with trillions of parameters to activate only a fraction per inference.

However, internal MoE is limited to experts within one model. The Mixture of Titans extends this externally: treating entire proprietary or open-source LLMs as "titans" (experts) and employing a dedicated router to direct queries. This "Mixture of Models" or system-level routing has gained traction in 2025, with frameworks like RouteLLM (from LMSYS), Martian, and open-source projects demonstrating cost savings of 20-97% while maintaining or exceeding single-model quality.

Real-world implementations, such as AWS multi-LLM routing and tools like OpenRouter, already aggregate models from multiple providers. The proposed Mixture of Titans builds on this by specializing routing for task domains, assuming strengths like:

  • Claude Opus/Sonnet: Best for coding, with top scores on benchmarks like SWE-Bench (often 70-77% success rates in 2025 evaluations).
  • (For this example) ChatGPT: Exceling in language writing, creative storytelling, and nuanced prose.
  • Gemini Pro: Leads in reasoning-heavy domains, topping leaderboards in math (e.g., AIME), philosophy, and cosmology with advanced chain-of-thought capabilities.


How the Mixture of Titans Works

At its core, the system features three components:

  1. The Titans (Expert LLMs): A curated ensemble of top models. In this proposal:

    • Claude for programming and technical tasks.
    • ChatGPT for writing, editing, and creative generation.
    • Gemini for philosophical debates, cosmological explanations, advanced math, and logical reasoning.

    These assumptions align with 2025 benchmarks: Claude consistently ranks highest for coding accuracy and explanation depth; ChatGPT for engaging, human-like writing; Gemini for multimodal reasoning and hard science.

  2. The Routing AI: A lightweight, fast model (e.g., a fine-tuned smaller LLM like Llama or a custom classifier) that classifies the query. Techniques include:

    • Semantic embedding comparison.
    • Keyword/intent analysis.
    • LLM-as-a-judge for difficulty estimation.
    • Trained on preference data (e.g., which model wins head-to-head on similar queries).

    Advanced routers, like those in RouteLLM, use matrix factorization or causal LLMs to predict the best model, achieving near-GPT-4 quality at half the cost.

  3. The Orchestrator: Handles query preprocessing, routing, post-processing (e.g., combining outputs if needed), and fallback mechanisms (e.g., escalate to a stronger model if confidence is low).

For example:

  • Query: "Write a Python script to simulate quantum entanglement." → Router detects coding task → Routes to Claude → Returns robust, well-commented code.
  • Query: "Craft a short story about a philosopher pondering the universe's origins." → Router identifies creative writing → Routes to ChatGPT → Delivers vivid, engaging narrative.
  • Query: "Explain the implications of the holographic principle in cosmology, with mathematical derivations." → Router flags deep reasoning/math → Routes to Gemini → Provides rigorous, step-by-step analysis.


Benefits of the Mixture of Titans

This architecture offers compelling advantages:

  • Superior Performance: By selecting the best-suited titan, overall output quality surpasses any single model. Benchmarks from routing systems show ensembles outperforming individual leaders on multi-task evaluations.
  • Cost Efficiency: Route simple queries to cheaper models or APIs. In 2025, routing can reduce expenses by a large percentage, as weaker models handle routine tasks while titans tackle complex ones.
  • Scalability and Flexibility: Easily add/remove titans (e.g., incorporate Grok for real-time data or DeepSeek for math specialization). Supports hybrid open-source/proprietary setups.
  • Reduced Bias and Improved Robustness: Diverse models mitigate individual weaknesses or biases.
  • User Experience: Seamless interface: one entry point, optimal results without manual switching.

Challenges include routing accuracy (misroutes degrade quality), latency from classification, and API management. Solutions like cached embeddings and parallel evaluation mitigate these.

Real-World Parallels and Implementations

The concept isn't hypothetical. In 2025:

  • RouteLLM: Open-source framework for training routers, outperforming commercial alternatives on benchmarks like MT-Bench.
  • Martian and Unify AI: Commercial routers dynamically selecting models for optimal cost/performance.
  • OpenRouter: Unified API aggregating dozens of LLMs with intelligent fallback.
  • Academic work (e.g., TensorOpera Router) explores embedding-based routing across providers.

Creating a system that uses the best LLMs at any given time theoretically should yield greater gains.


Is a Mixture of Titans inevitable? 

The reason why we have not left this era of thinking is because the Transformer based LLM architecture has not been evolved into the next great AI. Reinforced Learning CoT Reasoning is here, and we see that AI labs have scaled that specific training to yield results in benchmarks and real world. Current thinking is that going back to the initial pre-training data and improving the first-step pretraining will yield a better AI, specifically noting Grok 4 (Currently Grok 4.1 Beta), moving toward Grok 5. LLMs are proliferating, with over 141,000 open-source models on Hugging Face alone. Intelligent routing is solid on paper, but we do not see competing AI companies working together to combine their AI into one system. At least, not for the public. The Mixture of Titans envisions a future where users interact with an AI team conducted by a smart router utilizing the best performers. Indeed this is some sort of "Swarm" or "Many AI Working Together." Until the paradigm of AI architecture changes, we will continue to have the coding expert (Anthropic's Claude), and the reasoning expert (Gemini 3 Pro), etcetera. I will preface this by saying that if a platform continues to get more sophisticated and a AI lab creates their own internal MoT (Mixture of Titans), we may see that any specific lab may finally be the best at what we would say would be everything. That would be an incredible dominant position of being #1. We have seen LLMs being number one in benchmarks, We have seen LLMs excelling at multiple areas, but we have not really seen one AI being the best at everything. With this in mind: The smartest system won't be the biggest single titan, but the one that knows when to call upon each.

November 26, 2025

Switching Weights on an AI LLM:


A Transformer-based AI Large Language Model has one set of frozen weights for its neural network. The following are systems that switch weights:


1. Mixture of Experts (MoE)

This is the most famous implementation and is likely how GPT-4 and Mixtral (a popular open-source model) work.

  • The Concept: Instead of one giant neural network where every neuron is used for every word, the model is broken into many smaller "expert" sub-networks.

  • How it works: A "gating network" looks at the input (e.g., the word "python") and decides which experts to activate. It might route that word to a "coding expert" set of weights and a "logic expert" set of weights, while ignoring the "creative writing" weights.

  • Why it's used: It allows models to have trillions of parameters (weights) but only use a small fraction of them for any single token. This makes them smarter but much faster and cheaper to run.


2. LoRA and Adapters (Task-Specific Weight Swapping)

This approach is widely used in the open-source community to customize models without retraining them.

  • The Concept: Imagine you have a frozen base model. You can attach small, separate "adapter" modules—tiny sets of weights—that are trained for specific purposes.

  • How it works:

    • LoRA (Low-Rank Adaptation): You freeze the massive main network. If you want the model to write like Shakespeare, you load a tiny "Shakespeare" file (maybe 100MB) that sits on top of the main model.

    • Hot-Swapping: You can literally swap these adapters in and out instantly. In a single system, one user could be using the "Medical Diagnosis" weights while another user is using the "Fantasy RPG" weights, both sharing the same frozen base brain.


3. Hypernetworks or HyperNets (The "Network that Writes Networks")

Hypernetworks (or hypernets) are neural networks that produce the weights for another neural network, which is then named the "target network".

  • The Concept: You have two neural networks.[1][2][3] Network A (the Hypernetwork) takes an input and outputs the weights for Network B.

  • How it works: Network B doesn't actually exist until Network A creates it. If you show Network A a picture of a cat, it might generate a set of weights for Network B that are perfectly tuned to detect cats. If you show it a dog, it rewrites Network B to detect dogs.

  • Current State: This is computationally expensive and tricky to train, so it's not yet standard in large LLMs, but it is used in image generation and smaller experimental models.


4. Fast Weights (Short-Term Memory)

This is an idea championed by AI pioneers like Geoffrey Hinton and Jürgen Schmidhuber.

  • The Concept: Standard weights represent "long-term memory" (what the model learned during training). "Fast weights" are temporary weights that change rapidly during a conversation to store "short-term memory."

  • The connection to Transformers: Modern Transformers (the architecture behind LLMs) actually use a mechanism called Attention that behaves mathematically very similarly to fast weights. When the model looks at a sentence, it dynamically calculates "attention scores" (temporary weights) that determine how much one word relates to another. In a sense, the model is re-wiring itself for every single sentence it reads.


Summarized:

We are moving away from "monolithic" frozen digital brains toward modular, dynamic systems.

  • MoE switches weights per token.

  • Adapters switch weights per task.

  • Hypernetworks generate weights per input.


This is why we are in the most popular term's era: "The MoE Era"


Generated by Gemini 3 Pro


November 24, 2025

AI Sustainable Energy Demands Future

I set out to ask the question: Will AI Energy Demands Be Made Sustainable in the future?

The Energy Demands of Modern AI

Generative AI today is created with the concept of "scaling law," where bigger models trained on vast datasets yield superior results. It's a natural pain-point that now that we have multiple trillion parameter LLM models that run on incredibly powerful AI server hardware, that AI scientists are exploring ways to increase efficiency in significant ways.

As the AI industry reaches a crossroads, it will face a critical choice: continue a brute-force expansion that requires reviving nuclear power, or fundamentally redesign the architecture of intelligence to mimic the efficiency of the biological brain.

The Brute Force Solution: The Nuclear Renaissance

Faced with these projections, technology giants are seeking reliable, carbon-free baseload power to guarantee 24/7 uptime. While the International Data Corporation (IDC) advises focusing on renewables like solar and wind for their low levelized costs, the intermittency of weather-dependent energy has led the industry toward a controversial partner: nuclear power.

Major partnerships have recently emerged:

  • Microsoft & Constellation Energy: A 20-year deal to restart the 837 MW Unit 1 reactor at Three Mile Island, providing enough power for 800,000 homes.
  • Amazon & Talen Energy: A secured commitment of 960 MW from the Susquehanna nuclear plant in Pennsylvania.

Proponents argue that nuclear power offers the only viable zero-emission solution for the constant demands of AI. Industry analysts suggest U.S. nuclear capacity could triple from 100 GW to 300 GW by 2050 to meet this need. However, this approach faces significant hurdles, including steep construction costs, lengthy permitting timelines, and public safety concerns rooted in historical incidents.

The Architectural Solution: Brain-Inspired Efficiency

While infrastructure expands, researchers are attacking the problem at its source: the inefficiency of the neural networks themselves. Traditional "Transformer" models process information continuously—like leaving every light in a building on—and suffer from quadratic computational costs that balloon as input data grows.

To solve this, scientists are turning to Spiking Neural Networks (SNNs). Unlike standard models, SNNs mimic biological neurons by communicating through discrete "spikes" only when necessary, rather than continuous signals.

Introducing SpikingBrain

In September 2025, researchers from the Chinese Academy of Sciences unveiled SpikingBrain, a family of large-scale, brain-inspired language models that demonstrate how AI can grow in capability while shrinking its carbon footprint. The project introduces several technical breakthroughs:

  • Hybrid Linear Attention: Standard Transformers struggle with "quadratic self-attention." SpikingBrain replaces this with linear and sliding-window attention mechanisms. By adapting pre-trained Transformer weights into sparse matrices, the team reduced training and inference costs to under 2% of the cost of training from scratch.
  • Mixture-of-Experts (MoE): The architecture activates only the necessary "experts" for a given task, engaging just 15% of parameters per token.
  • Adaptive Threshold Spiking: A core innovation where neurons adjust their firing thresholds based on membrane potential, converting floating-point values into efficient integer spike counts.

The Efficiency Gains

The results of the SpikingBrain initiative suggest a path toward sustainable high-performance AI:

  • Extreme Sparsity: The model achieves 69.15% sparsity, meaning over two-thirds of activations are zeroed out, requiring no computation.
  • Energy Plummet: By combining spiking computation with INT8 quantization, energy consumption per operation drops to 0.034 picojoules. This represents a 97.7% reduction compared to standard floating-point operations.
  • Speed: The 7-billion parameter model (SpikingBrain-7B) maintains constant memory usage and achieves a 100x faster Time to First Token for massive 4-million-token inputs.

Standard Transformer based architecture LLMs struggle with quadratic self-attention that balloons with input length. SpikingBrain swaps this for linear and sliding-window attention, blending local focus with low-rank global views. By adapting pre-trained Transformer weights into sparse matrices, training and inference costs drop to under 2% of starting from zero. The models incorporate Mixture-of-Experts (MoE) layers, engaging just 15% of parameters per token. Releases include SpikingBrain-7B (a 7 billion-parameter linear model) and SpikingBrain-76B-A12B (a 76 billion-parameter hybrid with MoE). Both match Transformer benchmarks after pre-training on only 150 billion tokens.

Adaptive Spiking and Coding Methods

A core feature is the adaptive-threshold spiking neuron, turning floating-point values into integer spike counts. The threshold adjusts based on membrane potential averages to avoid extremes. Training converts activations to spikes in one pass for GPU efficiency, while inference expands them into sparse trains for event-based processing. The team tested binary, ternary, and bitwise coding to balance sparsity and detail.

When linked with asynchronous hardware, SpikingBrain delivers impressive efficiencies:

  • Sparsity gains: Achieving 69.15% sparsity, over two-thirds of activations are zeroed out, slashing computations.
  • Stable memory: SpikingBrain-7B maintains constant memory in inference, with 100x faster Time to First Token for 4M-token inputs.
  • Event-based savings: With spiking and INT8 quantization, energy per multiply-accumulate drops to 0.034 pJ—97.7% less than FP16 and 85.2% less than standard INT8.
  • Hardware flexibility: Trained on hundreds of MetaX C550 GPUs at 23.4% FLOPs utilization, including tools for non-NVIDIA setups.

This proves brain-mimicking designs can curb LLM energy without performance hits. Layering MoE sparsity with spiking at the neuron level creates multi-level efficiency, suiting neuromorphic chips for async, low-power operation.

Wider Ramifications

SpikingBrain builds on prior efficient LLM efforts but stands out for its size and non-NVIDIA compatibility. The report maps a path for neuromorphic hardware and edge deployments in areas like manufacturing or smartphones.

I won't go into traditional methods to make current AI more efficient, but there have been some revolutionary ideas put into practice that go beyond standard quantization and distillation that seek to maintain quality while yielding efficiency gains. Personal opinion from alby13: the efficiency era that we are in is in making current LLM technology more efficient, with a notable example being Diffusion Based LLMs that process and output text with more efficiency and speed. This article is focusing on the more neuromorphic scientific evolution of mimicking the human brain for efficiency. No doubt there will be other areas that will use any idea that can be gained from the human body for things like memory to produce the best results.

End Notes:

The solution to the AI energy crisis is not singular. Hardware and Power solutions will need to be found, and AI will continue to evolve or revolutionize to produce ultimate efficiency, even if those novel new models and platforms are for specific needs. The research is being done, and it seems that not much research will be wasted if it can be put into the world as a product/service.



SpikingBrain Paper: https://arxiv.org/pdf/2509.05276


Article is written by AI with Human Oversight. Please check facts.

August 21, 2025

Beyond Objects in 3D: The Next Leap for AI in 2025 and Beyond

 


In 2024, I talked about AI models in the form of LLMs and unique 3D AI Apps having the ability to create 3D objects. Objects are great, but AI must advance. So what's next? Read on!

In 2024 and 2025, the world experienced new AI capabilities in the form of Large Language Models (LLMs) creating 3D and innovative 3D AI applications capable of generating fully realized 3D objects. These systems marked a major technological milestone: machines that could not only understand human language but also create in 3D space, tangible digital objects, parts, and characters. Yet, as transformative as this has been, it is only the beginning. The real question now is: what comes next?

Moving Beyond Creation to Interaction and Autonomy

Which way, western man? What we have seen is that creators and builders of AI are recognizing the unique job of having AI craft 3D, and in that there is a fundamental commonality: internal skeletons for 3D characters. Avatars, creatures, and human shapes require bones for the games and simulations to enable function...... or do they? THEY DON'T! Now, shockingly, surprisingly, there is a new path that has emerged, because AI can now mostly correctly and properly figure out how things are supposed to animate, all without a skeleton! That is a revolutionary concept, but it doesn't mean that skeleton rigs are going away, so there are two paths!

The ability to create 3D is powerful, but creation alone is limited without context, purpose, and interaction. The next leap in AI will involve systems that don’t just generate items and people, but understand and operate within dynamic environments. Instead of merely producing a 3D model of a chair, the AI of tomorrow will:

  • Design environments where that chair exists, considering ergonomics, lighting, and user behavior.
  • Adapt in real time to changing conditions, learning from feedback without retraining.

This shift transforms AI from a tool into a crafter, capable of co-creating and co-connecting inside of complex digital and physical worlds.

Indeed creating 3D scenes, rooms, buildings, stages, and mini-worlds are coming. Not AI generated video-like 3D, but crafted and created 3D worlds that are used in Virtual Reality sims and games. The utility of this is far too great to be missed, as in an important cerebral step for AI to take as it takes on bigger, more complex challenges.

It sounds simple, doesn't it? Create a world. Create an interrogation room for a role-play video game. Create a science lab where the user can do fun experiments. Let's take that example. 3D objects begin to gain complexity because they can have properties, characteristics, and even *scripting*. Let's say you have a glass vile in the 3D lab with a green liquid. Modeling stationary green fluid as a shape inside the glass might be simple enough, but what if you want physics? A physics engine can support that, but the object has to tell the engine what that liquid is. Let's say when you tilt the vile over, a green liquid is generated by coding in a script making it do that. The AI could potentially incorporate that.

I'd like to mention one quick example: Let's say, AI model scissors as a 3D object. Our human expectation is that there is a screw or bolt holding those two scissors pieces together, and that when in the physics world the handle is held and the pieces spread, that it should rotate on the axis of that bolt or screw. That's a more complex ask than modeling 3D scissors as one part, or even two or three parts. There is an actual physics trait there!

3D Advancement for AI

Last year gave us objects - beautiful, functional, and impressive creations. The future, however, will give us complexities of intelligence: creative, adaptive, and purpose-driven AI creations that work for our needs. This next chapter is going to show how AI agents and platforms gain agentic intelligence, 3D modeling intelligence, and related systems intelligence.

First AI shapes our 3D virtual reality, then AI shapes our physical reality.



Article augmented by AI.


July 24, 2025

Growing Human Livers: Current Progress (2025)

Growing Human Livers: Current Progress (Mid- 2025) and Future Potential

Scientists are making remarkable progress toward growing functional human livers using bioengineering techniques that share similarities with cloning approaches. While we're not quite at full-scale liver production yet, the field has achieved several groundbreaking milestones that suggest this goal is achievable.



Current Achievements

Miniature Functional Livers

Researchers have successfully created miniature human livers that function like natural organs. Teams at Wake Forest University have engineered livers about an inch in diameter that weigh 0.2 ounces, demonstrating that human liver cells can be used to generate functioning liver tissue. These mini-livers secrete bile acids and urea just like normal livers.

Japanese scientists have made particularly impressive advances by creating 4-millimeter "liver buds" from human stem cells that, when transplanted into mice, work in conjunction with the animals' organs and produce human liver-specific proteins. This represents the first time people have made a solid organ using pluripotent stem cells.


Multiple Bioengineering Approaches

Scientists are pursuing several promising methods:

  • Decellularization: Researchers take animal livers, remove all cells with mild detergent, leaving only the collagen "skeleton," then repopulate it with human liver cells
  • Stem cell conversion: Converting human skin cells into stem cells, then coaxing them to become liver cells
  • 3D bioprinting: Using advanced printing techniques to create liver scaffolds
  • Organoid development: Growing "mini-organs" from stem cells that can repair damaged liver tissue


Breakthrough Human Trials

The field has reached a significant milestone with the first human trial beginning in 2024. A volunteer with severe liver disease received an experimental treatment designed to grow a second "mini liver" in their lymph node. This approach injects healthy liver cells into lymph nodes, where they can develop into functional liver tissue while some cells migrate to help regenerate the existing damaged liver.



Current Limitations and Challenges

Scale Requirements

While current mini-livers are functional, they need significant scaling up. An adult human liver weighs about 4.4 pounds, but researchers estimate that an engineered liver would need to weigh about one pound to sustain human life, since livers functioning at 30% capacity can support the body.


Technical Hurdles

Key challenges that researchers are actively addressing include:

  • Cell production: Learning to grow billions of liver cells simultaneously
  • Vascularization: Creating proper blood vessel networks within the engineered tissue
  • Bile duct construction: Developing fully functional bile drainage systems
  • Long-term functionality: Ensuring engineered livers maintain function over time


Future Timeline and Prospects

The research suggests that patient-specific liver substitutes are achievable through continued optimization and integration of induced pluripotent stem cells. However, scientists emphasize they're still at an early stage, with many technical hurdles requiring resolution before patient treatment becomes routine.

Bioengineered liver tissues currently need "additional rounds of molecular fine tuning before they can be tested in clinical trials", but the rapid advancement in recent years suggests this technology could become clinically viable within the next decade.


Beyond Transplantation

Engineered livers offer additional benefits beyond treating liver disease. They provide platforms for drug safety testing that more closely mimic human liver metabolism compared to animal models, and can serve as disease models for research purposes.

The field of liver bioengineering is advancing rapidly, with multiple successful approaches demonstrating that growing functional human livers is not just theoretically possible but actively being achieved in laboratories worldwide. While full-scale clinical implementation still requires overcoming significant technical challenges, the foundation has been established for what could become a revolutionary treatment for liver disease. 



Created with Perplexity


Sources:

The Conversation - How to grow human mini-livers in the lab to help solve liver disease
https://theconversation.com/how-to-grow-human-mini-livers-in-the-lab-to-help-solve-liver-disease-121297

Wake Forest University School of Medicine - Human Liver
https://school.wakehealth.edu/research/institutes-and-centers/wake-forest-institute-for-regenerative-medicine/research/replacement-organs-and-tissue/human-liver


New Atlas - Researchers grow laboratory-engineered miniature human livers
https://newatlas.com/bioengineered-miniature-human-livers/16790/

UPMC - Lab-Grown Miniature Human Livers Transplanted into Rats
https://www.upmc.com/media/news/052820-lab-grown-miniature-human-livers

CBS News - Researchers create miniature human liver out of stem cells
https://www.cbsnews.com/news/researchers-create-miniature-human-liver-out-of-stem-cells/

National Library of Medicine: Liver Bioengineering: Promise, Pitfalls, and Hurdles to Overcome
https://pubmed.ncbi.nlm.nih.gov/31289714/

University of Cambridge - Lab-grown ‘mini-bile ducts’ used to repair human livers in regenerative medicine first
https://www.cam.ac.uk/research/news/lab-grown-mini-bile-ducts-used-to-repair-human-livers-in-regenerative-medicine-first

MIT Technology Review - This company is about to grow new organs in a person for the first time
https://www.technologyreview.com/2022/08/25/1058652/grow-new-organs/

Springer Nature - ‘Mini liver’ will grow in person’s own lymph node in bold new trial
https://www.nature.com/articles/d41586-024-00975-z

Articles are augmented by AI.