Tiny AI on Phones Reshapes App Energy Use in Real World Apps
On-device AI shifts some cloud work to local compute, reducing latency for certain tasks while increasing energy use for others. The overall effect isn’t a simple battery win or loss: energy costs migrate among data transfer, memory traffic, and compute depending on the task, hardware, and reuse patterns. Real-world apps now behave differently as they trade cloud reliance for local inference, shaping user experience and device life.
The energy paradox of edge AI in devices
Edge AI promises privacy by keeping data on the device, but it hides a growing energy cost. Continuous on-device inference and cooling turn billions of gadgets into near-constant heat sources, shifting power demand from cloud data centers to living rooms, offices, and pockets. This piece weighs privacy gains against the energy bill and argues for smarter balance between what we keep private and how much power we burn to keep it private.
The Quiet Logic of Cache Eviction
Cache eviction operates under the hood, yet its decisions shape latency and energy far more than clock speed. Tiny cache-line replacements ripple through L1, L2, and DRAM, altering power use and the feel of everyday apps—from buttery scrolling to jittery latency on busy servers. Recognizing eviction as a system constraint expands what we count as performance in modern CPUs. That reframing makes eviction a constraint to design, not a mystery to explain.


