Pockot — Founding Note

Founding Note: A Useful Local Mind Has a Power Budget

Published May 2026

The Gap

Cloud AI begins with an assumption: networks, power, accounts, and datacenters keep working. Pockot starts by removing that assumption.

The question is not whether a pocket device can imitate a large datacenter model. It cannot. The question is what useful local intelligence remains when the device has to live inside a real envelope: accelerator throughput, sustained watts, battery watt-hours, memory, storage, model size, thermal limits, and an offline corpus.

The floor is moving. Apple states that the M4 Neural Engine can deliver up to 38 trillion operations per second. Source: Apple M4. Raspberry Pi's AI HAT+ product brief lists accelerator variants at 13 and 26 TOPS. Source: Raspberry Pi AI HAT+ product brief. Qualcomm's Snapdragon X Elite page says the platform can run generative AI models over 13B parameters on-device. Source: Qualcomm Snapdragon X Elite.

Those numbers are useful, but they are not enough. TOPS does not say how long the system runs, how much memory is available to the model, whether the runtime is mature, how much storage remains for offline knowledge, or what happens when thermal throttling appears.

Pockot exists to make those constraints visible.

Why This Matters Now

Small models now have an official on-device path. Meta's Llama 3.2 release includes lightweight text-only 1B and 3B models for select edge and mobile devices, and states a 128K-token context length for the 1B and 3B models. Source: Meta Llama 3.2.

That changes the problem. A local device no longer has to be judged only by whether it can run the largest model. It can be judged by task fit. Can it summarize local documents? Search a stored corpus? Help with field notes? Explain a repair manual? Maintain a private log? Draft code against local references? Translate without a network? These are narrower than cloud-scale generality, but narrower does not mean useless.

The power boundary is equally concrete. The FAA page for airline passengers and batteries treats 100 Wh as the standard boundary for spare lithium-ion batteries and power banks, with larger batteries needing airline approval under stated limits. Source: FAA batteries. Pockot uses 100 Wh as a practical reference point, not as a product claim. At 10 sustained watts, 100 Wh means a simple model runtime of 10 hours before conversion losses, thermal effects, and workload variation. That is a calculation, not a promise.

The new work is not to write a survival story. It is to measure what a device can keep doing when cloud dependency is removed.

That measurement has to stay local. A benchmark run in a chilled lab with wall power is useful, but it is not the same as a small enclosure that has to run on a battery, store its own references, and recover cleanly after power loss. A model that answers one prompt quickly and then overheats is a different object from a tool that can work slowly for hours. Pockot will keep those cases separate.

What Pockot Will Study

Pockot starts with six questions.

First: what is the smallest useful model for specific offline tasks? A 1B model, a 3B model, and a 13B model are not interchangeable. The answer depends on task, quantization, context, latency, and memory.

Second: what sustained power draw is realistic? Peak accelerator numbers are not sustained system numbers. The device has CPU, memory, storage, display, radios, sensors, and thermal overhead.

Third: how much local knowledge is enough? Offline usefulness depends on the corpus. A device that can run a model but cannot store the relevant manuals, notes, maps, or references is not resilient.

Fourth: which runtimes are measurable? Pockot will prefer reproducible measurements over vague compatibility claims. A useful entry should name the chip, model, quantization, tokens per second if measured, sustained watts if measured, and runtime conditions.

Fifth: what can be updated offline? "Self-improving" has to be treated carefully. A local device may support retrieval updates, note consolidation, tool logs, or small adapter experiments. That is different from unsupported claims about autonomous model improvement.

Sixth: what fails first? In a pocket system, the limiting factor may be battery, thermals, memory bandwidth, storage, model quality, or the user's ability to maintain the corpus.

Our Approach

The first tool is the offline compute feasibility calculator. Version 0 asks for TOPS, sustained watts, battery Wh, model size, quantization, storage, and corpus size. It returns model runtime, approximate model memory, remaining storage, storage fit, compute per billion parameters, and a conservative assumption tier.

The calculator is deliberately modest. It does not certify performance. It does not certify safety. It does not claim autonomy. It only makes the assumptions visible enough to argue with.

The first research notes will compare device classes: phone-class NPUs, laptop-class NPUs, Raspberry Pi plus accelerator modules, and small dedicated edge boxes. The output should be a living table of what can run locally, for how long, and under which constraints.

The project will also track maintenance. Offline usefulness is not only inference. It includes updating a local corpus, checking file integrity, preserving private notes, exporting state, and explaining when confidence is low because the local reference set is incomplete. These are mundane constraints, but they are the difference between a demo and a device a person could actually keep.

Pockot's operating question is simple: what is the minimum stack that remains useful when the network is gone, and which part of that stack is still missing?