# I Built Analytics That Cannot See You

By [MetaEnd](https://paragraph.com/@metaend) · 2026-06-08

privacy, secops, astro

---

Every analytics tool makes you a promise. "We respect your privacy." "We don't sell your data." "GDPR compliant." The promise is the product, because underneath it, the tool is still doing the thing it always did: writing down a record of your visit and trusting whoever holds it to behave.

I did not want to make that promise. I wanted to build something where the promise is unnecessary, because the math makes individual visits unknowable in the first place. So I built astro-private-count, and it has been running in production on my own site since this week, with a public dashboard you can open right now.

The problem with "privacy-friendly" analytics
---------------------------------------------

Give the privacy-respecting tools their due. They dropped cookies, they stopped shipping your data to ad networks, they anonymized IP addresses. That is real progress and I am glad it happened.

But look at what they still do. For every visit, they compute a signal: a cookie, or a salted hash of your IP address and your user agent. Even when that hash rotates every day, a per-visit record still exists on the server for that day. There is a row, and that row is about you. The privacy comes from a policy wrapped around that row: a retention window, a promise not to look too closely, a legal basis in a document nobody reads.

Policy is better than nothing. But policy can change, leak, get subpoenaed, or get quietly ignored after an acquisition. I wanted a design where there is simply no row about you to begin with. Where the privacy is a property of the system, not a sentence on a page.

Randomized response, a 60-year-old trick
----------------------------------------

The answer turned out to be older than the web. In 1965 a statistician named Stanley Warner wanted to survey people about embarrassing or illegal behavior and get honest answers. His insight: give every respondent deniability, and they will tell the truth, and you can still recover the group statistics.

The modern version, the foundation of what is now called local differential privacy, works like this. Suppose I want to count how many visits did some action, a yes or no question. Before any answer leaves your browser, your own device flips a weighted coin.

About three times out of four, it sends the true answer. The other one time in four, it sends a random answer, a fair coin flip with no relation to what you actually did.

Now think about what arrives at my server. A "yes" lands. Did that visit actually do the thing? Or did the coin say "lie" and the random flip happened to land on yes? I cannot tell. Nobody can tell. Every single visitor has plausible deniability baked in, before the data ever crosses the network.

So how do you get useful numbers out of noise?
----------------------------------------------

This is the part that feels like magic but is just arithmetic. The individual answers are deniable, but the amount of noise is known exactly, because I chose it. So I can subtract it back out in aggregate.

If a thousand bits arrive and I know that a quarter of them are random coin flips averaging to half yes, I can solve for how many of the real answers must have been yes. The estimate is unbiased: on average it lands exactly on the true count. The error shrinks as traffic grows, roughly with the square root of the sample. At a few hundred reports it is within a few percent. At ten thousand it is within two.

The trade is simple and honest. I give up the last sliver of precision, especially on low-traffic days, and in exchange no individual visit is ever recoverable, even by me, even with the raw database in hand. For "are people using this, is the funnel converting," that trade is not close. Privacy wins easily.

There is one detail I want to flag because it bit me and it is the kind of thing that quietly ruins these systems. There are two flavors of randomized response. In one, a "lie" sends the opposite of the truth. In mine, a "lie" sends a fresh random bit. They look almost identical, but they need different formulas to de-bias, and the denominators differ. Use the wrong one and every number you report is off by about fifty percent while looking perfectly plausible. I have a test that guards that exact line of math so a future change cannot silently reintroduce the bug.

What it actually stores
-----------------------

Two integers per event per day. How many reports arrived, and how many of them were a one. That is the entire database.

No cookies. No IP addresses. No user agents. No identifiers. No fingerprints. There is no table that represents a person, which means there is nothing to leak in a breach, nothing to hand over under a subpoena, nothing to de-anonymize three years from now when somebody gets clever. You cannot lose what you never collected.

And because nothing is stored on your device, the usual cookie-consent banner is not even legally required. I ship a short, honest notice instead. No modal, no "we value your privacy" dark pattern, no reject-all maze. Just a plain sentence telling you what is happening.

I publish the epsilon
---------------------

There is one number that controls the whole privacy guarantee. It is called epsilon, and it sets how often the coin tells the truth. Lower epsilon means more noise and stronger deniability. Higher means cleaner numbers and weaker privacy.

Most systems that use differential privacy keep that number quiet, if they use it at all. I put it directly on the public dashboard. Anyone can see exactly how strong the guarantee is, in the open, and check the de-biased totals for themselves. A privacy claim you can audit is worth more than one you have to trust.

What this is, and what it is not
--------------------------------

I am careful here, because the whole point of the project is that the claim is exact.

This is local differential privacy. It is a real, mathematical guarantee that the server learns totals and nothing about any individual. It is tunable and it is auditable.

It is not a zero-knowledge proof. Those are a different cryptographic tool for a different job, and bolting one onto a pageview counter would be theater. I will not call this ZK, because it is not.

And it does not anonymize your network connection. Your browser still opens a connection to my server, so my edge necessarily sees that some address connected at some time. I never log it, never store it, never build anything from it, but I cannot pretend the connection is invisible. True network-level privacy needs something like a mixnet in front of the client, and that is out of scope for an analytics library. Anything else would be overselling, and overselling is exactly what I built this to get away from.

Why I actually built it
-----------------------

This came out of a different project, an agent-readiness grader that scores websites on how well they serve AI agents, including how they handle privacy and data. I was about to add analytics to it, and every option on the table was something I would have docked points for if I found it on someone else's site.

That felt wrong enough to stop me. If I am going to hold the rest of the web to a standard, the tool doing the grading has to clear that same bar first. So the analytics had to be the kind I would give an A. That constraint is the whole reason this exists, and it is why it is open source: the standard only means something if anyone can use it and check it.

It is released under AGPL-3.0, runs on Bun, and installs as a normal Astro integration. Add it to your config, call track(), open the dashboard. That is the whole setup.

Privacy should be a property of the system, not a promise on a page. This is my attempt to build it that way, and to prove it by running it in the open.

You can see it counting live, epsilon and all, at [metaend-grade.fly.dev/stats](http://metaend-grade.fly.dev/stats). The source is at [gitworkshop.dev/ngmi@zaps.lol/astro-private-count](https://gitworkshop.dev/ngmi@zaps.lol/astro-private-count), AGPL-3.0. Take it, run it, check the math.

metaend

---

*Originally published on [MetaEnd](https://paragraph.com/@metaend/analytics-that-cannot-see-you)*
