Monthly Archives: April 2017

USAFacts, Corporate Hagiography and Historical Ignorance

This morning my circles are talking about Steve Ballmer’s new government data initiative USAFacts as reported in this NYT article.

It’s an interesting project, and I am glad that this is how Ballmer is spending his dotage! It’s a lot better than going into VC as a lot of other tech execs seem to do as they age. I wish him the best.


This is not the first time someone has worked on making government data more accessible. I wish that Ballmer and the media coverage around this launch spent any time at all discussing the many other similar initiatives and how this fits into the ecosystem.

For example, the mission of “a comprehensive summary” is interesting and different, but represents a tradeoff compared to deep contextual understanding. Contrast with the “Scarsdale” series by Thomas Levine!/scarsdale/, for example. Also, this is a classic example of the “How Standards Proliferate” process. Everyone who comes along thinks: “If only there were one canonical home for all government data!” And then you end up with 15 different portals.

I think most notably, USAFacts doesn’t actually make their data open, they just publish reports. That’s a major departure from what a lot of other players are doing, and I wish there was any discussion about why they made that choice. Are there legal requirements connected to some of the data? Surely at least some of it could be open. Is it a desire to keep a “moat”? Who knows!

The tone around this launch irks me in the same way most tech coverage irks me. Ballmer is not the first to think of it, not by a long shot. And his effort to understand what was already out there seems….cursory, at best. Googling “open government data” would have been a very good start.

Why was this published in NYT’s DealBook section? It’s not business reporting at all. DealBook seems to exist as a WSJ competitor so the Times can attract the crowd that just wants corporate hagiography. Related:

If you are interested in learning more about different open datasets, this may be a good start:!/open-data/better-datasets-about-open-data/

Edit: There are two comment threads on HN (1 2) about this, the discussion is pretty good so far. Fave comments: