Category Archives: Uncategorized

Decentralization

A very thoughtful recent blog post makes the point that institutions that seem “decentralized” or claim that as a value often exhibit centralizing tendencies over time. Some recommendations:

  • Be specific about what things you want to decentralize, and why. Regard decentralization as an ongoing process that can never be complete.
  • Find checks and balances, so that it is harder for any set of actors to achieve centralizing power. Use multiple forms of decentralization and participation.
  • Consider accountability: often what we really care about is accountability, and a centralized but accountable entity (such as a government antitrust enforcer) can limit the centralized and unaccountable power accumulation that we fear.

Pluribus Skepticism

Is Facebook’s new poker AI really the best in the world?

Facebook released a paper and blog post about a new AI called Pluribus that can beat human pros. The paper title (in Science!) calls it “superhuman”, and the popular media is using words like “unbeatable”.

But I think this is overblown.

If you look at the confidence intervals in the FB blog post above, you’ll see that while Pluribus was definitely better against the human pros on average, Linus Loeliger “was down 0.5 bb/100 (standard error of 1.0 bb/100).” The post also mentions that “Loeliger is considered by many to be the best player in the world at six-player no-limit Hold’em cash games.” Given that prior, and the data, I’d assign something like a 65-75% probability that Pluribus is actually better than Loeliger. That’s certainly impressive. But it’s not “superhuman”.

I don’t know enough about poker or the AIVAT technique they used for variation reduction to get much deeper into this. How do people quantify the skill difference across the pros now?

I’m also a bit skeptical about the compensation scheme that was adopted – if the human players were compensated for anything other than the exact inverse of the outcome metric they’re using, I’d find that shady – but the paper didn’t include those details.

Thoughts?

Defensive Randomization

Machine learning is common and its use is growing. As time goes on, most of the options that you face in your life will be chosen by opaque algorithms that are optimizing for corporate profits. For example, the prices you see will be the highest price under which you’ll buy, as based on an enormous amount of data about you and your past decisions.

To counter these tendencies, I expect people to begin adopting “defensive randomization”, introducing noise into your decision-making and forcing corporate algorithms to experiment more broadly with the options they introduce to you. You could do this by simple coin flip, or introduce your own bots that make random (or targeted exploratory) decisions on your behalf. For example, you could have a bot log in to your Netflix account and search for a bunch of movies that are far away from Netflix’s recommendations for you.

One possible future is for these bots to share data between themselves — a guerilla network of computation that is reverse-engineering corporate algorithms and feeding them the information that will make your life more humane.

This is related to:

[mildly inspired by Maximilian Kasy’s Politics of Machine Learning]

 

Police Science

Very much enjoying Jackie Wang’s Carceral Capitalism.

Especially liked this thought in “This Is A Story About Nerds and Cops“:

Given that critics of the police associate law enforcement with the arbitrary use of force, racial domination, and the discretionary power to make decisions about who will live and who will die, the rebranding of policing in a way that foregrounds statistical impersonality and symbolically removes the agency of individual officers is a clever way to cast police activity as neutral, unbiased, and rational.

Contributing to pandas

Very proud to announce today that I had a pull request merged into the pandas library. In version 0.21, pandas will have a new feature: a way to read in line-delimited JSON in small pieces, which can be useful when working with large files or streams.

This is a fairly small change, technically, but a big deal for me.  Pandas is one of the most commonly used tools in the data science world. When I started at TrueAccord they bought me the book on pandas (Volume 2 coming out next month!). This was my first introduction to any programming language other than Stata, an odd proprietary language that languishes on among economists and epidemiologists. Now, writing software is a core part of my career.

Related, I highly recommend  The Success of Open Source, in which Steven Weber outlines the varied ways in which open source communities elicit and channel cooperation, and explores the complex set of motivations that leads people to contribute to open source.

What I’ve been reading lately

Rebecca Solnit, River of ShadowsSolnit is a marvelous thinker and historian who moves smoothly between well-researched historical fact and philosophical reverie. Here she traces the life of Edward Muybridge whose motion studies of animals are still familiar today. Muybridge was a first-class photographer, a true artist who also made many technical innovations. Solnit takes his collaboration with Leland Stanford as the jumping-off point for an exploration of the way technology has annihilated time and space, and develops a genealogy from those two to the California of today, dominated by Hollywood and Silicon Valley. In her telling, these two industries named for physical places are at the center of a world that, in large part because of their doing, is increasingly disconnected from the world itself.

Mary Robison, Why Did I EverA few years back I made a note to myself to read this novel. I can’t recall why, or at whose urging, but I’m glad I did. Told in over 500 short fragments, Robison is funny and poignant. I was sad to have finished this book.

Diane Coyle, GDP: A Brief but Affectionate HistoryI’ve been meaning to read this for a while, but I am, so far, disappointed. GDP is the single measure that people associate with economic health and growth, to the extent that people say “the economy grew” when they mean “GDP grew”. How the economy is measured could not be more important and Coyle lays out some of the history of how GDP developed, and some of the ways in which it is flawed. This wasn’t the right level of depth for me — took some things for granted and was disappointingly shallow elsewhere — but seems like a good starting point for a deeper read into these ideas.

Nitt Witt Ridge

Art Beal spent 61 years building a house out of found materials at Nitt Witt Ridge in Cambria, CA. He served for a time  as the town garbageman, dumping his truck directly into his own backyard and rummaging for salvageable building supplies with which he slowly built a house in the shape of his own mind. There is now little trace of the 20 feet of landfill underneath the hill. where his house rests.

Beal, born in Oakland, was a celebrated long-distance swimmer in his youth but decamped in his 20’s to Cambria, 200 miles south along the California coast. He built a small house and lived in it with “Gloria” whose life is otherwise lost to history. At some point she disappeared. He abandoned that house and began constructing his masterwork, the unfinished project of the rest of his life.

There is no place in our world for some men. Through accident of birth some men are born different and they accumulate injuries in the world as they repeatedly are rammed through holes of the wrong shape. Beal was lucky. He found a place for his energy, found a way to preserve himself in a world that has no room for difference of mind.

USAFacts, Corporate Hagiography and Historical Ignorance

This morning my circles are talking about Steve Ballmer’s new government data initiative USAFacts as reported in this NYT article.

It’s an interesting project, and I am glad that this is how Ballmer is spending his dotage! It’s a lot better than going into VC as a lot of other tech execs seem to do as they age. I wish him the best.

HOWEVER

This is not the first time someone has worked on making government data more accessible. I wish that Ballmer and the media coverage around this launch spent any time at all discussing the many other similar initiatives and how this fits into the ecosystem.

For example, the mission of “a comprehensive summary” is interesting and different, but represents a tradeoff compared to deep contextual understanding. Contrast with the “Scarsdale” series by Thomas Levine https://thomaslevine.com/!/scarsdale/, for example. Also, this is a classic example of the “How Standards Proliferate” process. Everyone who comes along thinks: “If only there were one canonical home for all government data!” And then you end up with 15 different portals.

I think most notably, USAFacts doesn’t actually make their data open, they just publish reports. That’s a major departure from what a lot of other players are doing, and I wish there was any discussion about why they made that choice. Are there legal requirements connected to some of the data? Surely at least some of it could be open. Is it a desire to keep a “moat”? Who knows!

The tone around this launch irks me in the same way most tech coverage irks me. Ballmer is not the first to think of it, not by a long shot. And his effort to understand what was already out there seems….cursory, at best. Googling “open government data” would have been a very good start.

Why was this published in NYT’s DealBook section? It’s not business reporting at all. DealBook seems to exist as a WSJ competitor so the Times can attract the crowd that just wants corporate hagiography. Related: https://twitter.com/louispotok/status/423173257110372352

If you are interested in learning more about different open datasets, this may be a good start: https://thomaslevine.com/!/open-data/better-datasets-about-open-data/

Edit: There are two comment threads on HN (1 2) about this, the discussion is pretty good so far. Fave comments:

Python tip: Inspect function signature at runtime

Problem:

I have a list of functions with different signatures. There is some set of possible parameters, and I want to call all these functions with the “appropriate” argument for each parameter.

This is a little hand-wavy, let’s look at an example:


def half(a):
    return a / 2

def twice(a):
    return 2 * a

def addition(a, b):
    return a + b

def subtraction(a, b):
    return a - b

functions = [half, twice, addition, subtraction]
a = get_a()
b = get_b()

Desired outcome:

[half(a), twice(a), addition(a,b), subtraction(a, b)]

And we want to do this without making our function definitions too ugly.

Solution 1:

One option is to `get_b()` within the functions that need them. This is not ideal, suppose `get_b` is not a pure function (e.g. a network call), we would want to pass `b` into scope instead of getting it from elsewhere every time it’s needed.

Solution 2:

We could change the signature to accept arbitrary kwargs and then pass a dict of args, for example:

def half(**kwargs):
    return kwargs['a'] / 2

def twice(**kwargs):
    return 2 * kwargs['a']

def addition(**kwargs):
    return kwargs['a'] + kwargs['b']

def subtraction(**kwargs):
    return kwargs['a'] - kwargs['b']

functions = [half, twice, addition, subtraction]
payload = {'a': get_a(), 'b': get_b()}
results = [f(**payload) for f in functions]

This works, but makes each of our function definitions uglier.

Solution 3:

Allow each function to have a different signature, inspect the signature at runtime and pass what is needed.
(Adapted from http://stackoverflow.com/a/2677263/3393459)


import inspect

def half(a):
    return a / 2

def twice(a):
    return 2 * a

def addition(a, b):
    return a + b

def subtraction(a, b):
    return a - b

# Wrapper which:
# * accepts a dict of all possible kwargs and their names
# * inspects the signature of the function
# * calls that function with the correct args
def call_func_with_correct_args(f, possible_args):
    func_args = inspect.getargspec(f).args
    args_to_pass = {k: possible_args[k] for k in func_args}
    return f(**args_to_pass)

functions = [half, twice, addition, subtraction]
a = get_a()
b = get_b()
full_payload = {'a': a, 'b': b}

results = [call_func_with_correct_args(f, full_payload) for f in functions]

So this is nice and clever, but we need to be careful that our function parameters are named correctly and consistently. Essentially we are passing the burden to the function definitions.

Conclusion

I don’t know what a great solution to this might look like. Is there a better way to do this? If all the parameters are different types, Python3’s type hinting might provide another option. What does this look like in other languages?

Book Review | Earthseed Series | Octavia Butler

Parable of the Sower and Parable of the Talents by Octavia Butler

I first heard about Octavia Butler in early 2014 on Twitter, I think originally from Danilo and/or Holly. I saw allusions to something called Earthseed, to humanity’s destiny in the stars, to the idea that “God Is Change”.

Two years later, with fascism on the rise and afrofuturism enjoying a moment of popularity, I went to a Fusion-backed symposium about Butler’s work. ((I learned about this from Alexis Madrigal’s newsletter Five Intriguing things which I’ve previously pluggedAlexis, formerly at the Atlantic and now Editor in Chief at Fusion, is one of the most interesting writers I follow.))  At this event, I was delighted by the idea that all progressives, all activists, are engaged in acts of science fiction; they imagine alternate worlds that could branch off from this one in a plausible way, societies like ours except governed by different principles of the physical or psychological universe.

Sower and Talents, published in 1993 and 1998 respectively, look more prescient by the day. Butler saw the future with great clarity and with a sense of resignation to the hate, destruction and degradation our world would suffer. In the Parable series, environmental catastrophe and economic inequality have created a desperate underclass driven to violence and drugs, whose life is of no value to a police force interested in protecting the property of the rich. In this fertile ground a white supremacist Christian paramilitary organization flourishes with the winking support of Presidential candidate Andrew Steele Jarrett, whose ascendance tears apart the vanishing middle class between liberal values and a frantic need to protect their families and communities from the predations of those even a little less fortunate. Kashmir Hill has already written about the uncanny similarity of this campaign to Trump’s. By the late 80s our future was not murky  to a thinker of Butler’s diagnostic precision.

The series follows Lauren Oya Olamina, a teenage girl who shows us the imagination and empathy and ambition that we will need to survive this bleak world. As a teenager in a middle-class enclave in southern California, Olamina begins to develop a practice called Earthseed, rooted in strong communities, individual self-sufficiency and an embrace of the universe’s ever-changing nature. Earthseed demands resilience and adaptability, with a sort of scientific and moral pragmatism, and points humanity towards the stars for its own survival. As she develops her philosophy it is eventually collected into The Book of the Living, which is “excerpted” heavily in the two books.

In these two books we don’t see anyone leave Earth — we are not given the pleasure of Butler articulating what it would be like for a whole society to live by these principles. We see small communities struggle to adopt these practices. We see them try to integrate new members who are grateful for food and shelter and company but skeptical of any indoctrination. We see major setbacks and minor accomplishments.

When we are defeated by Moloch our devastation is global and absolute and permanent. Our victories are usually are messy and local and temporary, a momentary respite from an ancient foe that is only getting stronger. If we are to survive, we must connect our small patches of humanity into a resilient and adaptable network. Our power is weak and our time is short, but our destiny is in the stars.