Can AI ever really be "Open Source?"

Of course there are AI projects that have open source code. Many do, even some of the largest and most influential.

My question is somewhat deeper. Does having an libre or “open source” AI really come with the ethical and practical benefits of traditional free-software programs?

Does it really mean we know how an AI is computing something?

Does it really mean we can understand the AI’s output?

Does it really mean we have control over the program outside of ad hoc limitations?

No. In all cases.

AI is only peripherally source code.

What I mean by this is that a traditional program’s source code can be read my a competent programmer and changed as needed. There is exactly a one-to-one relationship between the source code and the program’s behavior in a given circumstance.

AI is a bit different. The source code ultimately is a fairly small portion or what defines the behavior of these large AIs. In fact, it is really simple raw material, or the starting point. More substantial, obviously, is the data that the AIs are trained on, which with public facing AIs are not public information.

Even when we know the data they are trained on (which we and those running these systems scarcely can enumerate), the levels of abstraction and cross-reference involved in any AI or neural net are simply far beyond the intuitive grasp of the human mind. That is, there is a truth there, but we cannot grok it.

In this sense, even the companies writing and training these AI are more-or-less just as in the dark about the true workings of their projects as a person who is trying to reverse-engineer or decompile a binary. But the problem is much more intractable than decompilation because the actual learned computation involved in an AI, until human-written source code does not unfold into some series of simple cause and effect or if statements.

A Philosophical Question

Why does an AI answer any given question the way it does?

Ultimately, we can never answer this question. If I ask an AI if I should take some medicine for a headache, how can we say why it says what it says?

It is not because of one simple website or some other data it was trained on. We can give general answers to these kind of questions, but we are utterly unable to trouble-shoot AI in the way we can trouble-shoot a computer program. There is just too much going on.

In this sense, an AI is a black box, even if it is open source.

You could say we can experimentally train an AI on two differing data sets to see the difference, but this is quite literally the same kind of blind game as playing with two different compiled binaries without the source code.

Can AI ever be ethical?

So can AI ever be ethical in a Stallmanesque way?

Well, AIs are necessarily proprietary software and might not be designed by their makers to exploit users. Free source code for an AI does aid us in avoiding purposefully malicious code. Stallman is a man of precise and unchanging definitions, so I don’t think he would lump AI fundamentally in the same category.

But we are in new territory with a new risk. All AIs must function similarly to proprietary software in that their functioning is still like a black-box. They might not be intentionally evil on the part of the programmer, but when we train them on data from the mainstream media or the like, we not only will realize that they are evil, but that the root of their evil is not something programatically specifiable.

Due to the way that AIs by necessity function, we cannot truly understand the bit-by-bit why of why they function the way they do.