15 September 2010

Comment on Poor Richard's comment on data deluge, Google's Black Box, etc.

This is yet another case of comment-blogorrhea, in this case in reply to this. Any second person pronouns in the text below are references to Poor Richard.


I don't necessarily agree with your contention that the myriad species of utility in your taxonomy of utilities can or should be rolled into one framework of general utility. I don't necessarily disagree with the contention, either. Like Mr. Anderson, I'm 'agnostic,' preferring to wait and see what the data say. Pubwan, of course is my proposed methodology for gathering the (hopefully) relevant data. Like any human, I have an idea of what I'd like the data to prove, and it seems not to coincide with your framework of general utility. More specifically, I question the assumption held by economists at least since Walras that utility is a scalar quantity, conveniently measurable in units of currency. One reason I want to believe either that utility is BS or else that utility is vector-valued or irreducible or otherwise non-scalar is because I've taken a liking to various ecological and other groups and causes that are questioning the legitimacy of GDP as a measure of quality of life, or even of economic development. As someone whose career never really took off, I also have a personal vested interest in the idea that 'money isn't everything.' My take on the notion of vector-valued utility is here. I had a little trouble finding it as I misremembered now that I had worded it "quality of life as vector-valued function." This little bit of serendipity led me to the discovery that Google Blog Search knows of exactly one instance of the phrase "vector valued utility" in the blogosphere. (Discussion of your discussion on Google later, BTW) For your amusement, here's what's been said about vector-valued utility. I think the concept of vector-valued utility (or at least vector-valued income) underlies the notion of 'multiple bottom line' accounting that has been applied to various types of politically-correct businesses. I seriously doubt this practice is sufficient for capitalism to buy its soul back, but I highly value the empirical opportunities that this practice should open up, if combined with the introduction of radical transparency to accounting.

Now for Google guy Chris Anderson and his data deluge. I'll read the pdf later (really). Right now I'm commenting on your comments on it.

"At the petabyte scale, information is not a matter of simple three- and four-dimensional taxonomy and order but of dimensionally agnostic statistics."

That's called factor analysis, and statisticians have been talking about it since before I was born. Yawn. I do agree with Mr. Norvig that the point is that data are available (to some) with unprecedented fidelity. That is exactly the point. Using Google again to trace my own activity log (another example of the pervasiveness of cloud computing) I retrieve the quote "plotting high-resolution demand curves" from pubwan wiki, and in the process discover exactly one other page containing (at the time of indexing, anyway) the word "pubwan" and the phrase "high resolution". Needless to say, I've been aware for some time that 'pubwan' is a word in the Thai language. I never got a round tuit and decided to satisfy that curiosity. Unfortunately, Google Translate doesn't yet include Google Transliterate, at least for Thai->English. The other page about pubwan and high resolution contained a Facebook link so I inquired there. The person on the other end might think I'm an idiot, or an example of why Americans are dumb, but maybe not, and I'm overcome with curiosity. Aaaaanyhoooo, combined with the emerging instrumented ecosystem, etc., yes, it is both technically exciting (for the few) and politically chilling (for the many). Interesting you should mention dupermarkets, BTW. Did you get that one from me? :-) More or less, I would say that I would like to see as much of these data make it into the public domain. As far as privacy, transparency and democratic regulation, well, as a nominal anarchist sympathizer, I'm supposed to believe in neither democracy nor regulation. At any rate, I think they are both irrelevant to the issue of symmetric transparency. Otherwise, I would say yes to transparency and no to privacy. Privacy no, not because I don't believe in it in the normative sense, but because I don't believe in it in the positive sense, and transparency YES, because the cause of transparency (but only if it's symmetric transparency) has become the one cause to which I am most fervently committed. Informationally, I'm a militant communist.

ONE MORE THING: You are eminently qualified to edit pubwan wiki. Everyone is eligible to edit pubwan wiki. That is the whole point. It runs on Media Wiki, so in theory it's impervious to both vandalism and incompetence, neither of which apply to you, anyway. The thing is absolutely dying on the vine as a one-person wiki; which is one thing that is not meant to be.

this affects you
