Introduction
Like Love is the Killer App this was an assigned reading, and admittedly this may never have shown up on my radar otherwise. Like my reading of Sanders' book I was a bit rushed in my reading of The Numerati. I did actually read through the book in a very short period of time, as my sister can tell you it's all I did saturday and sunday of her visit.
It wasn't a hard read, Baker is a business journalist and has been one for over 20 years. He knows how to write a good book as well.
As I said with the other assigned reading, I honestly try not to judge an assigned reading too harshly. At the very least I look for what the professor saw in the book and what they may have wanted me to glean from the book.
Let me give you a few terms worth knowing:
- data mining: The process of gathering (mining) the information (data) of a collection of individuals.
- DNA: as used in the book it describes the basic structure of a person in terms of measurable, and quantifiable units.
- cookies: small (relatively benign) files placed on your computer by the websites you visit
- Noise: pieces of data that skew or disrupt your ability to process your basic data based DNA.
Summary
The book begins with an introduction and a short discussion of a quandary faced by a data compiling firm. Essentially this firm has been gathering millions of pieces of data on people who visit their client sites and trying to understand how and where to place an add for a particular product/service. The question is this: "Why is it that people who read romance novels are the second most likely people to click on a rent-a-car ad?" The first ranked are people who are reading online obituaries, this is a clear enough of correlation since a person who is reading an obituary would probably want to make travel arrangements to get to the funeral they've just read about, but what do you do about those romance novel fans? What they've determined (upon further investigation) is that these people were looking clicking on a particular banner advertising for romantic getaway packages the rent-a-car company was offering. Makes sense now, right?
I share this story because it is part and partial of the entire book. This book discusses the uses and effects of "data mining," on our modern day culture. In particular, Baker focuses on seven groups of people that are affected:
- Workers
- Shoppers
- Voters
- Bloggers
- Terrorists
- Patients
- Lovers
He discusses how technology will affect how we are viewed by our employers (a system that would rank us similarly to stocks (in terms of value to the company). Heavy observation could lead to a Big Brother style push for efficiency, but it could also be used to foster better employees and create more cohesive work groups.
Any online shopper is familiar with things like the Amazon.com method. Using various cookies they track the items you buy and shop for and attempt to recommend similar products to you. there are firms that do this on a massive scale with thousands of website clients allowing them to plant cookies on the users that it can use to compile information about what links you clicked on what websites you visited, even the path you took from one to another. Baker discusses the possibilities of connecting this data to your person such that a cart at your local super center can connect identify you, and tell you which products you buy often are on sale or even help you develop a shopping list of the things you are most likely to buy.
An interesting discussion happens in this chapter. Of course you'll discuss how do you remove the noise (a 40 year-old woman buying a video game for her son won't respond to future ads for video games), but a better question, how do you get rid of those customers that cost you money? Business lingo calls them Barnacles. These are the people you see buying only on sale items and buying them with coupons to further their savings. They never pay full price, they only buy items that have free shipping offers, and they lose businesses money. These individuals can be tracked as easily as anyone else online, but once tagged as a barnacle they'll bury your most likely purchases in a sea of regular priced items when you use search engines. Interesting to think about.
How do you gather voter data? After all with campaign costs continuously rising, you need to focus your efforts on winnable markets. If your on the far left it won't due you much good to send campaign brochures to that staunchly far right neighborhood. The same goes for phone calls and door knockers. One firm described how they break down people into one of 5 major categories (based on the values they hold most dearly) and then they assess which end of the spectrum they fall on from high dedication to low. They break down individuals who answer their surveys this way then they can extrapolate that information to inform neighborhoods and whole counties. (this is important since political consulting isn't reserved exclusively for the general election).
Obviously, I held some interest in the blogger section. Most of the discussion focuses on exactly how do you break down the very human information into something people will pay for. The answer? Not easily. One group is developing an algorithm that can (sometimes) determine what a blog article is about, what sort of person wrote it (demographically) and what they think (positively or negatively) about the topics included. However, as Baker shows, this is still a rather inaccurate system. Linguists have helped determine how men and women write and how the older and younger writers differ in terms of word choice. Even with that help a man in his upper 30s who is an active skateboarder was identified as a teenager, and a woman who writes aggressively could be identified as a man. The difficulty of syntax is one that computers struggle with for example a sarcastic sentence that is in favor of something was read as a negative statement. Since the system only works by putting in examples of blogs that fit the definitions of the types of blogs they have categories for, they are limited by the human supplied definitions.
The terrorists section focused on the ability to identify terrorists that actually began with casinos developing technology to keep the Danny Ocean and the rest of the team from getting away with anything. It identifies individual crooks and if more crooks arrive it searches it's databases to predict who is the next most like crook. The NORA (non-obvious relationship awareness) system was eventually adopted to monitor the movements of known terrorists and predict what other terrorists to keep an eye on. Other NSA systems look at spending patterns, patterns of association and demographic information to determine which mild-mannered citizen is a future future terrorist. Basically, they are looking for the exceptions to the norm, those factors that separate us (the average citizens) from the potential terrorists. The concern is that the other systems of identification Baker has discussed are allowed to be inaccurate to some degree, because the worst case scenario is that you send a campaign flyer to the wrong door or advertise death metal to a grandma. But the NSA's task involves putting people in jail, treating them as enemy combatants and stripping them of their lively hood (on the basis of what this data tells them). Basically, they are looking for the exceptions to the norm, those factors that separate us (the average citizens) from the potential terrorists. In an increasingly diverse and ever more observable society like ours how do you define the norms, and what about those innocent shop owners who thought they were supporting a charity, but who accidentally funded a terrorist cell?
The patients section was, more or less, a discussion of interesting medical monitoring gadgets. Most of them are meant to gather data over a long period of time. One of the biggest flaws in the medical system is that most of us spend 15 minutes (or less) with our doctor each year, but many conditions (such as parkinson's) are identifiable by long term information. In the case of parkinson's they have developed a floor that detects footsteps (a major sign of parkinson's is decreased stride length). They imagine one day entire houses (they have the prototype) filled with similar sensors in your bed, your floors, chairs, etc. These combined with similar sensors would gather data and if it detects a problem it could alert your doctor. But what does this mean for insurance? will they allow you to BE insured without these devices, will you have a high premium if you live in a house without it? Could your premium spike if the insurance company becomes aware of a condition first. The next task for these monitors is who gets to see the information.
The final chapter focuses on the burgeoning online dating scene. Specifically those who use algorithms to determine compatibility. Anthropologists, sociologists, biochemists and countless others are part of the process of trying to understand compatibility on various levels, and they are the ones recruited by dating websites that opt to use surveys to gather the important data. Another idea was to use cellphones as mini broadcasters of our profile pages. If two people match with similar interests (or whatever compatibility measures you use with the program) are in proximity to one another, then you they will receive one another's personal profile page. It's an interesting thought, I mean if you are already broadcasting your information on the internet for people in Norway to view, why not use that same data to find someone a little closer. Admittedly the idea sounded a bit like those nanopet battlers that alerted you when another was near by (so that you could battle strangers I guess).
The conclusion Baker has is this summed up simply: "Life is Complex." From that launching point he adds. "And yet, bit by bit the Numerati make progress. No, they don't truly know us and they never will. But in each domain, they understand and predict our behavior a bit better today than they did last week." Baker is impressed with the technology, but truly believes that it can never comprehensively understand us.
Criticisms
Negatives:
- Balance of Facts:
Some chapters have a lot to say about the various technologies used or in development for an industry. Other chapters (the terrorist chapter being a prime example) have little to say about the technologies themselves and are busier conjecturing about the implications. I'm not saying those aren't important, but it feels like more effort could have been given in doing research about those areas. - Kind of a slow read:
Despite being fascinating information, Baker seems to drudge through some portions. It's obvious from his openings and closings that he's a journalist, those are excellent, but the in between is rather weak.
Positives:
- Style:
As I said the pace does slow down in the middle, but I do like the style with which Baker writes. After all, he really does a great job of setting a scene for his interviews, and he does his a superb job of simplifying the talk of several PhD's into something that the average reader can grasp. - Interesting information:
I found the subject matter legitimately interesting. It was interesting to learn more about how some of the technologies I interact with operate, and what people are doing to improve them. - The Future:
The discussions of the future implications of these technologies are stimulating and fascinating. What does all of this data mining mean for the future of medicine, advertising, voting, privacy and security, and even personal relationships? Baker does an excellent job of being far looking in his vision of these technologies as they may be used once fully applied. He often sees what the less altruistic people will do with the technology someday. If Oppenheimer had this level of foresight who knows how the Second World War would have turned out.
Final Thoughts:
Not the greatest book I've ever read, but not the worst. It's only of temporary value though so if it interests you read it soon, because the rate of technological development we have in America, the book is becoming more outdated with every second.
That said if you're just more interested in the gadgetry then I'd recommend giving this one a miss and just reading tech journals. For better or worse this book will fade into obscurity eventually. It wasn't half bad for an assigned reading, but I can't say I'd hunt it down if I heard about it.
That's about it really.
Ti Voglio Bene,
-matt
No comments:
Post a Comment