Data Mining

sandwichIt should surprise no one that Bush’s warrantless phone data collection will result in more than “spying on al Qaeda.” The project’s designers say the National Security Agency’s electronic warehousing of trillions of phone records from calls made by some 200 million Americans is intended to seek out “patterns” from conversations involving alleged terrorists and then to apply the digital outline to the stockpiled records.

The human brain excels at pattern recognition. The mischief lies in the meaning we make of those patterns and the action we take based upon those “meanings.” For example, you might see a pattern that resembles a face in a grilled cheese sandwich, but are you willing to call the face that of the Virgin Mary? Do you take that as a sign from God? To me the face resembles my third grade teacher, Mrs. Merrell. I recall having a crush on Mrs. Merrell, so that might account for it. In other words, the patterns we see and the meanings we make of those patterns are largely dependent on our prior experiences and our expectations. How many “degrees of separation” are you from a known terrorist? Now that the NSA has the data base and the computer programs that allow them to conduct “social-network analysis,” we should all be worried. The process is described in an article from the Washington Post:

“Let’s say lots [of data] comes in and we don’t see anything interesting,” the source said. “Tomorrow we find out someone is communicating with a known terrorist. When you go back and look at the past data, there may be information that you missed. A pattern that was meaningless suddenly makes sense.”

Unlike the human brain that sees patterns in holistic “Gestalt” moments of recognition, social-network analysis makes connections based upon proximity and frequency. Consequently, an innocuous phone call you made to your cousin in Portland, who happened to call a friend in Egypt, who called a friend in Saudi Arabia, who knows someone who has possible terrorist ties, could put you into a social-network that makes you a potential suspect.
Data Mining
I have been thinking about this potential for “mischief making” after playing around with a powerful new “data mining” tool that Google has just made available called “Google Trends.” Of course, Google has a massive data base of internet searches that can be mined in the same way as the NSA phone records. Google trends is an analysis tool that allows you to see how often specific search terms are being entered into the Google search engine. Data can be sorted by time, language and geographic location. For example, if you do a search for NSA, you get a graph of volume over time and a number of articles that you can access by clicking the link. This has the potential to be a very powerful research tool.


If you have a powerful tool, it is only natural that you will use it. Like the NSA social-network analysis, Google Trends shows you all sorts of interesting “patterns” and the potential for mischievous meaning making is somewhat overwhelming. For example, if you do a search for “pornography” and “top cities,” you get the following results:

So, sorting the volume of searches for the term “pornography” based upon cities, Salt Lake City ranked third behind two cities in India. I will leave it to you to “make meaning” of that particular pattern.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s