The Ghost Files

Connelly had an idea: could he use data mining to infer what types of information were being left out of the public record? In theory, this seemed plausible, if he could compile enough materials to work with.

David J. Craig – The Ghost Files

This is what big data is good for. And, with all the talk of government collecting data on us, I think turnabout is more than fair play.

The Shortcomings of Becoming the App

When the original iPad was released in 2010, a single phrase by Adam Engst described the device perfectly: “The iPad becomes the app…” This transformational ability explains the heavy physicality of iOS before iOS 7. Skeuomorphic software design helps the device identify itself as the thing it is emulating, be it a book, a map, a game table, or an audio player. Whether this was actually necessary is a matter of debate, as is whether iOS 7 went too far in the other direction. The logic behind the skeuomorphic design is that familiarity reduced the learning curve, especially when your device is nothing but a blank surface that can become any interface.

Except that it can’t. Whatever the iPad, or any other touch-based digital devices tries to do, it’s still pictures under glass. It’s important to think about this when we consider how we relate to the a device that can become anything, versus a physical object that is only one thing?

The digital object doesnâ€™t so much have â€œa functionâ€ as a series of functions under an umbrella of one or two metafunctions… The association between object and function that was often one-to-one has become multiplied, perhaps receding into infinity…

Navneet Alang – “Calculating the Weight of the Object” – Snarkmarket

Like skeuomorphic software design, the tension between physicality and multiple functions puts the cultural divide around eBooks and physical books into perspective. Even a dedicated eBook reader, like the e-ink Amazon Kindle, has other functions beyond reading book. A book doesn’t let you browse the web, look up definitions of words without leaving the text, or play word games. Those limitations are encoded in its very form. There is precious little you can do with a book and have it still be a book. Door stops and leveling out extremely wobbly tables come to mind. Those in the physical book camp have internalized this disconnect and summarized it as “I like the physicality of a real book.”

We relate to single-purpose objects in a very different way, and it’s a relationship defined by simplicity. These objects represent a singularity of purpose, and their function is embodied in their form. When we reach for a physical book over an iPad or a Kindle, we are making a deliberate restriction of our possible choices. Even if we would have been reaching for that iPad or Kindle so we could read something on it, the physical book is a commitment to just reading. When a writer uses a mechanical typewriter to bang out their first draft, they make a similar commitment. While it’s possible to use a multi-functional digital device for a single task, it takes a lot more willpower not to switch modes. It’s even harder when the device is set up to bother you whenever something else demands your attention.

Believers in the idea that physical media will go away and that print will vanish ignore the value in the simple relationship between a thing and its purpose. Believers in the idea that digital media is a fad, and that multi-function devices are more trouble than they’re worth are giving up a wealth of power and new tools that can be used to create previously impossible new ideas. There’s room in our lives for multi-purpose devices and the single-purpose objects they purport to supplant. Those of us living in this transition have the opportunity to not only find the balance that works for us, but also define this balance for future generations.

If you’re old enough to remember life before we could read a thousand page book on device that weighs only a pound (or less), it’s hard to imagine a world where you never experience a single-purpose, non-digital device. I don’t expect this will come to pass, but I do expect that multi-function digital devices will be the primary way people born today will interact with media. We may save the physical stuff for things we want a special connection with. A part of the human psyche demands to have tangibility, and it’s best we don’t forget it.

The Genre Tag Problem

Those who listen to Crush On Radio may know that I am somewhat meticulous about the metadata in my iTunes library. I make sure dates, artist names, and song titles are 100% correct. I try to get album art that’s at least 500×500 pixels. For titles in non-Roman writing system, I make sure to use the correct text. I use sort tags so that artists are sorted last name first (e.g.: David Bowie is under “B”). Even when I buy albums from iTunes, I still find myself tweaking the metadata just to get everything how I like it. The only part of my library that’s out of whack are my genre tags, and I doubt I’m the only one.

Part of this is a weakness of the ID3 metadata specification. The original specification limited you to a choice of only 80 pre-programmed genres, with an additional 46 in a later revision. Some of those pre-assigned genres weren’t even ones you’d be likely to use, except as a joke. ID3v2 changed the genre tag from a numeric value to a free-form text field, which is a blessing and a curse because of the second part of the problem. That is, music genres are exceedingly difficult to pin down.

Ask any passionate music fan about their favorite genre or genres of music, and you’ll be in for a graduate-level course in their passion. And that’s before you dive in to the various subgenres of music, in details that would overwhelm any sort of systematic organization in a store. For a passionate music fan, it’s not nearly as simple. Just look at this list of subgenres for Heavy Metal, itself a sub-genre of Rock music. And forget about using the iTunes genre images if you get really specific. And, if it’s not easily classifiable, odds are, in the iTunes Store it’s classified as “Alternative”. This is a limitation by design.

The latest version of the ID3v2 specification allows for any free text field, including Genre, to contain multiple values. However, player support for multiple genres is non-existent. Any solution is unlikely to come from the top down. Neither Apple, or the record labels are going to put a lot of thought into a detailed classification system, largely because it doesn’t affect how they do business. Even the MusicBrainz database doesn’t bother with Genre tags. It’s just up to us as music fans to decide what genre criteria we want to use—or if we even want to bother.

Towards Good News

A few months ago, I had the pleasure of attending a lecture by Carlos Castillo of the Qatar Computing Research Institute on news, and social media. I’d come under the impression that it would be about using social media to discover and generate news, but it was more interesting than that. Mr. Castillo’s presentation and research occupies an intersection between Computational Linguistics, social media, and news. He uses social media signals to identify the lifecycle of a news article, and its relevance to an audience.

It left me wondering how we can use the signals in both computational language analysis and social media analysis to keep people better informed. I don’t mean this in terms of volume of information, but accuracy of information. Right now, I can see the insights of Mr. Castillo’s work being mis-applied to increase the reach—and ad views—of a story rather than promote real journalism. To put it another way:

Truth has never been an essential ingredient of viral content on the Internet. But in the stepped-up competition for readers, digital news sites are increasingly blurring the line between fact and fiction, and saying that it is all part of doing business in the rough-and-tumble world of online journalism.

— If A Story is Viral, Truth May Be Taking a Beating

A number of the fake news stories in the New York Times piece I just quoted and linked to are “soft news” at best. They succeed in their mission of getting attention, measured in tweets, shares, likes, and the all important Page View, but they are not journalism in any legitimate sense.

But there is a desire among people to read real news. In Mr. Castillo’s lecture, he noted that long-form journalistic pieces have a long timeframe of relevance and traffic, up to a week, while breaking stories tend to have a lifespan of about nine hours with an intense first hour. For both, the amount of traffic and social media a news piece gets in the first hour is the best judge of its relevance. QCRI’s demo site provides a good visual explanation. Green bars show the predicted page views for an article of Al-Jazeera news based on existing traffic and social media signal. The articles are primarily hard news, as that’s the bulk of what Al-Jazeera produces, but the source doesn’t matter. The same algorithm would work for Huffington Post, MSNBC, Buzzfeed, or Fox News.

While people will share and click for hard news and soft news alike, soft news has the risk of spreading misinformation. This can’t be good for society, right? Well, to quote a quote, “Even if it’s fake, it’s real.” Is there value in the fake news, engineered for pure virtality, to spawn discussion? Potentially. I haven’t seen much discussion around viral news stories except for people complaining about the viral news stories in their feed. This could just be a function of the online circles I travel in—jaded and cynical tech people, often former news junkies themselves. [1]

The New Yorker recently published a piece trying to determine what stories go viral, and why. At the risk of spoiling the article, Aristotle may have had the answer already. “The answer, he argued, was three principles: ethos, pathos, and logos. Content should have an ethical appeal, an emotional appeal, or a logical appeal. A rhetorician strong on all three was likely to leave behind a persuaded audience.” I’m not so sure about persuasion in the Internet age. It’s entirely possible to live online in an echo chamber of voices that are similar enough to yours that almost nothing counter to your worldview can permeate.

It’s that “almost” that makes things interesting. If viral news stories have a spread that can transcend, or at least bypass, the social filters in our online lives, and they can spawn constructive discussion, we may be on to something. In the technology world, The Verge’s Fanboys piece is extremely viral, and the discussion surrounding it constructive. The Verge could be creating a template for a story that forces people to think about a contentious issue, and if it gets even one obnoxious online “fanboy” to think about their loyalties and behavior with a little more nuance, it’s a win.

What is clear, is that virality cannot be forced, but it can be engineered. Fanboys may not be as engineered for virality as the stories on Upworthy and Buzzfeed, though the clever layout tricks it employs show that a lot of thought was put into how people will see it. This brings me full circle, to the research from QCRI and Carlos Castillo. Predictive analytics can be a valuable tool to make sure that, should an editorial team with a focus on elevating discussion and making an impact want, they can engineer a story that can go viral and spawn real discussion. The cynic in me, however, expects it to only drive pageviews and increase ad revenue. There’s no reason it has to be either one or the other, though.

Either way, you’ll still get jaded former news junkies complaining. Just maybe not as many.

I’ve mostly weaned myself off of trying to keep up with the news. I get NPR’s 7AM morning news update for the real world, 5by5’s The News podcast for the tech stuff, and I figure I’ll hear about anything important I miss through other channels. I get all the news I need in about ten minutes per day. ↩

“On Demand” and The Stream

There used to be a dream that all your entertainment would come to you on demand. Instead of needing to be at a certain place at a certain time to catch whatever monocultural touchstone was being broadcast, anything we would want could be a button push, or ten, away. This dream has come to fruition for anything that would have been broadcast in the past. Only sports are exempt. Parallel to the rise of on demand media, a new form of media that cannot be simply stored and caught up on has evolved. These are the streams, and you either keep up with them as they come in, or you accept that you’ll never catch up with what you missed. Or both.

Both old, broadcast media and our new streams make demands on us. At least broadcast media’s demands were concrete in time and space. If you weren’t home, and you didn’t set up your VCR or DVR, you missed what happened, that was it. Streams are in our pockets, inescapable wherever there’s a signal to our phones. We don’t want to miss a moment, so we’re always pulling out our phones, distracting ourselves from whatever we’re doing, just to catch up. If certain companies have their way, our streams will be on and in our faces as well. People already get hit by cars while checking their streams on phones. Face-mounted streams won’t be much of an improvement.

Even worse is when our streams make themselves demand our attention. They make our phones buzz and beep with each new activity. Another Pavlovian stimulus to deal with, the sound and sensation are our trigger to salivate and check our streams. We can turn the notifications off, but apps for streams come with the alerts turned on right out of the App Store. When the optimal is not the default, the default wins out for almost everything. Changing settings is a power-user move.

Of course, streams are in their infancy. We are still learning how to handle them; figuring out who is worth following, when is right to check, and what is right to say. Even as we learn, however, they’re still updating. Radio and television stations, in their infancy, at last had the courtesy to sign off at the end of the day. Though now they’re on the air constantly, there’s no need to stay up all hours just for one program when you can watch it at your leisure the next day. Streams never sleep. The operators of streams can have algorithms drop fresh new content into your stream at any hour of the day, multiple times a day. You never have to be without something new to see, and you never are.

We all could benefit from thinking about the streams we let into our lives and what we let into our streams. How up-to-date do we need to be about the things people do and say? What do we truly need to be informed about? The nature of a stream precludes being truly “on demand,” but judicious pruning of what we allow in can make it easier if we want to bother with catching up. There are a fixed number of hours in our day. We all could be more judicious about what we let consume our time, and when. And when it all gets too overwhelming, broadcast media and streams alike have an off switch we shouldn’t be afraid to press.

Sanspoint.

Essays on Technology and Culture