Data and Evidence, Brick by Brick

Every weekday morning, I drive my kids to school. I often see an officer in a patrol car parked in the fire station parking lot. I never see the car pull someone over. I've been thinking about what that officer is doing.

It is my understanding that there is technology that can quickly compile a list of license plates from a patrol car. But I imagine a near future where that list is compared to the data of each day prior, and outliers could be identified--a brown Buick NEVER drives past this station at 7am--better flag that car. And while I have the urge to go crazy over privacy rights and other civil liberties issues and OH NO! MINORITY REPORT!, I also recognize that the collected data from CompStat helped turn the NYPD into a model of cutting crime in an urban area.

But the real wizardry of data collection is when different types of data are combined with other types of data to get really specific. For instance, what if the list of those license plates' (along with make and model) schedules were then attached to compiled lists of schedules of local schools and businesses? Then the police could really get an understanding of what a person's daily life is like. I imagine this is how an identity thief or a sophisticated burglar works, too.

Every day, we hear of real cases in civil and criminal court based on data leading people to "reach conclusions." Not so long ago, a woman sued Netflix for outing her based on Netflix's movie recommendation algorithm. Just a few weeks ago, the Amazon Echo factored into a murder case. Police, suspecting that the device's microphones picked up crucial evidence, demanded access to anything recorded. But the case doesn't end there.

When a case is built against a defendant in a criminal or civil suit, there is rarely a "smoking gun." Instead, a plaintiff's attorney or a prosecutor must assiduously build--brick by brick--a narrative from various pieces of evidence. Data makes compelling evidence. In the Amazon Echo murder, there was another key piece of data that may make the story feel more complete. The suspect, apparently a fan of the Internet of Things, also had a smart water meter which recorded an enormous amount of water used. The conclusion that may be reached is that the homeowner washed away a lot of evidence.

Data collection provides a startling amount of new evidence to sift through and link together to build persuasive arguments. And yet... what if all this data just makes it easier to lend credence to junk science? What if people's minds wander too much and reach fanciful conclusions? What if the police officer parked in the fire station parking lot just wants to drink a cup of coffee and read the paper before starting the midday shift?