Language detectives say the key clues to who wrote the anonymous New York Times opinion piece slamming President Donald Trump may not be the odd and glimmering “lodestar,” but the itty-bitty words that people usually read right over: “I,” “of” and “but.”
And lodestar? That could be a red herring meant to throw sleuths off track, some experts say.
Experts use a combination of language use, statistics and computer science to help figure out who wrote documents that are anonymous or possibly plagiarized. They’ve even solved crimes and historical mysteries that way. Some call the field forensic linguistics, others call it stylometry or simply doing “author attribution.”
The field is suddenly at center stage after an unidentified “senior administration official” wrote in the Times’ opinion pages that he or she was part of a “resistance” movement working from within the administration to curb Trump’s perceived dangerous impulses.
One political scientist figures there are about 50 people in the Trump administration who fit the Times’ description as a senior administration official and could be the author. The key would be to look at how they write, the words they use, what words they put next to each other, spelling, punctuation and even tenses, experts say.
“Language is a set of choices. What to say, how to say and when to say it,” Duquesne University computer and language scientist Patrick Juola said. “And there’s a lot of different options.”
One of the favorite techniques of Juola and other experts is to look at what’s called “function words.” These are words people use all the time but that are hard to define because they more provide function than meaning. Some examples are “of,” “with,” “the,” “a,” “over” and “and.”
“We all use them but we don’t use them in the same way,” Juola says. “We don’t use them in the same frequency.” Same goes with apostrophes and other punctuation.
For example, do you say “different from” or “different than”? Women tend to use first- and second-person pronouns more — “I,” “me” and “you” — and more present tense, Illinois Institute of Technology computer science and data expert Shlomo Argamon said; men use “the,” “of,” “this” and “that” more often.
“You look for clues and you try to assess the usefulness of those clues,” Argamon said. But he is less optimistic the Trump opinion piece case will be cracked for various reasons, including the New York Times’ editing for style and possible efforts to fool language detectives with words that someone else likes to use such as “lodestar.” Mostly, he’s pessimistic because to do a proper comparison, samples from all suspects have to be gathered and have to be similar, such as all opinion columns as opposed to novels, speeches or magazine stories.
Still, this would be far from the first time language could finger a culprit. The Unabomber’s brother identified him because of his distinctive writing style. Field pioneers helped find a kidnapper who used the unique term “devil strip” for the grassy area between the sidewalk and road. That phrase is only used in parts of Ohio.
Literary sleuthing goes back to the founding of the republic. Historians had a hard time figuring out which specific Federalist Papers were written by Alexander Hamilton and which were by James Madison. A 1963 statistical analysis figured it out: One of the many clues came down to usage of the words “while” and “whilst.” Madison used “whilst”; Hamilton preferred “while.”
“The science is very good,” Juola said. “It’s not quite DNA. It’s actually considered by some scientists to be considered the second-most accurate form of forensic identification we have because it is so good.”
On Thursday, the president urged the Times to publish the author’s name “for the sake of our national security. First lady Melania Trump said the author is “sabotaging” the country through “cowardly actions.” And several top officials came out directly in their denials, including Secretary of State Mike Pompeo: “It’s not mine.”