There is a conference on voice biometrics in New York city April 3-4. Will any media be there?
Jeralyn Merritt, the Talk Left defense attorney, oftens finds herself in odd positions during the media meltdowns involving politically fashionable defendants. While Dirty Harry conservatives are normally the ones who want to trample avoid being overly solicitous of defendant's rights sometimes (Duke Lacrosse, Kobe Bryant) roles shift. In any case, if I were in a jam I would hire her in a second; we could talk politics later.
Today she has a post that is supportive of facts and the law, but not so much for the pro-Martin crowd. She highlights a conference on Voice Biometrics opening in New York City on April 3-4, wonders whether anyone's media budget can afford the $699 registration fee, and looks at the legal standards for admissibilty of the 911 Scream evidence in a Trayvon Martin killing.
SEND IN THE EXPERTS:
Let me aggregate the links I have found to various firms that provide expert witness services in audio forensics, as well as excerpts of the discussion they offer on its limitations. Do keep in mind that the Orlando Sentinel had available as evidence (a) the background screams on the gunshot 911 call; (b) George Zimmerman's 911 call (spoken voice); and (c) nothing at all for Trayvon Martin; down the road, that may change, just as Martin's voice surely changed with the years.
We can start with Tom Owen, the lead expert in the Orlando Sentinel piece.
The examiner can only work with speech samples which are the same as the text of the unknown recording. Under the best of circumstances the suspects will repeat, several times, the text of the recording of the unknown speaker and these words will be recorded in a similar manner to the recording of the unknown speaker. For example, if the recording of the unknown speaker was a bomb threat made to a recorded telephone line then each of the suspects would repeat the threat, word for word, to a recorded telephone line. This will provide the examiner with not only the same speech sounds for comparison but also with valuable information about the way each speech sound completes the transition to the next sound.
There are those times when a voice sample must be obtained without the knowledge of the suspect. It is possible to make an identification from a surreptitious recording but the amount of speech necessary to do the comparison is usually much greater. If the suspect is being engaged in conversation for the purpose of obtaining a voice sample, the conversation must be manipulated in such a way so as to have the suspect repeat as many of the words and phrases found in the text of the unknown recording as possible.
The worst exemplar recordings with which an examiner must work are those of random speech. It is necessary to obtain a large sample of speech to improve the chances of obtaining a sufficient amount of comparable speech.
Yet he claimed to be able to get a meaningful test of screams to speech to the Orlando Sentinel. Ms. Merritt notes his financial stake in this - he just rolled out his new voice matching software to police departments everywhere. Normally the media loves this sort of conflict of interest story - time will tell.
The second Orlando Sentinel expert was Mr. Primeau, who was quoted as follows:
"I believe that's Trayvon Martin in the background, without a doubt," Primeau says, stressing that the tone of the voice is a giveaway. "That's a young man screaming."
Yet he admits to never having heard a tape of Martin's voice. For all he or I know, Martin would bring down the house on Saturday night doing his impression of Barry White, yet he is sure that voice is Martin. Extraordinary. Not admissible, but extraordinary. Let's go to his website:
4. When conducting voice identification, it is important to create an exemplar of the accused for audio comparison using as exact conditions and equipment as close as possible to the measurements taken from the evidence as outlined above. The speech must be the same as the speech on the evidence in order for the testing to be accurate. As an audio forensic expert, I often have to coach the accused into the same energetic voice tone and inflection as the evidence recording. However, it is still possible to compare speech if the exemplar is not as close to the evidence as I would like.
So far it seems as if the defense counsel for Zimmerman could hire either of these guys to disqualify their own story to the Orlando Sentinel.
Another expert - Stutchman Forensic Laboratory, Advocate for Evidence Since 1992:
It is recommended that the exemplar of the known voice must be collected in as close to the same manor as the recording of the unknown voice was recorded. For example, if the recording of the unknown voice was recorded over the phone, the exemplar of the known voice should be collected over the phone, etc. When the exemplar is collected, the suspect is asked by the examiner to stay the same words in the same way as they were spoken by the unknown person. In other words, in a normal, natural voice.
The spectrographic voice identification analysis has two steps. The sound of speech is first transformed into a three dimensional (time - frequency - volume) graphic pictures which do reveal numerous acoustical features of an individual’s voice. The second step involves the pattern comparison of the same phrases/sentences from the unknown sample and the suspect’s sample. The results of analysis are expressed as:
- Probably the same speaker (high level of confidence).
- Possibly the same speaker (intermediate level of confidence).
- Inconclusive (due to the insufficient number of comparison words, poor quality of recordings, too high variability of the voice, possible disguise).
- Possibly not same speaker (intermediate level of confidence).
- Probably not the same speaker (high level of confidence).
The results depend on quality of recordings, the total number of comparison words, speakers’ condition, and individual speakers’ voice variability. There is a requirement for a minimum number of 20 comparison words in a ‘connected speech’. The suspect should provide the comparison sample by reading three times the transcript of the unknown voice sample.
That is four experts, all on the same side of the issue. And (trust me or not!) I am not cherry-picking here - finding more firms that describe their methods and requirements at their website should be possible, but I hoovered up everyone I could find (hence, Canada). We welcome more experts!
Now, Jacob Sullum wondered whether an aggressive prosecutor could, in order to establish probable cause or secure an indictment, use evidence he knew/suspected would not be admissible in court. IANAL, but extensive review of Law & Order as well as comments from people who *are* lawyers reminds me that an ethical prosecutor won't use evidence he knows can't be used at trial.
What the sanctions are, and what constitutes "knowledge of inadmissibility" in the case of this sort of audio evidence I do not know. For example, would an indictment that relied on audio evidence later found to be inadmissible be dismissed? Would the prosecutor be sanctioned if he could not convince a judge it was a good faith mistake? I don't know.
But my *GUESS* is that neither Federal nor State prosecutors will be able to find credible experts to green light this evidence. So no audio ex machina for the prosecution.
UPDATE: CNN talks to some experts, including Mr. Stutchman, linked above:
And standards set by the American Board of Recorded Evidence indicate "there must be at least 10 comparable words between two voice samples to reach a minimal decision criteria." While Zimmerman says more than that many words on his 911 call, the only one heard on the second is a cry for "help."
But that board's current chairman Gregg Stutchman -- who described Owens as a friend and well-respected in their field -- said that exact metric doesn't necessarily apply to the software Owens used.
David Faigman, a professor of law at the University of California-Hastings and an expert on the admissibility of scientific evidence, said courts and the overall scientific community have mixed opinions about the reliability of such "voiceprint" analysis.
Because one goal in the Martin case might be ruling out Zimmerman as the source of the screams, rather than precisely identifying who actually was yelling, it could lower the bar for getting such evidence into court, he said.
"I have no audio anywhere of Zimmerman screaming but I am certain that couldn't be Zimmerman". Really? That seems roughly as dubious as saying it is Zimmerman. Well, they will have to show the science, and I bet they can't.
Still, he said, it wouldn't be too hard for Zimmerman's attorneys to find an audio expert to offer an opposing opinion.
"These expert witnesses come out of the woodwork when money is concerned," he said.
Hmm - right now the only guy with an obvious financial stake is Tom Owen. Too bad CNN missed that. Kidding - it's a good thing CNN missed that or they would have dropped that quote. Here is CNN failing to follow the money:
He cited software that is widely used in Europe and has become recently accepted in the United States that examines characteristics like pitch and the space between spoken words to analyze voices....
A bit more from Mr. Stutchman:
Stutchman acknowledged there are some "quacks" who pass themselves off as experts, but insisted that certified audio practitioners like Owens can be effective. In fact, he said such experts could analyze if the screaming voice on the 911 call is that of Martin -- assuming they can get a sample of him speaking, perhaps from a voice mail message.
And after they get a no-match, citing technical limitations, cell phone compression, or the sorts of obstacles described at Mr. Stutchman's website, where does the court go? The screaming is neither Zimmerman nor Martin, so let's start looking for a third mystery witness?
Back to Mr. Owen:
Using it, he found a 48% likelihood the voice is Zimmerman's. At least 60% is necessary to feel confident two samples are from the same source, he told CNN on Monday -- meaning it's unlikely it was Zimmerman who can be heard yelling.
"Unlikely?" What happened to "reasonabe scientific certainty" in the Orlando Sentinel?
The software compared that audio to Zimmerman's voice. It returned a 48 percent match. Owen said to reach a positive match with audio of this quality, he'd expect higher than 90 percent.
"As a result of that, you can say with reasonable scientific certainty that it's not Zimmerman," Owen says, stressing that he cannot confirm the voice as Trayvon's, because he didn't have a sample of the teen's voice to compare.
Huh? The poor match means due to some combination of circumstances, such as poor or compressed audio and badly matched voice samples, and reality, such as "It wasn't Zimmerman", he can't claim a positive match. To then say that all the uncertainty is resolved by concluding it is not Zimmerman and giving the technical side a pass is absurd. As to 60% versus 90%, what is the science behind that?
My *GUESS* is that he is saying that in his professional experience, the bad audio quality, small sample and "Screams versus speech" problem should have knocked a perfect match down to, hmm, 90%, if in fact it was Zimmerman screaming. Or 60%. Either way, higher than 48%, so it must not be Zimmerman. Keep in mind that with two different tapes of Richard Nixon he got an 86% match, thereby proving with reasonable scientific certainty that there was a New Nixon.
In any case, we are not talking about how good a sample must be to provide a positive match, i.e., ruling someone in. We are talking about how bad a sample can be and still provide a negative match, i.e., exclude someone.
In theory that is not un-doable - Mr. Owen could get audio of himself and a few neighbors screaming "Help" in pain and fear, get spoken audio from everyone, and see what sorts of matches he gets. If he consistenty gets a 70% match from the Known Screamer and 40% matches from the wrong screamers, he has the foundation for a good journal article. But if that article hasn't already been written, let's not expect a judge to allow this.
MAINTAINING STANDARDS: Ms. Merritt explains the 'Daubert' standard for the admissibility of technical evidence in Federal courts. We have been advised by Sue that... well, here it is:
I just looked it up, Florida is one of the states that uses Frye not Daubert. Either way, this guy probably wouldn't pass as an expert.
This Hahvahd mahn says Florida is a Frye state for purposes of a state trial. Would that apply to federal charges brought in Florida? I can't afford the lawyers to sort this out, but one presumes a uniformity of Federal standards, which means Daubert would be it for the Feds.
LOOKING FOR LONGSHOTS: Any chance Trayvon Martin sang in a church or school choir (being a choirboy and all)? If he was a bass, or a soprano, that might be contemporary and objective evidence of his voice timbre. Well, rather than an audio from a few years ago, for example. That is an easy question for his family to answer, althogh I doubt Ben Crump will think it serves the family's interests to stand in front of a mike and tell us that Trayvon Martin was a basso profundo.
I DEPLORE HER PESSIMISTIC CONCLUSION: Ms. Merritt on the upcoming expert conference:
In the meantime, maybe the case will be brought up at the biometrics conference tomorrow and there will be some tweeting about it.
Tweeting? I won't to see interviews on the mean streets of the greatest city in the world. LIVE! I know what I smell, but these newsies ought to smell a story.
YEAH, YEAH: Obviously, if an audio expert opined that his enhancement of a cryptic audio confirmed that Iran was going nuclear we would see plenty of lefties doubting his technology and techniques. But today they are in a "Trust The Expert" mode. Today. And who among us has a problem with faith-based communities?