CHICAGO – Connecting the dots is difficult, but for homeland security agents, the real trick is figuring out where the dots are and which ones need connecting.

That analogy may be at the core of the federal government’s interest in keeping tabs of all the telephone calls Americans make to each other every day.

Government agents reportedly hope that computers can sift through the mountains of phone data to extract nuggets of information revealing terrorist plotters.

Only within the past decade has a subset of computer science called link mining even become available to attempt such a daunting task, though some researchers believe that even the most powerful computers will never deliver the answers that the government seeks.

Congressional leaders were demanding answers from the Bush administration Thursday about a specific type of connecting-the-dots activity: whether the National Security Agency had collected extensive phone call records from America’s three largest telecommunications carriers, and whether the privacy rights of individuals had been violated.

Behind those questions is the arcane science of using super-powerful computers to mine data of all types for information.

“It’s a massive data problem, but you can do it,” said Kris Hammond, Northwestern University professor of electrical engineering and computer science. “If it were impossible to get specific answers to specific questions from a huge data base, Google couldn’t exist.”

The likelihood of success, Hammond said, is higher if agents have specific questions, such as what mobile phones in Washington, D.C., made calls to Tehran during a given period, and whether calls were made from those phones to San Francisco during another period. For that query, a database could be quite useful.

But if security agents don’t know what they’re looking for, they can’t expect a data mining program to connect all the dots for them.

“If you approach the data without specific questions and just look for patterns, you can find hundreds of millions of patterns,” Hammond said. “But only a few would be significant. Your chances of finding anything are nil.”

Despite advances in artificial intelligence, computers aren’t like human detectives who can make inferences and shift assumptions on the fly, said Yali Amit, a University of Chicago professor of statistics and computer science.

Federal security agents may not understand this limitation, he said.

“They have records from millions of innocent people and perhaps a few thousand terrorists who might make phone calls,” said Amit. “The size of the data set of interest – the terrorists – is too small. You get reliability rates that make the whole endeavor pretty ridiculous.”

The White House hasn’t confirmed what the NSA has been doing with phone records, but last December, an official of DARPA, a Defense Department agency that funds advanced research, published a paper in an academic journal that suggests an ambitious role for link mining.

“Metaphorically, link mining offers the potential not only for connecting the dots, but for determining which dots to connect, a far more difficult task,” wrote Ted E. Senator, who stipulated he was expressing his own views, not those of DARPA or the federal government.

The science of connecting the dots began decades ago when sociologists began studying social networks, charting people’s connections with each other by hand, said Karrie Karahalios, an assistant professor of computer science at the University of Illinois in Urbana-Champaign. They looked at things like how gossip spread through a community.

“There’s a lot you can learn just from looking at simple connections,” she said.

In recent years with the rise of instant messaging, buddy lists and other Internet-related phenomena, the attention of computer research turned to social networking.

“Previously, we only analyzed a person in the context of themselves, but now we’re looking at links between that person and others,” said Paul Bradley, data mining principal for Apollo Data Technologies LLC, a Chicago-based data mining firm.



The NSA presumably is looking for patterns among the millions of phone calls Americans place to one another daily. With A calling B and B calling C, there might be some interest if C later calls A. Such data tied to phone numbers could be combined with census data, driver license listings and other data bases to produce useful information.

“They might look at how frequently, how recently the calls were made and where they were geographically,” said Bradley. “You could make some generalizations from that, but they’d be weak ones.”

When commercial firms do link mining, their problems are usually well defined and far less ambitious than the nation’s phone calling records.

“A portal company like Yahoo, Google, or AOL might look at patterns among people using their e-mail services and instant messaging,” said Jeff Kaplan, client services principal for Apollo Data. “They might look at how frequently they use e-mail or look at who is sending e-mails to a friend, but not using instant messaging.”

The goal might be to personalize a customer’s experience or to market more products to him, said Kaplan.

While national security agents may use the same basic software as private enterprise, they have more resources.

“Unless you have some other information about a person beyond just calling patterns, there’s not that much you can say,” said Bradley. “But there are a lot of other databases out there that provide personal information about people. And the government has access to data bases that people in the private sector don’t even know exist.”

(c) 2006, Chicago Tribune.

Visit the Chicago Tribune on the Internet at

Distributed by Knight Ridder/Tribune Information Services.

AP-NY-05-11-06 1949EDT