National Academies Press: OpenBook
« Previous: Communications and Computer Security
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 457
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 458
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 459
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 460
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 461
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 462
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 463
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 464
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 465
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 466
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 467
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 468
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 469
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 470
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 471
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 472
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 473
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 474
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 475
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 476
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 477
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 478
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 479
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 480
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 481
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 482
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 483
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 484
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 485
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 486
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 487
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 488
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 489
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 490
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 491
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 492
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 493
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 494
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 495
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 496
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 497
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 498
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 499
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 500
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 501
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 502
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 503
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 504
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 505
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 506
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 507
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 508
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 509
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 510
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 511
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 512
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 513
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 514
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 515
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 516
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 517
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 518
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 519
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 520
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 521
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 522
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 523
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 524
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 525
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 526
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 527
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 528
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 529
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 530
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 531
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 532
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 533
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 534
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 535
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 536
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 537
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 538
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 539
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 540
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 541
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 542
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 543
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 544
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 545
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 546
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 547
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 548
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 549
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 550
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 551
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 552
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 553
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 554
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 555
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 556
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 557
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 558
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 559
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 560
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 561
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 562
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 563
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 564
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 565
Suggested Citation:"Data Integration and Fusion." National Research Council. 2004. The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/10940.
×
Page 566

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

457 Alexander Levis "introduction by Session Chair" Transcript of Presentation Summary of Presentation Power Point Slides Video Presentation Alexander Levis is chief scientist of the U.S. Air Force, Washington, D.C. He serves as chief scientific adviser to the Chief of Staff and Secretary of the Air Force and provides assessments on a wide range of scientific and technical issues affecting the Air Force mission. Dr. Levis received his professional education at the Massachusetts Institute of Technology. Prior to his current position, he was University Professor of Electrical, Computer and Systems Engineering at George Mason University in Fairfax, Virginia, and head of the System Architectures Laboratory of the C31 Center. For the last 20 years, Dr. Levis has conducted basic and applied research in and taught many aspects of command and control, from organization design for command centers, to operational and system architectures, to decision support systems. Dr. Levis has served as senior officer in national and international professional societies, is on the editorial board of several professional journals, and on the Board of Directors of the AFCEA Education Foundation. He has held two appointments to the Air Force Scientific Advisory Board, where he participated in several summer studies and several ad hoc studies. 457

458 DR. LEVIS: Just on a serious note on that, the term stegonography was mentioned in the first session, and people asked what is it. As far as I know, Al-Queda has used it already. So they are fairly sophisticated. Good afternoon. My name is Alex Levis, and I will be chairing this session. My current position -- I have never worn a uniform in my life, but my current position is with the Air Force. I am chief scientist. It is the best job in the world. It has no specification. What I am supposed to do is give advice to the chief of the Air Force and the Secretary of the Air Force. The Secretary of the Air Force did his Ph.D. on decision analysis at Harvard. He does not need any advice from me on matters technical. The chief has a war to fight, he needs no advice from me. So I am having a great time coming to workshops, and I really appreciate the invitation. Somewhere in my distant path actually, I do have a degree in mathematics, but it is an undergraduate one. I noted when David earlier -- there was a question asked yesterday, how many people have a degree in mathematics, and four or five people raised their hands. The question today that David asked is, how many mathematicians, and 458

459 two-thirds of the people raised their hands. I wonder what the significance of this is. I would like to make some comments first. I have learned a lot of interesting things yesterday and today. Information security I am a little more familiar with. But the thing that hasn't come up very much is the problem. Those who want to contribute in mathematics, and I believe strongly in that, and I'll try to show a couple of examples, mathematics to the problem. I am an engineer, and I need to understand the problem before I apply solutions to it. The problem has not been defined in this workshop. We all took it for granted. From the discussion it is apparent that our knowledge is what we all hear from the newspapers, et cetera. I don't think even the White House has defined homeland security, let alone -- I don't know the difference between homeland defense and homeland security, by the way, but different people use them in different ways. So let me try to define in non-technical terms the problem. We don't really know who the adversaries are. We know some individuals, we know some organizations, but we don't have a complete knowledge of the adversaries. We don't know where they are. You see every night on TV that we really don't know where they are, and we don't know 459

where they are going to attack, whether it is going to be in the U.S. The call just went out again, after being prepared for a year and a half. We don't know when they will attack. We do not know what they will attack, and we don't know how they are going to attack. We know bits and pieces of that, but this is the classic journalism, the five questions. So when we talk about homeland security, we have to look at all those aspects. You can characterize the uncertainty. We talked a lot about probability yesterday and a little bit today, but there are different kinds of uncertainty that are associated with this. You have temporal uncertainties, you have locational uncertainties, you have all sorts of things that somehow have to mix and match together. This session, remember, is about integration and fusion. Now, the strategies that you can consider fall in three categories: reactive strategies, -- PARTICIPANT: By the way, you forgot one question. We don't know why they are attacking us. DR. LEVIS: No, I 'm sorry, that we know. They don't like us. We can start a debate, but I come from that part of the world, and I can tell you in great length why they don't like us. 460

461 We have concentrated mostly in terms of what people have been discussing and the approaches they are indicating. We are looking primarily at the reactive problem, what happens if somebody does that. Much of the work that was presented in epidemiology would go into the reactive part: once there is an attack, what can we do about it. There is also the anticipatory part. Some of the work yesterday relates to that. We know that somebody is crossing the broader. We don't know where, but we are going to look for it, we are expecting an attack on the bridges of San Francisco during that week. This is anticipatory. The other one is the proactive; get them before they get a chance to get us, which is what we are doing right now, what the Air Force is actually doing. The term that the Air Force uses for this kind of a notion, that is, the last one, is predictive battle space awareness. It doesn't really mean anything. This is a bumper sticker to put at the end of the car. It tells you it is great to know what they have done. It is great to have all the sensors to tell me what happened, but that really is of very limited value. What I really want to know is what 461

462 they are going to do next, so I can go and get them before they do it. That is the hard problem. With the expansion of sensors, information technology, et cetera, as many of you indicated, we have a wealth of data, but our difficulty is to take that data and start determining intent, which takes us away from just projection of the data. We have to understand what is going on, because you need to know intent, to understand intent, to make the predictions. The first part I have already mentioned. Yesterday we talked a lot about searching for a needle in a haystack. I am the last person to tell you how to do that. I have no idea now to do that. But there are alternative approaches, how to think about the problem. First of all, somebody mentioned yesterday getting a magnet or changing the properties of the needle so that you can find it more easily, given the kind of sensors that you have. A lot of our intelligence community is worrying about things like that. There are ways of trying to mark things that we eventually may want to find. That is one way of approaching it. The last one is, find the needle before it gets in the haystack. As an engineer, this appeals to me very much. I am lazy. I want to solve the easy problem, I 462

463 don't want to solve the hard one. We need to keep this in mind. There is a whole spectrum of activities over here. To get to this predictive integration notion, given that there are large gaps in knowledge, because you can get to intent -- intent is human, and you can't really model all that stuff. You cannot go to databases to find those things. You need to do some modeling and simulation. This is now a case study to respond to a question that was raised later, you have all those good solutions, how do you go and give them for people to use. That is a very hard problem. You cannot just say, I have a wonderful solution. I think it was mentioned, you have to understand the science of the problem, we have to understand the application of the problem, because nobody is going to pay attention. You expect somebody else to understand the solution and then convert it and apply it. Not anymore; nobody has the patience for that. One needs to go out. One example over here very quickly which is relevant to the presentations today, that is why I chose it, has to do with influence nets. After 9/11, actually a couple of weeks later in September, I went to the Air Force Studies and Analysis. These are the people who do the actual math and do modeling and so forth for the Air Force. Originally they were the whiz kids. Some of you who are my 463

464 age will remember McNamara; this is the whiz kids group of McNamara's. They still exist, with a fancy name. I gave them some software from the lab, research software, to start modeling Al-Queda. They did it, and it has been used since then to do a number of analyses. That software was certainly not for prime time. It was homemade by my master's level students. Since then, we have got people from the Air Force Research Laboratory, this is from Rome Lab up in upstate New York, the information directorate, and they were developing software over there which had some relation to mathematics, but it was not very explicit what the relationship between the software and the mathematics was. It was very heuristic in a way, but it did approximately the same things, but it is much more user friendly. Once we established from the laboratory the validity of the approach using more rigorous tools, a heuristic approach was being used and is used currently. Now we need the data. Some of the data came from databases, as described yesterday, but some of the data had to do with judgments. We brought in the analysts from many of the three-letter agencies that were discussed yesterday to provide that information. 464

465 The moment the analysts got that stuff, after they learned within two or three days how to use the software and how to model in that way, immediately they started asking for more. All this does is, you wiggle the inputs to see what the outputs are. This is good for planning, but how do I go to the execution? This is what we showed to the generals. This way they understood the influence nets and Bayesian networks, except for the Secretary, who does know Bayesian networks. Fortunately, I did my homework, and before I discussed those things I knew that he knew Bayesian networks, and I didn't say silly things to him. But the idea of how to use them is a little bit different here. You have to put yourself in the mind of the adversary and define a set of effects that you want to achieve. We want Milosovich to change his mind, we want Saddam Hussein to stop bothering us and go on vacation or something like this. You put that stuff over here, and then you start looking, what are the things that influence his decisions. This blob here is where the modeling takes place. You want to bring them to the point that you have what is called actional events, the things that we can do -- we are 465

466 the good guys, the blue -- that will eventually have an effect here. This is the kind of models -- now we have made a bunch of models of those things. Actually we have been using this kind of stuff in DoD since 1994, in the intelligence part of DoD. They use it that way. But now the question came up -- and I wish I had mathematicians working with me to solve the problem, because I don't know -- and my students don't know the answer, and this is not the field that we work in. Suppose now that they start the engagement. Bombs start dropping or things are happening in Afghanistan. I start having observations that come over here from various intermediate nodes; this thing occurred, that thing occurred. Can I propagate forward? Sure, I can propagate forward except I have to update first and update the priors and do all those things, and do it. That is brute force, one can do it. It takes a few weeks, implement. But then they are a question. If I look at this node happened well am interest sensors you get it done, you more sophisticated and I see what has and that improves my information over here, how I doing in reaching the outcomes, can I solve the problem? Can you tell me where I should put to see what is happening, so I can get a better 466

467 understanding of whether I am achieving my goals or not. That is a lot harder problem, but from what I know, mathematicians will know how to do them correctly, as opposed to trying to ad hoc them, the way it is occurring right now. But you cannot do it in two years. You need to be ready to go and work and do it in a couple of weeks and put it in. I hope it will be over before it is actually needed. This is a real transition, but it took a lot of people. That is one of the good things when you are chief scientist; I could call people and cash in all my IOUs. This is how it has been established now, studies and analysis, now has the capability to do modeling, using Bayesian nets. We never had that capability before. The research laboratory, Air Force Office of Scientific Research, indeed, some of you in the audience probably know it, that is the basic research component of the Air Force research establishment. This is the applied research, the Air Force Research Laboratory. They are providing the algorithms and so forth. Here is the alphabet soup of the intelligence community, providing the data and the updates. We are using it, responding to decision makers. What we need here 467

468 is better math, because I am math limited in this area. So are the analysts. It is not their job to be mathematicians; their job is to understand that and to provide the answers. Opportunities, every day. But how do you exploit -- because I have good people like Kathy and Tod that I have known a long time who know about those areas. A long time ago, 1994, I started learning about this stuff; I could make the transition. It is not my technology, but at least I could see the possibility of a transition. It takes a lot of work and it takes involvement to get to know the problem. You need to understand the science. The devil is in the details, and you have got to understand. I think I have said enough as an introduction to this session. We have three speakers. We are going to continue straight through with two discussers, and then an open discussion. The intent is to leave here by 6 o'clock. No reaction by anybody? You are all awake, I see, we are doing fine. With this, I would like to ask the first presenter, Tod Levitt. I will ask each of the presenters to make appropriate comments about themselves. Although I know them, they do not want me to say what I know about 468

469 them. It is all good stuff, but it will take a long time to describe their accomplishments. 469

470 Introduction by Session Chair Alexancler Levis There are three types of strategies for handing an attack: I. Reactive. Once there is an attack, how is it hanctlect? 2. Anticipatory. What are the procedures when we are expecting an attack? 3. Proactive. What will happen next, so that it can be prevented In a proactive strategy, mathematics is used to create moclels and simulations of the problem. One idea, from the Air Force's Studies and Analysis group, is to use Bayesian networks to put ourselves in the minct of the adversary, and to clefine a set of effects that we then want to achieve. The mocte] helps identify the things that influence our decisions, as well as the actional events the things we can do. We can also "propagate forward" to see how well we're ctoing in reaching the ctesirect outcomes, and test the outcomes' sensitivities to what happened at key nobles along the way. 470

471 Tod Levitt "Reasoning About Rare Events" Transcript of Presentation Summary of Presentation Power Point Slides Video Presentation Tod Levitt's research has focused on the development of advanced capabilities for multisensor fusion, SAR, IR, and EO image understanding, ground robot vision, air-to-ground surveillance systems, and C41SR systems supporting multiple military intelligence, planning, and command and control applications. He has authored over 50 publications in the fields of multisensor fusion, image understanding, robotics, and artificial intelligence. Dr. Levitt has a track record of unique developments in evidential reasoning in large-scale, high-dimensional data, qualitative navigation for mobile robots, encryption of algorithms, and the foundations of model-based software system design. Dr. Levitt has led the development of a diverse family of advanced information software systems built to handle real-world data under complex operating conditions. The breadth of these applications reflects his unique ability to transfer technology capabilities across sensor modalities and to demonstrate the application of advanced fusion techniques to dynamically meld the resulting multisensor observations into actionable intelligence. These systems include a fully automated middle-Eastern armor unit detector for the U.S. Army that was evaluated to perform at expert imagery analyst levels on wide-area, low-resolution Desert Storm SAR; a system for automated diagnostic measurement from digital x-rays of the hand that was employed in clinical care at the San Francisco Veterans Administration Medical Center; a system for semiautomated, three-dimensional construction of neural processes from multiple hundreds of two-dimensional cross-sectional confocal microscope images that was used in developmental anatomy studies at Stanford University's Department of Neurosciences; and development and installation of an automated hot steel slab inspection system at U.S. Steel in Gary, Indiana. Dr. Levitt founded Information Extraction and Transport, Inc. (JET) in September 1991, aiming to achieve large-scale, breakthrough technology development of automated systems supporting scientific discovery. IET provides products and services that manage uncertainty to add bottom- line value in enterprise computing across business, defense, and engineering industries. IET brings together a world-class team of technologists with a shared technology vision who produce best-of-breed tools and state-of-the-art solutions for dynamic Bayesian inference and decision making. Prior to founding JET, Dr. Levitt was employed as a research scientist at Advanced Decision Systems from 1983 to 1991 and at the Honeywell Systems and Research Center from 1978 to 1 983. Dr. Levitt is a co-founder and member of the board of directors of the Association for Uncertainty in Artificial Intelligence. He received his Ph.D. in mathematics from the Universitv of Minnesota. 471

472 DR. LEVITT: Thanks, Alex. I have an advantage, in that I have few accomplishments, because I have spent the last 12 years running a company that deals with applied research, largely in information fusion for tactical battle space applications, a lot of work with DACHA and other military agencies. As such, we have a wealth of experience in knocking our heads on problems that the government typically -- impossible problems the government states vaguely on purpose to try and get smart people like yourselves to come forward with answers to the questions they can't even pose. Certainly anything called homeland defense fulfills this in spaces. The title is reasoning about rare events. I meant the term rare in two senses. One is unlikely, the traditional sense that we simply don't expect any given thing to happen with any great likelihood, and the other is more technical, a concept due to David Schumm. In fact, everything I will talk about is somebody else's work. In that sense of rareness, we have the concept of rare evidence, that can occur even when a rare event is happening with certainty. Rare evidence turns out to be very important for detecting anomaly. In this way, we can hope to model things that we might expect and still detect 472

473 that something else is going on within the context of those models. I will get to that eventually. But first, to wave hands at what homeland defense might actually mean, any notion of it has to immediately acknowledge that it is a stupendous panorama, any way you look at it. This is one little cut Alex gave on it. Here is another, to attempt to trot out taxonomies of what might be, and multiply them down and come up with possibilities for what we are trying to defend against. So you are talking about people in organizations that want to hurt the United States, of all varieties there, where they might come from. They don't necessarily come from overseas. They can be recruited here, of course, all kinds of different attacks, especially conventional ones. TNT is readily available in the United States. You can talk about hitting all sorts of supply chains from electrical grid to information to the waterways and actual transportation, agriculture, livestock is a huge issue. It would be very easy to cause a terrific economic impact, not to mention effective terror, and political, information itself. Information is very easily spread and can cause a lot of problems in the United States. I had an experience in December. I heard a noted scholar remark that since we had a machine that could play 473

474 chess at a grandmaster level, doing automated intelligence assessment for tactical warfare surely must be solvable now. I was kind of shocked, having spent most of my life trying to do that problem. But it struck me after the fact, a lot of people think -- when you talk about automation which I insist, given the size of the problem, is where we want to focus, at least from the point of view of where mathematical breakthroughs are needed, and brilliant applications are needed in order to make any progress at all. You start looking at what chess looks like versus tactical warfare, which I am very familiar with -- these numbers are just wags. They can be substantiated, but I won't take your time. If you just casually look at what you are talking about here, you see two orders of magnitude in size basically as you move to the right each time. Most importantly, you notice that there is no uncertainty in what you observe in chess. It is all right there in front of you all the time. That of course is spectacularly not the case. It is the source of what makes data fusion necessary. There is terrific uncertainty in what is out there, where it is at any given point in time, what it is doing, or as Alex said, the buzz word du jour, predictable awareness, what it will do. 474

475 Then when you actually do see something, typically 65 percent is the typical number that turns up as to how certain you are of what you are looking at. That is in tactical warfare. That is very constrained. You might be a little surprised that organizations in tactical warfare -- random number, just pick one, eight is typical. It is not commonly known when you go looking at these current types of tactical conflicts that go on. For instance, in Bosnia there were three major armies, the Serbian, the Croatian Muslim and the Croatian Christian were all more or less conventional armies, meaning they had big armor and artillery and things that they had to haul around with big logistics trails. There were 28 paramilitary groups, including mujahadeen coming in from all over the world, mercenaries, all kinds of different religious factions, all doing crazy things for strange reasons, joining up with the army, switching sides. This is pretty typical of war situations. There are many actual organizations to consider. When you go to homeland defense, everything is much worse. You are talking about the territory of the United States including Alaska and Hawaii and such. You don't even know what to look for. What are we looking for? 475

476 If the answer is that you have to know about box cutters, this is a little frightening. Is it completely impossible? No, I don't think so. Mostly impossible, maybe, but there is actually a great deal of information that is being collected all the time, and that has been. There are many standing organizations, Centers for Disease Control, the USDA, Plum Island spends all its time, as has been discussed at the workshop. The Treasury Department is always auditing and tracking all sorts of things. And of course, the intelligence agencies are doing these. One can imagine trying to -- these are efforts going on more or less. This is a cartoon of what could be. You look on the right-hand side, you have different sources of expertise on the left, different areas gathering statistics. These can be the same as they are at CDC, but they don't have to be. If you start asking to consciously integrate these, so that the information is being automatically sifted and compared, and regionally focused and then hierarchically fused up from regions, it is conceivable that a great deal more might be known. It makes it obvious, I hope, that automation is absolutely critical to sift and compare in an attempt to match against models, and 476

477 that this needs to be done from a strong scientific perspective, with any hope of not just generating spaghetti and garbage. Quite basically, if we are going to have automated reasoning about what is going on when one formulates hypotheses on the basis of evidence, just to nail things down a little bit, if we had a hypothesis that was anthrax attack at a location in the United States, and that is either true or false, there is evidence of the sort of thing the Centers for Disease Control gets. They get reports from physicians that have to report certain occurrences such as anthrax. So they get anthrax diagnosis incidents at a location. It is baseline, whatever that may be, zero most places, but it is not zero actually in the absence of attack; anthrax is naturally occurring. Then anything else would be continued high. Then we have a standard one-stage Bayesian inference here. It is important to notice that in modeling this, typically the way we think of this in building actual applications, it need not be causal, but it is a powerful way to think about the model. If there is an attack going on, then obviously one should expect to see an incidence of disease. That is after all the purpose of the attack. 477

478 The inference goes the other way. You want to observe the incident and infer whether or not an attack is going on. The Bayesian rules are built to do that. In a one-stage inference like this, the likelihood ratio has the information. So the likelihood of any -- once you observe the state of evidence, in this case it would be either high incidence or normal, then the ratio has all the relevant information in a one-stage inference. As you build complex networks, one-stage inference is no longer so simple. We change to real-life modeling in terms of all kinds of webs. There is actually a very large literature since the 1980s, focused in the uncertainty in the artificial intelligence community. There has been a proceedings, an annual conference since 1985, and those are available since 1990 through Morgan Kaufman; I highly recommend it to any mathematician who wants to get quickly up with the state of the art in what is understood in automated Bayesian learning and in computation. There are two basic algorithms that have been discovered in 1988. Spiegelhalter developed the so-called joint triagramen, which solves any Bayesian network that is well structured automatically. Then in 1991, Bruce d'Ambrosia and Brendon Claverill developed the 478

479 probabilistic inference algorithm, which also does the exact solution for any Bayesian network, based on an approach of heuristic rearrangement of polynomials, all probabilities here being modeled as discrete. That is the website of the organization. Fortunately, Professor Laskey is going to talk a great deal about this, so I won't spend any more time on it. The next one, to get back to how we get to these complex models that require these sorts of algorithms, as we look at what really goes on and what the evidence comes from, this is a little more realistic here. In this case, what you actually observe is a report at the Centers for Disease Control. That report may or may not accurately reflect the diagnosis that is supposed to be referred by a doctor because errors happen when reporting goes on, especially when it is done on a massive scale. Then of course, the doctor's diagnosis may or may not be correct. So the observability -- was the report made, was it made accurately -- the credibility of the doctor, doctors who have no experience. There is an attack in the heartland or something, it is very likely that most of the physicians that are seeing ill patients have never seen a 479

case of anthrax. There can be delays, there can be all kinds of different sorts of things. Together, if we look at the hypothesis I know, it is that people are actually ill from anthrax, and that is hidden. It is not observed at all. But together, we see that we are modeling the veracity of that illness, based on factoring over the credibility of the physicians who are diagnosing, and the accuracy and timeliness often of the reporting. I stuck another random variable in there, the dispersion. This is to get the sense of what is relevant. One might think off the top of one's head that the weather is not relevant to whether an anthrax attack is going to happen, but it certainly is if a crop duster is being used to disperse it. It is going to be timed so that the wind is blowing and is likely to continue to be blowing and such, to cause the most damage. So what one might think of as ancillary factors might become critical first-order variables. This issue of what is relevant is place modeling, especially for something as broad as homeland defense might be defined to be. So in a deep hierarchy like this, odd things happen. Most of the approaches to information fusion that have been used for automation focus from the tradition of 480

481 what we might call a standard probability Pascalian type probability, and focus on conditional independence, whether or not it is true, between random variables, and the weight of evidence that a piece of evidence carries in its pre- post area to an intermediate hypothesis or another piece of evidence, or the targetls hypotheses about what is going on that one is interested in. But as this example suggests, there is actually a lot of other issues that come into play when one goes to do large scale modeling of these scarce events, as opposed to, we are going to drive tanks into somebody, and we have a pretty good idea of how they are going to respond when we do that. They are not going to like it, and they are either going to run away or they are going to shoot back. Whereas here, things are much more wide open. So the issue, as suggested here, is evidential semantics and how that applies to how one should model things mathematically. The granularity of reasoning, how fine is it, how does one define the relevance of information, how fine does it have to be in order to get sufficient accuracy to draw conclusions to take action on? Again, even naive reasoning, I hope, when one looks over the breadth of the land suggests that any action taken is going to be 481

482 enormously expensive. So you want to be careful about what you actually choose to do. How finely do you have to reason in order to get good enough probabilities, whatever calculus might be used, so that your conclusions are likely to be robust? Is that possible? That leads to the question of completeness, whether you are spanning enough of the world or not. These are all different approaches which are amenable, most of them explicitly so, to graphical representation. The importance of that is that it leads naturally from the mathematics supporting the fusion of the information to algorithms that software engineers can actually build and compute. By the way, I might mention that the exact Bayesian inference solutions for arbitrary directed acyclic graphs of random variables is NP hard. It has been proven so by Greg Cooper in 1992. It is in the uncertainty AI literature. That raises the issue of approximations. For instance, variational mean field approximation is an explicit attempt by Tommy Jeckle and Michael Jordan -- not the basketball player -- to come up with a robust computable approximation to very large scale Bayesian networks, et cetera. 482

On the bottom here, the so-called belief functions, also called Dempster-Schaeffer functions and networks, and fuzzy set theory, are two examples that deviate from the Pascalian tradition in explicitly being motivated by other modeling considerations. In particular, we look at what some people in philosophy have probably called non-Pascalian, modeling ignorance. When you look at homeland security, this raises its head as a big issue. We simply don't know an awful lot. It is forbidding to think of actually modeling everything one might. So this suggests looking at three logics, where you have a supporting, denying, don't know. Then work pioneered by, among others, Professor Dempster, upper-lower probabilities and inter-value probabilities motivated by some of the same issues. Fuzzy theory, motivated by lack of precision in being able to express, lack of ability to obtain calibratable estimates. Of course, none of these calculi are particularly miscible with each other, and there are other issues. The one I would point at from a computational perspective and an automation perspective that has plagued the use of these theories is, in order to take action, ultimately we end up having to rank things. And certainly in automation you do. Interval valued or pair valued 483

calculi of course are not rankable, and it is easy to prove that if you do have a mathematically consistent way to rank them, then they have to flatten down to be equivalent to probability. Nevertheless, these are important issues to look at. Something that is much less well known than the work of Professors Dempster and Schaeffer and Zotti is work by Jonathan Cohen, and more recently in causality with Glen Schaeffer and Yuta Proboth making very significant contributions. These approaches, Baconian they have been called, the Humeian approaches, address in the first case, instead of weight of evidence approach to reasoning, eliminative reasoning, attempting to come up with exhaustive explanations that essentially deny a hypothesis on logical grounds, and counting them. This may seem silly when you say that of all the possible explanations. But in fact, that is more or less what you do when you model in computer systems. You have got to go build models, and when you are done, there is at least a numerable number of them in some sense. They may be parameterized, but there typically is a finite number of models. So even with the parameterization, you are only modeling so much of the world. 484

John Cohen's calculi addresses the issue, not sufficiently for automation, but it is certainly an excellent place to start; how do you know when you've got enough, certainly a critical issue in homeland defense. Can we infer causality? Going back to the Reichenbach and Fisher, there was a tradition in the 2Oth century in science of saying that causality could not be inferred, only correlation. Now both Schaeffer and Pearl have challenged that, and have algorithms and approaches that address this question of, can we from the observable infer the causes of doing them? This has the possibility then of discovery, without having to do so much explicit modeling. All the systems that we build at IT are based on hierarchical Bayesian inference, because there is tremendous power there to exploit, and there are fascinating issues that arise in attempting to do that exploit in the weight of evidence reasoning. One of the things that happens when you look at homeland defense is a wide area of fusion. You are going to talk about evidence arising from all over at computers that are going to then do fusion. In any national scheme, one would have to have a hierarchy of those or a web of them. Evidence will come not necessarily in the temporal 485

486 order of occurrence or in relevance, and it may not be conditionally independent from evidence -- in fact, it is unlikely to be conditionally independent from evidence that has already been fused in models that have been spawned, or hypotheses that have been generated to this point. I would strongly point you at David Schumm's book, Evidential Foundations of Probablistic Reasoning, which gives the first scientific treatise of these issues in toto. I am going to borrow heavily from his results in the following. He distinguishes weak versus rare evidence, and this is evidence of which I was speaking at the . . beginning . Weak evidence is when a piece of evidence does not either strongly confirm or deny the hypothesis that it purports to support. That means that the likelihood of that evidence relative to hypothesis is close to one. Rare evidence on the other hand is evidence where the absolute likelihood does not provide strong support or denial. That is to say, it is close to zero. In other words, what that is saying is that evidence doesn't occur very often, no matter what the state of the world. The significance of rare evidence is that one could choose to model it on purpose, because when it occurs, it is going to 486

487 suggest that the hypothesis to which it might be attached in a model based sense is probably not the right one. So by explicitly attempting to model rare evidential states relative to hypotheses that we expect, we can build in anomaly detection and hope to essentially have an inference power that is far greater than the model space that we can explicitly spawn. Kind of a subtle point, actually. What is interesting is that as Schumm points out in his book, that is not as widely known as it ought to be, when you go to multi-stage hierarchical inference, the likelihood ratio no longer tells the story. In fact, the difference of the likelihoods is states, so rare evidence becomes have weak evidence that i 5 evidential in two or more very salient. You can ~ rare or weak evidence that is not rare, and rare evidence that is not weak, et cetera. What is interesting is that weak evidence in hierarchical based inference and solution algorithms and Bayesian nets and such, the order that you accrue that evidence, given a static set to accrue, the order of that accrual will not significantly affect the results except up to precision. In rare evidence, the inference order varies a lot. 487

488 This is just a conceptual example, or an explicit way that I stuck in this idea that we could have a medium amount, whatever that would mean, of illness in the population. One would expect that in an effective anthrax attack, you are going to have a high incidence, where high doesn't have to be very many, of course, in the human population in the United States. The normal baseline -- by the way, those baseline statistics are available for all regions all over the United States from the CDC, so these are things that exist off the shelf -- and you can expect to see low or none. So explicitly modeling a medium amount of anthrax is a way to build in an anomaly detector. For instance, what actually happened with the anthrax being distributed by mail turned up exactly that kind of behavior. It was not like you would expect with an anthrax attack, but it was much higher than the baseline. It was like a medium amount. That should suggest right away that it is not like an aerosol attack or something, that one would get. The weak evidence here is, if you had a doctor for instance who consistently misdiagnosed anthrax, then him saying things are normal is not particularly evidential to whether or not an attack is going on. That is an 488

489 example of weak evidence. So the credibility of a doctor to introduce that. This is an example again drawn from Schumm and applied to this example. The weak evidence here is, if you have a report here of a given type, if you look down at the lower right, at the red numbers, it is saying that the probability of seeing those reports given on the doctor diagnosing is low and the probability of seeing the report if he is not diagnosing it is of course much lower. In that case, this would be modeling that as rare evidence, not that it necessarily is; the one I pointed out as rare is in case three. But the point here is just to show how the numbers work, if you insert them in and just do hierarchical Bayesian accrual up. ~ _, we have the likelihood of the report, being that you do receive the reports for anthrax, given the concatenation of all the hypotheses above it, which is more or less the same as asking what is the force of this report on the final hypothesis at the top. As you can see, what is fascinating here is that when the rare evidence is close to the base of the argument, of the inference, then you get the same result as when there is not any rare evidence, in terms of the force You see down at the bottom here 489

490 of that, the likelihood of it, on the final hypothesis, if you will. But as the rare evidence is moved up the chain, that force of the base report, the thing you actually observe, becomes weaker and weaker. In other words, the whole argument weakens. It is a very unexpected result. What this says is that when you start saying, okay, we are going to start distributing some kind of sensors around -- those are the green things. This could be people, it could be the current CDC, it could be hardware, whatever it is, any kind of way to make some kind of observations. You are going to distribute them spatially. They are going to report based on some protocol. They are going to report some machine that is going to choose that with other information. The first thing to note is that when you distribute this sort of thing, this issue of rare evidence now when it arrives there is going to affect your conclusion. That is a problem. There is no answer to that. That is a math problem, if you will. Now I am changing direction here. The evidence itself for the sorts of things that are often modeled for homeland attack as being the critical large destruction type of issues, is often diffusive itself. So you have 490

491 aerosols, you have disease incidents. Foot and mouth for instance can be trivially transmitted. It doesn't hurt people, you can carry it in your pocket, you can walk it right through Customs. None of the things detect foot and mouth when you go through the airport. They will say, you have a lot of grease in your pocket. You can rub it on an animal, and that herd will be infected in a day. It is a little-known fact that I found out, little-known to me, anyway, that there are only about 22 facilities in the U.S. where all live processing from animals goes to stuff on the shelf. So if you go rub foot and mouth all over those facilities all at once, you are going to have a trillion dollar impact on the livestock industry in the United States. It is not that hard to do. That is a simple example; how could you detect this? So you get these examples which have to do with combinatorial problems. Optimal placement. Again, the happy medium here is cost. You absolutely want to minimize any active deployment, because most of the time it is not going to be doing you any good. Optimal placement and the assessors. So I have the computers here, stuff that is fusing. How do you do that, where do you put them? You have the issues of going over telecommunications networks, you have the issue of the 491

492 latency of flight. You are trying to infer here -- this gets to some of the issues Alex raised of space and time -- you are trying to infer whether you are getting a hypothesis that this is an aerosol or some other thing, a subway release, for mail if it was anthrax. But this also applies to other distributed sorts of things like worms and viruses in computers. You want to do this in such a way that you get a rapid convergence of such evidence when something is going on. Ilm going to blitz through this. This is an example of the kind of modeling work science that I teach. Jorgensen has been doing this as a reference in e-mail. The idea is to apply population ecology models, mathematical models, dynamical systems models, to detecting worms and viruses in computer environments, and also to proactively attempt to intervene to design them. In this case, you look at a computer network and look at what goes on in terms of a work flow model, which corresponds to the growth and death of things in a population. You model that. You then can do a standard differential equation model and do the standard dynamical systems analysis. You can play with that in a what-if scenario to say, what if we changed that, what if we change the protocol, so that you get different effects in terms of 492

493 stability, in terms of how it would be taken over in the introduction of a worm. So this is an example, just one in a fairly tractable space, of the sort of thing that I mean when you talk about modeling how evidence is gathered, and what can be done about it. DR. BORGS: (Comments off mike.) DR. LEVITT: This is not a Bayesian network. This is a diagram that indicates the work flow. DR. CHAYES : But if it has loops on it, you can'- solve it. DR. LEVITT: No, it's not a is a work flow diagram that is differential equations for the varying being exchanged in the work Bayesian network. It used to model the quantities that are . Then when that affects the a a__ ~ _ _ ~ =_~ going on in the LAN you start having worms and things, parameters in the equations. That is not a fusion technique, it is a modeling technique for the evidence itself. Just a quick left turn here. If you now take a proactive defense view from a building point of view, instead of looking at the diffusion outdoors or whatever, or through computer systems, if you look at a building that is purposefully being asked -- and they do this with State 493

494 Department buildings and such now. You say, if there was an aerosol release outside, how would it penetrate, and how would we deploy sensors to do that. What you get is amenable to art gallery theorems. You have vents, windows and doors and things where this can come in. That is a completely different approach, but very valid in the sense of defending. Finally, these two things have to come together. The evidence has to match ultimately into the inference chains. There is no such math. These are different bodies of mathematics that simply aren't miscible at the current time. This is the punchline. Where should we focus? Well, certainly calculi for automated inference for fusion, issues in accounting for granular causality. We need to solve the problem asynchronous arrival of rare evidence, the optimal placement combinatorial problem of sensors and assessors, especially from the point of view -- and this is the twist that hasn't been done, the rapid convergence of evidence in the fusion algorithms that are used, not just the placement to meet some objective function. Finally, the integration of these mathematical representations, or the interoperation of them, if you 494 completeness and of the optima]

495 will, to transforation, from the continuous dynamical systems, boundary value type problems that you get when you look at diffuse phenomenon, to the inference webs and chains that have to occur if you are going to do fusion on machines. Thank you . 495

496 Reasoning About Rare Events Tot! Levitt The goal of data fusion is to allow us to consciously integrate data from organizations such as the Center for Disease Control, the U.S. Department of Agriculture, the Treasury Department, and the intelligence agencies, so that the information is automatically sifted and comparect, regionally focused, and then hierarchically fused up from the regions. It is absolutely critical to buiict an automated system in order to sift and compare collected data with highly complex mathematical moclels. Automated reasoning is a function of how observed data may clep art from baseline conclitions. Researchers are faced with several challenges in the area of data fusion. First, it is crucial to verify the accuracy and timeliness of data. Also, given the breadth of the United States, any action taken is going to be enormously expensive. Some areas of future work inciucte the following: Developing calculi for automated inference for fusion; Determining when information is relevant and how complete it must be in orcler to give a level of accuracy necessary for drawing conclusions; Handing the problem of asynchronous arrival of rare evidence; and Transitioning from the mathematical moclels that arise in studying diffuse phenomena to inference webs and chains that are required for data fusion. 496

497 Kathryn Laskey "Knowledge Representation and inference for Mu~tisource Fusion" ~ ranscript of Presentation Summary of Presentation Power Point Slides Video Presentation Kathryn Laskey is a professor in the Systems Engineering and Operations Research Department at George Washington University. She received a B.S. in mathematics from the University of Pittsburgh, a master's in mathematics from the University of Michigan, and a Ph.D. in statistics and public affairs from Carnegie Mellon University. Her research interests include Bayesian inference and decision theory, multisource fusion, uncertainty in artificial intelligence, and situation assessment. 497

498 DR. LEVIS: Thank you. We will proceed right away with Kathryn Laskey, who is on the faculty of the systems engineering operations research department at George Mason University. That is where I also am in my real life. Kathy. DR. LASKEY: source integration. DR. LEVIS: While Kathy is working on the hardware problem, I mentioned before some of the work my lab is doing on Bayesian network, et cetera. I send my students to Kathy to learn that stuff. That gets me off the hook. DR. LASKEY: Before I plunge into my talk, I want to stop and say, these days it is hard to talk about multi- source information fusion without talking about Bayesian networks. The inventor of Bayesian networks was the father of Danny Pearl, so I would like to dedicate my talk to Danny Pearl and his father, and to say that I hope that Danny Pearl's memory will live on in those of us who apply his father's research to the problem of homeland security. I am going to talk about some basic requirements for inference and decision support for homeland security which you already heard some of in my comments at the microphone earlier. Then I am going to talk about -- most I am going to talk about multi- 498

499 of my talk is focused on knowledge representation, which is my area of research. In particular, I liked David's remarks yesterday about the cathedrals. I tell freshmen who are coming into my university, in our information technology engineering school, that they are living in a very exciting time. If you think about the course of human history, there was the agricultural revolution, where we learned to apply technology to food production. It enabled the emergence of cities. Then there was the Industrial Revolution, where we learned to apply technology to the development of the production of physical systems. Now we are moving into information technology, where we are learning to apply technology to information processing. I believe that the impact on human society will be every bit as fundamental as in the agricultural and industrial revolutions. So I want to talk about representation of knowledge for the ability to do information technology. Then I will talk briefly about open challenges. This is a quote from the Washington Post on September 12, which pretty much encapsulates what the challenges are in front of us. We want to be proactive rather than reactive. I'm not going to spend much time on this Vu-graph, because everybody said this already, but my 499

500 favorite phrase that I coined to speak about the issues that confront us with this needle and haystack problem is, data, data everywhere and not the time to think. The Joint Directors of Laboratories has developed a taxonomy of information fusion, broken it into levels. Level zero is pre-detection. That is, we are observing things that are observable in the environment, but we haven't yet said there is something out there, and can we fuse different sources to enable us to do a better job of detecting things that are out there. Level one is detecting individual entities, like, I see a tank there, or I saw an animal with anthrax there. Level two is to support situation assessment, which is militarily significant for the Joint Directors of Laboratories, but for our problem, a significant configuration of entities in the environment that has some meaning. So for example, I have a company of vehicles, about 30 of them, or I have a pattern of unusually high incidence of anthrax. Level three is support of threat assessment, which is, what is the intention of my adversary, so there is a bioterrorist attack. Level four is sensor management, tasking with collection assets. That is the question of where do I put 500

501 my sensing resources so that I can do a better job of detecting what is out there. What is really needed in practice and what mathematicians can help with is a unified, theoretically based technology that spans all five of these levels. Right now, it is pretty ad hoc and things are separate. We need to be able to span all the levels, and we need to be able to have a theoretically justified, unified approach. If we are going to do a good job of this problem, we need to support and not replace the human. That means that we need to combine expert judgment with prior knowledge, apply expertise, but what is important is that we need to be able to apply expertise. In the homeland security problem, it is essential to be able to fuse both hard and soft types of knowledge. of disease transmission, but also, of, for example, the political conflicts evolve, the So not just the biology we need the knowledge scientist about how anthropologist, the social psychologist, the clinical psychologist on the mind of the terrorist. We need to look at -- a lot of different kinds of knowledge have to be brought to bear, and they are not always easily quantifiable, but they have to be brought together in a common framework. 501

502 We need to be able to present results so that humans can use them. As had been mentioned before, we need huge volumes of data. We need to span the five levels of fusion. We need to perform effectively. Although there is uncertainty, there are types of threats that we haven't seen before. Configurations of things that are meaningless in isolation, but when you put them together, they are meaningful, and huge numbers of possibilities to consider. We also need to be able to learn from experience . ~ ___ our Not just humans learning and not just data mining, but we need to be able to combine the human strengths of pattern recognition and the computer strengths of sifting through large quantities of data to have an effective learning capability that makes use of both human expertise and automated processing. Let me talk a little bit -- this is where I am going to talk about Danny Pearl's father. I am going to talk a little bit about knowledge representation, but first let me talk about a paradigm shift that I see occurring in computing technology. The old paradigm is, you think of a computer as an algorithm running on a machine, executing deterministic steps, and it either gets the right or the wrong answer to the problem that you are solving. We are using logic, but 502

503 a computer is an automaton that processes preprogrammed steps. There is a new paradigm that is occurring, actually. We hear these buzz words like agent-based systems. In fact, there is a recipe in recent years for getting a Ph.D thesis, I have noticed. You take your problem, some very complex and high dimensional optimization or statistical inference problem and you map your objective function if it is an optimization problem, or your measure of your log likelihood if it is a statistics problem. You call that energy or action, and then you invent a fictitious physical system, and you have these particles wandering around in the physical system, and you import methods from statistical physics to solve your problem. We have got the explosion of micro chain Monte Carlo methods and statistics and variational methods. This is a recipe for getting a Ph.D thesis. The Hatfield nets were one of them, Radford-Neal was another one. If you think about this agent-based computing, we think of agents making decisions to achieve their objectives, and instead of writing programs, we are designing dynamic systems which by the laws of physics will evolve in such a way that we will optimize whatever we want to optimize, or infer whatever we want to infer. Then 503

504 we can put these in hardware. People are building neural net chips in hardware. In decision theory, we maximize expected utility or minimize expected loss. In physics, we minimize action. So let's think about building agent-based systems, and this paradigm shift actually is occurring. Instead of programming computers, we are designing economies of interactive agents. Just like any paradigm shift, the old paradigm is a limited case of the new paradigm. Let me tell you very briefly what a knowledge based system is, because we are talking about representing knowledge. It is a computer program that is supposed to behave intelligently, so if you are going to design intelligent agents, you have got to use knowledge based systems. So you need to have a representation language, which has a syntax as its grammar. The ontology defines what I am allowed to talk about in my domain. It is like my vocabulary, and it has the semantics, which defines the meaning . I now have a knowledge base and an inference engine. The knowledge base is the main specific knowledge. The basic idea with the knowledge based system is to separate representation of a problem from solution of a 504

505 problem. And mathematicians are good at transforming from one representation to another, so if I can represent a problem in terms that are meaningful to an analyst, but put it into a computable representation, it might take until the end of the universe to compute, but it is mathematically well specified, then a mathematician can take that representation and either approximate the solution or come up with a different way of representing it that is more efficient to solve it, and this is the kind of thing that we need to be able to do. We need to have a unified theory for knowledge based systems. We see a synthesis occurring between traditional artificial intelligence, which looked at structured representations for knowledge, and complex search algorithms, probability and decision theory, which can deal with uncertainty and can deal with objectives and values. Bayesian statistics can bring observations to bear as a theory of belief dynamics, of updating with evidence, and then database management, which taught us to separate the logical view of the data from the physical representation on the computer, methods of access update, administration security and shared data repositories. We can put all these things together into a theory of knowledge based systems. 505

506 I am going to talk now about a Bayesian network. Then I am going to talk about how Bayesian networks are evolving into graphical modeling language for specifying agent-based systems. This is an example that has nothing to do with homeland security, but I have used it as a classroom example that illustrates the basic ideas. Maria starts sneezing. This is a Bayesian network about Maria's sneezing problem. She might be sneezing because she is having an allergic reaction, or because she has a cold. She wants to go visit her grandmother -- this is the utility value -- but she cares about the health of her grandmother, and she cares about pleasure from the visit. If she has a cold, she doesn't want to go because she cares about the health of her grandmother. This is called an influence diagram or decision graph. The probability part of it is called a Bayesian network. So she starts sneezing, and she draws a plausible inference, so her probability of having a cold went from about eight percent to about 50-50 from sneezing. So the utility shifts here, and she shouldn't go and see her grandmother. 506

507 But then she sees scratches on the furniture, which as we have already talked about is distributed propagation algorithms. It makes her think there is a cat nearby, which makes her think she is having an allergic reaction, which explains away the sneezing, so she doesn't need cold as an explanation anymore, so now it is okay to visit her grandmother again. This is a trivial example. We have built much more complicated systems than this on much more interesting applications. But it shows you how you can use these Bayesian networks to represent commonsense knowledge. Let's talk about what happened here when Maria was sneezing. The decision graph is both a knowledge representation and a computational architecture. So it formulates somebody's model in terms that are meaningful to a human being. We have worked with domain experts, and we can build a fairly complex domain model within a couple of days with an expert. The expert can play with a model and make it in a day or two. I have had junior high kids playing with Bayesian networks, and they understand them and they like them. I can get students after one or two classes to build passable Bayesian networks. So it really is human understandable language for specifying very complex, hundreds of 507

508 variables, probability and decision models, and the output accords with human intuition. If there is just the probability nodes, that is a probabilistic inference problem. You can also put decision nodes, and it is a decision theoretic optimization model. There are inference algorithms for computing the marginal distribution of one variable, given evidence on the other variables. So we learn the probability. We updated it, learned that Maria was sneezing and that changed our probability, and that was Bayes rule, and it was a distributed algorithm for applying Bayes rule. There are also what they call learning algorithms, which is parameter estimation, where you use data sets to update the parameters. Most of the methods that are applied are getting increasingly sophisticated, but in the artificial intelligence field, most of the algorithms that are applied are fairly straightforward from a statistics perspective. They tend to be very simple models and independent identically distributed observations with no missing data, things like that. They are becoming more sophisticated now. From a mathematician's perspective, there are a lot of fascinating problems there. In geometry, in inference, for example, if you take a Bayesian network and 508

509 you don't know the structure so you don't know what the arcs are, then you don't know what dimension your manifold IS . If you do these mixture models, where I have one probability of this model, one probability of that model, one probability of that model, there may be hundreds of thousands of them, you get something called a stratified exponential family, which is an interesting mathematical structure. It would be interesting to do some theory on the properties of those things and the estimators that you get. What I just showed you was pretty much the state of the art up until about 1990, and is really exploring out now and being applied all over the place. There are hundreds of papers on Bayesian networks now coming out every year. But that was a one size fits all model, in the sense that the types of applications people do. You might have a clinic where you have thousands upon thousands of patients coming in, but you have exactly the same symptoms for every patient, exactly the same background variables, exactly the same diseases that you are worried about. That is not good enough for this kind of problem. We don't know how many terrorists there are out there. There are lots of different stockyards that they might be 509

infecting. There are lots of different plans that they might have. There are just too many things to have one grand Bayesian network to cover them all. There is also the temporal dimension which makes things much more complex. Let me talk about Maria again. I am going to from the sublime to the ridiculous. Our first variation is a very trivial variation. We have Tran. He is sneezing and he saw scratches, but he was recently exposed to a cold. Unlike Maria, he is not allergy prone. That is very easy. The second variation is, he is the one who saw the scratches, Maria didn't see the scratches, and he is in the room with her. That is a little bit more complex. The final variation I am going to talk about is that they are both sneezing, and this time they are both allergy prone, and we don't know about their cold history, and they both saw scratches, but they are a continent apart. Those three variations are going to illustrate slightly more complex types of reasoning that come up in the homeland defense area. Variation one I can do just by adding a background variable, that somebody is or is not allergy 510

511 prone. Maria is, and we don't know about Tran, so this one is not gray, and it is not either zero or one. Then there is exposure to cold. We don't know when Maria and -- we do know it on Tran, these guys are mirror images of each other. It turns out that Maria Tran shouldn't visit his. extra variables to cover all these situational factors. That is what people have been doing with Bayesian networks. You add variables that cover all the situational factors that you might want to model. In variation two though, that doesn't quite work. We have repeated sub-structures. I have replaced cat nearby with Maria being near a cat and Tran being near a cat, and there is a cat here to the location of Maria and the location of Tran. Maria is near Tran, so we find out if Maria is near Tran, then they both have to be near the cat. I've got two copies of the same model, and I have pasted them together like Legos. That is the direction that things are moving. Notice as a statistician, there is replication here for learning that this piece and that piece are the same, or close to the same or whatever. It should visit her grandmother and All we have done is added a few 511

512 is an interesting statistical model and problem to learn these kinds of models. This is what happens if we try to apply that model, but then we try to put them a continent apart, and have them both see scratches. What happens is, they are both 50-50 near a cat. That illustrates something that in the fusion literature is called the hypothesis management problem. We haven't enumerated enough cats to cover the situation, so we have the hypothesized entities. Here I have all this spaghetti up here, where I have got two possible cats, one of which might be the cat that is near Maria and one of which might be the cat that is near Tran. It turns out that this model is essentially the same as variation one, variation two and variation three done right. In other words, I can use this model for everything, but I don't need it if I only have one cat, and I don't need it if Maria and Tran aren't related at all. So the question is, how much of the model do I construct and how much do I prune away. That is something that has got very interesting mathematical challenges. I like to let my experts specify the model in conceptually meaningful units. So I've got my cats and allergies fragment, that talks about allergic reactions. I've got a spatial reasoning fragment, which is rather 512

513 rudimentary for this problem. This just says that if I am near you and you are near somebody else, then they are near me, too. I've got my evolution of colds and time fragment, which is a very simple markup chain here. I've got my value fragment, which says I care about my grandmother's health and I care about my pleasure. I can paste these things together, and from a logician's standpoint this kind of model has first order expressive power, whereas the previous kind of model had propositional expressive power. This is the way things are moving in artificial intelligence. A simpler model give the same results as the more complex model. From a mathematician's standpoint, we are building an infinite dimensional Bayesian network implicitly in our knowledge base. We need to think about the mathematical properties of these algorithms. This is an architecture for a system that retrieves Bayesian network or decision graph fragments and pastes them together. I will just show you that there is this architecture. Let's get to the challenges. This is a bunch of applications that I have been involved in with various students and colleagues. I can make these Vu-graphs 513

514 available to you. I can let you read through some of these, but I want to move to a few Vu-graphs on challenges, and I am running out of time, so let's go ahead here. Model life cycle management as a systems engineer. We want to be building reusable models. So we need to have a library of fragments. We need to have something that the database people call mete data, different describes the model's capabilities, inputs and outputs, and models that pass information between them. From that perspective, the modeling language provides a universal representation language that describes generic decision and inference problems. We need software tools that are theory based. Inference and learning technology is probably more meaty for mathematicians. Efficient solution methods, temporal reasoning is intractable. Decisions over time is even more intractable. Value of information as Tod mentioned is even more intractable. Value computation. Standard probability theory assumes that I know all the logical consequences of my beliefs, but I don't know the optimum computation before I have done it. People have modeled -- Eric Corbett from Microsoft has done a lot of work in value of computation, 514

515 applying value of information to decide whether to perform a computation or not. We need addual modeling. That is a buzz word that came from a DACHA program in addual modeling, the idea of being able to take these model pieces and break out when we get new problems that we haven't seen before. Some parts of the world don't change at all, and we want to leave those fixed, and we want to change only those parts that we need to change, and how do we structure out models so that they are easily changeable. Multi-resolution modeling. I want to model at different resolutions because of computational efficiency or other factors. The really important thing is deception. If they know I am modeling them, how do I model the fact that they know I am modeling them. Game theory is relevant there. Support for human-computer interaction. We want to combine expert knowledge with designed experiments and observational data. Most of the learning methods cannot handle anything more complex than independent, identically distributed observations and a conjugate prior. You have got to get better than that. Human-computer interaction design for model input, model output, sensitivity and uncertainty analysis 515

and collaboration among heterogenous and geographically dispersed analysts. I think that is the end. Summary, one more. We are moving from hand-crafted special purpose models to reconfigurable model pieces that humans work on parts of the problem and then piece together with other parts of the problem. It is something that we really need theory to understand what is going on. That is the direction things are going. There is lots of useful application experience. By the way, Clippy in your Microsoft office products is a Bayesian network, in case you didn't know. DR. CHAYES: But the interface was not written by the Bayesian. DR. LEVITT: Eric does not like the obnoxious little -- he says if they had used his decision theory on when to bother you, as well as the Bayesian stuff on what do you want to be doing, that people would like Clippy a lot better. DR. CHAYES: Eric is very embarrassed by this. This is used as the example of his work, and he didn't have anything to do with the obnoxious part. 516

517 Knowledge Representation and Inference for Multisource Fusion Kathryn Laskey Excellent inference and decision support for homelanct security requires supporting expert, human judgment with data; extracting key conclusions from volumes of heterogeneous data; and effective performance in the presence of uncertainty and ambiguity. There are many recent advances in knowiecige representation and inference technology, such as dynamic computing systems, whose solutions improve over time. We need to have a unified theory for knowlecige-basect systems, and in that spirit it was noted that a synthesis is occurring between traditional artificial] intelligence (which looks at structured representations of knowledge) and complex search algorithms involving probability and decision theory (which can cleat with uncertainty, as well as with objectives and values). The main links between the two arenas are Bayesian statistics, which can bring observations to bear in updating evidence, and database management. Dr. Laskey ctiscussect Bayesian networks, which essentially are cause] graphs associated with unclerlying probability distributions. She offered some examples to illustrate Bayesian networks' usefulness for representing common-sense knowiecige and assisting in clecision-making. The decision graph is both a knowledge representation and a computational architecture, so it formulates somebocly's mocte] in terms that are meaningful to a human being. It can be used for specifying very complex probability and decision mocteis, with as many as hunctrects of variables, and the output will still accord with human intuition. 517

518 Valen Johnson "A Hierarchical Model for Estimating the Reliability of Complex Systems" Transcript of Presentation Summary of Presentation Power Point Slides Video Presentation Valen Johnson is a professor of biostatistics at the University of Michigan. He received a Ph.D. in Statistics from the University of Chicago in 1989 and was a professor of statistics and decision sciences at Duke University prior to moving to Ann Arbor in the fall of 2002. He is a fellow of the American Statistical Association, past treasurer of the International Society of Bayesian Analysis, has served as an associate editor for the Journal of the American Statistical Association and IEEE Transactions for Medical Imaging, and is a Savage Award winner. He is coauthor of Ordinal Data Modeling with James Albert and author of Grade Inflation: A Crisis in College Education. His research interests include ordinal and rank data modeling, Bayesian image analysis, Bayesian reliability modeling, convergence diagnostics for Markov chain Monte Carlo algorithms, Bayesian goodness-of-fit diagnostics, and educational assessment. 518

519 DR. JOHNSON: The title of the talk is Estimating the Reliability of Complex Systems. This is joint work with Ty Graves and Mike Yamada at Los Alamos. Before I get into the talk, I would like to acknowledge some of the people we have been working with. A lot of the work started as an application of the nuclear weapons program at Los Alamos. We were trying to model the reliability of nuclear weapons that can't be tested anymore. But we have also been applying some of these applications through some Air Force problems, the F-22 safety program. We have been working with the Army, and the assessment division of the engineering directorate has provided some funding for this. We are also trying to apply this type of methodology to the Ballistic Missile Defense -- or I guess, the Missile Defense Agency now. Here is a non-classified example of what we are doing. This represents the system diagram for an anti- aircraft missile. This particular weapons system has 17 components. The fault diagram -- in this case, the way these fault diagrams work, this notion means that for the guided missile on this weapons system to work, all of the sub-systems down here work. The idea here is that we are going to in general, and in particular for nuclear weapons, we are going to have 519

520 prior expert opinion about the probability that any one of these particular sub-systems works. The prior expert opinion can be very important in the nuclear weapons system and in the anti-aircraft missiles case, because you can't test or you can't get as much information from actual data as you would like. So we need to use expert opinion. The expert opinion is going to come at different system levels. You may have a physicist who can give you information at the systems level. You might have an engineer who can give you information at a valve or something level. So we need to incorporate all these different sources of expert opinion. We also may have binomial data that has also been collected at different levels in the system. It may also be collected from different systems. So we want to incorporate that into estimation of system reliability. We need to handle sparse data. I'll show you why that is important in a little bit. Then in the weapons program, we may have five or six different weapons systems that are all similar, and we want to model the similarity across these different systems. My talk is going to be a little bit different from some of the previous talks, in that I am going to become somewhat specific in how we are modeling this 520

521 reliability. Just a little bit of notation. In some of the slides I am going to show in a minute, Ci is going to denote the component or the sub-component I have in the system. The pi will denote the probability that that sub- component functions. These are the numbers that we are really interested in. We want to do inference on those. When we have data for a system, xi will denote the number of successful trials, and ni's will denote the total number of trials. So different types of information we might have. PARTICIPANT: (Comments off mike.) DR. JOHNSON: That is the probability that that component functions. We are not going to know that, we are going to estimate that. Generally you don't know that. DR. CHAYES: So it is the real problem. DR. JOHNSON: Real, but unknown. The first sort of information is specific prior expert opinion concerning pi's. So you go to an engineer and you ask him, what is the probability that this valve functions. The engineer in this case is going to tell me, the probability that that valve functions is A. So maybe it is .93. We model this in a beta type density function, but we haven't fixed the K. So A is the maximum value of this beta density. We are going to model the precision of 521

522 this beta density. As K gets bigger, the beta density collapses around this estimate. We are going to see how consistent the expert is with the other data we have and with the other experts. For specific prior information, let me just emphasize that the A is fixed, but we are going to estimate the K, and we are going to use the beta density to do that. PARTICIPANT : (Comments off mike.) DR. JOHNSON: We have flexibility on how we can do that, but we have been assuming that the K is the same for the same expert. So if an expert comes along and gives us guesses that are not consistent with the data, we down weight his guesses for all of the components, whether we have data on them or not. Generic expert opinion, if we go back to the fault diagram, what we want to do is, we want to allow experts to come in and say, maybe this control assembly and this guide assembly have similar success probability or similar reliabilities, but we don't know what they are. He won't say it is .93, but he will say these two things are similar. If we do a similar thing here, again we are going to use a beta density to estimate all the probabilities within a group of components that an expert has said should 522

523 be similar. The difference now is that we don't fix C. So we are going to have a precision parameter to say how tightly these components are grouped together, and we are going to have a C which we are going to also estimate from the data. So we generally will put a Jeffreys prior on C and we'll put a gamma prior on M. But we are going to estimate both of those parameters. Then finally what we do -- and we don't have data on all the nodes in this system. We can't for example compute a system-wide probability if we only have data on five or six of these terminal nodes. So we are going to assume another grouping type prior on the terminal nodes in our fault tree. So that is the final piece of the puzzle until we get to the data. Here we assume a Bayesian distribution as well. So we take a hierarchical specification, a lot of priors on the terminal nodes. For all of the terminal nodes, those are the nodes on the tree that don't have any sub-components. We take this type of beta. We say we don't know what the parameters of the beta are, but we have a parameter B which we estimate, which is the mean reliability of all the terminal nodes. The M is going to 523

524 estimate they are how tightly grouped they are together, whether all precisely the same or very much different. PARTICIPANT: (Comments off mike.) DR. JOHNSON: What we do here, we say the terminal nodes have some probabilities. We don't know what the mean probability is. We assume that B is drawn. In the next stage of the hierarchy, we assume it is drawn from a beta density. Generally what we assume here is a Jeffreys prior. So C and D are taken to be one half. PARTICIPANT: (Comments off mike.) DR. JOHNSON: M is also a pre-primer. PARTICIPANT: M is not drawn from -- M would be determined by -- DR. JOHNSON: No. We take a gamma prior typically for M. As Kathryn mentioned, this is sort of like a Bayesian network. A lot of these conjugate priors - - we use conjugate priors because they are convenient, but it is not necessary. The data terms of course are coming in as binomial observations. I'll mention at the end how we are extending those also to time-bearing cases for this anti-aircraft missile data. Then finally, if you look in the reliability literature, you will find that there has traditionally been a consistency problem in these type of nets, because people 524

525 have introduced different probabilities in different levels in the system. Then they use different priors and combine data in ways to give you incoherent results. We have avoided this problem in our model by using the fault tree specification to relate the parameters. So for example, for component four, I am not going to model the piece of four, because with the fault tree -- let me first say, if I had ten successes and one failure, then the binomial probably at p4 would be this, except from the fault tree. When you look at p4, p4 only works if component ~ and component 9 work. So we just re- parameterized. We said p4 is identically equal to p8 times pg. We are developing diagnostics to see when that actually holds as well. That is sort of an important point, because if you have one system level test, all this information is transferring itself all the way up and down the fault tree now. So for example, if I have N trials at the system levels, the likelihood that comes from those N trials takes this form, where we have the product over the probabilities of all the terminal nodes minus that same probability. So now we have information about all of the terminal nodes from one system test. 525

526 The joint posterior distribution which is the thing that we are going to use to do inference on all these probabilities, is just obtained by multiplying the specific expert opinion, the generic expert opinion, the hierarchical specification on the terminal nodes and the data terms all together, so we are assuming essentially independence there. We get a joint probability density on all the nodes in our system. Ten years ago, that would have been a hard problem to address, but now of course with Monte Carlo algorithms, we can go through and sample these probabilities very easily. We can update all the precision parameters, all the probability parameters, and even the hierarchical means and the generic means. Let me go through a quick example of how this works. We had some very unique data for testing this model. We had data from an anti-aircraft missile system, approximately 350 component level tests were performed on the system. I can't show you the actual data because it is proprietary. But it also turns out that they have done 1400 system level tests. So I have gone through and I have done the analysis just using the 350 component level tests, and then also doing it using the 350 component level test plus 526

527 the 1400 system level test to see how the two results compare. Here is what we ended up with. We do have the 1400 system level test for component one, where I have drawn X's through components where we had no component local data. You can see where we get rid of the entire weapon test. There may be problems in predicting the missile round and various other components if you look through this. For example, component ~ and component 9 may not be identified because we just have the test up here. So it is a fairly complicated problem, and it is also representative of a lot of the nuclear weapons system problems that we have, where we don't have test data for certain components. Here are the posterior distributions. For the reliability of the system as a whole -- and again, I apologize for not being able to display the actual axes here. These axes are between zero and one. We are using the 1400 system level test and the component level test. We get a posterior distribution that looks like this black line, so that is the posterior distribution. When we throw away all of the system level test and just use the component level tests, we get the red line. As you expect, the posterior distribution based on 527

528 just the component level test is quite a bit broader than it is using system level data, but it is reassuring, in that it seems to cover the posterior distribution that we would have gotten using all of the system level tests. So we were heartened by this. We had expected a priori that when we did this analysis, the posterior default system would be down here somewhere, just because of assembly problems, the fact that the components might not interact together. We had anticipated having to put an assembly node into our default diagram to account for that. But in this case, the data just didn't support that. We tried this to see what the effects were on components where we have a lot of data. For one of the sub-systems that had a lot of test data, the posterior didn't change much when we got system level tests and when we incorporated that information, whereas we were looking for components where there was no data, the posteriors did change somewhat, because the system level tests did provide that additional information. DR. BORGS: (Comments off mike.) DR. JOHNSON: That is a good question, and we wondered why that was happening. It turns out, when we talked to the people who provided us with this data, they 528

529 had made improvements in the manufacture of the weapons system over time. If you actually go through and do logistic regression just on a component level, where you have the time for each of these things. Did I answer that question? I think I answered a different question. Some extensions that we are currently modeling. Some missile test data that we had, we had a time associated with all the binomial data, so we are modeling now -- instead of just having a beta distribution at each node where we have data, we are looking at a logistic regression model. We are incorporating data in other applications from computational physics models, where they run computer code, for example, to predict outputs of systems. We have to incorporate the uncertainties in the input parameters to these codes into this. Material degradation models, and then different types of engineering data. I think an important point to make is also that the model provides guidance where additional information can best be obtained. So we can look at the posterior distributions and different components and see where we would like to get more data if we wanted to better estimate the reliability of the system as a whole. 529

530 So I'll stop there. There was a question that you asked. DR. BORGS: (Comments off mike.) DR. JOHNSON: That issue is involved in something we are working on actively. That is fitting these time varying models. It turns out the component level tests were done in a different time than the system level tests. So there was a time effect on these curves. But I think the more appropriate answer here is just to say that there is uncertainty in the estimate of these probabilities. Based on just component level tests in general, we are going to have a broader distribution for the system, and when we get the system level information, it is not always going to fall right in the middle of that distribution. There are certain diagnostics we would like to do, for example, omitting some of the data in the nodes that we have data, and see if we predict well what that data would have done. But the assembly error in putting these components together is an important thing to be considered, as is the independence of the expert opinions from each other and from data that they have already seen. PARTICIPANT: I have got a practical question that is less related to the -- 530

531 DR. LEVIS: Will you please go to the microphone? DR. FREEDMAN: (Comments off mike.) Are there other experiences in our society or in the world which are -- that is, something was tested for some period of time and then for a long period of time thereafter, we didn't really test them. We had to rely in some way on expert opinion, that we then model and put together to try to infer how a system worked. My apprehension is that the farther away we get from actually testing something, the more we are deluded into believing that the aggregation of expert opinion and wonderful methodology truth. What is the empirical basis to give that this speaks truth about how a system will DR. JOHNSON: I think your question if the test data and the expert opinion was before 1992 when we stopped doing nuclear still speaks us confidence perform? is basically, all collected tests, do we really want to use it now. And of course, at the lab that is a major focus of investigation, what is happening with these weapons systems as they age and they begin to exceed their intended lifetime. DR. CHAYES: He is saying that the test data itself is -- DR. JOHNSON: That is what I am trying to answer. There is a major effort in trying to tie the output from 531

532 computer codes, for example, that have been designed to run very accurate simulations of what happens in a nuclear reaction, to tie that back to the test data, as well as to tie it to any other data that they can collect now. So some critical experiments where they may do hydra tests. It is an extremely important problem, and people are not naive in thinking that they can just simply use data collected 15 years ago and put this into the model and think it is going to work. But they are trying to use the data collected 15 years ago to validate their codes. So in the last slide where I said we are trying to incorporate different sources of information into this model, this binomial system, we are really trying to look at some fairly sophisticated ways of putting information into these different nodes that isn't really in a conjugated form. 532

533 A Heirarchical Mode! for E:stimating the Reliability of Complex Systems Valen Johnson Engineering problems may involve estimating the reliability of a system that cannot be tested. For example, at Los AIamos, efforts are uncter way to mocte] the reliability of nuclear weapons. This issue also arises in handling Air Force projects, such as the F-22 safety program, and working with the Army and the Missile Defense Agency. Consider a muiticomponent system in which every component must work in order for the system to work. In orcler to estimate the probability that any one of these particular subsystems will work, as in the case of nuclear-weapons systems, prior expert opinion can be very important. This is because we can't test, or we can't get as much information from actual data as we wouict like, so we need to use expert opinion. Dr. Johnson ctescribect in detail his method for using different sources of expert opinion in the estimation of system reliability. 533

534 Arthur Dempster "Remarks on Data integration and Fusion" Transcript of Presentation Summary of Presentation Video Presentation Arthur Dempster is a professor of theoretical statistics at Harvard University. His research interests include the methodology and logic of applied statistics; computational aspects of Bayesian and belief function inference; modeling and analysis of dynamic processes; statistical analysis of medical, social, and physical phenomena. 534

535 DR. DEMPSTER: That's okay. I notice that the talks have been getting shorter and shorter. I think it has something to do with Saturday afternoon. The schedule says I have five minutes, then I pass it on to Alberto. I don't know what he is going to do. I have a few topics I wanted to touch on very briefly that I planned in advance. Then I can come back perhaps to the three talks, again very briefly. First of all, on defense, I am completely non- expert in this area. I have heard lots of wonderful things, beginning with our chairman. I am glad to hear that there are positive activities as well as possible ones, seeking out and pre-empting possible dangers and thinking about strategies for doing this. A big theme is of course modeling the whole complex system and all its actors and characteristics and environments and dynamics and so on and so forth. I perhaps would have liked to hear a little bit more about gaming, the opposite to people are developing strategies, and we have to cope with their strategies and back and forth, multiple feedbacks and so on. Anyway, it goes much beyond passive description of the situation. One thing about complexity. Jimmy Savage used to say, make your models as big as elephants. Another thing 535

536 that I used to hear, Bill Cochran was my colleague in the late '50s, the late '70s, would quote Ari Fischer with the advice to make your causal thinking complicated, bring in all the causal mechanisms before you try to understand what . . IS going on. Obviously, the last question occurred to me when I listened to it, the idea of, we are well past the period of testing. There are all kinds of dynamic changes that have gone on in the active weapons and the newly constructed ones and so on. One would have to think of all the causal mechanisms that were operating there and build them into the mathematical models as they get more and more dependent on scientific expertise and such things. So that is one set of comment on defense. A second comment that isn't perhaps very much on today's topic, but I'll back up a little bit -- I think discussants have been allowed to do this by the chairs -- that is to say that data are dumb. What I mean by that is, in themselves, if they are just a string of bits, that means nothing at all and it is obvious. But even if you have lots of structures that are set up wonderfully from the computer science point of view for access and information flow and manipulation, that is not getting at what is really going on here. So the 536

537 necessary complement is meaning . This aspect I think was brought up by George Papanicoulou in the seismic problem, where the databases of course are huge. But the real understanding came about through working through the science and improving that. So in all these homeland security models, I think the science is somehow primary, and we should always keep that in mind. This session has been on fusion, and that is what I should try to address a little bit. We are talking about fusing different measurement sources that are measuring objective things with error. That is the kind of model that is important to develop and to try to pool. The other aspect of it though is that we need to be fusing the models that refer to measurement, the models of the underlying phenomenon, the models that represent the science. That is an even bigger task. I think again, we should be deliberately focusing on those kinds of issues. As to mechanisms for fusion, the way I look at it and am conditioned to look at it is that we have had 250 years now since Bayes presented the formal tool of conditional probability. It was 200 years ago roughly that Gauss deliberately used that tool in order to propose Lee's squares for models with formal errors. scientific understanding and 537

538 Interestingly, it has been 150 years roughly since Bool set out the basic ideas of propositional logic, which is essentially a formal fusion tool of deterministic information. About 35 years ago, I presented what seemed to me to be the natural combination of these two ideas in a single mechanism that seemed to me to be totally natural, and then 25 years ago Ben Schaeffer gave a new name theory of belief functions in his Theory of Evidence book to that conception. For me, it has always been just that, the fusion of Boolean logic with probability, so it is not different from probability. But one does have to emphasize the word logic here, logic in a broad sense of trying to reason. I do want to emphasize that I think of it as a set of tools for reasoning about a scientific phenomena that include deterministic models and uncertainty and probabilistic uncertainty and so on. One comment about it though is that this fusion idea rests very heavily on a concept of independence. So the different information sources of evidential sources do have to have within them some representation that is independent from source to source before either Boolean logic or probabilistic combination or whatever you are doing, Bayesian combination, can operate. 538

539 If we take a specific example like Val Johnson's, one should be thinking carefully about whether the evidence that comes in at higher levels in the tree are independent of one another. He and I have discussed this. He is well aware of course of that kind of issue. So that is a technical thing that should be of great concern. I'll say a little bit about graphical structures. It seems to me that this technology, certainly the one I am familiar with, the belief function technology, has been closely tied to join trees, tree structures, that kind of thing, since the mid-SOs when Schaeffer and Schnoy wrote a paper about it, and my student, Augustine Kahm, did a thesis in 1986 developing that subject. That is not different from the better known Bayesian network theory. In fact, it subsumes it. In some ways, these conditional independence ideas go back to Phil David's '79 paper. I regard Judea Pearl as a good friend and very stimulating, and certainly has done a lot to move this theory out into the computer science world. I am a little surprised that Kathy ascribes it to him, since some of us have known about it rather longer than that. A little bit about mathematics. I don't profess to be a mathematician, although I do have a math stat degree from the Princeton math department way back. I like 539

540 mathematics, and I especially like beautiful mathematics, which I think is part of its great power. I don't have a good sense that mathematical thinking has come to grips with computing and the data revolution and all those complexities and so on. Maybe it is just the reductive aspect of the way mathematicians work, but to me, what I would like to see mathematics do is have new conceptions or frameworks for addressing scientific issues. I include all of these homeland security issues as scientific issues, in the sense that they need to be described, and what you know about them set up and formalized as much as possible and so on. Anyway, there is certainly plenty to do for understanding computations. You can set up a mathematical model and define an algorithm that will do it by brute force, but one big thing mathematics can do is provide clever ways around the difficulties and gain you orders of magnitude of speed. This MCNC approach that I am familiar with and that Val has used in his example is another area where there is a tremendous amount of mathematical development that can go on. Mathematicians in mathematical statistics have emphasized properties and procedures, and there is continuing to be a lot of new work going on there, very 540

541 interesting, very important, as to the efficiency of methods not only of calculation, but also of scientific methods, issues of robustness and all that kind of thing. So perhaps I should turn a little to the speakers. I think Alexander gave a very nice outline of one set of ways of thinking about these things, and Tod and Kathy the same. I find that they are presenting very complex stories. I didn't have them in advance. Maybe that was a blessing. One reaction I have, I would have to make a mathematical model of each of their presentations, and then I could analyze it and draw some conclusions about what I thought. But at the moment, I am left with trying to react to a very few different things. I certainly thought Tod's talk was extremely interesting. He threw out all these different approaches to uncertainty. I'm not so sure that what Lad Visati has been trying to do and what I am interested in doing are not miscible. I think they may be miscible in certain ways. One kind of thing that I am thinking about these days is a mathematical model of object recognition, very abstract, away from the real problem or what features, the complexities or dynamics or this kind of thing. But there was an example there of a medium characteristic. This 541

542 object I am looking at might have a membership value of .9 for being medium. I can treat that as a simple support function in the Dempster-Schaeffer theory, and I can discuss with Lad Visati, as I have done and will do more of in the next few months, exactly what the interactions are, since he is more interested in getting probability mixed in with his technologies. So there is some hope I think of some of these different approaches perhaps coming more together. I did have the advantage of being at Los Alamos last week and talking to Tod Graves and Val Johnson on the paper, and I had a chance to look at it. I think of it as a very nice case study type thing that can scale up to very complex examples. It does make use of the graphical model formulation in a very beautiful and non-trivial way. It brings in several different kinds of modeling. One is the traditional binomial sampling model. What I found when I thought about it was that the Bayesian aspect -- it has got a Bayesian label on it, but what the Bayesian would do, the standard Bayesian, would be to take the samples, the binomial samples and the 13 unknown probabilities and say, how am I going to assign a prior to those 13 probabilities that I can mix with the data. But those authors didn't do that. They put in prior 542

543 information here and there, and I regard that as very much in the spirit of the belief function approach to things. So I don't think of it as belonging to Bayes in the classical sense. It is much more moving towards the belief function approach. I thought the way they used beta priors with those capital K and capital N values in them was quite ingenious, although it seemed to me to mix a little bit the notion of expert opinion, the U of I.J. Good, the expert as a set of data. Whereas, the other aspect of it, the hierarchical thing, was just shrinking towards a common mean, that aspect of it gets buried in the beta priors and it was a little hard to see, unless you are very familiar with what is going on there. I misinterpreted it the first time I read the paper. That assumption, that the true values of P are drawn from some population, that is a judgment, not an expert opinion, not a sampling of likelihood term, but it is a kind of judgment on the part of the analyst who is constructing the model, to treat these things at that level as exchangeable. That is a judgment by what I.J. Good called "U", in quotes, or Jimmy Savage someplace else called thou. 543

544 So there is a kind of a philosophical attitude going on here, whereby there is objective science, something that I think Glen Schaeffer called nature, nature as the source of causation in the book, The Art of Causal Conjecture, that Tod mentioned. There is objective science of that nature. There is expert opinion, which is almost like data, and there is the U or the thou that makes some judgment that puts it all together into a model that you can have some belief in. So you need to have some kind of a working philosophy there, which I think very few people have. Perhaps one of the reasons that what I regard as this very simple combination of probability and Boolean logic, Bayes and Bool, is understood is because what it is really talking about is this U performing some kind of logic. If people started to think that way, they would understand this technology a little better. 544

545 Remarks on Data Integration and Fusion Arthur Dempster Dr. Dempster spoke on a number of related topics relevant to homelanct security. Game theory. In order to devise an effective methoct for seeking out and preempting possible ciangers, perhaps there shouict be more focus on game theory. We have to cope with the attacker's strategies, and then back and forth with multiple feedbacks. It goes much beyond passive description of the situation. The primacy of science. Data by themselves are just strings of bits that mean nothing at all without the necessary complement of scientific unclerstancting and meaning. The science is primary, and we shouict always keep that in minct. Fusion. Dr. Dempster spoke of the need to fuse together measurement sources that measure objective quantities with some amount of error. In acictition, there was a need to develop ways of fusing together the mocteis that refer to measurement with the mocteis of the unclerlying phenomena the moclels that represent the science. That is an even bigger task, but one researchers shouict be cteiiberate~y focusing on. Mathematics. "I don't have a good sense that mathematical thinking has come to grips with computing and the data revolution," Dr. Dempster saint. "Maybe it's just the reductive aspect of the way mathematicians work, but what ~ wouict like to see mathematics do is have new conceptions or frameworks for acictressing scientific issues- inciucting homeianct security issues." Although one can set up a mocte] and ctefine an algorithm that will do a computation problem by brute force, one big thing mathematics can do is provide clever ways around the ctifficulties and gain you orders of magnitude of speect. 545

546 Alberio Grunbaum "Remarks on Data integration and Fusion" Transcript of Presentation Summary of Presentation Video Presentation Alberto Grunbaum is a professor of mathematics at the University of California, Berkeley. His research interests include analysis, probability, integrable systems, and medical imaging. Dr. Grunbaum's current research is in medical imaging. He is studying the use of an infrared laser in place of X-rays. 546

547 DR. GRUNBAUM: Being the last speaker in this fantastic couple of days is a mixed thing. On the one hand, you are going to remember everything that I say the very best, because there is nobody speaking after me. On the other hand, the real question you have is, can you get me a taxi to go to the airport. So I am not sure. I really enjoyed this tremendously. I wouldn't claim to know anything about information fusion, even before or after the great talks. But I am very, very interested in the issue that I think brought some of these people together here, coming from so many different parts. The issue is, is there any way of bringing mathematical types in general, that includes all sorts of different things, to think about this new unfortunately national effort of homeland defense. I share fully what George mentioned yesterday and Dave McLaughlin and some other people, that it is very important to keep in mind the physics, the biology, the chemistry, the computational background that are part of these Problems. Even when we formulate them as , that should be a very, very important element all the time. On the other hand, mathematics has this amazing, almost poetic ability of looking at a problem in a certain _ _ problems mathematical problems 547

548 context and then taking a step back, playing with data, and all of a sudden turning around into some other problem and saying, by the way, maybe I can use these tools in this different area. That is the feeling that I have. I will be pushing some of the community that I think I represent now, which is the community of inverse problems. There are a number of people that have worked in medical imaging, in geophysics. Those are two good success stories of uses of mathematics in areas where, if you go back say 20, 25, 30 years ago, nobody would believe that math could contribute at all. To convince a medical doctor 30 years ago that mathematics could be useful you would be laughed out of any such medical meeting. Things have changed a lot. Now, as we move in this poetic fashion from one publication to a different one, I think it is very important to remind ourselves that we have to go back and start from scratch and talk to the experts in this new field, because the problem may look the same from the mathematical point of view, but they have all very different features. So I was actually very, very pleased when I heard our chairman talking about inverse problems, because I had prepared a slide about that ahead of time. What I will do 548

549 -- I'm sure you all want to stay here until 9 p.m., but I want to get out before that, so what I will do is use the principle of the hammer and the nail and talk a little bit about a problem that I have been playing with for awhile, and hopefully the audience will tell me if it has any meaning whatsoever in terms of looking at a very complicated network and solving what from my point of view is actually an inverse problem. I already mentioned that I could have picked another application, but this is one that I am more familiar with. A number of pieces have been mentioned in these past two days that have appeared from the very beginning in the area of medical imaging, the question of high dimensionality being the main Back in the '70s, maybe one. even earlier than that, we finally realized something that by now everyone realizes is real, but it wasn't real at all back then. If you have some function of two variables, we think about the information density of human tissue when you expose them to X-rays, if you measure all the line integrals and put in sources and detectors in the regimen that I have here, you have an expert source here, and measuring over there, you realize that they are not the same. So there is a model that tells you -- that you can use to change the intensity 549

550 here to an intensity there, is the integral of this unknown function along the line A. Do that, and then eventually by doing this with all different lines, you recover the function. Nowadays this is part of the culture. An element that was mentioned yesterday in the very first session has to do with the training of analysts at NSA. The issue of training radiologists is actually very similar; people look set of pictures and draw very different So this is just an example that some of these things -- what is my thesis? I am trying to say, if you are going around trying to look for a bunch of mathematicians that have already looked at some of these Issues . I have been fooling around for the last few years, trying to use laser instead of X-rays. Now you have to be able to perform a discovery. The phenomenon is that instead of having to begin with linear equations, from the very beginning, the thesis is so complicated -- again, you cannot ignore the thesis that determines everything. The source is one, that is one to go, that is applied here repeatedly, and you put goggles to be able to see any of this, you are going to see light coming through your hand, except on the other side. at the same conclusions. 550

551 want to England. now. We So I want to take this as a study problem. I make propaganda for this journal published in I do this because I happen to be the editor right are finding that a collection, a whole community of people coming from many different areas, some of them even call themselves mathematicians, that have dealt with a number of issues that are very, very -- the way somebody was talking about at what level should we move in matters of analysis. This is a very classical problem in medical imaging; should we go for higher spatial resolution or quantum resolution. The truth is that we would like to do both of them at the same time. In a very rare event, there is a difference in attenuation, in the order of one and two percent. At the same time you want to go to reconstruction, you have fantastic spatial resolution and you want to be able to use it in an effective way. Some of you may have seen a movie called the Brain Attack Team. Nowadays if you suffer a stroke, they actually into your brain and they clean you up. That requires very good time and spatial -- what I want to do in the last slide, let's start from the ~ · . neglnnlng . Here is a problem that I want to share with you in the hope that this may be of some use in some of these 551

552 extremely complicated networks. What I have drawn here is supposed to be a Markov chain with finite states base. The states are of three different kinds, some of them in code. S is one. Demographically this would mean these are the source positions. There are all these states that are inside, there is an arrow connecting two states. That means that you can make in one unit of time a transition. Then finally there are some detective states that detect those sites. What is it that I actually consider here? I imagine that I have the seven one-step transition probability. I didn't want to clutter this any further. This probability which is not known, which is -- then in the next state I keep on going, wandering around. I can only make boundary measures. So in the columns that we had in the very first transparency, we determine what are the new ones, the good ones, the red ones, or orange, on the other side. Think of the source positions as the new ones. There is nothing that you can look inside. Not assuming the sum of the probabilities from one age to all the -- that is a way of modeling absorption. The fact is, one can solve the forward problem completely, which is not entirely trivial, but that can be done. What I have managed to do in some very simple cases 552

553 is describe the complete class of all solutions for the inverse problem. If we don't make any assumptions -- of course, everything depends on the number of things that you have, but even the simplest cases are complicated like this, actually a little bit more complicated than this one, is data source positions, vector ones. In that case there are 64 parameters, measuring the probability from going from every incoming to every outgoing and measuring the time of light, you cans love analytically in terms of explicit formulas, for 56 out of 64. So I want to stop here. Thank you for the fantastic two days. If somebody would tell me, I have a problem that looks a little like that, please do it soon. That's all. DR. LEVIS: Thank you very much. Could you join us here at the table? I believe the schedule is now to open the discussion. But I first thank the speakers and the discussers for the last session. There are two kinds of discussions, but my suggestion is that we merge the two, both specific and general discussion, right now, if that is agreeable. So please come to the microphone, identify yourself. I am working with a handicap; those two things are blinding me, 553

554 so I can't even see you. right there. PARTICIPANT: I had a general question about your introduction in the first talk. We saw the diagram for the possible threats to the United States and the five questions to which we knew none of the answers. It makes it sound like homeland defense is not a really tenable proposition. Is defense the right strategic paradigm? After all, the number of people who seem to be committed to doing these things is very small. You are not talking about millions of people, perhaps a few thousands or a few tens of thousands. Maybe the whole idea of trying to wall the country in is not strategically the right approach, and one should look for instance at the possibilities of an offense. DR. LEVIS: I will answer diplomatically on that one. I think I already gave the answer. Many people cal this homeland security, as opposed to homeland defense. I said, there are different meanings to the word defense and security. But we are doing all of the above as far as I know, but I don't know everything that we do. PARTICIPANT: Certainly the emphasis in the talks that we had today was very much passive waiting, lying So please come to the microphone 554

555 around and waiting to be hit next, and trying to prevent attacks. I'm not deprecating any of that. But I think looking at the diagram that the first speaker showed us, an unimaginable number of threats coming from an unimaginable number of places, and looking at all the vulnerabilities of a complex society, I have to say, didn't look very promising. I just wanted to posit that as a possibility, that that is maybe just the wrong strategic thing about it. DR. LEVIS: I will give a very brief response and then refer it to the other speakers or members of the audience. To begin with, and this is from limited knowledge, I am not an academic in real life, we are doing very well, actually. If you consider how many things have not happened -- this is a issue of metrics. Let me put it in a different way. We have the police. The real metric for the police, if I am happy with my police department, is not how many people they arrested, it is how many crimes they prevented by their mere existence. When you go to the arrest, the crime has already occurred. But it is very difficult to get measures, as you understand very well about this, about the things that have not happened. 555

556 So the problem is very complex, it is very difficult, it is a large country. There is no way that you can border it totally, disallow anybody to enter it and so forth, the way of life that has been discussed. We have a huge intelligence establishment for which we pay a lot of taxpayers' dollars. Part of their job is to do the anticipatory and not even suggest, just bring the data up front for the appropriate decision makers to decide whether to be proactive or not. I am speaking as an individual, but I don't think we are doing that badly, when you consider how many real instances of terrorism we have had in this country. It is the most open country in the world that I know. PARTICIPANT: I wasn't suggesting that we have done badly. It was just a concern of mine towards this sort of strategy around a lot of what we heard today. There was a sense that we could not -- PARTICIPANT: You are not by the microphone. PARTICIPANT: There was just a very defensive kind of mindset, and I was just concerned that this feeling of vulnerability could itself -- I was just concerned about the strategy part. DR. DEMPSTER: It seems to me to be the new thing that there are events of extremely low probability that 556

557 have extremely high costs. focusing on a bit. DR. LEVITT: Alex mentioned something in his opening. He said, I could tell you, but then I'd have to kill you. I think the problem with the offensive stuff is that it tends to be handled by the intelligence community. I think there is an enormous amount going on, but I don't know much of it, and therefore I wouldn't know, as far as germane to this workshop goes, where the mathematical challenges lay in offensive activity, which I think are highly focused. However, I completely agree with the scope of homeland defense. There is the newspaper version, or what I consider the public political version. On the other hand, where I see the real payoff is that there is enormous amount, and there has been for man years, of data in automation, collection, and critical things going on, that really can be exploited. I think that we don't know what the limits of automation might be, we don't have to have people, you are talking PCs and memory. If we could deal with the inferential issues, so that we didn't generate false alarms all the time, that there might be a tremendous payoff. That is what maybe we should be 557

558 DR. LASKEY: I wanted to comment on that, and say that I agree thoroughly with the points you made about not sitting around and just sorry if my talk came how it was intended. passively observing. In fact, I'm across that way, because it wasn't Inference can be used to passively observe, but you notice that level five of the five levels of fusion that I talked about was placing your sensors. Part of what you are doing is, you are not just passively observing, but you are watching and anticipating what is going to happen and preparing yourself to respond to it, but also at the same time, if we discover that if there is an Al-Queda cell that is planning something, I'm sure that our military people would go in and stop them before they had the chance to do anything about it. But it is not just waiting until the horse is gone to close the barn door. The technologies that I was talking about would apply much more broadly. DR. LEVIS: Other comments? Questions to the speakers? The timetable for the airplanes? I'd like to make one comment and a question, and then we can adjourn, if you like. Earlier today there was a question about funding. This is Washington, so I would like to address it very briefly. 558

559 For mathematics, the Air Force has the Office of Scientific Research. There is the Mathematics and Computer Science Division within it. I think one person was present here today. They are looking into those problems, and it is basic research. However, things have changed a little bit, and you have to write more than one paragraph or one opening sentence. The old joke, everything looks like a worm. You have to write a little more and show some understanding of how that work will have an impact. Of course, good mathematics has a good impact, but that is not sufficient. There is a lot of scrutiny for relevance. So understanding the problem makes also a much better proposal, even though you are doing the mathematics that you want. People mentioned DARPA. I have known DARPA for many, many years. For DARPA you have to know the problem. There is a very small part of the DARPA budget that is in basic research. Most of it is in applied research and demonstrations. In the process of using those large three to five year demonstrations, a lot of research is being done, and I am sure some of you have participated. But those programs are fairly focused in trying to solve a particular problem and make a demonstration. So in order 559

560 to play, you don't have to know every aspect of the problem. They are usually large themes, and you spend an awful lot of time going to meetings and coordinating on those things. But there is a substantial amount of funding that allow the basic research to be done in the context of the applied research. But you have to know -- the quick kill three program is, if you are going to have a new algorithm that is going to do it faster, cheaper, better and all the other things. There is a substantial amount of funding. The Defense Information Systems Agency together with DARPA, they have a joint program office called JPO. That office has kicked off a major homeland security initiative. It is not exactly new money yet, there will be new money in the future, but right now they are earmarking existing money everywhere that could be pointed in that direction. I have been through that exercise, I write a lot of proposals. I can see how we can take some of the work that we are doing, dress it up appropriately, and look at that problem. But in order to get funding for that, I will have to be credible that at least I understand some aspects that are relevant to homeland security, not leave it to the reviewer to make the connection. I have seen enough from 560

561 the inside, the paperwork, there is somebody that will have to write a paragraph that is that long on a form, explaining why your work is relevant to the problem. So you are passing the buck to somebody else. When you are doing that, things don't work very well. It is much better if you write it yourself, if I write it myself and submit it and discuss it. It helps the proposal along. But there is funding that is developing within the various -- I haven't checked with the Office of Naval Research, but I am sure they are looking at that problem in the mathematics section, but one has to be a little more closely aligned to the problem than was discussed in here, at a fairly abstract and high level. Thank you. I don't have any money, by the way. Whoever would like to close -- DR. KELLER-McNULTY: I guess I was voted in on this. I am the only other board member besides you two here. I just want to first of all thank everybody for coming to the workshop. I think it has been pretty exciting. I know that I have learned a lot, and I have tried to listen carefully. I think that there are clearly great challenges mathematically for us to try to address that can be very 561

562 supportive to homeland security. I also think that there is a fair amount of what we already do that can almost be directly applied to some of the immediate problems, and it behooves us to seek out those opportunities and try to do that. We heard a lot of things in the last couple of days. Some of the things that resonated with me were the following. First of all, we have to keep the human in the loop. No one is talking about -- and some of our colleagues when we talk to them, some of our scientific colleagues, they get really nervous that we are trying to develop methods that somehow take them out of the decision making loop. I think it was pretty clear from everything we heard that the human has to be in the loop, the science has got to be incorporated into the problem, the domain knowledge has got to be included in the solutions and in the frameworks we build to solve these problems. Which then of course points to one of the very first things that Alex said when he opened the final session, which was that you have got to know the problem. Clearly, if we want to get collective funding to work in this area, we had better darn well be willing to admit to knowing that we are trying to solve a real problem, and try to figure out how to pull the pieces together. 562

563 The other thing that came out really clear to me is -- and something that we sometimes forget to talk about when we are talking to people about the problems in the work we are doing, is that quantification underlies almost everything we are doing here. We tend to forget about it, because it is part of our being as mathematical scientists and as computer scientists. But everything is about uncertain quantification. The question that came up about the nuclear weapons certification, it is not just about predicting a point value, whether or not we think these systems will work. It is all about uncertainty quantification and knowing when we have to go back and start testing again, or collect other types of information. That is true throughout all the homeland security problems. A lot of people talked about how we are in a stage and in an era of being swamped with data, we just have so much of it. What was it that Art said? That data is dumb, which is true. Kathy commented that we have all this data, data, data, we don't have time to think. That is true on a certain level, but as soon as we ratchet these problems up to the incredible dimensions that they are -- and Tod showed us what the dimensionality of the space is, there actually is a lot of data sparsity. 563

564 In the last session, I think that Val's talk tried to show how we can try to put things together in the presence of a lot of data when you take some slices through the problem, but not a lot of data or information when you take other slices through that problem space. So that is really important that we remember as well, that it is not just about massive data mining, but it is about how to integrate all of this information. The final thing I want to say is that what we are talking about is modeling complex systems. Perhaps we do have to put hats on a bit like engineers and take a systems approach to looking at these problems and figuring out how we can build and develop the mathematical frameworks to actually make progress. I don't think the problems are impossible. I actually disagree with one of the last comments that was made from the floor, that a lot of what I heard at this workshop is about prediction and forecasting, how do we take what we have learned, what we are seeing, and project ahead and forecast ahead, to try to understand what the next thing is that needs to be done or where the next vulnerability is. These are not easy problems. We have to come together and work on them. I hope that we do. I have been 564

565 pretty energized by this and I look forward to continuing to interact with many of you on these. That's it , . Safe travel. 565

566 Remarks on Data Integration and Fusion Alberio Grunbaum Mathematics has anmazing, almost poetic, ability to look at a problem in a certain area and then take a step back and sucicten~y realize that maybe we can use these tools for some other problem in a different area. For example, there are a number of people who have worked first in geophysics and then in mectica] imaging! He cautioned, however, that as he and his colleagues movect in this poetic fashion from one area of application to a different one, it was very important to remind themselves that they tract to go back and start from scratch and talk to the experts in this new fieict. The problems might look the same from the mathematical point of view, but they have all very different features. 566

Next: Business Week Article: Enlisting Math to Defend the Homeland by Stephen H. Wildstrom »
The Mathematical Sciences' Role in Homeland Security: Proceedings of a Workshop Get This Book
×
Buy Paperback | $147.00
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Mathematical sciences play a key role in many important areas of Homeland Security including data mining and image analysis and voice recognition for intelligence analysis, encryption and decryption for intelligence gathering and computer security, detection and epidemiology of bioterriost attacks to determine their scope, and data fusion to analyze information coming from simultaneously from several sources.

This report presents the results of a workshop focusing on mathematical methods and techniques for addressing these areas. The goal of the workshop is to help mathematical scientists and policy makers understand the connections between mathematical sciences research and these homeland security applications.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!