Many existing information retrieval (IR) systems are surprisingly ineffective at finding documents relevant to particular topics. Traditional systems are extremely brittle, failing to retrieve relevant documents unless the user's exact search string is found. They support only the most primitive trial-and-error interaction with their users and are also static. Even systems with so-called "relevance feedback" are incapable of learning from experience with users. SCALIR (a Symbolic and Connectionist Approach to Legal Information Retrieval) -- a system for assisting research on copyright law -- has been designed to address these problems. By using a hybrid of symbolic and connectionist artificial intelligence techniques, SCALIR develops a conceptual representation of document relationships without explicit knowledge engineering. SCALIR's direct manipulation interface encourages users to browse through the space of documents. It then uses these browsing patterns to improve its performance by modifying its representation, resulting in a communal repository of expertise for all of its users. SCALIR's representational scheme also mirrors the hybrid nature of the Anglo-American legal system. While certain legal concepts are precise and rule-like, others -- which legal scholars call "open-textured" -- are subject to interpretation. The meaning of legal text is established through the parallel and distributed precedence-based judicial appeal system. SCALIR represents documents and terms as nodes in a network, capturing the duality of the legal system by using symbolic (semantic network) and connectionist links. The former correspond to a priori knowledge such as the fact that one case overturned another on appeal. The latter correspond to statistical inferences such as the relevance of a term describing a case. SCALIR's text corpus includes all federal cases on copyright law. The hybrid representation also suggests a way to resolve the apparent incompatibility between the two prominent paradigms in artificial intelligence, the "classical" symbol-manipulation approach and the neurally-inspired connectionist approach. Part of the book focuses on a characterization of the two paradigms and an investigation of when and how -- as in the legal research domain -- they can be effectively combined.

- 336 pages
- English
- ePUB (mobile friendly)
- Available on iOS & Android
eBook - ePub
A Symbolic and Connectionist Approach To Legal Information Retrieval
About this book
Trusted by 375,005 students
Access to over 1.5 million titles for a fair monthly price.
Study more efficiently using our study tools.
Information
Chapter 1
Introduction
Last year, the United States Supreme Court wrote over 5,000 pages of opinions, published in 5 volumes. Federal Courts of Appeal accounted for another 40,000 pages reported in 27 volumes. District Courts added about the same amount. In addition to a smaller body of statutes and regulations, these documents together with the thousands of volumes that preceded them, constitute much of the law in this country. Moreover, every attorney is expected to be able to âfind the lawâ (usually court decisions) that applies to the situation of his or her client. This research task is difficult and time consuming. Yet this is just one example of a widespread problem: how to find information in large bodies of text. It is faced by scientists looking for research articles, historians looking for newspaper stories, and managers looking for corporate documents.
Many of the documents in these diverse fields have recently become available on-line. Some of these documents, such as office memoranda or business correspondence, are created using word processors and continue to exist in electronic form long after their paper copies have been printed out. Others, like books or journals, are converted to electronic form (often through manual rekeying) by publishers, even if the initial manuscripts were produced with ordinary typewriters. Still other documents are being added to on-line databases through the use of scanners and character-recognition software. In short, the volume of text available on-line is already huge, and will only get larger as the use of computers continues to grow.
There are clear advantages for having text corpora available on-line. Physical access to the text is significantly easier; in some cases the researcher may be able to find the desired document without leaving his or her desk. If the text is easily accessible on-line, the user may not require bound copies, thus reducing unnecessary duplication. On-line documents may be passed across large distances from one person to another in minutes. Computerized indexing methods allow rapid access to nearly any part of the text corpus, eliminating the need for hunting through library stacks, scanning microfiche cards, or flipping through journal pages.
What the availability of on-line text does not change, however, is the difficulty of finding the desired information. Many lawyers, for example, continue to prefer manual research methods when given the choice. This suggests two conclusions: First, existing on-line search methods are not meeting the needs of users. Second, manual search methods may have strengths unnoticed by designers of on-line text retrieval systems. To address this problem, new approaches to finding information on-line are needed. The research described in this book is one such approach.
1.1 An Interdisciplinary Approach to Finding Information
Suppose one wanted to design a computer system to assist the research process in a certain domain. Ideally, the system should be able to search huge databases of text rapidly, and be able to differentiate between documents that seem more relevant to the userâs problem and those that seem less so. It should have some information or âknowledgeâ about the problem being researched, about the domain generally, and about the context of the search â for example, what the user has previously considered relevant, what has already been found, and so on. Finally, it should be designed in such a way that it facilitates the original research task.
Each of these three goals is a major research problem in its own right, drawing on the techniques and tools of three different subfields of computer science and cognitive science: information retrieval (IR), artificial intelligence (AI), and human-computer interaction (HCI). A system designed to meet only one of these goals is likely to have difficulty satisfying the others. In contrast, a combination of approaches from the three disciplines may offer a collective solution not available to any one in isolation.
This book describes an attempt to bring the tools of IR, AI, and HCI to bear on the information-finding problem. The attempt takes the form of a system called SCALIR, for Symbolic and Connectionist Approach to Legal Information Retrieval, which is designed to assist research on copyright law. Although SCALIR represents only one way to address the problem, I believe it demonstrates the value and feasibility of an interdisciplinary apporach.
Like any partnership, the union of AI, IR, and HCI requires some compromises. Although IR has contributed statistical text indexing techniques to the SCALIR system, the SCALIR research rejects many of the traditional IR assumptions about retrieval, relevance, and evaluation. AI has supplied the basic structures and mechanisms from which SCALIR is constructed (in particular, spreading activation search in semantic and connectionist networks), but they are applied to an âindustrial gradeâ problem and thus are less theoretically pure than most AI researchers advocate. HCI has informed the design of SCALIRâs methods for interacting with users; the user/system interface has evolved in response to usersâ suggestions. Yet many in HCI may find SCALIRâs design process to be rather ad hoc and its interface compromised by the systemâs other goals. If SCALIR has broken any new ground, it probably has not done so without disturbing its neighbors.
1.2 Central Themes
Several important issues recur throughout this research. This section briefly highlights these central themes.
1.2.1 Law as a Problem Domain
Law is a fascinating institution. Its principles emerge from its large body of natural language text â court decisions â and evolve over time. Trying to get a system to represent this text is a major undertaking. The legal1 reasoning process is also complex, relying heavily on analogies between previous cases and the current situation. Thus legal research (and hence legal information retrieval) itself plays a role in legal reasoning. In short, law serves as an ideal laboratory for AI. It is a tightly constrained and decomposable domain, yet it involves many of AIâs hardest problems.
Widespread research on applying AI techniques to the law is fairly recent, at least relative to a domain like medicine. The first ACM-sponsored International Conference on AI and Law (ICAIL) was not until 1987, and the first journal for the field is due out this year. In some ways this is surprising, because there are so many ways in which law could benefit from intelligent computer systems. It may be due to the unavailability of data for research. It may also reflect the reluctance of an old and somewhat conservative profession to embrace new technologies. In any case, the result is a fairly wide-open field with much work to be done, an inviting prospect for any AI researcher.
1.2.2 Connectionist/Symbolic Hybrids
Connectionism â the use of neurally inspired, massively parallel networks of primitive processing units, and the attendant implications for modeling human cognition â had a major rebirth in the mid-1980s. Connectionist techniques have since been applied to many problems previously thought amenable only to traditional, âsymbolicâ AI. This has led some researchers to conclude that connectionism is the correct approach to most problems, whereas others still believe that only symbolic techniques are correct.
I believe that both approaches have a continuing role to play in AI, and that further gains can be achieved by combining the two to create new hybrid systems. In the information retrieval domain in particular, connectionist models facilitate learning and the gradual combination of evidence, which are helpful in overcoming some of the brittleness of traditional IR systems. Symbolic techniques allow us to encode a priori knowledge of the domain, such as constraints representing the âphysicsâ of the world in question. A hybrid system may thus be capable of performing a task more effectively (perhaps more efficiently, or more perspicuously, or more robustly) than a system using only a single paradigm.
There is an additional reason why this hybrid methodology is particularly useful in SCALIR: The legal system itself has some characteristics of both symbolic and connectionist models, making each approach especially appropriate for representing different aspects of the problem.
1.2.3 Text as Knowledge Representation
Text has traditionally played a central role in IR systems â their goal is to retrieve text documents â but not in AI. AI researchers who have attempted to capture knowledge extant in text often constructed formal representations of the knowledge and then threw away the text itself. The resulting representations necessarily lost some of the information initially present in the text. Thus the systems constructed were limited to the features of the text considered important by the programmer.
In a text-based intelligent system, the text itself may be viewed as the knowledge base. Text has many virtues as a representation, not the least of which is that, in many cases, it already exists on-line. It is capable of conveying vast amounts of information, some of which gradually emerges through continual reinterpretation. It is also easily comprehended by humans, which is not the case for many formal AI representations. The difficulty, of course, is accessing the knowledge in the text.
Rather than explicitly trying to extract the textual knowledge, SCALIR begins with intelligent tools for letting users access the text directly, bringing them documents of interest. By changing its representation of these documents in response to user feedback, the system eventually provides its users with a shared repository of knowledge about the text.
1.2.4 The Role of the User
A great deal of information retrieval research has focused on improving performance of IR systems, where âperformanceâ is defined as a score on some measure of accuracy or completeness of retrieval. Actual user satisfaction with these systems is difficult to quantify and is rarely measured. In fact, AI techniques are often viewed as a way to eliminate more of the userâs âburden,â without consideration of whether the user would prefer to maintain tighter control over the search process.
Other research, often from schools of library and information science, has examined the problems library patrons have locating information, both in traditional media and with on-line catalogs. Furthermore, humanâcomputer interaction studies have looked at the kinds of problems users have with a variety of computer systems.
Yet there have been relatively few attempts to combine the results of performance-oriented and user-oriented research. In other words, few people have asked how what we know about usersâ search behavior can be used to design more effective IR systems. SCALIR represents an attempt to do just that; it is based on a premise of keeping users âin the loopâ at all times and improving its performance by observing its usersâ behavior.
1.3 Goals of the Research
The SCALIR project began with three goals:
Improving Legal Information Retrieval. The first goal was simply to construct an IR system for the legal research task that, by using an unusual combination of techniques from artificial intelligence, could overcome many of the problems facing more traditional systems.
Feasibility of Hybrid AI Systems. The second goal was to demonstrate that there were problems for which the approaches of connectionist and traditional AI could be usefully combined. In particular, I claimed that learning was possible in a hybrid system, and that such a system would constitute a whole greater than the sum of its parts.
IR for Natural Language. Finally, the SCALIR research was intended to show how information retrieval systems might provide an âend-runâ around the problem of natural language processing, and the symbol grounding problem in particular.
During the course of the research, the focus expanded somewhat to include an analysis of the role of human-computer interaction in the IR task. At the same time, the third initial goal became a perspective from which to view the research rather than a concrete research program. Though the concept of IR for natural language processing is discussed in chapter 3, one will not find specific results or experiments pertaining to it. Chapter 12 summarizes the results presented throughout the book, and tries to assess how well each goal was satisfied.
1.4 A Sketch of the SCALIR Approach
The goal of an information retrieval system is to find documents relevant to a userâs search request, documents that are said to satisfy the userâs âinformation need.â This requires performing some sort of matching operation between the request and the documents in the corpus, and (preferably) ordering the responses from most to lea...
Table of contents
- Cover
- Halftitle
- Title
- Copyright
- Contents
- List of Figures
- List of Tables
- Preface
- 1 Introduction
- 2 Humans, Computers, and Finding Information
- 3 Knowledge Representation, Meaning, and Text in AI
- 4 Approaches to Information Retrieval
- 5 Some Perspectives on the Law and Legal Research
- 6 Hybrid Vigor
- 7 The Structure of SCALIR
- 8 The Retrieval Process
- 9 Feedback and Learning
- 10 Interacting With SCALIR
- 11 Performance Evaluation
- 12 Discussion
- References
- Author Index
- Subject Index
Frequently asked questions
Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline
Perlego offers two plans: Essential and Complete
- Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
- Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.5M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1.5 million books across 990+ topics, weâve got you covered! Learn about our mission
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud
Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere â even offline. Perfect for commutes or when youâre on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app
Yes, you can access A Symbolic and Connectionist Approach To Legal Information Retrieval by Daniel E. Rose in PDF and/or ePUB format, as well as other popular books in Psychology & Cognitive Psychology & Cognition. We have over 1.5 million books available in our catalogue for you to explore.