
Textual Information Access
Statistical Models
- English
- ePUB (mobile friendly)
- Available on iOS & Android
Textual Information Access
Statistical Models
About this book
This book presents statistical models that have recently been developed within several research communities to access information contained in text collections. The problems considered are linked to applications aiming at facilitating information access:
- information extraction and retrieval;
- text classification and clustering;
- opinion mining;
- comprehension aids (automatic summarization, machine translation, visualization).
In order to give the reader as complete a description as possible, the focus is placed on the probability models used in the applications concerned, by highlighting the relationship between models and applications and by illustrating the behavior of each model on real collections.
Textual Information Access is organized around four themes: informational retrieval and ranking models, classification and clustering (regression logistics, kernel methods, Markov fields, etc.), multilingualism and machine translation, and emerging applications such as information exploration.
Frequently asked questions
- Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
- Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Information
PART 1
Information Retrieval
Chapter 1
Probabilistic Models for Information Retrieval 1
1.1. Introduction
| Notation | Description |
|---|---|
| RSV(q, d) | Retrieval status value: score of document d for query q |
| qw | Number of occurrences of a term w in the query q |
![]() | Number of occurrences of a term w in the document d |
| N | Number of documents in the collection |
| M | Number of indexing terms |
| Fw | Average frequency of ![]() |
| Nw | Document frequency of w: ![]() |
| zw | zw = Fw or zw = Nw |
| ld | Document length d |
| lc | Length of collection |
| m | Average length of document |
| Xw | Random variable associated with the word w |
| Xd | Multivariate random variable associated with the document d |
Table of contents
- Cover
- Title Page
- Copyright
- Introduction
- Part 1: Information Retrieval
- Part 2: Classification and Clustering
- Part 3: Multilingualism
- Part 4: Emerging Applications
- Appendix A: Probabilistic Models: An Introduction
- List of Authors
- Index


