eBook - ePub

Mastering Apache Solr 7.x

Name: Mastering Apache Solr 7.x
Author: Sandeep Nair, Chintan Mehta, Dharmesh Vasoya

Sandeep Nair, Chintan Mehta, Dharmesh Vasoya

Partager le livre

English
ePUB (adapté aux mobiles)
Disponible sur iOS et Android

eBook - ePub

Mastering Apache Solr 7.x

Sandeep Nair, Chintan Mehta, Dharmesh Vasoya

Détails du livre

Aperçu du livre

Table des matières

Citations

À propos de ce livre

Accelerate your enterprise search engine and bring relevancy in your search analyticsAbout This Book• A practical guide in building expertise with Indexing, Faceting, Clustering and Pagination• Master the management and administration of Enterprise Search Applications and services seamlessly• Handle multiple data inputs such as JSON, xml, pdf, doc, xls, ppt, csv and much more.Who This Book Is ForThe book would rightly appeal to developers, software engineers, data engineers and database architects who are building or seeking to build enterprise-wide effective search engines for business intelligence. Prior experience of Apache Solr or Java programming is must to take the best of this book.What You Will Learn• Design schema using schema API to access data in the database• Advance querying and fine-tuning techniques for better performance• Get to grips with indexing using Client API• Set up a fault tolerant and highly available server with newer distributed capabilities, SolrCloud• Explore Apache Tika to upload data with Solr Cell• Understand different data operations that can be done while indexing• Master advanced querying through Velocity Search UI, faceting and Query Re-ranking, pagination and spatial search• Learn to use JavaScript, Python, SolrJ and Ruby for interacting with SolrIn DetailApache Solr is the only standalone enterprise search server with a REST-like application interface. providing highly scalable, distributed search and index replication for many of the world's largest internet sites.To begin with, you would be introduced to how you perform full text search, multiple filter search, perform dynamic clustering and so on helping you to brush up the basics of Apache Solr. You will also explore the new features and advanced options released in Apache Solr 7.x which will get you numerous performance aspects and making data investigation simpler, easier and powerful. You will learn to build complex queries, extensive filters and how are they compiled in your system to bring relevance in your search tools. You will learn to carry out Solr scoring, elements affecting the document score and how you can optimize or tune the score for the application at hand. You will learn to extract features of documents, writing complex queries in re-ranking the documents. You will also learn advanced options helping you to know what content is indexed and how the extracted content is indexed. Throughout the book, you would go through complex problems with solutions along with varied approaches to tackle your business needs. By the end of this book, you will gain advanced proficiency to build out-of-box smart search solutions for your enterprise demands.Style and approachAn advance guide which will take you through complex problems with solutions along with varied approaches to tackle your business needs by using Apache solr 7.x

Foire aux questions

Comment puis-je résilier mon abonnement ?

Il vous suffit de vous rendre dans la section compte dans paramètres et de cliquer sur « Résilier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez résilié votre abonnement, il restera actif pour le reste de la période pour laquelle vous avez payé. Découvrez-en plus ici.

Puis-je / comment puis-je télécharger des livres ?

Pour le moment, tous nos livres en format ePub adaptés aux mobiles peuvent être téléchargés via l’application. La plupart de nos PDF sont également disponibles en téléchargement et les autres seront téléchargeables très prochainement. Découvrez-en plus ici.

Quelle est la différence entre les formules tarifaires ?

Les deux abonnements vous donnent un accès complet à la bibliothèque et à toutes les fonctionnalités de Perlego. Les seules différences sont les tarifs ainsi que la période d’abonnement : avec l’abonnement annuel, vous économiserez environ 30 % par rapport à 12 mois d’abonnement mensuel.

Qu’est-ce que Perlego ?

Nous sommes un service d’abonnement à des ouvrages universitaires en ligne, où vous pouvez accéder à toute une bibliothèque pour un prix inférieur à celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! Découvrez-en plus ici.

Prenez-vous en charge la synthèse vocale ?

Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte à haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accélérer ou le ralentir. Découvrez-en plus ici.

Est-ce que Mastering Apache Solr 7.x est un PDF/ePUB en ligne ?

Oui, vous pouvez accéder à Mastering Apache Solr 7.x par Sandeep Nair, Chintan Mehta, Dharmesh Vasoya en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Informatique et Traitement des données. Nous disposons de plus d’un million d’ouvrages à découvrir dans notre catalogue.

Informations

Éditeur

Packt Publishing

Année

2018

ISBN

9781788831550

Édition

Sujet

Informatique

Sous-sujet

Traitement des données

Advanced Queries – Part I

In the previous chapter, we learned how to build indexes using various methods. In this chapter, we will see how Solr's search works. Solr comes with a large searching kit; by configuring elements from this kit, it provides users with an extensive search experience and returns impressive results with a helpful interface.

Here is a list of search functionalities provided by Solr, that put Solr in the list of desirable search engines:

Highlighting
Spell checking
Reranking
Transformation of results
Suggested words
Pagination on results
Expand and collapse
Grouping and clustering
Spatial search
More like this word
Autocomplete

We will look at some of these functions in detail later in this chapter, but first let's understand every component that performs an important role during searches and generates impressive results.

Search relevance

Relevance is a measurement of the user's satisfaction with the response to their search query. It completely depends on the context of the search. Sometimes, the same document can be searched by different classes of people for different context. For example, the search query higher tax payer in India can be searched by:

An income tax department in the context of their duty
Chartered accountants in the context of their professional interest
Students in the context of gaining knowledge

The comprehensiveness of any response depends on the context of the search. Sometimes, the context is high, such as searching for legal information; sometimes, it is low, when someone is searching for context such as specific dance steps. So, during Solr configuration, we need to take care of this too.

There are two terms that play an important role in relevance:

Precision: Precision is the percentage of documents in the returned results that are relevant.
Recall: Recall is the percentage of relevant results returned out of all relevant results in the system. Retrieving perfect recall is insignificant, for example, returning every document for every query.

From this example, we can conclude that precision and recall totally depend on the context of the search. Sometimes, we need 100% recall, say when searching for legal information. Here, all the relevant documents should be returned in the response. While in other scenarios, there is no need to return all documents. For example, when searching for dance steps, returning all the documents will overwhelm the application.

Through faceting, query filters, and other search components, the application can be configured with the flexibility to help end users get their searches, in order to return the most relevant results for users. We can configure Solr to balance precision and recall to meet the needs of a particular user community.

Velocity search UI

Solr provides a user interface through which we can easily understand the Solr search mechanism. Using velocity search UI, we can explore search features such as faceting, highlighting, autocomplete, and geospatial searching. Previously we have seen an example of techproducts; let's browse its products through velocity UI. You can access the UI through http://localhost:8983/solr/techproducts/browse, as shown in the following screenshot:

Solr uses response writer to generate an organized response. Here velocity UI uses velocity response writer. We will explore response writer later in this chapter.

Query parsing and syntax

In this section, we will explore some query parsers, their features, and how to configure them with Solr. Solr supports some query parsers. Here is the list of parsers supported by Solr:

Standard query parser
DisMax query parser
Extended DisMax (eDisMax) query parser

Each parser has its own configuration parameters for clubbing with Solr. However, there are some common parameters required by all parsers. First let's take a look at these common parameters.

Common query parameters

The following are the common query parameters supported by standard query parser, DisMax query parser, and extended DisMax query parser:

Parameter	Behavior	Default value
defType	Selects the query parser: defType=dismax	Lucene (standard query parser)
sort	Sorts the search results in either ascending or descending order. The value can be specified as asc or ASC and desc or DESC. Sorting is supported by numerical or alphabetical content. Solr supports sorting by field clones. Example: salary asc: Sorts based on salary (high to low). name desc: Sorts based on names (z → a). salary asc name desc: First sorts by salary high to low. Within that, it sorts the result set again sorts by name (z → a).	desc
start	Specifies the starting point from where the results should begin displaying.	0
rows	Specifies the maximum number of do...