eBook - ePub

Apache Solr High Performance

Name: Apache Solr High Performance
Author: Surendra Mohan

Surendra Mohan

Condividi libro

124 pagine
English
ePUB (disponibile sull'app)
Disponibile su iOS e Android

eBook - ePub

Apache Solr High Performance

Surendra Mohan

Dettagli del libro

Anteprima del libro

Indice dei contenuti

Citazioni

Informazioni sul libro

In Detail

Apache Solr is one of the most popular open source search servers available on the web. However, simply setting up Apache Solr is not enough to ensure the success of your web product. To maximize efficiency, you need to use techniques to boost Solr performance in order to return relevant results faster. You need to implement robust techniques that focus on optimizing the performance of your Solr instances and also troubleshoot issues that are prone to arise while maintaining Solr.

Apache Solr High Performance is a practical guide that will help you explore and take full advantage of the robust nature of Apache Solr so as to achieve optimized Solr instances, especially in terms of performance.

You will learn everything you need to know in order to achieve a high performing Solr instance or set of instances, as well as how to troubleshoot the common problems you are prone to face while working with single or multiple Solr servers.

This book offers you an introduction by explaining the prerequisites of Apache Solr and installing it, while also integrating it with the required additional components, and gradually progresses into features that make Solr flexible enough to achieve high performance ratings in various circumstances. Moving forward, the book will cover several clear and highly practical concepts that will help you further optimize your Solr instancesperformance both on single as well as multiple servers, and learn how to troubleshoot common problems that are prone to arise while using your Solr instance. By the end of the book you will also learn how to set up, configure, and deploy ZooKeeper along with learning more about other applications of ZooKeeper.

You will also learn how to handle data in multiple server environments, searches based on specific geographical co-ordinates, different caching techniques, and various algorithms and formulae that enable better performance; and many more.

Approach

This book is an easy-to-follow guide, full of hands-on, real-world examples. Each topic is explained and demonstrated in a specific and user-friendly flow, from search optimization using Solr to Deployment of Zookeeper applications.

Who this book is for

This book is ideal for Apache Solr developers and want to learn different techniques to optimize Solr performance with utmost efficiency, along with effectively troubleshooting the problems that usually occur while trying to boost performance. Familiarity with search servers and database querying is expected.

Domande frequenti

Come faccio ad annullare l'abbonamento?

È semplicissimo: basta accedere alla sezione Account nelle Impostazioni e cliccare su "Annulla abbonamento". Dopo la cancellazione, l'abbonamento rimarrà attivo per il periodo rimanente già pagato. Per maggiori informazioni, clicca qui

È possibile scaricare libri? Se sì, come?

Al momento è possibile scaricare tramite l'app tutti i nostri libri ePub mobile-friendly. Anche la maggior parte dei nostri PDF è scaricabile e stiamo lavorando per rendere disponibile quanto prima il download di tutti gli altri file. Per maggiori informazioni, clicca qui

Che differenza c'è tra i piani?

Entrambi i piani ti danno accesso illimitato alla libreria e a tutte le funzionalità di Perlego. Le uniche differenze sono il prezzo e il periodo di abbonamento: con il piano annuale risparmierai circa il 30% rispetto a 12 rate con quello mensile.

Cos'è Perlego?

Perlego è un servizio di abbonamento a testi accademici, che ti permette di accedere a un'intera libreria online a un prezzo inferiore rispetto a quello che pagheresti per acquistare un singolo libro al mese. Con oltre 1 milione di testi suddivisi in più di 1.000 categorie, troverai sicuramente ciò che fa per te! Per maggiori informazioni, clicca qui.

Perlego supporta la sintesi vocale?

Cerca l'icona Sintesi vocale nel prossimo libro che leggerai per verificare se è possibile riprodurre l'audio. Questo strumento permette di leggere il testo a voce alta, evidenziandolo man mano che la lettura procede. Puoi aumentare o diminuire la velocità della sintesi vocale, oppure sospendere la riproduzione. Per maggiori informazioni, clicca qui.

Apache Solr High Performance è disponibile online in formato PDF/ePub?

Sì, puoi accedere a Apache Solr High Performance di Surendra Mohan in formato PDF e/o ePub, così come ad altri libri molto apprezzati nelle sezioni relative a Computer Science e Data Processing. Scopri oltre 1 milione di libri disponibili nel nostro catalogo.

Informazioni

Editore

Packt Publishing

Anno

2014

ISBN

9781782164821

Edizione

Argomento

Computer Science

Categoria

Data Processing

Apache Solr High Performance

Credits

About the Author

About the Reviewers

www.PacktPub.com

Support files, eBooks, discount offers and more

Why Subscribe?

Free Access for Packt account holders

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

1. Installing Solr

Prerequisites for Solr

Installing components

Summary

2. Boost Your Search

Scoring

Query-time and index-time boosting

Index-time boosting

Query-time boosting

Troubleshoot queries and scores

The dismax query parser

Lucene DisjunctionMaxQuery

Autophrase boosting

Configuring autophrase boosting

Configuring the phrase slop

Boosting a partial phrase

Boost queries

Boost functions

Boost addition and multiplication

Function queries

Field references

Function references

Mathematical operations

The ord() and rord() functions

Other functions

Boosting the function query

Logarithm

Reciprocal

Linear

Inverse reciprocal

Summary

3. Performance Optimization

Solr performance factors

Solr caching

Document caching

Query result caching

Filter caching

Result pages caching

Using SolrCloud

Creating a SolrCloud cluster

Multiple collections within a cluster

Managing a SolrCloud cluster

Distributed indexing and searching

Stopping automatic document distribution

Near real-time search

Summary

4. Additional Performance Optimization Techniques

Documents similar to those returned in the search result

Sorting results by function values

Searching for homophones

Ignore the defined words from being searched

Summary

5. Troubleshooting

Dealing with the corrupt index

Reducing the file count in the index

Dealing with the locked index

Truncating the index size

Dealing with a huge count of open files

Dealing with out-of-memory issues

Dealing with an infinite loop exception in shards

Dealing with expensive garbage collection

Bulk updating a single field without full indexation

Summary

6. Performance Optimization with ZooKeeper

Getting familiar with ZooKeeper

Prerequisites for a distributed server

Aid your distributed system using ZooKeeper

Setting an ideal node count for ZooKeeper

Setting up, configuring, and deploying ZooKeeper

Setting up ZooKeeper

Configuring ZooKeeper

Deploying ZooKeeper

Applications of ZooKeeper

Summary

A. Resources

Index

Apache Solr High Performance

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: March 2014

Production Reference: 1180314

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78216-482-1

www.packtpub.com

Cover Image by Glain Clarrie (<[email protected]>)

Credits

Author

Surendra Mohan

Reviewers

Azaz Desai

Ankit Jain

Mark Kerzner

Ruben Teijeiro

Acquisition Editor

Neha Nagwekar

Content Development Editor

Poonam Jain

Technical Editor

Krishnaveni Haridas

Copy Editors

Mradula Hegde

Alfida Paiva

Adithi Shetty

Project Coordinator

Puja Shukla

Proofreaders

Simran Bhogal

Ameesha Green

Maria Gould

Indexers

Monica Ajmera Mehta

Mariammal Chettiyar

Graphics

Abhinash Sahu

Production Coordinator

Saiprasad Kadam

Cover Work

Saiprasad Kadam

About the Author

Surendra Mohan, who has served a few top-notch software organizations in varied roles, is currently a freelance software consultant. He has been working on various cutting-edge technologies such as Drupal and Moodle for more than nine years. He also delivers technical talks at various community events such as Drupal meet-ups and Drupal camps. To know more about him, his write-ups, and technical blogs, and much more, log on to http://www.surendramohan.info/.

He has also authored the book Administrating Solr, Packt Publishing, and has reviewed other technical books such as Drupal 7 Multi Sites Configuration and Drupal Search Engine Optimization, Packt Publishing, and titles on Drupal commerce and ElasticSearch, Drupal-related video tutorials, a title on Opsview, and many more.

About the Reviewers

Azaz Desai has more than three years of experience in Mule ESB, jBPM, and Liferay technology. He is responsible for implementing, deploying, integrating, and optimizing services and business processes using ESB and BPM tools. He was a lead writer of Mule ESB Cookbook, Packt Publishing, and also played a vital role as a trainer on ESB. He currently provides training on Mule ESB to global clients. He has done various integrations of Mule ESB with Liferay, Alfresco, jBPM, and Drools. He was part of a key project on Mule ESB integration as a messaging system. He has worked on various web services and standards and frameworks such as CXF, AXIS, SOAP, and REST.

Ankit Jain holds a bachelor's degree in Computer Science Engineering from RGPV University, Bhopal, India. He has three years of experience in designing and architecting solutions for the Big Data domain and has been involved with several complex engagements. His technical strengths include Hadoop, Storm, S4, HBase, Hive, Sqoop, Flume, ElasticSearch, Machine Learning, Kafka, Spring, Java, and J2EE.

He also shares his thoughts on his personal blog at http://ankitasblogger.blogspot.in/. You can follow him on Twitter at @mynameisanky. He spends most of his time reading books and playing with different technologies. When not at work, Ankit spends time with his family and friends, watching movies, and playing games.

Mark Kerzner holds degrees in Law, Maths, and Computer Science. He has been designing software for many years and Hadoop-based systems since 2008. He is the President of SHMsoft, a provider of Hadoop applications for various verticals, and a cofounder of the Hadoop Illuminated training and consulting, as well as the coauthor of the Hadoop Illuminated open source book. He has authored and coauthored several books and patents.

Informazioni sul libro

In Detail

Approach

Who this book is for

Domande frequenti

Informazioni

Apache Solr High Performance

Table of Contents

Apache Solr High Performance

Credits

About the Author

About the Reviewers

Indice dei contenuti