Instant Pentaho Data Integration Kitchen
eBook - ePub

Instant Pentaho Data Integration Kitchen

Sergio Ramazzina

Share book
  1. 68 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Instant Pentaho Data Integration Kitchen

Sergio Ramazzina

Book details
Book preview
Table of contents
Citations

About This Book

In Detail

Pentaho PDI is a modern, powerful, and easy-to-use ETL system that lets you develop ETL processes with simplicity. Explore and gain the experience and skills that you need to run processes from the command line or schedule them by using an extensive description and a good set of samples.

Instant Pentaho Data Integration Kitchen How-to will help you to understand the correct way to deal with PDI command line tools. We start with a recipe about how to configure your memory requirements to run your processes effectively and then move forward with a set of recipes that show you the different ways to start PDI processes.

We start with a recap about how transformations and jobs are designed using spoon and then move forward to configure memory requirements to properly run your processes from the command line.

We dive into the various flags that control the logging system by specifying the logging output and the log verbosity. We focus and deliver all the knowledge you require to run the ETL processes using command line tools with ease and in a proficient manner.

Approach

Filled with practical, step-by-step instructions and clear explanations for the most important and useful tasks. A practical guide with easy-to-follow recipes helping developers to quickly and effectively collect data from disparate sources such as databases, files, and applications, and turn the data into a unified format that is accessible and relevant to end users.

Who this book is for

Any IT professional working on PDI and is a valid support for either learning how to use the command line tools efficiently or for going deeper on some aspects of the command line tools to help you work better.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on ā€œCancel Subscriptionā€ - itā€™s as simple as that. After you cancel, your membership will stay active for the remainder of the time youā€™ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlegoā€™s features. The only differences are the price and subscription period: With the annual plan youā€™ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, weā€™ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Instant Pentaho Data Integration Kitchen an online PDF/ePUB?
Yes, you can access Instant Pentaho Data Integration Kitchen by Sergio Ramazzina in PDF and/or ePUB format, as well as other popular books in Informatica & Archiviazione di dati. We have over one million books available in our catalogue for you to explore.

Information

Year
2013
ISBN
9781849696906

Instant Pentaho Data Integration Kitchen


Instant Pentaho Data Integration Kitchen

Copyright Ā© 2013 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: July 2013
Production Reference: 1240713
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-84969-690-6
www.packtpub.com

Credits

Author
Sergio Ramazzina
Reviewer
Joel Latino
Acquisition Editor
Erol Staveley
Commissioning Editor
Shreerang Deshpande
Technical Editor
Sampreshita Maheshwari
Copy Editor
Insiya Morbiwala
Project Coordinator
Suraj Bist
Proofreader
Paul Hindle
Production Coordinator
Zahid Shaikh
Cover Work
Prachali Bhiwandkar
Cover Image
Aditi Gajjar

About the Author

Sergio Ramazzina is a software architect/trainer with over 20 years of experience working on a large number of projects for banks and major Italian companies as well as designing complex enterprise solutions in Java/JavaEE and Ruby. He started using Pentaho products from the very beginning (late 2003), gaining vast experience by deploying Pentaho as an open source, standalone BI solution. He also deeply integrated Pentaho as the analytics engine of choice in other applications he designed. Starting from 2009, based on his experience in the Java/JavaEE world and because of his appreciation for the open source world and its principles, he began participating actively as a contributor to some Pentaho projects, such as JPivot, Saiku, CDF, and CDA, and he has achieved the title of Pentaho Active Contributor.
In late 2010, he founded Serasoft, a young Italian consulting company specialized in the design and delivery of open source business intelligence solutions, and he started participating as a BI architect and Pentaho expert on a wide number of projects where open source BI and Pentaho were the main heroes. He is also the CTO of Athilab (Athirat Innovation Lab), sharing his experience in the design and delivery of high-value innovative enterprise solutions. He is always looking for innovative solutions that can help users make their work more efficient. He is also passionate about skiing, tennis, and photography.

About the Reviewer

Joel Latino was born in Ponte de Lima, Portugal, in 1989. He has been working in the IT industry since 2010, mostly as a software developer and BI developer.
He started his career at Xpand-ITā€”a Portuguese company specialized in strategic planning, consulting, implementation, and the maintenance of enterprise software that is fully adapted to the customer's needsā€”and earned his graduate degree in Informatics Engineering at the School of Technology and Management of the Viana do Castelo Polytechnic Institute.
Joel mainly focuses on open source web technology, databases, and business intelligence, and has some fascination with mobile technologies. He is responsible for developing some plugins to Pentaho Data Integration, such as Android and Apple push notification steps.

www.PacktPub.com

Support files, eBooks, discount offers and more

You might want to visit www.PacktPub.com for support files and downloads related to your book.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
Support files, eBooks, discount offers and more
http://PacktLib.PacktPub.com
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can access, read and search across Packt's entire library of books.

Why Subscribe?

  • Fully searchable across every book published by Packt
  • Copy and paste, print and bookmark content
  • On demand and accessible via web browser

Free Access for Packt account holders

If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view nine entirely free books. Simply use your login credentials for immediate access.

Preface

Pentaho Data Integration (PDI) is an ETL tool that was born 10 years ago. Its creator, Matt Caster, celebrated the 10th anniversary of this product, originally named Kettle (you can read the celebratory post on Matt's blog at: http://www.ibridge.be/?p=211), this year on March 8th 2013. The term K. E. T. T. L. E. is an acronym that stands for Kettle Extraction Transformation Transport Load Environment. When Pentaho acquired Kettle, its name was changed to Pentaho Data Integration, but actually, many developers continue to call it by the old name: Kettle.

How the story beganā€¦

The history of Kettle began in 2001 when Matt Caster, Pentaho Data Integration's chief architect and creator of Kettle, was working as a BI consultant. He had the idea of writing his own ETL tool to have a better and cheaper way to transfer data from one place to another. He was looking for a different solution, something that was better than inventing ugly data warehouse solutions written in PL/SQL, VB, or Shell scripts. He spent two years doing a thorough analysis of the problem. Because he was busy all the time with his work as a consultant, he worked on this project either during the weekends or at night. A...

Table of contents