Pentaho Data Integration 4 Cookbook
eBook - ePub

Pentaho Data Integration 4 Cookbook

  1. 352 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Pentaho Data Integration 4 Cookbook

About this book

In Detail

Pentaho Data Integration (PDI, also called Kettle), one of the data integration tools leaders, is broadly used for all kind of data manipulation such as migrating data between applications or databases, exporting data from databases to flat files, data cleansing, and much more. Do you need quick solutions to the problems you face while using Kettle?

Pentaho Data Integration 4 Cookbook explains Kettle features in detail through clear and practical recipes that you can quickly apply to your solutions. The recipes cover a broad range of topics including processing files, working with databases, understanding XML structures, integrating with Pentaho BI Suite, and more.

Pentaho Data Integration 4 Cookbook shows you how to take advantage of all the aspects of Kettle through a set of practical recipes organized to find quick solutions to your needs. The initial chapters explain the details about working with databases, files, and XML structures. Then you will see different ways for searching data, executing and reusing jobs and transformations, and manipulating streams. Further, you will learn all the available options for integrating Kettle with other Pentaho tools.

Pentaho Data Integration 4 Cookbook has plenty of recipes with easy step-by-step instructions to accomplish specific tasks. There are examples and code that are ready for adaptation to individual needs.

Learn to solve data manipulation problems using the Pentaho Data Integration tool Kettle.

Approach

This book has step-by-step instructions to solve data manipulation problems using PDI in the form of recipes. It has plenty of well-organized tips, screenshots, tables, and examples to aid quick and easy understanding.

Who this book is for

If you are a software developer or anyone involved or interested in developing ETL solutions, or in general, doing any kind of data manipulation, this book is for you. It does not cover PDI basics, SQL basics, or database concepts. You are expected to have a basic understanding of the PDI tool, SQL language, and databases.

Trusted by 375,005 students

Access to over 1 million titles for a fair monthly price.

Study more efficiently using our study tools.

Information

Pentaho Data Integration 4 Cookbook


Table of Contents

Pentaho Data Integration 4 Cookbook
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers and more
Why Subscribe?
Free Access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Working with Databases
Introduction
Sample databases
Pentaho BI platform databases
Connecting to a database
Getting ready
How to do it...
How it works...
There's more...
Avoiding creating the same database connection over and over again
Avoiding modifying jobs and transformations every time a connection changes
Specifying advanced connection properties
Connecting to a database not supported by Kettle
Checking the database connection at run-time
Getting data from a database
Getting ready
How to do it...
How it works...
There's more...
See also
Getting data from a database by providing parameters
Getting ready
How to do it...
How it works...
There's more...
Parameters coming in more than one row
Executing the SELECT statement several times, each for a different set of parameters
See also
Getting data from a database by running a query built at runtime
Getting ready
How to do it...
How it works...
There's more...
See also
Inserting or updating rows in a table
Getting ready
How to do it...
How it works...
There's more...
Alternative solution if you just want to insert records
Alternative solution if you just want to update rows
Alternative way for inserting and updating
See also
Inserting new rows where a simple primary key has to be generated
Getting ready
How to do it...
How it works...
There's more...
Using the Combination lookup/update for looking up
See also
Inserting new rows where the primary key has to be generated based on stored values
Getting ready
How to do it...
How it works...
There's more...
See also
Deleting data from a table
Getting ready
How to do it...
How it works...
See also
Creating or altering a database table from PDI (design time)
Getting ready
How to do it...
How it works...
There's more...
See also
Creating or altering a database table from PDI (runtime)
How to do it...
How it works...
There's more...
See also
Inserting, deleting, or updating a table depending on a field
Getting ready
How to do it...
How it works...
There's more...
Insert, update, and delete all-in-one
Synchronizing after merge
See also
Changing the database connection at runtime
Getting ready
How to do it...
How it works...
There's more...
See also
Loading a parent-child table
Getting ready
How to do it...
How it works...
See also
2. Reading and Writing Files
Introduction
Reading a simple file
Getting ready
How to do it...
How it works...
There's more...
Alternative notation for a separator
About file format and encoding
About data types and formats
Altering the names, order, or metadata of the fields coming from the file
Reading files with fixed width fields
Reading several files at the same time
Getting ready
How to do it...
How it works...
There's more...
Reading unstructured files
Getting ready
How to do it...
How it works...
There's more...
Master/detail files
Log files
Reading files having one field by row
Getting ready
How to do it...
How it works...
There's more...
See also
Reading files with some fields occupying two or more rows
Getting ready
How to do it...
How it works...
See also
Writing a simple file
Getting ready
How to do it...
How it works...
There's more...
Changing headers
Giving the output fields a format
Writing an unstructured file
Getting ready
How to do it...
How it works...
There's more...
Providing the name of a file (for reading or writing) dynamically
Getting ready
How to do it...
How it works...
There's more...
Get System Info
Generating several files simultaneously with the same structure, but different names
Using the name of a file (or part of it) as a field
Gett...

Table of contents

  1. Pentaho Data Integration 4 Cookbook

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 990+ topics, we’ve got you covered! Learn about our mission
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud
Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app
Yes, you can access Pentaho Data Integration 4 Cookbook by Adrian Sergio Pulvirenti, Maria Carina Roldan in PDF and/or ePUB format, as well as other popular books in Computer Science & Data Processing. We have over one million books available in our catalogue for you to explore.