Delve into the SAP Data Services environment to efficiently prepare, implement, and develop ETL processes
About This Book
- Install and configure the SAP Data Services environment
- Develop ETL techniques in the Data Services environment
- Implement real-life examples of Data Services uses through step-by-step instructions to perform specific ETL development tasks
Who This Book Is For
This book is for IT technical engineers who want to get familiar with the EIM solutions provided by SAP for ETL development and data quality management. The book requires familiarity with basic programming concepts and basic knowledge of the SQL language.
What You Will Learn
- Install, configure, and administer the SAP Data Services components
- Run through the ETL design basics
- Maximize the performance of your ETL with the advanced patterns in Data Services
- Extract methods from various databases and systems
- Get familiar with the transformation methods available in SAP Data Services
- Load methods into various databases and systems
- Code with the Data Services scripting language
- Validate and cleanse your data, applying the Data quality methods of the Information Steward
Want to cost effectively deliver trusted information to all of your crucial business functions? SAP Data Services delivers one enterprise-class solution for data integration, data quality, data profiling, and text data processing. It boosts productivity with a single solution for data quality and data integration. SAP Data Services also enables you to move, improve, govern, and unlock big data.
This book will lead you through the SAP Data Services environment to efficiently develop ETL processes. To begin with, you’ll learn to install, configure, and prepare the ETL development environment. You will get familiarized with the concepts of developing ETL processes with SAP Data Services. Starting from smallest unit of work- the data flow, the chapters will lead you to the highest organizational unit―the Data Services job, revealing the advanced techniques of ETL design.
You will learn to import XML files by creating and implementing real-time jobs. It will then guide you through the ETL development patterns that enable the most effective performance when extracting, transforming, and loading data. You will also find out how to create validation functions and transforms.
Finally, the book will show you the benefits of data quality management with the help of another SAP solution Information Steward.
Style and approach
This book is an easy-to-follow guide with step-by-step instructions to perform specific ETL development tasks.
Chapter 1, Introduction to ETL Development, explains what Extract, Transform, and Load (ETL) processes are, and what role Data Services plays in ETL development. It includes the steps to configure the database environment used in recipes of the book.
Chapter 2, Configuring the Data Services Environment, explains how to install and configure all Data Services components and applications. It introduces the Data Services development GUI—the Designer tool with the simple example of “Hello World” ETL code.
Chapter 3, Data Services Basics – Data Types, Scripting Language, and Functions, introduces the reader to Data Services internal scripting language. It explains various categories of functions that are available in Data Services, and gives the reader an example of how scripting language can be used to create custom functions.
Chapter 4, Dataflow – Extract, Transform, and Load, introduces the most important processing unit in Data Service, dataflow object, and the most useful types of transformations that can be performed inside a dataflow. It gives the reader examples of extracting data from source systems and loading data into target data structures.
Chapter 5, Workflow – Controlling Execution Order, introduces another Data Services object, workflow, which is used to group other workflows, dataflows, and script objects into execution units. It explains the conditional and loop structures available in Data Services.
Chapter 6, Job – Building the ETL Architecture, brings the reader to the job object level and reviews the steps used in the development process to make a successful and robust ETL solution. It covers the monitoring and debugging functionality available in Data Services and embedded audit features.
Chapter 7, Validating and Cleansing Data, introduces the concepts of validating methods, which can be applied to the data passing through the ETL processes in order to cleanse and conform it according to the defined Data Quality standards.
Chapter 8, Optimizing ETL Performance, is one of the first advanced chapters, which starts explaining complex ETL development techniques. This particular chapter helps the user understand how the existing processes can be optimized further in Data Services in order to make sure that they run quickly and efficiently, consuming as less computer resources as possible with the least amount of execution time.
Chapter 9, Advanced Design Techniques, guides the reader through advanced data transformation techniques. It introduces concepts of Change Data Capture methods that are available in Data Services, pivoting transformations, and automatic recovery concepts.
Chapter 10, Developing Real-time Jobs, introduces the concept of nested structures and the transforms that work with nested structures. It covers the mains aspects of how they can be created and used in Data Services real-time jobs. It also introduces new a Data Services component—Access Server.
Chapter 11, Working with SAP Applications, is dedicated to the topic of reading and loading data from SAP systems with the example of the SAP ERP system. It presents the real-life use case of loading data into the SAP ERP system module.
Chapter 12, Introduction to Information Steward, covers another SAP product, Information Steward, which accompanies Data Services and provides a comprehensive view of the organization’s data, and helps validate and cleanse it by applying Data Quality methods.