IBM DB2 Analytics Accelerator for z/os (IDAA)

Documenti analoghi

Introduzione all Architettura del DBMS

DBMS (Data Base Management System)

Database support Prerequisites Architecture Driver features Setup Stored procedures Where to use. Contents

Data Warehousing. Esercitazione 1

Big Data e IT Strategy

S P A P Bus Bu in s e in s e s s s O n O e n 9 e.0 9 p.0 e p r e r S A S P A P HAN HA A Gennaio 2014

Il BACKUP è disponibile in

L architettura di un DBMS

Che cosa è SADAS INFOMANAGER (1982) Gestione Archivi Storici (1992) SADAS (2005) Ambiente MVS OVERMILLION (1990) Client-Server e multipiattaforma

Virtualizzazione con Microsoft Tecnologie e Licensing

TNCguide OEM Informativa sull introduzione di documentazione aggiuntiva nella TNCguide

Introduzione Kerberos. Orazio Battaglia

Riccardo Dutto, Paolo Garza Politecnico di Torino. Riccardo Dutto, Paolo Garza Politecnico di Torino

Decomponibilità dei sistemi software

DICHIARAZIONE DI RESPONSABILITÀ

SQL Server. SQL server e un RDBMS di tipo client/server che utilizza Transact-SQL per gestire la comunicazione fra un client e SQL Server

Introduzione ad OLAP (On-Line Analytical Processing)

Ciclo di vita dimensionale

DB2 User Group Italia

Portale Materiali Grafiche Tamburini. Grafiche Tamburini Materials Portal

Loredana Sales IBM IOD Centre of Excellence

HBase Data Model. in più : le colonne sono raccolte in gruppi di colonne detti Column Family; Cosa cambia dunque?

Data Base. Master "Bio Info" Reti e Basi di Dati Lezione 6

CLOUD AWS. #cloudaws. Community - Cloud AWS su Google+ Amazon Web Services. Amazon VPC (Virtual Private Cloud)

Dispensa di database Access

Data warehouse in Oracle

AICA - Workshop 01/03/2011

IBM Power Systems e IBM i: evoluzione e innovazione. Nicoletta Bernasconi Product Manager Power i IBM Italia nicoletta_bernasconi@it.ibm.

FTP NAV - Guida tecnica FTP NAV - Technical Guide

CONFIGURATION MANUAL

Data Warehousing (DW)

Basi di dati. Il Linguaggio SQL. K. Donno - Il Linguaggio SQL

Grid Data Management Services

Alfonso Ponticelli Una gestione ottimale delle utenze privilegiate

Software per la pianificazione finanziaria

Copyright 2012 Binary System srl Piacenza ITALIA Via Coppalati, 6 P.IVA info@binarysystem.eu

La sfida che si può vincere: innovare ed essere più competitivi riducendo i costi IT Brunello Bonanni Center of Excellence Manager IBM Italia

Il linguaggio SQL: trigger. Versione elettronica: 04.7.SQL.trigger.pdf

Monitoraggio e performance: il ruolo del DBA manager e gli strumenti a supporto

IS Governance. Francesco Clabot Consulenza di processo.

Calcolatori Elettronici A a.a. 2008/2009

WorkFlow Management Systems

Compiere ERP/CRM Compiere per le PMI

Riccardo Paganelli Analisi, Reporting, Dashboard, Scorecard per prendere le migliori decisioni: Cognos 8 BI

Software di sistema e software applicativo. I programmi che fanno funzionare il computer e quelli che gli permettono di svolgere attività specifiche

Esercitazione query in SQL L esercitazione viene effettuata sul database viaggi e vacanze che prevede il seguente modello E/R:

Esempio: aggiungere j

GERARCHIE RICORSIVE - SQL SERVER 2008

Professional Planner 2011

Sistemi per le decisioni Dai sistemi gestionali ai sistemi di governo

SQL Server Introduzione all uso di SQL Server e utilizzo delle opzioni Olap. Dutto Riccardo - SQL Server 2005.

Lezione 8. Metadati, Viste e Trigger

IBM System i 515 Express

Sommario. Oracle Database 10g (laboratorio) Grid computing. Oracle Database 10g. Concetti. Installazione Oracle Database 10g

IBM Tivoli Storage Manager

Elettronica dei Sistemi Programmabili

Replica di Active Directory. Orazio Battaglia

Progetto Migrazione a DB2 V8

Modulo. Programmiamo in Pascal. Unità didattiche COSA IMPAREREMO...

Business Intelligence. strumento per gli Open Data

15 volte più veloce. per ridurre TCO e time-to-market

Analisi dei Dati. Lezione 10 Introduzione al Datwarehouse

Software relazione. Software di base Software applicativo. Hardware. Bios. Sistema operativo. Programmi applicativi

Il memory manager. Gestione della memoria centrale

Laboratorio di Amministrazione di Sistema (CT0157) parte A : domande a risposta multipla

sfide, opportunitàe competenze per i professionistidell ICT

LA GESTIONE INTEGRATA ED ATTIVA DEI DATI IMMOBILIARI: UN CONTRIBUTO ALLA TRASPARENZA DEL MERCATO

Esperienze Reali Migrazione alla V10

SQL Server Integration Services. SQL Server 2005: ETL - 1. Integration Services Project

SQL Server: miti da sfatare

Grid Data Management Services. Griglie e Sistemi di Elaborazione Ubiqui

Scenari evolutivi nei sistemi e nella tecnologia e loro impatti sui CED e sui loro consumi energetici

Grid Data Management Services

OpenVAS - Open Source Vulnerability Scanner

Misura delle prestazioni dei processi aziendali con sistemi di integrazione dati open source

Estendere Lean e Operational Excellence a tutta la Supply Chain

Hardware di un Computer

ICT Trade 2013 Special Edition

Configurazione avanzata di IBM SPSS Modeler Entity Analytics

I I SISTEMI INFORMATIVI INTEGRATI. Baan IV IV - Enterprise e Orgware NOTE

ESEMPI DI QUERY SQL. Esempi di Query SQL Michele Batocchi AS 2012/2013 Pagina 1 di 7

Caratteristiche principali. Contesti di utilizzo

La Gestione delle risorse Renato Agati

Corso di Amministrazione di Sistema Parte I ITIL 8

STRUTTURE DEI SISTEMI DI CALCOLO

Associazione Italiana Corporate & Investment Banking. Presentazione Ricerca. Il risk management nelle imprese italiane

Modulo Piattaforma Concorsi Interattivi

DB POWER STUDIO Relatori: Franca Alessandra Guidetti Francesco Reggiani Viani

Lezione 1. Introduzione e Modellazione Concettuale

OFFICE SOLUZIONE DI SICUREZZA GESTITA. Powered by

Sistema Operativo di un Router (IOS Software)

MySQL Database Management System

NOME PROGETTO. DWH Map Creator DOCUMENTO EMESSO DA: DATA Fabio Calcopietro 14/11/2007 NOME HW SW MANAGER/

Introduzione data warehose. Gian Luigi Ferrari Dipartimento di Informatica Università di Pisa. Data Warehouse

CONTENT MANAGEMENT SYSTEM

Il motore di previsione statistica SAS Forecast Server a

Strumenti per i Social Media a supporto del Marketing Digitale. 23 Novembre 2015 Antonio Parlato

Sistemi di BPM su Cloud per la flessibilità delle PMI

Con il termine Sistema operativo si fa riferimento all insieme dei moduli software di un sistema di elaborazione dati dedicati alla sua gestione.

Transcript:

IBM DB2 Analytics Accelerator for z/os (IDAA) La BI del 201x ha trovato casa Enrico Caraffi Architect IBM Software Group enrico.caraffi@it.ibm.com Milano Roma 13-14 Marzo 2012

Agenda Introduzione: Tracce di BI del 201x Architettura DB2 + IDAA Alcuni risultati del beta program DB2 + IDAA internals Proposta: Workload assesment

Scenari di Business Intelligence e dintorni Alcune Tracce per agevolare la discussione 3

Traccia 1) Lo scisma dell IT C era una volta un sistema IT... Acquire Data Warehouse Information Transform Information OLAP Present Information REPORTS Applicazioni OPERATIONAL APPLICATIONS AND USERS WAREHOUSE LEGACY SOURCES DATA INTEGRATION DATAMARTS REFERENCE DATA MASTER DATA OPERATIONAL ANALYTICAL ENTERPRISE CONTENT DATA 4

Il costo dello scisma Traccia 1) Lo scisma dell IT Dis-Economie nella gestione di ambienti multi piattaforma Disomogeneità nei processi di Sicurezza, Storage, Schedulazione Limitati Workload management, Monitoraggio, H. Availability Limiti alla possibilità di consolidare l HW Problemi nello spostare i dati da una piattaforma all altra Ritardi di propagazione Inefficienze nello storage Instabilità nelle performance di Rete Difficoltà nel chiudere il ciclo dati-informazioni-decisioni Complicazione nel creare ritorni dalla Bi per il Business di Front-Line Difficoltà nel Certificare i processi e i dati di BI Difficoltà nel credere ai dati da parte degli utenti 6

R R O Il costo dello scisma Traccia 1) Lo scisma dell IT Dis-Economie nella gestione di ambienti multi piattaforma $ $ $ Disomogeneità nei processi di Sicurezza, Storage, Schedulazione Limitati Workload management, Monitoraggio, H. Availability Limiti alla possibilità di consolidare l HW Problemi nello spostare i dati da una piattaforma all altra O $ R $ Ritardi di propagazione Inefficienze nello storage Instabilità nelle performance di Rete Difficoltà nel chiudere il ciclo dati-informazioni-decisioni $ $ O $ R Complicazione nel creare ritorni dalla Bi per il Business di Front-Line Difficoltà nel Certificare i processi e i dati di BI Difficoltà nel credere ai dati da parte degli utenti O $ R Mancate opportunità DisEconomie Rischi 7

Traccia 2: la Business Intelligence e i suoi utenti Un servizio che porta valore a ciascun utente Numero delle richieste Utenti 1990 Occasionali Board Room Executive KPI dashboard <50 1995 Decine Manager <500 2000 Centinaia Analisti del Business Risk Analysis <1,000 2005 Migliaia Personale a contatto con il cliente (es., Filiale, Centro Servizi, Call Center) Cross Selling n *1.000 201X Milioni Customers Milioni 8

Workload diversificato Gioco di squadra: ad ognuno il suo compito Query più complesse ampie e storicamente profonde DB2 zos IDAA Ottimizzato per accessi puntuali Ottimizzato per processi massivi ottimizzato per accessi concorrenti Ottimizzato per fare scansioni Ottimizzato per Aggregazioni DB2 z/os Utenti della BI Query Ottimizzatore 9 Query più numerose focalizzate in ambiti specifici

Agenda Introduzione: Tracce di BI del 201x Architettura DB2 + IDAA Alcuni risultati del beta program DB2 + IDAA internals Proposta: Workload assesment 12

Deep DB2 Integration within zenterprise Applications DBA Tools, z/os Console,... Application Interfaces (standard SQL dialects) Operational Interfaces (e.g. DB2 Commands) DB2 for z/os Data Manager Buffer Manager... IRLM Log Manager IBM DB2 Analytics Accelerator Superior availability reliability, security, Workload management z/os on System z Superior performance on analytic queries Netezza 13

Query Execution Flow Faster Answers, Faster Reports Application Interface Optimizer Heartbeat SPU CPU FPGA Memory Application Query execution run-time for queries that cannot be or should not be off-loaded to IDAA IDAA DRDA Requestor SMP Host SPU CPU FPGA Memory SPU CPU FPGA Memory SPU CPU FPGA Memory DB2 for z/os IBM DB2 Analytics Accelerator 14 Heartbeat (DB2 Analytics Accelerator availability and performance indicators) Queries executed without DB2 Analytics Accelerator Queries executed with DB2 Analytics Accelerator

DB2 Analytics Accelerator Supportato dalla tecnologia HW e SW Netezza TwinfinTM Comparto Dischi Front End SMP Server Snippet Blades TM (S-Blades, SPUs) 15 Storage incorporato: 8 Comparti con 12 Dischi ciascuno da 3.5 1TB, 7200RPM, SAS (3Gb/s) steaming a max 116MB/s su dati fortemente compressi Esempio: TF12: con 8 comparti 96 HDDs 1/3 dedicati ai dati = 32 TB spazio fisico 1/3 in Mirroring 1/3 per Workspace Ipotizzando una compressione media di 4:1 Ospita 128 TB di dati IDAA Server SQL Compiler, Query Plan, Optimize Administration 2 front/end hosts, IBM 3650M3 clustered active-passive 2 Nehalem-EP Quad-core 2.4GHz per host Processori e Logica di trattamento dati ottimizzati per il Data Base streaming, le aggregazioni, le Join massivamente parallele. In un Case massimo 6+6 Blades Con 1+1 Blade di riserva

Asymmetric Massively Parallel Processing Netezza TwinFin Appliance 1 Logical Processing Unit Processor & streaming DB logic SQL SQL Compiler 2 Logical Processing Unit Processor & streaming DB logic Query Plan Execution Engine 3 Logical Processing Unit Processor & streaming DB logic High-speed Loader/Unloader Optimize Admin Front End DBOS 960 High-Performance Database Engine Streaming joins, aggregations, sorts, etc. Logical Processing Unit Processor & streaming DB logic SMP Host 10 Gigabit Ethernet Massively Parallel Intelligent Storage

Asymmetric Massively Parallel Processing Percorso logico della query Netezza TwinFin Appliance SQL SQL Compiler Snippets 1 2 3 1 2 Logical Processing Unit Processor & 1 2 3 streaming DB logic Logical Processing Unit Processor & 1 2 3 streaming DB logic Query Plan Execution Engine 3 Logical Processing Unit Processor & 1 2 3 streaming DB logic High-speed Loader/Unloader Optimize Admin Front End DBOS 960 High-Performance Database Engine Streaming joins, aggregations, sorts, etc. Logical Processing Unit Processor & 1 2 3 streaming DB logic SMP Host 10 Gigabit Ethernet Massively Parallel Intelligent Storage 17

Ti Piace vincere facile? Il Field Programmable Gate Array (FPGA) COSA SONO GLI FPGA Circuiti basati su velocissime porte logiche sea of gates riconfigurabili Elaborazione in Streaming molto efficiente La riconfigurazione avviene specificamente per ciascuna query. Anticipa fino al 90% del lavoro solitamente svolto dalla CPU per procurarsi I dati che servono in forma utilizzabile Una soluzione che è stata utilizzata già da 2003 da Netezza. Le performance ottenute con gli FPGA abilitano: migliori livelli di servizio sulle query ad alta I-O & cpu e con tempi molto prevedibili richiedono minore lavoro e risorse niente Indici da definire e manutenere meno memoria cache necessaria non serve precalcolare le MQT o I Cubi

The Netezza Secret Sauce select DISTRICT, PRODUCTGRP, sum(revenue) from SALES_DATA where MONTH = '20091201' and MARKET = 509123 and SPECIALTY = DAIRY' FPGA Core CPU Core Slice of table SALES_DATA (compressed) Uncompress Project Restrict, Visibility Complex Joins, Aggs, etc. 116 MB/Sec Compressi sum(revenue) * fattore di compressione 4 464 MB di Row Data/Secondo select DISTRICT, PRODUCTGRP, sum(revenue) where MONTH = '20091201' and MARKET = 509123 and SPECIALTY = DAIRY' 19

Asymmetric Massively Parallel Processing Percorso dei dati di risposta alla query Netezza TwinFin Appliance Consolidate 1 Logical Processing Unit Processor & 1 2 3 streaming DB logic SQL Compiler 2 Logical Processing Unit Processor & 1 2 3 streaming DB logic Query Plan Execution Engine 3 Logical Processing Unit Processor & 1 2 3 streaming DB logic High-speed Loader/Unloader Optimize Admin Front End DBOS 96 High-Performance Database Engine Streaming joins, aggregations, sorts, etc. Logical Processing Unit Processor & 1 2 3 streaming DB logic 20 SMP Host 10 Gigabit Ethernet TWF12 Gross Data Scan Speed Massively Parallel Intelligent Storage 464 MB * 8 Core * 12 Blades = 44 GB /Sec 1 TB in 23 sec.

Agenda Introduzione: Tracce di BI del 201x Architettura DB2 + IDAA Alcuni risultati del beta program DB2 + IDAA internals Proposta: Workload assesment 21

Esperienza Beta Test Ambiente* HW Modello IBM Z196 (definita una partizione con 2 processori) IDAA 12 Blades zos versione 1.12 DB2 versione 9 SW 22 * DB2+IDAA pone come prerequisiti: Hardware IBM z196 o z114 (ultima generazione) Sistema operativo zos dalla V1.12 (anche 1.11 ma con opportune PTF) DB2 V9 o V10

Esperienza Beta Test Case test 1 13 mesi di Dati = > 353 GB row data 92,8 GB su IDAA Tasso di compressione 3,8 Tabelle dei Fatti: Tpcl : 3.600 Milioni di righe Tabelle delle Dimensioni: Altre 12 Tabelle max 114 Mil Rec Case test 2 6 mesi di Dati = >2.867 GB row data 265 GB su IDAA Tasso di compressione 10,8 Tabelle dei Fatti: T20 : 2.491 Milioni di righe T43 : 4.767 Milioni di righe Tabelle delle Dimensioni: Altre 15 Tabelle max 10 Mil Rec 23

Analisi su casi di test reali 1/4 Query sui dati di 4 mesi circa 1,4 miliardi di record sulla fact table + Tempo di calcolo puro circa 19 secondi + circa 16 secondi di Tempo di rete 24

Analisi su casi di test reali 2/4 Query sui dati di 1 mese circa 350 milioni di record sulla fact table - Ricerca molto selettiva dati di un solo cliente

Analisi su casi di test reali 2/4 con DB2 Indicizzato Dopo avere definito gli indici del DB2 sullo star schema - L ottimizzatore sa che per una Ricerca molto selettiva - è più veloce con il DB2

Analisi su casi di test reali 2/4 con DB2 Indicizzato

Performance Outcomes alla fine dei test NOTA DELL AUTORE La comparazione diretta con le tecnologie concorrenti di mercato NON rappresente un benchmark in quanto i dati utilizzati per i test sono IDENTICI ma la parte di benchmark a sinistra risale a fine 2009. Tra le 3 tecnologie concorrenti NESSUNA riesce a prevalere per più del 50% delle volte 15 46 66 26 31 41 28 DB2 + IDAA (Sec) 5,89 4,59 0,82 1,70 8,52 13,08 0,39

Performance Outcomes alla fine dei test NOTA DELL AUTORE La comparazione diretta con le tecnologie concorrenti di mercato NON rappresente un benchmark in quanto i dati utilizzati per i test sono IDENTICI ma la parte di benchmark a sinistra risale a fine 2009. Le tecnologie concorrenti arrivano a occupare 3 ordini di grandezza nella scala logaritmina dei tempi di risposta 15 46 66 26 31 41 28 DB2 + IDAA (Sec) 5,89 4,59 0,82 1,70 8,52 13,08 0,39 DB2 e IDAA contengono la variabilità del response time in 1 ordine di grandezza abbatendo in 2 casi su 7 la bariera del secondo

Agenda Introduzione: Tracce di BI del 201x Architettura DB2 + IDAA Alcuni risultati del beta program DB2 + IDAA internals Proposta: Workload assesment 30

IBM DB2 Analytics Accelerator Product Components zenterprise Netezza Technology CLIENT OSA-Express3 10 GbE Primary Private Service Network 10Gb Backup Data Studio Foundation IDAA Plug-in BladeCenter Users/ Applications Data Warehouse application DB2 for z/os enabled for IBM DB2 Analytics Accelerator IBM DB2 Analytics Acelerator 31

Gestione dei dati nell acceleratore IDAA Alcuni principi: 1. Per potere essere consistenti ed efficienti DB2 e IDAA devono contenere sempre gli stessi dati Questo perchè l organizzazione interna dei dati su DB2 e IDAA è profondamante antitetica E perchè ogni motore deve potere essere indipendente dall altro in tutto 2. Il ruolo di Owner del dato, della sicurezza, Backup, Change ecc.. Rimane il DB2zOS, 3. l IDAA è sempre mediato dal DB2 quindi NON può ricevere direttamente dati da nessun altro Come Funziona: 1. Tutte le funzioni di gestione dei dati e dei metadati su IDAA sono implementate con una serie di stored procedure standard DB2 2. Le Stored procedure sono lanciabili in diversi modi, a seconda della necessità: Sono associate ai tasti disponibili sulla GUI IDAA Studio Possono essere richiamate da JCL zos Possono essere integrate in altri strumenti come ETL e Scripting 32

Caricare e allineare I dati nell acceleratore IDAA Ciclo di vita dei dati su IDAA 1.Le tabelle si aggiungono all acceleratore con la SP ACCEL_ADD_TABLES che riceve la lista delle tabelle da portare sull accelelratore 2.I dati si caricano con la SP ACCEL_LOAD_TABLES che riceve la lista delle tabelle o delle partizioni che devono essere rinfrescate 3. Al momento è possibile rinfrescare su IDAA Intere tabelle Singole partizione (solo Partizionamento per Range) Sono supportate la ADD e la ROTATE delle partizioni ma non la ALTER PARTITION RANGE zos IDAA T1 p1 10Gb Blade 1 8 core + 8 FPGA T1 p2 T1 p3 unload unload unload USS pipe USS pipe USS pipe SMP server Blade 2 8 core + 8 FPGA... Blade N 8 core + 8 FPGA 33 Processo di allineamento IDAA 1.Il processo di caricamento dati su IDAA è gestito come stored procedure e si basa sulla Unload DB2 2.Il lavoro viene parallelizzato su più tabelle e su più partizioni con un limite parametrico 3.I dati si riversano dal DB2 a un buffer USS pipe che viene subito letto dall IDAA. 4.Tutti i nodi di lavoro di IDAA partecipano al caricamento dei dati con l efficienza data dagli FPGA 5.Il caricamento non sospende il servizio dell acceleratore sulle query

Componenti DB2 influenzate Nuovi Parametri di sistema e Special registers Nuove tabelle/colonne nel catalogo Criteri di Ottimizzazione e di routing Explain delle query con opzioni IDAA Nuovi Comandi DB2 Nuove Stored Procedures specifiche di gestione IDAA 34

System parameters ACCEL Possible values: NO, AUTO, COMMAND QUERY_ACCELERATION Sets the initial value for the CURRENT QUERY ACCELERATION special register Possible values: NONE (default), ENABLE and ENABLE WITH FAILBACK Special register CURRENT QUERY ACCELERATION Can be set implicitly by inheriting the value of the system parameter, or Explicitly by SET CURRENT QUERY ACCELERATION Value NONE Description No query is routed to the accelerator ENABLE A query is routed to the accelerator if it satisfies the acceleration criteria. If there is an accelerator failure while running the query, or the accelerator returns an error, DB2 will return a negative SQL code to the application. ENABLE WITH FAILBACK A query is routed to the accelerator if it satisfies the acceleration criteria. Under certain conditions the query will run on DB2 after it fails in the accelerator. In particular, any negative SQLCODE will cause a failback to DB2 during PREPARE or first OPEN. No failback is possible after a successful OPEN of a query. 35

Una query viene indirizzata all IDAA se: Arriva come SQL Dinamico Tutte le tabelle referenziate dalla Query devono essere copiate nell acceleratore L SQL non deve contenere le istruzioni unsupported (vedi pagina seguente) La query NON contiene istruzioni di scrittura (es INSERT INTO. SELECT ) Il cursore associato non è definito scrollable o rowset Tutta la query è gestita come una unità atomica e quindi non scindibile: la query girerà per intero sul DB2 o sull accelleratore I singoli query blocks non sono considerati accellerabili Non viene supportato il protocollo privato (già deprecato da DB2 V9) E soprattutto: l esecuzione della query sull IDAA deve essere ritenuta conveniente rispetto alla esecuzione sul DB2 Questa decisione viene assunta nell ottimizzatore 36

limitazioni: SQL non accellerabile Non sono ammessi alcuni data types come LOBs, ROWID, XML. Le colonne di questi tipi non verranno portate in IDAA Le query che le utilizzano non sono accellerate, le atre si Non tutte le funzioni del DB2 sono supportate, sono escluse Le funzioni trigonometriche come SIN, COS, TAN. Le funzioni user defined Le funzioni avanzate sulle stringhe come LOCATE, LEFT, OVERLAY. Alcune funzionalità tipicamente OLAP tipo RANK, ROLLUP, CUBE 37

Dentro l Ottimizzatore 1/2 Per scegliere a quale percorso instradare la Query l ottimizzatore deve basarsi: sulla query, sulla base dati inferita, Con I dati contenuti nelle tabelle statistiche del catalogo DB2 si posssono avere o stimare: dimensione delle tabelle in input quantità di dati da elaborare dimensione attesa del risultato La regola decisionale è di tipo euristico, ossia è una regola che a fronte di un problema molto complesso deve necessariamente essere veloce da calcolare 38 (1) http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/index.jsp?topic=/com.ibm.db2z10.doc.perf/src/tpc/db2z_profiles.htm

Dentro l Ottimizzatore 2/2 La regola euristica prevede che una serie di casi rimangano di pertinenza del DB2 core engine piuttosto che nell IDAA, Ad esempio rimangono al DB2 le query: Stimate come molto selettive (tipo OLTP) ad esempio Accessi per campi chiave molto selettivi Accesso a campioni di dati Query completamente prive di funzioni di aggregazione o selezione WHERE, GROUP BY, ORDER BY Nei casi in cui tutte le tabelle referenziate dalla query siano definite di taglia small Il concetto di small viene riferito a un numero minimo di pagine in cui rientra la tabella, normalmente impostato a 50 Si tratta di un parametro modificabile, il valore -1 fa saltare questo controllo Nei casi in cui si stimi che la query produca un risultato potenzialmente large allora la query viene lasciata al db2 Anche il concetto di Large è affidato a un parametro legato al numero di righe Si tratta di un parametro modificabile, il valore -1 fa saltare questo controllo 39

Agenda Introduzione: Tracce di BI del 201x Architettura DB2 + IDAA Alcuni risultati del beta program DB2 + IDAA internals Proposta: Workload assesment 40

Quick Workload Test Customer Collecting information from dynamic statement cache, supported by stepby-step instruction and REXX script (small effort for customer) Uploading compressed file (up to some MB) to IBM FTP server Report for a first assessment: Acceleration potential for Queries Estimated time CP cost IBM / Center of Excellence Importing data into local database Quick analysis based on known DB2 Analytics Accelerator capabilities 1 2 3 Customer Database Documentation and REXX procedure Data package (mainly unload data sets) Pre-process and load IBM lab Database Quick Workload Test Tool Report Assessment 41

approfondimenti Workload Analysis - Step 1: Activate Dynamic Statement Cache - Step 2: Activate relevant IFCIDs 316, 317, 318 - Step 3: Create objects for collecting workload information - Step 4: Collect workload information from Dynamic Statement Cache (EXPLAIN STMT CACHE) --> popola la tabella DSN_STATEMENT_CACHE_TABLE - Step 5: Explain degli Stmt scaricati nella tabella DSN_STATEMENT_CACHE_TABLE - Step 6: Unload workload, explain and catalog information su data set - Step 7: Prepare tersed datasets for sending - Step 8: Send Unload files to IBM Boeblingen DWHz CoE 42

contatti - per i prossimi passi Se sai già come DB2 e IDAA possono portare valore alla tua azienda.. Spiegacelo! strada facendo abbiamo conosciuto altri clienti che hanno avuto idee interessanti Se non ne sei sicuro? Noi alcune idee le porta IBM, il valore sicuramente esite, e vorremmo provare scoprirlo assieme. Iniziando dalla Workload Analysis che si può fare anche sul DB2 in V8 Team Commerciale Information Management su zos Angela Ascione (Centro-Sud) Elisabetta Curci (Nord) angela_ascione@it.ibm.com e_curci@it.ibm.com Team Tecnico IDAA Mario Biffi Enrico Caraffi Massimiliano Castellini Paola Zornig mario_biffi@it.ibm.com enrico.caraffi@it.ibm.com MAX_CASTELLINI@it.ibm.com paola_zornig@it.ibm.com

46 IBM Confidential

IDAA Preserves DB2 Key Value Propositions DB2 continues to own data (both OLTP and DW) Access to data (authorization, privileges, ) Data consistency and integrity (backup, recovery, ) Enables extending System z QoS characteristics to BI/DW data as well Applications access data (both OLTP and DW) only through DB2 DB2 controls whether to execute query in DB2 mainline or route to IDAA DB2 returns results directly to the calling application Enables mixed workloads and selection of optimal access path (within DB2 mainline or IDAA) depending on access pattern IDAA continues to be implemented as DB2 internal component DB2 provides key IDAA status and performance indicators as well as typical administration tasks by standard DB2 interfaces and means No direct access (log-on) to IDAA Enables operational cost reduction through skills, tools and processes consolidation 47

IDAA Administrative Stored Procedures ACCEL_ADD_ACCELERATOR Pairing an accelerator to a DB2 subsystem ACCEL_TEST_CONNECTION Check of the connectivity from DB2 procedures to the accelerator ACCEL_REMOVE_ACCELERATOR Removing an accelerator from a DB2 subsystem and cleanup resources on accelerator ACCEL_UPDATE_CREDENTIALS ACCEL_ADD_TABLES Renewing the credentials (authentication token) in the accelerator Add a set of tables to the accelerator ACCEL_ALTER_TABLES Alter table definitions for a set of tables on the accelerator (only distribution and organizing keys) ACCEL_REMOVE_TABLES ACCEL_GET_TABLES_INFO ACCEL_LOAD_TABLES ACCEL_SET_TABLES_ACCELERATION Remove a set of tables from the accelerator List set of tables on the accelerator together with detail information Load data from DB2 into a set of tables on the accelerator Enable or disable a set of tables for query off-loading ACCEL_CONTROL_ACCELERATOR Controlling the accelerator tracing, collecting trace and detail of the accelerator (software level etc.) ACCEL_UPDATE_SOFTWARE Update software on the accelerator (transfer versioned software packages or apply an already transferred package, new: also list software both on z/os and accelerator side) ACCEL_GET_QUERY_DETAILS Retrieve statement text and query plan for a running or completed Netezza query ACCEL_GET_QUERY_EXPLAIN Generate and retrieve Netezza explain output for a query explained by DB2 48 ACCEL_GET_QUERIES Retrieve active and/or history query information from accelerator

EXPLAIN DB2 EXPLAIN function is enhanced to provide basic information about accelerator usage Whether query qualifies for acceleration and, if not, why The access path details associated with the query execution by Netezza are provided independently of DB2 EXPLAIN by the IDAA Studio. For each query (irrespective of the number of query blocks) a row is inserted in the following tables: in both PLAN_TABLE and DSN_QUERYINFO_TABLE, if the query is re-routed PLAN_TABLE's ACCESSTYPE column is set to a value of 'A' DSN_QUERYINFO_TABLE's QI_DATA column shows the converted query text in DSN_QUERYINFO_TABLE only, if the query is not re-routed REASON_CODE and QI_DATA columns provide details Note that the EXPLAIN tables can be populated with above described information even if there is no accelerator connected to DB2 Specifying EXPLAINONLY on START ACCEL command does not establish any communications with an actual accelerator, but enables DB2 to consider its presence in the access path selection process 49

DSN_QUERYINFO_TABLE Column Name Column Contents QUERYNO The statement identification, the same value as in PLAN_TABLE. Use it with EXPLAIN_TIME to correlate DSN_QUERYINFO_TABLE and PLAN_TABLE QBLOCKNO QINAME1 QINAME2 APPLNAME If REASON_CODE = 0, the name of the accelerator If REASON_CODE = 0, the location of the accelerator The name of the application plan for the row. Applies only to embedded EXPLAIN statements that are executed from a plan or to statements that are explained when binding a plan. A blank indicates that the column is not applicable. PROGNAME The name of the program or package containing the statement being explained. Applies only to embedded EXPLAIN statements and to statements explained as the result of binding a plan or package. A blank indicates that the column is not applicable. VERSION The version identifier for the package. Applies only to an embedded EXPLAIN statement executed from a package or to a statement that is explained when binding a package. A blank indicates that the column is not applicable. COLLID GROUP_MEMBER SECTNOI The collection ID for the package. Applies only to an embedded EXPLAIN statement that is executed from a package or to a statement that is explained when binding a package. A blank indicates that the column is not applicable. The member name of the DB2 that executed EXPLAIN. The column is blank for non-data sharing. The section number of the statement. SEQNO EXPLAIN_TIME The time at which the statement is processed. This time is the same as the BIND_TIME column in PLAN_TABLE. TYPE 'A' identifies a query that is considered for acceleration. REASON_CODE identifies if the query qualifies for acceleration or not. REASON_CODE If 0, the query qualifies for acceleration. Otherwise, the query cannot be accelerated. More details on the next chart. QI_DATA If REASON_CODE = 0, the text of the converted SQL statement (sent to IDAA). Otherwise, the description of the reason for not qualifying for acceleration SERVICE_INFO IBM internal use only 50 QB_INFO_ROWID IBM internal use only

Value Description 0 Query qualifies for acceleration 1 No active accelerator was found when EXPLAIN was executed. 2 The special register CURRENT QUERY ACCELERATION is set to NONE. 3 The query is a DB2 short running query or re-routing to the accelerator is not considered advantageous. 4 The query is not read-only 5 The query is running under the private protocol. 6 The cursor is defined as scrollable or rowset cursor. 7 The query refers to multiple encoding schemes. 8 The query FROM clause specifies a data-change-table-reference. 9 The query contains a correlated table expression. 10 The query contains a common table expression reference. 11 The query contains an unsupported expression. QI_DATA contains the expression text. 12 The query references table table-name that is either not defined in accelerator, or the table is defined, but is not enabled for query re-routing. 13 The accelerator accelerator-name containing the tables of the query is not started. 14 The column column-name referenced in the query is altered in DB2 after the data is loaded in the accelerator. 51 900 through 999 IBM internal use

Connectivity Options Multiple DB2 systems can connect to a single IDAA DB2 IDAA DB2 A single DB2 system can connect to multiple IDAAs IDAA DB2 IDAA Multiple DB2 systems can connect to multiple IDAAs DB2 IDAA IDAA DB2 Better utilization of IDAA resources Scalability High availability Full flexibility for DB2 systems: residing in the same LPAR residing in different LPARs residing in different CECs being independent (non-data sharing) belonging to the same data sharing group belonging to different data sharing groups 52