Monthly Archives: December, 2015

IBM Netezza Interview questions (3/3)

61. What is a Sequence? How do you create a new sequence?

These are used to generate surrogate key.

62. How do you delete all rows in a Netezza table at once?

Truncate table :  We can’t rollback

Delete from table: We can rollback using DELETEXID column

63. Do we have private synonyms in Netezza?

No. All synonyms are public.

64. How do you find out table size?

select objname, sum(used_bytes)/1024/1024 size_in_MB from _v_obj_relation_xdb

join _v_sys_object_dslice_info on (objid = tblid) where objname like ‘T1%’;

How to run?

nzsql -db SYSTEM -f c.sql  (or) nzsql -db RAVI -f c.sql

(or) run the above SQL from GUI

[nz@netezza ravi]$ nzsql -db ravi -f c.sql

OBJNAME | SIZE_IN_MB

———+————

T1      |   0.750000

(1 row)

 

65. How do you find out a database size?

select orx.database::nvarchar(64) as “databasename” ,

case when sum(sod.used_bytes) is null then 0 else sum(sod.used_bytes)/1073741824 end as “usedspace_gb”,

case when sum(sod.allocated_bytes) is null then 0 else sum(sod.allocated_bytes)/1073741824 end as “allocatedspace_gb”

from _v_sys_object_dslice_info sod inner join _v_obj_relation_xdb orx on orx.objid = sod.tblid

group by “databasename” order by “databasename”;

Output will be like below:

**************************

[nz@netezza ravi]$ nzsql -f b.sql

databasename | usedspace_gb | allocatedspace_gb

————–+————–+——————-

RAVI         |     0.000122 |          0.002930

TESTDB       |     0.000244 |          0.005859

UAT          |     0.000244 |          0.005859

(3 rows)

66. What data types are most suited for Zone maps?

Integer Datatypes (Byteint, Smallint, Int, Bigint)

Date

67. List which options are prioritized when join operation is required.

Netezza evaluates joins in this order of preference:

1) Colocated joins: All data for joins are located on the same SPU.

2) Redistribute: All required data is not located in the same SPU, send data to corresponding SPU where driving table data is located.

3) Broadcast: Mentioned as replication above; Send all data from SPU to host which collates all that data, sends it to all SPUs. Each SPU has entire table data. That is, if Netezza machine has 32 SPU, there will be 32 physical tables, one each SPU.

 

Coming to joins, typically Netezza prefers in this order:

A) Hash join in memory

B) Hash join on disk

C) Sort Merge join

D) Nested loops

E) Cross join

Oracle preference closely resembles the same or similar order.

68. What are CLI commands to stop and start Netezza?

nzstop

nzstart

nzsystem -restart

69. Can we stop and start Netezza using NZSYSTEM?

nzsystem -restart

70. Why integer data types are preferred in Netezza?

They give best performance in Joins.

 

71. Where do you find Netezza system parameters?

/nz/data/postgresql.conf

 

72. In Netezza, can we do cross database inserts/updates/deletes?

Nope. Only selects

 

73. In Netezza, can we do cross database selects?

Yes

 

74. What is the difference between active session and active transaction?

A session can have multiple transactions.

We can see active sessions using “nzsession”

[nz@netezza kit]$ nzsession -h

Usage: nzsession [-h|-rev] [-hc] <subcmd> [<subcmd options>]

Options:

-h or -? or –help   display this help

-[rR]ev or -V        print the software revision of this program

-hc            display help for <subcmd>

 

Subcommands:

show              show the list of current sessions (default)

abort             abort a running user session

abortTxn          abort the transaction of a current SQL session

priority          change the priority of an SQL session

listSessionTypes  list the session types supported by show

 

75. Get Netezza version running

select version();

76. How do find out running port number in Netezza?

netstat | grep port

 

77. What is the use of NZPASSWORD cli command?

To cache passwords; Avoids the need of typing the password every time.

 

78. How do you find locks in Netezza?

Netezza follows Higher isolation level called serializable.

nz_show_locks DB_NAME  TABLE_NAME

 

79. What are version tables?

https://www-304.ibm.com/connections/forums/html/topic?id=c9f19ff4-c267-4e9f-bcf3-2c4dd9619a39

 

80. Alter table

https://www-304.ibm.com/support/knowledgecenter/SSULQD_7.2.0/com.ibm.nz.dbu.doc/r_dbuser_alter_table.html

 

81. Alter table change database of a column

https://www-304.ibm.com/connections/forums/html/topic?id=49f96c09-fd83-4e72-a118-aaadbc74c4a0&ps=25

 

Thanks,
Ravi

Advertisements

IBM Netezza Interview questions (2/3)

31. How do you remove logically deleted records?

Using Groom Table/Nzreclaim. Before we apply groom, the delete record must be part of at least one backup set id.

Groom (vs) Nzreclaim:

Nzreclaim: CLI. This can be only run from NZSQL interface.

Groom: We can run this from SQL interface. For ex: we can run this from Aginity.

32. How do you restore deleted records?

Using DELETEXID.

set show_deleted_records = TRUE

33. What are various ways to load tables in Netezza?

  1. External Table
  2. NZLOAD (Wrapper to External Table)
  3. CTAS

 

34. What are various ways to unload a table or create a table level backup?

  • External Table

35. What are external tables in Netezza?

  • In Netezza we have 3 types of tables:
  • System Tables: These are stored on Netezza host disks
  • User tables: These are stored on Netezza data slices
  • External Tables: These are stored outside Netezza system. If we have file on Windows box, then we have to use REMOTESOURCE ‘odbc’ option

 

36. How do you update a row/record in Netezza table?

  • If we are trying to update distribution column, it fails
  • In Netezza update = Delete current + Insert new

37. Can you explain what happens in Netezza when you try to update a row/record in Netezza?

Same as (6) above

38. What are the fundamental principles on which Netezza is built?

  • AMPP (Asynchronous Massive Parallel Processing): Gives massive performance
  • Because of its distribution capabilities
  • FPGA (Secret sauce of Netezza’s massive performance)
  • Easy maintenance. For example: No indexes, Limited space management. High Availability
  • Scalability: We can easily plug in another Netezza boxes to current Netezza appliance.

39. What is FPGA?

Field programmable Gate Array. It takes care of de-compression, Restrictions, & Projections and sends the filtered data back to CPU.

 

40. What are transactions in Netezza?

Transaction: All or None. It follows ACID properties.

A: Atomicity

C: Consistency

I: Isolation

D: Durability

 

In Oracle, we have to explicitly commit a transaction. But In Netezza, Transactions are auto commit unless we explicitly use BEGIN.

41. What are various datatypes in Netezza?

SELECT * FROM _V_DATATYPE;

42. List 4 hidden datatypes in Netezza

ROWID, CREATEXID, DELETEXID, Datasliceid

Rowid, Createxid, Deletexid are stored in Netezza Data slices.

But, Datasliceid is stored on Netezza host disks

43. How do you create a new user?

Using CREATE USER

\h create user   ==> Gives us the help

44. How do you create a new group?

Using CREATE GROUP

\h create group   ==> Gives us the help

45. How do you add existing users to an already create group?

Alter user (or) Alter Group

46. What are resource sharing groups (RSG)?

Normal Group (vs) Resource Sharing Group:

By default every group belongs to Normal Group/Public.

RSG’s are nothing but we specify resource usage limits. For example: We add CEO, CIO, Senior Leadership userid’s into HIGHPRIORITY groups. So that, when ever they login to check any dashboards, more resources will be allocated to them.

47. List 2 types of privileges in IBM Netezza

  • Object Level Privileges: Select, insert
  • System Level Privileges: Create database, create user

48. What is a primary key? Does Netezza have Primary Key constraints?

Primary key  = Unique key + No nulls

In Netezza we can create primary key, but Netezza doesn’t enforce primary keys.

49. What is a foreign key? Does Netezza has foreign Key constraints?

Foreign Key: Parent-child relation

In Netezza we can create foreign key, but Netezza doesn’t enforce foreign key constraints

50. What is a NOT NULL constraint? Why NOT NULL is better for Netezza’s performance

Every row in Netezza has a NULL VECTOR which says about the columns null/not null details.

If row header says a particular column is NOT NULL, we don’t have to do some extra checks for that column at the run time.

51. What is the difference between SPU and S-Blade?

Both are same. Where “S” stands for Snippet.

In old Netezza versions – such as Mustang – we use SPU terminology.

52. What is a dataslice and disk?

Disk: Physical drive on which data resides.

Host disks: hold Netezza software, host operating system, database metadata

SPU disks: hold user databases and tables

Data slice: Logical representation of the data that is saved on a disk

Data partition: Logical representation of a data slice that is managed by a specific SPU

 

53. How may disks, FPGA’s, CPU Processors are available in Twinfin 12?

96 each

54. How may disks, FPGA’s, CPU Processors are available in Twinfin 6?

96/4 = 24 each

55. How may disks, FPGA’s, CPU Processors are available in Twinfin 24?

96 *2 ==> 192 each

56. What are clustered base tables and how they are different from normal tables?

Clustering is nothing but grouping of similar things.

In clustered base tables, we group a tables records using a column/columns.

These columns are called as organizing key.

For a table, we can specify a maximum of 4 columns as an Organizing Key.

 

57. List the log file name that captured day-to-day Netezza’s diagnostic messages

sysmgr.log

58. What are various things collected as part of statistics?

  • Number of rows
  • Number of columns & it’s diversity
  • How data is deviant from standard deviation
  • About nulls/not nulls

 

59. What are views? List few system views?

Logical representation of a table. _v_datatype, _v_table. Anything starts with “_v”

 

60. What is a Join? List various Joins in Netezza?

Join: Combining result from multiple tables.

Ex: Inner join, self join, Outer join, Left Outer Join, Right Outer Join, Cross Join

 

Thanks,
Ravi

IBM Netezza Interview questions (1/3)

  1. What is Netezza?

Netezza is a datawarehouse appliance in which both h/w and s/w are tightly coupled to deliver best performance to the business.

Appliance is like a black box. For ex: Washing machine

 

  1. List various datawarehouse appliances available in the market? (or) who are all Netezza’s competitors?
    • Oracle Exadata
    • Teradata
    • EMC Greenplum
    • HP Vertica
    • Microsoft Parallel machine.

Netezza (vs) Hadoop:

Hadoop: For Unstructured data. For ex: Someone wishing you on your birthday

Netezza: Structured data. Registering for a facebook account.

 

  1. List various appliance components
  • SMP Hosts
  • Disks
  • Snippet Blades
  • Sub components: Disk Enclosures, SPA, SPU, FPGA

S-Blade = SPU

Storage Array 1, Storage Array 2.

Each Storage Array has 4 disk Enclosures. Each Disk Enclosure contains 12 disks of 1TB size.

Each S-Blade owns several disks, which reside in a storage array within the same rack.

Storage Array = Storage Disk = N disk enclosures

====

Storage Array 1  +  Chassis 1 => Snippet Processing Array (SPA) 1

Sorage Array2    + Chassis 2 => SPA 2

 

4. List few ETL tools?

Informatica, Datastage, Ab intio

 

5. List few BI tools?

Cognos, OBIEE, SSIS

 6. Where system catalog information is stored? In Netezza host or Netezza disks?

Netezza hosts.

For example: when we create a table, table definition is stored in host disks (df -k) and rows/records of this table are stored in Netezza disks

7. What are default userid’s and passwords in Netezza?

root, nz, admin

8. What is clustering in Netezza?

Clustering is nothing but grouping of similar things.  In Netezza, we have 2 SMP hosts. These 2 hosts are part of a cluster. In this cluster, one host is active and another host is passive.

9. List few back slash commands in Netezza

Back slash commands: To avoid basic day to day SQL

For example: \du is to display users. \l is to list the database users.

 

10. What is the difference between “nzsql” and “nzsql -E”

-E ==> To see the underlying query generated by back slash command

 

11. List 2 default databases in Netezza

SYSTEM, MASTER_DB

MASTER_DB: Is template database; helps us to create new databases faster.

12. What is the extent and page size in Netezza?

Extent size: 3MB

Page size: 128 KB

Note: The above are fixed. we can’t change them.

 

13. What are various maximums in Netezza?

Columns: 1600 per table or view

Maximum columns per distribution: 4

Max Length of database and column names: 128 bytes

Max number of characters in a char/varchar field: 64,000

Max connections to the server: 2000. Default: 500

Row size: 65,535 bytes

 

14. List few environment variables in Netezza

NZ_DATABASE

NZ_HOST

NZ_USER

NZ_PASSWORD

15. List constraints in Netezza

Primary Key (Not enforced)

Unique key (Not enforced)

Referential Integrity (Not enforced)

Not Null (Enforced)

Data type verification (Enforced)

16. Can you insert duplicate rows in Netezza?

Yes

17. Techniques to improve Netezza appliance performance?

As a developer:

– At design time, have a best distribution key. To avoid data skew

– Use appropriate data types. For example: If INT2 is good, then don’t use INT4. For age, Byteint (Int1) is good enough

18. What is a Snippet?

Snippet is a part of given query. When we submit a query to Netezza, it is divided into multiple snippets. These snippets are executed by Snippet(s)-Blades.

19. What are zonemaps?

Automatically created persistent internal tables.  Minimum & Maximum for  Integer & Date columns.

Helps Netezza to avoid unnecessary scanning of Extents (3MB)

Till Netezza 6.0: Zonemaps are till only at extent level

In Netezza 7.0, Zonemaps are even added at Page(128KB) level

Zonemaps (vs) Index: Zonemaps are like Anti-Indexes.

Index points to wherever the data is. But, Zonemaps point to where data doesn’t exist.

20. Generate Statistics (vs) Generate Express Statistics (vs) Automatic Statistics

When a user submits a query, Netezza optimizer decides on best execution path for that query.

For example: If I want to travel from Newyork to Chicago, what is my best mode of transportation in the resources(money & Time) available to me.

 

Generate statistics: Helps Netezza optimizer to know the details of tables such as number of rows, null/not nulls, etc in advance

 

Generate Statistics = Generate Express Statistics

Automatic Statistics: Automatically generated statistics.

Generate Statistics (Vs) Automatic Statistics:

With respect to sampling. Automatic statistics only gathers using little sampling of data. Where as Generate statistics, takes complete data as sample.

21. What are materialized views and advantages of creating them?

In Netezza, Materialized views are also called as SPM views.

S: Sorted

P: Projected

M: Materialized

 

The objective of SPM views is to avoid unnecessary I/O

Advantages:

If we are using only few columns of a major table in most of the queries, we can create a materialized view on these commonly used columns.

For example: create materialized view as select c_custname, c_age, c_address from custinfo order by c_custname.

If anyone fires a select statement to select customer name, the query gets the result from materialized view instead of main table.

The major advantage here is: Amount of data transferred from storage layer to database layer.

 

Can’t we use View?

View: Still has to go and get data from main table. The advantage with SPM views is: data is ordered

22. List 2 distribution methods in Netezza

Hashing Distribution

Round Robin

Advantage of Hashing: Co-located joins will happen here

 

23. What is the default distribution method in Netezza?

Hashing on first column

24. How many maximum columns you can use while distributing a table?

4

 

25. Can we update column(s) used in distribution key?

Nope.  If you have to update, create another table using “CTAS: Create table as … distribution key”

26. How do you change distribution key of already create table?

CTAS: Create table as …

27. What is CTAS? What is the default distribution that you get when you create table using CTAS?

CTAS: Create table as

The default is: whatever we have on original table

 

28. What is a collocated join?

The joins that might happen on Hashing distributed tables in which we don’t have to distribute data of tables at run time.

29. What is table skew? How do you check table is skewed or not?

Table skew: Un even distribution of data of a table across data slices.

SELECT datasliceid, count(datasliceid) from <tablename> group by  datasliceid;

 

30. What is broadcasting in Netezza? When Netezza uses broadcasting?

SELECT …

FROM customer, sales

Where customer.cid = sales.sid

If both tables in join doesn’t have same distribution key, then one or both tables have to be re-distributed at run time.

The decision of broadcasting is taken by Netezza: if one of the tables is way smaller than other table.

 

Thanks,
Ravi

Determine active, standby, cluster information in Netezza

[root@nzdev2a ~]# crm_mon -i5

[root@nzdev3a ~]# /usr/sbin/crm_resource -r nps -W

crm_resource[12322]: 2013/01/19_05:46:37 info: Invoked: /usr/sbin/crm_resource -r nps -W

resource nps is running on: nzdev3a

Finding object/database ID of a Netezza database

There are two ways you can find out object/database ID of a Netezza database.

Please note the below examples are run on pre-7.0.3 environment where we don’t have multi-schema support. However, the same commands works on 7.0.3 onwards too.

(1) Way1:

[nz@nzdev2a 1]$ nzsql

Welcome to nzsql, the IBM Netezza SQL interactive terminal.

Type:  \h for help with SQL commands

\? for help on internal slash commands

\g or terminate with semicolon to execute query

\q to quit

SYSTEM(ADMIN)=> \d  _v_database

View “_V_DATABASE”

Attribute     |              Type               | Modifier | Default Value

——————+———————————+———-+—————

OBJID            | OID                             |          |

DATABASE         | NAME                            |          |

OWNER            | NAME                            |          |

CREATEDATE       | ABSTIME                         |          |

DB_CHARSET       | NAME                            |          |

DB_COLLATION     | NAME                            |          |

DBCHARSET        | OID                             |          |

DBCOLLATION      | OID                             |          |

DBOWNERID        | OID                             |          |

DBLOCKPID        | INTEGER                         |          |

DBSTATUS         | NATIONAL CHARACTER VARYING(255) |          |

BACKUPGROUP      | NAME                            |          |

OBJDELIM         | BOOLEAN                         |          |

DBCOLLECTHISTORY | BOOLEAN                         |          |

ENCODING         | INTEGER                         |          |

View definition: SELECT OD.OBJID, OD.OBJNAME AS “DATABASE”, OD.”OWNER”, OD.CREATEDATE, CASE WHEN (“d”.ENCODING = 0) THEN “c”.COLCHARSET ELSE “NAME”((‘IBM’::”VARCHAR” || “VARCHAR”(“d”.ENCODING))) END AS DB_CHARSET, “c”.COLCOLLATION AS DB_COLLATION, “d”.DBCHARSET, “d”.DBCOLLATE AS DBCOLLATION, OD.DBOWNER AS DBOWNERID, “d”.DBLOCKPID, CASE WHEN (“d”.DBLOCKTYPE = ‘U’::”CHAR”) THEN (‘READ-ONLY’::”NVARCHAR”)::NVARCHAR(255) WHEN (“d”.DBLOCKTYPE = ‘E’::”CHAR”) THEN (‘EXCLUSIVE’::”NVARCHAR”)::NVARCHAR(255) ELSE ‘NULL’::”NVARCHAR” END AS DBSTATUS, GO.OBJNAME AS BACKUPGROUP, OD.OBJDELIM, “d”.DBCOLLECTHISTORY, “d”.ENCODING FROM (((ADMIN.”_V_OBJ_DATABASE” OD JOIN ADMIN.”_T_DATABASE” “d” ON ((“d”.OID = OD.OBJID))) LEFT JOIN ADMIN.”_T_COLLATION” “c” ON ((“c”.OID = “d”.DBCOLLATE))) LEFT JOIN ADMIN.”_T_OBJECT” GO ON (((“d”.DBBACKUPGRP NOTNULL) AND (“d”.DBBACKUPGRP = GO.OBJID))));

 

SYSTEM(ADMIN)=> select OBJID, DATABASE, OWNER from _v_database;

OBJID  | DATABASE  | OWNER

——–+———–+——-

219958 | EDWQA     | ADMIN

218470 | EDWQAHIST | ADMIN

2 | MASTER_DB | ADMIN

1 | SYSTEM    | ADMIN

(4 rows)

 

(2) Way2

[nz@nzdev2a 1]$ nzstats -type database

DB Id  DB Name   Create Date         Owner Id Num Tables Num Views Num Active Users

—— ——— ——————- ——– ———- ——— —————-

1 SYSTEM    2013-01-15 16:50:16      500          0         0                6

2 MASTER_DB 2013-01-15 16:50:16      500          0         0                0

218470 EDWQAHIST 2013-01-15 17:32:53      500         35         0                0

219958 EDWQA     2013-01-16 01:33:20      500       1006       241                0

Common IBM Netezza Administration tasks

The administration tasks of IBM Netezza generally fall into the below categories:

(a) Deploying and installing Netezza clients

(b) Managing the Netezza appliance

(c) Managing system notifications and events

(d) Managing Netezza users and groups

(e) Database Management

(f) Loading Data

(g) Database backup and restore

(h) Query History

(i) Workload management

Thanks,
Ravi