Oracle Administration: Data Pump Import

Let's begin our discussion with a quick review of a sample Data Pump Import command:

impdp foot/foot directory=data_pump_dir1 dumpfile=expfull.dmp logfile=impfull.log full=y

"impdp" tells the operating system to run the Data Pump Import utility. The parameters that follow are used to control the job's execution. In our example, the account and password are both "foot", the file that will contain the input data is "expfull.dmp", the logfile is"impfull.log" , and a full database import will be performed

We learned previously that you must precreate a directory using a SQL DDL statement on the database being accessed. If you don't pre-create the directory, Oracle provides a default directory called DATA_PUMP_DIR. That means that users are unable to fully qualify the output file and log file as they were able to do in the non Data Pump versions of Export and Import. You specify a fully qualified directory on the server when you create the new database directory object and then you must grant access to that directory. My last blog contains more information on the creation and administration of Oracle directories.
Data Pump provides the following import modes:

Full - Activated by specifying "FULL=Y". Notifies Import to import the entire content of the input file. The amount of data actually imported is dependent upon the data contained in the input file. If the input file contains all of the objects and data from a full database export, all of those objects will be imported. If the input file contains a schema export, all of the objects in that schema will be imported.
Schema Import - Activated by specifying "SCHEMAS=schema, schema, schema….). Imports the entire schema and all dependent objects (tables, indexes, grants, synonyms, PL/SQL programs, views). The problem with the older version of Import was that because it didn't recreate the user (just the user's objects), the administrator was forced to save the user creation and grant statements to recreate the user with the same privileges it had before the drop was performed. A Data Pump schema import will recreate the user and execute all of the associated grants, user password history, etc..
Table Import - Activated by specifying "TABLES=[schemas].tablename, [schemas].tablename,…. Imports a table and all dependent objects.
Tablespace Import - Activated by specifying "TABLESPACES=tablespacename, tablespacename, tablespacename… All of the objects contained in the specified tablespace are imported. If a table is imported, its dependent objects are also imported regardless of the tablespace they were stored in.
Transportable Tablespace Import - Activated by specifying "TRANSPORT TABLESPACES=tablespacename… Here's how this feature works. The administrator first runs Data Pump Export with the Transportable Tablespace feature activated. Data Pump exports only the metadata for the objects contained in the specified tablespaces. The objects in the tablespace set being exported can not have any references to objects contained in tablespaces outside of the tablespace set. In addition there can't be any objects in the tablespace set that are referenced by objects outside of the tablespace set being exported. Having a index in a tablespace that is not contained in the same tablespace set that contains the table it is built upon would be an example of a dependency reference problem. Administrators use the TRANSPORT_FULL_CHECK parameter to perform the dependency verification.

The administrator then uses an operating system command to copy the tablespace datafile(s) to the target database. The transportable tablespace import feature is then used to attach the datafile (and its contents) to the database. Transferring the datafile using operating system commands is much faster than exporting and importing the data using the traditional Data Pump Export and Import operations. Administrators are only limited by hardware and network performance.

It is important to note that you are not forced into using the same import mode as you did for the export. You can use the schema and table option using an export file that was created by a full database export. You can also import one table from a schema level export (and so on…). Lastly, Data Pump Import can only read data that was created by a Data Pump Export.
Let's continue by reviewing a few of the 10G Data Pump Import parameters that you may find interesting. Many of these parameters will be the same as for Data Pump Export.
Data Pump Import Parameters

JOB_NAME=jobname_string - You are able to assign a job name of up to 30 bytes to a Data Pump operation. This allows the job to be more easily identified when querying the DBA_DATAPUMP_JOBS and DBA_DATAPUMP_SESSIONS data dictionary views that provide information on Data Pump operations. If you do not specify a job name, Oracle will create a default job name for you. The form will be SYS_IMPORT_mode_nn, where mode is the type of export (FULL, TABLE, etc.) and "nn" is a incrementing number starting at 01. The job name is used in the ATTACH command (shown below) to attach, and reattach, to running jobs.
ATTACH [=[schema_name.]job_name] - Used to attach a client session to an existing Data Pump operation. An example of the ATTACH command is provided below:

> impdp foot/foot ATTACH=foot.importfulljob
DIRECTORY=directory_object - All of the dumpfile set I/O is generated by Data Pump processes that run within the constructs of the database engine. As a result, the O/S account performing the I/O is Oracle, which usually has a higher level of privileges than most other accounts. Oracle attempts to prevent unwarranted operations from occurring by using Oracle directory objects that require read and write privileges granted by the DBA. Users running Data Pump operations are only able to read and write files in the directories that they have been granted access to.

Before a user is able to run any Data Pump operation, the administrator must pre-create a directory and grant privileges to the user on that directory. This means that users are unable to fully qualify the output file and log file as they were able to do in the non Data Pump versions of Export and Import. If you don't pre-create the directory, Oracle provides a default directory called DATA_PUMP_DIR.
DUMPFILE=[directory_object:]file_name [, ...] - Contains the metadata and/or data output from a Data Pump Export operation that Data Pump Import is using as input. The directory object specification is not required to be specified if a directory is specified in the DIRECTORY parameter. If one is specified in the DIRECTORY parameter and the DUMPFILE= parameter, the DUMPFILE=directory: specification will take precedence.
ESTIMATE={BLOCKS | STATISTICS} - Used to estimate the amount of data a Data Pump Import network operation will generate. This differs from the estimate option that is used with Data Pump Export. Using the ESTIMATE parameter with Data Pump Export will ask Data Pump to estimate how large the output dumpfile will be. I'll provide more information on Data Pump network import operations later in this blog.
Fine Grained Selectivity Parameters (Exclude, Include, Query, Flashback, Content) - Allows the administrator to include or exclude objects.
HELP = {y | n} - Displays a listing of all the parameters that can be specified for the Data Pump operation. For example:

> impdp help=y

Would provide a listing of parameters for a Data Pump Import operation.
LOGFILE=[directory_object:]file_name - Contains work-in-progress, work completed and errors generated during a Data Pump Operation. For a real time update on a Data Pump operation's activities, you are able to attach to a running Data Pump operation using the ATTACH command and then execute the STATUS command. Like its dumpfile counterpart, The directory object specification is not required to be specified if a directory is specified in the DIRECTORY parameter. If one is specified in the DIRECTORY parameter and the DUMPFILE= parameter, the DUMPFILE=directory: specification will take precedence.
NETWORK_LINK=database_link_name - OK, I must admit that this is one of my favorite new features provided by Data Pump Import. The network transfer option allows administrators to attach to any source database that can be accessed from the network, transfer the data and write it to the target database. No dumpfile required! Administrators use the NETWORK_LINK parameter to specify a pre-created database link that points to the source database.

This parameter must be specified when using the FLASHBACK_SCN, FLASHBACK_TIME, ESTIMATE, or TRANSPORT_TABLESPACES parameters.
PARALLEL=integer - Multiple worker processes are able to perform inter-table and inter-partition work in parallel. This allows administrators to load and unload database tables using parallel processing. Data Pump Import takes advantage of parallelism to build indexes and load package bodies.

The most beneficial aspect of Data Pump's parallel processing feature is that it is dynamic. Administrators are able to attach (connect) to a Data Pump job that is currently executing and dynamically alter the number of parallel processes it uses.

Oracle documentation states that the parallel option may cause file contention if too many parallel processes are used to read the same input dumpfile. As a result, you'll need to be careful when specifying the level of parallelism.
REUSE_DATAFILES={y | n} - Tells Data Pump Import to reinitialize existing tablespace datafiles. Specifying "N" tells Data Pump Import to generate an error message from the failing create tablespace execution. The Import will then continue executing. Specifying "Y" tells Import to issue a warning and reinitialize the tablespace datafile(s) which results in a total loss of existing data in the target tablespace.
STATUS=[integer] - Specifies the number of seconds between each status information display.
TABLE_EXISTS_ACTION={SKIP | APPEND | TRUNCATE | REPLACE} - Provides Data Pump Import with instructions to perform when it finds a table in the target database with the same name as a table contained in the input file. TABLE_EXISTS_ACTION=SKIP - Existing tables are bypassed if they already exist. No action is taken.
TABLE_EXISTS_ACTION=APPEND -Data Pump Import will append rows to the existing table.
TABLE_EXISTS_ACTION=TRUNCATE - Data Pump Import will truncate the existing table and load rows from the source.
TABLE_EXISTS_ACTION=REPLACE - Existing tables will be dropped, recreated and then loaded from the source.
TRANSFORM = transform_name:value[:object_type] - Allows the administrator to alter object creation DDL for specified objects .

Commands Available in Interactive Mode
We learned previously that administrators are able to attach and detach from actively running Data Pump operations. Administrators use the ATTACH and EXIT_CLIENT commands to connect and disconnect from an active Data Pump job. This output file provides a demonstration of detaching and attaching from an active Data Pump operation. You'll notice that I dynamically change the parallel setting after I reattach.
Let's take a look at some of the other parameters for Data Pump Import that you can dynamically change during its execution:

CONTINUE_CLIENT - Switches from interactive mode to status (logging) mode. An example of this command follows:

impdp> CONTINUE_CLIENT
EXIT_CLIENT - Detaches the client from an active Data Pump operation. The job continues to run. The administrator is then able to use the ATTACH command to reattach to the job at any time. The attach can be performed if the job is executing or in a stopped state. An error message will be returned if an attempt is made to attach to a completed Data Pump Operation.
HELP - Provides help on interactive mode Data Pump commands.
KILL_JOB - Kills the active Data Pump operation and detaches all client sessions. This command differs from the STOP_JOB command which is discussed below. The KILL_JOB will remove DUMPFILEs that were created which prevents the user from taking advantage of Data Pump's restart capabilities. The log files will remain.
PARALLEL=integer - Change the number of parallel processes for a Data Pump operation.
START_JOB / STOP _JOB - Restarts and stops the Data Pump operation the client is currently attached to. Data Pump jobs update a master table with all pertinent information pertaining to the job being executed. The master table knows about the objects being exported or imported and their location in the dump file set.

As a result, a Data Pump job can be restarted in the event of a planned or unplanned stoppage. The Data Pump job accesses the master table to determine where to start and picks up right where it left off. Administrators are able to use the STOP_JOB AND START_JOB commands on the client to temporarily stop jobs and restart them when desired.

Oracle provides the SKIP_CURRENT option of the START_JOB parameter to allow administrators to restart a Data Pump Import operation that failed to restart because of a failing DDL statement. SKIP_CURRENT tells Data Pump Import to skip the object having the problem and restart processing on the next object.
STATUS=[integer] - Specifies the number of seconds between each status information display.

Oracle Administration

Pages

Friday, 11 November 2011

Data Pump Import

No comments:

Post a Comment