Friday 11 November 2011

Data Pump Export

Data Pump Export
The ability to transfer data out of an Oracle database to a flat file provides numerous benefits. A couple of the more common benefits are as follows:
  • Database Backup - Export can be used to take a time consistent snapshot of the data in an Oracle database. The data contained in the flat files can be reloaded back into the database at any time. A common question I received from students during my classes was if redo logs could be used to roll the database forward to the current point-in-time after using Import to reload the data. Sorry, Export and Import work within the database data files. The synchronization information required by Oracle during a database restore and recovery relies upon the data contained in the control files and data file headers. Since Export/Import does not change any of the synchronization data in the file headers, the utility can not be used to perform traditional Oracle recoveries executed using the RECOVER statement.

  • Data Archival - I have seen several shops that used the transportable tablespace feature to archive data to offline storage. If the data was required for reporting purposes, the tablespace was reattached to a reporting database for quick access.

  • Database Upgrades - Export/Import is one of the upgrade options that is almost always specified in the Oracle migration manuals. It is often the only method if the difference between the old source release and the new target release is substantial. Since it requires that a new database be created using the new release, backoff is certainly simplified.

  • Data Transfer - Administrators have been using FROM USER/TO USER to transfer data from one schema to another since SCOTT/TIGER was a cub.
Let's begin our discussion with a quick review of a sample Data Pump Export command that I provided in a previous blog:
expdp foot/foot directory=export_dir dumpfile=expfull.dmp logfile=expfull.log full=y

"expdp" tells the operating system to run the Data Pump Export utility. The parameters that follow are used to control the job's execution. In our example, the account and password are both "foot", the file that will contain the output data is "expfull.dmp", the logfile is"expfull.log" , and a full database export will be performed. 

We learned previously that you must precreate a directory using a SQL DDL statement on the database being accessed. If you don't pre-create the directory, Oracle provides a default directory called DATA_PUMP_DIR. That means that users are unable to fully qualify the output file and log file as they were able to do in the non Data Pump versions of Export and Import. You specify a fully qualified directory on the server when you create the new database directory object and then you must grant access to that directory. You will need to do this for both Data Pump Export and Import operations.
Creating the directory:
CREATE DIRECTORY data_pump_dir1 as '/u01/oradata/prod1/datapump';
We can then view all of the directories by executing the select statement below:

SQL> select directory_name, directory_path from dba_directories

DIRECTORY_NAME DIRECTORY_PATH
--------------- -------------------------------------------------------------
SUBDIR /app/oracle/product/10.2/db/demo/schema/order_entry//2002/Sep
XMLDIR /app/oracle/product/10.2/db/demo/schema/order_entry/
MEDIA_DIR /app/oracle/product/10.2/db/demo/schema/product_media/
LOG_FILE_DIR /app/oracle/product/10.2/db/demo/schema/log/
WORK_DIR /ade/aime_10.2_lnx_push/oracle/work
DATA_FILE_DIR /app/oracle/product/10.2/db/demo/schema/sales_history/
DATA_PUMP_DIR /app/oracle/product/10.2/db/admin/ctitst/dpdump/
ADMIN_DIR /ade/aime_10.2_lnx_push/oracle/md/admin
DATA_PUMP_DIR1 /u01/oradata/prod1/datapump

Before we are able to begin using the directory for Data Pump Operations, we'll need to grant access to it:
GRANT READ, WRITE ON DIRECTORY data_pump_dir1 TO foot;
There is one important point to remember. Even though you have granted the database user the authority to read and write to that directory, the user does not have the authority to access those files outside of the Oracle database. If the only task the user needs to do is execute the Data Pump Export and Import utilities, the database privileges will suffice. But if the user needs to view or manipulate the output and log files after they are created, they will need to have an operating system account created on the database server with the proper operating system privileges granted to it.
In addition, you will have to ensure that the Oracle server itself has the appropriate read/write operating system privileges on the directory structure used in the CREATE DIRECTORY statement. If the Oracle engine can't access the directory, the Data Pump Export and Import won't be able to access it.
Data Pump provides the following export modes:
  • Full Export - Activated by specifying "FULL=Y". Notifies Export to export the entire database. Everything you need to recreate all of the application data objects will be included in the output file including tablespaces, tables, indexes, views, synonyms, PL/SQL programs, users, grants - the works.

  • Schema Export - Activated by specifying "SCHEMAS=schema, schema, schema….). Exports the entire schema and all dependent objects (tables, indexes, grants, synonyms, PL/SQL programs, views). The problem with the older version of Import was that because it didn't recreate the user (just the user's objects), the administrator was forced to save the user creation and grant statements to recreate the user with the same privileges it had before the drop was performed. A Data Pump schema import will recreate the user and execute all of the associated grants, user password history, etc..

  • Table Export - Activated by specifying "TABLES=[schemas].tablename, [schemas].tablename,…. Exports a table and all dependent objects.

  • Tablespace Export - Activated by specifying "TABLESPACES=tablespacename, tablespacename, tablespacename… All of the objects contained in the specified tablespace are exported. If a table is exported, its dependent objects are also exported regardless of the tablespace they are stored in.

  • Transportable Tablespace Export - Activated by specifying "TRANSPORT TABLESPACES=tablespacename… Only the metadata is exported for objects contained in the specified tablespaces. The administrator then uses an operating system command to copy the tablespace datafile(s) to the target database. The transportable tablespace Import feature is then used to attach the datafile (and its contents) to the database.
I provided a listing of parameters that I thought readers might find interesting in a previous blog titled "10G Data Pump Part 1". It may be beneficial for those of you that haven't read that blog to do so before continuing.
Here are a few other parameters I thought you may find to be beneficial during your day-to-day usage of Data Pump Export:
  • JOB_NAME=jobname_string - You are able to assign a job name to a Data Pump operation. This allows the job to be more easily identified when querying the DBA_DATAPUMP_JOBS and DBA_DATAPUMP_SESSIONS data dictionary views. If you do not specify a job name, Oracle will create a default job name for you. The form will be SYS_EXPORT_mode_nn, where mode is the type of export (FULL, TABLE, etc.) and "nn" is a incrementing number starting at 01. The job name is used in the ATTACH command below to attach, and reattach, to running jobs.

  • ATTACH [=[schema_name.]job_name] - Used to attach a client session to an existing Data Pump operation. An example of the ATTACH command is provided below:

    > expdp foot/foot ATTACH=foot.export_full

  • DIRECTORY=directory_object - All of the dumpfile set I/O is generated by Data Pump processes that run within the constructs of the database engine. As a result, the O/S account performing the I/O is Oracle, which usually has a higher level of privileges than most other accounts. Oracle attempts to prevent unwarranted operations from occurring by using Oracle directory objects that require read and write privileges granted by the DBA. Users running Data Pump operations are only able to read and write files in the directories that they have been granted access to.

    Before a user is able to run a Data Pump operation, the administrator must pre-create a directory and grant privileges to the user on that directory. This means that users are unable to fully qualify the output file and log file as they were able to do in the non Data Pump versions of Export and Import. If you don't pre-create the directory, Oracle provides a default directory called DATA_PUMP_DIR.

  • DUMPFILE=[directory_object:]file_name [, ...] - Contains the metadata and/or data output from a Data Pump Export operation. The directory object specification is not required to be specified if a directory is specified in the DIRECTORY parameter. If one is specified in the DIRECTORY parameter and the DUMPFILE= parameter, the DUMPFILE=directory: specification will take precedence.

  • ESTIMATE={BLOCKS | STATISTICS} - Used in conjunction with the ESTIMATE_ONLY parameter to estimate the amount of space a Data Pump Export dumpfile will consume.

  • Fine Grained Selectivity Parameters (Exclude, Include, Query, Flashback, Content) - Allows the administrator to include or exclude objects. For more information on these parameters.
  • HELP = {y | n} - Displays a listing of all the parameters that can be specified for the Data Pump operation. For example:

    > expdp help=y

    Would provide a listing of parameters for a Data Pump Export operation.

  • LOGFILE=[directory_object:]file_name - Contains work-in-progress, work completed and errors generated during a Data Pump Operation. For a real time update on a Data Pump operation's activities, you are able to attach to a running Data Pump operation using the ATTACH command and then execute the STATUS command. Like its dumpfile counterpart, The directory object specification is not required to be specified if a directory is specified in the DIRECTORY parameter. If one is specified in the DIRECTORY parameter and the LOGFILE= parameter, the LOGFILE=directory: specification will take precedence.

  • PARALLEL=integer - Multiple worker processes are able to perform inter-table and inter-partition work in parallel. This allows administrators to load and unload database tables using parallel processing.

    The most beneficial aspect of Data Pump's parallel processing feature is that it is dynamic. Administrators are able to attach (connect) to a Data Pump job that is currently executing and dynamically alter the number of parallel processes it uses.

  • SAMPLE=[[schema_name.]table_name:]sample_percent - Allows an administrator to sample a percentage of the object specified. If you don't specify a table name, the percentage will be applied to the entire Data Pump operation.

  • STATUS=[integer] - Specifies the number of seconds between each status information display.
Commands Available in Interactive Mode
We learned previously that administrators are able to attach and detach from actively running Data Pump operations. Administrators use the ATTACH and EXIT_CLIENT commands to connect and disconnect from an active Data Pump job. This output file provides a demonstration of detaching and attaching from an active Data Pump operation. You'll notice that I dynamically change the parallel setting after I reattach.
Let's take a look at some of the other parameters you can dynamically change:
  • ADD_FILE=[directory_object]file_name [,...] - This feature allows the administrator to add additional files to a Data Pump Export operation. The ability to add data files "on the fly" would come in very handy when the first dumpfile's target drive begins to fill up. The administrator would be able to add a new file on a different drive.

  • CONTINUE_CLIENT - Switches from interactive mode to status mode. An example of this command follows:

    export> CONTINUE_CLIENT

  • EXIT_CLIENT - Detaches the client from an active Data Pump operation.

  • HELP - Provides help on interactive mode Data Pump commands.

  • KILL_JOB - Kills the active Data Pump operation and detaches all client sessions. This command differs from the STOP_JOB command which is discussed below. The KILL_JOB will remove DUMPFILEs that were created which prevents the user from taking advantage of Data Pump's restart capabilities. The log files will remain.

  • PARALLEL=integer - Change the number of parallel processes for a Data Pump operation.

  • START_JOB / STOP _JOB - Restarts and stops the Data Pump operation the client is currently attached to. Data Pump jobs update a master table with all pertinent information pertaining to the job being executed. The master table knows about the objects being exported or imported and their location in the dump file set.

    As a result, a Data Pump job can be restarted in the event of a planned or unplanned stoppage. The Data Pump job accesses the master table to determine where to start and picks up right where it left off. Administrators are able to use the STOP_JOB AND START_JOB commands on the client to temporarily stop jobs and restart them when desired.

  • STATUS=[integer] - Specifies the number of seconds between each status information display.

No comments:

Post a Comment