Data Pump Components
Data Pump provides four client utilities that execute the procedures provided in the DBMS_DATA_PUMP PL/SQL package. The client utilities accept input in the form of parameters which are passed to the DBMS_DATA_PUMP PL/SQL package to enable the exporting and importing of metadata and data into and out of an Oracle database.
Data Pump provides four client utilities that execute the procedures provided in the DBMS_DATA_PUMP PL/SQL package. The client utilities accept input in the form of parameters which are passed to the DBMS_DATA_PUMP PL/SQL package to enable the exporting and importing of metadata and data into and out of an Oracle database.
A brief overview    of the four client utilities is provided below:
- EXPDP - Client interface to the Data Pump Export utility. The interface closely resembles its predecessor, Oracle Export.
- IMPDP - Client interface to the Data Pump Import utility. The interface closely resembles its predecessor, Oracle Import.
- OEM Interface - Provides access to the DBMS_DATA_PUMP PL/SQL package from the DB and Grid Control web interfaces.
- Custom Interface - Allows for the movement of complex data mining models.
Because  Data Pump    is made available through the DBMS_DATA_PUMP PL/SQL  package API, users are able    to call Data Pump natively from  application programs. 
Data Access    Methods
Data Pump provides two access methods to load and unload data:
Data Pump provides two access methods to load and unload data:
- Direct Path       - Direct Path unload has been available since Oracle 7.3 and its  direct load      counterpart has been around since Oracle 8i. Oracle  will attempt to use the      Direct Path first but there are a few  situations that can force Data Pump      to switch to the External  Tables access method. Oracle documentation provides      a listing of  all the reasons why Data Pump was unable to use the Direct Path       data access method. Since Data Pump will automatically switch from the  Direct      Path access method to External Tables, it is transparent to  the client.
 
 
- External Tables - The capability of reading data from external sources has been available since Oracle9i. Oracle10G also support writing data to external destinations. Data Pump provides the ORACLE_DATAPUMP access driver that allows it to read and write to external files. Data Pump is required to use this access method if parallel processing is activated to increase performance.
The  user is not    required to determine which data access method best fits  each job's requirements.    The Data Pump utility itself will  automatically choose the most appropriate    data access method for each  table being accessed. No user intervention required!    The output  created by both the Direct Path and External Table data access methods     has the same external format. This allows Data Pump to choose the best  access    method to load data, regardless of which data access method  created the output. 
Metadata API
Data Pump uses the DBMS_METADATA package to load and unload metadata. The Metadata API stores the metadata as XML documents. This differs from the original Export utility which stored the metadata as DDL. The Data Pump Import utility takes advantage of the flexibility provided by the XML format by allowing users to change object ownership, storage characteristics and tablespace destinations.
Data Pump uses the DBMS_METADATA package to load and unload metadata. The Metadata API stores the metadata as XML documents. This differs from the original Export utility which stored the metadata as DDL. The Data Pump Import utility takes advantage of the flexibility provided by the XML format by allowing users to change object ownership, storage characteristics and tablespace destinations.
Data Pump Data    Flow 
In order for us to truly understand how data pump works; we need to learn how the processes and memory structures interact with each other.
In order for us to truly understand how data pump works; we need to learn how the processes and memory structures interact with each other.
The diagram  closely resembles the one that Oracle uses in many of its presentations     on the Data Pump Export data flow. Let's break down the data flow  diagram into    its individual components:
Shadow Processes
This should be an easy one for us to understand. The shadow process is the foreground process that is created when a client logs on to an Oracle database. The shadow process creates most of the other components found on this diagram. After receiving the Data Pump initiation request, the shadow process creates the master table, advanced queue structures for the command and control and status queues and the master control process.
This should be an easy one for us to understand. The shadow process is the foreground process that is created when a client logs on to an Oracle database. The shadow process creates most of the other components found on this diagram. After receiving the Data Pump initiation request, the shadow process creates the master table, advanced queue structures for the command and control and status queues and the master control process.
After  the Data    Pump job is executing, the shadow process is delegated the  responsibility of    services the GET_STATUS requests from the client.  If the client detaches, the    process is no longer needed and is  removed. We learned previously that detaching    a client from Data Pump  does not cause the Data Pump job to stop executing.    Users are able  to reattach to a Data Pump operation as often as required during    the  job's life cycle. Each time the reattach is performed, Oracle will start     a new shadow process on the client's behalf. In addition, multiple  clients can    be attached to the same Data Pump job concurrently, with  each client having    its own shadow process.
Status Queue
Oracle takes advantage of its Advanced Queuing (AQ) feature to provide communication paths between the different Data Pump processes. The Data Pump master control process writes work progress and error messages to the status queue. Shadow processes subscribe to this queue to retrieve the status information.
Oracle takes advantage of its Advanced Queuing (AQ) feature to provide communication paths between the different Data Pump processes. The Data Pump master control process writes work progress and error messages to the status queue. Shadow processes subscribe to this queue to retrieve the status information.
Master Control    Process
The master control process is the "traffic cop" of the Data Pump environment. It controls the execution of Data Pump jobs by delegating work requests to worker processes based on the particular execution phase (metadata capture, data unloading, data loading, etc.) the job is currently performing. The master control proces updates the master table with high-level information (job's description, state, restart and dumpfile info.) required to restart stopped Data Pump jobs. The process is also responsible for communicating to the clients through the status queue and performing logging operations.
The master control process is the "traffic cop" of the Data Pump environment. It controls the execution of Data Pump jobs by delegating work requests to worker processes based on the particular execution phase (metadata capture, data unloading, data loading, etc.) the job is currently performing. The master control proces updates the master table with high-level information (job's description, state, restart and dumpfile info.) required to restart stopped Data Pump jobs. The process is also responsible for communicating to the clients through the status queue and performing logging operations.
Master Table
The master table contains information about the details of the current Data Pump operation being performed. You can equate it to a running log book that provides information on each object being exported or imported and their locations in the dumpfile set. In addition, the master table also records the parameters supplied by the calling program, worker status information and output file information.
The master table contains information about the details of the current Data Pump operation being performed. You can equate it to a running log book that provides information on each object being exported or imported and their locations in the dumpfile set. In addition, the master table also records the parameters supplied by the calling program, worker status information and output file information.
The  table provides    all of the information required to restart a Data  Pump job that has stopped    because of a planned, or unplanned,  failure. The table is owned by the account    running the Data Pump job  and is written to the dump file at the end of each    Data Pump Export  operation. Data Pump Import read's the master table and loads    it into  the user's schema during the initial phase of execution. The table is     then used to keep track of the Data Pump Import operation (job status,  object    being imported, etc.).
Command and    Control Queue
The command and control queue provides an avenue of communications between the master control process and worker processes. All of the work requests created by the master process and the associated responses are passed through the command and control queue.
The command and control queue provides an avenue of communications between the master control process and worker processes. All of the work requests created by the master process and the associated responses are passed through the command and control queue.
Worker Process
During the initial phases of a Data Pump job, the master control process creates worker process to perform the actual data and metadata load and unload operations. The worker processes are responsible for updating the master table with information on the status (pending, completed, failed) of each object being processed. This information is used to provide the detailed information required to restart stopped Data Pump jobs.
During the initial phases of a Data Pump job, the master control process creates worker process to perform the actual data and metadata load and unload operations. The worker processes are responsible for updating the master table with information on the status (pending, completed, failed) of each object being processed. This information is used to provide the detailed information required to restart stopped Data Pump jobs.
Parallel Query    Process
For those of you that have used Oracle's Parallel Query Option (PQO), these processes should be familiar to you. Data Pump's Parallel Query Processes are standard parallel query slave processes that are controlled by the worker process which assumes the role of the parallel query coordinator.
For those of you that have used Oracle's Parallel Query Option (PQO), these processes should be familiar to you. Data Pump's Parallel Query Processes are standard parallel query slave processes that are controlled by the worker process which assumes the role of the parallel query coordinator.
Dump Datafile
Data Pump dump files are created by EXPDP and used as input by IMPDP. Since their format is incompatible with their predecessors, Export and Import, the utilities are not interchangeable. Import can only read a file created by Export and IMPDP can only read a file created by EXPDP.
Data Pump dump files are created by EXPDP and used as input by IMPDP. Since their format is incompatible with their predecessors, Export and Import, the utilities are not interchangeable. Import can only read a file created by Export and IMPDP can only read a file created by EXPDP.
The  dumpfile contains    data and/or metadata and the master table which is  written to the dumpfile at    the end of each EXPDP operation and read  at the beginning of each IMPDP operation. 
All  of the dumpfile    set I/O is generated by Data Pump processes that run  within the constructs of    the database engine. As a result, the O/S  account performing the I/O is Oracle,    which usually has a higher  level of privileges than most other accounts. Oracle    attempts to  prevent unwarranted operations from occurring by using Oracle directory     objects that require read and write privileges granted by the DBA.  Users running    Data Pump operations are only able to read and write  files in the directories    that they have been granted access to.
Before  a user is    able to run a Data Pump operation, the administrator must  pre-create a directory    and grant privileges to the user on that  directory. This means that users are    unable to fully qualify the  output file and log file as they were able to do    in the non Data Pump  versions of Export and Import. If you don't pre-create    the  directory, Oracle provides a default directory called DATA_PUMP_DIR. 
You  specify a fully    qualified directory on the server when you create  the new database directory    object and then grant access on that  directory to the user running the data    pump operation:
CREATE DIRECTORY data_pump_dir1 as '/u03/oradata/datapump';
GRANT READ, WRITE ON DIRECTORY data_pump_dir1 TO foot;
The user running    the Data Pump Operation must specify the directory in one of two ways:
By specifying the    DIRECTORY parameter on the command line:
EXPDP foot/foot DIRECTORY=data_pump_dir1 dumpfile=users.dmp schemas=eul, kap, mal logfile=users.log
Or by specifying    it in the dumpfile and logfile parameters:
EXPDP foot/foot dumpfile= data_pump_dir1:users.dmp schemas=eul, kap, mal logfile=data_pump_directory:users.log
Notice  that in    the above command I have chosen to write the logfile to a  different directory    than the Data Pump output file. If you specify  the DIRECTORY parameter, the    directory specifications embedded in the  dumpfile and logfile parameters are    optional. Directory  specifications embedded in the file parameters override    the directory  specified in the DIRECTORY parameter.

 
 
No comments:
Post a Comment