7.2.3. Free-form Query Imports
Sqoop can also import the result set of an arbitrary SQL query. Instead of using the --table
, --columns
and --where
arguments, you can specify a SQL statement with the --query
argument.
When importing a free-form query, you must specify a destination directory with --target-dir
.
If you want to import the results of a query in parallel, then each map task will need to execute a copy of the query, with results partitioned by bounding conditions inferred by Sqoop. Your query must include the token $CONDITIONS
which each Sqoop process will replace with a unique condition expression. You must also select a splitting column with --split-by
.
For example:
$ sqoop import \ --query 'SELECT a.*, b.* FROM a JOIN b on (a.id == b.id) WHERE $CONDITIONS' \ --split-by a.id --target-dir /user/foo/joinresults
Alternately, the query can be executed once and imported serially, by specifying a single map task with -m 1
:
$ sqoop import \ --query 'SELECT a.*, b.* FROM a JOIN b on (a.id == b.id) WHERE $CONDITIONS' \ -m 1 --target-dir /user/foo/joinresults
Note | |
---|---|
If you are issuing the query wrapped with double quotes ("), you will have to use |
Note | |
---|---|
The facility of using free-form query in the current version of Sqoop is limited to simple queries where there are no ambiguous projections and no |
No comments:
Post a Comment