Menu

#2 'ON DATA CONVERSION ERROR SKIP' does not work

open
5
2009-03-06
2009-02-17
No

I have a data conversion problem on 'ON ERROR' clause.
I'd like to skip dirty rows using 'ON DATA CONVERSION ERROR SKIP' clause, but it doesnt work.

DDL:
create table MYLOGS
(anonid int, query varchar, querytime datetime format 'yyyy-MM-dd
HH:mm:ss', itemrank int, clickurl varchar )
column sep '\t';

Query:
select
year(querytime) as qy, month(querytime) as qm, count(*) as qcnt
from MYLOGS
group by qy, qm ON DATA CONVERSION ERROR SKIP
;

On Reduce, I got an error:
java.io.IOException: Could not convert to date:null
at com.business.cloudbase.hadoop.job.AggFunHandler
$AggFunReducer.reduce(Unknown Source)
at com.business.cloudbase.hadoop.job.AggFunHandler
$AggFunReducer.reduce(Unknown Source)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:
318)
at org.apache.hadoop.mapred.TaskTracker$Child.main
(TaskTracker.java:2198)

Thanks,
Youngwoo

Discussion

  • Tarandeep Singh

    Tarandeep Singh - 2009-03-06
    • labels: --> CloudBase Server
    • assigned_to: nobody --> tsingh
     
  • Tarandeep Singh

    Tarandeep Singh - 2009-03-17

    Can you post few lines from your log file... you can mask the fields, I just want to see why NULL is returned for Date column.

     
  • Nobody/Anonymous

    Hi Taran,

    It's a simple table.

    My DDL:
    create table MYLOGS
    (anonid int, query varchar, querytime datetime format 'yyyy-MM-dd
    HH:mm:ss', itemrank int, clickurl varchar )
    column sep '\t';

    This is search logs and the log file contains dirty rows. I just wanted to skip dirty rows using year(), month() functions.

    Thanks,
    Youngwoo

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.