etl - Pentaho Data Integration (DI) Get Last File in a Directory of a SFTP Server -
i doing transformation on pentaho data integration , have list of files in directory of sftp server. files named file_yyyymmddhhiiss.txt format, directory looks that:
- mydirectory
- file_20130701090000.txt
- file_20130701170000.txt
- file_20130702090000.txt
- file_20130702170000.txt
- file_20130703090000.txt
- file_20130703170000.txt
my problem need last file of list in accordance of creation date, pass other transformation step...
how can in pentaho data integration?
in fact quite simple because file names can sorted textually, , max in sort list recent file.
since list of files short, can use memory group by
step. grouping step needs separate column aggregate. if have column , want find max in entire set, can add grouping column add constants
step, , configure add column with, integer 1 in every row.
configure memory group by
group on column of 1s, , use file name column subject. select maximum grouping type. produce single row grouping column, file name field removed , aggregate column containing max file name. this:
Comments
Post a Comment