java.nio.file.NoSuchFileException: hdfs:/nameservice1/user HDFS Scala program
At the time of writing this, I could not find an effective native Scala API to copy and move the files. The most common recommendation was to use java.nio.* package.
UPDATE: The java.nio.* approach may not work on HDFS always. So found the following solution that works.
Move files from one directory to another using org.apache.hadoop.fs.FileUtil.copy API
val fs = FileSystem.get(new Configuration())
val conf = new org.apache.hadoop.conf.Configuration()
val srcFs = FileSystem.get(new org.apache.hadoop.conf.Configuration())
val dstFs = FileSystem.get(new org.apache.hadoop.conf.Configuration())
val dstPath = new org.apache.hadoop.fs.Path(DEST_FILE_DIR)
for (file <- fileList) {
// The 5th parameter indicates whether source should be deleted or not
FileUtil.copy(srcFs, file, dstFs, dstPath, true, conf)
Old solution using java.nio.* APIs
ex:
//correct way
Path s = new File("C:\\test\\input\\FlumeData.123.avro").toPath();
Path d = new File("C:\\test\output\\FlumeData.123.avro").toPath();
Files.move(s, d, StandardCopyOption.REPLACE_EXISTING);
//incorrect way
Path s = new File("C:\\test\\input\\FlumeData.123.avro").toPath();
Path d = new File("C:\\test\output").toPath();
Files.move(s, d, StandardCopyOption.REPLACE_EXISTING);
So the essence is the Files.move() requires the complete path of file, not just the directory.
The below exception could occur when you don't pass the entire path (including the file name) in the Files.copy method.
Exception:
java.nio.file.NoSuchFileException: hdfs:/nameservice1/user/xxxxx/inputDir/FlumeData.xxx.avro -> hdfs:/nameservice1/user/xxxx/output
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:390)
at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
at java.nio.file.Files.move(Files.java:1347)
at hadoop.scala.FileSorter$$anonfun$moveProcessedFilesToArchiveDir$1.apply(FileSorter.scala:114)
at hadoop.scala.FileSorter$$anonfun$moveProcessedFilesToArchiveDir$1.apply(FileSorter.scala:112)
at scala.collection.immutable.List.foreach(List.scala:318)
at hadoop.scala.FileSorter.moveProcessedFilesToArchiveDir(FileSorter.scala:112)
at hadoop.scala.FileSorter.processFile(FileSorter.scala:78)
at hadoop.scala.FileSorter.init(FileSorter.scala:37)
at hadoop.scala.Driver$.main(Driver.scala:32)
at hadoop.scala.Driver.main(Driver.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Comments