Thanks for writing this awesome utility. I hate calling things by building a process, and relying on reading stdout, etc. This gives us the true java way of doing things, and I really appreciate your effort.
The reason why I am posting this is because I didn't find the code examples straightforward enough to figure out how to write the extracted contents of an archive to the disk. So, after some quick experimentation, I came up with the following code. You can provide a path for extraction, and then it writes the files from the archive to the disk with the path names preserved. It's quick code and it may not be the best, so edits for efficiency, etc, are welcome and appreciated.
Your code has a serious problem. It might work some times,
if your archive contains small files. But it will fail, if you will try to
extract a big file.
The problem is, that the ISequentialOutStream.write() method can be called by 7-zip
extraction engine more, than ones. If you extract a really huge file, that doesn't fit into
memory you must be able to process this file part by part (what you can do with 7-zip).
Creating each time a new file in the write() method can't be write, though.
The other problem, that you have: extracting 0 length files. For such files the write method
call is omitted.
I'm pretty busy right now, but I will try to keep this in mind and add new code snippet with,
that suites your needs.
Regards,
Boris Brodski
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Every time ISequentialOutStream.write() method was open it created FileOutputStream and this was ok for small files.
I couldn't figure out how to pass the stream and write many times.
Actually it was stupid question, which I found solution.
sorry for delay. This is Ok so far, but I think, you could make your code a little bit better.
My suggestions:
- To extract a single file you don't need to extract all files and then filter one file you really want to extract. You can call the extract() method just like this:
inArchive.extract(new int {index}, false, extractCallback); // extract only file with index "index"
- In "public int write(byte data) throws SevenZipException" method you can throw your I/O exceptions packed in SevenZipExceptions. This will make the proper exception handling much easier:
try {
inArchive.extract(..);
} catch (SevenZipException e) {
// Deal with all errors here
}
// …
public static class MyExtractCallback implements IArchiveExtractCallback {
public int write(byte data) throws SevenZipException {
try {
fos.write(data);
return data.length; // Return number of written bytes only, if no error occurs!
} catch (Exception e) {
throw new SevenZipException("Error writing output file", e);
}
}
I hope, this will help.
Regards,
Boris
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2010-12-06
I have the following code:
inArchive = SevenZip.openInArchive(null, new RandomAccessFileInStream(randomAccessFile));
inArchive.extract(itemsToExtract, false, callback);
int itemsToExtract = { 2 };
SevenZipCallback callback = new SevenZipCallback();
inArchive.extract(itemsToExtract, false, callback);
In that call back I have
@Override
public ISequentialOutStream getStream(int i, ExtractAskMode extractAskMode) throws SevenZipException {
System.out.println("7Zip calback getStream class: i: " + i);
return new ISequentialOutStream() {…..}
Hope that is understandable. Basically I'm passing in 2 as being the index to extract and the callback prints out each time it is called with a particular index. From the docs and the examples above this should print out once with index 2, but the actual output I'm getting with the latest java windows libraries is:
I'm trying to extract ISO archive with folders, subfolders, files and files without extensions. The problem i have is how to recognize if path without extension is file or folder?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
So, with all the changes metioned in this thread, does someone have one piece of code that will do the extraction of a file the right way? If so, can you post it here? It would be really helpful if the code to fully extract an archive was o nteh main 7zip binding site. Any plans for that?
Thanks!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Any updates to my question? I'd liek to use this code in a production environment but would like to verify I am using correct extraction code.
So, with all the changes metioned in this thread, does someone have one piece of code that will do the extraction of a file the right way? If so, can you post it here? It would be really helpful if the code to fully extract an archive was on the main 7zip binding site. Any plans for that?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm not familar with the commandline so I'm not sure if that's right (I assume it is). Basically, given an archive theArchive and an extractionPath extractionPath, I'd like to see the code to do
extractArchive(theArchive, extractionpath)
The code steve973 showed us seems to work but you mentioned some improvements. I'd like to see the final code with the improvements added.
I have been playing with this code today and found that it doesn't work with archives with subdirectories until I changed this code:
here is the code I'm using right now, which started with the code above, then I added some bugfixes. I noticed the code above would cause my unit tests to error with "too many open files" so I cleaned up the extract method so it always closes the file handles. I also cleaned up the paths so it handles the .gz files corectly, etc. In my unit testing, this code has worked well btu I am definitely open to optimizations, etc.
publicclassSevenZipJBindingExtractor{privatestaticLoggerlogger=Logger.getLogger(SevenZipJBindingExtractor.class.getName());publicvoidextract(Stringfile,StringextractPath)throwsSevenZipException,IOException{ISevenZipInArchiveinArchive=null;RandomAccessFilerandomAccessFile=null;try{randomAccessFile=newRandomAccessFile(newFile(file),"r");inArchive=SevenZip.openInArchive(null,newRandomAccessFileInStream(randomAccessFile));inArchive.extract(null,false,newMyExtractCallback(inArchive,extractPath));}finally{if(inArchive!=null){inArchive.close();}if(randomAccessFile!=null){randomAccessFile.close();}}}privatestaticclassMyExtractCallbackimplementsIArchiveExtractCallback{privatefinalISevenZipInArchiveinArchive;privatefinalStringextractPath;publicMyExtractCallback(ISevenZipInArchiveinArchive,StringextractPath){this.inArchive=inArchive;this.extractPath=StringUtils.ensurePathEndsInSlash(extractPath);}@OverridepublicISequentialOutStreamgetStream(finalintindex,ExtractAskModeextractAskMode)throwsSevenZipException{returnnewISequentialOutStream(){@Overridepublicintwrite(byte[]data)throwsSevenZipException{StringfilePath=inArchive.getStringProperty(index,PropID.PATH);FileOutputStreamfos=null;try{Filepath=newFile(extractPath+filePath);if(!path.getParentFile().exists()){path.getParentFile().mkdirs();}if(!path.exists()){path.createNewFile();}fos=newFileOutputStream(path,true);fos.write(data);}catch(IOExceptione){logger.error("IOException while extracting "+filePath,e);}finally{try{if(fos!=null){fos.flush();fos.close();}}catch(IOExceptione){logger.error("Could not close FileOutputStream",e);}}returndata.length;}};}@OverridepublicvoidprepareOperation(ExtractAskModeextractAskMode)throwsSevenZipException{}@OverridepublicvoidsetOperationResult(ExtractOperationResultextractOperationResult)throwsSevenZipException{}@OverridepublicvoidsetCompleted(longcompleteValue)throwsSevenZipException{}@OverridepublicvoidsetTotal(longtotal)throwsSevenZipException{}}publicstaticvoidmain(String[]args)throwsException{if(args.length==0){System.out.println("Usage: java SevenZipJBindingExtractor file extractPath");return;}newSevenZipJBindingExtractor().extract(args[0],args[1]);}
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi brian, thank you for your code! I'm using it to extract my archive.
Generally, it works fine. But I found a problem after digging into it.
Here is a archive structure for example (each txt file size - 333k):
a.zip
|--1.txt
|--2.txt
|--3.txt
|--4.txt
|--5.txt
|--6.txt
|--7.txt
|--8.txt
|--9.txt
|--10.txt
Then I can get the folder structure after running brian's code:
extractPath
|--1.txt
|--2.txt
|--3.txt
|--4.txt
|--5.txt
|--6.txt
|--7.txt
|--8.txt
|--9.txt
|--10.txt
It's OK!
But, I found the "public int write(byte[] data)" function was executed more than 10 times. In fact, it's 60 times. So I made some modification for testing:
Add two global counters to indicate times function executed;
Here below is my modification (btw, i'm using the lastest version, so some interface name have been changed):
privatestaticintgetStreamCounter=0;privatestaticintwriteCounter=0;publicvoidextract(Stringfile,StringextractPath)throwsSevenZipException,IOException{IInArchiveinArchive=null;RandomAccessFilerandomAccessFile=null;try{randomAccessFile=newRandomAccessFile(newFile(file),"r");inArchive=SevenZip.openInArchive(null,newRandomAccessFileInStream(randomAccessFile));inArchive.extract(null,false,newMyExtractCallback(inArchive,extractPath));}catch(Exceptionex){ex.printStackTrace();}finally{if(inArchive!=null){inArchive.close();}if(randomAccessFile!=null){randomAccessFile.close();}}}privatestaticclassMyExtractCallbackimplementsIArchiveExtractCallback{privatefinalIInArchiveinArchive;privatefinalStringextractPath;publicMyExtractCallback(IInArchiveinArchive,StringextractPath){this.inArchive=inArchive;this.extractPath=extractPath;}@OverridepublicISequentialOutStreamgetStream(finalintindex,ExtractAskModeextractAskMode)throwsSevenZipException{getStreamCounter++;System.out.println("running into getStream - "+getStreamCounter);returnnewISequentialOutStream(){@Overridepublicintwrite(byte[]data)throwsSevenZipException{writeCounter++;System.out.println("running into write - "+writeCounter);StringfilePath=inArchive.getStringProperty(index,PropID.PATH);FileOutputStreamfos=null;try{Filepath=newFile(extractPath+filePath);if(!path.getParentFile().exists()){path.getParentFile().mkdirs();}if(!path.exists()){path.createNewFile();}fos=newFileOutputStream(path,true);fos.write(data);}catch(IOExceptione){System.out.println("IOException while extracting "+filePath+e);}finally{try{if(fos!=null){fos.flush();fos.close();}}catch(IOExceptione){System.out.println("Could not close FileOutputStream"+e);}}returndata.length;}};}@OverridepublicvoidprepareOperation(ExtractAskModeextractAskMode)throwsSevenZipException{}@OverridepublicvoidsetOperationResult(ExtractOperationResultextractOperationResult)throwsSevenZipException{}@OverridepublicvoidsetCompleted(longcompleteValue)throwsSevenZipException{}@OverridepublicvoidsetTotal(longtotal)throwsSevenZipException{}}publicstaticvoidmain(String[]args)throwsException{newSevenZipJBindingExtractor().extract("E:\\test\\a.zip","E:\\test\\extractPath\\");}
}
and here below is the console output:
running into getStream - 1
running into write - 1
running into write - 2
running into write - 3
running into write - 4
running into write - 5
running into write - 6
running into getStream - 2
running into write - 7
running into write - 8
running into write - 9
running into write - 10
running into write - 11
running into write - 12
running into getStream - 3
running into write - 13
running into write - 14
running into write - 15
running into write - 16
running into write - 17
running into write - 18
running into getStream - 4
running into write - 19
running into write - 20
running into write - 21
running into write - 22
running into write - 23
running into write - 24
running into getStream - 5
running into write - 25
running into write - 26
running into write - 27
running into write - 28
running into write - 29
running into write - 30
running into getStream - 6
running into write - 31
running into write - 32
running into write - 33
running into write - 34
running into write - 35
running into write - 36
running into getStream - 7
running into write - 37
running into write - 38
running into write - 39
running into write - 40
running into write - 41
running into write - 42
running into getStream - 8
running into write - 43
running into write - 44
running into write - 45
running into write - 46
running into write - 47
running into write - 48
running into getStream - 9
running into write - 49
running into write - 50
running into write - 51
running into write - 52
running into write - 53
running into write - 54
running into getStream - 10
running into write - 55
running into write - 56
running into write - 57
running into write - 58
running into write - 59
running into write - 60
Could anyone explain that? thanks!
Last edit: troy 2017-03-30
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
your code has a problem. In the write() get don't get the entire extracted file at once.
It's not possible, since the extracted file may be very large (for example >1TB). So 7-Zip
passes the extracted file in chunks calling write() method multiple times.
So you have to open your file outside of the write() method. Then in the write() method append data to the file. Then close the file later on.
Regards,
Boris
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I appreciate the code samples but I have a few questions.
I tried using the SevenZipJBinding library but it doesn't seem to work on a 64-bit system.
Is this library platform dependent?
If so, does anyone know of a 7Zip decompression algorithm which is platform independent?
I need a decompress/extract method which will work on a Windows, Linux, Unix box running under either 32-bit or 64-bit.
Using the above example by gruntbug I get the following error.
Thanks for writing this awesome utility. I hate calling things by building a process, and relying on reading stdout, etc. This gives us the true java way of doing things, and I really appreciate your effort.
The reason why I am posting this is because I didn't find the code examples straightforward enough to figure out how to write the extracted contents of an archive to the disk. So, after some quick experimentation, I came up with the following code. You can provide a path for extraction, and then it writes the files from the archive to the disk with the path names preserved. It's quick code and it may not be the best, so edits for efficiency, etc, are welcome and appreciated.
Hello,
sorry for replying so late.
Your code has a serious problem. It might work some times,
if your archive contains small files. But it will fail, if you will try to
extract a big file.
The problem is, that the ISequentialOutStream.write() method can be called by 7-zip
extraction engine more, than ones. If you extract a really huge file, that doesn't fit into
memory you must be able to process this file part by part (what you can do with 7-zip).
Creating each time a new file in the write() method can't be write, though.
The other problem, that you have: extracting 0 length files. For such files the write method
call is omitted.
I'm pretty busy right now, but I will try to keep this in mind and add new code snippet with,
that suites your needs.
Regards,
Boris Brodski
Hi,
I made the absolutly the same mistake.
And I need help.
Regards,
Mira
Hi Mira,
I would be happy to help you :-)
Just ask your questions here
Regards,
Boris Brodski
Hi Boris,
Every time ISequentialOutStream.write() method was open it created FileOutputStream and this was ok for small files.
I couldn't figure out how to pass the stream and write many times.
Actually it was stupid question, which I found solution.
I think this ok now.
Now my code looks like this:
Thank you for that you ready to help me.
Regards,
Mira
Hello Mira,
sorry for delay. This is Ok so far, but I think, you could make your code a little bit better.
My suggestions:
- To extract a single file you don't need to extract all files and then filter one file you really want to extract. You can call the extract() method just like this:
inArchive.extract(new int {index}, false, extractCallback); // extract only file with index "index"
- In "public int write(byte data) throws SevenZipException" method you can throw your I/O exceptions packed in SevenZipExceptions. This will make the proper exception handling much easier:
try {
inArchive.extract(..);
} catch (SevenZipException e) {
// Deal with all errors here
}
// …
public static class MyExtractCallback implements IArchiveExtractCallback {
public int write(byte data) throws SevenZipException {
try {
fos.write(data);
return data.length; // Return number of written bytes only, if no error occurs!
} catch (Exception e) {
throw new SevenZipException("Error writing output file", e);
}
}
I hope, this will help.
Regards,
Boris
I have the following code:
inArchive = SevenZip.openInArchive(null, new RandomAccessFileInStream(randomAccessFile));
inArchive.extract(itemsToExtract, false, callback);
int itemsToExtract = { 2 };
SevenZipCallback callback = new SevenZipCallback();
inArchive.extract(itemsToExtract, false, callback);
In that call back I have
@Override
public ISequentialOutStream getStream(int i, ExtractAskMode extractAskMode) throws SevenZipException {
System.out.println("7Zip calback getStream class: i: " + i);
return new ISequentialOutStream() {…..}
Hope that is understandable. Basically I'm passing in 2 as being the index to extract and the callback prints out each time it is called with a particular index. From the docs and the examples above this should print out once with index 2, but the actual output I'm getting with the latest java windows libraries is:
7Zip calback getStream class: i: 0
7Zip calback getStream class: i: 1
7Zip calback getStream class: i: 2
Also would it be possible to update the library to a newer 7Zip version, I would like to access WIM archives if possible.
Sorry I've answered this myself, we need to check the extractAskMode:
if (extractAskMode == ExtractAskMode.EXTRACT) {
return new ISequentialOutStream() {…};
} else {
return null;
Hi,
I'm trying to extract ISO archive with folders, subfolders, files and files without extensions. The problem i have is how to recognize if path without extension is file or folder?
Now to create files and subfolders i'm using some code from 1st post in this thread
So, with all the changes metioned in this thread, does someone have one piece of code that will do the extraction of a file the right way? If so, can you post it here? It would be really helpful if the code to fully extract an archive was o nteh main 7zip binding site. Any plans for that?
Thanks!
I want to compress a file into 7z, zip and rar using 7zip in java.
Is there any utility or jar available which can be used to compress files?
Any updates to my question? I'd liek to use this code in a production environment but would like to verify I am using correct extraction code.
Hello!
Thank for reminding me. So you want to extract an archive file to the disk preserving the directory structure, just like:
> 7z x archive.7z
right?
Regards,
Boris
I'm not familar with the commandline so I'm not sure if that's right (I assume it is). Basically, given an archive theArchive and an extractionPath extractionPath, I'd like to see the code to do
extractArchive(theArchive, extractionpath)
The code steve973 showed us seems to work but you mentioned some improvements. I'd like to see the final code with the improvements added.
I have been playing with this code today and found that it doesn't work with archives with subdirectories until I changed this code:
to
One more thing - it doesn't work properly with .tar.gz files either
here is the code I'm using right now, which started with the code above, then I added some bugfixes. I noticed the code above would cause my unit tests to error with "too many open files" so I cleaned up the extract method so it always closes the file handles. I also cleaned up the paths so it handles the .gz files corectly, etc. In my unit testing, this code has worked well btu I am definitely open to optimizations, etc.
Hi Folks!
Thank you very much, especially to Brian Pipa, the code above worked perfectly for me!
Hi brian, thank you for your code! I'm using it to extract my archive.
Generally, it works fine. But I found a problem after digging into it.
Here is a archive structure for example (each txt file size - 333k):
a.zip
|--1.txt
|--2.txt
|--3.txt
|--4.txt
|--5.txt
|--6.txt
|--7.txt
|--8.txt
|--9.txt
|--10.txt
Then I can get the folder structure after running brian's code:
extractPath
|--1.txt
|--2.txt
|--3.txt
|--4.txt
|--5.txt
|--6.txt
|--7.txt
|--8.txt
|--9.txt
|--10.txt
It's OK!
But, I found the "public int write(byte[] data)" function was executed more than 10 times. In fact, it's 60 times. So I made some modification for testing:
Add two global counters to indicate times function executed;
Here below is my modification (btw, i'm using the lastest version, so some interface name have been changed):
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.RandomAccessFile;
import net.sf.sevenzipjbinding.ExtractAskMode;
import net.sf.sevenzipjbinding.ExtractOperationResult;
import net.sf.sevenzipjbinding.IArchiveExtractCallback;
import net.sf.sevenzipjbinding.IInArchive;
import net.sf.sevenzipjbinding.ISequentialOutStream;
import net.sf.sevenzipjbinding.PropID;
import net.sf.sevenzipjbinding.SevenZip;
import net.sf.sevenzipjbinding.SevenZipException;
import net.sf.sevenzipjbinding.impl.RandomAccessFileInStream;
public class SevenZipJBindingExtractor {
}
and here below is the console output:
running into getStream - 1
running into write - 1
running into write - 2
running into write - 3
running into write - 4
running into write - 5
running into write - 6
running into getStream - 2
running into write - 7
running into write - 8
running into write - 9
running into write - 10
running into write - 11
running into write - 12
running into getStream - 3
running into write - 13
running into write - 14
running into write - 15
running into write - 16
running into write - 17
running into write - 18
running into getStream - 4
running into write - 19
running into write - 20
running into write - 21
running into write - 22
running into write - 23
running into write - 24
running into getStream - 5
running into write - 25
running into write - 26
running into write - 27
running into write - 28
running into write - 29
running into write - 30
running into getStream - 6
running into write - 31
running into write - 32
running into write - 33
running into write - 34
running into write - 35
running into write - 36
running into getStream - 7
running into write - 37
running into write - 38
running into write - 39
running into write - 40
running into write - 41
running into write - 42
running into getStream - 8
running into write - 43
running into write - 44
running into write - 45
running into write - 46
running into write - 47
running into write - 48
running into getStream - 9
running into write - 49
running into write - 50
running into write - 51
running into write - 52
running into write - 53
running into write - 54
running into getStream - 10
running into write - 55
running into write - 56
running into write - 57
running into write - 58
running into write - 59
running into write - 60
Could anyone explain that? thanks!
Last edit: troy 2017-03-30
Hello Troy,
your code has a problem. In the write() get don't get the entire extracted file at once.
It's not possible, since the extracted file may be very large (for example >1TB). So 7-Zip
passes the extracted file in chunks calling write() method multiple times.
So you have to open your file outside of the write() method. Then in the write() method append data to the file. Then close the file later on.
Regards,
Boris
I appreciate the code samples but I have a few questions.
I tried using the SevenZipJBinding library but it doesn't seem to work on a 64-bit system.
Is this library platform dependent?
If so, does anyone know of a 7Zip decompression algorithm which is platform independent?
I need a decompress/extract method which will work on a Windows, Linux, Unix box running under either 32-bit or 64-bit.
Using the above example by gruntbug I get the following error.
Error Trace
7-Zip-JBinding should work on: Windows 32/64, Linux 32/64 and Mac OS 32/64 (all Intel, no ARM yet)
Did you already tried
- sevenzipjbinding-4.65-1.05-rc-extr-only-Windows-amd64.zip
- sevenzipjbinding-4.65-1.05-rc-extr-only-AllWindows.zip
- sevenzipjbinding-4.65-1.05-rc-extr-only-AllPlatforms.zip
Download URL: https://sourceforge.net/projects/sevenzipjbind/files/7-Zip-JBinding/
Be sure to select the latest release.
.tar.gz archives can't be handles as a single archive. Those archives are actually a TAR-archive within a GZ-archive.
7-Zip-JBinding can extract (and later also create) GZ-archives as well as TAR-archives. So you need to implement the two-step extraction.
Sorry, I didn't see that there was an AllPlatforms version :)
Now it works perfectly.
Thanks for the fast replies.
This library is amazing, I can't believe I didn't know about it sooner.
Thank you. I'm glad, you like it :-)