From: Nguyen A. Q. <aq...@gm...> - 2015-10-27 02:49:39
|
On Tue, Oct 27, 2015 at 2:30 AM, Phil Roth <pr...@en...> wrote: > Hi all, > > This past July, I gave a talk about using Python to examine malware: > http://www.slideshare.net/mrphilroth/examining-malware-with-python > https://www.youtube.com/watch?v=2gyAemhbxnE > thanks for sharing this. it looks like a nice work, congrats! > > In it, I talk about using machine learning techniques to classify malware. > Specifically, I compare the performance of classification models based on > instructions generated by IDA Pro and instructions I generated myself with > Capstone. Someone with this project made a comment about the talk on > Twitter: > https://twitter.com/capstone_engine/status/624580597650862080 > > Next month, I’m going to be giving a talk to a Meetup group in San > Francisco where I’m going to include some of the same material. I wanted to > check here before I give the talk so that I don’t misrepresent what > Capstone is and is not. I don’t feel like I yet totally understand the > issues behind that tweet. > > My message is going to be: “Disassembled instructions are a great feature > to use when using machine learning models to classify malware. Results can > vary based on what disassembler is used. I’ve found that a model based on > features from a single pass disassembler like Capstone will produce > slightly worse results than one based on IDA Pro disassembly. But the ease > of use and repeatability of the results make it a better choice.” > what do you mean by "single pass disassembler"? this is how all the disassemblers work, not only Capstone. also, can you elaborate where IDA produces better result? keep in mind that IDA is a complicated tool which does a lot more than just disassembling, why Capstone is designed to do just one simple thing: disassemble the binary you feed it. more complicated process must be done by your programs. > > Is the error in those statements referring to Capstone Engine as the > disassembler? Should I be referring to LLVM MC as the disassembler and > Capstone as the framework through which I used it? Is there some other > problem that I don’t yet understand? > Capstone is based on LLVM MC, but we go far beyond that: http://www.capstone-engine.org/beyond_llvm.html let me know if you have more questions, thanks. Quynh |