|
From: Stephen T. <st...@to...> - 2006-11-16 14:52:39
|
How does valgrind load itself at address 0x38000000? I see the make variable VALT_LOAD_ADDRESS. What I cannot determine is how that is used. Stephen |
|
From: Tom H. <to...@co...> - 2006-11-16 15:11:24
|
In message <116...@ba...>
Stephen Torri <st...@to...> wrote:
> How does valgrind load itself at address 0x38000000? I see the make
> variable VALT_LOAD_ADDRESS. What I cannot determine is how that is used.
By passing -Wl,-defsym,valt_load_address=... to the linker when
linking each of the static tool binaries.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Stephen T. <st...@to...> - 2006-11-17 01:12:36
|
On Thu, 2006-11-16 at 15:11 +0000, Tom Hughes wrote: > In message <116...@ba...> > Stephen Torri <st...@to...> wrote: > > > How does valgrind load itself at address 0x38000000? I see the make > > variable VALT_LOAD_ADDRESS. What I cannot determine is how that is used. > > By passing -Wl,-defsym,valt_load_address=... to the linker when > linking each of the static tool binaries. > > Tom Ok. So how does valgrind handle loading the programs its going to test at their expected address? Stephen |
|
From: Tom H. <to...@co...> - 2006-11-17 08:32:20
|
In message <116...@ba...>
Stephen Torri <st...@to...> wrote:
> On Thu, 2006-11-16 at 15:11 +0000, Tom Hughes wrote:
>> In message <116...@ba...>
>> Stephen Torri <st...@to...> wrote:
>>
>> > How does valgrind load itself at address 0x38000000? I see the make
>> > variable VALT_LOAD_ADDRESS. What I cannot determine is how that is used.
>>
>> By passing -Wl,-defsym,valt_load_address=... to the linker when
>> linking each of the static tool binaries.
>
> Ok. So how does valgrind handle loading the programs its going to test
> at their expected address?
That all starts with VG_(do_exec) in coregrind/m_ume.c but basically
we just implement an ELF loader.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Stephen T. <st...@to...> - 2006-11-17 17:25:00
|
On Fri, 2006-11-17 at 08:32 +0000, Tom Hughes wrote: > > Ok. So how does valgrind handle loading the programs its going to test > > at their expected address? > > That all starts with VG_(do_exec) in coregrind/m_ume.c but basically > we just implement an ELF loader. This confirms my suspicions that I will need to create a loader. The framework I am building is written in C++. I need the functionality that you have in aspacemgr for creating the space at the desire location using the mmap2 system call. Since this coregrind functionality is not available from the installed headers I will either need to copy the code from the copy I checked out and use it or create it myself. What do you recommend that I do? Stephen |
|
From: Tom H. <to...@co...> - 2006-11-17 18:01:12
|
In message <116...@ba...>
Stephen Torri <st...@to...> wrote:
> This confirms my suspicions that I will need to create a loader.
>
> The framework I am building is written in C++. I need the functionality
> that you have in aspacemgr for creating the space at the desire location
> using the mmap2 system call. Since this coregrind functionality is not
> available from the installed headers I will either need to copy the code
> from the copy I checked out and use it or create it myself.
>
> What do you recommend that I do?
I think you need to explain a lot more about what you're trying to
do before I can answer that...
If your code is running as part of a client program managed by valgrind
then you can just call the mmap() function in the C library, or call the
appropriate mmap system call directly, and valgrind will intercept and
handle it and do it's best to honour the request.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Stephen T. <st...@to...> - 2006-11-17 18:31:34
|
On Fri, 2006-11-17 at 18:00 +0000, Tom Hughes wrote: > I think you need to explain a lot more about what you're trying to > do before I can answer that... I am working on a reverse engineering shared library which can be called from a variety of programs. The present problem I am trying to solve is how to detect a binary program is compressed or not. That is a packer/compression tool is used on the original binary and produces a smaller executable. This compressed program is not the real program but its hard to determine from the binary if this is true or not. I am supposing that if I mark the memory of the code one color (e.g. Orange) and the data another (e.g. Green) that I can at least detect if the suspect binary is a self-modifying program or not. A compressed binary is one class of self-modifying code. So just merely running the code and determining if it attempts to execute any part of its data section is not specific enough. So the Unpacker component in this framework takes in input from other components in order to do its job. It needs to know the architecture for the given binary along with starting code address, code size, starting data address, data size and a memory map of the suspect binary. A memory map is a data structure I created for managing the binary images. It provides me the ability to jump to various points in the image and write or read from it. The Unpacker use the type of architecture to create a specific Simulator (e.g. x86). A simulator is a class that uses the VEX library to translate the binary image to simulate execution. That was the suggestion that Julian gave me since the framework is only going to handle user space applications. So I need to give the simulator a starting address into the memory map where the first instruction is and the memory map in order to start it working. So for now I desire to load the suspect binary into its desire location (e.g. 0x4000000 for win32 programs). If I can I would like to recover the unpacking code from the suspect binary to analyze it to hopefully discover unique properties to unpacking programs. I would be able to use this information to create a better compressed binary detector. Also I would like to recover the original program instructions. So I would have two memory maps as output, unpacker code and original code. I know I need to create a PE file loader like valgrind has done for ELF programs. Right now the fundamental issue is how to locate the program and called DLLs at their desired location in memory if possible. I am sorry for being vague. Its sincere hope that what I have provided will help you understand the nature of the problems I am trying to solve. Stephen |
|
From: Julian S. <js...@ac...> - 2006-11-17 19:13:43
|
> I am working on a reverse engineering shared library which can be called > from a variety of programs. The present problem I am trying to solve is > how to detect a binary program is compressed or not. [...] FWIW, I have no clue about Windows so I can't help you there. What I am struck by is that this seems a roundabout way to discover whether or not an executable is compressed. There are some pretty effective techniques for guessing whether or not a sequence of bytes comes from a known source if you have previous examples of the outputs of candidate sources. The crudest version of would be to simply try to compress the executable. Even a simple compression algorithm should be able to get x86 code below 4 bits/byte, whereas random data -- ie, already compressed code -- will not compress any further. If you want to get more sophisticated you could use a PPM-based sequence prediction algorithm. Train it on various source models, eg x86 code, amd64 code, english; then use each one in turn to compress (parts of) the executable. Typically the model that most closely matches the fragment you are testing will do noticeably better than the rest. PPM (Prediction by Partial Matching) is fairly simple to implement. J |
|
From: Tom H. <to...@co...> - 2006-11-17 18:42:16
|
In message <116...@ba...>
Stephen Torri <st...@to...> wrote:
> So for now I desire to load the suspect binary into its desire location
> (e.g. 0x4000000 for win32 programs). If I can I would like to recover
> the unpacking code from the suspect binary to analyze it to hopefully
> discover unique properties to unpacking programs. I would be able to use
> this information to create a better compressed binary detector. Also I
> would like to recover the original program instructions. So I would have
> two memory maps as output, unpacker code and original code.
>
> I know I need to create a PE file loader like valgrind has done for ELF
> programs. Right now the fundamental issue is how to locate the program
> and called DLLs at their desired location in memory if possible.
Ah, well Windows isn't really my area of expertise I'm afraid, but
it does have memory allocation APIs along the lines of mmap so there
shouldn't be any problem allocating the memory.
You just need to read the PE header and work out where each section
wants to load, then try and allocate the memory and read it in.
There are complications though, as if memory serves me right PE code
is not generally position independent (even when in a DLL) so if you
can't load it at the right address you may have to relocate it.
The same thing does apply to ELF, at least on x86 where the linker
will allow you to put non PIC code in a shared library, but we are
able to use the existing linux dynamic loader to do most of the work
and I don't know if the same thing would be possible on Windows?
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Stephen T. <st...@to...> - 2006-11-17 19:14:20
|
On Fri, 2006-11-17 at 18:42 +0000, Tom Hughes wrote: > Ah, well Windows isn't really my area of expertise I'm afraid, but > it does have memory allocation APIs along the lines of mmap so there > shouldn't be any problem allocating the memory. > > You just need to read the PE header and work out where each section > wants to load, then try and allocate the memory and read it in. > > There are complications though, as if memory serves me right PE code > is not generally position independent (even when in a DLL) so if you > can't load it at the right address you may have to relocate it. > > The same thing does apply to ELF, at least on x86 where the linker > will allow you to put non PIC code in a shared library, but we are > able to use the existing linux dynamic loader to do most of the work > and I don't know if the same thing would be possible on Windows? The library is developed and running on a Linux system. Stephen |
|
From: Nicholas N. <nj...@cs...> - 2006-11-17 22:50:43
|
On Fri, 17 Nov 2006, Stephen Torri wrote: > I am working on a reverse engineering shared library which can be called > from a variety of programs. The present problem I am trying to solve is > how to detect a binary program is compressed or not. That is a > packer/compression tool is used on the original binary and produces a > smaller executable. This compressed program is not the real program but > its hard to determine from the binary if this is true or not. > > [...] > > I know I need to create a PE file loader like valgrind has done for ELF > programs. Right now the fundamental issue is how to locate the program > and called DLLs at their desired location in memory if possible. > > I am sorry for being vague. Its sincere hope that what I have provided > will help you understand the nature of the problems I am trying to > solve. Sounds to me like this could be written relatively easily as a normal Valgrind tool, except that you need it to run on Windows. Or maybe you need it to handle Windows binaries, but the tool can run on Linux? Nick |
|
From: Stephen T. <st...@to...> - 2006-11-17 23:13:26
|
On Sat, 2006-11-18 at 09:50 +1100, Nicholas Nethercote wrote: > Sounds to me like this could be written relatively easily as a normal > Valgrind tool, except that you need it to run on Windows. Or maybe you need > it to handle Windows binaries, but the tool can run on Linux? My design is to create a library that can be used with programs that run on Linux. The library will support a wide variety of components that can be arranged at run-time through the XML files containing the layout. This way if someone comes up with a new component for doing something then its easy for them to add it into the program. I do not want to restrict how someone can use this program. Right now my trouble is how to handle allocating a memory block via mmap2 and use that allocator in a C++ STL vector. All of this is getting off-topic. So if any one has a solution to this I would appreciate the help. When I get to actually needing to use VEX I will ask my questions on this list. So everyone please contact me off-line regarding this topic. Stephen |