Thread: [Openpacket-devel] Use cases and meta data
Brought to you by:
crazy_j,
taosecurity
|
From: Eric H. <li...@er...> - 2007-01-29 14:14:30
|
Dear Packet Fanatics, I'm going to talk out loud here in hopes that by not editing myself too much I will be freer to innovate. That means a lot of this could have been said before, but maybe I'll present a new twist. One of the things we'll need with packet captures is meta data to describe the packet flow. One of the way to determine what meta-data is necessary is to go through use cases and see what becomes necessary and desirable. My use case here will be testing Network Security Devices (NSD) as that is where the need for OpenPacket is extremely high and where my interest is. For testing NSDs, one will want to send a stream of packets past the NSD. For inline perhaps a stream separated into sides going through the NSD. This process should be automated, so that a program should be able to associate the flow with the desired outcome from the NSD. There should be normal flows as well as attack flows available. There may be 'false positive' flows that should not trip the NSD, but might, and these should be labeled normal and associated with the attack. The following elements are needed to support this. Each element may be either standardized if a value space is defined or normalized if it needs to be. For these purposes, I'll break out the flows into the client and the server, understanding that sometimes that is not appropriate, but mostly it is. Shoulds and musts are apply to this use case only. CVE, standardized, if may be that more than one CVE applies, but for now, if any apply, one must be filled in. Server Application vendor, normalized, must. Server Application version, normalized, must. Client Application vendor, normalized, should. Client Application version, normalized, should. Server OS vendor, normalized, should. Server OS version, normalized, should. Client OS vendor, normalized, should. Client OS version, normalized, should. Vulnerability exploited?, boolean, Vulnerable, boolean, for passive sensors. IP in data, boolean, must indicate if the IP address of either the client or server is represented in the data. If so, it would be nice to have a way to define where (for this particular flow) the IPs are in the data so that they can be changed. Something like packet 4, byte offset 54, client IP. Port in data, boolean, same as IP in data. Flow-pointer, normalized, a URL for the actual packet dump file. OpenPacket meta-data, such as reliability of the source or if verified by others. Meta-data URL, standardized, for getting updates to the meta-data, especially the open-packet portion of the meta-data. Not all elements need to be a part of the system as stored in the OpenPacket repository, however the method of creating the meta-data should support having all these elements. Sometimes the OS is what is vulnerable and sometimes it is just the application. Having the OS separate is nice. It may allow a sophisticated testing system to manipulate OS specific fingerprints in a flow, such as Source ports, sequence numbers, options etc. to simulate different OSs from one capture. Is that dreaming, or what. :) I don't recall if there has been a final decision regarding packaging of captures. In order to support meta-data, they probably need to be gzipped-tarballs that include both the meta-data and the packet capture. Alternatively, if the capture file names could be completely unique and normalized, then a testing system could look locally for a cached copy, by filename, and then use the full URL if not found. This is probably a feature for a later version of OpenPacket. A wonderful feature for tcpreplay that would support this use case is client-server mode. Here the tcpreplay client and server would 'play' their respective side of the conversation after receiving the packet from the other side. This would be very useful for inline NSD testing. The reason I have been thinking about this use case is that I am putting together a system that will be able to manage the automated testing. It's current functionality is limited. It can watch a log file, request a remote attack agent to attack, receive confirmation that the attack was made, and check to see if the attack was observed in the logs. The attack agent can only launch HTTP attacks or prewritten HPING attacks right now. It is all in Perl and extensible through plug-ins. I begun to think about a tcpreplay plug-in and that brought about this email. I hope to appease my employer's intellectual property demigods and clean up the functionality, code and docs enough to release the system to CPAN in the not too distant future. Regards, Eric Hacker, CISSP aptronym (AP-troh-NIM) noun A name that is especially suited to the profession of its owner I _can_ leave well enough alone, but my criteria for well enough is pretty darn high. |
|
From: Aaron T. <syn...@gm...> - 2007-01-29 21:01:34
|
Comments inline... On 1/29/07, Eric Hacker <li...@er...> wrote: > Dear Packet Fanatics, [snip] > The following elements are needed to support this. Each element may be > either standardized if a value space is defined or normalized if it > needs to be. For these purposes, I'll break out the flows into the > client and the server, understanding that sometimes that is not > appropriate, but mostly it is. Shoulds and musts are apply to this > use case only. [snip list of description oriented metadata] Having some experience creating a taxonomy in this area, I can say it's quite doable. However, it does take time to manage the taxonomy and time to connect the dots. In general I suspect that most people with pcap's they'd contribute no longer have this sort of information, and if they did would be unwilling to spend the time fill out the data for each pcap. I'm not saying some people won't, but a majority won't bother. > IP in data, boolean, must indicate if the IP address of either the > client or server is represented in the data. If so, it would be nice > to have a way to define where (for this particular flow) the IPs are > in the data so that they can be changed. Something like packet 4, byte > offset 54, client IP. I assume the reason you want the above is so that you can properly rewrite IP addresses to match your test bed network. I would argue that this really should be automated (who wants to look through pcap's in Wirkshark and find IP addresses??) and hence should be a separate tool. Perhaps the OpenPacket team develops such a tool, but that seems outside of the scope of this project at this time. On a side note, I'd suggest looking at tshark's pdml's output. At least that way you won't be spending all your time writing protocol decoders. > Port in data, boolean, same as IP in data. > Flow-pointer, normalized, a URL for the actual packet dump file. I'm not sure what a "flow-pointer" is. More info? > OpenPacket meta-data, such as reliability of the source or if verified > by others. > Meta-data URL, standardized, for getting updates to the meta-data, > especially the open-packet portion of the meta-data. Good idea. > Not all elements need to be a part of the system as stored in the > OpenPacket repository, however the method of creating the meta-data > should support having all these elements. Not sure I 100% follow, but I will say that data is only useful if people have access to it. And people are less likely to contribute data if they don't see the immediate value. > Sometimes the OS is what is vulnerable and sometimes it is just the > application. Having the OS separate is nice. It may allow a > sophisticated testing system to manipulate OS specific fingerprints in > a flow, such as Source ports, sequence numbers, options etc. to > simulate different OSs from one capture. Is that dreaming, or what. :) I think a lot of the vuln target information should come from the CVE. No need to duplicate info IMHO. > I don't recall if there has been a final decision regarding packaging > of captures. In order to support meta-data, they probably need to be > gzipped-tarballs that include both the meta-data and the packet > capture. Alternatively, if the capture file names could be completely > unique and normalized, then a testing system could look locally for a > cached copy, by filename, and then use the full URL if not found. This > is probably a feature for a later version of OpenPacket. > > A wonderful feature for tcpreplay that would support this use case is > client-server mode. Here the tcpreplay client and server would 'play' > their respective side of the conversation after receiving the packet > from the other side. This would be very useful for inline NSD testing. Yep. Actually, I would love to see OpenPacket ship tcpreplay (tcpprep) cache files along with the pcap's so people can use them in inline mode right away. I'm sure a lot of people would find that useful. [snip] -- Aaron Turner http://synfin.net/ http://tcpreplay.synfin.net/ - Pcap editing & replay tools for Unix |
|
From: Eric H. <li...@er...> - 2007-01-29 22:16:40
|
On 1/29/07, Aaron Turner <syn...@gm...> wrote: > Comments inline... > > On 1/29/07, Eric Hacker <li...@er...> wrote: > > Dear Packet Fanatics, > > [snip] > > Having some experience creating a taxonomy in this area, I can say > it's quite doable. However, it does take time to manage the taxonomy > and time to connect the dots. In general I suspect that most people > with pcap's they'd contribute no longer have this sort of information, > and if they did would be unwilling to spend the time fill out the data > for each pcap. I'm not saying some people won't, but a majority won't > bother. Sadly, I agree. That's what karma points are for though, right. :) > > IP in data, boolean, must indicate if the IP address of either the > > client or server is represented in the data. If so, it would be nice > > to have a way to define where (for this particular flow) the IPs are > > in the data so that they can be changed. Something like packet 4, byte > > offset 54, client IP. > > I assume the reason you want the above is so that you can properly > rewrite IP addresses to match your test bed network. I would argue > that this really should be automated (who wants to look through pcap's > in Wirkshark and find IP addresses??) and hence should be a separate > tool. Perhaps the OpenPacket team develops such a tool, but that > seems outside of the scope of this project at this time. On a side > note, I'd suggest looking at tshark's pdml's output. At least that > way you won't be spending all your time writing protocol decoders. Rewriting supports more than just matching the testbed. It allows the simulation of multiple flows. Sure the IDS catches one attack, but does it handle 500 all at different targets? I agree that rewriting IP addresses needs to be automated, and perhaps a tool can do it automatically most of the time so this isn't necessary. It just seems to me that often the protocols are new or complex enough that the tools can't support this. Whereas if the meta data can explicitly provide the location, the tool can do it without knowing the protocol. PDML sucks as far as XML goes and using other tools to parse it. It is a Packet DISPLAY ML. A good Packet ML would look much like RFC 3252, fixed up a bit. > > Flow-pointer, normalized, a URL for the actual packet dump file. > > I'm not sure what a "flow-pointer" is. More info? I was assuming that the meta-data and the actual pcap are not in a single file, so the meta-data ought to point to the pcap. Flow pointer is probably a poor choice of words. > > Not all elements need to be a part of the system as stored in the > > OpenPacket repository, however the method of creating the meta-data > > should support having all these elements. > > Not sure I 100% follow, but I will say that data is only useful if > people have access to it. And people are less likely to contribute > data if they don't see the immediate value. What I meant here is that the format should be extensible and systems shouldn't choke on meta-data that they don't understand. There is a core set of required meta-data, a set of optional meta-data, but if it is there, this is what it has to look like, and then there's the meta-data that someone needed and decided to give back because someone else might find it useful too. There are limits of course. > I think a lot of the vuln target information should come from the CVE. > No need to duplicate info IMHO. I agree, when it is CVE related. What if just some nice Microsoft Exchange traffic that happens to trigger some dumb ISS signature all the time? I think this is the kind of info that is optional, but needs to have rules established on how to put it in. > Yep. Actually, I would love to see OpenPacket ship tcpreplay > (tcpprep) cache files along with the pcap's so people can use them in > inline mode right away. I'm sure a lot of people would find that > useful. That's a perfect reason to keep the meta-data format extensible. It enables you to create a way to ship extra information in the meta-data. Thanks for tcpreplay, BTW. Peace, Hacker |
|
From: Aaron T. <syn...@gm...> - 2007-01-30 00:06:29
|
On 1/29/07, Eric Hacker <li...@er...> wrote: > On 1/29/07, Aaron Turner <syn...@gm...> wrote: > > Comments inline... > > > > On 1/29/07, Eric Hacker <li...@er...> wrote: > > > Dear Packet Fanatics, > > [snip] > > > IP in data, boolean, must indicate if the IP address of either the > > > client or server is represented in the data. If so, it would be nice > > > to have a way to define where (for this particular flow) the IPs are > > > in the data so that they can be changed. Something like packet 4, byte > > > offset 54, client IP. > > > > I assume the reason you want the above is so that you can properly > > rewrite IP addresses to match your test bed network. I would argue > > that this really should be automated (who wants to look through pcap's > > in Wirkshark and find IP addresses??) and hence should be a separate > > tool. Perhaps the OpenPacket team develops such a tool, but that > > seems outside of the scope of this project at this time. On a side > > note, I'd suggest looking at tshark's pdml's output. At least that > > way you won't be spending all your time writing protocol decoders. > > Rewriting supports more than just matching the testbed. It allows the > simulation of multiple flows. Sure the IDS catches one attack, but > does it handle 500 all at different targets? Fair enough, but it reduces to the same problem once you automate it. > I agree that rewriting IP addresses needs to be automated, and perhaps > a tool can do it automatically most of the time so this isn't > necessary. It just seems to me that often the protocols are new or > complex enough that the tools can't support this. Whereas if the meta > data can explicitly provide the location, the tool can do it without > knowing the protocol. > > PDML sucks as far as XML goes and using other tools to parse it. It is > a Packet DISPLAY ML. A good Packet ML would look much like RFC 3252, > fixed up a bit. No argument here. I'm just not aware of any actual implementation of anything better. The hard part isn't changing the IP address, it's doing all the packet decoding to find the IP address/checksums/etc to edit. I really have no interest in writing decoders for every lame POS protocol which decides to encode L3 data in L4+. Anyways, that said, if people can agree on an reasonable format to encode packet editing commands, something like (or not, I'm just making this up as I type): Packet 57: # packet to modify Offset: 89 # byte offset starting from start of packet Direction: C2S # direction of packet Type: IPv4 # type of field Encoding: big_endian # encoding of new value Value: 192.168.2.34 # optional new value I'll see about making sure libtcpedit/tcprewrite supports rewriting the data appropriately. The goal being that you'd take the .pcap, this edit command and a new value and you'd get a new pcap containing the appropriate changes (including L3/4 checksum re-calcuation). That would hopefully allow anyone to easily modify the packets based on their need. > > > Flow-pointer, normalized, a URL for the actual packet dump file. > > > > I'm not sure what a "flow-pointer" is. More info? > > I was assuming that the meta-data and the actual pcap are not in a > single file, so the meta-data ought to point to the pcap. Flow pointer > is probably a poor choice of words. Ah, that makes sense. Yeah, that seems the way to go, at least until the pcap-ng format takes off, but that's at least a few years off. > > > Not all elements need to be a part of the system as stored in the > > > OpenPacket repository, however the method of creating the meta-data > > > should support having all these elements. > > > > Not sure I 100% follow, but I will say that data is only useful if > > people have access to it. And people are less likely to contribute > > data if they don't see the immediate value. > > What I meant here is that the format should be extensible and systems > shouldn't choke on meta-data that they don't understand. There is a > core set of required meta-data, a set of optional meta-data, but if it > is there, this is what it has to look like, and then there's the > meta-data that someone needed and decided to give back because someone > else might find it useful too. There are limits of course. Reasonable. > > I think a lot of the vuln target information should come from the CVE. > > No need to duplicate info IMHO. > > I agree, when it is CVE related. What if just some nice Microsoft > Exchange traffic that happens to trigger some dumb ISS signature all > the time? I think this is the kind of info that is optional, but needs > to have rules established on how to put it in. In those cases, my feeling you'll have better luck getting people to use 'tags' or labels rather then a formalized taxonomy. Basically my experience is that contributors of this sort can't be bothered to figure things out and just want something simple (think upload a tarball of pcaps). Even getting per-pcap metadata may be an uphill climb. IMHO you're better off making it super simple for them and writing tools to make it easy for you to clean up. I'd hope for the best, plan for the worst. -- Aaron Turner http://synfin.net/ http://tcpreplay.synfin.net/ - Pcap editing & replay tools for Unix |