Mitch Williams <mitch.a.williams@...> wrote:
>If we created an interface such that echo "add eth0" > slaves added the
>slave, but cat slaves showed us all the slaves, it would obviously
>work, but the semantics of reading and writing now become completely
>different, and potentially confusing. Maybe this really isn't an issue,
>but it would have to be carefully documented.
I don't have a problem with this; it certainly seems less
confusing that the other way. Reading the sysfs.txt in the kernel, it
says we're ok to "express an array of values of the same type" (although
it prefers having them on one line vs. one per line, I can see that
that'd be easier for scripts to deal with).
It also avoids the problem with the "boogabooga" test, namely,
# echo boogabooga > /sys/bonding/control/bonds
has two effects: it wipes out all bonds on the system, and
creates a new bonding device "boogaboog". The problem, in other words,
is that there is no illegal input to the "bonds" control file. Anything
you write to it will most likely destroy the entire bonding
configuration, and may or may not create new bonds.
On that topic (I didn't notice this until this morning), in
theory bonding_update_bonds() checks the interface name, but I don't
seem to see the error if I do the above (name too long) or specify a
name like "/////////////////////////".
>I also considered creating separate add_ and delete_ interfaces, but
>rejected that idea as I felt it just added way too much complexity. We'd
>need this for slaves and ARP targets, and we'd end up with a ton of files
>in each directory.
Yah, the separate add/delete business is, to use a technical
term, icky. To me, "echo add eth0 > /sys/bonding/bond0/slaves" (and
similarly for the list of bond devices) seems to be the most elegant.
>I guess it comes down to a case of potential user confusion vs safety and
>ease of implementation. If consensus is to go with a command-based (or
>similar) interfaces for adding and removing slaves and bonds, we'll be
>happy to implement it.
One of the places I'm thinking about is the system init scripts
(the sysconfig and initscripts packages, I think they're called). Those
both operate somewhat differently, but generally they add interfaces in
order, so they'll add one bond at a time, and it seems excessively
complicated to require "read the list, add to the list, echo the list
The alternative is to hide all of the icky-poo stuff inside of a
New And Super-Spiffy ifenslave, adding gizmos like "ifenslave create
bond0" that will do all of that list read/edit/write magic under the
covers. Even with that, there's still a race if two attempts are made
to create a bond simultaneously (i.e., two things do the
"read/edit/write" cycle at once; there's no mutual exclusion for that
set of three steps). I dunno why that would happen; maybe the
initscripts some day do parallel network device initializations, but its
an exposure that the "echo add bond0" style interface doesn't have.
>If anybody knows how to implement this, please let me know. Same goes for
>the idea of moving the bonding stuff elsewhere in the tree. I ended up
>with creating a bonding subsystem because I couldn't figure out how to
>hook into the class hierarchy without exporting private data structures
>from other parts of the kernel. There are also issues with the level of
>nesting that would be required. Can we add a kset to another kset? I
>don't think that will work.
Ok, here's another thought.
As "command" files for control (i.e., "add foo0" etc) and
version (which is read only for the version). No subdirectories.
For the per-bond instance control fru fru, do what bridge does:
put it in a subdirectory of class/net/[interface]/, e.g., bridge has
We could have
Also, I note that the /sys/class/net/[interface]/ "directories"
contain both "files" and "directories" (whatever they're called in
sysfs-ese), so it must be possible to do that somehow. Still, this may
be more consistent with existing implementations (modulo the extra
"create/destroy" interface that nobody else seems to have in sysfs).
>We've discussed this, but I didn't actually get to it yet. We'd like to
>split "stat" into several different files, one of which would be
>"active_slave" or some such. This would kill both of these birds with one
>stone (or, at least, several pebbles). By doing this, we would also be a
>step closer to the fine-grained user mode control the HP guys were looking
Yah, having a "stat" subdirectory with a zillion separate stat
controls is more in keeping with the Sysfs Way, I suppose. Combined
with the above /sys/class/net/ fru fru, it might blur the boundary
between "statistics" and "controls" with some things as read only, but
many things not, e.g., a generic "link_state" that's read only, but a
failure count that can be set back to zero or any other value that suits
their fancy. Relevant to the above, the "stat" field that lists the
active slave is simply read/write, and writing to it changes the active.
If we've got all of the various bits from the stat file in other
individual locations, then we don't need a separate stat at all.
As far as the HP gang goes (and presumably others), in the long
run I think an asynchronous event notifier is needed, most likely
implemented via netlink. It may be that the whole "control" file in
sysfs (the one to create and destroy bonds) ends up being a netlink
interface; it depends upon what the core maintainers think (e.g., if
they say no way to /sys/bonding/stuff).
>> 9- In the long run, there are wrapper gizmos to set a device's
>> mac address and mtu (dev_set_mtu() is in 2.6.11 and has been around a
>> while, dev_set_mac_address() is still pending in bk). These should be
>> used instead of directly fiddling with the relevant bits (as in
>We'll also take care of this, both in bond_main.c and bond_sysfs.c. I
>just didn't know these wrappers existed; it'll be nice to simplify the
>code a bit.
Don't sweat this one too much if you're not up to rebasing to
the latest 2.6.11-netdev patch (which covers all of the instances other
than what's in your new code); it can be done later against a more final
version or something. It's an easy change.
>> 10- I think it might be time to finally ditch the copying of the
>> master's IP address to the slaves. Right now, this happens in
>> ifenslave, and may have had some technical value in the past, but I
>> don't think it does now. Anybody in the usual gang have insights here?
>Now you tell me! That was the biggest pain the rear to implement.
>However, I'll happily yank it out if it's not needed. We'll have to
>double-check all of the teaming modes to make sure it doesn't affect
Look in the bonding-devel mailing list archives, we went through
the "do we need any of this?" exercise a while back (I created a patch
to propogate all of that crap in real time, and in the end it turns out
to be useless for the majority of it). Hunt for the keyword
"propogation" (or maybe "propogate", I forget) to find the discussion.
If memory serves, the only things the slaves gotta have is MTU, MAC,
VLAN, promiscuous and multicast settings.
-Jay Vosburgh, IBM Linux Technology Center, fubar@...