On Mon, Nov 06, 2006 at 11:41:56AM -0800, Erik Nordmark wrote:
> Edward Pilatowicz wrote:
> [You brought up an issue with /etc/hostname.* etc being ignored when a
> shared-IP zone is booted.]
> >perhaps some kind of warning message should be generated in this
> >scenario instead?
> >something like:
> > Ignoring zone network configuration specified: /etc/hostname.bge0
> > Current network configuration is dictated by the global zone.
> > To use the network configuration specified within this zone it
> > must have an exclusive-IP instance allocated to it by the global
> > zone.
> This is an issue in S10 at two levels:
> 1. The /etc/hostname.<ifname> and related files are ignored when a zone
> is booted.
> 2. Any IP configuration information in $ZROOT/etc/sysidcfg is ignored by
> It might be that IP Instances makes this more of an issue, although I'm
> not certain it will, but in any case the issue is with the original
> shared-IP stack. (Of course, this is far from the only issue with the
> shared-IP stack.)
well, i see it as a new issue because previously we had one consistent
behavior. (always ignore ip configuration files.) now we're going to
have two different behaviors based on an option that someone can set in
zonecfg. where this can gets confusing is that this is an option that
they can change at any time in zonecfg and they may not realize the
full extent/impact of what they've done. hence i thought it might be
good to make it obvious that we thing something is not quite right.
if i've created an exclusive-IP zone and then i change it to a shared
IP zone but don't specify IP configuration info (and i expect it
to get this configuration in the same way it used to) i'm going to be
confused as to why things aren't working.
> I don't know what the best way is to proceed on this. One way is to just
> file CRs on the share-IP stack (smf scripts and sysidtool).
> But medium term it might make sense to instead spend our time on
> aggressively moving away from the shared-IP stack (and maybe even
> ripping out that code in our lifetimes) by making sure that the
> exclusive-IP stack can satisfy the same needs as the shared-IP stack. IP
> Instances is a bey building block for this, but we also need some
> additional pieces (some of which are under way)
> - vnic support to get the same type of sharing of the "wire" as with
> the shared-IP stack
> - security (perhaps in the form of GLD-level filtering) to contain
> exclusive-IP zones towards the network (prevent ARP spoofing, redirect
> spoofing, RIP spoofing, etc) since ARP spoofing is prevented for
> shared-IP zones
> - think about scalable/centralizable configuration of IP parameters
> for zones. I think the answer is related to DHCP; perhaps having easy
> configurations where the global zone can be a DHCP server (and NAT?) for
> the non-global zones on the same machine. We have the building blocks
> for this, but we need to better understand the identity of a zone as it
> relates to DHCP before we proceed.
> So long answer to a short question. Adding warning messages doesn't seem
> to help us get to where we need to go.
i think that with vnics this would be a really cool possibility and
would help simplify the new choose-you-ip-stack-type zones model and
also make global and non-global zone behavior more consistent.
are the any proposals for how exclusive IP instances could be used
to replace shared-IP stacks? it would be helpfull to see this to
understand what operations would be permitted/restricted/required
within zones if we were to do this.
> >nit: in zonecfg it's "ip-type" were as above there's no dash.
> I realized that. We'll make them consistent.
> >i looked into it a bit more and as it turns out in linux network devices
> >don't actually have any /dev entries. but we can't simply tell non-native
> >brands not to map exclusive-IP network devices because this could break
> >other zones. (think sn1, belinix, nexenta, etc.)
> >so here's a question. in an exclusive-IP zone, do we have to have
> >network /dev entries in to be able to configure network interfaces?
> >or can all the necessary configuration be done via socket operations?
> sockets are implemented using devices (see /etc/sock2path), and ifconfig
> plumb is implemented using DLPI devices.
so for ifconfig, we need to have access to the devices, right?
> >i guess if we need to have access to the device nodes then the easiest
> >thing to do will be to simply map them in and hopefully linux apps will
> >ignore them. if linux apps don't ignore them then we'll have to
> >create /dev and /native/dev as seperate namespaces. (currently
> >they are the same.)
> If there is a /dev/bge1 in a lx zone, presumably it doesn't upset
> anything. Or are there applications which try to do something to
> everything the find in /dev?
well, glibc:ttyname() actually traveres all of /dev looking for
devices that match the passed in fd. since bge1 wouldn't be mapped
to a specific linux device we'd have to come up with a mechanism to
avoid dev_t mapping conflicts. (currently we explicitly translate dev_t's
for all devices since all the devices we import have linux equivilants.
once we start mapping in devices with no equilivants we have to ensure
there is no overlap between the new native non-translated dev_ts and
translated linux dev_ts.)
> >if we don't need/want the nodes then we'll have to add some kind of
> >brand callback such that the lx brand can indicate this when you
> >attempt to add in the interfaces.
> >lastly, we'll probably have to add some kind or mechanism that will
> >allow a brand to iterate over all the network devices which have been
> >exclusively allocated to it. (so when the linux brand does
> >"ifconfig eth0 plumb" we can look at all the available interfaces and
> >plumb one up.)
> We have the system calls to do this. They are used by install which does
> an 'ifconfig -a plumb'. ifconfig uses the zone_get_iflist() call.
> I think that would be sufficient to build something for lx.
cool. as long as zone_get_iflist() is a system call and doesn't need
to parse an xml file to get this info then we should be ok. (since
we don't have full access to native config files and libraries in
the non-native zones.)
> But I don't understand how lx handles multiple Ethernet interfaces; how
> it decides which one is eth0 and which one is eth1.
currently whenever a process is initialized it creates a socket and
does SIOCGIFNUM/SIOCGIFCONF/SIOCGIFFLAGS to figure out what interfaces
are present on the system, sorts them, and then maps them to linux
ethernet interface names. (see ifname_scan()) this works because in
general with shared-ip instance the interfaces assigned to a zone
don't change while the zone is running. with exclusive ip instances
this will no longer be true. we'll be plumbing interfaces and possibly
virtual interfaces, so all this information will be changing dynamically
and we would have to come up with a new mapping mechanism. the
existin mapping mechanism also doesn't support linux virtual interfaces.
(ie, you'll never see eth0:1)
> >>We can certainly disable IP instances with non-native brands in zonecfg.
> >>But I assume we'd want to support sn-1, thus "non-native" might not be
> >>the right test.
> >then this seems like something that should be controlled via
> >the brand configuration verification callback. currently the lx
> >brand already looks for unsupported zonecfg options and reports
> >errors if there are any. it seems like it should be enhanced to
> >detect an exclusive-IP config and not allow it. (see the verify
> >subcommand to lx_support.c.)
> I'll take a look at the implementation.
> >hm. i spend most my time thinking about the NAT case so i hadn't
> >considered dhcp. i guess that will require running the dhcp daemon
> >inside the linux zone which will require supporting whatever
> >daemon/kernel interfaces they use. (probably socket operations.)
> But isn't it just a wad of ioctls?
probably. we have the infrastructure for it, it's just a work. ;)
> >currently we just disable them by moving them aside.
> >i guess we could patch them up by adding a line to the top like:
> > /native/usr/lib/brand/lx/lx_configure_ip || exit 0
> >where lx_configure_ip is a script that runs a native binary and
> >tells us if we're configured with an exclusive-IP stack or not.
> That would be more flexible I think.
zones-discuss mailing list