tag:blogger.com,1999:blog-2103966514803091562024-02-21T09:08:12.035-08:00seblogSebastienhttp://www.blogger.com/profile/00380169559827106266noreply@blogger.comBlogger13125tag:blogger.com,1999:blog-210396651480309156.post-87912138626614964242009-12-02T01:05:00.000-08:002011-03-23T08:54:46.412-07:00IP Tunneling and Amazon VPC<h3 class="GenericStory_Message" data-ft="{"type":"msg"}"></h3>Glenn Brunette has successfully been able to access the <a href="http://aws.amazon.com/vpc/">Amazon Virtual Private Cloud</a> using OpenSolaris as a customer gateway. He describes the general concept in his <a href="http://www.blogger.com/gbrunett/entry/new_opensolaris_vpc_gateway">blog</a> along with pointers to a tool he's developed to automate the configuration. This OpenSolaris customer gateway uses core technologies such as IP tunneling, IPsec, and BGP to provide redundant secure links to the cloud (see the <a href="http://kenai.com/projects/osolvpc/pages/Home#Functional_Diagram">functional diagram</a> that Glenn provides for a depiction of this).<br />
<br />
The IP tunnel configuration using OpenSolaris allowed Glenn and his team (including <a href="http://www.blogger.com/danmcd/">Dan McDonald</a> and others) to troubleshoot the operation of BGP over these tunnels using the observability provided by the <a href="http://hub.opensolaris.org/bin/view/Project+clearview/iptun">Clearview</a> project (integrated in OpenSolaris build 125), which would not have been possible before.<br />
<br />
Thanks for doing this fantastic work Glenn!Sebastienhttp://www.blogger.com/profile/00380169559827106266noreply@blogger.com0tag:blogger.com,1999:blog-210396651480309156.post-66921948108591496642009-11-12T23:18:00.000-08:002011-03-23T08:54:46.413-07:00Fluendo DVD player initial reactionI bought the Fluendo DVD player for OpenSolaris last night, and there seem to be some very rough edges. For one, the /usr/bin/fluendo-dvd script doesn't work out of the box and spews shell syntax errors. It assumes that "/bin/sh" is actually bash, which isn't the case on OpenSolaris. Changing the first line of the script to "#!/bin/bash" fixes the problem and the binary launches.<br />
<br />
I've installed the player on two systems, both running development build 126 of OpenSolaris. One is my Sun Ultra 40 desktop, and the other is my Toshiba Portege r500 laptop. One common gripe in general is that the player has no control buttons at all (e.g. play, stop, pause, etc.). To control the player, one needs to go through the "DVD Player" menu, which is very odd.<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9W-m8zuWWL17xl6VpnJyujVU7rgHKtKIGZgA8gEUjLZVjbKF83f9yPKdV-1yrKQXsLYQXTE5rHo5F820LW9Zd8HNlRoMlFdrOANxpHw4n7KasJGe5vSmV7rS6FmjyNmIIc4xpVqsOhbKe/s1600/fluendo-dvd.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="270" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9W-m8zuWWL17xl6VpnJyujVU7rgHKtKIGZgA8gEUjLZVjbKF83f9yPKdV-1yrKQXsLYQXTE5rHo5F820LW9Zd8HNlRoMlFdrOANxpHw4n7KasJGe5vSmV7rS6FmjyNmIIc4xpVqsOhbKe/s400/fluendo-dvd.png" width="400" /></a></div>There is also no evidence of the capability to navigate forward or backwards through a movie at higher or lower rates.<br />
<br />
OpenSolaris itself is not contributing to a positive user experience, as after watching a movie for any more than ten minutes results in the audio stream being corrupted by what sounds like static clicks and hisses. Stopping and restarting the player causes the audio issue to go away, but it comes back after a short time. This exact same issue occurs for other gstreamer applications, so this is not fluendo-dvd player problem. There is likely a bug in the audio framework.<br />
<br />
Aside from these common issues, the player is unable to play movies on my Toshiba Portege laptop. The first time I attempted to play a movie (after having applied the above fix to the launcher script), the application crashed with a segmentation fault. I have the core dump if anyone from Fluendo wishes to debug the issue (the segfault occurs in <span style="font-family: courier new,courier,monospace;">fluendo_css_descrambler_descramble()</span>). From that point on, any attempt to play a movie results in the following popup.<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgy-9gebkvrLJ5meeIFRxTmcSLrx_i1cYGA3EQg6Gp7Iz9glRcZXH0jsOtc_6T-gnaXttsIZbRmvbIDKy18fbWfkSQZL1o5UPKhSXzekU6HAaafX-XlSsdwE_kCKkAVf5oUlxdPIgmM9hEv/s1600/fluendo-auth-error.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgy-9gebkvrLJ5meeIFRxTmcSLrx_i1cYGA3EQg6Gp7Iz9glRcZXH0jsOtc_6T-gnaXttsIZbRmvbIDKy18fbWfkSQZL1o5UPKhSXzekU6HAaafX-XlSsdwE_kCKkAVf5oUlxdPIgmM9hEv/s1600/fluendo-auth-error.png" /></a></div>I'm not sure what to make of that. Perhaps there are some file permission issues on this system, but the error is cryptic enough that there's no hope of diagnosing what the problem is.<br />
<br />
On the positive side, on the one system I'm able to get it working, the video quality is excellent.<br />
<br />
The laptop is the main platform from which I'd like to use this, so the current situation is disappointing. I'm hoping that these are simple bugs that can be expediently fixed, and that the mail I sent to the support channel at Fluendo last night will be answered (I'd expect so since the 20 Euros one pays for this includes 1 year of support).Sebastienhttp://www.blogger.com/profile/00380169559827106266noreply@blogger.com1tag:blogger.com,1999:blog-210396651480309156.post-26024110942941151002009-11-12T03:49:00.000-08:002011-03-23T08:54:46.413-07:00OpenSolaris DVD player from Fluendo<a href="http://www.fluendo.com/">Fluendo</a> released their DVD player for OpenSolaris today!<br />
<br />
<a href="http://www.fluendo.com/shop/product/fluendo-dvd-player/">http://www.fluendo.com/shop/product/fluendo-dvd-player/</a><br />
<a href="http://www.fluendo.com/shop/product/fluendo-dvd-player/"><img src="http://www.fluendo.com/media/images/productimage-picture-fluendo-dvd-player-12_t_w69_h86.png" /></a>Sebastienhttp://www.blogger.com/profile/00380169559827106266noreply@blogger.com0tag:blogger.com,1999:blog-210396651480309156.post-11165657362537828342009-10-08T02:04:00.000-07:002011-03-23T08:54:46.413-07:00IPv6 in Shared-Stack ZonesI was recently at an OpenSolaris user-group meeting where a question was asked regarding how IPv6 could be used from a shared-stack zone. For the benefit of anyone who has a similar question, here is an example of a working configuration: <br />
<br />
<br />
<pre>bash-3.2# zoneadm list -iv
ID NAME STATUS PATH BRAND IP
0 global running / native shared
- test installed /export/home/test native excl
- test2 installed /export/home/test2 native shared
</pre><br />
The exclusive-stack zone "test" has all of its own networking configured within it, so IPv6 inherently just works there. The question, however, was about shared-stack, and so I setup the "test2" zone to demonstrate this. <br />
<br />
<br />
<pre>bash-3.2# zonecfg -z test2
zonecfg:test2> add net
zonecfg:test2:net> set physical=e1000g0
zonecfg:test2:net> set address=fe80::1234/10
zonecfg:test2:net> end
zonecfg:test2> add net
zonecfg:test2:net> set physical=e1000g0
zonecfg:test2:net> set address=2002:a08:39f0:1::1234/64
zonecfg:test2:net> end
zonecfg:test2> verify
zonecfg:test2> commit
zonecfg:test2> exit
bash-3.2# zonecfg -z test2 info
zonename: test2
zonepath: /export/home/test2
brand: native
...
net:
address: 10.8.57.111/24
physical: e1000g0
defrouter not specified
net:
address: fe80::1234/10
physical: e1000g0
defrouter not specified
net:
address: 2002:a08:39f0:1::1234/64
physical: e1000g0
defrouter not specified
</pre><br />
Here I configured a link-local address <span style="font-family: courier new,courier,monospace;">fe80::1234/10</span>, and a global address <span style="font-family: courier new,courier,monospace;">2002:a08:39f0:1::1234/64</span>. Each interface within each zone requires a link-local address for use with neighbor-discovery, and the global address is the address used for actual IPv6 communication by applications and services. The global address' prefix is one that is configured on the link to which the interface is connected. In the zone, we end up with:<br />
<br />
<pre>bash-3.2# zlogin test2 ifconfig -a6
lo0:1: flags=2002000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv6,VIRTUAL> mtu 8252 index 1
inet6 ::1/128
e1000g0:2: flags=2000841<UP,RUNNING,MULTICAST,IPv6> mtu 1500 index 2
inet6 fe80::1234/10
e1000g0:3: flags=2000841<UP,RUNNING,MULTICAST,IPv6> mtu 1500 index 2
inet6 2002:a08:39f0:1::1234/64
</pre><br />
The global zone has IPv6 connectivity using this same prefix as well as a default IPv6 route: [2]<br />
<br />
<pre>bash-3.2# netstat -f inet6 -rn
Routing Table: IPv6
Destination/Mask Gateway Flags Ref Use If
--------------------------- --------------------------- ----- --- ------- -----
2002:a08:39f0:1::/64 2002:a08:39f0:1:214:4fff:fe1e:1e72 U 1 0 e1000g0:1
fe80::/10 fe80::214:4fff:fe1e:1e72 U 1 0 e1000g0
default fe80::1 UG 1 0 e1000g0
</pre><br />
From the non-global zone, we have IPv6 connectivity:<br />
<br />
<pre>bash-3.2# zlogin test2 ping -sn 2002:8194:aeaa:1:214:4fff:fe70:5530
PING 2002:8194:aeaa:1:214:4fff:fe70:5530 (2002:8194:aeaa:1:214:4fff:fe70:5530): 56 data bytes
64 bytes from 2002:8194:aeaa:1:214:4fff:fe70:5530: icmp_seq=0. time=4.654 ms
64 bytes from 2002:8194:aeaa:1:214:4fff:fe70:5530: icmp_seq=1. time=2.632 ms
64 bytes from 2002:8194:aeaa:1:214:4fff:fe70:5530: icmp_seq=2. time=2.501 ms
64 bytes from 2002:8194:aeaa:1:214:4fff:fe70:5530: icmp_seq=3. time=2.571 ms
^C
----2002:8194:aeaa:1:214:4fff:fe70:5530 PING Statistics----
4 packets transmitted, 4 packets received, 0% packet loss
round-trip (ms) min/avg/max/stddev = 2.501/3.090/4.654/1.044
</pre><br />
The zone can then be configured to use DNS or local hosts to resolve names to IPv6 addresses in order to utilize IPv6 more effectively.Sebastienhttp://www.blogger.com/profile/00380169559827106266noreply@blogger.com0tag:blogger.com,1999:blog-210396651480309156.post-68279413554631438552009-09-25T01:03:00.000-07:002011-03-23T08:54:46.413-07:00Clearview IP Tunneling in OpenSolarisI integrated <a href="http://www.opensolaris.org/os/project/clearview/iptun/">Clearview IP Tunneling</a> (the final component of the <a href="http://www.opensolaris.org/os/project/clearview/">Clearview</a> project) into the <a href="http://www.opensolaris.org/os/community/on/">ON consolidation</a> this week. It will be included in OpenSolaris build 125 which will make its way to the <a href="http://pkg.opensolaris.org/dev/">dev repository</a> in due time. Thanks to all who participated including the Clearview project team (past and present), and members of various OpenSolaris communities who contributed by doing design and code reviews. This brings a close to a project that <a href="http://blogs.sun.com/meem/">Meem</a> and I conceived years ago while doodling network interface requirements on his whiteboard. We've now delivered every component that we initially identified as the solutions to meet our requirements. That's something to be proud of.<br />
<br />
With this integration, IP tunnel links can be created using <span style="font-family: courier new,courier,monospace;">dladm</span>, be given meaningful names using link vanity naming, observed using traditional network observability tools such as <span style="font-family: courier new,courier,monospace;">snoop</span> and <span style="font-family: courier new,courier,monospace;">wireshark</span>, assigned to exclusive stack non-global zones, and created from within non-global zones.<br />
<br />
This integration also enables the use of <span style="font-family: courier new,courier,monospace;">dladm</span> in general from within exclusive stack non-global zones. Aside from the IP tunnel subcommands which are supported from such zones, all of the show-* subcommands now work in such zones, allowing administrators to view datalink configuration pertinent to the zone. This is a first step towards gradually expanding the set of datalink features available in zones.<br />
<br />
Enjoy, and feel free to communicate with us regarding this project at <a href="mailto:clearview-discuss@opensolaris.org">clearview-discuss@opensolaris.org</a>.Sebastienhttp://www.blogger.com/profile/00380169559827106266noreply@blogger.com3tag:blogger.com,1999:blog-210396651480309156.post-51194328110700051012008-11-17T22:47:00.000-08:002011-03-23T08:54:46.414-07:00Observe Loopback and Inter-Zone IP Packets With OpenSolarisI'm happy to announce that the IP Observability Devices component of the <a href="http://www.opensolaris.org/os/project/clearview">Clearview project</a> has integrated into OpenSolaris build 103 (also see Phil Kirk's <a href="http://www.opensolaris.org/os/community/on/flag-days/pages/2008110601/">announcement</a> to the ON community). This adds the following new capabilities to OpenSolaris:<br />
<br />
<ul><li>Network observability at the IP layer for traditional DLPI-based tools such as snoop</li>
<li>Observability of loopback IP packets</li>
<li>Observability of inter-zone IP packets</li>
<li>Tools such as snoop can be run from within a non-global zone to observe packets associated with that zone</li>
<li>Snoop filtering based on zone id</li>
</ul><br />
The snoop command has grown a new "-I <interface-name>" option to access this feature. Its semantics are to snoop the IP interface named <interface-name> at the IP layer. When observing a particular IP interface with this facility, packets that have a source or destination IP address assigned to that interface can be observed, as well as packets that are forwarded to or from that IP interface, and broadcast and multicast packets received by that interface. Additional internal filtering is performed to ensure that an observer from a non-global zone can only see packets that belong to that zone, with the exception of the global zone, from which packets to or from any zone that shares its stack can be observed. Any IP interface visible through "ifconfig -a" can be observed using this feature.<br />
<br />
We are also working towards integrating support for these IP Observability Devices into <a href="http://www.wireshark.org/">Wireshark</a> and <a href="http://www.tcpdump.org/">tcpdump</a> in the near future.<br />
<br />
<br />
Here are some examples using snoop:<br />
<br />
<h4>Example 1: Observing the Loopback Interface</h4><br />
<pre>bash-3.2# snoop -I lo0
Using device ipnet/lo0 (promiscuous mode)
localhost -> localhost ICMP Echo request (ID: 37110 Sequence number: 0)
localhost -> localhost ICMP Echo reply (ID: 37110 Sequence number: 0)
</pre><br />
The lo0 interface has the 127.0.0.1 address assigned to it, and so any communication using the address 127.0.0.1 is seen above (in this case, I was simply doing "ping 127.0.0.1"). Snoop's verbose output mode displays a new "ipnet" header that precedes all IP packets observed:<br />
<br />
<pre>bash-3.2# snoop -v -I lo0
Using device ipnet/lo0 (promiscuous mode)
IPNET: ----- IPNET Header -----
IPNET:
IPNET: Packet 1 arrived at 10:40:33.68506
IPNET: Packet size = 108 bytes
IPNET: dli_version = 1
IPNET: dli_type = 4
IPNET: dli_srczone = 0
IPNET: dli_dstzone = 0
IPNET:
...</pre><br />
Note above that the source and destination zone ids are displayed. In this case, I was running "ping 127.0.0.1" in the global zone, and so both the source and destination zone ids are "0".<br />
<br />
<br />
<h4>Example 2: Running Snoop From a Non-Global Zone</h4><br />
<pre>bash-3.2# zoneadm list -v
ID NAME STATUS PATH BRAND IP
0 global running / native shared
4 test running /zones/test native shared
bash-3.2# zlogin test
[Connected to zone 'test' pts/2]
...
bash-3.2# ifconfig -a
lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
bge0:1: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 2
inet 10.8.57.34 netmask ffffff00 broadcast 10.8.57.255
lo0:1: flags=2002000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv6,VIRTUAL> mtu 8252 index 1
inet6 ::1/128
bge0:2: flags=202000841<UP,RUNNING,MULTICAST,IPv6,CoS> mtu 1500 index 2
inet6 2002:a08:39f0:1::f/64
bash-3.2# snoop -I bge0
Using device ipnet/bge0 (promiscuous mode)
whitestar1-2.East.Sun.COM -> mf-ubur-01.East.Sun.COM DNS C 253.57.8.10.in-addr.arpa. Internet PTR ?
mf-ubur-01.East.Sun.COM -> whitestar1-2.East.Sun.COM DNS R 2.0.0.224.in-addr.arpa. Internet PTR ALL-ROUTERS.MCAST.NET.
whitestar1-6.East.Sun.COM -> whitestar1-2.East.Sun.COM TCP D=22 S=62117 Syn Seq=195630514 Len=0 Win=49152 Options=<mss
whitestar1-2.East.Sun.COM -> whitestar1-6.East.Sun.COM TCP D=62117 S=22 Syn Ack=195630515 Seq=195794440 Len=0 Win=49152
whitestar1-6.East.Sun.COM -> whitestar1-2.East.Sun.COM TCP D=22 S=62117 Ack=195794441 Seq=195630515 Len=0 Win=49152
whitestar1-2.East.Sun.COM -> whitestar1-6.East.Sun.COM TCP D=62117 S=22 Push Ack=195630515 Seq=195794441 Len=20 Win=491
</pre><br />
Although not evident from the snoop output above, whitestar1-2 is 10.8.57.34 (the bge0:1 IP address in this non-global zone), and whitestar1-6 is actually an IP address in another zone on the same system. By snooping the bge0 interface, the user sees all packets associated with the bge0 IP addresses in the zone; even those that are locally delivered to other zones. Using snoop's verbose output mode allows us to see which zones these packets are flowing between:<br />
<br />
<pre>bash-3.2# snoop -v -I bge0 whitestar1-6
Using device ipnet/bge0 (promiscuous mode)
IPNET: ----- IPNET Header -----
IPNET:
IPNET: Packet 1 arrived at 10:44:10.86739
IPNET: Packet size = 76 bytes
IPNET: dli_version = 1
IPNET: dli_type = 4
IPNET: dli_srczone = 0
IPNET: dli_dstzone = 4
IPNET:
...</pre><br />
We can see above that the packet was from the global zone to the test zone.<br />
<br />
<h4>Example 3: Filtering by Zone ID</h4><br />
Filtering by zone id can be useful on a system that has multiple zones. In this example, an administrator in the global zone observes packets being sent to or from IP addresses in the "test" zone.<br />
<br />
<pre>bash-3.2# zoneadm list -v
ID NAME STATUS PATH BRAND IP
0 global running / native shared
4 test running /zones/test native shared
bash-3.2# snoop -I bge0 zone 4
Using device ipnet/bge0 (promiscuous mode)
whitestar1-6.East.Sun.COM -> whitestar1-2.East.Sun.COM TCP D=22 S=61658 Syn Seq=374055417 Len=0 Win=49152 Options=<mss
whitestar1-2.East.Sun.COM -> whitestar1-6.East.Sun.COM TCP D=61658 S=22 Syn Ack=374055418 Seq=374124525 Len=0 Win=49152
whitestar1-6.East.Sun.COM -> whitestar1-2.East.Sun.COM TCP D=22 S=61658 Ack=374124526 Seq=374055418 Len=0 Win=49152
</pre><br />
This can be particularly useful with the loopback interface, as the 127.0.0.1 address is shared among all shared-stack zones, and it can be difficult to associate a loopback packet to an application in a zone.<br />
<br />
Note that there is a pending RFE to also be able to enter a zone name as well as a zone id as the argument to the snoop "zone" filtering primitive. For now, the zone id is the only allowable argument.Sebastienhttp://www.blogger.com/profile/00380169559827106266noreply@blogger.com1tag:blogger.com,1999:blog-210396651480309156.post-45709647554281632962008-06-05T04:47:00.000-07:002011-03-23T08:54:46.414-07:00Clearview Vanity Naming BigAdmin ArticleI expanded upon one of my previous blog entries on network datalink vanity naming in OpenSolaris into a more thorough article with more examples. The result is the following BigAdmin article:<br />
<br />
<a href="http://www.sun.com/bigadmin/sundocs/articles/vnamingsol.jsp">http://www.sun.com/bigadmin/sundocs/articles/vnamingsol.jsp</a><br />
<br />
Enjoy.Sebastienhttp://www.blogger.com/profile/00380169559827106266noreply@blogger.com0tag:blogger.com,1999:blog-210396651480309156.post-31976851441056424052008-05-30T04:28:00.000-07:002011-03-23T08:54:46.414-07:00Maybe Some Ice Cream With That OpenSolarisWell, the pickles and beer in the refrigerator were not enough to bribe my Ferrari into installing OpenSolaris. Maybe some Ice Cream will coax it into behaving better. Luckily, the cleaning people empty out the freezer on the last Friday of the month at 2:00pm (which is today!), leaving plenty of room for...<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiS8l17OdHDS7YNoRxHtQUuXPgXW6X9igIlNAp2ZkJ18OoeFXBGbjttynSAOBTJoHn1cM_LNv6KjFVXliEcSY6xX1apoXIhYcW-dh-9g2xOXrljnZEBC-TB5azyE9P3Ds8RmqtgPTUHSVSF/s1600/ferrari-in-the-freezer.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="300" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiS8l17OdHDS7YNoRxHtQUuXPgXW6X9igIlNAp2ZkJ18OoeFXBGbjttynSAOBTJoHn1cM_LNv6KjFVXliEcSY6xX1apoXIhYcW-dh-9g2xOXrljnZEBC-TB5azyE9P3Ds8RmqtgPTUHSVSF/s400/ferrari-in-the-freezer.jpg" width="400" /></a></div><br />
<br />
Props to Will Young for claiming to have done something like this first. ;-)Sebastienhttp://www.blogger.com/profile/00380169559827106266noreply@blogger.com2tag:blogger.com,1999:blog-210396651480309156.post-32121228694350483122008-05-30T03:59:00.000-07:002011-03-23T08:54:46.414-07:00Not Too Much Mustard on That Ferrari PleaseMy Acer Ferrari 3400 cannot go through an OpenSolaris installation without overheating and powering itself down. Because OpenSolaris has no power management for this laptop, the CPU runs at 100% clock rate 100% of the time, which isn't a problem for other OSs.<br />
<br />
Luckily, the Ferrari has no problems sharing a cramped space with mustard, pickes, left-over Chinese food, and a beer. Ferrari 3400, meet OpenSolaris 2008.05:<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj2km7Gl9eJJIdM6A5uzOfhImKlM0C6rAegXTxdFIOVIPfDQZnCQCHe-TBz7AMfhzkkMU48IXZj2hV2Tf5Y4TNuhkYjGNH9-R9mpi3EKuTAbOMxmDqC4KNQLQ5ibihPi5rBArlzg-PO1SCZ/s1600/ferrari-in-the-fridge.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="300" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj2km7Gl9eJJIdM6A5uzOfhImKlM0C6rAegXTxdFIOVIPfDQZnCQCHe-TBz7AMfhzkkMU48IXZj2hV2Tf5Y4TNuhkYjGNH9-R9mpi3EKuTAbOMxmDqC4KNQLQ5ibihPi5rBArlzg-PO1SCZ/s400/ferrari-in-the-fridge.jpg" width="400" /></a></div>Sebastienhttp://www.blogger.com/profile/00380169559827106266noreply@blogger.com1tag:blogger.com,1999:blog-210396651480309156.post-85989065830527680942008-03-28T03:01:00.000-07:002011-03-23T08:54:46.415-07:00Configuring an OpenSolaris 6to4 routerA common problem in enterprise networks is that many IT departments have not begun to deploy IPv6 within their supported infrastructure, but developers need IPv6 networking in order to develop and test products which support IPv6. 6to4 (defined in <a href="http://tools.ietf.org/html/rfc3056">RFC 3056</a>) can be a quick way to obtain IPv6 connectivity between IPv6 nodes separated by IPv4 networks such as this. The general idea is that each 6to4 site has a 6to4 router which is responsible for automatically tunneling IPv6 packets from its site to other 6to4 routers in other 6to4 sites (or native IPv6 networks with the use of relay routers<sup><a href="http://www.blogger.com/post-edit.g?blogID=210396651480309156&postID=8598906583052768094#foot1" name="footref1">[1]</a></sup>) over IPv4. 6to4, then, can often be the answer for such developers, where configuring a 6to4 router in a lab environment or in a small subnet within an enterprise network is very easy and addresses their basic IPv6 connectivity requirements.<br />
<br />
OpenSolaris<sup><a href="http://www.blogger.com/post-edit.g?blogID=210396651480309156&postID=8598906583052768094#foot2" name="footref2">[2]</a></sup> can be used as a 6to4 router, and I've received so many requests for basic instructions on how to configure a 6to4 router with OpenSolaris, that I've decided to write a short blog entry on the subject. Note that while this blog may come in handy, there is in fact official <a href="http://docs.sun.com/app/docs/doc/819-3000/ipv6-ref-47">Sun documentation on 6to4 routing<sup></sup></a><sup><a href="http://www.blogger.com/post-edit.g?blogID=210396651480309156&postID=8598906583052768094#foot3" name="footref3"> [3]</a></sup> which may be even more useful.<br />
The following instructions configure a persistent configuration which will be enabled after a reboot of the system. All of this can also be configured similarly on the running system, but it is simpler to give one set of instructions. Experienced administrators will surely know how to interpret these instructions to apply configuration to the running system, and that's left as an exercise to the reader.<br />
<ol><li>Enable IPv6 on one of the physical interfaces of the 6to4 router:<br />
<pre>touch /etc/hostname6.<intf></pre>Where <intf> is the interface in question (e.g., e1000g0).<br />
<br />
</li>
<li>Configure a 6to4 tunneling interface on the 6to4 router:<br />
<pre>echo "tsrc <v4addr> up" > /etc/hostname6.ip.6to4tun0</pre>Where <v4addr> is the IPv4 address of the 6to4 router.<br />
<br />
</li>
<li>Enable IPv6 forwarding on the 6to4 router:<br />
<pre>routeadm -e ipv6-forwarding</pre><br />
</li>
<li>Reboot the system. When the system comes back up, it will have an IPv6 interface name ip.6to4tun0 which will have an address like 2002:<hex-v4addr>::1 <sup><a href="http://www.blogger.com/post-edit.g?blogID=210396651480309156&postID=8598906583052768094#foot4" name="footref4">[4]</a></sup>. The "2002:<hex-v4addr>::" part is the 48-bit 6to4 site-prefix for your 6to4 site. All IPv6 nodes in the site that use this 6to4 router must share this common prefix, although it needs to be further sub-divided within each IPv6 subnet in the site in order to be useful (that's what the remaining 16 bits of the /64 prefix are for). For example, if the site consists of a single IPv6 subnet, then it's easy enough to create a single "2002:<hex-v4addr>:1::/64" prefix by following the following remaining steps.<br />
<br />
</li>
<li>Enable IPv6 router advertisements on the 6to4 router so that IPv6 hosts on the subnet automatically configure their IPv6 addresses and use this router as their default router:<br />
<pre>cat << EOF > /etc/inet/ndpd.conf
ifdefault AdvSendAdvertisements 1
prefix 2002:<hex-v4addr>:1::/64 <intf>
EOF
</pre>Where <hex-v4addr> is the same as the <hex-v4addr> displayed in step 4, and <intf> is the physical interface attached to the IPv6 subnet in question. The ":1" following <hex-v4addr> is important, as this is the 16-bit subnet-id for the prefix being advertised. It uniquely identifies this /64 prefix from other prefixes in the site, which all share a common /48. The subnet-id must be non-zero (because the 0 subnet-id was allocated to the 6to4 router's ip.6to4tun0 interface) and unique within the site, so it doesn't necessarily need to be "1".<br />
<br />
If the 6to4 router is attached to more than one subnet, then there would be additional "prefix" entries in the ndpd.conf file above, one for each interface. Each prefix would then have its own unique 16-bit subnet id.<br />
<br />
</li>
<li>Restart the neighbor discovery daemon for the changes to take effect.<br />
<pre>svcadm restart routing/ndp</pre></li>
</ol>At this point, hosts which have IPv6 enabled in the link connected to the 6to4 router's <intf> interface will automatically<br />
configure IPv6 addresses based on the advertised prefix, and will have a<br />
default route to the 6to4 router. All packets destined off-link to other<br />
6to4 sites will be tunneled to the remote 6to4 routers.<br />
<shameless plug>Of course, when the <a href="http://www.opensolaris.org/os/project/clearview/">Clearview</a> <a href="http://www.opensolaris.org/os/project/clearview/iptun/">IP Tunneling Device Driver</a> component delivers to Nevada, one will be able to use dladm(1M) to create a 6to4 tunnel with a meaningful name, and to observe packets in the 6to4 tunnel using snoop(1M), wireshark, or other such tools.</shameless plug><br />
<br />
<hr size="2" width="100%" /><a href="http://www.blogger.com/post-edit.g?blogID=210396651480309156&postID=8598906583052768094#footref1" name="foot1">[1]</a> I'm skipping discussing relay routers for various reasons which I won't go into here.<br />
<a href="http://www.blogger.com/post-edit.g?blogID=210396651480309156&postID=8598906583052768094#footref2" name="foot2">[2]</a> In fact, Solaris starting with Solaris 9.<br />
<a href="http://www.blogger.com/post-edit.g?blogID=210396651480309156&postID=8598906583052768094#footref3" name="foot3">[3]</a> Look for 6to4. Within this documentation, there are also instructions on <a href="http://docs.sun.com/app/docs/doc/819-3000/ipv6-config-tasks-24">how to configure 6to4 on Solaris</a>, similar to this blog entry.<br />
<br />
<a href="http://www.blogger.com/post-edit.g?blogID=210396651480309156&postID=8598906583052768094#footref4" name="foot4">[4]</a> The 2002::/16 prefix is the "magic" 6to4 prefix that allows 6to4 routers to tunnel to one another. The 32 bits that follow these initial 16 bits is an IPv4 address. It is the IPv4 address of the 6to4 router which is responsible for the automatic IPv6 tunneling of packets for its 6to4 site. For example, when a 6to4 router needs to tunnel an IPv6 packet with a destination of 2002:0a01:0203:1::1, it will know to automatically encapsulate this IPv6 packet in an IPv4 header with a destination of 10.1.2.3 (the IPv4 address of the remote 6to4 router).Sebastienhttp://www.blogger.com/profile/00380169559827106266noreply@blogger.com0tag:blogger.com,1999:blog-210396651480309156.post-65931852145379702142008-01-29T11:42:00.000-08:002011-03-23T08:54:46.415-07:00Using New Networking Features in OpenSolaris<p>The <a href="http://www.opensolaris.org/os/project/clearview/uv/">Nemo Unification and Vanity Naming</a> component of <a href="http://www.opensolaris.org/os/project/clearview/">project Clearview</a> has integrated into <a href="http://www.opensolaris.org/">OpenSolaris</a> build 83, which (among other things) allows administrators to give meaningful names to network datalink interfaces, including VLAN interfaces. I thought I'd share how I used this feature on one of our lab routers here in Sun.</p><p>The system has four Ethernet NICs, but needs to be the router for 8 separate lab subnets. The aggregate bandwidth of four Gigabit pipes is plenty for all of the lab subnets combined, so it wasn't really worthwhile to go and add four more NICs to the system (plus, that's not really scalable). Instead, I created a single link aggregation (802.3ad) including all four Ethernet links, and created individual tagged VLAN interfaces (one for each of the 8 subnets) on top of this aggregation.<br /></p><p>Step by step, here's what I did. Keep in mind that this is done using a nightly build of OpenSolaris from after January 24th 2008. Here was the list of datalinks on the system before I started changing things (bonus points for anyone who can tell me what kind of system I'm doing this on based on the devices listed below) :-) :<br /></p><pre>bash-3.2# dladm show-link<br />LINK CLASS MTU STATE OVER<br />nge0 phys 1500 up --<br />nge1 phys 1500 up --<br />e1000g0 phys 1500 up --<br />e1000g1 phys 1500 up --<br />bash-3.2# dladm show-phys<br />LINK MEDIA STATE SPEED DUPLEX DEVICE<br />nge0 Ethernet up 1000Mb full nge0<br />nge1 Ethernet up 1000Mb full nge1<br />e1000g0 Ethernet up 1000Mb full e1000g0<br />e1000g1 Ethernet up 1000Mb full e1000g1<br /></pre><p>First, I unplumbed all IP interfaces on each of these links by issuing appropriate "ifconfig <intf> unplumb" commands. This was necessary since renaming datalinks requires that no IP interfaces be plumbed above them. I then gave each of these interfaces more generic names. The benefit of doing this is that if we replace the Ethernet cards in the future with cards of a different chip set, we won't have to change the interface names associated with that card (one of the big benefits of Clearview UV vanity naming).</p><pre>bash-3.2# dladm rename-link nge0 eth0<br />bash-3.2# dladm rename-link nge1 eth1<br />bash-3.2# dladm rename-link e1000g0 eth2<br />bash-3.2# dladm rename-link e1000g1 eth3<br />LINK CLASS MTU STATE OVER<br />eth0 phys 1500 up --<br />eth1 phys 1500 up --<br />eth2 phys 1500 up --<br />eth3 phys 1500 up --<br />bash-3.2# dladm show-phys<br />LINK MEDIA STATE SPEED DUPLEX DEVICE<br />eth0 Ethernet up 1000Mb full nge0<br />eth1 Ethernet up 1000Mb full nge1<br />eth2 Ethernet up 1000Mb full e1000g0<br />eth3 Ethernet up 1000Mb full e1000g1<br /></pre><p>Then I created a link aggregation using these four Ethernet links:</p><pre>bash-3.2# dladm create-aggr -P L2,L3 -l eth0 -l eth1 -l eth2 -l eth3 default0</pre><p>I named the link "default0" because this is the main untagged subnet for the lab network, and the network to which the default route points. Now the set of links looks like:</p><pre>bash-3.2# dladm show-link<br />LINK CLASS MTU STATE OVER<br />eth0 phys 1500 up --<br />eth1 phys 1500 up --<br />eth2 phys 1500 up --<br />eth3 phys 1500 up --<br />default0 aggr 1500 up eth0 eth1 eth2 eth3<br /></pre><p>The next step was to create the VLAN links on top of this aggregation. Our lab subnets have a color-coded naming scheme, which I used when naming the VLAN links. This is convenient when diagnosing network problems with particular systems, as our DNS naming uses a paralell scheme. For example, if a system's hostname is blue-98, I know to do my network snooping on the "blue" link. Creating the VLAN links was as simple as:</p><pre>bash-3.2# dladm create-vlan -v 2 -l default0 orange0<br />bash-3.2# dladm create-vlan -v 3 -l default0 green0<br />bash-3.2# dladm create-vlan -v 4 -l default0 blue0<br />bash-3.2# dladm create-vlan -v 5 -l default0 white0<br />bash-3.2# dladm create-vlan -v 6 -l default0 yellow0<br />bash-3.2# dladm create-vlan -v 7 -l default0 red0<br />bash-3.2# dladm create-vlan -v 8 -l default0 cyan0<br /></pre><p>There is now one link for each subnet in the lab (one untagged link, and seven tagged VLAN links).</p><pre>bash-3.2# dladm show-link<br />LINK CLASS MTU STATE OVER<br />eth0 phys 1500 up --<br />eth1 phys 1500 up --<br />eth2 phys 1500 up --<br />eth3 phys 1500 up --<br />default0 aggr 1500 up eth0 eth1 eth2 eth3<br />orange0 vlan 1500 up default0<br />green0 vlan 1500 up default0<br />blue0 vlan 1500 up default0<br />white0 vlan 1500 up default0<br />yellow0 vlan 1500 up default0<br />red0 vlan 1500 up default0<br />cyan0 vlan 1500 up default0<br />bash-3.2# dladm show-vlan<br />LINK VID OVER FLAGS<br />orange0 2 default0 -----<br />green0 3 default0 -----<br />blue0 4 default0 -----<br />white0 5 default0 -----<br />yellow0 6 default0 -----<br />red0 7 default0 -----<br />cyan0 8 default0 -----<br /></pre><p>I then plumbed IP interfaces in each subnet. For example:</p><pre>bash-3.2# ifconfig orange0 plumb ...<br />bash-3.2# ifconfig green0 plumb ...<br />...<br /></pre><p>Configuring this router also involved configuring IPv4 dynamic routing and forwarding, IPv6 dynamic routing and forwarding, etc. All of these latter steps involved placing the network interface names in some sort of persistent configuration (like /etc/hostname.<intf>, /etc/inet/ndpd.conf, and IP filter rules to name a few). This is where giving meaningful names to network interfaces has the most value. With all of these interface names in various configuration files, we don't want to ever have to go and reconfigure all of those things if the underlying hardware of the system were to change from under them. Before Clearview UV's vanity naming feature, a VLAN interface above the e1000g1 interface would look something like e1000g80001 (for VLAN tag 8), thanks to the moldy "VLAN PPA-hack". This is ridiculous enough as an interface name, but what happens when I replace my e1000g1 card with a Broadcom card which has a device name of bge0? I need to go fetch every piece of configuration on the system that made reference to e1000g1 and e1000g8001, and change everything to bge0 and bge8000.</p><p>With Clearview UV's vanity naming feature I could have named the link something meaningful like "private1", and assigned the newly added bge0 card that same name (using the dladm rename-link command I showcased above) to keep all of my network configuration intact.</p><br />Sebastienhttp://www.blogger.com/profile/00380169559827106266noreply@blogger.com3tag:blogger.com,1999:blog-210396651480309156.post-9854588924329516902007-09-25T10:07:00.000-07:002011-03-23T08:54:46.415-07:00Early Access to Clearview IP Tunneling<p>Earlier today, <a title="Project Clearview Downloads" href="http://www.opensolaris.org/os/project/clearview/downloads/">early access build 74</a> of <a href="http://www.opensolaris.org/os/project/clearview/" title="Project Clearview">Project Clearview</a> was announced to <a href="http://www.opensolaris.org/jive/thread.jspa?threadID=40403&tstart=0">networking-discuss@opensolaris.org</a> and <a href="http://www.opensolaris.org/jive/thread.jspa?threadID=40402&tstart=0">clearview-discuss@opensolaris.org.</a> This build introduces the new GLDv3-based IP tunneling driver to users. With this work, the 6000 or so lines of kernel code that comprised the "tun" STREAMS module is replaced with a GLDv3 driver which is half of that size and has more features.</p><p>With this driver, IP tunnels in Solaris are now fully observable using snoop:</p><pre>seb# snoop -d ip.tun0<br />Using device ip.tun0 (promiscuous mode)<br /> seb -> my-desktop TCP D=60722 S=22 Push Ack=624936085 Seq=693788605 Len=80 Win=49644 (1 encap)<br /> my-desktop -> seb TCP D=22 S=60722 Ack=693788685 Seq=624936085 Len=0 Win=49644 (1 encap)<br /> seb -> dns-server DNS C 3.1.168.192.in-addr.arpa. Internet PTR ? (1 encap) <br /></pre><p>IP tunnels can be given meaningful names (thanks to Clearview vanity naming):</p><pre>seb# dladm create-iptun -T 6to4 -s 10.8.57.44 ipv6gateway0<br />IP tunnel created: ipv6gateway0<br />seb# dladm show-iptun<br />LINK TYPE SOURCE DESTINATION<br />ipv6gateway0 6to4 10.8.57.44 N/A <br />seb# ifconfig ipv6gateway0 inet6 plumb up<br />seb# ifconfig ipv6gateway0 inet6<br />ipv6gateway0: flags=202200041<UP,RUNNING,NONUD,IPv6,CoS> mtu 65515 index 3<br /> inet tunnel src 10.8.57.44<br /> tunnel hop limit 64<br /> inet6 2002:a08:392c::1/1</pre><p><br /></p><pre>seb# dladm create-iptun -T ipv4 -s seb -d vpngateway vpn0<br />IP tunnel created: vpn0<br />seb# ipsecconf -l -i vpn0<br />#INDEX vpn0,1<br />{ tunnel vpn0 negotiate tunnel laddr seb/32 dir out } ipsec { encr_algs aes-cbc(128..256) encr_auth_algs hmac-md5(128) sa shared }<br />#INDEX vpn0,2<br />{ tunnel vpn0 negotiate tunnel laddr seb/32 dir in } ipsec { encr_algs aes-cbc(128..256) encr_auth_algs hmac-md5(128) sa shared }<br />seb# ifconfig vpn0 plumb 10.0.0.1 10.0.0.2 up <br /></pre><p>IP tunnel links are administered using dladm (although pre-existing ifconfig syntax is still supported for backward compatibility):</p><pre>seb# dladm create-iptun -T ipv6 -s me -d you trans0<br />IP tunnel created: trans0<br />seb# dladm show-linkprop trans0<br />LINK PROPERTY VALUE DEFAULT POSSIBLE <br />trans0 autopush -- -- -- <br />trans0 zone -- -- -- <br />trans0 hoplimit 64 64 -- <br />trans0 encaplimit 4 4 -- <br />seb# dladm set-linkprop -p encaplimit=2 trans0<br />seb# dladm show-linkprop trans0<br />LINK PROPERTY VALUE DEFAULT POSSIBLE <br />trans0 autopush -- -- -- <br />trans0 zone -- -- -- <br />trans0 hoplimit 64 64 -- <br />trans0 encaplimit 2 4 -- <br /></pre><p>We welcome users to bfu these bits and try out the new features. Click <a title="Clearview Downloads" href="http://www.opensolaris.org/os/project/clearview/downloads/">here</a> for download instructions and release notes, and let us know what you think by sending us feedback at <a href="mailto:clearview-discuss@opensolaris.org">clearview-discuss@opensolaris.org</a>.<br /></p><br />Sebastienhttp://www.blogger.com/profile/00380169559827106266noreply@blogger.com0tag:blogger.com,1999:blog-210396651480309156.post-50792772446969870362005-06-14T01:17:00.001-07:002011-03-23T08:54:46.415-07:00How an IP Tunnel Interface Dynamically Adjusts its Link MTUWith the launch of <a href="http://www.opensolaris.org/">OpenSolaris</a> comes the opportunity to discuss the implementation details behind existing Solaris features. I'd like to share some of the details behind one of my contributions to Solaris 10; the implementation of dynamic MTU calculation for IP tunnel interfaces.<br />
<br />
Solaris 8 was the first version of Solaris that implemented the IP in IP tunneling mechanism described in<a href="http://tools.ietf.org/html/rfc1853"> RFC1853</a>. It did not, however, implement the "Tunnel MTU Discovery" section of this RFC. Tunneling over IPv6 (<a href="http://tools.ietf.org/html/rfc2473">RFC2473</a>) was implemented very early in Solaris 10 (and backported to Solaris 9 in Update 1) along with a Tunnel MTU Discovery mechanism that worked for IPv6 tunnel interfaces only. Some mechanism was needed that worked for both IPv4 and IPv6 tunnels, and that was visible to the administrator. One drawback to the IPv6 tunnel implementation of Tunnel MTU Discovery for IPv6 tunnels was that there was no observability into the Tunnel MTU (ifconfig's output always showed some static MTU value that was unrelated to the actual tunnel interface's MTU).<br />
<br />
This work became more important when customers (internal and external to Sun) started using Solaris' IPsec tunneling to implement VPN solutions. Without proper Tunnel MTU Discovery, things like TCP MSS calculations can take longer to converge to usable values and protocols that don't have any insight into Path MTU (UDP for example) yield unecessary amounts of IP fragmentation. For more on the benefits of Tunnel MTU Discovery, see the two aformentioned RFC's on IP tunneling.<br />
<br />
Without going into too much detail about the inner workings of the ip and tun modules or every line of code that was changed to implement this feature, I'd like to focus on two aspects of the implementation. The first is the mechanism used by the tun module to obtain path MTU information about the tunnel destination from ip, and the second is the mechanism by which the ip interface's MTU is dynamically changed when the tun module detects a change in the tunnel's link MTU.<br />
<br />
<h2 style="font-weight: normal;"><span style="font-size: large;">IRE_DB_REQ_TYPE</span></h2>In order for the tun module to be able to calculate a useful tunnel MTU, it needs to know the Path MTU of the tunnel destination. The tunnel destination is the IP node we'll send encapsulated packets to when sending them through the tunnel interface. In ifconfig output, it is the "tunnel dst":<br />
<br />
<pre># ifconfig ip.tun0
ip.tun0: flags=10008d1<UP,POINTOPOINT,RUNNING,NOARP,MULTICAST,IPv4> mtu 1480 index 4
inet tunnel src 11.0.0.1 tunnel dst 11.0.0.2
tunnel hop limit 60
inet 10.0.0.1 --> 10.0.0.2 netmask ff000000
</pre><br />
In the above example, IP packets forwarded into ip.tun0 are encapsulated into an outer IP header with a source of 11.0.0.1 and a destination of 11.0.0.2. 11.0.0.2 is the "tunnel destination".<br />
<br />
The Path MTU to the destination is the size of the largest IP packet that can be sent to the destination without being fragmented nor resulting in an ICMP fragmentation needed message. The tunnel MTU of a given tunnel is the Path MTU of the tunnel destination plus any tunneling overhead (encapsulating IP header and perhaps IPsec headers if IPsec tunneling is being used).<br />
<br />
The ip module keeps this Path MTU information in a per-destination cache (aka IRE cache) table. The protocol used to keep track of this Path MTU information is described in <a href="http://tools.ietf.org/html/rfc1191">RFC1191</a>. The ip module provides a number of methods of accessing this per-destination cache. One of them is the <a href="http://cvs.opensolaris.org/source/xref/usr/src/uts/common/inet/ip/ip_ire.%0Ac#ire_ctable_lookup">ire_ctable_lookup()</a> functional interface, but because tun and ip are separate STREAMS module and this functional interface was previously only safe to use within the ip module's STREAMS perimeter<sup><a href="http://www.blogger.com/post-edit.g?blogID=210396651480309156&postID=5079277244696987036#foot1" name="footref1">[1]</a></sup>, tun could not use this functional interface.<br />
<br />
Another method ip provides is the IRE_DB_REQ_TYPE STREAMS message. An upstream module can send such a message down to ip, and ip will reply with an IRE_DB_TYPE message and append a copy of the IRE<sup><a href="http://www.blogger.com/post-edit.g?blogID=210396651480309156&postID=5079277244696987036#foot2" name="footref2">[2]</a></sup> requested to the message (assuming the requested IRE is found). This is the method used by the tun module. Periodically, tun sends down this message to get the current Path MTU for its tunnel destination. For example, it does this when sending a packet down to ip and the Path MTU information it has expired in <a href="http://cvs.opensolaris.org/source/xref/usr/src/uts/common/inet/ip/tun.%0Ac#tun_wdata_v4">tun_wdata_v4()</a>.<br />
<br />
<pre>/*
* Request the destination ire regularly in case Path MTU has
* increased.
*/
if (TUN_IRE_TOO_OLD(atp))
tun_send_ire_req(q);
</pre><h2 style="font-weight: normal;"><span style="font-size: large;">DL_NOTIFY_REQ/IND and DL_NOTE_SDU_SIZE</span></h2>Once the tun module has obtained the Path MTU information of the destination, it needs to recalcule the link MTU of the tunnel interface and notify the upper instance of ip if the MTU has changed. The ip module can then update the IP interface's MTU accordingly. The MTU calculation is done by the <a href="http://cvs.opensolaris.org/source/xref/usr/src/uts/common/inet/ip/tun.%0Ac#tun_update_link_mtu">tun_update_link_mtu()</a> function, which in turn calls <a href="http://cvs.opensolaris.org/source/xref/usr/src/uts/common/inet/ip/tun.%0Ac#tun_sendsdusize">tun_sendsdusize()</a> to notify the ip module of the new MTU if it has changed:<br />
<br />
<pre>/*
* Given the path MTU to the tunnel destination, calculate tunnel's link
* mtu. For configured tunnels, we update the tunnel's link MTU and notify
* the upper instance of IP of the change so that the IP interface's MTU
* can be updated. If the tunnel is a 6to4 or automatic tunnel, just
* return the effective MTU of the tunnel without updating it. We don't
* update the link MTU of 6to4 or automatic tunnels because they tunnel to
* multiple destinations all with potentially differing path MTU's.
*/
static uint32_t
tun_update_link_mtu(queue_t *q, uint32_t pmtu, boolean_t icmp)
{
tun_t *atp = (tun_t *)q->q_ptr;
uint32_t newmtu = pmtu;
boolean_t sendsdusize = B_FALSE;
/*
* If the pmtu provided came from an ICMP error being passed up
* from below, then the pmtu argument has already been adjusted
* by the IPsec overhead.
*/
if (!icmp && (atp->tun_flags & TUN_SECURITY))
newmtu -= atp->tun_ipsec_overhead;
if (atp->tun_flags & TUN_L_V4) {
newmtu -= sizeof (ipha_t);
if (newmtu < IP_MIN_MTU)
newmtu = IP_MIN_MTU;
} else {
ASSERT(atp->tun_flags & TUN_L_V6);
newmtu -= sizeof (ip6_t);
if (atp->tun_encap_lim > 0)
newmtu -= IPV6_TUN_ENCAP_OPT_LEN;
if (newmtu < IPV6_MIN_MTU)
newmtu = IPV6_MIN_MTU;
}
if (!(atp->tun_flags & (TUN_6TO4 | TUN_AUTOMATIC))) {
if (newmtu != atp->tun_mtu) {
atp->tun_mtu = newmtu;
sendsdusize = B_TRUE;
}
if (sendsdusize)
tun_sendsdusize(q);
}
return (newmtu);
}
</pre><br />
Note, there is a cosmetic bug in the above code. The fix would be a good starter fix for anyone wishing to be introduced to the OpenSolaris development process. :-) The sendsdusize variable is obviously not needed and the last if statement can be reduced to:<br />
<br />
<pre>if (newmtu != atp->tun_mtu &&
!(atp->tun_flags & (TUN_6TO4 | TUN_AUTOMATIC))) {
atp->tun_mtu = newmtu;
tun_sendsdusize(q);
}
</pre><br />
How does the notification between tun and ip work? It's done via a DLPI notification mechanism that is Solaris specific. The dlpi(7P) man page describes the mechanism as "Notification Support", and it includes support for the asynchronous nofication of link status (up or down), SDU (send data unit, or MTU) size, link speed, and other information. The tun modules uses the SDU notification.<br />
<br />
The mechanism works as follows:<br />
<ol><li>When an IP interface is plumbed, the ip module sends the underlying driver a DL_NOTIFY_REQ DLPi message. The message contains a bitfield representing the notifications that ip is interested in. This is done by <a href="http://cvs.opensolaris.org/source/xref/usr/src/uts/common/inet/ip/ip6_if.%0Ac#ill_dl_phys">ill_dl_phys()</a>:<br />
<br />
<pre>/*
* Allocate a DL_NOTIFY_REQ and set the notifications we want.
*/
notify_mp = ip_dlpi_alloc(sizeof (dl_notify_req_t) + sizeof (long),
DL_NOTIFY_REQ);
if (notify_mp == NULL)
goto bad;
((dl_notify_req_t *)notify_mp->b_rptr)->dl_notifications =
(DL_NOTE_PHYS_ADDR | DL_NOTE_SDU_SIZE | DL_NOTE_FASTPATH_FLUSH |
DL_NOTE_LINK_UP | DL_NOTE_LINK_DOWN | DL_NOTE_CAPAB_RENEG);
...
ill_dlpi_send(ill, notify_mp);
</pre></li>
<li>The underlying driver (tun in this case) replies with a DL_NOTIFY_ACK containing the subset of capabilities that it support. The tun only supports DL_NOTE_SDU_SIZE.</li>
<li>When an event that triggers a change in MTU occurs, the driver (tun) sends up a DL_NOTIFY_IND message to those DLPI users that were interested in DL_NOTE_SDU_SIZE notifications. The tun module does this in the <a href="http://cvs.opensolaris.org/source/xref/usr/src/uts/common/inet/ip/tun.%0Ac#tun_sendsdusize">tun_sendsdusize()</a> function.</li>
<li>When ip receives the DL_NOTIFY_IND message containing a DL_NOTE_SDU_SIZE notification, it updates the IP tunnel interface's MTU accordingly, and ifconfig shows the new dynamically updated MTU!</li>
</ol><hr /><br />
<a href="http://www.blogger.com/post-edit.g?blogID=210396651480309156&postID=5079277244696987036#footref1" name="foot1">[1]</a> The IP Multithreading feature of the<a href="http://www.sun.com/bigadmin/content/networkperf/"> FireEngine project</a> now makes it possible for other modules to use this functional interface. Some modules such as <a href="http://cvs.opensolaris.org/source/xref/usr/src/uts/common/ipf/">ipf</a> (IP Filter) and <a href="http://cvs.opensolaris.org/source/xref/usr/src/uts/common/inet/ip/nattymod%0A.c">nattymod</a> (IPsec NAT traversal) already use it. The tun module can now use it as well, which is something we plan on doing.<br />
<br />
<a href="http://www.blogger.com/post-edit.g?blogID=210396651480309156&postID=5079277244696987036#footref2" name="foot2">[2]</a> An IRE, or internet routing entry is<br />
a data structure internal to Solaris' IP implementation used to represent<br />
forwarding table entries _and_ per-destination cache entries. Creation and<br />
maintenance of IRE tables is by far the most complex (some would say overly<br />
complex) parts of the ip module. The subject of IRE's would make for a very<br />
lengthy blog entry on its own.<br />
<br />
<hr /><br />
Technorati Tag: <a href="http://www.technorati.com/tag/OpenSolaris" rel="tag">OpenSolaris</a><br />
<br />
<br />
Technorati Tag: <a href="http://www.technorati.com/tag/Solaris" rel="tag">Solaris</a>Sebastienhttp://www.blogger.com/profile/00380169559827106266noreply@blogger.com0