7/25/2014

PXE, gPXE now pulling dhcp and image


Somewhat unexpectantly.

It appears the driver works with only a minor change.

Originally the driver detected the USB to NIC adapter that I had, but it timed out.


So once I setup a test environment.

I looked over the code and noticed something peculiar.

static struct usb_device_id asix_88178_ids[] = {
    USB_ROM (0x0b95, 0x772a, "asix", "ASIX AX88x72A"),
    USB_ROM (0x1737, 0x0039, "asix", "Linksys USB1000"),
    USB_ROM (0x04bb, 0x0939, "asix", "IO-DATA ETG-US2"),
    USB_ROM (0x050d, 0x5055, "asix", "Belkin F5D5055"),
};
 static struct usb_device_id asix_88772_ids[] = {
    USB_ROM (0x2001, 0x3c05, "asix", "DLink DUB-E100"),
};
These seem to designate the "class" of binding function needed by the device:

struct usb_driver asix_88178_usb_driver __usb_driver = {
    .ids = asix_88178_ids,
    .id_count = (sizeof(asix_88178_ids) / sizeof(asix_88178_ids[0])),
    .probe = asix_88178_probe,
};
 struct usb_driver asix_88772_usb_driver __usb_driver = {
    .ids = asix_88772_ids,
    .id_count = (sizeof(asix_88772_ids) / sizeof(asix_88772_ids[0])),
    .probe = asix_88772_probe,
};

But I knew from reading the driver code for the Linux kernel and from the Asix documentation that these were 10/100 or 10/100/1000 adapters:

=====================================================
    ASIX AX88178 USB2.0 Gigabit Ethernet Network Adapter
    ASIX AX88772 USB2.0 Fast Ethernet Network Adapter
   
    Driver Compilation & Configuration on the Linux
=====================================================

It appeared the AX88772A adapter I had [ 0x0b95, 0x772a ] was "misclassified"

I reasoned it was being initalized as a 1000 Mbps device, when it is actually a 100 Mbps device.

So I moved the definition to the other category and recompiled.

This time it worked!


It pulled a dhcp address from my local network and appeared in the leases file for my local lan dhcp server.

The only change was this

static struct usb_device_id asix_88178_ids[] = {
        USB_ROM (0x1737, 0x0039, "asix", "Linksys USB1000"),
        USB_ROM (0x04bb, 0x0939, "asix", "IO-DATA ETG-US2"),
        USB_ROM (0x050d, 0x5055, "asix", "Belkin F5D5055"),
};

static struct usb_device_id asix_88772_ids[] = {
        USB_ROM (0x2001, 0x3c05, "asix", "DLink DUB-E100"),
        USB_ROM (0x0b95, 0x772a, "asix", "ASIX AX88772A"),
};
Hardly worth a changelog edit.

Next task is to see if I can pull a PXE bootable image and boot into that kernel image.

Success (sort of..)

CTRL-B to get to a gPXE prompt

[ I am using a tinyweb server that requires file extensions to serve the files, so I renamed bz2bzImage to bz2bzImage.0 below ]

# dhcp net0
# kernel -n bz2bzImage http://192.168.2.125/gtest/bz2bzImage.0 root=100
# initrd http://192.168.2.125/gtest/initrd.bz2
# boot bz2bzImage

I can manually use the USB to NIC adapter to pull over a kernel image and an initrd image, and manually boot them



It boots to a TomRtBt image, after login all looks familar:


So weirdness "aside" we can conclude the usb subsystem and the usb driver in the gPXE image does indeed work.

For some reason gPXE isn't recognizing the net0 device created by the driver as a valid boot device and produces the following when left up to the command line arguments to automatically boot

# qemu -cdrom gpxe.iso  -net user -usbdevice host:0b95:772a -bootp http://192.168.2.125/gtest/gtest.gpxe



This "does" explain why gpxe.usb will not load (it has to be "padded" before it will load)

http://support.etherboot.org/index.php?do=details&task_id=23

The error message "Could not load gPXE" is displayed.

Instead of loading gPXE from disk one sector at a time, this code tries to read
an entire track at once. The size of the image is not taken into account here.

This code will read beyond the end of disk if a virtual machine is run from the
gPXE USB image. Boot will only succeed if the image file size is greater or
equal to the gPXE image size rounded up by 32 KB.

Physical machines and disk media are not affected because they will be larger
than the gPXE USB image size and aligned to 32KB (sectors-per-track).

The following workaround will pad the USB image appropriately:

$ util/padimg.pl -s 32768 bin/rtl8139.usb

The file size is now a multiple of 32 KB:
The fix appears to download the perl code and add it to the util directory for gpxe src, then use it manually if planning to boot the gpxe.usb in a virtual machine, like qemu

 http://git.etherboot.org/?p=gpxe.git;a=commit;h=7741546a406217827c3d4a8d72aaa322b2565c35


$ ls -alF bin/rtl8139.usb | awk '{ print $5 }'
98304

This might explain the 'lack of ' enumeration

...for the USB nic (or detecting that it is indeed bootable).

Summary is the Boot Firmware Table is not being updated by the detection of the USB nic. If a second virtual nic is added and detected by gPXE it will indicate the USB nic designated nic0 is "inaccessible" and attempt to use the second virtual nic.. which would not be a valid test.

Here is the reference that put me on to the idea:

http://serverfault.com/questions/477708/second-nic-not-working-properly-on-diskless-server-2012

Neither gPXE 1.0.0 nor the Jan. 31, 2013 commit of its successor, iPXE, ever write multiple NICs to the iBFT, even where there are multiple NICs they know about. I've verified this by examination of their source code. My hacky solution was to get the iPXE source tree, and modify the program such that it writes a second NIC section to the iBFT, corresponding to the other NIC in my server (the NIC I was not booting from.)
Which suggests to me the insertion or detection code should be making a call to a function that updates the device list, but is not.. if that can be found and called.. everything should work as expected.

A wonky way to get it to automatically work

I noticed a message "No filename or root path specified" and tracked that down to 
src/usr/autoboot.c

Which indicated it was trying to source the boot file name from the dhcp call for net0.
(aka the USB nic)

So I put the path
http://192.168.2.125/gtest/gtest.gpxe into the tftpd32 [ Boot File ] field dhcp tab



And that worked

And now the wonky part..
# qemu -hda bin/gpxe.usb -bootp http://192.168.2.125/gtest/gtest.gpxe -usbdevice host:0b95:772a -net nic -net user
The qemu arguments for booting from a nic and providing a bootp file path are still required or it doesn't work.. I'm not sure why.. but that tells me for some reason the arguments from qemu are not being sourced when attempting to boot from net0 but they are being sourced when booting from net1.

That kind of leads me to believe the problem is with qemu and the way it handles arguments for nic devices it supports. Since the usb nic is being "passed-through" using usbdevice instead of emulated it could be the bootp option isn't interacting with the enumeration mechanism properly.

Bottom line: I think this problem lies in [qemu] and the USB pass through, not with gPXE.

Its very possible that booting gpxe.usb from a flash stick or gxpe.iso media it won't be a problem, the USB Ethernet device will be detected, and the Boot File path will be used to pull down a configuration file to further download the next stage.

Actually this is kind of okay.. because that means the gpxe.usb image can be generic, and its boot target is set by the dhcp server and then any processing logic for selecting the image would be up to the http server providing the target file gtest.gpxe (http processing could hand out a different file based on IP address for example).. which was assigned based on mac addr or even mac vendor code anything dhcp can key off of when assigning an IP. (Or) the next stage could be a menu selector that download other choices.