Hacking update

Wow, time flies. In the past months I've been working on a handful of projects related to the mainframe, but nothing that is really finished - so I thought I would summarize some work I've been doing instead.

Things I've been up to since last post:

Learning about PCIe and (not so successfully) adding it to Fejkon
Repairing my DS6800 disk array
Debugging why the circuit breaker disconnects when redundant power is connected
VTE FICON tape virtualization

Let's go over them a bit one by one!

PCIe and Fejkon

Fejkon, my FICON FPGA card, needs to move up to 32 (4x 8G ports) Gbit/s FICON traffic to and from the card. Normally this calls for some accelerator company to come in and write an FPGA module (IP-Core) for many many pennies. As I'm doing this to learn and for the benefit of the open source community I'm writing those modules myself.

PCIe is a complex topic and I will be writing a post about how Fejkon's PCIe works when I finish it - I will however explain the issues I have right now for your entertainment.

PCIe can chiefly read and write to memory addresses - those addresses can live on a PCIe device or the host. In normal mode of operation, e.g. when a device is being set up as a system is booting, all memory accesses are done from the host ("What card are you?", "How many ports do you have?", etc.) - this is called polling. In the past data transfers were done this way - the host was interrupted when data was available and it would ask the device for the contents of its memory buffers. This is however quite slow as PCIe has non-trivial latency. In addition if the host does not have time to process the incoming data it might be lost, so you end up adding a bunch of memory on the device which raises the cost and complexity of the device.

Instead all high-speed interfaces today use something called Direct Memory Access (DMA) which is a fancy way of saying that the device is able to do operations on the host memory. The host configures the device, essentially saying "Hey, when you get new packets write them here -and I'll write the outgoing packets over here". This is much better, because when the interrupt to the host fires the data is already available - potentially even in the right space where the userspace server wants it (that is called zero-copy) if one does things carefully.

Fejkon will not initially support zero-copy as that requires a more elaborate DMA engine, and for 32Gbit/s that shouldn't be needed.

For all this to be implemented on the FPGA board you need to send and receive your own so called Transaction Layer Packets (TLPs). These are basically the same as IP frames. I've implemented TLP handling for polled requests so far, but it doesn't quite work right.

I'm currently fighting an issue where the host sends "Hey, please give me the data at 0x4" - the FPGA logs that it received the request and that it sent the correct response (0xdeadbeef) but the host claims it never received the reply and raises a Completion Timeout (called CmplTO). The issue is here if you want to follow along.

After that issue is fixed there are two big items to do:

Implement Avalon Master to enable polling of all the various memory addresses in the FPGA
Create the DMA engine that will send the TLPs to the host for reading/writing packets as they are streamed in

I have already decided how I want the DMA buffer to be structured and written the support in the QEMU device model as well as the driver. This means that it's "only" the hardware left :-).

Repairing my DS6800 disk array

I have written about my complicated relationship with the DS6800 DASD array in the past. The TL;DR is that it is one of the smallest (i.e. not refrigerator sized) disk arrays you can get your hands on without it being extremely expensive. It is however not a great piece of engineering due to it being prone to bricking itself.

My DS6800 had the issue of a broken CompactFlash card for one of the controllers. Thanks to some fellow hackers I managed to finally find a brand that worked as a replacement.

Reassembling the controller after giving it a "new" flash

I've documented all my findings at the page https://ds6800.mainframe.dev/ with the hope that it can be useful for somebody.

After assembling it together, doing a factory reset of the controllers, applying the latest firmware update it does actually work!

Test zSeries 3390-9 volumes created

This should mean that I have a working DASD to play with, like installing the z/VM 5.3 Evaluation version to finally start running some mainframe software.

For the Fejkon project this also means that I should not have a viable reference implementation to reverse engineer. Expect to see documentation of the FICON protocol with PCAPs down the line!

Right now the final annoying issue is that the array does not link up to the SAN switch, but that's probably just me having the wrong optics or TX/RX being swapped. Also one disk appears to be broken, but eBay has a ton of them so that's easy enough.

Redundant power

Readers might have seen the previous article on how to power the mainframe. This is how I have powered it since the start, but there is one issue with this - as soon as the secondary power supply is connected, the circuit breaker trips. It doesn't matter if I connect the front or the back first, it is always tripping when the secondary is plugged in. Even if the whole system is EPO'd and the BPUs are locked out.

Now, in Switzerland and Europe in general we have circuit breakers that are for 10A or 16A, which is well within the load of my z114. However, the circuit breakers for non-industrial use often have a 30 mA ground fault protection. IBM recommends not using one of those breakers as the current can reportedly go up to hundreds of mA. Sadly, unless the connection box is behind lock and key (or the datacenter qualifies as a cow farm) the 30 mA limit is regulated by law.

Speaking to some electricians about this issue they taught me how to measure the leakage current through ground, which I now have.

13.8 mA leakage current to ground on one of the connectors to z114

The result is 13.8 mA. Given that the leakage current breakers compare the current in the L/N wires (the difference is the leakage current) I expect it to be the same on both sides. This means that the combined leakage is likely around 28 mA, well within the tripping zone of the 30 mA circuit breaker!

With this data in hand the datacenter has agreed to test with simply running the other BPU on another circuit breaker. Besides, it's quite silly to have both BPUs connected to the same incoming power anyway :-).

Fingers crossed this will work!

VTE FICON tape virtualization

Finally, you might recall the Bus-tech adapters I wrote a bit about in the past. I took another look at how the machine that runs those cards are set up. The machine is called a Virtual Tape Engine (VTE) and consists of 2x FICON cards and 1x AHA363 compression accelerator. My current issue is that I'm trying to virtualize the VTE into VMware ESXi running on a much smaller and less power-hungry box, and learning about PCIe passthrough in the process.

Right now the issues have been around:

Cards use 64-bit MMIO, VMware requires enabling that explicitly through advanced settings
Cards also use an I/O region which VMware refuses to enable for some reason when booting the VTE (using non-EIF)

Booting Ubuntu 18.04 in EFI mode results in the cards' regions being assigned correctly

As the VTE contains the binary kernel modules to run the FICON and AHA363 adapters I'm forced to use it "as is" - which sadly means running 2.6.16 kernel which has no support for EFI.

Current plan of action is to try with KVM instead of VMware, maybe the BIOS setup is better there for PCIe passthrough.

In the end I hope to have a self-contained box that is able to show up as virtual 3590 tape drives, hiding away the legacy 2.6.16 kernel. This box would also serve as a reference for Fejkon to support 3590 tapes.

Summary

Those are the major things I've been working on for the past months - I hope you have enjoyed reading about the progress. If you have interested and any ideas how you can help in this journey please reach out! For more real-time updates on my project, make sure to follow me at Twitter.

Thanks for reading!

System z on contemporary zLinux

IBM System z supports a handful of operating systems; z/VM, z/VSE, z/OS, z/TPF, and finally zLinux. All the earlier mentioned OSes are proprietary except for zLinux which is simply Linux with a fancy z in the name. zLinux is the term used to describe a Linux distribution compiled for S390 (31 bit) or S390X (64 bit). As we are talking about modern mainframes I will not be discussing S390, only S390X. There is a comfortable amount of distributions that support S390X - more or less all of the popular distributions do. In this list we find distributions like Debian, Ubuntu, Gentoo, Fedora, and RHEL. Noticeably Arch is missing but then again they only have an official port for x86-64. This is great - this means that we could download the latest Ubuntu, boot the DVD, and be up and running in no time, right? Well, sadly no. The devil is, as always, in the details. When compiling high level code like C/C++/Go the compiler needs to select an instruction set to use for the compi...

mainframe.dev

Hacking update

PCIe and Fejkon

Repairing my DS6800 disk array

Redundant power

VTE FICON tape virtualization

Summary

Comments

Post a Comment

Popular posts from this blog

Buying an IBM Mainframe

Brocade Fabric OS downloads

System z on contemporary zLinux