Patching a Linux Kernel Module

By Evan Sangaline | July 27, 2017
Follow @sangaline

A Bug on Linux? Why, I never!

I’ve been using GNU/Linux for about fifteen years and, I’ve got to admit, it used to be pretty rough around the edges (to put it lightly). A lot can change over fifteen years though; most of the things that were once major problem areas haven’t required a second thought in years. Laptop suspension, WIFI, advanced function keys, sound, and pretty much everything else all typically “just work” these days, and this has been the case for quite a while. “Ah, that’s just an anecdote,” you might say, but it’s honestly closer to a statistic at this point.

I often suggest to people who are thinking about buying a new computer that they try breathing new life into their old one first by putting GNU/Linux on it. The specs just haven’t improved enough recently to make a huge difference for most people and, in my experience, software bloat is usually a bigger problem than hardware limitations. If somebody is interested in giving this a shot then I’m always happy to help them get situated. I’ve done this for quite a number of friends and family members with excellent results.

One such family member is my father, who has been happily running Ubuntu on a 7 year old Lenovo IdeaPad Z570 for years. Browsers have been steadily increasing their memory usage over that time though, and his 4GB of RAM has been starting to feel a bit tight for even casual web browsing. To help alleviate this, I upgraded him to 8GB and a solid state drive over the previous weekend. He also wanted to try something a bit lighter than Unity, so we decided to move him to Antergos (which has become my goto for non-technical users who want a light, stable operating system).

So I popped in the new hardware, ran through the Cnchi installer process for Antergos, and rebooted the machine. I was honestly a bit stunned when I then found that I wasn’t able to activate the WIFI. As I started debugging the problem, I quickly found that it was actually kind of interesting what was going on. The solution ultimately ended up involving patching the kernel and it took some fun detective work to track down the issue.

The main reason that I decided to write this up is that all you can find on Google about this are a handful of forum posts with well meaning–but ultimately unhelpful–replies. There’s probably a dwindlingly small number of people using an ancient Lenovo IdeaPad Z570 at this point, but if anybody does find this through search then hopefully this can help them resolve the problem.

The other reason is that it’s an interesting glimpse into how open source software gives you the power to change how things work on pretty much every level. There were frustrating WIFI bugs on OS X that took years to get resolved and I would have loved to have the opportunity to fix those myself at the time. Even if you’re not desperately trying to get GNU/Linux working on your Lenovo IdeaPad Z570, you might still enjoy following along through the process of debugging and patching a kernel issue.

Getting to the Bottom of This

My first thought was that there might be some issue with the GUI network manager, so I hopped into a terminal and tried to bring up the interface directly. First, I used the ip link command to find the name of the WIFI interface. This command just lists the available interfaces and is roughly equivalent to running ifconfig if you’re familiar with the older tools that were common before iproute2util.

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp3s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN mode DEFAULT group default qlen 1000
    link/ether f0:de:f1:8e:c7:a4 brd ff:ff:ff:ff:ff:ff
3: wlp2s0: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN mode DORMANT group default qlen 1000
    link/ether 74:e5:0b:42:ac:a4 brd ff:ff:ff:ff:ff:ff

From this, we can see that the interface name is wlp2s0 where the wl prefix denotes that this is a Wireless-LAN interface. You might be more used to seeing an interface name like wlan0 here, but these names have been phased out on systemd-based systems in favor of predistable network interface names which are stable across reboots and hardware changes.

Knowing the interface identifier, I then tried to bring it up with sudo ip link set wlp2s0 (the equivalent of sudo ifconfig wlp2s0 up). This operation failed with a helpful error message:

RTNETLINK answers: Operation not possible due to RF-kill

This error message basically means that the WIFI card appears to be deliberately disabled, like you would expect when you you put your laptop into airplane mode. I was then able to determine that it seemed to be a hardware switch that was causing the problem by listing the more detailed RF-kill status with rfkill list.

0: phy0: Wireless LAN
    Soft blocked: no
    Hard blocked: yes

Using the Fn+F5 shortcut to toggle the WIFI state appeared to only modify the software block value, while the actual hardware switch had no effect on either state. Unblocking using sudo rfkill unblock all also had no effect, but that’s expected because that should only be able to modify the software blocking state.

So it seemed that the hardware switch was not being read correctly and that this was preventing the system from bringing the interface up.

Patching the Kernel Module

I mentioned that there weren’t any helpful forum replies relating to the Z570, but related symptoms have also been observed with the earlier Lenovo Yoga laptops. The issue with those was that they didn’t have a hardware switch, but the driver was still attempting to read it. The easy fix there was to simply blacklist the ideapad_laptop kernel module, but that prevented the keyboard from working for the Z570. Instead, I opted to patch the module to disable the support for the hardware switch entirely (credit to 1 and 2 for doing similar things with the Yoga laptops).

The WIFI actually did work on the live-USB, so I was able to boot into that and access the internet. I could then use Chroot to effectively drop into the installation on the hard drive. This requires first mounting the drive

sudo mkdir /mnt/antergos
sudo mount /dev/sda2 /mnt/antergos

where sda2 needs to be replaced with the actual installation location (which you can check with sudo fdisk -l). If you’re using Arch, or an Arch-based operating system, on the live-usb then you can simply run arch-chroot /mnt/antergos. For any other distribution, you’ll need to do the steps manually, like so:

# do everything as root
sudo su
cd /mnt/antergos

# mount the temporary API filesystems
mount -t proc proc proc/
mount --rbind /sys sys/
mount --rbind /dev dev/
mount --rbind /run run/

# copy over the DNS details
cp /etc/resolv.conf etc/resolv.conf

# actually change the root
chroot /mnt/antergos /bin/bash

# source the local bash configuration
source /etc/profile
source ~/.bashrc

At this point, you will have a shell that effectively acts as if you were just logged into the hard drive installation. First, let’s make sure that we have both the long-term support kernel and headers installed.

pacman -S linux-lts linux-lts-headers

Take note of the kernel version as it installs, in my case it was 4.9.31-1. Then run ls /usr/lib/modules/ to find the location of the corresponding kernel modules. You’ll want to pick the one which corresponds to the LTS kernel version, so for me it was 4.9.31-1-lts.

Now, we can move on to actually patching and building the required kernel module. First, make sure that you also have a build environment available by running pacman -S base-devel. Then, simply run the following sequence of commands. Note that if you’re working with a different kernel version then you’ll need to find the correct tarball location at kernel.org and modify the /usr/lib/module/ path accordingly.

# drop down to a non-root user
su -l sangaline

# make a working directory and move to it
mkdir ~/kernelbuild
cd ~/kernelbuild

# download and extract the kernel source code
wget https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.9.39.tar.xz
tar -xvJf linux-4.9.39.tar.xz

# clean the source tree
cd linux-4.9.39
make clean
make mrproper

# copy over your existing configuration
cp /usr/lib/modules/4.9.31-1-lts/build/.config ./
cp /usr/lib/modules/4.9.31-1-lts/build/Module.symvers ./

# prepare the source tree
make prepare
make scripts

We’re now prepared to build the kernel module, but first we must patch it to disable the hardware WIFI switch and force it’s perceived state to ON. The patched changes are as follows.

*** drivers/platform/x86/ideapad-laptop.c	2017-07-26 11:07:37.347085974 -0400
--- drivers/platform/x86/ideapad-laptop-patched.c	2017-07-26 11:12:03.936764755 -0400
***************
*** 40,46 ****
  #include <linux/device.h>
  #include <acpi/video.h>
  
! #define IDEAPAD_RFKILL_DEV_NUM	(3)
  
  #define CFG_BT_BIT	(16)
  #define CFG_3G_BIT	(17)
--- 40,46 ----
  #include <linux/device.h>
  #include <acpi/video.h>
  
! #define IDEAPAD_RFKILL_DEV_NUM	(2)
  
  #define CFG_BT_BIT	(16)
  #define CFG_3G_BIT	(17)
***************
*** 230,238 ****
  		seq_printf(s, "BL power value:\t%s\n", value ? "On" : "Off");
  	seq_printf(s, "=====================\n");
  
- 	if (!read_ec_data(priv->adev->handle, VPCCMD_R_RF, &value))
- 		seq_printf(s, "Radio status:\t%s(%lu)\n",
- 			   value ? "On" : "Off", value);
  	if (!read_ec_data(priv->adev->handle, VPCCMD_R_WIFI, &value))
  		seq_printf(s, "Wifi status:\t%s(%lu)\n",
  			   value ? "On" : "Off", value);
--- 230,235 ----
***************
*** 465,471 ****
  };
  
  static const struct ideapad_rfk_data ideapad_rfk_data[] = {
- 	{ "ideapad_wlan",    CFG_WIFI_BIT, VPCCMD_W_WIFI, RFKILL_TYPE_WLAN },
  	{ "ideapad_bluetooth", CFG_BT_BIT, VPCCMD_W_BT, RFKILL_TYPE_BLUETOOTH },
  	{ "ideapad_3g",        CFG_3G_BIT, VPCCMD_W_3G, RFKILL_TYPE_WWAN },
  };
--- 462,467 ----
***************
*** 484,501 ****
  
  static void ideapad_sync_rfk_state(struct ideapad_private *priv)
  {
- 	unsigned long hw_blocked = 0;
- 	int i;
- 
- 	if (priv->has_hw_rfkill_switch) {
- 		if (read_ec_data(priv->adev->handle, VPCCMD_R_RF, &hw_blocked))
- 			return;
- 		hw_blocked = !hw_blocked;
- 	}
- 
- 	for (i = 0; i < IDEAPAD_RFKILL_DEV_NUM; i++)
- 		if (priv->rfk[i])
- 			rfkill_set_hw_state(priv->rfk[i], hw_blocked);
  }
  
  static int ideapad_register_rfkill(struct ideapad_private *priv, int dev)
--- 480,485 ----
***************
*** 999,1004 ****
--- 983,989 ----
  			ideapad_register_rfkill(priv, i);
  
  	ideapad_sync_rfk_state(priv);
+     write_ec_cmd(priv->adev->handle, VPCCMD_W_WIFI, 1);
  	ideapad_sync_touchpad_state(priv);
  
  	if (acpi_video_get_backlight_type() == acpi_backlight_vendor) {

You can either apply these changes manually, or download and apply the patch directly from our site.

# download the contextual patch from intoli.com
wget https://intoli.com/blog/patching-a-linux-kernel-module/ideapad-laptop.patch

# apply it in the source tree
patch -p0 -i ideapad-laptop.patch

Now that everything is set up to go, we can finally compile the ideapad-laptop kernel module with

make M=drivers/platform/x86

and overwrite the existing module with the new patched one (again, being mindful of which /etc/lib/modules/ path to use):

gzip gzip drivers/platform/x86/ideapad-laptop.ko
sudo cp drivers/platform/x86/ideapad-laptop.ko.gz /usr/lib/modules/4.9.31-1-lts/kernel/drivers/platform/x86/

At this point, you might want to double-check your systemd-boot or grub2 config to make sure that it defaults to the correct kernel version. I used systemd-boot so I simply had to modify the default antegros line in /boot/loader/loader.conf to instead read default antegros-lts.

The one last thing that you should probably do is lock the kernel version so that you can be prepared to update the module again when you want to update the kernel. This can be done by simply adding the following line to /etc/pacman.conf:

IgnorePkg = linux-lts linux-lts-headers

The Wrap-Up

If you found this from Google because you were dealing with RF-kill hardware switch issues on a Lenovo IdeaPad Z570, then hopefully you found this helped you solve the problem! Otherwise, maybe you were at least able to find it interesting to see how you can go about debugging and solving kernel-level issues on GNU/Linux. This was admittedly a bit of a headache to figure out, but I do really enjoy the fact that there’s no such thing as an “unfixable” problem with open source software!

And please do get in touch if you’re ever dealing with any devops or debugging issues of your own and looking for outside assistance.