I enjoy troubleshooting complex technical problems, especially the ones where the erroneous behaviour doesn't make sense at all. This blog post tells the story of such a problem and how I finally managed to understand the root cause behind it
I recently came across a strange issue affecting some packages deployed through our mobile device management solution that we use internally within the company. Whilst most packages will deploy without any issues; some would install but keep raising an annoying warning. The warning about a kernel extension that's used by the software but not whitelisted (at least not correctly) will pop up now and then which caused a few of our users to be concerned.
As with other issue affecting kernel extensions on macOS, my first thought was to check the privacy profile configuration. The first suspects were:
- Wrong team or bundle id.
- Wrong code signing requirements.
- Wrong or missing kexts
All of them checked out and were in order, so now it was time to dig deeper into the rabbit hole, logs to the rescue or so I hoped.
The MDM we use provides logs on clients (duh!), I logged into my test virtual machine and checked the logs. To my dismay, there was no useful information in the logs. These logs were on par with not having logs at all when it came to usefulness.
Broken Package Perhaps?
Next, we started thinking it might be the package we are building. See, the security software provider has a package (.pkg) file but installing it will not connect the endpoint to the central server. To avoid having to activate each endpoint manually, we decided to use a script provided by the vendor which install the .pkg file and connects the endpoint to our control server.
The problem with this strategy is that we had to build a custom package that does nothing but runs a shell script. While that part was easy to figure out, configuring the permissions for the new package took a moment or two before we figured it out.
However, the package wasn't the issue. How did we know? Because the package installs successfully on about 50% of the times so if it was the package, the success rate would be 0%, wouldn't it?
How It Works
During a routine call with our vendor, this issue came up, and we discussed it a bit. The support specialist mentioned something that was new information to me. He mentioned they push the PPPC profiles using Apple Push Notification service and that it's separate from their agent installing the pkg. I didn't know this, so I did a bit of digging and learned about the instrumental role APNs plays in mobile device management activities for Apple devices. Allow me to explain.
Apple Push Notification service (APNs) is the centrepiece of the remote notifications feature. It is a robust, secure, and highly efficient service for app developers to propagate information to iOS (and, indirectly, watchOS), tvOS, and macOS devices. (Source)
APNs is how MDMs communicate configurations to Apple devices. They send the information to Apple, which relays it to the devices all within a secure and reliable channel. APNs have many benefits, and once you start reading about them, you realize why they are a cornerstone in the Apple MDM ecosystem. One of the many configurations that go through APNs is the Privacy Preferences Policy Control (PPPC) profiles, the culprit in this saga of failing extensions.
Timing Is Everything
At this point, we had all the pieces of the puzzle; we managed to form a theory that explains the random failures of our PPPC profile deployment and consequently, the warning alerts.
Our MDM installing sends the new PPPC profile through APNs and pushes the installation package using their agent which is installed locally on all enrolled devices. The problem with this flow? Timing. There is no guarantee that delivery will always happen in order (PPPC profile -> package file). If the PPPC delivery was late and if the MDM agent proceeded with the installation, the required whitelisting won't be in place in time to allow the application to run without any issues.
Since delays in PPPC deliver aren't predictable, so were the instances of kernel extensions. Some times we get an alert, other times we don't, and it all makes perfect sense.
Given the fact that no one can control or even claim to predict the performance of APNs, a more straightforward solution is in order. Our proposal to our vendor was to make their local MDM agent check for the PPPC profile before it installs the package. If it can't detect the whitelisting, then the installation is delayed until the check passes.
This extra check might delay the deployment of applications. Still, I would argue the delay is a small price to pay to avoid the onslaught of support tickets sent by users who are worried about random warning alerts.