Duncan's Blog

Sunday, November 7, 2021

Using Cloudflare Access to Protect Home Assistant

In my last post, I mentioned how I've been using Cloudflare Access to secure my Home Assistant server.

If you're not familiar with Cloudflare, it's basically a global content distribution network. They host reverse proxies all around the world to provide customers with low-latency caching and DDoS protection. Basically, you stick Cloudflare in front of your website and it makes it faster.

Cloudflare Access adds two specific features that we can use to secure Home Assistant:

"Argo tunnels" change the way requests flow from Cloudflare to Home Assistant. Normally, with reverse proxies, the proxy makes a connection to the "origin" server (i.e. Cloudflare would make a connection to our Home Assistant server). With "Argo tunnels", we instead make a connection from the Home Assistant server to Cloudflare to establish a tunnel, and connections are proxied over this tunel. This ensures that all connections come from Cloudflare directly and avoids us needing to accept connections from the internet (e.g. with port forwarding).
"Cloudflare Access" is an additional mechanism to limit access to our server to authorized users only. It uses third-party oauth providers (e.g. GitHub) to handle identity, and rules we create to decide who is authorized. This is the feature that prevents unauthorized users from even seeing the Home Assistant login screen.

This setup has been working fairly well for me for the last month or so, but it might not be for everyone.

Benefits

Secure: As I mentioned in the previous post, Cloudflare Access an important layer of security on top of exposing Home Assistant directly on the internet. Because Cloudflare is doing all the work, the only thing we need to do to keep this working and safe is to update cloudflared from time to time.

Free*: This is all available as part of Cloudflare's free tier. The only expense involved is that you need to own your own domain name. Depending on the TLD and the registrar this can be less than $10/year.

One stop shop: Cloudflare also takes care of DNS and SSL for you, so you don't need to worry about setting up Let's Encrypt. You can even buy domain names from them at cost.

Drawbacks

Configuration is complex: Unfortunately, I haven't found a good configuration guide on how to set up Cloudflare Access with Home Assistant. A lot of what I did was by trial and error. Making matters worse, the Cloudflare Access configuration was recently merged with Cloudflare for Teams, which seems to be an entirely separate unrelated product that we don't want.

Furthermore, Cloudflare Access also feels "bolted on" to Cloudflare's core functionality. While I hope this isn't the case, the way the configuration is laid out gives me the impression that if the Cloudflare Access config were to disappear, the system would "fail open" and expose Home Assistant directly to the internet.

Cookie-based authentication not well supported: Cloudflare Access's authentication scheme relies on cookies, and without cookies enabled, you won't be able to access the protected site. The Android app recently gained cookie support, but the iOS app (as far as I know) does not support cookies.

Trust & Privacy: Unlike Home Assistant Cloud, where your traffic is encrypted end-to-end, with Cloudflare Access, Cloudflare does SSL termination. As a result, Cloudflare has the technical means to intercept the traffic to your Home Assistant instance. I can't imagine this being an actual problem, but it's worth noting for completeness.

Handling of long-lived sessions: Cloudflare Access is configured with a maximum session length, and the longest setting is one month. When a session expires, it's not very graceful. Once a month we'll have to log in again with our Github credentials, to gain access to Home Assistant, and then (possibly) again with our Home Assistant credentials.

In Conclusion

Cloudflare Access is a relatively new product and it's really exciting that it's available for free. It's great for securing remote access to home servers. I expect the configuration complexity will get ironed out soon.

Sunday, October 17, 2021

Protect Home Assistant from No-Auth Vulnerabilities

I've been using Home Assistant for several years. It's really great to be able to control all of our devices from a single place, and being able to adjust the lights or the thermostat from miles away never seems to get old.

The most common way to set up Home Assistant for remote access is to set up port forwarding or to use Home Assistant Cloud. Either of these puts Home Assistant on the internet, and lets you log in from anywhere.

But it also means that anyone else on the internet can get to your Home Assistant login screen, served by your Home Assistant server.

No-auth vulnerabilities

Home Assistant is designed to refuse access to anyone that doesn't provide valid credentials, but all software has bugs, and modern web applications are surprisingly complex. It's easy to imagine a bug that would allow someone on the internet to access Home Assistant, or worse, the underlying operating system, without successfully logging in.

This sort of no-auth vulnerability is common across the software world. This type of vulnerability has even affected Home Assistant users in the past. In January 2021, Home Assistant disclosed that several vulnerabilities were discovered in third-party custom integrations. Malicious users on the internet, without valid credentials, could read any file accessible to Home Assistant. This would include the Home Assistant database which might contain sensitive data such as location history. (In this particular case, only users with certain third-party integrations were affected, but this type of vulnerability could happen against Home Assistant core, too.)

So, how do we protect ourselves from this kind of vulnerability?

Defense in depth

We want to prevent random people on the internet from being able to talk to any part of Home Assistant at all while still allowing valid users to connect. The easiest way to do this is to put a reverse proxy (such as Nginx) between the internet and your Home Assistant server and configure it to do some sort of access control. You can be creative here: use another username/password, use client SSL certificates, set up IP allow-listing, or create some sort of complicated port knocking scheme.

It doesn't have to be perfect. We want to make it hard for anyone scanning around the internet looking to exploit a vulnerability. It might mean you have to log in twice to get to Home Assistant, once to satisfy the reverse proxy, and then once again for Home Assistant, but that's a pretty minor nuisance.

With this approach, Home Assistant is protected, but our reverse proxy could still be vulnerable to remote attacks. We're exchanging one problem for another but it's probably a net win. The most popular remote proxies attract a lot of scrutiny from security researchers, so it's likely that vulnerabilities will be discovered and fixed quickly. Still, we should be sure to set up our operating system's automatic updates to make sure we apply security patches quickly.

To the cloud!

We could reduce our risk somewhat further by putting our reverse proxy on a VM in the cloud and connect to Home Assistant over a tunnel. This way any vulnerabilities would (hopefully) only impact the VM and not our Home Assistant server. Of course, this is easier said than done. I've been thinking about implementing something like this for years, but I've never gotten around to it because it's a pretty big endeavor.

Recently, I was very excited to learn about Cloudflare Access, which is a hosted product that pretty much does exactly this out of the box, without needing me to maintain the VM. And, it's now available on Cloudflare's free tier. I've been using it for about a month and it seems to work well.

In my next post I'll dive into the pros and cons of using Cloudflare Access to secure Home Assistant.

Saturday, September 25, 2021

It's been a while...

So it's been over eight years since my last post. It's amazing how time flies. I've changed the name of the blog, but the content going forward should be about the same as it has always been... which isn't saying much.

Sunday, April 14, 2013

Getting an IR receiver to work with a different remote with LIRC

I've been using MythTV for a long time, and recently I decided to upgrade my DVR to become a dedicated box, using Mythbuntu (replacing my old Debian setup).

The installation went pretty smoothly, but I had a ton of trouble getting my Hauppauge PVR-350 remote working with my Pinnacle PCTV USB TV Tuner / IR Receiver and LIRC. I know I got it working before on Debian only a month ago, so I figured it would be easy.

Of course, it wasn't. I think I spent longer trying to figure out what I'd done before than it would have taken me to fix it from scratch, and that's when I knew it was time to write a Google Fodder entry.

In recent versions of Linux, the kernel has support for many IR Receivers built in, and exposes it to LIRC via the "devinput" protocol. LIRC itself does not have any idea what kind of hardware is being used, nor should it. (I mistakenly tried telling LIRC about the actual remote I was using, but that was a dead end.) By default, each piece of hardware loads the keymap associated with the corresponding remote. This mapping of hardware to keymap is done by udev and contolled by the file /etc/rc_maps.cfg. To tell udev to load a different keymap, you can change rc_maps.cfg to point to your custom remote in /etc/rc_keymaps/. The syntax of these files are a little finicky so be sure to use ir-keytable to test your work. (Notably, I couldn't put comments in my keymap file.) Since /etc/rc_maps.cfg is used at boot time, the fix is actually very easy, which might explain why I forgot it. I've seen forum posts all over the place advocating for running some special ir-keytable commands at boot time, but that's not necessary.

Saturday, September 29, 2012

NoMachine NX, FreeNX, NeatX: Which should I use?

TL;DR: Don't use any of those, use X2Go.

I've used various different VNC servers and clients before, but I've never found them to be very useful. Most of the time the connection is too slow to get anything done. Recently, I found out about NoMachine NX, which has a bunch of really cool technology to make it actually reasonable to use your desktop machine over the internet. And, it runs over SSH, so you don't need to worry about opening up additional ports or encrypting everything.

My own use case was not terribly demanding; I want to access the computer on my desk (running Debian) from my laptop on the couch (running OS X), but I figured if NX works over the internet, it'll work even better over my local wireless network, right?

Unfortunately, NoMachine NX isn't free software. (Although some parts of it are.) Over the years several groups have tried to create an alternative server implementations, striving for compatibility with the NoMachine client software.

I first tried FreeNX, using Ubuntu packages on Debian. (I suppose that should have been a red flag: never use software compiled for Ubuntu on Debian, or vice versa. It's just asking for trouble.) It almost worked. I could connect, but then it would immediately crash. I spent hours trying anything I could to fix it, to no avail.

I tried using NoMachine NX Free, the free-of-charge version of NoMachine NX. It installed everything in weird locations on my machine and that made me angry. Also, I couldn't get it working.

I then tried going back to FreeNX, recompiling all the Ubuntu packages from source on a Debian box to rule out version incompatibilities (which was surprisingly difficult). It wasn't until I was half way through this that I discovered X2Go. The X2Go team maintains a lot of the packages used by FreeNX. Many of the underlying libraries are shared by the two products.

X2Go accomplishes everything that NoMachine NX does, except that it doesn't try for compatibility with the NoMachine NX Client; there is a separate X2Go client, with Windows, Linux and OS X support. That's a good thing, as it allows the X2Go project to control both the client and the server. And, it's packaged for Debian. I installed it and it just worked. Very simple.

So, long story short: use X2Go. Stay away from NoMachine NX, FreeNX, NeatX; they aren't worth your time.

Monday, January 9, 2012

Shutting worker threads down gracefully after a signal in Python

Recently I wrote about a bug in Python around handling of signals in multi-threaded programs. The upstream Python developers suggested that in order to properly handle signals in multi-threaded programs across several operating systems, developers should use some of the newer APIs in the signal library to make sure we get the behavior we want.

A common scenario for threads and signals is a daemon with long running worker threads that need to exit gracefully when SIGTERM or some other signal is received. The basic idea is:

Set up signal handlers
Spawn worker threads
Wait for SIGINT, and wake up immediately when it is received
Shutdown workers gracefully

The challenge is to write the simplest code that will do this portably, (at least) on FreeBSD and Linux, and that requires absolutely no CPU in the main thread while waiting (we want to be able to sleep completely, not have to wake up periodically to check for signals).

This is the best solution I've come up with (skip down to the bottom, it's where the interesting stuff is):

import errno
import fcntl
import os
import signal
import threading

NUM_THREADS = 2
_shutdown = False


class Worker(threading.Thread):

    def __init__(self, *args, **kwargs):
        threading.Thread.__init__(self, *args, **kwargs)
        self._stop_event = threading.Event()

    def run(self):
        # Do something.
        while not self._stop_event.isSet():
            print 'hi from %s' % (self.getName(),)
            self._stop_event.wait(10)

    def shutdown(self):
        self._stop_event.set()
        print 'shutdown %s' % (self.getName(),)
        self.join()


def sig_handler(signum, frame):
    print 'handled'
    global _shutdown
    _shutdown = True


if __name__ == '__main__':

    # Set up signal handling.
    pipe_r, pipe_w = os.pipe()
    flags = fcntl.fcntl(pipe_w, fcntl.F_GETFL, 0)
    flags |= os.O_NONBLOCK
    flags = fcntl.fcntl(pipe_w, fcntl.F_SETFL, flags)
    signal.set_wakeup_fd(pipe_w)

    signal.signal(signal.SIGTERM, sig_handler)

    # Start worker threads.
    workers = [Worker() for i in xrange(NUM_THREADS)]
    for worker in workers:
        worker.start()

    # Sleep until woken by a signal.
    while not _shutdown:
        while True:
            try:
                os.read(pipe_r, 1)
                break
            except OSError, e:
                if e.errno != errno.EINTR:
                    raise

    # Shutdown work threads gracefully.
    for worker in workers:
        worker.shutdown()

Basically, we have to use set_wakeup_fd() to ensure that we can reliably wake up when a signal is delivered. The obvious function to use here (signal.pause()) doesn't work

Monday, December 19, 2011

Signals and Threads with Python on FreeBSD

Over the last few months, I've been plagued by a fun bug in Python around handling of signals in multi-threaded programs on FreeBSD.

If you kill a multi-threaded program, FreeBSD will deliver the signal to any running thread, while Linux will only deliver the signal to the main thread. Python guarantees that as far as Python is concerned, only the main thread will handle the signal, but it makes no guarantees about anything else. Unfortunately, this leads to a few problems.

If you use the FreeBSD ports build of Python, for the most part you'll get correct thread and signal behavior. The upstream maintainers have installed a patch that basically blocks signals for all threads but the main thread. Unfortunately, this leads to one flaw. If ever you fork, from a thread (e.g. to spawn a subprocess), the signals are not unblocked, so your subprocess is unkillable.

If you use a stock version of Python on FreeBSD, you get the following, different, problems. This makes it difficult to write portable code on FreeBSD and Linux.

Working with signals and threads in Python on FreeBSD (stock Python)

If you're writing multi-threaded code on FreeBSD, and you want to handle signals, you need to ensure that you are prepared to handle interrupted system calls in every thread, not just the main thread. Usually this means wrapping them in a try/except block like this:

while True:
    try:
        data = my_sock.read()
        if not data:
            break
        buffer.append(data)
    except socket.error, e:
        if e.errno != errno.EINTR:
            raise

On Linux, you only need to worry about this in the main thread. On FreeBSD, you need to worry about it everywhere.

The other thing you need to avoid is blocking indefinitely in the main thread. It's a common pattern to spawn a thread to handle connections and have your main thread wait for a signal or some other indication it's time to quit. Unforunately, because of Python's assumptions about signals, this doesn't work on FreeBSD. Not even signal.pause() in the main thread will return when a signal is received. For example, the following code will never exit on FreeBSD.

import os
import signal
import threading
import time

def handler(signum, frame):
    print 'Signal %d handled' % (signum,)

def kill_me():
    time.sleep(1)
    print 'Suicide?'
    os.kill(os.getpid(), signal.SIGTERM)
    time.sleep(1)

if __name__ == '__main__':
    signal.signal(signal.SIGTERM, handler)
    t = threading.Thread(target=kill_me).start()
    signal.pause()
    print 'Got a signal, exit.'

The fix is to replace the blocking call (in the example above the signal.pause() in the main thread with a sleep-loop.

def my_signal_handler(signal, frame):
    global _run
    _run = False

_run = True
signal.signal(signal.SIGTERM, my_signal_handler)

# Spawn some threads ...

while _run:
    time.sleep(1)

# Join the threads ...

If your application has a need to handle signals faster, you'll want to have a shorter sleep. If you can tolerate a longer delay, pick a longer sleep time. This is obviously inefficient, but it's the best you can do with a stock Python.