ActionCable: Handling client connection errors on server side

Ping -> PONG, we will use this technique later

ActionCable: Handling client connection errors on server side

I'm still trying to choose the best blog platform so you can find the original article on Medium.

TLDR: a full code snippet at the end of the article.

If you ever had a chance to work with WebSockets in Ruby close enough to handle disconnections, then you should know, that a package for working with WebSockets — ActionCable — has a peculiarity, or better to say — a defect that happens in scenarios such as Internet connection loss. The main problem behind the defect is that ActionCable out of the box does not know (read as: does not react) to the loss of connection by the client quickly enough.

The easiest way to reproduce this nuance is to connect to the channel from a mobile phone, and then switch the phone to airplane mode. In this case, ActionCable will continue to transmit messages to the WS-channel, despite the physical inaccessibility of the subscriber.

If you think the problem is a bit contrived, check out this link: github.com/rails/rails/issues/29307

That’s a real problem for mobile clients. And our goal is to implement a solution that eliminates this defect.

Main conditions:

  1. We want to have a Plug-N-play solution (i.e. we can’t touch frontend or have any modifications on the client-side)
  2. We don’t want to monkeypatch ActionCable (I hope there is no need to describe why)
  3. We don’t want to use Redis or any other external storage

Step by step: Some circles

First of all, let’s create some ActionCable code so we could create subscriptions. I assume that you have a Rails app with AC being initialized and routed.

# app/channels/application_cable/connection.rb
module ApplicationCable
  class Connection < ActionCable::Connection::Base
    identified_by :uuid

    def connect
      self.uuid = SecureRandom.uuid
      logger.add_tags 'ActionCable', uuid
    end

    def disconnect
      ActionCable.server.remote_connections.where(uuid: uuid).disconnect
    end
  end
end
# app/channels/sample_channel.rb
class SampleChannel < ApplicationCable::Channel
  def subscribed
    stream_from "sample_channel_#{uuid}"
  end

  def unsubscribed; end
end

This code allows us to easily create/destroy connections and identify them by :uuids. Now we drew two circles and it’s time to create an owl.

image.png

Step by step: The Owl

Let’s go back to our problem. Fortunately or unfortunately, there are not so many options to resolve this case except using ping/pong functionality with regular Keep-Alive checks. Wow, such luck! We have beat method right in AC:

# File actioncable/lib/action_cable/connection/base.rb, line 116
def beat
  transmit type: ActionCable::INTERNAL[:message_types][:ping], message: Time.now.to_i
end

But, emmmmm…What’s about getting any response? The short answer: No.

Any network-related events like disconnections need to be handled by the caller if your TCP socket becomes disconnected, the driver cannot detect that and you need to handle it yourself.

Understandable, have a great ping!

So, we need to create our own ping, which also can get pong messages made by the client. Let’s start this by adding periodically. periodically is a method that could be found in the depth of ActionCable docs and allows us to define tasks that would be performed periodically on the channel. We will periodically send ping messages and create postponed tasks using Concurrent::TimerTask to unsubscribe users:

module ApplicationCable
  class Channel < ActionCable::Channel::Base
    CONNECTION_TIMEOUT = 4.seconds
    CONNECTION_PING_INTERVAL = 5.seconds
    periodically :track_users, every: CONNECTION_PING_INTERVAL

    def track_users
      ActionCable.server.connections.each do |conn|
        order66 = Concurrent::TimerTask.new(execution_interval: CONNECTION_TIMEOUT) do
          conn.disconnect
        end
        order66.execute
      if pong_received # something we will surely add
        order66.shutdown
      end
    end
  end
end

The only lost puzzle we should solve in our game is pong_received. To find it we need to check all AC dependencies and understand where to obtain access to WS-client. AC dependencies are nio4r and websocket-driver.

Ctrl+F on websocket-driver sources gifts us with a better version of the ping method, which judging by specs responds with true , if client is alive.

# lib/websocket/driver/hybi.rb:131
def ping(message = '', &callback)
  @ping_callbacks[message] = callback if callback
  frame(message, :ping)
end
# spec/websocket/driver/hybi_spec.rb:449
it "runs the given callback on matching pong" do
  driver.ping("Hi") { @reply = true }
  driver.parse [0x8a, 0x02, 72, 105].pack("C*")
  expect(@reply).to eq true
end

Finally, let’s find any kind of interface to this driver in AC. Ironically, they wrapped the real socket to minimize our possibilities to modify it:

image.png

But as we decided before -> no monkeypatching. So, our solution is not so beautiful, but it’s honest work: connection.instance_values[‘websocket’].instance_values[‘websocket’].instance_variable_get(:@driver)

And the full code is below (TLDR):

module ApplicationCable
  class Channel < ActionCable::Channel::Base
    CONNECTION_TIMEOUT = 4.seconds
    CONNECTION_PING_INTERVAL = 5.seconds
    periodically :track_users, every: CONNECTION_PING_INTERVAL

    def track_users
      ActionCable.server.connections.each do |conn|
        order66 = Concurrent::TimerTask.new(execution_interval: CONNECTION_TIMEOUT) do
          conn.disconnect
        end
        order66.execute
      if connection.instance_values[‘websocket’].instance_values[‘websocket’].instance_variable_get(:@driver).ping do
        order66.shutdown
      end
    end
  end
end

Or other version, if you prefer not to use unknown concurrent libs:

module ApplicationCable
  class Channel < ActionCable::Channel::Base
    after_subscribe :connection_monitor
    CONNECTION_TIMEOUT = 10.seconds
    CONNECTION_PING_INTERVAL = 5.seconds
    periodically every: CONNECTION_PING_INTERVAL do
      @driver&.ping
      if Time.now - @_last_request_at > @_timeout
        connection.disconnect
      end
    end
    def connection_monitor
      @_last_request_at ||= Time.now
      @_timeout = CONNECTION_TIMEOUT
      @driver = connection.instance_variable_get('@websocket').possible?&.instance_variable_get('@driver')
      @driver.on(:pong) { @_last_request_at = Time.now }
    end
  end
end

Hope this article helped you somehow!

Some links for research-lovers:

  1. github.com/rails/rails/issues/24908 — just an interesting discussion on the problem
  2. w3.org/Bugs/Public/show_bug.cgi?id=13104 — Enable keepalive on WebSocket API
  3. github.com/faye/websocket-driver-ruby/issue..
  4. gist.github.com/radalin/8a250c85a8f9bd8c727.. — one of solutions, but it has frontend modifications
  5. blog.heroku.com/real_time_rails_implementin.. — just cool pics
  6. github.com/faye/websocket-driver-ruby/blob/.. — ws-driver sources
  7. github.com/faye/faye-websocket-ruby/blob/0... — ws sources

image.png