I started writing this blog after i read this https://cyber.fsi.stanford.edu/io/news/clubhouse-china blog published by Stanford Internet Observatory. It doesn't seem to make a lot of sense after i read that blog. This is a deep dive into Clubhouse's backend infrastructure to see what they use on the infrastructure side.
Clubhouse IOS app has SSL pinning in place, so it involves using a jailbroken Apple device to disable SSL pinning to observe the traffic.
Here is the Clubhouse stack as of Feb 16, 2021
Cloudflare - Hosting API's
AWS(S3 for storing your profile image)
Amplitude for product analytics
datatheorem for app security
pubnub for actual voice room
Here are some detailed steps that explains which API's support what features. It has a simple and straightforward implementation from what i see.
When a user first downloads clubhouse and opens the app, they have multiple options, either to Sign in or request an invite.
At this time there are 2 calls that go out from the app, one to overmind.datatheorem endpoint with a bunch of metadata about certs and validation results. Another call is to api.amplitude.com that has logs about the current session, phone metadata and so on. This all is common logging information that all modern apps do. As you use the app you see logs being posted to api.amplitude.com endpoint
Once you enter the phone information, and if you have an invite you get a text message and you login. Backend hostname is www.clubhouseapi.com that seems hosted with Cloudflare infrastructure.
Below are some example calls with the actual endpoint's redacted.
Once you log in, you see a similar screen. If you are building products, you would be able to see it as bunch of features.
On the top left you have search feature, invite friends feature, upcoming for you, notifications, profile etc. Then on the main screen you have your feed thats customized per your likes. On the bottom of the screen you see 'Start a room' and who is online features.
So you can say that there are about 8-10 features, this means there would be about 8-10 API's that needs to be implemented on the backend to support this.
These features are implemented as separate endpoints. For example 'search', 'settings'..etc These are all served by www.clubhouseapi.com under different API endpoints and the profile images are loaded from direct S3 URL. I am redacting the S3 bucket name that they use.Below is a snapshot of requests being fired when using the app.
The 2 important features here are Joining a room and creating a room. When the feed of available rooms gets loaded, they all have unique channel name and Channel_id as you see below.
When you click Join Channel, the www.clubhouseapi.com endpoint is called with this channel name and the backend responds with the token information thats necessary to join the Room and hear others. Clubhouse uses pubnub.com from what i see.
This recent study from Stanford(https://cyber.fsi.stanford.edu/io/news/clubhouse-china) says they use Agora, but i don't see it happening. Below is the sample response of what the clubhouse API returns back to the iOS client.
Then, the user joins the already created channel. you can see the requests to the pubnub servers when the call is active.
Similar stuff happens when you create the room as well, instead of joining an existing room, you create a new room and join that room.
When you are on the call, clubhouse needs to find out if you have closed the app or not. If you have quit the app by swiping up, then clubhouse wont know that you are not on the call, so they have implemented this ping api call , thats just send a ping every 45 secs to let the backend know that this user is still in this call.
There are 2 hostnames are of interest from what we have seen from above logs.
www.clubhouseapi.com - This hostnames hosts the actual API that powers Clubhouse and it is hosted in Cloudflare. Cloudflare uses Anycast to route their traffic to the customers nearest servers. This maybe one of the reason why Stanford research team found the traffic passing thru servers in China, because for users in China they would be routed to the closest servers. The origin servers are hosted somewhere in US, so they would travel through the Cloudflare's backbone network.
clubhouse.pubnub.com - This is the hostname that is used for all the rooms. pubnub is hosted in AWS. They seems to be hosted in Multiple regions as well. So when you join a clubhouse room anywhere from the World, you would actually join to the closest AWS region and pubnub should handle the rest in terms of data transfer. .
So, from the above sleuthing we can see that there is nothing wrong in how clubhouse have implemented their app, or their backend and where the data is hosted. I am super excited for a social app to take off after a decade of just Facebook and Twitter.