From cb54ae7dbce8a8b0a5a81a3e08965fc7ec32a5ce Mon Sep 17 00:00:00 2001
From: =?utf8?q?Duje=20Mihanovi=C4=87?= <duje.mihanovic@skole.hr>
Date: Mon, 15 Jan 2024 17:03:43 +0100
Subject: [PATCH] New post: Matrix delegation and how it may...

---
 content/posts/matrix-delegation/index.md | 76 ++++++++++++++++++++++++
 1 file changed, 76 insertions(+)
 create mode 100644 content/posts/matrix-delegation/index.md

diff --git a/content/posts/matrix-delegation/index.md b/content/posts/matrix-delegation/index.md
new file mode 100644
index 0000000..76324e6
--- /dev/null
+++ b/content/posts/matrix-delegation/index.md
@@ -0,0 +1,76 @@
+---
+title: "Matrix delegation and how it may bite you"
+date: 2024-01-14T11:19:48+01:00
+summary: One of the ways small details can cause big issues.
+---
+For those who don't know, delegation in Matrix is used in server-to-server
+communication to figure out which server serves a given domain. As an example,
+if my own Matrix homeserver was running on `matrix.dujemihanovic.xyz` instead of
+`dujemihanovic.xyz`, I could delegate the latter to the former to save anyone
+wanting to contact me from having to type out the `matrix.`.
+
+Besides domain name, delegation can also be used to specify which port to use
+for server-to-server communication. The default is `8448`, and if it's blocked
+you can use delegation to use `443` for server-to-server as client-to-server
+does by default. However, if you can, **I'd strongly suggest using `8448`!** I had
+been delegating S2S to `443` almost the whole time I have had this server for no
+reason and it seems that this caused an extremely weird issue with a certain
+room:
+
+## What happened?
+
+Message fetching kept breaking **constantly**. What I mean by that is that when
+I joined the room everything would work fine the first few messages, but at some
+point I would start getting notifications without any new message being present
+in that room. I have noticed that logging out and back in would get the missing
+messages in my client, but then the forementioned cycle would repeat again no
+matter how many times I logged out and back in *(this also happened on other
+clients besides Element desktop)*. To confirm my homeserver was the issue, I
+joined the room with my old matrix.org account and sure enough that worked just
+fine.
+
+I tried the usual things such as restarting Dendrite and the whole VPS, but to
+no avail. I was pretty insistent that the issue was not with my homeserver but
+the main server hosting the room *(which, unsurprisingly, turned out to be
+false)* and so I gave up on that. The eyeopening moment was me reading the
+[conduit
+documentation](https://gitlab.com/famedly/conduit/-/blob/next/DEPLOY.md) *(I had
+considered migrating to it)*, specifically this:
+
+> If Conduit runs behind Cloudflare reverse proxy, which doesn't support port
+> 8448 on free plans,
+
+This implies that routing server-to-server traffic to `443` should only be done
+if it's **absolutely impossible** to use `8448` for this, and the [Synapse
+documentation](https://matrix-org.github.io/synapse/latest/delegate.html#when-do-i-need-delegation)
+said something similar:
+
+> **However**, if your homeserver's APIs aren't accessible on port 8448 and on
+> the domain server_name points to, you will need to let other servers know how
+> to find it using delegation.
+
+## Fixing the issue
+
+Encouraged by this, I fixed up my server:
+
+* allow port `8448` in `ufw`
+* add something like this to `Caddyfile`:
+```
+dujemihanovic.xyz:8448 {
+        reverse_proxy /_matrix/* localhost:8008
+}
+```
+* change `/.well-known/matrix/server` to point to `dujemihanovic.xyz:8448` *(in
+  theory, I could have gotten rid of that `return` directive altogether as
+  `8448` is default anyway, but I still chose to specify it just to be safe)*
+* reload `caddy` and restart `dendrite` *(the latter is, again, just to be
+  safe)*
+
+Once all this was done, the room finally started acting normally.
+
+## Small sidenote
+
+I must note that delegating federation to `443` **should not cause breakage like
+this**. Despite this, it still did so in my case and for that reason I wrote
+about it anyway. It's very unlikely that you will be affected by this issue, but
+I still believe it should be pointed out in the event that it does.
-- 
2.39.5