Welcome

This is my public repository of various guides and resources - it's a collection of personal notes, and hopefully my shared knowledge base may come in handy for anyone interested in tech, especially areas like server management, networking, and more.

About this Project

This site is built from my GitHub repository using mdBook, a utility to create online books from Markdown files. It's a simple but powerful tool that lets me write in a simple format that can be quickly compiled and presented in a structured, readable format.

How It's Built

  1. Content Structure: All documentation is stored in the src directory, specified in book.toml.
  2. Continuous Deployment: The .github/workflows/mdbook.yml file configures GitHub Actions, so whenever there's a push to the repository, GitHub Actions auto-builds the project using mdBook and deploys it to GitHub Pages.
  3. Customization: Various CSS and JavaScript files (src/assets/) are included to tweak the appearance and behaviour of the site.

Editing and Contributions

I do accept GitHub Issues and Pull Requests - feel free to propose changes or enhancements, especially if you notice anything inaccurate or needing extra clarification. You can click the edit icon on the top right of any page to jump to the GitHub editor for that file - simply fork the repo and submit a Pull Request and I'll get back to you as soon as possible.

Happy reading!

Matrix Synapse Homeserver Guides

This section of my website aims to be a one-stop hub for getting into the Matrix ecosystem and making the most of a Synapse server, from configuring it for the first time to delving into more complex development.

Synapse is the reference homeserver for the Matrix ecosystem – an open-source, decentralised communication platform. It's incredibly flexible, letting you chat, call, and custom-build your own communication solutions.

Getting started with Synapse can be daunting though, so I'm collating a range of guides and resources - from straightforward installation walkthroughs to more intricate development tips, I hope there'll be something for everyone.

If you have any questions, just drop in to the Synapse Admins room and we'll do our best to help. Now, time to roll up your sleeves, and let's get started!

Deploying a Synapse Homeserver with Docker

  1. Introduction
  2. Model Explanation
  3. Getting Started

Introduction

Welcome to my deployment guide for setting up a Matrix Synapse Homeserver using Docker. Our setup focuses on scalability and efficiency, using multiple "workers" to distribute workload and enhance performance.

Synapse, as the reference homeserver implementation for the Matrix network, offers flexible and decentralised communication capabilities, including messaging, voice calls, and custom integrations.

In a closed system, it performs exceptionally well on default settings, allowing many users to communicate in thousands of chats. However, when opening up to the internet to join public rooms, it can struggle to manage communicating with hundreds/thousands of other servers at the speed you'd expect from an instant communication platform.

There is a default model described in the official documentation, but this design is optimised for a family or team of users to access federated rooms at high speed without using so much extra CPU and RAM.

Model Explanation

In this deployment, we use a variety of Synapse workers, each with a specific role.

We want to give each worker plenty of work to do, so it's not just sitting around using memory for no reason, but also make sure we're not overwhelming individual workers in ways that impact the service.

Here's a diagram of how requests should flow once we're done:

graph TD;
    A[Client\nRequests] --> B[Client Sync &\nStream Writers];
    A --> C[Main\nProcess];
    C --> B;
    C --> E[Background &\nEvent Writer];
    A --> D[Room\nWorkers];
    F[Federation\nRequests] --> D;
    F --> G[Federation\nReader]
    G --> B;
    G --> E;
    H[Media\nRequests] --> I[Media\nRepository];
  • Main Process: Some requests can only go to the Synapse main process, but we also send client requests there when they don't include a room ID. This load is very light (typically signing keys and profile information) on a server with only a few active users, so they're safe to send here.

  • Client Sync & Stream Writers: A user's main source of truth is from the sync feed, which we're dedicating to the Client Sync worker. By also having this worker responsible for most Stream Writing responsibilities, all other workers send it the typing/receipts/etc events they're aware of, to deliver them directly to users that need to know about them as quickly as possible.

  • Room Workers (4 Instances): When a user is trying to interact with a specific room, it makes sense to store the cache for a single room in a single worker to minimise the amount of caching each worker needs to do. Using load balancing that identifies requests with a room ID in them, we can send all requests for the same room to just one of the Room Workers, so as your server grows, you can simply add more Room Workers to spread the rooms across more workers.

  • Federation Reader: When other servers are sending new data, these requests don't advertise the room ID in the URI, so we collect these on a single Federation Reader, which forwards the events to the Stream/Event Writers. All other requests from another homeserver that specify a room ID in them can go to the same Room Worker the clients use, which helps to make the most of its cache.

  • Media Repository: For media requests, we send these to a dedicated media worker, which handles uploads of attachments/images, generates thumbnails, and provides downloads to both local clients and remote servers.

  • Background Tasks & Event Writing: There are a number of background roles, including maintenance tasks, "pushing" notifications to users, and sending updates to any AppServices you have (like bridges) that are generally quite low-stress for server with only a few active users, so we combine these with the Event Writer, which is typically only busy when joining a very large/complex room.

  • Federation Senders (4 Instances): These aren't displayed above, as they don't handle inbound requests, but we have several to balance the load so you can communicate with even the largest rooms.

Getting Started

To get your Synapse Homeserver up and running, follow the configuration guides for each component. The end result should be a powerful, self-hosted communication platform. And as always, if questions pop up or you hit a snag, the Synapse Admins room is there to lend a hand.

Deploying a Synapse Homeserver with Docker

Docker Compose with Templates

  1. Docker Compose with Templates
  2. Docker Engine
  3. Environment Files
  4. YAML Templating
  5. Unix Sockets
  6. Redis
  7. PostgreSQL Database
  8. Synapse

Docker Engine

If Docker is not already installed, visit the official guide and select the correct operating system to install Docker Engine.

Once complete, you should now be ready with the latest version of Docker, and can continue the guide.

Environment Files

Before creating the Docker Compose configuration itself, let's define the environment variables for them:

  • Synapse:

    SYNAPSE_REPORT_STATS=no
    SYNAPSE_SERVER_NAME=mydomain.com
    UID=1000
    GID=1000
    TZ=Europe/London
    
  • PostgreSQL:

    POSTGRES_DB=synapse
    POSTGRES_USER=synapse
    POSTGRES_PASSWORD=SuperSecretPassword
    POSTGRES_INITDB_ARGS=--encoding=UTF-8 --lc-collate=C --lc-ctype=C
    

YAML Templating

Using YAML Anchors lets you cut down the repeated lines in the config and simplify updating values uniformly.

Docker doesn't try to create anything from blocks starting with x- so you can use them to define an &anchor that you can then recall later as an *alias.

It's not very simple to explain, so take a look at this example, where we establish basic settings for all containers, then upper-limits on CPU and RAM for sizes of container:

version: "3"

x-container-template: &container-template
  depends_on:
    - init-sockets
  restart: always

x-small-container: &small-container
  <<: *container-template
  cpus: 1
  mem_limit: 0.5G

x-medium-container: &medium-container
  <<: *container-template
  cpus: 2
  mem_limit: 4G

x-large-container: &large-container
  <<: *container-template
  cpus: 4
  mem_limit: 8G

Now we've defined these, we can extend further with more specific templates, first defining what a Synapse container looks like, and variants for the two main types of worker:

x-synapse-template: &synapse-template
  <<: *medium-container
  depends_on:
    - init-sockets
    - db
    - redis
  env_file: .synapse.env
  image: matrixdotorg/synapse:latest
  volumes:
    - sockets:/sockets
    - ./logs:/data/logs
    - ./media:/media
    - ./synapse:/data

x-synapse-worker-template: &synapse-worker-template
  <<: *synapse-template
  depends_on:
    - synapse
  environment:
    SYNAPSE_WORKER: synapse.app.generic_worker

x-synapse-media-template: &synapse-media-template
  <<: *synapse-template
  depends_on:
    - synapse
  environment:
    SYNAPSE_WORKER: synapse.app.media_repository

x-postgres-template: &postgres-template
  <<: *medium-container
  depends_on:
    - init-sockets
  image: postgres:16-alpine
  env_file: .postgres.env
  shm_size: 1G

Now this is done, we're ready to start actually defining resources!

Unix Sockets

If all of your containers live on the same physical server, you can take advantage of Unix sockets to bypass the entire network stack when containers need to talk to each other.

This may sound super technical, but in short, it means two different programs can speak directly via the operating system instead of opening a network connection, reducing the time it takes to connect. Synapse is constantly passing messages between workers and replicating data, so this one change makes a very measurable visible difference to client performance for free!

First, let's define a volume to store the sockets. As the sockets are tiny, we can use tmpfs so it's stored in RAM to make the connections even faster and minimise disk load:

volumes:
  sockets:
    driver_opts:
      type: tmpfs
      device: tmpfs

I then recommend a tiny "init-sockets" container to run before the others to make sure the ownership and permissions are set correctly before the other containers start to try writing to it:

services:
  init-sockets:
    command:
      - sh
      - -c
      - |
        chown -R 1000:1000 /sockets &&
        chmod 777 /sockets &&
        echo "Sockets initialised!"
    image: alpine:latest
    restart: no
    volumes:
      - sockets:/sockets

Redis

To use sockets, Redis requires an adjustment to the launch command, so we'll define that here:

  redis:
    <<: *small-container
    command: redis-server --unixsocket /sockets/synapse_redis.sock --unixsocketperm 660
    image: redis:alpine
    user: "1000:1000"
    volumes:
      - sockets:/sockets
      - ./redis:/data

PostgreSQL Database

Now we can define our PostgreSQL database:

  db:
    <<: *postgres-template
    volumes:
      - sockets:/sockets
      - ./pgsql16:/var/lib/postgresql/data

And if you're following my backups guide, it's now as easy as this to deploy a replica:

  db-replica:
    <<: *postgres-template
    environment:
      POSTGRES_STANDBY_MODE: "on"
      POSTGRES_PRIMARY_CONNINFO: host=/sockets user=synapse password=${SYNAPSE_PASSWORD}
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -h /sockets -p 5433 -U synapse"]
      interval: 5s
      timeout: 5s
      retries: 5
    volumes:
      - sockets:/sockets
      - ./pgrep16:/var/lib/postgresql/data

You can change the paths from "pgsql" or "pgrep" if you prefer, just make sure to do it before starting the first time, or you'll need to rename the directory on disk at the same time to avoid any data loss.

Synapse

With all of our templates above, Synapse itself is this easy:

  synapse:
    <<: *synapse-template

In the next sections, we just need to set up the config files for each of these applications and then you're ready to go.

Deploying a Synapse Homeserver with Docker

Synapse Configuration

  1. Synapse Configuration
  2. Default File
  3. Log Config
  4. Homeserver Config
  5. Cache Optimisation

Default File

Before we can modify the Synapse config, we need to create it.

Run this command to launch Synapse only to generate the config file and then close again:

docker compose run -it synapse generate && docker compose down -v

In your "synapse" directory you should now find a number of files like this:

/synapse# ls -lh
total 16K
-rw-r--r-- 1 root root  694 Dec 20 23:20 mydomain.com.log.config
-rw-r--r-- 1 root root   59 Dec 20 23:20 mydomain.com.signing.key
-rw-r--r-- 1 root root 1.3K Dec 20 23:20 homeserver.yaml

The signing key is unique to your server and is vital to maintain for other servers to trust yours in the future. You can wipe the entire database and still be able to federate with other servers if your signing key is the same, so it's worthwhile backing this up now.

Log Config

For the log config, by default this is very barebones and just logs straight to console, but you could replace it with something like this to keep a daily log for the past 3 days in your "logs" folder:

version: 1
formatters:
  precise:
    format: '%(asctime)s - %(name)s - %(lineno)d - %(levelname)s - %(request)s - %(message)s'

handlers:
  file:
    class: logging.handlers.TimedRotatingFileHandler
    formatter: precise
    filename: /logs/synapse.log
    when: midnight
    backupCount: 3
    encoding: utf8

  buffer:
    class: synapse.logging.handlers.PeriodicallyFlushingMemoryHandler
    target: file
    capacity: 10
    flushLevel: 30
    period: 5

loggers:
    synapse:
        level: INFO
        handlers: [buffer]
    synapse.storage.SQL:
        level: INFO
        handlers: [buffer]
    shared_secret_authenticator:
        level: INFO
        handlers: [buffer]

root:
    level: INFO
    handlers: [buffer]

Homeserver Config

By default, this file is quite short and relies a lot on defaults. There is no harm adding blank lines between entries here to make it more readable, or adding comments (starting with the # hash character) to explain what lines mean.

Note: The "secret" or "key" lines are unique to your server and things are likely to misbehave if you change some of them after the server is running. It's generally best to leave them safe at the bottom of the file while you work on the other values.

Here's an example with comments you may wish to use to start with some safe defaults:

# Basic Server Details
server_name: "mydomain.com" # Domain name used by other homeservers to connect to you
public_baseurl: "https://matrix.mydomain.com/" # Public URL of your Synapse server
admin_contact: "mailto:admin@mydomain.com" # Contact email for the server admin
pid_file: "/data/process.pid" # File that stores the process ID of the Synapse server
signing_key_path: "/data/mydomain.com.signing.key" # Location of the signing key for the server

# Logging and Monitoring
log_config: "/data/log.config/synapse.log.config" # Path to the logging configuration file
report_stats: false # Whether to report anonymous statistics
enable_metrics: false # Enable the metrics listener to monitor with Prometheus

# Login and Registration
enable_registration: false # Whether to allow users to register on this server
enable_registration_captcha: true # Whether to enable CAPTCHA for registration
enable_registration_without_verification: false # Allow users to register without email verification
delete_stale_devices_after: 30d # Devices not synced in this long will have their tokens and pushers retired
password_config:
  enabled: true # Set to false to only allow SSO login

# Database and Storage Configuration
database:
  name: psycopg2 # PostgreSQL adapter for Python
  args:
    user: synapse # Username to login to Postgres
    password: SuperSecretPassword # Password for Postgres
    database: synapse # Name of the database in Postgres
    host: "/sockets" # Hostname of the Postgres server, or socket directory
    cp_min: 1 # Minimum number of database connections to keep open
    cp_max: 20 # Maximum number of database connections to keep open

# Redis Configuration
redis:
  enabled: true # Required for workers to operate correctly
  path: "/sockets/synapse_redis.sock" # Path to Redis listening socket

# Network Configuration
listeners:
  - path: "/sockets/synapse_replication_main.sock" # Path to Unix socket
    type: http # Type of listener, almost always http
    resources:
      - names: [replication] # Replication allows workers to communicate with the main thread
        compress: false # Whether to compress responses
  - path: "/sockets/synapse_inbound_main.sock" # Path to Unix socket
    type: http # Type of listener, almost always http
    x_forwarded: true # Use the 'X-Forwarded-For' header to recognise the client IP address
    resources:
      - names: [client, federation] # Client API and federation between homeservers
        compress: false # Whether to compress responses
  - type: metrics # Used for Prometheus metrics later
    port: 10101 # Easy port to remember later?

# Workers will eventually go here
instance_map:
  main: # The main process should always be here
    path: "/sockets/synapse_replication_main.sock"

# Trusted Key Servers
trusted_key_servers: # Servers to check for server keys when another server's keys are unknown
  - server_name: "beeper.com"
  - server_name: "matrix.org"
  - server_name: "t2bot.io"
suppress_key_server_warning: true # Suppress warning that matrix.org is in above list

# Federation Configuration
allow_public_rooms_over_federation: false # Allow other servers to read your public room directory
federation: # Back off retrying dead servers as often
  destination_min_retry_interval: 1m
  destination_retry_multiplier: 5
  destination_max_retry_interval: 365d
federation_ip_range_blacklist: # IP address ranges to forbid for federation
  - '10.0.0.0/8'
  - '100.64.0.0/10'
  - '127.0.0.0/8'
  - '169.254.0.0/16'
  - '172.16.0.0/12'
  - '192.168.0.0/16'
  - '::1/128'
  - 'fc00::/7'
  - 'fe80::/64'

# Cache Configuration
event_cache_size: 30K
caches:
  global_factor: 1
  expire_caches: true
  cache_entry_ttl: 1080m
  sync_response_cache_duration: 2m
  per_cache_factors:
    get_current_hosts_in_room: 3
    get_local_users_in_room: 3
    get_partial_current_state_ids: 0.5
    _get_presence_for_user: 3
    get_rooms_for_user: 3
    _get_server_keys_json: 3
    stateGroupCache: 0.1
    stateGroupMembersCache: 0.2
  cache_autotuning:
    max_cache_memory_usage: 896M
    target_cache_memory_usage: 512M
    min_cache_ttl: 30s

# Garbage Collection (Cache Eviction)
gc_thresholds: [550, 10, 10]
gc_min_interval: [1s, 1m, 2m]

# Media Configuration
media_store_path: "/media" # Path where media files will be stored
media_retention:
  local_media_lifetime: 5y # Maximum time to retain local media files
  remote_media_lifetime: 30d # Maximum time to retain remote media files

# User and Room Management
allow_guest_access: false # Whether to allow guest access
auto_join_rooms: # Rooms to auto-join new users to
  - "#welcome-room:mydomain.com"
autocreate_auto_join_rooms: true # Auto-create auto-join rooms if they're missing
presence:
  enabled: true # Enable viewing/sharing of online status and last active time
push:
  include_content: true # Include content of events in push notifications
user_directory:
  enabled: true # Whether to maintain a user directory
  search_all_users: true # Whether to include all users in user directory search results
  prefer_local_users: true # Whether to give local users higher search result ranking

# Data Retention
retention:
  enabled: false # Whether to enable automatic data retention policies
forget_rooms_on_leave: true # Automatically forget rooms when leaving them
forgotten_room_retention_period: 1d # Purge rooms this long after all local users forgot it

# URL Preview Configuration
url_preview_enabled: true # Whether to enable URL previews in messages
url_preview_accept_language: # Language preferences for URL preview content
  - 'en-GB'
  - 'en-US;q=0.9'
  - '*;q=0.8'
url_preview_ip_range_blacklist: # Forbid previews for URLs at IP addresses in these ranges
  - '10.0.0.0/8'
  - '100.64.0.0/10'
  - '127.0.0.0/8'
  - '169.254.0.0/16'
  - '172.16.0.0/12'
  - '192.168.0.0/16'
  - '::1/128'
  - 'fc00::/7'
  - 'fe80::/64'

# SSO Configuration
oidc_providers:
  - idp_id: authentik
    idp_name: "SSO"
    idp_icon: "mxc://mydomain.com/SomeImageURL"
    discover: true
    issuer: "https://auth.fostered.uk/application/o/matrix/"
    client_id: "SuperSecretClientId"
    client_secret: "SuperSecretClientSecret"
    scopes: ["openid", "profile", "email"]
    allow_existing_users: true
    user_mapping_provider:
      config:
        localpart_template: "{{ user.preferred_username }}"
        display_name_template: "{{ user.name|capitalize }}"
        email_template: "{{ user.email }}"

# Email Configuration
email:
  enable_notifs: true # Whether to enable email notifications
  smtp_host: "smtp.mydomain.com" # Hostname of the SMTP server
  smtp_port: 587 # TCP port to connect to SMTP server
  smtp_user: "SuperSecretEmailUser" # Username to connect to SMTP server
  smtp_pass: "SuperSecretEmailPass" # Password to connect to SMTP server
  require_transport_security: True # Require transport security (TLS) for SMTP
  notif_from: "Matrix <noreply@mydomain.com>" # The From address for notification emails
  app_name: Matrix # Name of the app to use in email templates
  notif_for_new_users: True # Enable notifications for new users

# Security and Authentication
form_secret: "SuperSecretValue1" # Secret for preventing CSRF attacks
macaroon_secret_key: "SuperSecretValue2" # Secret for generating macaroons
registration_shared_secret: "SuperSecretValue3" # Shared secret for registration
recaptcha_public_key: "SuperSecretValue4" # Public key for reCAPTCHA
recaptcha_private_key: "SuperSecretValue5" # Private key for reCAPTCHA
worker_replication_secret: "SuperSecretValue6" # Secret for communication between Synapse and workers

In this case, I've included typical configuration for Authentik in case you want to use SSO instead of Synapse's built-in password database - it's perfectly safe to omit this oidc_providers: section if you're not using SSO, but the official Authentik guide is quite quick and easy if you do wish to use it after installing Authentik.

Cache Optimisation

Most of the example configuration above is fairly standard, however of particular note to performance tuning is the cache configuration.

The defaults (at time of writing) are below and in the official documentation at event_cache_size and caches:

event_cache_size: 10K
caches:
  global_factor: 0.5
  expire_caches: true
  cache_entry_ttl: 30m
  sync_response_cache_duration: 2m

In this default case:

  • All of the caches (including the event_cache_size) are halved (so each worker can only actually hold 5,000 events as a maximum)
  • Every entry in the cache expires within 30 minutes
  • cache_autotuning is disabled, so entries leave the cache after 30 minutes or when the server needs to cache something and there isn't enough space to store it.

In particular, that last option is a problem, as we have multiple containers, so we don't want every container seeking to fill its caches to the max then waiting for the expiry time to lose entries that have only been read once!

I've recommended the following config, which instead:

  • Increases the number of events we can cache to lower load on the database
  • Enable cache_autotuning to remove entries that aren't frequently accessed
  • Allow entries to stay in cache longer when they're used frequently
  • Modified the limit to expand caches that are frequently accessed by large federated rooms, and restricted ones that are less frequently reused
event_cache_size: 30K
caches:
  global_factor: 1
  expire_caches: true
  cache_entry_ttl: 1080m
  sync_response_cache_duration: 2m
  per_cache_factors:
    get_current_hosts_in_room: 3
    get_local_users_in_room: 3
    get_partial_current_state_ids: 0.5
    _get_presence_for_user: 3
    get_rooms_for_user: 3
    _get_server_keys_json: 3
    stateGroupCache: 0.1
    stateGroupMembersCache: 0.2
  cache_autotuning:
    max_cache_memory_usage: 896M
    target_cache_memory_usage: 512M
    min_cache_ttl: 30s

Furthermore, as this is designed to be a server with more limited RAM, we've updated the "garbage collection" thresholds, so Synapse can quickly clean up older cached entries to make sure we're keeping a healthy amount of cache without running out of memory:

gc_thresholds: [550, 10, 10]
gc_min_interval: [1s, 1m, 2m]

Deploying a Synapse Homeserver with Docker

PostgreSQL Configuration

  1. PostgreSQL Configuration
  2. Creating Database
  3. Configuring PostgreSQL

Creating Database

Before we can modify the PostgreSQL config, we need to let the container generate it, so for now (whether you're deploying a single database or a replica too) just start the primary database like this:

docker compose up db

You should see the image be downloaded, then a few seconds later it should have started, with a few logs to say it's created the database and started listening, e.g.

PostgreSQL init process complete; ready for start up.

2023-12-20 22:58:57.675 UTC [1] LOG:  starting PostgreSQL 16.1 on x86_64-pc-linux-musl, compiled by gcc (Alpine 13.2.1_git20231014) 13.2.1 20231014, 64-bit
2023-12-20 22:58:57.675 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2023-12-20 22:58:57.675 UTC [1] LOG:  listening on IPv6 address "::", port 5432
2023-12-20 22:58:57.686 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2023-12-20 22:58:57.699 UTC [51] LOG:  database system was shut down at 2023-12-20 22:58:57 UTC
2023-12-20 22:58:57.707 UTC [1] LOG:  database system is ready to accept connections

Configuring PostgreSQL

Now you can hit Ctrl+C to close it, and you should find a "psql16" folder now exists with a "postgresql.conf" file inside it.

I recommend removing it entirely and replacing it with a template of selected values like this:

# Network
listen_addresses = '0.0.0.0'
max_connections = 500
port = 5432
unix_socket_directories = '/sockets'

# Workers
max_worker_processes = 16
max_parallel_workers = 16
max_parallel_workers_per_gather = 4
max_parallel_maintenance_workers = 4

# Memory
dynamic_shared_memory_type = posix
effective_cache_size = 40GB
effective_io_concurrency = 200
maintenance_work_mem = 1GB
shared_buffers = 4GB
wal_buffers = 32MB
work_mem = 32MB

# Query Planning
enable_partitionwise_join = on
enable_partitionwise_aggregate = on
parallel_setup_cost = 1000
random_page_cost = 1.1

# Performance
commit_delay = 500
commit_siblings = 3
synchronous_commit = off
wal_writer_delay = 500

# Replication
archive_mode = off
checkpoint_completion_target = 0.9
checkpoint_timeout = 15min
hot_standby = off
max_wal_senders = 3
max_wal_size = 4GB
min_wal_size = 1GB
wal_keep_size = 2048
wal_level = replica

# Maintenance
autovacuum_vacuum_cost_limit = 400
autovacuum_analyze_scale_factor = 0.05
autovacuum_vacuum_scale_factor = 0.02
vacuum_cost_limit = 300

# Logging
#log_min_duration_statement = 3000
log_min_messages = warning
log_min_error_statement = warning

# Locale
datestyle = 'iso, mdy'
default_text_search_config = 'pg_catalog.english'
lc_messages = 'en_GB.utf8'
lc_monetary = 'en_GB.utf8'
lc_numeric = 'en_GB.utf8'
lc_time = 'en_GB.utf8'
log_timezone = 'Europe/London'
timezone = 'Europe/London'

# Extensions
#shared_preload_libraries = 'pg_buffercache,pg_stat_statements'

This is quite a high spec configuration, designed for a server with over 16 cores and 64GB RAM and using SSD storage, so you may wish to consult my tuning guide to decide on the best amount of workers and cache for your situation.

If in doubt, it's better to be more conservative, and increase values over time as needed - on a quad-core server with 8GB RAM, these would be reasonable values to start:

# Workers
max_worker_processes = 4
max_parallel_workers = 4
max_parallel_workers_per_gather = 2
max_parallel_maintenance_workers = 1

# Memory
dynamic_shared_memory_type = posix
effective_cache_size = 2GB
effective_io_concurrency = 200
maintenance_work_mem = 512MB
shared_buffers = 1GB
wal_buffers = 32MB
work_mem = 28MB

Deploying a Synapse Homeserver with Docker

Nginx Configuration

  1. Nginx Configuration
  2. Docker Compose
  3. Configuration Files
    1. nginx.conf
    2. upstreams.conf
    3. maps.conf
    4. locations.conf
    5. proxy.conf
    6. private.conf

Docker Compose

Example Docker Compose deployment:

  nginx:
    <<: *small-container
    depends_on:
      - synapse
    image: nginx:mainline-alpine-slim
    ports:
      - "8008:8008"
      - "8448:8448"
    tmpfs:
      - /var/cache/nginx/client_temp
    volumes:
      - sockets:/sockets
      - ./nginx/config:/etc/nginx
      - ./nginx/logs:/var/log/nginx/

You may already have a reverse proxy in front of your server, but in either case, I recommend a copy of Nginx deployed alongside Synapse itself so that it can easily use the sockets to communicate directly with Synapse and its workers, and be restarted whenever Synapse is.

Having Nginx here will provide a single HTTP port to your network to access Synapse on, so outside your machine it'll behave (almost) exactly the same as a monolithic instance of Synapse, just a lot faster!

Configuration Files

I recommend splitting up the config into more manageable files, so next to my docker-compose.yml I have an nginx directory with the following file structure:

docker-compose.yml
nginx
└── config
    ├── locations.conf
    ├── maps.conf
    ├── nginx.conf
    ├── private.conf
    ├── proxy.conf
    └── upstreams.conf

My current configuration files are below, with a short summary of what's going on:

nginx.conf

This is some fairly standard Nginx configuration for a public HTTP service, with one Nginx worker per CPU core, and larger buffer sizes to accommodate media requests:

# Worker Performance
worker_processes auto;
worker_rlimit_nofile 8192;
pcre_jit on;

# Events Configuration
events {
  multi_accept off;
  worker_connections 4096;
}

# HTTP Configuration
http {
  # Security Settings
  server_tokens off;

  # Connection Optimisation
  client_body_buffer_size 32m;
  client_header_buffer_size 32k;
  client_max_body_size 1g;
  http2_max_concurrent_streams 128;
  keepalive_timeout 65;
  keepalive_requests 100;
  large_client_header_buffers 4 16k;
  resolver 127.0.0.11 valid=60;
  resolver_timeout 10s;
  sendfile on;
  server_names_hash_bucket_size 128;
  tcp_nodelay on;
  tcp_nopush on;

  # Proxy optimisation
  proxy_buffer_size 128k;
  proxy_buffers 4 256k;
  proxy_busy_buffers_size 256k;

  # Gzip Compression
  gzip on;
  gzip_buffers 16 8k;
  gzip_comp_level 2;
  gzip_disable "MSIE [1-6]\.";
  gzip_min_length 1000;
  gzip_proxied any;
  gzip_types application/javascript application/json application/x-javascript application/xml application/xml+rss image/svg+xml text/css text/javascript text/plain text/xml;
  gzip_vary on;

  # Logging
  log_format balanced '"$proxy_host" "$upstream_addr" >> $http_x_forwarded_for '
                      '"$remote_user [$time_local] "$request" $status $body_bytes_sent '
                      '"$http_referer" "$http_user_agent" $request_time';

  # HTTP-level includes
  include maps.conf;
  include upstreams.conf;

  server {
    # Listen to 8008 for all incoming requests
    listen 8008 default_server backlog=2048 reuseport fastopen=256 deferred so_keepalive=on;
    server_name _;
    charset utf-8;

    # Logging
    access_log /var/log/nginx/access.log balanced buffer=64k flush=1m;
    error_log /var/log/nginx/error.log warn;

    # Server-level includes
    include locations.conf;

    # Redirect any unmatched URIs back to host
    location / {
      return 301 https://$host:8448;
    }
  }
}

This specifically just covers HTTP for placing behind another HTTPS proxy if you have one.

If you want this server to handle HTTPS directly in front of the internet, add this:

http {
  # SSL hardening
  ssl_ciphers EECDH+AESGCM:EDH+AESGCM;
  ssl_prefer_server_ciphers on;
  ssl_protocols TLSv1.2 TLSv1.3;
  ssl_session_cache shared:SSL:10m;
  ssl_session_tickets off;
  ssl_session_timeout 1d;
  ssl_stapling on;
  ssl_stapling_verify on;
  add_header Strict-Transport-Security "max-age=63072000; includeSubdomains";

  # Rest of the config above until the server block, then replace server block with below

  # HTTP redirect
  server {
    listen 8008 default_server backlog=2048 reuseport fastopen=256 deferred so_keepalive=on;
    server_name _;

    # Always redirect to HTTPS
    return 301 https://$host:8448$request_uri;
  }

  # Default HTTPS error
  server {
    listen 8448 ssl default_server backlog=2048 reuseport fastopen=256 deferred so_keepalive=on;
    server_name _;
    charset utf-8;
    http2 on;

    # SSL certificate
    ssl_certificate /path/to/ssl/mydomain.com/fullchain.pem;
    ssl_certificate_key /path/to/ssl/mydomain.com/privkey.pem;    

    # Default security headers
    add_header Referrer-Policy "no-referrer";
    add_header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload";
    add_header X-Content-Type-Options "nosniff";
    add_header X-Frame-Options "SAMEORIGIN";
    add_header X-XSS-Protection "1; mode=block";

    # Logging
    access_log /var/log/nginx/access.log balanced buffer=64k flush=1m;
    error_log /var/log/nginx/error.log warn;

    # Server-level includes
    include locations.conf;

    # Return 404 for unmatched location
    return 404;
  }
}

upstreams.conf

This is where we actually list the sockets Nginx will send requests to:

# Client non-room requests
upstream synapse_inbound_client_readers {
  # least_conn;
  server unix:/sockets/synapse_inbound_client_reader1.sock max_fails=0;
  keepalive 10;
}

# Client sync workers
upstream synapse_inbound_client_syncs {
  # hash $mxid_localpart consistent;
  server unix:/sockets/synapse_inbound_client_sync1.sock max_fails=0;
  keepalive 10;
}

# Federation non-room requests
upstream synapse_inbound_federation_readers {
  # ip_hash;
  server unix:/sockets/synapse_inbound_federation_reader1.sock max_fails=0;
  keepalive 10;
}

# Media requests
upstream synapse_inbound_media {
  # least_conn;
  server unix:/sockets/synapse_inbound_media1.sock max_fails=0;
  keepalive 10;
}

# Synapse main thread
upstream synapse_inbound_main {
  server unix:/sockets/synapse_inbound_main.sock max_fails=0;
  keepalive 10;
}

# Client/federation room requests
upstream synapse_inbound_room_workers {
  hash $room_name consistent;
  server unix:/sockets/synapse_inbound_rooms1.sock max_fails=0;
  server unix:/sockets/synapse_inbound_rooms2.sock max_fails=0;
  server unix:/sockets/synapse_inbound_rooms3.sock max_fails=0;
  server unix:/sockets/synapse_inbound_rooms4.sock max_fails=0;
  keepalive 10;
}

A major change from the default design is my concept of "room workers" that are each responsible for a fraction of the rooms the server handles.

The theory here is that, by balancing requests using the room ID, each "room worker" only needs to understand a few of the rooms, and its cache can be very specialised, while massively reducing the amount of workers we need overall.

I've included the load balancing method you should use for each one, in case you need to add extra workers - for example, if your server needs to generate lots of thumbnails, or has more than a few users, you may need an extra media worker.

maps.conf

These are used to provide "mapping" so Nginx can understand which worker to load balance incoming requests, no changes should be required:

# Client username from MXID
map $http_authorization $mxid_localpart {
  default                           $http_authorization;
  "~Bearer syt_(?<username>.*?)_.*" $username;
  ""                                $accesstoken_from_urlparam;
}

# Whether to upgrade HTTP connection
map $http_upgrade $connection_upgrade {
  default upgrade;
  '' close;
}

#Extract room name from URI
map $request_uri $room_name {
  default "not_room";
  "~^/_matrix/(client|federation)/.*?(?:%21|!)(?<room>[\s\S]+)(?::|%3A)(?<domain>[A-Za-z0-9.\-]+)" "!$room:$domain";
}

locations.conf

This is the biggest file, and defines which URIs go to which upstream:

### MAIN OVERRIDES ###

# Client: Main overrides
location ~ ^/_matrix/client/(api/v1|r0|v3|unstable)/(account/3pid/|directory/list/room/|pushrules/|rooms/[\s\S]+/(forget|upgrade)|login/sso/redirect/|register) {
  set $proxy_pass http://synapse_inbound_main;
  include proxy.conf;
}

# Client: OpenID Connect SSO
location ~ ^(/_matrix/client/(api/v1|r0|v3|unstable)/login/sso/redirect|/_synapse/client/(pick_username|(new_user_consent|oidc/callback|pick_idp|sso_register)$)) {
  set $proxy_pass http://synapse_inbound_main;
  include proxy.conf;
}

# Federation: Main overrides
location ~ ^/_matrix/federation/v1/openid/userinfo$ {
  set $proxy_pass http://synapse_inbound_main;
  include proxy.conf;
}

### FEDERATION ###

# Federation rooms
location ~ "^/_matrix/(client|federation)/.*?(?:%21|!)[\s\S]+(?:%3A|:)[A-Za-z0-9.\-]+" {
  set $proxy_pass http://synapse_inbound_room_workers;
  include proxy.conf;
}

location ~ ^/_matrix/federation/v[12]/(?:state_ids|get_missing_events)/(?:%21|!)[\s\S]+(?:%3A|:)[A-Za-z0-9.\-]+ {
  set $proxy_pass http://synapse_inbound_room_workers;
  include proxy.conf;
}

# Federation readers
location ~ ^/_matrix/(federation/(v1|v2)|key/v2)/ {
  set $proxy_pass http://synapse_inbound_federation_readers;
  include proxy.conf;
}

### CLIENTS ###

# Stream: account_data
location ~ ^/_matrix/client/(api/v1|r0|v3|unstable)/[\s\S]+(/tags|/account_data) {
  set $proxy_pass http://synapse_inbound_client_syncs;
  include proxy.conf;
}

# Stream: presence
location ~ ^/_matrix/client/(api/v1|r0|v3|unstable)/presence/ {
  set $proxy_pass http://synapse_inbound_client_syncs;
  include proxy.conf;
}

# Stream: receipts
location ~ ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/[\s\S]+/(receipt|read_markers) {
  set $proxy_pass http://synapse_inbound_client_syncs;
  include proxy.conf;
}

# Stream: to_device
location ~ ^/_matrix/client/(api/v1|r0|v3|unstable)/sendToDevice/ {
  set $proxy_pass http://synapse_inbound_client_syncs;
  include proxy.conf;
}

# Stream: typing
location ~ ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/[\s\S]+/typing {
  set $proxy_pass http://synapse_inbound_client_syncs;
  include proxy.conf;
}

# Note: The following client blocks must come *after* the stream blocks above
otherwise some stream requests would be incorrectly routed

# Client: User directory
location ~ ^/_matrix/client/(api/v1|r0|v3|unstable)/user_directory/search {
  set $proxy_pass http://synapse_inbound_client_syncs;
  include proxy.conf;
}

# Client: Rooms
location ~ ^/_matrix/client/.*?![\s\S]+:[A-Za-z0-9.\-]+ {
  set $proxy_pass http://synapse_inbound_room_workers;
  include proxy.conf;
}

# Client: Sync
location ~ ^/_matrix/client/((api/)?[^/]+)/(sync|events|initialSync|rooms/[\s\S]+/initialSync)$ {
  set $proxy_pass http://synapse_inbound_client_syncs;
  include proxy.conf;
}

# Client: Reader
location ~ ^/_matrix/client/(api/v1|r0|v3|unstable)/(room_keys/|keys/(query|changes|claim|upload/|room_keys/)|login|register(/available|/m.login.registration_token/validity|)|password_policy|profile|rooms/[\s\S]+/(joined_members|context/[\s\S]+|members|state|hierarchy|relations/|event/|aliases|timestamp_to_event|redact|send|state/|(join|invite|leave|ban|unban|kick))|createRoom|publicRooms|account/(3pid|whoami|devices)|versions|voip/turnServer|joined_rooms|search|user/[\s\S]+/filter(/|$)|directory/room/[\s\S]+|capabilities) {
  set $proxy_pass http://synapse_inbound_main;
  include proxy.conf;
}

# Media
location ~* ^/_matrix/((client|federation)/[^/]+/)media/ {
  set $proxy_pass http://synapse_inbound_media;
  include proxy.conf;
}

# Matrix default
location /_matrix/ {
  set $proxy_pass http://synapse_inbound_main;
  include proxy.conf;
}

# Media admin
location ~ ^/_synapse/admin/v1/(purge_)?(media(_cache)?|room|user|quarantine_media|users)/[\s\S]+|media$ {
  include private.conf;
  set $proxy_pass http://synapse_inbound_media;
  include proxy.conf;
}

# Matrix admin API
location /_synapse/ {
  include private.conf;
  set $proxy_pass http://synapse_inbound_main;
  include proxy.conf;
}

It starts by forcing some requests to go directly to the main thread, as workers aren't ready to handle them yet, and then for each type of request (federation/client) we send specialised requests to specialised workers, otherwise send any request with a room ID to the "room workers" and whatever's left goes to our dedicated federation/client reader.

You may also notice that the special "stream" endpoints all go to the synapse_inbound_client_syncs group - if you have multiple sync workers, you'll need to split this out to a separate worker for stream writing, but for a small number of clients (e.g. a home install) it's best for performance to keep the caches with your sync workers to maximise caching and minimise queries to your database.

proxy.conf

You may have noticed we used "proxy.conf" many times above. We do this to quickly define standard proxy config, which could easily be overriden per location block if needed later:

proxy_connect_timeout 2s;
proxy_buffering off;
proxy_http_version 1.1;
proxy_read_timeout 3600s;
proxy_redirect off;
proxy_send_timeout 120s;
proxy_socket_keepalive on;
proxy_ssl_verify off;

proxy_set_header Accept-Encoding "";
proxy_set_header Host $host;
proxy_set_header Connection $connection_upgrade;
proxy_set_header Upgrade $http_upgrade;

private.conf

Lastly, let's approve specific ranges for private access to the admin API. You'll want to define ranges that can access it, which may include your home/work IP, or private ranges if you're hosting at home.

Here, I've specified the standard RFC1918 private ranges:

# Access list for non-public access
allow 10.0.0.0/8;
allow 172.16.0.0/12;
allow 192.168.0.0/16;
deny all;
satisfy all;

Deploying a Synapse Homeserver with Docker

Worker Configuration

  1. Worker Configuration
  2. Introduction
  3. Synapse Configuration
  4. Worker Config Files
  5. Worker Log Config
  6. Docker Configuration

Introduction

Due to the way Python handles multiple CPU cores, a design decision was made in Synapse to allow splitting work out between multiple copies with defined roles, rather than trying to shoehorn many processes into a single instance of Synapse.

As a result, we can create multiple workers, say what we want them to do to meet our specific server's needs, and tweak the config to optimise them.

My suggested design is different from the official documentation, so feel free to study that first, but my recommended model is based on months of testing of various size servers to ensure they can efficiently cope with thousands of rooms and also rooms with tens of thousands of users in them, so I hope you will find it helps.

I've also included an explanation with a diagram at the bottom of this page to help explain the rationale behind this design, and why it makes the best use of available CPU & RAM.

Synapse Configuration

In the initial homeserver.yaml we didn't reference any workers, so will want to add these now.

To begin with, let's tell Synapse the name of workers we want to assign to various roles that can be split out of the main Synapse process:

enable_media_repo: false
federation_sender_instances:
  - sender1
  - sender2
  - sender3
  - sender4
media_instance_running_background_jobs: media1
notify_appservices_from_worker: tasks1
pusher_instances:
  - tasks1
run_background_tasks_on: tasks1
start_pushers: false
stream_writers:
  account_data:
    - client_sync1
  events:
    - tasks1
  presence:
    - client_sync1
  receipts:
    - client_sync1
  to_device:
    - client_sync1
  typing:
    - client_sync1
update_user_directory_from_worker: client_sync1

Four federation senders should be plenty for most federating servers that have less than a few hundred users, but a later section will explain how to scale up your server to handle hundreds/thousands of users, should the need arise.

Now we've defined the roles, we also need to add an instance_map to tell Synapse how to reach each worker listed in the config entries above:

instance_map:
  main:
    path: "/sockets/synapse_replication_main.sock"
  client_sync1:
    path: "/sockets/synapse_replication_client_sync1.sock"
  media1:
    path: "/sockets/synapse_replication_media1.sock"
  sender1:
    path: "/sockets/synapse_replication_sender1.sock"
  sender2:
    path: "/sockets/synapse_replication_sender2.sock"
  sender3:
    path: "/sockets/synapse_replication_sender3.sock"
  sender4:
    path: "/sockets/synapse_replication_sender4.sock"
  tasks1:
    path: "/sockets/synapse_replication_tasks1.sock"

Worker Config Files

Firstly, I recommend these be stored in a subfolder of your Synapse directory (like "workers") so they're easier to organise.

These are typically very simple, but vary slightly depending on the worker, so I'll explain that below.

worker_app: "synapse.app.generic_worker" # Always this unless "synapse.app.media_repository"
worker_name: "client_sync1" # Name of worker specified in instance map
worker_log_config: "/data/log.config/client_sync.log.config" # Log config file

worker_listeners:
  # Include for any worker in the instance map above:
  - path: "/sockets/synapse_replication_client_sync1.sock"
    type: http
    resources:
      - names: [replication]
        compress: false
  # Include for any worker that receives requests in Nginx:
  - path: "/sockets/synapse_inbound_client_sync1.sock"
    type: http
    x_forwarded: true # Trust the X-Forwarded-For header from Nginx
    resources:
      - names: [client, federation]
        compress: false
  # Include when using Prometheus or compatible monitoring system:
  - type: metrics
    bind_address: ''
    port: 9000

This means, for example, that the Room Workers don't need a replication socket as they are not in the instance map, but do require an inbound socket as Nginx will need to forward events to them:

worker_app: "synapse.app.generic_worker"
worker_name: "rooms1"
worker_log_config: "/data/log.config/rooms.log.config"

worker_listeners:
  - path: "/sockets/synapse_inbound_rooms1.sock"
    type: http
    x_forwarded: true
    resources:
      - names: [client, federation]
        compress: false
  - type: metrics
    port: 10101

As above, I recommend having a separate log config for each type of worker to aid any investigation you need to do later, so will explain this in the following section:

Worker Log Config

These have a standard format, but here I have enabled buffered logging to lower disk I/O, and use a daily log to keep for 3 days before deleting:

version: 1
formatters:
  precise:
    format: '%(asctime)s - %(name)s - %(lineno)d - %(levelname)s - %(request)s - %(message)s'
handlers:
  file:
    class: logging.handlers.TimedRotatingFileHandler
    formatter: precise
    filename: /data/log/rooms.log
    when: midnight
    backupCount: 3
    encoding: utf8

  buffer:
    class: synapse.logging.handlers.PeriodicallyFlushingMemoryHandler
    target: file
    capacity: 10
    flushLevel: 30
    period: 5

loggers:
  synapse.metrics:
    level: WARN
    handlers: [buffer]
  synapse.replication.tcp:
    level: WARN
    handlers: [buffer]
  synapse.util.caches.lrucache:
    level: WARN
    handlers: [buffer]
  twisted:
    level: WARN
    handlers: [buffer]
  synapse:
    level: INFO
    handlers: [buffer]

root:
  level: INFO
  handlers: [buffer]

Note: While Synapse is running, each line in the log (after the timestamp) starts with a string like synapse.util.caches.lrucache so you can control exactly what is logged for each log type by adding some of them to the loggers section here. In this example, I've suppressed less informative logs to make the more important ones easier to follow.

Docker Configuration

Since we defined a "synapse-worker-template" and "synapse-media-template" in the previous Docker Compose section, these are very simple to define just below our main Synapse container:

  synapse:
    <<: *synapse-template

  client-sync1:
    <<: *synapse-worker-template
    command: run --config-path=/data/homeserver.yaml --config-path=/data/workers/client_sync1.yaml
    healthcheck:
      test: curl -fSs --unix-socket /sockets/synapse_replication_client_sync1.sock http://localhost/health

  federation-reader1:
    <<: *synapse-worker-template
    command: run --config-path=/data/homeserver.yaml --config-path=/data/workers/federation_reader1.yaml
    healthcheck:
      test: curl -fSs --unix-socket /sockets/synapse_inbound_federation_reader1.sock http://localhost/health

  media1:
    <<: *synapse-media-template
    command: run --config-path=/data/homeserver.yaml --config-path=/data/workers/media1.yaml
    healthcheck:
      test: curl -fSs --unix-socket /sockets/synapse_replication_media1.sock http://localhost/health

  rooms1:
    <<: *synapse-worker-template
    command: run --config-path=/data/homeserver.yaml --config-path=/data/workers/rooms1.yaml
    healthcheck:
      test: curl -fSs --unix-socket /sockets/synapse_inbound_rooms1.sock http://localhost/health

  rooms2:
    <<: *synapse-worker-template
    command: run --config-path=/data/homeserver.yaml --config-path=/data/workers/rooms2.yaml
    healthcheck:
      test: curl -fSs --unix-socket /sockets/synapse_inbound_rooms2.sock http://localhost/health

  rooms3:
    <<: *synapse-worker-template
    command: run --config-path=/data/homeserver.yaml --config-path=/data/workers/rooms3.yaml
    healthcheck:
      test: curl -fSs --unix-socket /sockets/synapse_inbound_rooms3.sock http://localhost/health

  rooms4:
    <<: *synapse-worker-template
    command: run --config-path=/data/homeserver.yaml --config-path=/data/workers/rooms4.yaml
    healthcheck:
      test: curl -fSs --unix-socket /sockets/synapse_inbound_rooms4.sock http://localhost/health

  sender1:
    <<: *synapse-worker-template
    command: run --config-path=/data/homeserver.yaml --config-path=/data/workers/sender1.yaml
    healthcheck:
      test: curl -fSs --unix-socket /sockets/synapse_replication_sender1.sock http://localhost/health

  sender2:
    <<: *synapse-worker-template
    command: run --config-path=/data/homeserver.yaml --config-path=/data/workers/sender2.yaml
    healthcheck:
      test: curl -fSs --unix-socket /sockets/synapse_replication_sender2.sock http://localhost/health

  sender3:
    <<: *synapse-worker-template
    command: run --config-path=/data/homeserver.yaml --config-path=/data/workers/sender3.yaml
    healthcheck:
      test: curl -fSs --unix-socket /sockets/synapse_replication_sender3.sock http://localhost/health

  sender4:
    <<: *synapse-worker-template
    command: run --config-path=/data/homeserver.yaml --config-path=/data/workers/sender4.yaml
    healthcheck:
      test: curl -fSs --unix-socket /sockets/synapse_replication_sender4.sock http://localhost/health

  tasks1:
    <<: *synapse-worker-template
    command: run --config-path=/data/homeserver.yaml --config-path=/data/workers/tasks1.yaml
    healthcheck:
      test: curl -fSs --unix-socket /sockets/synapse_replication_tasks1.sock http://localhost/health

The "healthcheck" sections just need to match the socket name from each worker's config file - the /health endpoint listens on both replication and inbound sockets, so you can use either, depending on what the worker has available. This allows Docker to test whether the container is running, so it can be automatically restarted if there are any issues.

With all of the configuration sections above in place, and the Nginx upstream configuration from the previous section, all you should need to do now is run docker compose down && docker compose up -d to bring up Synapse with the new configuration and a much higher capacity!

PostgreSQL for Synapse Homeservers

Welcome to my PostgreSQL guides! If you're here, you're likely interested in optimising and understanding how PostgreSQL works with your Matrix Synapse Homeserver.

PostgreSQL (or usually Postgres) is an advanced, open-source relational database system. It's known for its robustness, performance, and compatibility with a wide range of applications, so it's become a popular choice for businesses, developers, and system administrators alike.

In the context of Matrix Synapse, PostgreSQL plays a crucial role. Synapse manages a significant amount of data – ranging from user information to message history, and all the complex interactions in between. The database is the backbone of this operation, ensuring data integrity, security, and accessibility. Properly tuning and managing your PostgreSQL database is key to maintaining a responsive and efficient Synapse server.

This section aims to guide you through various aspects of PostgreSQL in relation to Synapse. Whether you're setting up PostgreSQL for the first time or looking to optimise an existing database, you'll hopefully find insights and instructions tailored to your needs. From basic configuration to advanced tuning, I'll try to cover a range of topics to help you get the most out of your server.

Remember, each Synapse server is unique, and so are its database requirements. This guide aims to provides general advice and best practices, but it's important for you to assess and adapt these recommendations to fit your specific situation.

Let's dive into the world of PostgreSQL!

Tuning PostgreSQL for a Matrix Synapse Homeserver

Introduction

Welcome to the guide on fine-tuning PostgreSQL for your Matrix Synapse Homeserver. Matrix Synapse is an open-source server implementation for the Matrix protocol, which powers an ever-growing network of secure, decentralised real-time communication. Ensuring that Synapse runs efficiently is crucial, and a significant part of that efficiency comes from the underlying database—PostgreSQL.

Out-of-the-box, PostgreSQL is configured with general-purpose settings that may not align with Synapse's specific demands. This guide will help you customise PostgreSQL, enhancing performance and ensuring your server can handle its unique workload with ease.

Remember, each Synapse server is as individual as its users, rooms, and usage patterns. Therefore, rather than prescribing one-size-fits-all solutions, I'll try to equip you with the knowledge to tailor your database settings to your server's distinct personality. Let's embark on this journey towards a more responsive and optimised Synapse experience.

Tuning PostgreSQL for a Matrix Synapse Homeserver

Statistics Modules

Some of the following sections will rely on statistics generated by either pg_buffercache or pg_stat_statements, which are optional modules included with PostgreSQL that offer statistics on various tasks that PostgreSQL has seen.

These extensions cause PostgreSQL to use slightly more shared memory, and a few percent higher CPU tim - there's no direct harm in leaving them running, but if your objective is for maximum performance, after enabling them and completing your investigation, you can disable them again in the Disabling Statistics section below

Enabling Statistics

  1. Open your postgresql.conf file, search for the shared_preload_libraries setting, then add pg_buffercache,pg_stat_statements to its value (making sure to comma-separate each entry).

    If it's not present, simply add the following line:

    shared_preload_libraries = 'pg_buffercache,pg_stat_statements'
    
  2. Restart the PostgreSQL server for the changes to take effect, then run these queries:

    CREATE EXTENSION IF NOT EXISTS pg_buffercache;
    CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
    

Resetting Statistics

To reset the statistics collected by pg_stat_statements, you can execute the following command:

SELECT pg_stat_reset();

If your server has been running a long time, it's definitely worth running this to ensure you're looking at fresh numbers.

You can check when the stats were last reset for each database using a query like this:

SELECT datname AS database,
       stats_reset AS stats_last_reset
FROM pg_stat_database
WHERE datname
NOT LIKE 'template%';

 database  |       stats_last_reset
-----------+-------------------------------
 synapse   | 2023-12-22 12:13:28.708593+00
(1 row)

(Note: An empty value here would mean the stats have never been reset, according to PostgreSQL's records)

Disabling Statistics

Once you're done investigating, there's no need to remove the line from postgresql.conf - simply run the queries below to disable the extensions, so they'll stop running, but be available next time you need them:

DROP EXTENSION IF EXISTS pg_buffercache;
DROP EXTENSION IF EXISTS pg_stat_statements;

Tuning PostgreSQL for a Matrix Synapse Homeserver

2. Worker Configuration

PostgreSQL splits work among processes that handle various tasks, from executing queries to performing maintenance operations. Just like in Synapse, these extra threads are called "workers", and the number of them and their configuration can have a huge influence on the performance of your database.

The Importance of Latency

Speed can be measured in multiple ways: some say "speed" when they mean "bandwidth", but in a realtime application like Synapse that can make hundreds (or thousands) of queries per second, reducing latency (the time it takes for a single piece of data to get from A to B) can make a world of difference.

Synapse's database (particularly the state_groups_state table, which typically contains over 90% of the data) is highly sensitive to latency. Each transaction must complete quickly to prevent concurrency issues, where two queries are trying to write the same data and can reach a deadlock. This is where the balance between CPU-bound and IO-bound operations becomes critical:

  • CPU-bound: The system's performance is primarily limited by CPU power. If a system is CPU-bound, adding more (or faster) cores or optimising the computation can improve performance.
  • I/O-bound: The system spends more time waiting for I/O operations to complete than actually computing. This could be due to slow disk access, network latency, or other I/O bottlenecks.

For Synapse and PostgreSQL, the goal is to strike a balance: we want to ensure the database isn't CPU-bound, and has enough computational resources to process queries efficiently. However, giving it excessive CPU resources only makes it IO-bound and unable to utilise all of that allocated power.

Tuning Workers to CPU Cores

The number of workers in PostgreSQL is closely tied to the number of available CPU cores because each worker process can perform tasks concurrently on a separate core. However, too many workers can lead to resource contention and increased context switching, which can degrade performance.

Here's an example of how you might configure the worker settings in postgresql.conf:

# Maximum number of workers in total, including maintenance and replication
# (typically the number of CPU cores you have)
max_worker_processes = 8

# Maximum number of workers in total that can be used for queries
# (capped by max_worker_processes, so typically the same number)
max_parallel_workers = 8

# Maximum number of workers that can be used for a single query
# (typically a quarter to a third of the total workers)
max_parallel_workers_per_gather = 3

# Maximum number of workers that can be used for maintenance operations
# (typically an eighth to a quarter of the total workers)
max_parallel_maintenance_workers = 2

Postgres is generally reliable at choosing how many workers to use, but doesn't necessarily understand the profile of the work you're expecting from it each day, as it doesn't understand how your application (in this case Synapse) is designed.

For example, when all workers are busy, Postgres will queue incoming queries until workers are available, which delays those queries being answered. You might be tempted to set max_parallel_workers_per_gather = 1 to ensure more queries are handled immediately, but then if one query requires a lock on a table, all other workers would need to wait to access that data.

In this Synapse case, it's generally better to use parallelism when possible to speed up complex queries, rather than trying to enable the maximum amount of queries to be running at the same time.

Monitor CPU Utilisation

Use tools like top, htop, or vmstat to monitor CPU usage, or docker stats if using Docker. If the CPU utilisation of Postgres never exceeds 50-60%, consider reducing the number of workers to free up resources for Synapse and other processes.

Analysing Query Performance Time

With pg_stat_statements enabled, you can now monitor the performance of your SQL statements. Here's a query to help you analyse the database's behaviour:

SELECT LEFT(query, 80) AS query,
       calls,
       mean_exec_time AS average_time_ms,
       total_exec_time AS total_time_ms
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;

This should show the top 10 queries that consumed the most time on average, including the amount of times that query was called, and the total execution time taken. The longest queries are usually not the most common, but by comparing the average time before and after a change at each stage, you can gauge the impact of your optimisations.

Balance with Synapse

Remember, Synapse and PostgreSQL are two parts of the same team here, so test at each stage that adjustments made to the database don't adversely impact Synapse's performance.

Browse around your client's UI, scroll through room timelines, and monitor Synapse's logs and performance metrics to ensure everything behaves as expected. We'll cover this in more detail later in the Testing Methodology section.

Tuning PostgreSQL for a Matrix Synapse Homeserver

3. Memory Configuration

Memory plays a pivotal role in the performance of your PostgreSQL database, as does using it efficiently in the right places. Having terrabytes of RAM would undoubtedly speed things up, but the benefit typically drops off quickly after a few gigabytes.

Shared Buffers

The shared_buffers setting determines the amount of memory allocated for PostgreSQL to use for caching data. This cache is critical because it allows frequently accessed data to be served directly from memory, which is much faster than reading from disk.

# Set the amount of memory the database server uses for shared memory buffers
shared_buffers = '4GB'

As a general guideline, setting shared_buffers to approximately 25% of the total system memory is a good starting point on a dedicated database server. However, because PostgreSQL relies on the operating system's cache as well, it's not necessary to allocate all available memory to shared_buffers. The optimal size also depends on the nature of your workload and the size of your database.

You can run this query to see the status of your buffers:

WITH block_size AS (
    SELECT setting::integer AS block_size
    FROM pg_settings
    WHERE name = 'block_size'
), buffer_stats AS (
    SELECT
        COUNT(*) * (SELECT block_size FROM block_size) AS total_buffer_bytes,
        SUM(CASE WHEN b.usagecount > 0 THEN (SELECT block_size FROM block_size) ELSE 0 END) AS used_buffer_bytes,
        SUM(CASE WHEN b.isdirty THEN (SELECT block_size FROM block_size) ELSE 0 END) AS unwritten_buffer_bytes
    FROM pg_buffercache b
) SELECT
    pg_size_pretty(total_buffer_bytes) AS total_buffers,
    pg_size_pretty(used_buffer_bytes) AS used_buffers,
    ROUND((used_buffer_bytes::float / NULLIF(total_buffer_bytes, 0)) * 100) AS perc_used_of_total,
    pg_size_pretty(unwritten_buffer_bytes) AS unwritten_buffers,
    ROUND((unwritten_buffer_bytes::float / NULLIF(used_buffer_bytes, 0)) * 100) AS perc_unwritten_of_used
FROM buffer_stats;

 total_buffers | used_buffers | perc_used_of_total | unwritten_buffers | perc_unwritten_of_used
---------------+--------------+--------------------+-------------------+------------------------
 4096 MB       | 1623 MB      |                 40 | 16 MB             |                      1
(1 row)

Here I've allocated 4 GiB, but even after an hour of reasonable use, I'm only actually using 1.6 GiB and the unwritten amount is very low, so I could easily lower the buffer if memory was an issue.

As always, this is a rule of thumb. You may choose to allocate more RAM when you have slow storage and want more of the database available in RAM. However, if you're using SSD/NVME storage, this could easily be a waste of RAM that could just as easily be returned to the OS to use as disk cache.

Shared Memory

Shared memory (specifically the /dev/shm area) plays a vital role in PostgreSQL's performance. It behaves like a ramdisk where files are temporarily stored in memory, and in PostgreSQL it's used frequently during sorting and indexing operations, but also in all sorts of other caching and maintenance tasks.

Unfortunately, Docker typically limits this to 64MB, which can severely limit PostgreSQL's performance. If you're using Docker, manually setting shm_size in Docker to a similar size as the shared_buffers can dramatically improve both query and maintenance performance, as well as reducing disk I/O.

Here's an example of how you might set this in your Docker configuration:

services:
  postgres:
    image: postgres:latest
    shm_size: '1gb'

There is little value in setting this larger than shared_buffers, but the RAM is only consumed while PostgreSQL is using the space, so it's worth setting this to a similar size to shared_buffers if you can afford it.

Effective Cache Size

The effective_cache_size parameter helps the PostgreSQL query planner to estimate how much memory is available for disk caching by the operating system and PostgreSQL combined:

# Set the planner's assumption about the effective size of the disk cache
effective_cache_size = '8GB'

This is not a setting that allocates memory, but rather an help the planner make more informed decisions about query execution. This helps PostgreSQL understand how much memory can be used for caching and can influence decisions such as whether to use an index scan or a sequential scan.

For example, using the free command, you might see:

# free -h
               total        used        free      shared  buff/cache   available
Mem:            62Gi        23Gi       3.4Gi       5.5Gi        35Gi        32Gi
Swap:          8.0Gi       265Mi       7.7Gi

Or using the top command, you might see:

# top -n1 | head -n5
top - 15:20:35 up 14:26,  1 user,  load average: 0.67, 1.92, 2.58
Threads: 5240 total,   1 running, 5239 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.6 us,  1.5 sy,  0.0 ni, 96.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  64082.5 total,   3382.0 free,  24445.6 used,  36254.9 buff/cache
MiB Swap:   8192.0 total,   7926.7 free,    265.2 used.  33243.2 avail Mem

Here, although only about 3GB is "free", around 36GB is being used by the OS for cache. By setting effective_cache_size to a value that reflects this available cache, PostgreSQL can better estimate whether to try accessing the disk, knowing the data is likely to be answered directly from the memory instead.

Working Memory

The work_mem setting controls the amount of memory used for internal sort operations and hash tables instead of writing to temporary disk files:

# Set the maximum amount of memory to be used for query workspaces
work_mem = '32MB'

Setting this value too low can lead to slow performance due to frequent disk writes, while setting it too high can cause excessive memory consumption if many operations happen concurrently.

Remember that each query operation can potentially use up to work_mem memory, so consider the total potential memory usage under peak load when choosing a value.

You can use this query to see how many (and how often) the temporary files are written to disk because the work_mem wasn't high enough:

SELECT datname,
       temp_files,
       temp_bytes
FROM pg_stat_database
WHERE datname NOT LIKE 'template%';

 datname  | temp_files | temp_bytes
----------+------------+------------
 synapse  |        292 | 7143424000
(2 rows)

Here, temporary files are being created for the Synapse database. Gradually increase work_mem by 2-4MB increments, monitoring for 30-60 minutes each time, until temporary files are no longer regularly created.

In practice, values above 32MB often don't make a noticeable difference for Synapse, but you may find higher values (like 64MB or even 128MB) help other applications such as Sliding Sync.

Maintenance Work Memory

Allocating memory for maintenance operations sets aside room for cleaning and organising your workspace. Properly configured, it helps ensure that routine maintenance doesn't disrupt your database's performance.

In PostgreSQL, when a row is deleted or updated, the old data is not immediately removed from the disk. Instead, it's marked as obsolete, and the VACUUM process is expected to run to clean up this obsolete data, compacting the database, and reclaiming space.

Setting the maintenance_work_mem to an optimal value ensures that the VACUUM process has enough memory to perform these tasks efficiently. If you have ample available RAM, you should set this higher (e.g. 512MB-1GB) to minimise maintenance time and table locks.

We'll cover maintenance in more detail later, but properly setting maintenance_work_mem now will significantly speed up those tasks later, helping to keep the database compact and efficient.

Tuning PostgreSQL for a Matrix Synapse Homeserver

4. Query Planner Configuration

If tuning a database is like orchestrating a complex symphony, the query planner is the conductor guiding this intricate performance. More literally, the query planner evaluates the multiple possible ways a given query could be handled, and attempts to choose the most efficient one.

The planner weighs various factors (such as data size, indexes, and available system resources) to blend performance and accuracy, and we have the opportunity to tune this behaviour, so I've listed a few common options below that could help optimise queries.

Cost-Based Parameters

These parameters help PostgreSQL's query planner estimate the relative cost of different query execution plans:

# Cost of a non-sequentially-fetched disk page
random_page_cost = 1.1

# Cost of a sequentially-fetched disk page
seq_page_cost = 0.7

# Cost of processing each row in a query
cpu_tuple_cost = 0.01

# Cost of processing each index entry during an index scan
cpu_index_tuple_cost = 0.005

# Cost of processing each operator or function executed during a query
cpu_operator_cost = 0.0025

# Cost of setting up parallel workers for a parallel operation
parallel_setup_cost = 1000.0

# Minimum amount of table data for a parallel scan to be considered
min_parallel_table_scan_size = 8MB
  • random_page_cost (Default: 4.0) This setting represents the cost of reading a page randomly from disk. Lowering this value (e.g. to 1.1) can be beneficial on systems with fast I/O, like NVME/SSDs, as it makes the planner more likely to choose index scans that involve random disk access. You may want to increase it further on systems with slower HDDs that have a high seek time.

  • seq_page_cost (Default: 1.0) This is the estimated cost of reading a page sequentially from disk. Reducing this value to 0.7 would make sequential scans more attractive to the planner, which might be preferable if your storage is extremely fast.

  • cpu_tuple_cost (Default: 0.01) This parameter estimates the cost of processing each row (tuple) during a query. If you have a CPU-optimized environment, you might consider lowering this value to make plans that process more rows seem less expensive.

  • cpu_index_tuple_cost (Default: 0.005) The cost of processing each index entry during an index scan. Adjusting this value influences the planner's preference for index scans over sequential scans. A lower value (e.g. 0.03) might encourage the use of indexes, but should be done carefully.

  • cpu_operator_cost (Default: 0.0025) This setting estimates the cost of processing each operator or function in a query, so can discourage more compute-intensive plans.

  • parallel_setup_cost (Default: 1000.0) The cost of initiating parallel worker processes for a query. Decreasing this value (e.g. to 500) encourages the planner to use parallel query plans, which can be advantageous if you have many CPU cores that are underutilised.

  • min_parallel_table_scan_size (Default: 8MB) Defines the minimum size of a table scan before the planner considers parallel execution. Increasing this value (e.g. to 16MB) may reduce the use of parallelism for smaller tables, focusing parallel resources on larger scans. Decreasing it (e.g. to 4MB) might encourage more parallelism, even for smaller tables.

Partitioning Parameters

At the time of writing, Synapse doesn't use partitioning in tables, so these should have no effect. However, as they have no negative impact on performance, it's worth enabling them in case partitioned tables appear in the future.

# Allows the planner to consider partitions on joins
enable_partitionwise_join = on

# Allows the planner to consider partitions on aggregation
enable_partitionwise_aggregate = on
  • enable_partitionwise_join: (Default: off) Controls whether the planner can generate query plans that join partitioned tables in a way that considers partitions. Enabling this feature (set to on) can lead to more efficient join operations for partitioned tables.

  • enable_partitionwise_aggregate: (Default: off) Controls whether the planner can generate query plans that perform aggregation in a way that considers partitions. Similar to joins, enabling this feature (set to on) can make aggregation queries more efficient for partitioned tables.

Tuning PostgreSQL for a Matrix Synapse Homeserver

5. Maintenance

Regular maintenance of your PostgreSQL database is important and, when configured correctly, PostgreSQL can take care of most of these duties itself.

Vacuuming

When PostgreSQL deletes data, it doesn't immediately remove it from disk, but rather marks it for removal later. This cleanup task is called "vacuuming", where the old data is removed and the database compacted to improve efficiency and leave less to search in future operations.

Autovacuum

Autovacuum is PostgreSQL's automated janitor, regularly tidying up to save you doing it manually later. It's a helpful feature that can save you time and effort, but as with most PostgreSQL configuration, the defaults are a rough guess at what the majority of applications might benefit from, and can benefit from tuning to work efficiently with a write-heavy application like Synapse.

Here are the defaults:

autovacuum_analyze_scale_factor = 0.1
autovacuum_vacuum_scale_factor = 0.2
autovacuum_vacuum_cost_limit = -1 # uses value of vacuum_cost_limit
vacuum_cost_limit = 200
  • autovacuum_analyze_scale_factor: How often an `ANALYZE`` operation is triggered, measured as a fraction of the table size, so 0.1 should trigger when at least 10% of the table has changed. Analysing the data keeps the statistics more up-to-date, which helps PostgreSQL's query planning.
  • autovacuum_vacuum_scale_factor: How often a vacuum operation is triggered, 0.2 would mean it runs when 20% of the table can be freed. A lower value means that vacuum will run more frequently, reclaiming space more aggressively.
  • autovacuum_vacuum_cost_limit: This sets a limit on how much vacuuming work can be done each run. Increasing this value allows the vacuum process to achieve more each cycle, trading extra disk I/O for faster progress.
  • vacuum_cost_limit: This is the global setting for all vacuum operations, including manual ones. You can adjust the two cost limits separately to have manual and autovacuum operations behave differently.

Here is an example that would run operations more frequently:

autovacuum_analyze_scale_factor = 0.05
autovacuum_vacuum_scale_factor = 0.02
autovacuum_vacuum_cost_limit = 400
vacuum_cost_limit = 300

This example causes more frequent disk I/O, which could affect performance if your caching and working memory aren't optimal. However, in my experience, running the operations more frequently helps to reduces the amount of work required each time, which in turn can help to make the user experience more consistent too.

Tuning PostgreSQL for a Matrix Synapse Homeserver

6. Checkpoints and Replication

This section primarily deals with the performance options when committing data to disk.

However, if you want to back up the database regularly without impacting the performance of the live database, consider setting up a dedicated replica - it won't technically speed up PostgreSQL, but significantly decreases the performance impact of dumping the backup to disk, and backups typically complete faster too. You can find my guide on this here.

Understanding WAL and Checkpoints

Instead of writing each piece of data to the main database file when it arrives, PostgreSQL uses Write-Ahead Logging (WAL) to protect the main database by logging changes into a file as they occur.

This means that, should a crash or power outage occur, the main database file is less likely to become corrupted, and PostgreSQL can try to recover the WAL and commit it into the database the next time it starts up.

In this process, Checkpoints are the points in time where PostgreSQL guarantees that all past changes have been written into the main database files, so tuning this is important to control disk I/O.

Checkpoint Configuration

  • checkpoint_completion_target: Sets the target time for completing the checkpoint's writing work. The default is 0.5, but increasing this to 0.9 (90% of the checkpoint interval) helps to spread out the I/O load to avoid large amounts of work hitting the disk at once.
  • checkpoint_timeout: Sets the maximum time between checkpoints. This is 5 minutes by default, so increasing to 15 minutes can also help smooth out spikes in disk I/O.

WAL Size Configuration

For these values, you can use the query from the Shared Buffers section to see how much of the shared_buffers are consumed by new changes in the checkpoint window.

  • min_wal_size: This sets the minimum size for the WAL. Setting this too low can cause unnecessary disk I/O as Postgres tries to reduce and recreate WAL files, so it's better to set this to a realistic figure. In my example with 785MB of changed data, it would be reasonable to set the min_wal_size to 1GB.
  • max_wal_size: This is a soft limit, and Postgres will create as much WAL as needed, but setting this to an ample figure helps to reduce disk I/O when there is a spike in changes over the checkpoint window. I typically set this to double the shared_buffers value.

WAL Level Configuration

  • If not using replication, set wal_level = minimal to keep the WAL efficient with only the data required to restore after a crash.
  • If replicating to another server, set wal_level = replica to store the necessary data for replication. If you've configured replication, PostgreSQL will actually refuse to start if this is not set!

Tuning PostgreSQL for a Matrix Synapse Homeserver

7. Disk Space

Efficient disk space management ensures that your server remains responsive and that you're making the most of your available resources. This is difficult to cover in detail, as the applications and usage of a Matrix server vary wildly, but I've included some general guidance below:

Database Size

Over time, your PostgreSQL database will grow as more data is added. It's important to keep an eye on the size of your tables, especially those that are known to grow rapidly, such as state_groups_state in Synapse.

This query will list your largest tables:

WITH table_sizes AS (
    SELECT table_schema,
           table_name, 
           pg_total_relation_size('"' || table_schema || '"."' || table_name || '"') AS size
    FROM information_schema.tables
    WHERE table_schema NOT IN ('pg_catalog', 'information_schema')
    ORDER BY size DESC
)
SELECT table_schema AS schema,
       table_name AS table,
       pg_size_pretty(size) AS "size"
FROM table_sizes
LIMIT 10;

 schema |            table             |  size
--------+------------------------------+--------
 public | state_groups_state           | 29 GB
 public | event_json                   | 818 MB
...

On a Synapse server, you should find state_groups_state is by far the largest one, and can see which rooms are the largest with a query like this:

WITH room_counts AS (
    SELECT room_id,
           COUNT(*),
           COUNT(*) * 1.0 / SUM(COUNT(*)) OVER () AS ratio
    FROM state_groups_state
    GROUP BY room_id
), table_sizes AS (
    SELECT table_schema,
           table_name, 
           pg_total_relation_size('"' || table_schema || '"."' || table_name || '"') AS size
    FROM information_schema.tables
    WHERE table_name = 'state_groups_state'
)
SELECT rc.room_id AS room_id,
       rc.count AS state_entries,
       ROUND(rc.ratio * 100, 2) AS percentage,
       pg_size_pretty(ts.size * rc.ratio) AS estimated_size
FROM room_counts rc, table_sizes ts
ORDER BY rc.count DESC
LIMIT 10;

            room_id             | state_entries | percentage | estimated_size
--------------------------------+---------------+------------+----------------
 !OGEhHVWSdvArJzumhm:matrix.org |     125012687 |      91.75 | 26 GB
 !ehXvUhWNASUkSLvAGP:matrix.org |      10003431 |       7.34 | 2152 MB
...

Synapse Compress State Utility

For Synapse, the state_groups_state table can grow significantly. To help manage this, The Matrix Foundation has developed a tool called Synapse Compress State that can compress state maps without losing any data.

For Docker users, I maintain a Docker image of the project, so you can run it without any other dependencies.

Media Size

Media files, such as images and videos and other message attachments, are stored on the filesystem rather than the database, but are tracked in PostgreSQL. Large media files can consume significant disk space, and it can be a challenge to narrow down what is using all of the space through Synapse directly.

With this query you can see how many files of each type were uploaded each month, and the total disk space that consumes:

WITH media_size AS (
    SELECT EXTRACT(YEAR FROM to_timestamp(created_ts / 1000)) AS year,
        EXTRACT(MONTH FROM to_timestamp(created_ts / 1000)) AS month,
        media_type AS mime_type,
        COUNT(*) AS files,
        SUM(media_length) AS total_bytes
    FROM local_media_repository
    GROUP BY media_type, year, month
    ORDER BY total_bytes DESC
)
SELECT year,
    month,
    mime_type,
    files,
    pg_size_pretty(total_bytes) AS total_size
FROM  media_size
LIMIT 10;

 year | month | mime_type  | files | total_size
------+-------+------------+-------+------------
 2023 |     9 | video/mp4  |   464 | 2004 MB
 2023 |     9 | image/png  |   592 | 1648 MB
 2023 |    10 | video/mp4  |   308 | 1530 MB
 2023 |     8 | image/png  |  2614 | 1316 MB
 ...

Managing Media Files

Synapse provides configuration options to manage media files, such as:

  • media_store_path: Defines where on the filesystem media files are stored.
  • max_upload_size: Sets the maximum size for uploaded media files.
  • media_retention: Configures the duration for which media files are retained before being automatically deleted.

Here's an example of how you might configure these in your homeserver.yaml:

media_store_path: "/var/lib/synapse/media"
max_upload_size: "10M"
media_retention:
  local_media_lifetime: 3y
  remote_media_lifetime: 30d

It's important to note that this takes effect shortly after the next server start, so make sure you're not removing anything you want to keep. Remote media in particular is less of a concern as this can be re-retrieved later from other homeservers on demand, but some may wish to keep a local copy in case that server goes offline in the future.

Tuning PostgreSQL for a Matrix Synapse Homeserver

8. Testing Methodology

Monitor Database Connections

You can use this query to see the number of active and idle connections open to each database:

SELECT datname AS database,
       state AS connection_state,
       count(*) AS connections
FROM pg_stat_activity
WHERE datname IS NOT NULL
GROUP BY state, datname
ORDER BY datname;

 datname | state  | count
---------+--------+-------
 synapse | idle   |    77
 synapse | active |    10
(2 rows)

There's no harm in setting max_connections = 500 in your postgresql.conf, however you may wish to control the amount of connections Synapse is making if it's hardly using them.

Adjust Synapse Connection Limits

By default, Synapse is tuned for a single process where all database communication is done by a single worker. When creating multiple (or dozens!) of workers to spread the load, each worker needs significantly fewer database connections to complete its task.

In Synapse, you can configure the cp_min and cp_max values for this:

database:
  name: psycopg2
  args:
...
    cp_min: 1
    cp_max: 6

Synapse uses a network library called Twisted, which appears to open cp_max connections and never close them, so there's no harm in setting cp_min = 1.

On a monolithic (without workers) Synapse server you could easily set cp_max = 20 to cover the many duties it needs to perform. However, with many workers, you can set cp_max = 6 or lower as each worker has fewer specialised tasks.

After any changes, restart Synapse and ensure it's behaving correctly, and that there aren't any logs showing database errors or advising that connections are prematurely closed - it's far easier to revert a small change now than to troubleshoot the source of a problem later after other changes have been made.

Analysing Query Performance

The pg_stat_statements extension is a powerful tool for analysing query performance. There are many different ways to view the data, but below are a couple of examples to try:

Slowest Queries

This will give you the top 5 slowest queries, how many times they've been called, the total execution time, and average execution time:

SELECT LEFT(query, 80) AS short_query,
       calls,
       ROUND(mean_exec_time) AS average_ms,
       ROUND(total_exec_time) AS total_ms
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 5;

Slowest Queries by Type

If you want to analyse a specific query pattern for slowness, you can filter by the query text:

SELECT LEFT(query, 80) AS short_query,
       ROUND(mean_exec_time) AS average_ms,
       calls,
       ROUND(total_exec_time) AS total_ms
FROM pg_stat_statements
WHERE query LIKE '%INSERT INTO events%'
ORDER BY mean_exec_time DESC
LIMIT 5;

This will help you identify places to optimise, for example in this example we're looking at events being inserted into the database, but could just as easily look at large SELECT statements indexing lots of data.

Continuous Monitoring and Iterative Tuning

Tuning a PostgreSQL database for Synapse is an iterative process. Monitor the connections, query performance, and other vital statistics, then adjust the configuration as needed and observe the impact. Document the changes and the reasons for them, as this could be invaluable for future tuning or troubleshooting.

Likewise, if you record user statistics or Synapse metrics, it can be really valuable to record some details when unusual events occur. What happened on the day the server had twice as many active users as usual? How do Synapse and PostgreSQL react when waves of federation traffic arrive from a larger server? These events can help you understand where the server has its bad days and allow you to prepare so you can avoid a panic if the worst should happen.

Tuning PostgreSQL for a Matrix Synapse Homeserver

9. Migrating PostgreSQL Major Versions

Newer releases of PostgreSQL don't just come with minor bug fixes, but often major security and performance improvements. Whenever possible, it's best to keep abreast of these new releases to take advantage of these benefits.

Minor releases of PostgreSQL (like 16.0 to 16.1) typically arrive every quarter and are backwards compatible, so require no extra effort. However, major releases typically come yearly, and your entire database will need to be migrated from one version to the other to be compatible.

The guide below is written with a Docker user in mind, but if you're using PostgreSQL directly (or in a VM) you can simply ignore the Docker steps.

Preparing for Migration

Backups are always recommended, however this process is designed to allow you to revert in minutes without any loss of data. That said, any work you do on your database is at your own risk and it's best to ensure you always have multiple copies of all data readily to hand at all times.

Depending on the speed of your storage, this process can take up to an hour, so you may wish to inform your users about the scheduled downtime.

Creating a Backup with pg_dumpall

The most reliable method to migrate the data is to simply export a copy from the old database and import it into the new one.

pg_dumpall is the tool we'll use to do this as it not only includes all databases, but also users and passwords, so the new one will identically replicate the old one.

  1. Make sure your Synapse server is stopped so the database is no longer being written to

  2. Log into your current PostgreSQL container:

    docker exec -it your_old_postgres_container bash
    
  3. Use pg_dumpall to create a backup:

    pg_dumpall > /var/lib/postgresql/data/pg_backup.sql
    
  4. Exit the container and copy the backup file from the old container to a safe location on your host:

    docker cp <your_old_postgres_container>:/var/lib/postgresql/data/pg_backup.sql .
    

Now you have your entire database in a single .sql file, you can stop PostgreSQL and prepare the new version.

Setting Up the New PostgreSQL Version

If you're using Docker Compose, you can simply update your docker-compose.yml to use the newer image (e.g. postgres:16-alpine) and change the volume mapping to store the data in a new directory, for example when upgrading from PostgreSQL 15:

services:
  db:
    image: postgres:15-alpine
    volumes:
      - ./pgsql15:/var/lib/postgresql/data

When moving to PostgreSQL 16 we'd change this to:

services:
  db:
    image: postgres:16-alpine
    volumes:
      - ./pgsql16:/var/lib/postgresql/data
      - ./pg_backup.sql:/docker-entrypoint-initdb.d/init.sql

Now, when this container starts up, it will automatically load your data before launching the database for the first time.

Note: PostgreSQL will try to create any users you have defined in the environment variables, so if this interferes with the import, you may need to change the username (e.g. from synapse to synapsetemp) then after the import is complete and you're back in Synapse, you can remove this extra user with DROP USER synapsetemp;.

Restoring Data Manually

If you're not using Docker, or want to load in the data manually, you can simply follow these extra instructions:

  1. If using Docker, copy the backup file to the new container and login:

    docker cp pg_backup.sql your_new_postgres_container:/var/lib/postgresql/data/
    docker exec -it your_new_postgres_container bash
    
  2. Restore the backup using psql:

    psql -U postgres -f /var/lib/postgresql/data/pg_backup.sql
    
  3. Assuming that went without error, you can now remove the copy you made:

    rm /var/lib/postgresql/data/pg_backup.sql
    

Completing the Migration

PostgreSQL did not include your configuration in the backup, so once the restore is complete, stop PostgreSQL and copy over your postgresql.conf file. When you start it again, it's possible some of the configuration options may have changed, so watch the logs to confirm it starts without error.

Once this is complete, you should be safe to start Synapse and also confirm it can login to the database without error.

If you used the automated Docker instructions above, remove the ./pg_backup.sql:/docker-entrypoint-initdb.d/init.sql line from the "volumes" section and remove the pg_backup.sql file - nothing should break if you leave them there, as the docker-entrypoint-initdb.d is only read when the Docker image starts with no databases available, but removing these extra files will save disk space and keep things tidy ready for the next time.

Reverting Back

If your new version of PostgreSQL doesn't start up correctly, or Synapse can't connect to it, you're not stuck!

Take a copy of the logs to help investigation later, then simply stop Synapse and PostgreSQL, and change back the settings above (e.g. image: and volumes: if you used the Docker Compose method) and bring them back up again.

You should now be back up and running on the previous version with plenty of time to investigate what occurred before the next attempt.

Tuning PostgreSQL for a Matrix Synapse Homeserver

Conclusion

Throughout this guide, we've touched on key areas such as network configuration, worker alignment, memory management, regular maintenance, replication strategies, and systematic testing, but performance tuning PostgreSQL is more than a technical exercise; it's a philosophy of continuous improvement, adaptability, and understanding.

This is very much a starting point for anyone interested in making the most of their hardware, and a well-tuned system is not a destination but a journey: if you embrace the process, apply the principles, and let the knowledge you've acquired today guide you towards a more efficient and reliable Synapse environment, hopefully you'll be empowered to manage your own configuration to meet all demands for many years to come.

Setting Up a Replica for Backups for PostgreSQL in Docker

Introduction

Backing up a write-heavy database like Synapse can be a challenge: in my case, a dump of the database would take >15 minutes and cause all sorts of performance and locking issues in the process.

This guide will walk you through setting up replication from a primary PostgreSQL Docker container to a secondary container dedicated for backups.

By the end, you'll have a backup system that's efficient, minimizes performance hits, and ensures data safety.

Setting Up a Replica for Backups for PostgreSQL in Docker

1. Preparing Docker Compose

Below is an example of my database entry in docker-compose.yml:

volumes:
  sockets:

services:
  db:
    cpus: 4
    image: postgres:alpine
    environment:
      POSTGRES_DB: synapse
      POSTGRES_USER: synapse
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
      POSTGRES_INITDB_ARGS: "--encoding=UTF-8 --lc-collate=C --lc-ctype=C"
    mem_limit: 8G
    restart: always
    volumes:
      - sockets:/sockets
      - ./pgsql:/var/lib/postgresql/data

As you can see, I'm using a "sockets" volume for Unix socket communication, which as well as avoiding unnecessary open TCP ports, provides a lower latency connection when containers are on the same host.

To do the same, just ensure you have this in your postgresql.conf to let Postgres know where to write its sockets:

unix_socket_directories = '/sockets'

If you're not using sockets (e.g. your replica's on a different host) then you may need to adjust some of the later steps to replicate via TCP port instead.

I've then added this replica, almost identical except for the standby configuration with lower resource limits:

  db-replica:
    cpus: 2
    depends_on:
      - db
    image: postgres:alpine
    environment:
      POSTGRES_DB: synapse
      POSTGRES_USER: synapse
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
      POSTGRES_STANDBY_MODE: "on"
      POSTGRES_PRIMARY_CONNINFO: host=/sockets user=synapse password=${POSTGRES_PASSWORD}
    mem_limit: 2G
    restart: always
    volumes:
      - sockets:/sockets
      - ./pgreplica:/var/lib/postgresql/data

You can try setting lower limits, I prefer to allow the replica 2 cores to avoid replication interruptions while the backup runs, as on fast storage this can easily cause one core to run at 100%.

Setting Up a Replica for Backups for PostgreSQL in Docker

2. Configuration

  1. Primary Postgres Configuration:

    Now, you'll likely want this at the bottom of your postgresql.conf to make sure it's ready to replicate:

    hot_standby = on
    archive_mode = off
    wal_level = replica
    max_wal_senders = 3
    wal_keep_size = 1024
    

    It'll need to be restarted for these changes to take effect, which would be safest done now before copying the data:

    docker compose down db && docker compose up db -d
    
  2. Preparing Replica Data:

    Postgres replication involves streaming updates as they're made to the database, so to start we'll need to create a duplicate of the current database to use for the replica.

    You can create a copy of your entire database like this, just substitute the container name and user as required:

    docker exec -it synapse-db-1 pg_basebackup -h /sockets -U synapse -D /tmp/pgreplica
    

    The data is initially written to /tmp/ inside the container as it's safest for permissions. We can then move it to /var/lib/postgresql/data/ so we can more easily access it from the host OS:

    docker exec -it synapse-db-1 mv /tmp/pgreplica /var/lib/postgresql/data/
    

    You can hopefully now reach the data and move it to a new directory for your replica, updating the ownership to match your existing Postgres data directory:

    mv ./pgsql/pgreplica ./
    chown -R 70:1000 ./pgreplica
    
  3. Replica Postgres Configuration:

    Now for the replica's postgresql.conf, add this to the bottom to tell it that it's a secondary and scale back its resource usage as it won't be actively serving clients:

    port = 5433
    hot_standby = on
    checkpoint_timeout = 30min
    shared_buffers = 512MB
    effective_cache_size = 1GB
    maintenance_work_mem = 128MB
    work_mem = 4MB
    max_wal_size = 2GB
    max_parallel_workers_per_gather = 1
    max_parallel_workers = 1
    max_parallel_maintenance_workers = 1
    
  4. Primary Postgres Replication

    This will instruct the primary to allow replication:

    # Enable replication for the user
    docker exec -it your_primary_container_name psql -U synapse -c "ALTER USER synapse WITH REPLICATION;"
    
    # Create a replication slot
    docker exec -it your_primary_container_name psql -U synapse -c "SELECT * FROM pg_create_physical_replication_slot('replica_slot_name');"
    

Setting Up a Replica for Backups for PostgreSQL in Docker

3. Starting Replication

Once you run docker compose up db-replica -d your new replica should now be running.

Running this command confirms that the primary sees the replica and is streaming data to it:

docker exec -it synapse-db-1 psql -h /sockets -U synapse -d synapse -c "SELECT application_name, state, sync_priority, sync_state, pg_current_wal_lsn() - sent_lsn AS bytes_behind FROM pg_stat_replication;"

The output should look something like this:

 application_name |   state   | sync_priority | sync_state | bytes_behind
------------------+-----------+---------------+------------+--------------
 walreceiver      | streaming |             0 | async      |            0
(1 row)

Replica Logs

When running docker logs synapse-db-replica-1 (adjusting your replica's name as necessary) we should now see messages distinct from the primary's typical "checkpoint" logs. Here's a concise breakdown using an example log:

LOG:  entering standby mode
LOG:  consistent recovery state reached at [WAL location]
LOG:  invalid record length at [WAL location]: wanted [X], got 0
LOG:  started streaming WAL from primary at [WAL location] on timeline [X]
LOG:  restartpoint starting: [reason]
LOG:  restartpoint complete: ...
LOG:  recovery restart point at [WAL location]

Key Points:

  • Entering Standby Mode: The replica is ready to receive WAL records.
  • Consistent Recovery State: The replica is synchronized with the primary's WAL records.
  • Invalid Record Length: An informational message indicating the end of available WAL records.
  • Started Streaming WAL: Active replication is in progress.
  • Restart Points: Periodic checkpoints in the replica for data consistency.
  • Recovery Restart Point: The point where recovery would begin if the replica restarts.

If you're seeing errors here, double-check the steps above: Postgres will refuse to start if the configuration between the two containers is too different, so if you've skipped steps or done them out of order then it should explain quite verbosely what went wrong here.

Setting Up a Replica for Backups for PostgreSQL in Docker

4. Backup Script

I've written the following to take a backup - the files are automatically compressed using gzip before they're written to save space and minimise wear on your storage:

#!/bin/bash

# Define backup directory, filenames, and the number of backups to keep
BACKUP_DIR="/path/to/backups"
CURRENT_BACKUP_DIR="$BACKUP_DIR/backup_$(date +%Y%m%d%H%M)"
CURRENT_BACKUP_ARCHIVE="$CURRENT_BACKUP_DIR.tar.gz"
NUM_BACKUPS_TO_KEEP=6

# Create a new backup using pg_basebackup
mkdir -p $CURRENT_BACKUP_DIR
docker exec synapse-db-replica-1 pg_basebackup -h /sockets -U synapse -D $CURRENT_BACKUP_DIR -Fp

# Check if the backup was successful
if [ $? -eq 0 ]; then
    echo "Backup successful! Compressing the backup directory..."
    
    # Compress the backup directory
    tar -czf $CURRENT_BACKUP_ARCHIVE -C $CURRENT_BACKUP_DIR .
    rm -rf $CURRENT_BACKUP_DIR

    # Check if previous backups exist
    if [ -n "$(ls $BACKUP_DIR/backup_*.tar.gz 2>/dev/null)" ]; then
        PREVIOUS_BACKUPS=($(ls $BACKUP_DIR/backup_*.tar.gz | sort -r))

        # If there are more backups than the specified number, delete the oldest ones
        if [ ${#PREVIOUS_BACKUPS[@]} -gt $NUM_BACKUPS_TO_KEEP ]; then
            for i in $(seq $(($NUM_BACKUPS_TO_KEEP + 1)) ${#PREVIOUS_BACKUPS[@]}); do
                rm -f ${PREVIOUS_BACKUPS[$i-1]}
            done
        fi
    fi
else
    echo "Backup failed!"
    rm -rf $CURRENT_BACKUP_DIR
fi

To configure, simply set the BACKUP_DIR to the location you want your backups to be stored, the NUM_BACKUPS_TO_KEEP to the number of previous backups to store before removal, and update the docker exec line to match your replica's details.

You could also tailor the script to your specific needs, for example, by adding email notifications to let you know when backups are failing for any reason.

Make sure to mark this script as executable so it can be run:

chmod +x /path/to/postgres_backup.sh

We can then configure a cron job (e.g. in /etc/cron.d/postgres) to run it:

30 */4 * * * root /path/to/postgres_backup.sh 2>&1 | logger -t "postgres-backup"

This would run every 4 hours from 12:30am, however you could set a specific list of hours like this:

30 3,7,11,15,19,23 * * * root /path/to/postgres_backup.sh 2>&1 | logger -t "postgres-backup"

Setting Up a Replica for Backups for PostgreSQL in Docker

5. Conclusion

And there you have it! A dedicated backup replica for your PostgreSQL database in Docker.

Remember: this is a general guide that should work for a lot of people, but it's impossible to cover every scenario, so if you've read this far, it's recommended to try with a test database first, and spend some time deciding whether this solution is for you.

For potential troubleshooting or further reading, consider referring to the official Postgres documentation on replication and backups.