Initial commit — encrypted chat server + Python clients (v0.8.5)
E2E encrypted chat (X3DH + Double Ratchet, Signal Protocol). Server: asyncio TCP + TLS, MySQL. Clients: PyQt6 GUI + CLI. Secrets (.env, TLS keys, Cloudflare token), runtime data and mobile clients (separate repos) are gitignored. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
252
scaling.md
Normal file
252
scaling.md
Normal file
@@ -0,0 +1,252 @@
|
||||
# Škálování serveru — plán kapacitního růstu
|
||||
|
||||
## Cílový hardware
|
||||
|
||||
- **CPU:** Intel Xeon E5-2630v4 (10 cores / 20 threads, 2.2 GHz)
|
||||
- **RAM:** 256 GB REG ECC
|
||||
- **Disk:** 500 GB SSD (boot/OS/DB) + 4 TB HDD (soubory)
|
||||
- **Síť:** 1 Gbit
|
||||
|
||||
Odhadovaná kapacita po optimalizaci: **10 000–20 000 uživatelů**, **2000–5000 zpráv/s**
|
||||
|
||||
---
|
||||
|
||||
## Krok 1: Okamžité změny (hotovo v kódu)
|
||||
|
||||
### 1a. Thread pool — `server.py`
|
||||
|
||||
```env
|
||||
THREAD_POOL_SIZE=40
|
||||
```
|
||||
|
||||
Nastavuje `ThreadPoolExecutor(max_workers=40)` jako default executor pro `asyncio.to_thread()`.
|
||||
S 20 HW thready a DB latencí ~2–5ms je 40 workerů optimální (2x HW threads — workery čekají na I/O).
|
||||
|
||||
### 1b. DB pool — `.env`
|
||||
|
||||
```env
|
||||
DB_POOL_SIZE=30
|
||||
```
|
||||
|
||||
30 simultánních MySQL spojení. S 40 thread workers a ~2ms query je 30 pool konexí dostatek.
|
||||
|
||||
### 1c. Chybějící DB indexy — `schema.sql`
|
||||
|
||||
Přidány 5 nových indexů pro nejčastější dotazy:
|
||||
|
||||
| Index | Tabulka | Dotaz který zrychlí |
|
||||
|-------|---------|---------------------|
|
||||
| `idx_cm_user (user_id)` | `conversation_members` | `list_user_conversations` — **kritický**, bez něj full table scan |
|
||||
| `idx_inv_user (user_id)` | `group_invitations` | `get_pending_invitations` |
|
||||
| `idx_messages_deleted (conversation_id, deleted_at)` | `messages` | `get_deleted_messages_since` |
|
||||
| `idx_messages_pinned (conversation_id, pinned_at)` | `messages` | `get_pinned_messages` |
|
||||
| `idx_reads_user (user_id)` | `message_reads` | `get_unread_counts` |
|
||||
|
||||
**SQL migrace pro existující databázi:**
|
||||
|
||||
```sql
|
||||
ALTER TABLE conversation_members ADD INDEX idx_cm_user (user_id);
|
||||
ALTER TABLE group_invitations ADD INDEX idx_inv_user (user_id);
|
||||
ALTER TABLE messages ADD INDEX idx_messages_deleted (conversation_id, deleted_at);
|
||||
ALTER TABLE messages ADD INDEX idx_messages_pinned (conversation_id, pinned_at);
|
||||
ALTER TABLE message_reads ADD INDEX idx_reads_user (user_id);
|
||||
```
|
||||
|
||||
### 1d. Upload adresář na HDD
|
||||
|
||||
```env
|
||||
UPLOAD_DIR=/mnt/hdd/encrypted_chat/uploads
|
||||
```
|
||||
|
||||
Šifrované soubory a avatary na 4TB HDD — SSD zůstane pro OS a MySQL data.
|
||||
|
||||
```bash
|
||||
mkdir -p /mnt/hdd/encrypted_chat/uploads
|
||||
chmod 700 /mnt/hdd/encrypted_chat/uploads
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Krok 2: MySQL tuning pro 256 GB RAM
|
||||
|
||||
### `/etc/mysql/mysql.conf.d/tuning.cnf` (nebo ekvivalent v Dockeru)
|
||||
|
||||
```ini
|
||||
[mysqld]
|
||||
# === Buffer Pool — hlavní cache pro data + indexy ===
|
||||
# 96 GB = ~37% RAM (MySQL + app na stejném stroji)
|
||||
innodb_buffer_pool_size = 96G
|
||||
innodb_buffer_pool_instances = 16
|
||||
|
||||
# === Redo Log — větší = méně I/O, rychlejší zápisy ===
|
||||
innodb_redo_log_capacity = 4G
|
||||
|
||||
# === Flush strategie ===
|
||||
# 2 = flush do OS cache každou sekundu (ne každý commit)
|
||||
# Ztráta max 1s dat při pádu OS, ale 10x rychlejší zápisy
|
||||
innodb_flush_log_at_trx_commit = 2
|
||||
# O_DIRECT = bypass OS page cache (InnoDB má vlastní)
|
||||
innodb_flush_method = O_DIRECT
|
||||
|
||||
# === I/O kapacita (SSD) ===
|
||||
innodb_io_capacity = 2000
|
||||
innodb_io_capacity_max = 4000
|
||||
|
||||
# === Connections ===
|
||||
max_connections = 200
|
||||
|
||||
# === Sort/Join buffery ===
|
||||
sort_buffer_size = 4M
|
||||
join_buffer_size = 4M
|
||||
read_buffer_size = 2M
|
||||
read_rnd_buffer_size = 2M
|
||||
|
||||
# === Temporary tables ===
|
||||
tmp_table_size = 256M
|
||||
max_heap_table_size = 256M
|
||||
|
||||
# === Query cache (MySQL 8.0+ nemá, pro 5.7) ===
|
||||
# query_cache_type = 0
|
||||
|
||||
# === Thread cache ===
|
||||
thread_cache_size = 64
|
||||
|
||||
# === Binary logging (pro budoucí repliky) ===
|
||||
# server-id = 1
|
||||
# log_bin = /var/log/mysql/mysql-bin
|
||||
# binlog_expire_logs_seconds = 604800
|
||||
# max_binlog_size = 256M
|
||||
```
|
||||
|
||||
**Pokud MySQL běží v Dockeru:**
|
||||
|
||||
```yaml
|
||||
# docker-compose.yml
|
||||
services:
|
||||
mysql:
|
||||
image: mysql:8.0
|
||||
volumes:
|
||||
- /var/lib/mysql:/var/lib/mysql # data na SSD
|
||||
- ./tuning.cnf:/etc/mysql/conf.d/tuning.cnf
|
||||
deploy:
|
||||
resources:
|
||||
limits:
|
||||
memory: 128G # limitovat aby zbylo pro app
|
||||
environment:
|
||||
MYSQL_DATABASE: encrypted_chat
|
||||
```
|
||||
|
||||
### Po aplikaci restartovat MySQL a ověřit:
|
||||
|
||||
```sql
|
||||
SHOW VARIABLES LIKE 'innodb_buffer_pool_size';
|
||||
SHOW VARIABLES LIKE 'innodb_flush_log_at_trx_commit';
|
||||
SHOW ENGINE INNODB STATUS\G
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Krok 3: Doporučená `.env` pro produkci
|
||||
|
||||
```env
|
||||
# Server
|
||||
SERVER_HOST=0.0.0.0
|
||||
SERVER_PORT=9999
|
||||
|
||||
# MySQL
|
||||
MYSQL_HOST=127.0.0.1
|
||||
MYSQL_PORT=3306
|
||||
MYSQL_USER=sifrator
|
||||
MYSQL_PASSWORD=<silne-heslo>
|
||||
MYSQL_DATABASE=encrypted_chat
|
||||
DB_POOL_SIZE=30
|
||||
|
||||
# Performance
|
||||
THREAD_POOL_SIZE=40
|
||||
|
||||
# Storage
|
||||
UPLOAD_DIR=/mnt/hdd/encrypted_chat/uploads
|
||||
|
||||
# TLS (zapnout pro produkci)
|
||||
TLS_ENABLED=true
|
||||
TLS_CERT_FILE=/etc/letsencrypt/live/chat.example.com/fullchain.pem
|
||||
TLS_KEY_FILE=/etc/letsencrypt/live/chat.example.com/privkey.pem
|
||||
|
||||
# Logging
|
||||
LOG_LEVEL=INFO
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Krok 4: Monitoring (doporučeno)
|
||||
|
||||
### Jednoduché metriky bez externích nástrojů
|
||||
|
||||
Přidat do serveru periodické logování:
|
||||
|
||||
```python
|
||||
# V _periodic_cleanup() (každých 10 min):
|
||||
async with _clients_lock:
|
||||
total_connections = sum(len(v) for v in connected_clients.values())
|
||||
unique_users = len(connected_clients)
|
||||
logger.info("[STATS] users=%d connections=%d", unique_users, total_connections)
|
||||
```
|
||||
|
||||
### S externími nástroji (volitelně)
|
||||
|
||||
- **htop** — CPU / RAM využití procesu
|
||||
- **mysqladmin status** — queries/s, slow queries, connections
|
||||
- **Prometheus + Grafana** — dlouhodobé trendy (přidat až při potřebě)
|
||||
|
||||
---
|
||||
|
||||
## Budoucí škálování
|
||||
|
||||
### Fáze A: Separace MySQL (15K+ uživatelů)
|
||||
|
||||
MySQL na separátní stroj (nebo managed DB). App server + Redis na jednom, DB na druhém.
|
||||
|
||||
```
|
||||
[Server: App + Redis] ──TCP──▶ [Server: MySQL]
|
||||
│
|
||||
└──▶ [HDD/S3: soubory]
|
||||
```
|
||||
|
||||
### Fáze B: Horizontální škálování (50K+ uživatelů)
|
||||
|
||||
Více app serverů za load balancerem + Redis Pub/Sub pro cross-server notifikace.
|
||||
|
||||
```
|
||||
┌─── App server 1 ───┐
|
||||
Client ──▶ │ connected_clients │──┐
|
||||
└─────────────────────┘ │
|
||||
├──▶ Redis Pub/Sub ──▶ MySQL
|
||||
┌─── App server 2 ───┐ │
|
||||
Client ──▶ │ connected_clients │──┘
|
||||
└─────────────────────┘
|
||||
▲
|
||||
Load Balancer (HAProxy / nginx stream)
|
||||
(sticky sessions by user_id)
|
||||
```
|
||||
|
||||
Hlavní změna: `_notify_users()` posílá do Redis místo lokálního `connected_clients` pokud uživatel není na tomto serveru.
|
||||
|
||||
### Fáze C: DB škálování (100K+ uživatelů)
|
||||
|
||||
- Read replicas pro SELECT dotazy
|
||||
- Partitioning tabulky `messages` podle měsíce
|
||||
- Sharding podle `conversation_id`
|
||||
|
||||
---
|
||||
|
||||
## Přehled — co je hotovo
|
||||
|
||||
| Krok | Stav | Popis |
|
||||
|------|------|-------|
|
||||
| asyncio.to_thread() pro DB | **Hotovo** | 131 DB volání offloadováno do thread poolu |
|
||||
| ThreadPoolExecutor(40) | **Hotovo** | Konfigurovatelný přes `THREAD_POOL_SIZE` |
|
||||
| DB indexy (5 nových) | **Hotovo** | Schema + SQL migrace připraveny |
|
||||
| UPLOAD_DIR na HDD | **Konfigurace** | Nastavit v `.env` |
|
||||
| MySQL tuning | **Konfigurace** | Aplikovat `tuning.cnf` |
|
||||
| TLS certifikát | **TODO** | Let's Encrypt nebo vlastní CA |
|
||||
| Monitoring | **Volitelné** | Periodické logování stats |
|
||||
Reference in New Issue
Block a user