Performance Troubleshooting

Diagnose and optimize performance issues.

Performance Metrics

Key Metrics to Monitor

# Get all metrics
curl http://localhost:8080/metrics

# Key performance metrics:
# - Stream latency
# - Throughput
# - Active connections
# - Buffer usage

Metric	What It Tells You
`stream_open_latency_seconds`	Time to establish streams
`bytes_sent_total`	Data throughput
`streams_active`	Current load
`keepalive_rtt_seconds`	Network latency

High Latency

Symptoms

Slow page loads
SSH feels sluggish
High stream_open_latency_seconds

Diagnosis

# Check stream open latency
curl http://localhost:8080/metrics | grep stream_open_latency

# Check keepalive RTT
curl http://localhost:8080/metrics | grep keepalive_rtt

# Count hops
curl http://localhost:8080/healthz | jq '.routes'
# Look at metric values - each increment is one hop

Solutions

1. Reduce hop count

Fewer hops = lower latency.

Before: A -> B -> C -> D -> E (4 hops)
After:  A -> B -> E (2 hops)

2. Use faster transports

QUIC is generally faster than HTTP/2 or WebSocket.

3. Optimize network path

Use geographically closer relays
Avoid high-latency links

4. Tune keepalive

connections:
  idle_threshold: 60s    # Less frequent keepalives

Low Throughput

Symptoms

Slow file transfers
Low bytes_sent_total rate
Buffering on streams

Diagnosis

# Check throughput
curl http://localhost:8080/metrics | grep bytes

# Check for throttling
curl http://localhost:8080/metrics | grep stream

# Check buffer status (if available)

Solutions

1. Increase buffer size

limits:
  buffer_size: 524288    # 512 KB (default 256 KB)

Larger buffers = better throughput, but more memory.

2. Check for bottlenecks

# Test network speed between hops
iperf3 -c peer-address -p 5201

# Check if CPU-bound
top -p $(pgrep muti-metroo)

3. Use QUIC

QUIC handles packet loss better than TCP-based transports.

4. Reduce frame overhead

Ensure frames are at or near max size (16 KB).

High Memory Usage

Symptoms

Agent using excessive RAM
OOM kills
System slowdown

Diagnosis

# Check memory usage
ps aux | grep muti-metroo
cat /proc/$(pgrep muti-metroo)/status | grep -i mem

# Check stream count
curl http://localhost:8080/metrics | grep streams_active

Calculation

Memory per stream = buffer_size x number_of_hops

1000 streams x 256 KB buffer x 3 hops = 768 MB

Solutions

1. Reduce buffer size

limits:
  buffer_size: 131072    # 128 KB

2. Limit concurrent streams

limits:
  max_streams_per_peer: 500
  max_streams_total: 2000

3. Reduce hop count

Fewer hops = less buffering per stream.

4. Add memory limits (container)

# docker-compose.yml
services:
  agent:
    deploy:
      resources:
        limits:
          memory: 1G

High CPU Usage

Symptoms

Agent using high CPU
Slow response times
System load high

Diagnosis

# Check CPU usage
top -p $(pgrep muti-metroo)

# CPU profiling
curl http://localhost:8080/debug/pprof/profile?seconds=30 > cpu.prof
go tool pprof cpu.prof

Solutions

1. Reduce logging

agent:
  log_level: "warn"    # Not debug or info

2. Limit stream count

More streams = more CPU.

limits:
  max_streams_total: 5000

3. Use faster hardware

CPU-bound workloads benefit from faster cores.

Connection Issues

Too Many Connections

# Check connection count
netstat -an | grep 4433 | wc -l
curl http://localhost:8080/metrics | grep peers_connected

Solutions:

limits:
  max_streams_per_peer: 500    # Limit per peer

Connection Churn

Frequent connect/disconnect wastes resources.

# Check reconnection rate
curl http://localhost:8080/metrics | grep peer_disconnects

Solutions:

Increase timeouts
Improve network stability
Check for misbehaving peers

pprof Debugging

Muti Metroo exposes pprof for profiling:

# CPU profile
curl http://localhost:8080/debug/pprof/profile?seconds=30 > cpu.prof
go tool pprof cpu.prof

# Memory profile
curl http://localhost:8080/debug/pprof/heap > heap.prof
go tool pprof heap.prof

# Goroutine dump
curl http://localhost:8080/debug/pprof/goroutine?debug=2

# Block profile (where goroutines block)
curl http://localhost:8080/debug/pprof/block > block.prof
go tool pprof block.prof

Optimization Checklist

For Latency

Minimize hop count
Use QUIC transport
Geographically optimize relay placement
Check network latency between hops

For Throughput

Increase buffer sizes
Use QUIC transport
Check for network bottlenecks
Monitor for packet loss

For Memory

Reduce buffer size if needed
Limit stream counts
Monitor active streams
Set memory limits

For CPU

Reduce logging verbosity
Limit stream counts
Profile with pprof
Check for excessive reconnections

Performance Tuning Guide

Low Latency Priority

routing:
  max_hops: 4              # Limit hops

connections:
  idle_threshold: 60s      # Less keepalive traffic

limits:
  buffer_size: 131072      # 128 KB - smaller buffers

High Throughput Priority

limits:
  buffer_size: 524288      # 512 KB - larger buffers
  max_streams_per_peer: 2000

connections:
  idle_threshold: 30s      # Detect issues quickly

Memory Constrained

limits:
  buffer_size: 65536       # 64 KB
  max_streams_total: 1000
  max_streams_per_peer: 100

Next Steps

Protocol Limits - Limit reference
Common Issues - Quick solutions
Deployment - Optimize deployment

Performance Metrics​

Key Metrics to Monitor​

High Latency​

Symptoms​

Diagnosis​

Solutions​

Low Throughput​

Symptoms​

Diagnosis​

Solutions​

High Memory Usage​

Symptoms​

Diagnosis​

Calculation​

Solutions​

High CPU Usage​

Symptoms​

Diagnosis​

Solutions​

Connection Issues​

Too Many Connections​

Connection Churn​

pprof Debugging​

Optimization Checklist​

For Latency​

For Throughput​

For Memory​

For CPU​

Performance Tuning Guide​

Low Latency Priority​

High Throughput Priority​

Memory Constrained​

Next Steps​

Performance Metrics

Key Metrics to Monitor

High Latency

Symptoms

Diagnosis

Solutions

Low Throughput

Symptoms

Diagnosis

Solutions

High Memory Usage

Symptoms

Diagnosis

Calculation

Solutions

High CPU Usage

Symptoms

Diagnosis

Solutions

Connection Issues

Too Many Connections

Connection Churn

pprof Debugging

Optimization Checklist

For Latency

For Throughput

For Memory

For CPU

Performance Tuning Guide

Low Latency Priority

High Throughput Priority

Memory Constrained

Next Steps