Troubleshooting

This guide helps you diagnose and resolve common issues with PTIIKInsight. Issues are organized by category with step-by-step solutions.

Quick Diagnostics

System Health Check

Run this comprehensive health check to identify common issues:

# Check all services
curl http://localhost:8000/health

# Check specific components
docker-compose ps
docker-compose logs --tail=50

Common Symptoms and Quick Fixes

Symptom
Quick Fix

Dashboard not loading

Check port 8501, restart dashboard service

API not responding

Check port 8000, verify database connection

Model training fails

Check memory usage, verify data format

Slow performance

Check resource usage, optimize parameters

Empty topics

Adjust min_topic_size parameter

Installation Issues

Docker Issues

Docker Compose Fails to Start

Symptoms:

  • Services exit immediately

  • Port binding errors

  • Volume mount issues

Solutions:

# Check Docker daemon
sudo systemctl status docker

# Check port availability
sudo lsof -i :8000
sudo lsof -i :8501

# Fix permission issues
sudo chown -R $USER:$USER ./data
sudo chmod -R 755 ./data

# Restart with fresh containers
docker-compose down --volumes
docker-compose up --build

Out of Memory Errors

Symptoms:

  • Container exits with code 137

  • "Killed" messages in logs

Solutions:

# Increase Docker memory limit
# Docker Desktop: Settings > Resources > Memory > 8GB+

# Check memory usage
docker stats

# Optimize container resources
# Edit docker-compose.yml
services:
  api:
    deploy:
      resources:
        limits:
          memory: 4G
        reservations:
          memory: 2G

Network Connectivity Issues

Symptoms:

  • Services can't communicate

  • Database connection errors

Solutions:

# Check network configuration
docker network ls
docker network inspect ptiikinsight_default

# Recreate network
docker-compose down
docker network prune
docker-compose up

Manual Installation Issues

Python Dependencies

Symptoms:

  • Package installation fails

  • Import errors

  • Version conflicts

Solutions:

# Clean Python environment
pip uninstall -r requirements.txt -y
pip cache purge

# Install with specific versions
pip install --no-cache-dir -r requirements.txt

# Fix CUDA issues (if using GPU)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

# Check Python version
python --version  # Should be 3.8+

Database Issues

Symptoms:

  • Connection refused

  • Authentication failed

  • Database doesn't exist

Solutions:

# Check PostgreSQL status
sudo systemctl status postgresql

# Start PostgreSQL
sudo systemctl start postgresql

# Create database
sudo -u postgres createdb ptiikinsight

# Check connection
psql -h localhost -p 5432 -U postgres -d ptiikinsight

# Reset password
sudo -u postgres psql
ALTER USER postgres PASSWORD 'your_password';

Runtime Issues

API Issues

API Not Responding

Symptoms:

  • Connection timeout

  • 502 Bad Gateway

  • Service unavailable

Diagnostics:

# Check API logs
docker-compose logs api

# Test API directly
curl http://localhost:8000/health

# Check process
ps aux | grep uvicorn

Solutions:

# Restart API service
docker-compose restart api

# Check configuration
cat .env | grep API

# Test with different port
uvicorn api.main:app --host 0.0.0.0 --port 8001

Database Connection Errors

Symptoms:

  • "Connection to database failed"

  • "Connection pool exhausted"

  • Timeout errors

Solutions:

# Check database status
docker-compose logs db

# Test connection
psql $DATABASE_URL -c "SELECT 1"

# Increase connection pool
# Edit .env
DATABASE_POOL_SIZE=50
DATABASE_MAX_OVERFLOW=20

# Check for connection leaks
docker-compose exec db psql -U postgres -c "SELECT count(*) FROM pg_stat_activity"

Redis Connection Issues

Symptoms:

  • Cache not working

  • Session errors

  • Connection refused

Solutions:

# Check Redis status
docker-compose logs redis

# Test connection
redis-cli -u $REDIS_URL ping

# Clear Redis cache
redis-cli -u $REDIS_URL FLUSHALL

# Check Redis memory
redis-cli -u $REDIS_URL INFO memory

Dashboard Issues

Dashboard Won't Load

Symptoms:

  • Blank page

  • Connection error

  • Python errors

Diagnostics:

# Check dashboard logs
docker-compose logs dashboard

# Test Streamlit directly
streamlit run dashboard/main.py --server.port 8502

Solutions:

# Restart dashboard
docker-compose restart dashboard

# Check port binding
sudo lsof -i :8501

# Clear Streamlit cache
rm -rf ~/.streamlit/

Slow Dashboard Performance

Symptoms:

  • Long loading times

  • Unresponsive interface

  • Timeout errors

Solutions:

# Optimize Streamlit configuration
# Edit .streamlit/config.toml
[server]
maxUploadSize = 200
maxMessageSize = 200

[theme]
base = "light"

# Reduce data display
# Limit number of documents/topics shown
# Use pagination for large datasets

Model Training Issues

Training Failures

Out of Memory During Training

Symptoms:

  • Training stops unexpectedly

  • "CUDA out of memory" errors

  • System becomes unresponsive

Solutions:

# Reduce batch size
# Edit training parameters
{
  "batch_size": 32,  # Reduce from default
  "max_documents": 10000,  # Limit dataset size
  "use_gpu": false  # Disable GPU if insufficient VRAM
}

# Monitor memory usage
watch -n 1 'free -h'
nvidia-smi  # If using GPU

# Increase system memory
# Add swap space (Linux)
sudo fallocate -l 4G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile

Poor Topic Quality

Symptoms:

  • Topics don't make sense

  • Too many similar topics

  • Keywords are not relevant

Solutions:

# Adjust parameters
{
  "min_topic_size": 25,  # Increase for better quality
  "nr_topics": 20,  # Limit number of topics
  "embedding_model": "all-mpnet-base-v2"  # Use better model
}

# Improve data preprocessing
{
  "remove_short_docs": true,
  "min_doc_length": 50,
  "remove_duplicates": true,
  "custom_stopwords": ["custom", "stopwords"]
}

# Try different clustering parameters
{
  "clustering_algorithm": "kmeans",
  "dimensionality_reduction": "pca"
}

Training Never Completes

Symptoms:

  • Training stuck at certain percentage

  • No progress for extended time

  • Memory usage constantly high

Solutions:

# Check training logs
docker-compose logs -f api

# Monitor system resources
htop
iotop

# Reduce dataset size
# Sample your data first
import pandas as pd
df = pd.read_csv('large_dataset.csv')
sample_df = df.sample(n=1000)
sample_df.to_csv('sample_dataset.csv')

# Restart training with smaller dataset

Model Performance Issues

Slow Inference

Symptoms:

  • Topic prediction takes long time

  • API timeouts

  • High CPU usage

Solutions:

# Optimize model parameters
{
  "calculate_probabilities": false,  # Disable if not needed
  "use_fast_inference": true,
  "batch_inference_size": 100
}

# Use faster embedding model
{
  "embedding_model": "all-MiniLM-L6-v2"  # Faster than mpnet
}

# Enable caching
{
  "cache_embeddings": true,
  "cache_predictions": true
}

High Memory Usage

Symptoms:

  • System slow after training

  • Memory not released

  • Swap usage high

Solutions:

# Force garbage collection
import gc
gc.collect()

# Restart model service
docker-compose restart api

# Optimize model storage
# Save model with compression
import pickle
with open('model.pkl', 'wb') as f:
    pickle.dump(model, f, protocol=pickle.HIGHEST_PROTOCOL)

Data Issues

Data Loading Problems

File Upload Failures

Symptoms:

  • Upload times out

  • File format errors

  • Encoding issues

Solutions:

# Check file size limits
# Edit .env
MAX_UPLOAD_SIZE_MB=500

# Fix encoding issues
# Convert to UTF-8
iconv -f ISO-8859-1 -t UTF-8 input.csv > output.csv

# Check file format
file your_file.csv
head -n 5 your_file.csv

# Validate CSV structure
python -c "import pandas as pd; print(pd.read_csv('your_file.csv').head())"

Data Preprocessing Errors

Symptoms:

  • Text cleaning fails

  • Language detection errors

  • Empty documents after processing

Solutions:

# Debug preprocessing
{
  "debug_preprocessing": true,
  "log_preprocessing_steps": true,
  "preserve_original_text": true
}

# Adjust preprocessing parameters
{
  "min_text_length": 10,  # Reduce minimum
  "max_text_length": 50000,  # Increase maximum
  "remove_html": true,
  "normalize_unicode": true
}

# Check text encoding
python -c "
import chardet
with open('your_file.txt', 'rb') as f:
    result = chardet.detect(f.read())
    print(result)
"

Data Quality Issues

Inconsistent Results

Symptoms:

  • Topics change between runs

  • Inconsistent document assignments

  • Varying quality scores

Solutions:

# Set random seed for reproducibility
{
  "random_seed": 42,
  "deterministic": true
}

# Use consistent preprocessing
{
  "preprocessing_pipeline": "standard",
  "normalize_text": true,
  "consistent_tokenization": true
}

# Check data consistency
python -c "
import pandas as pd
df = pd.read_csv('your_data.csv')
print(f'Duplicates: {df.duplicated().sum()}')
print(f'Missing values: {df.isnull().sum()}')
print(f'Text length stats: {df['text'].str.len().describe()}')
"

Performance Issues

System Performance

High CPU Usage

Symptoms:

  • System sluggish

  • High load average

  • Processes not responding

Solutions:

# Monitor CPU usage
top -p $(pgrep -f ptiikinsight)

# Limit CPU usage
# Edit docker-compose.yml
services:
  api:
    deploy:
      resources:
        limits:
          cpus: '2.0'

# Optimize thread usage
# Edit .env
WORKERS=2
THREADS_PER_WORKER=2

High Memory Usage

Symptoms:

  • System using swap

  • Out of memory errors

  • Slow response times

Solutions:

# Monitor memory usage
free -h
ps aux --sort=-%mem | head

# Optimize memory usage
# Edit .env
MEMORY_LIMIT=4G
BATCH_SIZE=50
CACHE_SIZE=512MB

# Clean up unused data
docker system prune -a

Disk Space Issues

Symptoms:

  • Disk full errors

  • Slow I/O operations

  • Log files growing large

Solutions:

# Check disk usage
df -h
du -sh /path/to/ptiikinsight/*

# Clean up logs
find ./logs -name "*.log" -mtime +7 -delete

# Compress old data
gzip ./data/processed/*.json

# Move data to external storage
# Configure S3 or external storage

Monitoring and Logging

Log Analysis

Finding Relevant Logs

# API logs
docker-compose logs api | grep ERROR

# Dashboard logs
docker-compose logs dashboard | tail -100

# System logs
journalctl -u docker | grep ptiikinsight

# Application logs
tail -f ./logs/ptiikinsight.log

Common Log Patterns

# Database connection issues
grep "connection" ./logs/*.log

# Memory issues
grep -i "memory\|oom" ./logs/*.log

# Model training issues
grep "training\|model" ./logs/*.log

# API errors
grep "ERROR\|500" ./logs/api.log

Monitoring Setup

Basic Monitoring

# CPU and memory monitoring
watch -n 5 'docker stats --no-stream'

# Disk usage monitoring
watch -n 30 'df -h'

# Network monitoring
netstat -tuln | grep :8000

Advanced Monitoring

# Set up Prometheus monitoring
docker-compose -f docker-compose.monitoring.yml up -d

# Access Grafana dashboard
# http://localhost:3000 (admin/admin)

# Check metrics endpoint
curl http://localhost:8000/metrics

Frequently Asked Questions

General Questions

Q: Why is my training taking so long? A: Training time depends on dataset size, model complexity, and hardware. Try reducing dataset size, using a faster embedding model, or increasing hardware resources.

Q: Can I use PTIIKInsight with other languages? A: Yes, use multilingual embedding models like paraphrase-multilingual-MiniLM-L12-v2 and set the appropriate language in preprocessing settings.

Q: How do I improve topic quality? A: Adjust min_topic_size, use better embedding models, improve data preprocessing, and ensure sufficient data volume.

Technical Questions

Q: Can I run PTIIKInsight on Windows? A: Yes, using Docker Desktop or WSL2. Native Windows installation is possible but Docker is recommended.

Q: How much memory do I need? A: Minimum 8GB RAM, but 16GB+ recommended for production use. Memory usage scales with dataset size.

Q: Can I use GPU acceleration? A: Yes, configure GPU support in Docker and set USE_GPU=true in environment variables.

Configuration Questions

Q: How do I change the default ports? A: Edit the .env file to change API_PORT and DASHBOARD_PORT values.

Q: Can I use external databases? A: Yes, configure DATABASE_URL and REDIS_URL to point to external services.

Q: How do I backup my data? A: Use docker-compose exec db pg_dump for database backups and backup the ./data directory.

Getting Additional Help

Community Support

  • GitHub Issues: Report bugs and request features

  • Discord Server: Join the community chat

  • Stack Overflow: Tag questions with ptiikinsight

Professional Support

  • Documentation: Comprehensive guides and tutorials

  • Consulting: Professional implementation support

  • Training: Workshops and training sessions

Reporting Issues

When reporting issues, include:

  1. System information: OS, Docker version, memory

  2. Configuration: .env file (without secrets)

  3. Error logs: Relevant log excerpts

  4. Steps to reproduce: Clear reproduction steps

  5. Expected vs actual behavior: What you expected vs what happened

Contributing

Help improve PTIIKInsight:

  • Report bugs and issues

  • Suggest new features

  • Contribute code improvements

  • Improve documentation

  • Share usage examples


Last updated