Troubleshooting
This guide helps you diagnose and resolve common issues with PTIIKInsight. Issues are organized by category with step-by-step solutions.
Quick Diagnostics
System Health Check
Run this comprehensive health check to identify common issues:
# Check all services
curl http://localhost:8000/health
# Check specific components
docker-compose ps
docker-compose logs --tail=50
Common Symptoms and Quick Fixes
Dashboard not loading
Check port 8501, restart dashboard service
API not responding
Check port 8000, verify database connection
Model training fails
Check memory usage, verify data format
Slow performance
Check resource usage, optimize parameters
Empty topics
Adjust min_topic_size parameter
Installation Issues
Docker Issues
Docker Compose Fails to Start
Symptoms:
Services exit immediately
Port binding errors
Volume mount issues
Solutions:
# Check Docker daemon
sudo systemctl status docker
# Check port availability
sudo lsof -i :8000
sudo lsof -i :8501
# Fix permission issues
sudo chown -R $USER:$USER ./data
sudo chmod -R 755 ./data
# Restart with fresh containers
docker-compose down --volumes
docker-compose up --build
Out of Memory Errors
Symptoms:
Container exits with code 137
"Killed" messages in logs
Solutions:
# Increase Docker memory limit
# Docker Desktop: Settings > Resources > Memory > 8GB+
# Check memory usage
docker stats
# Optimize container resources
# Edit docker-compose.yml
services:
api:
deploy:
resources:
limits:
memory: 4G
reservations:
memory: 2G
Network Connectivity Issues
Symptoms:
Services can't communicate
Database connection errors
Solutions:
# Check network configuration
docker network ls
docker network inspect ptiikinsight_default
# Recreate network
docker-compose down
docker network prune
docker-compose up
Manual Installation Issues
Python Dependencies
Symptoms:
Package installation fails
Import errors
Version conflicts
Solutions:
# Clean Python environment
pip uninstall -r requirements.txt -y
pip cache purge
# Install with specific versions
pip install --no-cache-dir -r requirements.txt
# Fix CUDA issues (if using GPU)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
# Check Python version
python --version # Should be 3.8+
Database Issues
Symptoms:
Connection refused
Authentication failed
Database doesn't exist
Solutions:
# Check PostgreSQL status
sudo systemctl status postgresql
# Start PostgreSQL
sudo systemctl start postgresql
# Create database
sudo -u postgres createdb ptiikinsight
# Check connection
psql -h localhost -p 5432 -U postgres -d ptiikinsight
# Reset password
sudo -u postgres psql
ALTER USER postgres PASSWORD 'your_password';
Runtime Issues
API Issues
API Not Responding
Symptoms:
Connection timeout
502 Bad Gateway
Service unavailable
Diagnostics:
# Check API logs
docker-compose logs api
# Test API directly
curl http://localhost:8000/health
# Check process
ps aux | grep uvicorn
Solutions:
# Restart API service
docker-compose restart api
# Check configuration
cat .env | grep API
# Test with different port
uvicorn api.main:app --host 0.0.0.0 --port 8001
Database Connection Errors
Symptoms:
"Connection to database failed"
"Connection pool exhausted"
Timeout errors
Solutions:
# Check database status
docker-compose logs db
# Test connection
psql $DATABASE_URL -c "SELECT 1"
# Increase connection pool
# Edit .env
DATABASE_POOL_SIZE=50
DATABASE_MAX_OVERFLOW=20
# Check for connection leaks
docker-compose exec db psql -U postgres -c "SELECT count(*) FROM pg_stat_activity"
Redis Connection Issues
Symptoms:
Cache not working
Session errors
Connection refused
Solutions:
# Check Redis status
docker-compose logs redis
# Test connection
redis-cli -u $REDIS_URL ping
# Clear Redis cache
redis-cli -u $REDIS_URL FLUSHALL
# Check Redis memory
redis-cli -u $REDIS_URL INFO memory
Dashboard Issues
Dashboard Won't Load
Symptoms:
Blank page
Connection error
Python errors
Diagnostics:
# Check dashboard logs
docker-compose logs dashboard
# Test Streamlit directly
streamlit run dashboard/main.py --server.port 8502
Solutions:
# Restart dashboard
docker-compose restart dashboard
# Check port binding
sudo lsof -i :8501
# Clear Streamlit cache
rm -rf ~/.streamlit/
Slow Dashboard Performance
Symptoms:
Long loading times
Unresponsive interface
Timeout errors
Solutions:
# Optimize Streamlit configuration
# Edit .streamlit/config.toml
[server]
maxUploadSize = 200
maxMessageSize = 200
[theme]
base = "light"
# Reduce data display
# Limit number of documents/topics shown
# Use pagination for large datasets
Model Training Issues
Training Failures
Out of Memory During Training
Symptoms:
Training stops unexpectedly
"CUDA out of memory" errors
System becomes unresponsive
Solutions:
# Reduce batch size
# Edit training parameters
{
"batch_size": 32, # Reduce from default
"max_documents": 10000, # Limit dataset size
"use_gpu": false # Disable GPU if insufficient VRAM
}
# Monitor memory usage
watch -n 1 'free -h'
nvidia-smi # If using GPU
# Increase system memory
# Add swap space (Linux)
sudo fallocate -l 4G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
Poor Topic Quality
Symptoms:
Topics don't make sense
Too many similar topics
Keywords are not relevant
Solutions:
# Adjust parameters
{
"min_topic_size": 25, # Increase for better quality
"nr_topics": 20, # Limit number of topics
"embedding_model": "all-mpnet-base-v2" # Use better model
}
# Improve data preprocessing
{
"remove_short_docs": true,
"min_doc_length": 50,
"remove_duplicates": true,
"custom_stopwords": ["custom", "stopwords"]
}
# Try different clustering parameters
{
"clustering_algorithm": "kmeans",
"dimensionality_reduction": "pca"
}
Training Never Completes
Symptoms:
Training stuck at certain percentage
No progress for extended time
Memory usage constantly high
Solutions:
# Check training logs
docker-compose logs -f api
# Monitor system resources
htop
iotop
# Reduce dataset size
# Sample your data first
import pandas as pd
df = pd.read_csv('large_dataset.csv')
sample_df = df.sample(n=1000)
sample_df.to_csv('sample_dataset.csv')
# Restart training with smaller dataset
Model Performance Issues
Slow Inference
Symptoms:
Topic prediction takes long time
API timeouts
High CPU usage
Solutions:
# Optimize model parameters
{
"calculate_probabilities": false, # Disable if not needed
"use_fast_inference": true,
"batch_inference_size": 100
}
# Use faster embedding model
{
"embedding_model": "all-MiniLM-L6-v2" # Faster than mpnet
}
# Enable caching
{
"cache_embeddings": true,
"cache_predictions": true
}
High Memory Usage
Symptoms:
System slow after training
Memory not released
Swap usage high
Solutions:
# Force garbage collection
import gc
gc.collect()
# Restart model service
docker-compose restart api
# Optimize model storage
# Save model with compression
import pickle
with open('model.pkl', 'wb') as f:
pickle.dump(model, f, protocol=pickle.HIGHEST_PROTOCOL)
Data Issues
Data Loading Problems
File Upload Failures
Symptoms:
Upload times out
File format errors
Encoding issues
Solutions:
# Check file size limits
# Edit .env
MAX_UPLOAD_SIZE_MB=500
# Fix encoding issues
# Convert to UTF-8
iconv -f ISO-8859-1 -t UTF-8 input.csv > output.csv
# Check file format
file your_file.csv
head -n 5 your_file.csv
# Validate CSV structure
python -c "import pandas as pd; print(pd.read_csv('your_file.csv').head())"
Data Preprocessing Errors
Symptoms:
Text cleaning fails
Language detection errors
Empty documents after processing
Solutions:
# Debug preprocessing
{
"debug_preprocessing": true,
"log_preprocessing_steps": true,
"preserve_original_text": true
}
# Adjust preprocessing parameters
{
"min_text_length": 10, # Reduce minimum
"max_text_length": 50000, # Increase maximum
"remove_html": true,
"normalize_unicode": true
}
# Check text encoding
python -c "
import chardet
with open('your_file.txt', 'rb') as f:
result = chardet.detect(f.read())
print(result)
"
Data Quality Issues
Inconsistent Results
Symptoms:
Topics change between runs
Inconsistent document assignments
Varying quality scores
Solutions:
# Set random seed for reproducibility
{
"random_seed": 42,
"deterministic": true
}
# Use consistent preprocessing
{
"preprocessing_pipeline": "standard",
"normalize_text": true,
"consistent_tokenization": true
}
# Check data consistency
python -c "
import pandas as pd
df = pd.read_csv('your_data.csv')
print(f'Duplicates: {df.duplicated().sum()}')
print(f'Missing values: {df.isnull().sum()}')
print(f'Text length stats: {df['text'].str.len().describe()}')
"
Performance Issues
System Performance
High CPU Usage
Symptoms:
System sluggish
High load average
Processes not responding
Solutions:
# Monitor CPU usage
top -p $(pgrep -f ptiikinsight)
# Limit CPU usage
# Edit docker-compose.yml
services:
api:
deploy:
resources:
limits:
cpus: '2.0'
# Optimize thread usage
# Edit .env
WORKERS=2
THREADS_PER_WORKER=2
High Memory Usage
Symptoms:
System using swap
Out of memory errors
Slow response times
Solutions:
# Monitor memory usage
free -h
ps aux --sort=-%mem | head
# Optimize memory usage
# Edit .env
MEMORY_LIMIT=4G
BATCH_SIZE=50
CACHE_SIZE=512MB
# Clean up unused data
docker system prune -a
Disk Space Issues
Symptoms:
Disk full errors
Slow I/O operations
Log files growing large
Solutions:
# Check disk usage
df -h
du -sh /path/to/ptiikinsight/*
# Clean up logs
find ./logs -name "*.log" -mtime +7 -delete
# Compress old data
gzip ./data/processed/*.json
# Move data to external storage
# Configure S3 or external storage
Monitoring and Logging
Log Analysis
Finding Relevant Logs
# API logs
docker-compose logs api | grep ERROR
# Dashboard logs
docker-compose logs dashboard | tail -100
# System logs
journalctl -u docker | grep ptiikinsight
# Application logs
tail -f ./logs/ptiikinsight.log
Common Log Patterns
# Database connection issues
grep "connection" ./logs/*.log
# Memory issues
grep -i "memory\|oom" ./logs/*.log
# Model training issues
grep "training\|model" ./logs/*.log
# API errors
grep "ERROR\|500" ./logs/api.log
Monitoring Setup
Basic Monitoring
# CPU and memory monitoring
watch -n 5 'docker stats --no-stream'
# Disk usage monitoring
watch -n 30 'df -h'
# Network monitoring
netstat -tuln | grep :8000
Advanced Monitoring
# Set up Prometheus monitoring
docker-compose -f docker-compose.monitoring.yml up -d
# Access Grafana dashboard
# http://localhost:3000 (admin/admin)
# Check metrics endpoint
curl http://localhost:8000/metrics
Frequently Asked Questions
General Questions
Q: Why is my training taking so long? A: Training time depends on dataset size, model complexity, and hardware. Try reducing dataset size, using a faster embedding model, or increasing hardware resources.
Q: Can I use PTIIKInsight with other languages?
A: Yes, use multilingual embedding models like paraphrase-multilingual-MiniLM-L12-v2
and set the appropriate language in preprocessing settings.
Q: How do I improve topic quality?
A: Adjust min_topic_size
, use better embedding models, improve data preprocessing, and ensure sufficient data volume.
Technical Questions
Q: Can I run PTIIKInsight on Windows? A: Yes, using Docker Desktop or WSL2. Native Windows installation is possible but Docker is recommended.
Q: How much memory do I need? A: Minimum 8GB RAM, but 16GB+ recommended for production use. Memory usage scales with dataset size.
Q: Can I use GPU acceleration?
A: Yes, configure GPU support in Docker and set USE_GPU=true
in environment variables.
Configuration Questions
Q: How do I change the default ports?
A: Edit the .env
file to change API_PORT
and DASHBOARD_PORT
values.
Q: Can I use external databases?
A: Yes, configure DATABASE_URL
and REDIS_URL
to point to external services.
Q: How do I backup my data?
A: Use docker-compose exec db pg_dump
for database backups and backup the ./data
directory.
Getting Additional Help
Community Support
GitHub Issues: Report bugs and request features
Discord Server: Join the community chat
Stack Overflow: Tag questions with
ptiikinsight
Professional Support
Documentation: Comprehensive guides and tutorials
Consulting: Professional implementation support
Training: Workshops and training sessions
Reporting Issues
When reporting issues, include:
System information: OS, Docker version, memory
Configuration: .env file (without secrets)
Error logs: Relevant log excerpts
Steps to reproduce: Clear reproduction steps
Expected vs actual behavior: What you expected vs what happened
Contributing
Help improve PTIIKInsight:
Report bugs and issues
Suggest new features
Contribute code improvements
Improve documentation
Share usage examples
Last updated