API Reference

PTIIKInsight provides a REST API for topic modeling and data scraping operations. This reference covers all available endpoints based on the actual project implementation.

Base URL

http://localhost:8000

The API runs on port 8000 by default and provides automatic documentation at /docs.

Available Endpoints

Health Check

Check if the API service is running and healthy.

GET /health

Response:

{
  "status": "healthy",
  "timestamp": 1640995200.0,
  "model_loaded": true
}
### Data Scraping

Start the web scraping process to collect new research paper data.

```http
POST /scrape

Response:

This endpoint runs scraping asynchronously in the background. The process includes:

  1. Web scraping using the scraping module

  2. Data preprocessing and cleaning

  3. Storing results in CSV format

Get Scraped Data

Retrieve the processed data from previous scraping operations.

Success Response:

No Data Response:

Response Fields:

  • count: Number of records retrieved

  • data: Array of scraped research paper records

Topic Prediction

Predict topics for given text inputs using the trained BERTopic model.

Request Body:

Request Fields:

  • texts: Array of text strings for topic prediction (max 100 texts per request)

Response:

Response Fields:

  • input_count: Number of input texts processed

  • prediction_time: Time taken for prediction in seconds

  • topics: Array of prediction results with topic assignments

Model Accuracy Update

Update the model accuracy metric for monitoring purposes.

Query Parameters:

  • accuracy: Float value between 0 and 1

Example:

Response:

Error Responses

All endpoints may return error responses in case of failures:

400 Bad Request:

500 Internal Server Error:

Usage Examples

Complete Workflow Example

Python Client Example

Monitoring and Metrics

The API includes Prometheus metrics for monitoring:

  • model_predictions_total: Total number of predictions made

  • model_prediction_errors_total: Total number of prediction errors

  • model_prediction_duration_seconds: Time spent on predictions

  • model_accuracy: Current model accuracy score

  • scraping_requests_total: Total scraping requests

  • scraping_errors_total: Total scraping errors

Metrics are available at /metrics endpoint for Prometheus scraping.

Rate Limits

  • Maximum 100 texts per /predict request

  • Scraping operations are queued and run one at a time

  • No authentication required for current version

API Documentation

Interactive API documentation is available at:

  • Swagger UI: http://localhost:8000/docs

  • ReDoc: http://localhost:8000/redoc

Last updated