Computational Cost¶
skillinfer is designed to be fast. No neural networks, no training loops, no GPU. Just matrix algebra.
Complexity¶
| Operation | Time | Memory |
|---|---|---|
Build population (from_dataframe) |
\(O(N \cdot K^2)\) | \(O(K^2)\) |
Single observation (observe) |
\(O(K^2)\) | \(O(K^2)\) |
Batch observation (observe_many) |
\(O(n \cdot K^2)\) | \(O(K^2)\) |
| Query mean/std | \(O(K)\) | — |
| Most uncertain | \(O(K \log K)\) | — |
| Similarity | \(O(K)\) | — |
Where N = number of entities, K = number of features, n = number of observations.
Building the population¶
The one-time cost is dominated by covariance estimation:
- Ledoit-Wolf: \(O(N \cdot K^2)\) — fits the shrinkage estimator
- Sample covariance: \(O(N \cdot K^2)\) — computes
np.cov
For typical use cases:
| Dataset | N | K | Build time |
|---|---|---|---|
| LLM benchmarks | 4,576 | 6 | < 10 ms |
| O*NET | 894 | 120 | < 100 ms |
| Large population | 10,000 | 500 | < 1 s |
Per-observation cost¶
Each observe() call performs:
- One column extraction: \(O(K)\)
- One division: \(O(K)\)
- One vector-scalar multiply: \(O(K)\)
- One outer product and subtraction: \(O(K^2)\)
The dominant cost is the rank-1 covariance update (outer product), which is \(O(K^2)\).
For K = 120 features (O*NET), each observation takes approximately 0.1 ms. For K = 6 features (LLM benchmarks), it's effectively instantaneous.
Scaling¶
The method scales comfortably to K = 1,000+ features. The bottleneck is memory for the \(K \times K\) covariance matrix:
| K | Covariance memory | Per-observation time |
|---|---|---|
| 6 | 288 B | ~1 μs |
| 120 | 115 KB | ~0.1 ms |
| 500 | 2 MB | ~1 ms |
| 1,000 | 8 MB | ~5 ms |
| 5,000 | 200 MB | ~100 ms |
Comparison with alternatives¶
| Method | Per-prediction cost | Setup cost | GPU needed |
|---|---|---|---|
| skillinfer | \(O(K^2)\) | \(O(N K^2)\) | No |
| Neural collaborative filtering | \(O(d)\) forward pass | Hours of training | Usually |
| Matrix factorization (SVD) | \(O(K \cdot r)\) | \(O(N K r)\) | No |
| Gaussian Process | \(O(N^3)\) | \(O(N^3)\) | No |
skillinfer has no training loop — the "setup" is a single covariance estimation. Predictions are matrix-vector products, not forward passes through a network.