Pruning — deep dive¶
Problem setup¶
Given weights \(W\), choose a mask \(M\in\{0,1\}^{\text{shape}(W)}\) to minimize loss under a sparsity budget.
Structured vs unstructured¶
- Unstructured: mask individual weights.
- Structured: drop channels/filters/heads/blocks to keep hardware-friendly sparsity.
Algorithms (markdown-friendly pseudocode)¶
Iterative volume sampling (subset selection)¶
Procedure IterativeVolumeSample(A ∈ ℝ^{n×d}, s)
- Initialize \( S \gets \{1,\dots,n\} \).
- While \(|S|>s\): compute [ q_i \gets \frac{\det(A_{S\setminus{i},} A_{S\setminus{i},}^\top)}{\det(A_{S,}A_{S,}^\top)}\ ,\quad i\in S ] Sample \(i\propto q_i\) and remove it from \(S\).
- Return \(S\).
Empirical notes¶
- Pruning schedules (one-shot vs gradual)
- Rescaling and fine-tuning phases
- Trade-offs vs quantization