Pruning — standard (10m)¶
L₂-norm sampling of rows¶
Procedure L2NormSample(A ∈ ℝ^{n×d}, s)
- For each row \( i = 1 \ldots n \): \( \ell_i \gets \lVert A_{i,*} \rVert_2^2 \)
- Let \( P \gets \sum_{j=1}^n \ell_j \)
- For each \( i \): \( p_i \gets \ell_i / P \)
- Initialize \( \tilde A \gets [\,] \)
- Repeat \( s \) times: draw \( i_t \sim p \), set \( \mathbf r \gets A_{i_t,*} \), rescale \( \mathbf r \gets \mathbf r/\sqrt{s p_{i_t}} \), stack.
- Return \( \tilde A \).
Leverage-score sampling¶
Procedure LeverageSample(A ∈ ℝ^{n×d}, s)
- Thin SVD: \( A = U\Sigma V^\top \)
- \( \ell_i \gets \lVert U_{i,*} \rVert_2^2 \), normalize to probabilities, then sample like above.
These give principled alternatives to plain magnitude pruning.