MCMC sampling¶
Add/exchange/delete¶
[AGR16], [LJS16c] and [LJS16d] derived variants of a Metropolis sampler having for stationary distribution \(\operatorname{DPP}(\mathbf{L})\) (2). The proposal mechanism works as follows.
At state \(S\subset [N]\), propose \(S'\) different from \(S\) by at most 2 elements by picking
Then perform
Add-Delete¶
Pure addition/deletion moves
Add \(S' \leftrightarrow S \cup t\)
Delete \(S' \leftrightarrow S \setminus s\)
Add-Exchange-Delete¶
Mix of exchange and add-delete moves
Delete \(S' \leftrightarrow S \setminus s\)
Exchange \(S' \leftrightarrow S \setminus s \cup t\)
Add \(S' \leftrightarrow S \cup t\)
Hint
Because moves are allowed between subsets having at most 2 different elements, transitions are very local inducing correlation, however fast mixing was proved.
import numpy as np
from dppy.finite_dpps import FiniteDPP
rng = np.random.RandomState(413121)
r, N = 4, 10
# Random feature vectors
Phi = rng.randn(r, N)
L = Phi.T.dot(Phi)
DPP = FiniteDPP('likelihood', **{'L': L})
DPP.sample_mcmc('AED', random_state=rng) # AED, AD, E
print(DPP.list_of_samples) # list of trajectories, here there's only one
[[[0, 2, 3, 6], [0, 2, 3, 6], [0, 2, 3, 6], [0, 2, 3, 6], [0, 2, 3, 6], [0, 2, 3, 6], [0, 2, 6, 9], [0, 2, 6, 9], [2, 6, 9], [2, 6, 9]]]
See also
Zonotope¶
[GBV17] target a projection \(\operatorname{DPP}(\mathbf{K})\) with
where \(\Phi\), is the underlying \(r\times N\) feature matrix satisfying \(\operatorname{rank}(\Phi)=\operatorname{rank}(\mathbf{K})=r\).
In this setting the Number of points is almost surely equal to \(r\) and we have
The original finite ground set is embedded into a continuous domain called a zonotope. The hit-and-run procedure is used to move across this polytope and visit the different tiles. To recover the finite DPP samples one needs to identify the tile in which the successive points lie, this is done by solving linear programs (LPs).
Hint
Sampling from a projection DPP boils down to solving randomized linear programs (LPs).
Important
For its LPs solving needs DPPy uses the cvxopt
library, but cvxopt
is not installed by default when installing DPPy. Please refer to the installation instructions on GitHub for more details on how to install the necessary dependencies.
from numpy.random import RandomState
from dppy.finite_dpps import FiniteDPP
rng = RandomState(413121)
r, N = 4, 10
A = rng.randn(r, N)
DPP = FiniteDPP('correlation', projection=True, **{'A_zono': A})
DPP.sample_mcmc('zonotope', random_state=rng)
print(DPP.list_of_samples) # list of trajectories, here there's only one
[[[2, 4, 5, 7], [2, 4, 5, 7], [2, 4, 5, 7], [1, 4, 5, 7], [1, 4, 5, 7], [1, 4, 5, 7], [0, 4, 7, 8], [0, 2, 7, 9], [0, 2, 7, 9], [2, 4, 5, 7]]]
Note
On the one hand, the Zonotope perspective on sampling projection DPPs yields a better exploration of the state space. Using hit-and-run, moving to any other state is possible but at the cost of solving LPs at each step. On the other hand, the Add/exchange/delete view allows to perform cheap but local moves.
See also
k-DPPs¶
To preserve the size \(k\) of the samples of \(k\!\operatorname{-DPP}(\mathbf{L})\), only Exchange moves can be performed.
Caution
\(k\) must satisfy \(k \leq \operatorname{rank}(L)\)
from numpy.random import RandomState
from dppy.finite_dpps import FiniteDPP
rng = RandomState(123)
r, N = 5, 10
# Random feature vectors
Phi = rng.randn(r, N)
L = Phi.T.dot(Phi)
DPP = FiniteDPP('likelihood', **{'L': L})
k = 3
DPP.sample_mcmc_k_dpp(size=k, random_state=rng)
print(DPP.list_of_samples) # list of trajectories, here there's only one
[[[7, 2, 5], [7, 2, 5], [7, 2, 9], [7, 8, 9], [7, 8, 9], [7, 8, 2], [7, 8, 2], [6, 8, 2], [1, 8, 2], [1, 8, 2]]]
See also
[LJS16a] for a core-set perspective