Inherits from
- model: SciPy.maxentropy.maxentropy.model
Attributes
Inherited from base classes
Method summary
- __init__(self, F, counts, numcontexts)
- dual(self, params = None, ignorepenalty = False)
- expectations(self)
- fit(self, algorithm = 'CG')
- lognormconst(self)
- logpmf(self)
Inherited from base classes
- pmf(self)
- pmf_function(self, f = None)
- setfeaturesandsamplespace(self, f, samplespace)
Methods
- __init__(self, F, counts, numcontexts)
The F parameter should be a (sparse) m x size matrix, where m is the number of features and size is |W| * |X|, where |W| is the number of contexts and |X| is the number of elements X in the sample space. The 'counts' parameter should be a row vector stored as a (1 x |W|*|X|) sparse matrix, whose element i*|W|+j is the number of occurrences of x_j in context w_i in the training set. This storage format allows efficient multiplication over all contexts in one operation.
- dual(self, params = None, ignorepenalty = False)
The entropy dual function is defined for conditional models as
- L(theta) = sum_w q(w) log Z(w; theta)
- sum_{w,x} q(w,x) [theta . f(w,x)]
or equivalently as
L(theta) = sum_w q(w) log Z(w; theta) - (theta . k)
where K_i = sum_{w, x} q(w, x) f_i(w, x), and where q(w) is the empirical probability mass function derived from observations of the context w in a training set. Normally q(w, x) will be 1, unless the same class label is assigned to the same context more than once.
Note that both sums are only over the training set {w,x}, not the entire sample space, since q(w,x) = 0 for all w,x not in the training set.
The entropy dual function is proportional to the negative log likelihood.
- Compare to the entropy dual of an unconditional model:
- L(theta) = log(Z) - theta^T . K
- expectations(self)
The vector of expectations of the features with respect to the distribution p_tilde(w) p(x | w), where p_tilde(w) is the empirical probability mass function value stored as self.p_tilde_context[w].
- fit(self, algorithm = 'CG')
Fits the conditional maximum entropy model subject to the constraints
sum_{w, x} p_tilde(w) p(x | w) f_i(w, x) = k_i
- for i=1,...,m, where k_i is the empirical expectation
- k_i = sum_{w, x} p_tilde(w, x) f_i(w, x).
- lognormconst(self)
Compute the elementwise log of the normalization constant (partition function) Z(w)=sum_{y in Y(w)} exp(theta . f(w, y)). The sample space must be discrete and finite. This is a vector with one element for each context w.
- logpmf(self)
Returns a (sparse) row vector of logarithms of the conditional probability mass function (pmf) values p(x | c) for all pairs (c, x), where c are contexts and x are points in the sample space. The order of these is log p(x | c) = logpmf()[c * numsamplepoints + x].
