Name

sxk_means_groups - determine 'best' number of clusters in the data using K-means classification of a set of images

Usage

Usage in command lines:

sxk_means_groups.py stackfile output_file <mask_file> --opt_method=k_means_method K1=start_number_of_cluster K2=stop_number_of_clusters --rand_seed=1000 --maxit=max_iter --trials=number_of_trials_of_k_means --crit=criterion_name --CTF --MPI

Usage in python programming:

k_means_groups(stackfile, output_file, mask=Image mask, opt_method=k_means_method, K1=start_number_of_cluster, K2=stop_number_of_clusters, rand_seed=1000, maxit=max_iter, trials=number_of_trials_of_k_means, crit=criterion_name, CTF)

Input

stackfile
The input stack of images
output_file
text file in which values of clustering criteria are be stored
mask

filename for input image mask. The input image are considered only for pixels mask that have value > 0.5. Note: has to have the same dimensions as the input (default = None, entire images will be used)

K1
minimum requested number of clusters
K2
maximum requested number of clusters
trials
number of trials of K-means (see description below) (default one trial)
opt_method
optimization method: 'SSE' or 'cla' (default is SSE) (see description below)
CTF
if set, CTF information stored in file headers will be used (default no CTF)
rand_seed
random seed of initial...to generate random numbers?...set to??
crit
names of criterion used: 'all' all criterion, 'C' Coleman, 'H' Harabasz or 'D' Davies-Bouldin. Prefered to use the three criterions 'CHD' in the same time and choose the number of clusters that satisfy all criterions, see below in description section. Possibility to composite free options, like 'H', 'CD', 'HC', or 'CDH', ...
MPI
to use MPI version of k-means groups

Output

output_file
text file will contain differents columns according the criterions choosed, for example if crit='CHD', the columns of numbers: (1) number of clusters, (2) values of Coleman criterion, (3) values of Harabasz criterion and (4) values of Davies-Bouldin criterion
output_file.p
file contain a gnuplot script, this file allow plot directly the values of all criterions with the same range. Use this command in gnuplot: load 'output_file.p'
WATCH_GRP_KMEANS or WATCH_MPI_GRP_KMEANS
file contain the progress of k-means groups. This file can be read in real-time to watch the evolution of criterions.

Description

Reference

Author / Maintainer

Julien Bert

Keywords

category 1
APPLICATIONS

Files

statistics.py, sxk_means_groups.py

See also

sxk_means

Maturity

beta
works for author, often works for others.

Bugs

None. It is perfect.

sxk means groups (last edited 2008-07-24 14:37:34 by Julien)