simulator.Rd
Simulate clustered categorical datasets by using continous time Marcov chain.
simulator(simK, n_coordinates, n_observations, n_categories, sim_between_t, sim_within_t, use_dirichlet = FALSE, sim_pi)
simK | Number of clusters. |
---|---|
n_coordinates | Number of coordinate of the simulated dataset. |
n_observations | Number of observation of the simulated dataset. |
n_categories | Number of categories of the simulated dataset. |
sim_between_t | Between cluster variation. |
sim_within_t | Within cluster variation. |
use_dirichlet | Indicate if cimulate datasets with dirichlet prior. Defalut is FALSE. |
sim_pi | Mixing proportions, a vector with the same length of specified number of clusters and the sum of the values in this vector has to be 1. |
Returns a list of simulation dataset results.
Value:
"CTMC_probabilities"
: Number of observations in each cluster of the best initialization.
"modes"
: Simulated modes.
"cluster_assignments"
: Simulated cluster assignments.
"cluster_sizes"
: Simulated cluster sizes.
"data"
: Simulated data.
#Simulate data with dim 100 * 10, 4 different categories and there are 5 true clusters. data <- simulator(simK = 5, n_coordinates = 10, n_observations = 100, n_categories = 4, sim_between_t = 2, sim_within_t = 1, use_dirichlet = TRUE, sim_pi = c(0.1, 0.1, 0.2, 0.3, 0.3))