simulator.RdSimulate clustered categorical datasets by using continous time Marcov chain.
simulator(simK, n_coordinates, n_observations, n_categories, sim_between_t, sim_within_t, use_dirichlet = FALSE, sim_pi)
| simK | Number of clusters. |
|---|---|
| n_coordinates | Number of coordinate of the simulated dataset. |
| n_observations | Number of observation of the simulated dataset. |
| n_categories | Number of categories of the simulated dataset. |
| sim_between_t | Between cluster variation. |
| sim_within_t | Within cluster variation. |
| use_dirichlet | Indicate if cimulate datasets with dirichlet prior. Defalut is FALSE. |
| sim_pi | Mixing proportions, a vector with the same length of specified number of clusters and the sum of the values in this vector has to be 1. |
Returns a list of simulation dataset results.
Value:
"CTMC_probabilities": Number of observations in each cluster of the best initialization.
"modes": Simulated modes.
"cluster_assignments": Simulated cluster assignments.
"cluster_sizes": Simulated cluster sizes.
"data": Simulated data.
#Simulate data with dim 100 * 10, 4 different categories and there are 5 true clusters. data <- simulator(simK = 5, n_coordinates = 10, n_observations = 100, n_categories = 4, sim_between_t = 2, sim_within_t = 1, use_dirichlet = TRUE, sim_pi = c(0.1, 0.1, 0.2, 0.3, 0.3))