Data Generation

In this tab, you can generate new datasets and choose distribution types. You can create multiple distributions simultaneously by collecting them in the command box.

Note

This tutorial assumes that you already selected a project and imported data. For more information please visit on Project and Import Data section.


User Interface Structure

This is a detailed description of all settings you can define for this operation.

Main Settings

Type a positive integer greater than 0 as size for the dataset.

Distribution Options

Select the distribution you want to create. Depending on the distribution, you need to provide one or up to four parameters; see details in the table Distributions and parameters. You can also create a preview of your distribution settings with the Show button.

Generation List

Collect all distributions you want to generate in this box.

Filter

As in all Data Enrichment Tabs, you can select only a part of the data by using a Filter. A more detailed description of Filters can be found here Since you are generating new data, the filter will have no effect.

Preview/Apply

Preview button will do the operation but will not save the data, while Apply will give you the option to save the data in a folder in content. More information is available in here

Basic Usage

Creating new datasets is easy and follows the same rules:

  1. First, you type the data size (number of rows) in the main settings. It has to be a positive integer greater than 0.
  2. Select the distribution type you want.
  3. Type the distribution parameters; see Distributions and parameters for more details.
  4. Add it into the command collection box.
  5. Apply the Operation.

The following animation shows an example usage, where you create a Normal and Weibull distribution with 1000 rows simultaneously.

Image_Caption

Distributions and parameters

Distribution Parameter 1 Parameter 2 Parameter 3 Parameter 4 Link
Beta alpha (>0) beta (>0) - - Link
Binomial n (>0) p (0<=p<=1) - - Link
Birnbaum Saunders c (>0) - - - Link
Burr Type XII c (>0) d (>0) - - Link
Chi-Square df (>0) - - - Link
Exponential scale - - - Link
Generalized Extreme Value mu beta (>=0) - - Link
F df numerator (>0) df denumerator (>0) - - Link
Gamma shape (>=0) scale (>=0) - - Link
Generalized Pareto shape scale (>=0) loc - Link
Geometric p (0<=p<=1) - - - Link
Half-Normal loc scale (>0) - - Link
Hypergeometric number of good (0<=ngood<=1e9) number of bad (0<=nbad<=1e9) - - Link
Inverse Gaussian mean (>0) scale (>0) - - Link
Logistic loc scale (>0) - - Link
Loglogistic shape (>0) loc scale (>=0) - Link
Lognormal mean standard deviation (>=0) - - Link
Nakagami nu (>0) scale (>=0) - - Link
Negative Binomial n (>0) p (0<=p<=1) - - Link
Noncentral F df numerator (>0) df denumerator (>0) nonc (>=0) - Link
Noncentral t df (>0) nc - - Link
Noncentral Chi-Squared df (>0) nonc (>0) - - Link
Normal mean standard deviation (>=0) - - Link
Poisson lambda (>=0) - - - Link
Rayleigh scale (>=0) - - - Link
Rician shape (>0) loc scale - Link
Stable alpha (0< alpha<=2) beta (-1 <= beta <= 1) loc scale (>0) Link
Student's t df (>0) - - - Link
t Location-Scale df (>0) loc scale (>0) - Link
Uniform (Continuous) lower boundary upper boundary - - Link
Uniform (Discrete) lower boundary upper boundary - - Link
Weibull scale shape - - Link
Linspace start stop - - Link
Logspace start (>0) stop (>0) - - Link
Geomspace start (>0) stop (>0) - - Link
Arange start space - - Link