2. Data Split

  1. Click on Data Split in the Machine Learning category.

  1. Input Data: Choose whether the target data is included in the input data. If it is, select Feature Data and Target Data separately. You can also select specific columns from one dataset using the funnel icon.

  2. Test Size: Select the percentage of input data to use for testing purposes.

  3. Random State: Generate the same random state, ensuring consistent data splits each time. (If not set, data will be randomly split differently each time.)

  4. Shuffle: Shuffle the data randomly to prevent the model from relying on the order of the data, thereby reducing bias and improving generalization performance.

  5. Stratify: Maintain class ratios when splitting the data to prevent over-representation of certain classes (Classification).

  6. Allocate to: Assign variable names to the split data.

  7. Code View: Preview the code that will be output.

  8. Run: Execute the code.

Last updated