Bin a Categorical Field
Last updated
Last updated
Categorical fields are modeled as a separate field for each distinct category. So a field with a large number of categories cannot be modeled. Use this dialog to see if the field can be reduced to a smaller number of categories by combining together small occurrence categories. This is accessible through the Configure dialog.
This dialog is an interface on top of the "bin()" function. You can also create a "bin()" expression directly.
See for a detailed discussion of the modeling capabilities in ADVIZOR AnalystX. Review the overall Analytics to see how this dialog fits into it.
Bin a categorical field like this:
Categorical Fields: This box lists all categorical fields in the table chose for modeling. Click a field name to see more details about it.
Original Categories: The number of unique categories in the original field.
After Binning: The number of unique categories after the field is binned. Binning is done using the Create button (below).
Group Name: The name used for the group formed from low frequency categories.
Coverage: This is the percentage of the data with the highest occurrence categories where the categories are unchanged. Data is aggregated by row count and ordered; after this threshold, all remaining bins are give the "other" Group Name
This works best if there are a small number of categories that are common and a large number of low occurrence categories. The low occurrence categories are summarized into a single, new category.
Binned Field: The name of the new field produced. This cannot be modified by the user.
Create: Create the new field with reduced categories.
Remove: Remove a new field that was created. This happens immediately and is not undone by Cancel (below).
Help: Display assistance on using this dialog.
OK: Close the dialog and update the Model Configuration dialog with any changes.
Cancel: Close the dialog and discard any changes.
If OK is used to close the dialog, any newly binned fields are added to the Model Configuration, and the base field is removed. Cancel causes all additions to be forgotten. Remove actions, however, are not reversed by Cancel.
This dialog creates a "bin()" expression to do the binning, which is recorded with the project so that it will be run when the project is refreshed.
See Also: