Hi Andrew, thanks for your suggestions!
I’ve had a chance to return to the subject, and here is a short GIF explaining the goal:
Technically, I am converting a string column with categories into a numeric column.
There, this column assigns numbers to categories in order of their appearance. This is useful in many cases: for instance, to anonymize data, or to put datasets into machine learning algorithms not working with string categorical columns.
This GIF also shows an approach for Excel In fact, this only works well in Excel on small datasets, but takes considerable times otherwise on reasonably sized datasets of 10’000+ elements.
I know that we have
.categories map on the dataframe’s column, but this isn’t quite what I was looking for. I thought it’d be useful to have this data augmentation mechanism out of the box in the UI.
Maybe there’s a formula possible to use in Add New Column which I’m not aware of? In any case, I wasn’t going to use the one for Excel from a GIF, as it’s more of a peculiar fact than a daily tool.