Categorization has always been of interest to me. I’m one of those people that takes pleasure in organizing things, and categorization is supposed to facilitate that. It’s supposed to separate the entire mass of things into convenient sub-groups. This enables you at once to get a better sense of the composition of the whole and to make it quicker and easier to find things within it. It’s often questionable, though, to what extent this type of organizational effort is worth it. Especially in the digital world, search technologies have in many cases made it obsolete. There’s little point in painstakingly saving files or emails in particular folders when a keyword search can generally locate the document you’re looking for. There’s another problem, too: choosing the categories themselves and figuring out how to delineate the boundaries between them.
Traditionally my problem was creating overly specific categories. I took pleasure in forming very tailored categories. This was another instance of being too much of a manager. It gratified my managerial tendencies, but it wasn’t functional. A lot of organization tends to be self-indulgent in this way, an artificial order that flatters the ego of its creator but is in practice useless to counterproductive. And so every time I retuned to my ordered groups, I was confronted with a warren of categories that I would have to sort through. It often wasn’t immediately clear which category I should be looking in because the thing I was looking for had features that aligned it with multiple categories. Of course, you also don’t want the categories to be overly broad because then they don’t sufficiently cut down the population of things that you have to search through.
A principle occurred to me: For optimal categorization, the categories should be broad and distinct enough that for any given thing it is immediately clear which category it would fall into. The categories should then be drawn as minutely as possible up to this limit. In other words, you can continue to break down a population into narrower, more descriptive categories until you begin to hesitate too much in deciding which group any given thing would belong to. The result is that when you need to find something, you know exactly which subsection of the population to look in. Inevitably, there will be a few problematic outliers that might defy the resulting schema, but that can just be accepted. It’s undue attention to these weird cases that leads to the excessive fragmentation caused by niche categories.
Apart from search, technology also offers another solution to this problem: tagging. Like search it is more of a circumvention of the issue. Tagging is essentially a way of concurrently assigning a thing to multiple categories at once. This obviates the need to remember the one particular category it was placed in; searching by one of the possible tags should be enough. While elegant, this solution is painstaking as it requires categorizing each thing multiple times in order to capture all of the possible groups it could belong to. Again, the investment might not be worth the downstream efficiency it provides. The principle laid out above seems to offer a comparatively easier, functional method for organizing and one with which I’ve enjoyed experimenting in simplifying the structure of some of my databases and archives.
Discover more from The Blog of Jan Tomiska
Subscribe to get the latest posts sent to your email.