A few months ago I was invited to enter in a dialogue with Brussels-based artist Rossella Biscotti for the occasion of the exhibition of her installation “Other” from 2015 at the Contour Biennale in Mechelen (Belgium). In this work, she uses the Jacquard weaving technique to visualise data from Belgian census data, and engages in an exploration of data-subjects that are categorised as ‘other’ within this data-set. The resulting installation consists of 4 large carpets that display data of various minority-groups and rest-categories of the Brussels population.
My role in this collaboration was to contribute formal or mathematical insights on how rest-categories like other or none of the above could be understood. In this second blog-post (see first) I elaborate on how logico-mathematical insights can become part of such inquiries.

What then can a logico-mathematical approach contribute to artistic research concerned with the classification practices on which census-data are built? Two things at least. It can help make the idea of a “logic of classification” more explicit, and develop its implications in purely abstract terms (for instance without associating rest-categories with forms of exclusion). As such, it can reorient our critical attention from how classification-structures affect specific data-subjects in concrete settings to how classification-rules create abstract entities like the profiles or categories that become the primary entities we reason about or use to make decisions.  Second, it can be used to explore alternative approaches; in this specific case, different ways of conceptualising how rest-categories should be used in the construction of categories of (in certain respects) similar data-subjects.

In relation to the focus on “other”, I specifically contrasted two different ways in which the membership of a rest-category could be conceptualised. The basic principle that underlies both is that data-subjects belong to the same category (or fall under the same profile) if and only if for all the relevant data-dimensions we have attributed them the same values (or values within the same range). In this way, we can construct categories of, say, all the children of ages between 6 and 10 that have at least one sibling. Similarly, we seem to be able to construct the category of all the data-subjects categorised as “other” in the data-dimension “household position,” and this even if the actual household-roles of the presumed members of this category do not have anything in common apart from the fact that they do not conform to any of the roles privileged by the designers of the census, and that their place or role within a household probably isn’t very common (as in the case of “other nationalities”). Treating such rest-categories as bona fide categories makes sense if we think of labels like “other” or “none of the above” as semantically significant labels; labels that provide sufficient ground for identification because they indicate that we have sufficient evidence to identify the data-subjects that were so-labelled.

If, however, we think of such labels as a mere indication of the absence of any information, this strategy quickly becomes questionable. In the context of the mentioned household positions, being categorised as “other” results from negative answers to 4 consecutive yes/no-questions, but does not need to carry any positive information. At least for some rest-categories it thus makes more sense to treat the labels we use to denote these categories along the same lines of the sentinel-values that are customarily used to signal missing data, like 9999 or the NaN (not a number) numeric data-type described by the IEEE 754 floating-point standard. Let us stipulate that two data-subjects fall under the same profile or belong to the same category if and only if, first, there is no information that indicates that they are different in a relevant respect (a potentially vacuous sense of being similar), and, in addition, there is also positive evidence that they are similar in the relevant respects. By the second requirement, the label “other” then no longer leads to the creation of a category of others. Because explicit sentinel-values like NaN have the property of not being equal to themselves (the expression NaN==NaN will typically evaluate to False), this requirement for positive information can be simulated by using such values to denote rest-categories.

Using a randomly generated data-set similar to the data used by Biscotti, the difference between the two types of approaches can easily be visualised. In the figures below the sizes of categories are displayed as bubbles; the figure on the left uses the number 10 to denote “other” (and 10==10 evaluates to True), whereas the figure on the right uses NaN.

Here, we immediately see that the presence of data-subjects labelled as “other” leads to the creation of a large periphery of different (because unknown) data-subjects whenever the label used to denote rest-categories indicates the absence of information. As such, this leads to a minimal sense in which we can understand how the meaning we assign to the labels we use to denote categories interacts with the process of creating categories or profiles and the subsequent use of these categories as an ontology used to describe a given subject-matter.