The contiguous part of this—355:312:59—was shown in an older entry, but I recently found the other three pairs mirrored about the same axis, and then the other group. The simplest interpretation of these is: six symmetrical (same number of components) mirrored (same MQ) pairs about the same axis. Ignoring the axis part, in a randomly generated set (using possible MQs 18-115), there are on average 47 of these symmetrical mirrored pairs. If the WTC 1 set of MQs is shuffled, the average is 70. In the actual ordered WTC 1 set, there are 90.

Considering the axis: [at least] six symmetrical mirrors about the same axis occurs about 1% of the time (at random). Of course, there are many different axes in the WTC1 set with multiple mirrors. The largest of these multi-mirrorings are 6, 6, 5, 4, and 3 (the two six groups are in the image). Randomly, this occurs about .002% of the time. If using the WTC1 set and shuffling only prelude and fugue MQs to respective even and odd positions in the index, the odds increase to just .17%. Shuffling the WTC1 set obviously enters into a “begging the question” scenario, but it unsurprisingly doesn’t really boost the odds into “likely” territory.

Considering the contiguous bit: the odds of a 3-part contiguous mirror occurring over at least 36 components at random is .07%; shuffling the WTC1 set increases this to .14%. This ignores the 3 other mirrors about the same axis, which would naturally decrease the percentage.

Considering that the difference between the axis positions for both of the pictured groups is <1: the chances of two contiguous groups of 3 symmetrical mirrored pairs about the same axis having axes at or adjacent to the same component is %.0007. That is, I had 7 hits in 1,000,000 attempts. The shuffled set increases it to %.0024. This again does not take into account the size of the two groups, or the other non-contiguous mirrors; I believe the isolated requirements are sufficient here, as some might say applying all is overly restrictive. Update 1/5/25: I tried anyway for fun. I made sure that the script was able to find the actual [pictured groups] from the WTC1 set via the same process before setting it to randomize/simulate. In a random set of 48 MQs generated from 18-115, two axes less than <1 apart were found to have 6 symmetrical mirrors, with three of them being contiguous and at least 10 components long 1 in 1,000,000 times. It was trial #350,114, for the curious.

For the technically curious: my method was to divide the logic into the smallest reasonable steps, and test that the step was operating correctly before moving to the next. The steps building up to this final test are thus:

Create an index of 48 random numbers, using range 18-115, can repeat.
Calculate all contiguous group totals. (0,0), (0,1), (0,2), etc.
List all totals that occur more than once, along with each instance of that total and its index range.
For each recurring total, list all possible pairs of instances whose ranges do not intersect. These will be called “mirrored pairs.”
For each non-intersecting pair, list pairs whose index lengths are the same. These will be called “symmetrical mirrored pairs”.
For each symmetrical mirrored pair, calculate the average of the first instance’s end point with the second instance’s start point, which we will call the “axis”. Ex: (0,5) and (6,11) have an axis of 5.5.
List all axes that repeat, along with their respective mirrored sums, indices of each instance in the mirrored pairs, and starting point of the first instance of each pair.
If there are duplicate instances of starting points within an axis, remove all but the entry with the smallest sum. This gives all unique mirrored pairs.
OR
Remove all pairs that do not have duplicate starting points. This gives all contiguous groups.

I’ve been talking to a robot all day: Google’s Gemini 2.0. It has struggled mightily to add double-digit integers, but it has apologized, and I’ve not choice but to accept, as I need it to generate Python code for simulations, which to this untrained eye it has done quite well. Google’s Colab has been running millions of simulations off these scripts without yet requiring me to shovel coal into a furnace to power it.

Conversations at this year’s Bach Network Dialogue Meeting showed me that people are still quite skeptical about any kind of numerical design in Bach. So, I am tackling one of the most essential and unassailable numerical traits of WTC1, the set of component measure quantities (MQs), meaning how many measures to which each piece in the collection totals. Points about the set:

WTC I Component MQs.
Sorted by size, repeat values colored.

There are 33 unique values: 10 are repeated at least once; and 25—over half of the set—belong to this “repeated” category. If choosing at random [from the range 18-115], one would expect around 38 unique values; a simulation returned 33 (or less) ~2% of the time.
The majority of the repeats occur in the lower MQs: 17 out of 25 (68%) are below the median of 36. This occurs in a simulation ~6% of the time.
Two MQs repeat 4x, one repeats 3x, and seven repeat 2x. MQs repeat [at least] as often in a simulation ~.1% of the time.
There are three noticeable “gaps” in the MQ range: there are no values from 59-69, 75-85, or 88-103. Three gaps of [at least] this size (not limited to the exact ranges) occur ~.02% of the time in random simulation.

We might thus conclude that Bach did not merely “arrive” at these MQs in a process of random free composition, which should surprise no one, as Bach was no robot. The question is to what degree Bach was choosing the numbers, as opposed to them being some kind of emergent property of his compositional process and style. For instance, the MQ of 24 repeats 4x, and only in preludes. This could easily be explained as a structure of 2x12, 4x6, etc; preludes being perhaps more likely than fugues to have such an arithmetic structure. Conversely, easily subdivided MQs like 32 and 36 (which are very near the median of the set) are not used at all.

The above points all deal with the set of MQs in isolation. To assess whether or not Bach was choosing MQs, we might consider how they are actually arranged in the work:

The MQ of 35 repeats 4x—twice in prelude and twice in fugue—but only in major; the odds of an MQ occurring 4x but only in a single mode are ~1%.
There are 7 repeated MQs in a row at 31-37 (among a larger cluster of 11 out of 12). There are also voids at 3-8 and 20-25. These occur together about .2% of the time in simulation. As the distribution of repeated MQs across Mode and Prelude/Fugue is essentially equal in both cases—12:13—there is no clear reason why groups of repeated MQs would cluster.
Combinations of 35/27, 19/34, and 29/41 all repeat. Only one of these is not in a P/F pair. Disregarding the P/F tendency, picking at random, this occurs .03% of the time; shuffling the actual WTC1 set improves the odds to .35%.
The MQs that repeat 4x—24 and 35—each have equidistant instances. The first three instances of 24 are nine components apart, and the first-second and third-fourth instances of 35 are each seven components apart.

I cannot imagine a reasonable explanation for all of these that does not surreptitiously accept that Bach was aware of the numbers as an abstraction apart from the consequence of composing more or less measures; at which point Bach chose and arranged MQs as a structural device is a short and convenient stumble away.

I will attach the code for all simulations referenced at some point in the near future; and will likely continue to edit and revise this post if I find errors in the script or my assumptions.

-jtr