I notice something of an inverse relationship between “the common-substring problem’s solution”, and “Kolmogorov complexity” — in the sense that “each is comprised of (and defined by) the set of constituent elements, which are excluded and ignored by the other”…
…but i can’t find this relationship discussed anywhere: —might anyone point me towards the correct search-term, or some ‘accessible’ texts, on this relationship?
To explain:
- The solution to the “common-substring problem”, is all of the “common words, and strings thereof”, between two input texts
the “common-substring problem” is typically discussed in terms of finding only the “longest common-substring” — but rather here, we will be interested in “all commonality”
- “Kolmogorov complexity” frames the “randomness” and “incompressibility” of some sequence; and is based upon a process which “replaces” sub-strings which exist elsewhere (typically framed as the result of some generator function), referred to as compression; leaving only the unrecognised elements behind, which are considered “random” (and incompressible)
So, if we consider our two (input) strings from the “common-substring problem”, imagine:
the first, as “the source of candidate sub-strings for kolmogorov compression replacement”, and
the second, as “the active sequence which is compressed by every candidate substring from (i) found in itself” — (leaving only elements which are not replaced)
What I’m pointing at then, is the way in which:
for sequences , with “common” elements (found )
the “uncommon” elements , (found by , or ), equate to “incompressible” elements; which when measured, result in “kolmogorov complexity” , where
—is this “correct”?
Might anyone point me towards the correct search-term, or some accessible texts, on this relationship?
Many thanks!