Posted in January 2019

More on machine learning and CH

Posted on 2019-01-11 by Klaas Pieter Hart

A few days ago I wrote about a paper establishing an independence result in the field of machine learning. Here I offer a few more comments.

In the paper the authors comment on the relation between their result and actual machine learning. That relation may seem tenuous because none of the functions involved in the arguments is related to any kind of algorithm.
Indeed the constructions of the compression schemes are very non-constructive in that they use repeated applications of the Axiom of Choice.

Now the inequality 2^ℵ₀>ℵ_ω implies there are no compression schemes whatsoever. But it may be of interest to know that consistent examples of compression schemes must be nonconstructive. It turns out that, with the aid of a few standard results from Descriptive Set Theory it is relatively easy to show outright that there are no Borel measurable monotone compression schemes and hence no Borel measurable learning functions for the class of problems studied in the paper mentioned above either.

The details can be found in this note.

Posted in Set Theory, Topologie, verzamelingen, Wiskunde Leave a reply

Machine Learning and the Continuum Hypothesis

Posted on 2019-01-09 by Klaas Pieter Hart

Not even Machine Learning is safe from Set Theory, or so it seems. On the website of the journal Nature there is an article about a paper in Nature Machine Intelligence that connects a certain kind of learnability to the Continuum Hypothesis. The conclusion of the paper is that certain abstract learnability questions are undecidable on the basis of the normal ZFC axioms of Set Theory.

The article tries to explain what is going on but seems to confuse two disparate things: Gödel’s (First) Incompleteness Theorem on the one hand and the undecidability of the Continuum Hypothesis on the other hand.
The first is a very general statement about first-order theories; it states that for every theory that satisfies a number of technical conditions there are statements that have no formal proof and neither do their negations. Elementary number theory is subject to this theorem, as is ZFC Set Theory.
The second is a Set-theoretical statement for which we can prove that there is no formal proof, nor for its negation. It is also more interesting than Gödel’s statements; the latter `simply’ assert their own unprovability, whereas the Continuum Hypothesis is a fundamental statement/question about the set of real numbers.

The confusion manifests itself when the Continuum Hypothesis is called a paradox. It is not. The statements from the Incompleteness Theorem on the other hand are usually likened to the Liar Paradox in that “This formula if unprovable” looks a lot like the paradox that is “This sentence is false”.

The paper itself also alludes to the Incompleteness Theorem; it even states that is used in the argument. It is not. No use is made of Gödel’s abstract unprovable sentences.

The Set Theory

So, what is the Set Theory in the paper? The learnability question is shown to be equivalent to the existence of a natural number m and a map η from the family of m-element subsets of [0,1] to the family F of finite subsets of [0,1] that satisfies the following condition: if A is a subset of [0,1] with m+1 elements then it has a subset B with m elements such that η(B) contains A.

The main theorem of the paper states that an arbitrary set X admits such a map with m=k+1 if and only if X has cardinality at most ℵ_k.

If the Continuum Hypothesis holds then [0,1] has cardinality ℵ₁, hence there is a map as required with m=2. More generally there is a map as required if the cardinality of [0,1] is equal to ℵ_k for some natural number k. These possibilities do not lead to contradictions, hence neither does the learnability statement. On the other hand, the statement that the cardinality of [0,1] is larger than all these ℵ_k does not lead to contradictions either, hence neither does the negation of the learnability statement.

The derivation of the main statement parallells that of the main result of the paper Sur une caractérisation des alephs by Kuratowski from 1951: a set X has cardinality at most ℵ_k if and only if its power X^k+2 can be written as the union of k+2 sets A₁, …, A_k+2 such that for every i and every point (x₁,…,x_k+2) in the power the set of points y in A_i that satisfy y_j=x_j for j≠i is finite; in Kuratowski’s words “A_i is finite in the direction of the ith axis”.
Indeed one can even construct of a map η for m=k+1 from this decomposition of the canonical set ω_k of cardinality ℵ_k.
For notational simplicity we take k=2, so m=3, and ω₂⁴ has a decomposition into four sets A₁, A₂, A₃, and A₄. Given a subset F of ω₂ of 3 elements enumerate it in increasing order: x₁<x₂<x₃. The set η(F) will consist of F itself together with

all x for which (x,x₁,x₂,x₃) belongs to A₁,
all x for which (x₁,x,x₂,x₃) belongs to A₂,
all x for which (x₁,x₂,x,x₃) belongs to A₃,
all x for which (x₁,x₂,x₃,x) belongs to A₄

To see that this works let G be a four-element subset of ω₂, enumerate it as y₁<y₂<y₃<y₄. Then (y₁,y₂,y₃,y₄) belongs to one of the four sets, say it belongs to A₂; then G is a subset of η({y₁,y₃,y₄}): the point y₂ is included in the second line in the list above.

Note. The proof of the main theorem (Theorem 1) of the paper is not quite correct: it fails for k=1 for example as one encounters the cardinal number ℵ_-1. Worse: in that case the ordering <₁ seems to have order type ω₁ and ω₀ simultaneously. All this can be repaired with a better write-up.

Posted in Georg Cantor, Logica, onmogelijkheden, Set Theory, verzamelingen, Wiskunde 2 Replies

KP Hart

Recent Posts

Recent Comments

Archives

Categories

Meta

Posted in January 2019

More on machine learning and CH

Machine Learning and the Continuum Hypothesis

The Set Theory

Direct links: