A formal approach to meaning - Alec's Project Documentation

# The basics ## Structure of the article 1. Introduction 1.1. Explanation: meaning versus outcomes 1.2. Our approach 1.3. A formal approach 2. The meaning of meaning 2.1. Intension and extension 2.2. The meaning in datasets 3. An information-theoretic approach 3.1. Information and meaning 3.2. Maximally terse meanings 3.3. Example 4. Towards translation 5. Conclusion # First reading ## 1.1. Explanation: meaning versus outcomes + Explaining outcomes is often a fool's errand, as not everything had to happen + Explaining them meaningfully, however, or explaining meaningful behavior meaningfully, isn't nearly as problematic ## 1.2. Our approach ## 1.3. A formal approach ## 2.1. Intension and extension + Given a relation between objects and their characteristics + The intent of objects is all the attributes they map to + The extent of attributes is all the objects which have it + The crucial point is the the **meaning** of attributes and objects is just their intents and extents, respectively ## 2.2. The meaning in datasets + Datasets are rows of attributes of cultural objects + This dataset is conceptualized as a hypergraph, connecting elements + A hypergraph ${\bf Q} = [ABCD]$ links elements of the sets $A, B, ...$ + $[ABCD]$ implies a set of possible collapsing to (induced graphs) $[AB], [BD],$ etc... + The "$C$ meaning of $a\in A$" are the marginals of qualities in $C$ collapsed over everything else, holding $a$ constant. > Note: this is close to the structure I've built to analyze meaning anyways, I just haven't extracted it yet. It's the N-grams analysis, where I split counts for n-grams by journal and year. You could imagine also splitting by author, article keyword, home institution, who they cite (in my [brainstorming document](/journal-analysis/brainstorm), under "Desired datasets"). **The only difference** is that I am looking also at multi-edges of n-grams, connecting them together insofar as they appear together. ## 3.1. Information and meaning + The authors want to generalize to continuous variables. + Instead of assessing the meaning of a *specific element* of a continuous variable, they assess the meaning one variable contains about another. ## 3.2. Maximally terse dataset + **Entropy** quantifies how much information the average symbol holds + **Mutual information** is the amount of additional information conveyed by two kinds' coincidence together + It can be generalized past one variable's information content about anothers, to multiway tables + **Variety** is this generalization, "a quantification of the degree to which the information in the table is internally useful—how much knowing an observation’s values on some variables tells us about its value on another variable." + **Terseness** of a dataset is the proportion of information in the table which "is of use to us". + Terseness gives us a meaningful criteria to *maximize*, as it benefits from the removal of categories which have no meaning at all + ## 3.3. Example + "In sum, we thus use the degree of quantitative informativeness to guide us in our focus on the formal analysis of certain qualitative meanings." ## 4. Towards translation He seems to be introducing an algebra for table manipulation that I don't understand. It seems to be rather powerful, given his conclusion about equation 13: "This is the translation of A into E via B". I can't say I understand, but it seems important. ## 5. Conclusion