Calculus of Thought
eBook - ePub

Calculus of Thought

Neuromorphic Logistic Regression in Cognitive Machines

  1. 272 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Calculus of Thought

Neuromorphic Logistic Regression in Cognitive Machines

About this book

Calculus of Thought: Neuromorphic Logistic Regression in Cognitive Machines is a must-read for all scientists about a very simple computation method designed to simulate big-data neural processing. This book is inspired by the Calculus Ratiocinator idea of Gottfried Leibniz, which is that machine computation should be developed to simulate human cognitive processes, thus avoiding problematic subjective bias in analytic solutions to practical and scientific problems.The reduced error logistic regression (RELR) method is proposed as such a "Calculus of Thought." This book reviews how RELR's completely automated processing may parallel important aspects of explicit and implicit learning in neural processes. It emphasizes the fact that RELR is really just a simple adjustment to already widely used logistic regression, along with RELR's new applications that go well beyond standard logistic regression in prediction and explanation. Readers will learn how RELR solves some of the most basic problems in today's big and small data related to high dimensionality, multi-colinearity, and cognitive bias in capricious outcomes commonly involving human behavior.- Provides a high-level introduction and detailed reviews of the neural, statistical and machine learning knowledge base as a foundation for a new era of smarter machines- Argues that smarter machine learning to handle both explanation and prediction without cognitive bias must have a foundation in cognitive neuroscience and must embody similar explicit and implicit learning principles that occur in the brain

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Calculus of Thought by Daniel M Rice in PDF and/or ePUB format, as well as other popular books in Mathematics & Calculus. We have over one million books available in our catalogue for you to explore.

Information

Year
2013
Print ISBN
9780124104075
eBook ISBN
9780124104525
Chapter 1

Calculus Ratiocinator

Abstract

There is more need than ever to implement Leibniz's Calculus Ratiocinator suggestion concerning a machine that simulates human cognition but without the inherent subjective biases of humans. This need is seen in how predictive models based upon observation data often vary widely across different blinded modelers or across the same standard automated variable selection methods. This unreliability is amplified in today's Big Data with very high-dimension confounding candidate variables. The single biggest reason for this unreliability is uncontrolled error that is especially prevalent with highly multicollinear input variables. So, modelers need to make arbitrary or biased subjective choices to overcome these problems because widely used automated variable selection methods like standard stepwise methods are simply not built to handle such error. The stacked ensemble method that averages many different elementary models was reviewed as one way to avoid such bias and error to generate a reliable prediction, but there are disadvantages including lack of automation and lack of transparent, parsimonious, and understandable solutions. A form of logistic regression that also models error events as a component of the maximum likelihood estimation called Reduced Error Logistic Regression (RELR) was also introduced as a method that avoids this multicollinearity error. An important neuromorphic property of RELR is that it shows stable explicit and implicit learning in small training samples and high-dimension inputs as observed in neurons. Other important neuromorphic properties of RELR consistent with a Calculus Ratiocinator machine were also introduced including the ability to produce unbiased automatic stable maximum probability solutions and stable causal reasoning based upon matched sample quasi-experiments. Given RELR's connection to information theory, these stability properties are the basis of the new stable information theory that is reviewed in this book with wide ranging causal and predictive analytics applications.

Keywords

Analytic science; Big data; Calculus Ratiocinator; Causal analytics; Causality; Cognitive neuroscience; Cognitive science; Data mining; Ensemble learning; Explanation; Explicit learning and memory; High dimension data; Implicit learning and memory; Information theory; Logistic regression; Machine learning; Matching experiment; Maximum entropy; Maximum likelihood; Multicollinearity; Neuromorphic; Neuroscience; Observational data; Outcome score matching; Prediction; Predictive analytics; Propensity score; Quasi-experiment; Randomized controlled experiment; Reduced error logistic regression (RELR); Stable information theory
ā€œIt is obvious that if we could find characters or signs suited for expressing all our thoughts as clearly and as exactly as arithmetic expresses numbers or geometry expresses lines, we could do in all matters insofar as they are subject to reasoning all that we can do in arithmetic and geometry. For all investigations which depend on reasoning would be carried out by transposing these characters and by a species of calculus.ā€
Gottfried Leibniz, Preface to the General Science, 1677.1
Contents
1. A Fundamental Problem with the Widely Used Methods
2. Ensemble Models and Cognitive Processing in Playing Jeopardy
3. The Brain's Explicit and Implicit Learning
4. Two Distinct Modeling Cultures and Machine Intelligence
5. Logistic Regression and the Calculus Ratiocinator Problem
At the of end of his life and starting in 1703 Gottfried Leibniz engaged in a 12-year feud with Isaac Newton over who first invented the calculus and who committed plagiarism. All serious scholarship now indicates that both Newton and Leibniz developed calculus independently.2 Yet, stories about Leibniz's invention of calculus usually focus on this priority dispute with Newton and give much less attention to how Leibniz's vision of calculus differed substantially from Newton's. Whereas Newton was trained in mathematical physics and continued to be associated with academia during the most creative time in his career, Leibniz's early academic failings in math led him to become a lawyer by training and an entrepreneur by profession.3 So Leibniz's deep mathematical insights that led to calculus occurred away from a university professional association. Unlike Newton whose entire mathematical interests seemed tied to physics, Leibniz clearly had a much broader goal for calculus. These applications were in areas well beyond physics that seem to have nothing to do with mathematics. His dream application was for a Calculus Ratiocinator, which is synonymous with Calculus of Thought.4 This can be interpreted to be a very precise mathematical model of cognition that could be automated in a machine to answer any important philosophical, scientific, or practical question that traditionally would be answered with human subjective conjecture.5 Leibniz proposed that if we had such a cognitive calculus, we could just say ā€œLet us calculateā€6 and always find most reasonable answers uncontaminated by human bias.
In a sense, this concept of Calculus Ratiocinator foreshadows today's predictive analytic technology.7 Predictive analytics are widely used today to generate better than chance longer term projections for more stable physical and biological outcomes like climate change, schizophrenia, Parkinson's disease, Alzheimer's disease, diabetes, cancer, optimal crop yields, and even good short-term projections for less stable social outcomes like marriage satisfaction, divorce, successful parenting, crime, successful businesses, satisfied customers, great employees, successful ad campaigns, stock price changes, loan decisions, among many others. Until the widespread practice of predictive analytics with the introduction of the computers in the past century, most of these outcomes were thought to be too capricious to have anything to do with mathematics. Instead, they were traditionally answered with speculative and biased hypotheses or intuitions often rooted in culture or philosophy (Fig. 1.1).
image
Figure 1.1 Gottfried Wilhelm Leibniz.8
Until just very recently, standard computer technology could only evaluate a small number of predictive features and observations. But, we are now in an era of big data and high performance massively parallel computing. So our predictive models should now become much more powerful. This is because it would seem reasonable that those traditional methods that worked to select important predictive features from small data will scale to high-dimension data and suddenly select predictive models that are much more accurate and insightful. This would give us a new and much more powerful big data machine intelligence technology that is everything that Leibniz imagined in a Calculus Ratiocinator. Big data massively parallel technology should thus theoretically allow completely new data-driven cognitive machines to predict and explain capricious outcomes in science, medicine, business, and government.
Unfortunately, it is not this simple. This is because observation samples are still fairly small in most of today's predictive analytic applications. One reason is that most real-world data are not representative samples of the population to which one wishes to generalize. For example, the people who visit Facebook or search on Google might not be a good representative sample of many populations, so smaller representative samples will need to be taken if the analytics are to generalize very well. Another problem is that many real-world data are not independent observations and instead are often repeated observations from the same individuals. For this reason, data also need to be down sampled significantly to be independent observations. Still, another problem is that even when there are many millions of independent representative observations, there are usually a much smaller number of individuals who do things like respond to a particular type of cancer drug or commit fraud or respond to an advertising promotion in the recent past. The informative sample for a predictive model is the group of targeted individuals and a group of similar size that did not show such a response, but these are not usually big data samples in terms of large numbers of observations. So, the biggest limitation of big data in the sense of a large number of observations is that most real-world data are not ā€œbigā€ and instead have limited numbers of observations. This is especially true because most predictive models are not built from Facebook or Google data.9
Still, most real-world data are ā€œbigā€ in another sense. This is in the sense of being very high dimensional given that interactions between variables and nonlinear effects are also predictive features. Previously we have not had the technology to evaluate high dimensions of potentially predictive variables rapidly enough to be useful. The slower processing that was the reason for this ā€œcurse of dimensionalityā€ is now behind us. So many might believe that this suddenly allows the evaluation of almost unfathomably high dimensions of data for the selection of important features in much more accurate and smarter big data predictive models simply by applying traditional widely used methods.
Unfortunately, the traditional widely used methods often do not give unbiased or non-arbitrary predictions and explanations, and this problem will become ever more apparent with today's high-dimension data.

1 A Fundamental Problem with the Widely Used Methods

There is one glaring problem with today's widely used predictive analytic methods that stands in the way of our new data-driven science. This problem is inconsistent with Leibniz's idea of an automated machine that can reproduce the very computations of human cognition, but without the subjective biases of humans. This problem is suggested by the fact that there are probably at least hundreds of predictive analytic methods that are in use today. Each method makes differing assumptions that would not be agreed upon by all, and all have at least one and sometimes many arbitrary parameters. This arbitrary diversity is defended by those who believe a ā€œno free lunchā€ theorem that argues that there is no one best method across all situations.10,11 Yet, when predictive modelers test various arbitrary algorithms based upon these methods to get a best model for a specific situation, they obviously will only test but a tiny subset of the possibilities. So unless there is an obvious very simple best model, different modelers will almost always produce substantially different arbitrary models with the same data.
As examples of this problem of arbitrary methods, there are different types of decision tree methods like CHAID and CART which have different statistical tests to determine branching. Even with the very same method, different user-provided parameters for splitting the branches of the tree will often give quite different decision trees that will generate very different predictions and explanations. Likewise, there are many widely used regression variable selection methods like stepwise and LASSO logistic regression that are all different in the arbitrary assumptions and parameters employed in how one selects important ā€œexplanatoryā€ variables. Even with the very same regression method, different user choices in these parameters will almost always generate widely differing explanations and often substantially differing predictions. There are other methods like Principal Component Analysis (PCA), Variable Clustering and Factor Analysis that attempt to avoid the variable selection problem by greatly reducing the dimensionality of the variables. These methods work well when the data match underlying assumptions, but most behavioral data will not be easily modeled with the assumptions in these methods like orthogonal components in the case of PCA or that one knows how to rotate the components to be nonorthogonal using the other methods given that there are an infinite number of possible rotations. Likewise, there are many other methods like Bayesian Networks, Partial Least Squares, and Structural Equation Modeling that modelers often use to make explanatory inferences. These methods each make differing arbitrary assumptions that often generate wide diversity in explanations and predictions. Likewise, there are a large number of fairly black box methods like Support Vector Machines, Artificial Neural Networks, Random Forests, Stochastic Gradient Boosting, and various Genetic Algorithms that are not completely transparent in their explanations of how the predictions are formed, although some measure of variable importance often can be obtained. These methods can generate quite different predictions and important variables simply because of differing assumptions across the methods or differing user-defined modeling parameters within the methods.
Because there are so many methods and because all require unsubstantiated modeling assumptions along with arbitrary user-defined parameters, if you gave exactly the same data to a 100 different predictive modelers, you would likely get a 100 completely different models unless it was a simple solution. These differing models often would make very different predictions and almost always generate different explanations to the extent that the method produces transparent models that could be interpreted. In cases where regression methods are used and raw interaction or nonlinear effects are parsimoniously selected without accompanying main effects, the model's predictions are even likely to depend on how variables are scaled so that currency in Dollars versus Euros would give different predictions.12 Because of such variability that even can defy basic principles of logic, it is unreasonable to interpret any of these arbitrary models as reflecting a causal and/or most probable explanation or prediction.
Because the widely used methods yield arbitrary and even illogical models in many cases, hardly can we say ā€œLet us calculateā€ to answer important questions such as the most likely contribution of environmental versus genetic versus other biological factors in causing Parkinson's disease, Alzheimer's disease, prostate cancer, breast cancer and so on. Hardly, can we say ā€œLet us calculateā€, when we wish to provide a most likely explanation for why there is climate change or why certain genetic and environmental markers correlate to diseases or why our business is suddenly losing customers or how we may decrease costs and yet improve quality in health care. Hardly, can we say ā€œLet us calculateā€, when we wish to know the extent to which sexual orientation and other average gender differences are determined by biological factors or by social factors, when we wish to know whether stricter guns control policies would have a positive or negative impact on crime and murder rates, or when we wish to know whether austerity as an economic intervention tool is helpful or hurtful. Because our widely used predictive analytic methods are so influenced by completely subjective human choices, predictive model explanations and predictions about human diseases, climate change, and business and social outcomes will have substantial variability simply due to our cognitive biases and/or our arbitrary modeling methods. The most important questions of our day relate to various economic, social, medical, and environmental outcomes related to human behavior by cause or effect, but our widely used predictive analytic methods cannot answer these questions reliably.
Even when the very same method is used to select variables, the important variables that the model selects as the basis of explanation are likely to vary across independent observation samples. This sampling variability will be especially prevalent if the observations available to train the model are limited or if there are many possible features that are candidates for explanatory variables, and if there is also more than a modest correlation between at least some of the candidate explanatory variables. This problem of correlation between variables or multicollinearity is ultimately the real culprit. This multicollinearity problem is almost always seen with human behavior outcomes. Unlike many physical phenomena, behavioral outcomes usually cannot be understood in terms of easy to separate uncorrelated causal components. Models based upon randomized controlled experimental selection methods can avoid this multicollinearity problem through designs that yield variables that are orthogonal.13 Yet, most of today's predictive analytic applications necessarily must deal with observation data, as randomized experiments are simply not possible usually with human behavior in real-world situations. Leo Breiman, who was one of the more prominent statisticians of recent memory, referred to this inability to deal with multicollinearity error as ā€œthe quiet scandal of statisticsā€ because the attempts to avoid it in traditional predictive modeling methods are arbitrary and pro...

Table of contents

  1. Cover image
  2. Title page
  3. Table of Contents
  4. Copyright
  5. Preface
  6. Chapter 1. Calculus Ratiocinator
  7. Chapter 2. Most Likely Inference
  8. Chapter 3. Probability Learning and Memory
  9. Chapter 4. Causal Reasoning
  10. Chapter 5. Neural Calculus
  11. Chapter 6. Oscillating Neural Synchrony
  12. Chapter 7. Alzheimer's and Mind–Brain Problems
  13. Chapter 8. Let Us Calculate
  14. Appendix
  15. Notes and References
  16. Index