Data Science for Mathematicians
eBook - ePub

Data Science for Mathematicians

Nathan Carter, Nathan Carter

Compartir libro
  1. 516 páginas
  2. English
  3. ePUB (apto para móviles)
  4. Disponible en iOS y Android
eBook - ePub

Data Science for Mathematicians

Nathan Carter, Nathan Carter

Detalles del libro
Vista previa del libro
Índice
Citas

Información del libro

Mathematicians have skills that, if deepened in the right ways, would enable them to use data to answer questions important to them and others, and report those answers in compelling ways. Data science combines parts of mathematics, statistics, computer science. Gaining such power and the ability to teach has reinvigorated the careers of mathematicians. This handbook will assist mathematicians to better understand the opportunities presented by data science. As it applies to the curriculum, research, and career opportunities, data science is a fast-growing field. Contributors from both academics and industry present their views on these opportunities and how to advantage them.

Preguntas frecuentes

¿Cómo cancelo mi suscripción?
Simplemente, dirígete a la sección ajustes de la cuenta y haz clic en «Cancelar suscripción». Así de sencillo. Después de cancelar tu suscripción, esta permanecerá activa el tiempo restante que hayas pagado. Obtén más información aquí.
¿Cómo descargo los libros?
Por el momento, todos nuestros libros ePub adaptables a dispositivos móviles se pueden descargar a través de la aplicación. La mayor parte de nuestros PDF también se puede descargar y ya estamos trabajando para que el resto también sea descargable. Obtén más información aquí.
¿En qué se diferencian los planes de precios?
Ambos planes te permiten acceder por completo a la biblioteca y a todas las funciones de Perlego. Las únicas diferencias son el precio y el período de suscripción: con el plan anual ahorrarás en torno a un 30 % en comparación con 12 meses de un plan mensual.
¿Qué es Perlego?
Somos un servicio de suscripción de libros de texto en línea que te permite acceder a toda una biblioteca en línea por menos de lo que cuesta un libro al mes. Con más de un millón de libros sobre más de 1000 categorías, ¡tenemos todo lo que necesitas! Obtén más información aquí.
¿Perlego ofrece la función de texto a voz?
Busca el símbolo de lectura en voz alta en tu próximo libro para ver si puedes escucharlo. La herramienta de lectura en voz alta lee el texto en voz alta por ti, resaltando el texto a medida que se lee. Puedes pausarla, acelerarla y ralentizarla. Obtén más información aquí.
¿Es Data Science for Mathematicians un PDF/ePUB en línea?
Sí, puedes acceder a Data Science for Mathematicians de Nathan Carter, Nathan Carter en formato PDF o ePUB, así como a otros libros populares de Mathematics y Mathematics General. Tenemos más de un millón de libros disponibles en nuestro catálogo para que explores.

Información

Año
2020
ISBN
9780429675676
Edición
1
Categoría
Mathematics
Chapter 1
Introduction
Nathan Carter
Bentley University
1.1Who should read this book?
1.2What is data science?
1.3Is data science new?
1.4What can I expect from this book?
1.5What will this book expect from me?
This chapter serves as an introduction to both this text and the field of data science in general. Its first few sections explain the book’s purpose and context. Then Sections 1.4 and 1.5 explain which subjects will be covered and suggest how you should interact with them.
1.1Who should read this book?
The job market continues to demand data scientists in fields as diverse as health care and music, marketing and defense, sports and academia. Practitioners in these fields have seen the value of evidence-based decision making and communication. Their demand for employees with those skills obligates the academy to train students for data science careers. Such an obligation is not unwelcome because data science has in common with academia a quest for answers.
Yet data science degree programs are quite new; very few faculty in the academy have a PhD in data science specifically. Thus the next generation of data scientists will be trained by faculty in closely related disciplines, primarily statistics, computer science, and mathematics.
Many faculty in those fields are teaching data-science-related courses now. Some do so because they like the material. Others want to try something new. Some want to use the related skills for consulting. Others just want to help their institution as it launches a new program or expands to meet increased demand. Three of my mathematician friends have had their recent careers shaped by a transition from pure mathematics to data science. Their stories serve as examples of this transition.
My friend David earned his PhD in category theory and was doing part-time teaching and part-time consulting using his computer skills. He landed a full-time teaching job at an institution that was soon to introduce graduate courses in data science. His consulting background and computing skills made him a natural choice for teaching some of those courses, which eventually led to curriculum development, a new job title, and grant writing. David is one of the authors of Chapter 8.
Another friend, Sam, completed a PhD in probability and began a postdoctoral position in that field. When his institution needed a new director of its data science masters program, his combination of mathematical background and programming skills made him a great internal candidate. Now in that role, his teaching, expository writing, and career as a whole are largely focused on data science. Sam is the author of Chapter 9.
The third and final friend I’ll mention here, Mahesh, began his career as a number theorist and his research required him to pick up some programming expertise. Wanting to learn a bit more about computing, he saw data science as an exciting space in which to do so. Before long he was serving on a national committee about data science curricula and spending a sabbatical in a visiting position where he could make connections to data science academics and practitioners. Mahesh is the other author of Chapter 8.
These are just the three people closest to me who have made this transition. As you read this, stories of your own friends or colleagues may come to mind. Even if you don't know a mathematician-turned-data-scientist personally, most mathematicians are familiar with Cathy O’Neil from her famous book Weapons of Math Destruction [377], who left algebraic geometry to work in various applied positions, and has authored several books on data science.
In each of these stories, a pure mathematician with some computer experience made a significant change to their career by learning and doing data science, a transition that’s so feasible because a mathematical background is excellent preparation for it. Eric Place1 summarized the state of data science by saying, “There aren't any experts; it’s just who’s the fastest learner.”
But mathematicians who want to follow a path like that of David, Sam, or Mahesh have had no straightforward way to get started. Those three friends cobbled together their own data science educations from books, websites, software tutorials, and self-imposed project work. This book is here so you don't have to do that, but can learn from their experiences and those of others. With a mathematical background and some computing experience, this book can to be your pathway to teaching in a data science program and considering research in the field.
But the book does not exist solely for the benefit of its mathematician readers. Students of data science, as they learn its techniques and best practices, inevitably ask why those techniques work and how they became best practices. Mathematics is one of the disciplines best suited to answering that type of question, in data science or any other quantitative context. We are in the habit of demanding the highest standards of evidence and are not content to know just that a technique works or is widely accepted. Bringing that mindset to data science will give students those important “why” answers and make your teaching of data science more robust. If this book helps you shift or expand your career, it will not be for your benefit only, but for that of our students as well.
1.2What is data science?
In 2001, William Cleveland published a paper in International Statistical Review [98] that named a new field, “Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics.” As the title suggests, he may not have been intending to name a new field, since he saw his proposal as an expansion of statistics, with roughly 15% of that expanded scope falling under the heading of computer science. Whether data science is a new field is a matter of some debate, as we’ll see in Section 1.3, though I will sometimes refer to it as a field for the sake of convenience.
In Doing Data Science [427], Cathy O’Neil and Rachel Schutt say that the term “data scientist” wasn't coined until seven years after Cleveland’s article, in 2008. The first people to use it were employees of Facebook and LinkedIn, tech companies where many of the newly-christened data scientists were employed.
To explain this new field, Drew Conway created perhaps the most famous and reused diagram in data science, a Venn diagram relating mathematics, statistics, computer science, and domain expertise [107]. Something very close to his original appears in Figure 1.1, but you’re likely to encounter many variations on it, because each writer seems to create one to reflect their own preferences.
fig1_1.webp
Figure 1.1: Rendering of Drew Conway’s “data science Venn diagram” [107].
You can think of the three main circles of the diagram as three academic departments, computer science on the top left, math on the top right, and some other (usually quantitative) discipline on the bottom, one that wants to use mathematics and computing to answer some questions. Conway’s top-left circle uses the word “hacking” instead of computer science, because only a small subset of data science work requires formal software engineering skills. In fact, reading data and computing answers from it sometimes involves unexpected and clever repurposing of data or tools, which the word “hacking” describes very well. And mathematicians, in particular, can take heart from Conway’s labeling of the top-right circle not as statistics, but mathematics and statistics, and for good reason. Though Cleveland argued for classifying data science as part of statistics, we will see in Section 1.4 that many areas of mathematics proper are deeply involved in today’s data science work.
The lower-left intersection in the diagram is good news for readers of this text: it claims that data science done without knowledge of mathematics and statistics is a walk into danger. The premise of this text is that mathematicians have less to learn and can thus progress more quickly.
The top intersection is a bit harder to explain, and we will defer a full explanation until Chapter 8, on machine learning. But the gist is that machine learning differs from traditional mathematical modeling in that the analyst does not impose as much structure when using machine learning as he or she would when doing mathematical modeling, thus requiring less domain knowledge. Instead, the machine infers more of the structure on its own.
But Figure 1.1 merely outlines which disciplines come into play. The practice of data science proceeds something like the following.
1.A question arises that could be answered with data.
This may come from the data scientist’s employer or client, who needs the answer to make a strategic decision, or from the data scientist’s own curiosity about the world, perhaps in science, politics, business, or some other area.
2.The data scientist prepares to do an analysis.
This includes find...

Índice