Data Science for Everyone
Preface
Data science is everywhere around us—from the maps that guide our commutes, to the movie recommendations we receive, to the health and environmental reports that help shape public policy. Yet for many students, data science feels distant, complicated, or reserved for college courses. This book takes the opposite view. With the right tools and a curious mindset, anyone—especially high school students—can begin doing real data science today.
The goal of this book is to show you what data science actually looks like in practice. It focuses not on memorizing formulas, but on learning how to set up your computer correctly, organize your work, use modern tools such as the command line, Git, VS Code, and Quarto, and build reproducible analyses from day one. These skills form the foundation of professional data science, and developing them early will allow you to create insights that matter.
Throughout the book, you will work with real datasets from topics you already care about: sports, the environment, health, city life, and more. You will learn how to explore the data, ask thoughtful questions, create visualizations, and communicate your findings clearly. Just as importantly, you will learn how to manage your work like a professional data scientist, building good habits that will serve you through college and beyond.
This book assumes no prior experience with programming or statistics. What it does assume is curiosity, patience, and a willingness to try. You will write your first command-line commands, make your first Git commits, produce your first Quarto report, and build your first real project—all in Part I. Part II will invite you to discover patterns in real-world datasets through guided examples.
My hope is that this book helps you see data science not as something mysterious, but as something you can do, enjoy, and grow with. Every data scientist started where you are now—with a blank screen, an open mind, and a desire to understand the world better.
Welcome to your journey into data science.