All Posts

Last update on .

Bootstrap Teaches Rigorous Data Science In An Introductory Computing Module For K-12 Students

Click the link that follows for more Brown CS content about Bootstrap.

From a deluge of job openings to new university programs, Data Science has become a hot topic. But if it’s so important, why wait until a student enters university to introduce it?

Children are natural data scientists! They argue about who was the greatest quarterback, the most successful singer, which chain has the best pizza. These questions quickly shift to data: did athlete X win more trophies than athlete Y, are Grammy nominations or albums-sold a better indicator of talent, and so on. As they mature, they want to know whether a law is racist, or whether the outcomes of going to a particular college justify the extra student loans. In a world that’s data-rich, what these students need isn’t data. It’s the ability to ask questions and make meaning from that data.

Bootstrap, founded and co-led by three current Brown CS faculty/staff members, is one of the only groups in the CS Education field that builds our own curriculum, software, and programming tools. This gives us a unique opportunity to fill that gap, with a programming language that makes operations on tabular data (literally, spreadsheets) accessible without the overhead of teaching loops. We’ve leveraged our world-class language development team to bring rigorous Data Science to an introductory computing module. And since our unit of data storage is a spreadsheet, there’s a smooth on-ramp for teachers who are comfortable with Microsoft Excel and Google Sheets.

You can learn more about Bootstrap in the expandable sections below:

World-Class Pedagogy In A True Introductory Course

Thanks to seed funding from Bloomberg, a Bootstrap:Data Science course is already being piloted at several middle schools and high schools. In Rhode Island, high school students used our module to compare college acceptance rates for high schools across the state. 6th-graders in North Carolina studied the role that Data Science plays in things like credit card fraud detection. By lowering the barrier to entry, Bootstrap:Data Science is a true introductory course, which addresses learning goals that make it suitable in a number of mainstream classes:

  • Statistics classes can use Bootstrap:Data Science to cover core concepts in statistics, such as measures of center (mean, median, mode), and visualizing data (line, bar, and scatter plots)

  • Business classes can use the Data Science module to make inferences about sales, profits, and customer demand, porting existing spreadsheet-based coursework to our programmed lessons

  • Civics and Social Studies classes can use Data Science to explore the role of data in government and social policy, exploring the impact of things like stop-and-frisk, the Electoral College, and third-party voting on the world around them

Even with a powerful tool and a flexible curriculum, a successful course still needs an effective pedagogy. Bootstrap builds on a world-class pedagogical technique developed over decades at the university level, which has been proven to work with students of all ages and abilities via our Bootstrap:Algebra course. More than 20,000 students each year use our structured approach to problem solving (~45% female, ~50% students of color), and we’re excited to bring this approach to Data Science.

Equity, Rigor, And Scale

The motivation for a K-12 Data Science curriculum is clear, but building such a curriculum requires careful thinking about software, pedagogy and curriculum. To blend in with mainstream courses, these curricula should be designed to fit comfortably within existing content strands, and aligned to national and state standards for statistics, CTE, and/or business.

Professional-grade tools like Stata and R offer powerful features, but aren’t explicitly designed to be child-friendly or teacher-friendly. At the opposite extreme, spreadsheets have deep roots in educational settings, but lack the programming component, and some of the features, necessary to build a rigorous introductory Data Science course.

Current attempts to fill the gap in the middle have students program various loops over two-dimensional arrays. Making for-loops and nested data structures a prerequisite for Data Science immediately limits the possible audience of students to a small, elite group, and burns valuable time that could be spent addressing the standards-alignment of mainstream courses.

If a Data Science module needs a few weeks to introduce for-loops and two-dimensional arrays, that’s weeks of time spent before a teacher can address the needed standards for bar, pie and scatter plots, measures of center, or linear regression. This approach rules out scale (only a small percentage of students and teachers are ready for this material) and equity (asking students to self-select into these courses only reinforces existing stereotypes). What’s needed is a holistic approach to Data Science that has all three: equity, rigor, and scale.

Questions That Teachers And Students Care About

This summer, Bootstrap held its first-ever teacher professional development training for Bootstrap:Data Science at CSPdWeek. In an early indicator of the curriculum’s potential, the majority of these applicants were not CS teachers, nor were they being assigned to teach a CS class (see chart below)! The training covered introductory programming, visualization using half a dozen chart types, the core statistical concepts mandated by most state and national standards (mean, median, mode, linear regression, r-squared, etc), and table queries.

Staying true to our belief that Data Science must be about more than the tool, the course emphasized the thinking and writing side of things, encouraging teachers to focus on making meaning rather than just writing code. Some investigated the relationship between home ownership and income. Others looked into whether or not louder pop songs tended to have faster beats. In every case, teachers encountered outliers, surprising trends, and more complex relationships than they anticipated.
Just as their students will, these teachers focused on questions they cared about, and used the Data Science concepts they learned to search for answers in the data. By the end of the week, they had created detailed reports of their analysis, which they presented in front of an audience. They found themselves thinking about fitting the material into their own math, business, computing, and social studies classes, and gave us invaluable feedback about how to make the curriculum even better. We still have more to learn and to much to do, but we believe that we’ve found a great starting point for an authentic, accessible curriculum: using computation as a vehicle for thinking, talking, and writing about data.

For more information, click the link that follows to contact Brown CS Communication Outreach Specialist Jesse C. Polhemus.