Students across Five Colleges to compete in DataFest

During the 48-hour event that begins Friday evening and concludes Sunday afternoon, each team from Hampshire, Amherst, Mount Holyoke, and Smith Colleges and the University of Massachusetts Amherst will compete head-to-head with all other teams for prizes in categories ranging from Best Insight, Best Visualization, and Best Use of External Data. The collaboration between academe, students, and industry will be held at UMass.

DataFest is an annual competition in which teams of up to five undergraduate students work to reveal insights from a large and rich data set. This unique program takes data-analysis learning beyond the constraints normally encountered in a typical statistics course by enabling the students to work with big data provided by a real client.

Each team presents its findings to a panel of judges comprised of professors, data scientists, and representatives of the company or organization that provides the data set for the competition.

Student competitors also try to catch the attention of various company and organization representatives who will attend the event to offer advice and identify students with the best quantitative and analytical skills for potential job opportunities.

“While many participants enjoy DataFest as a friendly competitive event, it means much more to students nearing graduation and the company reps in attendance who are seeking to recruit new statistical talent,” says Robert Gould, Professor of Statistics at UCLA and national organizer for ASA DataFest, which was first held at UCLA in 2011.

“In the relatively short history of DataFest, numerous students showcased their statistical skill during the event and simultaneously developed contacts with employers that have led to offers of full-time employment. Students who do well at DataFest are students who have proven that they can navigate the 'data deluge.' And this is very attractive to potential employers," Gould said.

Each year, the data and challenge are different, but the common theme of making sense of big data—larger and more complex than the data sets undergraduate students usually encounter in a classroom—is carried over. The data set, which consists of real-world data, is not unveiled until the start of the competition so participating students cannot prepare in advance for the event.