BME 500: Jun Li, Ph.D.

“Pointillism and Big Data: the misguided quest for multi-scale laws in biology”

WHERE: Electrical Engineering and Computer Science Building, 1200 show on map

WHEN: February 6, 2020 4:00 pm-5:00 pmADD TO CALENDAR

BME 500: Jun Li, Ph.D.: “Pointillism and Big Data: the misguided quest for multi-scale laws in biology”

In today’s research we often talk about knowledge-extraction from Big Data, and integration across different scales: molecules, cells, tissues/organs, organisms and their communities. The pursuit of multi-scale synthesis has a long history. For the microscopic world we have largely succeeded in connecting the chemical properties of molecules with the facts of atoms and their constituents and interactions. In epidemiology, many are currently applying linear mixed models to quantify the genetic contribution of disease risks in the general population. By and large, we live with the tacit belief that basic principles, once found, will be simple and elegant, and that we can build Systems Biology from the ground level. This leads to a pointillistic research culture, as when we try to explain the heredity of complex traits by summing up the individual actions of millions of DNA variants, or when we look for the neural basis of behavior by the connectivity and firing patterns of millions of neurons.
I will use this talk to share some thoughts on the emerging appreciation that, in biomedical data science, perhaps the best one can learn is not widely generalizable Mechanisms, but different laws for different scales of organization. There may not be a good chance, and perhaps no need, to "know" a system by brute force accumulation of larger and larger data at the bottom level. Acknowledging the irreducibility of highly-level phenomena in biology and medicine can help us appreciate the distinct methods, norms, and compromises in traditional disciplines, and steer the society's investment towards balanced collection of good data on all levels. By giving up the blind celebration of sample size, we give more attention to new technologies that can measure what was previously inaccessible, and to the next-generation of information science that embraces messy, context-specific models.