实验室动态

Eric Lo: Multi-Column-at-a-Time Main-Memory Column-Stores: Algorithms, Systems, and Implementation

时间

2017年3月16日 (周四)下午3点-4点

 

地点

计算机所106会议室

 

摘要

Main memory analytic databases are gaining ground rapidly because of the strong demand of real-time analytics and the increasing capability of housing terabytes of main memory in modern servers. Modern main-memory analytical databases are “column-stores”, with data tables physically stored in memory as sections of columns of data rather than as rows of data. Query processing in main-memory column-stores have been based on the “column-at- a-time” approach, i.e., a query is evaluated as a sequence of primitive operations (e.g., hashing, sorting) on individual attributes/columns, one at a time. With the advent of several key techniques such as SIMD-accelerated data processing, column encoding, and code generation, our preliminary work showed that a main-memory column-store can attain substantial performance improvement if it can support “multi-column-at-a- time” processing. Multi-column-at-a-time means a column-store processes multiple columns together instead of one by one. It is a novel query processing paradigm that opens up a much finer level of optimization (e.g., bytes from different columns can be processed together). We are now building the community’s first multi-column-at-a-time enabled main-memory column-store. In this talk, I will cover its design, algorithms, and implementation details. We plan to open-source it afterwards.

简历

Eric Lo is an associate professor in the Department of Computer Science and Engineering at the Chinese University of Hong Kong (CUHK). He started his PhD study at ETH Zurich (Switzerland) in 2005 and obtained his PhD degree in 2006. Before he returned to Hong Kong, he worked at Google and Microsoft. His recent research focuses on large-scale data processing on modern architectures (e.g., lock-free programming on many-core), distributed Bayesian inference systems for big data, and data science. He has been the program committee members of all major data engineering conferences and will be the program vice chairs of CIKM’18 and ICDE’18. His research works have thrice selected as bests of conferences (VLDB’05, ICDE’12, and DASFAA’14).