萨师煊大数据分析与管理国际研究中心第一届大数据分析与管理国际研讨会 _ 通知公告

通知公告

网站首页 > 通知公告

萨师煊大数据分析与管理国际研究中心第一届大数据分析与管理国际研讨会

发布日期：2012-07-13 访问量:

2012年7月19日北京

21世纪，IT技术突飞猛进，世界进入到大数据时代。为了有效应对大数据带来的挑战，同时充分利用大数据带来的机遇，国际学术界、企业界、甚至政府都在积极提出应对措施、制定战略规划。例如，超大数据库国际会议XLDB在美国、欧洲等地已经连续举办五次，美国政府投资2亿美元设立"大数据研究与发展计划"等。

正是在这样的背景下，中国人民大学依托数据工程与知识工程教育部重点实验室设立了萨师煊大数据分析与管理国际研究中心，并将于今年7月19日在中国人民大学逸夫会议中心举办"第一届大数据分析与管理国际研讨会"，围绕不同领域中大数据应用的挑战、大数据的基础设施构建等议题进行深入探讨，以推进大数据的研究和发展。

时间：2012年7月19日9:00-17:00

会议地点：北京市海淀区中国人民大学逸夫会议中心第一报告厅

日程安排

时间	讲者	题目
8:00-9:00	Registration
KEYNOTE TALK SESSION CHAIR: Xiaofang Zhou
9:00-9:45	Ooi Beng Chin	Design and Development of an IT Infrastructure for Data-intensive Applications and Analysis
9:45-10:15	Coffee Break
INVITED TALK SESSION CHAIR: Cuiping Li
10:15-10:45	Christian S.Jensen	Managing High-Velocity Mobile Location Data
10:45-11:15	Ling Liu	From Trajectory Mining to Big Data Analytics
11:15-11:45	Zhenkun Yang	Building a scalable RDBMS from scratch
11:45-12:15	Lei Chen	Whom to Ask? Jury Selection for Decision Making Tasks on Microblog Services
12:15-14:00	Lunch
PROJECT REPORT SESSION CHAIR: Jiaheng Lu
14:00-14:30	Jianmin Wang	Research Progress on Unstructured Data Management in China
14:30-15:00	Xiaofang Zhou	Research project on Web information extraction, analysis and management
15:00-15:30	Coffee Break
PANEL DISCUSSION SESSION CHAIR: Ling Liu
15:30-17:00	Weiyi Meng	Big Data Analysis and Management: Opportunities and Challenges
	Christian S.Jensen
	Ling Liu
	Ke Wang
	Chengqi Zhang
	Zhenkun Yang
	Jianmin Wang
	Xiaofang Zhou
	Lei Chen

学术专题讲座详细信息

学术讲座1：Design and Development of an IT Infrastructure for Data-intensive Applications and Analysis

摘要: Web 2.0
applications and enterprise applications are generating massive amounts of
different types of data at an unprecedented scale, and \emph{big data} problem
represents a new emerging challenge that has crippled the IT infrastructures of
many modern enterprises. With the fast popularity of cloud computing for its
promises of elastically scalable and pay-per-use model, IT infrastructures in
the future are likely to move from the enterprise settings to the cloud.
However, although cloud computing has demonstrated its usefulness in the context
of Web-centric information-oriented applications, it has not reached the level
of maturity to support a large class of enterprise applications that rely on a
foundational data and information layer that is based on Database Management
Systems (DBMSes). In this talk, I shall discuss our effort in building a cloud
infrastructure for big data management, processing and analytics driven by two
application domains, namely, social networking and enterprise data
management.

讲者简介: Prof. Ooi
BengChin, Dean of School of Computing (SoC), and Director of
Interactive Digital Media Institute (IDMI) at the National University of
Singapore (NUS). Beng Chin has served as a PC member for international
conferences such as ACM SIGMOD, VLDB, IEEE ICDE, WWW, and SIGKDD, and as Vice PC
Chair for ICDE`00,04,06, co-PC Chair for SSD`93 and DASFAA`05, PC Chair for ACM
SIGMOD`07, Core DB PC chair for VLDB`08, and PC co-Chair for IEEE ICDE`12. He
was an editor of VLDB Journal and IEEE Transactions on Knowledge and Data
Engineering, and as a co-chair of the ACM SIGMOD Jim Gray Best Thesis Award
committee. He is serving as the Editor-in-Chief of IEEE Transactions on
Knowledge and Data Engineering (TKDE)(2009-2012), an editor of Distributed and
Parallel Databases Journal, an advisory board member of SIGMOD, and a trustee
board member and executive of VLDB Endowment.   学术讲座2：Managing High-Velocity Mobile
Location Data 摘要:
Deployments of networked sensors enable online applications that feed on
real-time sensor data. For example, a variety of applications may exploit a
database of up-to-date locations of very large populations of mobile users. This
scenario calls for techniques that support the management of workloads that contain queries with low
latency requirements as well as massive volumes of updates. Such techniques
should exploit the parallelism offered by modern processors. To do so, it is
essential to avoid contention among parallel hardware threads.This talk covers
two solutions, TwingGrid and PGrid. TwinGrid maintains two copies, or snapshots,
of the data: one for the relatively long-duration queries and one for the
frequent and very localized updates.The snapshot that receives the updates is
frequently made available to queries by means of the C library memcpy function.
PGrid avoids keeping two copies, but instead relaxes the query semantics so that
updates and queries can occur in parallel. By maintaining two copies of data
updated with non-local updates, so-called freshness semantics can be guaranteed.
Both solutions use spatial grid indexes and use secondary hash indexes on object
identifiers. 讲者简介: Prof. Christian S. Jensen is a Professor of Computer Science at Aarhus
University, Denmark. He was previously at Aalborg University for two decades,
and he recently spent a 1-year sabbatical at Google Inc., Mountain View. His
research concerns data management and data-intensive systems, and its focus is
on temporal and spatio-temporal data management. Christian is an ACM and an IEEE
fellow, and he is a member of the Royal Danish Academy of Sciences and Letters
and the Danish Academy of Technical Sciences. He has received several national
and international awards for his research. He is currently vice-chair of ACM
SIGMOD and an editor-in-chief of The VLDB Journal. 学术讲座3: From Trajectory Mining to Big Data
Analytics 摘要:A trajectory can be defined as a
time-ordered set of states of a dynamical system. The most common example of
trajectory is the path that a moving object follows through space as a function
of time. A trajectory can be described mathematically either by the geometry of
the path, or as the position of the object over time. Mining moving object
trajectory data has been gaining significant interest in recent years. However,
existing approaches to trajectory clustering are mainly based on density and
Euclidean distance measures. We argue that when the utility of spatial
clustering of mobile object trajectories is targeted at road network aware
location based applications, density and Euclidean distance are no longer the
effective measures. This is because traffic flows in a road network and the
flow-based density characterization become important factors for finding
interesting trajectory clusters of mobile objects travelling in road networks.
In this talk, I will briefly introduce our research project NEAT−a fast and
effective approach to clustering of spatial trajectories of moving objects
travelling on road networks. Our method takes into account the physical
constraints of the road network, the network proximity and the traffic flows
among consecutive road segments. Trajectory clusters discovered by NEAT are
groups of sub-trajectories that describe both dense and highly continuous
traffic flows of mobile objects. Our experimental results with mobility traces
generated using different scales of real road network maps demonstrate that the
NEAT approach is highly accurate and runs orders of magnitude faster than
existing Euclidean density-based trajectory clustering approaches. I will
conclude the talk by discussing moving objet trajectory mining in information
networks, such as consumer networks, healthcare information networks, social
networks. 讲者简介: Prof.
Ling Liu is a full Professor in the School of Computer Science at Georgia
Institute of Technology. She directs the research programs in Distributed Data
Intensive Systems Lab (DiSL), examining various aspects of large scale data
intensive systems with the focus on performance, availability,security, privacy,
and energy efficiency. Prof. Liu and her students have released a number of open
source software tools, including WebCQ, XWRAP,PeerCrawl, GTMobiSim. She has
published over 300 International journal and conference articles in the areas of
databases, distributed systems, and Internet Computing. Prof. Liu is a recipient
of 2012 IEEE Computer Society Technical Achievement Award and an Outstanding
Doctoral Thesis Advisor award from Georgia Institute of Technology.She has also
served as general chair and PC chairs of several IEEE and ACM conferences in
data engineering and distributed computing fields and served on editorial board
of over a dozen international journals.Currently Prof. Liu is on the editorial
board of Distributed and Parallel Databases (Springer), Journal of Parallel and
Distributed Computing (JPDC), IEEE Transactions on Service Computing (TSC), and
ACM Transactions on Web (TWEB). Dr. Liu`s current research is primarily
sponsored by NSF, IBM, and Intel. 学术讲座4：Building a
scalable RDBMS from scratch 摘要: Even if big data of petabytes or even exabytes are
    attracting more and more eyes, RDBMS is still the BASE of our society. In this
    talk, I will share with you OceanBase ( http://oceanbase.taobao.org/ ), a
    scalable RDBMS built from scratch. OceanBase is a semi-distributed storage
    system for managing structured data, supporting transaction (ACID) as well as
    many features of the relational model. Being a share nothing architecture, it
    can easily scale to hundreds of billions of records across hundreds of commodity
    servers on-the-fly with fault tolerance. By eliminating random disk write, it
    matches commodity solid state disk (SSD) perfectly and thus enables much higher
    transaction per second (TPS) and query per second (QPS). OceanBase has provided
    relational database services for more than a dozen projects in the product
system of Taobao.com and servers billions of real time queries every day. 讲者简介: Dr. Zhenkun YANG (yangzhenkun@gmail.com) is a
Senior Researcher with Taobao.com. In recent years, his research interests are
distributed storage and computing system. He is now the chief architect of
OceanBase, an open source scalable relational database at Taobao.com. Before
joined Taobao.com, he has been a Senior Scientist with Baidu.com, a Lead
Researcher with Microsoft Research Asia and a Chief Researcher with Lenovo
Research. He received his bachelor and master degrees from the Department of
Mathematics, Peking University. After he got his PhD degree from the Department
of Computer Science in 1993, he became a faculty of the Institute of Computer
Science and Technology, Peking University and a full professor in 1997. He
received the Cheung Kong Scholar Award, Peking University in 1999. He was the
4th person in the First Class Award of the National Science and Technology
Progress of China in 1995. He also won the First Class Award of Science and
Technology Progress of Beijing in 1996, National Youth Science and Technology
Award of China in 1998, Qiushi Eminent of the Chinese Academy of Science and
Technology in 1998, and Wusi Youth Award of Beijing in 2000.   学术讲座5： Whom to
    Ask? Jury Selection for Decision Making Tasks on Microblog Services

摘要: It is universal to see people obtain knowledge
on micro-blog services by asking others decision making questions. In this talk,
I will present our recent study on the Jury Selection Problem(JSP) by utilizing
crowdsourcing for decision making tasks on micro-blog services. Specifically,
the problem is to enroll a subset of crowd under a limited budget, whose
aggregated wisdom via Majority Voting scheme has the lowest probability of
drawing a wrong answer (Jury Error Rate-JER). The challenges of such problem
reside in the procedure of calculating JER and finding the optimal subset under
a limited budget. Due to various individual error-rates of the crowd, the
calculation of JER is non-trivial. In our study, we propose two efficient
algorithms: a dynamic programming-based algorithm and a divide-and-conquer
algorithm. For JSP, we formally propose two models, one for altruistic
users(AltrM) and the other one for incentive-requiring users(PayM) who require
extra payment when enrolled into a task. Based on two models, we design
efficient algorithms for JSP. The efficiency and effectiveness of our proposed
algorithms are verified on both synthetic and real micro-blog data.
讲者简介:
Prof. Lei Chen received the BS degree in computer science and
engineering from Tianjin University, Tianjin, China, in 1994,
the MA degree from Asian Institute of Technology, Bangkok,
Thailand, in 1997, and the PhD degree in computer science from
the University of Waterloo, Waterloo, Ontario, Canada, in 2005.
He is currently an associate professor in the Department of
Computer Science and Engineering, Hong Kong University of
Science and Technology. His research interests include crowd
sourcing on social media, social media analysis, probabilistic
and uncertain databases, and privacy-preserved data publishing.
So far, he published more than 150 conference and journal papers.
He got the best paper awards in DASFAA 2009 and 2010. He is PC
Track chairs for ACM SIGMM 2011, ACM CIKM 2012, and IEEE ICDE
2012. He has served as PC members for SIGMOD, VLDB, ICDE, SIGMM,
and WWW. He is a member of the ACM and IEEE. He also serves as
the chairman of ACM Hong Kong Chapter.
学术讲座6： Research Progress on Unstructured Data Management in China
摘要: In 2010,
under the national theme of so called `Core Electronic Devices, Advanced
Chipsets and Fundamental Software Products(Abbreviated as HGJ in Chinese)`,
China has started supporting research and development activities on unstructured
data management system. This report overviews these HGJ projects during the
11th-five-year-plan period. In particular, we focus on large-scale, unstructured
data management systems, especially the research progress on their elastic
system architecture, flexible transaction mechanism and behavior data
mining. 讲者简介: Prof.
Jianmin Wang, received the Ph.D in Computer Software from Tsinghua
University In 1995. He is now a professor and doctor supervisor in School of
Software, Tsinghua University. He has been a member of the experts group for
National "HGJ" major science and technology project of China, and a member of
the experts group of National 863 program of China. He has been engaged in the research work of data
management and information systems, including the topics on unstructured data
management, business process management, product data management, benchmarks and
frameworks for database evaluation, novel watermarking and digital rights
management and so on. 学术讲座7：Research
project on Web information extraction, analysis and management 摘要: In China, we currently
    launch a new 863 project called "Big Web information extraction, analysis and
    management in an open environment" . In this project, we will address the
    challenge for the Web information extraction, analysis and management involving
    Internet-scale data. This project will last three years and is expected to do
    some real applications with Petabyte-scale data. 讲者简介: Professor Xiaofang Zhou is a Professor of Computer Science at
the University of Queensland, and Head of Data Engineering and Pattern
Recognition Research Division at UQ,. His research focus is to find effective
and efficient solutions for managing, integrating and analyzing very large
amount of complex data for business, scientific and personal applications. He
has been working in the area of spatial and multimedia databases, data quality,
high performance query processing, Web information systems and bioinformatics,
co-authored over 200 research papers with many published in top journals and
conferences such as SIGMOD, VLDB , ICDE, ACM Multimedia, The VLDB Journal, ACM
Transactions and IEEE Transactions. Xiaofang is an Adjunct Professor of Renmin
University of China appointed under the Chinese National Qianren Scheme, and
serves as the Director of RUC-UQ Joint Lab on Data Engineering and Knowledge
Engineering.