Deadline extended - SIGKDD Workshop on Large-scale Data Mining: Theory and Applications

                                Call for Papers
Third Workshop on Large-scale Data Mining: Theory and Applications (LDMTA
   in conjunction with SIGKDD2011, August 21-24, 2011, San Diego, CA, USA


With advances in data collection and storage technologies, large data
sources have become ubiquitous. Today, organizations routinely collect
terabytes of data on a daily basis with the intent of gleaning non-trivial
insights on their business processes. To benefit from these advances, it is
imperative that data mining and machine learning techniques scale to such
proportions. Such scaling can be achieved through the design of new and
faster algorithms and/or through the employment of parallelism.
Furthermore, it is important to note that emerging and future processor
architectures (like multi-cores) will rely on user-specified parallelism to
provide any performance gains. Unfortunately, achieving such scaling is
non-trivial and only a handful of research efforts in the data mining and
machine learning communities have attempted to address these scales.

At the other end of the spectrum, the past few years have witnessed the
emergence of several platforms for the implementation and deployment of
large-scale analytics. Examples of such platforms include Hadoop (Apache)
and Dryad (Microsoft). These platforms have been developed by the
large-scale distributed processing community and can not only simplify
implementation but also support execution on the cloud making large-scale
machine learning and data mining both affordable and available to all.
Today, there is a large gap between the data mining/machine learning and
the large scale distributed processing communities. To make advances in
large-scale analytics it is imperative that both these communities work
hand-in-hand. The intent of this workshop is to further research efforts on
large-scale data mining and to encourage researchers and practitioners to
share their studies and experiences on the implementation and deployment of
scalable data mining and machine learning algorithms.

Topics of Interest

    * Application case studies that showcase the need for large-scale
machine learning/data mining. Areas of interest of interest include
financial modeling, web mining, medical informatics, climate modeling, and
mining retail and e-commerce data.
    * Parallel and distributed algorithms for large-scale machine
learning/data mining, data preprocessing, and cleaning.
    * Exploiting modern and specialized hardware such as multi-core
processors, GPUs, STI Cell processor, etc.
    * Memory hierarchy aware data mining/machine learning algorithms.
    * Streaming data algorithms for machine learning and data mining.
    * New platforms and/or programming model proposals for
parallel/distributed machine learning and data mining for batch and/or
stream domains.
    * Evaluation of platforms (such as Hadoop) and/or programming models
(such as map-reduce) for batch and/or stream domains.
    * Performance studies comparing cloud, grid, and cluster
    * Data intensive computing approaches
    * Future research challenges in cloud and data intensive computing

Important dates and guidelines

    Submission deadline: May 21th, 2011
    Notification of acceptance: June 10th, 2011
    Final papers due: June 15th, 2011

All papers submitted should have a maximum length of 8 pages and must be
prepared using the ACM camera‐ready template Authors are required to
submit their papers electronically in PDF format. The submission site URL
will be available on our website shortly. All submissions should clearly
present the author information including the names of the authors, the
affiliations and the emails. Submission site is located at

Workshop Co-chairs

    Dr. Chidanand Apte, IBM Research
    Prof. Nitesh V. Chawla, University of Notre Dame
    Dr. Amol Ghoting, IBM Research
    Prof. Yan Liu, University of Southern California
    Dr. Jimeng Sun, IBM Research
    Prof. Jie Tang, Tsinghua University, China
    Dr. Ranga Raju Vatsavai, Oak Ridge National Laboratory

Program Committee

    Shirish Tatikonda, IBM Research
    Gagan Agrawal, Ohio State University
    Jeffrey Yu, Chinese University of Hong Kong
    Alexander Gray, Georgia Tech
    Prabhanjan Kambadur, IBM Research
    Rong Yan, Facebook
    Elad Yom-Tov, Yahoo! Research
    Mohammed Zaki, Rensselaer Polytechnic Institute
    Saeed Salem, North Dakota State University
    Berthold Reinwald, IBM Research
    Yuan Yu, Microsoft Research
    Petros Drineas, Rensselaer Polytechnic Institute
    Misha Bilenko, Microsoft Research
    Ron Bekkerman, LinkedIn
    Vijay Narayanan, Yahoo!
    Milind Bhandarkar, LinkedIn
    Tina Eliassi-Rad, Rutgers University

Steering Committee

    Prof. Christos Faloutsos, Carnegie Mellon University
    Prof. Robert Grossman, University of Illinois at Chicago
    Prof. Jiawei Han, University of Illinois at Urbana-Champaign

