Scott Sanner

Research Overview: Data-driven Decision-making
for Better Business and Smarter Cities

The Present

The push to process large quantities of raw data for prescriptive analytics is the next computing revolution and it's taking place now:

Thanks to Facebook, Google, Microsoft, and Twitter, the Internet is transforming from a passive repository of text to an active real-time recommender of localized, personalized, and socially-aware media — anticipating our preferences from prior targeted interactions.
Thanks to improved sensors, networking, and coordinated signaling, transport systems that affect billions of lives daily are becoming more aware and more adaptive — improving society by saving time and emissions and increasing safety.

While online recommendation and transport optimization may seem worlds apart, they both require optimal decision-making and they both represent an unprecedented tipping point in decision science where (a) decisions (and not just predictions) are predicated upon extremely large amounts of observational data (Big Data, if you will) and (b) we have the computational techniques, resources, and infrastructure to find near-optimal solutions in such settings.

But who can address such challenging computational questions that span multiple theoretical fields while requiring meticuolous engineering attention to work effectively in practice? I can and I am. My research spans not only the data-driven fields of machine learning and information retrieval but it extends to the decision-driven fields of artificial intelligence (AI) and operations research required to go beyond prediction to prescription. My research combines applied theoretical knowledge in the technical areas of

(Bayesian) probabilistic inference and learning in graphical models;
sequential decision-theory: MDPs, POMDPs and reinforcement learning; and
constrained optimization (both convex and non-convex)

with computational toolsets ranging from symbolic algebra with decision diagrams to scalable cloud computing via MapReduce and Apache Spark.

The insights in my research stem from understanding how computational tools (or novel extensions thereof) dovetail with theoretical formalisms such as MDPs or (constrained) optimization to compute near-optimal decisions without sacrificing performance for scalability. As two exemplars of my research innovations:

In social media recommendation (WWW-12, COSN-13), I've introduced novel learning objectives and innovative social features to unlock the recommendation potential of a rich variety and vast quantity of social media data to build state-of-the-art social recommender systems — to evaluate this work, my colleagues and I built a recommender Facebook App that has collected data from over 37,000 users. Variants of this work are now being trialed with a major international online e-book retailer.
In logistics planning (e.g, transport, inventory, etc.) (UAI-11, AAAI-12a, IJCAI-13b, NIPS-12, UAI-13, AAAI-15), I have leveraged symbolic piecewise algebraic computation using novel continuous extensions of decision diagram data structures for the automatic derivation of optimal closed-form policies to multi-signal traffic control and multi-item joint-capacitated inventory control problems — the first time in the 50+ year study of these problems that optimal policies in the form of symbolic algebraic functions of state have been algorithmically derived in closed-form.

This is just the tip of the iceberg. To elaborate in more detail, following is a survey of the ABCs of my research (or rather, MBDSs) that enable these large-scale data processing and decision-making applications.

M: Mining Massive Media

With the explosion of social media content, data has transformed into a rich network of social interactions and viral content dynamically changing over time. This has led to a data-driven revolution in recommender systems that can mine user interaction patterns and anticipate information diffusion as shown in my work on social collaborative filtering for Facebook (WWW-12, COSN-13). At the same time, this media explosion has led to an information overload requiring automated content analysis to manage it for a variety of user needs such as extracting human-readable topics underlying millions of daily Tweets (SIGIR-13), discovering implicit user communities in collaborative preference data (IJCAI-13a), and performing diverse retrieval that covers all facets of a user's information needs (SIGIR-10, CIKM-11, SIGIR-12).

This explosion of media content has also led to an overload of unstructured text data, which through effective natural language processing can be exploited for information extraction, summarization, and advanced information retrieval. To this end, I've worked on building state-of-the-art systems for automatic summarization, named entity recognition (NER), query expansion, sentiment and opinion analysis, complex sentiment, time and event extraction, hierarchical text classification, and question answering. Combining these various technologies together, at NICTA we've built tools for live Twitter-stream visual summarization (EventWatch) and visual news and blog exploration (OpinionWatch) to help users drill down through mass quantities of natural language content to find the information they need to act on. These technologies are currently licensed by a number of organizations throughout Australia.

B: Bayesian on a Budget

Bayesian decision theory provides a theoretical underpinning for robust decision-making in an online, streaming data setting making it ideal for a wide variety of data-driven decision-making applications. My research has cast a number of online decision-making problems such as real-time dynamic programming (IJCAI-09) and reinforcement learning (ICML-10) in a risk-sensitive Bayesian decision-theoretic framework, where these algorithms incur fewer suboptimal decisions than their non-Bayesian versions without any increase in computational complexity. Furthermore, in preference elicitation for recommendation (AISTATS-10, NIPS-10), I have built on recent advances in approximate Bayesian inference to leverage value of information for making near-optimal, real-time recommendations with minimal user interaction. These results all counter the stigma that Bayesian reasoning is computationally intractable and demonstrate the benefits of Bayesian reasoning on a strict computational budget.

D: Decision Diagrams on a Diet

One of the most important data structures for efficiently manipulating expressive functions for probabilistic and decision-theoretic inference is the decision diagram. With David McAllester, I defined the Affine Algebraic Decision Diagram (AADD) (IJCAI-05, AAMAS-10) for compactly representing logical and arithmetic structure in functions yielding up to an exponential-to-linear reduction in space and time for probabilistic and decision-theoretic inference applications. For my thesis work, I developed the first-order ADD (FOADD) (UAI-06, AI Journal 2009) for exploiting first-order logical relational structure used to reason efficiently about infinite domains in probabilistic planning. And crucial to my current work on hybrid stochastic control is the continuous extension of the ADD — the XADD (UAI-11, UAI-13) for efficient symbolic piecewise algebraic computation capable of producing the first exact and bounded approximate solutions for continuous traffic control models, inventory control, and numerous other hybrid control applications (AAAI-12a, NIPS-12, IJCAI-13b).

S: Symbolic Specifications and Solutions

The advent of symbolic solution techniques (often decision diagram-based) for exploiting structure in fully and partially observed sequential decision-making problems has allowed the application of MDPs and POMDPs to extremely large problems (e.g., transport and recommendation) that would have been inconceivable even a decade ago. My work has helped define this research field and has provided order of magnitude reductions in the time and space complexity — and in some cases, the first computationally feasible approach — to (approximately) solve problems with factored (IJCAI-05, AAMAS-10) , relational (UAI-05, UAI-06, AI Journal 2009, AAAI-10) and continuous state, action and observation spaces (UAI-11, AAAI-12a, IJCAI-13b, NIPS-12, UAI-13). I have also applied these ideas to exact probabilistic inference in expressive continuous graphical models, including those required for closed-form non-conjugate Bayesian inference (AAAI-12b).

For an overview of my (and others') work in this area, I gave a AAAI-13 Tutorial on this topic.

In order to exploit symbolic structure in decision-making, one must be able to compactly model such structure. As a major contribution to the planning community, I have developed the unifying relational dynamic influence diagram language (RDDL) to compactly model expressive decision-making problems by leveraging a mix of logical and stochastic programming representations; RDDL permits modeling of complex problems like traffic control that were otherwise impossible to model with previous description languages (e.g., PPDDL). Using RDDL, I ran the 2011 ICAPS International Probabilistic Planning Competition (IPPC) with a record 11 competitors from around the world (including M.I.T., U. Washington, U. Waterloo, KAIST in S. Korea, and N.U.S. in Singapore) (AI Magazine), which has helped shift the focus of probabilistic planning to domains that more closely reflect the expressivity of real-world applications.

The Future

My vision for the future is simple: social media is connecting all information, the Internet is becoming more interactive and personalized, and every embedded system that impacts our lives — from our daily commute to our consumer needs — will become increasingly adaptive to our welfare and that of society. Achieving this vision requires transforming immense quantities of data into actionable decisions in a way that is self-improving (learning), online (requiring efficient algorithms and data structures), robust (Bayesian), expressive (symbolic), and non-myopic (sequentially optimal). My research at the confluence of these areas is creating new representational and computational paradigms and new opportunities for data-driven decision-making to create better business processes and smarter cities. Alan Kay was right, "the best way to predict the future is to invent it."

Research Overview: Data-driven Decision-making for Better Business and Smarter Cities

Research Overview: Data-driven Decision-making
for Better Business and Smarter Cities