
CS Colloquium 20132014
Coordinator: Dr. Shahar Maoz
Place: Schreiber 309
Time: 11:00 gathering, cookies and coffee
11:1512:15 colloquium talk

Date 
Speaker 
Video 
Title 
October 13 2013 11:15  12:15 
Rotem Oshman 

Tight Bounds for Set Disjointness in the MessagePassing Model
In many distributed systems, the cost of computation is dominated by the cost of communication between the machines participating in the computation. Communication complexity is therefore a very useful tool in understanding distributed computation, and communication complexity lower bounds have been used extensively to obtain lower bounds on various distributed problems. However, almost all applications of communication complexity lower bounds in distributed computing use twoparty lower bounds, despite the fact that distributed computation usually involves many players. Unfortunately, there are interesting distributed problems that it appears cannot be addressed using twoplayer lower bounds, because reductions from twoplayer problems seem to lose the "hardness" of the problem.
With this motivation in mind, in this work we study communication complexity in the multiparty messagepassing model, which has been extensively used in distributed computing and in secure multiparty computation. In the messagepassing model there are k players, each with a private nbit input, and the players communicate with each other over private channels. We show a lower bound of Omega(n*k) on the communication complexity of set disjointness in the messagepassing model. To obtain this bound we develop information complexity tools for the model, prove a directsum theorem, and show that the information complexity of computing a 1bit AND with k players is Omega(k).
No prior background on information complexity will be assumed for the talk. This is joint work with Mark Braverman, Faith Ellen, Toniann Pitassi and Vinod Vaikuntanathan.

October 27 2013 11:15  12:15 
Prof. Csaba Szepesvári 

Online Learning in Markov Decision Processes under Bandit Feedback
Abstract:
Most reinforcement learning algorithms assume that the system to be controlled can be accurately approximated given the measurements and the available resources. However, this assumption is overly optimistic for too many problems of practical interest. For example, there can be a very large number of unobserved variables that influence the dynamics in a complicated manner. How can then one ask for nearoptimal performance without requiring an enormous amount of data? An alternative to this standard criterion is based on the concept of regret, borrowed from the online learning literature. Under this alternative criterion, the performance of a learning algorithm is measured by how much total reward is collected as compared to the total reward that could have been collected by the best policy in hindsight from a fixed policy class. How can we design algorithms that keep the sodefined regret small? Do we need to change existing algorithm designs? The talk will provide some answers and raise some more questions.
Joint work with András Antos, András György, Travis Dick and Gergely Neu.
Bio:
Csaba Szepesvari (University of Alberta, Canada)
Csaba Szepesvari received his PhD in 1999 from "Jozsef Attila" University, Szeged, Hungary. After his PhD he has spent 8 years in the software industry, developing and directing the development of advanced "artificial intelligence" software. He has funded a software company that develops custom software that uses advanced machine learning algorithms and helped funding a highfrequency trading company. He is currently an Associate Professor at the Department of Computing Science of the University of Alberta and a Principal Investigator of the Alberta Innovates Center for Machine Learning. He is the coauthor of a book on nonlinear approximate adaptive controllers and the sole author of a recent short book on Reinforcement Learning. He has published about 150 journal and conference papers. He serves as the Associate Editor of IEEE Transactions on Automatic Control and AI Communications, and is an Action Editor of the Journal of Machine Learning Research, the Machine Learning Journal and an editorial board member of the Computer Science Review. His research interests include reinforcement learning, statistical learning theory and online learning.

November 24 2013 11:15  12:15 
Katrina Ligett 

Data Privacy: Tensions and Opportunities
Abstract: Large databases of sensitive personal information are ubiquitous, hold great potential, and are rife with risks. How can we understand the fundamental tradeoffs between making use of such data and respecting individual privacy? Are both goals simultaneously achievable in realistic settings? What could change once we model the fact that privacy considerations might play a role in individuals' decisions to provide their data in the first place? In this talk, we will discuss recent theoretical work addressing these questions, in the formal framework of differential privacy.
Bio:
Katrina Ligett is an Assistant Professor of Computer Science and Economics at Caltech. Prior to joining Caltech, she was a postdoctoral scholar in Computer Science at Cornell (during which she was also a visitor to the Research Group on Algorithmic Game Theory at the Hebrew University's Institute for Advanced Studies). She received her PhD in Computer Science from Carnegie Mellon in 2009. She is a recent recipient of the Microsoft Research Faculty Fellowship and an NSF CAREER award.

December 15 2013 11:15  12:15 
Amir Shpilka 

Recent results in Algebraic Complexity with a focus on Polynomial Identity Testing
Abstract: Algebraic complexity is the field that studies the complexity of computing algebraic functions such as multiplication of two matrices, determinants etc. The main model of computation is Arithmetic Circuits  circuits that use the arithmetic operations and act on field elements (intuitively, an arithmetic circuit is a “program” that only uses the arithmetic operations +,*)
Polynomial Identity Testing (PIT for short) is the problem of efficiently and deterministically deciding whether a given arithmetic circuit computes the zero polynomial. This is one of the most fundamental problems in computational complexity and it is strongly related to the problem of proving lower bounds for arithmetic circuits.
In this talk I will survey several recent results on the complexity of arithmetic circuits, focusing mainly on the Polynomial Identity Testing problem. Specifically I will explain the importance of arithmetic circuits and what makes this model special compared to the usual Boolean model. Then I will discuss the PIT problem and give an overview of some of the ideas behind recent PIT algorithms and the connection to questions in combinatorial geometry.


December 18 2013 special day: Wednesday! Gilman building, auditorium 223! 11:15  12:15 
Daniel Jackson 

(RE)THINKING SOFTWARE DESIGN
Abstract: What is the essence of software design? Researchers and practitioners have for many years quoted Fred Brooks's assertions that "conceptual integrity is the most important consideration in system design" and is "central to product quality". But what exactly is conceptual integrity? In this talk, I'll report on progress in a new research project that explores this question by attempting to develop a theory of conceptual design, and applying it experimentally in a series of redesigns of common applications (such as Git, Gmail and Dropbox).
Bio:
Daniel Jackson is Professor of Computer Science and a MacVicar Teaching Fellow at the Massachusetts Institute of Technology. He led the development of Alloy (alloy.mit.edu) and is the author of "Software Abstractions: Logic, Language, and Analysis" (MIT Press, 2006). He was chair of a National Academies study entitled "Software for Dependable Systems: Sufficient Evidence?" in 2007, and more recently a member of the study on unintended acceleration. He has broad interests in many areas of software engineering, especially in software design, critical systems and formal methods.


December 22 2013 11:15  12:15 
Shiri Chechik 

Shortest Path Queries: Static, Dynamic and Faulttolerant
Abstract: In recent years, the practical need for fast retrieval of shortest path queries has significantly increased, in part due to the developing GPS navigation systems and other route planning softwares.
Classical shortest paths algorithms such as Dijkstra's algorithm return shortest path distance in almost linear time in the number of nodes. However, continentsized road networks require something much faster. It is possible to achieve sublineartime queries if preprocessing is allowed.
A distance oracle is a data structure that allows fast retrieval of a distance estimate for any pair of nodes. The focus in designing distance oracles is often on the tradeoff between several parameters: the construction time (the time it takes to construct the distance oracle), the size of the data structure, the query time and the quality of the answer (how close is the estimated distance to the real distance).
Not only are distance oracles important structures by their own right with both practical and theoretical interest, but they also have strongties to numerous other problems in theory and distributed computing, such as spanners, compact routing schemes and lowstretch spanning trees.
A major complicating aspect is that realworld networks may change over time. The changes may be either permanent (e.g., some road is added to the network) or temporary (e.g., some roads may be closed for short periods or a major junction may be temporary blocked).
A dynamic setting is used to capture permanent changes to the network, whereas a fault tolerance setting captures temporary changes.
In the first part of the talk I will highlight some of my work on shortest path queries in static, dynamic and faulttolerant settings. In the second part of the talk I will discuss my recent work on constant time distance oracles, which essentially obtains optimal bounds for general graphs and improves over the ThorupZwick distance oracle [STOC '01, J.ACM '05].


December 29 2013 11:15  12:15 
Sivan Sabato, Microsoft Research New England 

Learning with Lower Information Costs
Abstract: In this talk I will consider learning with lower information costs, focusing on linear regression. Linear regression is one of the most widely used methods for prediction and forecasting, with widespread uses in many fields such as natural sciences, economy and medicine. I will show how to improve the information costs of linear regression in two settings. First, I will present a new estimation algorithm for the standard supervised regression setting. This is the first efficient estimator that enjoys minimax optimal sample complexity, up to log factors, for general heavy tailed distributions. The technique is general and can be applied to a larger class of smooth and strongly convex losses. Second, I will consider the challenge of using crowd sourcing for labeling in tasks that usually require experts, and show how to achieve this using linear regression combined with a feature multiselection approach.
Based on Joint work with Daniel Hsu and Adam Kalai


January 9 2014 special date  Thursday!! 13:15  14:15 
Baruch Barzel 

A Peek Into the Inner Workings of Complex Networks Making Microscopic Sense of Macroscopic Data
Abstract: The era of big data allows us unprecedented access to the behavior of complex systems that were previously considered unmeasurable. Indeed, high throughput data collection, from genetic perturbation experiments to cascading effects in social networks is now frequently used to monitor biological, social and technological systems, in an effort to decipher the rules that govern their dynamical behavior. This, however, confronts us with a new kind of scientific challenge: how to translate the empirically observed data into a predictive model, describing the dynamical mechanisms that govern the system’s behavior. To address this problem we develop a new formalism, which, starting from empirical observations of the system’s response to external perturbations, allows us to reconstruct the underlying mechanisms that govern the interactions between the system’s components. The result is a microscopic model for the system’s rules of interaction, inferred directly from the macroscopic observation of the system’s behavior.
Relevant papers:
Universality in network dynamics. Nature Physics 9, 673–681 (2013) doi:10.1038/nphys2741
Network link prediction by global silencing of indirect correlations. Nature Biotechnology 31, 720–725 (2013) doi:10.1038/nbt.2601
Also featured in 2physics.com  Presenting key developments in physics: http://www.2physics.com/search/label/Complex%20System%203


January 12 2014 11:15  12:15 
Iddo Tzameret 

Algebra, Proofs and Complexity: Recent Developments
Abstract: Proof complexity is an established direction towards
the central questions of computational complexity. Namely, the
fundamental questions about the limits of efficient
computation; as well as the field that provides the theoretical
foundation for proofsearch algorithms (like DPLLbased
SATsolvers). In its heart is the notion of a propositional
proof system, which is simply a way to write certificates for
Boolean tautologies: each different proof system gives a
different way to write proofs of tautologies, and the
complexity of a proof is the number of symbols it takes to
write it down. Establishing strong size lower bounds on strong
enough propositional proof systems, that is, showing that a
certain family of tautologies do not have small proofs in the
system, is an extremely hard problem in itself which has also
striking implications to computational complexity.
In recent years, a new purely algebraic approach to proof
complexity has emerged via the notion of arithmetic proofs,
introduced by Hrubes and myself (CCC'09). This approach sets
out to provide new structures and tools to old problems in the
field; as well as providing an elegant solution to some
longstanding open problems and giving rise to new seemingly
fundamental open problems. In this talk I will give a short
background into propositional proof complexity and discuss the
algebraic approach, some of its achievements, prospects, and
basic open problems.
Based on joint works with Pavel Hrubes (CCC'09, STOC'12), Fu Li
('13) and Stephen A. Cook ('13).



February 16 2014 11:15  12:15 
Orna KupfermanHebrew University 

Variations on Safety
Abstract:
Verification and design of reactive systems involves reasoning about ongoing behaviors  the nonterminating interaction of the system with its environment. Of special interest in formal methods are safety properties, which assert that the system always stays within some allowed region, in which nothing "bad" happens. For example, two processes are never together in some critical section. Equivalently, a property is a safety property if every violation of it occurs after a finite execution of the system. Thus, a computation violates the property if it has a "bad prefix", all whose extensions violate the property. The talk surveys the theoretical aspects of safety properties as well as their practical advantages with respect to general properties. It also describes several extensions and variations of safety, and a probabilitybased approach for defining different levels of safety.
 

February 23 2014 11:15  12:15 
Elad Hazan 

Sublinear Optimization for Machine Learning
Abstract:
VIn many modern optimization problems, particularly those arising in
machine learning, the amount data is too large to apply standard
convex optimization methods. We'll discuss new optimization algorithms
that make use of randomization to prune the data produce a correct
solution albeit running in time which is smaller than the data
representation, i.e. sublinear running time. We'll present such
sublineartime algorithms for linear classification, support vector
machine training, semidefinite programming and other optimization
problems. These new algorithms are based on a primaldual approach,
and use a combination of novel sampling techniques and the randomized
implementation of online learning algorithms. We'll describe
informationtheoretic lower bounds that show our running times to be
nearly best possible in the unitcost RAM model.
The talk will be self contained  no prior knowledge in convex
optimization or machine learning is assumed.
 

April 6 2014 11:15  12:15 
Michael Walfish 

Verifying the correctness of remote executions: from theoretical possibility to near practicality
Abstract:
How can a machine specify a computation to another one and then, without
executing the computation, check that the remote machine carried it out
correctly? And how can this be done without assumptions about the
performer (replication, etc.) or restrictions on the computation? This
is a classic problem in systems security, and it is particularly
relevant in the context of cloud computing. For decades, it has been
known that this problem can be solved in theory, using probabilistically
checkable proofs (PCPs) coupled with cryptographic tools. The rub was
practicality: if implemented naively, the theory would be astronomically
more expensive than simply executing the computation locally.
Over the last several years, a number of projects have brought this
theory to near practicality in the context of implemented systems. In
this emerging area of _proofbased verifiable computation_, the pace of
progress has been rapid, and there have been many encouraging
developments. I will discuss this general area and my group's efforts.
We have refined an elegant protocol of Ishai, Kushilevitz, and Ostrovsky
(CCC '07); broadened the computational model from Boolean circuits to
something generalpurpose; developed a compiler that transforms from
programs to executables that implement the protocol entities; fully
implemented the system; accelerated it with GPUs; and extended the
machinery to handle real applications of the cloud (MapReduce, etc.).
 
April 27 2014 11:15  12:15 
Keren Yizhak 

Genomescale computational methods for the identification of metabolic drug targets in aging and cancer
Abstract:
Metabolism is a key process involved in aging and many human diseases, ranging from obesity to cancer and neurodegenerative disorders. It is also one of the very few key cellular processes that is currently amenable to computational modeling on a genomescale, moving towards the holy grail of an in silico cell. Within this framework, the objective of my research has been to develop new computational methods for studying human metabolism on a large scale. My talk will describe two such methods and demonstrate their applications: (1) First, I will introduce the Metabolic Transformation Algorithm (MTA). This algorithm identifies genetic and dietary perturbations that are most likely to retrieve the healthy metabolic state from a diseased one. Applied to analyze yeast and human aging data, our experimental collaborators (the Cohen lab, BIU) have successfully validated some of MTA’s predictions, finding genes whose suppression has more than doubled the longevity of the yeast. (2) Second, moving to study cancer metabolism, I will introduce a novel method termed PRIME that generates personalized human metabolic models based on individual molecular data. PRIME is utilized to build >300 models of normal and cancer celllines, that successfully predict an array of metabolic phenotypes in an individual manner. The resulting cell models are then used to predict novel drug targets that selectively suppress cancer cell growth, and drug targets that selectively attenuate cell migration (and hence may potentially inhibit metastasis). Some of the targets have now been experimentally tested and validated by our collaborators (the Frezza lab (Cambridge) and the Van de Water lab (Leiden)). PRIME therefore lays a foundation for future personalized metabolic modeling applications, enhancing the search for novel anticancer therapies.
 
May 11 2014 12:4513:45  NOTE SPECIAL TIME! 
Jan Vitek, Purdue University 

The rise of dynamic languages
Abstract:
Formal approaches to program correctness through static software verification have influenced the design of programming languages, programming models, and developer tools for the last fifty years. Yet static techniques can only handle a small fraction of the programs written in the languages they claim to target. By and large academic research in the field has been about analyzing languages that all look and feel like Pascal; languages with a static type discipline and with readonly programs. This has very little to do with popular languages in use today. Languages like JavaScript, Python, Lua and R where typing is dynamic and new behaviors can be synthesized at runtime through powerful reflective programming interfaces. Instead of embracing dynamism and trying to support popular programming idioms, our community keeps proposing solutions that impose static disciplines on programmers. We keep trying to find the inner Pascal in every JavaScript. This is bound to drive practitioners away and ensure our continued irrelevance. Are we bound to repeat history or is there a way out?
Biography:
Jan Vitek is a Professor of Computer Science at Purdue University and University Faculty Scholar. Over the years, he worked on topics related to programming languages, their design, use, and implementation. With Noble and Potter, he proposed the notion of flexible alias control which became know as Ownership Types. He led the Ovm project which produced the first realtime Java virtual machine to be flight tested on a ScanEagle drone (he claims no one was harmed). Outcomes of this project include the Schism realtime garbage collector and the FijiVM – a production VM for embedded systems. More recently, Jan worked on dynamic languages, trying to make sense of JavaScript and to design a new language called, Thorn. Nowadays, he spends his time with statisticians and data scientists. Jan believes that his 2012 election as Chair of SIGPLAN was an accident; since has been busy trying to rock the boat to ensure this does not happen again. In his spare time, Jan enjoys organizing conferences and sitting on PCs (over 25 in the last decade). He founded the MOS (mobile objects), IWACO (alias control), STOP (gradual typing), and TRANSACT (transactional memory) workshop series. He was the first program chair of VEE and chaired ESOP, ECOOP, Coordination and TOOLS. He was the general chair of PLDI (in Beijing!), ISMM and LCTES. He may still be sitting on the steering committees of ECOOP, JTRES, ICFP, OOPLSA, POPL, PLDI, LCTES, ESOP.
 
May 18 2014 11:15  12:15 
Doron Peled ,Bar Ilan University 

synthesis using genetic programming based on Model Checking.
Abstract:
We show how the use of genetic programming, in combination of model checking and testing, provides a powerful way to synthesize programs. Whereas classical algorithmic synthesis provides alarming high complexity and undecidability results, the genetic approach provides a surprisingly successful heauristics. We describe several versions of a method for synthesizing sequential and concurrent systems. To cope with the constraints of model checking and of theorem proving, we combine such exhaustive verification methods with testing. We show several examples where we used our approach to synthesize, improve and correct code.

June 1 2014 11:15  12:15 
Michal Feldman 

Beyond Walrasian Equilibrium
Abstract:
We study a combinatorial market design problem, where a collection of indivisible objects is to be priced and sold to potential buyers subject to equilibrium constraints. The classic solution concept for such problems is Walrasian Equilibrium (WE), which provides a simple and transparent pricing structure that achieves optimal social welfare. The main weakness of the WE notion is that it exists only in very restrictive cases. To overcome this limitation, we introduce the notion of a Combinatorial Walrasian equilibium (CWE), a natural relaxation of WE. The difference between a CWE and a (noncombinatorial) WE is that the seller can package the items into indivisible bundles prior to sale, and the market does not necessarily clear.
We show that every valuation profile admits a CWE that obtains at least half of the optimal (unconstrained) social welfare. Moreover, we devise a polytime algorithm that, given an arbitrary allocation $X$, computes a CWE that achieves at least half of the welfare of $X$. Thus, the economic problem of finding a CWE with high social welfare reduces to the algorithmic problem of socialwelfare approximation. In addition, we show that every valuation profile admits a CWE that extracts a logarithmic fraction of the optimal welfare as revenue. The strength of our results derives partly from their generality  our results hold for arbitrary valuations that may exhibit complex combinations of substitutes and complements.
Finally, we study relaxed Walrasian equilibria where the seller can package the items into bundles but must clear the market, and equilibria where the seller is restricted to using item prices only, but does not necessarily clear the market.
Based on joint work with Nick Gravin, Brendan Lucier, Shahar Dobzinski, Inbal TalgamCohen, and Omri Weinstein.

June 5 2014 13:15  14:15 NOTE SPECIAL DATE/TIME! 
Dafna Shahaf, Stanford University 

The Aha! Moment: From Data to Insight
Abstract:
The amount of data in the world is increasing at incredible rates. Largescale data has potential to transform almost every aspect of our world, from science to business; for this potential to be realized, we must turn data into insight. In this talk, I will describe two of my efforts to address this problem computationally: The first project, Metro Maps of Information, aims to help people understand the underlying structure of complex topics, such as news stories or research areas. Metro Maps are structured summaries that can help us understand the information landscape, connect the dots between pieces of information, and uncover the big picture. The second project proposes a framework for automatic discovery of insightful connections in data. In particular, we focus on identifying gaps in medical knowledge: our system recommends directions of research that are both novel and promising. I will formulate both problems mathematically and provide efficient, scalable methods for solving them. User studies on realworld datasets demonstrate that our methods help users acquire insight efficiently across multiple domains.
Bio: Dafna Shahaf is a postdoctoral fellow at Stanford University. She received her Ph.D. from Carnegie Mellon University; prior to that, she earned an M.S. from the University of Illinois at UrbanaChampaign and a B.Sc. from TelAviv university. Dafna's research focuses on helping people make sense of massive amounts of data. She has won a best research paper award at KDD 2010, a Microsoft Research Fellowship, a Siebel Scholarship, and a Magic Grant for innovative ideas.

June 8 2014 11:15  12:15 
Eran Tromer, Tel Aviv University 

Succinct Zero Knowledge Proofs and their Applications
Abstract:
"Computers are unreliable and vulnerable to attacks; therefore we shouldn't believe what they say, unless they prove its correctness."
Imagine how more robust our networks and protocols would be, if this tenet could be realized! Recent cryptographic work aims to bring this to reality and practicality, via "zeroknowledge Succinct Noninteractive ARguments of Knowlwdge" (zkSNARK) proof systems,
and their extension to "ProofCarrying Data".
This talk will survey recent work in this area, including:
 Theoretical feasibility
 Implementation of zkSNARKs for proving the correct execution of
general computation, e.g., expressed as C programs
 Concrete applications, e.g., to improving privacy in the
Bitcoin decentralized currency
Joint works with Eli BenSasson, Nir Bitansky, Ran Canetti, Alessandro Chiesa, Christina Garman, Daniel Genkin, Matthew Green, Ian Miers, Eran Tromer, and Madars Virza.
For papers and code see http://www.sciprlab.org and http://zerocashproject.org
 



