1. Collaborative learning at scale [electronic resource] [2017] Online
- Book
- 1 online resource.
We introduce an online learning platform that scales collaborative learning. We study the impact of team formation and the team formation process in massive open online classes. We observe that learners prefers team members with similar location, age range and education level. We also observe that team members in more successful teams have diverse skill sets. We model the team formation process as a cooperative game and prove the existence of stable team allocations. We propose a polynomial-time algorithm that finds a stable team allocation for a certain class of utility functions. We use this algorithm to recommend teams to learners. We show that team recommendations increase the percentage of learners who finish the class.
We introduce an online learning platform that scales collaborative learning. We study the impact of team formation and the team formation process in massive open online classes. We observe that learners prefers team members with similar location, age range and education level. We also observe that team members in more successful teams have diverse skill sets. We model the team formation process as a cooperative game and prove the existence of stable team allocations. We propose a polynomial-time algorithm that finds a stable team allocation for a certain class of utility functions. We use this algorithm to recommend teams to learners. We show that team recommendations increase the percentage of learners who finish the class.
2. Design and analysis of loyalty reward programs [electronic resource] [2017] Online
- Book
- 1 online resource.
This thesis provides an in-depth analysis of two major components in the design of loyalty reward programs. First, we discuss the design of coalition loyalty programs - schemes where customers can earn and spend reward points across multiple merchant partners. And second, we conduct a model based comparison of a standalone loyalty reward program against traditional pricing - we theoretically characterize the conditions under which it is better to run a reward program within a competitive environment. Coalition loyalty programs are agreements between merchants allowing their customers to exchange reward points from one merchant to another at agreed upon exchange rates. Such exchanges lead to transfer of liabilities between merchant partners, which need to be frequently settled using payments. We first conduct an empirical investigation of existing coalitions, and formulate an analytical model of bargaining for merchant partners to agree upon the exchange rate and payment parameters. We show that our bargaining model produces networks that are close to optimal in terms of social welfare, in addition to cohering with empirical observations. Then, we introduce a novel alternate methodology for settling the transferred liabilities between merchants participating in a coalition. Our model has three interesting properties -- it is decentralized, arbitrage-proof, and fair against market power concentration -- which make it a real alternative to how settlements happen in coalition loyalty programs. Finally, we investigate the design of an optimal reward program for a merchant competing against a traditional pricing merchant, for varying customer populations, where customers measure their utility in rational economic terms. We assume customers are either myopic or strategic, and have a prior loyalty bias toward the reward program merchant, drawn from a known distribution. We show that for the reward program to perform better, it is necessary for a minimum fraction of the customer population to be strategic, and the loyalty bias distribution to be within an optimal range. This thesis is a useful read for marketers building promotional schemes within retail, researchers in the field of marketing and behavioral science, and companies investigating the intersection of customer behavior, loyalty, and virtual currencies.
This thesis provides an in-depth analysis of two major components in the design of loyalty reward programs. First, we discuss the design of coalition loyalty programs - schemes where customers can earn and spend reward points across multiple merchant partners. And second, we conduct a model based comparison of a standalone loyalty reward program against traditional pricing - we theoretically characterize the conditions under which it is better to run a reward program within a competitive environment. Coalition loyalty programs are agreements between merchants allowing their customers to exchange reward points from one merchant to another at agreed upon exchange rates. Such exchanges lead to transfer of liabilities between merchant partners, which need to be frequently settled using payments. We first conduct an empirical investigation of existing coalitions, and formulate an analytical model of bargaining for merchant partners to agree upon the exchange rate and payment parameters. We show that our bargaining model produces networks that are close to optimal in terms of social welfare, in addition to cohering with empirical observations. Then, we introduce a novel alternate methodology for settling the transferred liabilities between merchants participating in a coalition. Our model has three interesting properties -- it is decentralized, arbitrage-proof, and fair against market power concentration -- which make it a real alternative to how settlements happen in coalition loyalty programs. Finally, we investigate the design of an optimal reward program for a merchant competing against a traditional pricing merchant, for varying customer populations, where customers measure their utility in rational economic terms. We assume customers are either myopic or strategic, and have a prior loyalty bias toward the reward program merchant, drawn from a known distribution. We show that for the reward program to perform better, it is necessary for a minimum fraction of the customer population to be strategic, and the loyalty bias distribution to be within an optimal range. This thesis is a useful read for marketers building promotional schemes within retail, researchers in the field of marketing and behavioral science, and companies investigating the intersection of customer behavior, loyalty, and virtual currencies.
3. Online active learning with linear models [electronic resource] [2017] Online
- Book
- 1 online resource.
In this thesis we address online decision making problems where an agent needs to collect optimal training data to fit statistical models. Decisions on whether to request the response of incoming stochastic observations or not must be made on the spot, and, when selected, the outcomes are immediately revealed. In particular, in the first part, we study scenarios where a single linear model is estimated given a limited budget: only k out of the n incoming observations can be labelled. In the second part, we focus on active learning in the presence of several linear models with heterogeneous unknown noise levels. The goal is to estimate all the models equally well. We design algorithms to efficiently solve the problems, extend them to sparse high-dimensional settings, and derive statistical guarantees in all cases. In addition, we validate our algorithms with synthetic and real-world data, where most of our technical assumptions are violated. Finally, we briefly explore active learning in pure exploration settings for reinforcement learning. In this case, an unknown Markov Decision Process needs to be learned under a fixed budget of episodes.
In this thesis we address online decision making problems where an agent needs to collect optimal training data to fit statistical models. Decisions on whether to request the response of incoming stochastic observations or not must be made on the spot, and, when selected, the outcomes are immediately revealed. In particular, in the first part, we study scenarios where a single linear model is estimated given a limited budget: only k out of the n incoming observations can be labelled. In the second part, we focus on active learning in the presence of several linear models with heterogeneous unknown noise levels. The goal is to estimate all the models equally well. We design algorithms to efficiently solve the problems, extend them to sparse high-dimensional settings, and derive statistical guarantees in all cases. In addition, we validate our algorithms with synthetic and real-world data, where most of our technical assumptions are violated. Finally, we briefly explore active learning in pure exploration settings for reinforcement learning. In this case, an unknown Markov Decision Process needs to be learned under a fixed budget of episodes.
4. Decision-making at scale [electronic resource] [2016] Online
- Book
- 1 online resource.
Recent years have seen an increase in democratic innovations aimed at increasing the participation of the public in policy-making. This observation, coupled with the increasing prevalence of internet-based communication, points to a very real possibility of implementing participatory democracies on a mass-scale in which every individual is invited to contribute their ideas and opinions. This dissertation addresses two challenges to realizing such a vision: scalability and polarization. I first address the challenge of scaling up voting. In the classical theory of social choice, participants submit rankings over all the candidates. This is, unfortunately, not feasible when voting over a large number of contributed ideas. We give algorithms that elicit approximate winners for a large class of social choice functions while only requiring participants to make a small number of comparisons, significantly improving on previous lower bounds. We show that one can further improve these algorithms to preserve voter privacy, resist strategic manipulation, and account for streaming settings. We demonstrate these ideas in Finland's recent off-road traffic law reform. We also propose a model for scaling up aggregation more collaboratively, through sequences of small group deliberations. We consider whether such interactions can be used to find the wisdom of the crowd, defined here to be the median opinion, and show a result indicating the importance of having groups of at least three. Specifically, we show that a small number of triadic interactions are able to find a tight approximation of the generalized median, but that dyadic interactions satisfying natural axioms are unable to. Finally, we briefly describe our work addressing the challenge of polarization in society. Several empirical studies have shown that homophily, i.e. greater interaction between like-minded individuals, results in polarization. However, we show that well-known models of opinion formation based on repeated averaging can never be polarizing, even if individuals are arbitrarily homophilous. We generalize these models to account for a phenomenon well-known in social psychology as biased assimilation and show that a simple model of homophilous networks can result in either polarization, persistent disagreement or consensus depending on how biased individuals are. This model is then used to discuss methods for countering polarization.
Recent years have seen an increase in democratic innovations aimed at increasing the participation of the public in policy-making. This observation, coupled with the increasing prevalence of internet-based communication, points to a very real possibility of implementing participatory democracies on a mass-scale in which every individual is invited to contribute their ideas and opinions. This dissertation addresses two challenges to realizing such a vision: scalability and polarization. I first address the challenge of scaling up voting. In the classical theory of social choice, participants submit rankings over all the candidates. This is, unfortunately, not feasible when voting over a large number of contributed ideas. We give algorithms that elicit approximate winners for a large class of social choice functions while only requiring participants to make a small number of comparisons, significantly improving on previous lower bounds. We show that one can further improve these algorithms to preserve voter privacy, resist strategic manipulation, and account for streaming settings. We demonstrate these ideas in Finland's recent off-road traffic law reform. We also propose a model for scaling up aggregation more collaboratively, through sequences of small group deliberations. We consider whether such interactions can be used to find the wisdom of the crowd, defined here to be the median opinion, and show a result indicating the importance of having groups of at least three. Specifically, we show that a small number of triadic interactions are able to find a tight approximation of the generalized median, but that dyadic interactions satisfying natural axioms are unable to. Finally, we briefly describe our work addressing the challenge of polarization in society. Several empirical studies have shown that homophily, i.e. greater interaction between like-minded individuals, results in polarization. However, we show that well-known models of opinion formation based on repeated averaging can never be polarizing, even if individuals are arbitrarily homophilous. We generalize these models to account for a phenomenon well-known in social psychology as biased assimilation and show that a simple model of homophilous networks can result in either polarization, persistent disagreement or consensus depending on how biased individuals are. This model is then used to discuss methods for countering polarization.
- Book
- 1 online resource.
Personalization, the practice of dynamically tailoring functionality for the needs and wants of each user, is a natural step in the evolution of many computing systems (such as PCs, search engines, and recommendation systems). The merits of personalization are obvious: Users get services more quickly and accurately, context and information gets more relevant, and the interaction between users and the personalized system becomes more amicable. This thesis explores personalization as a natural step of the network's evolution, along with its associated benefits and challenges. At first, personalization might seem irrelevant for networks---after all, the network has a single, unambiguous and objective task to complete: Carry packets from one side to the other as quickly as possible. But in practice, networks are more complicated, continuously taking decisions about which traffic gets priority, how different applications are being charged, and to which WiFi network a user should connect and with what password. Making these decisions while ignoring users often produces suboptimal results. For example, the net neutrality debate highlights the dangers in dampening user choice and innovation when ISPs and Content Providers decide which applications are prioritized or zero rated. Similarly, our day-to-day interaction with several user-agnostic networks results in a fragmented user experience: We use different authentication methods and credentials for each network; reachability to desired resources is subject to the network to which we are connected; and we have very little, if any, control over policies such as firewalls and QoS settings. This thesis presents a generic network architecture for personalizing network functionality. Network operators structure and expose functionality as high-level services (e.g., a fast lane or a zero-rated lane), and then let users tailor these services by expressing their own preferences. A critical piece for personalized network services is how do users communicate their preferences to the network. Network cookies, a generic mapping abstraction, provide this user to network interface in a way that is simple yet expressive, respects the tussle between different stakeholders (e.g., users, ISPs, content providers and policymakers) in terms of security, privacy, revocability and authentication, and can be practically deployed in existing networks. Leveraging network cookies and user preferences, I describe a user-focused alternative for the net neutrality debate: Enable fast or zero-rated lanes and allow users to decide which traffic goes over them. Through user studies for zero rating and fast lanes, I demonstrate that user preferences are heavy tailed, and users are willing to express their preferences if there is a simple way to do so. Network cookies allow users to express high level preferences (e.g., prioritize a website or a mobile application) with high accuracy in the presence of encryption and middleboxes. I validate the approach through the design and prototype implementation of Boost, a user-driven fast lane deployed in 160 home networks. The last part of the thesis extends the concept of personalization by enabling a fully personalized network experience: Users define their network properties once (i.e., their WiFi SSID and password, devices, and policies such as QoS and firewalls) and their network follows them wherever they go---at workplace, public hotspots, or a friend's home. Personal WiFi networks provide users with a simplified and consistent experience regardless of how they are connected to the network. I describe the design and prototype implementation of BeHop, a personalized WiFi infrastructure deployed in a student dorm at Stanford University.
Personalization, the practice of dynamically tailoring functionality for the needs and wants of each user, is a natural step in the evolution of many computing systems (such as PCs, search engines, and recommendation systems). The merits of personalization are obvious: Users get services more quickly and accurately, context and information gets more relevant, and the interaction between users and the personalized system becomes more amicable. This thesis explores personalization as a natural step of the network's evolution, along with its associated benefits and challenges. At first, personalization might seem irrelevant for networks---after all, the network has a single, unambiguous and objective task to complete: Carry packets from one side to the other as quickly as possible. But in practice, networks are more complicated, continuously taking decisions about which traffic gets priority, how different applications are being charged, and to which WiFi network a user should connect and with what password. Making these decisions while ignoring users often produces suboptimal results. For example, the net neutrality debate highlights the dangers in dampening user choice and innovation when ISPs and Content Providers decide which applications are prioritized or zero rated. Similarly, our day-to-day interaction with several user-agnostic networks results in a fragmented user experience: We use different authentication methods and credentials for each network; reachability to desired resources is subject to the network to which we are connected; and we have very little, if any, control over policies such as firewalls and QoS settings. This thesis presents a generic network architecture for personalizing network functionality. Network operators structure and expose functionality as high-level services (e.g., a fast lane or a zero-rated lane), and then let users tailor these services by expressing their own preferences. A critical piece for personalized network services is how do users communicate their preferences to the network. Network cookies, a generic mapping abstraction, provide this user to network interface in a way that is simple yet expressive, respects the tussle between different stakeholders (e.g., users, ISPs, content providers and policymakers) in terms of security, privacy, revocability and authentication, and can be practically deployed in existing networks. Leveraging network cookies and user preferences, I describe a user-focused alternative for the net neutrality debate: Enable fast or zero-rated lanes and allow users to decide which traffic goes over them. Through user studies for zero rating and fast lanes, I demonstrate that user preferences are heavy tailed, and users are willing to express their preferences if there is a simple way to do so. Network cookies allow users to express high level preferences (e.g., prioritize a website or a mobile application) with high accuracy in the presence of encryption and middleboxes. I validate the approach through the design and prototype implementation of Boost, a user-driven fast lane deployed in 160 home networks. The last part of the thesis extends the concept of personalization by enabling a fully personalized network experience: Users define their network properties once (i.e., their WiFi SSID and password, devices, and policies such as QoS and firewalls) and their network follows them wherever they go---at workplace, public hotspots, or a friend's home. Personal WiFi networks provide users with a simplified and consistent experience regardless of how they are connected to the network. I describe the design and prototype implementation of BeHop, a personalized WiFi infrastructure deployed in a student dorm at Stanford University.
6. Modeling information flow in networks [electronic resource] : competition, evolution, and external influence [2016] Online
- Book
- 1 online resource.
In online social networks such as Twitter and Facebook, users are constantly sharing information with people they are connected to, as well as re-sharing information posted by others. Through this process, a single piece of information called a contagion can spread from user to user over the connections until it has reached a large portion of the network. In this thesis, we develop a series of probabilistic methods for modeling the spread of contagions in social networks in order to better understand the factors that affect the process. Our work examines several different phenomena that affect information flows through social networks. One such phenomenon is unobserved sources of information influencing members of the network. We present a model that not only quantifies these hidden information sources but also provides a more accurate view of information spread. We find that as much as 29% of all information spreading through social networks like Twitter originates from sources outside the network. Next, we examine how different contagions spreading through a network can interact with each other. We observe and model competition (when one contagion can decrease the spread of another contagion) and cooperation (when one contagion increases the spread of another). We find that contagion interaction can increase or decrease the probability of contagion spread by more than 70% on average. We also explore the dynamic nature of social network structure, and how these dynamics are affected by the spread of information. As social network users are exposed to new contagions, they are constantly forming new connections with other users as well as deleting connections. We find that the spread of contagions can cause sudden "bursts" in both the creation of new connections and the deletion of old connections. We also find that contagions can change the network structure on a global scale by moving like-minded members closer to each other as well as pushing less similar users farther away. Additionally, we consider the problem of inferring the structure of a hidden network when only patterns of information spread are known. Given only the timing of when each user adopted each piece of information, our method accurately predicts which users are connected to which other users by converting the maximum likelihood estimation of the network into a series of independent convex optimization problems that can be solved efficiently. Taken together, the results presented in this thesis make contributions to understanding how information flows through networks, and they provide insights into the nature of how people exchange information.
In online social networks such as Twitter and Facebook, users are constantly sharing information with people they are connected to, as well as re-sharing information posted by others. Through this process, a single piece of information called a contagion can spread from user to user over the connections until it has reached a large portion of the network. In this thesis, we develop a series of probabilistic methods for modeling the spread of contagions in social networks in order to better understand the factors that affect the process. Our work examines several different phenomena that affect information flows through social networks. One such phenomenon is unobserved sources of information influencing members of the network. We present a model that not only quantifies these hidden information sources but also provides a more accurate view of information spread. We find that as much as 29% of all information spreading through social networks like Twitter originates from sources outside the network. Next, we examine how different contagions spreading through a network can interact with each other. We observe and model competition (when one contagion can decrease the spread of another contagion) and cooperation (when one contagion increases the spread of another). We find that contagion interaction can increase or decrease the probability of contagion spread by more than 70% on average. We also explore the dynamic nature of social network structure, and how these dynamics are affected by the spread of information. As social network users are exposed to new contagions, they are constantly forming new connections with other users as well as deleting connections. We find that the spread of contagions can cause sudden "bursts" in both the creation of new connections and the deletion of old connections. We also find that contagions can change the network structure on a global scale by moving like-minded members closer to each other as well as pushing less similar users farther away. Additionally, we consider the problem of inferring the structure of a hidden network when only patterns of information spread are known. Given only the timing of when each user adopted each piece of information, our method accurately predicts which users are connected to which other users by converting the maximum likelihood estimation of the network into a series of independent convex optimization problems that can be solved efficiently. Taken together, the results presented in this thesis make contributions to understanding how information flows through networks, and they provide insights into the nature of how people exchange information.
- Book
- 1 online resource.
This thesis includes two essays on outsourcing in the presence of uncertainty. The first concerns physical goods and multi-tier supply chains whereas the second discusses the outsourcing of innovation in the context of dynamic contests. In Chapter 2 we study multi-tier supply chain networks in the presence of disruption risk. Firms decide on how to source their inputs from upstream suppliers so as to maximize their expected profits, and prices of intermediate goods are set so that markets clear. We establish that a supply equilibrium exists and provide an explicit characterization of the equilibrium prices, profits, and sourcing decisions. Our characterization allows us to derive insights on how the network structure, i.e., the number of firms in each tier, production costs, and disruption risk affect firms' profits and the prices of intermediate goods. Furthermore, we establish that networks that maximize consumer surplus and aggregate profits for the supply chain are structurally different. In particular, the former have relatively less diversified downstream tiers and generate more variable output than the latter. Finally, we study the question of endogenous chain formation by considering a game of entry, i.e., firms decide whether to engage in production by forming beliefs about their profits in the post-entry supply chain. We argue that endogenous entry leads to chains that are inefficient in terms of the number of firms that engage in production. Chapter 3 studies the design of innovation contests as a tool to outsource innovation when there is uncertainty about the feasibility of the project. In such a contest, participants race towards completing an innovation project and learn about its feasibility from their own efforts and their competitors' gradual progress. Information about the status of competition can alleviate some of the uncertainty inherent in the contest, but it can also adversely affect effort provision from the laggards. We explores the problem of designing the award structure of a contest and its information disclosure policy in a dynamic framework and provides a number of guidelines for maximizing the designer's expected payoff.
This thesis includes two essays on outsourcing in the presence of uncertainty. The first concerns physical goods and multi-tier supply chains whereas the second discusses the outsourcing of innovation in the context of dynamic contests. In Chapter 2 we study multi-tier supply chain networks in the presence of disruption risk. Firms decide on how to source their inputs from upstream suppliers so as to maximize their expected profits, and prices of intermediate goods are set so that markets clear. We establish that a supply equilibrium exists and provide an explicit characterization of the equilibrium prices, profits, and sourcing decisions. Our characterization allows us to derive insights on how the network structure, i.e., the number of firms in each tier, production costs, and disruption risk affect firms' profits and the prices of intermediate goods. Furthermore, we establish that networks that maximize consumer surplus and aggregate profits for the supply chain are structurally different. In particular, the former have relatively less diversified downstream tiers and generate more variable output than the latter. Finally, we study the question of endogenous chain formation by considering a game of entry, i.e., firms decide whether to engage in production by forming beliefs about their profits in the post-entry supply chain. We argue that endogenous entry leads to chains that are inefficient in terms of the number of firms that engage in production. Chapter 3 studies the design of innovation contests as a tool to outsource innovation when there is uncertainty about the feasibility of the project. In such a contest, participants race towards completing an innovation project and learn about its feasibility from their own efforts and their competitors' gradual progress. Information about the status of competition can alleviate some of the uncertainty inherent in the contest, but it can also adversely affect effort provision from the laggards. We explores the problem of designing the award structure of a contest and its information disclosure policy in a dynamic framework and provides a number of guidelines for maximizing the designer's expected payoff.
8. Design of large scale nudge engines [electronic resource] [2015] Online
- Book
- 1 online resource.
Congestion is a widespread problem in modern urban transportation networks; hundreds of billions of dollars are lost each year due to wasted time, extra fuel consumption, traffic accidents, etc. A significant fraction of these losses is due to peak-hour congestion on road networks, which occurs when the demand for transport exceeds capacity over a short time period each day. In this thesis, we explore the feasibility of reducing peak-hour road traffic congestion using incentives to shift drivers' commute times. We first discuss a practical implementation of such an incentive program — CAPRI (Congestion And Parking Relief Incentives). This program aimed to reduce peak-hour vehicular traffic into and out of Stanford University. Commuters who sign up for CAPRI earn points for the "good trips" they make, and these points can be redeemed for rewards (both monetary and in-kind). CAPRI also includes the capability to personalize incentives based on users' historical behavior. To complement the implementation, we develop a theoretical model for optimally reducing the cost of peak-hour congestion with targeted incentives. We study the evolution of congestion on a highway under time-varying demands using fluid models of traffic flow. We then examine the effect of shifting users' commute times on congestion. We show that ideas from the theory of optimal transport of measures can be used to develop cost-effective incentive schemes to reduce congestion. Specifically, we show that the "cost of congestion" and the "cost of nudging" are closely related to the Wasserstein distance between measures. We use this relationship to formulate linear programming problems to compute personalized recommendations and incentives to nudge drivers to the off-peak hour. We find that the resultant reduction in the cost of congestion is significant and that it helps to prioritize commuters for nudging based on their origin and destination.
Congestion is a widespread problem in modern urban transportation networks; hundreds of billions of dollars are lost each year due to wasted time, extra fuel consumption, traffic accidents, etc. A significant fraction of these losses is due to peak-hour congestion on road networks, which occurs when the demand for transport exceeds capacity over a short time period each day. In this thesis, we explore the feasibility of reducing peak-hour road traffic congestion using incentives to shift drivers' commute times. We first discuss a practical implementation of such an incentive program — CAPRI (Congestion And Parking Relief Incentives). This program aimed to reduce peak-hour vehicular traffic into and out of Stanford University. Commuters who sign up for CAPRI earn points for the "good trips" they make, and these points can be redeemed for rewards (both monetary and in-kind). CAPRI also includes the capability to personalize incentives based on users' historical behavior. To complement the implementation, we develop a theoretical model for optimally reducing the cost of peak-hour congestion with targeted incentives. We study the evolution of congestion on a highway under time-varying demands using fluid models of traffic flow. We then examine the effect of shifting users' commute times on congestion. We show that ideas from the theory of optimal transport of measures can be used to develop cost-effective incentive schemes to reduce congestion. Specifically, we show that the "cost of congestion" and the "cost of nudging" are closely related to the Wasserstein distance between measures. We use this relationship to formulate linear programming problems to compute personalized recommendations and incentives to nudge drivers to the off-peak hour. We find that the resultant reduction in the cost of congestion is significant and that it helps to prioritize commuters for nudging based on their origin and destination.
9. Efficient learning in sequential optimization [electronic resource] [2015] Online
- Book
- 1 online resource.
We consider a broad class of online optimization problems in which a decision-maker must balance between exploration and exploitation while learning from partial feedback. In these problems, the decision-maker repeatedly chooses among a set of possible actions, observes an outcome, and receives a reward representing the utility derived from this outcome. She is uncertain about the underlying system and is therefore initially unsure of which action is best. However, as outcomes are observed, she is able to learn over time to make increasingly effective decisions. Her objective is to choose actions sequentially so as to maximize the expected cumulative reward. We focus on three algorithmic approaches that accommodate flexible statistical modeling, and are capable of experimenting efficiently in broad classes of problems. The first part of the thesis focuses on a design principle known as optimism in the face of uncertainty, which underlies many of the most effective exploration algorithms. We provide a regret bound for an optimistic algorithm that applies broadly and can be specialized to many specific model classes. Our bound depends on a new notion of dimension that measures the degree of dependence among actions. We compare our notion to the Vapnik-Chervonenkis dimension, and explain why that and other measures of dimension used in the supervised literature do not suffice when it comes to analyzing optimistic algorithms. We then turn our attention to Thompson sampling, an elegant algorithm for learning in online optimization problems with partial feedback. We derive a close theoretical connection between Thompson sampling and optimistic algorithms. Due to the connection we derive, existing analysis available for specific optimistic algorithms immediately translates to expected regret bounds for Thompson sampling. The second part of the thesis pushes beyond the optimistic principle, and offers a fresh, information-theoretic, perspective on the exploration/exploitation tradeoff. We first revisit Thompson sampling from this perspective and provide novel regret bounds that scale with the entropy of the optimal action distribution. Then, we propose a new algorithm--information-directed sampling (IDS)--and study its performance. IDS quantifies the amount learned by selecting an action through an information theoretic measure, and selects actions by optimizing an objective that explicitly balances attaining a high immediate reward and selecting informative actions. We provide a general regret bound for IDS, demonstrate strong performance in simulation, and show through simple analytic examples that it can dramatically outperform Thompson sampling due to the way it quantifies information.
We consider a broad class of online optimization problems in which a decision-maker must balance between exploration and exploitation while learning from partial feedback. In these problems, the decision-maker repeatedly chooses among a set of possible actions, observes an outcome, and receives a reward representing the utility derived from this outcome. She is uncertain about the underlying system and is therefore initially unsure of which action is best. However, as outcomes are observed, she is able to learn over time to make increasingly effective decisions. Her objective is to choose actions sequentially so as to maximize the expected cumulative reward. We focus on three algorithmic approaches that accommodate flexible statistical modeling, and are capable of experimenting efficiently in broad classes of problems. The first part of the thesis focuses on a design principle known as optimism in the face of uncertainty, which underlies many of the most effective exploration algorithms. We provide a regret bound for an optimistic algorithm that applies broadly and can be specialized to many specific model classes. Our bound depends on a new notion of dimension that measures the degree of dependence among actions. We compare our notion to the Vapnik-Chervonenkis dimension, and explain why that and other measures of dimension used in the supervised literature do not suffice when it comes to analyzing optimistic algorithms. We then turn our attention to Thompson sampling, an elegant algorithm for learning in online optimization problems with partial feedback. We derive a close theoretical connection between Thompson sampling and optimistic algorithms. Due to the connection we derive, existing analysis available for specific optimistic algorithms immediately translates to expected regret bounds for Thompson sampling. The second part of the thesis pushes beyond the optimistic principle, and offers a fresh, information-theoretic, perspective on the exploration/exploitation tradeoff. We first revisit Thompson sampling from this perspective and provide novel regret bounds that scale with the entropy of the optimal action distribution. Then, we propose a new algorithm--information-directed sampling (IDS)--and study its performance. IDS quantifies the amount learned by selecting an action through an information theoretic measure, and selects actions by optimizing an objective that explicitly balances attaining a high immediate reward and selecting informative actions. We provide a general regret bound for IDS, demonstrate strong performance in simulation, and show through simple analytic examples that it can dramatically outperform Thompson sampling due to the way it quantifies information.
10. A buffer-based approach to video rate adaptation [electronic resource] [2014] Online
- Book
- 1 online resource.
During peak viewing time, well over 50% of US Internet traffic is streamed video from Netflix and YouTube. To provide a better streaming experience, these services adapt their video rates by observing and estimating the available capacity. However, accurate capacity estimation is difficult due to highly variable throughput and complex interactions between layers. As a result, existing rate adaptation algorithms often lead to suboptimal video quality and unnecessary rebuffers. This thesis proposes an alternative buffer-based approach to adapt video rate. Rather than presuming that capacity estimation is always required, this approach starts the design by only using the playback buffer occupancy, and then ask when capacity estimation can be helpful. This design process leads to two separate phases of operation: during the steady-state phase, when the buffer encodes adequate information, we choose the video rate based only on the playback buffer; during the startup phase, when the buffer contains little information, we augment the buffer-based design with capacity estimation. This approach is tested with a series of field experiments spanning millions of Netflix users from May to September, 2013. The results demonstrate that although a simple capacity estimation is important during the startup phase, it is unnecessary in the steady state. The buffer-based approach allows us to reduce the rebuffer rate by 10-20% compared to a commercial algorithm used in Netflix, while delivering a similar overall average video rate and a higher video rate in steady state.
During peak viewing time, well over 50% of US Internet traffic is streamed video from Netflix and YouTube. To provide a better streaming experience, these services adapt their video rates by observing and estimating the available capacity. However, accurate capacity estimation is difficult due to highly variable throughput and complex interactions between layers. As a result, existing rate adaptation algorithms often lead to suboptimal video quality and unnecessary rebuffers. This thesis proposes an alternative buffer-based approach to adapt video rate. Rather than presuming that capacity estimation is always required, this approach starts the design by only using the playback buffer occupancy, and then ask when capacity estimation can be helpful. This design process leads to two separate phases of operation: during the steady-state phase, when the buffer encodes adequate information, we choose the video rate based only on the playback buffer; during the startup phase, when the buffer contains little information, we augment the buffer-based design with capacity estimation. This approach is tested with a series of field experiments spanning millions of Netflix users from May to September, 2013. The results demonstrate that although a simple capacity estimation is important during the startup phase, it is unnecessary in the steady state. The buffer-based approach allows us to reduce the rebuffer rate by 10-20% compared to a commercial algorithm used in Netflix, while delivering a similar overall average video rate and a higher video rate in steady state.
Special Collections
Special Collections | Status |
---|---|
University Archives | Request on-site access |
3781 2014 H | In-library use |
11. Mean-field models in network game theory [electronic resource] [2014] Online
- Book
- 1 online resource.
In many contexts, the interactions between a large number of agents are governed by a network structure. Examples include online social networks, computer networks or societies in which each agent interacts only with a given set of individuals. In many situations those agents must make strategic decisions such as whether to form a certain opinion, whether to adopt a new product or technology, whether to get vaccinated or whether to invest in computer security solutions. While these so-called network games have many useful applications, they can pose important challenges in terms of tractability and place an unrealistic cognitive burden upon agents. In this thesis, we study such large network games and propose a mean-field equilibrium concept (MFE) as an approximation that allows for both tractability and realism in terms of the cognitive burden placed on agents. We study different applications. In Chapter 2, we study a network game of technology adoption. When a product or technology is first introduced, there is uncertainty about its value or quality. This quality can be learned by trying the product, at a risk. It can also be learned by letting others try it and free-riding on the information that they generate. We propose a class of dynamic games to study the adoption of technologies of uncertain value, when agents are connected by a network. This class of games allows for referral incentives, whereby an agent can earn rewards by encouraging his neighbors to adopt. We derive a mean-field equilibrium (MFE) and show that a pricing policy that involves referral incentives leads to a double-threshold strategy by which both low and high-degree agents may choose to experiment with the technology of uncertain value whereas the middle-degree agents free-ride on the information revealed by that experimentation. We characterize how different dynamic pricing mechanisms affect the pattern of early/late adoption and information diffusion. Pricing mechanisms that allow a monopolist to guarantee early adoption by agents of high or low degrees are proposed. We illustrate how referral incentives can be preferable on certain networks while inter-temporal price discrimination (i.e. price discounts) may be preferable on others. In Chapter 3, we study cascading failures in networks and the incentives that agents have to invest in costly protection against failure. A finite set of agents are connected through a network and can fail either intrinsically or as a result of the failure of a subset of their neighbors. Particular applications of this game include vaccination, investment in computer security solutions, airport security as well as investments in cash buffers in an interbank system. We derive a Bayes-Nash equilibrium in which agents form a belief about the joint probability of failure of their neighbors. We characterize the equilibrium based on an agent's effective probability of failure and derive conditions under which equilibrium strategies are monotone in degree. We show that this monotonicity is reversed, depending on whether the investment in protection insulates an agent against the failure of his neighbors or just against his own intrinsic failure. The former case defines a game of strategic substitutes in which some agents free-ride on the investment in protection of others, while the latter case defines a game of strategic complements in which agents pool their investments in protection. When failure risk is increasing in degree, protection against the failure of neighbors induces more investment by higher-degree agents whereas protection against intrinsic failure induces more investment by lower-degree agents. Welfare implications are discussed. Finally in Chapter 4, given the difficulty of expressing the beliefs in the previous chapter, we introduce a bounded-rationality solution concept as an approximation: a mean-field equilibrium (MFE). Agents simply consider a mean-field approximation of the cascading process when making their decision of whether to invest in protection. We study the game with binary actions --- where an agent can make a costly investment in protection against failure --- and show that the equilibria are characterized by thresholds in degree, above or below which agents choose to invest. When the costly investment protects them against the failure of their neighbors, the equilibrium is unique, whereas when it protects them only against their own intrinsic failure, there can be multiple equilibria. The mean-field model conveniently allows for comparative statics in terms of the degree distribution. It is shown that a more interconnected system can either induce more or less investment in protection, depending on whether the failure risk is decreasing or increasing in degree.
In many contexts, the interactions between a large number of agents are governed by a network structure. Examples include online social networks, computer networks or societies in which each agent interacts only with a given set of individuals. In many situations those agents must make strategic decisions such as whether to form a certain opinion, whether to adopt a new product or technology, whether to get vaccinated or whether to invest in computer security solutions. While these so-called network games have many useful applications, they can pose important challenges in terms of tractability and place an unrealistic cognitive burden upon agents. In this thesis, we study such large network games and propose a mean-field equilibrium concept (MFE) as an approximation that allows for both tractability and realism in terms of the cognitive burden placed on agents. We study different applications. In Chapter 2, we study a network game of technology adoption. When a product or technology is first introduced, there is uncertainty about its value or quality. This quality can be learned by trying the product, at a risk. It can also be learned by letting others try it and free-riding on the information that they generate. We propose a class of dynamic games to study the adoption of technologies of uncertain value, when agents are connected by a network. This class of games allows for referral incentives, whereby an agent can earn rewards by encouraging his neighbors to adopt. We derive a mean-field equilibrium (MFE) and show that a pricing policy that involves referral incentives leads to a double-threshold strategy by which both low and high-degree agents may choose to experiment with the technology of uncertain value whereas the middle-degree agents free-ride on the information revealed by that experimentation. We characterize how different dynamic pricing mechanisms affect the pattern of early/late adoption and information diffusion. Pricing mechanisms that allow a monopolist to guarantee early adoption by agents of high or low degrees are proposed. We illustrate how referral incentives can be preferable on certain networks while inter-temporal price discrimination (i.e. price discounts) may be preferable on others. In Chapter 3, we study cascading failures in networks and the incentives that agents have to invest in costly protection against failure. A finite set of agents are connected through a network and can fail either intrinsically or as a result of the failure of a subset of their neighbors. Particular applications of this game include vaccination, investment in computer security solutions, airport security as well as investments in cash buffers in an interbank system. We derive a Bayes-Nash equilibrium in which agents form a belief about the joint probability of failure of their neighbors. We characterize the equilibrium based on an agent's effective probability of failure and derive conditions under which equilibrium strategies are monotone in degree. We show that this monotonicity is reversed, depending on whether the investment in protection insulates an agent against the failure of his neighbors or just against his own intrinsic failure. The former case defines a game of strategic substitutes in which some agents free-ride on the investment in protection of others, while the latter case defines a game of strategic complements in which agents pool their investments in protection. When failure risk is increasing in degree, protection against the failure of neighbors induces more investment by higher-degree agents whereas protection against intrinsic failure induces more investment by lower-degree agents. Welfare implications are discussed. Finally in Chapter 4, given the difficulty of expressing the beliefs in the previous chapter, we introduce a bounded-rationality solution concept as an approximation: a mean-field equilibrium (MFE). Agents simply consider a mean-field approximation of the cascading process when making their decision of whether to invest in protection. We study the game with binary actions --- where an agent can make a costly investment in protection against failure --- and show that the equilibria are characterized by thresholds in degree, above or below which agents choose to invest. When the costly investment protects them against the failure of their neighbors, the equilibrium is unique, whereas when it protects them only against their own intrinsic failure, there can be multiple equilibria. The mean-field model conveniently allows for comparative statics in terms of the degree distribution. It is shown that a more interconnected system can either induce more or less investment in protection, depending on whether the failure risk is decreasing or increasing in degree.
Special Collections
Special Collections | Status |
---|---|
University Archives | Request on-site access |
3781 2014 L | In-library use |
12. Trust and mistrust in a networked society [electronic resource] [2013] Online
- Book
- 1 online resource.
Internet based socio-economic interactions often involve strangers, and as a result, are shrouded in varying degrees of mutual mistrust. Effectively combating this mistrust is critical to enabling trusted interactions on the Internet. This dissertation addresses the question of how to foster trust in Internet based socio-economic interactions in two contexts: preventing misbehavior such as spam, fraud, and free-riding in content-sharing systems (e.g., BitTorrent, Yelp, YouTube, email) and online marketplaces, and countering the apparent polarization in society through online social systems that influence the way people form opinions. Most online content sharing systems and marketplaces use centralized virtual currencies or reputation systems to prevent misbehavior and foster trust. We study an alternate model of decentralized currency called credit networks, that is based on trust between individuals, and hence is robust against the problem of cheap pseudonyms. Individuals in a credit network pay for goods and services by issuing their own IOUs (obligations), and express trust in terms of the number of IOUs they are committed to accept from each other. We analyze the liquidity---the ability to trade---in credit networks in terms of the long-term transaction failure probability when individuals repeatedly transact under a probabilistic transaction regime. We show analytically and using simulations that, surprisingly, liquidity in credit networks having a number of well-known network topologies is nearly identical to that in centralized currency systems. We also analyze the structural and economic properties of the credit networks that are formed when individuals strategically decide how much to trust each other. Our results help establish credit networks as a viable alternative to centralized reputation and virtual currency systems. While the Internet has helped bring us closer by making it easier to communicate, there is considerable anecdotal and empirical evidence that we are getting more polarized as a society. In particular, empirical studies have shown that homophily, i.e., greater interaction between like-minded individuals, results in polarization. We study polarization through a mathematical model of opinion formation in a social network. We adopt the view that polarization is not a property of a state of society; instead it is a property of the dynamics through which people form opinions. We say that opinion formation dynamics are polarizing if they result in an increased divergence of opinions. We show that DeGroot's well-known model of opinion formation based on repeated averaging can never be polarizing, even if individuals are arbitrarily homophilous. We generalize DeGroot's model to account for a phenomenon well-known in social psychology as confirmation bias: When presented with mixed or inconclusive evidence on a complex issue, individuals draw undue support for their initial position, thereby arriving at a more extreme opinion. We show that in a simple model of homophilous networks, our biased opinion formation process results in polarization if individuals are sufficiently biased. In other words, homophily alone, without confirmation bias, is not sufficient to polarize society. We discuss the implications of our analysis for the design of online social systems that influence the way people form opinions. In particular, we use confirmation bias as a framework to analyze the polarizing effect of Internet based recommender systems that deliver content tailored to our specific tastes. We also describe an online social platform called Widescope that we built to foster dialogue and enable consensus on government budgets and budget deficits. We report the results of two human subject experiments in which each subject created a federal budget on Widescope and then the subjects interacted in pairs to arrive at a greater consensus on the budget.
Internet based socio-economic interactions often involve strangers, and as a result, are shrouded in varying degrees of mutual mistrust. Effectively combating this mistrust is critical to enabling trusted interactions on the Internet. This dissertation addresses the question of how to foster trust in Internet based socio-economic interactions in two contexts: preventing misbehavior such as spam, fraud, and free-riding in content-sharing systems (e.g., BitTorrent, Yelp, YouTube, email) and online marketplaces, and countering the apparent polarization in society through online social systems that influence the way people form opinions. Most online content sharing systems and marketplaces use centralized virtual currencies or reputation systems to prevent misbehavior and foster trust. We study an alternate model of decentralized currency called credit networks, that is based on trust between individuals, and hence is robust against the problem of cheap pseudonyms. Individuals in a credit network pay for goods and services by issuing their own IOUs (obligations), and express trust in terms of the number of IOUs they are committed to accept from each other. We analyze the liquidity---the ability to trade---in credit networks in terms of the long-term transaction failure probability when individuals repeatedly transact under a probabilistic transaction regime. We show analytically and using simulations that, surprisingly, liquidity in credit networks having a number of well-known network topologies is nearly identical to that in centralized currency systems. We also analyze the structural and economic properties of the credit networks that are formed when individuals strategically decide how much to trust each other. Our results help establish credit networks as a viable alternative to centralized reputation and virtual currency systems. While the Internet has helped bring us closer by making it easier to communicate, there is considerable anecdotal and empirical evidence that we are getting more polarized as a society. In particular, empirical studies have shown that homophily, i.e., greater interaction between like-minded individuals, results in polarization. We study polarization through a mathematical model of opinion formation in a social network. We adopt the view that polarization is not a property of a state of society; instead it is a property of the dynamics through which people form opinions. We say that opinion formation dynamics are polarizing if they result in an increased divergence of opinions. We show that DeGroot's well-known model of opinion formation based on repeated averaging can never be polarizing, even if individuals are arbitrarily homophilous. We generalize DeGroot's model to account for a phenomenon well-known in social psychology as confirmation bias: When presented with mixed or inconclusive evidence on a complex issue, individuals draw undue support for their initial position, thereby arriving at a more extreme opinion. We show that in a simple model of homophilous networks, our biased opinion formation process results in polarization if individuals are sufficiently biased. In other words, homophily alone, without confirmation bias, is not sufficient to polarize society. We discuss the implications of our analysis for the design of online social systems that influence the way people form opinions. In particular, we use confirmation bias as a framework to analyze the polarizing effect of Internet based recommender systems that deliver content tailored to our specific tastes. We also describe an online social platform called Widescope that we built to foster dialogue and enable consensus on government budgets and budget deficits. We report the results of two human subject experiments in which each subject created a federal budget on Widescope and then the subjects interacted in pairs to arrive at a greater consensus on the budget.
Special Collections
Special Collections | Status |
---|---|
University Archives | Request on-site access |
3781 2013 D | In-library use |
13. Using packet histories to troubleshoot networks [electronic resource] [2013] Online
- Book
- 1 online resource.
Operating networks is hard. When a network goes down, network administrators have only a rudimentary set of tools at their disposal to track down the root cause of the outage. As networks have become more complicated, with more network protocols modifying the forwarding behavior below, and more application types running above, the debugging toolkit has remained essentially unchanged, with little or no innovation in years. Today, skilled network administrators frequently use manual, heuristic- driven procedures to configure and maintain networks. Humans are involved almost every time something goes wrong, and we are still far from an era of automated troubleshooting. In this dissertation, I show how packet histories--the full story of every packet's journey through the network--can simplify network diagnosis. A packet history is the route a packet takes through a network, combined with the switch state and header modifications it encounters at each switch on the route. Using packet history as the core construct, I propose an abstraction for systematic network troubleshooting, a framework with which to express the observed error symptoms and pose questions to the network. To demonstrate the usefulness of packet histories and the practical feasibility of constructing them, I built NetSight, an extensible platform that captures packet histories and enables applications to concisely and flexibly retrieve packet histories of interest. Atop NetSight I built four applications that illustrate its flexibility: an interactive network debugger, a live invariant monitor, a path-aware history logger, and a hierarchical network profiler. On a single modern multi-core server, NetSight can process packet histories for the traffic of multiple 10 Gb/s links. For larger networks, NetSight scales linearly with additional servers. To scale it even further to bandwidth-heavy enterprises and datacenter networks, I present two optimized NetSight variants using straightforward additions to switch ASICs and hypervisor-based switches.
Operating networks is hard. When a network goes down, network administrators have only a rudimentary set of tools at their disposal to track down the root cause of the outage. As networks have become more complicated, with more network protocols modifying the forwarding behavior below, and more application types running above, the debugging toolkit has remained essentially unchanged, with little or no innovation in years. Today, skilled network administrators frequently use manual, heuristic- driven procedures to configure and maintain networks. Humans are involved almost every time something goes wrong, and we are still far from an era of automated troubleshooting. In this dissertation, I show how packet histories--the full story of every packet's journey through the network--can simplify network diagnosis. A packet history is the route a packet takes through a network, combined with the switch state and header modifications it encounters at each switch on the route. Using packet history as the core construct, I propose an abstraction for systematic network troubleshooting, a framework with which to express the observed error symptoms and pose questions to the network. To demonstrate the usefulness of packet histories and the practical feasibility of constructing them, I built NetSight, an extensible platform that captures packet histories and enables applications to concisely and flexibly retrieve packet histories of interest. Atop NetSight I built four applications that illustrate its flexibility: an interactive network debugger, a live invariant monitor, a path-aware history logger, and a hierarchical network profiler. On a single modern multi-core server, NetSight can process packet histories for the traffic of multiple 10 Gb/s links. For larger networks, NetSight scales linearly with additional servers. To scale it even further to bandwidth-heavy enterprises and datacenter networks, I present two optimized NetSight variants using straightforward additions to switch ASICs and hypervisor-based switches.
Special Collections
Special Collections | Status |
---|---|
University Archives | Request on-site access |
3781 2013 H | In-library use |
14. Approximate equilibria in large games [electronic resource] [2012] Online
- Book
- 1 online resource.
The complexity of studying Nash equilibrium in large games often scales with the size of the system: as it increases, computing the exact Nash equilibrium can soon become intractable. However, when the number of players in the system approaches infinity and no individual player has a significant impact on the system, we can approximate the system by considering each single player no longer playing against other individual players but a single aggregation of all other players. In this paper, we apply this idea to study and approximate Nash equilibria in two large scale games. In part I, we consider a model of priced resource sharing that combines both queueing behavior and strategic behavior. We study a priority service model where a single server allocates its capacity to agents in proportion to their payment to the system, and users from different classes act to minimize the sum of their cost for processing delay and payment. As the exact processing time of this system is hard to compute and cannot be characterized in closed form, we introduce the concept of aggregate equilibrium to approximate the exact Nash equilibrium, by assuming each individual player plays against a common aggregate priority that characterizes the large system. We then introduce the notion of heavy traffic equilibrium as an alternative approximation of the Nash equilibrium, derived by considering the asymptotic regime where the system load approaches capacity. We show that both aggregate equilibrium and heavy traffic equilibrium are asymptotically exact in heavy traffic. We present some numerical results for both approximate equilibria, and discuss efficiency and revenue, and in particular provide a bound for the price of anarchy of the heavy traffic equilibrium. In part II, we study the reputation system of large scale online marketplace. We develop a large market model to study reputation mechanisms in online marketplaces. We consider two types of sellers: commitment sellers, who are intrinsically honest but may be unable to accurately describe items because of limited expertise; and strategic sellers, who are driven by a profit maximization motive. We focus on stationary equilibria for this dynamic market, in particular, on separating equilibria where strategic sellers are incentivized to describe the items they have for sale truthfully, and characterize the conditions under which such equilibria exist. We then complement our theoretical results with computational analysis and provide insights on the features of markets that may incentivize truthfulness in equilibrium.
The complexity of studying Nash equilibrium in large games often scales with the size of the system: as it increases, computing the exact Nash equilibrium can soon become intractable. However, when the number of players in the system approaches infinity and no individual player has a significant impact on the system, we can approximate the system by considering each single player no longer playing against other individual players but a single aggregation of all other players. In this paper, we apply this idea to study and approximate Nash equilibria in two large scale games. In part I, we consider a model of priced resource sharing that combines both queueing behavior and strategic behavior. We study a priority service model where a single server allocates its capacity to agents in proportion to their payment to the system, and users from different classes act to minimize the sum of their cost for processing delay and payment. As the exact processing time of this system is hard to compute and cannot be characterized in closed form, we introduce the concept of aggregate equilibrium to approximate the exact Nash equilibrium, by assuming each individual player plays against a common aggregate priority that characterizes the large system. We then introduce the notion of heavy traffic equilibrium as an alternative approximation of the Nash equilibrium, derived by considering the asymptotic regime where the system load approaches capacity. We show that both aggregate equilibrium and heavy traffic equilibrium are asymptotically exact in heavy traffic. We present some numerical results for both approximate equilibria, and discuss efficiency and revenue, and in particular provide a bound for the price of anarchy of the heavy traffic equilibrium. In part II, we study the reputation system of large scale online marketplace. We develop a large market model to study reputation mechanisms in online marketplaces. We consider two types of sellers: commitment sellers, who are intrinsically honest but may be unable to accurately describe items because of limited expertise; and strategic sellers, who are driven by a profit maximization motive. We focus on stationary equilibria for this dynamic market, in particular, on separating equilibria where strategic sellers are incentivized to describe the items they have for sale truthfully, and characterize the conditions under which such equilibria exist. We then complement our theoretical results with computational analysis and provide insights on the features of markets that may incentivize truthfulness in equilibrium.
Special Collections
Special Collections | Status |
---|---|
University Archives | Request on-site access |
3781 2012 W | In-library use |
15. Incentive mechanisms for societal networks [electronic resource] [2012] Online
- Book
- 1 online resource.
A "societal network" is an interconnected structure of resources and consumers that drives a societal process. Examples include roads, electricity grids, health-care systems, and waste management networks. Societal networks are plagued by two pressing problems: (i) the demand for resources exceeds their supply, especially at "peak'' hours, and (ii) inefficiencies due to wastage are highly permeable, i.e., while the loss to an individual is very small, the overall loss to the societal network as a whole can be substantial. This suggests that small changes in the behavior of each user can lead to significant gains in the efficiency of societal networks. More importantly, such a change in each user is necessary to combat systemic problems such as congestion. There is a strong urgency to the problems in societal networks -- the scale of the problems has reached epic proportions. Fortunately, help is at hand. Over the past few decades, there have been great technological inventions such as the creation of the global Internet, the Web, cellular telephony and cloud computing. Now is the right time to address the problems in societal networks making use of the ubiquitous presence of smart sensing technology. This thesis presents a first attempt at that -- making use of a variety of sensing technology to monitor user behavior, informing the users about their actions and incentivizing them to adopt a desirable behavior. This thesis develops a general approach to affect change in the behavior of the users of societal networks. At the first level, it is based on fine-grained sensing of user behavior and communicating this information to a platform, and using incentives of monetary, social and other kinds to change user behavior. At the second level, the approach is concerned with the design of low-cost and accurate sensors, and backend algorithms to inform, engage and incentivize users to affect the desired behavior change. We describe some pilot projects which we have conducted to test and verify the effectiveness of this approach. The first project, INSTANT (Infosys-Stanford Traffic project), was conducted over 6 months in Bangalore, India, to incentivize roughly 14,000 commuters of Infosys Technologies to shift their commute to off-peak hours. The second project is a recycling experiment conducted over one week at Stanford to incentivize the members of the Stanford community to recycle. Third, we outline our contributions to Steptacular, an employee wellness program run at Accenture, USA. Finally, we describe CAPRI (Congestion and Parking Relief Incentives), a research project currently underway at Stanford University to incentivize off-peak travel and "off-center" parking.
A "societal network" is an interconnected structure of resources and consumers that drives a societal process. Examples include roads, electricity grids, health-care systems, and waste management networks. Societal networks are plagued by two pressing problems: (i) the demand for resources exceeds their supply, especially at "peak'' hours, and (ii) inefficiencies due to wastage are highly permeable, i.e., while the loss to an individual is very small, the overall loss to the societal network as a whole can be substantial. This suggests that small changes in the behavior of each user can lead to significant gains in the efficiency of societal networks. More importantly, such a change in each user is necessary to combat systemic problems such as congestion. There is a strong urgency to the problems in societal networks -- the scale of the problems has reached epic proportions. Fortunately, help is at hand. Over the past few decades, there have been great technological inventions such as the creation of the global Internet, the Web, cellular telephony and cloud computing. Now is the right time to address the problems in societal networks making use of the ubiquitous presence of smart sensing technology. This thesis presents a first attempt at that -- making use of a variety of sensing technology to monitor user behavior, informing the users about their actions and incentivizing them to adopt a desirable behavior. This thesis develops a general approach to affect change in the behavior of the users of societal networks. At the first level, it is based on fine-grained sensing of user behavior and communicating this information to a platform, and using incentives of monetary, social and other kinds to change user behavior. At the second level, the approach is concerned with the design of low-cost and accurate sensors, and backend algorithms to inform, engage and incentivize users to affect the desired behavior change. We describe some pilot projects which we have conducted to test and verify the effectiveness of this approach. The first project, INSTANT (Infosys-Stanford Traffic project), was conducted over 6 months in Bangalore, India, to incentivize roughly 14,000 commuters of Infosys Technologies to shift their commute to off-peak hours. The second project is a recycling experiment conducted over one week at Stanford to incentivize the members of the Stanford community to recycle. Third, we outline our contributions to Steptacular, an employee wellness program run at Accenture, USA. Finally, we describe CAPRI (Congestion and Parking Relief Incentives), a research project currently underway at Stanford University to incentivize off-peak travel and "off-center" parking.
Special Collections
Special Collections | Status |
---|---|
University Archives | Request on-site access |
3781 2012 M | In-library use |
16. Learning graphical models fundamental limits and efficient algorithms [electronic resource] [2012] Online
- Book
- 1 online resource.
Graphical models provide a flexible and yet powerful language to describe high dimensional probability distributions. Over the last 20 years, graphical models methods have been successfully applied to a broad range of problems, from computer vision to error correcting codes to computational biology, to name a few. In a graphical model, the graph structure characterizes the conditional independence properties of the underlying probability distributions. Roughly speaking, it encodes key information about which variables influence each other. It allows us to answer questions of the type: are variables X and Y dependent because they 'interact' directly, or because they are both dependent on a third variable Z? In many applications, this information has utmost practical importance, and it is therefore crucial to develop efficient algorithms to learn the graphical structure from data. This problem is largely unsolved and for a long time several heuristics have been used without a solid theoretical foundation in place. In the first part of this work, we consider the problem of learning the structure of Ising models (pairwise binary Markov random fields) from i.i.d. samples. While several methods have been proposed to accomplish this task, their relative merits and limitations remain somewhat obscure. By analyzing a number of concrete examples, we show that low-complexity algorithms often fail when the Markov random field develops long-range correlations. More precisely, this phenomenon appears to be related to the Ising model phase transition (although it does not coincide with it). An example of an important algorithm that exhibits this behavior is the L1- regularized logistic regression estimator introduced by Ravikumar et al. Ravikumar et al. proved a set sufficient conditions under which this algorithm exactly learns Ising models, the most interesting being a so-called incoherence condition. In this thesis we show that this incoherence condition is also necessary and analytically establish whether it holds for several families of graphs. In particular, denoting by [theta] the edge strength and by \Delta the maximum degree, we prove that regularized logistic regression succeeds on any graph with \Delta [less than or equal to] 3/(10 \theta) and fails on most regular graphs with \Delta [greater than or equal to] 2 / \theta. In the second part of this work, we address the important scenario in which data is not composed of i.i.d. samples. We focus on the problem of learning the drift coefficient of a p-dimensional stochastic differential equation (SDE) from a sample path of length T. We assume that the drift is parametrized by a high-dimensional vector, and study the support recovery problem in the case where p is allowed to grow with T. In particular, we describe a general lower bound on the sample-complexity T by using a characterization of mutual information as a time integral of conditional variance, due to Kadota, Zakai, and Ziv. For linear SDEs, the drift coefficient is parametrized by a p-by-p matrix which describes which degrees of freedom interact under the dynamics. In this case, we analyze an L1-regularized least-squares estimator and describe an upper bound on T that nearly matches the lower bound on specific classes of sparse matrices. We describe how this same algorithm can be used to learn non-linear SDEs and in addition show by means of a numerical experiment why one should expect the sample-complexity to be of the same order as that for linear SDEs.
Graphical models provide a flexible and yet powerful language to describe high dimensional probability distributions. Over the last 20 years, graphical models methods have been successfully applied to a broad range of problems, from computer vision to error correcting codes to computational biology, to name a few. In a graphical model, the graph structure characterizes the conditional independence properties of the underlying probability distributions. Roughly speaking, it encodes key information about which variables influence each other. It allows us to answer questions of the type: are variables X and Y dependent because they 'interact' directly, or because they are both dependent on a third variable Z? In many applications, this information has utmost practical importance, and it is therefore crucial to develop efficient algorithms to learn the graphical structure from data. This problem is largely unsolved and for a long time several heuristics have been used without a solid theoretical foundation in place. In the first part of this work, we consider the problem of learning the structure of Ising models (pairwise binary Markov random fields) from i.i.d. samples. While several methods have been proposed to accomplish this task, their relative merits and limitations remain somewhat obscure. By analyzing a number of concrete examples, we show that low-complexity algorithms often fail when the Markov random field develops long-range correlations. More precisely, this phenomenon appears to be related to the Ising model phase transition (although it does not coincide with it). An example of an important algorithm that exhibits this behavior is the L1- regularized logistic regression estimator introduced by Ravikumar et al. Ravikumar et al. proved a set sufficient conditions under which this algorithm exactly learns Ising models, the most interesting being a so-called incoherence condition. In this thesis we show that this incoherence condition is also necessary and analytically establish whether it holds for several families of graphs. In particular, denoting by [theta] the edge strength and by \Delta the maximum degree, we prove that regularized logistic regression succeeds on any graph with \Delta [less than or equal to] 3/(10 \theta) and fails on most regular graphs with \Delta [greater than or equal to] 2 / \theta. In the second part of this work, we address the important scenario in which data is not composed of i.i.d. samples. We focus on the problem of learning the drift coefficient of a p-dimensional stochastic differential equation (SDE) from a sample path of length T. We assume that the drift is parametrized by a high-dimensional vector, and study the support recovery problem in the case where p is allowed to grow with T. In particular, we describe a general lower bound on the sample-complexity T by using a characterization of mutual information as a time integral of conditional variance, due to Kadota, Zakai, and Ziv. For linear SDEs, the drift coefficient is parametrized by a p-by-p matrix which describes which degrees of freedom interact under the dynamics. In this case, we analyze an L1-regularized least-squares estimator and describe an upper bound on T that nearly matches the lower bound on specific classes of sparse matrices. We describe how this same algorithm can be used to learn non-linear SDEs and in addition show by means of a numerical experiment why one should expect the sample-complexity to be of the same order as that for linear SDEs.
Special Collections
Special Collections | Status |
---|---|
University Archives | Request on-site access |
3781 2012 B | In-library use |
17. Network models and infectious disease control [electronic resource] : analysis and insights [2012] Online
- Book
- 1 online resource.
Many models of infectious disease do not explicitly consider the underlying contact network through which the disease spreads. However, network structure can greatly influence the dynamics of an epidemic and has implications for infectious disease control policies. This dissertation explores two important themes of network science as it relates to infectious disease control policy: first, how to incorporate network structure into the modeling and evaluation of infectious disease control policies; and second, how to leverage network structure in the design of optimal control policies. Complete network information is rarely available. Furthermore, even when network data is available, it generally only represents a snapshot in time and does not capture population or network dynamics (e.g., the formation and dissolution of contacts). In Chapter 2, we address these data limitations and describe a general simulation framework for modeling the spread of infectious disease in a dynamic contact network. We provide a methodology for inferring important model parameters, such as those governing network structure and network dynamics, from readily available data sources. In Chapter 3, we present an application of this framework in evaluating the effectiveness and cost-effectiveness of mass media campaigns aimed at reducing concurrent sexual partnerships in sub-Saharan Africa for HIV prevention. In Chapter 4, we address the problem of leveraging network structure in designing infectious disease control policies. In particular, we consider the problem of identifying which links to remove from a contact network in order to maximize the number of individuals who are protected from infection. We show that this problem can be posed as a non-convex quadratically-constrained quadratic program (QCQP), from which a link removal algorithm can be derived. Evaluation of the QCQP algorithm on standard network models demonstrates that it exhibits near-optimal performance and outperforms other intuitive link removal algorithms, such as removing links in order of edge centrality.
Many models of infectious disease do not explicitly consider the underlying contact network through which the disease spreads. However, network structure can greatly influence the dynamics of an epidemic and has implications for infectious disease control policies. This dissertation explores two important themes of network science as it relates to infectious disease control policy: first, how to incorporate network structure into the modeling and evaluation of infectious disease control policies; and second, how to leverage network structure in the design of optimal control policies. Complete network information is rarely available. Furthermore, even when network data is available, it generally only represents a snapshot in time and does not capture population or network dynamics (e.g., the formation and dissolution of contacts). In Chapter 2, we address these data limitations and describe a general simulation framework for modeling the spread of infectious disease in a dynamic contact network. We provide a methodology for inferring important model parameters, such as those governing network structure and network dynamics, from readily available data sources. In Chapter 3, we present an application of this framework in evaluating the effectiveness and cost-effectiveness of mass media campaigns aimed at reducing concurrent sexual partnerships in sub-Saharan Africa for HIV prevention. In Chapter 4, we address the problem of leveraging network structure in designing infectious disease control policies. In particular, we consider the problem of identifying which links to remove from a contact network in order to maximize the number of individuals who are protected from infection. We show that this problem can be posed as a non-convex quadratically-constrained quadratic program (QCQP), from which a link removal algorithm can be derived. Evaluation of the QCQP algorithm on standard network models demonstrates that it exhibits near-optimal performance and outperforms other intuitive link removal algorithms, such as removing links in order of edge centrality.
18. Strategic and adaptive execution [electronic resource] [2012] Online
- Book
- 1 online resource.
First, we consider a trader who aims to liquidate a large position in the presence of an arbitrageur who hopes to profit from the trader's activity. The arbitrageur is uncertain about the trader's position and learns from observed price fluctuations. This is a dynamic game with asymmetric information. We present an algorithm for computing perfect Bayesian equilibrium behavior and conduct numerical experiments. Our results demonstrate that the trader's strategy differs significantly from one that would be optimal in the absence of the arbitrageur. In particular, the trader must balance the conflicting desires of minimizing price impact and minimizing information that is signaled through trading. Accounting for information signaling and the presence of strategic adversaries can greatly reduce execution costs. Second, we consider a model in which a trader seeks to maximize expected risk-adjusted profit while trading a single security. In our second model, there is no arbitrageur, and each price change is a linear combination of observed factors, impact resulting from the trader's current and prior activity, and unpredictable random effects. The trader must learn coefficients of a price impact model while trading. We propose a new method for simultaneous execution and learning -- the confidence-triggered regularized adaptive certainty equivalent (CTRACE) policy -- and establish a polylogarithmic finite-time expected regret bound. This bound implies that CTRACE is efficient in the sense that the ([epsilon], [delta])-convergence time is bounded by a polynomial function of 1/[epsilon] and log(1/[delta]) with high probability. In addition, we demonstrate via Monte Carlo simulation that CTRACE outperforms the certainty equivalent policy and a recently proposed reinforcement learning algorithm that is designed to explore efficiently in linear-quadratic control problems.
First, we consider a trader who aims to liquidate a large position in the presence of an arbitrageur who hopes to profit from the trader's activity. The arbitrageur is uncertain about the trader's position and learns from observed price fluctuations. This is a dynamic game with asymmetric information. We present an algorithm for computing perfect Bayesian equilibrium behavior and conduct numerical experiments. Our results demonstrate that the trader's strategy differs significantly from one that would be optimal in the absence of the arbitrageur. In particular, the trader must balance the conflicting desires of minimizing price impact and minimizing information that is signaled through trading. Accounting for information signaling and the presence of strategic adversaries can greatly reduce execution costs. Second, we consider a model in which a trader seeks to maximize expected risk-adjusted profit while trading a single security. In our second model, there is no arbitrageur, and each price change is a linear combination of observed factors, impact resulting from the trader's current and prior activity, and unpredictable random effects. The trader must learn coefficients of a price impact model while trading. We propose a new method for simultaneous execution and learning -- the confidence-triggered regularized adaptive certainty equivalent (CTRACE) policy -- and establish a polylogarithmic finite-time expected regret bound. This bound implies that CTRACE is efficient in the sense that the ([epsilon], [delta])-convergence time is bounded by a polynomial function of 1/[epsilon] and log(1/[delta]) with high probability. In addition, we demonstrate via Monte Carlo simulation that CTRACE outperforms the certainty equivalent policy and a recently proposed reinforcement learning algorithm that is designed to explore efficiently in linear-quadratic control problems.
Special Collections
Special Collections | Status |
---|---|
University Archives | Request on-site access |
3781 2012 P | In-library use |
19. Adaptive decision making in information networks [electronic resource] [2011] Online
- Book
- 1 online resource.
The goal of this thesis is to develop techniques that enable us to analyze network precesses in which nodes of the network make decisions based on local and limited information. Such processes arise naturally in many information networks such as the World Wide Web (WWW), online social networks, and sensor networks. First we study a stochastic version of the online bipartite matching problem in which the sequence of arrivals are i.i.d samples of a given distribution. We use the ideas of offline statistics and adaptivity to give the first algorithm for this problem that improves the 1 − 1/e competitive ratio of the celebrated algorithm by Karp, Vazirani, and Vazirani, for the stochastic version in its general form. We also model and study the evolution of cooperation in populations with local interactions. Every node of the network corresponds on an individual choosing whether to cooperate or defect in a repeated prisoner's dilemma game. The players try to learn the best action by imitating those neighbors who have higher payoffs. We show that when the interactions take place on graphs with large girth, cooperation is more likely to emerge as a result of local clustering of cooperators. We also consider the practical problem of constructing well-connected sensor networks by locally moving the sensors. We design decentralized algorithms in which sensors identify the critical edges and then move in directions that improve connectivity in more critical regions. In our algorithms, nodes determine their movement direction only with local information exchange with their neighbors.
The goal of this thesis is to develop techniques that enable us to analyze network precesses in which nodes of the network make decisions based on local and limited information. Such processes arise naturally in many information networks such as the World Wide Web (WWW), online social networks, and sensor networks. First we study a stochastic version of the online bipartite matching problem in which the sequence of arrivals are i.i.d samples of a given distribution. We use the ideas of offline statistics and adaptivity to give the first algorithm for this problem that improves the 1 − 1/e competitive ratio of the celebrated algorithm by Karp, Vazirani, and Vazirani, for the stochastic version in its general form. We also model and study the evolution of cooperation in populations with local interactions. Every node of the network corresponds on an individual choosing whether to cooperate or defect in a repeated prisoner's dilemma game. The players try to learn the best action by imitating those neighbors who have higher payoffs. We show that when the interactions take place on graphs with large girth, cooperation is more likely to emerge as a result of local clustering of cooperators. We also consider the practical problem of constructing well-connected sensor networks by locally moving the sensors. We design decentralized algorithms in which sensors identify the critical edges and then move in directions that improve connectivity in more critical regions. In our algorithms, nodes determine their movement direction only with local information exchange with their neighbors.
Special Collections
Special Collections | Status |
---|---|
University Archives | Request on-site access |
3781 2011 M | In-library use |
20. An algebraic geometry based approach to decentralized control [electronic resource] [2011] Online
- Book
- 1 online resource.
Decentralized control has been one of the important problems in systems and control engineering. Computing an optimal decentralized controller for general linear systems, however, is known to be a very challenging task. In particular, designing an optimal decentralized controller in the standard framework of a linear system with quadratic cost and Gaussian noise is well known to be extremely hard even in very simple and small sized problems. Because of this fact, previous work has focused on characterizing several different classes of problems for which an optimal decentralized controller may be efficiently computed. The set of quadratically invariant problems is one of the largest known class of such problems. This dissertation provides a novel, general, and powerful framework for addressing decentralized control by introducing the idea of using rational elimination theory of algebraic geometry. We show that, in certain cases, this approach reduces the set of closed-loop maps of decentralized control to the solution set of a collection of linear equations. We show how to use these linear equations to find an optimal decentralized controller. We also prove that if a system is quadratically invariant then under an appropriate technical condition the resulting elimination set is affine. We further illustrate that our approach can be well applied to a strictly larger class of decentralized control problem than the quadratically invariant one by presenting a simple example: the example shows that there are problems which are not quadratically invariant but for which the resulting elimination description is affine.
Decentralized control has been one of the important problems in systems and control engineering. Computing an optimal decentralized controller for general linear systems, however, is known to be a very challenging task. In particular, designing an optimal decentralized controller in the standard framework of a linear system with quadratic cost and Gaussian noise is well known to be extremely hard even in very simple and small sized problems. Because of this fact, previous work has focused on characterizing several different classes of problems for which an optimal decentralized controller may be efficiently computed. The set of quadratically invariant problems is one of the largest known class of such problems. This dissertation provides a novel, general, and powerful framework for addressing decentralized control by introducing the idea of using rational elimination theory of algebraic geometry. We show that, in certain cases, this approach reduces the set of closed-loop maps of decentralized control to the solution set of a collection of linear equations. We show how to use these linear equations to find an optimal decentralized controller. We also prove that if a system is quadratically invariant then under an appropriate technical condition the resulting elimination set is affine. We further illustrate that our approach can be well applied to a strictly larger class of decentralized control problem than the quadratically invariant one by presenting a simple example: the example shows that there are problems which are not quadratically invariant but for which the resulting elimination description is affine.
Special Collections
Special Collections | Status |
---|---|
University Archives | Request on-site access |
3781 2011 S | In-library use |
Looking for different results?
Modify your search: Search all fields
Search elsewhere: Search WorldCat Search library website