(summaries of and key takeaways from two papers I read in December)
Paper: Three States and a Plan: The A.I. of F.E.A.R (this was the first game design paper I’ve read and it was pretty awesome, combining two of my Computer Science interests — graph theory and A.I)
- Enemy A.I.in F.E.A.R = FSM to express states + A* to plan sequence of actions to reach goal state.
- Separating goals from how the goals can be achieved (i.e. actions) leads to less complex code, code reusability, and facilitates code composition to build more complex systems.
- The planning system in F.E.A.R is called Goal-Oriented Action Planning and is based on STRIPS with several modifications.
- A* is used to find the sequence of actions with the least cost to reach a goal state. A* is used on a graph in which the nodes are states of the world and the edges are actions that cause the world to change from one state to another.
- Effects and preconditions for actions are represented as a fixed size array capturing the state of the world AND as procedural functions.
- Squad behavior is implemented by periodically clustering A.I. that are in close physical proximity and issuing squad orders. These orders are simply goals that the A.I. prioritizes (according to its current goals) and satisfies if appropriate.
Paper: Kraken: Leveraging Live Traffic Tests to Identify and Resolve Resource Utilization Bottlenecks in Large Scale Web Services
- Kraken is a system that load tests production systems (data centers or services) at Facebook by diverting live user traffic to the systems under test, and monitoring metrics like p99 latency and 5xx error rates to determine if traffic to the system under test should be increased or decreased, and by what amount.
- Real user traffic is the best representative of load to your system. By using real user traffic to test production systems you don’t have to worry about capturing complex system dependencies and interactions that arise out of a SOA.
- Kraken diverts traffic by modifying edge weights (from POPs to data centers), and cluster weights (from web frontend cluster load balancers to the web frontend clusters), and server weights (from service load balancers to individual servers that make up the service).
- Kraken reads test input and updates configuration files that are read by Proxygen to implement the edge and cluster weighting. Kraken then reads system metrics from Gorilla to dynamically determine how to adjust the edge and cluster weights based on how the system under test is performing.
- Kraken tests allow Facebook to measure a server’s, cluster’s, and region’s capacity.
- Kraken helps increase system utilization by exposing bottlenecks. By analyzing system metrics and how they change under different levels of load, Facebook was able to fix problems in their system. One of the issues identified in a system was poor load balancing, for which pick-2 load balancing was used as a solution.