https://optimization.cbe.cornell.edu/api.php?action=feedcontributions&user=Khaledfahat&feedformat=atom Cornell University Computational Optimization Open Textbook - Optimization Wiki - User contributions [en] 2021-06-14T00:24:27Z User contributions MediaWiki 1.35.0 https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=2635 Set covering problem 2020-12-15T02:49:55Z <p>Khaledfahat: /* Approximation via LP relaxation and rounding */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> &lt;br&gt;<br /> Steward: Allen Yang, Fengqi You<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. Given a collection of elements, the set covering problem aims to find the minimum number of sets that incorporate (cover) all of these elements. &lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion. &lt;ref name=&quot;one&quot; /&gt; &lt;ref name=&quot;seven&quot;&gt; P. Slavı́k, [https://www.sciencedirect.com/science/article/abs/pii/S0196677497908877 &quot;A Tight Analysis of the Greedy Algorithm for Set Cover],&quot; ''Journal of Algorithms,'', vol. 25, pp. 237-245, 1997. &lt;/ref&gt; &lt;ref name=&quot;nine&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem?],&quot; ''Operations Research Letters'', vol. 44, pp. 366-369, 2016. &lt;/ref&gt;<br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s. &lt;ref name=&quot;two&quot;&gt; J. Rubin, [https://www.jstor.org/stable/25767684?seq=1 &quot;A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling],&quot; ''Transportation Science'', vol. 7, pp. 34-48, 1973. &lt;/ref&gt;<br /> <br /> == Problem formulation ==<br /> In the set covering problem, two sets are given: a set &lt;math&gt; U &lt;/math&gt; of elements and a set &lt;math&gt; S &lt;/math&gt; of subsets of the set &lt;math&gt; U &lt;/math&gt;. Each subset in &lt;math&gt; S &lt;/math&gt; is associated with a predetermined cost, and the union of all the subsets covers the set &lt;math&gt; U &lt;/math&gt;. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt; &lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;<br /> <br /> The mathematical formulation of the set covering problem is define as follows. We define &lt;math&gt; U &lt;/math&gt; = { &lt;math&gt; u_i,..., u_m &lt;/math&gt;} as the universe of elements and &lt;math&gt; S &lt;/math&gt; = { &lt;math&gt; s_i,..., s_n &lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i \subset U &lt;/math&gt; and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in &lt;math&gt; U &lt;/math&gt; (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = &lt;math&gt; U &lt;/math&gt; ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of &lt;math&gt; U &lt;/math&gt; and has associated cost &lt;math&gt; c_i&lt;/math&gt; such that &lt;math&gt; c_i &gt; 0&lt;/math&gt;. The objective is to find the minimum cost sub-collection of sets &lt;math&gt; X &lt;/math&gt; &lt;math&gt;\subset&lt;/math&gt; &lt;math&gt; S &lt;/math&gt; that covers all the elements in the universe &lt;math&gt; U &lt;/math&gt;.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:&lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element &lt;math&gt; i &lt;/math&gt; in the universe &lt;math&gt; U &lt;/math&gt; must be be covered and the second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary which means that every set is either in the set cover or not.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which implies that as the size of the problem increases, the computational time to solve it increases exponentially. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. In subsequent sections, we will cover two of the most widely used approximation methods to solve set cover problem in polynomial time which are linear program relaxation methods and classical greedy algorithms. &lt;ref name=&quot;seven&quot; /&gt;<br /> <br /> == Approximation via LP relaxation and rounding ==<br /> Set covering is a classical integer programming problem and solving integer program in general is NP-hard. Therefore, one approach to achieve an &lt;math&gt; O&lt;/math&gt;(log&lt;math&gt;n&lt;/math&gt;) approximation to set covering problem in polynomial time is solving via linear programming (LP) relaxation algorithms &lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt; &lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;. In LP relaxation, we relax the integrality requirement into a linear constraints. For instance, if we replace the constraints &lt;math&gt; y_i \in \{0, 1\}&lt;/math&gt; with the constraints &lt;math&gt; 0 =&lt; y_i &lt;= 1 &lt;/math&gt;, we obtain the following LP problem that can be solved in polynomial time:<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> subject to &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; 0 =&lt; y_i&lt;= 1, \forall i = 1,....,n&lt;/math&gt;<br /> <br /> The above LP formulation is a relaxation of the original ILP set cover problem. This means that every feasible solution of the integer program is also feasible for this LP program. Additionally, the value of any feasible solution for the integer program is the same value in LP since the objective functions of both integer and linear programs are the same. Solving the LP program will result in an optimal solution that is a lower bound for the original integer program since the minimization of LP finds a feasible solution of lowest possible values. Moreover, we use LP rounding algorithms to directly round the fractional LP solution to an integral combinatorial solution as follows:<br /> &lt;br&gt;<br /> <br /> <br /> '''Deterministic rounding algorithm''' <br /> &lt;br&gt;<br /> <br /> Suppose we have an optimal solution &lt;math&gt; z^* &lt;/math&gt; for the linear programming relaxation of the set cover problem. We round the fractional solution &lt;math&gt; z^* &lt;/math&gt; to an integer solution &lt;math&gt; z &lt;/math&gt; using LP rounding algorithm. In general, there are two approaches for rounding algorithms, deterministic and randomized rounding algorithm. In this section, we will explain the deterministic algorithms.In this approach, we include subset &lt;math&gt; s_i &lt;/math&gt; in our solution if &lt;math&gt; z^* &gt;= 1/d &lt;/math&gt;, where &lt;math&gt; d &lt;/math&gt; is the maximum number of sets in which any element appears. In practice, we set &lt;math&gt; z &lt;/math&gt; to be as follows:&lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;<br /> <br /> &lt;math&gt; z = \begin{cases} 1, &amp; \text{if } z^*&gt;= 1/d \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> The rounding algorithm is an approximation algorithm for the set cover problem. It is clear that the algorithm converge in polynomial time and &lt;math&gt; z &lt;/math&gt; is a feasible solution to the integer program.<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time. &lt;ref name=&quot;seven&quot; /&gt; &lt;ref name=&quot;nine&quot; /&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe &lt;math&gt; U &lt;/math&gt;, and delete the uncovered elements, until all elements are covered. &lt;ref name=&quot;ten&quot;&gt; V. Chvatal, [https://pubsonline.informs.org/doi/abs/10.1287/moor.4.3.233 &quot;Greedy Heuristic for the Set-Covering Problem],&quot; ''Mathematics of Operations Research'', vol. 4, pp. 233-235, 1979. &lt;/ref&gt; Let &lt;math&gt; T &lt;/math&gt; be the set that contain the covered elements, and &lt;math&gt; U &lt;/math&gt; be the set that contain the elements of &lt;math&gt; Y &lt;/math&gt; that still uncovered. At the beginning of the iteration, &lt;math&gt; T &lt;/math&gt; is empty and all elements &lt;math&gt; Y \in U &lt;/math&gt;. We iteratively select the set of &lt;math&gt; S &lt;/math&gt; that covers the largest number of elements in &lt;math&gt; U &lt;/math&gt; and add it to the covered elements in &lt;math&gt; T &lt;/math&gt;. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { &lt;math&gt; T &lt;/math&gt; stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { &lt;math&gt; U &lt;/math&gt; stores the uncovered elements &lt;math&gt; Y &lt;/math&gt;}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1 Camera Location vs Stadium Area<br /> |-<br /> !camera Location<br /> |1<br /> |2<br /> |3<br /> |4<br /> |5<br /> |6<br /> |7<br /> |8<br /> |-<br /> !stadium area<br /> |1,3,4,6,7<br /> |4,7,8,12<br /> |2,5,9,11,13<br /> |1,2,14,15<br /> |3,6,10,12,14<br /> |8,14,15<br /> |1,2,6,11<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2 Stadium Area vs Camera Location<br /> |-<br /> !stadium area<br /> |1<br /> |2<br /> |3<br /> |4<br /> |5<br /> |6<br /> |7<br /> |8<br /> |9<br /> |10<br /> |11<br /> |12<br /> |13<br /> |14<br /> |15<br /> |-<br /> !camera location<br /> |1,4,7,8<br /> |3,4,7,8<br /> |1,5<br /> |1,2,8<br /> |3<br /> |1,5,7,8<br /> |1,2<br /> |2,6,8<br /> |3<br /> |5<br /> |3,7<br /> |2,5,8<br /> |3<br /> |4,5,6<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3 Binary Table (All Camera Locations and Stadium Areas)<br /> !<br /> !Camera1<br /> !Camera2<br /> !Camera3<br /> !Camera4<br /> !Camera5<br /> !Camera6<br /> !Camera7<br /> !Camera8<br /> |-<br /> !Stadium1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> !Stadium2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> !Stadium3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> !Stadium4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> !Stadium5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> !Stadium6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> !Stadium7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> !Stadium8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> !Stadium9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> !Stadium10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> !Stadium11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> !Stadium12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> !Stadium13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> !Stadium14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> !Stadium15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; <br /> <br /> ''s.t. Constraints 1 to 15 are satisfied:''<br /> <br /> &lt;math&gt; z_1 + z_4 + z_7 + z_8 \geqslant 1 (1)&lt;/math&gt;<br /> <br /> &lt;math&gt; z_3 + z_4 + z_7 + z_8 \geqslant 1 (2)&lt;/math&gt;<br /> <br /> &lt;math&gt; z_1 + z_5 \geqslant 1 (3)&lt;/math&gt;<br /> <br /> &lt;math&gt; z_1 + z_2 + z_8 \geqslant 1 (4)&lt;/math&gt;<br /> <br /> &lt;math&gt; z_3 \geqslant 1 (5)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_1 + z_5 + z_7 + z_8 \geqslant 1 (6)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_1 + z_2 \geqslant 1 (7)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_2 + z_6 + z_8 \geqslant 1 (8)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_3 \geqslant 1 (9)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_5 \geqslant 1 (10)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_3 + z_7 \geqslant 1 (11)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_2 + z_5 + z_8 \geqslant 1 (12)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_3 \geqslant 1 (13)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_4 + z_5 + z_6 \geqslant 1 (14)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_4 + z_6 \geqslant 1 (15)&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;, <br /> <br /> s.t.:<br /> <br /> &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1 (1)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_1 + z_5 \geqslant 1 (3)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_1 + z_2 + z_8 \geqslant 1 (4)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_1 + z_5 + z_7 + z_8 \geqslant 1 (6)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_1 + z_2 \geqslant 1 (7)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_2 + z_6 + z_8 \geqslant 1 (8)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_5 \geqslant 1 (10)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_2 + z_5 + z_8 \geqslant 1 (12)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_4 + z_5 + z_6 \geqslant 1 (14)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_4 + z_6 \geqslant 1 (15)&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;<br /> <br /> s.t.:<br /> <br /> &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1 (1)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_1 + z_2 \geqslant 1 (7)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_2 + z_6 + z_8 \geqslant 1 (8)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_4 + z_6 \geqslant 1 (15)&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations. &lt;ref name=&quot;three&quot;&gt; R. Church and C. ReVelle, [https://link.springer.com/article/10.1007/BF01942293 &quot;The maximal covering location problem],&quot; ''Papers of the Regional Science Association'', vol. 32, pp. 101-118, 1974. &lt;/ref&gt; Consider the problem of placing fire stations to serve the towns of some city. &lt;ref name=&quot;four&quot;&gt; E. Aktaş, Ö. Özaydın, B. Bozkaya, F. Ülengin, and Ş. Önsel, [https://pubsonline.informs.org/doi/10.1287/inte.1120.0671 &quot;Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality],&quot; ''Interfaces'', vol. 43, pp. 240-255, 2013. &lt;/ref&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i \geq 1, \forall i&lt;/math&gt; <br /> <br /> A real-world case study involving optimizing fire station locations in Istanbul is analyzed in this reference. &lt;ref name=&quot;four&quot; /&gt; The Istanbul municipality serves 790 subdistricts, which should all be covered by a fire station. Each subdistrict is considered covered if it has a neighboring district (a district at most 5 minutes away) that has a fire station. For detailed computational analysis, we refer the reader to the mentioned academic paper.<br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem. &lt;ref name=&quot;five&quot;&gt; J. Ali and V. Dyo, [https://www.scitepress.org/Link.aspx?doi=10.5220/0006469800830088 &quot;Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach],&quot; ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications'', pp. 83-88, 2017. &lt;/ref&gt; &lt;ref name=&quot;eleven&quot;&gt; P.H. Cruz Caminha , R. De Souza Couto , L.H. Maciel Kosmalski Costa , A. Fladenmuller , and M. Dias de Amorim, [https://www.mdpi.com/1424-8220/18/6/1976 &quot;On the Coverage of Bus-Based Mobile Sensing],&quot; ''Sensors'', 2018. &lt;/ref&gt; Specifically, giving a collection of bus routes '''''R''''', where each route itself is divided into segments. Route &lt;math&gt; i &lt;/math&gt; is denoted by &lt;math&gt; R_i &lt;/math&gt;, and segment &lt;math&gt; j &lt;/math&gt; is denoted by &lt;math&gt; S_j &lt;/math&gt;. The segments of two different routes can overlap, and each segment is associated with a length &lt;math&gt; a_j &lt;/math&gt;. The goal is then to select the routes that maximize the total covered distance.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Suppose we want to use at most &lt;math&gt; k &lt;/math&gt; different routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the length of of covered segments. Let &lt;math&gt; x_i &lt;/math&gt; be the binary decision variable corresponding to selecting route &lt;math&gt; R_i &lt;/math&gt;, and let &lt;math&gt; y_j &lt;/math&gt; be the decision variable associated with covering segment &lt;math&gt; S_j &lt;/math&gt;. Let us also denote the set of routes that cover segment &lt;math&gt; j &lt;/math&gt; by &lt;math&gt; C_j &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> &lt;math&gt;<br /> \begin{align}<br /> \text{max} &amp; ~~ \sum_{j} a_jy_j\\<br /> \text{s.t} &amp; ~~ \sum_{i\in C_j} x_i \geq y_j \quad \forall j \\<br /> &amp; ~~ \sum_{i} x_i = k \\ <br /> &amp; ~~ x_i,y_{j} \in \{0,1\} \\<br /> \end{align}<br /> &lt;/math&gt;<br /> <br /> The work by Ali and Dyo explores a greedy approximation algorithm to solve an optimal selection problem including 713 bus routes in Greater London. &lt;ref name=&quot;five&quot; /&gt; Using 14% of the routes only (100 routes), the greedy algorithm returns a solution that covers 25% of the segments in Greater London. For a details of the approximation algorithm and the world case study, we refer the reader to this reference. &lt;ref name=&quot;five&quot; /&gt; For a significantly larger case study involving 5747 buses covering 5060km, we refer the reader to this academic article. &lt;ref name=&quot;eleven&quot; /&gt;<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;six&quot;&gt; E. Marchiori and A. Steenbeek, [https://link.springer.com/chapter/10.1007/3-540-45561-2_36 &quot;An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling],&quot; ''Real-World Applications of Evolutionary Computing. EvoWorkshops 2000. Lecture Notes in Computer Science'', 2000. &lt;/ref&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey, which contains several problem instances with the number of flights ranging from 1013 to 7765 flights, for a detailed analysis of the formulation and algorithms that pertain to this significant application. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;eight&quot;&gt; A. Kasirzadeh, M. Saddoune, and F. Soumis [https://www.sciencedirect.com/science/article/pii/S2192437620300820?via%3Dihub &quot;Airline crew scheduling: models, algorithms, and data sets],&quot; ''EURO Journal on Transportation and Logistics'', vol. 6, pp. 111-137, 2017. &lt;/ref&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> == References ==<br /> &lt;references /&gt;</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=2634 Set covering problem 2020-12-15T02:49:06Z <p>Khaledfahat: </p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> &lt;br&gt;<br /> Steward: Allen Yang, Fengqi You<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. Given a collection of elements, the set covering problem aims to find the minimum number of sets that incorporate (cover) all of these elements. &lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion. &lt;ref name=&quot;one&quot; /&gt; &lt;ref name=&quot;seven&quot;&gt; P. Slavı́k, [https://www.sciencedirect.com/science/article/abs/pii/S0196677497908877 &quot;A Tight Analysis of the Greedy Algorithm for Set Cover],&quot; ''Journal of Algorithms,'', vol. 25, pp. 237-245, 1997. &lt;/ref&gt; &lt;ref name=&quot;nine&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem?],&quot; ''Operations Research Letters'', vol. 44, pp. 366-369, 2016. &lt;/ref&gt;<br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s. &lt;ref name=&quot;two&quot;&gt; J. Rubin, [https://www.jstor.org/stable/25767684?seq=1 &quot;A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling],&quot; ''Transportation Science'', vol. 7, pp. 34-48, 1973. &lt;/ref&gt;<br /> <br /> == Problem formulation ==<br /> In the set covering problem, two sets are given: a set &lt;math&gt; U &lt;/math&gt; of elements and a set &lt;math&gt; S &lt;/math&gt; of subsets of the set &lt;math&gt; U &lt;/math&gt;. Each subset in &lt;math&gt; S &lt;/math&gt; is associated with a predetermined cost, and the union of all the subsets covers the set &lt;math&gt; U &lt;/math&gt;. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt; &lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;<br /> <br /> The mathematical formulation of the set covering problem is define as follows. We define &lt;math&gt; U &lt;/math&gt; = { &lt;math&gt; u_i,..., u_m &lt;/math&gt;} as the universe of elements and &lt;math&gt; S &lt;/math&gt; = { &lt;math&gt; s_i,..., s_n &lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i \subset U &lt;/math&gt; and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in &lt;math&gt; U &lt;/math&gt; (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = &lt;math&gt; U &lt;/math&gt; ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of &lt;math&gt; U &lt;/math&gt; and has associated cost &lt;math&gt; c_i&lt;/math&gt; such that &lt;math&gt; c_i &gt; 0&lt;/math&gt;. The objective is to find the minimum cost sub-collection of sets &lt;math&gt; X &lt;/math&gt; &lt;math&gt;\subset&lt;/math&gt; &lt;math&gt; S &lt;/math&gt; that covers all the elements in the universe &lt;math&gt; U &lt;/math&gt;.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:&lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element &lt;math&gt; i &lt;/math&gt; in the universe &lt;math&gt; U &lt;/math&gt; must be be covered and the second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary which means that every set is either in the set cover or not.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which implies that as the size of the problem increases, the computational time to solve it increases exponentially. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. In subsequent sections, we will cover two of the most widely used approximation methods to solve set cover problem in polynomial time which are linear program relaxation methods and classical greedy algorithms. &lt;ref name=&quot;seven&quot; /&gt;<br /> <br /> == Approximation via LP relaxation and rounding ==<br /> Set covering is a classical integer programming problem and solving integer program in general is NP-hard. Therefore, one approach to achieve an &lt;math&gt; O&lt;/math&gt;(log&lt;math&gt;n&lt;/math&gt;) approximation to set covering problem in polynomial time is solving via linear programming (LP) ''relaxation'' algorithms &lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt; &lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;. In LP relaxation, we relax the integrality requirement into a linear constraints. For instance, if we replace the constraints &lt;math&gt; y_i \in \{0, 1\}&lt;/math&gt; with the constraints &lt;math&gt; 0 =&lt; y_i &lt;= 1 &lt;/math&gt;, we obtain the following LP problem that can be solved in polynomial time:<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> subject to &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; 0 =&lt; y_i&lt;= 1, \forall i = 1,....,n&lt;/math&gt;<br /> <br /> The above LP formulation is a relaxation of the original ILP set cover problem. This means that every feasible solution of the integer program is also feasible for this LP program. Additionally, the value of any feasible solution for the integer program is the same value in LP since the objective functions of both integer and linear programs are the same. Solving the LP program will result in an optimal solution that is a lower bound for the original integer program since the minimization of LP finds a feasible solution of lowest possible values. Moreover, we use LP rounding algorithms to directly round the fractional LP solution to an integral combinatorial solution as follows:<br /> &lt;br&gt;<br /> <br /> <br /> '''Deterministic rounding algorithm''' <br /> &lt;br&gt;<br /> <br /> Suppose we have an optimal solution &lt;math&gt; z^* &lt;/math&gt; for the linear programming relaxation of the set cover problem. We round the fractional solution &lt;math&gt; z^* &lt;/math&gt; to an integer solution &lt;math&gt; z &lt;/math&gt; using LP rounding algorithm. In general, there are two approaches for rounding algorithms, deterministic and randomized rounding algorithm. In this section, we will explain the deterministic algorithms.In this approach, we include subset &lt;math&gt; s_i &lt;/math&gt; in our solution if &lt;math&gt; z^* &gt;= 1/d &lt;/math&gt;, where &lt;math&gt; d &lt;/math&gt; is the maximum number of sets in which any element appears. In practice, we set &lt;math&gt; z &lt;/math&gt; to be as follows:&lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;<br /> <br /> &lt;math&gt; z = \begin{cases} 1, &amp; \text{if } z^*&gt;= 1/d \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> The rounding algorithm is an approximation algorithm for the set cover problem. It is clear that the algorithm converge in polynomial time and &lt;math&gt; z &lt;/math&gt; is a feasible solution to the integer program.<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time. &lt;ref name=&quot;seven&quot; /&gt; &lt;ref name=&quot;nine&quot; /&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe &lt;math&gt; U &lt;/math&gt;, and delete the uncovered elements, until all elements are covered. &lt;ref name=&quot;ten&quot;&gt; V. Chvatal, [https://pubsonline.informs.org/doi/abs/10.1287/moor.4.3.233 &quot;Greedy Heuristic for the Set-Covering Problem],&quot; ''Mathematics of Operations Research'', vol. 4, pp. 233-235, 1979. &lt;/ref&gt; Let &lt;math&gt; T &lt;/math&gt; be the set that contain the covered elements, and &lt;math&gt; U &lt;/math&gt; be the set that contain the elements of &lt;math&gt; Y &lt;/math&gt; that still uncovered. At the beginning of the iteration, &lt;math&gt; T &lt;/math&gt; is empty and all elements &lt;math&gt; Y \in U &lt;/math&gt;. We iteratively select the set of &lt;math&gt; S &lt;/math&gt; that covers the largest number of elements in &lt;math&gt; U &lt;/math&gt; and add it to the covered elements in &lt;math&gt; T &lt;/math&gt;. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { &lt;math&gt; T &lt;/math&gt; stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { &lt;math&gt; U &lt;/math&gt; stores the uncovered elements &lt;math&gt; Y &lt;/math&gt;}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1 Camera Location vs Stadium Area<br /> |-<br /> !camera Location<br /> |1<br /> |2<br /> |3<br /> |4<br /> |5<br /> |6<br /> |7<br /> |8<br /> |-<br /> !stadium area<br /> |1,3,4,6,7<br /> |4,7,8,12<br /> |2,5,9,11,13<br /> |1,2,14,15<br /> |3,6,10,12,14<br /> |8,14,15<br /> |1,2,6,11<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2 Stadium Area vs Camera Location<br /> |-<br /> !stadium area<br /> |1<br /> |2<br /> |3<br /> |4<br /> |5<br /> |6<br /> |7<br /> |8<br /> |9<br /> |10<br /> |11<br /> |12<br /> |13<br /> |14<br /> |15<br /> |-<br /> !camera location<br /> |1,4,7,8<br /> |3,4,7,8<br /> |1,5<br /> |1,2,8<br /> |3<br /> |1,5,7,8<br /> |1,2<br /> |2,6,8<br /> |3<br /> |5<br /> |3,7<br /> |2,5,8<br /> |3<br /> |4,5,6<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3 Binary Table (All Camera Locations and Stadium Areas)<br /> !<br /> !Camera1<br /> !Camera2<br /> !Camera3<br /> !Camera4<br /> !Camera5<br /> !Camera6<br /> !Camera7<br /> !Camera8<br /> |-<br /> !Stadium1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> !Stadium2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> !Stadium3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> !Stadium4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> !Stadium5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> !Stadium6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> !Stadium7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> !Stadium8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> !Stadium9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> !Stadium10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> !Stadium11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> !Stadium12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> !Stadium13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> !Stadium14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> !Stadium15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; <br /> <br /> ''s.t. Constraints 1 to 15 are satisfied:''<br /> <br /> &lt;math&gt; z_1 + z_4 + z_7 + z_8 \geqslant 1 (1)&lt;/math&gt;<br /> <br /> &lt;math&gt; z_3 + z_4 + z_7 + z_8 \geqslant 1 (2)&lt;/math&gt;<br /> <br /> &lt;math&gt; z_1 + z_5 \geqslant 1 (3)&lt;/math&gt;<br /> <br /> &lt;math&gt; z_1 + z_2 + z_8 \geqslant 1 (4)&lt;/math&gt;<br /> <br /> &lt;math&gt; z_3 \geqslant 1 (5)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_1 + z_5 + z_7 + z_8 \geqslant 1 (6)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_1 + z_2 \geqslant 1 (7)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_2 + z_6 + z_8 \geqslant 1 (8)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_3 \geqslant 1 (9)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_5 \geqslant 1 (10)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_3 + z_7 \geqslant 1 (11)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_2 + z_5 + z_8 \geqslant 1 (12)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_3 \geqslant 1 (13)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_4 + z_5 + z_6 \geqslant 1 (14)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_4 + z_6 \geqslant 1 (15)&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;, <br /> <br /> s.t.:<br /> <br /> &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1 (1)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_1 + z_5 \geqslant 1 (3)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_1 + z_2 + z_8 \geqslant 1 (4)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_1 + z_5 + z_7 + z_8 \geqslant 1 (6)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_1 + z_2 \geqslant 1 (7)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_2 + z_6 + z_8 \geqslant 1 (8)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_5 \geqslant 1 (10)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_2 + z_5 + z_8 \geqslant 1 (12)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_4 + z_5 + z_6 \geqslant 1 (14)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_4 + z_6 \geqslant 1 (15)&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;<br /> <br /> s.t.:<br /> <br /> &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1 (1)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_1 + z_2 \geqslant 1 (7)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_2 + z_6 + z_8 \geqslant 1 (8)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_4 + z_6 \geqslant 1 (15)&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations. &lt;ref name=&quot;three&quot;&gt; R. Church and C. ReVelle, [https://link.springer.com/article/10.1007/BF01942293 &quot;The maximal covering location problem],&quot; ''Papers of the Regional Science Association'', vol. 32, pp. 101-118, 1974. &lt;/ref&gt; Consider the problem of placing fire stations to serve the towns of some city. &lt;ref name=&quot;four&quot;&gt; E. Aktaş, Ö. Özaydın, B. Bozkaya, F. Ülengin, and Ş. Önsel, [https://pubsonline.informs.org/doi/10.1287/inte.1120.0671 &quot;Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality],&quot; ''Interfaces'', vol. 43, pp. 240-255, 2013. &lt;/ref&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i \geq 1, \forall i&lt;/math&gt; <br /> <br /> A real-world case study involving optimizing fire station locations in Istanbul is analyzed in this reference. &lt;ref name=&quot;four&quot; /&gt; The Istanbul municipality serves 790 subdistricts, which should all be covered by a fire station. Each subdistrict is considered covered if it has a neighboring district (a district at most 5 minutes away) that has a fire station. For detailed computational analysis, we refer the reader to the mentioned academic paper.<br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem. &lt;ref name=&quot;five&quot;&gt; J. Ali and V. Dyo, [https://www.scitepress.org/Link.aspx?doi=10.5220/0006469800830088 &quot;Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach],&quot; ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications'', pp. 83-88, 2017. &lt;/ref&gt; &lt;ref name=&quot;eleven&quot;&gt; P.H. Cruz Caminha , R. De Souza Couto , L.H. Maciel Kosmalski Costa , A. Fladenmuller , and M. Dias de Amorim, [https://www.mdpi.com/1424-8220/18/6/1976 &quot;On the Coverage of Bus-Based Mobile Sensing],&quot; ''Sensors'', 2018. &lt;/ref&gt; Specifically, giving a collection of bus routes '''''R''''', where each route itself is divided into segments. Route &lt;math&gt; i &lt;/math&gt; is denoted by &lt;math&gt; R_i &lt;/math&gt;, and segment &lt;math&gt; j &lt;/math&gt; is denoted by &lt;math&gt; S_j &lt;/math&gt;. The segments of two different routes can overlap, and each segment is associated with a length &lt;math&gt; a_j &lt;/math&gt;. The goal is then to select the routes that maximize the total covered distance.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Suppose we want to use at most &lt;math&gt; k &lt;/math&gt; different routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the length of of covered segments. Let &lt;math&gt; x_i &lt;/math&gt; be the binary decision variable corresponding to selecting route &lt;math&gt; R_i &lt;/math&gt;, and let &lt;math&gt; y_j &lt;/math&gt; be the decision variable associated with covering segment &lt;math&gt; S_j &lt;/math&gt;. Let us also denote the set of routes that cover segment &lt;math&gt; j &lt;/math&gt; by &lt;math&gt; C_j &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> &lt;math&gt;<br /> \begin{align}<br /> \text{max} &amp; ~~ \sum_{j} a_jy_j\\<br /> \text{s.t} &amp; ~~ \sum_{i\in C_j} x_i \geq y_j \quad \forall j \\<br /> &amp; ~~ \sum_{i} x_i = k \\ <br /> &amp; ~~ x_i,y_{j} \in \{0,1\} \\<br /> \end{align}<br /> &lt;/math&gt;<br /> <br /> The work by Ali and Dyo explores a greedy approximation algorithm to solve an optimal selection problem including 713 bus routes in Greater London. &lt;ref name=&quot;five&quot; /&gt; Using 14% of the routes only (100 routes), the greedy algorithm returns a solution that covers 25% of the segments in Greater London. For a details of the approximation algorithm and the world case study, we refer the reader to this reference. &lt;ref name=&quot;five&quot; /&gt; For a significantly larger case study involving 5747 buses covering 5060km, we refer the reader to this academic article. &lt;ref name=&quot;eleven&quot; /&gt;<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;six&quot;&gt; E. Marchiori and A. Steenbeek, [https://link.springer.com/chapter/10.1007/3-540-45561-2_36 &quot;An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling],&quot; ''Real-World Applications of Evolutionary Computing. EvoWorkshops 2000. Lecture Notes in Computer Science'', 2000. &lt;/ref&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey, which contains several problem instances with the number of flights ranging from 1013 to 7765 flights, for a detailed analysis of the formulation and algorithms that pertain to this significant application. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;eight&quot;&gt; A. Kasirzadeh, M. Saddoune, and F. Soumis [https://www.sciencedirect.com/science/article/pii/S2192437620300820?via%3Dihub &quot;Airline crew scheduling: models, algorithms, and data sets],&quot; ''EURO Journal on Transportation and Logistics'', vol. 6, pp. 111-137, 2017. &lt;/ref&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> == References ==<br /> &lt;references /&gt;</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=2620 Set covering problem 2020-12-14T04:47:07Z <p>Khaledfahat: </p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> &lt;br&gt;<br /> Steward: Allen Yang, Fengqi You<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. Given a collection of elements, the set covering problem aims to find the minimum number of sets that incorporate (cover) all of these elements. &lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion. &lt;ref name=&quot;one&quot; /&gt; &lt;ref name=&quot;seven&quot;&gt; P. Slavı́k, [https://www.sciencedirect.com/science/article/abs/pii/S0196677497908877 &quot;A Tight Analysis of the Greedy Algorithm for Set Cover],&quot; ''Journal of Algorithms,'', vol. 25, pp. 237-245, 1997. &lt;/ref&gt; &lt;ref name=&quot;nine&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem?],&quot; ''Operations Research Letters'', vol. 44, pp. 366-369, 2016. &lt;/ref&gt;<br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s. &lt;ref name=&quot;two&quot;&gt; J. Rubin, [https://www.jstor.org/stable/25767684?seq=1 &quot;A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling],&quot; ''Transportation Science'', vol. 7, pp. 34-48, 1973. &lt;/ref&gt;<br /> <br /> == Problem formulation ==<br /> In the set covering problem, two sets are given: a set &lt;math&gt; U &lt;/math&gt; of elements and a set &lt;math&gt; S &lt;/math&gt; of subsets of the set &lt;math&gt; U &lt;/math&gt;. Each subset in &lt;math&gt; S &lt;/math&gt; is associated with a predetermined cost, and the union of all the subsets covers the set &lt;math&gt; U &lt;/math&gt;. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt; &lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;<br /> <br /> The mathematical formulation of the set covering problem is define as follows. We define &lt;math&gt; U &lt;/math&gt; = { &lt;math&gt; u_i,..., u_m &lt;/math&gt;} as the universe of elements and &lt;math&gt; S &lt;/math&gt; = { &lt;math&gt; s_i,..., s_n &lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i \in U &lt;/math&gt; and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in &lt;math&gt; U &lt;/math&gt; (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = &lt;math&gt; U &lt;/math&gt; ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of &lt;math&gt; U &lt;/math&gt; and has associated cost &lt;math&gt; c_i&lt;/math&gt; such that &lt;math&gt; c_i &gt; 0&lt;/math&gt;. The objective is to find the minimum cost sub-collection of sets &lt;math&gt; X &lt;/math&gt; &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; S &lt;/math&gt; that covers all the elements in the universe &lt;math&gt; U &lt;/math&gt;.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:&lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element &lt;math&gt; i &lt;/math&gt; in the universe &lt;math&gt; U &lt;/math&gt; must be be covered and the second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary which means that every set is either in the set cover or not.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which implies that as the size of the problem increases, the computational time to solve it increases exponentially. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. In subsequent sections, we will cover two of the most widely used approximation methods to solve set cover problem in polynomial time which are linear program relaxation methods and classical greedy algorithms. &lt;ref name=&quot;seven&quot; /&gt;<br /> <br /> == Approximation via LP relaxation and rounding ==<br /> Set covering is a classical integer programming problem and solving integer program in general is NP-hard. Therefore, one approach to achieve an &lt;math&gt; O&lt;/math&gt;(log&lt;math&gt;n&lt;/math&gt;) approximation to set covering problem in polynomial time is solving via linear programming (LP) ''relaxation'' algorithms &lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt; &lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;. In LP relaxation, we relax the integrality requirement into a linear constraints. For instance, if we replace the constraints &lt;math&gt; y_i \in \{0, 1\}&lt;/math&gt; with the constraints &lt;math&gt; 0 =&lt; y_i &lt;= 1 &lt;/math&gt;, we obtain the following LP problem that can be solved in polynomial time:<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> subject to &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; 0 =&lt; y_i&lt;= 1, \forall i = 1,....,n&lt;/math&gt;<br /> <br /> The above LP formulation is a relaxation of the original ILP set cover problem. This means that every feasible solution of the integer program is also feasible for this LP program. Additionally, the value of any feasible solution for the integer program is the same value in LP since the objective functions of both integer and linear programs are the same. Solving the LP program will result in an optimal solution that is a lower bound for the original integer program since the minimization of LP finds a feasible solution of lowest possible values. Moreover, we use LP rounding algorithms to directly round the fractional LP solution to an integral combinatorial solution as follows:<br /> &lt;br&gt;<br /> <br /> <br /> '''Deterministic rounding algorithm''' <br /> &lt;br&gt;<br /> <br /> Suppose we have an optimal solution &lt;math&gt; z^* &lt;/math&gt; for the linear programming relaxation of the set cover problem. We round the fractional solution &lt;math&gt; z^* &lt;/math&gt; to an integer solution &lt;math&gt; z &lt;/math&gt; using LP rounding algorithm. In general, there are two approaches for rounding algorithms, deterministic and randomized rounding algorithm. In this section, we will explain the deterministic algorithms.In this approach, we include subset &lt;math&gt; s_i &lt;/math&gt; in our solution if &lt;math&gt; z^* &gt;= 1/d &lt;/math&gt;, where &lt;math&gt; d &lt;/math&gt; is the maximum number of sets in which any element appears. In practice, we set &lt;math&gt; z &lt;/math&gt; to be as follows:&lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;<br /> <br /> &lt;math&gt; z = \begin{cases} 1, &amp; \text{if } z^*&gt;= 1/d \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> The rounding algorithm is an approximation algorithm for the set cover problem. It is clear that the algorithm converge in polynomial time and &lt;math&gt; z &lt;/math&gt; is a feasible solution to the integer program.<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time. &lt;ref name=&quot;seven&quot; /&gt; &lt;ref name=&quot;nine&quot; /&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe &lt;math&gt; U &lt;/math&gt;, and delete the uncovered elements, until all elements are covered. &lt;ref name=&quot;ten&quot;&gt; V. Chvatal, [https://pubsonline.informs.org/doi/abs/10.1287/moor.4.3.233 &quot;Greedy Heuristic for the Set-Covering Problem],&quot; ''Mathematics of Operations Research'', vol. 4, pp. 233-235, 1979. &lt;/ref&gt; Let &lt;math&gt; T &lt;/math&gt; be the set that contain the covered elements, and &lt;math&gt; U &lt;/math&gt; be the set that contain the elements of &lt;math&gt; Y &lt;/math&gt; that still uncovered. At the beginning of the iteration, &lt;math&gt; T &lt;/math&gt; is empty and all elements &lt;math&gt; Y \in U &lt;/math&gt;. We iteratively select the set of &lt;math&gt; S &lt;/math&gt; that covers the largest number of elements in &lt;math&gt; U &lt;/math&gt; and add it to the covered elements in &lt;math&gt; T &lt;/math&gt;. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { &lt;math&gt; T &lt;/math&gt; stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { &lt;math&gt; U &lt;/math&gt; stores the uncovered elements &lt;math&gt; Y &lt;/math&gt;}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1 Camera Location vs Stadium Area<br /> |-<br /> !camera Location<br /> |1<br /> |2<br /> |3<br /> |4<br /> |5<br /> |6<br /> |7<br /> |8<br /> |-<br /> !stadium area<br /> |1,3,4,6,7<br /> |4,7,8,12<br /> |2,5,9,11,13<br /> |1,2,14,15<br /> |3,6,10,12,14<br /> |8,14,15<br /> |1,2,6,11<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2 Stadium Area vs Camera Location<br /> |-<br /> !stadium area<br /> |1<br /> |2<br /> |3<br /> |4<br /> |5<br /> |6<br /> |7<br /> |8<br /> |9<br /> |10<br /> |11<br /> |12<br /> |13<br /> |14<br /> |15<br /> |-<br /> !camera location<br /> |1,4,7,8<br /> |3,4,7,8<br /> |1,5<br /> |1,2,8<br /> |3<br /> |1,5,7,8<br /> |1,2<br /> |2,6,8<br /> |3<br /> |5<br /> |3,7<br /> |2,5,8<br /> |3<br /> |4,5,6<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3 Binary Table (All Camera Locations and Stadium Areas)<br /> !<br /> !Camera1<br /> !Camera2<br /> !Camera3<br /> !Camera4<br /> !Camera5<br /> !Camera6<br /> !Camera7<br /> !Camera8<br /> |-<br /> !Stadium1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> !Stadium2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> !Stadium3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> !Stadium4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> !Stadium5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> !Stadium6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> !Stadium7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> !Stadium8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> !Stadium9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> !Stadium10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> !Stadium11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> !Stadium12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> !Stadium13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> !Stadium14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> !Stadium15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; <br /> <br /> ''s.t. Constraints 1 to 15 are satisfied:''<br /> <br /> &lt;math&gt; z_1 + z_4 + z_7 + z_8 \geqslant 1 (1)&lt;/math&gt;<br /> <br /> &lt;math&gt; z_3 + z_4 + z_7 + z_8 \geqslant 1 (2)&lt;/math&gt;<br /> <br /> &lt;math&gt; z_1 + z_5 \geqslant 1 (3)&lt;/math&gt;<br /> <br /> &lt;math&gt; z_1 + z_2 + z_8 \geqslant 1 (4)&lt;/math&gt;<br /> <br /> &lt;math&gt; z_3 \geqslant 1 (5)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_1 + z_5 + z_7 + z_8 \geqslant 1 (6)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_1 + z_2 \geqslant 1 (7)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_2 + z_6 + z_8 \geqslant 1 (8)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_3 \geqslant 1 (9)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_5 \geqslant 1 (10)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_3 + z_7 \geqslant 1 (11)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_2 + z_5 + z_8 \geqslant 1 (12)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_3 \geqslant 1 (13)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_4 + z_5 + z_6 \geqslant 1 (14)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_4 + z_6 \geqslant 1 (15)&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;, <br /> <br /> s.t.:<br /> <br /> &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1 (1)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_1 + z_5 \geqslant 1 (3)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_1 + z_2 + z_8 \geqslant 1 (4)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_1 + z_5 + z_7 + z_8 \geqslant 1 (6)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_1 + z_2 \geqslant 1 (7)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_2 + z_6 + z_8 \geqslant 1 (8)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_5 \geqslant 1 (10)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_2 + z_5 + z_8 \geqslant 1 (12)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_4 + z_5 + z_6 \geqslant 1 (14)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_4 + z_6 \geqslant 1 (15)&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;<br /> <br /> s.t.:<br /> <br /> &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1 (1)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_1 + z_2 \geqslant 1 (7)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_2 + z_6 + z_8 \geqslant 1 (8)&lt;/math&gt;<br /> <br /> &lt;math&gt;z_4 + z_6 \geqslant 1 (15)&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations. &lt;ref name=&quot;three&quot;&gt; R. Church and C. ReVelle, [https://link.springer.com/article/10.1007/BF01942293 &quot;The maximal covering location problem],&quot; ''Papers of the Regional Science Association'', vol. 32, pp. 101-118, 1974. &lt;/ref&gt; Consider the problem of placing fire stations to serve the towns of some city. &lt;ref name=&quot;four&quot;&gt; E. Aktaş, Ö. Özaydın, B. Bozkaya, F. Ülengin, and Ş. Önsel, [https://pubsonline.informs.org/doi/10.1287/inte.1120.0671 &quot;Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality],&quot; ''Interfaces'', vol. 43, pp. 240-255, 2013. &lt;/ref&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i \geq 1, \forall i&lt;/math&gt; <br /> <br /> A real-world case study involving optimizing fire station locations in Istanbul is analyzed in this reference. &lt;ref name=&quot;four&quot; /&gt; The Istanbul municipality serves 790 subdistricts, which should all be covered by a fire station. Each subdistrict is considered covered if it has a neighboring district (a district at most 5 minutes away) that has a fire station. For detailed computational analysis, we refer the reader to the mentioned academic paper.<br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem. &lt;ref name=&quot;five&quot;&gt; J. Ali and V. Dyo, [https://www.scitepress.org/Link.aspx?doi=10.5220/0006469800830088 &quot;Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach],&quot; ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications'', pp. 83-88, 2017. &lt;/ref&gt; &lt;ref name=&quot;eleven&quot;&gt; P.H. Cruz Caminha , R. De Souza Couto , L.H. Maciel Kosmalski Costa , A. Fladenmuller , and M. Dias de Amorim, [https://www.mdpi.com/1424-8220/18/6/1976 &quot;On the Coverage of Bus-Based Mobile Sensing],&quot; ''Sensors'', 2018. &lt;/ref&gt; Specifically, giving a collection of bus routes '''''R''''', where each route itself is divided into segments. Route &lt;math&gt; i &lt;/math&gt; is denoted by &lt;math&gt; R_i &lt;/math&gt;, and segment &lt;math&gt; j &lt;/math&gt; is denoted by &lt;math&gt; S_j &lt;/math&gt;. The segments of two different routes can overlap, and each segment is associated with a length &lt;math&gt; a_j &lt;/math&gt;. The goal is then to select the routes that maximize the total covered distance.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Suppose we want to use at most &lt;math&gt; k &lt;/math&gt; different routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the length of of covered segments. Let &lt;math&gt; x_i &lt;/math&gt; be the binary decision variable corresponding to selecting route &lt;math&gt; R_i &lt;/math&gt;, and let &lt;math&gt; y_j &lt;/math&gt; be the decision variable associated with covering segment &lt;math&gt; S_j &lt;/math&gt;. Let us also denote the set of routes that cover segment &lt;math&gt; j &lt;/math&gt; by &lt;math&gt; C_j &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> &lt;math&gt;<br /> \begin{align}<br /> \text{max} &amp; ~~ \sum_{j} a_jy_j\\<br /> \text{s.t} &amp; ~~ \sum_{i\in C_j} x_i \geq y_j \quad \forall j \\<br /> &amp; ~~ \sum_{i} x_i = k \\ <br /> &amp; ~~ x_i,y_{j} \in \{0,1\} \\<br /> \end{align}<br /> &lt;/math&gt;<br /> <br /> The work by Ali and Dyo explores a greedy approximation algorithm to solve an optimal selection problem including 713 bus routes in Greater London. &lt;ref name=&quot;five&quot; /&gt; Using 14% of the routes only (100 routes), the greedy algorithm returns a solution that covers 25% of the segments in Greater London. For a details of the approximation algorithm and the world case study, we refer the reader to this reference. &lt;ref name=&quot;five&quot; /&gt; For a significantly larger case study involving 5747 buses covering 5060km, we refer the reader to this academic article. &lt;ref name=&quot;eleven&quot; /&gt;<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;six&quot;&gt; E. Marchiori and A. Steenbeek, [https://link.springer.com/chapter/10.1007/3-540-45561-2_36 &quot;An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling],&quot; ''Real-World Applications of Evolutionary Computing. EvoWorkshops 2000. Lecture Notes in Computer Science'', 2000. &lt;/ref&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey, which contains several problem instances with the number of flights ranging from 1013 to 7765 flights, for a detailed analysis of the formulation and algorithms that pertain to this significant application. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;eight&quot;&gt; A. Kasirzadeh, M. Saddoune, and F. Soumis [https://www.sciencedirect.com/science/article/pii/S2192437620300820?via%3Dihub &quot;Airline crew scheduling: models, algorithms, and data sets],&quot; ''EURO Journal on Transportation and Logistics'', vol. 6, pp. 111-137, 2017. &lt;/ref&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> == References ==<br /> &lt;references /&gt;</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=2555 Set covering problem 2020-12-14T00:55:33Z <p>Khaledfahat: /* Problem formulation */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. Given a collection of elements, the set covering problem aims to find the minimum number of sets that incorporate (cover) all of these elements. &lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion. &lt;ref name=&quot;one&quot; /&gt; &lt;ref name=&quot;seven&quot;&gt; P. Slavı́k, [https://www.sciencedirect.com/science/article/abs/pii/S0196677497908877 &quot;A Tight Analysis of the Greedy Algorithm for Set Cover],&quot; ''Journal of Algorithms,'', vol. 25, pp. 237-245, 1997. &lt;/ref&gt; &lt;ref name=&quot;nine&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem?],&quot; ''Operations Research Letters'', vol. 44, pp. 366-369, 2016. &lt;/ref&gt;<br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s. &lt;ref name=&quot;two&quot;&gt; J. Rubin, [https://www.jstor.org/stable/25767684?seq=1 &quot;A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling],&quot; ''Transportation Science'', vol. 7, pp. 34-48, 1973. &lt;/ref&gt;<br /> <br /> == Problem formulation ==<br /> In the set covering problem, two sets are given: a set &lt;math&gt; U &lt;/math&gt; of elements and a set &lt;math&gt; S &lt;/math&gt; of subsets of the set &lt;math&gt; U &lt;/math&gt;. Each subset in &lt;math&gt; S &lt;/math&gt; is associated with a predetermined cost, and the union of all the subsets covers the set &lt;math&gt; U &lt;/math&gt;. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt; &lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;<br /> <br /> The mathematical formulation of the set covering problem is define as follows. We define &lt;math&gt; U &lt;/math&gt; = { &lt;math&gt; u_i,..., u_m &lt;/math&gt;} as the universe of elements and &lt;math&gt; S &lt;/math&gt; = { &lt;math&gt; s_i,..., s_n &lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i \in U &lt;/math&gt; and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in &lt;math&gt; U &lt;/math&gt; (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = &lt;math&gt; U &lt;/math&gt; ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of &lt;math&gt; U &lt;/math&gt; and has associated cost &lt;math&gt; c_i&lt;/math&gt; such that &lt;math&gt; c_i &gt; 0&lt;/math&gt;. The objective is to find the minimum cost sub-collection of sets &lt;math&gt; X &lt;/math&gt; &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; S &lt;/math&gt; that covers all the elements in the universe &lt;math&gt; U &lt;/math&gt;.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:&lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element &lt;math&gt; i &lt;/math&gt; in the universe &lt;math&gt; U &lt;/math&gt; must be be covered and the second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary which means that every set is either in the set cover or not.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which implies that as the size of the problem increases, the computational time to solve it increases exponentially. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. In subsequent sections, we will cover two of the most widely used approximation methods to solve set cover problem in polynomial time which are linear program relaxation methods and classical greedy algorithms. &lt;ref name=&quot;seven&quot; /&gt;<br /> <br /> == Approximation via LP relaxation and rounding ==<br /> Set covering is a classical integer programming problem and solving integer program in general is NP-hard. Therefore, one approach to achieve an &lt;math&gt; O&lt;/math&gt;(log&lt;math&gt;n&lt;/math&gt;) approximation to set covering problem in polynomial time is solving via linear programming (LP) ''relaxation'' algorithms &lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt; &lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;. In LP relaxation, we relax the integrality requirement into a linear constraints. For instance, if we replace the constraints &lt;math&gt; y_i \in \{0, 1\}&lt;/math&gt; with the constraints &lt;math&gt; 0 =&lt; y_i &lt;= 1 &lt;/math&gt;, we obtain the following LP problem that can be solved in polynomial time:<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> subject to &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; 0 =&lt; y_i&lt;= 1, \forall i = 1,....,n&lt;/math&gt;<br /> <br /> The above LP formulation is a relaxation of the original ILP set cover problem. This means that every feasible solution of the integer program is also feasible for this LP program. Additionally, the value of any feasible solution for the integer program is the same value in LP since the objective functions of both integer and linear programs are the same. Solving the LP program will result in an optimal solution that is a lower bound for the original integer program since the minimization of LP finds a feasible solution of lowest possible values. Moreover, we use LP rounding algorithms to directly round the fractional LP solution to an integral combinatorial solution as follows:<br /> &lt;br&gt;<br /> <br /> <br /> '''Deterministic rounding algorithm''' <br /> &lt;br&gt;<br /> <br /> Suppose we have an optimal solution &lt;math&gt; z^* &lt;/math&gt; for the linear programming relaxation of the set cover problem. We round the fractional solution &lt;math&gt; z^* &lt;/math&gt; to an integer solution &lt;math&gt; z &lt;/math&gt; using LP rounding algorithm. In general, there are two approaches for rounding algorithms, deterministic and randomized rounding algorithm. In this section, we will explain the deterministic algorithms.In this approach, we include subset &lt;math&gt; s_i &lt;/math&gt; in our solution if &lt;math&gt; z^* &gt;= 1/d &lt;/math&gt;, where &lt;math&gt; d &lt;/math&gt; is the maximum number of sets in which any element appears. In practice, we set &lt;math&gt; z &lt;/math&gt; to be as follows:&lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;<br /> <br /> &lt;math&gt; z = \begin{cases} 1, &amp; \text{if } z^*&gt;= 1/d \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> The rounding algorithm is an approximation algorithm for the set cover problem. It is clear that the algorithm converge in polynomial time and &lt;math&gt; z &lt;/math&gt; is a feasible solution to the integer program.<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time. &lt;ref name=&quot;seven&quot; /&gt; &lt;ref name=&quot;nine&quot; /&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe &lt;math&gt; U &lt;/math&gt;, and delete the uncovered elements, until all elements are covered. &lt;ref name=&quot;ten&quot;&gt; V. Chvatal, [https://pubsonline.informs.org/doi/abs/10.1287/moor.4.3.233 &quot;Greedy Heuristic for the Set-Covering Problem],&quot; ''Mathematics of Operations Research'', vol. 4, pp. 233-235, 1979. &lt;/ref&gt; Let &lt;math&gt; T &lt;/math&gt; be the set that contain the covered elements, and &lt;math&gt; U &lt;/math&gt; be the set that contain the elements of &lt;math&gt; Y &lt;/math&gt; that still uncovered. At the beginning of the iteration, &lt;math&gt; T &lt;/math&gt; is empty and all elements &lt;math&gt; Y \in U &lt;/math&gt;. We iteratively select the set of &lt;math&gt; S &lt;/math&gt; that covers the largest number of elements in &lt;math&gt; U &lt;/math&gt; and add it to the covered elements in &lt;math&gt; T &lt;/math&gt;. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { &lt;math&gt; T &lt;/math&gt; stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { &lt;math&gt; U &lt;/math&gt; stores the uncovered elements &lt;math&gt; Y &lt;/math&gt;}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations. &lt;ref name=&quot;three&quot;&gt; R. Church and C. ReVelle, [https://link.springer.com/article/10.1007/BF01942293 &quot;The maximal covering location problem],&quot; ''Papers of the Regional Science Association'', vol. 32, pp. 101-118, 1974. &lt;/ref&gt; Consider the problem of placing fire stations to serve the towns of some city. &lt;ref name=&quot;four&quot;&gt; E. Aktaş, Ö. Özaydın, B. Bozkaya, F. Ülengin, and Ş. Önsel, [https://pubsonline.informs.org/doi/10.1287/inte.1120.0671 &quot;Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality],&quot; ''Interfaces'', vol. 43, pp. 240-255, 2013. &lt;/ref&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i \geq 1, \forall i&lt;/math&gt; <br /> <br /> A real-world case study involving optimizing fire station locations in Istanbul is analyzed in this reference. &lt;ref name=&quot;four&quot; /&gt; The Istanbul municipality serves 790 subdistricts, which should all be covered by a fire station. Each subdistrict is considered covered if it has a neighboring district (a district at most 5 minutes away) that has a fire station. For detailed computational analysis, we refer the reader to the mentioned academic paper.<br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem. &lt;ref name=&quot;five&quot;&gt; J. Ali and V. Dyo, [https://www.scitepress.org/Link.aspx?doi=10.5220/0006469800830088 &quot;Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach],&quot; ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications'', pp. 83-88, 2017. &lt;/ref&gt; &lt;ref name=&quot;eleven&quot;&gt; P.H. Cruz Caminha , R. De Souza Couto , L.H. Maciel Kosmalski Costa , A. Fladenmuller , and M. Dias de Amorim, [https://www.mdpi.com/1424-8220/18/6/1976 &quot;On the Coverage of Bus-Based Mobile Sensing],&quot; ''Sensors'', 2018. &lt;/ref&gt; Specifically, giving a collection of bus routes '''''R''''', where each route itself is divided into segments. Route &lt;math&gt; i &lt;/math&gt; is denoted by &lt;math&gt; R_i &lt;/math&gt;, and segment &lt;math&gt; j &lt;/math&gt; is denoted by &lt;math&gt; S_j &lt;/math&gt;. The segments of two different routes can overlap, and each segment is associated with a length &lt;math&gt; a_j &lt;/math&gt;. The goal is then to select the routes that maximize the total covered distance.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Suppose we want to use at most &lt;math&gt; k &lt;/math&gt; different routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the length of of covered segments. Let &lt;math&gt; x_i &lt;/math&gt; be the binary decision variable corresponding to selecting route &lt;math&gt; R_i &lt;/math&gt;, and let &lt;math&gt; y_j &lt;/math&gt; be the decision variable associated with covering segment &lt;math&gt; S_j &lt;/math&gt;. Let us also denote the set of routes that cover segment &lt;math&gt; j &lt;/math&gt; by &lt;math&gt; C_j &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> &lt;math&gt;<br /> \begin{align}<br /> \text{max} &amp; ~~ \sum_{j} a_jy_j\\<br /> \text{s.t} &amp; ~~ \sum_{i\in C_j} x_i \geq y_j \quad \forall j \\<br /> &amp; ~~ \sum_{i} x_i = k \\ <br /> &amp; ~~ x_i,y_{j} \in \{0,1\} \\<br /> \end{align}<br /> &lt;/math&gt;<br /> <br /> The work by Ali and Dyo explores a greedy approximation algorithm to solve an optimal selection problem including 713 bus routes in Greater London. &lt;ref name=&quot;five&quot; /&gt; Using 14% of the routes only (100 routes), the greedy algorithm returns a solution that covers 25% of the segments in Greater London. For a details of the approximation algorithm and the world case study, we refer the reader to this reference. &lt;ref name=&quot;five&quot; /&gt; For a significantly larger case study involving 5747 buses covering 5060km, we refer the reader to this academic article. &lt;ref name=&quot;eleven&quot; /&gt;<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;six&quot;&gt; E. Marchiori and A. Steenbeek, [https://link.springer.com/chapter/10.1007/3-540-45561-2_36 &quot;An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling],&quot; ''Real-World Applications of Evolutionary Computing. EvoWorkshops 2000. Lecture Notes in Computer Science'', 2000. &lt;/ref&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey, which contains several problem instances with the number of flights ranging from 1013 to 7765 flights, for a detailed analysis of the formulation and algorithms that pertain to this significant application. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;eight&quot;&gt; A. Kasirzadeh, M. Saddoune, and F. Soumis [https://www.sciencedirect.com/science/article/pii/S2192437620300820?via%3Dihub &quot;Airline crew scheduling: models, algorithms, and data sets],&quot; ''EURO Journal on Transportation and Logistics'', vol. 6, pp. 111-137, 2017. &lt;/ref&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> == References ==<br /> &lt;references /&gt;</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=2554 Set covering problem 2020-12-14T00:53:37Z <p>Khaledfahat: /* Problem formulation */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. Given a collection of elements, the set covering problem aims to find the minimum number of sets that incorporate (cover) all of these elements. &lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion. &lt;ref name=&quot;one&quot; /&gt; &lt;ref name=&quot;seven&quot;&gt; P. Slavı́k, [https://www.sciencedirect.com/science/article/abs/pii/S0196677497908877 &quot;A Tight Analysis of the Greedy Algorithm for Set Cover],&quot; ''Journal of Algorithms,'', vol. 25, pp. 237-245, 1997. &lt;/ref&gt; &lt;ref name=&quot;nine&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem?],&quot; ''Operations Research Letters'', vol. 44, pp. 366-369, 2016. &lt;/ref&gt;<br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s. &lt;ref name=&quot;two&quot;&gt; J. Rubin, [https://www.jstor.org/stable/25767684?seq=1 &quot;A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling],&quot; ''Transportation Science'', vol. 7, pp. 34-48, 1973. &lt;/ref&gt;<br /> <br /> == Problem formulation ==<br /> In the set covering problem, two sets are given: a set &lt;math&gt; U &lt;/math&gt; of elements and a set &lt;math&gt; S &lt;/math&gt; of subsets of the set &lt;math&gt; U &lt;/math&gt;. Each subset in &lt;math&gt; S &lt;/math&gt; is associated with a predetermined cost, and the union of all the subsets covers the set &lt;math&gt; U &lt;/math&gt;. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt; &lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;<br /> <br /> The mathematical formulation of the set covering problem is define as follows. We define &lt;math&gt; U &lt;/math&gt; = { &lt;math&gt; u_i,..., u_m &lt;/math&gt;} as the universe of elements and &lt;math&gt; S &lt;/math&gt; = { &lt;math&gt; s_i,..., s_n &lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i \in U &lt;/math&gt; and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of &lt;math&gt; U &lt;/math&gt; and has associated cost &lt;math&gt; c_i&lt;/math&gt; such that &lt;math&gt; c_i &gt; 0&lt;/math&gt;. The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe &lt;math&gt; U &lt;/math&gt;.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:&lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element &lt;math&gt; i &lt;/math&gt; in the universe &lt;math&gt; U &lt;/math&gt; must be be covered and the second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary which means that every set is either in the set cover or not.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which implies that as the size of the problem increases, the computational time to solve it increases exponentially. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. In subsequent sections, we will cover two of the most widely used approximation methods to solve set cover problem in polynomial time which are linear program relaxation methods and classical greedy algorithms. &lt;ref name=&quot;seven&quot; /&gt;<br /> <br /> == Approximation via LP relaxation and rounding ==<br /> Set covering is a classical integer programming problem and solving integer program in general is NP-hard. Therefore, one approach to achieve an &lt;math&gt; O&lt;/math&gt;(log&lt;math&gt;n&lt;/math&gt;) approximation to set covering problem in polynomial time is solving via linear programming (LP) ''relaxation'' algorithms &lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt; &lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;. In LP relaxation, we relax the integrality requirement into a linear constraints. For instance, if we replace the constraints &lt;math&gt; y_i \in \{0, 1\}&lt;/math&gt; with the constraints &lt;math&gt; 0 =&lt; y_i &lt;= 1 &lt;/math&gt;, we obtain the following LP problem that can be solved in polynomial time:<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> subject to &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; 0 =&lt; y_i&lt;= 1, \forall i = 1,....,n&lt;/math&gt;<br /> <br /> The above LP formulation is a relaxation of the original ILP set cover problem. This means that every feasible solution of the integer program is also feasible for this LP program. Additionally, the value of any feasible solution for the integer program is the same value in LP since the objective functions of both integer and linear programs are the same. Solving the LP program will result in an optimal solution that is a lower bound for the original integer program since the minimization of LP finds a feasible solution of lowest possible values. Moreover, we use LP rounding algorithms to directly round the fractional LP solution to an integral combinatorial solution as follows:<br /> &lt;br&gt;<br /> <br /> <br /> '''Deterministic rounding algorithm''' <br /> &lt;br&gt;<br /> <br /> Suppose we have an optimal solution &lt;math&gt; z^* &lt;/math&gt; for the linear programming relaxation of the set cover problem. We round the fractional solution &lt;math&gt; z^* &lt;/math&gt; to an integer solution &lt;math&gt; z &lt;/math&gt; using LP rounding algorithm. In general, there are two approaches for rounding algorithms, deterministic and randomized rounding algorithm. In this section, we will explain the deterministic algorithms.In this approach, we include subset &lt;math&gt; s_i &lt;/math&gt; in our solution if &lt;math&gt; z^* &gt;= 1/d &lt;/math&gt;, where &lt;math&gt; d &lt;/math&gt; is the maximum number of sets in which any element appears. In practice, we set &lt;math&gt; z &lt;/math&gt; to be as follows:&lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;<br /> <br /> &lt;math&gt; z = \begin{cases} 1, &amp; \text{if } z^*&gt;= 1/d \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> The rounding algorithm is an approximation algorithm for the set cover problem. It is clear that the algorithm converge in polynomial time and &lt;math&gt; z &lt;/math&gt; is a feasible solution to the integer program.<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time. &lt;ref name=&quot;seven&quot; /&gt; &lt;ref name=&quot;nine&quot; /&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe &lt;math&gt; U &lt;/math&gt;, and delete the uncovered elements, until all elements are covered. &lt;ref name=&quot;ten&quot;&gt; V. Chvatal, [https://pubsonline.informs.org/doi/abs/10.1287/moor.4.3.233 &quot;Greedy Heuristic for the Set-Covering Problem],&quot; ''Mathematics of Operations Research'', vol. 4, pp. 233-235, 1979. &lt;/ref&gt; Let &lt;math&gt; T &lt;/math&gt; be the set that contain the covered elements, and &lt;math&gt; U &lt;/math&gt; be the set that contain the elements of &lt;math&gt; Y &lt;/math&gt; that still uncovered. At the beginning of the iteration, &lt;math&gt; T &lt;/math&gt; is empty and all elements &lt;math&gt; Y \in U &lt;/math&gt;. We iteratively select the set of &lt;math&gt; S &lt;/math&gt; that covers the largest number of elements in &lt;math&gt; U &lt;/math&gt; and add it to the covered elements in &lt;math&gt; T &lt;/math&gt;. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { &lt;math&gt; T &lt;/math&gt; stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { &lt;math&gt; U &lt;/math&gt; stores the uncovered elements &lt;math&gt; Y &lt;/math&gt;}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations. &lt;ref name=&quot;three&quot;&gt; R. Church and C. ReVelle, [https://link.springer.com/article/10.1007/BF01942293 &quot;The maximal covering location problem],&quot; ''Papers of the Regional Science Association'', vol. 32, pp. 101-118, 1974. &lt;/ref&gt; Consider the problem of placing fire stations to serve the towns of some city. &lt;ref name=&quot;four&quot;&gt; E. Aktaş, Ö. Özaydın, B. Bozkaya, F. Ülengin, and Ş. Önsel, [https://pubsonline.informs.org/doi/10.1287/inte.1120.0671 &quot;Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality],&quot; ''Interfaces'', vol. 43, pp. 240-255, 2013. &lt;/ref&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i \geq 1, \forall i&lt;/math&gt; <br /> <br /> A real-world case study involving optimizing fire station locations in Istanbul is analyzed in this reference. &lt;ref name=&quot;four&quot; /&gt; The Istanbul municipality serves 790 subdistricts, which should all be covered by a fire station. Each subdistrict is considered covered if it has a neighboring district (a district at most 5 minutes away) that has a fire station. For detailed computational analysis, we refer the reader to the mentioned academic paper.<br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem. &lt;ref name=&quot;five&quot;&gt; J. Ali and V. Dyo, [https://www.scitepress.org/Link.aspx?doi=10.5220/0006469800830088 &quot;Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach],&quot; ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications'', pp. 83-88, 2017. &lt;/ref&gt; &lt;ref name=&quot;eleven&quot;&gt; P.H. Cruz Caminha , R. De Souza Couto , L.H. Maciel Kosmalski Costa , A. Fladenmuller , and M. Dias de Amorim, [https://www.mdpi.com/1424-8220/18/6/1976 &quot;On the Coverage of Bus-Based Mobile Sensing],&quot; ''Sensors'', 2018. &lt;/ref&gt; Specifically, giving a collection of bus routes '''''R''''', where each route itself is divided into segments. Route &lt;math&gt; i &lt;/math&gt; is denoted by &lt;math&gt; R_i &lt;/math&gt;, and segment &lt;math&gt; j &lt;/math&gt; is denoted by &lt;math&gt; S_j &lt;/math&gt;. The segments of two different routes can overlap, and each segment is associated with a length &lt;math&gt; a_j &lt;/math&gt;. The goal is then to select the routes that maximize the total covered distance.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Suppose we want to use at most &lt;math&gt; k &lt;/math&gt; different routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the length of of covered segments. Let &lt;math&gt; x_i &lt;/math&gt; be the binary decision variable corresponding to selecting route &lt;math&gt; R_i &lt;/math&gt;, and let &lt;math&gt; y_j &lt;/math&gt; be the decision variable associated with covering segment &lt;math&gt; S_j &lt;/math&gt;. Let us also denote the set of routes that cover segment &lt;math&gt; j &lt;/math&gt; by &lt;math&gt; C_j &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> &lt;math&gt;<br /> \begin{align}<br /> \text{max} &amp; ~~ \sum_{j} a_jy_j\\<br /> \text{s.t} &amp; ~~ \sum_{i\in C_j} x_i \geq y_j \quad \forall j \\<br /> &amp; ~~ \sum_{i} x_i = k \\ <br /> &amp; ~~ x_i,y_{j} \in \{0,1\} \\<br /> \end{align}<br /> &lt;/math&gt;<br /> <br /> The work by Ali and Dyo explores a greedy approximation algorithm to solve an optimal selection problem including 713 bus routes in Greater London. &lt;ref name=&quot;five&quot; /&gt; Using 14% of the routes only (100 routes), the greedy algorithm returns a solution that covers 25% of the segments in Greater London. For a details of the approximation algorithm and the world case study, we refer the reader to this reference. &lt;ref name=&quot;five&quot; /&gt; For a significantly larger case study involving 5747 buses covering 5060km, we refer the reader to this academic article. &lt;ref name=&quot;eleven&quot; /&gt;<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;six&quot;&gt; E. Marchiori and A. Steenbeek, [https://link.springer.com/chapter/10.1007/3-540-45561-2_36 &quot;An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling],&quot; ''Real-World Applications of Evolutionary Computing. EvoWorkshops 2000. Lecture Notes in Computer Science'', 2000. &lt;/ref&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey, which contains several problem instances with the number of flights ranging from 1013 to 7765 flights, for a detailed analysis of the formulation and algorithms that pertain to this significant application. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;eight&quot;&gt; A. Kasirzadeh, M. Saddoune, and F. Soumis [https://www.sciencedirect.com/science/article/pii/S2192437620300820?via%3Dihub &quot;Airline crew scheduling: models, algorithms, and data sets],&quot; ''EURO Journal on Transportation and Logistics'', vol. 6, pp. 111-137, 2017. &lt;/ref&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> == References ==<br /> &lt;references /&gt;</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=2553 Set covering problem 2020-12-14T00:48:04Z <p>Khaledfahat: /* Problem formulation */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. Given a collection of elements, the set covering problem aims to find the minimum number of sets that incorporate (cover) all of these elements. &lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion. &lt;ref name=&quot;one&quot; /&gt; &lt;ref name=&quot;seven&quot;&gt; P. Slavı́k, [https://www.sciencedirect.com/science/article/abs/pii/S0196677497908877 &quot;A Tight Analysis of the Greedy Algorithm for Set Cover],&quot; ''Journal of Algorithms,'', vol. 25, pp. 237-245, 1997. &lt;/ref&gt; &lt;ref name=&quot;nine&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem?],&quot; ''Operations Research Letters'', vol. 44, pp. 366-369, 2016. &lt;/ref&gt;<br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s. &lt;ref name=&quot;two&quot;&gt; J. Rubin, [https://www.jstor.org/stable/25767684?seq=1 &quot;A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling],&quot; ''Transportation Science'', vol. 7, pp. 34-48, 1973. &lt;/ref&gt;<br /> <br /> == Problem formulation ==<br /> In the set covering problem, two sets are given: a set &lt;math&gt; U &lt;/math&gt; of elements and a set &lt;math&gt; S &lt;/math&gt; of subsets of the set &lt;math&gt; U &lt;/math&gt;. Each subset in &lt;math&gt; S &lt;/math&gt; is associated with a predetermined cost, and the union of all the subsets covers the set &lt;math&gt; U &lt;/math&gt;. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt; &lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;<br /> <br /> The mathematical formulation of the set covering problem is define as follows. We define &lt;math&gt; U = { u_i, ….. u_m }&lt;/math&gt; as the universe of elements and &lt;math&gt; S = {s_1, …., s_n} &lt;/math&gt; as a collection of subsets such that &lt;math&gt; s_i \in U &lt;/math&gt; and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of &lt;math&gt; U &lt;/math&gt; and has associated cost &lt;math&gt; c_i&lt;/math&gt; such that &lt;math&gt; c_i &gt; 0&lt;/math&gt;. The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe &lt;math&gt; U &lt;/math&gt;.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:&lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element &lt;math&gt; i &lt;/math&gt; in the universe &lt;math&gt; U &lt;/math&gt; must be be covered and the second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary which means that every set is either in the set cover or not.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which implies that as the size of the problem increases, the computational time to solve it increases exponentially. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. In subsequent sections, we will cover two of the most widely used approximation methods to solve set cover problem in polynomial time which are linear program relaxation methods and classical greedy algorithms. &lt;ref name=&quot;seven&quot; /&gt;<br /> <br /> == Approximation via LP relaxation and rounding ==<br /> Set covering is a classical integer programming problem and solving integer program in general is NP-hard. Therefore, one approach to achieve an &lt;math&gt; O&lt;/math&gt;(log&lt;math&gt;n&lt;/math&gt;) approximation to set covering problem in polynomial time is solving via linear programming (LP) ''relaxation'' algorithms &lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt; &lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;. In LP relaxation, we relax the integrality requirement into a linear constraints. For instance, if we replace the constraints &lt;math&gt; y_i \in \{0, 1\}&lt;/math&gt; with the constraints &lt;math&gt; 0 =&lt; y_i &lt;= 1 &lt;/math&gt;, we obtain the following LP problem that can be solved in polynomial time:<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> subject to &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; 0 =&lt; y_i&lt;= 1, \forall i = 1,....,n&lt;/math&gt;<br /> <br /> The above LP formulation is a relaxation of the original ILP set cover problem. This means that every feasible solution of the integer program is also feasible for this LP program. Additionally, the value of any feasible solution for the integer program is the same value in LP since the objective functions of both integer and linear programs are the same. Solving the LP program will result in an optimal solution that is a lower bound for the original integer program since the minimization of LP finds a feasible solution of lowest possible values. Moreover, we use LP rounding algorithms to directly round the fractional LP solution to an integral combinatorial solution as follows:<br /> &lt;br&gt;<br /> <br /> <br /> '''Deterministic rounding algorithm''' <br /> &lt;br&gt;<br /> <br /> Suppose we have an optimal solution &lt;math&gt; z^* &lt;/math&gt; for the linear programming relaxation of the set cover problem. We round the fractional solution &lt;math&gt; z^* &lt;/math&gt; to an integer solution &lt;math&gt; z &lt;/math&gt; using LP rounding algorithm. In general, there are two approaches for rounding algorithms, deterministic and randomized rounding algorithm. In this section, we will explain the deterministic algorithms.In this approach, we include subset &lt;math&gt; s_i &lt;/math&gt; in our solution if &lt;math&gt; z^* &gt;= 1/d &lt;/math&gt;, where &lt;math&gt; d &lt;/math&gt; is the maximum number of sets in which any element appears. In practice, we set &lt;math&gt; z &lt;/math&gt; to be as follows:&lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;<br /> <br /> &lt;math&gt; z = \begin{cases} 1, &amp; \text{if } z^*&gt;= 1/d \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> The rounding algorithm is an approximation algorithm for the set cover problem. It is clear that the algorithm converge in polynomial time and &lt;math&gt; z &lt;/math&gt; is a feasible solution to the integer program.<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time. &lt;ref name=&quot;seven&quot; /&gt; &lt;ref name=&quot;nine&quot; /&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe &lt;math&gt; U &lt;/math&gt;, and delete the uncovered elements, until all elements are covered. &lt;ref name=&quot;ten&quot;&gt; V. Chvatal, [https://pubsonline.informs.org/doi/abs/10.1287/moor.4.3.233 &quot;Greedy Heuristic for the Set-Covering Problem],&quot; ''Mathematics of Operations Research'', vol. 4, pp. 233-235, 1979. &lt;/ref&gt; Let &lt;math&gt; T &lt;/math&gt; be the set that contain the covered elements, and &lt;math&gt; U &lt;/math&gt; be the set that contain the elements of &lt;math&gt; Y &lt;/math&gt; that still uncovered. At the beginning of the iteration, &lt;math&gt; T &lt;/math&gt; is empty and all elements &lt;math&gt; Y \in U &lt;/math&gt;. We iteratively select the set of &lt;math&gt; S &lt;/math&gt; that covers the largest number of elements in &lt;math&gt; U &lt;/math&gt; and add it to the covered elements in &lt;math&gt; T &lt;/math&gt;. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { &lt;math&gt; T &lt;/math&gt; stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { &lt;math&gt; U &lt;/math&gt; stores the uncovered elements &lt;math&gt; Y &lt;/math&gt;}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations. &lt;ref name=&quot;three&quot;&gt; R. Church and C. ReVelle, [https://link.springer.com/article/10.1007/BF01942293 &quot;The maximal covering location problem],&quot; ''Papers of the Regional Science Association'', vol. 32, pp. 101-118, 1974. &lt;/ref&gt; Consider the problem of placing fire stations to serve the towns of some city. &lt;ref name=&quot;four&quot;&gt; E. Aktaş, Ö. Özaydın, B. Bozkaya, F. Ülengin, and Ş. Önsel, [https://pubsonline.informs.org/doi/10.1287/inte.1120.0671 &quot;Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality],&quot; ''Interfaces'', vol. 43, pp. 240-255, 2013. &lt;/ref&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i \geq 1, \forall i&lt;/math&gt; <br /> <br /> A real-world case study involving optimizing fire station locations in Istanbul is analyzed in this reference. &lt;ref name=&quot;four&quot; /&gt; The Istanbul municipality serves 790 subdistricts, which should all be covered by a fire station. Each subdistrict is considered covered if it has a neighboring district (a district at most 5 minutes away) that has a fire station. For detailed computational analysis, we refer the reader to the mentioned academic paper.<br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem. &lt;ref name=&quot;five&quot;&gt; J. Ali and V. Dyo, [https://www.scitepress.org/Link.aspx?doi=10.5220/0006469800830088 &quot;Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach],&quot; ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications'', pp. 83-88, 2017. &lt;/ref&gt; &lt;ref name=&quot;eleven&quot;&gt; P.H. Cruz Caminha , R. De Souza Couto , L.H. Maciel Kosmalski Costa , A. Fladenmuller , and M. Dias de Amorim, [https://www.mdpi.com/1424-8220/18/6/1976 &quot;On the Coverage of Bus-Based Mobile Sensing],&quot; ''Sensors'', 2018. &lt;/ref&gt; Specifically, giving a collection of bus routes '''''R''''', where each route itself is divided into segments. Route &lt;math&gt; i &lt;/math&gt; is denoted by &lt;math&gt; R_i &lt;/math&gt;, and segment &lt;math&gt; j &lt;/math&gt; is denoted by &lt;math&gt; S_j &lt;/math&gt;. The segments of two different routes can overlap, and each segment is associated with a length &lt;math&gt; a_j &lt;/math&gt;. The goal is then to select the routes that maximize the total covered distance.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Suppose we want to use at most &lt;math&gt; k &lt;/math&gt; different routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the length of of covered segments. Let &lt;math&gt; x_i &lt;/math&gt; be the binary decision variable corresponding to selecting route &lt;math&gt; R_i &lt;/math&gt;, and let &lt;math&gt; y_j &lt;/math&gt; be the decision variable associated with covering segment &lt;math&gt; S_j &lt;/math&gt;. Let us also denote the set of routes that cover segment &lt;math&gt; j &lt;/math&gt; by &lt;math&gt; C_j &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> &lt;math&gt;<br /> \begin{align}<br /> \text{max} &amp; ~~ \sum_{j} a_jy_j\\<br /> \text{s.t} &amp; ~~ \sum_{i\in C_j} x_i \geq y_j \quad \forall j \\<br /> &amp; ~~ \sum_{i} x_i = k \\ <br /> &amp; ~~ x_i,y_{j} \in \{0,1\} \\<br /> \end{align}<br /> &lt;/math&gt;<br /> <br /> The work by Ali and Dyo explores a greedy approximation algorithm to solve an optimal selection problem including 713 bus routes in Greater London. &lt;ref name=&quot;five&quot; /&gt; Using 14% of the routes only (100 routes), the greedy algorithm returns a solution that covers 25% of the segments in Greater London. For a details of the approximation algorithm and the world case study, we refer the reader to this reference. &lt;ref name=&quot;five&quot; /&gt; For a significantly larger case study involving 5747 buses covering 5060km, we refer the reader to this academic article. &lt;ref name=&quot;eleven&quot; /&gt;<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;six&quot;&gt; E. Marchiori and A. Steenbeek, [https://link.springer.com/chapter/10.1007/3-540-45561-2_36 &quot;An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling],&quot; ''Real-World Applications of Evolutionary Computing. EvoWorkshops 2000. Lecture Notes in Computer Science'', 2000. &lt;/ref&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey, which contains several problem instances with the number of flights ranging from 1013 to 7765 flights, for a detailed analysis of the formulation and algorithms that pertain to this significant application. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;eight&quot;&gt; A. Kasirzadeh, M. Saddoune, and F. Soumis [https://www.sciencedirect.com/science/article/pii/S2192437620300820?via%3Dihub &quot;Airline crew scheduling: models, algorithms, and data sets],&quot; ''EURO Journal on Transportation and Logistics'', vol. 6, pp. 111-137, 2017. &lt;/ref&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> == References ==<br /> &lt;references /&gt;</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=2552 Set covering problem 2020-12-14T00:47:00Z <p>Khaledfahat: /* Problem formulation */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. Given a collection of elements, the set covering problem aims to find the minimum number of sets that incorporate (cover) all of these elements. &lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion. &lt;ref name=&quot;one&quot; /&gt; &lt;ref name=&quot;seven&quot;&gt; P. Slavı́k, [https://www.sciencedirect.com/science/article/abs/pii/S0196677497908877 &quot;A Tight Analysis of the Greedy Algorithm for Set Cover],&quot; ''Journal of Algorithms,'', vol. 25, pp. 237-245, 1997. &lt;/ref&gt; &lt;ref name=&quot;nine&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem?],&quot; ''Operations Research Letters'', vol. 44, pp. 366-369, 2016. &lt;/ref&gt;<br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s. &lt;ref name=&quot;two&quot;&gt; J. Rubin, [https://www.jstor.org/stable/25767684?seq=1 &quot;A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling],&quot; ''Transportation Science'', vol. 7, pp. 34-48, 1973. &lt;/ref&gt;<br /> <br /> == Problem formulation ==<br /> In the set covering problem, two sets are given: a set &lt;math&gt; U &lt;/math&gt; of elements and a set &lt;math&gt; S &lt;/math&gt; of subsets of the set &lt;math&gt; U &lt;/math&gt;. Each subset in &lt;math&gt; S &lt;/math&gt; is associated with a predetermined cost, and the union of all the subsets covers the set &lt;math&gt; U &lt;/math&gt;. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt; &lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;<br /> <br /> The mathematical formulation of the set covering problem is define as follows. We define &lt;math&gt; U = { u_i&lt;/math&gt;, ….. &lt;math&gt; u_m }&lt;/math&gt; as the universe of elements and &lt;math&gt; S = {s_1&lt;/math&gt;, ….,&lt;math&gt; s_n} &lt;/math&gt; as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of &lt;math&gt; U &lt;/math&gt; and has associated cost &lt;math&gt; c_i&lt;/math&gt; such that &lt;math&gt; c_i &gt; 0&lt;/math&gt;. The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe &lt;math&gt; U &lt;/math&gt;.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:&lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element &lt;math&gt; i &lt;/math&gt; in the universe &lt;math&gt; U &lt;/math&gt; must be be covered and the second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary which means that every set is either in the set cover or not.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which implies that as the size of the problem increases, the computational time to solve it increases exponentially. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. In subsequent sections, we will cover two of the most widely used approximation methods to solve set cover problem in polynomial time which are linear program relaxation methods and classical greedy algorithms. &lt;ref name=&quot;seven&quot; /&gt;<br /> <br /> == Approximation via LP relaxation and rounding ==<br /> Set covering is a classical integer programming problem and solving integer program in general is NP-hard. Therefore, one approach to achieve an &lt;math&gt; O&lt;/math&gt;(log&lt;math&gt;n&lt;/math&gt;) approximation to set covering problem in polynomial time is solving via linear programming (LP) ''relaxation'' algorithms &lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt; &lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;. In LP relaxation, we relax the integrality requirement into a linear constraints. For instance, if we replace the constraints &lt;math&gt; y_i \in \{0, 1\}&lt;/math&gt; with the constraints &lt;math&gt; 0 =&lt; y_i &lt;= 1 &lt;/math&gt;, we obtain the following LP problem that can be solved in polynomial time:<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> subject to &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; 0 =&lt; y_i&lt;= 1, \forall i = 1,....,n&lt;/math&gt;<br /> <br /> The above LP formulation is a relaxation of the original ILP set cover problem. This means that every feasible solution of the integer program is also feasible for this LP program. Additionally, the value of any feasible solution for the integer program is the same value in LP since the objective functions of both integer and linear programs are the same. Solving the LP program will result in an optimal solution that is a lower bound for the original integer program since the minimization of LP finds a feasible solution of lowest possible values. Moreover, we use LP rounding algorithms to directly round the fractional LP solution to an integral combinatorial solution as follows:<br /> &lt;br&gt;<br /> <br /> <br /> '''Deterministic rounding algorithm''' <br /> &lt;br&gt;<br /> <br /> Suppose we have an optimal solution &lt;math&gt; z^* &lt;/math&gt; for the linear programming relaxation of the set cover problem. We round the fractional solution &lt;math&gt; z^* &lt;/math&gt; to an integer solution &lt;math&gt; z &lt;/math&gt; using LP rounding algorithm. In general, there are two approaches for rounding algorithms, deterministic and randomized rounding algorithm. In this section, we will explain the deterministic algorithms.In this approach, we include subset &lt;math&gt; s_i &lt;/math&gt; in our solution if &lt;math&gt; z^* &gt;= 1/d &lt;/math&gt;, where &lt;math&gt; d &lt;/math&gt; is the maximum number of sets in which any element appears. In practice, we set &lt;math&gt; z &lt;/math&gt; to be as follows:&lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;<br /> <br /> &lt;math&gt; z = \begin{cases} 1, &amp; \text{if } z^*&gt;= 1/d \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> The rounding algorithm is an approximation algorithm for the set cover problem. It is clear that the algorithm converge in polynomial time and &lt;math&gt; z &lt;/math&gt; is a feasible solution to the integer program.<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time. &lt;ref name=&quot;seven&quot; /&gt; &lt;ref name=&quot;nine&quot; /&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe &lt;math&gt; U &lt;/math&gt;, and delete the uncovered elements, until all elements are covered. &lt;ref name=&quot;ten&quot;&gt; V. Chvatal, [https://pubsonline.informs.org/doi/abs/10.1287/moor.4.3.233 &quot;Greedy Heuristic for the Set-Covering Problem],&quot; ''Mathematics of Operations Research'', vol. 4, pp. 233-235, 1979. &lt;/ref&gt; Let &lt;math&gt; T &lt;/math&gt; be the set that contain the covered elements, and &lt;math&gt; U &lt;/math&gt; be the set that contain the elements of &lt;math&gt; Y &lt;/math&gt; that still uncovered. At the beginning of the iteration, &lt;math&gt; T &lt;/math&gt; is empty and all elements &lt;math&gt; Y \in U &lt;/math&gt;. We iteratively select the set of &lt;math&gt; S &lt;/math&gt; that covers the largest number of elements in &lt;math&gt; U &lt;/math&gt; and add it to the covered elements in &lt;math&gt; T &lt;/math&gt;. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { &lt;math&gt; T &lt;/math&gt; stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { &lt;math&gt; U &lt;/math&gt; stores the uncovered elements &lt;math&gt; Y &lt;/math&gt;}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations. &lt;ref name=&quot;three&quot;&gt; R. Church and C. ReVelle, [https://link.springer.com/article/10.1007/BF01942293 &quot;The maximal covering location problem],&quot; ''Papers of the Regional Science Association'', vol. 32, pp. 101-118, 1974. &lt;/ref&gt; Consider the problem of placing fire stations to serve the towns of some city. &lt;ref name=&quot;four&quot;&gt; E. Aktaş, Ö. Özaydın, B. Bozkaya, F. Ülengin, and Ş. Önsel, [https://pubsonline.informs.org/doi/10.1287/inte.1120.0671 &quot;Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality],&quot; ''Interfaces'', vol. 43, pp. 240-255, 2013. &lt;/ref&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i \geq 1, \forall i&lt;/math&gt; <br /> <br /> A real-world case study involving optimizing fire station locations in Istanbul is analyzed in this reference. &lt;ref name=&quot;four&quot; /&gt; The Istanbul municipality serves 790 subdistricts, which should all be covered by a fire station. Each subdistrict is considered covered if it has a neighboring district (a district at most 5 minutes away) that has a fire station. For detailed computational analysis, we refer the reader to the mentioned academic paper.<br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem. &lt;ref name=&quot;five&quot;&gt; J. Ali and V. Dyo, [https://www.scitepress.org/Link.aspx?doi=10.5220/0006469800830088 &quot;Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach],&quot; ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications'', pp. 83-88, 2017. &lt;/ref&gt; &lt;ref name=&quot;eleven&quot;&gt; P.H. Cruz Caminha , R. De Souza Couto , L.H. Maciel Kosmalski Costa , A. Fladenmuller , and M. Dias de Amorim, [https://www.mdpi.com/1424-8220/18/6/1976 &quot;On the Coverage of Bus-Based Mobile Sensing],&quot; ''Sensors'', 2018. &lt;/ref&gt; Specifically, giving a collection of bus routes '''''R''''', where each route itself is divided into segments. Route &lt;math&gt; i &lt;/math&gt; is denoted by &lt;math&gt; R_i &lt;/math&gt;, and segment &lt;math&gt; j &lt;/math&gt; is denoted by &lt;math&gt; S_j &lt;/math&gt;. The segments of two different routes can overlap, and each segment is associated with a length &lt;math&gt; a_j &lt;/math&gt;. The goal is then to select the routes that maximize the total covered distance.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Suppose we want to use at most &lt;math&gt; k &lt;/math&gt; different routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the length of of covered segments. Let &lt;math&gt; x_i &lt;/math&gt; be the binary decision variable corresponding to selecting route &lt;math&gt; R_i &lt;/math&gt;, and let &lt;math&gt; y_j &lt;/math&gt; be the decision variable associated with covering segment &lt;math&gt; S_j &lt;/math&gt;. Let us also denote the set of routes that cover segment &lt;math&gt; j &lt;/math&gt; by &lt;math&gt; C_j &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> &lt;math&gt;<br /> \begin{align}<br /> \text{max} &amp; ~~ \sum_{j} a_jy_j\\<br /> \text{s.t} &amp; ~~ \sum_{i\in C_j} x_i \geq y_j \quad \forall j \\<br /> &amp; ~~ \sum_{i} x_i = k \\ <br /> &amp; ~~ x_i,y_{j} \in \{0,1\} \\<br /> \end{align}<br /> &lt;/math&gt;<br /> <br /> The work by Ali and Dyo explores a greedy approximation algorithm to solve an optimal selection problem including 713 bus routes in Greater London. &lt;ref name=&quot;five&quot; /&gt; Using 14% of the routes only (100 routes), the greedy algorithm returns a solution that covers 25% of the segments in Greater London. For a details of the approximation algorithm and the world case study, we refer the reader to this reference. &lt;ref name=&quot;five&quot; /&gt; For a significantly larger case study involving 5747 buses covering 5060km, we refer the reader to this academic article. &lt;ref name=&quot;eleven&quot; /&gt;<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;six&quot;&gt; E. Marchiori and A. Steenbeek, [https://link.springer.com/chapter/10.1007/3-540-45561-2_36 &quot;An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling],&quot; ''Real-World Applications of Evolutionary Computing. EvoWorkshops 2000. Lecture Notes in Computer Science'', 2000. &lt;/ref&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey, which contains several problem instances with the number of flights ranging from 1013 to 7765 flights, for a detailed analysis of the formulation and algorithms that pertain to this significant application. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;eight&quot;&gt; A. Kasirzadeh, M. Saddoune, and F. Soumis [https://www.sciencedirect.com/science/article/pii/S2192437620300820?via%3Dihub &quot;Airline crew scheduling: models, algorithms, and data sets],&quot; ''EURO Journal on Transportation and Logistics'', vol. 6, pp. 111-137, 2017. &lt;/ref&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> == References ==<br /> &lt;references /&gt;</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=2551 Set covering problem 2020-12-14T00:45:50Z <p>Khaledfahat: </p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. Given a collection of elements, the set covering problem aims to find the minimum number of sets that incorporate (cover) all of these elements. &lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion. &lt;ref name=&quot;one&quot; /&gt; &lt;ref name=&quot;seven&quot;&gt; P. Slavı́k, [https://www.sciencedirect.com/science/article/abs/pii/S0196677497908877 &quot;A Tight Analysis of the Greedy Algorithm for Set Cover],&quot; ''Journal of Algorithms,'', vol. 25, pp. 237-245, 1997. &lt;/ref&gt; &lt;ref name=&quot;nine&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem?],&quot; ''Operations Research Letters'', vol. 44, pp. 366-369, 2016. &lt;/ref&gt;<br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s. &lt;ref name=&quot;two&quot;&gt; J. Rubin, [https://www.jstor.org/stable/25767684?seq=1 &quot;A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling],&quot; ''Transportation Science'', vol. 7, pp. 34-48, 1973. &lt;/ref&gt;<br /> <br /> == Problem formulation ==<br /> In the set covering problem, two sets are given: a set &lt;math&gt; U &lt;/math&gt; of elements and a set &lt;math&gt; S &lt;/math&gt; of subsets of the set &lt;math&gt; U &lt;/math&gt;. Each subset in &lt;math&gt; S &lt;/math&gt; is associated with a predetermined cost, and the union of all the subsets covers the set &lt;math&gt; U &lt;/math&gt;. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt; &lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;<br /> <br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of &lt;math&gt; U &lt;/math&gt; and has associated cost &lt;math&gt; c_i&lt;/math&gt; such that &lt;math&gt; c_i &gt; 0&lt;/math&gt;. The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe &lt;math&gt; U &lt;/math&gt;.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:&lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element &lt;math&gt; i &lt;/math&gt; in the universe &lt;math&gt; U &lt;/math&gt; must be be covered and the second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary which means that every set is either in the set cover or not.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which implies that as the size of the problem increases, the computational time to solve it increases exponentially. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. In subsequent sections, we will cover two of the most widely used approximation methods to solve set cover problem in polynomial time which are linear program relaxation methods and classical greedy algorithms. &lt;ref name=&quot;seven&quot; /&gt;<br /> <br /> == Approximation via LP relaxation and rounding ==<br /> Set covering is a classical integer programming problem and solving integer program in general is NP-hard. Therefore, one approach to achieve an &lt;math&gt; O&lt;/math&gt;(log&lt;math&gt;n&lt;/math&gt;) approximation to set covering problem in polynomial time is solving via linear programming (LP) ''relaxation'' algorithms &lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt; &lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;. In LP relaxation, we relax the integrality requirement into a linear constraints. For instance, if we replace the constraints &lt;math&gt; y_i \in \{0, 1\}&lt;/math&gt; with the constraints &lt;math&gt; 0 =&lt; y_i &lt;= 1 &lt;/math&gt;, we obtain the following LP problem that can be solved in polynomial time:<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> subject to &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; 0 =&lt; y_i&lt;= 1, \forall i = 1,....,n&lt;/math&gt;<br /> <br /> The above LP formulation is a relaxation of the original ILP set cover problem. This means that every feasible solution of the integer program is also feasible for this LP program. Additionally, the value of any feasible solution for the integer program is the same value in LP since the objective functions of both integer and linear programs are the same. Solving the LP program will result in an optimal solution that is a lower bound for the original integer program since the minimization of LP finds a feasible solution of lowest possible values. Moreover, we use LP rounding algorithms to directly round the fractional LP solution to an integral combinatorial solution as follows:<br /> &lt;br&gt;<br /> <br /> <br /> '''Deterministic rounding algorithm''' <br /> &lt;br&gt;<br /> <br /> Suppose we have an optimal solution &lt;math&gt; z^* &lt;/math&gt; for the linear programming relaxation of the set cover problem. We round the fractional solution &lt;math&gt; z^* &lt;/math&gt; to an integer solution &lt;math&gt; z &lt;/math&gt; using LP rounding algorithm. In general, there are two approaches for rounding algorithms, deterministic and randomized rounding algorithm. In this section, we will explain the deterministic algorithms.In this approach, we include subset &lt;math&gt; s_i &lt;/math&gt; in our solution if &lt;math&gt; z^* &gt;= 1/d &lt;/math&gt;, where &lt;math&gt; d &lt;/math&gt; is the maximum number of sets in which any element appears. In practice, we set &lt;math&gt; z &lt;/math&gt; to be as follows:&lt;ref name=&quot;twelve&quot;&gt; Williamson, David P., and David B. Shmoys. “The Design of Approximation Algorithms” [https://www.designofapproxalgs.com/book.pdf]. “Cambridge University Press”, 2011. &lt;/ref&gt;<br /> <br /> &lt;math&gt; z = \begin{cases} 1, &amp; \text{if } z^*&gt;= 1/d \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> The rounding algorithm is an approximation algorithm for the set cover problem. It is clear that the algorithm converge in polynomial time and &lt;math&gt; z &lt;/math&gt; is a feasible solution to the integer program.<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time. &lt;ref name=&quot;seven&quot; /&gt; &lt;ref name=&quot;nine&quot; /&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe &lt;math&gt; U &lt;/math&gt;, and delete the uncovered elements, until all elements are covered. &lt;ref name=&quot;ten&quot;&gt; V. Chvatal, [https://pubsonline.informs.org/doi/abs/10.1287/moor.4.3.233 &quot;Greedy Heuristic for the Set-Covering Problem],&quot; ''Mathematics of Operations Research'', vol. 4, pp. 233-235, 1979. &lt;/ref&gt; Let &lt;math&gt; T &lt;/math&gt; be the set that contain the covered elements, and &lt;math&gt; U &lt;/math&gt; be the set that contain the elements of &lt;math&gt; Y &lt;/math&gt; that still uncovered. At the beginning of the iteration, &lt;math&gt; T &lt;/math&gt; is empty and all elements &lt;math&gt; Y \in U &lt;/math&gt;. We iteratively select the set of &lt;math&gt; S &lt;/math&gt; that covers the largest number of elements in &lt;math&gt; U &lt;/math&gt; and add it to the covered elements in &lt;math&gt; T &lt;/math&gt;. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { &lt;math&gt; T &lt;/math&gt; stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { &lt;math&gt; U &lt;/math&gt; stores the uncovered elements &lt;math&gt; Y &lt;/math&gt;}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations. &lt;ref name=&quot;three&quot;&gt; R. Church and C. ReVelle, [https://link.springer.com/article/10.1007/BF01942293 &quot;The maximal covering location problem],&quot; ''Papers of the Regional Science Association'', vol. 32, pp. 101-118, 1974. &lt;/ref&gt; Consider the problem of placing fire stations to serve the towns of some city. &lt;ref name=&quot;four&quot;&gt; E. Aktaş, Ö. Özaydın, B. Bozkaya, F. Ülengin, and Ş. Önsel, [https://pubsonline.informs.org/doi/10.1287/inte.1120.0671 &quot;Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality],&quot; ''Interfaces'', vol. 43, pp. 240-255, 2013. &lt;/ref&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i \geq 1, \forall i&lt;/math&gt; <br /> <br /> A real-world case study involving optimizing fire station locations in Istanbul is analyzed in this reference. &lt;ref name=&quot;four&quot; /&gt; The Istanbul municipality serves 790 subdistricts, which should all be covered by a fire station. Each subdistrict is considered covered if it has a neighboring district (a district at most 5 minutes away) that has a fire station. For detailed computational analysis, we refer the reader to the mentioned academic paper.<br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem. &lt;ref name=&quot;five&quot;&gt; J. Ali and V. Dyo, [https://www.scitepress.org/Link.aspx?doi=10.5220/0006469800830088 &quot;Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach],&quot; ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications'', pp. 83-88, 2017. &lt;/ref&gt; &lt;ref name=&quot;eleven&quot;&gt; P.H. Cruz Caminha , R. De Souza Couto , L.H. Maciel Kosmalski Costa , A. Fladenmuller , and M. Dias de Amorim, [https://www.mdpi.com/1424-8220/18/6/1976 &quot;On the Coverage of Bus-Based Mobile Sensing],&quot; ''Sensors'', 2018. &lt;/ref&gt; Specifically, giving a collection of bus routes '''''R''''', where each route itself is divided into segments. Route &lt;math&gt; i &lt;/math&gt; is denoted by &lt;math&gt; R_i &lt;/math&gt;, and segment &lt;math&gt; j &lt;/math&gt; is denoted by &lt;math&gt; S_j &lt;/math&gt;. The segments of two different routes can overlap, and each segment is associated with a length &lt;math&gt; a_j &lt;/math&gt;. The goal is then to select the routes that maximize the total covered distance.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Suppose we want to use at most &lt;math&gt; k &lt;/math&gt; different routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the length of of covered segments. Let &lt;math&gt; x_i &lt;/math&gt; be the binary decision variable corresponding to selecting route &lt;math&gt; R_i &lt;/math&gt;, and let &lt;math&gt; y_j &lt;/math&gt; be the decision variable associated with covering segment &lt;math&gt; S_j &lt;/math&gt;. Let us also denote the set of routes that cover segment &lt;math&gt; j &lt;/math&gt; by &lt;math&gt; C_j &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> &lt;math&gt;<br /> \begin{align}<br /> \text{max} &amp; ~~ \sum_{j} a_jy_j\\<br /> \text{s.t} &amp; ~~ \sum_{i\in C_j} x_i \geq y_j \quad \forall j \\<br /> &amp; ~~ \sum_{i} x_i = k \\ <br /> &amp; ~~ x_i,y_{j} \in \{0,1\} \\<br /> \end{align}<br /> &lt;/math&gt;<br /> <br /> The work by Ali and Dyo explores a greedy approximation algorithm to solve an optimal selection problem including 713 bus routes in Greater London. &lt;ref name=&quot;five&quot; /&gt; Using 14% of the routes only (100 routes), the greedy algorithm returns a solution that covers 25% of the segments in Greater London. For a details of the approximation algorithm and the world case study, we refer the reader to this reference. &lt;ref name=&quot;five&quot; /&gt; For a significantly larger case study involving 5747 buses covering 5060km, we refer the reader to this academic article. &lt;ref name=&quot;eleven&quot; /&gt;<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;six&quot;&gt; E. Marchiori and A. Steenbeek, [https://link.springer.com/chapter/10.1007/3-540-45561-2_36 &quot;An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling],&quot; ''Real-World Applications of Evolutionary Computing. EvoWorkshops 2000. Lecture Notes in Computer Science'', 2000. &lt;/ref&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey, which contains several problem instances with the number of flights ranging from 1013 to 7765 flights, for a detailed analysis of the formulation and algorithms that pertain to this significant application. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;eight&quot;&gt; A. Kasirzadeh, M. Saddoune, and F. Soumis [https://www.sciencedirect.com/science/article/pii/S2192437620300820?via%3Dihub &quot;Airline crew scheduling: models, algorithms, and data sets],&quot; ''EURO Journal on Transportation and Logistics'', vol. 6, pp. 111-137, 2017. &lt;/ref&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> == References ==<br /> &lt;references /&gt;</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=2549 Set covering problem 2020-12-14T00:33:57Z <p>Khaledfahat: /* Integer linear program formulation */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. Given a collection of elements, the set covering problem aims to find the minimum number of sets that incorporate (cover) all of these elements. &lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion. &lt;ref name=&quot;one&quot; /&gt; &lt;ref name=&quot;seven&quot;&gt; P. Slavı́k, [https://www.sciencedirect.com/science/article/abs/pii/S0196677497908877 &quot;A Tight Analysis of the Greedy Algorithm for Set Cover],&quot; ''Journal of Algorithms,'', vol. 25, pp. 237-245, 1997. &lt;/ref&gt; &lt;ref name=&quot;nine&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem?],&quot; ''Operations Research Letters'', vol. 44, pp. 366-369, 2016. &lt;/ref&gt;<br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s. &lt;ref name=&quot;two&quot;&gt; J. Rubin, [https://www.jstor.org/stable/25767684?seq=1 &quot;A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling],&quot; ''Transportation Science'', vol. 7, pp. 34-48, 1973. &lt;/ref&gt;<br /> <br /> == Problem formulation ==<br /> In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.<br /> <br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c_i&lt;/math&gt; such that &lt;math&gt; c_i &gt; 0&lt;/math&gt;. The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:&lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element &lt;math&gt; i &lt;/math&gt; in the universe &lt;math&gt; U &lt;/math&gt; must be be covered and the second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary which means that every set is either in the set cover or not.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which implies that as the size of the problem increases, the computational time to solve it increases exponentially. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. In subsequent sections, we will cover two of the most widely used approximation methods to solve set cover problem in polynomial time which are linear program relaxation methods and classical greedy algorithms. &lt;ref name=&quot;seven&quot; /&gt;<br /> <br /> == Approximation via LP relaxation and rounding ==<br /> Set covering is a classical integer programming problem and solving integer program in general is NP-hard. Therefore, one approach to achieve an &lt;math&gt; O&lt;/math&gt;(log&lt;math&gt;n&lt;/math&gt;) approximation to set covering problem in polynomial time is solving via linear programming (LP) ''relaxation'' algorithms. In LP relaxation, we relax the integrality requirement into a linear constraints. For instance, if we replace the constraints &lt;math&gt; y_i \in \{0, 1\}&lt;/math&gt; with the constraints &lt;math&gt; 0 =&lt; y_i &lt;= 1 &lt;/math&gt;, we obtain the following LP problem that can be solved in polynomial time:<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> subject to &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; 0 =&lt; y_i&lt;= 1, \forall i = 1,....,n&lt;/math&gt;<br /> <br /> The above LP formulation is a relaxation of the original ILP set cover problem. This means that every feasible solution of the integer program is also feasible for this LP program. Additionally, the value of any feasible solution for the integer program is the same value in LP since the objective functions of both integer and linear programs are the same. Solving the LP program will result in an optimal solution that is a lower bound for the original integer program since the minimization of LP finds a feasible solution of lowest possible values. Moreover, we use LP rounding algorithms to directly round the fractional LP solution to an integral combinatorial solution as follows:<br /> &lt;br&gt;<br /> <br /> <br /> '''Deterministic rounding algorithm''' <br /> &lt;br&gt;<br /> <br /> Suppose we have an optimal solution &lt;math&gt; z^* &lt;/math&gt; for the linear programming relaxation of the set cover problem. We round the fractional solution &lt;math&gt; z^* &lt;/math&gt; to an integer solution &lt;math&gt; z &lt;/math&gt; using LP rounding algorithm. In general, there are two approaches for rounding algorithms, deterministic and randomized rounding algorithm. In this section, we will explain the deterministic algorithms.In this approach, we include subset &lt;math&gt; s_i &lt;/math&gt; in our solution if &lt;math&gt; z^* &gt;= 1/d &lt;/math&gt;, where &lt;math&gt; d &lt;/math&gt; is the maximum number of sets in which any element appears. In practice, we set &lt;math&gt; z &lt;/math&gt; to be as follows:<br /> <br /> &lt;math&gt; z = \begin{cases} 1, &amp; \text{if } z^*&gt;= 1/d \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> The rounding algorithm is an approximation algorithm for the set cover problem. It is clear that the algorithm converge in polynomial time and &lt;math&gt; z &lt;/math&gt; is a feasible solution to the integer program.<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time. &lt;ref name=&quot;seven&quot; /&gt; &lt;ref name=&quot;nine&quot; /&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe &lt;math&gt; U &lt;/math&gt;, and delete the uncovered elements, until all elements are covered. &lt;ref name=&quot;ten&quot;&gt; V. Chvatal, [https://pubsonline.informs.org/doi/abs/10.1287/moor.4.3.233 &quot;Greedy Heuristic for the Set-Covering Problem],&quot; ''Mathematics of Operations Research'', vol. 4, pp. 233-235, 1979. &lt;/ref&gt; Let &lt;math&gt; T &lt;/math&gt; be the set that contain the covered elements, and &lt;math&gt; U &lt;/math&gt; be the set that contain the elements of &lt;math&gt; Y &lt;/math&gt; that still uncovered. At the beginning of the iteration, &lt;math&gt; T &lt;/math&gt; is empty and all elements &lt;math&gt; Y \in U &lt;/math&gt;. We iteratively select the set of &lt;math&gt; S &lt;/math&gt; that covers the largest number of elements in &lt;math&gt; U &lt;/math&gt; and add it to the covered elements in &lt;math&gt; T &lt;/math&gt;. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { &lt;math&gt; T &lt;/math&gt; stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { &lt;math&gt; U &lt;/math&gt; stores the uncovered elements &lt;math&gt; Y &lt;/math&gt;}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations. &lt;ref name=&quot;three&quot;&gt; R. Church and C. ReVelle, [https://link.springer.com/article/10.1007/BF01942293 &quot;The maximal covering location problem],&quot; ''Papers of the Regional Science Association'', vol. 32, pp. 101-118, 1974. &lt;/ref&gt; Consider the problem of placing fire stations to serve the towns of some city. &lt;ref name=&quot;four&quot;&gt; E. Aktaş, Ö. Özaydın, B. Bozkaya, F. Ülengin, and Ş. Önsel, [https://pubsonline.informs.org/doi/10.1287/inte.1120.0671 &quot;Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality],&quot; ''Interfaces'', vol. 43, pp. 240-255, 2013. &lt;/ref&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i \geq 1, \forall i&lt;/math&gt; <br /> <br /> A real-world case study involving optimizing fire station locations in Istanbul is analyzed in this reference. &lt;ref name=&quot;four&quot; /&gt; The Istanbul municipality serves 790 subdistricts, which should all be covered by a fire station. Each subdistrict is considered covered if it has a neighboring district (a district at most 5 minutes away) that has a fire station. For detailed computational analysis, we refer the reader to the mentioned academic paper.<br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem. &lt;ref name=&quot;five&quot;&gt; J. Ali and V. Dyo, [https://www.scitepress.org/Link.aspx?doi=10.5220/0006469800830088 &quot;Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach],&quot; ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications'', pp. 83-88, 2017. &lt;/ref&gt; &lt;ref name=&quot;eleven&quot;&gt; P.H. Cruz Caminha , R. De Souza Couto , L.H. Maciel Kosmalski Costa , A. Fladenmuller , and M. Dias de Amorim, [https://www.mdpi.com/1424-8220/18/6/1976 &quot;On the Coverage of Bus-Based Mobile Sensing],&quot; ''Sensors'', 2018. &lt;/ref&gt; Specifically, giving a collection of bus routes '''''R''''', where each route itself is divided into segments. Route &lt;math&gt; i &lt;/math&gt; is denoted by &lt;math&gt; R_i &lt;/math&gt;, and segment &lt;math&gt; j &lt;/math&gt; is denoted by &lt;math&gt; S_j &lt;/math&gt;. The segments of two different routes can overlap, and each segment is associated with a length &lt;math&gt; a_j &lt;/math&gt;. The goal is then to select the routes that maximize the total covered distance.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Suppose we want to use at most &lt;math&gt; k &lt;/math&gt; different routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the length of of covered segments. Let &lt;math&gt; x_i &lt;/math&gt; be the binary decision variable corresponding to selecting route &lt;math&gt; R_i &lt;/math&gt;, and let &lt;math&gt; y_j &lt;/math&gt; be the decision variable associated with covering segment &lt;math&gt; S_j &lt;/math&gt;. Let us also denote the set of routes that cover segment &lt;math&gt; j &lt;/math&gt; by &lt;math&gt; C_j &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> &lt;math&gt;<br /> \begin{align}<br /> \text{max} &amp; ~~ \sum_{j} a_jy_j\\<br /> \text{s.t} &amp; ~~ \sum_{i\in C_j} x_i \geq y_j \quad \forall j \\<br /> &amp; ~~ \sum_{i} x_i = k \\ <br /> &amp; ~~ x_i,y_{j} \in \{0,1\} \\<br /> \end{align}<br /> &lt;/math&gt;<br /> <br /> The work by Ali and Dyo explores a greedy approximation algorithm to solve an optimal selection problem including 713 bus routes in Greater London. &lt;ref name=&quot;five&quot; /&gt; Using 14% of the routes only (100 routes), the greedy algorithm returns a solution that covers 25% of the segments in Greater London. For a details of the approximation algorithm and the world case study, we refer the reader to this reference. &lt;ref name=&quot;five&quot; /&gt; For a significantly larger case study involving 5747 buses covering 5060km, we refer the reader to this academic article. &lt;ref name=&quot;eleven&quot; /&gt;<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;six&quot;&gt; E. Marchiori and A. Steenbeek, [https://link.springer.com/chapter/10.1007/3-540-45561-2_36 &quot;An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling],&quot; ''Real-World Applications of Evolutionary Computing. EvoWorkshops 2000. Lecture Notes in Computer Science'', 2000. &lt;/ref&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey, which contains several problem instances with the number of flights ranging from 1013 to 7765 flights, for a detailed analysis of the formulation and algorithms that pertain to this significant application. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;eight&quot;&gt; A. Kasirzadeh, M. Saddoune, and F. Soumis [https://www.sciencedirect.com/science/article/pii/S2192437620300820?via%3Dihub &quot;Airline crew scheduling: models, algorithms, and data sets],&quot; ''EURO Journal on Transportation and Logistics'', vol. 6, pp. 111-137, 2017. &lt;/ref&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> == References ==<br /> &lt;references /&gt;</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=2547 Set covering problem 2020-12-14T00:20:33Z <p>Khaledfahat: /* Integer linear program formulation */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. Given a collection of elements, the set covering problem aims to find the minimum number of sets that incorporate (cover) all of these elements. &lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion. &lt;ref name=&quot;one&quot; /&gt; &lt;ref name=&quot;seven&quot;&gt; P. Slavı́k, [https://www.sciencedirect.com/science/article/abs/pii/S0196677497908877 &quot;A Tight Analysis of the Greedy Algorithm for Set Cover],&quot; ''Journal of Algorithms,'', vol. 25, pp. 237-245, 1997. &lt;/ref&gt; &lt;ref name=&quot;nine&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem?],&quot; ''Operations Research Letters'', vol. 44, pp. 366-369, 2016. &lt;/ref&gt;<br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s. &lt;ref name=&quot;two&quot;&gt; J. Rubin, [https://www.jstor.org/stable/25767684?seq=1 &quot;A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling],&quot; ''Transportation Science'', vol. 7, pp. 34-48, 1973. &lt;/ref&gt;<br /> <br /> == Problem formulation ==<br /> In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.<br /> <br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c_i&lt;/math&gt; such that &lt;math&gt; c_i &gt; 0&lt;/math&gt;. The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element &lt;math&gt; i &lt;/math&gt; in the universe &lt;math&gt; U &lt;/math&gt; must be be covered and the second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary which means that every set is either in the set cover or not.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which implies that as the size of the problem increases, the computational time to solve it increases exponentially. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. In subsequent sections, we will cover two of the most widely used approximation methods to solve set cover problem in polynomial time which are linear program relaxation methods and classical greedy algorithms. &lt;ref name=&quot;seven&quot; /&gt;<br /> <br /> == Approximation via LP relaxation and rounding ==<br /> Set covering is a classical integer programming problem and solving integer program in general is NP-hard. Therefore, one approach to achieve an &lt;math&gt; O&lt;/math&gt;(log&lt;math&gt;n&lt;/math&gt;) approximation to set covering problem in polynomial time is solving via linear programming (LP) ''relaxation'' algorithms. In LP relaxation, we relax the integrality requirement into a linear constraints. For instance, if we replace the constraints &lt;math&gt; y_i \in \{0, 1\}&lt;/math&gt; with the constraints &lt;math&gt; 0 =&lt; y_i &lt;= 1 &lt;/math&gt;, we obtain the following LP problem that can be solved in polynomial time:<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> subject to &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; 0 =&lt; y_i&lt;= 1, \forall i = 1,....,n&lt;/math&gt;<br /> <br /> The above LP formulation is a relaxation of the original ILP set cover problem. This means that every feasible solution of the integer program is also feasible for this LP program. Additionally, the value of any feasible solution for the integer program is the same value in LP since the objective functions of both integer and linear programs are the same. Solving the LP program will result in an optimal solution that is a lower bound for the original integer program since the minimization of LP finds a feasible solution of lowest possible values. Moreover, we use LP rounding algorithms to directly round the fractional LP solution to an integral combinatorial solution as follows:<br /> &lt;br&gt;<br /> <br /> <br /> '''Deterministic rounding algorithm''' <br /> &lt;br&gt;<br /> <br /> Suppose we have an optimal solution &lt;math&gt; z^* &lt;/math&gt; for the linear programming relaxation of the set cover problem. We round the fractional solution &lt;math&gt; z^* &lt;/math&gt; to an integer solution &lt;math&gt; z &lt;/math&gt; using LP rounding algorithm. In general, there are two approaches for rounding algorithms, deterministic and randomized rounding algorithm. In this section, we will explain the deterministic algorithms.In this approach, we include subset &lt;math&gt; s_i &lt;/math&gt; in our solution if &lt;math&gt; z^* &gt;= 1/d &lt;/math&gt;, where &lt;math&gt; d &lt;/math&gt; is the maximum number of sets in which any element appears. In practice, we set &lt;math&gt; z &lt;/math&gt; to be as follows:<br /> <br /> &lt;math&gt; z = \begin{cases} 1, &amp; \text{if } z^*&gt;= 1/d \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> The rounding algorithm is an approximation algorithm for the set cover problem. It is clear that the algorithm converge in polynomial time and &lt;math&gt; z &lt;/math&gt; is a feasible solution to the integer program.<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time. &lt;ref name=&quot;seven&quot; /&gt; &lt;ref name=&quot;nine&quot; /&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe &lt;math&gt; U &lt;/math&gt;, and delete the uncovered elements, until all elements are covered. &lt;ref name=&quot;ten&quot;&gt; V. Chvatal, [https://pubsonline.informs.org/doi/abs/10.1287/moor.4.3.233 &quot;Greedy Heuristic for the Set-Covering Problem],&quot; ''Mathematics of Operations Research'', vol. 4, pp. 233-235, 1979. &lt;/ref&gt; Let &lt;math&gt; T &lt;/math&gt; be the set that contain the covered elements, and &lt;math&gt; U &lt;/math&gt; be the set that contain the elements of &lt;math&gt; Y &lt;/math&gt; that still uncovered. At the beginning of the iteration, &lt;math&gt; T &lt;/math&gt; is empty and all elements &lt;math&gt; Y \in U &lt;/math&gt;. We iteratively select the set of &lt;math&gt; S &lt;/math&gt; that covers the largest number of elements in &lt;math&gt; U &lt;/math&gt; and add it to the covered elements in &lt;math&gt; T &lt;/math&gt;. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { &lt;math&gt; T &lt;/math&gt; stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { &lt;math&gt; U &lt;/math&gt; stores the uncovered elements &lt;math&gt; Y &lt;/math&gt;}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations. &lt;ref name=&quot;three&quot;&gt; R. Church and C. ReVelle, [https://link.springer.com/article/10.1007/BF01942293 &quot;The maximal covering location problem],&quot; ''Papers of the Regional Science Association'', vol. 32, pp. 101-118, 1974. &lt;/ref&gt; Consider the problem of placing fire stations to serve the towns of some city. &lt;ref name=&quot;four&quot;&gt; E. Aktaş, Ö. Özaydın, B. Bozkaya, F. Ülengin, and Ş. Önsel, [https://pubsonline.informs.org/doi/10.1287/inte.1120.0671 &quot;Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality],&quot; ''Interfaces'', vol. 43, pp. 240-255, 2013. &lt;/ref&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i \geq 1, \forall i&lt;/math&gt; <br /> <br /> A real-world case study involving optimizing fire station locations in Istanbul is analyzed in this reference. &lt;ref name=&quot;four&quot; /&gt; The Istanbul municipality serves 790 subdistricts, which should all be covered by a fire station. Each subdistrict is considered covered if it has a neighboring district (a district at most 5 minutes away) that has a fire station. For detailed computational analysis, we refer the reader to the mentioned academic paper.<br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem. &lt;ref name=&quot;five&quot;&gt; J. Ali and V. Dyo, [https://www.scitepress.org/Link.aspx?doi=10.5220/0006469800830088 &quot;Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach],&quot; ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications'', pp. 83-88, 2017. &lt;/ref&gt; &lt;ref name=&quot;eleven&quot;&gt; P.H. Cruz Caminha , R. De Souza Couto , L.H. Maciel Kosmalski Costa , A. Fladenmuller , and M. Dias de Amorim, [https://www.mdpi.com/1424-8220/18/6/1976 &quot;On the Coverage of Bus-Based Mobile Sensing],&quot; ''Sensors'', 2018. &lt;/ref&gt; Specifically, giving a collection of bus routes '''''R''''', where each route itself is divided into segments. Route &lt;math&gt; i &lt;/math&gt; is denoted by &lt;math&gt; R_i &lt;/math&gt;, and segment &lt;math&gt; j &lt;/math&gt; is denoted by &lt;math&gt; S_j &lt;/math&gt;. The segments of two different routes can overlap, and each segment is associated with a length &lt;math&gt; a_j &lt;/math&gt;. The goal is then to select the routes that maximize the total covered distance.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Suppose we want to use at most &lt;math&gt; k &lt;/math&gt; different routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the length of of covered segments. Let &lt;math&gt; x_i &lt;/math&gt; be the binary decision variable corresponding to selecting route &lt;math&gt; R_i &lt;/math&gt;, and let &lt;math&gt; y_j &lt;/math&gt; be the decision variable associated with covering segment &lt;math&gt; S_j &lt;/math&gt;. Let us also denote the set of routes that cover segment &lt;math&gt; j &lt;/math&gt; by &lt;math&gt; C_j &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> &lt;math&gt;<br /> \begin{align}<br /> \text{max} &amp; ~~ \sum_{j} a_jy_j\\<br /> \text{s.t} &amp; ~~ \sum_{i\in C_j} x_i \geq y_j \quad \forall j \\<br /> &amp; ~~ \sum_{i} x_i = k \\ <br /> &amp; ~~ x_i,y_{j} \in \{0,1\} \\<br /> \end{align}<br /> &lt;/math&gt;<br /> <br /> The work by Ali and Dyo explores a greedy approximation algorithm to solve an optimal selection problem including 713 bus routes in Greater London. &lt;ref name=&quot;five&quot; /&gt; Using 14% of the routes only (100 routes), the greedy algorithm returns a solution that covers 25% of the segments in Greater London. For a details of the approximation algorithm and the world case study, we refer the reader to this reference. &lt;ref name=&quot;five&quot; /&gt; For a significantly larger case study involving 5747 buses covering 5060km, we refer the reader to this academic article. &lt;ref name=&quot;eleven&quot; /&gt;<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;six&quot;&gt; E. Marchiori and A. Steenbeek, [https://link.springer.com/chapter/10.1007/3-540-45561-2_36 &quot;An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling],&quot; ''Real-World Applications of Evolutionary Computing. EvoWorkshops 2000. Lecture Notes in Computer Science'', 2000. &lt;/ref&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey, which contains several problem instances with the number of flights ranging from 1013 to 7765 flights, for a detailed analysis of the formulation and algorithms that pertain to this significant application. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;eight&quot;&gt; A. Kasirzadeh, M. Saddoune, and F. Soumis [https://www.sciencedirect.com/science/article/pii/S2192437620300820?via%3Dihub &quot;Airline crew scheduling: models, algorithms, and data sets],&quot; ''EURO Journal on Transportation and Logistics'', vol. 6, pp. 111-137, 2017. &lt;/ref&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> == References ==<br /> &lt;references /&gt;</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=2546 Set covering problem 2020-12-14T00:18:24Z <p>Khaledfahat: /* Integer linear program formulation */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. Given a collection of elements, the set covering problem aims to find the minimum number of sets that incorporate (cover) all of these elements. &lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion. &lt;ref name=&quot;one&quot; /&gt; &lt;ref name=&quot;seven&quot;&gt; P. Slavı́k, [https://www.sciencedirect.com/science/article/abs/pii/S0196677497908877 &quot;A Tight Analysis of the Greedy Algorithm for Set Cover],&quot; ''Journal of Algorithms,'', vol. 25, pp. 237-245, 1997. &lt;/ref&gt; &lt;ref name=&quot;nine&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem?],&quot; ''Operations Research Letters'', vol. 44, pp. 366-369, 2016. &lt;/ref&gt;<br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s. &lt;ref name=&quot;two&quot;&gt; J. Rubin, [https://www.jstor.org/stable/25767684?seq=1 &quot;A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling],&quot; ''Transportation Science'', vol. 7, pp. 34-48, 1973. &lt;/ref&gt;<br /> <br /> == Problem formulation ==<br /> In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.<br /> <br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c_i&lt;/math&gt; such that &lt;math&gt; c_i &gt; 0&lt;/math&gt;. The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows&lt;ref&gt;1&lt;/ref&gt;:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element &lt;math&gt; i &lt;/math&gt; in the universe &lt;math&gt; U &lt;/math&gt; must be be covered and the second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary which means that every set is either in the set cover or not.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which implies that as the size of the problem increases, the computational time to solve it increases exponentially. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. In subsequent sections, we will cover two of the most widely used approximation methods to solve set cover problem in polynomial time which are linear program relaxation methods and classical greedy algorithms. &lt;ref name=&quot;seven&quot; /&gt;<br /> <br /> == Approximation via LP relaxation and rounding ==<br /> Set covering is a classical integer programming problem and solving integer program in general is NP-hard. Therefore, one approach to achieve an &lt;math&gt; O&lt;/math&gt;(log&lt;math&gt;n&lt;/math&gt;) approximation to set covering problem in polynomial time is solving via linear programming (LP) ''relaxation'' algorithms. In LP relaxation, we relax the integrality requirement into a linear constraints. For instance, if we replace the constraints &lt;math&gt; y_i \in \{0, 1\}&lt;/math&gt; with the constraints &lt;math&gt; 0 =&lt; y_i &lt;= 1 &lt;/math&gt;, we obtain the following LP problem that can be solved in polynomial time:<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> subject to &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; 0 =&lt; y_i&lt;= 1, \forall i = 1,....,n&lt;/math&gt;<br /> <br /> The above LP formulation is a relaxation of the original ILP set cover problem. This means that every feasible solution of the integer program is also feasible for this LP program. Additionally, the value of any feasible solution for the integer program is the same value in LP since the objective functions of both integer and linear programs are the same. Solving the LP program will result in an optimal solution that is a lower bound for the original integer program since the minimization of LP finds a feasible solution of lowest possible values. Moreover, we use LP rounding algorithms to directly round the fractional LP solution to an integral combinatorial solution as follows:<br /> &lt;br&gt;<br /> <br /> <br /> '''Deterministic rounding algorithm''' <br /> &lt;br&gt;<br /> <br /> Suppose we have an optimal solution &lt;math&gt; z^* &lt;/math&gt; for the linear programming relaxation of the set cover problem. We round the fractional solution &lt;math&gt; z^* &lt;/math&gt; to an integer solution &lt;math&gt; z &lt;/math&gt; using LP rounding algorithm. In general, there are two approaches for rounding algorithms, deterministic and randomized rounding algorithm. In this section, we will explain the deterministic algorithms.In this approach, we include subset &lt;math&gt; s_i &lt;/math&gt; in our solution if &lt;math&gt; z^* &gt;= 1/d &lt;/math&gt;, where &lt;math&gt; d &lt;/math&gt; is the maximum number of sets in which any element appears. In practice, we set &lt;math&gt; z &lt;/math&gt; to be as follows:<br /> <br /> &lt;math&gt; z = \begin{cases} 1, &amp; \text{if } z^*&gt;= 1/d \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> The rounding algorithm is an approximation algorithm for the set cover problem. It is clear that the algorithm converge in polynomial time and &lt;math&gt; z &lt;/math&gt; is a feasible solution to the integer program.<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time. &lt;ref name=&quot;seven&quot; /&gt; &lt;ref name=&quot;nine&quot; /&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe &lt;math&gt; U &lt;/math&gt;, and delete the uncovered elements, until all elements are covered. &lt;ref name=&quot;ten&quot;&gt; V. Chvatal, [https://pubsonline.informs.org/doi/abs/10.1287/moor.4.3.233 &quot;Greedy Heuristic for the Set-Covering Problem],&quot; ''Mathematics of Operations Research'', vol. 4, pp. 233-235, 1979. &lt;/ref&gt; Let &lt;math&gt; T &lt;/math&gt; be the set that contain the covered elements, and &lt;math&gt; U &lt;/math&gt; be the set that contain the elements of &lt;math&gt; Y &lt;/math&gt; that still uncovered. At the beginning of the iteration, &lt;math&gt; T &lt;/math&gt; is empty and all elements &lt;math&gt; Y \in U &lt;/math&gt;. We iteratively select the set of &lt;math&gt; S &lt;/math&gt; that covers the largest number of elements in &lt;math&gt; U &lt;/math&gt; and add it to the covered elements in &lt;math&gt; T &lt;/math&gt;. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { &lt;math&gt; T &lt;/math&gt; stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { &lt;math&gt; U &lt;/math&gt; stores the uncovered elements &lt;math&gt; Y &lt;/math&gt;}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations. &lt;ref name=&quot;three&quot;&gt; R. Church and C. ReVelle, [https://link.springer.com/article/10.1007/BF01942293 &quot;The maximal covering location problem],&quot; ''Papers of the Regional Science Association'', vol. 32, pp. 101-118, 1974. &lt;/ref&gt; Consider the problem of placing fire stations to serve the towns of some city. &lt;ref name=&quot;four&quot;&gt; E. Aktaş, Ö. Özaydın, B. Bozkaya, F. Ülengin, and Ş. Önsel, [https://pubsonline.informs.org/doi/10.1287/inte.1120.0671 &quot;Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality],&quot; ''Interfaces'', vol. 43, pp. 240-255, 2013. &lt;/ref&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i \geq 1, \forall i&lt;/math&gt; <br /> <br /> A real-world case study involving optimizing fire station locations in Istanbul is analyzed in this reference. &lt;ref name=&quot;four&quot; /&gt; The Istanbul municipality serves 790 subdistricts, which should all be covered by a fire station. Each subdistrict is considered covered if it has a neighboring district (a district at most 5 minutes away) that has a fire station. For detailed computational analysis, we refer the reader to the mentioned academic paper.<br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem. &lt;ref name=&quot;five&quot;&gt; J. Ali and V. Dyo, [https://www.scitepress.org/Link.aspx?doi=10.5220/0006469800830088 &quot;Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach],&quot; ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications'', pp. 83-88, 2017. &lt;/ref&gt; &lt;ref name=&quot;eleven&quot;&gt; P.H. Cruz Caminha , R. De Souza Couto , L.H. Maciel Kosmalski Costa , A. Fladenmuller , and M. Dias de Amorim, [https://www.mdpi.com/1424-8220/18/6/1976 &quot;On the Coverage of Bus-Based Mobile Sensing],&quot; ''Sensors'', 2018. &lt;/ref&gt; Specifically, giving a collection of bus routes '''''R''''', where each route itself is divided into segments. Route &lt;math&gt; i &lt;/math&gt; is denoted by &lt;math&gt; R_i &lt;/math&gt;, and segment &lt;math&gt; j &lt;/math&gt; is denoted by &lt;math&gt; S_j &lt;/math&gt;. The segments of two different routes can overlap, and each segment is associated with a length &lt;math&gt; a_j &lt;/math&gt;. The goal is then to select the routes that maximize the total covered distance.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Suppose we want to use at most &lt;math&gt; k &lt;/math&gt; different routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the length of of covered segments. Let &lt;math&gt; x_i &lt;/math&gt; be the binary decision variable corresponding to selecting route &lt;math&gt; R_i &lt;/math&gt;, and let &lt;math&gt; y_j &lt;/math&gt; be the decision variable associated with covering segment &lt;math&gt; S_j &lt;/math&gt;. Let us also denote the set of routes that cover segment &lt;math&gt; j &lt;/math&gt; by &lt;math&gt; C_j &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> &lt;math&gt;<br /> \begin{align}<br /> \text{max} &amp; ~~ \sum_{j} a_jy_j\\<br /> \text{s.t} &amp; ~~ \sum_{i\in C_j} x_i \geq y_j \quad \forall j \\<br /> &amp; ~~ \sum_{i} x_i = k \\ <br /> &amp; ~~ x_i,y_{j} \in \{0,1\} \\<br /> \end{align}<br /> &lt;/math&gt;<br /> <br /> The work by Ali and Dyo explores a greedy approximation algorithm to solve an optimal selection problem including 713 bus routes in Greater London. &lt;ref name=&quot;five&quot; /&gt; Using 14% of the routes only (100 routes), the greedy algorithm returns a solution that covers 25% of the segments in Greater London. For a details of the approximation algorithm and the world case study, we refer the reader to this reference. &lt;ref name=&quot;five&quot; /&gt; For a significantly larger case study involving 5747 buses covering 5060km, we refer the reader to this academic article. &lt;ref name=&quot;eleven&quot; /&gt;<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;six&quot;&gt; E. Marchiori and A. Steenbeek, [https://link.springer.com/chapter/10.1007/3-540-45561-2_36 &quot;An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling],&quot; ''Real-World Applications of Evolutionary Computing. EvoWorkshops 2000. Lecture Notes in Computer Science'', 2000. &lt;/ref&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey, which contains several problem instances with the number of flights ranging from 1013 to 7765 flights, for a detailed analysis of the formulation and algorithms that pertain to this significant application. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;eight&quot;&gt; A. Kasirzadeh, M. Saddoune, and F. Soumis [https://www.sciencedirect.com/science/article/pii/S2192437620300820?via%3Dihub &quot;Airline crew scheduling: models, algorithms, and data sets],&quot; ''EURO Journal on Transportation and Logistics'', vol. 6, pp. 111-137, 2017. &lt;/ref&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> == References ==<br /> &lt;references /&gt;</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=2544 Set covering problem 2020-12-14T00:07:28Z <p>Khaledfahat: /* Approximation via LP relaxation and rounding */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. Given a collection of elements, the set covering problem aims to find the minimum number of sets that incorporate (cover) all of these elements. &lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion. &lt;ref name=&quot;one&quot; /&gt; &lt;ref name=&quot;seven&quot;&gt; P. Slavı́k, [https://www.sciencedirect.com/science/article/abs/pii/S0196677497908877 &quot;A Tight Analysis of the Greedy Algorithm for Set Cover],&quot; ''Journal of Algorithms,'', vol. 25, pp. 237-245, 1997. &lt;/ref&gt; &lt;ref name=&quot;nine&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem?],&quot; ''Operations Research Letters'', vol. 44, pp. 366-369, 2016. &lt;/ref&gt;<br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s. &lt;ref name=&quot;two&quot;&gt; J. Rubin, [https://www.jstor.org/stable/25767684?seq=1 &quot;A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling],&quot; ''Transportation Science'', vol. 7, pp. 34-48, 1973. &lt;/ref&gt;<br /> <br /> == Problem formulation ==<br /> In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.<br /> <br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c_i&lt;/math&gt; such that &lt;math&gt; c_i &gt; 0&lt;/math&gt;. The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element &lt;math&gt; i &lt;/math&gt; in the universe &lt;math&gt; U &lt;/math&gt; must be be covered and the second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary which means that every set is either in the set cover or not.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which implies that as the size of the problem increases, the computational time to solve it increases exponentially. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. In subsequent sections, we will cover two of the most widely used approximation methods to solve set cover problem in polynomial time which are linear program relaxation methods and classical greedy algorithms. &lt;ref name=&quot;seven&quot; /&gt;<br /> <br /> == Approximation via LP relaxation and rounding ==<br /> Set covering is a classical integer programming problem and solving integer program in general is NP-hard. Therefore, one approach to achieve an &lt;math&gt; O&lt;/math&gt;(log&lt;math&gt;n&lt;/math&gt;) approximation to set covering problem in polynomial time is solving via linear programming (LP) ''relaxation'' algorithms. In LP relaxation, we relax the integrality requirement into a linear constraints. For instance, if we replace the constraints &lt;math&gt; y_i \in \{0, 1\}&lt;/math&gt; with the constraints &lt;math&gt; 0 =&lt; y_i &lt;= 1 &lt;/math&gt;, we obtain the following LP problem that can be solved in polynomial time:<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> subject to &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; 0 =&lt; y_i&lt;= 1, \forall i = 1,....,n&lt;/math&gt;<br /> <br /> The above LP formulation is a relaxation of the original ILP set cover problem. This means that every feasible solution of the integer program is also feasible for this LP program. Additionally, the value of any feasible solution for the integer program is the same value in LP since the objective functions of both integer and linear programs are the same. Solving the LP program will result in an optimal solution that is a lower bound for the original integer program since the minimization of LP finds a feasible solution of lowest possible values. Moreover, we use LP rounding algorithms to directly round the fractional LP solution to an integral combinatorial solution as follows:<br /> &lt;br&gt;<br /> <br /> <br /> '''Deterministic rounding algorithm''' <br /> &lt;br&gt;<br /> <br /> Suppose we have an optimal solution &lt;math&gt; z^* &lt;/math&gt; for the linear programming relaxation of the set cover problem. We round the fractional solution &lt;math&gt; z^* &lt;/math&gt; to an integer solution &lt;math&gt; z &lt;/math&gt; using LP rounding algorithm. In general, there are two approaches for rounding algorithms, deterministic and randomized rounding algorithm. In this section, we will explain the deterministic algorithms.In this approach, we include subset &lt;math&gt; s_i &lt;/math&gt; in our solution if &lt;math&gt; z^* &gt;= 1/d &lt;/math&gt;, where &lt;math&gt; d &lt;/math&gt; is the maximum number of sets in which any element appears. In practice, we set &lt;math&gt; z &lt;/math&gt; to be as follows:<br /> <br /> &lt;math&gt; z = \begin{cases} 1, &amp; \text{if } z^*&gt;= 1/d \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> The rounding algorithm is an approximation algorithm for the set cover problem. It is clear that the algorithm converge in polynomial time and &lt;math&gt; z &lt;/math&gt; is a feasible solution to the integer program.<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time. &lt;ref name=&quot;seven&quot; /&gt; &lt;ref name=&quot;nine&quot; /&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe &lt;math&gt; U &lt;/math&gt;, and delete the uncovered elements, until all elements are covered. &lt;ref name=&quot;ten&quot;&gt; V. Chvatal, [https://pubsonline.informs.org/doi/abs/10.1287/moor.4.3.233 &quot;Greedy Heuristic for the Set-Covering Problem],&quot; ''Mathematics of Operations Research'', vol. 4, pp. 233-235, 1979. &lt;/ref&gt; Let &lt;math&gt; T &lt;/math&gt; be the set that contain the covered elements, and &lt;math&gt; U &lt;/math&gt; be the set that contain the elements of &lt;math&gt; Y &lt;/math&gt; that still uncovered. At the beginning of the iteration, &lt;math&gt; T &lt;/math&gt; is empty and all elements &lt;math&gt; Y \in U &lt;/math&gt;. We iteratively select the set of &lt;math&gt; S &lt;/math&gt; that covers the largest number of elements in &lt;math&gt; U &lt;/math&gt; and add it to the covered elements in &lt;math&gt; T &lt;/math&gt;. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { &lt;math&gt; T &lt;/math&gt; stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { &lt;math&gt; U &lt;/math&gt; stores the uncovered elements &lt;math&gt; Y &lt;/math&gt;}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations. &lt;ref name=&quot;three&quot;&gt; R. Church and C. ReVelle, [https://link.springer.com/article/10.1007/BF01942293 &quot;The maximal covering location problem],&quot; ''Papers of the Regional Science Association'', vol. 32, pp. 101-118, 1974. &lt;/ref&gt; Consider the problem of placing fire stations to serve the towns of some city. &lt;ref name=&quot;four&quot;&gt; E. Aktaş, Ö. Özaydın, B. Bozkaya, F. Ülengin, and Ş. Önsel, [https://pubsonline.informs.org/doi/10.1287/inte.1120.0671 &quot;Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality],&quot; ''Interfaces'', vol. 43, pp. 240-255, 2013. &lt;/ref&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i \geq 1, \forall i&lt;/math&gt; <br /> <br /> A real-world case study involving optimizing fire station locations in Istanbul is analyzed in this reference. &lt;ref name=&quot;four&quot; /&gt; The Istanbul municipality serves 790 subdistricts, which should all be covered by a fire station. Each subdistrict is considered covered if it has a neighboring district (a district at most 5 minutes away) that has a fire station. For detailed computational analysis, we refer the reader to the mentioned academic paper.<br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem. &lt;ref name=&quot;five&quot;&gt; J. Ali and V. Dyo, [https://www.scitepress.org/Link.aspx?doi=10.5220/0006469800830088 &quot;Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach],&quot; ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications'', pp. 83-88, 2017. &lt;/ref&gt; &lt;ref name=&quot;eleven&quot;&gt; P.H. Cruz Caminha , R. De Souza Couto , L.H. Maciel Kosmalski Costa , A. Fladenmuller , and M. Dias de Amorim, [https://www.mdpi.com/1424-8220/18/6/1976 &quot;On the Coverage of Bus-Based Mobile Sensing],&quot; ''Sensors'', 2018. &lt;/ref&gt; Specifically, giving a collection of bus routes '''''R''''', where each route itself is divided into segments. Route &lt;math&gt; i &lt;/math&gt; is denoted by &lt;math&gt; R_i &lt;/math&gt;, and segment &lt;math&gt; j &lt;/math&gt; is denoted by &lt;math&gt; S_j &lt;/math&gt;. The segments of two different routes can overlap, and each segment is associated with a length &lt;math&gt; a_j &lt;/math&gt;. The goal is then to select the routes that maximize the total covered distance.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Suppose we want to use at most &lt;math&gt; k &lt;/math&gt; different routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the length of of covered segments. Let &lt;math&gt; x_i &lt;/math&gt; be the binary decision variable corresponding to selecting route &lt;math&gt; R_i &lt;/math&gt;, and let &lt;math&gt; y_j &lt;/math&gt; be the decision variable associated with covering segment &lt;math&gt; S_j &lt;/math&gt;. Let us also denote the set of routes that cover segment &lt;math&gt; j &lt;/math&gt; by &lt;math&gt; C_j &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> &lt;math&gt;<br /> \begin{align}<br /> \text{max} &amp; ~~ \sum_{j} a_jy_j\\<br /> \text{s.t} &amp; ~~ \sum_{i\in C_j} x_i \geq y_j \quad \forall j \\<br /> &amp; ~~ \sum_{i} x_i = k \\ <br /> &amp; ~~ x_i,y_{j} \in \{0,1\} \\<br /> \end{align}<br /> &lt;/math&gt;<br /> <br /> The work by Ali and Dyo explores a greedy approximation algorithm to solve an optimal selection problem including 713 bus routes in Greater London. &lt;ref name=&quot;five&quot; /&gt; Using 14% of the routes only (100 routes), the greedy algorithm returns a solution that covers 25% of the segments in Greater London. For a details of the approximation algorithm and the world case study, we refer the reader to this reference. &lt;ref name=&quot;five&quot; /&gt; For a significantly larger case study involving 5747 buses covering 5060km, we refer the reader to this academic article. &lt;ref name=&quot;eleven&quot; /&gt;<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;six&quot;&gt; E. Marchiori and A. Steenbeek, [https://link.springer.com/chapter/10.1007/3-540-45561-2_36 &quot;An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling],&quot; ''Real-World Applications of Evolutionary Computing. EvoWorkshops 2000. Lecture Notes in Computer Science'', 2000. &lt;/ref&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey, which contains several problem instances with the number of flights ranging from 1013 to 7765 flights, for a detailed analysis of the formulation and algorithms that pertain to this significant application. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;eight&quot;&gt; A. Kasirzadeh, M. Saddoune, and F. Soumis [https://www.sciencedirect.com/science/article/pii/S2192437620300820?via%3Dihub &quot;Airline crew scheduling: models, algorithms, and data sets],&quot; ''EURO Journal on Transportation and Logistics'', vol. 6, pp. 111-137, 2017. &lt;/ref&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> == References ==<br /> &lt;references /&gt;</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=2534 Set covering problem 2020-12-13T22:00:55Z <p>Khaledfahat: </p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. Given a collection of elements, the set covering problem aims to find the minimum number of sets that incorporate (cover) all of these elements. &lt;ref name=&quot;one&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;Computational experience with approximation algorithms for the set covering problem],&quot; ''European Journal of Operational Research'', vol. 101, pp. 81-92, 1997. &lt;/ref&gt;<br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion. &lt;ref name=&quot;one&quot; /&gt; &lt;ref name=&quot;seven&quot;&gt; P. Slavı́k, [https://www.sciencedirect.com/science/article/abs/pii/S0196677497908877 &quot;A Tight Analysis of the Greedy Algorithm for Set Cover],&quot; ''Journal of Algorithms,'', vol. 25, pp. 237-245, 1997. &lt;/ref&gt; &lt;ref name=&quot;nine&quot;&gt; T. Grossman and A. Wool, [https://www.sciencedirect.com/science/article/abs/pii/S0377221796001610 &quot;What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem?],&quot; ''Operations Research Letters'', vol. 44, pp. 366-369, 2016. &lt;/ref&gt;<br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s. &lt;ref name=&quot;two&quot;&gt; J. Rubin, [https://www.jstor.org/stable/25767684?seq=1 &quot;A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling],&quot; ''Transportation Science'', vol. 7, pp. 34-48, 1973. &lt;/ref&gt;<br /> <br /> == Problem formulation ==<br /> In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.<br /> <br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c_i&lt;/math&gt; such that &lt;math&gt; c_i &gt; 0&lt;/math&gt;. The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element &lt;math&gt; i &lt;/math&gt; in the universe &lt;math&gt; U &lt;/math&gt; must be be covered and the second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary which means that every set is either in the set cover or not.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which implies that as the size of the problem increases, the computational time to solve it increases exponentially. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. In subsequent sections, we will cover two of the most widely used approximation methods to solve set cover problem in polynomial time which are linear program relaxation methods and classical greedy algorithms. &lt;ref name=&quot;seven&quot; /&gt;<br /> <br /> == Approximation via LP relaxation and rounding ==<br /> Set covering is a classical integer programming problem and solving integer program in general is NP-hard. Therefore, one approach to achieve an &lt;math&gt; O(log n)&lt;/math&gt; approximation to set covering problem in polynomial time is solving via linear programming (LP) ''relaxation'' algorithms. In LP relaxation, we relax the integrality requirement into a linear constraints. For instance, if we replace the constraints &lt;math&gt; y_i \in \{0, 1\}&lt;/math&gt; with the constraints &lt;math&gt; y_i &gt;= 0 &lt;/math&gt;, we obtain the following LP problem that can be solved in polynomial time:<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> subject to &lt;math&gt; \sum_{i=1}^n y_i &gt;= 1, \forall i= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i &gt;= 0, \forall i = 1,....,n&lt;/math&gt;<br /> <br /> <br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time. &lt;ref name=&quot;seven&quot; /&gt; &lt;ref name=&quot;nine&quot; /&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe &lt;math&gt; U &lt;/math&gt;, and delete the uncovered elements, until all elements are covered. &lt;ref name=&quot;ten&quot;&gt; V. Chvatal, [https://pubsonline.informs.org/doi/abs/10.1287/moor.4.3.233 &quot;Greedy Heuristic for the Set-Covering Problem],&quot; ''Mathematics of Operations Research'', vol. 4, pp. 233-235, 1979. &lt;/ref&gt; Let &lt;math&gt; T &lt;/math&gt; be the set that contain the covered elements, and &lt;math&gt; U &lt;/math&gt; be the set that contain the elements of &lt;math&gt; Y &lt;/math&gt; that still uncovered. At the beginning of the iteration, &lt;math&gt; T &lt;/math&gt; is empty and all elements &lt;math&gt; Y \in U &lt;/math&gt;. We iteratively select the set of &lt;math&gt; S &lt;/math&gt; that covers the largest number of elements in &lt;math&gt; U &lt;/math&gt; and add it to the covered elements in &lt;math&gt; T &lt;/math&gt;. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { &lt;math&gt; T &lt;/math&gt; stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { &lt;math&gt; U &lt;/math&gt; stores the uncovered elements &lt;math&gt; Y &lt;/math&gt;}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations. &lt;ref name=&quot;three&quot;&gt; R. Church and C. ReVelle, [https://link.springer.com/article/10.1007/BF01942293 &quot;The maximal covering location problem],&quot; ''Papers of the Regional Science Association'', vol. 32, pp. 101-118, 1974. &lt;/ref&gt; Consider the problem of placing fire stations to serve the towns of some city. &lt;ref name=&quot;four&quot;&gt; E. Aktaş, Ö. Özaydın, B. Bozkaya, F. Ülengin, and Ş. Önsel, [https://pubsonline.informs.org/doi/10.1287/inte.1120.0671 &quot;Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality],&quot; ''Interfaces'', vol. 43, pp. 240-255, 2013. &lt;/ref&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i \geq 1, \forall i&lt;/math&gt; <br /> <br /> A real-world case study involving optimizing fire station locations in Istanbul is analyzed in this reference. &lt;ref name=&quot;four&quot; /&gt; The Istanbul municipality serves 790 subdistricts, which should all be covered by a fire station. Each subdistrict is considered covered if it has a neighboring district (a district at most 5 minutes away) that has a fire station. For detailed computational analysis, we refer the reader to the mentioned academic paper.<br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem. &lt;ref name=&quot;five&quot;&gt; J. Ali and V. Dyo, [https://www.scitepress.org/Link.aspx?doi=10.5220/0006469800830088 &quot;Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach],&quot; ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications'', pp. 83-88, 2017. &lt;/ref&gt; &lt;ref name=&quot;eleven&quot;&gt; P.H. Cruz Caminha , R. De Souza Couto , L.H. Maciel Kosmalski Costa , A. Fladenmuller , and M. Dias de Amorim, [https://www.mdpi.com/1424-8220/18/6/1976 &quot;On the Coverage of Bus-Based Mobile Sensing],&quot; ''Sensors'', 2018. &lt;/ref&gt; Specifically, giving a collection of bus routes '''''R''''', where each route itself is divided into segments. Route &lt;math&gt; i &lt;/math&gt; is denoted by &lt;math&gt; R_i &lt;/math&gt;, and segment &lt;math&gt; j &lt;/math&gt; is denoted by &lt;math&gt; S_j &lt;/math&gt;. The segments of two different routes can overlap, and each segment is associated with a length &lt;math&gt; a_j &lt;/math&gt;. The goal is then to select the routes that maximize the total covered distance.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Suppose we want to use at most &lt;math&gt; k &lt;/math&gt; different routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the length of of covered segments. Let &lt;math&gt; x_i &lt;/math&gt; be the binary decision variable corresponding to selecting route &lt;math&gt; R_i &lt;/math&gt;, and let &lt;math&gt; y_j &lt;/math&gt; be the decision variable associated with covering segment &lt;math&gt; S_j &lt;/math&gt;. Let us also denote the set of routes that cover segment &lt;math&gt; j &lt;/math&gt; by &lt;math&gt; C_j &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> &lt;math&gt;<br /> \begin{align}<br /> \text{max} &amp; ~~ \sum_{j} a_jy_j\\<br /> \text{s.t} &amp; ~~ \sum_{i\in C_j} x_i \geq y_j \quad \forall j \\<br /> &amp; ~~ \sum_{i} x_i = k \\ <br /> &amp; ~~ x_i,y_{j} \in \{0,1\} \\<br /> \end{align}<br /> &lt;/math&gt;<br /> <br /> The work by Ali and Dyo explores a greedy approximation algorithm to solve an optimal selection problem including 713 bus routes in Greater London. &lt;ref name=&quot;five&quot; /&gt; Using 14% of the routes only (100 routes), the greedy algorithm returns a solution that covers 25% of the segments in Greater London. For a details of the approximation algorithm and the world case study, we refer the reader to this reference. &lt;ref name=&quot;five&quot; /&gt; For a significantly larger case study involving 5747 buses covering 5060km, we refer the reader to this academic article. &lt;ref name=&quot;eleven&quot; /&gt;<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;six&quot;&gt; E. Marchiori and A. Steenbeek, [https://link.springer.com/chapter/10.1007/3-540-45561-2_36 &quot;An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling],&quot; ''Real-World Applications of Evolutionary Computing. EvoWorkshops 2000. Lecture Notes in Computer Science'', 2000. &lt;/ref&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey, which contains several problem instances with the number of flights ranging from 1013 to 7765 flights, for a detailed analysis of the formulation and algorithms that pertain to this significant application. &lt;ref name=&quot;two&quot; /&gt; &lt;ref name=&quot;eight&quot;&gt; A. Kasirzadeh, M. Saddoune, and F. Soumis [https://www.sciencedirect.com/science/article/pii/S2192437620300820?via%3Dihub &quot;Airline crew scheduling: models, algorithms, and data sets],&quot; ''EURO Journal on Transportation and Logistics'', vol. 6, pp. 111-137, 2017. &lt;/ref&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> == References ==<br /> &lt;references /&gt;</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=2460 Set covering problem 2020-12-13T05:56:16Z <p>Khaledfahat: /* Approximation via greedy algorithm */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; <br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion.&lt;sup&gt;1,7,9&lt;/sup&gt; <br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c_i&lt;/math&gt; such that &lt;math&gt; c_i &gt; 0&lt;/math&gt;. The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which implies that as the size of the problem increases, the computational time to solve the problem increases exponentially. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.&lt;sup&gt;7&lt;/sup&gt;<br /> <br /> == Approximation via LP relaxation and rounding ==<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time.&lt;sup&gt;7,9&lt;/sup&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe &lt;math&gt; U &lt;/math&gt;, and delete the uncovered elements, until all elements are covered.&lt;sup&gt;10&lt;/sup&gt; Let &lt;math&gt; T &lt;/math&gt; be the set that contain the covered elements, and &lt;math&gt; U &lt;/math&gt; be the set that contain the elements of &lt;math&gt; Y &lt;/math&gt; that still uncovered. At the beginning of the iteration, &lt;math&gt; T &lt;/math&gt; is empty and all elements &lt;math&gt; Y \in U &lt;/math&gt;. We iteratively select the set of &lt;math&gt; S &lt;/math&gt; that covers the largest number of elements in &lt;math&gt; U &lt;/math&gt; and add it to the covered elements in &lt;math&gt; T &lt;/math&gt;. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { &lt;math&gt; T &lt;/math&gt; stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { &lt;math&gt; U &lt;/math&gt; stores the uncovered elements &lt;math&gt; Y &lt;/math&gt;}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i &gt;= 1, \forall i&lt;/math&gt; <br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''R''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize the number of covered segments.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Say we want to use at most &lt;math&gt; k &lt;/math&gt; routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the number of covered segments. Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing route &lt;math&gt; R_i &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> maximize &lt;math&gt;\sum_{i=1}^N R_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i} y_i = k &lt;/math&gt; <br /> <br /> For a greedy approximation solution, please check the work by Ali and Dyo.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt;<br /> .<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey for a detailed analysis of the formulation and algorithms that pertain to this significant application.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;8&lt;/span&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from http://www.jstor.org/stable/25767684<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). https://doi.org/10.1007/BF01942293<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 https://doi.org/10.1006/jagm.1997.0887.<br /> # Kasirzadeh, A., Saddoune, M. &amp; Soumis, F. Airline crew scheduling: models, algorithms, and data sets. ''EURO J Transp Logist'' 6, 111–137 (2017). https://doi.org/10.1007/s13676-015-0080-x<br /> #Vasko, Francis J., et al. What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem? &quot;Operations Research Letters&quot;, North-Holland, 22 Mar. 2016, www.sciencedirect.com/science/article/pii/S016763771600047X.<br /> #Chvatal, V. A Greedy Heuristic for the Set-Covering Problem. &quot;Mathematics of Operations Research&quot; Vol. 4, No. 3 (Aug., 1979), pp. 233-235</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=2459 Set covering problem 2020-12-13T05:54:56Z <p>Khaledfahat: /* Approximation via linear program relaxation and rounding */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; <br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion.&lt;sup&gt;1,7,9&lt;/sup&gt; <br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c_i&lt;/math&gt; such that &lt;math&gt; c_i &gt; 0&lt;/math&gt;. The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which implies that as the size of the problem increases, the computational time to solve the problem increases exponentially. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.&lt;sup&gt;7&lt;/sup&gt;<br /> <br /> == Approximation via LP relaxation and rounding ==<br /> <br /> == Approximation via greedy algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time.&lt;sup&gt;7,9&lt;/sup&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe &lt;math&gt; U &lt;/math&gt;, and delete the uncovered elements, until all elements are covered.&lt;sup&gt;10&lt;/sup&gt; Let &lt;math&gt; T &lt;/math&gt; be the set that contain the covered elements, and &lt;math&gt; U &lt;/math&gt; be the set that contain the elements of &lt;math&gt; Y &lt;/math&gt; that still uncovered. At the beginning of the iteration, &lt;math&gt; T &lt;/math&gt; is empty and all elements &lt;math&gt; Y \in U &lt;/math&gt;. We iteratively select the set of &lt;math&gt; S &lt;/math&gt; that covers the largest number of elements in &lt;math&gt; U &lt;/math&gt; and add it to the covered elements in &lt;math&gt; T &lt;/math&gt;. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { &lt;math&gt; T &lt;/math&gt; stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { &lt;math&gt; U &lt;/math&gt; stores the uncovered elements &lt;math&gt; Y &lt;/math&gt;}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i &gt;= 1, \forall i&lt;/math&gt; <br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''R''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize the number of covered segments.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Say we want to use at most &lt;math&gt; k &lt;/math&gt; routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the number of covered segments. Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing route &lt;math&gt; R_i &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> maximize &lt;math&gt;\sum_{i=1}^N R_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i} y_i = k &lt;/math&gt; <br /> <br /> For a greedy approximation solution, please check the work by Ali and Dyo.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt;<br /> .<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey for a detailed analysis of the formulation and algorithms that pertain to this significant application.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;8&lt;/span&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from http://www.jstor.org/stable/25767684<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). https://doi.org/10.1007/BF01942293<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 https://doi.org/10.1006/jagm.1997.0887.<br /> # Kasirzadeh, A., Saddoune, M. &amp; Soumis, F. Airline crew scheduling: models, algorithms, and data sets. ''EURO J Transp Logist'' 6, 111–137 (2017). https://doi.org/10.1007/s13676-015-0080-x<br /> #Vasko, Francis J., et al. What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem? &quot;Operations Research Letters&quot;, North-Holland, 22 Mar. 2016, www.sciencedirect.com/science/article/pii/S016763771600047X.<br /> #Chvatal, V. A Greedy Heuristic for the Set-Covering Problem. &quot;Mathematics of Operations Research&quot; Vol. 4, No. 3 (Aug., 1979), pp. 233-235</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=2458 Set covering problem 2020-12-13T05:54:30Z <p>Khaledfahat: /* Approximation via greedy algorithm */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; <br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion.&lt;sup&gt;1,7,9&lt;/sup&gt; <br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c_i&lt;/math&gt; such that &lt;math&gt; c_i &gt; 0&lt;/math&gt;. The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which implies that as the size of the problem increases, the computational time to solve the problem increases exponentially. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.&lt;sup&gt;7&lt;/sup&gt;<br /> <br /> == Approximation via linear program relaxation and rounding == <br /> <br /> <br /> == Approximation via greedy algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time.&lt;sup&gt;7,9&lt;/sup&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe &lt;math&gt; U &lt;/math&gt;, and delete the uncovered elements, until all elements are covered.&lt;sup&gt;10&lt;/sup&gt; Let &lt;math&gt; T &lt;/math&gt; be the set that contain the covered elements, and &lt;math&gt; U &lt;/math&gt; be the set that contain the elements of &lt;math&gt; Y &lt;/math&gt; that still uncovered. At the beginning of the iteration, &lt;math&gt; T &lt;/math&gt; is empty and all elements &lt;math&gt; Y \in U &lt;/math&gt;. We iteratively select the set of &lt;math&gt; S &lt;/math&gt; that covers the largest number of elements in &lt;math&gt; U &lt;/math&gt; and add it to the covered elements in &lt;math&gt; T &lt;/math&gt;. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { &lt;math&gt; T &lt;/math&gt; stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { &lt;math&gt; U &lt;/math&gt; stores the uncovered elements &lt;math&gt; Y &lt;/math&gt;}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i &gt;= 1, \forall i&lt;/math&gt; <br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''R''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize the number of covered segments.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Say we want to use at most &lt;math&gt; k &lt;/math&gt; routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the number of covered segments. Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing route &lt;math&gt; R_i &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> maximize &lt;math&gt;\sum_{i=1}^N R_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i} y_i = k &lt;/math&gt; <br /> <br /> For a greedy approximation solution, please check the work by Ali and Dyo.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt;<br /> .<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey for a detailed analysis of the formulation and algorithms that pertain to this significant application.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;8&lt;/span&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from http://www.jstor.org/stable/25767684<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). https://doi.org/10.1007/BF01942293<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 https://doi.org/10.1006/jagm.1997.0887.<br /> # Kasirzadeh, A., Saddoune, M. &amp; Soumis, F. Airline crew scheduling: models, algorithms, and data sets. ''EURO J Transp Logist'' 6, 111–137 (2017). https://doi.org/10.1007/s13676-015-0080-x<br /> #Vasko, Francis J., et al. What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem? &quot;Operations Research Letters&quot;, North-Holland, 22 Mar. 2016, www.sciencedirect.com/science/article/pii/S016763771600047X.<br /> #Chvatal, V. A Greedy Heuristic for the Set-Covering Problem. &quot;Mathematics of Operations Research&quot; Vol. 4, No. 3 (Aug., 1979), pp. 233-235</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=2457 Set covering problem 2020-12-13T05:52:17Z <p>Khaledfahat: /* Problem formulation */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; <br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion.&lt;sup&gt;1,7,9&lt;/sup&gt; <br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c_i&lt;/math&gt; such that &lt;math&gt; c_i &gt; 0&lt;/math&gt;. The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which implies that as the size of the problem increases, the computational time to solve the problem increases exponentially. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.&lt;sup&gt;7&lt;/sup&gt;<br /> <br /> == Approximation via linear program relaxation and rounding == <br /> <br /> <br /> == Approximation via greedy algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time.&lt;sup&gt;7,9&lt;/sup&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe &lt;math&gt; U &lt;/math&gt;, and delete the uncovered elements, until all elements are covered.&lt;sup&gt;10&lt;/sup&gt; Let &lt;math&gt; T &lt;/math&gt; be the set that contain the covered elements, and &lt;math&gt; U &lt;/math&gt; be the set that contain the elements of &lt;math&gt; Y &lt;/math&gt; that still uncovered. At the beginning of the iteration, &lt;math&gt; T &lt;/math&gt; is empty and all elements &lt;math&gt; Y \in U &lt;/math&gt;. We iteratively select the set of &lt;math&gt; S &lt;/math&gt; that covers the largest number of elements in &lt;math&gt; U &lt;/math&gt; and add it to the covered elements in &lt;math&gt; T &lt;/math&gt;. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { '''T''' stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { '''U''' stores the uncovered elements '''Y'''}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in '''U'''<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i &gt;= 1, \forall i&lt;/math&gt; <br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''R''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize the number of covered segments.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Say we want to use at most &lt;math&gt; k &lt;/math&gt; routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the number of covered segments. Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing route &lt;math&gt; R_i &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> maximize &lt;math&gt;\sum_{i=1}^N R_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i} y_i = k &lt;/math&gt; <br /> <br /> For a greedy approximation solution, please check the work by Ali and Dyo.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt;<br /> .<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey for a detailed analysis of the formulation and algorithms that pertain to this significant application.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;8&lt;/span&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from http://www.jstor.org/stable/25767684<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). https://doi.org/10.1007/BF01942293<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 https://doi.org/10.1006/jagm.1997.0887.<br /> # Kasirzadeh, A., Saddoune, M. &amp; Soumis, F. Airline crew scheduling: models, algorithms, and data sets. ''EURO J Transp Logist'' 6, 111–137 (2017). https://doi.org/10.1007/s13676-015-0080-x<br /> #Vasko, Francis J., et al. What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem? &quot;Operations Research Letters&quot;, North-Holland, 22 Mar. 2016, www.sciencedirect.com/science/article/pii/S016763771600047X.<br /> #Chvatal, V. A Greedy Heuristic for the Set-Covering Problem. &quot;Mathematics of Operations Research&quot; Vol. 4, No. 3 (Aug., 1979), pp. 233-235</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=2456 Set covering problem 2020-12-13T05:51:21Z <p>Khaledfahat: </p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; <br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion.&lt;sup&gt;1,7,9&lt;/sup&gt; <br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c&lt;/math&gt; such that &lt;math&gt; c_i &gt; 0&lt;/math&gt;. The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which implies that as the size of the problem increases, the computational time to solve the problem increases exponentially. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.&lt;sup&gt;7&lt;/sup&gt;<br /> <br /> == Approximation via linear program relaxation and rounding == <br /> <br /> <br /> == Approximation via greedy algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time.&lt;sup&gt;7,9&lt;/sup&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe &lt;math&gt; U &lt;/math&gt;, and delete the uncovered elements, until all elements are covered.&lt;sup&gt;10&lt;/sup&gt; Let &lt;math&gt; T &lt;/math&gt; be the set that contain the covered elements, and &lt;math&gt; U &lt;/math&gt; be the set that contain the elements of &lt;math&gt; Y &lt;/math&gt; that still uncovered. At the beginning of the iteration, &lt;math&gt; T &lt;/math&gt; is empty and all elements &lt;math&gt; Y \in U &lt;/math&gt;. We iteratively select the set of &lt;math&gt; S &lt;/math&gt; that covers the largest number of elements in &lt;math&gt; U &lt;/math&gt; and add it to the covered elements in &lt;math&gt; T &lt;/math&gt;. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { '''T''' stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { '''U''' stores the uncovered elements '''Y'''}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in '''U'''<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i &gt;= 1, \forall i&lt;/math&gt; <br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''R''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize the number of covered segments.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Say we want to use at most &lt;math&gt; k &lt;/math&gt; routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the number of covered segments. Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing route &lt;math&gt; R_i &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> maximize &lt;math&gt;\sum_{i=1}^N R_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i} y_i = k &lt;/math&gt; <br /> <br /> For a greedy approximation solution, please check the work by Ali and Dyo.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt;<br /> .<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey for a detailed analysis of the formulation and algorithms that pertain to this significant application.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;8&lt;/span&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from http://www.jstor.org/stable/25767684<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). https://doi.org/10.1007/BF01942293<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 https://doi.org/10.1006/jagm.1997.0887.<br /> # Kasirzadeh, A., Saddoune, M. &amp; Soumis, F. Airline crew scheduling: models, algorithms, and data sets. ''EURO J Transp Logist'' 6, 111–137 (2017). https://doi.org/10.1007/s13676-015-0080-x<br /> #Vasko, Francis J., et al. What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem? &quot;Operations Research Letters&quot;, North-Holland, 22 Mar. 2016, www.sciencedirect.com/science/article/pii/S016763771600047X.<br /> #Chvatal, V. A Greedy Heuristic for the Set-Covering Problem. &quot;Mathematics of Operations Research&quot; Vol. 4, No. 3 (Aug., 1979), pp. 233-235</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=2455 Set covering problem 2020-12-13T05:48:33Z <p>Khaledfahat: /* Greedy approximation algorithm */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; <br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion.&lt;sup&gt;1,7,9&lt;/sup&gt; <br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c&lt;/math&gt; that is larger than zero (i.e. &lt;math&gt; c_i&lt;/math&gt; &gt; 0). The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which implies that as the size of the problem increases, the computational time to solve the problem increases exponentially. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.&lt;sup&gt;7&lt;/sup&gt;<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time.&lt;sup&gt;7,9&lt;/sup&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe &lt;math&gt; U &lt;/math&gt;, and delete the uncovered elements, until all elements are covered.&lt;sup&gt;10&lt;/sup&gt; Let &lt;math&gt; T &lt;/math&gt; be the set that contain the covered elements, and &lt;math&gt; U &lt;/math&gt; be the set that contain the elements of &lt;math&gt; Y &lt;/math&gt; that still uncovered. At the beginning of the iteration, &lt;math&gt; T &lt;/math&gt; is empty and all elements &lt;math&gt; Y \in U &lt;/math&gt;. We iteratively select the set of &lt;math&gt; S &lt;/math&gt; that covers the largest number of elements in &lt;math&gt; U &lt;/math&gt; and add it to the covered elements in &lt;math&gt; T &lt;/math&gt;. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { '''T''' stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { '''U''' stores the uncovered elements '''Y'''}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in '''U'''<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i &gt;= 1, \forall i&lt;/math&gt; <br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''R''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize the number of covered segments.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Say we want to use at most &lt;math&gt; k &lt;/math&gt; routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the number of covered segments. Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing route &lt;math&gt; R_i &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> maximize &lt;math&gt;\sum_{i=1}^N R_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i} y_i = k &lt;/math&gt; <br /> <br /> For a greedy approximation solution, please check the work by Ali and Dyo.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt;<br /> .<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey for a detailed analysis of the formulation and algorithms that pertain to this significant application.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;8&lt;/span&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from http://www.jstor.org/stable/25767684<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). https://doi.org/10.1007/BF01942293<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 https://doi.org/10.1006/jagm.1997.0887.<br /> # Kasirzadeh, A., Saddoune, M. &amp; Soumis, F. Airline crew scheduling: models, algorithms, and data sets. ''EURO J Transp Logist'' 6, 111–137 (2017). https://doi.org/10.1007/s13676-015-0080-x<br /> #Vasko, Francis J., et al. What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem? &quot;Operations Research Letters&quot;, North-Holland, 22 Mar. 2016, www.sciencedirect.com/science/article/pii/S016763771600047X.<br /> #Chvatal, V. A Greedy Heuristic for the Set-Covering Problem. &quot;Mathematics of Operations Research&quot; Vol. 4, No. 3 (Aug., 1979), pp. 233-235</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=2454 Set covering problem 2020-12-13T05:42:45Z <p>Khaledfahat: /* Integer linear program formulation */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; <br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion.&lt;sup&gt;1,7,9&lt;/sup&gt; <br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c&lt;/math&gt; that is larger than zero (i.e. &lt;math&gt; c_i&lt;/math&gt; &gt; 0). The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which implies that as the size of the problem increases, the computational time to solve the problem increases exponentially. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.&lt;sup&gt;7&lt;/sup&gt;<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time.&lt;sup&gt;7,9&lt;/sup&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe '''U''', and delete the uncovered elements, until all elements are covered.&lt;sup&gt;10&lt;/sup&gt; Let '''T''' be the set that contain the covered elements, and '''U''' be the set that contain the elements of '''Y''' that still uncovered. At the beginning of the iteration, '''T''' is empty and all elements '''Y''' belong to '''U'''. We iteratively select the set of '''S''' that covers the largest number of elements in '''U''' and add it to the covered elements in '''T'''. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { '''T''' stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { '''U''' stores the uncovered elements '''Y'''}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in '''U'''<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i &gt;= 1, \forall i&lt;/math&gt; <br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''R''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize the number of covered segments.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Say we want to use at most &lt;math&gt; k &lt;/math&gt; routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the number of covered segments. Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing route &lt;math&gt; R_i &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> maximize &lt;math&gt;\sum_{i=1}^N R_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i} y_i = k &lt;/math&gt; <br /> <br /> For a greedy approximation solution, please check the work by Ali and Dyo.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt;<br /> .<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey for a detailed analysis of the formulation and algorithms that pertain to this significant application.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;8&lt;/span&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from http://www.jstor.org/stable/25767684<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). https://doi.org/10.1007/BF01942293<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 https://doi.org/10.1006/jagm.1997.0887.<br /> # Kasirzadeh, A., Saddoune, M. &amp; Soumis, F. Airline crew scheduling: models, algorithms, and data sets. ''EURO J Transp Logist'' 6, 111–137 (2017). https://doi.org/10.1007/s13676-015-0080-x<br /> #Vasko, Francis J., et al. What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem? &quot;Operations Research Letters&quot;, North-Holland, 22 Mar. 2016, www.sciencedirect.com/science/article/pii/S016763771600047X.<br /> #Chvatal, V. A Greedy Heuristic for the Set-Covering Problem. &quot;Mathematics of Operations Research&quot; Vol. 4, No. 3 (Aug., 1979), pp. 233-235</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=2299 Set covering problem 2020-12-11T21:17:08Z <p>Khaledfahat: /* Integer linear program formulation */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; <br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion.&lt;sup&gt;1,7,9&lt;/sup&gt; <br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c&lt;/math&gt; that is larger than zero (i.e. &lt;math&gt; c_i&lt;/math&gt; &gt; 0). The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which means that for large size instances, ILP algorithms can be difficult to solve in polynomial time. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.&lt;sup&gt;7&lt;/sup&gt;<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time.&lt;sup&gt;7,9&lt;/sup&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe '''U''', and delete the uncovered elements, until all elements are covered.&lt;sup&gt;10&lt;/sup&gt; Let '''T''' be the set that contain the covered elements, and '''U''' be the set that contain the elements of '''Y''' that still uncovered. At the beginning of the iteration, '''T''' is empty and all elements '''Y''' belong to '''U'''. We iteratively select the set of '''S''' that covers the largest number of elements in '''U''' and add it to the covered elements in '''T'''. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { '''T''' stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { '''U''' stores the uncovered elements '''Y'''}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in '''U'''<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i &gt;= 1, \forall i&lt;/math&gt; <br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''R''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize the number of covered segments.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Say we want to use at most &lt;math&gt; k &lt;/math&gt; routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the number of covered segments. Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing route &lt;math&gt; R_i &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> maximize &lt;math&gt;\sum_{i=1}^N R_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i} y_i = k &lt;/math&gt; <br /> <br /> For a greedy approximation solution, please check the work by Ali and Dyo.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt;<br /> .<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey for a detailed analysis of the formulation and algorithms that pertain to this significant application.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;8&lt;/span&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from http://www.jstor.org/stable/25767684<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). https://doi.org/10.1007/BF01942293<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 https://doi.org/10.1006/jagm.1997.0887.<br /> # Kasirzadeh, A., Saddoune, M. &amp; Soumis, F. Airline crew scheduling: models, algorithms, and data sets. ''EURO J Transp Logist'' 6, 111–137 (2017). https://doi.org/10.1007/s13676-015-0080-x<br /> #Vasko, Francis J., et al. What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem? &quot;Operations Research Letters&quot;, North-Holland, 22 Mar. 2016, www.sciencedirect.com/science/article/pii/S016763771600047X.<br /> #Chvatal, V. A Greedy Heuristic for the Set-Covering Problem. &quot;Mathematics of Operations Research&quot; Vol. 4, No. 3 (Aug., 1979), pp. 233-235</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=1926 Set covering problem 2020-11-26T04:39:50Z <p>Khaledfahat: /* Greedy approximation algorithm */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; <br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion.&lt;sup&gt;1,7,9&lt;/sup&gt; <br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c&lt;/math&gt; that is larger than zero (i.e. &lt;math&gt; c_i&lt;/math&gt; &gt; 0). The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which means that for large size instances, integer linear program algorithms can be difficult to solve in polynomial time. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.&lt;sup&gt;7&lt;/sup&gt;<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time.&lt;sup&gt;7,9&lt;/sup&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe '''U''', and delete the uncovered elements, until all elements are covered.&lt;sup&gt;10&lt;/sup&gt; Let '''T''' be the set that contain the covered elements, and '''U''' be the set that contain the elements of '''Y''' that still uncovered. At the beginning of the iteration, '''T''' is empty and all elements '''Y''' belong to '''U'''. We iteratively select the set of '''S''' that covers the largest number of elements in '''U''' and add it to the covered elements in '''T'''. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { '''T''' stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { '''U''' stores the uncovered elements '''Y'''}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in '''U'''<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i &gt;= 1, \forall i&lt;/math&gt; <br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''R''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize the number of covered segments.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Say we want to use at most &lt;math&gt; k &lt;/math&gt; routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the number of covered segments. Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing route &lt;math&gt; R_i &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> maximize &lt;math&gt;\sum_{i=1}^N R_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i} y_i = k &lt;/math&gt; <br /> <br /> For a greedy approximation solution, please check the work by Ali and Dyo.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt;<br /> .<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey for a detailed analysis of the formulation and algorithms that pertain to this significant application.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;8&lt;/span&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from http://www.jstor.org/stable/25767684<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). https://doi.org/10.1007/BF01942293<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 https://doi.org/10.1006/jagm.1997.0887.<br /> # Kasirzadeh, A., Saddoune, M. &amp; Soumis, F. Airline crew scheduling: models, algorithms, and data sets. ''EURO J Transp Logist'' 6, 111–137 (2017). https://doi.org/10.1007/s13676-015-0080-x<br /> #Vasko, Francis J., et al. What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem? &quot;Operations Research Letters&quot;, North-Holland, 22 Mar. 2016, www.sciencedirect.com/science/article/pii/S016763771600047X.<br /> #Chvatal, V. A Greedy Heuristic for the Set-Covering Problem. &quot;Mathematics of Operations Research&quot; Vol. 4, No. 3 (Aug., 1979), pp. 233-235</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=1925 Set covering problem 2020-11-26T04:39:18Z <p>Khaledfahat: /* References */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; <br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion.&lt;sup&gt;1,7,9&lt;/sup&gt; <br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c&lt;/math&gt; that is larger than zero (i.e. &lt;math&gt; c_i&lt;/math&gt; &gt; 0). The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which means that for large size instances, integer linear program algorithms can be difficult to solve in polynomial time. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.&lt;sup&gt;7&lt;/sup&gt;<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time.&lt;sup&gt;7,9&lt;/sup&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe '''U''', and delete the uncovered elements, until all elements are covered. Let '''T''' be the set that contain the covered elements, and '''U''' be the set that contain the elements of '''Y''' that still uncovered. At the beginning of the iteration, '''T''' is empty and all elements '''Y''' belong to '''U'''. We iteratively select the set of '''S''' that covers the largest number of elements in '''U''' and add it to the covered elements in '''T'''. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { '''T''' stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { '''U''' stores the uncovered elements '''Y'''}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in '''U'''<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i &gt;= 1, \forall i&lt;/math&gt; <br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''R''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize the number of covered segments.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Say we want to use at most &lt;math&gt; k &lt;/math&gt; routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the number of covered segments. Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing route &lt;math&gt; R_i &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> maximize &lt;math&gt;\sum_{i=1}^N R_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i} y_i = k &lt;/math&gt; <br /> <br /> For a greedy approximation solution, please check the work by Ali and Dyo.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt;<br /> .<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey for a detailed analysis of the formulation and algorithms that pertain to this significant application.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;8&lt;/span&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from http://www.jstor.org/stable/25767684<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). https://doi.org/10.1007/BF01942293<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 https://doi.org/10.1006/jagm.1997.0887.<br /> # Kasirzadeh, A., Saddoune, M. &amp; Soumis, F. Airline crew scheduling: models, algorithms, and data sets. ''EURO J Transp Logist'' 6, 111–137 (2017). https://doi.org/10.1007/s13676-015-0080-x<br /> #Vasko, Francis J., et al. What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem? &quot;Operations Research Letters&quot;, North-Holland, 22 Mar. 2016, www.sciencedirect.com/science/article/pii/S016763771600047X.<br /> #Chvatal, V. A Greedy Heuristic for the Set-Covering Problem. &quot;Mathematics of Operations Research&quot; Vol. 4, No. 3 (Aug., 1979), pp. 233-235</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=1924 Set covering problem 2020-11-26T04:27:47Z <p>Khaledfahat: /* Introduction */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; <br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion.&lt;sup&gt;1,7,9&lt;/sup&gt; <br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c&lt;/math&gt; that is larger than zero (i.e. &lt;math&gt; c_i&lt;/math&gt; &gt; 0). The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which means that for large size instances, integer linear program algorithms can be difficult to solve in polynomial time. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.&lt;sup&gt;7&lt;/sup&gt;<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time.&lt;sup&gt;7,9&lt;/sup&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe '''U''', and delete the uncovered elements, until all elements are covered. Let '''T''' be the set that contain the covered elements, and '''U''' be the set that contain the elements of '''Y''' that still uncovered. At the beginning of the iteration, '''T''' is empty and all elements '''Y''' belong to '''U'''. We iteratively select the set of '''S''' that covers the largest number of elements in '''U''' and add it to the covered elements in '''T'''. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { '''T''' stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { '''U''' stores the uncovered elements '''Y'''}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in '''U'''<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i &gt;= 1, \forall i&lt;/math&gt; <br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''R''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize the number of covered segments.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Say we want to use at most &lt;math&gt; k &lt;/math&gt; routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the number of covered segments. Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing route &lt;math&gt; R_i &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> maximize &lt;math&gt;\sum_{i=1}^N R_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i} y_i = k &lt;/math&gt; <br /> <br /> For a greedy approximation solution, please check the work by Ali and Dyo.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt;<br /> .<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey for a detailed analysis of the formulation and algorithms that pertain to this significant application.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;8&lt;/span&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from http://www.jstor.org/stable/25767684<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). https://doi.org/10.1007/BF01942293<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 https://doi.org/10.1006/jagm.1997.0887.<br /> # Kasirzadeh, A., Saddoune, M. &amp; Soumis, F. Airline crew scheduling: models, algorithms, and data sets. ''EURO J Transp Logist'' 6, 111–137 (2017). https://doi.org/10.1007/s13676-015-0080-x<br /> #Vasko, Francis J., et al. “What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem?” Operations Research Letters, North-Holland, 22 Mar. 2016, www.sciencedirect.com/science/article/pii/S016763771600047X.</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=1892 Set covering problem 2020-11-26T01:42:16Z <p>Khaledfahat: /* Greedy approximation algorithm */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; <br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion. <br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c&lt;/math&gt; that is larger than zero (i.e. &lt;math&gt; c_i&lt;/math&gt; &gt; 0). The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which means that for large size instances, integer linear program algorithms can be difficult to solve in polynomial time. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.&lt;sup&gt;7&lt;/sup&gt;<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time.&lt;sup&gt;7,9&lt;/sup&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe '''U''', and delete the uncovered elements, until all elements are covered. Let '''T''' be the set that contain the covered elements, and '''U''' be the set that contain the elements of '''Y''' that still uncovered. At the beginning of the iteration, '''T''' is empty and all elements '''Y''' belong to '''U'''. We iteratively select the set of '''S''' that covers the largest number of elements in '''U''' and add it to the covered elements in '''T'''. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; \quad &lt;/math&gt; &lt;math&gt; T \in \Phi &lt;/math&gt; &lt;math&gt; \quad \quad \quad \quad \quad &lt;/math&gt; { '''T''' stores the covered elements }<br /> <br /> Step 1: &lt;math&gt; \quad &lt;/math&gt; '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do:''' &lt;math&gt; \quad &lt;/math&gt; { '''U''' stores the uncovered elements '''Y'''}<br /> <br /> Step 2: &lt;math&gt; \quad \quad \quad &lt;/math&gt; select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in '''U'''<br /> <br /> Step 3: &lt;math&gt; \quad \quad \quad &lt;/math&gt; add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: &lt;math&gt; \quad \quad \quad &lt;/math&gt; remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: &lt;math&gt; \quad &lt;/math&gt; '''End while''' <br /> <br /> Step 6: &lt;math&gt; \quad &lt;/math&gt; '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i &gt;= 1, \forall i&lt;/math&gt; <br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''R''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize the number of covered segments.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Say we want to use at most &lt;math&gt; k &lt;/math&gt; routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the number of covered segments. Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing route &lt;math&gt; R_i &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> maximize &lt;math&gt;\sum_{i=1}^N R_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i} y_i = k &lt;/math&gt; <br /> <br /> For a greedy approximation solution, please check the work by Ali and Dyo.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt;<br /> .<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey for a detailed analysis of the formulation and algorithms that pertain to this significant application.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;8&lt;/span&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from http://www.jstor.org/stable/25767684<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). https://doi.org/10.1007/BF01942293<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 https://doi.org/10.1006/jagm.1997.0887.<br /> # Kasirzadeh, A., Saddoune, M. &amp; Soumis, F. Airline crew scheduling: models, algorithms, and data sets. ''EURO J Transp Logist'' 6, 111–137 (2017). https://doi.org/10.1007/s13676-015-0080-x<br /> #Vasko, Francis J., et al. “What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem?” Operations Research Letters, North-Holland, 22 Mar. 2016, www.sciencedirect.com/science/article/pii/S016763771600047X.</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=1885 Set covering problem 2020-11-25T23:50:55Z <p>Khaledfahat: /* Greedy approximation algorithm */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; <br /> <br /> The set covering problem importance has two main aspects: one is pedagogical, and the other is practical. <br /> <br /> First, because many greedy approximation methods have been proposed for this combinatorial problem, studying it gives insight into the use of approximation algorithms in solving NP-hard problems. Thus, it is a primal example in teaching computational algorithms. We present a preview of these methods in a later section, and we refer the interested reader to these references for a deeper discussion. <br /> <br /> Second, many problems in different industries can be formulated as set covering problems. For example, scheduling machines to perform certain jobs can be thought of as covering the jobs. Picking the optimal location for a cell tower so that it covers the maximum number of customers is another set covering application. Moreover, this problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c&lt;/math&gt; that is larger than zero (i.e. &lt;math&gt; c_i&lt;/math&gt; &gt; 0). The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which means that for large size instances, integer linear program algorithms can be difficult to solve in polynomial time. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.&lt;sup&gt;7&lt;/sup&gt;<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time.&lt;sup&gt;7,9&lt;/sup&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe '''U''', and delete the uncovered elements, until all elements are covered. Let '''T''' be the set that contain the covered elements, and '''U''' be the set that contain the elements of '''Y''' that still uncovered. At the beginning of the iteration, '''T''' is empty and all elements '''Y''' belong to '''U'''. We iteratively select the set of '''S''' that covers the largest number of elements in '''U''' and add it to the covered elements in '''T'''. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; T \in \Phi &lt;/math&gt; { '''T''' stores the covered elements }<br /> <br /> Step 1: '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do''' { '''U''' stores the uncovered elements '''Y'''}<br /> <br /> Step 2: select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in '''U'''<br /> <br /> Step 3: add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: '''End while''' <br /> <br /> Step 6: '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i &gt;= 1, \forall i&lt;/math&gt; <br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''R''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize the number of covered segments.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Say we want to use at most &lt;math&gt; k &lt;/math&gt; routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the number of covered segments. Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing route &lt;math&gt; R_i &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> maximize &lt;math&gt;\sum_{i=1}^N R_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i} y_i = k &lt;/math&gt; <br /> <br /> For a greedy approximation solution, please check the work by Ali and Dyo.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt;<br /> .<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey for a detailed analysis of the formulation and algorithms that pertain to this significant application.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;8&lt;/span&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from http://www.jstor.org/stable/25767684<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). https://doi.org/10.1007/BF01942293<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 https://doi.org/10.1006/jagm.1997.0887.<br /> # Kasirzadeh, A., Saddoune, M. &amp; Soumis, F. Airline crew scheduling: models, algorithms, and data sets. ''EURO J Transp Logist'' 6, 111–137 (2017). https://doi.org/10.1007/s13676-015-0080-x<br /> #Vasko, Francis J., et al. “What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem?” Operations Research Letters, North-Holland, 22 Mar. 2016, www.sciencedirect.com/science/article/pii/S016763771600047X.</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=1879 Set covering problem 2020-11-25T22:36:39Z <p>Khaledfahat: /* Greedy approximation algorithm */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; The problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c&lt;/math&gt; that is larger than zero (i.e. &lt;math&gt; c_i&lt;/math&gt; &gt; 0). The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which means that for large size instances, integer linear program algorithms can be difficult to solve in polynomial time. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.&lt;sup&gt;7&lt;/sup&gt;<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time.&lt;sup&gt;7,9&lt;/sup&gt; The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe '''U''', and delete the uncovered elements, until all elements are covered. Let '''T''' be the set that contain the covered elements, and '''U''' be the set that contain the elements of '''Y''' that still uncovered. At the beginning of the iteration, '''T''' is empty and all elements '''Y''' belong to '''U'''. We iteratively select the set of '''S''' that covers the largest number of elements in '''U''' and add it to the covered elements in '''T'''. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; T \in \Phi &lt;/math&gt; { '''T''' stores the covered elements }<br /> <br /> Step 1: '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do''' \hfill { '''U''' stores the uncovered elements '''Y'''}<br /> <br /> Step 2: select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in '''U'''<br /> <br /> Step 3: add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: '''End while''' <br /> <br /> Step 6: '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i &gt;= 1, \forall i&lt;/math&gt; <br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''R''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize the number of covered segments.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Say we want to use at most &lt;math&gt; k &lt;/math&gt; routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the number of covered segments. Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing route &lt;math&gt; R_i &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> maximize &lt;math&gt;\sum_{i=1}^N R_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i} y_i = k &lt;/math&gt; <br /> <br /> For a greedy approximation solution, please check the work by Ali and Dyo.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt;<br /> .<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey for a detailed analysis of the formulation and algorithms that pertain to this significant application.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;8&lt;/span&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from http://www.jstor.org/stable/25767684<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). https://doi.org/10.1007/BF01942293<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 https://doi.org/10.1006/jagm.1997.0887.<br /> # Kasirzadeh, A., Saddoune, M. &amp; Soumis, F. Airline crew scheduling: models, algorithms, and data sets. ''EURO J Transp Logist'' 6, 111–137 (2017). https://doi.org/10.1007/s13676-015-0080-x<br /> #Vasko, Francis J., et al. “What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem?” Operations Research Letters, North-Holland, 22 Mar. 2016, www.sciencedirect.com/science/article/pii/S016763771600047X.</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=1878 Set covering problem 2020-11-25T22:34:01Z <p>Khaledfahat: /* Greedy approximation algorithm */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; The problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c&lt;/math&gt; that is larger than zero (i.e. &lt;math&gt; c_i&lt;/math&gt; &gt; 0). The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which means that for large size instances, integer linear program algorithms can be difficult to solve in polynomial time. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.&lt;sup&gt;7&lt;/sup&gt;<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time&lt;sup&gt;7,9&lt;/sup&gt;. The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe '''U''', and delete the uncovered elements, until all elements are covered. Let '''T''' be the set that contain the covered elements, and '''U''' be the set that contain the elements of '''Y''' that still uncovered. At the beginning of the iteration, '''T''' is empty and all elements '''Y''' belong to '''U'''. We iteratively select the set of '''S''' that covers the largest number of elements in '''U''' and add it to the covered elements in '''T'''. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; T \in \Phi &lt;/math&gt; { '''T''' stores the covered elements }<br /> <br /> Step 1: '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do''' \hfill { '''U''' stores the uncovered elements '''Y'''}<br /> <br /> Step 2: select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in '''U'''<br /> <br /> Step 3: add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: '''End while''' <br /> <br /> Step 6: '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i &gt;= 1, \forall i&lt;/math&gt; <br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''R''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize the number of covered segments.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Say we want to use at most &lt;math&gt; k &lt;/math&gt; routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the number of covered segments. Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing route &lt;math&gt; R_i &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> maximize &lt;math&gt;\sum_{i=1}^N R_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i} y_i = k &lt;/math&gt; <br /> <br /> For a greedy approximation solution, please check the work by Ali and Dyo.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt;<br /> .<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey for a detailed analysis of the formulation and algorithms that pertain to this significant application.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;8&lt;/span&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from http://www.jstor.org/stable/25767684<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). https://doi.org/10.1007/BF01942293<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 https://doi.org/10.1006/jagm.1997.0887.<br /> # Kasirzadeh, A., Saddoune, M. &amp; Soumis, F. Airline crew scheduling: models, algorithms, and data sets. ''EURO J Transp Logist'' 6, 111–137 (2017). https://doi.org/10.1007/s13676-015-0080-x<br /> #Vasko, Francis J., et al. “What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem?” Operations Research Letters, North-Holland, 22 Mar. 2016, www.sciencedirect.com/science/article/pii/S016763771600047X.</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=1877 Set covering problem 2020-11-25T22:27:20Z <p>Khaledfahat: </p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; The problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c&lt;/math&gt; that is larger than zero (i.e. &lt;math&gt; c_i&lt;/math&gt; &gt; 0). The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which means that for large size instances, integer linear program algorithms can be difficult to solve in polynomial time. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.&lt;sup&gt;7&lt;/sup&gt;<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time&lt;sup&gt;7,9&lt;/sup&gt;. The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe '''U''', and delete the uncovered elements, until all elements are covered. Let '''T''' be the set that contain the covered elements, and '''U''' be the set that contain the elements of '''Y''' that still uncovered. At the beginning of the iteration, '''T''' is empty and all elements '''Y''' belong to '''U'''. We iteratively select the set of '''S''' that covers the largest number of elements in '''U''' and add it to the covered elements in '''T'''. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; T \in \Phi &lt;/math&gt; \hfill { '''T''' stores the covered elements }<br /> <br /> Step 1: '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do''' \hfill { '''U''' stores the uncovered elements '''Y'''}<br /> <br /> Step 2: select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in '''U'''<br /> <br /> Step 3: add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: '''End while''' <br /> <br /> Step 6: '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i &gt;= 1, \forall i&lt;/math&gt; <br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''R''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize the number of covered segments.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Say we want to use at most &lt;math&gt; k &lt;/math&gt; routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the number of covered segments. Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing route &lt;math&gt; R_i &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> maximize &lt;math&gt;\sum_{i=1}^N R_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i} y_i = k &lt;/math&gt; <br /> <br /> For a greedy approximation solution, please check the work by Ali and Dyo.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt;<br /> .<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey for a detailed analysis of the formulation and algorithms that pertain to this significant application.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;8&lt;/span&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from http://www.jstor.org/stable/25767684<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). https://doi.org/10.1007/BF01942293<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 https://doi.org/10.1006/jagm.1997.0887.<br /> # Kasirzadeh, A., Saddoune, M. &amp; Soumis, F. Airline crew scheduling: models, algorithms, and data sets. ''EURO J Transp Logist'' 6, 111–137 (2017). https://doi.org/10.1007/s13676-015-0080-x<br /> #Vasko, Francis J., et al. “What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem?” Operations Research Letters, North-Holland, 22 Mar. 2016, www.sciencedirect.com/science/article/pii/S016763771600047X.</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=1874 Set covering problem 2020-11-25T21:52:07Z <p>Khaledfahat: </p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; The problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c&lt;/math&gt; that is larger than zero (i.e. &lt;math&gt; c_i&lt;/math&gt; &gt; 0). The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which means that for large size instances, integer linear program algorithms can be difficult to solve in polynomial time. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.&lt;sup&gt;7&lt;/sup&gt;<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time&lt;sup&gt;7,9&lt;/sup&gt;. The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe '''U''', and delete the uncovered elements, until all elements are covered. Let '''T''' be the set that contain the covered elements, and '''U''' be the set that contain the elements of '''Y''' that still uncovered. At the beginning of the iteration, '''T''' is empty and all elements '''Y''' belong to '''U'''. We iteratively select the set of '''S''' that covers the largest number of elements in '''U''' and add it to the covered elements in '''T'''. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> Step 0: &lt;math&gt; T \in \Phi &lt;/math&gt; \\ '''T''' stores the covered elements <br /> <br /> Step 1: '''While''' &lt;math&gt; U \neq \Phi &lt;/math&gt; '''do''' \\ '''U''' stores the uncovered elements '''Y'''<br /> <br /> Step 2: select &lt;math&gt; s_i \in S &lt;/math&gt; that covers the highest number of elements in '''U'''<br /> <br /> Step 3: add &lt;math&gt; s_i &lt;/math&gt; to &lt;math&gt; T &lt;/math&gt;<br /> <br /> Step 4: remove &lt;math&gt; s_i &lt;/math&gt; from &lt;math&gt; U &lt;/math&gt;<br /> <br /> Step 5: '''End while''' <br /> <br /> Step 6: '''Return''' &lt;math&gt; S &lt;/math&gt;<br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town &lt;math&gt; i &lt;/math&gt;. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town &lt;math&gt; i &lt;/math&gt; and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i &gt;= 1, \forall i&lt;/math&gt; <br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to the scarcity of the physical sensors, the problem does not allow for placing a detector on every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''R''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize the number of covered segments.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Say we want to use at most &lt;math&gt; k &lt;/math&gt; routes. We want to find &lt;math&gt; k &lt;/math&gt; routes that maximize the number of covered segments. Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing route &lt;math&gt; R_i &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> maximize &lt;math&gt;\sum_{i=1}^N R_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i} y_i = k &lt;/math&gt; <br /> <br /> For a greedy approximation solution, please check the work by Ali and Dyo.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt;<br /> .<br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights. Due to the complexity of airline schedules, this problem is usually divided into two subproblems: crew pairing and crew assignment. We refer the interested reader to this survey for a detailed analysis of the formulation and algorithms that pertain to this significant application.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;8&lt;/span&gt;<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from http://www.jstor.org/stable/25767684<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). https://doi.org/10.1007/BF01942293<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 https://doi.org/10.1006/jagm.1997.0887.<br /> # Kasirzadeh, A., Saddoune, M. &amp; Soumis, F. Airline crew scheduling: models, algorithms, and data sets. ''EURO J Transp Logist'' 6, 111–137 (2017). https://doi.org/10.1007/s13676-015-0080-x<br /> #Vasko, Francis J., et al. “What Is the Best Greedy-like Heuristic for the Weighted Set Covering Problem?” Operations Research Letters, North-Holland, 22 Mar. 2016, www.sciencedirect.com/science/article/pii/S016763771600047X.</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=1859 Set covering problem 2020-11-25T19:01:10Z <p>Khaledfahat: /* Integer linear program formulation */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; The problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c&lt;/math&gt; that is larger than zero (i.e. &lt;math&gt; c_i&lt;/math&gt; &gt; 0). The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing their total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which means that for large size instances, integer linear program algorithms can be difficult to solve in polynomial time. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.&lt;sup&gt;7&lt;/sup&gt;<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time. The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe '''U''', and delete the uncovered elements, until all elements are covered. Let '''T''' be the set that contain the covered elements, and '''U''' be the set that contain the elements of '''Y''' that still uncovered. At the beginning of the iteration, '''T''' is empty and all elements '''Y''' belong to '''U'''. We iteratively select the set of '''S''' that covers the largest number of elements in '''U''' and add it to the covered elements in '''T'''. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> <br /> This set covering problems is concerned with maximizing the coverage of some public facilities placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city. <br /> <br /> Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing to build a fire station at town i. Let &lt;math&gt; S_i &lt;/math&gt; be a subset of towns including town i and all its neighbors. The problem is then formulated as follows.<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n y_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i\in S_i} y_i &gt;= 1, \forall i&lt;/math&gt; <br /> ; The optimal route selection problem<br /> <br /> Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to scarcity of the physical sensors, the problem does not allow for placing a detector at every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''R''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize number of covered segments.<br /> <br /> This is quite different from other applications because it results in a maximization formulation, rather than a minimization formulation. Say we want to use at most k routes. We want to find k routes that maximize the number of covered segments. Let &lt;math&gt; y_i &lt;/math&gt; be the decision variable corresponding to choosing route &lt;math&gt; R_i &lt;/math&gt;. The problem is then formulated as follows.<br /> <br /> maximize &lt;math&gt;\sum_{i=1}^N R_i&lt;/math&gt; <br /> <br /> such that &lt;math&gt; \sum_{i} y_i = k &lt;/math&gt; <br /> <br /> For a greedy approximation solution, please check the work by Ali and Dyo.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt;<br /> <br /> ;The airline crew scheduling problem<br /> <br /> An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights.<br /> <br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from &lt;nowiki&gt;http://www.jstor.org/stable/25767684&lt;/nowiki&gt;<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). &lt;nowiki&gt;https://doi.org/10.1007/BF01942293&lt;/nowiki&gt;<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 &lt;nowiki&gt;https://doi.org/10.1006/jagm.1997.0887.&lt;/nowiki&gt;</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=1806 Set covering problem 2020-11-25T06:28:17Z <p>Khaledfahat: </p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; The problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c&lt;/math&gt; that is larger than zero (i.e. &lt;math&gt; c_i&lt;/math&gt; &gt; 0). The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing its total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which means that for large size instances, integer linear program algorithms can be difficult to solve in polynomial time. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.&lt;sup&gt;7&lt;/sup&gt;<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time. The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe '''U''', and delete the uncovered elements, until all elements are covered. Let '''T''' be the set that contain the covered elements, and '''U''' be the set that contain the elements of '''Y''' that still uncovered. At the beginning of the iteration, '''T''' is empty and all elements '''Y''' belong to '''U'''. We iteratively select the set of '''S''' that covers the largest number of elements in '''U''' and add it to the covered elements in '''T'''. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover example: '''<br /> <br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> :This set covering problems is concerned with maximizing the coverage of some public facility placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city.<br /> ; The optimal route selection problem<br /> :Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to scarcity of the physical sensors, the problem does not allow for placing a detector at every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''U''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize number of covered segments.<br /> ;The airline crew scheduling problem<br /> :An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights.<br /> :<br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from &lt;nowiki&gt;http://www.jstor.org/stable/25767684&lt;/nowiki&gt;<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). &lt;nowiki&gt;https://doi.org/10.1007/BF01942293&lt;/nowiki&gt;<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 &lt;nowiki&gt;https://doi.org/10.1006/jagm.1997.0887.&lt;/nowiki&gt;</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=1805 Set covering problem 2020-11-25T06:27:41Z <p>Khaledfahat: </p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; The problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c&lt;/math&gt; that is larger than zero (i.e. &lt;math&gt; c_i&lt;/math&gt; &gt; 0). The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing its total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which means that for large size instances, integer linear program algorithms can be difficult to solve in polynomial time. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.&lt;sup&gt;7&lt;/sup&gt;<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time. The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe '''U''', and delete the uncovered elements, until all elements are covered. Let '''T''' be the set that contain the covered elements, and '''U''' be the set that contain the elements of '''Y''' that still uncovered. At the beginning of the iteration, '''T''' is empty and all elements '''Y''' belong to '''U'''. We iteratively select the set of '''S''' that covers the largest number of elements in '''U''' and add it to the covered elements in '''T'''. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover: '''<br /> <br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> :This set covering problems is concerned with maximizing the coverage of some public facility placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city.<br /> ; The optimal route selection problem<br /> :Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to scarcity of the physical sensors, the problem does not allow for placing a detector at every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''U''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize number of covered segments.<br /> ;The airline crew scheduling problem<br /> :An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights.<br /> :<br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from &lt;nowiki&gt;http://www.jstor.org/stable/25767684&lt;/nowiki&gt;<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). &lt;nowiki&gt;https://doi.org/10.1007/BF01942293&lt;/nowiki&gt;<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 &lt;nowiki&gt;https://doi.org/10.1006/jagm.1997.0887.&lt;/nowiki&gt;</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=1804 Set covering problem 2020-11-25T06:26:58Z <p>Khaledfahat: </p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; The problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c&lt;/math&gt; that is larger than zero (i.e. &lt;math&gt; c_i&lt;/math&gt; &gt; 0). The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing its total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which means that for large size instances, integer linear program algorithms can be difficult to solve in polynomial time. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.&lt;sup&gt;7&lt;/sup&gt;<br /> <br /> == Greedy approximation algorithm ==<br /> Greedy algorithms can be used to approximate for optimal or near-optimal solutions for large scale set covering instances in polynomial solvable time. The greedy heuristics applies iterative process that, at each stage, select the largest number of uncovered elements in the universe '''U''', and delete the uncovered elements, until all elements are covered. Let '''T''' be the set that contain the covered elements, and '''U''' be the set that contain the elements of '''Y''' that still uncovered. At the beginning of the iteration, '''T''' is empty and all elements '''Y''' belong to '''U'''. We iteratively select the set of '''S''' that covers the largest number of elements in '''U''' and add it to the covered elements in '''T'''. An example of this algorithm is presented below. <br /> <br /> '''Greedy algorithm for minimum set cover '''<br /> <br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> :This set covering problems is concerned with maximizing the coverage of some public facility placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city.<br /> ; The optimal route selection problem<br /> :Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to scarcity of the physical sensors, the problem does not allow for placing a detector at every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''U''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize number of covered segments.<br /> ;The airline crew scheduling problem<br /> :An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights.<br /> :<br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from &lt;nowiki&gt;http://www.jstor.org/stable/25767684&lt;/nowiki&gt;<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). &lt;nowiki&gt;https://doi.org/10.1007/BF01942293&lt;/nowiki&gt;<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 &lt;nowiki&gt;https://doi.org/10.1006/jagm.1997.0887.&lt;/nowiki&gt;</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=1799 Set covering problem 2020-11-25T03:04:04Z <p>Khaledfahat: </p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; The problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c&lt;/math&gt; that is larger than zero (i.e. &lt;math&gt; c_i&lt;/math&gt; &gt; 0). The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;math&gt; y_i \in \{0, 1\}, \forall i = 1,....,n&lt;/math&gt; <br /> <br /> The objective function &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; is defined to minimize the number of subset &lt;math&gt; s_i&lt;/math&gt; that cover all elements in the universe by minimizing its total cost. The first constraint implies that every element j in the universe U must be be covered where &lt;math&gt; a_{ij}&lt;/math&gt;=1 if j &lt;math&gt;\in&lt;/math&gt; &lt;math&gt; s_i&lt;/math&gt; and 0 otherwise. The second constraint &lt;math&gt; y_i \in \{0, 1\} &lt;/math&gt; indicates that the decision variables are binary.<br /> <br /> Set covering problems are significant NP-hard optimization problems, which means that for large scale applications, integer linear program algorithms can be difficult to solve in polynomial time. Therefore, there exist approximation algorithms that can solve large scale problems in polynomial time with optimal or near-optimal solutions. One of the widely used approximation algorithms for set covering is the classical greedy algorithms.<br /> <br /> == Greedy approximation algorithm ==<br /> <br /> <br /> ==Numerical Example==<br /> Let’s consider a simple example where we assign cameras at different locations. Each location covers some areas of stadiums, and our goal is to put the least amount of cameras such that all areas of stadiums are covered. We have stadium areas from 1 to 15, and possible camera locations from 1 to 8.<br /> <br /> We are given that camera location 1 covers stadium areas {1,3,4,6,7}, camera location 2 covers stadium areas {4,7,8,12}, while the remaining camera locations and the stadium areas that the cameras can cover are given in table 1 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 1<br /> !Camera Location<br /> !Stadium Areas<br /> |-<br /> |1<br /> |1,3,4,6,7<br /> |-<br /> |2<br /> |4,7,8,12<br /> |-<br /> |3<br /> |2,5,9,11,13<br /> |-<br /> |4<br /> |1,2,14,15<br /> |-<br /> |5<br /> |3,6,10,12,14<br /> |-<br /> |6<br /> |8,14,15<br /> |-<br /> |7<br /> |1,2,6,11<br /> |-<br /> |8<br /> |1,2,4,6,8,12<br /> |}<br /> To reformulate the problem, we can list the stadium areas and the camera locations that cover each stadium area. For instance, stadium area 1 can be covered with camera location {1,4,7,8}, stadium area 2 can be covered with camera location {3,4,7,8}, while the rest are given in table 2 below:<br /> {| class=&quot;wikitable&quot;<br /> |+Table 2<br /> !Stadium Area<br /> !Camera Location<br /> |-<br /> |1<br /> |1,4,7,8<br /> |-<br /> |2<br /> |3,4,7,8<br /> |-<br /> |3<br /> |1,5<br /> |-<br /> |4<br /> |1,2,8<br /> |-<br /> |5<br /> |3<br /> |-<br /> |6<br /> |1,5,7,8<br /> |-<br /> |7<br /> |1,2<br /> |-<br /> |8<br /> |2,6,8<br /> |-<br /> |9<br /> |3<br /> |-<br /> |10<br /> |5<br /> |-<br /> |11<br /> |3,7<br /> |-<br /> |12<br /> |2,5,8<br /> |-<br /> |13<br /> |3<br /> |-<br /> |14<br /> |4,5,6<br /> |-<br /> |15<br /> |4,6<br /> |}<br /> We can then represent the above information using binary values. If the stadium area &lt;math&gt;i&lt;/math&gt; can be covered with camera location &lt;math&gt;j&lt;/math&gt;, then we have &lt;math&gt;y_{ij} = 1&lt;/math&gt;. If not,&lt;math&gt;y_{ij} = 0&lt;/math&gt;. For instance, stadium area 1 is covered by camera location 1, so &lt;math&gt;y_{11} = 1&lt;/math&gt;, while stadium area 1 is not covered by camera location 2, so &lt;math&gt;y_{12} = 0&lt;/math&gt;. The binary variables &lt;math&gt;y_{ij}&lt;/math&gt; values are given in the table below: <br /> {| class=&quot;wikitable&quot;<br /> |+Table 3<br /> !<br /> !1<br /> !2<br /> !3<br /> !4<br /> !5<br /> !6<br /> !7<br /> !8<br /> |-<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |2<br /> |<br /> |<br /> |1<br /> |1<br /> |<br /> |<br /> |1<br /> |1<br /> |-<br /> |3<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |4<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |-<br /> |5<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |6<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |1<br /> |-<br /> |7<br /> |1<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |8<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |-<br /> |9<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |10<br /> |<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |-<br /> |11<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |-<br /> |12<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |1<br /> |-<br /> |13<br /> |<br /> |<br /> |1<br /> |<br /> |<br /> |<br /> |<br /> |<br /> |-<br /> |14<br /> |<br /> |<br /> |<br /> |1<br /> |1<br /> |1<br /> |<br /> |<br /> |-<br /> |15<br /> |<br /> |<br /> |<br /> |1<br /> |<br /> |1<br /> |<br /> |<br /> |}<br /> <br /> <br /> <br /> We introduce another binary variable &lt;math&gt;z_j&lt;/math&gt; to indicate if a camera is installed at location &lt;math&gt;j&lt;/math&gt;. &lt;math&gt;z_j = 1&lt;/math&gt; if camera is installed at location &lt;math&gt;j&lt;/math&gt;, while &lt;math&gt;z_j = 0&lt;/math&gt; if not. <br /> <br /> Our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt;. For each stadium, there’s a constraint that the stadium area &lt;math&gt;i&lt;/math&gt; has to be covered by at least one camera location. For instance, for stadium area 1, we have &lt;math&gt;z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;, while for stadium 2, we have &lt;math&gt;z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;. All the 15 constraints that corresponds to 15 stadium areas are listed below:<br /> <br /> <br /> ''Constraints 1 to 15:''<br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;2. z_3 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;5. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;9. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;11. z_3 + z_7 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;13. z_3 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> From constraint {5,9,13}, we can obtain &lt;math&gt;z_3 = 1&lt;/math&gt;. Thus we no longer need constraint 2 and 11 as they are satisfied when &lt;math&gt;z_3 = 1&lt;/math&gt;. With &lt;math&gt;z_3 = 1&lt;/math&gt; determined, the constraints left are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;3. z_1 + z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;4. z_1 + z_2 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;6. z_1 + z_5 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;12. z_2 + z_5 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;14. z_4 + z_5 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> Now if we take a look at constraint &lt;math&gt;10. z_5 \geqslant 1&lt;/math&gt; so &lt;math&gt;z_5&lt;/math&gt; shall equal to 1. As &lt;math&gt;z_5 = 1&lt;/math&gt;, constraint {3,6,12,14} are satisfied no matter what other &lt;math&gt;z&lt;/math&gt; values are taken. If we also take a look at constraint 7 and 4, if constraint 4 will be satisfied as long as constraint 7 is satisfied since &lt;math&gt;z&lt;/math&gt; values are nonnegative, so constraint 4 is no longer needed. The remaining constraints are:<br /> <br /> <br /> &lt;math&gt;1. z_1 + z_4 + z_7 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;7. z_1 + z_2 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;8. z_2 + z_6 + z_8 \geqslant 1&lt;/math&gt;<br /> <br /> &lt;math&gt;15. z_4 + z_6 \geqslant 1&lt;/math&gt;<br /> <br /> <br /> The next step is to focus on constraint 7 and 15. We can have at least 4 combinations of &lt;math&gt;z_1, z_2, z_4, z_6&lt;/math&gt;values.<br /> <br /> <br /> &lt;math&gt;A: z_1 = 1, z_2 = 0, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;B: z_1 = 1, z_2 = 0, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;C: z_1 = 0, z_2 = 1, z_4 = 1, z_6 = 0&lt;/math&gt;<br /> <br /> &lt;math&gt;D: z_1 = 0, z_2 = 1, z_4 = 0, z_6 = 1&lt;/math&gt;<br /> <br /> <br /> We can then discuss each combination and determine &lt;math&gt;z_7, z_8&lt;/math&gt;values for constraint 1 and 8 to be satisfied.<br /> <br /> <br /> Combination &lt;math&gt;A&lt;/math&gt;: constraint 1 already satisfied, we need &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 8.<br /> <br /> Combination &lt;math&gt;B&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;C&lt;/math&gt;: constraint 1 already satisfied, constraint 8 already satisfied.<br /> <br /> Combination &lt;math&gt;D&lt;/math&gt;: we need &lt;math&gt;z_7 = 1&lt;/math&gt; or &lt;math&gt;z_8 = 1&lt;/math&gt; to satisfy constraint 1, while constraint 8 already satisfied.<br /> <br /> Our final step is to compare the four combinations. Since our objective is to minimize &lt;math&gt;\sum_{j=1}^8 z_j&lt;/math&gt; and combinations &lt;math&gt;B&lt;/math&gt; and &lt;math&gt;C&lt;/math&gt; require the least amount of &lt;math&gt;z_j&lt;/math&gt; to be 1, they are the optimal solutions.<br /> <br /> To conclude, our two solutions are:<br /> <br /> &lt;math&gt;Solution 1: z_1 = 1, z_3 = 1, z_5 = 1, z_6 = 1&lt;/math&gt;<br /> <br /> &lt;math&gt;Solution 2: z_2 = 1, z_3 = 1, z_4 = 1, z_5 = 1&lt;/math&gt;<br /> <br /> The minimum number of cameras that we need to install is 4.<br /> <br /> == Applications==<br /> <br /> The applications of the set covering problem span a wide range of applications, but its usefulness is evident in industrial and governmental planning. Variations of the set covering problem that are of practical significance include the following.<br /> ;The optimal location problem<br /> :This set covering problems is concerned with maximizing the coverage of some public facility placed at different locations.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;3&lt;/span&gt; Consider the problem of placing fire stations to serve the towns of some city.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;4&lt;/span&gt; If each fire station can serve its town and all adjacent towns, we can formulate a set covering problem where each subset consists of a set of adjacent towns. The problem is then solved to minimize the required number of fire stations to serve the whole city.<br /> ; The optimal route selection problem<br /> :Consider the problem of selecting the optimal bus routes to place pothole detectors. Due to scarcity of the physical sensors, the problem does not allow for placing a detector at every road. The task of finding the maximum coverage using a limited number of detectors could be formulated as a set covering problem.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;5&lt;/span&gt; Specifically, giving a collection of routes '''''U''','' where each route itself is divided into segments. The segments of two routes can overlap. The goal is then to select the routes that maximize number of covered segments.<br /> ;The airline crew scheduling problem<br /> :An important application of large-scale set covering is the airline crew scheduling problem, which pertains to assigning airline staff to work shifts.&lt;sup&gt;2,6&lt;/sup&gt; Thinking of the collection of flights as a universal set to be covered, we can formulate a set covering problem to search for the optimal assignment of employees to flights.<br /> :<br /> ==Conclusion ==<br /> <br /> The set covering problem, which aims to find the least number of subsets that cover some universal set, is a widely known NP-hard combinatorial problem. Due to its applicability to route planning and airline crew scheduling, several methods have been proposed to solve it. Its straightforward formulation allows for the use of off-the-shelf optimizers to solve it. Moreover, heuristic techniques and greedy algorithms can be used to solve large-scale set covering problems for industrial applications. <br /> <br /> ==References==<br /> #Grossman, T., &amp; Wool, A. (1997). Computational experience with approximation algorithms for the set covering problem. ''European Journal of Operational Research,'' ''101''(1), 81-92. doi:10.1016/s0377-2217(96)00161-0<br /> #RUBIN, J. (1973). A Technique for the Solution of Massive Set Covering Problems, with Application to Airline Crew Scheduling. ''Transportation Science,'' ''7''(1), 34-48. Retrieved November 23, 2020, from &lt;nowiki&gt;http://www.jstor.org/stable/25767684&lt;/nowiki&gt;<br /> #Church, R., ReVelle, C. The maximal covering location problem. ''Papers of the Regional Science Association'' 32, 101–118 (1974). &lt;nowiki&gt;https://doi.org/10.1007/BF01942293&lt;/nowiki&gt;<br /> # Aktaş, E., Özaydın, Ö, Bozkaya, B., Ülengin, F., &amp; Önsel, Ş. (2013). Optimizing Fire Station Locations for the Istanbul Metropolitan Municipality. ''Interfaces,'' ''43''(3), 240-255. doi:10.1287/inte.1120.0671<br /> #Ali, J., &amp; Dyo, V. (2017). Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach. ''Proceedings of the 14th International Joint Conference on E-Business and Telecommunications''. doi:10.5220/0006469800830088<br /> #Marchiori, E., &amp; Steenbeek, A. (2000). An Evolutionary Algorithm for Large Scale Set Covering Problems with Application to Airline Crew Scheduling. ''EvoWorkshops''.<br /> # Petr Slavı́k. (1997). A Tight Analysis of the Greedy Algorithm for Set Cover. ''Journal of Algorithms,'', Volume 25, Issue 2, Pages 237-254 &lt;nowiki&gt;https://doi.org/10.1006/jagm.1997.0887.&lt;/nowiki&gt;</div> Khaledfahat https://optimization.cbe.cornell.edu/index.php?title=Set_covering_problem&diff=1798 Set covering problem 2020-11-25T02:51:23Z <p>Khaledfahat: /* Problem formulation */</p> <hr /> <div>Authors: Sherry Liang, Khalid Alanazi, Kumail Al Hamoud<br /> <br /> == Introduction ==<br /> <br /> The set covering problem is a significant NP-hard problem in combinatorial optimization. In the set covering problem, two sets are given: a set '''''U''''' of elements and a set '''''S''''' of subsets of the set '''''U'''''. Each subset in '''''S''''' is associated with a predetermined cost, and the union of all the subsets covers the set '''''U'''''. This combinatorial problem then concerns finding the optimal number of subsets whose union covers the universal set while minimizing the total cost.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;1&lt;/span&gt; The problem has many applications in the airline industry, and it was explored on an industrial scale as early as the 1970s.&lt;span style=&quot;font-size: 8pt; position:relative; bottom: 0.3em;&quot;&gt;2&lt;/span&gt;<br /> <br /> == Problem formulation ==<br /> The mathematical formulation of the set covering problem is define as follows. We define U = {&lt;math&gt; u_i&lt;/math&gt;, ….. &lt;math&gt; u_m&lt;/math&gt;} as the universe of elements and S = {&lt;math&gt; s_1&lt;/math&gt;, ….,&lt;math&gt; s_n&lt;/math&gt;} as a collection of subsets such that &lt;math&gt; s_i&lt;/math&gt; &lt;math&gt;\in&lt;/math&gt;U and the union of &lt;math&gt; s_i&lt;/math&gt; covers all elements in U (i.e. &lt;math&gt;\cup&lt;/math&gt;&lt;math&gt; s_i&lt;/math&gt; = U ). Addionally, each set &lt;math&gt; s_i&lt;/math&gt; must cover at least one element of U and has associated cost &lt;math&gt; c&lt;/math&gt; that is larger than zero (i.e. &lt;math&gt; c_i&lt;/math&gt; &gt; 0). The objective is to find the minimum cost sub-collection of sets X &lt;math&gt;\in&lt;/math&gt; S that covers all the elements in the universe U.<br /> <br /> == Integer linear program formulation ==<br /> An integer linear program (ILP) model can be formulated for the minimum set covering problem as follows:<br /> <br /> '''Decision variables'''<br /> <br /> &lt;math&gt; y_i = \begin{cases} 1, &amp; \text{if subset }i\text{ is selected} \\ 0, &amp; \text{otherwise } \end{cases}&lt;/math&gt;<br /> <br /> '''Objective function'''<br /> <br /> minimize &lt;math&gt;\sum_{i=1}^n c_i y_i&lt;/math&gt; <br /> <br /> '''Constraints '''<br /> <br /> &lt;math&gt; \sum_{i=1}^n a_{ij} y_i &gt;= 1, \forall j= 1,....,m&lt;/math&gt; <br /> <br /> &lt;ma