![]() |
![]() |
The GGL implementation of the algorithms intersection, union, difference and symmetric difference is based on set theory (a.o. http://en.wikipedia.org/wiki/Set_(mathematics)). This theory is applied for spatial sets.
Intersection and union are so-called set-theoretic operations. Those operations work on sets, and geometries (especially polygons and multi-polygons) can be seen as sets, sets of points.
The first section will repeat a small, relevant part of the algebra of sets, also describing the notation used in this page. The next section will extend this algebra of sets for spatial sets (polygons).
(Source of most definitions: http://en.wikipedia.org/wiki/Algebra_of_sets)
There are several laws on sets and we will not discuss them all here. The most important for this page are:
Polygons are sets of points, and, therefore polygons follow all definitions and laws for sets. For pragmatic reasons and implementations in computer programs, polygons have an orientation, clockwise or counter clockwise. Orientation is not part of most set theory descriptions, but is an important aspect for the appliance of sets to polygon operations.
If a polygon is (arbitrarily) defined as having its vertices in clockwise direction:
This definition is important for the spatial interpretation sets.
The last observation is helpful in calculating the difference and the symmetric difference:
All spatial set-theoretic operations are implemented in shared code. There is hardly any difference in code between the calculation of an intersection or a union. The only difference is that at each intersection point, for an intersection the right turn should be taken. For union the left turn should be taken.
This is implemented as such in GGL. The turn to be taken is a variable.
There is an alternative to calculate union as well:
There is an additional difference in the handling of disjoint holes (holes which are not intersected). This is also implemented in the same generic way (the implementation will still be tweaked a little to have it even more generic).
For a counter clockwise polygon, the behaviour is the reverse: for intersection take the left path, for union take the right path. This is a trivial thing to implement, but it still has to be done (as the orientation was introduced in a later phase in GGL).
As explained above, for a difference, the vertices of the first polygon should be iterated by a forward iterator, but the vertices of the second polygon should be iterated by a reverse iterator (or vice versa). This (trivial) implementation still has to be done. It will not be implemented by creating a copy, reversing it, and presenting it as input to the set operation (as outlined above). That is easy and will work but has a performance penalty. Instead a reversible iterator will used, extended from Boost.Range iterators, and decorating a Boost.Range iterator at the same time, which can travel forward or backward.
It is currently named reversible_view and usage looks like:
template <int Direction, typename Range> void walk(Range const & range) { typedef reversible_view<Range, Direction> view_type; view_type view(range); typename boost::range_const_iterator<view_type>::type it; for (it = boost::begin(view); it != boost::end(view); ++it) { // do something } } walk<1>(range); // forward walk<-1>(range); // backward
The algorithm is a modern variant of the graph traversal algorithm, after Weiler-Atherton (http://en.wikipedia.org/wiki/Weiler-Atherton_clipping_algorithm).
It has the following characteristics (part of these points are deviations from Weiler-Atherton)
The actual implementation consists of the next phases.
1 the input geometries are indexed (if necessary). Currently we use monotonic sections for the index. It is done by the algorithm sectionalize. Sections are created is done on the fly, so no preparation is required before (though this would improve performance - it is possible that there will be an alternative variant where prepared sections or other indexes are part of the input). For box-polygon this phase is not necessary and skipped. Sectionalizing is done in linear time.
2, intersection points are calculated. Segments of polygon A are compared with segments of polygon B. Segment intersection is only done for segments in overlapping sections. Intersection points are not inserted into the original polygon or in a copy. A linked list is therefore not necessary. This phase is called get_intersection_points. This function can currently be used for one or two input geometries, for self-intersection or for intersection. Because found intersections are provided with intersection-information, including a reference to their source, it is possible (but currently not implemented) to have more than two geometry inputs.
The complexity of getting the intersections is (much) less than quadratic (n*m) because of the monotonic sections. The exact complexity depends on the number of sections, of how the input polygons look like. In a worst case scenario, there are only two monotonic sections per polygon and both overlap. The complexity is then quadratic. However, the sectionalize algorithm has a maximum number of segments per section, so for large polygons there are always more monotonic sections and in those cases they do not overlap by the definition of "monotonic". For boxes, the complexity is linear time.
To give another idea of how sections and indexes work: For a test processing 3918 polygons (but not processing those of which envelopes do not overlap):
In "normal" cases 84% of the time is spent on finding intersection points. These divisions in 's refers to the performance test described elsewhere
One piece of information per intersection points is if it is trivial. It is trivial if the intersection is not located at segment end points.
3, the found intersection points are merged (merge_intersection_points), and some intersections can be deleted (e.g. in case of collinearities). This merge process consists of sorting the intersection points in X (major) and Y (minor) direction, and merging intersections with a common location together. Intersections with common locations do occur as soon as segments are collinear or meet at their end points. This phase is skipped if all intersection points are trivial.
About 6% is spent on merging.
4, some turns need to be adapted. If segments intersect in their interiors, this is never necessary. However, if segments intersect on their end points, it is sometimes necessary to change "side" information to "turn" information. This phase is called adapt_turns.
The image below gives one example when adapting turns is necessary. There is side information, both segments have sides left and right, there is also collinear. However, for an intersection no turn should be taken at all, so no right turn. For a union, both polygons have to be travelled. In this case the side information is adapted to turn information, both turns will be left. This phase is skipped if all intersection points are trivial.
5, the merged intersection points are enriched (enrich_intersection_points) with information about a.o. the next intersection point (travel information).
About 3% is spent on enrichment.
6, polygons are traversed (traverse) using the intersection points, enriched with travel information. The input polygons are traversed and at all intersection poitns a direction is taken, left for union, right for intersection point (for counter clockwise polygons this is the other way round). In some cases separate rings are produced. In some cases new holes are formed.
About 6% is spent on traversal.
7, the created rings are assembled (assemble) into polygon(s) with exterior rings and interior rings. Even if there are no intersection points found, this process can be important to find containment and coverage.
Timing of this phase is not yet available, as the comparison program work on rings.
April 2, 2011 |
Copyright © 2007-2011 Barend Gehrels, Amsterdam, the Netherlands Copyright © 2008-2011 Bruno Lalande, Paris, France Copyright © 2009-2010 Mateusz Loskot, London, UK |