A PROTOTYPE FOR ADAPTIVE ASSOCIATION OF STREET NAMES WITH STREETS ON MAPS
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
A PROTOTYPE FOR ADAPTIVE ASSOCIATION OF STREET NAMES WITH STREETS ON MAPS G. Nagy1 , A. Samal2, S. Seth2 , T. Fisher2, E. Guthmann2, K. Kalafala1 , L. Li2 , P. Sarkar1, S. Sivasubramaniam1, Y. Xu1 1 Rensselaer Polytechnic Institute, Troy, NY 12180 USA 2 University of Nebraska, Lincoln, NE 68502, USA email: nagy@ecse.rpi.edu We present work in progress on the development of a partially-automated system for inter- pretation of map images. The principal aim of our project is to demonstrate that an adap- tive system can decrease operator intervention with increasing conversion volume. The extracted information is evaluated against an established database, and a cost model is con- structed to evaluate the entire conversion task. 1 Overview We report the status and results from a recently initiated project on interpretation of map images. Our immediate goal is to identify street lines and street names in a scanned USGS to- pographic quadrangle and produce a database of associations between them. The project involves several challenging tasks that are beyond the current state of the art in automated map conversion. We plan to automate gradually the many functions required, be- ginning with the most time-consuming aspects of manual digitization. Small sections of the map will be converted first, and an operator will correct any errors. These operator actions will provide feedback to automatically alter the parameters of the processing algorithms. The logged operator interventions will also assist us in the preparation of a cost model. Our emphasis throughout is on techniques that facilitate the batch conversion of maps of the same type. While the conversion of the first map in the batch may require significant operator in- tervention, subsequent maps will benefit from adaptive techniques based on the graphic and typesetting consistency within the batch. We build on considerable previous work. Vectorization under document-specific constraints and “beautification” are discussed in [1, 2, 3, 4], street-line and street-name extraction in [5, 6], and the separation of text and line art in [7, 8]. Excellent overviews of model-based map interpretation can be found in two recent dissertations [9, 10]. We also believe that over the last two decades much cartographic and image processing expertise has been built into our principal software tool, ARC/INFO [11]. The data flow through the conversion process is illustrated in Figure 1. The registration of the map to geodesic coordinates is required only for evaluation against the reference data. Color separation is currently performed using the hue component of the HSV (Hue, Saturation, and Value) model. The separation of the black sublayer into street line, street label, and “other” is a difficult task. Contextual cues must be used to differentiate small building icons, traffic islands, and in- dividual characters. Current commercial OCR (Optical Character Recognition) systems cannot
Washington DC East Quad Map Image Scanning Registration Other Layers Color Separation (Water, Vegetation, Built-up Areas, Elevations ) Street Index Future Black Red Layer Layer Work Sublayer Separation Graphics Line Text Vectorization Text Processing Other Text Other Lines Line Processing Street Names Street Lines Street Name Association Expert System USGS DLG Files TIGER/line Files Post-processing & Verification Evaluation Residual Conversion Output Errors Cost DLG Files Figure 1: Schematic of data flow recognize street names with sufficient accuracy. Street names are therefore recognized using domain-specific character prototypes obtained from a few operator-labeled words [12, 13]. The initial vectorization is performed by ArcScan. The resulting line segment configura- tion is evaluated according to map-specific constraints, and corrections are made on the basis of detailed analysis of the nearby foreground pixels, line segment connectivity and orientation, and intersection topology. The street-line/street-name association is accomplished by determining the line segment nearest to and best aligned with every street-name bounding box, then tracing the chains of par- allel street polylines to their logical termination. Finally, the segment of map that has already been processed is submitted to the operator. Errors are corrected using a GUI (Graphic User Interface) to ARC/INFO. The system keeps a detailed log and associates every correction with the responsible algorithm. Every one of the processing algorithms described above has several parameters that are ini- tialized to default values according to experience from previous maps. The log of operator cor- rections provides the opportunity to change these values. Each change is checked against the al- ready existing “correct” database. Some of the steps, such as vectorization and line processing, also have internal (automatic) consistency checks that are used to provide feedback to previous stages. The verified database of street name associations is compared to TIGER (Topologically In- tegrated Geographic Encoding and Referencing) and DLG (Digital Line Graph) files for eval-
uation. The cost-benefit ratio of residual error to operator time and computing resources forms the basis of a model for predicting the cost of new conversion tasks. 2 Image Acquisition and Processing The map image was provided to us on a CDROM by the US Government’s National Im- agery and Mapping Agency (NIMA). The high quality image was obtained by scanning a 7.50 USGS topographic litho of Washington DC East at 1000 dpi and 24 bits/pixel. The uncom- pressed image occupies over one Gigabyte of storage. After compression, this reduces to about 70 Megabytes, still a formidable size for storage and processing. Under the circumstance, sub- sampling, say, to 250 dpi is a tempting preprocessing alternative. Figure 2(a) shows a section of the map and identical pieces from this section are shown at 1000 and 250 dpi resolutions in Figures 2(b) through 2(e) for illustration. At the lower resolu- tion, character shapes may get distorted (Figure 2(e), extra gaps may be introduced in a street line where it is particularly thin (Figure 2(c)), and overlapping glyphs may be hard to distinguish from parallel street lines (Figures 2(b) and 2(d)). Intersections can be identified reliably pro- vided fine detail is preserved on street-line intersections. The example in Figure 2(d) shows this not to be true for the 250 dpi image. There is significant distortion in the regularity of halftone patterns at the lower resolution. Figure 2: 1000dpi and 250dpi images compared. In our view, the above disadvantages of image subsampling outweigh the potential prob- lems of working with the high-resolution image. A suite of tools, based on the public-domain TIFF library [14], was developed to overcome the cumbersome task of accessing sections of large image files. The image is stored in a tiled format as a mosaic of square tiles. The tile size was kept small (1/200 or 512-pixel square) so that each tile image could be viewed at full resolution on computer displays, without distortion or need to scroll. Library routines allow ef- ficient extraction of any rectangular region for display or processing. Additional routines allow examination of arbitrary pixels within a tile in the black layer. The relative ease with which arbitrary map sections can be accessed by the tools suite meant that, for each major image processing operation, we had the choice of doing it at the grid-cell vs the whole-image level. In some cases, e.g. connected-component analysis, it is cumbersome
to integrate the data from cell-level analysis to the whole image. In other cases, particularly vectorization and text recognition, cell-level analysis is a viable option. 3 Sublayer Separation Our strategy for sublayer separation is an initial classification based on connected-component (CC) analysis, and subsequent improvement by contextual analysis. The initial classification aims to distinguish between CCs for Lines, isolated character blocks (Text), and a catch-all class denoting everything else as Icons. Overlapping objects of different classes would obviously be misclassified in this CC-based labeling. The contextual analysis is expected to catch the bulk of errors of this type, and refine the Lines class further into streets, grid-lines, railroads, etc. At the time of this writing, we have completed only the initial classification of the black layer. Using standard techniques, our CC algorithm found the four-connected components in two complete scans of the compressed black-layer image (about 20 Megabytes) and labeled each foreground pixel with its CC number. The total CPU time on a Sparc-20 server was 13.5 min- utes. The decision rule for the initial classification was derived from the ground-truth data for a 3 x300 section of the map. It is based on two attributes of the rectangular (horizontal) bounding 00 box (BB) of each CC. The parameters in the decision rule were chosen to minimize the classi- fication error in the ground truth. Figure 3: Results of classification on a small section of the map. Counterclockwise from the top-left, the four image panels show, respectively, the black layer, Lines, Icons and Text. The four panels in Figure 3 show the classification graphically for a part of the 300x300 tile. In the class identified as Text, there are both erroneous lines (small polygonal shapes at odd-shaped intersections (quite common in DC!) and erroneous icons. We expect many of the errors of omission and commission to be corrected by contextual analysis. Most of the touching glyphs in
Table 1: Initial Sublayer Classification Assigned Label Object Class Text Lines Icons Total Text 89.7% 5.6% 4.7% 234 Lines 0.0% 97.4% 2.6% 115 Icons 1.6% 8.9% 89.5% 124 the line layer can be identified as missing pieces of the strings in the Text class. ArcScan’s ability to bypass overlapping glyphs during vectorization of street lines can also be used to advantage. The parameters used in the classification rule and contextual analysis are subject to adaptation. A summary of the errors for the 3 00x300 tile appears in Table 1 . The 24 BBs that were smaller than a minimum pixel size (10x10) were considered to be noise and not classified. 4 Vectorization An example of the quality of the vectorization performed by ArcScan on the tile of Figure 3 is shown in Figure 4(left). This is better than the several vectorization programs that we have developed. The objective of line-processing is to improve it further, both for accurate location of the streets with respect to the DLG and to facilitate street-name association, as shown in Figure 4(right). Figure 4: Left: ArcScan vectorization; Right: Improved vectorization. The first step is to run several checks on the line segments, including histograms of line- lengths and node-degrees. On the street layer, good vectorization is characterized by few very short line segments and few nodes of degree other than 2. Next, the foreground pixels near the line segments are checked for line thickness, and each line segment whose median line- thickness is different from that expected is flagged. Line segments are mated to parallel line seg- ments within the appropriate range of street-width spacing and unmated segments are flagged. Our measures of vectorization quality easily distinguish between the two vectorizations in Figure 4. ArcScan requires setting 17 parameters for its vectorization routine. In the case of suspect vectorization, as indicated by the quality measures, these parameters will be readjusted according to rules stored in an expert system (EXSYS) [15]. Polyline pairs of continuing streets (a chain of mated line segments) are denoted as curb pairs. The two lines in a curb pair are assigned opposite orientations as follows: imagine the two lines to represent opposite lanes of traffic in a two-lane street and assign each the direction according to the US traffic flow convention.
The above steps allow robust separation of continuing street lines from other mistakenly vectorized objects, but are not sufficient to correct errors in the representation of street intersec- tions. This is important, because the street-name association depends on accurate representation of the connectivity of the street network. Our definition of an intersection is that it is the area bounded by the smallest closed curve with only curb pairs incident upon it (Figure 4(left)). Note that a minor change in the line configuration may be sufficient to change an intersection into two separate intersections (Figure 4(right)). Figure 5: Left: An intersection; Right: Two separate intersections. A canonical intersection is one where curb pairs are not connected within the intersection but adjacent line segments from two curb pairs are connected within the intersection and have opposite orientations (one directed toward and the other away from the intersection). This rule requires some modification for underpasses and overpasses (where a street appears to be closed by a crossing), and for cloverleaf highway intersections. The purpose of this analysis is to flag intersections that are incorrectly vectorized and to correct them by reference to the underlying foreground pixels. 5 Street-Name Association The street-name association is carried out by (1) determining the street-line closest to each street name, (2) finding the curb pair that contains this street line, and (3) tracing the chain of curb pairs in both directions as far as possible. The map characteristics underlying street-name association are the following: The baseline of a street name is apposite and roughly parallel with the associated curb pair. Complete street names consist of one or two specific labels (VERMONT or 32ND) that precedes a single generic label (AVE. or ST.), and may be repeated along the street. Every street name is associated with a chain of curb pairs, but not every chain is associated with a street name. The USGS map does not show the names of “unimportant” streets. A chain of curb pairs that does not undergo abrupt direction changes retains the same name for its entire length unless curb pairs on it are associated with different labels. Name changes for continuing streets take place only at major intersections. The streets and highways form a network; there are no unreachable streets. Optionally, a gazetteer of street names with approximate grid locations is available to ver- ify the assignments.
Several steps of the street association algorithm require acceptance or rejection of alter- natives, or a choice between several candidates. For instance, at a Y-junction, the algorithm must make a choice between two branches, and tracing a street across a complex intersection like a traffic circle is even more difficult. The above assumptions are therefore codified into a parametrized EXSYS rule base. After the initial default assignment, the street-association pa- rameters are modified, like other processing parameters, by feedback from the log of operator corrections. 6 Post-processing, Evaluation and Cost Model The interface for interactive correction and verification consists of two separate parts: a sys- tem controlled iterative session and an operator controlled editing session. The system con- trolled session is based on the streets recognized by the automatic map conversion process. A single street at a time is presented to the operator along with its associated name. The oper- ator is then asked to either accept the data as shown or make corrections. During the operator controlled session, the operator will be allowed to select any existing street line and perform cor- rections. There is also an option to add missing lines that the automatic conversion may have missed. Since one of the paramount objectives in this project is to demonstrate a decrease in the amount of operator interaction needed by providing automatic conversion of the map image, the system produces an operator activity log. This log contains the elapsed times for various operator activities and allows subsequent analysis of the actual amount of operator time required for conversion. The evaluation phase compares the output of our conversion process against the “ground truth” of the map area. We make use of existing TIGER and DLG databases of the map area as good approximations of this ground truth to compare against. The DLG dataset for our sam- ple quadrangle was produced by the USGS by manual digitization of a newer version of the same map that we have, with positional accuracy standards that 90% of points should be within 0.02 inches (400) of the position on the original map. This street line data has a high degree of correspondence with our scanned map image (Figure 6(left)). The TIGER database, on the other hand, does not share the same origins as our map, and maintains a much lower standard of positional accuracy (Figure 6(right)). This database does, however, contain street names associated with the street vectors. We will use this to evaluate the street name recognition and association. We will generate a variety of statistical measures to verify the completeness and accuracy of our output. The bulk of this processing will be done using ARC/INFO. We can calculate several measures relating to the street lines without manual intervention: Total length of the street network. Percentage of the street network within a specified distance of streets in the DLG data. Percentage of the DLG street network within a specified distance of streets in our output. Average distance from street intersections to the closest intersection in the DLG data. It is not so straightforward to evaluate the results of street name recognition and association. Some information can be gained by pure comparison of the name output against TIGER data, but to get a better categorization of errors will require human interaction. For those measures, we will use a random sampling of street segments.
Figure 6: Left: DLG; Right: TIGER Percentage of streets correctly associated with names. Percentage of streets correctly associated, but only partially recognized names, and the degree to which the name is recognized. Percentage of named streets incorrectly associated or not associated with any name. The final summary of the evaluation phase will be a cost-benefit model. We will carry out evaluations after various amounts of operator correction of the data and various amounts of con- version volume. At one end of the scale is the effort required to manually enter the entire map, while the other end is fully automated conversion. We will model the relationship between op- erator intervention time and residual error, and investigate to what degree the various steps of the process benefit from increased operator intervention. Acknowledgment We gratefully acknowledge the support of the Intelligent Map Understanding Project of the National Imagery and Mapping Agency. This work is also being supported by the University of Nebraska-Lincoln, Center for Communication and Information Science. Part of this work was carried out in New York State Center for Advanced Technology (CAT) in Automation, Robotics and Manufacturing at Rensselaer Polytechnic Institute. The CAT is partially funded by a block grant from the New York State Science and Technology Foundation. We thank Environmental Systems Research Institute (ESRI) for software support. References [1] T. Pavlidis, C. J. Van Wyk, An Automatic Beautifier for Drawings and Illustrations, Pro- ceedings ACM SIGGRAPH 1985, 225-234 (1985). [2] O. Hori, A. Okazaki, High Quality Vectorization Based on a Generic Object Model, Structured Document Image Analysis, Springer-Verlag, pp. 325-329 (1992). [3] M. Roosli, G. Monagan, Adding Geometric Constraints to the Vectorization of Line Draw- ings, Graphics Recognition, Springer, pp. 49-56 (1996).
[4] T. Kaneko, Line Structure Extraction From Engineering Drawings, Pattern Recognition Vol. 25, pp. 963-973 (1992). [5] G. Myers, P. Mulgaonkar, C. Chen, J. DeCurtins, E. Chen, Verification-Based Approach for Automated Text and Feature Extraction from Raster-Scanned Maps, Graphics Recog- nition, Springer, pp. 190-203 (1996). [6] D.A. Varley and M. Visvalingam, Road Extraction and Topographic Data Validation us- ing Area Topology, The Computer Journal, Vol. 37 No. 1, pp. 3-15 (1994). [7] L.A. Fletcher and R. Kasturi, A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 10 No. 6 pp. 910-918 (1998). [8] M.-K. Kim, M.-K. Park, O.-S.Kwon, Y.-B. Kwon Automatic Region Labeling of the Lay- ered Map, Graphics Recognition, Springer, pp. 179-189 (1996). [9] R.D.T. Janssen, The application of model-based image processing to the interpretation of maps, Doctoral Dissertation, Technical University of Delft (1995). [10] J. Den Hartog, A framework for knowledge-based map interpretation, Doctoral Disser- tation, Technical University of Delft (1995). [11] M. Zeiler, Inside ARC/INFO, Revised Edition OnWord Press, Santa Fe, NM (1997). [12] G. Nagy, Y. Xu, Priming the Recognizer, Procs. DAS-96, Malvern, PA, pp. 263-281 (1996). [13] G. Nagy, Y. Xu, Automatic Prototype Extraction for OCR, accepted for presentation ICDAR-97, Ulm (1997). [14] S. Leffler, libtiff software distribution Version 3.4, Copyright (c) Sam Leffler and Silicon Graphics, Inc. (1996). Online: ftp://ftp.sgi.com/graphics/tiff/tiff-v3.4-tar.gz . [15] Multilogic Inc. EXSYS, http://www.multilogic.com/ .
You can also read