CONTRIBUTED RESEARCH ARTICLES 53 osmar: OpenStreetMap and R

OpenStreetMap provides freely accessible and editable geographic data. The osmar package smoothly integrates the OpenStreetMap project into the R ecosystem. The osmar package provides infrastructure to access OpenStreetMap data from different sources, to enable working with the OSM data in the familiar R idiom, and to convert the data into objects based on classes provided by existing R packages. This paper explains the package's concept and shows how to use it. As an application we present a simple navigation device.


Introduction
"OpenStreetMap creates and provides free geographic data such as street maps to anyone who wants them" announces the OpenStreetMap wiki main page (OSM Foundation, 2011) -and we think R users want free geographic data.Therefore, the add-on package osmar (Schlesinger and Eugster, 2012) provides extensible infrastructure for integrating the OpenStreetMap project (OSM) into the R project.
The aim of the OpenStreetMap project is to create a free editable map of the world.The project maintains a database of geographic elements (nodes, ways and relations) and features (such as streets, buildings and landmarks).These data are collected and provided by volunteers using GPS devices, aerial imagery, and local knowledge.The most prominent application is the rendering of the geographic data and features into raster images (for example, for the OSM map on the website).However, the project also provides an application programming interface (API) for fetching raw data from and saving to the OSM database.
The OpenStreetMap project provides data in the OSM XML format, which consists of three basic elements: Node: The basic element.It consists of the attributes latitude and longitude.
Way: An ordered interconnection of nodes to describe a linear feature (e.g., a street).Areas (e.g., buildings) are represented as closed ways.
Relation: A grouping of elements (nodes, ways, and relations), which are somehow geographically related (e.g., bus and cycle routes).
Each element has further attributes like the element ID (unique within the corresponding element group) and timestamp.Furthermore, each element may have an arbitrary number of tags (key-value pairs) which describe the element.Ways and relations, in addition, have references to their members' IDs.
In order to access the data, OSM provides an application programming interface (API) over the hypertext transfer protocol (HTTP) for getting raw data from and putting it to the OSM database.The main API (currently in version 0.6) has calls to get elements (and all other elements referenced by it) by, among other things, their ID and a bounding box.However, the requests are limited (e.g., currently only an area of 0.25 square degrees can be queried).An (unlimited) alternative is provided by planet files.These are compressed OSM XML files containing different OSM database extracts (e.g., the entire world or an individual country or area).Planet files can be downloaded from the OSM wiki and processed using the command-line Java tool Osmosis (Henderson, 2011).
For a complete introduction into the OSM project, the OSM API, and the OSM XML file format we refer to the project's wiki available at http://wiki.openstreetmap.org/.
The aim of the package osmar is to provide extensible infrastructure to get and to represent the above described OSM data within R, to enable working with the OSM data in the familiar R idiom, and to convert the OSM data to objects based on classes provided by other packages.Figure 1 visualizes the package's concept.This is a different idea than existing packages like OpenStreetMap (Fellows, 2012), RgoogleMaps (Loecher, 2012), and ggmap (Kahle and Wickham, 2012) follow.Whereas these packages provide access to the already rendered data (i.e., raster images), osmar enables the usage of the raw OSM data.
In the following section we present the package's implementation and usage.Note that we try to increase readability by only showing the relevant arguments of plot statements.We refer to the "navigator" demo in the osmar package for the actual plot statements.
The R Journal Vol.

Getting the data
We begin with defining the data source.The following object(s) are masked from 'package:utils': find > src <-osmsource_api() We can retrieve elements by using the IDs of the elements.The IDs in these examples have been extracted by hand from the OpenStreetMap website (via its export functionality).For example, one node: > get_osm(node(1896143 ), source = src) osmar object 1 nodes, ways, relations Or, one way with the way-related data only or with the data for all referenced elements (nodes and relations): The first statement retrieves the way only (because the default value of the full argument is FALSE).
The second statement additionally retrieves all nodes that are members of the way (i.e., all nodes that define the way).
The second possibility to retrieve elements is to specify a bounding box by defining the left, bottom, right, and top coordinates (corner_bbox()), or the center point and width and height in meters (center_bbox()): The R Journal Vol.5/1, June ISSN 2073-4859 -center_bbox(174.76778, -36.85 56, 7 , 7 ) > ua <-get_osm(bb, source = src) > ua osmar object 2427 nodes, 428 ways, 7 relations The use of planet files via Osmosis as source works analogously.The source is specified by the function osmsource_osmosis().The function's two arguments are the path to the planet file (file) and the path to the 'osmosis' tool (osmosis = "osmosis").Note that per default it is assumed that the Osmosis executable is in your 'PATH' environment variable.The navigator example demonstrates the usage of planet files.

Working with the data
The retrieved osmar object is a list with the three elements nodes, ways, and relations.Each element again is a list containing data.frames for the attributes (the attrs list element) and meta-data (the tags list element) of the OSM elements.Ways and relations additionally have a data.framecontaining their members (the refs list element).
Summarize.For each element nodes, ways, and relations of an osmar object an individual summary method is available.The overall summary method for an osmar object shows the three individual summaries all at once.In the case of the summary for nodes, the number of elements and tags, as well as the available variables for each corresponding data.frameare shown.The bounding box of the coordinates and a contingency table of the top ten most frequently available key-value pairs are printed.

> summary(ua
The summaries for the other two elements ways and relations are similar.Note that these methods in fact return the contingency table of all available key-value pairs and, in addition, further information which is not printed but may be useful for a descriptive analysis.We refer to the help pages (e.g., ?summary.nodes)for a detailed description of the return values.
Find.In order to find specific elements within the osmar object, the find() function allows the object to be queried with a given condition.As the basis of osmar objects are data.frames,the condition The R Journal Vol.5/1, June ISSN 2073-4859 principally is a logical expression indicating the rows to keep.In addition, one has to specify to which element (nodes, node(); ways, way(); or relations, relation()) and to which data (attributes, attrs(); meta-data, tags(); or members, refs()) the condition applies.
We use the functions find_down() and find_up() to find all related elements for given element IDs.The OSM basic elements define a hierarchy, node ← way ← relation, and these two functions enable us to find the related elements up and down the hierarchy.For example, find_up() on a node returns all related nodes, ways, and relations; find_down() on a node returns only the node itself.On the other hand, find_up() on a relation returns only the relation itself; find_down() on a relation returns the relation and all related ways and nodes.
> hw_ids <-find(ua, way(tags(k == "highway"))) > hw_ids <-find_down(ua, way(hw_ids)) In this example we find all ways that have a tag with the k attribute set to "highway".These contain hardened and recognised land routes between two places used by motorised vehicles, pedestrians, cyclists, etc.The return value of find_down() and find_up() is a list containing the element IDs: > str(hw_ids)

List of 3 $ node_ids
: num [1:1321] 25769641 ... $ way_ids : num [1:253] 43 96 8 ... $ relation_ids: NULL Subset.The return value of the find functions then can be used to create subsets of osmar objects.The subset() method for osmar objects takes element IDs and returns the corresponding data as osmar objects.For example, the two subsets based on the traffic signal and bus stop element IDs are: > ts <-subset(ua, node_ids = ts_ids) > ts osmar object 25 nodes, ways, relations > bs <-subset(ua, node_ids = bs_ids) > bs osmar object 15 nodes, ways, relations The subset based on the highway element IDs is: > hw <-subset(ua, ids = hw_ids) > hw osmar object 1321 nodes, 253 ways, relations The R Journal Vol.5/1, June ISSN 2073-4859 Note that the subsetting of osmar objects is divided into the two steps "finding" and "subsetting" to have more flexibility in handling the related elements (here with using find_down() and find_up(), but more sophisticated routines can be imagined).
Plot.The visualization of osmar objects is possible if nodes are available in the object (as only these OSM elements contain latitude and longitude information).The functions plot_nodes() and plot_ways() plot the available nodes as dots and ways as lines, respectively.The plot() method combines these two function calls.Note that this is a plot of the raw data and no projection is done (see the following section for a projected visualization).

Converting the data
In order to use the complete power of R on OpenStreetMap data, it is essential to be able to convert osmar objects into commonly used objects based on classes provided by other packages.Currently, osmar provides two converters -into the sp (Bivand et al., 2008) and the igraph (Csardi, 2011) packages.In this section we show the conversion to sp objects, the navigation device example shows the conversion to igraph objects.
The sp package provides special data structures and utility functions for spatial data.Spatial data classes are available for points, lines, and polygons and others (see Bivand et al., 2008).The osmar package provides the as_sp() function, > args(as_sp) function(obj, what = c("points", "lines", "polygons"), crs = osm_crs(), simplify = TRUE) NULL to convert an osmar object into the corresponding classes for points, lines, and polygons in the sp package (given the required data are available).Note that the appropriate WGS84 coordinate reference system (CRS) for OpenStreetMap data is used (cf.osm_crs()).
In order to finalize the University of Auckland example we create a bus route map and visualize the available bus routes belonging to the bus stops.Therefore, we find all bus relations available in the object, retrieve the corresponding data from the OSM API, and convert the data into lines (note that this computation takes some time): > bus_ids <-find(ua, relation(tags(v == "bus"))) > bus <-lapply(bus_ids, + function(i) { + raw <-get_osm(relation(i), full = TRUE) + as_sp(raw, "lines") + }) We use the argument full = TRUE to retrieve the relation itself and all related members.In detail, this means we retrieve all nodes, ways, and relations that are members of the specified relation; and, recursively, all nodes that are members of the retrieved ways.

R as navigator
We always wanted to know how a navigation device works.Now with osmar, R provides the necessary components and this serves as nice example on how to use osmar.The general idea is to (1) get the data, (2) extract highways, (3) create a graph of all highway nodes with the distance between the highway nodes as edge weights, (4) compute the shortest path on the graph, and (5) trace the path on the highways.
Get the data.We use a planet file from Munich as the data source and use Osmosis (Henderson, 2011) to process the data.Note that 'osmosis' has to be in your 'PATH' environment variable.
Compute the route.In order to compute the shortest route between the defined starting and ending nodes, we convert the highway-osmar object into a graph.R provides a set of packages to work with graphs, we decided to use igraph: > library("igraph ") > gr_muc <-as_igraph(hways_muc) > summary(gr_muc) The R Journal Vol.5/1, June ISSN 2073-4859 C++ and Javascript toolkit and framework for working with OSM data (Topf, 2012).An R interface (potentially via Rcpp modules; Eddelbuettel and François, 2011) would provide a very fast and flexible way to work with large OSM data sets.

Figure 2 :
Figure 2: University of Auckland; roads are green lines; bus stops are blue and traffic signals are red points.

Figure 3 :
Figure 3: Number of modifications per building.

Figure 4 :
Figure 4: Bus route map of the University of Auckland; roads are green lines; bus stops and bus routes are blue points and lines .

Figure 5 :
Figure 5: Highway map of Munich center.