Writing Data🔗
pyosmium can also be used to write OSM files. It offers different writer classes which support creating referentially correct files.
Basic writer usage🔗
All writers are created by instantiating them with the name of the file to write to.
Example
writer = osmium.SimpleWriter('my_extra_data.osm.pbf')
The format of the output file is usually determined through the file prefix.
pyosmium will refuse to overwrite any existing files. Either make sure to
delete the files before instantiating a writer or use the parameter
overwrite=true.
All writers are context managers and to ensure that the file is properly closed in the end, the recommended way to use them is in a with statement:
Example
with osmium.SimpleWriter('my_extra_data.osm.pbf') as writer:
# do stuff here
When not used inside a with block, then don't forget to call the close()
function explicitly to close the writer.
Once a writer is instantiated, one of the add* functions can be used to
add an OSM object to the file. You can either use one of the
add_node/way/relation functions to force writing a specific type of
object or use the generic add function, which will try to determine the
object type. The OSM objects are directly written out in the order in which
they are given to the writer object. It is your responsibility as a user to
make sure that the order is correct with respect to the
conventions for object order.
Here is a complete example for a script that converts a file from OPL format to PBF format:
Example
with osmium.SimpleWriter('buildings.osm.pbf') as writer:
for o in osmium.FileProcessor('buildings.opl'):
writer.add(o)
Writing modified objects🔗
In the example above an OSM object from an input file was written out directly
without modifications. Writers can accept OSM nodes, ways and relations
that way. However, usually you want to modify some of the data in the object
before writing it out again. Use the replace() function to create a
mutable version of the object with the given parameters replaced.
Say you want to create a copy of a OSM file with all source tags removed:
Example
with osmium.SimpleWriter('buildings.osm.pbf') as writer:
for o in osmium.FileProcessor('buildings.opl'):
if 'source' in tags:
new_tags = dict(o.tags) # make a copy of the tags
del new_tags['source']
writer.add(o.replace(tags=new_tags))
else:
# No source tag. Write object out as-is.
writer.add(o)
Writing custom objects🔗
You can also write data that is not based on OSM input data at all. The write functions will accept any Python object that mimics the attributes of a node, way or relation.
Here is a simple example that writes out four random points:
Example
from random import uniform
class RandomNode:
def __init__(self, name, id):
self.id = id
self.location = (uniform(-180, 180), uniform(-90, 90))
self.tags = {'name': name}
with osmium.SimpleWriter('points.opl') as writer:
for i in range(4):
writer.add_node(RandomNode(f"Random {i}", i))
The following table gives an overview over the recognised attributes and acceptable types. If an attribute is missing, then pyosmium will choose a suitable default or leave the attribute out completely from the output if that is possible.
| attribute | types |
|---|---|
| id | int |
| version | int (positive non-zero value) |
| visible | bool |
| changeset | int (positive non-zero value) |
| timestamp | str or datetime (will be translated to UTC first) |
| uid | int |
| tags | osmium.osm.TagList, a dict-like object or a list of tuples, where each tuple contains a (key, value) string pair |
| user | str |
| location | (node only) osmium.osm.Location or a tuple of lon/lat coordinates |
| nodes | (way only) osmium.osm.NodeRefList or a list consisting of either osmium.osm.NodeRefs or simple node ids |
| members | (relation only) osmium.osm.RelationMemberList or a list consisting of either osmium.osm.RelationMembers or tuples of (type, id, role). The member type must be a single character 'n', 'w' or 'r'. |
The osmium.osm.mutable module offers pure Python-object versions of Node,
Way and Relation to make the creation of custom objects easier. Any of
the allowable attributes may be set in the constructor. This makes the
example for writing random points a bit shorter:
Example
from random import uniform
with osmium.SimpleWriter('points.opl') as writer:
for i in range(4):
writer.add_node(osmium.osm.mutable.Node(
id=i, location = (uniform(-180, 180), uniform(-90, 90)),
tags={'name': f"Random {i}"}))
Writer types🔗
pyosmium implements three different writer classes: the basic SimpleWriter and the two reference-completing writers ForwardReferenceWriter and BackReferenceWriter.
Writing specific objects only🔗
The SimpleWriter creates an OSM data file by directly writing out any OSM object that it receives in the chosen format.
Writing reference-complete files🔗
The BackReferenceWriter will make sure that the file that is written out is reference-complete, meaning all objects that are directly referenced by the object written are added to the output file as well. This is needed when you want to make sure that geometries can be recreated from the object in the file.
Creating a file with backward references is a two-stage process: while the
writer is open, it will write all objects received through one of the add_*()
functions into a temporary file and keeps a record of which objects are needed
to make the file reference-complete. Once the writer is closed, it collects the
missing object from a given reference file, merges them with the data from
the temporary file and writes out the final result.
Writing files with forward references🔗
The ForwardReferenceWriter completes the written objects with forward references. This is particularly useful when creating geographic extracts of any kind: one selects the node of interest in a particular area and then lets the ForwardReferenceWriter complete the ways and relations referring to the nodes.
Files written by the ForwardReferenceWriter are not necessarily reference-complete. That is easy to see when considering the example of the geographic extract: there may be ways in the area that cross the boundary of the area chosen but only the nodes within the area are written out. This might be useful in many situations as the way would be simply seem to be cut on the area of interest. However, it has the disadvantage that some objects will get invalid geometries, especially when they represent areas.
The other thing to consider during forward completion are indirect references. When completing relations indirectly referenced through ways or other relations, then the resulting file can become big very quickly. For example, a seemingly small extract of the city of Strasbourg can suddenly contain not only the relations for France and Germany but also electoral boundaries and entire timezones. For that reason, when forward-completing relations, it is not recommended to use backward completion.