using chatgpt to build automated openstreetmap changesets metadata retrieval and storage

· mvexel's blog

#openstreetmap, #programming

# Let's try ChatGPT

I asked ChatGPT the following question earlier:

can you write me python code that retrieves changesets from openstreetmap?

It came up with a decent first try! I had to feed it some more questions to iron out some kinks and add some more features, which it did.

# Asking follow-up questions to fix issues and add functionality

In its first try it misunderstood my intent and used the Overpass API to retrieve newly created nodes, ways, and relations. So I asked it to retrieve changeset metadata instead:

I was looking for code that retrieves changeset metadata from https://planet.osm.org/replication/changesets/ - could you rewrite to retrieve from there please?

Then I asked it to store the results in a sqlite database:

can you rewrite that to store the changeset metadata in a sqlite database?

Next, I thought I'd add scheduling:

can you add code that schedules retrieving the most recent changesets every hour?

Then I had it add a check for the most recent changeset id already in the database:

can you add a check for the last sequence number already stored in the database?

The last piece of functionality I wanted to add was creating the database.

can you add code that checks if the database exists and create it if it doesn't?

Finally, I asked it to use the Python scheduler module and add a main loop

can you rewrite this to use the python scheduler module?

can you add an if __name__ == '__main__:' section?

# Result

The final result was this almost-working code.

# From ChatGPT output to a working script

I spent the next 30 minutes or so fixing the code by hand and adding a bit of useful output. The final code looks like this.

You can see the diff here. (thanks diff2html-cli!)

A summary of the fixes and improvements I made, referring to the destination file line numbers in the diff:

Overall, these are not major changes, and they took me not more than 30 minutes to implement. (I asked ChatGPT for a fix for the error in the original gzip response parsing, which it helpfully provided.)

It's still not perfect code by any stretch. Most notably it will not backfill missed sequences, something that would not be too hard to add.

# To conclude

I'm pretty impressed that ChatGPT was able to produce mostly correct code do something very specific.

Some people express fear that software engineering skills will become obsolete. In the link, John Carmack rightly points out that "product skills" will become more important. Understanding what problem you are solving rather than focusing on the specific tools. I agree, and this is bad news for a few different types of (aspiring) software engineers:

I don't think I've seen such an interesting inflection point in technology since I started messing with the internet in my university's computer lab in 1992.


  1. Or whatever language people speak where you work ↩︎