Convert GPS-tracked bus movement data from JSON to GeoJSON format

Written by Men Vuthy, 2022


Objective

  • The objective is to convert the JSON file of GPS data which recorded the movement of city bus in Phnom Penh to GeoJSON file format for analysis in GIS.

File content before conversion:

img

Environment

[ ]:
!pip install geopandas
!pip install contextily
!pip install mapclassify
[2]:
cd /content/drive/MyDrive/Colab Notebooks/Bus
/content/drive/MyDrive/Colab Notebooks/Bus

Code

[3]:
# Import necessary module
import json
import geopandas as gpd
[4]:
# Read json file
input_file = json.load(open("data.json", "r", encoding="utf-8"))
[5]:
# Check properties of data

# Check length
print('The length of data is', len(input_file))

# Check variables
print('The variables of first data:')
input_file[0]
The length of data is 13010
The variables of first data:
[5]:
{'_id': {'$oid': '625ce66611021109ba081525'},
 'device_id': '0358735074119172',
 'date': '1604120c0a11',
 'set_count': 'cf',
 'latitude_raw': '013db852',
 'longitude_raw': '0b41b0c0',
 'latitude': 11.567832222222222,
 'longitude': 104.91914666666666,
 'speed': 15,
 'orientation': 'd4ef',
 'lbs': '01c808273a002b33',
 'device_info': '00000010',
 'power': '0f',
 'gsm': 'a8',
 'alert': '0d',
 'power_status': '0',
 'gps_status': '0',
 'charge_status': '0',
 'acc_status': '1',
 'defence_status': '0',
 'from_cmd': 'ping',
 'location': {'type': 'Point',
  'coordinates': [104.91914666666666, 11.567832222222222]},
 'timespan': '2022-04-18 04:17:42',
 'last_submit': '2022-04-18 04:17:42'}

As seen above, the variables inside data properties are not in GeoJSON format. Thus, we need to rearrange them to a proper GeoJSON property file format.

The format can be referred to https://geojson.io/#map=2/20.0/0.0.

[9]:
# Code to rearange the variable to correct format of GeoJSON
geojs = {
     "type": "FeatureCollection",
     "features":[
           {
                "type":"Feature",
                "properties": {
                    '_id': d["_id"],
                    'acc_status': d["acc_status"],
                    'alert': d["alert"],
                    'charge_status': d["charge_status"],
                    'date': d["date"],
                    'defence_status': d["defence_status"],
                    'device_id': d["device_id"],
                    'device_info': d["device_info"],
                    'from_cmd': d["from_cmd"],
                    'gps_status': d["gps_status"],
                    'gsm': d["gsm"],
                    'last_submit': d["last_submit"],
                    'latitude': d["latitude"],
                    'latitude_raw': d["latitude_raw"],
                    'lbs': d["lbs"],
                    'location': d["location"],
                    'longitude': d["longitude"],
                    'longitude_raw': d["longitude_raw"],
                    'orientation': d["orientation"],
                    'power': d["power"],
                    'power_status': d["power_status"],
                    'set_count': d["set_count"],
                    'speed': d["speed"],
                    'timespan': d["timespan"]
                    },

                "geometry": {
                "type":"Point",
                "coordinates": d["location"]["coordinates"],
            }
         } for d in input_file
    ]
 }

# Save to a new file
output_file=open("geodata.json", "w", encoding="utf-8")
json.dump(geojs, output_file)

File content after conversion:

img

Read newly-created GeoJSON file and visualize speed data

[10]:
# Read newly-created GeoJSON file
df = gpd.read_file('geodata.json')
df.head()
[10]:
_id acc_status alert charge_status date defence_status device_id device_info from_cmd gps_status ... location longitude longitude_raw orientation power power_status set_count speed timespan geometry
0 {'$oid': '625ce66611021109ba081525'} 1 0d 0 1604120c0a11 0 0358735074119172 00000010 ping 0 ... {'type': 'Point', 'coordinates': [104.91914666... 104.919147 0b41b0c0 d4ef 0f 0 cf 15 2022-04-18T04:17:42 POINT (104.91915 11.56783)
1 {'$oid': '625ce6d511021178ef08168e'} 1 0d 1 1604120c0c2e 1 0358735074119172 00100111 ping 0 ... {'type': 'Point', 'coordinates': [104.91769777... 104.917698 0b41a690 d4e6 be 0 cf 22 2022-04-18T04:19:33 POINT (104.91770 11.56665)
2 {'$oid': '625ce6d611021175dd081693'} 1 0d 0 1604120c0e33 1 0358735074119172 00101011 ping 0 ... {'type': 'Point', 'coordinates': [104.91549333... 104.915493 0b419710 d4e5 fd 0 cf 8 2022-04-18T04:19:34 POINT (104.91549 11.56489)
3 {'$oid': '625ce6d8110211c581081699'} 0 0d 1 1604120c0f2e 0 0358735074119172 00000100 ping 0 ... {'type': 'Point', 'coordinates': [104.91375111... 104.913751 0b418ad0 d4e6 d8 0 cf 12 2022-04-18T04:19:36 POINT (104.91375 11.56349)
4 {'$oid': '625ce6d9110211599f08169c'} 0 0d 0 1604120c1124 1 0358735074119172 00111001 ping 0 ... {'type': 'Point', 'coordinates': [104.91170666... 104.911707 0b417c70 d4d8 42 0 cf 24 2022-04-18T04:19:37 POINT (104.91171 11.56169)

5 rows × 25 columns

[11]:
# Check coordinate reference system
df.crs
[11]:
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
[12]:
# Import function for customizing legend
import contextily as ctx
import mapclassify
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import make_axes_locatable

# Initialize subplots
fig, ax = plt.subplots(figsize=(9, 9))
ax.set_xlabel('Longitude')
ax.set_ylabel('Latitude')

# customize legend
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="3%", pad=0.2)

# add legend
df.plot(ax=ax, column = 'speed', markersize = 5, marker = "o",
          cmap="viridis",
          cax=cax,
          legend=True, legend_kwds={'label': "Speed (km/h)"})

ctx.add_basemap(ax, crs=df.crs.to_string())
../../../_images/Content_Documentation_geo-python_Convert-GPS-tracked-bus-movement-data_16_0.png