Parsing JSON with Python
Reading and Writing Files
• Reading-from and writing-to text files with Python's built-in open()
function
• It returns a file object that we can use to read-from and/or write-to the file that we opened
# Open a file for reading
file = open("path/to/file", mode="r")
# Read and print contents of a file
file_contents = file.read()
print(file_contents)
# Close the file
file.close()
# Open a file for writing
file = open("path/to/file", mode="w")
# Write and print contents of a file
file_contents = file.write("add something")
print(file_contents)
# Close the file
file.close()
• mode
is an optional string that specifies the mode in which the file is opened. It defaults to 'r' (read). Other common values are 'w' for writing (truncating the file if it already exists)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The
with
statement
• The with statement initiates context manager for Python
• No need to explicitly close the file as this will be handled by the context manager
# 'with' statement
>>> with open("intro-python/parsing-json/pep20.txt", mode="r") as file:
file_contents = file.read()
print(file_contents)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
JSON Syntax vs. Python Native Data Type
• The use of single vs. double quotes: Python doesn't care, but JSON does. JSON strings must be delineated using double quotes " ".
• The capitalization of the boolean values: Python uses True and False , while JSON uses an all-lowercase convention of true and false
• Trailing commas: Python ignores any trailing commas, but JSON will complain if you accidentally leave some trailing commas in your JSON data
• Python can use any immutable and hash-able data type as a name in a dictionary. JSON names must be strings
• The outermost element of a JSON data structure must be an "object" (which is the JSON name for a name-value pair data structure, like Python's dictionary)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
What is Parsing?
• Parsing is the process of analyzing text into its logical syntactic components
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Using the Python
json
module
• Not all Python data structures can be parsed into JSON notation. For example, if you created your own class
• It is best to stick with Python's native data structures (dict
, list
, etc.) when you want your data to be able to be encoded as JSON text
#!/usr/bin/env python
"""
Parsing structured JSON text into native Python data structures...
"""
import json
import os
from pprint import pprint
# Get the absolute path for the directory where this file is located "here"
here = os.path.abspath(os.path.dirname(__file__))
# Read in the JSON text
with open(os.path.join(here, "interface-config.json")) as file:
json_text = file.read()
# Display the type and contents of the json_text variable
print("json_text is a", type(json_text))
print(json_text)
# Use the json module to parse the JSON string into native Python data
json_data = json.loads(json_text)
# Display the type and contents of the json_data variable
print("json_data is a", type(json_data))
pprint(json_data)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Working with Nested Data

• This is the case with the ip
address in the JSON example below
{
"ietf-interfaces:interface": {
"name": "GigabitEthernet2",
"description": "Wide Area Network",
"enabled": true,
"ietf-ip:ipv4": {
"address": [
{
"ip": "172.16.0.2",
"netmask": "255.255.255.0"
}
]
}
}
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Indexing Nested Data
• We access an element within a Python data structure using the square braces [ ]
and the index of the item we want to "extract"
• Dictionaries use names as their index
• Lists use the numerical index of where each element sits in the ordered sequence (starting at zero)
Example
• Extracting the "ietf-interfaces:interface" dictionary from the top-level dictionary in from JSON sample above
>>> pprint(json_data["ietf-interfaces:interface"])
{'description': 'Wide Area Network',
'enabled': True,
'ietf-ip:ipv4': {'address': [{'ip': '172.16.0.2',
'netmask': '255.255.255.0'}]},
'name': 'GigabitEthernet2'}
• To get the ip address of the "ietf-interfaces:interface" dictionary
>>> json_data["ietf-interfaces:interface"]["ietf-ip:ipv4"]["address"][0]["ip"]
'172.16.0.2'
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Looping Through Nested Data
Looping through a
list
:
# Looping through a list
>>> my_list = [0, 1, 2, 3, 4]
>>> for item in my_list:
print(item)
0
1
2
3
4
Looping through a
dict
:
• use variable unpacking and the dictionary .items()
method
# Looping through a dictionary
>>> fruit_inventory = {"apples": 5, "pears": 2, "oranges": 9}
>>> for fruit, quantity in fruit_inventory.items():
print("You have {} {}.".format(quantity, fruit))
You have 5 apples.
You have 2 pears.
You have 9 oranges.
To loop through nested data, all you need to do is:
1. "Extract" the element that you want to loop through
2. Use that element as the target of your for loop
3. Within your for loop, your loop variable(s) will iteratively be assigned the values of the items within the element you are iterating
Example
for interface in json_data["ietf-interfaces:interfaces"]["interface"]:
print(interface["name"])
Index