If you've been writing automated tests for your RESTFUL APIs then we probably share the same sentiments on how troublesome  it is, to manage requests and response messages. It might be easy if you're just dealing 5 - 10 lines of requests/response messages but once it reached 20 or more lines, your productivity of writing tests will take it's toll.

To illustrate, here's an example of a requests body  of a Web Service that updates the registered user profile.

{
   "username": "KYLEDPOGI",
   "profile_id": "acc6b57f-3fc3-406e-bc85-a10b687e462f",
   "status": "ACTIVE",
   "email": "kyledpogi@sample.com"
}

And if we write a simple tests about the functionality of the POST endpoint it will probably look something like this.

post_body = {
   "username": "KYLEDPOGI",
   "profile_id": "acc6b57f-3fc3-406e-bc85-a10b687e462f",
   "status": "ACTIVE",
   "email": "kyledpogi@sample.com"
}
post_response = requests.post('POST_ENDPOINT', json=post_body)
get_response = requests.post('GET_ENDPOINT', params=post_body['profile_id'])

# let's assume that the get request won't return the `status` fields so we'll have to manually check the fields

assert get_response['username'] == post_body['username']
assert get_response['profile_id'] == post_body['profile_id']
assert get_response['email'] == post_body['email']

The above request body is still manageable as you only need to remember four fields. But in reality it's more common to process 15 or more fields some of which are nested in structure. To give you an idea, here's a sample POST data where I've added some supplementary values of a fictional REST endpoint.

{
   "username":"KyleDpogi",
   "profile_id":"acc6b57f-3fc3-406e-bc85-a10b687e462f",
   "status":"ACTIVE",
   "email":"kyleDpogi@sample.com",
   "profile_details":{
      "birthdate":"09/01/1991",
      "address":"Philippines",
      "sex":"MALE",
      "marital_status":"MARRIED",
      "occupation":"Accountant"
   },
   "catalogue":{
      "cat_id":200,
      "cat_name":"KyleDPogi Game Catalogue",
      "cat_list":[
         {
            "name":"DOTA",
            "game_id":"32214",
            "genre":"MOBA"
         }]
   }
}

The most common implementation in modelling JSON format is to use  dictionaries. It's intuitive enough if you're handling a relatively small data set but if you are handling similar data above, using python dictionaries in parsing and feeding data becomes very unwieldy. From simple assertions that I've shown on my first example it becomes an eyesore when nested structures are involved. Have a look to some field assertions below.

# Assertions
assert data['username'] == response['username']
assert data['profile_details']['birthdate'] == response['profile_details']['birthdate']
assert data['catalogue']['cat_list'][0]['name'] == response['catalogue']['cat_list'][0]['name']

Recommendation: Use Python's Dataclass

My recommendation is to use python's dataclass to model JSON data. It has built-in features that you don't need to implement which is tedious to do if you are going write it from scratch using regular python classes. To summarize here are the things that you'll gain when you use dataclasses

  1. You don't need to strictly memorize the structure of the request. You can use your IDE's intellisense feature to simply figure out the fields that you are forgetting.
  2. Implementing Dataclasses gives you the added benefit of comparing two dataclass instances without manually asserting the fields. Field comparison is something that is abstracted from us.
  3. You can add utility methods to the dataclass if you need to have custom handling of data.
This is not a Dataclass tutorial but a good use case where dataclasses can be applied. If you are new to Dataclass I suggest you check Raymond Hettinger's Pycon 2018 Presentation where he briefly discussed what dataclasses are.

Let your IDE help you figure out what's hard to remember.

If we model the previous example of a game catalogue, It will look like this below. If you notice, dataclass are just regular python classes. From the name itself it's just classes that holds data.

from dataclasses import dataclass
from dataclasses import asdict, replace
from dataclasses import field
from uuid import uuid4
from typing import List

@dataclass
class Game:
    name: str = "Legend of Zelda: BOTW"
    game_id: int = 3242223
    genre: str = "RPG"

@dataclass
class GamingCatalogue:
    cat_id: int = 434323
    cat_name: str = "KyleDPogi Game Catalogue"
    cat_list: List[Game] = field(default_factory=lambda: [Game()])

@dataclass
class ProfileDetails:
    birthdate: str = "09/01/1991"
    address: str = "Philippines"
    sex: str = "MALE"
    marital_status: str = "MARRIED"
    occupation: str = "ACCOUNTANT"

@dataclass
class User:
    username: str = "kyledpogi"
    profile_id: str = str(uuid4())
    status: str = "ACTIVE"
    email: str = "kyledpogi@sample.com"
    profile_details: ProfileDetails = field(default_factory=lambda: ProfileDetails())
    catalogue: GamingCatalogue = field(default_factory=lambda: GamingCatalogue())

Compared on modelling dictionaries where you  have an overview of what the JSON data will be, modelling JSON via dataclasses are just flat in structure. If there's a nested object you have to write a dataclass blueprint to use it and  assign the instance object to the field using a default factory. You can take a look at the User dataclass where I've instantiated profile_details and catalogue fields.

After that you can create many User dataclasses by assigning it to a variable and since it is a class you can set the values using dot notation.

from dataclasses import dataclass
from dataclasses import asdict, replace
from dataclasses import field
from uuid import uuid4
from typing import List

user_kyle = User()
user_kyle.profile_details.occupation = "Software Tester"

Below is an image where the IDE/Editor makes use of intellisense to predict the attributes or field names, something that a python dictionary can't do.

intellisense

Assertions Are Now Easier

You can compare dataclass objects using equality  == operator and it will do class attributes comparison while regular python classes will yield different result. To perform equality operation to class fields you need to overload the dunder method __eq__(). With dataclasses this is done automatically.

user_kyle = User()
user_kyle2 = User()
user_notkyle = User()

#update username
user_notkyle.username = 'notkyle'

# will raise AssertionError
assert user_kyle == user_notkyle

# will yield to true
assert user_kyle == user_kyle2

Add Class Methods For Custom Data Handling

The asdict() function can be used to convert your dataclass to a dictionary. You just need to import it from the dataclass module and use it simply like a function with the dataclass instance as a parameter. One thing that the dataclass module does not support is deserializing a python object to a dataclass. A response object from a rest api call needs to be deserialized if you want to do comparison operations. You can of course create a new instance object but you have to explicitly assign values like this.

user = User(username=json_response['username'],
    profile_id=json_response['profile_id'],
    status=json_response['status'],
    email=json_response['email'],
    profile_details=ProfileDetails(birthdate=json_response['profile_details']['birthdate'],
                                    address=json_response['profile_details']['address'],
                                    sex=json_response['profile_details']['sex'],
                                    marital_status=json_response['profile_details']['marital_status'],
                                    occupation=json_response['profile_details']['occupation']),
    catalogue=GamingCatalogue(cat_id=json_response['catalogue']['cat_id'],
                                cat_name=json_response['catalogue']['cat_name'],
                                cat_list=[Game(name=game['name'],game_id=game['game_id'], genre=game['genre']) for game in json_response['catalogue']['cat_list']]))

I won't be doing that every time! It's tedious and time consuming. Luckily we can create a method and call it every time you want to deserialize a response object.

@dataclass
class User:
    username: str = "kyledpogi"
    profile_id: str = str(uuid4())
    status: str = "ACTIVE"
    email: str = "kyledpogi@sample.com"
    profile_details: ProfileDetails = field(default_factory=lambda: ProfileDetails())
    catalogue: GamingCatalogue = field(default_factory=lambda: GamingCatalogue())

    def load_dictionary(self, dict_variable: dict):
        self.username = dict_variable['username'],
        self.profile_id = dict_variable['profile_id'],
        self.status = dict_variable['status'],
        self.email = dict_variable['email']
        self.profile_details = ProfileDetails(birthdate=dict_variable['profile_details']['birthdate'],
                                            address=dict_variable['profile_details']['address'],
                                            sex=dict_variable['profile_details']['sex'],
                                            marital_status=dict_variable['profile_details']['marital_status'],
                                            occupation=dict_variable['profile_details']['occupation'])
        self.catalogue = GamingCatalogue(cat_id=dict_variable['catalogue']['cat_id'],
                                        cat_name=dict_variable['catalogue']['cat_name'],
                                        cat_list=[Game(name=game['name'], game_id=game['game_id'], genre=game['genre']) for game in dict_variable['catalogue']['cat_list']])

Now all you need to do is pass the response object to the dataclass method load_dictionary

simple_user = User()
simple_user.load_dictionary(repsonse.json())

Hopefully you start to appreciate  how awesome dataclass is in helping you write clean and maintainable code in python. Coupled with Type Annotations and Type Checkers, I hope you'll see a big improvement on your productivity in writing your tests.

A big thanks to unsplash-logoDevon Divine for the amazing cover image.