Beyond the “Happy Path”

Posted on

In software development, we often focus on the “happy path”. It’s the ideal route through a system when everything goes as planned. To create robust and user-friendly applications, we need to venture beyond this path and explore the paths of potential errors.

This article discusses a proactive approach to error management that combines intentional error triggering with comprehensive error documentation and guidance, when an error happens.

The Happy Path and Beyond

The “happy path” represents the optimal flow through a system. For example, in a robot control application, the happy path for connecting to a robot might be:

  1. Enter the correct IP address
  2. Initiate the connection
  3. Successfully establish a websocket connection
  4. Begin robot control

However, users, especially newcomers, sometimes stray from this path. To create a more robust system, we need to prepare for these deviations.

Intentionally Triggering Errors

The key to understand where users stray from the path and what then happens is to intentionally trigger and document errors. This process involves:

  1. Identifying Potential Error Points: For each feature or process, brainstorm ways it could fail or be misused.
  2. Systematically Triggering Errors: Deliberately cause each identified error scenario.
  3. Documenting the Results: Record the steps to reproduce, error messages, and system state for each triggered error.

For our robot connection example, we might intentionally trigger errors by:

  • Entering an incorrect IP address
  • Trying to connect while the robot is powered off
  • Attempting the connection with a firewall blocking the necessary ports

Comprehensive Error Documentation and Guidance

Once we’ve triggered and documented these errors, we can create informative error messages and guidance. Let’s look at a real-world example of how this might be implemented:

def connect(self, ip, waiting=True):
    # ... (connection attempt code) ...

    while not self.is_connected and waiting:
        sleep(0.1)
        i = i + 1
        if i > 50:
            print("")
            width = len("Make sure your laptop is not using a firewall or other security software that blocks the connection") + 3
            print("")
            print(" Could not connect to robot, please check the following:")
            print(" - Make sure the robot is turned on")
            print(" - Make sure the laptop is connected to the same network as the robot")
            print(" - Make sure you have the correct ip address of the robot (" + ip + ")")
            print(" - Make sure that your laptop is not using a firewall or other security software that blocks the connection")
            hostname = socket.gethostname()
            IPAddr = socket.gethostbyname(hostname)
            print(" - Your Computer Name is: " + hostname)
            print(" - Your Computer IP Address is: " + IPAddr)
            print(f"\\{'_' * width}/" + self.character)
            exit(1)

In this case it’s easy to catch the errors, as all the errors end up as a “Not being able to connect” error. So catching the error and then showing a useful error message with tips on what could have gone wrong can lead to a smoother experience for inexperienced users.

When this code runs and runs into a timeout while trying to connect, it produces an output that looks like this:

Trying to connect to ws://10.0.233.55/live
Connecting ..................................................

 Could not connect to robot, please check the following:
 - Make sure the robot is turned on
 - Make sure the laptop is connected to the same network as the robot
 - Make sure you have the correct ip address of the robot (10.0.233.55)
 - Make sure that your laptop is not using a firewall or other security software that blocks the connection
 - Your Computer Name is: user-laptop
 - Your Computer IP Address is: 192.168.1.100
\__________________________________________________________________________________________/
 _-_  | /
/_  \ |/
(o)(o)
| | |
| \/ /
\    |
 ¯--¯

This error output demonstrates several key principles:

  1. Clear Error Identification: It states that the connection to the robot failed.
  2. Actionable Steps: It provides a list of specific things the user can check.
  3. Relevant Context: It includes the IP address the user was trying to connect to (10.0.233.55).
  4. Debug Information: It provides additional information like the user’s computer name and IP address.
  5. Visual Separation: The error message is visually separated from other output, making it easy to read and understand.
  6. Memorable Presentation: The inclusion of the ASCII art character adds a touch of personality and makes the error message stand out more. :)

If there is a stacktrace, you should still display it, as it might contain more information about the error. In this case with the timeout there is no stacktrace from an exception.

Implementing a Proactive Error Management Approach

To implement this approach in your own projects:

  1. Map the Happy Path: Document the ideal flow through each feature or process.
  2. Identify Deviation Points: For each step in the happy path, consider how users might deviate.
  3. Trigger Errors Intentionally: Systematically cause each identified error scenario.
  4. Document Thoroughly: For each triggered error, record:
    • Steps to reproduce
    • Resulting error message or behavior
    • Relevant system state or environment details
  5. Craft Informative Error Messages: Based on your documentation, create error messages that:
    • Clearly state what went wrong
    • Provide actionable steps for resolution
    • Include relevant context and debug information
  6. Implement in Code: Add these comprehensive error messages to your error handling logic.
  7. Continuous Improvement: Regularly review and update your error catalog based on user feedback and new discoveries.

Benefits

  1. Improved User Experience: Users encounter more helpful, actionable error messages.
  2. Reduced Support Burden: Many issues can be resolved without contacting support / you.
  3. Faster Debugging: Developers can identify and fix issues with comprehensive error information.
  4. More Robust Software: By anticipating and handling various error scenarios, your software becomes more resilient.
  5. Educational Opportunity: Detailed error messages help users understand the system better and also teach them to solve errors on their own.

Conclusion

By venturing beyond the happy path, deliberately exploring error scenarios, and providing informative guidance, we create software that’s not only more robust but also more user-friendly.

In the world of software development, errors are not just problems to be solved—they’re opportunities to improve user experience and system reliability.

Leave a Reply