I wrote about microservices and micro-UIs the other day and that got me thinking about another value in microservices; as a single source of truth. In this case the same can be said of shared classes & functions. This is an argument against code duplication.
Let’s say for example, and I’m currently seeing this in the terrible app I’m rebuilding, you have 2 functions/pages/scripts that deal with emails. In the first case the email text string is lower case, must contain the @ sign, must contain a ‘.’ after the @ sign, etc. In the second case the email text string must contain the @ sign, must contain a ‘.’ after the @ sign, etc. If you read that carefully, in the first case the email is set to lowercase but not in the second case. I intentionally made it a bit confusing to highlight the problem with duplicate code.
So building on our example above we have emails handled in these two places. Because they each handle emails in slightly different ways you cannot be absolutely certain “email” will always be the exact same no matter the minor differences the user enters or pastes into the UI.
You may have heard the term “single source of truth” in terms of data. That means you have a single place that is always assumed to be correct for any given piece of data. the “income” data may be in an accounting database while the “Google Analytics” data might be in the web database. So long as you always know “if I want correct income I go to the accounting DB, if I want correct analytics data I go to the web database”.
Now let’s apply that to microservices and functions. If I want a sanitized, validated email I always go to… From our example above we have 2 places to go to that turns out 2 different data formats. What we really want is “every time I want a sanitized email I go to SANITIZE_CLASS->email(), every time I want to validate an email I go to VALIDATE_CLASS->email()”. Do this means you only have 1 piece of code sanitizing/valildating emails and no matter where the email came from and how it was entered it’ll always come out of these functions looking the exact same.
In the case of the app I’m rebuilding at the moment, it doesn’t set emails to lowercase, so a user could have an account under “User@domain,com” and an account under “user@domain,com”. The account/user data loaded depends on which email the user entered in the login process. I haven’t wasted the time looking but I bet there’s more code handling emails and it’s even possible that data changed under the “User” account would impact the other “user” account.
Bottom line, anything touching or storing data should only ever be written once and should always be precisely consistent.
A classic, real-world example of this is URLs. Take the URLs google.com and GOOGLE.COM, they both resolve to the same place. BUT, these URLs do not resolve to the same place: https://www.google.com/search?q=lakebed.io versus https://www.google.com/search?Q=lakebed.io. Do you see the difference? “q” versus “Q” is enough to break Google.
I know I personally always use the PHP function array_change_key_case(), which sets all keys in an array (_GET and _POST for example) to lower case.