A little bit of background is needed to have a comprehensive picture. This story revolves around our Product API
,
our Authorization API
, and the company’s Authorization Management System
or ams
for short.
The Product API
is how clients interact with product data, assuming they are authorized.
The company’s ams
is the source-of-truth on whether a client is authorized or not. The Authorization API
exists
to mediate interactions between our Product API and this source-of-truth.
The Authorization API
supports an in-memory cache of previously authorized clients, and a database fallback in case
the cache is empty and ams
is down.
So the auth flow is like this:
function isUserAuthorizedForResource(userID, resourceID) bool {
var isAuthorized bool
var err error
isAuthorized, err = inMemoryCache.getAuthorization(userID, resourceID)
if err == nil {
// user was in our cache, return their auth state
return isAuthorized
}
// user not in cache, check the auth system
isAuthorized, err = authSystem.getAuthorization(userID, resourceID)
if err == nil {
return isAuthorized
}
// could not get to auth system, fall back to database
isAuthorized, err = database.getAuthorization(userID, resourceID)
if err == nil {
return isAuthorized
}
// something really went wrong. cannot authorize user.
return false
}
A client reached-out about an authorization error they were receiving. They had recently submitted a request to the
Authorization Management System
, ams
, to gain access to a particular API resource. But when attempting to access
said resource, they received an auth error.
Thinking over the list of possible explanations for the problem, I first checked our source-of-truth, the ams
. And yes, they did in fact have the correct authorization. So that’s not the problem.
Next, I replicated the request to our Authorization Server
. Bingo! Unauthorized.
The server’s in-memory cache must have their userID and available resources cached without the most recent addition. Simply clear the cache for that user, let it repopulate, and everything’s right with the world… Right?
Wrong.
Clearing the cache did not help. Is the ams
broken, reporting the correct authorization setup in the UI but not via the API?
Very unlikely.
That only leaves the database. It is periodically updated with userIDs and resources, and is only used when the in-memory
cache doesn’t have that userID + resource pair and the call to ams
fails. Which it doesn’t. So the database shouldn’t
be getting used.
Ah, but it is!
Debugging the issue locally revealed that the call to ams
would timeout after 10ms
, and would fallback to the database. Since the clients authorization change was so recent, it wasn’t captured in the database, so they’d get that
unauthorized error. But why is the timeout so short? Everyone on the team would know it should be at least 1s
,
possibly more.
Finally, we get to the juicey parts. Why was the timeout to the ams
so short?
In our Product API
application, we specified an acceptable default for the timeout, like so:
func createAuthSystemClient(/* deps provided */) AuthSystem {
authSystem := AuthSystem{}
/* misc setup of authSystem ... etc */
authSystem.timeout = getPropertyOrDefault("auth-system.timeout", "5000")
return authSystem
}
So if the auth-system.timeout
property was not specified in a configuration file or environmental variable, it will
default to 5s
. Which it is obviously not doing.
Time to check the various configuration files, and any possible environmental variables. And as expected, I cannot
find anything. So it has to be an issue with the code that sets this default to 5s
, right?
But no matter how hard I looked, I could see no bug or fault with this getPropertyOrDefault
code. In desperation, I
ctrl+f
for the property.
Lo and behold, in plain site, auth-system.timeout = "10ms"
. It was encoded in the application’s embedded confirguation. This
configuration is supposed to be non-environmental, specifying core properties like the app version, license, etc. But for
some reason, someone decided to specify a timeout for the ams
.
Removing this code allowed the getPropertyOrDefault
to do its job, and use 5s
as the minimal timeout. After that,
everything worked.
Calls would succeed to the ams
, re-populate the in-memory cache with the correct authorization. Eventually the
fallback database will be udpated too.
All is right with the world.
]]>Certainly from my own experieinces, I’ve hidden behind misdirection, or bullshit, when answering a question I do not know.
Why?
Maybe I didn’t want to look unprofessional in not knowing, and gave out any answer. Or on the flip-side, I wanted to have my voice heard even if I didn’t have something useful to contribute.
It could be I did not understand the question, and didn’t want to ask for clarification.
But at the end of the day, none of the above helps remedy the situation. It clouds the conversation, adding unnecessary complexity to what could be an already complicated topic.
Admitting “I don’t know” is the remedy.
It is clear, concise, and abrupt. It does not leave any room for misinerpresation.
You could make a game
They often get conflated when talking about a “CI/CD pipeline”; and honestly it wasn’t something I’ve stopped to consider before. The question seems easy to answer, but I wanted to make sure I did my research.
After all, semantics are important.
So… what better an opportunity to learn the differences than now?
A solid article highlighting the differences is from Atlassian, where the key distinction is that continuous delivery means you can deploy changes to clients quickly and consistently, but not necessarily automatically.
Continuous deployment, on the other hand, does goes the extra step and deploys code to production once it has passed through the CI pipeline. No manually triggering a deploy. If the code is green, it goes to prod.
I would personally argue that delivery ties into the idea of always being shippable; that you should be able to deploy at any time, not just frequently.
]]>GET /people/curtis
is better than GET /getPeopleByName?name=curtis
, right?
Let’s look at a better example:
Which of the above ☝️ is a “RESTful” URI?
It comes from Stefan Tilkov’s video on REST: I don’t Think it Means What You Think it Does; and I think it is a fantastic question.
Most people, myself included, get it wrong. Which one do you think is right?
My vote: #4
How about you? Vote now!
…
…
…
The fact of the matter is that the question simply does not make sense.
There is no such thing as a “RESTful URI”
— S. Tilkov
If you look at the constraints specified by Roy Fielding, none of them directly focus on URIs. The closest would be constraint #4, Uniform Interface (Sec 5.1.5).
Resources are identified in all requests; they are manipulated through representations; all communication is done via self-descriptive messages; and hypermedia is the engine of application state.
Essentially, it is the context through which URIs are presented and used which determines whether you are adhering to the constraint, rather than the URIs themselves.
A big part of understanding this requires understanding HATEOS. I think Dan Palmer does a great job providing an example on his blog post Your API is not RESTful.
“What do I need to know in advance to use this service?”. The correct answer is the domain, and the protocol. That’s all. Given that information, it should be possible to fully explore everything the service has to offer.
— D. Palmer
He goes on to provide an example of how you can use Amazon via your browser, without ever being concerned about the specific URIs you utilize to buy your books or DVDs.
The same should go for your RESTful api. It doesn’t really matter what your URI’s actually are from a REST-perspective, as they are simply a means-to-an-end. The endpoints should not need to be documented to use them; rather, subsequent requests to an API should include and link to related resources, proving the means of navigation without any specific regard to the format of the identifiers.
]]>Hence, this blog.
I am hoping to use this blog for my own benefit. I would like to learn about a variety of topics surrounding software engineering, as well as improve how I communicate my own thoughts.
A blog is a fantastic platform, and I look forward to utilizing it.
]]>