BLOG

Anthropic Claude’s new “Skills” and DevOps

Tim de Boer
Posted by
Tim de Boer
on
March 22, 2026

I have lived DevOps for many years now. And at Interfuze, when we build production ready software and call it enterprise grade. That meansDevOps.

So when the team embarked on building an API to have AI assess our project status reports (PSR’s) I was disappointed to find how lackingAnthropic were in this area given their Enterprise AI strategy.

Fortunately, it seems the direction I would naturally have expected Anthropic to head is starting to happen. And this is probably just a case of “if you surf this close to the edge you should expect some tool functionality to follow”. Hopefully closely!

To set the scene. At Interfuze we have a “zero project surprises” policy. We work in the hard space of digital transformation, and sometimes despite our best efforts, transformations fail. Fortunately, that doesn’t happen often, but when it does, we should see it coming and let our clients know. Usually through conversation of course but always more formally in our regular PSR’s.

To that end we hold portfolio reviews each fortnight, and one of the things we do in that review is pull up a few PSR’s. Problem is we can’t do it for every report. Wouldn’t it be great if we could use AI to analyse PSR’s and flag ones that could use a review so we can better target that time.

That was the challenge I set to two of our lead techs - Raf and Raj: “Provide me an API to which I can submit a PSR report and return whether it was internally consistent and contained the elements expected.”

I thought it would be either easy or impossible. Turned out it was reasonably easy to build something tantalisingly useful. But really tough to turn it into the enterprise grade solution I wanted.

Two problems:

1. There was no simple API to upload a file and have it analysed by a skill. The API exists but there is devil in the detail around file management and error handling.

2. We couldn’t find a framework to test skills. And lock in that testing as part of a deployment pipeline.

Raf fixed the first problem. And we can now submit a PSR to an API to test its quality. We liked the results so much we decided to make it public. You can try it against our reference PSR’s, or one of your own, at https://skillfuze.interfuze.com.au.

We plan to make the code public soon too. And potentially release as a GitHub app. Raf just needs one more brush stroke to perfect it first 😊.

Once that was done, we realised this skills file eval API provided the basis to fixing the second DevOps issue. Raf is on his way toward building a framework around this that will allow us to submit an array of test files to one or more skills, and return a matrix of results to help in:

·     testing skills as part of the skill creation experience,

·     locking in that testing on the skills deployment pipeline,

·     in time perhaps testing for AI drift and

·     perhaps in further time testing that against multiple providers/models to get the best combination of accuracy/cost.

The PSR evaluator is essentially a skill btw. And if you’re building skills, remember this: skills are software and software needs tests. So thankfully during this work along comes Claude with a release of its skills eval framework addition to the skill-creator skill (I know… that’s a lot of skills within skills!).

I do wish Anthropic would be a little more forthcoming with their roadmap. If we knew this was coming, we may have done things a little differently. And then - thanks to our friends at IMDEX - in talking about this work we’ve also been introduced to skills superpowers.

Fortunately for Raf (who busted a Saturday or two on this), both tools don’t seem to have completely tackled the DevOps problem. At least for the moment. So we’ll pivot and tackle the problems they’ll likely release in the weeks/months to come.

It’s all moving so fast. A powerful, fun and scary ride all at the same time. With so much to keep on top of I really do struggle with proclamation sof the death of consultants and SAAS. As far as I can tell, we’re needed more than ever.

At least for now!