Machine learning tool scores astronomy images using Claude AI

Support requests are handled via the support form so they can be tracked and resolved properly. This section is for community discussions about the AstroBin platform.
Vin avatar

So I’ve mentioned this in the past on the usual debates about IOTD that there should now be enough data out there (general images, TPNs, TPs, IOTDs) to be able to build a ML-based tool that can look at an image and return a score.

I had no idea how to build this, but with Claude having advanced so much, it suddenly dawned on me that probably Claude could build something.

So that’s what I’ve tested & done.

As it uses credits to call, I’m not putting it into the public domain yet (as I pay for my credits!).

But the results have been fascinating and amusing.

I set up a composite score out of 10 by assigning weights to compare a given target image to IOTD, TPN and all images etc (putting more emphasis on the putatively better images of IOTD, TP etc).

And just let Claude loose on analysing images, comparing a given target image, and returning a score. Not just a score but also feedback on 8 specific domains (signal-to-noise, processing/contrast etc). It also returns a summary of strengths, areas for improvement, and IOTD assessment on what it would take for an image to get to IOTD.

Some initial takeaways (non-comprehensive):

  1. The systematic parameter by parameter feedback it gives is actually pretty interesting - consistent and well structured.

  2. This feedback also flushes out what it analyses IOTD preferences as being - it reveals what humans would describe as the biases that are implicit in IOTD selections (eg: phrases like to reach what IOTD assessments prefer, there needs to be…)

  3. Putting TPs, TPNs and IOTDs into the tool is also interesting - very amusing when you see an IOTD score lower than other images that score higher. To my mind this really draws out what we all know - that the IOTD anointment the way it is currently done is not consistent. We’ve all seen images and wondered “how did that get IOTD”, and others where we’ve wondered “how on earth did that not get IOTD/TP” etc. Well Claude is showing that arbitrariness is v much there.

I’ve found it very interesting and useful - having a systematic rubric to analyse images actually gives v good constructive feedback. And reading a dispassionate data-driven critique on IOTD assessment is also illuminating, and sometimes amusing.

Anyway, I’m going to use it for systematic, consistent feedback on my own images, as well as occasional amusement.

I’m also going to eventually tweak it to include a comparative score and feedback based on APOD. (The overall generalised comparison it came back with on APOD vs IOTD was also v interesting).

Sorry a long post!

TLDR? You’re going to get better, more consistent, more systematic, more data-driven, more constructive feedback from an AI tool than arbitrary human experts. And boy oh boy is IOTD arbitrary and inconsistent when you run it through the overall data. (And remember Claude really is blind when it comes to the provenance of an image).

Well written Helpful Insightful Respectful
Vin avatar

I’m letting some astro buddies test it out, and if they think its helpful I’ll open it up more broadly.

Quinn Groessl avatar

I’ve seen people feed in an unedited .tiff of an object asking an AI to process it for them. Then it proceeded to spit out a completely different object’s image that it got from who knows where. Then in my experience seeing the drivel that most AI pictures are, I would take whatever results your getting with a grain of salt. Of course I’m biased, but AI is just not particularly good at artistic things.

And then besides that, your no better than all the companies that are creating AI if you’re using copyrighted work to train your model.

Vin avatar

This tool does not do any processing. It analyses a given loaded image (which is not retained) against publicly available images on certain parameters.

It scores the image on those parameters, and gives feedback as well as a quantitative score.

It’s not being “trained” on any copyrighted data, it doesn’t scrape any data. Its an analysis and feedback tool.

Once I have some feedback from some guinea-pig astro-buddies, I’ll open it to the community here to try out.

Helpful Concise Engaging
Tony Gondola avatar

Vin · May 13, 2026, 08:56 AM

And boy oh boy is IOTD arbitrary and inconsistent

It’s exactly that fact that makes it human. There are things in art that go beyond just the technical aspects that AI can assess. You can make a technically correct image that leaves most people cold and you can make an image that’s technically imperfect but yet it fills the viewer with joy and wonder.

Well written Respectful Engaging
Vin avatar

Tony Gondola · May 13, 2026 at 02:46 PM

There are things in art that go beyond just the technical aspects that AI can assess. You can make a technically correct image that leaves most people cold and you can make an image that’s technically imperfect but yet it fills the viewer with joy and wonder.

I completely agree that art is subjective — different works affect different people differently, and that's precisely its power. Which is also why most people would hesitate before anointing something as poem of the day or painting of the week. A personal reaction to art is just that — personal.

But IOTD is doing something more than responding to art. It's a collective label applied by a self-selecting group to what constitutes, in their opinion, the best astrophotography. That's no different from what the great academies used to do — and the art world moved on from that presumptuousness long ago. The Impressionists weren't considered art. Neither were the Cubists. You get the drift.

This tool is simply a mirror. It analyses images against a set of craft parameters — signal, processing, composition, detail — and compares them against what a group of people have already selected as IOTD, TPN, and general images. It makes no claim to capture any emotional response to an image (which you’ll admit will vary across every observer); it analyses craft, not feeling. To the extent you believe IOTD reflects expert taste and judgment of what is art, that's already baked into its comparison set.

What's interesting is when that mirror shows inconsistency — when images anointed IOTD score lower than others that weren't. Some will call that bias, others arbitrariness. Either way, the mirror is just reflecting what's already there.

For me personally the score is beside the point. What I value is the systematic, parameter-by-parameter feedback — specific, consistent, dispassionate. I'll then decide what to do with it. That's the beauty of having a tool rather than an arbiter.

YMMV.

Well written Insightful Engaging
Eric Gagné avatar

Vin,

I am in no way judging wether this is right or wrong or if it should be done or not or whatever, this is not for me to decide.

I understand what you said in your last post but to me the question remains. What is the point exactly ? Are you using this to compare your work with that of others to decide how your image of target xyz scores compare to others ?

I am honestly puzzled as to what purpose your tool serves exactly.

Vin avatar

Eric Gagné · May 13, 2026 at 08:35 PM

Vin,

I am in no way judging wether this is right or wrong or if it should be done or not or whatever, this is not for me to decide.

I understand what you said in your last post but to me the question remains. What is the point exactly ? Are you using this to compare your work with that of others to decide how your image of target xyz scores compare to others ?

I am honestly puzzled as to what purpose your tool serves exactly.

My motivation is simple: I want feedback on my images that is systematic, consistent, and precise - feedback on craft rather than opinion.

The tool analyses an image against a set of parameters and uses other image sets (IOTD, TPN, general population) as a reference - not to rank my image against theirs, but to generate structured, specific suggestions for improvement. A consistent, structured rubric.

As a byproduct, it also reveals something interesting: the preferences implicit in what gets selected as IOTD or TPN, and the inconsistencies that emerge when those selections are held up to a consistent rubric. That's an incidental curiosity rather than the point of the tool - but perhaps an illuminating one for some.

So in short: better, more consistent feedback on my own work - that's it.

Well written Engaging
Kevin Morefield avatar

Interesting exercise! Given the modeling work I did back at the old job I just love how regressions would find the driving factors of behaviors. Nothing was more fun than presenting these factors to the humans at the company who thought they understood how our business worked and what drove customer behavior.

The behavior you are trying to model is that of the submitters, reviewers and judges. And I would assume that you are getting three scores that represent the likelihood of an image to reach each of those three statuses. Given that each status is a prerequisite, I would create scores reflect the likelihood of the image moving from say, TPN to TP, only after it has reached TPN. Then you start to see drivers that will conflict - which is the reality of the situation. By that I mean that some factor like color contrast might be a positive in getting to TPN but a negative in getting to IOTD.

You mention inconsistencies. Given that Claude could not know what other TPs where available for promotion to IOTD on the day that the judge selected the image, inconsistencies should be expected. The current process in no way is supposed to award the best 365 images of the year to IOTD.

Did Claude take into account only the image itself or did it look at the imaging locale, equipment, integration, etc? Regardless of the fact that Submitters and Reviewers are supposed to see only the image and equipment I would think it would be interesting to see how that factors in. I’m sure that an equal image done by a backyard refractor and remote CDK, promotions are going to favor the backyard scope.

Time is also a factor. By that I mean that the generally quality of the images on AB have improved dramatically over the recent years. Having a training set include images from more than a few years back would really mess up the scores I would think.

Did Claude attempt to assess composition in any way? That’s the least objective and yet one of the most important factors in judging images. This is art after all.

Kevin

Well written Helpful Insightful Respectful Engaging
Tony Gondola avatar

Vin · May 13, 2026, 10:03 PM

Eric Gagné · May 13, 2026 at 08:35 PM

Vin,

I am in no way judging wether this is right or wrong or if it should be done or not or whatever, this is not for me to decide.

I understand what you said in your last post but to me the question remains. What is the point exactly ? Are you using this to compare your work with that of others to decide how your image of target xyz scores compare to others ?

I am honestly puzzled as to what purpose your tool serves exactly.

My motivation is simple: I want feedback on my images that is systematic, consistent, and precise - feedback on craft rather than opinion.

The tool analyses an image against a set of parameters and uses other image sets (IOTD, TPN, general population) as a reference - not to rank my image against theirs, but to generate structured, specific suggestions for improvement. A consistent, structured rubric.

As a byproduct, it also reveals something interesting: the preferences implicit in what gets selected as IOTD or TPN, and the inconsistencies that emerge when those selections are held up to a consistent rubric. That's an incidental curiosity rather than the point of the tool - but perhaps an illuminating one for some.

So in short: better, more consistent feedback on my own work - that's it.

I guess the bast it can do is to see if your image is technically in line with imaging winners.

Gilmour Dickson avatar

I have no desire to see human artistic endeavor judged by a machine. Believe it or not what we do is photography and as such there are emotions and feelings involved in looking at (and judging) a picture. The constant creep of AI into this hobby is starting to get a bit much. I have no issue with non-generative tools that help us in processing, but that for me is the line.

It is like people jumping into forums and saying “I asked chat GPT to tell me a bout (example) back spacing and I don’t understand the answer”. Well, honestly why didn’t you just come to the actual humans first….

Not everything has to be or should be “data driven”. Just my 2 cents.

Well written Engaging
Vin avatar

@Kevin Morefield thanks, this is exactly the kind of structured thinking I was hoping this would prompt - thank you. I'm hoping you'll be willing to share more feedback once you have a chance to play with it?

Your point about conditional scoring is well made and something I want to build in - scoring the TPN→TP transition separately from the general→TPN transition, which would surface the conflicting drivers you describe. I too am curious as to what that might show - it's on my to-do list.

On inconsistency, yes agreed that IOTD judges can only pick from what is put before them and Claude has no visibility into what else was available for promotion on a given day. So this tool isn't an attempt to undermine IOTD, it’s just an attempt to apply a consistent rubric for feedback. One of the things I've observed from learners in other domains is the frustration or confusion that can often come from inconsistent feedback proxies. And that's one of the big motivations for this tool, just to apply a consistent rubric. On any inconsistency it reveals about IOTD or TP selections themselves, that’s just about whether the craft parameters of selected images are themselves consistent (a narrower but perhaps still interesting question for some).

Currently it looks only at the image. I've deliberately kept equipment and metadata out for now - partly because the tool is meant to be a craft mirror, partly because the backyard-vs-remote CDK asymmetry you describe is real (and well discussed here on Astrobin) and I didn't want to bake that in yet. But yes it would be fascinating to iterate the tool to use separate remote vs backyard reference sets.

On temporal drift - completely agree. Claude's training will skew toward more recent images anyway, and I guess that's inevitable as consensus taste and preferences (which is what the reference sets essentially present?) will evolve.

On composition - yes, it's in the rubric. But it's the parameter I trust least for now, with the reason you name being one of the factors. What's interesting is how Claude describes its composition assessment — it tends to fall back on classical framing rules, which may or may not reflect what IOTD judges actually weight.

The art question is one I was coming back to initially but I'm actually more comfortable with it now. This tool doesn't try to label whether something is "art", and perhaps that's the most honest thing about it.

@Tony Gondola yes that's a fair and accurate summary - and for me that's the most useful part of it because it provides such an analytical summary in a consistent, data-driven way

@Gilmour Dickson I hear you, and I think the concern is completely legitimate. To me, a child's crayon drawing has just as much artistic legitimacy as a Rothko (there's a reason Picasso said "It took me four years to paint like Raphael, but a lifetime to paint like a child"). But I'd gently push back on the framing here - this tool doesn't judge art, and it certainly doesn't replace human connection. It's a private feedback instrument, used by a photographer on their own image, for their own purposes. No different in spirit from using a histogram, a noise analyser, or any other diagnostic tool.

Nobody has to use it. Nobody's images are being judged by a machine without their consent. And the humans on AstroBin, including but by no means limited to the IOTD judges and pickers and submitters whose selections form part of some of the reference sets, are very much still in the loop.

The question of whether to ask an AI or a human first is a good one. For some things, like emotional response, humans are irreplaceable. For a structured, repeatable craft rubric at 2am when you're processing an image and nobody's around perhaps a consistent mirror is exactly what many will find helpful? There’s no gospel truth in this tool and that’s not what it claims to be or do (for example some of the feedback it gives on my own images does not work for my own personal aesthetics, but I already know that my aesthetic does not overlap with consensus in some aspects and that’s fine for me).

Sorry this has become a long reply but I wanted to thank each of you for your thoughts. The early feedback from the first guinea-pigs has been positive, so I'm going to share the tool with a few more folks. If the feedback from that is also that it adds value to them in their own processes, then I'll put it out to the broader community. At the end of the day, its aim is just to be a tool - I'm sure the rubrics can be improved with many subtleties and perhaps over time I'll do that. What I don't want to do is over-engineer it because then it implicitly becomes a judgment tool, whereas the point of it is a private feedback tool.

Well written Helpful Insightful Respectful Engaging
Vin avatar

The tool is now open to the community — here's the URL: astroscore.leelaastroimaging.org

Free to use. As per the AstroBin hobby tools guidelines, as it's already been announced in this thread I won't repost it separately. Feedback welcome here, or via DM, or via the tool itself.

Hope folks find it helpful.

Well written Respectful Concise
Tony Gondola avatar

Not working, nothing happens when I select an image.

Vin avatar

If you drag and drop the image (from a file manager for example) into the box Tony, then it will show a click button to press for analysis.

(Sorry I should make it clearer on the tool that it’s drag & drop, thanks!)

📷 image.pngimage.png

Helpful Respectful Supportive
Tony Gondola avatar

mmmmm, now I feel dumb but after dropping the file nothing changes, no button to click.

Vin avatar

Not at all Tony, sorry I know the UX can br improved.

You can either drag and drop, or now if you just left click once in the drop area, a file manager should come up which will allow you to select and upload the file.

Once that’s loaded then just below that drop area a button comes up saying “analyse image”

I’m attaching a screenshot (you should see an M31 image loaded in that example, and below that the analyse image button).

If you still have problems pls let me know what machine and browser you’re using and I’ll see if it might be that - I’ve used it on a Mac on both Safari and Firefox, and a few other folks have used it on Windows machines so it should be platform agnostic.

📷 image.pngimage.png(Once its analysed, you may need to just scroll downwards as the analysis is presented further below the image, rather than as a fresh screen)

Helpful Supportive
Tony Gondola avatar

Yeah, I don’t see anything like that. if I drag and drop or left click and select, nothing happens. I’m running Chrome on Win11. I’ll give it a try on a few other browsers to see if it makes a difference.

Tony Gondola avatar

Yup, same result on MSN and Brave browsers…

Vin avatar

Tony Gondola · May 14, 2026 at 10:33 PM

Yeah, I don’t see anything like that. if I drag and drop or left click and select, nothing happens. I’m running Chrome on Win11. I’ll give it a try on a few other browsers to see if it makes a difference.

Thanks Tony. It may be a specific Chrome on Windows thing (I believe something to do with labels and overlay inputs). So I’ve tried to fix that and redeployed it - hopefully it will work this time for you on Chrome as well (you will need to reload the page to get the latest redeployment).

Pls let me know if any problems still and I’ll look at it tomorrow (it’s late here now). It should definitely also work on other browsers though.

And thanks again for the click to select suggestion - that’s also been incorporated.

Well written Helpful Respectful Concise Engaging Supportive