Skip to content
Our review process

How We Review
AI Tools & Agents

Every score on this site is earned through real-world testing — not marketing materials, not press access, not guesswork. Here is exactly how we review, what we measure, and why we think it matters.

202+ Tools reviewed
0 Paid placements
100% Self-funded
6 mo Re-review cycle
The process

From sign-up to published score

01

We buy the tool ourselves

Every tool we cover is purchased at full retail price using our own funds. No press licenses. No vendor-provided accounts. No free trials gifted for coverage. The experience you read about is the same experience you will have as a paying customer.

02

We test on real work for 2+ weeks

We do not run synthetic benchmarks or contrived demos. Every tool gets at least two weeks of active use inside real workflows — real codebases, real content pipelines, real client tasks. We document what works, what breaks, and what surprises us.

03

We score against four criteria

Every tool is measured on the same four dimensions — Reliability, Autonomy, Value, and Ease of Use — each scored from 1 to 10. The overall score is a weighted average. The scoring rubric is consistent across every review so comparisons are meaningful.

04

We re-review every 6 months

AI tools change fast. A product that scored 7.2 in January may deserve a 9.0 in July — or a 5.5. We re-evaluate every published review at least every six months. If a score changes, we show the history. If a tool gets significantly worse, we say so.

Scoring rubric

What every score measures

Each of the four dimensions is scored 1–10 against a published rubric. No guesswork — every point is defensible.

01

Reliability

Does it do what it says, consistently? We measure uptime, hallucination rate, error recovery, and how often the tool produces the expected output on the first attempt over a two-week test period.

1 — Fails constantly 10 — Flawless in production
02

Autonomy

How much can it do without hand-holding? High-scoring tools complete multi-step tasks from a single instruction. Low-scoring tools require constant intervention, re-prompting, or manual correction at every step.

1 — Needs constant supervision 10 — Fully autonomous
03

Value

Is the price justified by the output? We calculate real cost-per-task at typical usage levels and compare it against what the alternative (human time, competing tools, or doing nothing) would cost. A $500/month tool can score a 9 if it saves $5,000.

1 — Wildly overpriced 10 — Pays for itself fast
04

Ease of use

Can a non-expert get results? We test every tool as a first-time user and document time-to-first-value, documentation quality, and how much prior technical knowledge is required to achieve the core use case advertised on the homepage.

1 — Requires a developer 10 — Anyone can start in 10 min
Our independence

What we will never do

Independence is not a marketing claim for us — it is the only reason our reviews are worth reading. If we traded scores for money or access, there would be no point in this site existing. These are not aspirational policies. They are how every review has been produced since day one.

Accept payment for a review

No vendor has ever paid to be covered, reviewed, or ranked on this site. None ever will.

Let affiliate links affect scores

We earn commissions on some links. This has zero influence on scores or recommendations. A tool we earn nothing from will outscore a tool we earn from if the testing supports it.

Use press-provided accounts

We pay for our own subscriptions. Vendor-provided demo accounts do not reflect the real onboarding and billing experience, so we do not use them.

Allow companies to review content before publication

Vendors are welcome to flag factual errors after publication. They are never given pre-publication access to edit, moderate, or approve our findings.

Corrections & updates

We make mistakes. When we do, we correct them transparently. If a score changes due to a product update, we add a changelog note with the date and reason. If a factual error is reported by a reader or vendor, we verify it and publish a correction note within 48 hours.

Found an error? Email oneapplefall@gmail.com with "Correction:" in the subject line.

Questions

Methodology FAQ

Can I request a review of my product?
Yes. Email oneapplefall@gmail.com with a brief description of your product and the problem it solves. We prioritise tools our readers are actively asking about. Submission does not guarantee coverage, and we cannot share timelines.
How is the overall score calculated?
The overall score is a weighted average of the four criteria: Reliability (30%), Autonomy (30%), Value (20%), and Ease of Use (20%). For reviews of tools aimed at non-technical users, we weight Ease of Use at 30% and reduce Autonomy to 20%.
Do you review all categories of AI tools?
We focus on AI agents, automation tools, LLMs and developer tools, chatbot platforms, and content generation. We do not currently review AI hardware, image generators (beyond their agent use cases), or general consumer AI apps.
What happens if a product is updated after publication?
Minor updates (bug fixes, small UI changes) are noted in a brief update line at the top of the review. Major feature releases that could materially change the score trigger a full re-evaluation within 30 days.
How do affiliate relationships work?
Some links on the site are affiliate links — we earn a small commission if you click through and buy. We only link to tools we have reviewed. Affiliate relationships are disclosed clearly on every page that contains them. They have no influence on scores, rankings, or recommendations.
Can a company respond to a negative review?
Yes. Vendors can email oneapplefall@gmail.com to flag factual inaccuracies. We will verify and correct genuine errors. We will not change scores or remove negative findings in response to commercial pressure.
Read the reviews

See the methodology in action.

Every score on the site was earned using this process. Browse our latest hands-on reviews and see for yourself.