Tutorials

Build an AI Horary Bot: From Question to Judgment in 3 API Calls

Ship an AI horary feature in a weekend. Why it retains better than daily horoscopes, what it costs to skip, and where the indie wedge lives in 2026.

OK

Oleg Kopachovets

CTO & Co-Founder

May 18, 2026
12 min read
237 views
Building an AI horary chatbot tutorial
Building an AI horary chatbot tutorial
0%

The shortest path from "question" to "judgment"

Building a horary feature used to mean shipping a small library of doctrine. Question classifier rules. Dignity tables. Planetary hours. Void-of-course handling with at least three competing exception lists. Perfection, frustration, prohibition, translation, collection. About a thousand years of accumulated argument, hard-coded.

You don't need that anymore.

On May 17, 2026 we shipped POST /api/v3/horary/ask. One endpoint. You hand it a plain-language question and the moment the question was asked. It returns a classification, the full technical judgment, a plain-language summary, a confidence band, and a list of typed radicality flags. Total round-trip is around 1.5 seconds.

This post is for indie SaaS builders, mobile app founders, and astrologer-entrepreneurs deciding whether horary is worth shipping. Below: market sizing, retention math, the engineering cost you avoid, confidence-band UX, the multilingual opportunity, and the product judgment that separates a horary feature people use from one they delete.

What /horary/ask actually does

The endpoint runs a four-stage pipeline behind a single POST.

Stage 1: classify. The question is passed to an LLM classifier with 10 few-shot examples per category. Categories include marriage, marriage.reconciliation (for estranged spouses), business.partnership, business.success, fertility, property.sale, property.purchase, litigation, health, lost.object, travel, and a few others. Classification results are cached in Redis under a sha256 hash of the question, with a 7-day TTL. Same question asked twice in a week costs you one LLM call, not two.
Stage 2: analyze. The classification picks the houses involved and runs the full traditional judgment. Ascendant ruler for the querent, quesited house ruler for the matter, then perfection, frustration, prohibition, void-of-course (with the new two-layer signal), translation of light, collection of light, radicality checks (early/late ascendant, Saturn in 7th, Moon via combusta), planetary dignities, and Bonatti severity flags.
Stage 3: summarize. The technical judgment object goes back through an LLM to produce a 2-4 sentence plain-language summary in the requested language. Five languages supported: en, ru, de, fr, es.
Stage 4: package. You get back the classification, the full judgment object, the summary, a confidence_band (high / medium / low), and radicality_flags as a typed list. The flags are machine-readable, so you can branch on them in your UI without parsing prose.
A real response for the question "Will my house sale close this month?" looks like this:
json
1{
2 "classification": {
3 "category": "property.sale",
4 "confidence": 0.94,
5 "querent_house": 1,
6 "quesited_house": 5
7 },
8 "judgment": {
9 "perfection": "translation_of_light",
10 "translator": "Mercury",
11 "voc_exception_sign": "Taurus",
12 "voc_effective_strength": "mitigated",
13 "bonatti_severity": "mild",
14 "radicality_flags": ["early_ascendant"],
15 "verdict": "yes_with_delay"
16 },
17 "summary": "The sale is likely to close, but expect a delay. Mercury carries the deal between buyer and seller, meaning paperwork or a small renegotiation. The chart's early ascendant suggests it is too early to be fully decided; ask again if nothing moves in two weeks.",
18 "confidence_band": "medium",
19 "radicality_flags": ["early_ascendant"]
20}

That is the entire payload you build a feature on top of. Everything else in this post is product strategy, not parsing.

Who is actually paying for this

The astrology app market crossed $12B in annual consumer spend last year. Co-Star sits at roughly 30M downloads with a freemium plus $2.99/mo Plus tier. The Pattern crossed 50M downloads on a $7.99/mo subscription. Sanctuary, Nebula, and Chani all charge $7.99 to $14.99/mo. The sticker price most paying users tolerate is $5 to $15 per month. That is the slot you design for.

The wedge for a small dev or solo astrologer is not a Co-Star clone. The big players sell vibes and daily horoscopes to gen-pop users. They under-serve the slice of users who want answers to specific questions. That slice converts at higher rates because the value is concrete. A user asking "should I quit my job in June" has higher willingness-to-pay than a user idly scrolling a moon-phase widget.

The serious horary audience is small but dense. Niche traditional-astrology apps like Time Nomad, Solar Fire Mobile, and Astro Gold sell to working astrologers at $30-90 one-time. Their UX is for practitioners, not consumers. A consumer-friendly horary chatbot is the missing rung. You don't need a million users. You need ten thousand at $9/mo and a churn rate under 5%.

Two ICPs that work:

  1. Astrologer-entrepreneurs wrapping their practice in an app, charging $15-29/mo for "ask me anything in the chat" with a horary engine doing the heavy lift between live readings.
  2. Indie devs building a single-purpose horary app (think "Yes or No, but real") at the $4.99/mo impulse-buy tier, betting on volume.

Both make money. Both ship faster with one endpoint than with a year of doctrine code.

Why horary out-retains daily horoscopes

The biggest mistake astrology apps make is leading with "daily horoscope" as the retention loop. Daily horoscopes are a habit, not a reason to stay. Once a user notices the copy is mostly generic, they cancel. Average paid retention on horoscope-led apps drops to 30-40% by day 90.

Horary inverts the loop. A user opens the app because they have a question they cannot answer themselves. They get a specific, time-stamped judgment they can verify against reality. Two weeks later they find out the house sale closed on the delay you predicted. They come back to ask the next thing. The mechanic is outcome-traceable answers, and that is rare in consumer software.

Three retention effects you can design for:

  • Questions cluster around life events. A user asking about a job change also asks about salary, location, manager, and timing within a 30-day window. One paying user can be 8-12 questions in their first month.
  • Receipts. Show the user a timeline of past questions and actual outcomes (if they tap "happened" or "didn't"). That loop is what makes them tell friends.
  • High-intent referral. Horary users refer at 3-5x the rate of horoscope users, because they refer with a specific story ("it called the delay exactly"), not a vague endorsement.

Combine those and a horary-led app can hit 60%+ day-90 retention. That is the difference between a real business and a stalled one.

What hand-wiring this actually costs

If you decide to skip /horary/ask and build the pipeline yourself, here is the honest hour count from teams who have tried.
Classifier. Labeling enough horary question examples, building the prompt, evaluating against a held-out set: 40-80 hours. Maintaining the few-shot list as new question patterns emerge: ongoing.
Doctrine engine. Perfection, frustration, prohibition, translation, collection, with the right edge cases against an ephemeris: 120-200 hours for a working version, plus 80-120 for correctness work. The VoC rules alone (Lilly, Bonatti, Saunders all disagree) eat a full week if you take them seriously.
Radicality and dignity tables. Twelve dignity schemes, planetary hours, sect, three early/late ascendant thresholds depending on tradition: 40-60 hours.
Summarizer. Prompt engineering an LLM to read a structured judgment object and produce a 2-4 sentence answer that doesn't hallucinate the verdict: 30-50 hours, plus an eval set you need to maintain.
Multilingual layer. Per-language prompt tuning and QA across the five languages we ship: 40-80 hours each if you do it well.
Round numbers: 500-800 hours for a v1 you would ship to paying customers. At $80/hr contractor rate that is $40,000 to $64,000 before a single user sees it. At founder opportunity cost, it is the quarter you weren't shipping the rest of the product. /horary/ask is one POST and a JSON parse. The decision is whether your weekend is better spent shipping or rebuilding common infrastructure.

Confidence bands as UX strategy

The confidence_band field is the most product-important thing in the response, and most teams miss it on the first pass. It is not a debug flag. It is your trust-building surface.

Most astrology apps lose users because they overclaim. The bot sounds 100% certain about a question the chart is 60% certain about, the prediction misses, the user feels deceived. Once. They cancel and write a one-star review about how it is "fake".

Horary doctrine is older than astrology apps and the tradition already solved this. A "non-radical" chart is one you are not supposed to read with confidence. The endpoint surfaces that signal so you can pass it to users honestly.

How to translate the three bands into UX:

  • High. Lead with the verdict in bold. "Yes." Or "Yes, with a delay." Show a one-sentence reason underneath. Hide the technical details behind a "see why" disclosure for the curious. Users feel: clean, decisive, trustworthy.
  • Medium. Lead with the verdict but pair it with a "factors to consider" block. List the radicality flags as plain-language caveats. Users feel: thoughtful, not robotic.
  • Low. Do not lead with the verdict. Lead with the meta-judgment: "this question isn't ripe yet". Offer a "rephrase" button and a "try again later" timer. Users feel: respected, not gaslit.

The third one is the move. Most apps will never refuse a question. Refusing one well, when the chart says you should, is the single strongest trust signal a horary bot has. Users who get a "this isn't ready, try in a few hours" answer are more likely to keep paying than users who get a confident wrong answer.

A real example. User asks "will my business succeed?" The classifier returns business.success at 0.91. The judgment object comes back with early_ascendant. The endpoint sets confidence_band: "low". The right UI here is: "The chart says it's too early to tell. Traditional horary calls this an 'early ascendant', meaning the question hasn't fully ripened. Ask again in 2-4 hours, or rephrase so 'success' has a clearer meaning to you."
Two flag types worth special copy: via_combusta (Moon between 15° Libra and 15° Scorpio) and late_ascendant (29° of any sign). Both have specific traditional meanings and a polished app surfaces them by name. It is also good content marketing. Users screenshot interesting flags and post them.

The multilingual angle is bigger than you think

English-first product thinking is leaving money on the table in three regions specifically.

LATAM. Spanish-speaking astrology consumption per capita is among the highest in the world. Mexico, Argentina, Colombia, and Spain combined run an estimated $1.5-2B in annual astrology spend, with effectively zero quality horary apps localized for Spanish. Sub-$10/mo apps with Spanish-native UX convert at a meaningful premium to translated-English copy.
DACH. German-speaking astrology has a long traditional-astrology lineage and an audience that takes horary more seriously than English-speaking gen-pop. Switzerland and Austria, in particular, contain a small but high-LTV slice. Indie apps charging €9-14/mo in this market see lower churn than the equivalent US user.
France. Smaller market than LATAM or DACH but very low competition for traditional-technique apps. A founder shipping a French-native horary bot is competing with roughly zero polished products.
The lang parameter handles the summary and plain-language radicality copy. The classification and judgment fields stay language-neutral, so your analytics, A/B tests, and routing logic do not fork per language. Five languages on day one is not five forks of business logic. It is one feature, five user surfaces. The English horary slot is filling up. Spanish, German, and French are largely open. Ship localized first.

What goes wrong: scope creep that kills horary apps

Honest section. Most horary features that fail in production fail for the same three reasons, and none of them are technical.

The lottery problem. Users will ask "will I win the lottery". The classifier will accept it. The judgment will look like a coin flip because gambling questions sit outside the doctrine's assumption of a sincere, stake-holding querent. Users will get a fake-feeling answer and churn. The fix is a question-quality gate at the input, not a smarter judgment engine. Block gambling, sports betting, and "will I get rich" patterns at the chat layer.
The "will I be happy" problem. Vague questions produce vague answers. A user asking "will I be happy" gets back an analysis that cannot say anything specific because the question doesn't reference a quesited matter. The right product move is to coach the user into a specific question before sending anything to the API. "Happy about what? Your relationship, your job, your move to Lisbon?" That is a chat turn, not an API call.
The "ask again until you like the answer" problem. Users will rephrase the same question five times until they get a "yes". The 7-day Redis cache fights this on the server side. The bigger issue is product trust. Hard rule: surface the cached answer with a polite "you asked something like this on Tuesday, here's what the chart said" message. Force the user to confront their own pattern. The ones who appreciate it become your highest-LTV users.

The pattern across all three: refuse work the doctrine refuses. That is what makes horary feel real, and it is the moat against a competitor who lets the bot answer everything. Every refused question is also a request you didn't pay for and a user you didn't disappoint.

Pricing and rate limits

/horary/ask runs an LLM classifier on stage 1 and an LLM summarizer on stage 4, so it counts against your standard request quota. One question equals one API call regardless of how many internal LLM hops happen behind the scenes.

Practical sizing:

  • Professional ($37/mo, 55,000 requests) covers most indie chatbots through the first few thousand paying users. The Redis classifier cache means repeat questions effectively cost zero against your quota.
  • Business ($99/mo, 220,000 requests) is the right tier once you cross roughly 5,000 daily active users or open a public free tier with no rate-limit of your own.
  • Enterprise if you need dedicated capacity, custom rate limits, or SLA-backed latency below 1 second.
Full breakdown on the pricing page. The free tier (50 requests/month) is enough to wire up the integration and put it through real questions before committing.

Ship it this weekend

Grab an API key. Pick three test questions from your own life: a real one, a vague one, a clearly bad one. Send them through /horary/ask. Watch how the confidence_band shifts. That's your spec for the UI.
For background, the horary product page covers the doctrine in more depth. The astrology chat API is the companion endpoint for grounded follow-up turns. Electional astrology is the cross-sell for "when" questions, cazimi and combust is the deeper-dive for partnership and visibility, and traditional astrology is the parent topic.

Three calls. One weekend. A horary feature that would have been a six-month build a year ago.

Oleg Kopachovets

Oleg Kopachovets

CTO & Co-Founder

Technical founder at Astrology API, specializing in astronomical calculations and AI-powered astrology

More from Astrology API