Coverage for astrocyte/pipeline/intent

1"""Intent → RRF channel weights mapping (M34).

3Maps :class:`~astrocyte.pipeline.query_intent.QueryIntent` to per-channel

4weights for :func:`~astrocyte.pipeline.fusion.weighted_rrf_fusion`.

6Why this exists

7---------------

9Pre-M34, ``fact_recall`` ran 4-5 retrieval channels (semantic, episodic,

10temporal, link-expansion, BM25) and fused them with **equal-weight RRF**.

11The v015i / v015j bench runs showed this single-pool architecture

12shuffles ~3-4 questions across LME categories when temporal coverage

13shifts — gains in temporal-reasoning come at the cost of

14single-session-preference (and vice versa). Knob-tuning (capping

15``top_k_temporal``) couldn't break the trade-off because the channels

16compete on rank inside the same fused pool.

18M34's intent layer fixes this by **biasing channel contribution per

19query intent**. Temporal questions boost the temporal channel; preference

20questions damp it. Equal-weight fallback (UNKNOWN intent) keeps the

21pre-M34 behaviour for legacy callers and queries we can't classify.

23Design choices

24--------------

26- **All weights in [0.0, 1.5]** — bounded range keeps RRF stable. Negative

27 weights are rejected by ``weighted_rrf_fusion``; we use ``0.0`` to mute

28 a channel rather than skip it conditionally in calling code.

29- **Asymmetric biases** — the strongest boost (1.5) is reserved for the

30 channel an intent depends on; the strongest dampening (0.2-0.3) for

31 channels that introduce noise for that intent. Most channels stay at

32 1.0 (neutral).

33- **No 0.0 weights in the production table** — every channel gets at

34 least 0.2. Hard mutes invite silent failures when the classifier

35 misfires; soft dampening preserves graceful degradation.

37References

38----------

40- Design doc: ``docs/_design/m34-query-intent-routing.md``

41- Forensic basis: ``docs/_design/m31-lme-quality.md`` §8 (M31c

42 anti-composition) + ``benchmark-results/.../astrocyte-v015{i,j}``

43- Hindsight parallel: ``hindsight-api-slim/.../memory_engine.py:3009-3211``

44 uses per-fact-type segmentation + conditional channel arity; M34 is

45 the recall-bias analogue (combined with per-fact-type segmentation

46 via M34-4).

47"""

49from __future__ import annotations

51from dataclasses import dataclass

53from astrocyte.pipeline.query_intent import QueryIntent

56@dataclass(frozen=True)

57class ChannelWeights:

58 """RRF weights for each fact-recall channel.

60 All weights must be ``>= 0.0`` (enforced by

61 :func:`~astrocyte.pipeline.fusion.weighted_rrf_fusion`). A weight of

62 ``0.0`` mutes the channel; a weight of ``1.0`` is neutral; values

63 above ``1.0`` boost the channel's reciprocal-rank contribution.

65 Channel names match the keyword arguments of ``fact_recall``:

67 - ``semantic`` — cosine over fact-text embeddings

68 - ``episodic`` — episodic-marker entity search (M18a-4)

69 - ``temporal`` — date-range filter via search_facts_temporal

70 - ``link_expansion`` — cross-session entity graph (M27)

71 - ``bm25`` — full-text/BM25 over fact_text (M31c, re-wired in M34-5)

72 """

74 semantic: float = 1.0

75 episodic: float = 1.0

76 temporal: float = 1.0

77 link_expansion: float = 1.0

78 bm25: float = 1.0

81#: Per-intent channel weight table. The single calibration knob of M34.

82#:

83#: Calibrated against the v015i/v015j failure modes:

84#:

85#: - SSP regressed -5 when temporal flooded → PREFERENCE-style intent

86#: damps temporal to 0.2.

87#: - MS regressed -3 from cross-session dilution → RELATIONAL boosts

88#: link_expansion to 1.5.

89#: - SSU held at 7/15 because BM25 was off → FACTUAL boosts bm25 to 1.5.

90#: - TR held its +2 in both runs → TEMPORAL keeps temporal at 1.5.

91#:

92#: UNKNOWN (the safe fallback) gets the same weight profile as the

93#: v015j "all equal but slightly damped temporal" setup — never worse

94#: than current production behaviour for unclassifiable queries.

95INTENT_CHANNEL_WEIGHTS: dict[QueryIntent, ChannelWeights] = {

96 QueryIntent.TEMPORAL: ChannelWeights(

97 semantic=1.0, episodic=0.7, temporal=1.5, link_expansion=0.5, bm25=1.0,

98 ),

99 QueryIntent.COMPARATIVE: ChannelWeights(

100 semantic=1.0, episodic=1.0, temporal=0.3, link_expansion=1.0, bm25=1.0,

101 ),

102 QueryIntent.RELATIONAL: ChannelWeights(

103 semantic=0.8, episodic=1.0, temporal=0.5, link_expansion=1.5, bm25=1.0,

104 ),

105 QueryIntent.FACTUAL: ChannelWeights(

106 semantic=1.5, episodic=0.5, temporal=0.3, link_expansion=0.5, bm25=1.5,

107 ),

108 QueryIntent.PROCEDURAL: ChannelWeights(

109 semantic=1.2, episodic=0.8, temporal=0.3, link_expansion=0.8, bm25=1.0,

110 ),

111 QueryIntent.EXPLORATORY: ChannelWeights(

112 semantic=1.0, episodic=1.0, temporal=1.0, link_expansion=1.0, bm25=1.0,

113 ),

114 QueryIntent.UNKNOWN: ChannelWeights(

115 semantic=1.0, episodic=1.0, temporal=0.5, link_expansion=1.0, bm25=1.0,

116 ),

117}

118

119

120#: Neutral baseline. Identical to ``INTENT_CHANNEL_WEIGHTS[UNKNOWN]``

121#: but exposed as a constant for callers that want to opt out of intent

122#: routing without thinking about which fallback to pick.

123NEUTRAL_WEIGHTS: ChannelWeights = INTENT_CHANNEL_WEIGHTS[QueryIntent.UNKNOWN]

124

125

126def weights_for_intent(intent: QueryIntent | None) -> ChannelWeights:

127 """Look up channel weights for an intent.

128

129 Args:

130 intent: Classified intent, or ``None`` to use the neutral

131 baseline. ``None`` and :attr:`QueryIntent.UNKNOWN` resolve to

132 the same baseline — callers that classify and find UNKNOWN

133 should pass UNKNOWN explicitly so debug logs / metrics

134 distinguish "classifier ran and was uncertain" from "caller

135 didn't classify".

136

137 Returns:

138 Frozen :class:`ChannelWeights`. Never raises; unknown enum

139 values fall back to :data:`NEUTRAL_WEIGHTS`.

140 """

141 if intent is None:

142 return NEUTRAL_WEIGHTS

143 return INTENT_CHANNEL_WEIGHTS.get(intent, NEUTRAL_WEIGHTS)

144

145

146__all__ = [

147 "ChannelWeights",

148 "INTENT_CHANNEL_WEIGHTS",

149 "NEUTRAL_WEIGHTS",

150 "weights_for_intent",

151]

Coverage for astrocyte/pipeline/intent_weights.py: 100%

17 statements