記事

英語記事

2026 edition: How to read Japan's X algorithm

Beginner to X operation / 公開日: 2026/02/20 · 更新日: 2026/03/12

利用可能な言語英語版を見る
2026 edition: How to read Japan's X algorithm

Information on X operation still tends to rely on explanations such as "growth can be achieved with one technique." However, in 2026, the range that can be confirmed as public code will increase, and the premises of the discussion can be made considerably more concrete.

This article is a re-edited version for practical use, based on xai-org/x-algorithm as of March 1, 2026, and has translated the processing structure of For You recommendation into Japanese operation. The title remains the same, but the text is much more detailed.

3 minute summary first

For busy people, I'll just put the conclusion first.

  1. x-algorithm shows the implementation structure of "Recommendation for For You feed", where candidate acquisition, exclusion, and scoring are clearly divided.
  2. Ranking is not a single indicator, but a weighted summation of multiple action probabilities, and negative signals such as not interested / block / mute / report are explicitly treated as points being deducted.
  3. There is a two-stage filter before and after the score, and it is necessary not only to increase the score'' but also to design it so that it will not be dropped in the first place.''
  4. Since there is a ranking mask (candidate isolation) that does not look at candidates, improving the quality of individual posts is likely to be directly linked to reproducibility.
  5. In Japanese

From now on, I will explain these five points in detail according to the vocabulary of the code.

The position of this article (the line between facts and inference)

First, clarify where public facts end and operational reasoning ends.

Publication facts (can be checked in the repository)

  • Key component names recommended for For You
  • Stage structure of candidate acquisition, hydration, filter, score, and selection
  • Concept of multiple behavior prediction and weighted summation score
  • Implementation policy for attention mask that does not look at candidates
  • Typical filter names (duplicate, old, read, muted, visible, etc.)

Undisclosed or undetermined (area not determined)

  • Accurate weight values ​​and thresholds used in production operations (params may not be disclosed)
  • Model training frequency, delivery ratio with real traffic, fine-tuning by region
  • Overall system image across all product aspects

In other words, this article follows the principles of explaining the structure based on code basis'' and not mixing in speculation about concrete values.''

Figure 1: Overview of For You recommendation pipeline

For You Pipeline Overview

This diagram summarizes the processing order that can be read in the public README and in the home-mixer implementation. Importantly, scoring does not exist in isolation, but is highly dependent on prior and subsequent processing.

7 steps to read along with the code

We will look at each stage, focusing on home-mixer/candidate_pipeline/phoenix_candidate_pipeline.rs.

1. Query Hydration

Here, we will create the "user context" at the time of the recommendation request.

  • User behavior sequence (past reaction history)
  • User features (information derived from follow relationships and settings)

From an operational perspective, this is the basis for ``past behavior being effective for the next distribution.'' This is the reason why posts that consistently generate consistent responses tend to be more advantageous than posts that go viral in a short period of time.

2. Candidate Sourcing

Candidates come from at least two lineages.

  1. Thunder(In-Network)
  2. Phoenix Retrieval(Out-of-Network)

In-Network is a discovery frame that comes from following accounts, and Out-of-Network is a discovery frame that includes unfollowed accounts. What is important here is not to ``select one from the beginning,'' but to create a set of candidates and then narrow it down at a later stage.

3. Candidate Hydration

Adds additional information necessary for score calculation to the obtained candidates.

  • Core post data
  • Attributes such as video length
  • Can you view subscriptions?
  • Author information

This stage is simple, but if you miss it, you will fall at a later stage. That's why metadata integrity is effectively a quality requirement, not just the content of your posts.

4. Pre-Scoring Filters

Drop "unqualified candidates" before scoring. The main filters that appear in public implementations are:

  • DropDuplicatesFilter
  • CoreDataHydrationFilter
  • AgeFilter
  • SelfTweetFilter
  • RetweetDeduplicationFilter
  • IneligibleSubscriptionFilter
  • PreviouslySeenPostsFilter
  • PreviouslyServedPostsFilter
  • MutedKeywordFilter
  • AuthorSocialgraphFilter

The practical point here is simple. Before making an effort to raise your score, you need to take action to reduce the factors that cause your score to drop.

5. Scoring

In the public implementation, score-related processing is chained together in multiple steps.

  1. PhoenixScorer: Estimate behavior probability
  2. WeightedScorer: Weighted addition of multiple actions
  3. AuthorDiversityScorer: Attenuate continuous exposure of the same author
  4. OONScorer: Out-of-Network correction

This order is important. It is not simply "in descending order of estimated value", but corrections for diversity and network type are added later.

6. Selection

Sort by final score and select the top K results. The structure of selectors/top_k_score_selector.rs is very clear, and you can see that it is determined mainly by the final score.

7. Post-Selection Filters

After selection, it will be further excluded. The representative is as follows.

  • VFFilter
  • DedupConversationFilter

In other words, "ranking in the top = delivery confirmed" does not mean. There are final gates for visibility policy and conversation duplication control.

Figure 2: Practical understanding of score design

Score Composition Diagram

WeightedScorer multiplies multiple action probabilities by weight and sums them together. What is important here is that it is not only necessary to increase positive reactions'' but also reduce negative reactions'' at the same time.

Translate predicted actions into production language

The actions that can be seen in the public README and phoenix/runners.py are translated into posting operations as follows.

Prediction systemRepresentative actionOperational meaning
Reaction systemlike / reply / repost / quoteStrength of empathy, discussion, and diffusion
Transitionclick / profile_clickIntention to see more
Dwell systemdwell / dwell_time / video_viewWas it possible to watch without being skipped?
Expansionphoto_expand / share / copy_linkThe value of preserving content and sharing it with others
Relationship systemfollow_authorConverting to a long-term relationship
Negative signalnot_interested / block / mute / reportSign of disagreement, discomfort, or mistrust

The conclusions drawn from this design are clear.

  1. Like optimization alone is not enough
  2. A fishing design that focuses only on clicks is disadvantageous in the long term.
  3. Editing to reduce negative signals is more effective than you imagine.

How to interpret Candidate Isolation

The attention mask in phoenix/grok.py prohibits mutual references between candidates and allows references to user context and history. There are three practical implications:

  1. Less likely to be influenced by other candidates in the batch
  2. Improving individual posts is easily effective
  3. It is more reproducible to improve the post itself than to win with “comparison gacha”

Based on this, it is reasonable to improve your posts in the following order:

  1. Clarification of the two lines of introduction
  2. Paragraph division and information density optimization
  3. Single CTA
  4. Re-edit after viewing the reaction log

Figure 3: Diagram of 2-stage filter

Two-Stage Filters Diagram

If you focus too much on the score, you'll overlook filter design. In reality, in many cases the filter determines the upper limit of exposure first.

Operational mistakes that tend to get clogged with pre-score filters

  • Short-term reposting of the same text
  • Mass production ignoring the read list
  • Ambiguous expressions that touch mute words
  • Introduction that is easy to misread due to lack of context

Operational mistakes that tend to get clogged with post-score filters

  • Visibility judgment worsened by repeating incendiary text
  • Mass production of duplicate branches within the same conversation thread

The principle here is "don't fall before you stretch."

Using structure to explain why traditional hacks are not effective

The reason behind the feeling that ``old ways of doing things suddenly no longer work'' can be summarized as follows in light of the public structure.

  1. The entrance for candidate selection is double tracked.
  2. The score is multi-objective optimization (multiple actions)
  3. Negative signals are explicitly deducted points
  4. Due to diversity correction, same-type continuous pitching becomes relatively weak.
  5. Non-conforming content is dropped in the second stage with a two-stage filter

That's why "types that stabilize reaction quality" are more advantageous than short-term hacks.

Practical design in Japanese X (detailed version)

From here on, we will show you the specific steps to translate the above structure into Japanese usage.

1. Post design: strictly adhere to one post, one theme

The more you mix it up, the harder it becomes to estimate. Fix the value to be conveyed in one post, and leave supplementary information to the reply.

  • First line, equivalent to a heading: Who is it for?
  • 2nd line: What do you get?
  • Body: Conclusion -> Reason -> Specific example -> Action

2. How to make two lines of introduction (template)

The following type makes it easier to compare reactions.

  • Target: "For people who stop responding after the first month of using X"
  • Benefit: “How to identify one area for improvement in 72 hours”
  • Expected value: "Verify without increasing the number of posts"

If you overestimate the expected value when introducing it, you will tend to lean toward the negative signal side. Prioritize "specific and sincere promises" over "strong promises."

3. Information density design of main text

In the Japanese-speaking world, the flow rate is high, and the line of sight is fast even when reading long sentences. The following rules ensure readability.

  1. 1 paragraph 2-4 lines
  2. 1 paragraph, 1 point
  3. Add numbers or steps to abstract words
  4. Define terms only the first time they appear

4. Only one CTA

Multiple CTAs divide the action rate. It is more measurable to separate the times when you go to get reply and the times when you go to get profile click.

5. Preventive measures against negative reactions

Please check the following before posting.

  • Is the subject too large?
  • Does it unnecessarily provoke the reader's attributes?
  • Are you confusing facts and opinions?
  • Doesn't an assertive tone lead to misunderstandings?

This is not a moral theory, but a practical response to score design.

Operation flow using TenguX (for implementation)

  1. Select one theme candidate with /neta
  2. Create 2 drafts by quoting/rewriting (change only the 2 lines of introduction)
  3. Deliver to JST fixed frame with /queue
  4. Comparison of indicators at 24h and 72h
  5. Only winning introductions are carried over to the next week

The trick here is to "don't change the entire text every time." To protect comparability, we limit changes to 1-2 items.

Measurement design: 24h/72h two-stage review

What to see in 24h (initial velocity)

  • imp (input of delivery)
  • engagement (inlet of reaction)
  • profile clicks (interest transition)

What you see in 72h (persistent)

  • replies (depth of discussion)
  • reposts/quotes (room for rediffusion)
  • Whether the type is conserved (whether there is a revisit reaction)

Examples of practical judgment rules

  • Weak initial velocity -> Corrected the 2 introduction lines
  • Good initial velocity but poor sustain -> Corrected text structure
  • There is a response, but the quality is poor -> Correct the subject and assertive expressions

30-day operation plan (details)

Week 1: Building the foundation for measurement

  1. Fixed post template to one type
  2. Limit to 2 themes
  3. Create a record sheet for 24 hours/72 hours after posting

Week 2: Deployment optimization

  1. A/B only 2 lines of introduction
  2. The text is fixed
  3. Decide on the most stable implementation type

Week 3: Body optimization

  1. Adjust paragraph length and bullet point ratio
  2. Increase the amount of concrete examples by 1.2 times
  3. Fixed CTA and purely compare text differences

Week 4: Re-editing and capitalization

  1. Re-edit the top 3 responses
  2. Create different angles on the same theme
  3. Create a template and use it as a standard for the next month

The deliverable you should create in these 30 days is not a "buzz post" but a "reproducible template."

Common Misconceptions (FAQ)

Q1. Will everything get better if there are more likes?

no. Due to its public structure, it is a composite of multiple actions. Even if like is strong, it may be offset by other signals or negative signals.

Q2. Can you win by increasing the number of posts?

Although it increases exposure opportunities in the short term, it can have the opposite effect if the filters and quality are not aligned. The first month is more about verification accuracy than the number of books.

Q3. Should I provoke to target Out-of-Network?

Not recommended. OONScorer exists, but that doesn't mean you can do anything with it. An increase in negative signals is disadvantageous in the long term.

Q4. Can I understand all of X by reading this repository?

What we see is the core structure of the For You recommendation. All elements and thresholds for production operations have not been made public, so you should avoid making any definitive statements.

Q5. What is the top priority when using Japanese?

Clarification of the introductory two lines. If the target audience and benefits are unclear, post-optimization will be less effective.

Practical checklist (before posting)

  1. Is this post focused on one theme?
  2. Is the target audience clearly stated in the first line?
  3. Are the benefits clearly stated in the second line?
  4. Is there one point per paragraph?
  5. Are there any assertive expressions that can lead to misreading?
  6. Is there excessive provocation?
  7. Is there one CTA?
  8. Have you decided on the items to be verified 24h/72h?

Simply satisfying these eight items will significantly improve the reproducibility of your operations.

summary

What is important for X operation in 2026 is not looking for tricks but understanding the structure. The practical essence that can be gleaned from xai-org/x-algorithm is the following three points.

  1. For You recommendation works in stages (candidate acquisition -> completion -> filter -> score -> final filter)
  2. Scores are multipurpose; negative signals are explicitly disadvantageous.
  3. Operations that continue to improve the quality of individual posts are the easiest to reproduce.

With Japanese Just by continuing these three points for 30 days, you can move from sensory operation to verification operation.

Appendix A: Correspondence table between implementation files and highlights

Organize major files by purpose so you don't get lost when digging deeper. By clearly indicating where you should read first to find out what you need to know, it will be easier to keep up with information updates.

PurposeWhere to read firstWhat to readTranslation into practice
Get the big pictureREADME.mdFor You structure, stage names, and main conceptsBecome a map of where to improve
Check the pipeline orderhome-mixer/candidate_pipeline/phoenix_candidate_pipeline.rsQuery/source/hydrator/filter/scorer/selector orderYou can decide the improvement priority
Check the score formulahome-mixer/scorers/weighted_scorer.rsWeight summation of multiple actions, negative signal itemsAvoid the danger of "likes only" optimization
View diversity correctionhome-mixer/scorers/author_diversity_scorer.rsAttenuation processing for the same authorReview strategies that rely on repeated submissions
See OON correctionhome-mixer/scorers/oon_scorer.rsExistence of in/out-network correctionYou can design without over-reliance on unfollowed exposure
Understanding ranking maskphoenix/grok.pyMask logic of candidate isolationBasis for emphasizing the quality of individual posts
List of predicted itemsphoenix/runners.pyImplementation order of multi-behavior outputCan be reflected in the design of observation indicators

Of particular interest in the table above is the combination of weighted_scorer and grok.py. The former indicates "what is being synthesized" and the latter indicates "how it is compared." Once you understand these two things, your strategy will change from "guessing" to "hypothesis testing."

Appendix B: 12 ​​submission templates for Japanese X

Below is a template that is easy to use even in the first month. Both are written on the premise that the two lines of introduction can be compared.

Template 1: Procedure public type

  1. Clarify the target audience
  2. Putting the results obtained in numbers
  3. List the procedure in 3-5 steps
  4. Finally, declare "What should we improve next time?"

Template 2: Failure learning type

  1. Write down the failure situation based on facts
  2. Narrow down to one cause
  3. Show correction steps
  4. Place a check to prevent recurrence

Template 3: Checklist distribution type

  1. Specify the usage situation first
  2. Present 7-10 items to check
  3. Target “items that could not be met” for improvement next time

Template 4: Comparison verification type

  1. Limit the difference between Plan A and Plan B to one
  2. Disclose indicators compared 24h/72h
  3. Decide on only one measure for next time

Template 5: Misunderstanding correction type

  1. Presenting one common misconception
  2. Explain why misunderstandings occur
  3. Reinforcement with implementation structure or observation data
  4. Present practical alternatives

Template 6: Term decomposition type

  1. Choose one difficult word
  2. Define for beginners
  3. Add examples that occur in practice
  4. Indicate the conditions for deciding whether to use or not.

Template 7: Back calculation design type

  1. Put the target indicators first
  2. Count backwards and break down the necessary actions
  3. Convert to post design

Template 8: Case abstraction type

  1. Showing a single case
  2. Abstract success factors into three
  3. Write conditions for transfer to other themes

Template 9: QA proactive type

  1. List the points that are likely to be refuted first
  2. Give short answers to each.
  3. Also specify “conditions that are not applicable”

Template 10: Mini serial type

  1. Whole map on the first shot
  2. The most important point in the second run
  3. Third operational template
  4. Don’t change your CTA each time

Template 11: Weekly review public type

  1. Overview of the week's posts
  2. Two good/two bad points
  3. Fix next week's revision policy to one

Template 12: Redefined type for beginners

  1. Translating established theories for experts into terms for beginners
  2. Attach the minimum steps that can be taken now
  3. Show how to return if you fail

Templates are more about consistency than numbers. For the first month, you will learn faster if you select only 2-3 items and rotate them.

Appendix C: 24h/72h verification log recording format

Whether or not you get results in practice depends more on the "recording quality" than on the posting quality. If you leave a log in the format below, you will be less likely to make a mistake in judgment at the end of the month.

DatePost IDTemplate typeChanges24h imp24h replies72h profile clicksSubjective memo
03/01A001Open procedureShorten 2 lines of introduction12001426Clear introduction and quick response
03/03A002Failure learning typeFixed CTA to one9802118Conversations have increased, but transitions are weak
03/05A003Comparison verification typeShorten paragraph length13501631Clicking improved

From this log, make decisions as follows:

  1. Extract only the “changes” of posts that have moved indicators
  2. Categorize changes (introduction/text/CTA)
  3. Next week, improve only the categories that contributed the most.

“Fixing everything” destroys learning. If you correct only one category every week, the reproducibility after 4 weeks will be higher.

Appendix D: Editorial guide to reduce negative signals

not_interested / block / mute / report is difficult for the poster to directly observe, so Manage using proxy indicators.

Examples of proxy indicators

  1. Profile transition slows down even though responses are increasing
  2. There are a number of replies, but there are few constructive conversations.
  3. Sustainability of responses sharply decreases after the day after posting
  4. Exposure suddenly decreases when posting the same theme repeatedly

Although these are not definitive indicators, they can be treated as signs of increasing negative signals. Prioritize the following fixes the week the symptoms appear.

  1. Reduce too strong assertions
  2. Avoid divisive headlines
  3. Don't widen your target audience too much
  4. Clear separation between subjectivity and fact

Appendix E: Replacement steps to follow repository updates

The public implementation may be updated in the future. You can continuously maintain article content by monitoring differences using the following steps.

  1. Check the difference in README.md once a month
  2. Check the difference between home-mixer/scorers and home-mixer/filters
  3. Check the differences around mask of phoenix/grok.py
  4. Record added/deleted action names
  5. Only the “Practical translation” section of the article was updated first.

By implementing this operation, you can reduce the risk that the explanation will remain outdated.

Reference (as of March 1, 2026)

Resources

関連リソース

この記事の内容を、そのまま実務に落とすための型をまとめています。

次のアクション

この流れを実際に試す場合は、まず1テーマ分の投稿案づくりから始めてください。