記事
英語記事2026 edition: How to read Japan's X algorithm
Beginner to X operation / 公開日: 2026/02/20 · 更新日: 2026/03/12

Information on X operation still tends to rely on explanations such as "growth can be achieved with one technique." However, in 2026, the range that can be confirmed as public code will increase, and the premises of the discussion can be made considerably more concrete.
This article is a re-edited version for practical use, based on xai-org/x-algorithm as of March 1, 2026, and has translated the processing structure of For You recommendation into Japanese operation.
The title remains the same, but the text is much more detailed.
3 minute summary first
For busy people, I'll just put the conclusion first.
x-algorithmshows the implementation structure of "Recommendation for For You feed", where candidate acquisition, exclusion, and scoring are clearly divided.- Ranking is not a single indicator, but a weighted summation of multiple action probabilities, and negative signals such as
not interested / block / mute / reportare explicitly treated as points being deducted. - There is a two-stage filter before and after the score, and it is necessary not only to
increase the score'' but also todesign it so that it will not be dropped in the first place.'' - Since there is a ranking mask (candidate isolation) that does not look at candidates, improving the quality of individual posts is likely to be directly linked to reproducibility.
- In Japanese
From now on, I will explain these five points in detail according to the vocabulary of the code.
The position of this article (the line between facts and inference)
First, clarify where public facts end and operational reasoning ends.
Publication facts (can be checked in the repository)
- Key component names recommended for For You
- Stage structure of candidate acquisition, hydration, filter, score, and selection
- Concept of multiple behavior prediction and weighted summation score
- Implementation policy for attention mask that does not look at candidates
- Typical filter names (duplicate, old, read, muted, visible, etc.)
Undisclosed or undetermined (area not determined)
- Accurate weight values and thresholds used in production operations (
paramsmay not be disclosed) - Model training frequency, delivery ratio with real traffic, fine-tuning by region
- Overall system image across all product aspects
In other words, this article follows the principles of explaining the structure based on code basis'' and not mixing in speculation about concrete values.''
Figure 1: Overview of For You recommendation pipeline
This diagram summarizes the processing order that can be read in the public README and in the home-mixer implementation.
Importantly, scoring does not exist in isolation, but is highly dependent on prior and subsequent processing.
7 steps to read along with the code
We will look at each stage, focusing on home-mixer/candidate_pipeline/phoenix_candidate_pipeline.rs.
1. Query Hydration
Here, we will create the "user context" at the time of the recommendation request.
- User behavior sequence (past reaction history)
- User features (information derived from follow relationships and settings)
From an operational perspective, this is the basis for ``past behavior being effective for the next distribution.'' This is the reason why posts that consistently generate consistent responses tend to be more advantageous than posts that go viral in a short period of time.
2. Candidate Sourcing
Candidates come from at least two lineages.
Thunder(In-Network)Phoenix Retrieval(Out-of-Network)
In-Network is a discovery frame that comes from following accounts, and Out-of-Network is a discovery frame that includes unfollowed accounts. What is important here is not to ``select one from the beginning,'' but to create a set of candidates and then narrow it down at a later stage.
3. Candidate Hydration
Adds additional information necessary for score calculation to the obtained candidates.
- Core post data
- Attributes such as video length
- Can you view subscriptions?
- Author information
This stage is simple, but if you miss it, you will fall at a later stage. That's why metadata integrity is effectively a quality requirement, not just the content of your posts.
4. Pre-Scoring Filters
Drop "unqualified candidates" before scoring. The main filters that appear in public implementations are:
DropDuplicatesFilterCoreDataHydrationFilterAgeFilterSelfTweetFilterRetweetDeduplicationFilterIneligibleSubscriptionFilterPreviouslySeenPostsFilterPreviouslyServedPostsFilterMutedKeywordFilterAuthorSocialgraphFilter
The practical point here is simple. Before making an effort to raise your score, you need to take action to reduce the factors that cause your score to drop.
5. Scoring
In the public implementation, score-related processing is chained together in multiple steps.
PhoenixScorer: Estimate behavior probabilityWeightedScorer: Weighted addition of multiple actionsAuthorDiversityScorer: Attenuate continuous exposure of the same authorOONScorer: Out-of-Network correction
This order is important. It is not simply "in descending order of estimated value", but corrections for diversity and network type are added later.
6. Selection
Sort by final score and select the top K results.
The structure of selectors/top_k_score_selector.rs is very clear, and you can see that it is determined mainly by the final score.
7. Post-Selection Filters
After selection, it will be further excluded. The representative is as follows.
VFFilterDedupConversationFilter
In other words, "ranking in the top = delivery confirmed" does not mean. There are final gates for visibility policy and conversation duplication control.
Figure 2: Practical understanding of score design
WeightedScorer multiplies multiple action probabilities by weight and sums them together.
What is important here is that it is not only necessary to increase positive reactions'' but also reduce negative reactions'' at the same time.
Translate predicted actions into production language
The actions that can be seen in the public README and phoenix/runners.py are translated into posting operations as follows.
| Prediction system | Representative action | Operational meaning |
|---|---|---|
| Reaction system | like / reply / repost / quote | Strength of empathy, discussion, and diffusion |
| Transition | click / profile_click | Intention to see more |
| Dwell system | dwell / dwell_time / video_view | Was it possible to watch without being skipped? |
| Expansion | photo_expand / share / copy_link | The value of preserving content and sharing it with others |
| Relationship system | follow_author | Converting to a long-term relationship |
| Negative signal | not_interested / block / mute / report | Sign of disagreement, discomfort, or mistrust |
The conclusions drawn from this design are clear.
- Like optimization alone is not enough
- A fishing design that focuses only on clicks is disadvantageous in the long term.
- Editing to reduce negative signals is more effective than you imagine.
How to interpret Candidate Isolation
The attention mask in phoenix/grok.py prohibits mutual references between candidates and allows references to user context and history.
There are three practical implications:
- Less likely to be influenced by other candidates in the batch
- Improving individual posts is easily effective
- It is more reproducible to improve the post itself than to win with “comparison gacha”
Based on this, it is reasonable to improve your posts in the following order:
- Clarification of the two lines of introduction
- Paragraph division and information density optimization
- Single CTA
- Re-edit after viewing the reaction log
Figure 3: Diagram of 2-stage filter
If you focus too much on the score, you'll overlook filter design. In reality, in many cases the filter determines the upper limit of exposure first.
Operational mistakes that tend to get clogged with pre-score filters
- Short-term reposting of the same text
- Mass production ignoring the read list
- Ambiguous expressions that touch mute words
- Introduction that is easy to misread due to lack of context
Operational mistakes that tend to get clogged with post-score filters
- Visibility judgment worsened by repeating incendiary text
- Mass production of duplicate branches within the same conversation thread
The principle here is "don't fall before you stretch."
Using structure to explain why traditional hacks are not effective
The reason behind the feeling that ``old ways of doing things suddenly no longer work'' can be summarized as follows in light of the public structure.
- The entrance for candidate selection is double tracked.
- The score is multi-objective optimization (multiple actions)
- Negative signals are explicitly deducted points
- Due to diversity correction, same-type continuous pitching becomes relatively weak.
- Non-conforming content is dropped in the second stage with a two-stage filter
That's why "types that stabilize reaction quality" are more advantageous than short-term hacks.
Practical design in Japanese X (detailed version)
From here on, we will show you the specific steps to translate the above structure into Japanese usage.
1. Post design: strictly adhere to one post, one theme
The more you mix it up, the harder it becomes to estimate. Fix the value to be conveyed in one post, and leave supplementary information to the reply.
- First line, equivalent to a heading: Who is it for?
- 2nd line: What do you get?
- Body: Conclusion -> Reason -> Specific example -> Action
2. How to make two lines of introduction (template)
The following type makes it easier to compare reactions.
- Target: "For people who stop responding after the first month of using X"
- Benefit: “How to identify one area for improvement in 72 hours”
- Expected value: "Verify without increasing the number of posts"
If you overestimate the expected value when introducing it, you will tend to lean toward the negative signal side. Prioritize "specific and sincere promises" over "strong promises."
3. Information density design of main text
In the Japanese-speaking world, the flow rate is high, and the line of sight is fast even when reading long sentences. The following rules ensure readability.
- 1 paragraph 2-4 lines
- 1 paragraph, 1 point
- Add numbers or steps to abstract words
- Define terms only the first time they appear
4. Only one CTA
Multiple CTAs divide the action rate.
It is more measurable to separate the times when you go to get reply and the times when you go to get profile click.
5. Preventive measures against negative reactions
Please check the following before posting.
- Is the subject too large?
- Does it unnecessarily provoke the reader's attributes?
- Are you confusing facts and opinions?
- Doesn't an assertive tone lead to misunderstandings?
This is not a moral theory, but a practical response to score design.
Operation flow using TenguX (for implementation)
- Select one theme candidate with
/neta - Create 2 drafts by quoting/rewriting (change only the 2 lines of introduction)
- Deliver to JST fixed frame with
/queue - Comparison of indicators at 24h and 72h
- Only winning introductions are carried over to the next week
The trick here is to "don't change the entire text every time." To protect comparability, we limit changes to 1-2 items.
Measurement design: 24h/72h two-stage review
What to see in 24h (initial velocity)
imp(input of delivery)engagement(inlet of reaction)profile clicks(interest transition)
What you see in 72h (persistent)
replies(depth of discussion)reposts/quotes(room for rediffusion)- Whether the type is conserved (whether there is a revisit reaction)
Examples of practical judgment rules
- Weak initial velocity -> Corrected the 2 introduction lines
- Good initial velocity but poor sustain -> Corrected text structure
- There is a response, but the quality is poor -> Correct the subject and assertive expressions
30-day operation plan (details)
Week 1: Building the foundation for measurement
- Fixed post template to one type
- Limit to 2 themes
- Create a record sheet for 24 hours/72 hours after posting
Week 2: Deployment optimization
- A/B only 2 lines of introduction
- The text is fixed
- Decide on the most stable implementation type
Week 3: Body optimization
- Adjust paragraph length and bullet point ratio
- Increase the amount of concrete examples by 1.2 times
- Fixed CTA and purely compare text differences
Week 4: Re-editing and capitalization
- Re-edit the top 3 responses
- Create different angles on the same theme
- Create a template and use it as a standard for the next month
The deliverable you should create in these 30 days is not a "buzz post" but a "reproducible template."
Common Misconceptions (FAQ)
Q1. Will everything get better if there are more likes?
no. Due to its public structure, it is a composite of multiple actions.
Even if like is strong, it may be offset by other signals or negative signals.
Q2. Can you win by increasing the number of posts?
Although it increases exposure opportunities in the short term, it can have the opposite effect if the filters and quality are not aligned. The first month is more about verification accuracy than the number of books.
Q3. Should I provoke to target Out-of-Network?
Not recommended. OONScorer exists, but that doesn't mean you can do anything with it.
An increase in negative signals is disadvantageous in the long term.
Q4. Can I understand all of X by reading this repository?
What we see is the core structure of the For You recommendation. All elements and thresholds for production operations have not been made public, so you should avoid making any definitive statements.
Q5. What is the top priority when using Japanese?
Clarification of the introductory two lines. If the target audience and benefits are unclear, post-optimization will be less effective.
Practical checklist (before posting)
- Is this post focused on one theme?
- Is the target audience clearly stated in the first line?
- Are the benefits clearly stated in the second line?
- Is there one point per paragraph?
- Are there any assertive expressions that can lead to misreading?
- Is there excessive provocation?
- Is there one CTA?
- Have you decided on the items to be verified 24h/72h?
Simply satisfying these eight items will significantly improve the reproducibility of your operations.
summary
What is important for X operation in 2026 is not looking for tricks but understanding the structure.
The practical essence that can be gleaned from xai-org/x-algorithm is the following three points.
- For You recommendation works in stages (candidate acquisition -> completion -> filter -> score -> final filter)
- Scores are multipurpose; negative signals are explicitly disadvantageous.
- Operations that continue to improve the quality of individual posts are the easiest to reproduce.
With Japanese Just by continuing these three points for 30 days, you can move from sensory operation to verification operation.
Appendix A: Correspondence table between implementation files and highlights
Organize major files by purpose so you don't get lost when digging deeper. By clearly indicating where you should read first to find out what you need to know, it will be easier to keep up with information updates.
| Purpose | Where to read first | What to read | Translation into practice |
|---|---|---|---|
| Get the big picture | README.md | For You structure, stage names, and main concepts | Become a map of where to improve |
| Check the pipeline order | home-mixer/candidate_pipeline/phoenix_candidate_pipeline.rs | Query/source/hydrator/filter/scorer/selector order | You can decide the improvement priority |
| Check the score formula | home-mixer/scorers/weighted_scorer.rs | Weight summation of multiple actions, negative signal items | Avoid the danger of "likes only" optimization |
| View diversity correction | home-mixer/scorers/author_diversity_scorer.rs | Attenuation processing for the same author | Review strategies that rely on repeated submissions |
| See OON correction | home-mixer/scorers/oon_scorer.rs | Existence of in/out-network correction | You can design without over-reliance on unfollowed exposure |
| Understanding ranking mask | phoenix/grok.py | Mask logic of candidate isolation | Basis for emphasizing the quality of individual posts |
| List of predicted items | phoenix/runners.py | Implementation order of multi-behavior output | Can be reflected in the design of observation indicators |
Of particular interest in the table above is the combination of weighted_scorer and grok.py.
The former indicates "what is being synthesized" and the latter indicates "how it is compared."
Once you understand these two things, your strategy will change from "guessing" to "hypothesis testing."
Appendix B: 12 submission templates for Japanese X
Below is a template that is easy to use even in the first month. Both are written on the premise that the two lines of introduction can be compared.
Template 1: Procedure public type
- Clarify the target audience
- Putting the results obtained in numbers
- List the procedure in 3-5 steps
- Finally, declare "What should we improve next time?"
Template 2: Failure learning type
- Write down the failure situation based on facts
- Narrow down to one cause
- Show correction steps
- Place a check to prevent recurrence
Template 3: Checklist distribution type
- Specify the usage situation first
- Present 7-10 items to check
- Target “items that could not be met” for improvement next time
Template 4: Comparison verification type
- Limit the difference between Plan A and Plan B to one
- Disclose indicators compared 24h/72h
- Decide on only one measure for next time
Template 5: Misunderstanding correction type
- Presenting one common misconception
- Explain why misunderstandings occur
- Reinforcement with implementation structure or observation data
- Present practical alternatives
Template 6: Term decomposition type
- Choose one difficult word
- Define for beginners
- Add examples that occur in practice
- Indicate the conditions for deciding whether to use or not.
Template 7: Back calculation design type
- Put the target indicators first
- Count backwards and break down the necessary actions
- Convert to post design
Template 8: Case abstraction type
- Showing a single case
- Abstract success factors into three
- Write conditions for transfer to other themes
Template 9: QA proactive type
- List the points that are likely to be refuted first
- Give short answers to each.
- Also specify “conditions that are not applicable”
Template 10: Mini serial type
- Whole map on the first shot
- The most important point in the second run
- Third operational template
- Don’t change your CTA each time
Template 11: Weekly review public type
- Overview of the week's posts
- Two good/two bad points
- Fix next week's revision policy to one
Template 12: Redefined type for beginners
- Translating established theories for experts into terms for beginners
- Attach the minimum steps that can be taken now
- Show how to return if you fail
Templates are more about consistency than numbers. For the first month, you will learn faster if you select only 2-3 items and rotate them.
Appendix C: 24h/72h verification log recording format
Whether or not you get results in practice depends more on the "recording quality" than on the posting quality. If you leave a log in the format below, you will be less likely to make a mistake in judgment at the end of the month.
| Date | Post ID | Template type | Changes | 24h imp | 24h replies | 72h profile clicks | Subjective memo |
|---|---|---|---|---|---|---|---|
| 03/01 | A001 | Open procedure | Shorten 2 lines of introduction | 1200 | 14 | 26 | Clear introduction and quick response |
| 03/03 | A002 | Failure learning type | Fixed CTA to one | 980 | 21 | 18 | Conversations have increased, but transitions are weak |
| 03/05 | A003 | Comparison verification type | Shorten paragraph length | 1350 | 16 | 31 | Clicking improved |
From this log, make decisions as follows:
- Extract only the “changes” of posts that have moved indicators
- Categorize changes (introduction/text/CTA)
- Next week, improve only the categories that contributed the most.
“Fixing everything” destroys learning. If you correct only one category every week, the reproducibility after 4 weeks will be higher.
Appendix D: Editorial guide to reduce negative signals
not_interested / block / mute / report is difficult for the poster to directly observe, so
Manage using proxy indicators.
Examples of proxy indicators
- Profile transition slows down even though responses are increasing
- There are a number of replies, but there are few constructive conversations.
- Sustainability of responses sharply decreases after the day after posting
- Exposure suddenly decreases when posting the same theme repeatedly
Although these are not definitive indicators, they can be treated as signs of increasing negative signals. Prioritize the following fixes the week the symptoms appear.
- Reduce too strong assertions
- Avoid divisive headlines
- Don't widen your target audience too much
- Clear separation between subjectivity and fact
Appendix E: Replacement steps to follow repository updates
The public implementation may be updated in the future. You can continuously maintain article content by monitoring differences using the following steps.
- Check the difference in
README.mdonce a month - Check the difference between
home-mixer/scorersandhome-mixer/filters - Check the differences around mask of
phoenix/grok.py - Record added/deleted action names
- Only the “Practical translation” section of the article was updated first.
By implementing this operation, you can reduce the risk that the explanation will remain outdated.
Reference (as of March 1, 2026)
Resources
関連リソース
この記事の内容を、そのまま実務に落とすための型をまとめています。
次のアクション
この流れを実際に試す場合は、まず1テーマ分の投稿案づくりから始めてください。
