Article

English content

How the X Algorithm Works in Japan: A 2026 Guide

Beginner to X operation / Published: 02/20/2026 · Updated: 03/12/2026

Available languagesView Japanese version

How the X Algorithm Works in Japan: A 2026 Guide

Information on X operation still tends to rely on explanations such as "growth can be achieved with one technique." However, in 2026, the range that can be confirmed as public code will increase, and the premises of the discussion can be made considerably more concrete.

This article is a re-edited version for practical use, based on xai-org/x-algorithm as of March 1, 2026, and has translated the processing structure of For You recommendation into Japanese operation. The title remains the same, but the text is much more detailed.

3 minute summary first

For busy people, I'll just put the conclusion first.

x-algorithm shows the implementation structure of "Recommendation for For You feed", where candidate acquisition, exclusion, and scoring are clearly divided.
Ranking is not a single indicator, but a weighted summation of multiple action probabilities, and negative signals such as not interested / block / mute / report are explicitly treated as points being deducted.
There is a two-stage filter before and after the score, and it is necessary not only to increase the score'' but also to design it so that it will not be dropped in the first place.''
Since there is a ranking mask (candidate isolation) that does not look at candidates, improving the quality of individual posts is likely to be directly linked to reproducibility.
In Japanese

From now on, I will explain these five points in detail according to the vocabulary of the code.

The position of this article (the line between facts and inference)

First, clarify where public facts end and operational reasoning ends.

Publication facts (can be checked in the repository)

Key component names recommended for For You
Stage structure of candidate acquisition, hydration, filter, score, and selection
Concept of multiple behavior prediction and weighted summation score
Implementation policy for attention mask that does not look at candidates
Typical filter names (duplicate, old, read, muted, visible, etc.)

Undisclosed or undetermined (area not determined)

Accurate weight values and thresholds used in production operations (params may not be disclosed)
Model training frequency, delivery ratio with real traffic, fine-tuning by region
Overall system image across all product aspects

In other words, this article follows the principles of explaining the structure based on code basis'' and not mixing in speculation about concrete values.''

Figure 1: Overview of For You recommendation pipeline

For You Pipeline Overview

This diagram summarizes the processing order that can be read in the public README and in the home-mixer implementation. Importantly, scoring does not exist in isolation, but is highly dependent on prior and subsequent processing.

7 steps to read along with the code

We will look at each stage, focusing on home-mixer/candidate_pipeline/phoenix_candidate_pipeline.rs.

1. Query Hydration

Here, we will create the "user context" at the time of the recommendation request.

User behavior sequence (past reaction history)
User features (information derived from follow relationships and settings)

From an operational perspective, this is the basis for ``past behavior being effective for the next distribution.'' This is the reason why posts that consistently generate consistent responses tend to be more advantageous than posts that go viral in a short period of time.

2. Candidate Sourcing

Candidates come from at least two lineages.

Thunder（In-Network）
Phoenix Retrieval（Out-of-Network）

In-Network is a discovery frame that comes from following accounts, and Out-of-Network is a discovery frame that includes unfollowed accounts. What is important here is not to ``select one from the beginning,'' but to create a set of candidates and then narrow it down at a later stage.

3. Candidate Hydration

Adds additional information necessary for score calculation to the obtained candidates.

Core post data
Attributes such as video length
Can you view subscriptions?
Author information

This stage is simple, but if you miss it, you will fall at a later stage. That's why metadata integrity is effectively a quality requirement, not just the content of your posts.

4. Pre-Scoring Filters

Drop "unqualified candidates" before scoring. The main filters that appear in public implementations are:

DropDuplicatesFilter
CoreDataHydrationFilter
AgeFilter
SelfTweetFilter
RetweetDeduplicationFilter
IneligibleSubscriptionFilter
PreviouslySeenPostsFilter
PreviouslyServedPostsFilter
MutedKeywordFilter
AuthorSocialgraphFilter

The practical point here is simple. Before making an effort to raise your score, you need to take action to reduce the factors that cause your score to drop.

5. Scoring

In the public implementation, score-related processing is chained together in multiple steps.

PhoenixScorer: Estimate behavior probability
WeightedScorer: Weighted addition of multiple actions
AuthorDiversityScorer: Attenuate continuous exposure of the same author
OONScorer: Out-of-Network correction

This order is important. It is not simply "in descending order of estimated value", but corrections for diversity and network type are added later.

6. Selection

Sort by final score and select the top K results. The structure of selectors/top_k_score_selector.rs is very clear, and you can see that it is determined mainly by the final score.

7. Post-Selection Filters

After selection, it will be further excluded. The representative is as follows.

VFFilter
DedupConversationFilter

In other words, "ranking in the top = delivery confirmed" does not mean. There are final gates for visibility policy and conversation duplication control.

Figure 2: Practical understanding of score design

Score Composition Diagram

WeightedScorer multiplies multiple action probabilities by weight and sums them together. What is important here is that it is not only necessary to increase positive reactions'' but also reduce negative reactions'' at the same time.

Translate predicted actions into production language

The actions that can be seen in the public README and phoenix/runners.py are translated into posting operations as follows.

Prediction system	Representative action	Operational meaning
Reaction system	like / reply / repost / quote	Strength of empathy, discussion, and diffusion
Transition	click / profile_click	Intention to see more
Dwell system	dwell / dwell_time / video_view	Was it possible to watch without being skipped?
Expansion	photo_expand / share / copy_link	The value of preserving content and sharing it with others
Relationship system	follow_author	Converting to a long-term relationship
Negative signal	not_interested / block / mute / report	Sign of disagreement, discomfort, or mistrust

The conclusions drawn from this design are clear.

Like optimization alone is not enough
A fishing design that focuses only on clicks is disadvantageous in the long term.
Editing to reduce negative signals is more effective than you imagine.

How to interpret Candidate Isolation

The attention mask in phoenix/grok.py prohibits mutual references between candidates and allows references to user context and history. There are three practical implications:

Less likely to be influenced by other candidates in the batch
Improving individual posts is easily effective
It is more reproducible to improve the post itself than to win with “comparison gacha”

Based on this, it is reasonable to improve your posts in the following order:

Clarification of the two lines of introduction
Paragraph division and information density optimization
Single CTA
Re-edit after viewing the reaction log

Figure 3: Diagram of 2-stage filter

Two-Stage Filters Diagram

If you focus too much on the score, you'll overlook filter design. In reality, in many cases the filter determines the upper limit of exposure first.

Operational mistakes that tend to get clogged with pre-score filters

Short-term reposting of the same text
Mass production ignoring the read list
Ambiguous expressions that touch mute words
Introduction that is easy to misread due to lack of context

Operational mistakes that tend to get clogged with post-score filters

Visibility judgment worsened by repeating incendiary text
Mass production of duplicate branches within the same conversation thread

The principle here is "don't fall before you stretch."

Using structure to explain why traditional hacks are not effective

The reason behind the feeling that ``old ways of doing things suddenly no longer work'' can be summarized as follows in light of the public structure.

The entrance for candidate selection is double tracked.
The score is multi-objective optimization (multiple actions)
Negative signals are explicitly deducted points
Due to diversity correction, same-type continuous pitching becomes relatively weak.
Non-conforming content is dropped in the second stage with a two-stage filter

That's why "types that stabilize reaction quality" are more advantageous than short-term hacks.

Practical design in Japanese X (detailed version)

From here on, we will show you the specific steps to translate the above structure into Japanese usage.

1. Post design: strictly adhere to one post, one theme

The more you mix it up, the harder it becomes to estimate. Fix the value to be conveyed in one post, and leave supplementary information to the reply.

First line, equivalent to a heading: Who is it for?
2nd line: What do you get?
Body: Conclusion -> Reason -> Specific example -> Action

2. How to make two lines of introduction (template)

The following type makes it easier to compare reactions.

Target: "For people who stop responding after the first month of using X"
Benefit: “How to identify one area for improvement in 72 hours”
Expected value: "Verify without increasing the number of posts"

If you overestimate the expected value when introducing it, you will tend to lean toward the negative signal side. Prioritize "specific and sincere promises" over "strong promises."

3. Information density design of main text

In the Japanese-speaking world, the flow rate is high, and the line of sight is fast even when reading long sentences. The following rules ensure readability.

1 paragraph 2-4 lines
1 paragraph, 1 point
Add numbers or steps to abstract words
Define terms only the first time they appear

4. Only one CTA

Multiple CTAs divide the action rate. It is more measurable to separate the times when you go to get reply and the times when you go to get profile click.

5. Preventive measures against negative reactions

Please check the following before posting.

Is the subject too large?
Does it unnecessarily provoke the reader's attributes?
Are you confusing facts and opinions?
Doesn't an assertive tone lead to misunderstandings?

This is not a moral theory, but a practical response to score design.

Operation flow using TenguX (for implementation)

Select one theme candidate with /neta
Create 2 drafts by quoting/rewriting (change only the 2 lines of introduction)
Deliver to JST fixed frame with /queue
Comparison of indicators at 24h and 72h
Only winning introductions are carried over to the next week

The trick here is to "don't change the entire text every time." To protect comparability, we limit changes to 1-2 items.

Measurement design: 24h/72h two-stage review

What to see in 24h (initial velocity)

imp (input of delivery)
engagement (inlet of reaction)
profile clicks (interest transition)

What you see in 72h (persistent)

replies (depth of discussion)
reposts/quotes (room for rediffusion)
Whether the type is conserved (whether there is a revisit reaction)

Examples of practical judgment rules

Weak initial velocity -> Corrected the 2 introduction lines
Good initial velocity but poor sustain -> Corrected text structure
There is a response, but the quality is poor -> Correct the subject and assertive expressions

30-day operation plan (details)

Week 1: Building the foundation for measurement

Fixed post template to one type
Limit to 2 themes
Create a record sheet for 24 hours/72 hours after posting

Week 2: Deployment optimization

A/B only 2 lines of introduction
The text is fixed
Decide on the most stable implementation type

Week 3: Body optimization

Adjust paragraph length and bullet point ratio
Increase the amount of concrete examples by 1.2 times
Fixed CTA and purely compare text differences

Week 4: Re-editing and capitalization

Re-edit the top 3 responses
Create different angles on the same theme
Create a template and use it as a standard for the next month

The deliverable you should create in these 30 days is not a "buzz post" but a "reproducible template."

Common Misconceptions (FAQ)

Q1. Will everything get better if there are more likes?

no. Due to its public structure, it is a composite of multiple actions. Even if like is strong, it may be offset by other signals or negative signals.

Q2. Can you win by increasing the number of posts?

Although it increases exposure opportunities in the short term, it can have the opposite effect if the filters and quality are not aligned. The first month is more about verification accuracy than the number of books.

Q3. Should I provoke to target Out-of-Network?

Not recommended. OONScorer exists, but that doesn't mean you can do anything with it. An increase in negative signals is disadvantageous in the long term.

Q4. Can I understand all of X by reading this repository?

What we see is the core structure of the For You recommendation. All elements and thresholds for production operations have not been made public, so you should avoid making any definitive statements.

Q5. What is the top priority when using Japanese?

Clarification of the introductory two lines. If the target audience and benefits are unclear, post-optimization will be less effective.

Practical checklist (before posting)

Is this post focused on one theme?
Is the target audience clearly stated in the first line?
Are the benefits clearly stated in the second line?
Is there one point per paragraph?
Are there any assertive expressions that can lead to misreading?
Is there excessive provocation?
Is there one CTA?
Have you decided on the items to be verified 24h/72h?

Simply satisfying these eight items will significantly improve the reproducibility of your operations.

summary

What is important for X operation in 2026 is not looking for tricks but understanding the structure. The practical essence that can be gleaned from xai-org/x-algorithm is the following three points.

For You recommendation works in stages (candidate acquisition -> completion -> filter -> score -> final filter)
Scores are multipurpose; negative signals are explicitly disadvantageous.
Operations that continue to improve the quality of individual posts are the easiest to reproduce.

With Japanese Just by continuing these three points for 30 days, you can move from sensory operation to verification operation.

Appendix A: Correspondence table between implementation files and highlights

Organize major files by purpose so you don't get lost when digging deeper. By clearly indicating where you should read first to find out what you need to know, it will be easier to keep up with information updates.

Purpose	Where to read first	What to read	Translation into practice
Get the big picture	`README.md`	For You structure, stage names, and main concepts	Become a map of where to improve
Check the pipeline order	`home-mixer/candidate_pipeline/phoenix_candidate_pipeline.rs`	Query/source/hydrator/filter/scorer/selector order	You can decide the improvement priority
Check the score formula	`home-mixer/scorers/weighted_scorer.rs`	Weight summation of multiple actions, negative signal items	Avoid the danger of "likes only" optimization
View diversity correction	`home-mixer/scorers/author_diversity_scorer.rs`	Attenuation processing for the same author	Review strategies that rely on repeated submissions
See OON correction	`home-mixer/scorers/oon_scorer.rs`	Existence of in/out-network correction	You can design without over-reliance on unfollowed exposure
Understanding ranking mask	`phoenix/grok.py`	Mask logic of candidate isolation	Basis for emphasizing the quality of individual posts
List of predicted items	`phoenix/runners.py`	Implementation order of multi-behavior output	Can be reflected in the design of observation indicators

Of particular interest in the table above is the combination of weighted_scorer and grok.py. The former indicates "what is being synthesized" and the latter indicates "how it is compared." Once you understand these two things, your strategy will change from "guessing" to "hypothesis testing."

Appendix B: 12 submission templates for Japanese X

Below is a template that is easy to use even in the first month. Both are written on the premise that the two lines of introduction can be compared.

Template 1: Procedure public type

Clarify the target audience
Putting the results obtained in numbers
List the procedure in 3-5 steps
Finally, declare "What should we improve next time?"

Template 2: Failure learning type

Write down the failure situation based on facts
Narrow down to one cause
Show correction steps
Place a check to prevent recurrence

Template 3: Checklist distribution type

Specify the usage situation first
Present 7-10 items to check
Target “items that could not be met” for improvement next time

Template 4: Comparison verification type

Limit the difference between Plan A and Plan B to one
Disclose indicators compared 24h/72h
Decide on only one measure for next time

Template 5: Misunderstanding correction type

Presenting one common misconception
Explain why misunderstandings occur
Reinforcement with implementation structure or observation data
Present practical alternatives

Template 6: Term decomposition type

Choose one difficult word
Define for beginners
Add examples that occur in practice
Indicate the conditions for deciding whether to use or not.

Template 7: Back calculation design type

Put the target indicators first
Count backwards and break down the necessary actions
Convert to post design

Template 8: Case abstraction type

Showing a single case
Abstract success factors into three
Write conditions for transfer to other themes

Template 9: QA proactive type

List the points that are likely to be refuted first
Give short answers to each.
Also specify “conditions that are not applicable”

Template 10: Mini serial type

Whole map on the first shot
The most important point in the second run
Third operational template
Don’t change your CTA each time

Template 11: Weekly review public type

Overview of the week's posts
Two good/two bad points
Fix next week's revision policy to one

Template 12: Redefined type for beginners

Translating established theories for experts into terms for beginners
Attach the minimum steps that can be taken now
Show how to return if you fail

Templates are more about consistency than numbers. For the first month, you will learn faster if you select only 2-3 items and rotate them.

Appendix C: 24h/72h verification log recording format

Whether or not you get results in practice depends more on the "recording quality" than on the posting quality. If you leave a log in the format below, you will be less likely to make a mistake in judgment at the end of the month.

Date	Post ID	Template type	Changes	24h imp	24h replies	72h profile clicks	Subjective memo
03/01	A001	Open procedure	Shorten 2 lines of introduction	1200	14	26	Clear introduction and quick response
03/03	A002	Failure learning type	Fixed CTA to one	980	21	18	Conversations have increased, but transitions are weak
03/05	A003	Comparison verification type	Shorten paragraph length	1350	16	31	Clicking improved

From this log, make decisions as follows:

Extract only the “changes” of posts that have moved indicators
Categorize changes (introduction/text/CTA)
Next week, improve only the categories that contributed the most.

“Fixing everything” destroys learning. If you correct only one category every week, the reproducibility after 4 weeks will be higher.

Appendix D: Editorial guide to reduce negative signals

not_interested / block / mute / report is difficult for the poster to directly observe, so Manage using proxy indicators.

Examples of proxy indicators

Profile transition slows down even though responses are increasing
There are a number of replies, but there are few constructive conversations.
Sustainability of responses sharply decreases after the day after posting
Exposure suddenly decreases when posting the same theme repeatedly

Although these are not definitive indicators, they can be treated as signs of increasing negative signals. Prioritize the following fixes the week the symptoms appear.

Reduce too strong assertions
Avoid divisive headlines
Don't widen your target audience too much
Clear separation between subjectivity and fact

Appendix E: Replacement steps to follow repository updates

The public implementation may be updated in the future. You can continuously maintain article content by monitoring differences using the following steps.

Check the difference in README.md once a month
Check the difference between home-mixer/scorers and home-mixer/filters
Check the differences around mask of phoenix/grok.py
Record added/deleted action names
Only the “Practical translation” section of the article was updated first.

By implementing this operation, you can reduce the risk that the explanation will remain outdated.

Reference (as of March 1, 2026)

Resources

Related resources

Use these templates and references to apply the article workflow directly in your own operations.

TemplateList of X post templates

Next action

If you want to try this flow yourself, start by creating draft ideas for one theme.

Start free for 7 days Back to articles