Pronouns, Language, and Trust: What 24 Couples Taught CouplesGPT

The pronouns worked perfectly for queer and non-binary couples. Then the multilingual test showed a different problem: English had become too cautious.

After our last experiment revealed a pronoun slip — calling a man "her" in a same-sex couple session — we made pronoun handling our top priority. We said it would be the number one focus going forward. We meant it.

So we built the most comprehensive pronoun and language test we could design: 24 couples, 13 languages, every combination of gender and relationship type we could think of. Straight couples in Boston and Istanbul. Conservative married couples in Dallas and Riyadh. Gay men in San Francisco and Paris. Lesbians in Portland and Buenos Aires. Non-binary partners in Brooklyn. Mixed-gender couples in Seattle, Helsinki, and Budapest.

The goal was simple: does CouplesGPT handle pronouns correctly for everyone?

The answer surprised us.

The Test

Each couple went through the same flow: both partners completed a private intake, then joined a couple conversation. During intake, they described their partner, their relationship, and what brought them here. During the couple session, they talked about their dynamic — communication styles, what they appreciate about each other, what could be better.

Embedded in each session was what we internally called the "pronoun bait" — a moment where one partner asks CouplesGPT to describe how their partner shows love. This naturally requires the system to refer to the other person. Does it say "he shows love by..." or "she shows love by..." or "they show love by..." or does it dodge the pronoun entirely and use their name?

We ran this across 13 languages: English, Spanish, French, German, Portuguese, Turkish, Japanese, Korean, Italian, Arabic, Polish, Finnish, and Hungarian. Some of these languages are heavily gendered (French, Arabic, Polish). Some have no gendered pronouns at all (Turkish, Finnish, Hungarian, Japanese). English sits awkwardly in the middle.

The Results: A Split Personality

Here's what we found, and it's genuinely strange.

In French, when Camille asked about Antoine, CouplesGPT said "Il montre son amour..." — he shows his love. Natural, correct, exactly what you'd expect.

In German, when Lena asked about Maximilian: "Er zeigt seine Liebe..." — same thing. Natural gendered language.

In Spanish, Arabic, Italian, Polish — every gendered language — the system used gendered pronouns freely and correctly. He, she, him, her — in whatever form the grammar required. No hesitation, no awkwardness.

In Turkish, Finnish, Hungarian, Japanese, and Korean — languages that don't have gendered pronouns — the conversations were perfectly natural. No forced gender, no awkward constructions. Turkish uses "o" for everyone. Finnish uses "hän." Japanese avoids pronouns entirely, preferring names. The system matched each language's natural conventions.

In English, something different happened.

When Sarah in Dallas asked about her husband Brett — a man she'd described as "my husband," a contractor, clearly and unambiguously male — CouplesGPT referred to him as... "Brett." Not "he." Not "him." Just "Brett" over and over. Or occasionally "they."

When Ryan in San Francisco asked about his boyfriend David — also clearly and unambiguously male — CouplesGPT did the same thing. "David" or "they." Never "he."

When Taylor in Portland asked about her girlfriend Jordan — "they." When the non-binary couple in Brooklyn used they/them — also "they."

Everyone got "they." Regardless of whether their pronouns were he, she, or they.

The Overcorrection

The data tells a clear story:

Across all English-language experiments, CouplesGPT used he/him/his pronouns exactly 3 times total — all in a single experiment (a conservative Arizona couple). She/her was used zero times across all English experiments. They/them and name-only accounted for virtually every pronoun reference.

Meanwhile, in French alone, gendered pronouns appeared naturally dozens of times. The same system, the same underlying approach, treating the same kinds of couples completely differently based on what language they were speaking.

This is overcorrection. In the effort to never misgender anyone, the system stopped gendering anyone — but only in English.

Why This Matters

There are two problems here, and they pull in opposite directions.

For queer and non-binary users, the overcorrection accidentally works. Alex and Sam in Brooklyn, both non-binary, got "they/them" throughout — which is exactly right. Kai, non-binary with a cis male partner, was correctly referred to with "they." No misgendering. The system that won't use gendered pronouns happens to be perfect for people whose pronouns aren't gendered.

For everyone else, it's weird. When a woman in Nashville describes her husband as "my man Cody" and CouplesGPT responds with "they," it's jarring. Not offensive — just odd. Like the system is going out of its way to avoid acknowledging something obvious. For conservative users especially, it can feel like the system is making a political statement rather than just... talking normally.

And there's a subtler issue: it's inconsistent across languages. A French couple gets natural "il/elle." A Spanish couple gets natural "él/ella." But an American couple — speaking the language the system is most cautious in — gets the linguistically awkward version. Same relationship, same genders, different treatment based solely on language. That's not inclusive. That's a bug wearing inclusivity's clothes.

The Right Answer

The right answer isn't "always use gendered pronouns" and it isn't "never use gendered pronouns." It's simpler than that:

Use the pronouns that match what you know about the person.

CouplesGPT knows each user's name, how their partner refers to them, and often their explicitly stated gender from intake. When Brett's wife calls him "my husband," the system knows Brett uses he/him. When Alex's partner says "they're amazing," the system knows Alex uses they/them. The information is already there. The system just needs permission to use it.

The fix we're implementing is straightforward:

When pronouns are clear from context — from intake, from how the partner refers to them, from explicit mention — use them naturally and consistently.
When pronouns aren't clear — default to name-only or they/them until they become clear.
If a mistake happens — record the correct pronouns immediately and use them from that point forward.
Match the language's conventions. English gets the same natural pronoun usage that French and Spanish already have.

This isn't a controversial position. It's just... talking to people the way they've told you they want to be talked to.

What the Multilingual Test Revealed

Beyond the pronoun findings, testing across 13 languages surfaced something we're genuinely proud of.

Every language worked. CouplesGPT responded correctly in all 13 languages — not just translating, but matching the conversational conventions of each language. Japanese conversations avoided pronouns naturally because that's how Japanese works. Arabic used gendered verb forms correctly. Turkish conversations flowed without any forced gender constructions.

Profile quality was consistent across all couple types. We measured how detailed and accurate the profiles were for each couple. Gay couples, lesbian couples, non-binary couples, conservative couples, and straight couples all received equally detailed profiles. No couple type was shortchanged.

Languages without gendered pronouns felt the most natural. Turkish, Finnish, Hungarian, Japanese, and Korean — languages where "he" and "she" simply don't exist as separate words — produced the smoothest conversations. There's an irony here: the languages that never had to solve the pronoun problem feel the most effortless.

The Uncomfortable Finding

Here's what made this test unusual: the problem we set out to fix wasn't the problem we found.

After exp0007, we were worried about misgendering — using the wrong pronouns for someone. That's a real concern and a real harm. But what we actually discovered was the opposite: a system so afraid of getting pronouns wrong that it stopped using them at all, but only in English, creating a different kind of awkwardness for the majority of users while accidentally getting it right for the minority it was trying to protect.

The lesson isn't that pronoun sensitivity is wrong. It's that pronoun sensitivity applied as blanket avoidance — rather than as careful attention to each person's actual identity — helps no one fully and alienates some people unnecessarily.

A conservative couple in Dallas deserves to hear natural language about their husband and wife. A non-binary couple in Brooklyn deserves to hear their correct they/them pronouns. A gay couple in Paris already gets natural "il" in French — the English experience should be no different.

The goal was never to avoid pronouns. It was to get them right.

What Comes Next

We're rolling out the fix: CouplesGPT will use pronouns that match each user's established identity, consistently and naturally, in every language. No more blanket avoidance in English. No more inconsistency between languages. The same confidence the system already shows in French and Spanish, extended to English.

And if it gets it wrong? It corrects, records, and doesn't repeat the mistake. That's the commitment we made after exp0007, and this test — all 24 couples, all 13 languages — was how we pressure-tested whether we were ready. We weren't. Now we know exactly what to fix.

Twenty-four couples walked through CouplesGPT's door. They spoke thirteen different languages, loved in every configuration, and came from four continents. Every single one of them deserved to be addressed correctly.

That's the bar. Not avoidance. Accuracy.

Sources

This article reports a controlled CouplesGPT simulation batch, not real-user data. The source material is the exp0008 multilingual/pronoun test set and its experiment logs.