Sticks and stones (CODA) – How names work against women

Mothers tell your daughters

From 2011 to December 2015, five women fought the Japanese Government all the way to the country’s Supreme Court. They were seeking to change a law that compels couples to adopt the same surname in order to legally register their marriage. Although the law does not specify whose name it should be, in practice, 96% of couples take the husband’s name, and the women argued that this made the law unconstitutional, because it violated their basic civil rights.

By losing your surname … you’re being made light of, you’re not respected … It’s as if part of your self vanishes,” said one of the plaintiffs, Kaori Oguni.

Conservatives were unimpressed. Defending the law, which was passed in 1896, constitutional scholar Masaomi Takanori, argued that, “Names are the best way to bind families,” and that “Allowing different surnames risks destroying social stability, the maintenance of public order and the basis for social welfare.”

When I planned my three part series on names, I hadn’t heard about this case. My earlier posts described the legal regulation of names in the West, and the way it has undermined traditional naming practices, distorting the name grammars of many of the world’s languages in the process. These changes have influenced the perception of people’s names, made them harder to process and to remember, and as I showed in my last post, they have not affected all parts of society equally. African-Americans in particular are disadvantaged by the American name system.

On examination, it turns out that there have been winners and losers in the name game. And that is why, when I read about this case, and its result, I realized that I had to add an extra post.

Family names in Japan

As with so many aspects of its culture, Japan’s name system is a unique mixture of indigenous, Western and Sinosphere influences. As in the West, for most of Japanese history, most of the population did not have surnames. First names sufficed for most purposes, to which extra features might be appended in context in order to provide additional discriminating information. Japanese ‘family’ names did not become universal until laws enforcing them were passed during the Meiji period in the late 19th Century.

The requirement for all Japanese people to adopt a surname in addition to their personal names was initially laid out in a decree issued in 1875. The names people adopted in response to this decree came from a variety of places. Some were historical, or chosen by divination. Some people adopted names chosen by a Shinto or Buddhist priest. And some names were simply made up.

An 1876 decree then laid out the rules for marriage, insisting that spouses should both keep their original family (maiden) names. Since Japanese family names are first names, this was in keeping with naming practices in the Sinosphere, where family names are also first names (in Sinosphere countries, a wife keeps her first name on marriage, and a couple’s children then take the father’s first name as a family name).

However, the Family Register Law, which was passed in 1896 changed things again. Wives were now obliged to adopt their husbands’ first (family) name on marriage. This is the regulation that the women wanted the Supreme Court to change.

Families of name grammars

In my last post, I showed how the practice of passing a hereditary name down through generations is a remarkably poor way of communicating family ties. I also showed how, if one ignores the machinations of bureaucracy, and modern myths about ‘family names,’ and attends instead to the communicative properties of native name grammars, there are some striking similarities in the way that Western and Sinosphere first names work.

Throughout most of the history of the East and the West, people’s first names were drawn from a very small pool. Traditionally, Western first names are gendered, and around 50% of men and women were given one of the 3 most common names for their gender. The frequencies with which names were given to children within genders were exponentially distributed, which meant that around 90% of the population had one of the most common 10 names for their gender, and that little more than 100 first names were in common usage.

In the Sinosphere, first names are not gendered (since all children take their father’s first name, the idea of gendered first names make no sense). However, traditionally, around 50% of people have one of the most common 3 first names, and the frequencies of these names are exponentially distributed, so around 90% of the population had one of the most common 10 names. The number of first names were in common usage was also small, amounting to little over 100, which is reflected in the fact that in China the colloquial expression for the common people is, “Bǎijiāxìng” (百家姓), which translates to, “the hundred names.”

In both the Sinosphere and the West, last names are drawn from a much larger set, and serve to help discriminate between the large numbers of individuals that share a common first name.

In the 20th century, engineers showed how exponential distributions provide the optimal way for arranging code words in electronic communication systems — thereby paving the way for our information age — and names, of course, play a very important part in natural communication. The basic idea goes like this: if the surnames following the most frequent (and hence least informative) first names are less frequent (and hence more informative), and the surnames following the least frequent (and hence most informative) first names are more frequent (and hence less informative), and if first and last names are exponentially distributed, then the amount of information in any first name – last name sequence can by equalized (so that every first name – last name sequence contains roughly the same information), and optimized (this will allow the set of names that imposes the lowest demands on memory and processing while still effectively discriminating between a set of individuals in communication to be produced). And in both the Sinosphere and the West, first names were exponentially distributed.

Which means, to put it another way, that from an information processing perspective, the historical distribution of Western and Sinosphere first names look to be parts of identical communication codes that have been optimized for devices of the same bandwidth. Which is stunning — it stands as an amazing testament what thousands of years of cultural evolution can do — and also, perhaps, unsurprising, given that the devices in question are human brains.

The distribution of first and last names in traditional Western and Sinosphere grammars conferred other important benefits. Although the small set of frequent first names means that first names don’t do much individuating, it does help to make names less grammatically ambiguous, and easier to identify in speech. And this means that even in contexts where first names fail to discriminate one individual from another, they help to make the more informative last names that serve to further individuate us easier to process in context. In linguistic terms, first names do mainly grammatical work – they help make clear that a name is a name – and last names contribute more to semantics, doing most of the work of discriminating specific individuals from their peers.

Names and marriage

In Sinosphere countries, where first names aren’t gendered, and women keep their names on marriage, there is obviously little gender bias in the name grammar. That is, in these countries, women’s first names are no less efficient in information theoretic terms than men’s – searching a database of women’s names would require no more processing than searching a database of men’s names – and marriage imposes no obvious cost on a woman’s established identity.

By contrast, in the West, marriage usually involved a change to a woman’s established identity (although this trend has diminished somewhat in recent times). And in Japan, the 1896 law that requires couples to adopt the same name has meant that in practice, the act of getting married has inevitably required women to change their established identities.

Because of the grammatical structure of names, which has evolved a very clear form across history, changing one’s name through marriage involves a different linguistic shift in Japan as compared to the West. This is because whereas marriage involves changing one’s last name in the West, it involves changing one’s first name in Japan, and historically, the information communicated by first and last names in the world’s languages have been very different. In Western terms, the Japanese situation is akin to going into a marriage service named Mary, and leaving it named Anne, as opposed to exchanging Smith for Jones.

Which of these is better or worse is an open question that I won’t be addressing here. Because what I want to do in the rest of this post is to reveal an insidious and, until now, hidden consequence of the traditions that often compelled Western women to change their names on marrying. Because in the years since universal ‘family’ names were fixed by legislatures in the West, this practice has led to a substantial erosion of the grammatical and informational properties of Women’s names. And, as I will show, this has led to the development of a considerable amount gender bias in some, if not all, Western naming systems.

What makes for a good name?

In previous posts, I have shown how the logic of native name grammars, with their highly frequent, more grammatical first names and their less frequent, more semantic last names is reflected in people’s favorability judgments about names.

The graph below plots the familiarity of male and female first names on its horizontal axis, and the favorability of British-English speakers towards them on its vertical axis. As you can see, people have a strong bias in favor of frequent first names:


Coleman, Hargreaves, & Sluckin (1981)

By contrast, when it comes to last names, frequency and familiarity work differently. The next graph again plots the familiarity of last names on its horizontal axis, and how favorably English speakers feel towards them on its vertical axis, but it shows how the relationship between favorability and familiarity for last names now takes an inverted-U function, such that the most and least familiar names are view least favorably, and names in the middle of the familiarity band are viewed most favorably:


Coleman. Sluckin & Hargreaves (1981)

Taken together, these findings give us some insight into the kinds of names that English speakers are likely to judge more or less favorably: All other things being equal, they are likely to favor David Bowie over David Jones, and David Jones over Zowie Jones.

Who gets to have the favorable names?

In my last post, I showed how fixing what was traditionally the flexible part of Western names – the last name – appears to have severely distorted the distribution of English names as populations grew in the 19th and 20th Centuries. This is what the distribution of top 3 names looked like between 1570 and 1700 in the UK:


And this is what happened to the distribution of top 3 names once last names were fixed by legislation.  The graph below plots the growth of the population of the UK between 1801 and 1994 on its horizontal axis, and the percentage of people with the top 3 male or female names over time on its vertical axis:


As you can see, the proportion of people with top 3 names dropped from around 50% to around 10% over this time (the r2 shows that the increase in population accounts for 96% of the change in the distribution of the top 3 male names). Since we know that people are minded to judge high frequency first names most favorably, this in turn suggests that the overall favorability of the first names that children were given is likely to have declined in this time.

The graph above suggests that the changes in the population in this time affected male and female top 3 names in a similar way (the data for the top 10 names looks much the same). However, although this makes it seem like population changes have affected male and female names equivalently, it turns out that this is not so.

My explanation of what actually happened begins with the 1701‒1800 baptizement records for the parish of Beith, in Scotland:


As you can see, at this point in time the distributions of male and female first names were very similar. 10 male and 10 female names accounted for over 90% of male and female births.

We can evaluate what this may have meant for processing and remembering names by comparing the information in each distribution: for boys, it is 2.7 bits, and for girls it is 2.9 bits, and this, of course means, uh… hum.

I’ll be the first to admit that ‘bits‘ are not the easiest of measures to intuitively grasp. At all. Thankfully, as I described in an earlier post, computational linguists have devised a more intuitive way of thinking about units of information. The first names of Beith in the table above are distributed exponentially — any given name is exponentially less or more likely than the next name — and an exponential distribution is the optimal way of arranging code words for efficient communication. By contrast, the least efficient way to distribute code words is to make them equally probable (such that every name would be equally likely). This gives us a way of making the idea of ‘bits’ of information a little more intuitively comprehensible: If we raise 2 to the bit value of a biased distribution (like an exponential), this transformation tells us what that same value would represent if it was actually the bit value of a set of items in a distribution where the chance of each given item is equally likely.

This measure is called perplexity, because it allows us to transform the information  represented by abstract bit values into a more intuitive representation, which we can think of as the perplexity we feel when we are asked to choose one thing from a set of equally desirable options (this perplexes me, anyway, especially as the set of desirable options grows). Applying this to Beith’s first names reveals that the distribution of boy’s names had a perplexity of 6.5, while the perplexity of girl’s names was 7.5. Hence, although there were 10 boys names in common use in Beith, the efficient way they were distributed meant that they only required as much information processing as 6.5 names in an equiprobable distribution. Or, to put it another way, given that we are describing 10 names in each case, this reveals just how well exponential distributions serve to optimize communication, since by definition, the perplexity of boy’s names would have been 10 if all of the names had been distributed equally among the boys.

All of which means that when we look at the data for over 90% of births in Beith in the Eighteenth Century, we see no great differences in the information in male and female first names. And this in turn means that we can expect that there were no great differences in the memorability or recognizability of male and female first names in Beith at this time.

We can then compare this to what had happened in the US by the end of the Twentieth Century by analyzing a data set released by the US Census Office, which provides seperate first and last name data collected from people living in 5,300 predefined blocks (or block clusters) during the 1990 Census. The data set comprises 6,290,251 last name records (which in turn represent 88,799 unique family names), 3,003,954 male first name records (comprised of 1,219 unique male first names), and 3,184,399 female first name records (comprised of 4,275 unique female first names).

As we saw in my last post, the number of unique last names dwarfs the number of unique first names (even if we add male and female first names together, there are twenty times more last names). But what is perhaps more interesting is the change in number of unique female first names as compared to the number of unique male first names.

The total first name stock in Beith in the 18th Century was 112 names, 50 of which were male and 62 of which were female (i.e., there were around 20% more female first names). By contrast, in the US Census sample, there are around three and a half times more female names than male names. Thus while the perplexity of male names is 167.2 , it rises to 385.6 for female first names, which looks like this:


In simple terms, this means that recalling any given boy’s name from this sample is akin to remembering a single item from a shopping list of 167 items; whereas recalling a female name is like remembering a single item from a shopping list of 386 items. Which, hopefully, gives you some idea of the degree to which the growth in information in first names during the 20th Century affected female names far more than male names.

Further, when it comes to the favorability of these names, extrapolating from the smaller Census data set to the population as a whole indicates that the average frequency of the top 20 males first names in 1990 was 2,340,000 people per name (close to 50 million males had a top 20 male first name). These figures fall to 1,285,000 per name, and just 25.7 million for the top 20 names females. Given that frequency is a good predictor of favorability for first names — and that frequency is essential to the grammatical function of first names in traditional name grammars — it appears that the 20th Century saw serious decline in the chances of a female receiving a name that would be as easy to process or remember as a male name, or that other people would judge to be as favorable as a male name.

The what and why of women’s names

This raises two questions: Why has this happened, and does it really matter?

I’ll address the last one first, and although I usually try to make sure that any comments on this blog are quantitatively grounded in empirical data, for once I’ll begin my answer with an anecdote. In my last post, I described how through accidents of history, name laws have come to discriminate against African-Americans; and above, I showed that the information content of women’s first names has risen out of all proportion to men’s. A while ago, I was describing these findings to a linguist colleague of mine, and after looking thoughtful for a moment, he told me about the names of the children of another linguist colleague, who had two girls and two boys.

The girls had beautiful African names that I hadn’t encountered before, and that I barely caught as he spoke them aloud, which was not just because I’m somewhat inattentive. To our brains, perception, prediction and uncertainty reduction are often much the same thing, and, of course, the uncertainty we experience increases as names become more uncommon, both moment to moment as a name unfolds (compare Th-eliz-be-a to Eliz-a-be-th) and then, about whether what we heard is actually a name or notA consequence of this is that the more uncommon and uncertain a name is, the less likely we will be to even hear it properly, let alone recognize or remember it. (This is likely to be a particular problem for first names, which the statistics of English lead our brains to expect to be familiar.) Accordingly, while it bothers me that I can’t remember the two daughters’ names, the fact is, I can’t.

Yet when it comes to the sons, while they weren’t actually called Peter and Paul, their names were pretty darned close. And because of the ease with which I recognized them, and the sheer memorability that their names gain from being so familiar, I can still recall them to this day.

This is one reason is why I think all this matters.

In my day job, I spend a lot of my time exploring how frequency and familiarity help to make processing language – and remembering what we read and hear – easier. It is what makes me appreciate just how remarkable historical naming practices were, because somehow, without having any explicit knowledge of what they were doing, or just how much it helped them to communicate about names, people managed to share around names like John and Nate and Cornelius so that on average, everyone in a community got close to the most frequent and familiar name it is possible to come up with while still allowing them to be individuated from others by it.

And this is also why it is that, despite the fact that I’d need a time travel machine to provide you with direct empirical evidence for it, I am confident that processing and remembering names was easier, and more gender neutral in the past than it is today. And it’s also why I’m confident that processing and remembering names has gotten harder, and that this is affecting women more than men.

“Man bites dog”

I can however, offer one indirect piece of empirical evidence for this claim. There is an old journalistic chestnut that holds that “Dog bites man” is not news, but “Man bites dog” is. Following this maxim, in an earlier post, I showed how despite the fact that the number of elderly adults in the US population is growing all the time —

older 2

the frequency with which we use words that refer to the elderly (counting things like old man, old lady, old woman, senior citizen, elderly, aged, pensioner, etc) is going down:

old over time

As “Man bites dog” suggests, people generally prefer to talk and hear about things that are interesting or informative. And since terms like “old geezer” will inevitably become less informative as the number of old geezers increases, it seems that people use them less as a consequence. (Or, to put it another way, when was the last time you walked into the kitchen saying, “Hello, I am walking into the kitchen”?)

With this in mind, while there are some good reasons why we should not always take individual Google n-gram counts too seriously, I think the trends revealed in the following plots are still quite informative. The first two plot the relative frequencies with which the phrases “forget his name” and “remember his name” and “forget her name” and “remember her name” occur in the Google Books corpus in the period between 1750 and 2000:


And the next two plot the relative frequencies with which the phrases “forgot his name” and “remembered his name” and “forgot her name” and “remembered her name” occur in the Google Books corpus in the same period:


As you can see, when it comes to what writers write about, situations in which people forget a name have become less and less newsworthy over time. Meanwhile, the act of remembering a name has become ever more worthy of mention. Which suggests both that remembering names was once far easier than it is today, and that forgotten names have now become something of a default expectation.

Which is exactly what the changes to the informativity of names that I have described would predict.


All of which brings me to my second question. Why is the change in the informativity (and hence the memorability, recognizability and the ease of processing) of names affecting women more than men?

As Oscar Wilde famously put it, the truth is never pure and rarely simple. I’m sure that numerous forces are in play here, yet I think there are good reasons for believing that one particular factor is clearly influencing this effect, and it brings me back to where I began this post: The historical expectation that a woman will change her name on marrying.

I mentioned above that for our brains, perception, prediction and uncertainty reduction are often much the same thing. And I have described some of the many ways that uncertainty can act, more and less obviously, to affect both our perception of the world, and our behavior in it. The effects of uncertainty are obvious in, say, a visit to Starbucks, where my friend Melody inevitably gets her latte back in a cup marked Melanie; or at a party, when one might feel more confident that one has remembered Jenn’s name correctly than one does Jeramina’s.

Yet people also react in remarkably consistent ways in response to more subtle forms of uncertainty. As I showed above, throughout the English speaking world, once laws were enacted that fixed the distribution of last names, populations have responded to the way this has increased uncertainty about identities by increasing the information in first names. As populations with fixed last names grow, people begin to use more first names, and they begin to use first names that have lower average frequencies. And across the English speaking world, populations have done this so consistently in response to the challenges that fixed last names impose that their behavior looks as if it has been influenced by a natural law.

So, to return to how uncertainty affects the naming of children, then based on the way people judge names, and based on the need for names to individuate as best they can, we can infer from the evidence above that Anne Bowie would be a good choice for an English name. By contrast, many parents might feel that Anne Smith lacks a little when it comes to individuation. And herein lies the rub, because the expectation that a woman will change her name on marrying necessarily increases the uncertainty that parents encounter when naming a daughter as compared to a son. Because, of course, if Anne Bowie marries Nate Smith, then even today, there is a decent chance that she will become Anne Smith.

For most intents and purposes, parents naming a son know what the informativity of his first name in relation to his last name will be throughout his life, regardless of whether he is a David Bowie or a David Jones (so, yes, there are exceptions). Accordingly, they can balance the various constraints that naming imposes with a fair degree of confidence that once they have done so, their job is done. However the expectation that a woman might change her name on marrying means that a parent cannot be certain that an Anne Bowie won’t later turn into an Anne Smith. This means that when it comes to giving a baby girl a name, parents are faced with more uncertainty than they are when they are naming a baby boy. When faced with uncertainty about names, history has shown conclusively that, in the West at least, when last names are fixed, parents invariably respond by increasing the information in first names.

Since increasing the information in first names is the natural response of parents faced with last name uncertainty, and since the expectation that a woman might change her name on marrying increases last name uncertainty for girls as compared to boys, it seems that this must inevitably be playing a part in increasing the difference in the information in male and female names. A difference that will be making female names harder to process, recognize and remember than male first names.

What happens with Japanese names, I don’t know. I don’t have the data to properly find out, because despite the fact that people’s names were regulated across the globe for the purpose of bureaucratic data collection, data on names themselves is guarded carefully (albeit for all sorts of very good reasons). My best guess, based on information theory, and the evidence I have described, is that the Japanese practice of forcing women to change their first names on marriage has an even worse effect when it comes to processing and remembering names — let alone sense of self identity, and all the other things that spring from names — than the Western practice of changing last names. But, I should emphasize, I really can’t be sure.

What I do know is that Japan is still upholding an iniquitous law that is barely 150 years old, and that, running roughshod over tens of thousands of years of social and linguistic evolution, it still effectively forces women to change their names on marriage. Because in December 2015, Kaori Oguni and the other four women lost their Supreme Court case.

I hope they don’t give up the fight.


Bertrand, M., & Mullainathan, S. (2004). Are Emily and Greg More Employable Than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination American Economic Review, 94 (4), 991-1013 DOI: 10.1257/0002828042002561

Colman, A., Sluckin, W., & Hargreaves, D. (1981). The effect of familiarity on preferences for surnames British Journal of Psychology, 72 (3), 363-369 DOI: 10.1111/j.2044-8295.1981.tb02195.x

Colman, A., Hargreaves, D., & Sluckin, W. (1981). Preferences for Christian names as a function of their experienced familiarity British Journal of Social Psychology, 20 (1), 3-5 DOI: 10.1111/j.2044-8309.1981.tb00465.x

A. Crook (2012). Personal Names in 18th-Century Scotland: a case study of the parish of Beith (North Ayrshire) Journal of Scottish Name Studies, 6, 1-10

J.C. Scott (1998). Seeing Like a State: How Certain Schemes to Improve the Human Condition Have Failed. Yale University Press

Shannon, C. (1948). A Mathematical Theory of Communication Bell System Technical Journal, 27(3), 379-423 DOI: 10.1002/j.1538-7305.1948.tb01338.x

Shannon, C. (1951). Prediction and Entropy of Printed English Bell System Technical Journal, 30(1), 50-64 DOI: 10.1002/j.1538-7305.1951.tb01366.x